Online Learning of Independent Cascade Models with Node-Level Feedback

Shuoguang Yang
Corresponding Author
Shuoguang Yang
[email protected]
https://orcid.org/0000-0002-0915-196X
Department of Industrial Engineering and Decision Analytics, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong SAR, China
Search for more papers by this author
,
Van-Anh Truong
Van-Anh Truong
[email protected]
Department of Industrial Engineering and Operations Research, Columbia University, New York, New York 10027
Search for more papers by this author

Corresponding Author

Shuoguang Yang

Department of Industrial Engineering and Decision Analytics, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong SAR, China

Search for more papers by this author

Van-Anh Truong

[email protected]

Department of Industrial Engineering and Operations Research, Columbia University, New York, New York 10027

Search for more papers by this author

Published Online:13 Jun 2025https://doi.org/10.1287/moor.2021.0117

References

[1] Abbasi-Yadkori Y, Pál D, Szepesvári C (2011) Improved algorithms for linear stochastic bandits. Shawe-Taylor J, Zemel RS, Bartlett PL, Pereira F, Weinberger KQ, eds. NIPS’11: Proc. 25th Internat. Conf. Adv. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 2312–2320.Google Scholar
[2] Agrawal S, Avadhanula V, Goyal V, Zeevi A (2017) Thompson sampling for the MNL-bandit. Conf. Learning Theory (PMLR, New York), 76–78.Google Scholar
[3] Agrawal S, Avadhanula V, Goyal V, Zeevi A (2019) MNL-bandit: A dynamic learning approach to assortment selection. Oper. Res. 67(5):1453–1485.Link, Google Scholar
[4] Audibert J-Y, Bubeck S, Lugosi G (2014) Regret in online combinatorial optimization. Math. Oper. Res. 39(1):31–45.Link, Google Scholar
[5] Bharathi S, Kempe D, Salek M (2007) Competitive influence maximization in social networks. Deng X, Graham FC, eds. Internet Network Econom. WINE 2007, Lecture Notes in Computer Science, vol. 4858 (Springer, Berlin, Heidelberg), 306–311.Google Scholar
[6] Bickel PJ, Ritov Y, Tsybakov AB (2009) Simultaneous analysis of lasso and Dantzig selector. Ann. Statist. 37(4):1705–1732.Crossref, Google Scholar
[7] Chen K, Hu I, Ying Z (1999) Strong consistency of maximum quasi-likelihood estimators in generalized linear models with fixed and adaptive designs. Ann. Statist. 27(4):1155–1163.Crossref, Google Scholar
[8] Chen N, Li A, Yang S (2020) Revenue maximization and learning in products ranking. Preprint, submitted December 7, https://arxiv.org/abs/2012.03800.Google Scholar
[9] Chen W, Wang Y, Yuan Y (2013) Combinatorial multi-armed bandit: General framework and applications. ICML’13: Proc. 30th Internat. Conf. Machine Learn., vol. 28 (PMLR, New York), 151–159.Google Scholar
[10] Chen W, Wang Y, Yuan Y, Wang Q (2016) Combinatorial multi-armed bandit and its extension to probabilistically triggered arms. J. Machine Learn. Res. 17(1):1746–1778.Google Scholar
[11] Chen W, Lin T, Tan Z, Zhao M, Zhou X (2016) Robust influence maximization. KDD’16: Proc. 22nd ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 795–804.Google Scholar
[12] Cheung WC, Tan V, Zhong Z (2019) A Thompson sampling algorithm for cascading bandits. 22nd Internat. Conf. Artificial Intelligence Statist. (PMLR, New York), 438–447.Google Scholar
[13] Combes R, Talebi MS, Proutiere A, Lelarge M (2015) Combinatorial bandits revisited. Cortes C, Lee DD, Sugiyama M, Garnett R, eds. NIPS’15: Proc. 29th Internat. Conf. Adv. Neural Inform. Processing Systems, vol. 2 (MIT Press, Cambridge, MA), 2116–2124. Google Scholar
[14] Fisher RA (1997) On an absolute criterion for fitting frequency curves. Statist. Sci. 12(1):39–41.Google Scholar
[15] Gai Y, Krishnamachari B, Jain R (2012) Combinatorial network optimization with unknown variables: Multi-armed bandits with linear rewards and individual observations. IEEE/ACM Trans. Networking 20(5):1466–1478.Crossref, Google Scholar
[16] Goyal A, Bonchi F, Lakshmanan LVS (2010) Learning influence probabilities in social networks. WSDM ‘10: Proc. 3rd ACM Internat. Conf. Web Search Data Mining (Association for Computing Machinery, New York), 241–250.Google Scholar
[17] He X, Kempe D (2016) Robust influence maximization. Proc. 22nd ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 885–894.Google Scholar
[18] He X, Kempe D (2018) Stability and robustness in influence maximization. ACM Trans. Knowledge Discovery Data 12(6):1–34.Crossref, Google Scholar
[19] Katariya S, Kveton B, Szepesvari C, Wen Z (2016) DCM bandits: Learning to rank with multiple clicks. Balcan MF, Weinberger KQ, eds. ICML’16: Proc. 33rd Internat. Conf. Machine Learn., vol. 48 (PMLR, New York), 1215–1224.Google Scholar
[20] Kempe D, Kleinberg J, Tardos É (2003) Maximizing the spread of influence through a social network. KDD’03: Proc. 9th ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 137–146.Google Scholar
[21] Kveton B, Szepesvari C, Wen Z, Ashkan A (2015) Cascading bandits: Learning to rank in the cascade model. Bach F, Blei D, eds. ICML’15: Proc. 32nd Internat. Conf. Machine Learn., vol. 36 (PMLR, New York), 767–776.Google Scholar
[22] Kveton B, Wen Z, Ashkan A, Szepesvari C (2015a) Combinatorial cascading bandits. Cortes C, Lee DD, Sugiyama M, Garnett R, eds. NIPS’15: Proc. 29th Internat. Conf. Adv. Neural Inform. Processing Systems, vol. 1 (MIT Press, Cambridge, MA), 1450–1458.Google Scholar
[23] Kveton B, Wen Z, Ashkan A, Szepesvari C (2015b) Tight regret bounds for stochastic combinatorial semi-bandits. Proc. 18th Internat. Conf. Intelligence Statist. (PMLR, New York), 535–543.Google Scholar
[24] Lattimore T, Szepesvári C (2020) Bandit Algorithms (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
[25] Lei S, Maniu S, Mo L, Cheng R, Senellart P (2015) Online influence maximization. PKDD’15: Proc. 21st ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 645–654.Google Scholar
[26] Li L, Lu Y, Zhou D (2017) Provably optimal algorithms for generalized linear contextual bandits. Proc. 34th Internat. Conf. Machine Learn., vol. 70 (PMLR, New York), 2071–2080.Google Scholar
[27] Li S, Kong F, Tang K, Li Q, Chen W (2020) Online influence maximization under linear threshold model. Larochelle H, Ranzato M, Hadsell R, Balcan MF, Lin H, eds. NIPS’20: Proc. 34th Internat. Conf. Adv. Neural Inform. Processing Systems (Curran Associates, Red Hook, NY), 1192–1204.Google Scholar
[28] Mannor S, Shamir O (2011) From bandits to experts: On the value of side-observations. Shawe-Taylor J, Zemel RS, Bartlett PL, Pereira F, Weinberger KQ, eds. NIPS’11: Proc. 25th Internat. Conf. Adv. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 684–692.Google Scholar
[29] McAuley J, Leskovec J (2012) Learning to discover social circles in ego networks. NIPS’12: Proc. 26th Internat. Conf. Neural Inform. Processing Systems, vol. 1 (Curran Associates Inc., Red Hook, NY), 539–547.Google Scholar
[30] Nemhauser GL, Wolsey LA, Fisher ML (1978) An analysis of approximations for maximizing submodular set functions—I. Math. Programming 14:265–294.Crossref, Google Scholar
[31] Netrapalli P, Sanghavi S (2012) Learning the graph of epidemic cascades. Sigmetrics Performance Eval. Rev. 40(1):211–222.Crossref, Google Scholar
[32] Oh M-h, Iyengar G (2021) Multinomial logit contextual bandits: Provable optimality and practicality. Proc. AAAI Conf Artificial Intelligence, vol. 35 (AAAI Press, Washington, DC), 9205–9213.Google Scholar
[33] Saito K, Nakano R, Kimura M (2008) Prediction of information diffusion probabilities for independent cascade model. Lovrek I, Howlett RJ, Jain LC, eds. Knowledge-Based Intelligent Inform. Engrg. Systems KES 2008, Lecture Notes in Computer Science, vol. 5179 (Springer, Berlin, Heidelberg), 67–75.Google Scholar
[34] Tang Y, Xiao X, Shi Y (2014) Influence maximization: Near-optimal time complexity meets practical efficiency. SIGMOD’14: Proc. 2014 ACM SIGMOD Internat. Conf. Management Data (Association for Computing Machinery, New York), 75–86.Google Scholar
[35] Toh K-C, Todd MJ, Tütüncü RH (1999) SDPT3—A Matlab software package for semidefinite programming, version 1.3. Optim. Methods Software 11(1–4):545–581.Crossref, Google Scholar
[36] Valko M (2016) Bandits on graphs and structures. Unpublished PhD thesis, École normale supérieure de Cachan (ENS Cachan), Paris.Google Scholar
[37] Vaswani S, Duttachoudhury N (2013) Learning influence diffusion probabilities under the linear threshold model. Computer Science Department, University of British Columbia, Vancouver.Google Scholar
[38] Vaswani S, Lakshmanan L, Schmidt M (2015) Influence maximization with bandits. Preprint, submitted February 27, https://arxiv.org/abs/1503.00024.Google Scholar
[39] Wang Q, Chen W (2017) Improving regret bounds for combinatorial semi-bandits with probabilistically triggered arms and its applications. von Luxburg U, Guyon I, Bengio S, Wallach H, Fergus R, eds. NIPS’17: Proc. 31st Internat. Conf. Adv. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 1161–1171.Google Scholar
[40] Wen Z, Kveton B, Valko M, Vaswani S (2017) Online influence maximization under independent cascade model with semi-bandit feedback. von Luxburg U, Guyon I, Bengio S, Wallach H, Fergus R, eds. NIPS’17: Proc. 31st Internat. Conf. Adv. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 3022–3032.Google Scholar
[41] Yang S, Wang S, Truong V-A (2019) Online learning and optimization under a new linear-threshold model with negative influence. Preprint, submitted November 8, https://arxiv.org/abs/1911.03276.Google Scholar
[42] Zong S, Ni H, Sung K, Ke NR, Wen Z, Kveton B (2016) Cascading bandits for large-scale recommendation problems. Ihler A, Janzing D, eds. UAI’16: Proc. 32nd Conf. Uncertainty Artificial Intelligence (AUAI Press, Arlington, VA), 835–844.Google Scholar

cover image Mathematics of Operations Research

Articles In Advance

Article Information

Supplemental Material

Metrics

Information

Received:May 12, 2021
Accepted:January 25, 2025
Published Online:June 13, 2025

Cite as

Shuoguang Yang, Van-Anh Truong (2025) Online Learning of Independent Cascade Models with Node-Level Feedback. Mathematics of Operations Research 0(0).

https://doi.org/10.1287/moor.2021.0117

Keywords

Acknowledgments

The authors express their sincere gratitude to the reviewers, the associate editor, and the area editor for their valuable suggestions that have significantly improved this paper.

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Online Learning of Independent Cascade Models with Node-Level Feedback

References

Articles In Advance

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News