Online Learning of Independent Cascade Models with Node-Level Feedback

Published Online:https://doi.org/10.1287/moor.2021.0117

References

  • [1] Abbasi-Yadkori Y, Pál D, Szepesvári C (2011) Improved algorithms for linear stochastic bandits. Shawe-Taylor J, Zemel RS, Bartlett PL, Pereira F, Weinberger KQ, eds. NIPS’11: Proc. 25th Internat. Conf. Adv. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 2312–2320.Google Scholar
  • [2] Agrawal S, Avadhanula V, Goyal V, Zeevi A (2017) Thompson sampling for the MNL-bandit. Conf. Learning Theory (PMLR, New York), 76–78.Google Scholar
  • [3] Agrawal S, Avadhanula V, Goyal V, Zeevi A (2019) MNL-bandit: A dynamic learning approach to assortment selection. Oper. Res. 67(5):1453–1485.LinkGoogle Scholar
  • [4] Audibert J-Y, Bubeck S, Lugosi G (2014) Regret in online combinatorial optimization. Math. Oper. Res. 39(1):31–45.LinkGoogle Scholar
  • [5] Bharathi S, Kempe D, Salek M (2007) Competitive influence maximization in social networks. Deng X, Graham FC, eds. Internet Network Econom. WINE 2007, Lecture Notes in Computer Science, vol. 4858 (Springer, Berlin, Heidelberg), 306–311.Google Scholar
  • [6] Bickel PJ, Ritov Y, Tsybakov AB (2009) Simultaneous analysis of lasso and Dantzig selector. Ann. Statist. 37(4):1705–1732.CrossrefGoogle Scholar
  • [7] Chen K, Hu I, Ying Z (1999) Strong consistency of maximum quasi-likelihood estimators in generalized linear models with fixed and adaptive designs. Ann. Statist. 27(4):1155–1163.CrossrefGoogle Scholar
  • [8] Chen N, Li A, Yang S (2020) Revenue maximization and learning in products ranking. Preprint, submitted December 7, https://arxiv.org/abs/2012.03800.Google Scholar
  • [9] Chen W, Wang Y, Yuan Y (2013) Combinatorial multi-armed bandit: General framework and applications. ICML’13: Proc. 30th Internat. Conf. Machine Learn., vol. 28 (PMLR, New York), 151–159.Google Scholar
  • [10] Chen W, Wang Y, Yuan Y, Wang Q (2016) Combinatorial multi-armed bandit and its extension to probabilistically triggered arms. J. Machine Learn. Res. 17(1):1746–1778.Google Scholar
  • [11] Chen W, Lin T, Tan Z, Zhao M, Zhou X (2016) Robust influence maximization. KDD’16: Proc. 22nd ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 795–804.Google Scholar
  • [12] Cheung WC, Tan V, Zhong Z (2019) A Thompson sampling algorithm for cascading bandits. 22nd Internat. Conf. Artificial Intelligence Statist. (PMLR, New York), 438–447.Google Scholar
  • [13] Combes R, Talebi MS, Proutiere A, Lelarge M (2015) Combinatorial bandits revisited. Cortes C, Lee DD, Sugiyama M, Garnett R, eds. NIPS’15: Proc. 29th Internat. Conf. Adv. Neural Inform. Processing Systems, vol. 2 (MIT Press, Cambridge, MA), 2116–2124. Google Scholar
  • [14] Fisher RA (1997) On an absolute criterion for fitting frequency curves. Statist. Sci. 12(1):39–41.Google Scholar
  • [15] Gai Y, Krishnamachari B, Jain R (2012) Combinatorial network optimization with unknown variables: Multi-armed bandits with linear rewards and individual observations. IEEE/ACM Trans. Networking 20(5):1466–1478.CrossrefGoogle Scholar
  • [16] Goyal A, Bonchi F, Lakshmanan LVS (2010) Learning influence probabilities in social networks. WSDM ‘10: Proc. 3rd ACM Internat. Conf. Web Search Data Mining (Association for Computing Machinery, New York), 241–250.Google Scholar
  • [17] He X, Kempe D (2016) Robust influence maximization. Proc. 22nd ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 885–894.Google Scholar
  • [18] He X, Kempe D (2018) Stability and robustness in influence maximization. ACM Trans. Knowledge Discovery Data 12(6):1–34.CrossrefGoogle Scholar
  • [19] Katariya S, Kveton B, Szepesvari C, Wen Z (2016) DCM bandits: Learning to rank with multiple clicks. Balcan MF, Weinberger KQ, eds. ICML’16: Proc. 33rd Internat. Conf. Machine Learn., vol. 48 (PMLR, New York), 1215–1224.Google Scholar
  • [20] Kempe D, Kleinberg J, Tardos É (2003) Maximizing the spread of influence through a social network. KDD’03: Proc. 9th ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 137–146.Google Scholar
  • [21] Kveton B, Szepesvari C, Wen Z, Ashkan A (2015) Cascading bandits: Learning to rank in the cascade model. Bach F, Blei D, eds. ICML’15: Proc. 32nd Internat. Conf. Machine Learn., vol. 36 (PMLR, New York), 767–776.Google Scholar
  • [22] Kveton B, Wen Z, Ashkan A, Szepesvari C (2015a) Combinatorial cascading bandits. Cortes C, Lee DD, Sugiyama M, Garnett R, eds. NIPS’15: Proc. 29th Internat. Conf. Adv. Neural Inform. Processing Systems, vol. 1 (MIT Press, Cambridge, MA), 1450–1458.Google Scholar
  • [23] Kveton B, Wen Z, Ashkan A, Szepesvari C (2015b) Tight regret bounds for stochastic combinatorial semi-bandits. Proc. 18th Internat. Conf. Intelligence Statist. (PMLR, New York), 535–543.Google Scholar
  • [24] Lattimore T, Szepesvári C (2020) Bandit Algorithms (Cambridge University Press, Cambridge, UK).CrossrefGoogle Scholar
  • [25] Lei S, Maniu S, Mo L, Cheng R, Senellart P (2015) Online influence maximization. PKDD’15: Proc. 21st ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 645–654.Google Scholar
  • [26] Li L, Lu Y, Zhou D (2017) Provably optimal algorithms for generalized linear contextual bandits. Proc. 34th Internat. Conf. Machine Learn., vol. 70 (PMLR, New York), 2071–2080.Google Scholar
  • [27] Li S, Kong F, Tang K, Li Q, Chen W (2020) Online influence maximization under linear threshold model. Larochelle H, Ranzato M, Hadsell R, Balcan MF, Lin H, eds. NIPS’20: Proc. 34th Internat. Conf. Adv. Neural Inform. Processing Systems (Curran Associates, Red Hook, NY), 1192–1204.Google Scholar
  • [28] Mannor S, Shamir O (2011) From bandits to experts: On the value of side-observations. Shawe-Taylor J, Zemel RS, Bartlett PL, Pereira F, Weinberger KQ, eds. NIPS’11: Proc. 25th Internat. Conf. Adv. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 684–692.Google Scholar
  • [29] McAuley J, Leskovec J (2012) Learning to discover social circles in ego networks. NIPS’12: Proc. 26th Internat. Conf. Neural Inform. Processing Systems, vol. 1 (Curran Associates Inc., Red Hook, NY), 539–547.Google Scholar
  • [30] Nemhauser GL, Wolsey LA, Fisher ML (1978) An analysis of approximations for maximizing submodular set functions—I. Math. Programming 14:265–294.CrossrefGoogle Scholar
  • [31] Netrapalli P, Sanghavi S (2012) Learning the graph of epidemic cascades. Sigmetrics Performance Eval. Rev. 40(1):211–222.CrossrefGoogle Scholar
  • [32] Oh M-h, Iyengar G (2021) Multinomial logit contextual bandits: Provable optimality and practicality. Proc. AAAI Conf Artificial Intelligence, vol. 35 (AAAI Press, Washington, DC), 9205–9213.Google Scholar
  • [33] Saito K, Nakano R, Kimura M (2008) Prediction of information diffusion probabilities for independent cascade model. Lovrek I, Howlett RJ, Jain LC, eds. Knowledge-Based Intelligent Inform. Engrg. Systems KES 2008, Lecture Notes in Computer Science, vol. 5179 (Springer, Berlin, Heidelberg), 67–75.Google Scholar
  • [34] Tang Y, Xiao X, Shi Y (2014) Influence maximization: Near-optimal time complexity meets practical efficiency. SIGMOD’14: Proc. 2014 ACM SIGMOD Internat. Conf. Management Data (Association for Computing Machinery, New York), 75–86.Google Scholar
  • [35] Toh K-C, Todd MJ, Tütüncü RH (1999) SDPT3—A Matlab software package for semidefinite programming, version 1.3. Optim. Methods Software 11(1–4):545–581.CrossrefGoogle Scholar
  • [36] Valko M (2016) Bandits on graphs and structures. Unpublished PhD thesis, École normale supérieure de Cachan (ENS Cachan), Paris.Google Scholar
  • [37] Vaswani S, Duttachoudhury N (2013) Learning influence diffusion probabilities under the linear threshold model. Computer Science Department, University of British Columbia, Vancouver.Google Scholar
  • [38] Vaswani S, Lakshmanan L, Schmidt M (2015) Influence maximization with bandits. Preprint, submitted February 27, https://arxiv.org/abs/1503.00024.Google Scholar
  • [39] Wang Q, Chen W (2017) Improving regret bounds for combinatorial semi-bandits with probabilistically triggered arms and its applications. von Luxburg U, Guyon I, Bengio S, Wallach H, Fergus R, eds. NIPS’17: Proc. 31st Internat. Conf. Adv. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 1161–1171.Google Scholar
  • [40] Wen Z, Kveton B, Valko M, Vaswani S (2017) Online influence maximization under independent cascade model with semi-bandit feedback. von Luxburg U, Guyon I, Bengio S, Wallach H, Fergus R, eds. NIPS’17: Proc. 31st Internat. Conf. Adv. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 3022–3032.Google Scholar
  • [41] Yang S, Wang S, Truong V-A (2019) Online learning and optimization under a new linear-threshold model with negative influence. Preprint, submitted November 8, https://arxiv.org/abs/1911.03276.Google Scholar
  • [42] Zong S, Ni H, Sung K, Ke NR, Wen Z, Kveton B (2016) Cascading bandits for large-scale recommendation problems. Ihler A, Janzing D, eds. UAI’16: Proc. 32nd Conf. Uncertainty Artificial Intelligence (AUAI Press, Arlington, VA), 835–844.Google Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.