A Bandit-Based Approach to Educational Recommender Systems: Contextual Thompson Sampling for Learner Skill Gain Optimization
Published Online:10 Mar 2026https://doi.org/10.1287/ited.2025.0174
References
- (2012) Analysis of Thompson sampling for the multi-armed bandit problem. Mannor S, Srebro Nathan, Williamson RC, eds. Proc. 25th Ann. Conf. Learn. Theory, vol. 23 (PMLR, Cambridge, MA), 39.1–39.26.Google Scholar
- (2013) Thompson sampling for contextual bandits with linear payoffs. Dasgupta S, McAllester D, eds. Proc. 30th Internat. Conf. Machine Learn., vol. 28 (PMLR, Cambridge, MA), 127–135.Google Scholar
- (2023) A multiarmed bandit approach for house ads recommendations. Marketing Sci. 42(2):271–292.Link, Google Scholar
- (2011) An empirical evaluation of Thompson sampling. Shawe-Taylor J, Zemel R, Bartlett P, Pereira F, Weinberger KQ, eds. Advances in Neural Information Processing Systems, vol. 24 (Curran Associates, Red Hook, NY), 1–9.Google Scholar
- (2014) Online Optimization of teaching sequences with multi-armed bandits. Stamper J, Pardos ZA, Mavrikis M, McLaren BM, eds. Proc. 7th Internat. Conf. Ed. Data Mining (International Educational Data Mining Society, Worcester, MA), 269–272.Google Scholar
- (1994) Knowledge tracing: Modeling the acquisition of procedural knowledge. User Modeling User-Adapt. Interactions 4(4):253–278.Crossref, Google Scholar
- (2020) Online knowledge level tracking with data-driven student models and collaborative filtering. IEEE Trans. Knowledge Data Engrg. 32(10):2000–2013.Crossref, Google Scholar
- (2023) A systematic literature review on educational recommender systems for teaching and learning: Research trends, limitations and opportunities. Ed. Inform. Tech. (Dordrecht) 28(3):3289–3328.Google Scholar
- (2025) A reward-informed semi-personalized bandit approach for enhancing accuracy and serendipity in online slate recommendations. ACM Trans. Recommender Systems (ACM, New York).Google Scholar
- (2018) Online network revenue management using Thompson sampling. Oper. Res. 66(6):1586–1602.Link, Google Scholar
- (2021) Empowering optimization skills through an orienteering competition. INFORMS Trans. Ed. 22(1):1–8.Link, Google Scholar
- (2024) XGBoost to enhance learner performance prediction. Comput. Ed. Artificial Intelligence 7:100254.Crossref, Google Scholar
- (2014) The ASSISTments ecosystem: Building a platform that brings scientists and teachers together for minimally invasive research on human learning and teaching. Internat. J. Artificial Intelligence Ed. 24(4):470–497.Crossref, Google Scholar
- (2014) Optimality of Thompson sampling for Gaussian Bandits depends on priors. Kaski S, Corander J, eds. Proc. 17th Internat. Conf. Artificial Intelligence Statist., vol. 33 (PMLR, Cambridge, MA), 375–383.Google Scholar
- (2019) A score prediction approach for optional course recommendation via cross-user-domain collaborative filtering. IEEE Access 7:19550–19563.Crossref, Google Scholar
- (2020) Reinforcement learning based on contextual bandits for personalized online learning recommendation systems. Wireless Personal Comm. 115(4):2917–2932.Crossref, Google Scholar
- (2020) A systematic review: Machine learning based recommendation systems for e-learning. Ed. Inform. Tech. (Dordrecht) 25(4):2635–2664.Google Scholar
- (2016) Student-centered education and constructivism: Challenges, concerns, and clarity for teachers. Clearing House 89(3):97–105.Crossref, Google Scholar
- (2017) Collaborative filtering algorithm based on rating distance. Kim CH, Lee HW, Lee DH, Sakurai K, eds. Proc. 11th Internat. Conf. Ubiquitous Inform. Management Comm. (Association for Computing Machinery, New York), 1–7.Google Scholar
- (2014) Trading off scientific knowledge and user learning with multi-armed bandits. Accessed August 7, 2025, https://api.semanticscholar.org/CorpusID:4103970.Google Scholar
- (2024) That’s incorrect and let me tell you why: A scalable assessment to evaluate higher order thinking skills. INFORMS Trans. Ed. 25(1):23–34.Link, Google Scholar
- (2017) Contextual multi-armed bandit algorithms for personalized learning action selection. Proc. IEEE Internat. Conf. Acoustics Speech Signal Processing, 6344–6348.Google Scholar
- (2020) Exploring data splitting strategies for the evaluation of recommendation models. Proc. 14th ACM Conf. Recommender Systems (Association for Computing Machinery, New York), 681–686.Google Scholar
- (2019) On recommendation of learning objects using Felder-Silverman learning style model. IEEE Access 7:163034–163048.Crossref, Google Scholar
- (2024) Towards modeling learner performance with large language models. Proc. 17th Internat. Conf. Ed. Data Mining (International Educational Data Mining Society, Worcester, MA), 759–768.Google Scholar
- (2014) Affective States and state tests: Investigating how affect and engagement during the school year predict end-of-year learning outcomes. J. Learn. Analytics 1(1):107–128.Crossref, Google Scholar
- (2020) ASSISTments longitudinal data mining competition special issue: A preface. J. Ed. Data Mining 12(2):i–xi.Google Scholar
- (2021) Game—Constructivist exercises to enhance teaching of probability and statistics for engineers. INFORMS Trans. Ed. 22(1):55–64.Link, Google Scholar
- (2016) Learning object recommendations for teachers based on elicited ICT competence profiles. IEEE Trans. Learn. Tech. 9(1):67–80.Crossref, Google Scholar
- (2017) A hybrid knowledge-based recommender system for e-learning based on ontology and sequential pattern mining. Future Generation Comput. Systems 72:37–48.Crossref, Google Scholar
- (1933) On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika 25(3–4):285–294.Crossref, Google Scholar
- (2010) Scaffolding in teacher–student interaction: A Decade of research. Ed. Psych. Rev. 22(3):271–296.Crossref, Google Scholar
- (2015) A fuzzy tree matching-based personalized e-learning recommender system. IEEE Trans. Fuzzy Systems 23(6):2412–2426.Crossref, Google Scholar

