A Bandit-Based Approach to Educational Recommender Systems: Contextual Thompson Sampling for Learner Skill Gain Optimization

Published Online:https://doi.org/10.1287/ited.2025.0174

References

  • Agrawal S, Goyal N (2012) Analysis of Thompson sampling for the multi-armed bandit problem. Mannor S, Srebro Nathan, Williamson RC, eds. Proc. 25th Ann. Conf. Learn. Theory, vol. 23 (PMLR, Cambridge, MA), 39.1–39.26.Google Scholar
  • Agrawal S, Goyal N (2013) Thompson sampling for contextual bandits with linear payoffs. Dasgupta S, McAllester D, eds. Proc. 30th Internat. Conf. Machine Learn., vol. 28 (PMLR, Cambridge, MA), 127–135.Google Scholar
  • Aramayo N, Schiappacasse M, Goic M (2023) A multiarmed bandit approach for house ads recommendations. Marketing Sci. 42(2):271–292.LinkGoogle Scholar
  • Chapelle O, Li L (2011) An empirical evaluation of Thompson sampling. Shawe-Taylor J, Zemel R, Bartlett P, Pereira F, Weinberger KQ, eds. Advances in Neural Information Processing Systems, vol. 24 (Curran Associates, Red Hook, NY), 1–9.Google Scholar
  • Clément B, Roy D, Oudeyer PY, Lopes M (2014) Online Optimization of teaching sequences with multi-armed bandits. Stamper J, Pardos ZA, Mavrikis M, McLaren BM, eds. Proc. 7th Internat. Conf. Ed. Data Mining (International Educational Data Mining Society, Worcester, MA), 269–272.Google Scholar
  • Corbett AT, Anderson JR (1994) Knowledge tracing: Modeling the acquisition of procedural knowledge. User Modeling User-Adapt. Interactions 4(4):253–278.CrossrefGoogle Scholar
  • Cully A, Demiris Y (2020) Online knowledge level tracking with data-driven student models and collaborative filtering. IEEE Trans. Knowledge Data Engrg. 32(10):2000–2013.CrossrefGoogle Scholar
  • Da Silva FL, Slodkowski BK, Da Silva KKA, Cazella SC (2023) A systematic literature review on educational recommender systems for teaching and learning: Research trends, limitations and opportunities. Ed. Inform. Tech. (Dordrecht) 28(3):3289–3328.Google Scholar
  • De Kerpel L, Benoit D (2025) A reward-informed semi-personalized bandit approach for enhancing accuracy and serendipity in online slate recommendations. ACM Trans. Recommender Systems (ACM, New York).Google Scholar
  • Ferreira KJ, Simchi-Levi D, Wang H (2018) Online network revenue management using Thompson sampling. Oper. Res. 66(6):1586–1602.LinkGoogle Scholar
  • Fornasiero M, Malucelli F, Pazzi R, Schettini T (2021) Empowering optimization skills through an orienteering competition. INFORMS Trans. Ed. 22(1):1–8.LinkGoogle Scholar
  • Hakkal S, Lahcen AA (2024) XGBoost to enhance learner performance prediction. Comput. Ed. Artificial Intelligence 7:100254.CrossrefGoogle Scholar
  • Heffernan NT, Heffernan CL (2014) The ASSISTments ecosystem: Building a platform that brings scientists and teachers together for minimally invasive research on human learning and teaching. Internat. J. Artificial Intelligence Ed. 24(4):470–497.CrossrefGoogle Scholar
  • Honda J, Takemura A (2014) Optimality of Thompson sampling for Gaussian Bandits depends on priors. Kaski S, Corander J, eds. Proc. 17th Internat. Conf. Artificial Intelligence Statist., vol. 33 (PMLR, Cambridge, MA), 375–383.Google Scholar
  • Huang L, Wang CD, Chao HY, Lai JH, Yu PS (2019) A score prediction approach for optional course recommendation via cross-user-domain collaborative filtering. IEEE Access 7:19550–19563.CrossrefGoogle Scholar
  • Intayoad W, Kamyod C, Temdee P (2020) Reinforcement learning based on contextual bandits for personalized online learning recommendation systems. Wireless Personal Comm. 115(4):2917–2932.CrossrefGoogle Scholar
  • Khanal SS, Prasad P, Alsadoon A, Maag A (2020) A systematic review: Machine learning based recommendation systems for e-learning. Ed. Inform. Tech. (Dordrecht) 25(4):2635–2664.Google Scholar
  • Krahenbuhl KS (2016) Student-centered education and constructivism: Challenges, concerns, and clarity for teachers. Clearing House 89(3):97–105.CrossrefGoogle Scholar
  • Liu Y, Feng J, Lu J (2017) Collaborative filtering algorithm based on rating distance. Kim CH, Lee HW, Lee DH, Sakurai K, eds. Proc. 11th Internat. Conf. Ubiquitous Inform. Management Comm. (Association for Computing Machinery, New York), 1–7.Google Scholar
  • Liu YE, Mandel T, Brunskill E, Popovic Z (2014) Trading off scientific knowledge and user learning with multi-armed bandits. Accessed August 7, 2025, https://api.semanticscholar.org/CorpusID:4103970.Google Scholar
  • Maclean KDS, Bayley T (2024) That’s incorrect and let me tell you why: A scalable assessment to evaluate higher order thinking skills. INFORMS Trans. Ed. 25(1):23–34.LinkGoogle Scholar
  • Manickam I, Lan AS, Baraniuk RG (2017) Contextual multi-armed bandit algorithms for personalized learning action selection. Proc. IEEE Internat. Conf. Acoustics Speech Signal Processing, 6344–6348.Google Scholar
  • Meng Z, McCreadie R, Macdonald C, Ounis I (2020) Exploring data splitting strategies for the evaluation of recommendation models. Proc. 14th ACM Conf. Recommender Systems (Association for Computing Machinery, New York), 681–686.Google Scholar
  • Nafea SM, Siewe F, He Y (2019) On recommendation of learning objects using Felder-Silverman learning style model. IEEE Access 7:163034–163048.CrossrefGoogle Scholar
  • Neshaei SP, Davis RL, Hazimeh A, Lazarevski B, Dillenbourg P, Käser T (2024) Towards modeling learner performance with large language models. Proc. 17th Internat. Conf. Ed. Data Mining (International Educational Data Mining Society, Worcester, MA), 759–768.Google Scholar
  • Pardos ZA, Baker RS, San Pedro M, Gowda SM, Gowda SM (2014) Affective States and state tests: Investigating how affect and engagement during the school year predict end-of-year learning outcomes. J. Learn. Analytics 1(1):107–128.CrossrefGoogle Scholar
  • Patikorn T, Baker RS, Heffernan NT (2020) ASSISTments longitudinal data mining competition special issue: A preface. J. Ed. Data Mining 12(2):i–xi.Google Scholar
  • Reeves KA, Hernandez-Gantes V, Centeno G, Gushi Nurnberg C (2021) Game—Constructivist exercises to enhance teaching of probability and statistics for engineers. INFORMS Trans. Ed. 22(1):55–64.LinkGoogle Scholar
  • Sergis S, Sampson DG (2016) Learning object recommendations for teachers based on elicited ICT competence profiles. IEEE Trans. Learn. Tech. 9(1):67–80.CrossrefGoogle Scholar
  • Tarus JK, Niu Z, Yousif A (2017) A hybrid knowledge-based recommender system for e-learning based on ontology and sequential pattern mining. Future Generation Comput. Systems 72:37–48.CrossrefGoogle Scholar
  • Thompson WR (1933) On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika 25(3–4):285–294.CrossrefGoogle Scholar
  • van de Pol J, Volman M, Beishuizen J (2010) Scaffolding in teacher–student interaction: A Decade of research. Ed. Psych. Rev. 22(3):271–296.CrossrefGoogle Scholar
  • Wu D, Lu J, Zhang G (2015) A fuzzy tree matching-based personalized e-learning recommender system. IEEE Trans. Fuzzy Systems 23(6):2412–2426.CrossrefGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.