sDTM: A Supervised Bayesian Deep Topic Model for Text Analytics

Published Online:https://doi.org/10.1287/isre.2022.1124

References

  • Abbasi A, France S, Zhang Z, Chen H (2011) Selecting attributes for sentiment classification using feature relation networks. IEEE Trans. Knowledge Data Engrg. 23(3):447–462.CrossrefGoogle Scholar
  • Abrahams AS, Fan W, Wang GA, Zhang ZJ, Jiao J (2015) An integrated text analytic framework for product defect discovery. Production Oper. Management 24(6):975–990.CrossrefGoogle Scholar
  • Adhikari A, Ram A, Tang R, Lin J (2019) Rethinking complex neural network architectures for document classification. Proc. 2019 Conf. North Amer. Chapter Assoc. Comput. Linguistics: Human Language Tech. (Association for Computational Linguistics, Stroudsburg, PA), 4046–4051.Google Scholar
  • Agarwal R, Dhar V (2014) Editorial—Big data, data science, and analytics: The opportunity and challenge for IS research. Inform. Systems Res. 25(3):443–448.LinkGoogle Scholar
  • Ahmad F, Abbasi A, Li J, Dobolyi DG, Netemeyer RG, Clifford GD, Chen H (2020) A deep learning architecture for psychometric natural language processing. ACM Trans. Inform. Systems 38(1):1–29.CrossrefGoogle Scholar
  • Baccianella S, Esuli A, Sebastiani F (2010) SentiWordNet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining. Proc. Seventh Internat. Conf. Language Resources Evaluation (European Language Resources Association, Paris), 2200–2204.Google Scholar
  • Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. Proc. Third Internat. Conf. Learn. Representations, San Diego.Google Scholar
  • Bao Y, Datta A (2014) Simultaneously discovering and quantifying risk types from textual risk disclosures. Management Sci. 60(6):1371–1391.LinkGoogle Scholar
  • Bellstam G, Bhagat S, Cookson JA (2021) A text-based analysis of corporate innovation. Management Sci. 67(7):4004–4031.LinkGoogle Scholar
  • Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J. Machine Learn. Res. 3:993–1022.Google Scholar
  • Boyd-Graber J, Mimno D, Newman D (2014) Care and feeding of topic models: Problems, diagnostics, and improvements. Handbook of Mixed Membership Models and Their Applications (Chapman and Hall/CRC).Google Scholar
  • Büschken J, Allenby GM (2016) Sentence-based text analysis for customer reviews. Marketing Sci. 35(6):953–975.LinkGoogle Scholar
  • Cao Z, Li S, Liu Y, Li W, Ji H (2015) A novel neural topic model and its supervised extension. AAAI Conf. Artificial Intelligence (Association for the Advancement of Artificial Intelligence, Menlo Park, CA).Google Scholar
  • Chai Y, Li W (2019) Towards deep learning interpretability: A topic modeling approach. Proc. Internat. Conf. Inform. Systems (Association for Information Systems, Atlanta).Google Scholar
  • Chen J, He J, Shen Y, Xiao L, He X, Gao J, Song X, Deng L (2015) End-to-end learning of LDA by mirror-descent back propagation over a deep architecture. Cortes C, Lawrence N, Lee D, Sugiyama M, Garnett R, eds. Proc. 28th Internat. Conf. Neural Inform. Processing Systems (MIT Press, Cambridge, MA), 1765–1773.Google Scholar
  • Cho K, van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder–decoder for statistical machine translation. Proc. 2014 Conf. Empirical Methods Natural Language Processing (Association for Computational Linguistics, Stroudsburg, PA), 1724–1734.Google Scholar
  • Chong W, Blei D, Li FF (2009) Simultaneous image classification and annotation. Proc. 2009 IEEE Conf. Comput. Vision Pattern Recognition (Institute of Electrical and Electronics Engineers, Piscataway, NJ), 1903–1910.Google Scholar
  • Clark J, Provost F (2016) Matrix-factorization-based dimensionality reduction in the predictive modeling process: A design science perspective. Technical report, New York University, New York.Google Scholar
  • Devlin J, Chang MW, Lee K, Toutanova K (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. Proc. 2019 Conf. North Amer. Chapter Assoc. Comput. Linguistics: Human Language Tech. (Association for Computational Linguistics, Stroudsburg, PA), 4171–4186.Google Scholar
  • Dieng AB, Wang C, Gao J, Paisley J (2017) TopicRNN: A recurrent neural network with long-range semantic dependency. Internat. Conf. Learn. Representations.Google Scholar
  • Dong W, Liao S, Zhang Z (2018) Leveraging financial social media data for corporate fraud detection. J. Management Inform. Systems 35(2):461–487.CrossrefGoogle Scholar
  • Dyer T, Lang M, Stice-Lawrence L (2017) The evolution of 10-K textual disclosure: Evidence from latent Dirichlet allocation. J. Accounting Econom. 64(2–3):221–245.CrossrefGoogle Scholar
  • Geva H, Oestreicher-Singer G, Saar-Tsechansky M (2019) Using retweets when shaping our online persona: Topic modeling approach. MIS Quart. 43(2):501–524.CrossrefGoogle Scholar
  • Ghose A, Ipeirotis PG, Li B (2019) Modeling consumer footprints on search engines: An interplay with social media. Management Sci. 65(3):1363–1385.LinkGoogle Scholar
  • Godes D, Mayzlin D (2004) Using online conversations to study word-of-mouth communication. Marketing Sci. 23(4):545–560.LinkGoogle Scholar
  • Gong J, Abhishek V, Li B (2018) Examining the impact of keyword ambiguity on search advertising performance: A topic model approach. MIS Quart. 42(3):805–829.CrossrefGoogle Scholar
  • Guo X, Wei Q, Chen G, Zhang J, Qiao D (2017) Extracting representative information on intra-organizational blogging platforms. MIS Quart. 41(4):1105–1127.CrossrefGoogle Scholar
  • Huang AH, Lehavy R, Zang AY, Zheng R (2017) Analyst information discovery and interpretation roles: A topic modeling approach. Management Sci. 64(6):2833–2855.LinkGoogle Scholar
  • Hutto C, Gilbert E (2014) VADER: A parsimonious rule-based model for sentiment analysis of social media text. Proc. Eighth Internat. AAAI Conf. Web Soc. Media (Association for the Advancement of Artificial Intelligence, Menlo Park, CA), 216–225.Google Scholar
  • Jones Q, Ravid G, Rafaeli S (2004) Information overload and the message dynamics of online interaction spaces: A theoretical model and empirical exploration. Inform. Systems Res. 15(2):194–210.LinkGoogle Scholar
  • Kaplan S, Vakili K (2015) The double-edged sword of recombination in breakthrough innovation. Strategic Management J. 36(10):1435–1457.CrossrefGoogle Scholar
  • Khern-am-nuai W, Kannan K, Ghasemkhani H (2018) Extrinsic versus intrinsic rewards for contributing reviews in an online platform. Inform. Systems Res. 29(4):871–892.LinkGoogle Scholar
  • Khurana S, Qiu L, Kumar S (2019) When a doctor knows, it shows: An empirical analysis of doctors’ responses in a Q&A forum of an online healthcare portal. Inform. Systems Res. 30(3):872–891.LinkGoogle Scholar
  • Kingma DP, Welling M (2013) Auto-encoding variational Bayes. Preprint, submitted December 20, https://arxiv.org/abs/1312.6114.Google Scholar
  • Kokkodis M, Lappas T, Ransbotham S (2020) From lurkers to workers: Predicting voluntary contribution and community welfare. Inform. Systems Res. 31(2):607–626.LinkGoogle Scholar
  • Lacoste-Julien S, Sha F, Jordan MI (2009) DiscLDA: Discriminative learning for dimensionality reduction and classification. Koller D, Schuurmans D, Bengio Y, Bottou L, eds. Proc. 21st Internat. Conf. Neural Inform. Processing Systems (Curran Associates, Red Hook, NY), 897–904.Google Scholar
  • Lappas T, Sabnis G, Valkanas G (2016) The impact of fake reviews on online visibility: A vulnerability assessment of the hotel industry. Inform. Systems Res. 27(4):940–961.LinkGoogle Scholar
  • Larsen KR, Bong CH (2016) A tool for addressing construct identity in literature reviews and meta-analyses. MIS Quart. 40(3):529–551.CrossrefGoogle Scholar
  • Lee SY, Qiu L, Whinston A (2018) Sentiment manipulation in online platforms: An analysis of movie tweets. Production Oper. Management 27(3):393–416.CrossrefGoogle Scholar
  • Lin M, Lucas HC, Shmueli G (2013) Research commentary—Too big to fail: Large samples and the p-value problem. Inform. Systems Res. 24(4):906–917.LinkGoogle Scholar
  • Liu X, Singh PV, Srinivasan K (2016) A structured analysis of unstructured big data by leveraging cloud computing. Marketing Sci. 35(3):363–388.LinkGoogle Scholar
  • Liu X, Wang GA, Fan W, Zhang Z (2020) Finding useful solutions in online knowledge communities: A theory-driven design and multilevel analysis. Inform. Systems Res. 31(3):731–752.LinkGoogle Scholar
  • Mankad S, Hu S, Gopal A (2018) Single stage prediction with embedded topic modeling of online reviews for mobile app management. Ann. Appl. Statist. 12(4):2279–2311.CrossrefGoogle Scholar
  • Mankad S, Han HS, Goh J, Gavirneni S (2016) Understanding online hotel reviews through automated text analysis. Service Sci. 8(2):124–138.LinkGoogle Scholar
  • Martens D, Provost F (2014) Explaining data-driven document classifications. MIS Quart. 38(1):73–100.CrossrefGoogle Scholar
  • Mcauliffe JD, Blei DM (2008) Supervised topic models. Platt J, Koller D, Singer Y, Roweis S, eds. Pro. 20th Internat. Conf. Neural Inform. Processing Systems (Curran Associates, Red Hook, NY), 121–128.Google Scholar
  • Miao Y, Grefenstette E, Blunsom P (2017) Discovering discrete latent topics with neural variational inference. Proc. 34th Internat. Conf. Machine Learn., 2410–2419.Google Scholar
  • Peng CH, Yin D, Zhang H (2020) More than words in medical question-and-answer sites: A content-context congruence perspective. Inform. Systems Res. 31(3):913–928.LinkGoogle Scholar
  • Puranam D, Narayan V, Kadiyali V (2017) The effect of calorie posting regulation On consumer opinion: A flexible latent Dirichlet allocation model with informative priors. Marketing Sci. 36(5):726–746.LinkGoogle Scholar
  • Qiao M, Huang KW (2021) Correcting misclassification bias in regression models with variables generated via data mining. Inform. Systems Res. 32(2):462–480.LinkGoogle Scholar
  • Rai A (2016) Editor’s comments: Synergies between big data and theory. MIS Quart. 40(2):iii–ix.CrossrefGoogle Scholar
  • Rehurek R, Sojka P (2010) Software framework for topic modelling with large corpora. Proc. LREC 2010 Workshop New Challenges NLP Frameworks (European Language Resources Association, Paris).Google Scholar
  • Sanh V, Debut L, Chaumond J, Wolf T (2019) DistilBERT, a distilled version of BERT: Smaller, faster, cheaper and lighter. Preprint, submitted October 2, https://arxiv.org/abs/1910.01108.Google Scholar
  • Shi T, Zhu J (2017) Online Bayesian passive-aggressive learning. J. Machine Learn. Res. 18(1):1084–1122.Google Scholar
  • Shi Z, Lee GM, Whinston AB (2016) Toward a better measure of business proximity: Topic modeling for industry intelligence. MIS Quart. 40(4):1035–1056.CrossrefGoogle Scholar
  • Shin D, He S, Lee GM, Whinston AB, Cetintas S, Lee KC (2020) Enhancing social media analysis with visual data analytics: A deep learning approach. MIS Quart. 44(4):1459–1492.CrossrefGoogle Scholar
  • Singh PV, Sahoo N, Mukhopadhyay T (2014) How to attract and retain readers in enterprise blogging? Inform. Systems Res. 25(1):35–52.LinkGoogle Scholar
  • Sun C, Qiu X, Xu Y, Huang X (2019) How to fine-tune BERT for text classification? Preprint, submitted May 14, https://arxiv.org/abs/1905.05583.Google Scholar
  • Tirunillai S, Tellis GJ (2014) Mining marketing meaning from online chatter: Strategic brand analysis of big data using latent Dirichlet allocation. J. Marketing Res. 51(4):463–479.CrossrefGoogle Scholar
  • Toubia O, Netzer O (2016) Idea generation, creativity, and prototypicality. Marketing Sci. 36(1):1–20.LinkGoogle Scholar
  • Wang Y, Chaudhry A (2018) When and how managers’ responses to online reviews affect subsequent reviews. J. Marketing Res. 55(2):163–177.CrossrefGoogle Scholar
  • Wang X, Yang Y (2020) Neural topic model with attention for supervised learning. Proc. 23rd Internat. Conf. Artificial Intelligence Statist., vol. 108 (PMLR), 1147–1156.Google Scholar
  • Wang Y, Zhu J (2014) Spectral methods for supervised topic models. Ghahramani Z, Welling M, Cortes C, Lawrence N, Weinberger KQ, eds. Proc. 27th Internat. Conf. Neural Inform. Processing Systems (MIT Press, Cambridge, MA), 1511–1519.Google Scholar
  • Wang Q, Li B, Singh PV (2018) Copycats vs. original mobile apps: A machine learning copycat-detection method and empirical analysis. Inform. Systems Res. 29(2):273–291.LinkGoogle Scholar
  • Xu L, Nian T, Cabral L (2020) What makes geeks tick? A study of Stack Overflow careers. Management Sci. 66(2):587–604.LinkGoogle Scholar
  • Yang M, Adomavicius G, Burtch G, Ren Y (2018) Mind the gap: Accounting for measurement error and misclassification in variables generated via data mining. Inform. Systems Res. 29(1):4–24.LinkGoogle Scholar
  • Yang Z, Yang D, Dyer C, He X, Smola A, Hovy E (2016) Hierarchical attention networks for document classification. Proc. 2016 Conf. North Amer. Chapter Assoc. Comput. Linguistics: Human Language Tech. (Association for Computational Linguistics, Stroudsburg, PA), 1480–1489.Google Scholar
  • Yue WT, Wang Q, Hui KL (2019) See no evil, hear no evil? Dissecting the impact of online hacker forums. MIS Quart. 43(1):73–95.CrossrefGoogle Scholar
  • Zhu J, Ahmed A, Xing EP (2012) MedLDA: Maximum margin supervised topic models. J. Machine Learn. Res. 13(74):2237–2278.Google Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.