Guided Diverse Concept Miner (GDCM): Uncovering Relevant Constructs for Managerial Insights from Text

Published Online:https://doi.org/10.1287/isre.2020.0494

References

  • Abbasi A, Zhou Y, Deng S, Zhang P (2018) Text analytics to support sense-making in social media: A language-action perspective. MIS Quart. 42(2):427–464.CrossrefGoogle Scholar
  • Abbasi A, Li J, Adjeroh D, Abate M, Zheng W (2019) Don’t mention it? Analyzing user-generated content signals for early adverse event warnings. Inform. Systems Res. 30(3):1007–1028.LinkGoogle Scholar
  • Abrahams AS, Fan W, Wang GA, Zhang Z, Jiao J (2015) An integrated text analytic framework for product defect discovery. Production Oper. Management 24(6):975–990.CrossrefGoogle Scholar
  • Airoldi EM, Bischof JM (2016) Improving and evaluating topic models and other models of text. J. Amer. Statist. Assoc. 111(516):1381–1403.CrossrefGoogle Scholar
  • Archak N, Ghose A, Ipeirotis PG (2011) Deriving the pricing power of product features by mining consumer reviews. Management Sci. 57(8):1485–1509.LinkGoogle Scholar
  • Bass FM (1995) Empirical generalizations and marketing science: A personal view. Marketing Sci. 14(3 Suppl):G6–G19.LinkGoogle Scholar
  • Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J. Machine Learn. Res. 3(January):993–1022.Google Scholar
  • Caballero AJ (2023) Document topic extraction with large language models (LLM) and the latent Dirichlet allocation (LDA) algorithm. Accessed April 17, 2024, https://towardsdatascience.com/document-topic-extraction-with-large-language-models-llm-and-the-latent-dirichlet-allocation-e4697e4dae87.Google Scholar
  • Carey S (2009) The Origin of Concepts (Oxford University Press, Oxford, UK).CrossrefGoogle Scholar
  • Chai Y, Li W (2019) Toward deep learning interpretability: A topic modeling approach. Proc. Internat. Conf. Inform. Systems (Association for Information Systems, Atlanta).Google Scholar
  • Chang J, Boyd-Graber JL, Gerrish S, Wang C, Blei DM (2009) Reading tea leaves: How humans interpret topic models. Bengio Y, Schuurmans D, Lafferty JD, Williams CKI, Culotta A, eds. Adv. Neural Inform. Processing Systems (Curran Associates, Red Hook, NY), 288–296.Google Scholar
  • Chen T, Guestrin C (2016) XGBoost: A scalable tree boosting system. Proc. 22nd ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (ACM, New York), 785–794.Google Scholar
  • Chen W, Gu B, Ye Q, Zhu KX (2019) Measuring and managing the externality of managerial responses to online customer reviews. Inform. Systems Res. 30(1):81–96.LinkGoogle Scholar
  • Chen J, He J, Shen Y, Xiao L, He X, Gao J, Song X, Deng L (2015) End-to-end learning of LDA by mirror-descent back propagation over a deep architecture. Cortes C, Lawrence N, Lee D, Sugiyama M, Garnett R, eds. Proc. 28th Internat. Conf. Neural Inform. Processing Systems (MIT Press, Cambridge, MA), 1765–1773.Google Scholar
  • Choi AA, Cho D, Yim D, Moon JY, Oh W (2019) When seeing helps believing: The interactive effects of previews and reviews on e-book purchases. Inform. Systems Res. 30(4):1164–1183.LinkGoogle Scholar
  • Clemons EK, Gao GG, Hitt LM (2006) When online reviews meet hyperdifferentiation: A study of the craft beer industry. J. Management Inform. Systems 23(2):149–171.CrossrefGoogle Scholar
  • Dhurandhar A, Iyengar V, Luss R, Shanmugam K (2017) Tip: Typifying the interpretability of procedures. Preprint, submitted June 9, https://arxiv.org/abs/1706.02952.Google Scholar
  • Dieng AB, Ruiz FJ, Blei DM (2020) Topic modeling in embedding spaces. Trans. Assoc. Comput. Linguistics 8:439–453.CrossrefGoogle Scholar
  • Efron B, Tibshirani R (1997) Improvements on cross-validation: The.632+ bootstrap method. J. Amer. Statist. Assoc. 92(438):548–560.Google Scholar
  • Feifer J (2013) The Amazon whisperer. Accessed April 17, 2024, https://www.fastcompany.com/3021229/chaim-pikarski-the-amazon-whisperer.Google Scholar
  • Gardenfors P (2004) Conceptual Spaces: The Geometry of Thought (MIT Press, Cambridge, MA).Google Scholar
  • Garvin DA (1984) What does “product quality” really mean? MIT Sloan Management Rev. (October 15), https://sloanreview.mit.edu/article/what-does-product-quality-really-mean/.Google Scholar
  • Garvin DA (1987) Competing on the 8 dimensions of quality. Harvard Bus. Rev. 65(6):101–109.Google Scholar
  • Goldstone RL, Son JY (2005) Similarity. Holyoak KJ, Morrison RG, eds. The Cambridge Handbook of Thinking and Reasoning (Cambridge University Press, Cambridge, UK), 13–36.Google Scholar
  • Grootendorst M (2022) BERTopic: Neural topic modeling with a class-based TF-IDF procedure. Preprint, submitted March 11, https://arxiv.org/abs/2203.05794.Google Scholar
  • Grootendorst M (2023) Topic modeling with Llama 2. Accessed April 17, 2024, https://towardsdatascience.com/topic-modeling-with-llama-2-85177d01e174.Google Scholar
  • Guidotti R, Monreale A, Ruggieri S, Turini F, Giannotti F, Pedreschi D (2018) A survey of methods for explaining black box models. ACM Comput. Surveys 51(5):1–42.CrossrefGoogle Scholar
  • Han S, Shin M, Park S, Jung C, Cha M (2023) Unified neural topic model via contrastive learning and term weighting. Vlachos A, Augenstein I, eds. Proc. 17th Conf. Eur. Chapter Assoc. Comput. Linguistics (Association for Computational Linguistics, Stroudsburg, PA), 1802–1817.Google Scholar
  • Harris ZS (1954) Distributional structure. Word 10(2–3):146–162.CrossrefGoogle Scholar
  • Huang S, Tran TD (2018) Sparse signal recovery via generalized entropy functions minimization. IEEE Trans. Signal Processing 67(5):1322–1337.CrossrefGoogle Scholar
  • Jackendoff R (1989) What is a concept, that a person may grasp it? Mind Language 4(1–2):68–102.CrossrefGoogle Scholar
  • Jagarlamudi J, Daumé H III, Udupa R (2012) Incorporating lexical priors into topic models. Daelemans W, ed. Proc. 13th Conf. Eur. Chapter Assoc. Comput. Linguistics (Association for Computational Linguistics, Stroudsburg, PA), 204–213.Google Scholar
  • Jurafsky D (2000) Speech & Language Processing (Pearson Education India, Noida, India).Google Scholar
  • Kuhn TS (2012) The Structure of Scientific Revolutions (University of Chicago Press, Chicago).CrossrefGoogle Scholar
  • Lau JH, Newman D, Baldwin T (2014) Machine reading tea leaves: Automatically evaluating topic coherence and topic model quality. Wintner S, Goldwater S, Riezler S, eds. Proc. 14th Conf. Eur. Chapter Assoc. Comput. Linguistics (Association for Computational Linguistics, Stroudsburg, PA), 530–539.Google Scholar
  • LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444.CrossrefGoogle Scholar
  • Lee D, Hosanagar K (2019) How do recommender systems affect sales diversity? A cross-category investigation via randomized field experiment. Inform. Systems Res. 30(1):239–259.LinkGoogle Scholar
  • Lee D, Hosanagar K, Nair H (2018) Advertising content and consumer engagement on social media: Evidence from Facebook. Management Sci. 64(11):5105–5131.LinkGoogle Scholar
  • Lipton ZC (2018) The mythos of model interpretability: In machine learning, the concept of interpretability is both important and slippery. Queue 16(3):31–57.Google Scholar
  • Liu X, Lee D, Srinivasan K (2019) Large-scale cross-category analysis of consumer review content on sales conversion leveraging deep learning. J. Marketing Res. 56(6):918–943.CrossrefGoogle Scholar
  • Liu Y, Liu Z, Chua TS, Sun M (2015) Topical word embeddings. Gunning D, Yeh PZ, eds. Proc. AAAI Conf. Artificial Intelligence, vol. 29(1) (AAAI, Palo Alto, CA).Google Scholar
  • Lu J, Lee D, Kim TW, Danks D (2019) Good explanation for algorithmic transparency. Preprint, submitted November 11, https://dx.doi.org/10.2139/ssrn.3503603.Google Scholar
  • Lundberg S, Lee SI (2017) A unified approach to interpreting model predictions. Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R, eds. Proc. 31st Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 4768–4777.Google Scholar
  • Margolis E, Laurence S, eds. (1999) Concepts: Core Readings (MIT Press, Cambridge, MA).Google Scholar
  • Margolis E, Laurence S (2023) Concepts. Zalta EN, Nodelman U, eds. The Stanford Encyclopedia of Philosophy, Fall 2023 ed. (Metaphysics Research Lab, Stanford University, Stanford, CA).Google Scholar
  • Mcauliffe JD, Blei DM (2008) Supervised topic models. Platt J, Koller D, Singer Y, Roweis S, eds. Proc. 20th Internat. Conf. Neural Inform. Processing Systems (Curran Associates, Red Hook, NY), 121–128.Google Scholar
  • Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ, eds. Adv. Neural Inform. Processing Systems (Curran Associates, Red Hook, NY), 3111–3119.Google Scholar
  • Miller T (2018) Explanation in artificial intelligence: Insights from the social sciences. Artificial Intelligence 267(2019):1–38.Google Scholar
  • Mimno D, McCallum A (2008) Topic models conditioned on arbitrary features with Dirichlet-multinomial regression. Barzilay R, Johnson M, eds. Proc. 2011 Conf. Empirical Methods Natl. Language Processing (Association for Computational Linguistics, Stroudsburg, PA), 262–272.Google Scholar
  • Miranda S, Berente N, Seidel S, Safadi H, Burton-Jones A (2022) Editor’s comments: Computationally intensive theory construction: A primer for authors and reviewers. MIS Quart. 46(2):iii–xviii.CrossrefGoogle Scholar
  • Moody CE (2016) Mixing Dirichlet topic models and word embeddings to make lda2vec. Preprint, submitted May 6, https://arxiv.org/abs/1605.02019.Google Scholar
  • Murdoch WJ, Singh C, Kumbier K, Abbasi-Asl R, Yu B (2019) Interpretable machine learning: Definitions, methods, and applications. Preprint, submitted January 14, https://arxiv.org/abs/1901.04592.Google Scholar
  • Murphy G (2004) The Big Book of Concepts (MIT Press, Cambridge, MA).Google Scholar
  • Netzer O, Lemaire A, Herzenstein M (2019) When words sweat: Identifying signals for loan default in the text of loan applications. J. Marketing Res. 56(6):960–980.CrossrefGoogle Scholar
  • Netzer O, Feldman R, Goldenberg J, Fresko M (2012) Mine your own business: Market-structure surveillance through text mining. Marketing Sci. 31(3):521–543.LinkGoogle Scholar
  • Newman D, Lau JH, Grieser K, Baldwin T (2010) Automatic evaluation of topic coherence. Kaplan R, Burstein J, Harper M, PennHuman G, eds. Language Tech. 2010 Annual Conf. North American Chapter Assoc. Comput. Linguist. (Association for Computational Linguistics, Stroudsburg, PA), 100–108.Google Scholar
  • Osherson DN, Smith EE (1981) On the adequacy of prototype theory as a theory of concepts. Cognition 9(1):35–58.CrossrefGoogle Scholar
  • Pariser E (2011) The Filter Bubble: How the New Personalized Web Is Changing What We Read and How We Think (Penguin, New York).Google Scholar
  • Pennington J, Socher R, Manning CD (2014) Glove: Global vectors for word representation. Moschitti A, Pang B, Daelemans W, eds. Proc. 2014 Conf. Empirical Methods Natural Language Processing (EMNLP) (Association for Computational Linguistics, Stroudsburg, PA), 1532–1543.Google Scholar
  • Pham CM, Hoyle A, Sun S, Iyyer M (2023) TopicGPT: A prompt-based topic modeling framework. Preprint, submitted November 2, https://arxiv.org/abs/2311.01449.Google Scholar
  • Ransbotham S, Lurie NH, Liu H (2019) Creation and consumption of mobile word of mouth: How are mobile reviews different? Marketing Sci. 38(5):773–792.LinkGoogle Scholar
  • Ras G, van Gerven M, Haselager P (2018) Explanation methods in deep learning: Users, values, concerns and challenges. Escalante H, Escalera S, Guyon I, Baró X, Güçlütürk Y, Güçlü U, van Gerven M, eds. Explainable and Interpretable Models in Computer Vision and Machine Learning, Springer Series on Challenges in Machine Learning (Springer, Cham, Switzerland), 19–36.CrossrefGoogle Scholar
  • Ribeiro MT, Singh S, Guestrin C (2016) “Why should I trust you?”: Explaining the predictions of any classifier. Proc. 22nd ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York).Google Scholar
  • Roberts ME, Stewart BM, Tingley D, Lucas C, Leder-Luis J, Gadarian SK, Albertson B, Rand DG (2014) Structural topic models for open-ended survey responses. Amer. J. Political Sci. 58(4):1064–1082.CrossrefGoogle Scholar
  • Rosch E (2002) Principles of categorization. Levitin DJ, ed. Foundations of Cognitive Psychology: Core Readings (MIT Press, Cambridge, MA), 251–270.Google Scholar
  • Rudin C (2019) Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence 1(5):206–215.CrossrefGoogle Scholar
  • Schölkopf B, Locatello F, Bauer S, Ke NR, Kalchbrenner N, Goyal A, Bengio Y (2021) Toward causal representation learning. Proc. IEEE 109(5):612–634.CrossrefGoogle Scholar
  • Shi B, Lam W, Jameel S, Schockaert S, Lai KP (2017) Jointly learning word embeddings and latent topics. Proc. 40th Internat. ACM SIGIR Conf. Res. Development Inform. Retrieval (Association for Computing Machinery, New York), 375–384.Google Scholar
  • Sloutsky VM, Deng W (2019) Categories, concepts, and conceptual development. Lang. Cogn. Neurosci. 34(10):1284–1297.CrossrefGoogle Scholar
  • Solomon KO, Medin DL, Lynch EB (1999) Concepts do more than categorize. Trends Cogn. Sci. 3(3):99–105.CrossrefGoogle Scholar
  • Sridhar D, Daumé H III, Blei D (2022) Heterogeneous supervised topic models. Trans. Assoc. Comput. Linguist. 10:732–745.CrossrefGoogle Scholar
  • Srivastava A, Sutton C (2017) Autoencoding variational inference for topic models. Preprint, submitted March 4, https://arxiv.org/abs/1703.01488.Google Scholar
  • Sunstein CR (2018) Republic: Divided Democracy in the Age of Social Media (Princeton University Press, Princeton, NJ).Google Scholar
  • Tibshirani R (1996) Regression shrinkage and selection via the lasso. J. Roy. Statist. Soc. Ser. B Statist. Methodology 58(1):267–288.CrossrefGoogle Scholar
  • Timoshenko A, Hauser JR (2019) Identifying customer needs from user-generated content. Marketing Sci. 38(1):1–20.LinkGoogle Scholar
  • Toubia O, Iyengar G, Bunnell R, Lemaire A (2019) Extracting features of entertainment products: A guided LDA approach informed by the psychology of media consumption. J. Marketing Res. 56(1):18–36.CrossrefGoogle Scholar
  • Vayansky I, Kumar SA (2020) A review of topic modeling methods. Inform. Systems 94(2020):101582.CrossrefGoogle Scholar
  • Wang X, Yang Y (2020) Neural topic model with attention for supervised learning. Chiappa S, Calandra R, eds. Proc. Twenty Third Internat. Conf. Artificial Intelligence Statist. (PMLR, New York), 1147–1156.Google Scholar
  • Wang H, Prakash N, Hoang NK, Hee MS, Naseem U, Lee RKW (2023) Prompting large language models for topic modeling. 2023 IEEE Internat. Conf. Big Data (BigData) (IEEE, Piscataway, NJ), 1236–1241.Google Scholar
  • Wernicke S (2015) How to use data to make a hit TV show. Accessed April 17, 2024, https://www.ted.com/talks/sebastian_wernicke_how_to_use_data_to_make_a_hit_tv_show.Google Scholar
  • Xu W, Hu W, Wu F, Sengamedu S (2023) Detime: Diffusion-enhanced topic modeling using encoder-decoder based LLM. Preprint, submitted October 23, https://arxiv.org/abs/2310.15296.Google Scholar
  • Xun G, Li Y, Gao J, Zhang A (2017) Collaboratively improving topic discovery and word embeddings by coordinating global and local contexts. Proc. 23rd ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 535–543.Google Scholar
  • Yang Y, Zhang K, Fan Y (2023) SDTM: A supervised Bayesian deep topic model for text analytics. Inform. Systems Res. 34(1):137–156.LinkGoogle Scholar
  • Zhang K, Moe W (2021) Measuring brand favorability using large-scale social media data. Inform. Systems Res. 32(4):1128–1139.LinkGoogle Scholar
  • Zhu J, Ahmed A, Xing EP (2012) MedLDA: Maximum margin supervised topic models. J. Machine Learn. Res. 13(August):2237–2278.Google Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.