A Structural Topic and Sentiment-Discourse Model for Text Analysis

Li Chen
Li Chen
[email protected]
https://orcid.org/0000-0002-2878-5351
Samuel Curtis Johnson Graduate School of Management, Cornell University, Ithaca, New York 14853
Search for more papers by this author
,
Shawn Mankad
Corresponding Author
Shawn Mankad
[email protected]
https://orcid.org/0000-0001-7945-8556
Poole College of Management, North Carolina State University, Raleigh, North Carolina 27695
Search for more papers by this author

Samuel Curtis Johnson Graduate School of Management, Cornell University, Ithaca, New York 14853

Search for more papers by this author

Shawn Mankad

Corresponding Author

Shawn Mankad

[email protected]

https://orcid.org/0000-0001-7945-8556

Poole College of Management, North Carolina State University, Raleigh, North Carolina 27695

Search for more papers by this author

Published Online:16 Oct 2024https://doi.org/10.1287/mnsc.2022.00261

References

Abhishek V, Gong J, Li B (2018) Examining the impact of contextual ambiguity on search advertising keyword performance: A topic model approach. MIS Quart. 42(3):805–829.Crossref, Google Scholar
Abrahams AS, Fan W, Wang GA, Zhang ZJ, Jiao J (2015) An integrated text analytic framework for product defect discovery. Production Oper. Management 24(6):975–990.Crossref, Google Scholar
Achiam J, Adler S, Agarwal S, Ahmad L, Akkaya I, Aleman FL, Almeida D, et al. (2023) GPT-4 technical report. Preprint, submitted March 15, https://arxiv.org/abs/2303.08774.Google Scholar
Almars A, Li X, Zhao X (2019) Modelling user attitudes using hierarchical sentiment-topic model. Data Knowledge Engrg. 119:139–149.Crossref, Google Scholar
Archak N, Ghose A, Ipeirotis PG (2011) Deriving the pricing power of product features by mining consumer reviews. Management Sci. 57(8):1485–1509.Link, Google Scholar
Arora S, Ge R, Moitra A (2012) Learning topic models–going beyond SVD. Proc. IEEE 53rd Annual Sympos. Foundations Comput. Sci (IEEE, Piscataway, NJ), 1–10.Google Scholar
Arora S, Ge R, Halpern Y, Mimno D, Moitra A, Sontag D, Wu Y, et al. (2013) A practical algorithm for topic modeling with provable guarantees. Dasgupta S, McAllester D, eds. Proc. Internat. Conf. Machine Learning (PMLR, New York), 280–288.Google Scholar
Bao Y, Datta A (2014) Simultaneously discovering and quantifying risk types from textual risk disclosures. Management Sci. 60(6):1371–1391.Link, Google Scholar
Bellstam G, Bhagat S, Cookson JA (2021) A text-based analysis of corporate innovation. Management Sci. 67(7):4004–4031.Link, Google Scholar
Bischof J, Airoldi EM (2012) Summarizing topical content with word frequency and exclusivity. Proc. 29th Internat. Conf. Machine Learn. (Association for Computing Machinery, New York), 201–208.Google Scholar
Blei DM, Lafferty JD (2007) A correlated topic model of science. Ann. Appl. Statist. 1(1):17–35.Crossref, Google Scholar
Blei DM, Kucukelbir A, McAuliffe JD (2017) Variational inference: A review for statisticians. J. Amer. Statist. Assoc. 112(518):859–877.Crossref, Google Scholar
Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J. Machine Learn. Res. 3:993–1022.Google Scholar
Botelho TL, Gertsberg M (2021) The disciplining effect of status: Evaluator status awards and observed gender bias in evaluations. Management Sci. 68(7):5311–5329.Google Scholar
Brody S, Elhadad N (2010) An unsupervised aspect-sentiment model for online reviews. Proc. Annual Conf. North Amer. Chapter Assoc. Comput. Linguistics (Association for Computational Linguistics, Los Angeles), 804–812.Google Scholar
Chakraborty I, Kim M, Sudhir K (2022) Attribute sentiment scoring with online text reviews: Accounting for language structure and missing attributes. J. Marketing Res. 59(3):600–622.Crossref, Google Scholar
Deerwester SC, Dumais ST, Landauer TK, Furnas GW, Harshman RA (1990) Indexing by latent semantic analysis. J. Amer. Soc. Inform. Sci. 41(6):391–407.Google Scholar
Dermouche M, Kouas L, Velcin J, Loudcher S (2015) A joint model for topic-sentiment modeling from text. Proc. 30th Annual ACM Sympos. Appl. Comput. (Association for Computing Machinery, New York), 819–824.Google Scholar
Diao Q, Qiu M, Wu C-Y, Smola AJ, Jiang J, Wang C (2014) Jointly modeling aspects, ratings and sentiments for movie recommendation (JMARs). Proc. 20th ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 193–202.Google Scholar
Ding C, Li T, Peng W (2008) On the equivalence between non-negative matrix factorization and probabilistic latent semantic indexing. Comput. Statist. Data Anal. 52(8):3913–3927.Crossref, Google Scholar
Dong H, Ren J, Padmanabhan B, Nickerson JV (2021) How are social and mass media different in relation to the stock market? A study on topic coverage and predictive value. Inform. Management 59(2):1–15.Google Scholar
Eisenstein J, Xing E (2010) The CMU 2008 Political Blog Corpus. Technical report, Carnegie Mellon University, Pittsburgh.Google Scholar
Eisenstein J, Ahmed A, Xing EP (2011) Sparse additive generative models of text. Getoor L, Scheffer T, eds. Proc. 28th Internat. Conf. Machine Learn. (Association for Computing Machinery, New York), 1041–1048.Google Scholar
Fu X, Sun X, Wu H, Cui L, Huang JZ (2018) Weakly supervised topic sentiment joint model with word embeddings. Knowledge Base. Systems 147:43–54.Crossref, Google Scholar
Gadarian SK, Albertson B (2014) Anxiety, immigration, and the search for information. Political Psych. 35(2):133–164.Crossref, Google Scholar
García-Pablos A, Cuadros M, Rigau G (2018) W2vlda: Almost unsupervised system for aspect based sentiment analysis. Expert Systems Appl. 91:127–137.Crossref, Google Scholar
Gemulla R, Nijkamp E, Haas PJ, Sismanis Y (2011) Large-scale matrix factorization with distributed stochastic gradient descent. Proc. 17th ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 69–77.Google Scholar
Geva H, Oestreicher-Singer G, Saar-Tsechansky M (2019) Using retweets when shaping our online persona: Topic modeling approach. MIS Quart. 43(2):501–524.Crossref, Google Scholar
Ghose A, Ipeirotis PG (2010) Estimating the helpfulness and economic impact of product reviews: Mining text and reviewer characteristics. IEEE Trans. Knowledge Data Engrg. 23(10):1498–1512.Crossref, Google Scholar
Ghose A, Ipeirotis PG, Li B (2019) Modeling consumer footprints on search engines: An interplay with social media. Management Sci. 65(3):1363–1385.Link, Google Scholar
Hai Z, Cong G, Chang K, Cheng P, Miao C (2017) Analyzing sentiments in one go: A supervised joint topic modeling approach. IEEE Trans. Knowledge Data Engrg. 29(6):1172–1185.Google Scholar
Hofmann T (1999) Probabilistic latent semantic indexing. Proc. 22nd Annual Internat. ACM SIGIR Conf. Res. Development Inform. Retrieval (Association for Computing Machinery, New York), 50–57.Google Scholar
Huang AH, Lehavy R, Zang AY, Zheng R (2018) Analyst information discovery and interpretation roles: A topic modeling approach. Management Sci. 64(6):2833–2855.Link, Google Scholar
Jacobs BJD, Donkers B, Fok D (2016) Model-based purchase predictions for large assortments. Marketing Sci. 35(3):389–404.Google Scholar
Jo Y, Oh AH (2011) Aspect and sentiment unification model for online review analysis. Proc. 4th ACM Internat. Conf. Web Search Data Mining (Association for Computing Machinery, New York), 815–824.Google Scholar
Kim S, Zhang J, Chen Z, Oh A, Liu S (2013) A hierarchical aspect-sentiment model for online reviews. Proc. Conf. AAAI Artificial Intelligence 27:526–533.Crossref, Google Scholar
Kullback S, Leibler RA (1951) On information and sufficiency. Ann. Math. Statist. 22(1):79–86.Crossref, Google Scholar
Li C, Zhang J, Sun J-T, Chen Z (2013) Sentiment topic model with decomposed prior. Proc. SIAM Internat. Conf. Data Mining (SIAM, Philadelphia), 767–775.Google Scholar
Lin C, He Y (2009) Joint sentiment/topic model for sentiment analysis. Proc. 18th ACM Conf. Inform. Knowledge Management (Association for Computing Machinery, New York), 375–384.Google Scholar
Liu J, Toubia O, Hill S (2021) Content-based model of web search behavior: An application to TV show search. Management Sci. 67(10):5969–6627.Google Scholar
Loughran T, McDonald B (2016) Textual analysis in accounting and finance: A survey. J. Accounting Res. 54(4):1187–1230.Crossref, Google Scholar
McAuley J, Leskovec J (2013) Hidden factors and hidden topics: Understanding rating dimensions with review text. Proc. 7th ACM Conf. Recommender Systems (Association for Computing Machinery, New York), 165–172.Google Scholar
Mei Q, Ling X, Wondra M, Su H, Zhai CX (2007) Topic sentiment mixture: Modeling facets and opinions in weblogs. Proc. 16th Internat. Conf. World Wide Web (Association for Computing Machinery, New York), 171–180.Google Scholar
Mejia J, Mankad S, Gopal A (2019) A for effort? Using the crowd to identify moral hazard in New York City restaurant hygiene inspections. Inform. Systems Res. 30(4):1363–1386.Link, Google Scholar
Mejia J, Mankad S, Gopal A (2021) Service quality using text mining: Measurement and consequences. Manufacturing Service Oper. Management 23(6):1354–1372.Link, Google Scholar
Mimno D, Wallach H, Talley E, Leenders M, McCallum A (2011) Optimizing semantic coherence in topic models. Proc. Conf. Empirical Methods Natural Language Processing (Association for Computing Machinery, New York), 262–272.Google Scholar
Plank B (2016) What to do about non-standard (or non-canonical) language in nlp. Preprint, submitted August 28, https://arxiv.org/abs/1608.07836.Google Scholar
Poddar L, Hsu W, Lee ML (2017) Author-aware aspect topic sentiment model to retrieve supporting opinions from reviews. Proc. Conf. Empirical Methods Natural Language Processing (Association for Computational Linguistics, Copenhagen), 472–481.Google Scholar
Pontiki M, Galanis D, Papageorgiou H, Manandhar S, Androutsopoulos I (2015) Semeval-2015 task 12: Aspect based sentiment analysis. Nakov P, Zesch T, Cer D, Jurgens D, eds. Proc. 9th Internat. Workshop Semantic Evaluation (Association for Computational Linguistics, Red Hook, NY), 486–495.Google Scholar
Pontiki M, Galanis D, Papageorgiou H, Androutsopoulos I, Manandhar S, Al-Smadi M, Al-Ayyoub M, et al. (2016) Semeval-2016 task 5: Aspect based sentiment analysis. Bethard S, Carpuat M, Cer D, Jurgens D, Nakov P, Zesch T, eds. Proc. Internat. Workshop Semantic Evaluation (Association for Computational Linguistics, San Diego), 19–30.Google Scholar
Rahman MM, Wang H (2016) Hidden topic sentiment model. Proc. 25th Internat. Conf. World Wide Web (Association for Computing Machinery, New York), 155–165.Google Scholar
Reisenbichler M, Reutterer T (2019) Topic modeling in marketing: Recent advances and research opportunities. J. Bus. Econom. 89(3):327–356.Crossref, Google Scholar
Roberts ME, Stewart BM, Airoldi EM (2016) A model of text for experimentation in the social sciences. J. Amer. Statist. Assoc. 111(515):988–1003.Crossref, Google Scholar
Roberts ME, Stewart BM, Tingley D (2019) STM: An R package for structural topic models. J. Statist. Software 91(1):1–40.Google Scholar
Roberts ME, Stewart BM, Tingley D, Lucas C, Leder-Luis J, Kushner Gadarian S, Albertson B, et al. (2014) Structural topic models for open-ended survey responses. Amer. J. Political Sci. 58(4):1064–1082.Crossref, Google Scholar
Schwartz R, Vassilev A, Greene K, Perine L, Burt A, Hall P (2022) Toward a Standard for Identifying and Managing Bias in Artificial Intelligence (National Institute of Standards and Technology, Gaithersburg, MD).Google Scholar
Shi Z, Lee GM, Whinston AB (2016) Toward a better measure of business proximity: Topic modeling for industry intelligence. MIS Quart. 40(4):1035–1056.Crossref, Google Scholar
Steel R (1953) Relation between Poisson and multinomial distributions. Technical report, Cornell University, Ithaca, NY.Google Scholar
Taddy M (2015) Distributed multinomial regression. Ann. Appl. Statist. 9(3):1394–1414.Crossref, Google Scholar
Tan YC, Celis LE (2019) Assessing social and intersectional biases in contextualized word representations. Adv. Neural Inform. Processing Systems 32(1):1–12.Google Scholar
Tang F, Fu L, Yao B, Xu W (2019) Aspect based fine-grained sentiment analysis for online reviews. Inform. Sci. 488:190–204.Crossref, Google Scholar
Tetlock PC, Saar-Tsechansky M, Macskassy S (2008) More than words: Quantifying language to measure firms’ fundamentals. J. Finance 63(3):1437–1467.Crossref, Google Scholar
Tirunillai S, Tellis GT (2014) Mining marketing meaning from online chatter: Strategic brand analysis of big data using latent Dirichlet allocation. J. Marketing Res. 51(4):463–479.Crossref, Google Scholar
Titov I, McDonald R (2008) A joint model of text and aspect ratings for sentiment summarization. Proc ACL-08 HLT (Association for Computational Linguistics, Columbus, OH), 308–316.Google Scholar
Wallach HM, Murray I, Salakhutdinov R, Mimno D (2009) Evaluation methods for topic models. Proc. 26th Annual Internat. Conf. Machine Learn. (Association for Computing Machinery, New York), 1105–1112.Google Scholar
Wang C, Blei DM (2013) Variational inference in nonconjugate models. J. Machine Learn. Res. 14(Apr):1005–1031.Google Scholar
Wang C, Blei D, Heckerman D (2008) Continuous time dynamic topic models. Proc. 24th Conf. Uncertainty Artificial Intelligence (AUAI Press, Arlington, VA), 579–586.Google Scholar
Wang S, Chen Z, Liu B (2016) Mining aspect-specific opinion using a holistic lifelong topic model. Proc. 25th Internat. Conf. World Wide Web (International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland), 167–176.Google Scholar
Wang H, Lu Y, Zhai CX (2011) Latent aspect rating analysis without aspect keyword supervision. Proc. 17th ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 618–626.Google Scholar
Xu Y, Armony M, Ghose A (2021) The interplay between online reviews and physician demand: An empirical investigation. Management Sci. 67(12):7291–7950.Google Scholar
Xu W, Liu X, Gong Y (2003) Document clustering based on non-negative matrix factorization. Proc. 26th Annual Internat. ACM SIGIR Conf. Res. Development Inform. Retrieval (ACM, New York), 267–273.Google Scholar

Volume 71, Issue 7

July 2025

Pages iv-vi, 5419-6318

Article Information

Supplemental Material

Metrics

Information

Received:January 27, 2022
Accepted:April 29, 2024
Published Online:October 16, 2024

Cite as

Li Chen, Shawn Mankad (2024) A Structural Topic and Sentiment-Discourse Model for Text Analysis. Management Science 71(7):5767-5787.

https://doi.org/10.1287/mnsc.2022.00261

Keywords

Acknowledgments

The authors thank department editor Anindya Ghose, an anonymous associate editor, and three anonymous referees for their constructive comments and suggestions. The authors are grateful for insightful discussions with Molly Roberts and Brandon Stewart. Thanks also go to seminar participants at UC San Diego, Shanghai Jiao Tong University, North Carolina State University, University of Michigan, and the Federal Reserve Bank of Philadelphia for their helpful feedback. Both authors contributed equally to this research.

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

A Structural Topic and Sentiment-Discourse Model for Text Analysis

References

Volume 71, Issue 7

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News