Divide and Contrast: A Text-Based Method for Firm Market Risk Prediction

Yi He
Yi He
[email protected]
School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
Search for more papers by this author
,
Yi Yang
Yi Yang
[email protected]
https://orcid.org/0000-0001-8863-112X
School of Business and Management, Hong Kong University of Science and Technology, Hong Kong
Search for more papers by this author
,
Defu Lian
Defu Lian
[email protected]
School of Data Science, School of Computer Science and Technology, University of Science and Technology of China, Chengdu 611731, China
Search for more papers by this author
,
Kunpeng Zhang
Corresponding Author
Kunpeng Zhang
[email protected]
https://orcid.org/0000-0002-1474-3169
Robert H. Smith School of Business, University of Maryland, College Park, Maryland 20742
Search for more papers by this author

School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China

Search for more papers by this author

Yi Yang

[email protected]

https://orcid.org/0000-0001-8863-112X

School of Business and Management, Hong Kong University of Science and Technology, Hong Kong

Search for more papers by this author

Defu Lian

[email protected]

School of Data Science, School of Computer Science and Technology, University of Science and Technology of China, Chengdu 611731, China

Search for more papers by this author

Kunpeng Zhang

Corresponding Author

Kunpeng Zhang

[email protected]

https://orcid.org/0000-0002-1474-3169

Robert H. Smith School of Business, University of Maryland, College Park, Maryland 20742

Search for more papers by this author

Published Online:25 Apr 2025https://doi.org/10.1287/ijoc.2023.0195

References

Beltagy I, Peters ME, Cohan A (2020) Longformer: The long-document transformer. Preprint, submitted April 10, https://arxiv.org/abs/2004.05150.Google Scholar
Bernard VL, Thomas JK (1989) Post-earnings-announcement drift: Delayed price response or risk premium? J. Accounting Res. 27:1–36.Crossref, Google Scholar
Black F, Scholes M (2019) The pricing of options and corporate liabilities. World Scientific Reference on Contingent Claims Analysis in Corporate Finance: Volume 1: Foundations of CCA and Equity Valuation (World Scientific, Singapore), 3–21.Crossref, Google Scholar
Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J. Machine Learn. Res. 3:993–1022.Google Scholar
Brooks C, Persand G (2003) Volatility forecasting for risk management. J. Forecasting 22(1):1–22.Crossref, Google Scholar
Brown T, Mann B, Ryder N, Subbiah M, Kaplan JD, Dhariwal P, Neelakantan A, Shyam P, Sastry G, Askell A, et al. (2020) Language models are few-shot learners. Adv. Neural Inform. Processing Systems, vol. 33 (MIT Press, Cambridge, MA), 1877–1901.Google Scholar
Brownlees CT, Gallo GM (2010) Comparison of volatility measures: A risk management perspective. J. Financial Econometrics 8(1):29–56.Crossref, Google Scholar
Bushee BJ, Matsumoto DA, Miller GS (2003) Open versus closed conference calls: The determinants and effects of broadening access to disclosure. J. Accounting Econom. 34(1–3):149–180.Crossref, Google Scholar
Cazier RA, Pfeiffer RJ (2016) Why are 10-k filings so long? Accounting Horizons 30(1):1–21.Crossref, Google Scholar
Chen T, Kornblith S, Norouzi M, Hinton G (2020) A simple framework for contrastive learning of visual representations. Internat. Conf. Machine Learn. (PMLR, New York), 1597–1607.Google Scholar
Cheng D, Yang F, Wang X, Zhang Y, Zhang L (2020) Knowledge graph-based event embedding framework for financial quantitative investments. Special Interest Group Inform. Retrieval (ACM, New York), 2221–2230.Crossref, Google Scholar
Dessaint O, Foucault T, Frésard L (2024) Does alternative data improve financial forecasting? The horizon effect. J. Finance 79(3):2237–2287.Crossref, Google Scholar
Devlin J, Chang MW, Lee K, Toutanova K (2019) Bert: Pre-training of deep bidirectional transformers for language understanding. NAACL-HLT (Association for Computational Linguistics, Pennsylvania), 4171–4186.Google Scholar
Doran JS, Peterson DR, Price SM (2012) Earnings conference call content and stock price: The case of REITs. J. Real Estate Finance Econom. 45(2):402–434.Crossref, Google Scholar
Dumas B, Kurshev A, Uppal R (2009) Equilibrium portfolio strategies in the presence of sentiment risk and excess volatility. J. Finance 64(2):579–629.Crossref, Google Scholar
Frankel R, Jennings JN, Lee JA (2017) Using natural language processing to assess text usefulness to readers: The case of conference calls and earnings prediction. Preprint, submitted February 14, https://dx.doi.org/10.2139/ssrn.3095754.Google Scholar
Frankel R, Johnson M, Skinner DJ (1999) An empirical examination of conference calls as a voluntary disclosure medium. J. Accounting Res. 37(1):133–150.Crossref, Google Scholar
Galke L, Scherp A (2022) Bag-of-words vs. graph vs. sequence in text classification: Questioning the necessity of text-graphs and the surprising strength of a wide MLP. 60th Annual Meeting Assoc. Comput. Linguistics (Association for Computational Linguistics, Pennsylvania), 4038–4051.Google Scholar
Giorgi JM, Nitski O, Wang B, Bader GD (2021) DeCLUTR: Deep contrastive learning for unsupervised textual representations. ACL/IJCNLP (Association for Computational Linguistics, Pennsylvania), 879–895.Google Scholar
He K, Fan H, Wu Y, Xie S, Girshick R (2020) Momentum contrast for unsupervised visual representation learning. Confe. Comput. Vision Pattern Recognition (Computer Vision Foundation/IEEE, Piscataway, NJ), 9729–9738.Google Scholar
He Y, Yang Y, Lian D, Zhang K (2025) Divide-and-contrast: A machine learning method for text-based risk prediction using earnings conference call transcripts. http://dx.doi.org/10.1287/ijoc.2023.0195.cd, https://github.com/INFORMSJoC/2023.0195.Google Scholar
Hoberg G, Phillips G (2016) Text-based network industries and endogenous product differentiation. J. Political Econom. 124(5):1423–1465.Crossref, Google Scholar
Hong LJ, Juneja S, Luo J (2014) Estimating sensitivities of portfolio credit risk using Monte Carlo. INFORMS J. Comput. 26(4):848–865.Link, Google Scholar
Hong W, Ji K, Liu J, Wang J, Chen J, Chu W (2021) Gilbert: Generative vision-language pre-training for image-text retrieval. Special Interest Group Inform. Retrieval (ACM, New York), 1379–1388.Google Scholar
Huang AH, Lehavy R, Zang AY, Zheng R (2018) Analyst information discovery and interpretation roles: A topic modeling approach. Management Sci. 64(6):2833–2855.Link, Google Scholar
Jiang G, Hong LJ, Nelson BL (2020) Online risk monitoring using offline simulation. INFORMS J. Comput. 32(2):356–375.Abstract, Google Scholar
Kingma DP, Ba J (2014) Adam: A method for stochastic optimization. Preprint, submitted December 22, https://arxiv.org/abs/1412.6980.Google Scholar
Kingma DP, Welling M (2013) Auto-encoding variational bayes. Preprint, submitted December 20, https://arxiv.org/abs/1312.6114.Google Scholar
Kitaev N, Kaiser Ł, Levskaya A (2020) Reformer: The efficient transformer. Preprint, submitted January 13, https://arxiv.org/abs/2001.04451.Google Scholar
Kogan S, Levin D, Routledge BR, Sagi JS, Smith NA (2009) Predicting risk from financial reports with regression. HLT-NAACL (The Association for Computational Linguistics, Pennsylvania), 272–280.Google Scholar
Liao J, Zhao X, Li X, Zhang L, Tang J (2021) Learning discriminative neural representations for event detection. Special Interest Group Inform. Retrieval (ACM, New York), 644–653.Google Scholar
Liu X, Zhang F, Hou Z, Mian L, Wang Z, Zhang J, Tang J (2021) Self-supervised learning: Generative or contrastive. IEEE Trans. Knowledge Data Engrg. 35(1):857–876.Google Scholar
Liu Y, Liu P (2021) SimCLS: A simple framework for contrastive learning of abstractive summarization. ACL/IJCNLP (Association for Computational Linguistics, Pennsylvania), 1065–1072.Google Scholar
Logeswaran L, Lee H (2018) An efficient framework for learning sentence representations. ICLR (OpenReview.net).Google Scholar
Loughran T, McDonald B (2016) Textual analysis in accounting and finance: A survey. J. Accounting Res. 54(4):1187–1230.Crossref, Google Scholar
Poon SH, Granger C (2005) Practical issues in forecasting volatility. Financial Anal. J. 61(1):45–56.Crossref, Google Scholar
Price SM, Doran JS, Peterson DR, Bliss BA (2012) Earnings conference calls and stock returns: The incremental informativeness of textual tone. J. Banking Finance 36(4):992–1011.Crossref, Google Scholar
Qin Y, Yang Y (2019) What you say and how you say it matters: Predicting stock volatility using verbal and vocal cues. ACL (Association for Computational Linguistics, Pennsylvania), 390–401.Google Scholar
Reimers N, Gurevych I (2019) Sentence-BERT: Sentence embeddings using Siamese BERT-networks. Proc. 2019 Conf. Empirical Methods Natl. Language Processing (Association for Computational Linguistics, Pennsylvania).Google Scholar
Sawhney R, Agarwal S, Thakkar M, Wadhwa A, Shah RR (2021) Hyperbolic online time stream modeling. Special Interest Group Inform. Retrieval (ACM, New York), 1682–1686.Google Scholar
Tay Y, Bahri D, Metzler D, Juan D, Zhao Z, Zheng C (2020a) Synthesizer: Rethinking self-attention in transformer models. Preprint, submitted May 2, https://arxiv.org/abs/2005.00743.Google Scholar
Tay Y, Dehghani M, Abnar S, Shen Y, Bahri D, Pham P, Rao J, Yang L, Ruder S, Metzler D (2020b) Long range arena: A benchmark for efficient transformers. Preprint, submitted November 8, https://arxiv.org/abs/2011.04006.Google Scholar
Theil CK, Broscheit S, Stuckenschmidt H (2019) Profet: Predicting the risk of firms from event transcripts. Internat. Joint Conf. Artificial Intelligence, 5211–5217.Google Scholar
Tong C, Peng H, Bai X, Dai Q, Zhang R, Li Y, Xu H, Gu X (2021) Learning discriminative text representation for streaming social event detection. IEEE Trans. Knowledge Data Engrg. 35(12):12295–12309.Crossref, Google Scholar
Touvron H, Lavril T, Izacard G, Martinet X, Lachaux MA, Lacroix T, Rozière B, et al. (2023) LLaMA: Open and efficient foundation language models. Preprint, submitted February 27, https://arxiv.org/abs/2302.13971.Google Scholar
Van den Oord A, Li Y, Vinyals O (2018) Representation learning with contrastive predictive coding. Preprint, submitted July 10, https://arxiv.org/abs/1807.03748.Google Scholar
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv. Neural Inform. Processing Systems, vol. 30 (MIT Press, Cambridge, MA).Google Scholar
Wang WY, Hua Z (2014) A semiparametric Gaussian copula regression model for predicting financial risks from earnings calls. ACL (The Association for Computer Linguistics, Pennsylvania), 1155–1165.Google Scholar
Yan Y, Li R, Wang S, Zhang F, Wu W, Xu W (2021) ConSERT: A contrastive framework for self-supervised sentence representation transfer. ACL/IJCNLP (Association for Computational Linguistics, Pennsylvania), 5065–5075.Google Scholar
Yang Y, Zhang K, Fan Y (2022) Analyzing firm reports for volatility prediction: A knowledge-driven text-embedding approach. INFORMS J. Comput. 34(1):522–540.Link, Google Scholar
Yang Y, Qin Y, Fan Y, Zhang Z (2023) Unlocking the power of voice for financial risk prediction: A theory-driven deep learning design approach. MIS Quart. 47(1):63–96.Crossref, Google Scholar
Ye Z, Qin Y, Xu W (2020) Financial risk prediction with multi-round Q&A attention network. Internat. Joint Conf. Artificial Intelligence, 4576–4582.Google Scholar
Zaheer M, Guruganesh G, Dubey KA, Ainslie J, Alberti C, Ontanon S, Pham P, Ravula A, Wang Q, Yang L, et al. (2020) Big bird: Transformers for longer sequences. Adv. in Neural Inform. Processing Systems, vol. 33 (MIT Press, Cambridge, MA), 17283–17297.Google Scholar
Zhang Y, He R, Liu Z, Lim KH, Bing L (2020a) An unsupervised sentence embedding method by mutual information maximization. EMNLP (Association for Computational Linguistics, Pennsylvania), 1601–1610.Google Scholar
Zhang Y, Zhao P, Li B, Wu Q, Huang J, Tan M (2020b) Cost-sensitive portfolio selection via deep reinforcement learning. IEEE Trans. Knowledge Data Engrg. 34(1):236–248.Google Scholar
Zhao X, Fang X, He J, Huang L (2022) Exploiting expert knowledge for assigning firms to industries: A novel deep learning method. Preprint, submitted September 11, https://arxiv.org/abs/2209.05943.Google Scholar
Zhou K, Wang H, Zhao WX, Zhu Y, Wang S, Zhang F, Wang Z, Wen JR (2020) S3-rec: Self-supervised learning for sequential recommendation with mutual information maximization. Proc. 29th ACM Internat. Conf. Inform. Knowledge Management (ACM, New York), 1893–1902.Google Scholar

cover image INFORMS Journal on Computing

Volume 38, Issue 2

March-April 2026

Pages iv, 341-691, iii

Article Information

Supplemental Material

Metrics

Information

Received:June 13, 2023
Accepted:February 14, 2025
Published Online:April 25, 2025

Cite as

Yi He, Yi Yang, Defu Lian, Kunpeng Zhang (2025) Divide and Contrast: A Text-Based Method for Firm Market Risk Prediction. INFORMS Journal on Computing 38(2):531-547.

https://doi.org/10.1287/ijoc.2023.0195

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Divide and Contrast: A Text-Based Method for Firm Market Risk Prediction

References

Volume 38, Issue 2

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News