Analyzing Firm Reports for Volatility Prediction: A Knowledge-Driven Text-Embedding Approach

Published Online:https://doi.org/10.1287/ijoc.2020.1046

References

  • Adhikari A, Ram A, Tang R, Lin J (2019) Rethinking complex neural network architectures for document classification. Proc. 2019 Conf. North Amer. Chapter Assoc. Comput. Linguistics: Human Language Tech., vol. 1 (Association for Computational Linguistics, Stroudsburg, PA), 4046–4051.Google Scholar
  • Andrzejewski D, Zhu X, Craven M (2009) Incorporating domain knowledge into topic modeling via Dirichlet forest priors. Proc. 26th Annual Internat. Conf. Machine Learn. (Association for Computing Machinery, New York), 25–32.confprocArora S, Liang Y, Ma T (2017) A simple but tough-to-beat baseline for sentence embeddings. Proc. Internat. Conf. Learn. Representations.Google Scholar
  • Arora S, Li Y, Liang Y, Ma T, Risteski A (2016) A latent variable model approach to PMI-based word embeddings. Trans. Assoc. Comput. Linguist. 4:385–399.CrossrefGoogle Scholar
  • Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. Proc. Internat. Conf. Learn. Representations.Google Scholar
  • Bao Y, Datta A (2014) Simultaneously discovering and quantifying risk types from textual risk disclosures. Management Sci. 60(6):1371–1391.LinkGoogle Scholar
  • Bengio Y, Courville A, Vincent P (2013) Representation learning: A review and new perspectives. IEEE Trans. Pattern Anal. Machine Intelligence 35(8):1798–1828.CrossrefGoogle Scholar
  • Bernard D, Alexander K, Raman U (2007) Equilibrium portfolio strategies in the presence of sentiment risk and excess volatility. Working Paper No. 13401, National Bureau of Economic Research, Cambridge, MA.Google Scholar
  • Black F, Scholes M (1973) The pricing of options and corporate liabilities. J. Political Econom. 81(3):637–654.CrossrefGoogle Scholar
  • Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J. Machine Learn. Res. 3(January):993–1022.Google Scholar
  • Bodnaruk A, Loughran T, McDonald B (2015) Using 10-K text to gauge financial constraints. J. Financial Quant. Anal. 50(4):623–646.CrossrefGoogle Scholar
  • Boudoukh J, Feldman R, Kogan S, Richardson M (2018) Information, trading, and volatility: Evidence from firm-specific news. Rev. Financial Stud. 32(3):992–1033.CrossrefGoogle Scholar
  • Büschken J, Allenby GM (2016) Sentence-based text analysis for customer reviews. Marketing Sci. 35(6):953–975.LinkGoogle Scholar
  • Christoffersen PF, Diebold FX (2000) How relevant is volatility forecasting for financial risk management? Rev. Econom. Statist. 82(1):12–22.CrossrefGoogle Scholar
  • Das SR, Chen MY (2007) Yahoo! for Amazon: Sentiment extraction from small talk on the web. Management Sci. 53(9):1375–1388.LinkGoogle Scholar
  • Devlin J, Chang MW, Lee K, Toutanova K (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. Proc. 2019 Conf. North Amer. Chapter Assoc. Comput. Linguistics: Human Language Tech., vol. 1 (Association for Computational Linguistics, Stroudsburg, PA), 4171–4186.Google Scholar
  • Drucker H, Burges CJC, Kaufman L, Smola AJ, Vapnik V (1997) Support vector regression machines. Mozer MC, Jordan MI, Petsche T, eds. Proc. Ninth Internat. Conf. Neural Inform. Processing Systems (MIT Press, Cambridge, MA), 155–161.Google Scholar
  • Dyer T, Lang M, Stice-Lawrence L (2017) The evolution of 10-K textual disclosure: Evidence from latent Dirichlet allocation. J. Accounting Econom. 64(2–3):221–245.CrossrefGoogle Scholar
  • Faruqui M, Dodge J, Jauhar SK, Dyer C, Hovy E, Smith NA (2015) Retrofitting word vectors to semantic lexicons. proc. 2015 Conf. North Amer. Chapter Assoc. Comput. Linguistics: Human Language Tech. (Association for Computational Linguistics, Stroudsburg, PA), 1606–1615.Google Scholar
  • Frankel R, Johnson M, Skinner DJ (1999) An empirical examination of conference calls as a voluntary disclosure medium. J. Accounting Res. 37(1):133–150.CrossrefGoogle Scholar
  • Gunning D (2017) Explainable artificial intelligence (XAI). Report DARPA/I20, Defense Advanced Research Projects Agency, Arlington, VA.Google Scholar
  • Hu M, Liu B (2004) Mining and summarizing customer reviews. Proc. 10th ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 168–177.Google Scholar
  • Huang AH, Lehavy R, Zang AY, Zheng R (2017) Analyst information discovery and interpretation roles: A topic modeling approach. Management Sci. 64(6):2833–2855.LinkGoogle Scholar
  • Jain S, Wallace BC (2019) Attention is not explanation. Proc. 2019 Conf. North Amer. Chapter Assoc. Comput. Linguistics: Human Language Technologies, vol. 1 (Association for Computational Linguistics, Stroudsburg, PA), 3543–3556.Google Scholar
  • Jegadeesh N, Wu D (2013) Word power: A new approach for content analysis. J. Financial Econom. 110(3):712–729.CrossrefGoogle Scholar
  • Kearney C, Liu S (2014) Textual sentiment in finance: A survey of methods and models. Internat. Rev. Financial Anal. 33(May):171–185.CrossrefGoogle Scholar
  • Kim Y (2014) Convolutional neural networks for sentence classification. Proc. 2014 Conf. Empirical Methods Natural Language Processing (Association for Computational Linguistics, Stroudsburg, PA), 1746–1751.Google Scholar
  • Kogan S, Levin D, Routledge BR, Sagi JS, Smith NA (2009) Predicting risk from financial reports with regression. Proc. Human Language Tech.: 2009 Annual Conf. North Amer. Chapter Assoc. Comput. Linguistics (Association for Computational Linguistics, Stroudsburg, PA), 272–280.Google Scholar
  • Kothari SP, Li X, Short JE (2009) The effect of disclosures by management, analysts, and business press on cost of capital, return volatility, and analyst forecasts: A study using content analysis. Accounting Rev. 84(5):1639–1670.CrossrefGoogle Scholar
  • Larcker DF, Zakolyukina AA (2012) Detecting deceptive discussions in conference calls. J. Accounting Res. 50(2):495–540.CrossrefGoogle Scholar
  • Li F (2010) The information content of forward-looking statements in corporate filings—A naïve Bayesian machine learning approach. J. Accounting Res. 48(5):1049–1102.CrossrefGoogle Scholar
  • Li X, Chen K, Sun SX, Fung T, Wang H, Zeng DD (2016) A commonsense knowledge-enabled textual analysis approach for financial market surveillance. INFORMS J. Comput. 28(2):278–294.LinkGoogle Scholar
  • Loughran T, McDonald B (2011) When is a liability not a liability? Textual analysis, dictionaries, and 10‐Ks. J. Finance 66(1):35–65.CrossrefGoogle Scholar
  • Loughran T, McDonald B (2013) IPO first-day returns, offer price revisions, volatility, and form S-1 language. J. Financial Econom. 109(2):307–326.CrossrefGoogle Scholar
  • Loughran T, McDonald B (2016) Textual analysis in accounting and finance: A survey. J. Accounting Res. 54(4):1187–1230.CrossrefGoogle Scholar
  • Mikolov T, Sutskever I, Chen K, Corrado G, Dean J (2013) Distributed representations of words and phrases and their compositionality. Proc. 26th Internat. Conf. Neural Inform. Processing Systems (Curran Associates, Red Hook, NY), 3111–3119.Google Scholar
  • Miller GA (1995) WordNet: A lexical database for English. Comm. ACM 38(11):39–41.CrossrefGoogle Scholar
  • Pardoe D, Stone P, Saar-Tsechansky M, Keskin T, Tomak K (2010) Adaptive auction mechanism design and the incorporation of prior knowledge. INFORMS J. Comput. 22(3):353–370.LinkGoogle Scholar
  • Pennebaker JW, Francis ME, Booth RJ (2001) Linguistic Inquiry and Word Count: LIWC2001 (Lawrence Erlbaum Publishers, Mahwah, NJ).Google Scholar
  • Pennington J, Socher R, Manning C (2014) GloVe: Global vectors for word representation. Proc. 2014 Conf. Empirical Methods Natural Language Processing (Association for Computational Linguistics, Stroudsburg, PA), 1532–1543.Google Scholar
  • Poon SH, Granger C (2005) Practical issues in forecasting volatility. Financial Anal. J. 61(1):45–56.CrossrefGoogle Scholar
  • Poon SH, Granger CW (2003) Forecasting volatility in financial markets: A review. J. Econom. Lit. 41(2):478–539.CrossrefGoogle Scholar
  • Qin Y, Yang Y (2019) What you say and how you say it matters: Predicting stock volatility using verbal and vocal cues. Proc. 57th Annual Meeting Assoc. Comput. Linguistics (Association for Computational Linguistics, Stroudsburg, PA), 390–401.Google Scholar
  • Radford A, Wu J, Child R, Luan D, Amodei D, Sutskever I (2019) Language models are unsupervised multitask learners.Google Scholar
  • Rekabsaz N, Lupu M, Baklanov A, Hanbury A, Duer A, Anderson L (2017) Volatility prediction using financial disclosures sentiments with word embedding-based ir models. Proc. 55th Annual Meeting Assoc. Comput. Linguistics (Association for Computational Linguistics, Stroudsburg, PA), 1712–1721.Google Scholar
  • Schölkopf B, Simard P, Smola AJ, Vapnik V (1998) Prior knowledge in support vector kernels. Proc. 1997 Conf. Advances Neural Inform. Processing Systems (MIT Press, Cambridge, MA), 640–646.Google Scholar
  • Sharman R, Kishore R, Ramesh R (2004) Computational ontologies and information systems II: Formal specification. Comm. Assoc. Inform. Systems 14(1):9.Google Scholar
  • Spasic I, Ananiadou S, McNaught J, Kumar A (2005) Text mining and ontologies in biomedicine: Making sense of raw text. Briefings Bioinform. 6(3):239–251.CrossrefGoogle Scholar
  • Sridharan SA (2015) Volatility forecasting using financial statement information. Accounting Rev. 90(5):2079–2106.CrossrefGoogle Scholar
  • Tetlock PC (2007) Giving content to investor sentiment: The role of media in the stock market. J. Finance 62(3):1139–1168.CrossrefGoogle Scholar
  • Tsai MF, Wang CJ (2014) Financial keyword expansion via continuous word vector representations. Proc. 2014 Conf. Empirical Methods Natural Language Processing (Association for Computational Linguistics, Stroudsburg, PA), 1453–1458.Google Scholar
  • Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Lu, Polosukhin I (2017) Attention is all you need. Proc. 31st Internat. Conf. Neural Inform. Processing Systems (Curran Associates, Red Hook, NY), 5998–6008.Google Scholar
  • Wagstaff K, Cardie C, Rogers S, Schrödl S (2001) Constrained K-means clustering with background knowledge. Proc. 18th Internat. Conf. Machine Learn., vol. 1 (Morgan Kaufmann, San Francisco), 577–584.Google Scholar
  • Yang Z, Yang D, Dyer C, He X, Smola A, Hovy E (2016) Hierarchical attention networks for document classification. Proc. 2016 Conf. North Amer. Chapter Assoc. Comput. Linguistics: Human Language Tech. (Association for Computational Linguistics, Stroudsburg, PA), 1480–1489.Google Scholar
  • You H, Zhang Xj (2009) Financial reporting complexity and investor underreaction to 10-K information. Rev. Accounting Stud. 14(4):559–586.CrossrefGoogle Scholar
  • Zhang D, Zhou ZH, Chen S (2007) Semi-supervised dimensionality reduction. Proc. SIAM Conf. Data Mining (Society of Industrial and Applied Mathematics, Philadelphia), 629–634.Google Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.