Analyzing Firm Reports for Volatility Prediction: A Knowledge-Driven Text-Embedding Approach
Published Online:15 Apr 2021https://doi.org/10.1287/ijoc.2020.1046
References
- (2019) Rethinking complex neural network architectures for document classification. Proc. 2019 Conf. North Amer. Chapter Assoc. Comput. Linguistics: Human Language Tech., vol. 1 (Association for Computational Linguistics, Stroudsburg, PA), 4046–4051.Google Scholar
- (2009) Incorporating domain knowledge into topic modeling via Dirichlet forest priors. Proc. 26th Annual Internat. Conf. Machine Learn. (Association for Computing Machinery, New York), 25–32.confprocArora S, Liang Y, Ma T (2017) A simple but tough-to-beat baseline for sentence embeddings. Proc. Internat. Conf. Learn. Representations.Google Scholar
- (2016) A latent variable model approach to PMI-based word embeddings. Trans. Assoc. Comput. Linguist. 4:385–399.Crossref, Google Scholar
- (2014) Neural machine translation by jointly learning to align and translate. Proc. Internat. Conf. Learn. Representations.Google Scholar
- (2014) Simultaneously discovering and quantifying risk types from textual risk disclosures. Management Sci. 60(6):1371–1391.Link, Google Scholar
- (2013) Representation learning: A review and new perspectives. IEEE Trans. Pattern Anal. Machine Intelligence 35(8):1798–1828.Crossref, Google Scholar
- (2007) Equilibrium portfolio strategies in the presence of sentiment risk and excess volatility. Working Paper No. 13401, National Bureau of Economic Research, Cambridge, MA.Google Scholar
- (1973) The pricing of options and corporate liabilities. J. Political Econom. 81(3):637–654.Crossref, Google Scholar
- (2003) Latent Dirichlet allocation. J. Machine Learn. Res. 3(January):993–1022.Google Scholar
- (2015) Using 10-K text to gauge financial constraints. J. Financial Quant. Anal. 50(4):623–646.Crossref, Google Scholar
- (2018) Information, trading, and volatility: Evidence from firm-specific news. Rev. Financial Stud. 32(3):992–1033.Crossref, Google Scholar
- (2016) Sentence-based text analysis for customer reviews. Marketing Sci. 35(6):953–975.Link, Google Scholar
- (2000) How relevant is volatility forecasting for financial risk management? Rev. Econom. Statist. 82(1):12–22.Crossref, Google Scholar
- (2007) Yahoo! for Amazon: Sentiment extraction from small talk on the web. Management Sci. 53(9):1375–1388.Link, Google Scholar
- (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. Proc. 2019 Conf. North Amer. Chapter Assoc. Comput. Linguistics: Human Language Tech., vol. 1 (Association for Computational Linguistics, Stroudsburg, PA), 4171–4186.Google Scholar
- (1997) Support vector regression machines. Mozer MC, Jordan MI, Petsche T, eds. Proc. Ninth Internat. Conf. Neural Inform. Processing Systems (MIT Press, Cambridge, MA), 155–161.Google Scholar
- (2017) The evolution of 10-K textual disclosure: Evidence from latent Dirichlet allocation. J. Accounting Econom. 64(2–3):221–245.Crossref, Google Scholar
- (2015) Retrofitting word vectors to semantic lexicons. proc. 2015 Conf. North Amer. Chapter Assoc. Comput. Linguistics: Human Language Tech. (Association for Computational Linguistics, Stroudsburg, PA), 1606–1615.Google Scholar
- (1999) An empirical examination of conference calls as a voluntary disclosure medium. J. Accounting Res. 37(1):133–150.Crossref, Google Scholar
- (2017) Explainable artificial intelligence (XAI). Report DARPA/I20, Defense Advanced Research Projects Agency, Arlington, VA.Google Scholar
- (2004) Mining and summarizing customer reviews. Proc. 10th ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 168–177.Google Scholar
- (2017) Analyst information discovery and interpretation roles: A topic modeling approach. Management Sci. 64(6):2833–2855.Link, Google Scholar
- (2019) Attention is not explanation. Proc. 2019 Conf. North Amer. Chapter Assoc. Comput. Linguistics: Human Language Technologies, vol. 1 (Association for Computational Linguistics, Stroudsburg, PA), 3543–3556.Google Scholar
- (2013) Word power: A new approach for content analysis. J. Financial Econom. 110(3):712–729.Crossref, Google Scholar
- (2014) Textual sentiment in finance: A survey of methods and models. Internat. Rev. Financial Anal. 33(May):171–185.Crossref, Google Scholar
- (2014) Convolutional neural networks for sentence classification. Proc. 2014 Conf. Empirical Methods Natural Language Processing (Association for Computational Linguistics, Stroudsburg, PA), 1746–1751.Google Scholar
- (2009) Predicting risk from financial reports with regression. Proc. Human Language Tech.: 2009 Annual Conf. North Amer. Chapter Assoc. Comput. Linguistics (Association for Computational Linguistics, Stroudsburg, PA), 272–280.Google Scholar
- (2009) The effect of disclosures by management, analysts, and business press on cost of capital, return volatility, and analyst forecasts: A study using content analysis. Accounting Rev. 84(5):1639–1670.Crossref, Google Scholar
- (2012) Detecting deceptive discussions in conference calls. J. Accounting Res. 50(2):495–540.Crossref, Google Scholar
- (2010) The information content of forward-looking statements in corporate filings—A naïve Bayesian machine learning approach. J. Accounting Res. 48(5):1049–1102.Crossref, Google Scholar
- (2016) A commonsense knowledge-enabled textual analysis approach for financial market surveillance. INFORMS J. Comput. 28(2):278–294.Link, Google Scholar
- (2011) When is a liability not a liability? Textual analysis, dictionaries, and 10‐Ks. J. Finance 66(1):35–65.Crossref, Google Scholar
- (2013) IPO first-day returns, offer price revisions, volatility, and form S-1 language. J. Financial Econom. 109(2):307–326.Crossref, Google Scholar
- (2016) Textual analysis in accounting and finance: A survey. J. Accounting Res. 54(4):1187–1230.Crossref, Google Scholar
- (2013) Distributed representations of words and phrases and their compositionality. Proc. 26th Internat. Conf. Neural Inform. Processing Systems (Curran Associates, Red Hook, NY), 3111–3119.Google Scholar
- (1995) WordNet: A lexical database for English. Comm. ACM 38(11):39–41.Crossref, Google Scholar
- (2010) Adaptive auction mechanism design and the incorporation of prior knowledge. INFORMS J. Comput. 22(3):353–370.Link, Google Scholar
- (2001) Linguistic Inquiry and Word Count: LIWC2001 (Lawrence Erlbaum Publishers, Mahwah, NJ).Google Scholar
- (2014) GloVe: Global vectors for word representation. Proc. 2014 Conf. Empirical Methods Natural Language Processing (Association for Computational Linguistics, Stroudsburg, PA), 1532–1543.Google Scholar
- (2005) Practical issues in forecasting volatility. Financial Anal. J. 61(1):45–56.Crossref, Google Scholar
- (2003) Forecasting volatility in financial markets: A review. J. Econom. Lit. 41(2):478–539.Crossref, Google Scholar
- (2019) What you say and how you say it matters: Predicting stock volatility using verbal and vocal cues. Proc. 57th Annual Meeting Assoc. Comput. Linguistics (Association for Computational Linguistics, Stroudsburg, PA), 390–401.Google Scholar
- (2019) Language models are unsupervised multitask learners.Google Scholar
- (2017) Volatility prediction using financial disclosures sentiments with word embedding-based ir models. Proc. 55th Annual Meeting Assoc. Comput. Linguistics (Association for Computational Linguistics, Stroudsburg, PA), 1712–1721.Google Scholar
- (1998) Prior knowledge in support vector kernels. Proc. 1997 Conf. Advances Neural Inform. Processing Systems (MIT Press, Cambridge, MA), 640–646.Google Scholar
- (2004) Computational ontologies and information systems II: Formal specification. Comm. Assoc. Inform. Systems 14(1):9.Google Scholar
- (2005) Text mining and ontologies in biomedicine: Making sense of raw text. Briefings Bioinform. 6(3):239–251.Crossref, Google Scholar
- (2015) Volatility forecasting using financial statement information. Accounting Rev. 90(5):2079–2106.Crossref, Google Scholar
- (2007) Giving content to investor sentiment: The role of media in the stock market. J. Finance 62(3):1139–1168.Crossref, Google Scholar
- (2014) Financial keyword expansion via continuous word vector representations. Proc. 2014 Conf. Empirical Methods Natural Language Processing (Association for Computational Linguistics, Stroudsburg, PA), 1453–1458.Google Scholar
- (2017) Attention is all you need. Proc. 31st Internat. Conf. Neural Inform. Processing Systems (Curran Associates, Red Hook, NY), 5998–6008.Google Scholar
- (2001) Constrained K-means clustering with background knowledge. Proc. 18th Internat. Conf. Machine Learn., vol. 1 (Morgan Kaufmann, San Francisco), 577–584.Google Scholar
- (2016) Hierarchical attention networks for document classification. Proc. 2016 Conf. North Amer. Chapter Assoc. Comput. Linguistics: Human Language Tech. (Association for Computational Linguistics, Stroudsburg, PA), 1480–1489.Google Scholar
- (2009) Financial reporting complexity and investor underreaction to 10-K information. Rev. Accounting Stud. 14(4):559–586.Crossref, Google Scholar
- (2007) Semi-supervised dimensionality reduction. Proc. SIAM Conf. Data Mining (Society of Industrial and Applied Mathematics, Philadelphia), 629–634.Google Scholar

