How Much Can Machines Learn Finance from Chinese Text Data?
References
- (2013) Eigenvalue ratio test for the number of factors. Econometrica 81(3):1203–1227.Crossref, Google Scholar
- (2004) Is all that talk just noise? The information content of internet stock message boards. J. Finance 59(3):1259–1294.Crossref, Google Scholar
- (2021) Synthetic difference-in-differences. Amer. Econom. Rev. 111(12):4088–4118.Crossref, Google Scholar
- (2012) Estimation of spiked eigenvalues in spiked models. Random Matrices Theory Appl. 1(02):1–21.Crossref, Google Scholar
- (2002) Determining the number of factors in approximate factor models. Econometrica 70(1):191–221.Crossref, Google Scholar
- (2003) Latent Dirichlet allocation. J. Machine Learn. Res. 3(January):993–1022.Google Scholar
- (2019) How news and its context drive risk and returns around the world. J. Financial Econom. 133(2):299–336.Crossref, Google Scholar
- (1997) On persistence in mutual fund performance. J. Finance 52(1):57–82.Crossref, Google Scholar
- (2015) Convolutional neural network for sentence classification. UWSpace (August 26), https://uwspace.uwaterloo.ca/handle/10012/9592.Google Scholar
- (2015) Asset allocation in the Chinese stock market: The role of return predictability. J. Portfolio Management 41(5):71–83.Crossref, Google Scholar
- (2019) Daily price limits and destructive market behavior. J. Econometrics 208(1):249–264.Crossref, Google Scholar
- (2019) Textual factors: A scalable, interpretable, and data-driven approach to analyzing unstructured information. Preprint, submitted September 1, https://dx.doi.org/10.2139/ssrn.3307057.Google Scholar
- (1933) Can stock market forecasters forecast? Econometrica 1(3):309–324.Crossref, Google Scholar
- (2015) The sum of all FEARS investor sentiment and asset prices. Rev. Financial Stud. 28(1):1–32.Crossref, Google Scholar
- (2016) On the unsupervised analysis of domain-specific Chinese texts. Proc. Natl. Acad. Sci. USA 113(22):6154–6159.Crossref, Google Scholar
- (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. Preprint, submitted May 24, https://arxiv.org/abs/1810.04805.Google Scholar
- (2022) Language and domain specificity: A Chinese financial sentiment dictionary. Rev. Finance 26(3):673–719.Crossref, Google Scholar
- (1993) Common risk factors in the returns on stocks and bonds. J. Financial Econom. 33(1):3–56.Crossref, Google Scholar
- (2008) Sure independence screening for ultrahigh dimensional feature space. J. Roy. Statist. Soc. Ser. B Statist. Methodology 70(5):849–911.Crossref, Google Scholar
- (2020a) Estimating number of factors by adjusted eigenvalues thresholding. J. Amer. Statist. Assoc. 117(538):852–861.Crossref, Google Scholar
- (2020b) Factor-adjusted regularized model selection. J. Econometrics 216(1):71–85.Crossref, Google Scholar
- (2020c) Statistical Foundations of Data Science (CRC Press, Boca Raton, FL).Crossref, Google Scholar
- (2020) Googling investor sentiment around the world. J. Financial Quant. Anal. 55(2):549–580.Crossref, Google Scholar
- (2013) Sentiment during recessions. J. Finance 68(3):1267–1300.Crossref, Google Scholar
- (2019a) Text as data. J. Econom. Literature 57(3):535–574.Crossref, Google Scholar
- (2019b) Measuring group differences in high-dimensional choices: Method and application to congressional speech. Econometrica 87(4):1307–1340.Crossref, Google Scholar
- (2019) Does unusual news forecast market stress? J. Financial Quant. Anal. 54(5):1937–1974.Crossref, Google Scholar
- (2020) Empirical asset pricing via machine learning. Rev. Financial Stud. 33(5):2223–2273.Crossref, Google Scholar
- (1973) Are investors influenced by how earnings press releases are written? J. Bus. Comm. 45(4):363–407.Crossref, Google Scholar
- (2020) Significance tests for neural networks. J. Machine Learn. Res. 21(227):1–29.Google Scholar
- (2013) Word power: A new approach for content analysis. J. Financial Econom. 110(3):712–729.Crossref, Google Scholar
- (2019) Predicting returns with text data. NBER Working Paper No. 26186, National Bureau of Economic Research, Cambridge, MA.Google Scholar
- (2017) Asset returns, news topics, and media effects. Preprint, submitted September 19, https://dx.doi.org/10.2139/ssrn.3057950.Google Scholar
- (2011) When is a liability not a liability? Textual analysis, dictionaries, and 10-Ks. J. Finance 66(1):35–65.Crossref, Google Scholar
- (2016) Textual analysis in accounting and finance: A survey. J. Accounting Res. 54(4):1187–1230.Crossref, Google Scholar
- (2017) News implied volatility and disaster concerns. J. Financial Econom. 123(1):137–162.Crossref, Google Scholar
- (2005) Short sales, institutional investors and the cross-section of stock returns. J. Financial Econom. 78(2):277–309.Crossref, Google Scholar
- (2021) Machine Learning in Asset Pricing (Princeton University Press, Princeton, NJ).Google Scholar
- (1997) Bidirectional recurrent neural networks. IEEE Trans. Signal Processing 45(11):2673–2681.Crossref, Google Scholar
- (2002) Forecasting using principal components from a large number of predictors. J. Amer. Statist. Assoc. 97(460):1167–1179.Crossref, Google Scholar
- (2017) Jieba Version v0.39 (August 31). https://github.com/fxsjy/jieba.Google Scholar
- (2016) Stock return predictability and investor sentiment: A high-frequency perspective. J. Banking Finance 73(11):147–164.Crossref, Google Scholar
- (2013) Multinomial inverse regression for text analysis. J. Amer. Statist. Assoc. 108(503):755–770.Crossref, Google Scholar
- (2007) Giving content to investor sentiment: The role of media in the stock market. J. Finance 62(3):1139–1168.Crossref, Google Scholar
- (2008) More than words: Quantifying language to measure firms’ fundamentals. J. Finance 63(3):1437–1467.Crossref, Google Scholar

