Mind the Gap: Accounting for Measurement Error and Misclassification in Variables Generated via Data Mining
Published Online:29 Jan 2018https://doi.org/10.1287/isre.2017.0727
References
- (2014) Editorial—Big data, data science, and analytics: The opportunity and challenge for IS research. Inform. Systems Res. 25(3):443–448.Link, Google Scholar
- (2015) Data Mining: The Textbook (Springer, Cham, Switzerland).Crossref, Google Scholar
- (2012) Putting money where the mouths are: The relation between venture financing and electronic word-of-mouth. Inform. Systems Res. 23(3-part-2):976–992.Link, Google Scholar
- (2014) Some simple economics of crowdfunding. Lerner J, Stern S, eds. Innovation Policy and the Economy, 1st ed., Vol. 14 (University of Chicago Press, Chicago), 63–97.Crossref, Google Scholar
- (2011) Deriving the pricing power of product features by mining consumer reviews. Management Sci. 57(8):1485–1509.Link, Google Scholar
- (2014) Simultaneously discovering and quantifying risk types from textual risk disclosures. Management Sci. 60(6):1371–1391.Link, Google Scholar
- (2005) On the effect of misclassification on bias of perfectly measured covariates in regression. Biometrics 61(3):831–836.Crossref, Google Scholar
- (2013) An empirical examination of the antecedents and consequences of contribution patterns in crowd-funded markets. Inform. Systems Res. 24(3):499–519.Link, Google Scholar
- (2015) The hidden cost of accommodating crowdfunder privacy preferences: A randomized field experiment. Management Sci. 61(5):949–962.Link, Google Scholar
- (1996) Asymptotics for the SIMEX estimator in nonlinear measurement error models. J. Amer. Statist. Assoc. 91(433):242–250.Crossref, Google Scholar
- (2006) Measurement Error in Nonlinear Models: A Modern Perspective (CRC Press, Boca Raton, FL).Crossref, Google Scholar
- (2014) Hiring biases in online labor markets: The case of gender stereotyping. Proc. 35th Internat. Conf. Inform. Systems (ICIS), Auckland, NZ, 1161–1178.Google Scholar
- (2012) Business intelligence and analytics: From big data to big impact. MIS Quart. 36(4):1165–1188.Crossref, Google Scholar
- (1994) Simulation-extrapolation estimation in parametric measurement error models. J. Amer. Statist. Assoc. 89(428):1314–1328.Crossref, Google Scholar
- (2007) Yahoo! for Amazon: Sentiment extraction from small talk on the web. Management Sci. 53(9):1375–1388.Link, Google Scholar
- (2003) The digitization of word of mouth: Promise and challenges of online feedback. Management Sci. 49(10):1407–1424.Link, Google Scholar
- (2016) Natural language processing in accounting, auditing and finance: A synthesis of the literature with a roadmap for future research. Intelligent Systems Accounting, Finance Management 23(3):157–214.Crossref, Google Scholar
- (2008) Examining the relationship between reviews and sales: The role of reviewer identity disclosure in electronic markets. Inform. Systems Res. 19(3):291–313.Link, Google Scholar
- (2011) Estimating the helpfulness and economic impact of product reviews: Mining text and reviewer characteristics. Knowledge Data Engrg., IEEE Trans. 23(10):1498–1512.Crossref, Google Scholar
- (2012) Designing ranking systems for hotels on travel search engines by mining user-generated and crowdsourced content. Marketing Sci. 31(3):493–520.Link, Google Scholar
- (1990) Improvements of the naive approach to estimation in nonlinear errors-in-variables regression models. Contemporary Math. 112:99–114.Crossref, Google Scholar
- (2004) Using online conversations to study word-of-mouth communication. Marketing Sci. 23(4):545–560.Link, Google Scholar
- (2013) Social media brand community and consumer behavior: Quantifying the relative impact of user- and marketer-generated content. Inform. Systems Res. 24(1):88–107.Link, Google Scholar
- (2003) Econometric Analysis (Pearson Education, Delhi, India).Google Scholar
- (2007) Competition among virtual communities and user valuation: The case of investing-related communities. Inform. Systems Res. 18(1):68–85.Link, Google Scholar
- (2014) The allure of homophily in social media: Evidence from investor responses on virtual communities. Inform. Systems Res. 25(3):604–617.Link, Google Scholar
- (2003) Measurement Error and Misclassification in Statistics and Epidemiology: Impacts and Bayesian Adjustments (CRC Press, Boca Raton, FL).Crossref, Google Scholar
- (2003) The simulation extrapolation method for fitting generalized linear models with additive measurement error. Stata J. 3(4):373–385.Crossref, Google Scholar
- (2010) A method of automated nonparametric content analysis for social science. Amer. J. Political Sci. 54(1):229–247.Crossref, Google Scholar
- (2017) Social network integration and user content generation: Evidence from natural experiments. MIS Quart. 41(4):1035–1058.Crossref, Google Scholar
- (2016) Effects of multiple psychological distances on construal level: A field study of online reviews. J. Consumer Psych. 26(4):474–482.Crossref, Google Scholar
- (2014) Political language in economics. Working paper, New York University, New York.Google Scholar
- (2015) The emergence of online community leadership. Inform. Systems Res. 26(1):165–187.Link, Google Scholar
- (2008) Speech and Language Processing (Prentice Hall, Upper Saddle River, NJ).Google Scholar
- (2007) Asymptotic variance estimation for the misclassification SIMEX. Comput. Statist. Data Anal. 51(12):6197–6211.Crossref, Google Scholar
- (2006) A general method for dealing with misclassification in regression: The misclassification SIMEX. Biometrics 62(1):85–96.Crossref, Google Scholar
- (2013) Research commentary—Too big to fail: Large samples and the p-value problem. Inform. Systems Res. 24(4):906–917.Link, Google Scholar
- (2012) I loan because…: Understanding motivations for pro-social lending. Proc. 5th ACM Internat. Conf. Web Search Data Mining (ACM, New York),503–512.Google Scholar
- (2013) The emergence of opinion leaders in a networked online community: A dyadic model with time dynamics and a heuristic for fast estimation. Management Sci. 59(8):1783–1799.Link, Google Scholar
- (2014) Promotional reviews: An empirical investigation of online review manipulation. Amer. Econom. Rev. 104(8):2421–2455.Crossref, Google Scholar
- (2014) Doing business with strangers: Reputation in online service marketplaces. Inform. Systems Res. 25(4):865–886.Link, Google Scholar
- (2010) What makes a helpful review? A study of customer reviews on Amazon.com. MIS Quart. 34(1):185–200.Crossref, Google Scholar
- (2002) Thumbs up? Sentiment classification using machine learning techniques. Proc. ACL-02 Conf. Empirical Methods Natural Language Processing, Vol. 10 (Association for Computational Linguistics, Strousburg, PA), 79–86.Google Scholar
- (2013) Data Science for Business: What You Need to Know About Data Mining and Data-Analytic Thinking (O’Reilly Media, Sebastopol, CA).Google Scholar
- (2015) Who gets started on Kickstarter? Demographic variations in fundraising success. Proc. 36th Internat. Conf. Inform. Systems (ICIS), Fort Worth, TX, 1303–1314.Google Scholar
- (2014) How to attract and retain readers in enterprise blogging? Inform. Systems Res. 25(1):35–52.Link, Google Scholar
- (1995) Simulation-extrapolation: The measurement error jackknife. J. Amer. Statist. Assoc. 90(432):1247–1256.Crossref, Google Scholar
- (2008) More than words: Quantifying language to measure firms’ fundamentals. J. Finance 63(3):1437–1467.Crossref, Google Scholar
- (2012) Does chatter really matter? Dynamics of user-generated content and stock performance. Marketing Sci. 31(2):198–215.Link, Google Scholar
- (1995) The Nature of Statistical Learning Theory (Springer, New York).Crossref, Google Scholar
- (2014) Big data: New tricks for econometrics. J. Econom. Perspect. 28(2):3–28.Crossref, Google Scholar
- (2013) The association between the disclosure and the realization of information security risk factors. Inform. Systems Res. 24(2):201–218.Link, Google Scholar
- (2000) Measurement Error and Latent Variables in Econometrics, Vol. 37 (North-Holland, Amsterdam).Google Scholar
- (2013) Social network effects on productivity and job security: Evidence from the adoption of a social networking tool. Inform. Systems Res. 24(1):30–51.Link, Google Scholar
- (2014) Anxious or angry? Effects of discrete emotions on the perceived helpfulness of online reviews. MIS Quart. 38(2):539–560.Crossref, Google Scholar
- (2016) How much is an image worth? An empirical analysis of property’s image aesthetic quality on demand at AirBNB. Proc. 37th Internat. Conf. Inform. Systems (ICIS), Dublin, Ireland, 168–188.Google Scholar
- (2012) Effectiveness of shared leadership in online communities. Proc. ACM 2012 Conf. Comput. Supported Cooperative Work (ACM, New York), 407–416.Google Scholar
- (2011) Identifying shared leadership in Wikipedia. Proc. SIGCHI Conf. Human Factors Comput. Systems (ACM, New York), 3431–3434.Google Scholar

