Efficient Methods for Sampling Responses from Large-Scale Qualitative Data

Surendra N. Singh
Surendra N. Singh
[email protected]
School of Business, University of Kansas, Lawrence, Kansas 66045
Search for more papers by this author
,
Steve Hillmer
Steve Hillmer
[email protected]
School of Business, University of Kansas, Lawrence, Kansas 66045
Search for more papers by this author
,
Ze Wang
Ze Wang
[email protected]
College of Business Administration, University of Central Florida, Orlando, Florida 32816
Search for more papers by this author

Surendra N. Singh

[email protected]

School of Business, University of Kansas, Lawrence, Kansas 66045

Search for more papers by this author

Steve Hillmer

[email protected]

School of Business, University of Kansas, Lawrence, Kansas 66045

Search for more papers by this author

Ze Wang

[email protected]

College of Business Administration, University of Central Florida, Orlando, Florida 32816

Search for more papers by this author

Published Online:15 Mar 2011https://doi.org/10.1287/mksc.1100.0632

Abstract

The World Wide Web contains a vast corpus of consumer-generated content that holds invaluable insights for improving the product and service offerings of firms. Yet the typical method for extracting diagnostic information from online content—text mining—has limitations. As a starting point, we propose analyzing a sample of comments before initiating text mining. Using a combination of real data and simulations, we demonstrate that a sampling procedure that selects respondents whose comments contain a large amount of information is superior to the two most popular sampling methods—simple random sampling and stratified random sampling—-in gaining insights from the data. In addition, we derive a method that determines the probability of observing diagnostic information repeated a specific number of times in the population, which will enable managers to base sample size decisions on the trade-off between obtaining additional diagnostic information and the added expense of a larger sample. We provide an illustration of one of the methods using a real data set from a website containing qualitative comments about staying at a hotel and demonstrate how sampling qualitative comments can be a useful first step in text mining.

Volume 30, Issue 3

May-June 2011

Pages 389-564

Article Information

Metrics

Information

Received:October 10, 2008
Accepted:December 08, 2010
Published Online:March 15, 2011

Cite as

Surendra N. Singh, Steve Hillmer, Ze Wang, (2011) Efficient Methods for Sampling Responses from Large-Scale Qualitative Data. Marketing Science 30(3):532-549.

https://doi.org/10.1287/mksc.1100.0632

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Efficient Methods for Sampling Responses from Large-Scale Qualitative Data

Abstract

Volume 30, Issue 3

Article Information

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News