Cost-Effective Quality Assurance in Crowd Labeling

Jing Wang
Corresponding Author
Jing Wang
[email protected]
School of Business and Management, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
Search for more papers by this author
,
Panagiotis G. Ipeirotis
Panagiotis G. Ipeirotis
[email protected]
Leonard Stern School of Business, New York University, New York, New York 10012
Search for more papers by this author
,
Foster Provost
Foster Provost
[email protected]
Leonard Stern School of Business, New York University, New York, New York 10012
Search for more papers by this author

Jing Wang

Corresponding Author

Jing Wang

[email protected]

School of Business and Management, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong

Search for more papers by this author

Panagiotis G. Ipeirotis

[email protected]

Leonard Stern School of Business, New York University, New York, New York 10012

Search for more papers by this author

Foster Provost

[email protected]

Leonard Stern School of Business, New York University, New York, New York 10012

Search for more papers by this author

Published Online:9 Feb 2017https://doi.org/10.1287/isre.2016.0661

References

Adomavicius G, Gupta A, Zhdanov D (2009) Designing intelligent software agents for auctions with limited information feedback. Inform. Systems Res. 20(4):507–526.Link, Google Scholar
Aperjis C, Johari R (2010) Optimal windows for aggregating ratings in electronic marketplaces. Management Sci. 56(5):864–880.Link, Google Scholar
Archak N, Ghose A, Ipeirotis PG (2011) Deriving the pricing power of product features by mining consumer reviews. Management Sci. 57(8):1485–1509.Link, Google Scholar
Bachrach Y, Graepel T, Minka T, Guiver J (2012) How to grade a test without knowing the answers—A Bayesian graphical model for adaptive crowdsourcing and aptitude testing. Proc. 29th Internat. Conf. Machine Learning (Omnipress, Madison, WI), 1183–1190.Google Scholar
Berger RL (1982) Multiparameter hypothesis testing and acceptance sampling. Technometrics 24(4):295–300.Crossref, Google Scholar
Carpenter B (2008) Multilevel Bayesian models of categorical data annotation. https://lingpipe.files.wordpress.com/2008/11/carp-bayesian-multilevel-annotation.pdf.Google Scholar
Chen X, Lin Q, Zhou D (2015) Statistical decision making for optimal budget allocation in crowd labeling. J. Machine Learning Res. 16:1–46.Google Scholar
Chiang IR, Mookerjee VS (2004) A fault threshold policy to manage software development projects. Inform. Systems Res. 15(1):3–21.Link, Google Scholar
Christoforaki M, Ipeirotis P (2014) STEP: A scalable testing and evaluation platform. Second AAAI Conf. Human Comput. Crowdsourcing (AAAI Press, Palo Alto, CA), 41–49.Google Scholar
Cohn D, Atlas L, Ladner R (1994) Improving generalization with active learning. Machine Learning 15(2):201–221.Crossref, Google Scholar
Crisan D, Doucet A (2002) A survey of convergence results on particle filtering methods for practitioners. IEEE Trans. Signal Processing 50(3):736–746.Crossref, Google Scholar
Crocker L, Algina J (2006) Introduction to Classical and Modern Test Theory (Wadsworth, Belmont, CA).Google Scholar
Dawid AP, Skene AM (1979) Maximum likelihood estimation of observer error-rates using the EM algorithm. Appl. Statist. 28(1):20–28.Crossref, Google Scholar
DeMars C (2010) Item Response Theory (Oxford University Press, Oxford, UK).Crossref, Google Scholar
Donmez P, Carbonell J, Schneider J (2010) A probabilistic framework to learn from multiple annotators with time-varying accuracy. Proc. 10th SIAM Internat. Conf. Data Mining (SDM) (SIAM, Philadelphia), 826–837.Google Scholar
Goes PB (2014) Editor’s comments: Design science research in top information systems journals. MIS Quart. 38(1):iii–viii.Google Scholar
Gregor S, Hevner AR (2013) Positioning and presenting design science research for maximum impact. MIS Quart. 37(2):337–356.Crossref, Google Scholar
Hevner AR, March ST, Park J, Ram S (2004) Design science in information systems research. MIS Quart. 28(1):75–105.Crossref, Google Scholar
Ho CJ, Slivkins A, Suri S, Vaughan JW (2015) Incentivizing high quality crowdwork. Proc. 24th Internat. Conf. World Wide Web (ACM, New York), 419–429.Crossref, Google Scholar
Ipeirotis PG (2010) Analyzing the Amazon Mechanical Turk marketplace. XRDS: Crossroads, ACM Magazine Students 17(2):16–21.Crossref, Google Scholar
Ipeirotis PG, Provost F, Wang J (2010) Quality management on Amazon Mechanical Turk. Proc. ACM SIGKDD Workshop on Human Comput. (ACM, New York), 64–67.Crossref, Google Scholar
Ipeirotis PG, Provost F, Sheng VS, Wang J (2014) Repeated labeling using multiple noisy labelers. Data Mining Knowledge Discovery 28(2):402–441.Crossref, Google Scholar
Karger DR, Oh S, Shah D (2011) Iterative learning for reliable crowdsourcing systems. Proc. 24th Internat. Neural Inform. Processing Systems (Curran Associates, Red Hook, NY), 1953–1961.Google Scholar
Ketter W, Collins J, Gini M, Gupta A, Schrater P (2012) Real-time tactical and strategic sales management for intelligent agents guided by economic regimes. Inform. Systems Res. 23(4):1263–1283.Link, Google Scholar
Kokkodis M, Ipeirotis PG (2015) Reputation transferability in online labor markets. Management Sci. 62(6):1687–1706.Link, Google Scholar
Kuechler W, Vaishnavi V (2012) A framework for theory development in design science research: Multiple perspectives. J. Assoc. Inform. Systems 13(6):395–423.Google Scholar
Lewis DD, Gale WA (1994) A sequential algorithm for training text classifiers. Proc. 17th Annual Internat. ACM SIGIR Conf. Res. Development Inform. Retrieval (Springer, New York), 3–12.Crossref, Google Scholar
Lizotte DJ, Madani O, Greiner R (2003) Budgeted learning of naive-Bayes classifiers. Proc. 19th Conf. Uncertainty Artificial Intelligence (Morgan Kaufmann, San Francisco), 378–385.Google Scholar
Malone TW, Laubacher R, Dellarocas C (2009) Harnessing crowds: Mapping the genome of collective intelligence. Working paper, Massachusetts Institute of Technology, Cambridge, http://ssrn.com/abstract=1381502.Google Scholar
March ST, Storey VC (2008) Design science in the information systems discipline: An introduction to the special issue on design science research. MIS Quart. 32(4):725–730.Crossref, Google Scholar
Moore JC, Whinston AB (1986) A model of decision-making with sequential information-acquisition (part 1). Decision Support Systems 2(4):285–307.Crossref, Google Scholar
Moore JC, Whinston AB (1987) A model of decision-making with sequential information-acquisition (part 2). Decision Support Systems 3(1):47–72.Crossref, Google Scholar
Moreno A, Terwiesch C (2014) Doing business with strangers: Reputation in online service marketplaces. Inform. Systems Res. 25(4):865–886.Link, Google Scholar
Raykar VC, Yu S, Zhao LH, Valadez GH, Florin C, Bogoni L, Moy L (2010) Learning from crowds. J. Machine Learning Res. 11(April):1297–1322.Google Scholar
Roy N, McCallum A (2001) Toward optimal active learning through sampling estimation of error reduction. Proc. 18th Internat. Conf. Machine Learning (Morgan Kaufmann, San Francisco),441–448.Google Scholar
Saar-Tsechansky M, Provost F (2004) Active sampling for class probability estimation and ranking. Machine Learning 54(2):153–178.Crossref, Google Scholar
Saar-Tsechansky M, Provost F (2007) Decision-centric active learning of binary-outcome models. Inform. Systems Res. 18(1):4–22.Link, Google Scholar
Saar-Tsechansky M, Melville P, Provost F (2009) Active feature-value acquisition. Management Sci. 55(4):664–684.Link, Google Scholar
Schilling EG (1982) Acceptance Sampling in Quality Control (CRC Press, Boca Raton, FL).Google Scholar
Sheng VS, Provost F, Ipeirotis PG (2008) Get another label? Improving data quality and data mining using multiple, noisy labelers. Proc. 14th ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (ACM, New York), 614–622.Crossref, Google Scholar
Snow R, O’Connor B, Jurafsky D, Ng AY (2008) Cheap and fast—But is it good? Evaluating non-expert annotations for natural language tasks. Proc. Conf. Empirical Methods Natural Language Processing (Association for Computational Linguistics, Stroudsburg, PA), 254–263.Google Scholar
Wais P, Lingamneni S, Cook D, Fennell J, Goldenberg B, Lubarov D, Marin D, Simons H (2010) Towards building a high-quality workforce with Mechanical Turk. Proc. NIPS Workshop Comput. Soc. Sci. Wisdom Crowds (Curran Associates, Red Hook, NY), 1–5.Google Scholar
Wang J, Ghose A, Ipeirotis P (2012) Bonus, disclosure, and choice: What motivates the creation of high-quality paid reviews? Proc. 33rd Internat. Conf. Inform. Systems (AIS, Atlanta).Google Scholar
Welinder P, Perona P (2010) Online crowdsourcing: Rating annotators and obtaining cost-effective labels. 2010 IEEE Comput. Soc. Conf. Comput. Vision Pattern Recognition-Workshops (IEEE, New York), 25–32.Crossref, Google Scholar
Welinder P, Branson S, Belongie S, Perona P (2010) The multidimensional wisdom of crowds. Proc. 23rd Internat. Conf. Neural Inform. Processing Systems (Curran Associates, Red Hook, NY),2424–2432.Google Scholar
Wetherill GB, Chiu WK (1975) A review of acceptance sampling schemes with emphasis on the economic aspect. Internat. Statist. Rev. 43(2):191–210.Crossref, Google Scholar
Whitehill J, Ruvolo P, Wu T, Bergsma J, Movellan J (2009) Whose vote should count more: Optimal integration of labels from labelers of unknown expertise. Proc. 22nd Internat. Conf. Adv. Neural Inform. Processing Systems (Curran Associates, Red Hook, NY), 2035–2043.Google Scholar
Zheng Z, Padmanabhan B (2006) Selectively acquiring customer information: A new data acquisition problem and an active learning-based solution. Management Sci. 52(5):697–712.Link, Google Scholar

cover image Information Systems Research

Volume 28, Issue 1

March 2017

Pages iii-vi, 1-202

Article Information

Supplemental Material

Metrics

Information

Received:August 14, 2014
Accepted:June 02, 2016
Published Online:February 09, 2017

Cite as

Jing Wang, Panagiotis G. Ipeirotis, Foster Provost (2017) Cost-Effective Quality Assurance in Crowd Labeling. Information Systems Research 28(1):137-158.

https://doi.org/10.1287/isre.2016.0661

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Cost-Effective Quality Assurance in Crowd Labeling

References

Volume 28, Issue 1

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News