Learning from Crowdsourced Multi-labeling: A Variational Bayesian Approach

Junming Yin
Junming Yin
[email protected]
https://orcid.org/0000-0001-6018-7813
Department of Management Information Systems, University of Arizona, Tucson, Arizona 85721;Tepper School of Business, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213;
Search for more papers by this author
,
Jerry Luo
Jerry Luo
[email protected]
Department of Mathematics, University of Arizona, Tucson, Arizona 85721
Search for more papers by this author
,
Susan A. Brown
Susan A. Brown
[email protected]
https://orcid.org/0000-0002-0484-4428
Department of Management Information Systems, University of Arizona, Tucson, Arizona 85721;
Search for more papers by this author

Department of Management Information Systems, University of Arizona, Tucson, Arizona 85721;Tepper School of Business, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213;

Search for more papers by this author

Jerry Luo

[email protected]

Department of Mathematics, University of Arizona, Tucson, Arizona 85721

Search for more papers by this author

Susan A. Brown

[email protected]

https://orcid.org/0000-0002-0484-4428

Department of Management Information Systems, University of Arizona, Tucson, Arizona 85721;

Search for more papers by this author

Published Online:23 Aug 2021

References

Antoniak CE (1974) Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems. Ann. Statist. 2(6):1152–1174.Crossref, Google Scholar
Bishop CM (2006) Pattern Recognition and Machine Learning (Springer, Berlin).Google Scholar
Blei DM, Jordan MI (2006) Variational inference for Dirichlet process mixtures. Bayesian Anal. 1(1):121–144.Crossref, Google Scholar
Blei DM, Kucukelbir A, McAuliffe JD (2017) Variational inference: A review for statisticians. J. Amer. Statist. Assoc. 112(518):859–877.Crossref, Google Scholar
Brabham DC (2013) Crowdsourcing (MIT Press, Cambridge, MA).Crossref, Google Scholar
Bragg J, Mausam, Weld DS (2013) Crowdsourcing multi-label classification for taxonomy creation. Hartman B, Horvitz E, eds. Proc. 1st AAAI Conf. Human Comput. Crowdsourcing (AAAI Press, Palo Alto, CA), 25–33.Google Scholar
Cartwright M, Dove G, Méndez Méndez AE, Bello JP, Nov O (2019) Crowdsourcing multi-label audio annotation tasks with citizen scientists. Brewster SA, Fitzpatrick G, Cox AL, Kostakos V, eds. Proc. 2019 CHI Conf. Human Factors in Comput. Systems (ACM, New York), 1–11.Google Scholar
Chen BX (2018) Tech can hurt our sleep. so I tried Bose sleepbuds for help. Accessed September 5, 2018, https://www.nytimes.com/2018/09/05/technology/personaltech/tech-sleep-bose-sleepbuds.html.Google Scholar
Chua TS, Tang J, Hong R, Li H, Luo Z, Zheng Y (2009) NUS-WIDE: A real-world web image database from National University of Singapore. Proc. ACM Internat. Conf. Image Video Retrieval, 48:1–48:9.Google Scholar
Dalvi N, Dasgupta A, Kumar R, Rastogi V (2013) Aggregating crowdsourced binary ratings. Schwabe D, Almeida VAF, Glaser H, Baeza-Yates R, Moon SB, eds. Proc. 22nd Internat. Conf. World Wide Web (ACM, New York), 285–294.Google Scholar
Dawid AP, Skene AM (1979) Maximum likelihood estimation of observer error-rates using the EM algorithm. J. Royal Statist. Soc. Series C Appl. Statist. 28(1):20–28.Crossref, Google Scholar
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J. Royal Statist. Soc. B 39(1):1–38.Crossref, Google Scholar
Deng J, Russakovsky O, Krause J, Bernstein MS, Berg A, Fei-Fei L (2014) Scalable multi-label annotation. Jones M, Palanque PA, Schmidt A, Grossman T, eds. Proc. SIGCHI Conf. Human Factors Comput. Systems (ACM, New York), 3099–3102.Google Scholar
Duan L, Oyama S, Sato H, Kurihara M (2014) Separate or joint? Estimation of multiple labels from crowdsourced annotations. Expert Systems Appl. 41(13):5723–5732.Crossref, Google Scholar
Escobar MD, West M (1995) Bayesian density estimation and inference using mixtures. J. Amer. Statist. Assoc. 90:577–588.Crossref, Google Scholar
Ferguson TS (1973) A Bayesian analysis of some nonparametric problems. Ann. Statist. 1(2):209–230.Crossref, Google Scholar
Gadiraju U, Kawase R, Dietze S, Demartini G (2015) Understanding malicious behavior in crowdsourcing platforms: The case of online surveys. Begole B, Kim J, Woo W, eds. Proc. 33rd Annual ACM Conf. Human Factors Comput. Systems (ACM, New York), 1631–1640.Google Scholar
Ghosh A, Kale S, McAfee P (2011) Who moderates the moderators?: Crowdsourcing abuse detection in user-generated content. Shoham Y, Chen Y, Roughgarden T, eds. Proc. 12th ACM Conf. Electronic Commerce (ACM, New York), 167–176.Google Scholar
Ho CJ, Vaughan JW (2012) Online task assignment in crowdsourcing markets. Hoffmann J, Selman B, eds. Proc. 26th AAAI Conf. Artificial Intelligence (AAAI Press, Palo Alto, CA), 45–51.Google Scholar
Ho CJ, Slivkins A, Suri S, Vaughan JW (2015) Incentivizing high quality crowdwork. Gangemi A, Leonardi S, Panconesi A, eds. Proc. 24th Internat. Conf. World Wide Web, (ACM, New York)419–429.Google Scholar
Hoffman MD, Gelman A (2014) The No-U-Turn sampler: Adaptively setting path lengths in Hamiltonian Monte Carlo. J. Machine Learning Res. 15(47):1593–1623.Google Scholar
Horton JJ, Chilton LB (2010) The labor economics of paid crowdsourcing. Parkes DC, Dellarocas, Tennenholtz M, eds. Proc. 11th ACM Conf. Electronic Commerce (ACM, New York), 209–218.Google Scholar
Hung NQV, Viet HH, Tam NT, Weidlich M, Yin H, Zhou X (2018) Computing crowd consensus with partial agreement. IEEE Trans. Knowledge Data Engrg. 30(1):1–14.Crossref, Google Scholar
Ipeirotis PG, Provost F, Wang J (2010) Quality management on Amazon Mechanical Turk. Chandrasekar R, Chi E, Chickering M, Ipeirotis PG, Mason W, Provost F, Tam J, von Ahn L, eds. Proc. ACM SIGKDD Workshop Human Comput. (ACM, New York), 64–67.Google Scholar
Ipeirotis PG, Provost F, Sheng VS, Wang J (2014) Repeated labeling using multiple noisy labelers. Data Mining Knowledge Discovery 28(2):402–441.Crossref, Google Scholar
Ishiguro K, Sato I, Ueda N (2017) Averaged collapsed variational Bayes inference. J. Machine Learning Res. 18(1):1–29.Google Scholar
Jain H, Padmanabhan B, Pavlou PA, Santanam RT (2018) Call for papers—Special issue of information systems research—Humans, algorithms, and augmented intelligence: The future of work, organizations, and society. Inform. Systems Res. 29(1):250–251.Link, Google Scholar
Joe H (1997) Multivariate Models and Multivariate Dependence Concepts (CRC Press, Boca Raton, FL).Crossref, Google Scholar
Jordan MI, Ghahramani Z, Jaakkola TS, Saul LK (1999) An introduction to variational methods for graphical models. Machine Learning 37(2):183–233.Crossref, Google Scholar
Juusola JL, Quisel TR, Foschini L, Ladapo JA (2016) The impact of an online crowdsourcing diagnostic tool on healthcare utilization: A case study using a novel approach to retrospective claims analysis. J. Medical Internet Res. 18(6):e127.Crossref, Google Scholar
Karger DR, Oh S, Shah D (2014) Budget-optimal task allocation for reliable crowdsourcing systems. Oper. Res. 62(1):1–24.Link, Google Scholar
Kazai G, Kamps J, Milic-Frayling N (2011) Worker types and personality traits in crowdsourcing relevance labels. Macdonald C, Ounis I, Ruthven I, eds. Proc. 20th ACM Internat. Conf. Inform. Knowledge Management (ACM, New York), 1941–1944.Google Scholar
Kim HC, Ghahramani Z (2012) Bayesian classifier combination. Lawrence ND, Girolami MA Proc. 15th Internat. Conf. Artificial Intelligence Statist. (JMLR), 619–627.Google Scholar
Kovashka A, Russakovsky O, Fei-Fei L, Grauman K (2016) Crowdsourcing in computer vision. Foundations Trends Comput. Graphic Vision 10(3):177–243.Crossref, Google Scholar
Kucukelbir A, Tran D, Ranganath R, Gelman A, Blei DM (2017) Automatic differentiation variational inference. J. Machine Learning Res. 18(14):1–45.Google Scholar
Kurihara K, Welling M, Teh YW (2007) Collapsed variational Dirichlet process mixture models. Veloso MM, ed. Proc. Internat. Joint Conf. Artificial Intelligence (AAAI Press, Palo Alto, CA), 2796–2801.Google Scholar
Lazarsfeld PF, Henry NW (1968) Latent Structure Analysis (Houghton Mifflin, Boston).Google Scholar
Li C, Wang B, Pavlu V, Aslam J (2016) Conditional Bernoulli mixtures for multi-label classification. Balcan M-F, Weinberger KQ, eds. Proc. 33rd Internat. Conf. Machine Learning (JMLR.org), 2482–2491.Google Scholar
Liu Z, Luo P, Wang X, Tang X (2015) Deep learning face attributes in the wild. Proc. 2015 IEEE Internat. Conf. Comput. Vision (ICCV) (IEEE, Washington, DC), 3730–3738.Google Scholar
Luca M, Zervas G (2016) Fake it till you make it: Reputation, competition, and yelp review fraud. Management Sci. 62(12):3412–3427.Link, Google Scholar
Magistretti B (2017) Playment raises $1.6 million to improve AI training through crowdsourced data tagging. Accessed November 21, 2017, https://venturebeat.com/2017/11/21/playment-raises-1-6-million-to-improve-ai-training-through-crowdsourced-data-tagging.Google Scholar
Marge M, Banerjee S, Rudnicky AI (2010) Using the Amazon Mechanical Turk for transcription of spoken language. IEEE Internat. Conf. Acoustics Speech Signal Processing (IEEE, Washington, DC), 5270–5273.Google Scholar
Minka T (2000) Estimating a Dirichlet distribution. Technical report, Massachusetts Institute of Technology, Cambridge, MA.Google Scholar
Moreno PG, Artes-Rodriguez A, Teh YW, Perez-Cruz F (2015) Bayesian nonparametric crowdsourcing. J. Machine Learning Res. 16:1607–1627.Google Scholar
Nowak S, Rüger S (2010) How reliable are annotations via crowdsourcing: a study about inter-annotator agreement for multi-label image annotation. Wang JZ, Boujemaa N, Ramirez NO, Natsev A, eds. Proc. Internat. Conf. Multimedia Inform. Retrieval (ACM, New York), 557–566.Google Scholar
Padmanabhan D, Bhat S, Shevade S, Narahari Y (2016) Topic model based multi-label classification. IEEE 28th Internat. Conf. Tools Artificial Intelligence (IEEE, Washington, DC), 996–1003.Google Scholar
Pavlick E, Post M, Irvine A, Kachaev D, Callison-Burch C (2014) The language demographics of Amazon Mechanical Turk. Trans. Assoc. Comput. Linguistics 2:79–92.Crossref, Google Scholar
Raykar VC, Yu S, Zhao LH, Valadez GH, Florin C, Bogoni L, Moy L (2010) Learning from crowds. J. Machine Learning Res. 11:1297–1322.Google Scholar
Ritter A, Clark S, Etzioni O (2011) Named entity recognition in tweets: an experimental study. Proc. Conf. Empirical Methods Natural Language Processing (Association for Computational Linguistics, Stroudsburg, PA), 1524–1534.Google Scholar
Sethuraman J (1994) A constructive definition of Dirichlet priors. Statist. Sinica 4(2):639–650.Google Scholar
Snow R, O’Connor B, Jurafsky D, Ng AY (2008) Cheap and fast—but is it good?: Evaluating non-expert annotations for natural language tasks. Proc. Conf. Empirical Methods Natural Language Processing (Association for Computational Linguistics, Stroudsburg, PA), 254–263.Google Scholar
Strapparava C, Mihalcea R (2007) Semeval-2007 task 14: Affective text. Agirre E, Marquez i Villodre L, Wicentowski R, eds. Proc. 4th Internat. Workshop Semantic Evaluations (Association for Computational Linguistics, Stroudsburg, PA), 70–74.Google Scholar
Teh YW, Newman D, Welling M (2007) A collapsed variational bayesian inference algorithm for latent Dirichlet allocation. Scholkopf B, Platt JC, Hofmann T, eds. Adv. Neural Inform. Processing Systems (MIT Press, Cambridge, MA), 1353–1360.Google Scholar
Trohidis K, Tsoumakas G, Kalliris G, Vlahavas IP (2011) Multi-label classification of music by emotion. EURASIP J. Audio Speech Music Processing 1:4.Crossref, Google Scholar
Tsoumakas G, Katakis I (2007) Multi-label classification: An overview. Internat. J. Data Warehouse Mining 3(3):1–13.Crossref, Google Scholar
Vaughan JW (2018) Making better use of the crowd: How crowdsourcing can advance machine learning research. J. Machine Learning Res. 18(193):1–46.Google Scholar
Venanzi M, Teacy WTL, Rogers A, Jennings NR (2015) Bayesian modelling of community-based multidimensional trust in participatory sensing under data sparsity. Yang Q, Woolridge MJ, eds. Proc. 24th Internat. Joint Conf. Artificial Intelligence (AAAI Press, Palo Alto, CA), 717–724.Google Scholar
Venanzi M, Guiver J, Kazai G, Kohli P, Shokouhi M (2014) Community-based Bayesian aggregation models for crowdsourcing. Chung CW, Broder AZ, Shim K, Suel T, eds. Proc. 23rd Internat. Conf. World Wide Web (ACM, New York), 155–164.Google Scholar
Wainwright MJ, Jordan MI (2008) Graphical models, exponential families, and variational inference. Foundations Trends Machine Learning 1(1–2):1–305.Google Scholar
Wang J, Ipeirotis PG, Provost F (2017a) Cost-effective quality assurance in crowd labeling. Inform. Systems Res. 28(1):137–158.Link, Google Scholar
Wang S, Li X, Chang X, Yao L, Sheng QZ, Long G (2017b) Learning multiple diagnosis codes for ICU patients with local disease correlation mining. ACM Trans. Knowledge Discovery Data 11(3):31:1–31:21.Google Scholar
Weld DS, Adar E, Chilton L, Hoffmann R, Horvitz E, Koch M, Landay J, et al. (2012) Personalized online education: A crowdsourcing challenge. Chen Y, Ipeirotis PG, Law E, von Ahn L, Zhang H, eds. The 4th Human Comput. Workshop 26th AAAI Conf. Artificial Intelligence (AAAI Press, Palo Alto, CA), 159–163.Google Scholar
Welinder P, Branson S, Belongie S, Perona P (2010) The multidimensional wisdom of crowds. Lafferty JD, Williams CKI, Shawe-Taylor J, Zemel RS, Culotta A, eds. Adv. Neural Inform. Processing Systems (Curran Associates, Inc., Red Hook, New York), 2424–2432.Google Scholar
Whitehill J, Ruvolo P, Wu T, Bergsma J, Movellan J (2009) Whose vote should count more: Optimal integration of labels from labelers of unknown expertise. Lafferty JD, Williams CKI, Shawe-Taylor J, Zemel RS, Culotta A, eds. Adv. Neural Inf. Process. Syst, 2035–2043.Google Scholar
Zhang J, Wu X (2018) Multi-label inference for crowdsourcing. Guo Y, Farooq F, eds. Proc. 24th ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (ACM, New York), 2738–2747.Google Scholar
Zhang M, Zhou Z (2013) A review on multi-label learning algorithms. IEEE Trans. Knowledge Data Engrg. 26(8):1819–1837.Crossref, Google Scholar

cover image Information Systems Research

Volume 32, Issue 3

September 2021

Pages iii-viii, 675-1097, C2

Article Information

Supplemental Material

Metrics

Information

Received:January 16, 2019
Accepted:October 22, 2020
Published Online:August 23, 2021

Cite as

Junming Yin, Jerry Luo, Susan A. Brown (2021) Learning from Crowdsourced Multi-labeling: A Variational Bayesian Approach. Information Systems Research 32(3):752-773.

https://doi.org/10.1287/isre.2021.1000

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Learning from Crowdsourced Multi-labeling: A Variational Bayesian Approach

References

Volume 32, Issue 3

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News