The IQ of the Crowd: Understanding and Improving Information Quality in Structured User-Generated Content

Roman Lukyanenko
Roman Lukyanenko
[email protected]
College of Business, Florida International University, Miami, Florida 33199
Search for more papers by this author
,
Jeffrey Parsons
Jeffrey Parsons
[email protected]
Faculty of Business Administration, Memorial University of Newfoundland, St. John’s, Newfoundland A1B 3X5 Canada
Search for more papers by this author
,
Yolanda F. Wiersma
Yolanda F. Wiersma
[email protected]
Department of Biology, Memorial University of Newfoundland, St. John’s, Newfoundland A1B 3X5 Canada
Search for more papers by this author

Roman Lukyanenko

[email protected]

College of Business, Florida International University, Miami, Florida 33199

Search for more papers by this author

Jeffrey Parsons

[email protected]

Faculty of Business Administration, Memorial University of Newfoundland, St. John’s, Newfoundland A1B 3X5 Canada

Search for more papers by this author

Yolanda F. Wiersma

[email protected]

Department of Biology, Memorial University of Newfoundland, St. John’s, Newfoundland A1B 3X5 Canada

Search for more papers by this author

Published Online:13 Oct 2014https://doi.org/10.1287/isre.2014.0537

References

Agresti A (1992) A survey of exact inference for contingency tables. Statist. Sci. 7(1):131–153.Crossref, Google Scholar
Allen G, March S (2012) A research note on representing part-whole relations in conceptual modeling. MIS Quart. 36(3):945–964.Crossref, Google Scholar
Angles R, Gutierrez C (2008) Survey of graph database models. Comput. Surveys 40(1):1–39.Crossref, Google Scholar
Appan R, Browne GJ (2010) Investigating retrieval-induced forgetting during information requirements determination. J. Assoc. Inform. Systems 11(5):250–275.Google Scholar
Arazy O, Nov O, Patterson R, Yeo L (2011) Information quality in Wikipedia: The effects of group composition and task conflict. J. Management Inform. Systems 27(4):71–98.Crossref, Google Scholar
Ballou DP, Pazer HL (1985) Modeling data and process quality in multi-input, multi-output information systems. Management Sci. 31(2):150–162.Link, Google Scholar
Ballou DP, Pazer HL (1995) Designing information systems to optimize the accuracy-timeliness tradeoff. Inform. Systems Res. 6(1):51–72.Link, Google Scholar
Barsalou LW (1983) Ad hoc categories. Memory Cognition 11(3):211–227.Crossref, Google Scholar
Bertino E, Guerrini G (1995) Objects with multiple most specific classes. Olthoff W, ed. Proc. 9th Eur. Conf. Object-Oriented Programming (Springer, Berlin Heidelberg), 102–126.Crossref, Google Scholar
Bonter DN, Cooper CB (2012) Data validation in citizen science: A case study from project feederwatch. Frontiers Ecology Environ. 10(6):305–307.Crossref, Google Scholar
Brennan P, Silman A (1992) Statistical methods for assessing observer variability in clinical measures. BMJ: British Medical J. 304(6840):1491–1494.Crossref, Google Scholar
Bunge M (1977) Treatise on Basic Philosophy: Ontology I: The Furniture of the World (Reidel, Boston).Crossref, Google Scholar
Burton-Jones A, Meso PN (2006) Conceptualizing systems for understanding: An empirical test of decomposition principles in object-oriented analysis. Inform. Systems Res. 17(1):38–60.Link, Google Scholar
Cha M, Kwak H, Rodriguez P, Ahn Y-Y, Moon S (2007) I tube, you tube, everybody tubes: Analyzing the world’s largest user generated content video system. Dovrolis C, Roughan M, eds. Proc. 7th ACM SIGCOMM Conf. Internet Measurement (ACM, New York), 1–13.Crossref, Google Scholar
Checkland P, Holwell S (1997) Information, Systems and Information Systems: Making Sense of the Field (John Wiley & Sons, New York).Google Scholar
Choudhury V (1997) Strategic choices in the development of interorganizational information systems. Inform. Systems Res. 8(1):1–24.Link, Google Scholar
Coleman DJ, Georgiadou Y, Labonte J (2009) Volunteered geographic information: The nature and motivation of producers. Internat. J. Spatial Data Infrastructures Res. 4:332–358.Google Scholar
Collins H, Evans R (2007) Rethinking Expertise (University of Chicago Press, Chicago).Crossref, Google Scholar
Crooks A, Croitoru AA, Stefanidis A, Radzikowski J (2013) #Earthquake: Twitter as a distributed sensor system. Trans. GIS 17(1): 124–147.Crossref, Google Scholar
Cruse DA (1977) The pragmatics of lexical specificity. J. Linguistics 13(2):153–164.Crossref, Google Scholar
Daugherty T, Eastin M, Bright L (2008) Exploring consumer motivations for creating user-generated content. J. Interactive Advertising 8(2):16–25.Crossref, Google Scholar
Dickinson JL, Zuckerberg B, Bonter DN (2010) Citizen science as an ecological research tool: Challenges and benefits. Ann. Rev. Ecology, Evolution, Systematics 41:112–149.Google Scholar
Doan A, Ramakrishnan R, Halevy AY (2011) Crowdsourcing systems on the World-Wide Web. Comm. ACM 54(4):86–96.Crossref, Google Scholar
Flanagin A, Metzger M (2008) The credibility of volunteered geographic information. GeoJournal 72(3):137–148.Crossref, Google Scholar
Fortson L, Masters K, Nichol R, Borne K, Edmondson E, Lintott C, Raddick J, Schawinski K, Wallin J (2012) Galaxy zoo: Morphological classification and citizen science. Way MJ, Scargle JD, Ali KM, Srivastava AN, eds. Advances in Machine Learning and Data Mining for Astronomy (CRC Press, Boca Raton, FL), 213–236.Google Scholar
Foster-Smith J, Evans SM (2003) The value of marine ecological data collected by volunteers. Biol. Conservation 113(2):199–213.Crossref, Google Scholar
Gallagher K, Parsons J, Foster KD (2001) A tale of two studies: Replicating advertising effectiveness and content evaluation in print and on the Web. J. Advertising Res. 41(4):71–81.Crossref, Google Scholar
Gallaugher J, Ransbotham S (2010) Social media and customer dialog management at Starbucks. MIS Quart. Executive 9(4): 197–212.Google Scholar
Galloway AWE, Tudor MT, Haegen WMV (2006) The reliability of citizen science: A case study of Oregon white oak stand surveys. Wildlife Soc. Bull. 34(5):1425–1429.Crossref, Google Scholar
Gangi PMD, Wasko M, Hooker R (2010) Getting customers' ideas to work for you: Learning from Dell how to succeed with online user innovation communities. MIS Quart. Executive 9(4): 163–178.Google Scholar
Gao G, McCullough JS, Agarwal R, Jha AK (2010) Are doctors created equal? An investigation of online ratings by patients Proc. Workshop Inform. Systems Econom. St. Louis, 1–6.Google Scholar
Gemino A, Wand Y (2005) Complexity and clarity in conceptual modeling: Comparison of mandatory and optional properties. Data Knowledge Engrg. 55(3):301–326.Crossref, Google Scholar
Girres J-F, Touya G (2010) Quality assessment of the French openstreetmap data set. Trans. GIS 14(4):435–459.Crossref, Google Scholar
Goodchild M (2007) Citizens as sensors: The world of volunteered geography. GeoJournal 69(4):211–221.Crossref, Google Scholar
Guizzardi G (2010) Theoretical foundations and engineering tools for building ontologies as reference conceptual models. Semantic Web 1(1):3–10.Google Scholar
Haklay M (2010) How good is volunteered geographical information? A comparative study of openstreetmap and ordnance survey data sets. Environ. Planning B 37(4):682–703.Crossref, Google Scholar
Hamel NJ, Burger AE, Charleton K, Davidson P, Lee S, Bertram DF, Parrish JK (2009) Bycatch and beached birds: Assessing mortality impacts in coastal net fisheries using marine bird strandings. Marine Ornithology 37(1):41–60.Google Scholar
Hand E (2010) People power. Nature 466(7307):685–687.Crossref, Google Scholar
Hill S, Ready-Campbell N (2011) Expert stock picker: The wisdom of (the experts in) crowds. Internat. J. Electronic Commerce 15(3):73–101.Crossref, Google Scholar
Hirschheim R, Klein HK, Lyytinen K (1995) Information Systems Development and Data Modeling: Conceptual and Philosophical Foundations (Cambridge University Press, New York).Crossref, Google Scholar
Hochachka WM, Fink D, Hutchinson RA, Sheldon D, Wong W-K, Kelling S (2012) Data-intensive science applied to broad-scale citizen science. Trends Ecology Evolution 27(2):130–137.Crossref, Google Scholar
Jolicoeur P, Gluck MA, Kosslyn SM (1984) Pictures and names: Making the connection. Cognitive Psych. 16(2):243–275.Crossref, Google Scholar
Jones RA, Rosenberg S (1974) Structural representations of naturalistic descriptions of personality. Multivariate Behav. Res. 9(2): 217–230.Crossref, Google Scholar
Juran JM, Gryna FM (1988) Juran’s Quality Control Handbook (McGraw-Hill, New York).Google Scholar
Kahn BK, Strong DM, Wang RY (2002) Information quality benchmarks: Product and service performance. Comm. ACM 45(4): 184–192.Crossref, Google Scholar
Kent W (2000) Data and Reality, 2nd ed. (1st Books, Bloomington, IN).Google Scholar
Kim S, Robson C, Zimmerman T, Pierce J, Haber EM (2011) Creek watch: Pairing usefulness and usability for successful citizen science. Tan D, ed. Proc. 2011 Ann. Conf. Human Factors Comput. Systems (ACM, New York), 2125–2134.Crossref, Google Scholar
Kittur A, Chi E, Pendleton B, Sun B, Mytkowicz T (2007) Power of the few vs. wisdom of the crowd: Wikipedia and the rise of the bourgeoisie. Rosson MB, ed. Proc. CHI 2007 (ACM, New York), 1–9.Google Scholar
Krumm J, Davies N, Narayanaswami C (2008) User-generated content. IEEE Pervasive Comput. 7(4):10–11.Crossref, Google Scholar
Lambert NM, Graham SM, Fincham FD (2009) A prototype analysis of gratitude: Varieties of gratitude experiences. Personality Soc. Psych. Bull. 3(9):1193–1207.Crossref, Google Scholar
Landis R, Koch G (1977) The measurement of observer agreement for categorical data. Biometrics 33(1):159–174.Crossref, Google Scholar
Lee YW (2003) Crafting rules: Context-reflective data quality problem solving. J. Management Inform. Systems 20(3):93–119.Crossref, Google Scholar
Lee YW, Pipino L, Strong DM, Wang RY (2004) Process-embedded data integrity. J. Database Management 15(1):87–103.Crossref, Google Scholar
Lee YW, Pipino LL, Funk JD, Wang RY (2006) Journey to Data Quality (MIT Press, Cambridge, MA).Google Scholar
Lukyanenko R, Parsons J, Wiersma Y (2011) Citizen science 2.0: Data management principles to harness the power of the crowd. Jain H, Sinha A, Vitharana P, eds. Service-Oriented Perspectives in Design Science Research, Vol. 6629 (Springer, Berlin Heidelberg), 465–473.Crossref, Google Scholar
Mackechnie C, Maskell L, Norton L, Roy D (2011) The role of “Big Society” in monitoring the state of the natural environment. J. Environ. Monitoring 13(10):2687–2691.Crossref, Google Scholar
Majchrzak A, More P (2011) Emergency! Web 2.0 to the rescue! Comm. ACM 54(4):125–132.Crossref, Google Scholar
Markus M, Steinfield CW, Wigand RT (2006) Industry-wide information systems standardization as collective action: The case of the US residential mortgage industry. MIS Quart. 30(Special Issue):439–465.Crossref, Google Scholar
Mason R, Mitroff I (1973) A program for research on management information systems. Management Sci. 19(5):475–487.Link, Google Scholar
McCloskey M, Glucksberg S (1978) Natural categories: Well defined or fuzzy sets? Memory Cognition 6(4):462–472.Crossref, Google Scholar
Murphy G, Smith E (1982) Basic-level superiority in picture categorization. J. Verbal Learn. Verbal Behav. 21(1):1–20.Crossref, Google Scholar
Murphy GL (2004) The Big Book of Concepts (MIT Press, Cambridge, MA).Google Scholar
Mylopoulos J (1992) Conceptual modeling and telos. Loucopoulos P, Zicari R, eds. Conceptual Modeling, Databases, and CASE: An Integrated View of Information Systems Development (John Wiley & Sons, Inc., New York), 49–68.Google Scholar
Mylopoulos J (1998) Information modeling in the time of the revolution. Inform. Systems 23(3–4):127–155.Crossref, Google Scholar
Nelson XJ, Fijn N (2013) The use of visual media as a tool for investigating animal behaviour. Animal Behav. 85(3):525–536.Crossref, Google Scholar
Nov O, Arazy O, Anderson D (2011) Technology-mediated citizen science participation: A motivational model. Proc. Fifth Internat. AAAI Conf. Weblogs Soc. Media, Barcelona, Spain, 249–256.Google Scholar
Olivé A (2007) Conceptual Modeling of Information Systems (Springer, Berlin Heidelberg).Google Scholar
Parsons J (1996) An information model based on classification theory. Management Sci. 42(10):1437–1453.Link, Google Scholar
Parsons J, Su J (2004) Exploiting instance-based data structures with iQL. Proc. Workshop Inform. Tech. Systems, 206–211.Google Scholar
Parsons J, Wand Y (1997) Choosing classes in conceptual modeling. Comm. ACM 40(6):63–69.Crossref, Google Scholar
Parsons J, Wand Y (2000) Emancipating instances from the tyranny of classes in information modeling. ACM Trans. Database Systems 25(2):228–268.Crossref, Google Scholar
Parsons J, Wand Y (2008) Using cognitive principles to guide classification in information systems modeling. MIS Quart. 32(4): 839–868.Crossref, Google Scholar
Parsons J, Lukyanenko R, Wiersma Y (2011) Easier citizen science is better. Nature 471(7336):37.Crossref, Google Scholar
Pipino LL, Lee YW, Wang RY (2002) Data quality assessment. Comm. ACM 45(4):211–218.Crossref, Google Scholar
Porter F (1980) An algorithm for suffix stripping. Program: Electronic Library Inform. Systems 14(3):130–137.Crossref, Google Scholar
Posner MI (1993) Foundations of Cognitive Science (MIT Press, Cambridge, MA).Google Scholar
Raccoon LSB, Puppydog POP (1998) A middle-out concept of hierarchy (or the problem of feeding the animals). ACM SIGSOFT Software Engrg. Notes 23(3):111–119.Crossref, Google Scholar
Recker J, Rosemann M, Green P, Indulska M (2011) Do ontological deficiencies in modeling grammars matter? MIS Quart. 35(1):57–79.Crossref, Google Scholar
Redman TC (1996) Data Quality for the Information Age (Artech House, Norwood, MA).Google Scholar
Reeves CA, Bednar DA (1994) Defining quality: Alternatives and implications. Acad. Management Rev. 19(3):419–445.Crossref, Google Scholar
Rosch E (1978) Principles of categorization. Rosch E, Lloyd B, eds. Cognition and Categorization (John Wiley & Sons Inc., Hoboken, NJ), 27–48.Google Scholar
Rosch E, Mervis CB, Gray WD, Johnson DM, Boyesbraem P (1976) Basic objects in natural categories. Cognitive Psych. 8(3): 382–439.Crossref, Google Scholar
Rosenberg S, Jones R (1972) A method for investigating and representing a person’s implicit theory of personality: Theodore Dreiser’s view of people. J. Personality Soc. Psych. 22(3):372–386.Crossref, Google Scholar
Rowland K (2012) Citizen science goes “extreme.” Nature News, Retrieved December 9, 2013, http://www.nature.com/news/citizen-science-goes-extreme-1.10054.Crossref, Google Scholar
Shanks G, Tansley E, Nuredini J, Tobin D, Weber R (2008) Representing part-whole relations in conceptual modeling: An empirical evaluation. MIS Quart. 32(3):553–573.Crossref, Google Scholar
Silvertown J (2010) Taxonomy: Include social networking. Nature 467(7317):788.Crossref, Google Scholar
Sim J, Wright CC (2005) The kappa statistic in reliability studies: Use, interpretation, and sample size requirements. Physical Therapy 85(3):257–268.Google Scholar
Stonebraker M, Abadi DJ, Batkin A, Chen X, Cherniack M, Ferreira M, Lau E, Lin A, Madden S, O’Neil E, O’Neil Pet al. (2005)C-store: A column-oriented DBMS. Böhm K, Jensen CS, Haas LM, Kersten ML, Larson P-Å, Ooi BC, eds. Proc. 31st Internat. Conf. Very Large Data Bases (ACM, New York), 553–564.Google Scholar
Susarla A, Oh J-H, Tan Y (2012) Social networks and the diffusion of user-generated content: Evidence from YouTube. Inform. Systems Res. 23(1):23–41.Link, Google Scholar
Tanaka JW, Taylor M (1991) Object categories and expertise: Is the basic level in the eye of the beholder? Cognitive Psych. 23(3):457–482.Crossref, Google Scholar
Tayi GK, Ballou DP (1998) Examining data quality. Comm. ACM 41(2):54–57.Crossref, Google Scholar
Tsichritzis DC, Lochovsky FH (1982) Data Models (Prentice-Hall, Englewood Cliffs, NJ).Google Scholar
Vitale MR, Johnson H (1988) Creating competitive advantage with interorganizational information systems. MIS Quart. 12(2): 152–165.Google Scholar
Wand Y, Wang RY (1996) Anchoring data quality dimensions in ontological foundations. Comm. ACM 39(11):86–95.Crossref, Google Scholar
Wand Y, Weber R (1990) An ontological model of an information system. IEEE Trans. Software Engrg. 16(11):1282–1292.Crossref, Google Scholar
Wand Y, Weber R (2002) Research commentary: Information systems and conceptual modeling—A research agenda. Inform. Systems Res. 13(4):363–376.Link, Google Scholar
Wand Y, Monarchi DE, Parsons J, Woo CC (1995) Theoretical foundations for conceptual modelling in information systems development. Decision Support Systems 15(4):285–304.Crossref, Google Scholar
Wang RY (1998) A product perspective on total data quality management. Comm. ACM 41(2):58–65.Crossref, Google Scholar
Wang RY, Strong DM (1996) Beyond accuracy: What data quality means to data consumers. J. Management Inform. Systems 12(4): 5–33.Crossref, Google Scholar
Wattal S, Schuff D, Mandviwalla M, Williams CB (2010) Web 2.0 and politics: The 2008 U.S. presidential election and an e-politics research agenda. MIS Quart. 34(4):669–688.Crossref, Google Scholar
Weber R (1996) Are attributes entities? A study of database designers' memory structures. Inform. Systems Res. 7(2):137–162.Link, Google Scholar
Wiersma YF (2010) Birding 2.0: Citizen science and effective monitoring in the Web 2.0 world. Avian Conservation Ecology 5(2):1–9.Crossref, Google Scholar
Wiggins A, Crowston K (2011) From conservation to crowdsourcing: A typology of citizen science. Sprague RH Jr, ed. Proc. 44th Hawaii Internat. Conf. System Sci. (IEEE, Piscataway, NJ).Crossref, Google Scholar
Wiggins A, Newman G, Stevenson RD, Crowston K (2011) Mechanisms for data quality and validation in citizen science. Proc. “Comput. Citizen Sci.” Workshop, Stockholm, 1–6.Crossref, Google Scholar
Wiggins A, Bonney R, Graham E, Henderson S, Kelling S, LeBuhn G, Litauer R, Lots K, Michener W, Newman G (2013) Data management guide for public participation in scientific research. DataOne Working Group 1–41. Retrieved December 5, http://www.dataone.org/sites/all/documents/DataONE-PPSR-DataManagementGuide.pdf.Google Scholar
Winograd T, Flores F (1986) Understanding Computers and Cognition: A New Foundation for Design (Ablex Pub, Norwood, NJ).Google Scholar
Wisniewski EJ, Murphy G (1989) Superordinate and basic category names in discourse: A textual analysis. Discourse Processes 12(2):245–261.Crossref, Google Scholar
Wyssusek B (2006) On ontological foundations of conceptual modelling. Scandinavian J. Inform. Systems 18(1):63–80.Google Scholar
Zhu H, Wu H (2011) Quality of data standards: Framework and illustration using XBRL taxonomy and instances. Electronic Markets 21(2):129–139.Crossref, Google Scholar
Zwass V (2010) Co-creation: Toward a taxonomy and an integrated research perspective. Internat. J. Electronic Commerce 15(1):11–48.Crossref, Google Scholar

cover image Information Systems Research

Volume 25, Issue 4

Special Section: Information, Technology, and the Changing Nature of Work

December 2014

Pages 667-891

Article Information

Supplemental Material

Metrics

Information

Received:May 04, 2012
Published Online:October 13, 2014

Cite as

Roman Lukyanenko, Jeffrey Parsons, Yolanda F. Wiersma (2014) The IQ of the Crowd: Understanding and Improving Information Quality in Structured User-Generated Content. Information Systems Research 25(4):669-689.

https://doi.org/10.1287/isre.2014.0537

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

The IQ of the Crowd: Understanding and Improving Information Quality in Structured User-Generated Content

References

Volume 25, Issue 4

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News