Augmenting Social Bot Detection with Crowd-Generated Labels
Published Online:12 May 2022https://doi.org/10.1287/isre.2022.1136
References
- (2022) Crowdsourcing for machine learning in public health surveillance: Lessons learned from Amazon Mechanical Turk. J. Medical Internet Res. 24(1):e28749.Crossref, Google Scholar
- (2018) Text analytics to support sense-making in social media: A language-action perspective. Management Inform. Systems Quart. 42(2):427–464.Crossref, Google Scholar
- (2019) BotCamp: Bot-driven interactions in social campaigns. Liu L, White R, eds. Proc. World Wide Web Conf. (Association for Computing Machinery, New York), 2529–2535.Google Scholar
- (2014) Big data, data science, and analytics: The opportunity and challenge for IS research. Inform. Systems Res. 25(3):443–448.Link, Google Scholar
- (1997) Dialogue acts in VerbMobil-2. Verbmobil report, German Research Center for Artificial Intelligence, Kaiserslautem, Germany.Google Scholar
- (1962) How to Do Things with Words (Oxford University Press, Oxford, UK).Google Scholar
- (2019) DICE-E: Darknet identification, collection, evaluation, with ethics. Management Inform. Systems Quart. 43(1):1–22.Crossref, Google Scholar
- (2018) Weaponized health communication: Twitter bots and Russian trolls amplify the vaccine debate. Amer. J. Public Health 108(10):1378–1384.Crossref, Google Scholar
- (2017) Behavior enhanced deep bot detection in social media. Zheng X, Zhang H, Xing C, Wang A, Zhou L, Luo B, eds. Proc. IEEE Internat. Conf. on Intelligence and Security Informatics (IEEE, New York), 128–130.Google Scholar
- (2018) Universal sentence encoder for English. Proc. Conf. on Empirical Methods in Natural Language Processing: System Demonstrations, 169–174.Google Scholar
- (2017) Revolt: Collaborative crowdsourcing for labeling machine learning datasets. Mark G, Fussell S, Lampe C, Schraefel MC, Hourcade JP, Appert C, Wigdor D, eds. Proc. CHI Conf. on Human Factors in Comput. Systems (Association for Computing Machinery, New York), 2334–2346.Google Scholar
- (2017) DeBot: Twitter bot detection via warped correlation. Domeniconi C, ed. Proc. IEEE Internat. Conf. on Data Mining (IEEE, New York), 817–822.Google Scholar
- (2017) The paradigm-shift of social spambots: Evidence, theories, and tools for the arms race. Barrett R, Cummings R, Agichtein E, Gabrilovich E, eds. Proc. 26th Internat. Conf. on World Wide Web Companion (Association for Computing Machinery, New York), 963–972.Google Scholar
- (1985) Speech act theory in quantitative research on interpersonal behavior. Discourse Processing 8(2):229–258.Crossref, Google Scholar
- (2016) Botornot: A system to evaluate social bots. Bourdeau J, Hendler J, eds. Proc. 25th Internat. Conf. Companion on World Wide Web (Association for Computing Machinery, New York), 273–274.Google Scholar
- (2017) Is that social bot behaving unethically? Comm. ACM 60(9):29–31.Crossref, Google Scholar
- (2006) Argumentation support: From technologies to tools. Comm. ACM 49(3):93–98.Crossref, Google Scholar
- (2018) Leveraging financial social media data for corporate fraud detection. J. Management Inform. Systems 35(2):461–487.Crossref, Google Scholar
- (2018) A system for intergroup prejudice detection: The case of microblogging under terrorist attacks. Decision Support Systems 113:11–21.Crossref, Google Scholar
- (2018) Supervised machine learning bot detection techniques to identify social twitter bots. SMU Data Sci. Rev. 1(2):5.Google Scholar
- (2018) SMOTE for learning from imbalanced data: Progress and challenges, marking the 15-year anniversary. J. Artificial Intelligence Res. 61:863–905.Crossref, Google Scholar
- (2018) Measuring social spam and the effect of bots on information diffusion in social media. Lehmann S, Ahn YY, eds. Complex Spreading Phenomena in Social Systems (Springer, Berlin), 229–255.Google Scholar
- (2016) The rise of social bots. Comm. ACM. 59(7):96–104.Crossref, Google Scholar
- (2015) Rsc: Mining and modeling temporal activity in social media. Cao L, Zhang C, Joachims T, Webb G, Margineantu DD, Williams G, eds. Proc. 21th ACM SIGKDD Internat. Conf. on Knowledge Discovery and Data Mining (Association for Computing Machinery, New York), 269–278.Google Scholar
- (2019) An empirical study on pre-trained embeddings and language models for bot detection. Augenstein I, Gella S, Ruder S, Kann K, Can B, Welbl J, Conneau A, Ren X, Rei M, eds. Proc. 4th Workshop on Representation Learn. for NLP (Association for Computiational Linguistics, Stroudsburg, PA), 148–155.Google Scholar
- (2014) Design science research in top information systems journals. MIS Quart. 38(1):iii–viii.Google Scholar
- (2016) Pragmatic language interpretation as probabilistic inference. Trends Cognitive Sci. 20(11):818–829.Crossref, Google Scholar
- (2013) Knowledge and implicature: Modeling language understanding as a social contagion. Top. Cognitive Sci. 5(1):173–184.Crossref, Google Scholar
- (2005) Borderline-SMOTE: A new over-sampling method in imbalanced data sets learning. Huang D-S, Zhang X-P, Huang G-B, eds. Proc. Internat. Conf. on Intelligence Comput. (Springer, New York), 878–887.Google Scholar
- (2009) Learning from imbalanced data. IEEE Trans. Knowledge Data Engrg. 21(9):1263–1284.Crossref, Google Scholar
- (2017) Social media strategies in product-harm crises. Inform. Systems Res. 29(2):362–380.Link, Google Scholar
- (2016) Are social bots on Twitter political actors? Empirical evidence from a Ukrainian social botnet. Gummadi KP, Strohmaier M, Gilbert E, Macy M, Wagner C, eds. Proc. 10th Internat. AAAI Conf. on Web and Social Media (AAAI, Palo Alto, CA), 579–582.Google Scholar
- (2020) Using bert to extract topic-independent sentiment features for social media bot detection. Vuong S, Chakrabarti S, Bradford P, Paul R, Rubenstein C, eds. Proc. 11th IEEE Annual Ubiquitous Comput., Electronics and Mobile Comm. Conf. (IEEE, New York), 0542–0547.Google Scholar
- (2019) Bot detection in Reddit political discussion. Ramachandran GS, Ortiz J, eds. Proc. 4th Internat. Workshop on Social Sensing (Association for Computing Machinery, New York), 30–35.Google Scholar
- (2019) Spam detection in social media using convolutional and long short-term memory neural network. Ann. Math. Artificial Intelligence 85(1):21–44.Crossref, Google Scholar
- (2016) Combating the evasion mechanisms of social bots. Comput. Security 58:230–249.Crossref, Google Scholar
- (2012) Learning from crowds and experts. Hoffman J, Selman B, eds. Proc. Workshops at the 26th AAAI Conf. on Artificial Intelligence (AAAI, Palo Alto, CA).Google Scholar
- (2019) Technological frames and user innovation: Exploring technological change in community moderation teams. Proc. ACM Human Comput. Interactions, 1–23.Google Scholar
- (2018) Deep neural networks for bot detection. Inform. Sci. 467:312–322.Crossref, Google Scholar
- (2011) A linguistic analysis of group support systems interactions for uncovering social realities of organizations. ACM Trans. MIS 2(1):1–21.Google Scholar
- (2017) It’s super hard to find the humans in the FCC’s net neutrality comments. Wired, https://www.wired.com/story/bots-form-letters-humans-fcc-net-neutrality-comments/.Google Scholar
- (2013) Real-time crowd labeling for deployable activity recognition. Bruckman A, Counts S, Lampe C, Terveen L, eds. Proc. Conf. Comput. Supported Cooperative Work (Association for Computing Machinery, New York), 1203–1212.Google Scholar
- (2016) Labeling relevant skills in tasks: Can the crowd help? Blackwell A, Plimmer B, Stapleton G, eds. Proc. IEEE Sympos. on Visual Languages and Human-Centric Comput. (IEEE, New York), 185–189.Google Scholar
- (2014) Neural word embedding as implicit matrix factorization. Ghahramani Z, Welling M, Cortes C, Lawrence N, Weinberger KQ, eds. Advances in Neural Information Processing Systems (MIT Press, Cambridge, MA), 2177–2185.Google Scholar
- (2015) Topical word embeddings. Gunning D, Yeh PZ, eds. Proc. 29th AAAI Conf. on Artificial Intelligence (AAAI, Palo Alto, CA).Google Scholar
- (2020) Go to YouTube and call me in the morning: Use of social media for chronic conditions. Management Inform. Systems Quart. 44(1):257–283.Crossref, Google Scholar
- (2016) Decoding social media speak: Developing a speech act theory research agenda. J. Consumer Marketing 33(2):124–134.Crossref, Google Scholar
- (1985) Implications of theories of language for IS. Management Inform. Systems Quart. 9(1):61–74.Crossref, Google Scholar
- (2016) Crowd-algorithm collaboration for large-scale endoscopic image annotation with confidence. Essert C, ed. Proc. Internat. Conf. on Medical Image Comput. and Comput.-Assisted Intervention (Springer, Cham, Switzerland), 616–623.Google Scholar
- (2019) Exploiting unintended feature leakage in collaborative learning. Gondree M, Kruegel C, Shacham H, eds. Proc. IEEE Sympos. on Security and Privacy (IEEE, New York), 691–706.Google Scholar
- (2013) Efficient estimation of word representations in vector space. IEEE Computer Society, ed. Proc. Internat. Conf. on Learn. Representations (IEEE, New York).Google Scholar
- (2011) Automated speech act classification for online chat. Visa S, Inoue A, Ralescu A, eds. MAICS (Midwest Artificial Intelligence and Cognitive Science, Cincinnati, OH), 23–29.Google Scholar
- (2013) Combining crowd-generated media and personal data: Semi-supervised learning for context recognition. Singh VK, Chua T-S, Jain R, eds. Proc. 1st ACM Internat. Workshop on Personal Data Meets Distributed Multimedia (Association for Computing Machinery, New York), 35–38.Google Scholar
- (2016) On profiling bots in social media. Spiro E, Ahn Y-Y, eds. Proc. Internat. Conf. on Social Informatics (World Academy of Science, Engineering, and Technology), 92–109.Google Scholar
- (2003) Utterance classification in AutoTutor. Proc. Human Language Tech.Google Scholar
- (2020) Measuring bot and human behavioral dynamics. Frontiers Phys. 8:125.Crossref, Google Scholar
- (2011) Classifying sentences as speech acts in message board posts. Barzilay R, Johnson M, eds. Proc. Conf. on Empirical Methods in Natural Language Processing (Association for Computational Linguistics, Stroudsburg, PA), 748–758.Google Scholar
- (2017) Editor’s comments: Diversity of design science research. Management Inform. Systems Quart. 41(1):iii–xviii.Google Scholar
- (2019) Sentence-BERT: Sentence embeddings using Siamese BERT-networks. Padó S, Huang R, eds. Proc. Conf. on Empirical Methods in Natural Language Processing (Association for Computational Linguistics, Stroudsburg, PA).Google Scholar
- (2017) CSI: A hybrid deep model for fake news detection. Lim E-P, Winslett M, Sanderson M, Fu A, Sun J, Culpepper S, Lo E, Ho J, Donato D, Agrawal R, Zheng Y, Castillo C, Sun A, Tseng VS, Li C, eds. Proc. ACM Conf. on Inform. and Knowledge Management (Association for Computing Machinery, New York), 797–806.Google Scholar
- (1969) Speech Acts: An Essay in the Philosophy of Language (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
- (2019) Detecting malicious social bots based on clickstream sequences. IEEE Access 7:28885–28862.Google Scholar
- (2019) MASS: Masked sequence to sequence pre-training for language generation. Xing E, Chaudhuri K, Salakhutdinov R, eds. Proc. Internat. Conf. on Machine Learn. (Association for Computing Machinery, New York), 5926–5936.Google Scholar
- (2017a) Do social bots dream of electric sheep? A categorisation of social media bot accounts. Jeffery R. ed. Proc. Australasian Conf. on Inform. Systems (Australasian Association for Information Systems, Melbourne, Australia), 1–11.Google Scholar
- (2017b) Do social bots still act different to humans? Comparing metrics of social bots with those of humans. Jeffery R. ed. Proc. Internat. Conf. on Social Comput. and Social Media (Australasian Association for Information Systems, Melbourne, Australia), 379–395.Google Scholar
- (2019) Target based speech act classification in political campaign text. Mihalcea R, Shutova E, Ku L-W, Evang K, Poria S, eds. Proc. 8th Joint Conf. on Lexical and Computational Semantics (Association for Computiational Linguistics, Stroudsburg, PA), 273–282.Google Scholar
- (2016) The DARPA Twitter bot challenge. Computer 49(6):38–46.Crossref, Google Scholar
- (2019) Utilizing BERT for aspect-based sentiment analysis via constructing auxiliary sentence. Burstein J, Doran C, Solorio T, eds. Proc. NAACL-HLT (Association for Computiational Linguistics, Stroudsburg, PA), 380–385.Google Scholar
- (2012) LSTM neural networks for language modeling. Navarro Mesa JL, Ortegoa A, Teixeira A, Perez EH, Morales PQ, Garcia AR, Moreno IG, Toledano DT, eds. Proc. 13th Annual Conf. of the Internat. Speech Comm. Assoc. (Elsevier, Amsterdam).Google Scholar
- (2017) Social media affordances for connective action: An examination of microblogging use during the Gulf of Mexico oil spill. Management Inform. Systems Quart. 41(4):1179–1205.Crossref, Google Scholar
- (2017) Online human-bot interactions: Detection, estimation, and characterization. Ruths D, Mason W, Marwick A, Gonzalez-Bailon S, eds. Proc. 11th AAAI Conf. on Web and Social Media, (AAAI, Palo Alto, CA).Google Scholar
- (2017) A new concept using LSTM neural networks for dynamic system identification. Sun J, Rajamani R, eds. Proc. Amer. Control Conf. (IEEE, New York), 5324–5329.Google Scholar
- (2019) Twitter bot detection using bidirectional long short-term memory neural networks and word embeddings. Joshi J, ed. Proc. 1st IEEE Internat. Conf. on Trust, Privacy and Security in Intelligent Systems and Appl. (IEEE, New York), 101–109.Google Scholar
- (1986) Understanding Computers and Cognition (Abex, Norwood, NJ).Google Scholar
- (2020) Transformers: State-of-the-art natural language processing. Wolf T, Debut L, Sanh V, Chaumond J, Delangue C, Moi A, Cistac P, Rault T, Louf R, Funtowicz M, Davison J, Shleifer S, von Platen P, Ma C, Jernite Y, Plu J, Xu C, Le Scao T, Gugger S, Drame M, Lhoest Q, Rush A, eds. Proc. Conf. on Empirical Methods in Natural Language Processing: System Demonstrations (Association for Computiational Linguistics, Stroudsburg, PA), 38–45.Google Scholar
- (2019) Arming the public with artificial intelligence to counter social bots. Human Behav. Emerging Tech. 1(1).Google Scholar
- (2016) Recognizing composite daily activities from crowd-labelled social media data. Pervasive Mobile Comput. 26:103–120.Crossref, Google Scholar

