Detecting Human Trafficking: Automated Classification of Online Customer Reviews of Massage Businesses
References
- Accenture (2020) Exposing human trafficking networks with AI. Accessed September 15, 2021, https://www.accenture.com/us-en/case-studies/applied-intelligence/artemis.Google Scholar
- (2017) Semi-supervised learning for detecting human trafficking. Security Inform. 6(1):1–14.Crossref, Google Scholar
- (2021) pyspellchecker. Accessed May 17, 2021, https://pypi.org/project/pyspellchecker/.Google Scholar
- (2009) Natural Language Processing with Python (O’Reilly Media, Sebastopol, CA).Google Scholar
- (2018) Estimating demand for illicit massage businesses in Houston, Texas. J. Human Trafficking 4(4):279–297.Crossref, Google Scholar
- (1996) Bagging predictors. Machine Learn. 24(2):123–140.Crossref, Google Scholar
- (2017) The human trafficking kill chain: A guide to systematic disruption. Accessed November 19, 2021, https://www.globalemancipation.ngo/the-human-trafficking-kill-chain/.Google Scholar
- (2021) An empirical survey of data augmentation for limited data learning in NLP. Preprint, submitted June 14, https://arxiv.org/abs/2106.07499.Google Scholar
- (2019) Illicit massage parlors in Los Angeles county and New York City: Stories from women workers. Accessed July 26, 2022, http://johnchin.net/Article_Files/MP_Study_10.11.19_FINAL.pdf.Google Scholar
- (2021) Identifying online risk markers of hard-to-observe crimes through semi-inductive triangulation: The case of human trafficking in the United States. British J. Criminology 62(3):639–658.Google Scholar
- Demand Abolition (2018) Who buys sex? Understanding and disrupting illicit market demand. Accessed July 27, 2022, https://www.demandabolition.org/wp-content/uploads/2019/07/Demand-Buyer-Report-July-2019.pdf.Google Scholar
- (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. Burstein J, Doran C, Solorio T, eds. Proc. Conf. of the North Amer. Chapter of the Assoc. for Computat. Linguistics (ACL, Stroudsburg, PA), 4171–4186.Google Scholar
- (2020) Natural language-based integration of online review datasets for identification of sex trafficking businesses. Ceballos C, ed. Proc. IEEE 21st Internat. Conf. on Inform. Reuse and Integration for Data Sci. (IEEE, New York), 259–264.Google Scholar
- (2015) Leveraging publicly available data to discern patterns of human-trafficking activity. J. Human Trafficking 1(1):65–85.Crossref, Google Scholar
- (2019) Context-specific language modeling for human trafficking detection from online advertisements. Korhonen A, Traum D, Màrquez L, eds. Proc. 57th Annual Meeting of the Assoc. for Comput. Linguistics (ACL, Stroudsburg, PA), 1180–1184.Google Scholar
- Federation of State Massage Therapy Boards (2017) Human trafficking task force report. Accessed July 27, 2022, https://www.fsmtb.org/media/1606/httf-report-final-web.pdf.Google Scholar
- (1996) Experiments with a new boosting algorithm. Saitta L, ed. Proc. 13th Internat. Conf. on Machine Learn. (Morgan Kaufmann, San Francisco), 148–156.Google Scholar
- (2019) Hidden in plain sight: A machine learning approach for detecting prostitution activity in Phoenix, Arizona. Appl. Spatial Anal. Policy 12:941–963.Crossref, Google Scholar
- Heyrick Research (2021) Snapshot: The illicit massage industry at a glance. Accessed September 9, 2021, https://www.heyrickresearch.org/research/what-is-the-illicit-massage-industry.Google Scholar
- (2019) Human trafficking awareness for salon professionals. Accessed December 6, 2022, https://www.floridacosmetologist.com/human-trafficking-awareness-salon/.Google Scholar
- (2016) FastText.zip: Compressing text classification models. Preprint, submitted December 12, https://arxiv.org/abs/1612.03651.Google Scholar
- (2018) The relative performance of ensemble methods with deep convolutional neural networks for image classification. J. Appl. Statist. 45(15):2800–2818.Crossref, Google Scholar
- (2019) exBAKE: Automatic fake news detection model based on bidirectional encoder representations from transformers (BERT). Appl. Sci. 9(19):4062.Crossref, Google Scholar
- (2012) Predictive patterns of sex trafficking online. Senior honors thesis, Carnegie Mellon University, Pittsburgh.Google Scholar
- (2021) Cracking sex trafficking: Data analysis, pattern recognition, and path prediction. Production Oper. Management 30(4):1110–1135.Crossref, Google Scholar
- (2020) Human trafficking in the hospitality industry in the Netherlands. Res. Hospitality Management 10(2):131–136.Crossref, Google Scholar
- (2014) Distributed representations of sentences and documents. Xing EP, Jebara T, eds. Proc. 31st Internat. Conf. on Machine Learning (PMLR, held virtually), 1188–1196.Google Scholar
- (2018) Word embedding for understanding natural language: A survey. Guide to Big Data Applications (Springer, Berlin), 83–104.Crossref, Google Scholar
- (2016) Super-learning of an optimal dynamic treatment rule. Internat. J. Biostatist. 12(1):305–332.Crossref, Google Scholar
- (2019) 200 Filipino nurses win human trafficking lawsuit against SentosaCare and its owners. Accessed December 6, 2022, https://courtroomstrategy.com/2019/10/200-filipino-nurses-win-human-trafficking-lawsuit-against-sentosacare-and-its-owners/.Google Scholar
- (2013a) Efficient estimation of word representations in vector space. Preprint, submitted January 16; last revised September 7, 2013, https://arxiv.org/abs/1301.3781.Google Scholar
- (2013b) Distributed representations of words and phrases and their compositionality. Adv. Neural Inform. Processing Systems 26:3111–3119.Google Scholar
- (2020) Test-time augmentation for deep learning-based cell segmentation on microscopy images. Sci. Rep. 10(1):1–7.Crossref, Google Scholar
- (2017) An entity resolution approach to isolate instances of human trafficking online. Derczynski L, Xu W, Ritter A, Baldwin T, eds. Proc. 3rd Workshop on Noisy User-Generated Text (ACL, Stroudsburg, PA), 77–84.Google Scholar
- (2018) Human trafficking in hotels: An “invisible” threat for a vulnerable industry. Internat. J. Contempory Hospitality Management 30(3):1996–2014.Crossref, Google Scholar
- Polaris Project (2012) Sex trafficking at truck stops. Accessed October 28, 2021, https://humantraffickinghotline.org/resources/sex-trafficking-truck-stops.Google Scholar
- Polaris Project (2019a) Human trafficking in illicit massage businesses. Accessed May 28, 2021, https://massagetherapy.nv.gov/Resources/Resource_Home/.Google Scholar
- Polaris Project (2019b) New report spotlights the trafficking of nannies, house cleaners, other domestic workers in the U.S. Accessed December 6, 2022, https://polarisproject.org/press-releases/new-report-spotlights-the-trafficking-of-nannies-house-cleaners-other-domestic-workers-in-the-u-s/.Google Scholar
- Polley EC, Rose S, van der Laan MJ (2011) Super learning. Targeted Learning. Springer Series in Statistics. (Springer, New York). https://doi.org/10.1007/978-1-4419-9782-1_3.Google Scholar
- (2021) Unmasking human trafficking risk in commercial sex supply chains with machine learning. Preprint, submitted June 13; last revised August 4, 2022, https://dx.doi.org/10.2139/ssrn.3866259.Google Scholar
- (2010) Software framework for topic modelling with large corpora. Witte R, Cunningham H, Patrick J, Beisswanger E, Buyko E, Hahn U, Verspoor K, Coden AR, eds. Proc. LREC Workshop on New Challenges for NLP Frameworks (ELRA, Paris), 45–50.Google Scholar
- (2016) Improving neural machine translation models with monolingual data. Erk K, Smith NA, eds. Proc. 54th Annual Meeting of the Assoc. for Comput. Linguistics (ACL, Stroudsburg, PA), 86–96.Google Scholar
- (2019) A survey on image data augmentation for deep learning. J. Big Data 6(1):1–48.Crossref, Google Scholar
- (2021) Semi-supervised classification of social media posts: Identifying sex-industry posts to enable better support for those experiencing sex-trafficking. Preprint, submitted April 7, https://arxiv.org/abs/2104.03233.Google Scholar
- (1997) Differential evolution: A simple and efficient heuristic for global optimization over continuous spaces. J. Global Optim. 11(4):341–359.Crossref, Google Scholar
- (2009) Classification of imbalanced data: A review. Internat. J. Pattern Recognition Artificial Intelligence 23(4):687–719.Crossref, Google Scholar
- (2021) The SMART framework: Selection of machine learning algorithms with ReplicaTions: A case study on the microvascular complications of diabetes. IEEE J. Biomedical Health Inform. 26(2):809–817.Google Scholar
- (2017) Combating human trafficking with deep multimodal models. Barzilay R, Kan M-Y, eds. Proc. 55th Annual Meeting of the Assoc. for Comput. Linguistics (ACL, Stroudsburg, PA), 1547–1556.Google Scholar
- U.S. Department of Justice (2017) Understanding the perspective of the victim: Recognizing the complexity of sex trafficking situations. Office of Juvenile Justice and Delinquency Prevention. Accessed August 3, 2022, https://ojjdp.ojp.gov/sites/g/files/xyckuh176/files/pubs/252021.pdf.Google Scholar
- U.S. Department of Justice (2018) Justice Department leads effort to seize Backpage.com. Accessed October 7, 2021, https://www.justice.gov/opa/pr/justice-department-leads-effort-seize-backpagecom-internet-s-leading-forum-prostitution-ads.Google Scholar
- U.S. Department of State (2021) Trafficking in persons report, 2021. Accessed July 21, 2021, https://www.state.gov/reports/2021-trafficking-in-persons-report/.Google Scholar
- (2019) Learning optimized risk scores. J. Machine Learn. Res. 20(150):1–75.Google Scholar
- (2007) Super learner. Statist. Appl. Genetic Molecular Biology 6(1):25.Crossref, Google Scholar
- (2020) SciPy 1.0: Fundamental algorithms for scientific computing in Python. Nature Methods 17:261–272.Crossref, Google Scholar
- (2019) Combating human trafficking using analytics. Accessesd July 14, 2021, https://conf.splunk.com/files/2019/slides/BAS2793.pdf?podcast=1577146223.Google Scholar
- (2019) Aleatoric uncertainty estimation with test-time augmentation for medical image segmentation with convolutional neural networks. Neurocomputing 338:34–45.Crossref, Google Scholar
- (2012) Data integration from open Internet sources to combat sex trafficking of minors. Bertot JC, Luna-Reyes LF, Mellouli S, eds. Proc. 13th Annual Internat. Conf. on Digital Government Res. (ACM, New York), 246–252.Google Scholar
- (2020) Sex trafficking detection with ordinal regression neural networks. Ross D, Sinha A, Staheli D, Streilein B, eds. Proc. AAAI-20 Workshop on Artificial Intelligence for Cyber Security. Preprint, submitted August 15, 2019; last revised January 12, 2020. https://arxiv.org/abs/1908.05434.Google Scholar
- (2019) EDA: Easy data augmentation techniques for boosting performance on text classification tasks. Preprint, submitted January 31; last revised August 25, 2019, https://arxiv.org/abs/1901.11196.Google Scholar
- (1992) Stacked generalization. Neural Networks 5(2):241–259.Crossref, Google Scholar
- (2008) Interpreting TF-IDF term weights as making relevance decisions. ACM Trans. Inform. Systems 26(3):1–37.Crossref, Google Scholar
- (2018) bert-as-service. Accessed July 20, 2021, https://bert-as-service.readthedocs.io/en/latest/index.html.Google Scholar
- (2019) Serving Google BERT in production using Tensorflow and ZeroMQ. Accessed July 25, 2022, https://hanxiao.io/2019/01/02/Serving-Google-BERT-in-Production-using-Tensorflow-and-ZeroMQ/.Google Scholar
- (2021) Inside the $4.5 billion erotic massage parlor economy. Accessed July 26, 2022, https://www.forbes.com/sites/willyakowicz/2021/04/04/inside-the-45-billion-erotic-massage-parlor-economy/?sh=7aef38eb79a8.Google Scholar
- (2019a) Identification and detection of human trafficking using language models. Brynielsson J, ed. Proc. Eur. Intelligence and Security Inform. Conf. (IEEE, New York), 24–31.Google Scholar
- (2019b) UM-IU@LING at SemEval-2019 task 6: Identifying offensive tweets using BERT and SVMs. May J, Shutova E, Herbelot A, Zhu X, Apidianaki M, Mohammad SM, eds. Proc. 13th Internat. Workshop on Semantic Evaluation (ACL, Stroudsburg, PA), 788–795.Google Scholar

