A Bias Correction Approach for Interference in Ranking Experiments

Published Online:https://doi.org/10.1287/mksc.2022.0046

References

  • Bajari P, Burdick B, Imbens G, Rosen I, Richardson T, McQueen J (2020) Multiple randomization designs for interference. Accessed June, 2023, https://assets.amazon.science/c1/94/0d6431bf46f7978295d245dd6e06/double-randomized-online-experiments.pdf.Google Scholar
  • Blake T, Coey D (2014) Why marketplace experimentation is harder than it seems: The role of test-control interference. Proc. 15th ACM Conf. Econom. Comput. (ACM, New York), 567–582.Google Scholar
  • Chakraborty I, Kim M, Sudhir K (2022) Attribute sentiment scoring with online text reviews: Accounting for language structure and missing attributes. J. Marketing Res. 59(3):600–622.Google Scholar
  • Chapelle O, Joachims T, Radlinski F, Yue Y (2012) Large-scale validation and analysis of interleaved search evaluation. ACM Trans. Inform. Systems 30(1):1–41.CrossrefGoogle Scholar
  • Chen Y, Canny JF (2011) Recommending ephemeral items at web scale. Proc. 34th Internat. ACM SIGIR Conf. Res. Development Inform. Retrieval (ACM, New York), 1013–1022.Google Scholar
  • Clarke B (2016) Why these tech companies keep running thousands of failed experiments. https://www.fastcompany.com/3063846/why-these-tech-companies-keep-running-thousands-of-failed.Google Scholar
  • De los Santos B, Koulayev S (2017) Optimizing click-through in online rankings with endogenous search refinement. Marketing Sci. 36(4):542–564.LinkGoogle Scholar
  • Dhillon P, Aral S (2021) Modeling dynamic user interests: A neural matrix factorization approach. Preprint, submitted September 19, https://arxiv.org/abs/2102.06602.Google Scholar
  • Dudík M, Langford J, Li L (2011) Doubly robust policy evaluation and learning. Proc. 28th Internat. Conf. Machine Learn. (Omni Press, Madison), 1097–1104.Google Scholar
  • Dzyabura D, Peres R (2019) Visual elicitation of brand perception. Preprint, submitted December 18, https://dx.doi.org/10.2139/ssrn.3496538.Google Scholar
  • Eckles D, Karrer B, Ugander J (2017) Design and analysis of experiments in networks: Reducing bias from interference. J. Causal Inference 5(1):20150021.CrossrefGoogle Scholar
  • Falk K (2019) Practical Recommender Systems (Manning, Shelter Island).Google Scholar
  • Fradkin A (2019) A simulation approach to designing digital matching platforms. Boston University Questrom School of Business Research Paper, MA.Google Scholar
  • Ghose A, Ipeirotis PG, Li B (2014) Examining the impact of ranking on consumer behavior and search engine revenue. Management Sci. 60(7):1632–1654.LinkGoogle Scholar
  • Goli A, Reiley D, Zhang H (2021) Personalized versioning: Product strategies constructed from experiments on Pandora. Preprint, submitted July 8, https://dx.doi.org/10.2139/ssrn.3874243.Google Scholar
  • Gomez-Uribe CA, Hunt N (2015) The netflix recommender system: Algorithms, business value, and innovation. ACM Trans. Management Inform. Systems 6(4):1–19.CrossrefGoogle Scholar
  • Ha-Thuc V, Dutta A, Mao R, Wood M, Liu Y (2020) A counterfactual framework for seller-side A/B testing on marketplaces. Proc. 43rd Internat. ACM SIGIR Conf. Res. Development Inform. Retrieval (ACM, New York), 2288–2296.Google Scholar
  • Hitsch GJ, Misra S (2018) Heterogeneous treatment effects and optimal targeting policy evaluation. Preprint, submitted February 6, https://dx.doi.org/10.2139/ssrn.3111957.Google Scholar
  • Holtz D, Aral S (2020) Limiting bias from test-control interference in online marketplace experiments. Preprint, submitted May 20, https://dx.doi.org/10.2139/ssrn.3583596.Google Scholar
  • Holtz D, Lobel R, Liskovich I, Aral S (2020) Reducing interference bias in online marketplace pricing experiments. Preprint, submitted April 26, https://arxiv.org/abs/2004.12489.Google Scholar
  • Horvitz DG, Thompson DJ (1952) A generalization of sampling without replacement from a finite universe. J. Amer. Statist. Assoc. 47(260):663–685.CrossrefGoogle Scholar
  • Imbens GW, Rubin DB (2015) Causal Inference in Statistics, Social, and Biomedical Sciences (Cambridge University Press, Cambridge, UK).CrossrefGoogle Scholar
  • Jamieson K, Jain L (2018) A bandit approach to multiple testing with false discovery control. Proc. 32nd Internat. Conf. Neural Inform. Processing Systems (NIPS, San Diego, CA), 3664–3674.Google Scholar
  • Jannach D, Jugovac M (2019) Measuring the business value of recommender systems. ACM Trans. Management Inform. Systems 10(4):1–23.CrossrefGoogle Scholar
  • Joachims T, Swaminathan A, Schnabel T (2017) Unbiased learning-to-rank with biased feedback. Proc. 10th ACM Internat. Conf. Web Search Data Mining (ACM, New York), 781–789.Google Scholar
  • Johari R, Pekelis L, Walsh D (2022) Always valid inference: Continuous monitoring of A/B tests. Oper. Res. 70(3):1806–1821.LinkGoogle Scholar
  • Johari R, Li H, Liskovich I, Weintraub G (2020) Experimental design in two-sided platforms: An analysis of bias. Preprint, submitted February 13, https://arxiv.org/abs/2002.05670.Google Scholar
  • Katukuri J, Könik T, Mukherjee R, Kolay S (2014) Recommending similar items in large-scale online marketplaces. IEEE Internat. Conf. Big Data (IEEE, Piscataway, NJ), 868–876.Google Scholar
  • Kohavi R, Tang D, Xu Y (2020) Trustworthy Online Controlled Experiments: A Practical Guide to A/B Testing (Cambridge University Press, Cambridge, UK).CrossrefGoogle Scholar
  • Liu J, Toubia O, Hill S (2021) Content-based model of web search behavior: An application to TV show search. Management Sci. 67(10):6378–6398.LinkGoogle Scholar
  • Manski CF (2013) Identification of treatment response with social interactions. Econom. J. 16(1):S1–S23.CrossrefGoogle Scholar
  • Nandy P, Basu K, Chatterjee S, Tu Y (2019) A/B testing in dense large-scale networks: Design and inference. Preprint, submitted January 29, https://arxiv.org/abs/1901.10505.Google Scholar
  • Netzer O, Feldman R, Goldenberg J, Fresko M (2012) Mine your own business: Market-structure surveillance through text mining. Marketing Sci. 31(3):521–543.LinkGoogle Scholar
  • Pennington J, Socher R, Manning CD (2014) GloVe: Global vectors for word representation. Proc. 2014 Conf. Empirical Methods Natl. Language Processing, 1532–1543.Google Scholar
  • Rafieian O, Yoganarasimhan H (2021) Targeting and privacy in mobile advertising. Marketing Sci. 40(2):193–218.LinkGoogle Scholar
  • Rubin DB (1990) Formal mode of statistical inference for causal effects. J. Statist. Planning Inference 25(3):279–292.CrossrefGoogle Scholar
  • Saveski M, Pouget-Abadie J, Saint-Jacques G, Duan W, Ghosh S, Xu Y, Airoldi EM (2017) Detecting network effects: Randomizing over randomized experiments. Proc. 23rd ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (ACM, New York), 1027–1035.Google Scholar
  • Ursu RM (2018) The power of rankings: Quantifying the effect of rankings on online consumer search and purchase decisions. Marketing Sci. 37(4):530–552.LinkGoogle Scholar
  • Yoganarasimhan H (2020) Search personalization using machine learning. Management Sci. 66(3):1045–1070.LinkGoogle Scholar
  • Yoganarasimhan H, Barzegary E, Pani A (2023) Design and evaluation of optimal free trials. Management Sci. 69(6):3220–3240.LinkGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.