Is Relevancy Everything? A Deep-Learning Approach to Understand the Effect of Image-Text Congruence
Published Online:9 May 2025https://doi.org/10.1287/mnsc.2022.01896
References
- 2011) Deriving the pricing power of product features by mining consumer reviews. Management Sci. 57(8):1485–1509.Link, Google Scholar (
- 1986) The moderator-mediator variable distinction in social psychological research: Conceptual, strategic, and statistical considerations. J. Personality Soc. Psych. 51(6):1173.Crossref, Google Scholar (
- 2012) Deep learning of representations for unsupervised and transfer learning. Proc. ICML Workshop Unsupervised and Transfer Learn. (JMLR.org), 17–36.Google Scholar (
- 2020) Uniting the tribes: Using text for marketing insight. J. Marketing 84(1):1–25.Crossref, Google Scholar (
- 2019) The smell of healthy choices: Cross-modal sensory compensation effects of ambient scent on food purchases. J. Marketing Res. 56(1):123–141.Crossref, Google Scholar (
- 2021) Mining bilateral reviews for online transaction prediction: A relational topic modeling approach. Inform. Systems Res. 32(2):541–560.Link, Google Scholar (
- 2017) On sampling strategies for neural network-based collaborative filtering. Proc. 23rd ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (ACM, New York), 767–776.Google Scholar (
- 2022) Mining consumer minds: Downstream consequences of host motivations for home-sharing platforms. J. Consumer Res. 48(5):817–838.Crossref, Google Scholar (
- 2020) Revisiting pre-trained models for Chinese natural language processing. Preprint, submitted April 29, https://arxiv.org/abs/2004.13922.Google Scholar (
- 2018) How evaluations of multiple percentage price changes are influenced by presentation mode and percentage ordering: The role of anchoring and surprise. J. Marketing Res. 55(5):655–666.Crossref, Google Scholar (
- 2010) Estimating aggregate consumer preferences from online product reviews. Internat. J. Res. Marketing 27(4):293–307.Crossref, Google Scholar (
- 2010) Consumer preferences for color combinations: An empirical analysis of similarity-based color relationships. J. Consumer Psych. 20(4):476–484.Crossref, Google Scholar (
- 2009) ImageNet: A large-scale hierarchical image database. Proc. IEEE Conf. Computer Vision Pattern Recognition (IEEE, Piscataway, NJ), 248–255.Google Scholar (
- 2018) BERT: Pre-training of deep bidirectional transformers for language understanding. Preprint, submitted October 11, https://arxiv.org/abs/1810.04805.Google Scholar (
- 2018) Bayesian nonparametric customer base analysis with model-based visualizations. Marketing Sci. 37(2):216–235.Link, Google Scholar (
- 2022) Letting logos speak: Leveraging multiview representation learning for data-driven branding and logo design. Marketing Sci. 41(2):401–425.Link, Google Scholar (
- 2021) Modeling dynamic user interests: A neural matrix factorization approach. Marketing Sci. 40(6):1059–1080.Abstract, Google Scholar (
- 2021) Visual elecitation of brand perception. J. Marketing 85(4):44–66.Crossref, Google Scholar (
- 2021) An AI method to score celebrity visual potential from human faces. Preprint, submitted May 1, http://dx.doi.org/10.2139/ssrn.4067555.Google Scholar (
- 1988) Print ad recognition readership scores: An information processing perspective. J. Marketing Res. 25(2):168–177.Crossref, Google Scholar (
- 2021) Structured multi-modal feature embedding and alignment for image-sentence retrieval. Preprint, submitted August 5, https://arxiv.org/abs/2108.02417.Google Scholar (
- 2014) Word2vec explained: Deriving Mikolov et al.’s negative-sampling word-embedding method. Preprint, submitted February 15, https://arxiv.org/abs/1402.3722.Google Scholar (
- 1980) Picture memory: How the action schema affects retention. Cognitive Psych. 12(4):473–495.Crossref, Google Scholar (
- 2021) Marketing insights from multimedia data: Text, image, audio, and video. J. Marketing Res. 58(6):1025–1033.Crossref, Google Scholar (
- 2008) Art infusion, the influence of visual art on the perception and evaluation of consumer products. J. Marketing Res. 45(3):379–389.Crossref, Google Scholar (
- 2004) Canonical correlation analysis: An overview with application to learning methods. Neural Comput. 16(12):2639–2664.Crossref, Google Scholar (
- 2021) The power of brand selfies. J. Marketing Res. 58(6):1159–1177.Crossref, Google Scholar (
- Hastie R (1980) Memory for Behavioral Information that Confirms or Contradicts a Personality Impression. Person Memory (PLE: Memory): The Cognitive Basis of Social Perception (Psychology Press, London, UK), 155–178.Google Scholar
- Hastie R (1981) Schematic principles in human memory. Social Cognition: The Ontario Symposium Volume 1 (Routledge, London), 39–88.Google Scholar
- He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. Proc. 2016 IEEE Conf. Computer Vision Pattern Recognition (IEEE, Piscataway, NJ), 770–778.Google Scholar
- 1992) The role of expectancy and relevancy in memory for verbal and visual information: What is incongruency? J. Consumer Res. 18(4):475–492.Crossref, Google Scholar (
- 1987) Picture-word consistency and the elaborative processing of advertisements. J. Marketing Res. 24(4):359–369.Crossref, Google Scholar (
- 2011) Automated marketing research using online customer reviews. J. Marketing Res. 48(5):881–894.Crossref, Google Scholar (
- 1999) Responses to information incongruency in advertising: The role of expectancy, relevancy, and humor. J. Consumer Res. 26(2):156–169.Crossref, Google Scholar (
- 2020) Is a picture worth a thousand words? An empirical study of image content and social media engagement. J. Marketing Res. 57(1):1–19.Crossref, Google Scholar (
- 1997) Toward a process analysis of emotions: The case of surprise. Motivation Emotion 21(3):251–274.Crossref, Google Scholar (
- 2013) Efficient estimation of word representations in vector space. Preprint, submitted January 16, https://arxiv.org/abs/1301.3781.Google Scholar (
- 1997) Relevance: The whole history. J. Amer. Soc. Inform. Sci. 48(9):810–832.Crossref, Google Scholar (
- 2019) When words sweat: Identifying signals for loan default in the text of loan applications. J. Marketing Res. 56(6):960–980.Crossref, Google Scholar (
- 2012) Mine your own business: Market-structure surveillance through text mining. Marketing Sci. 31(3):521–543.Link, Google Scholar (
- 2004) Attention capture and transfer in advertising: Brand, pictorial, and text-size effects. J. Marketing 68(2):36–50.Crossref, Google Scholar (
- 1996) Relevance reconsidered. Proc. 2nd Conf. Conceptions Library Inform. Sci., 201–218.Google Scholar (
- 1981) Person memory: Some tests of associative storage and retrieval models. J. Experiment. Psych. Human Learn. Memory 7(6):440.Crossref, Google Scholar (
- 1985) Associative storage and retrieval processes in person memory. J. Experiment. Psych. Learn. Memory Cognition 11(2):316.Crossref, Google Scholar (
- 2018) A survey on deep transfer learning. Proc. Internat. Conf. Artificial Neural Networks (Springer, Cham, Switzerland), 270–279.Crossref, Google Scholar (
- 2019) Identifying customer needs from user-generated content. Marketing Sci. 38(1):1–20.Link, Google Scholar (
- 2012) Does chatter really matter? Dynamics of user-generated content and stock performance. Marketing Sci. 31(2):198–215.Link, Google Scholar (
- 2022) Look the part? The role of profile pictures in online labor markets. Marketing Sci. 42(6):1080–1100.Link, Google Scholar (
- 1994) Effects of color on emotions. J. Experiment. Psych. General 123(4):394.Crossref, Google Scholar (
- 2010) More than words: On the importance of picture-text congruence in the online environment. J. Interactive Marketing 24(1):22–30.Crossref, Google Scholar (
- 2018) Learning two-branch neural networks for image-text matching tasks. IEEE Trans. Pattern Anal. Machine Intelligence 41(2):394–407.Crossref, Google Scholar (
- Wedel M, Pieters R (2008) Eye tracking for visual marketing. Foundations Trends® Marketing 1(4):231–320.Google Scholar
- 2015) The buffer effect: The role of color when advertising exposures are brief and blurred. Marketing Sci. 34(1):134–143.Link, Google Scholar (
- 2020) Multi-modality cross attention network for image and sentence matching. Proc. IEEE/CVF Conf. Computer Vision Pattern Recognition (IEEE, Piscataway, NJ), 10941–10950.Google Scholar (
- 2018) Color and emotion: Effects of hue, saturation, and brightness. Psych. Res. 82(5):896–914.Crossref, Google Scholar (
- 2020) Cross-modal attention with semantic consistence for image: Text matching. IEEE Trans. Neural Networks Learn. Systems 31(12):5412–5425.Crossref, Google Scholar (
- 2014) From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions. Trans. Assoc. Comput. Linguistics 2:67–78.Crossref, Google Scholar (
- 2016) Wide residual networks. Preprint, submitted May 23, https://arxiv.org/abs/1605.07146.Google Scholar (
- 2022) Can consumer-posted photos serve as a leading indicator of restaurant survival? Evidence from Yelp. Management Sci. 69(1):25–50.Link, Google Scholar (
- 2022) What makes a good image? Airbnb demand analytics leveraging interpretable image features. Management Sci. 68(8):5644–5666.Link, Google Scholar (