Multimodal Sentiment Analysis for Cross-Cultural Consumer Behaviour: Understanding the Popularity of Korean Products in India through Social Media

Alimpia Roy; Gurpreet Singh

doi:10.55524/ijirem.2026.13.3.5

Abstract

The Korean Wave (Hallyu) has significantly influenced consumer behaviour in India, yet understanding the nuanced sentiment patterns that drive cross-cultural product adoption remains challenging. Traditional consumer behaviour studies rely primarily on surveys and unimodal text analysis, which fail to capture the rich multimodal nature of social media discourse. In this paper, we propose CultureFuse, a novel multimodal sentiment analysis framework that leverages both textual and visual modalities to analyse Indian consumers’ perceptions of Korean products. Our framework employs multilingual BERT for textual feature extraction and category conditioned visual encoding, combined through a Cross-Modal Attention Fusion (CMAF) mechanism with adaptive gating. We construct Hallyu India-MM, a curated multimodal dataset of 133 consumer review samples spanning Korean beauty, food, fashion, and entertainment products popular among Indian consumers. Through rigorous 5-fold cross-validation, our multimodal CultureFuse approach achieves 90.9% accuracy in sentiment classification, substantially outperforming text only BERT (72.1%), TF-IDF+SVM (60.1%), and late fusion (78.1%) baselines. The learned adaptive gating mechanism assigns 62.1% weight to textual features and 37.9% to visual features on average, with category-dependent variation. Our per-category analysis reveals that visual features are particularly discriminative for beauty and fashion categories (100% accuracy), while food products remain more challenging across all modalities (73.3%). This work bridges multimodal AI and consumer behaviour research, demonstrating that cross-modal attention fusion substantially improves the understanding of cross-cultural consumption patterns.

Keywords

Multimodal Sentiment Analysis, Cross-cultural Consumer Behaviour, Korean Wave, Hallyu, Vision Transformer, BERT, Cross-Modal Attention, Cultural Affinity.

References

“Transnationality of Popular Culture in the Korean Wave,” Korea Journal, vol. 60, no. 1, pp. 5–16, 2020. Available from: https://doi.org/10.25024/kj.2020.60.1.5
S. J. Lee, “The Korean Wave: The Seoul of Asia,” The Elon Journal of Undergraduate Research in Communications, vol. 2, no. 1, pp. 85–93, 2011. Available from: https://eloncdn.blob.core.windows.net/eu3/sites/153/2017/06/09SueJin.pdf
O.-K. Lai and T.-Y. Kim, “Hallyu 2.0: The Rise of Korean Soft Power in South and Southeast Asia,” International Journal of Cultural Policy, vol. 29, no. 2, pp. 175–192, 2023.
W. Medhat, A. Hassan, and H. Korashy, “Sentiment Analysis Algorithms and Applications: A Survey,” Ain Shams Engineering Journal, vol. 5, no. 4, pp. 1093–1113, 2014. Available from: https://www.scirp.org/reference/referencespapers?referenceid=3018654
L. Zhang, S. Wang, and B. Liu, “Deep Learning for Sentiment Analysis: A Survey,” arXiv preprint arXiv:1801.07883, 2018. Available from: https://arxiv.org/abs/1801.07883
Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark, G. Krueger, and I. Sutskever, “Learning Transferable Visual Models From Natural Language Supervision,” arXiv preprint arXiv:2103.00020, 2021. Available from: https://arxiv.org/abs/2103.00020
J. Lu, D. Batra, D. Parikh, and S. Lee, “ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks,” arXiv preprint arXiv:1908.02265, 2019. Available from: https://arxiv.org/abs/1908.02265
Tan and M. Bansal, “LXMERT: Learning Cross-Modality Encoder Representations From Transformers,” arXiv preprint arXiv:1908.07490, 2019. Available from: https://arxiv.org/abs/1908.07490
S. Poria, E. Cambria, R. Bajpai, and A. Hussain, “A Review of Affective Computing: From Unimodal Analysis to Multimodal Fusion,” Information Fusion, vol. 37, pp. 98–125, 2017. Available from: https://www.scirp.org/reference/referencespapers?referenceid=3828980
T. Baltrušaitis, C. Ahuja, and L.-P. Morency, “Multimodal Machine Learning: A Survey and Taxonomy,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 2, pp. 423–443, 2019. Available from: https://doi.org/10.1109/TPAMI.2018.2798607
Zadeh, P. P. Liang, S. Poria, E. Cambria, and L.-P. Morency, “Multimodal Language Analysis in the Wild: CMU-MOSEI Dataset and Interpretable Dynamic Fusion Graph,” in Proc. 56th Annual Meeting of the Association for Computational Linguistics (ACL), Melbourne, Australia, 2018, pp. 2236–2246. Available from: https://aclanthology.org/P18-1208/
Y.-H. H. Tsai, S. Bai, P. P. Liang, J. Z. Kolter, L.-P. Morency, and R. Salakhutdinov, “Multimodal Transformer for Unaligned Multimodal Language Sequences,” arXiv preprint arXiv:1906.00295, 2019. Available from: https://arxiv.org/abs/1906.00295
L. Zhu, Z. Zhu, C. Zhang, Y. Xu, and X. Kong, “Multimodal Sentiment Analysis Based on Fusion Methods: A Survey,” Information Fusion, vol. 95, pp. 306–325, 2023. Available from: https://doi.org/10.1016/j.inffus.2023.02.028
Gandhi, K. Adhvaryu, S. Poria, E. Cambria, and A. Hussain, “Multimodal Sentiment Analysis: A Systematic Review of History, Datasets, Multimodal Fusion Methods, Applications, Challenges and Future Directions,” Information Fusion, vol. 91, pp. 424–444, 2023. Available from: https://www.scirp.org/reference/referencespapers?referenceid=3880947
Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding,” in Proc. 2019 Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, MN, USA, 2019, pp. 4171–4186. Available from: https://www.scirp.org/reference/referencespapers?referenceid=3751522
He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, Jun. 2016, pp. 770–778. Available from: https://www.scirp.org/reference/referencespapers?referenceid=3166599
Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, “An Image Is Worth 16×16 Words: Transformers for Image Recognition at Scale,” in International Conference on Learning Representations (ICLR), 2021. Available from: https://openreview.net/forum?id=YicbFdNTTy
Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, ?. Kaiser, and I. Polosukhin, “Attention Is All You Need,” arXiv preprint arXiv:1706.03762, 2017. Available from: https://arxiv.org/abs/1706.03762
Li, Y. Zhang, K. Li, Y. Li, and Y. Fu, “Bridging the Gap: Multi-Level Cross-Modal Alignment for Image-Text Matching,” IEEE Transactions on Image Processing, vol. 31, pp. 3912–3925, 2022.
Ravina, “Introduction: Conceptualizing the Korean Wave,” Southeast Review of Asian Studies, vol. 31, pp. 3–9, 2009. Available from: http://www.asia-studies.com/2seras07.html
Y. Cho and A. Singh, “The Korean Wave (Hallyu) in India: Cultural Proximity and Transnational Media Consumption,” Asian Communication Research, vol. 19, no. 2, pp. 43–62, 2022.
Yu and C. Park, “Exploring the Influence of K-Pop Fandom on Consumer Purchase Intentions,” Asia Pacific Journal of Marketing and Logistics, vol. 33, no. 9, pp. 2025–2044, 2021.
Kim, J. Lee, and S. Park, “Social Media and Korean Wave: How Social Media Engagement Drives Cross-Cultural Consumption,” Journal of International Consumer Marketing, vol. 35, no. 3, pp. 289–305, 2023.
S.-H. Lee and J.-S. Park, “The Rise of K-Beauty in Asia: Cultural Hybridization and Consumer Identity,” Journal of Consumer Culture, vol. 23, no. 1, pp. 99–118, 2023.
R. Kumar and P. Sharma, “Impact of the Korean Wave on Consumer Buying Behaviour in India: An Empirical Study,” International Journal of Research in Marketing Management and Sales, vol. 5, no. 1, pp. 45–58, 2023.
Hofstede, “Dimensionalizing Cultures: The Hofstede Model in Context,” Online Readings in Psychology and Culture, vol. 2, no. 1, 2011. Available from: https://scholarworks.gvsu.edu/orpc/vol2/iss1/8/
P. P. Liang, A. Zadeh, and L.-P. Morency, “Foundations and Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions,” ACM Computing Surveys, vol. 56, no. 10, pp. 1–42, 2024. Available from: https://doi.org/10.1145/3656580
Conneau, K. Khandelwal, N. Goyal, V. Chaudhary, G. Wenzek, F. Guzmán, E. Grave, M. Ott, L. Zettlemoyer, and V. Stoyanov, “Unsupervised Cross-Lingual Representation Learning at Scale,” arXiv preprint arXiv:1911.02116, 2019. Available from: https://arxiv.org/abs/1911.02116
T. Wolf, L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cistac, T. Rault, R. Louf, M. Funtowicz, J. Davison, S. Shleifer, P. von Platen, C. Ma, Y. Jernite, J. Plu, C. Xu, T. Le Scao, S. Gugger, and A. M. Rush, “HuggingFace’s Transformers: State-of-the-Art Natural Language Processing,” arXiv preprint arXiv:1910.03771, 2019. Available from: https://arxiv.org/abs/1910.03771
P. Kingma and J. Ba, “Adam: A Method for Stochastic Optimization,” in International Conference on Learning Representations (ICLR), San Diego, CA, USA, 2015. Available from: https://www.scirp.org/reference/referencespapers?referenceid=2655281

Cites this article as

A. Roy, G. Singh, "Multimodal Sentiment Analysis for Cross-Cultural Consumer Behaviour: Understanding the Popularity of Korean Products in India through Social Media", International Journal of Innovative Research in Engineering and Management (IJIREM), Vol-13, Issue-3, Page No-38-46, 2026. Available from: https://doi.org/10.55524/ijirem.2026.13.3.5