Thesis Topics Fall 25/26

Context Sensitivity and Cultural Bias in Vision-Language Models

Cross-cultural research in perception and cognition has demonstrated that individuals from different backgrounds process information in distinct ways, with East Asians tending toward holistic perspectives and Westerners favoring more analytical approaches. These cultural patterns raise important questions for computational models that are trained primarily on linguistic data. Vision-Language Models (VLMs), in particular, learn to connect textual and visual information, and their outputs may reflect not only structural properties of language but also culturally embedded modes of reasoning. When restricted to English, however, such models are trained within a predominantly Western linguistic and cultural context, which may influence their attentional patterns and descriptive tendencies. This project aims to examine whether VLMs trained predominantly on English text exhibit cultural biases consistent with analytic perceptual styles, and how these biases manifest in image description tasks. The goal is to systematically analyze the presence of culturally grounded attentional patterns in the dominant English language in each model, evaluate their implications for fairness and inclusivity, and establish a foundation for understanding how cultural cognition is implicitly reproduced in large-scale multimodal training.

[1] Masuda, T. (2001). Attending holistically versus analytically: Comparing the context sensitivity of Japanese and Americans. Journal of Personality and Social Psychology.