This paper highlights how text-to-image (T2I) models often reinforce Western cultural norms, leading to misrepresentation and harm for minority groups. Although cultural sensitivity is difficult to evaluate, the authors developed and validated a community-based, mixed-methods evaluation approach through a state-of-the-art review and co-creation workshops with 59 participants from 19 countries. Combining quantitative and qualitative methods, the study reveals agreements and differences across communities and shows how unequal power structures in training data distort cultural representation. To address the challenges of high resource demands and cultural fluidity, the authors propose a context-based, iterative evaluation methodology and offer actionable recommendations to improve the cultural representation practices of T2I model stakeholders.
This research follows a three-part process with corresponding outcomes.
Part I: State-of-the-Art Review examines how culture is defined in GenAI and T2I research, current uses of T2I, ethical concerns, evaluation methods, and potential harms. It results in a comprehensive framework of cultural elements and sub-elements grouped into demographic, semantic material, and semantic non-material categories.
Part II: Co-Creation Workshops involve three virtual sessions with 34 culturally diverse participants to explore T2I use, expand cultural elements, generate images, and develop evaluation criteria. The outcome is a set of evaluation metrics and an approach for comparative analysis.
Part III: Comparative Evaluation analyzes T2I outputs created from 13 prompts across 8 countries, assessed by 25 evaluators. This produces an evaluated image repository and a structured checklist to help users systematically assess cultural representation in T2I outputs.
This work has been submitted to CHI 2026 and was presented at Responsible Ai UK, where I also facilitated a workshop.





