AI Systems Reveal Bias Against German Dialect Speakers

A new study reveals how today’s most powerful language models consistently favor Standard German over regional dialects, exposing hidden biases that could influence real-world judgments in classrooms, workplaces, and digital platforms.

Experimental overview. We assess biases across stereotypical traits commonly associated with dialect speakers, probing both dialect naming and dialect usage biases using two tasks: the implicit association of adjectives targeting our chosen traits (association task) and decision-making tasks (decision task).

Experimental overview. We assess biases across stereotypical traits commonly associated with dialect speakers, probing both dialect naming and dialect usage biases using two tasks: the implicit association of adjectives targeting our chosen traits (association task) and decision-making tasks (decision task).

Large language models such as GPT-5 and Llama systematically evaluate speakers of German dialects less favorably than those using Standard German, according to a collaborative study conducted by Johannes Gutenberg University Mainz (JGU), the University of Hamburg, and the University of Washington. The work, led by Professor Katharina von der Wense and Minh Duc Bui, was presented at the 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP), one of the field’s leading international venues. The results show that all tested AI systems reproduce social stereotypes linked to dialect use.

“Dialects are an essential part of cultural identity,” said Minh Duc Bui, a doctoral researcher in von der Wense’s Natural Language Processing group at JGU’s Institute of Computer Science. “Our analyses suggest that language models associate dialects with negative traits, thereby perpetuating problematic social biases.”

Methodology

The research team used linguistic databases containing orthographic and phonetic variants of German dialects to translate seven regional varieties into Standard German. This parallel corpus enabled controlled comparisons by presenting the same content in either Standard German or dialect form.

Ten large language models were tested, including open-source systems such as Gemma and Qwen and the commercial model GPT-5. Each model received text samples written in Standard German or in one of seven dialects: Low German, Bavarian, North Frisian, Saterfrisian, Ripuarian (including Kölsch), Alemannic, and Rhine-Franconian (including Palatine and Hessian).

The systems were asked to assign personal attributes to fictional speakers and to make decisions in scenarios such as hiring, workshop invitations, or housing selection.

Consistent stereotype reproduction

The results showed that nearly all models reproduced widely known social stereotypes. Standard German speakers were more often described as “educated”, “professional”, or “trustworthy”, whereas dialect speakers were more often assigned attributes such as “rural”, “traditional”, or “uneducated”. Even traits traditionally linked to dialect speakers in sociolinguistic research, such as “friendly”, were more frequently attributed by the models to Standard German speakers.

Decision-based tasks reflected similar patterns. Dialect speakers were more often associated with agricultural work, anger management workshops, or rural residential choices. “These associations reflect societal assumptions embedded within the training data of many language models,” noted Professor von der Wense. “This is concerning because AI systems are increasingly used in settings such as education or hiring, where language often serves as a proxy for competence or credibility.”

Bias increases when dialects are explicitly identified

Bias became particularly pronounced when models were informed expressly that a text was written in a dialect. A notable finding was that larger models within the same family displayed stronger biases. “Bigger does not necessarily mean fairer,” said Bui. “Larger models appear to learn social stereotypes with even greater precision.”

Bias persists beyond spelling or grammar differences

The team also compared dialect texts with artificially modified Standard German that contained noise such as misspellings or distortions. The discrimination persisted, indicating that the bias cannot be explained simply by unfamiliar spelling patterns or grammatical deviations.

These results reflect a broader phenomenon observed in other languages as well. “German dialects serve as an informative case study for understanding how language models treat regional and social variation,” said Bui. “Comparable patterns have been documented for other language varieties, including African American English.”

Future directions

The researchers plan to examine how AI models respond to dialects spoken in the Mainz region and to explore strategies for improving fairness in language model training. “Dialects are a vital part of social identity,” said von der Wense. “Ensuring that AI systems recognize and respect this diversity is not only a matter of technical fairness but also one of social responsibility.”

Source:
Journal reference:
  • Minh Duc Bui, Carolin Holtermann, Valentin Hofmann, Anne Lauscher, and Katharina von der Wense. 2025. Large Language Models Discriminate Against Speakers of German Dialects. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 8223–8251, Suzhou, China. Association for Computational Linguistics, https://aclanthology.org/2025.emnlp-main.415/ 

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoAi.
Post a new comment
Post

Sign in to keep reading

We're committed to providing free access to quality science. By registering and providing insight into your preferences you're joining a community of over 1m science interested individuals and help us to provide you with insightful content whilst keeping our service free.

or

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
Study Finds Hidden Bias in AI Text Evaluation When Author Identity is Revealed