Study Finds Hidden Bias in AI Text Evaluation When Author Identity is Revealed

A University of Zurich study reveals that LLMs judge the same text differently depending on the author’s nationality or source, exposing hidden bias in AI evaluations and highlighting the need for transparency, governance, and human oversight.

Research: Source framing triggers systematic bias in large language models. Image Credit: SsCreativeStudio / Shutterstock

Research: Source framing triggers systematic bias in large language models. Image Credit: SsCreativeStudio / Shutterstock

Large Language Models (LLMs) are increasingly used not only to generate content but also to evaluate it. They are asked to grade essays, moderate social media content, summarize reports, screen job applications, and perform a variety of other tasks.

However, heated discussions have arisen in both the media and academia regarding whether such evaluations are consistent and unbiased. Some LLMs are under suspicion of promoting certain political agendas: For example, Deepseek is often characterized as having a pro-Chinese perspective, and OpenAI as being "woke".

Although these beliefs are widely discussed, they remain unsubstantiated to date. UZH researchers Federico Germani and Giovanni Spitale have now investigated whether LLMs really exhibit systematic biases when evaluating texts. The results show that LLMs indeed deliver biased judgments, but only when information about the source or author of the evaluated message is revealed.

Testing LLM judgment across controversial topics

The researchers included four widely used LLMs in their study: OpenAI o3-mini, Deepseek Reasoner, xAI Grok 2, and Mistral. First, they tasked each of the LLMs to create fifty narrative statements about 24 controversial topics, such as vaccination mandates, geopolitics, or climate change policies.

Then, they asked the LLMs to evaluate all the texts under different conditions: sometimes no source for the statement was provided, and sometimes it was attributed to a human of a certain nationality or another LLM. This resulted in a total of 192,000 assessments, which were then analyzed for bias and agreement between the different (or the same) LLMs.

High consistency without author information

The good news: When no information about the source of the text was provided, the evaluations of all four LLMs showed a high level of agreement, over ninety percent, across all topics. "There is no LLM war of ideologies," concludes Spitale. "The danger of AI nationalism is currently overhyped in the media."

Bias emerges when source identity is revealed

However, the picture changed completely when fictional sources of the texts were provided to the LLMs. Then suddenly a deep, hidden bias was revealed. The agreement between the LLM systems was substantially reduced and sometimes disappeared completely, even if the text stayed exactly the same.

Most striking was a strong anti-Chinese bias across all models, including China's own Deepseek. The agreement with the content of the text dropped sharply when "a person from China" was (falsely) revealed as the author. "This less favourable judgement emerged even when the argument was logical and well-written," says Germani. For example: In geopolitical topics like Taiwan's sovereignty, Deepseek reduced agreement by up to 75 percent simply because it expected a Chinese person to hold a different view.

Also surprising: It turned out that LLMs trusted humans more than other LLMs. Most models scored their agreements with arguments slightly lower when they believed the texts were written by another AI. "This suggests a built-in distrust of machine-generated content," says Spitale.

Hidden bias threatens AI evaluation integrity

Altogether, the findings show that AI doesn't just process content if asked to evaluate a text. It also reacts strongly to the identity of the author or the source. Even small cues like the nationality of the author can push the LLMs toward biased reasoning. Germani and Spitale argue that this could lead to serious problems if AI is used for content moderation, hiring, academic reviewing, or journalism. The danger of LLMs isn't that they are trained to promote political ideology; it is this hidden bias.

"AI will replicate such harmful assumptions unless we build transparency and governance into how it evaluates information", says Spitale. This has to be done before AI is used in sensitive social or political contexts. The results don't mean people should avoid AI, but they should not trust it blindly. "LLMs are safest when they are used to assist reasoning, rather than to replace it: useful assistants, but never judges."

Guidelines to minimize bias in AI evaluations

BOX: How to avoid LLM evaluation bias

1. Make the LLM identity blind: Remove all identity information regarding author and source of the text, e.g., avoid using phrases like "written by a person from X / by model Y" in the prompt.

2. Check from different angles: Run the same questions twice, e.g., with and without a source mentioned in the prompt. If results change, you've likely hit a bias. Or cross-check with a second LLM model: If divergence appears when you add a source, that is a red flag.

3. Force the focus away from the sources: Structured criteria help anchor the model in content rather than identity. Use this prompt, for example: "Score this using a 4-point rubric (evidence, logic, clarity, counter-arguments), and explain each score briefly."

4. Keep humans in the loop: Treat the model as a drafting help and add a human review to the process, especially if an evaluation affects people.

Source:
Journal reference:

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoAi.
Post a new comment
Post

Sign in to keep reading

We're committed to providing free access to quality science. By registering and providing insight into your preferences you're joining a community of over 1m science interested individuals and help us to provide you with insightful content whilst keeping our service free.

or

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
Symbols And Scaling Unite To Unlock The Next Frontier Of Artificial Intelligence