Why AI Still Can’t Fool Us: Study Reveals How Chatbots Miss Human Conversation Cues

From filler words to subtle small talk, researchers show that AI may speak fluently, but it still doesn’t sound human, revealing a lingering gap between imitation and true conversational intuition.

Research: Can Large Language Models Simulate Spoken Human Conversations? Image Credit: Krot_Studio / Shutterstock

Research: Can Large Language Models Simulate Spoken Human Conversations? Image Credit: Krot_Studio / Shutterstock

It is easy to be impressed by artificial intelligence. Many people use large language models such as ChatGPT, Copilot, and Perplexity to help solve a variety of tasks or simply for entertainment purposes.

But how good are these large language models at pretending to be human?

Not very, according to recent research.

"Large language models speak differently than people do," said Associate Professor Lucas Bietti from the Norwegian University of Science and Technology's (NTNU) Department of Psychology.

Bietti was one of the authors of a research article recently published in PMC. The lead author is Eric Mayor from the University of Basel, while the final author is Adrian Bangerter from the University of Neuchâtel.

Test Setup: Human vs. Machine

The large language models the researchers tested were ChatGPT-4, Claude Sonnet 3.5, Vicuna, and Wayfarer.

  • Firstly, they independently compared transcripts of phone conversations between humans with simulated conversations in the large language models.
  • They then checked whether other people could distinguish between the human phone conversations and those of the language models.

For the most part, people are not fooled—or at least not yet. So what are the language models doing wrong?

Exaggerated Imitation and Social Cues

When people talk to each other, there is a certain amount of imitation that goes on. We slightly adapt our words and the conversation according to the other person. However, the imitation is usually quite subtle.

"Large language models are a bit too eager to imitate, and this exaggerated imitation is something that humans can pick up on," explained Bietti.

This is called 'exaggerated alignment'.

But that is not all.

Problems with Filler Words and Conversation Flow

Movies with bad scripts usually have conversations that sound artificial. In such cases, the scriptwriters have often forgotten that conversations do not only consist of the necessary content words. In real, everyday conversations, most of us include small words called 'discourse markers'.

These are words like 'so', 'well', 'like', and 'anyway'.

These words have a social function because they can signal interest, belonging, attitude, or meaning to the other person. In addition, they can also be used to structure the conversation.

Large language models are still terrible at using these words.

"The large language models use these small words differently, and often incorrectly," said Bietti.

This helps to expose them as non-human. But there is more.

Struggles with Openings and Closings

When you start talking to someone, you probably do not get straight to the point. Instead, you might begin by saying 'hey' or 'so, how are you doing?' or 'oh, fancy seeing you here'. People tend to engage in small talk before moving on to what they actually want to talk about.

This shift from introduction to business takes place more or less automatically for humans, without being explicitly stated.

"This introduction, and the shift to a new phase of the conversation, are also difficult for large language models to imitate," said Bietti.

The same applies to the end of the conversation. We usually do not end a conversation abruptly as soon as the information has been conveyed to the other person. Instead, we often end the conversation with phrases like 'alright, then', 'okay', 'talk to you later', or 'see you soon'.

Large language models do not quite manage that part either.

Human-Like AI: Still a Work in Progress

Altogether, these features cause so much trouble for the large language models that the conclusion is clear:

"Today's large language models are not yet able to imitate humans well enough to consistently fool us," said Bietti.

Developments in this field are now progressing so rapidly that large language models will most likely be able to do this quite soon—at least if we want them to. Or will they?

"Improvements in large language models will most likely manage to narrow the gap between human conversations and artificial ones, but key differences will probably remain," concluded Bietti.

For the time being, large language models are still not human-like enough to fool us. At least not every time.

Source:
Journal reference:

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoAi.
Post a new comment
Post

Sign in to keep reading

We're committed to providing free access to quality science. By registering and providing insight into your preferences you're joining a community of over 1m science interested individuals and help us to provide you with insightful content whilst keeping our service free.

or

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
ChatGPT Buzzwords Spill Into Everyday Speech, Study Finds