False Certainty? AI chatbots act confident even when clueless
The study found that while LLMs can report confidence, they often do so with an overconfident tone - a trait shared with their human counterparts. The models frequently expressed high certainty in cases where their answers were incorrect, a pattern that raises important questions about reliability and risk.

New research sheds light on a critical dimension of artificial intelligence: confidence accuracy. The research assesses how reliably large language models (LLMs) such as ChatGPT, Gemini, Sonnet, and Haiku can judge their own certainty when providing information. The findings of the paper titled "Quantifying Uncert-AI-nty: Testing the Accuracy of LLMs’ Confidence Judgments" are published in the peer-reviewed journal Memory & Cognition.
The study offers an evidence-based examination of whether LLMs can mimic a distinctly human trait: evaluating and reporting how sure they are about the answers they give. It asks whether LLMs are capable not only of producing accurate outputs, but also of indicating when they might be wrong - a key factor in how humans weigh advice, assign credibility, and make informed decisions.
Are AI systems capable of meaningful confidence judgments?
The researchers set out to determine whether today's most advanced LLMs can provide confidence scores that accurately reflect their own performance. In essence, the study investigates if these systems know what they know, and what they don't. This is critical for a range of real-world applications, including medical diagnostics, legal decision-support, academic tutoring, and consumer advice, where misplaced confidence from an AI system could lead to costly or dangerous outcomes.
To measure this capacity, the research team designed five experimental studies encompassing both aleatory and epistemic uncertainty. Aleatory uncertainty included tasks based on chance and unpredictability, such as predicting NFL game outcomes and Oscar winners. Epistemic uncertainty dealt with knowledge-based content, such as general trivia, drawings from Pictionary, and university life questions. Each model, along with human participants, provided both answers and a self-rated confidence score for each response.
Across these varied domains, the LLMs demonstrated a consistent ability to generate confidence scores that aligned reasonably well with their performance outcomes. On average, their overall accuracy was comparable to that of human participants and sometimes slightly better. In particular, ChatGPT and Gemini emerged as the most consistently accurate models in expressing confidence, suggesting that some LLMs are already functioning at near-human levels in certain metacognitive dimensions.
Do LLMs suffer from overconfidence like humans?
The study found that while LLMs can report confidence, they often do so with an overconfident tone - a trait shared with their human counterparts. The models frequently expressed high certainty in cases where their answers were incorrect, a pattern that raises important questions about reliability and risk.
This overconfidence, though modest in scale, can become problematic when LLMs are deployed in scenarios where users may blindly trust system outputs. Without an ability to properly calibrate when they are likely to be wrong, LLMs could provide a false sense of security to end-users. Overconfidence in AI, just like in humans, can lead to poor decisions when not corrected or checked.
The researchers also discovered that LLMs failed to adjust their confidence based on prior mistakes. In other words, the systems did not learn from earlier performance patterns. Unlike humans, who tend to reduce their confidence after being proven wrong repeatedly, LLMs did not exhibit this metacognitive adaptability. This lack of feedback sensitivity means that current AI systems may consistently present inaccurate information with unjustified conviction.
How do these findings impact the future of AI deployment?
As LLMs become embedded in everything from digital assistants to workplace productivity tools, the need for metacognitive awareness becomes more than a technical detail, it becomes an ethical and practical imperative.
These findings suggest that confidence ratings provided by LLMs, while informative, should not yet be treated as fully reliable indicators of correctness. Developers and platform designers must therefore be cautious when implementing features that display AI-generated confidence, especially in applications involving safety, fairness, or user trust.
From a research perspective, the authors urge the development of AI models with improved metacognitive feedback loops. This could include architectures that allow LLMs to reflect on their own accuracy, adjust confidence estimates over time, and even decline to answer when uncertainty is high. Transparency about uncertainty could also form the backbone of more responsible AI-human collaboration frameworks.
- FIRST PUBLISHED IN:
- Devdiscourse