How much should you trust AI? New research offers a data-driven answer


COE-EDP, VisionRICOE-EDP, VisionRI | Updated: 29-04-2026 14:16 IST | Created: 29-04-2026 14:16 IST
How much should you trust AI? New research offers a data-driven answer
Representative image. Credit: ChatGPT

Artificial intelligence (AI) may be getting more explainable, but users still do not trust it when it matters most. A new study finds that even advanced explainable AI (XAI) systems fail to earn consistent confidence from human users, exposing a critical gap between machine accuracy and real-world decision-making.

A study titled “Toward a novel measure of user trust in XAI systems,” published in Frontiers in Computer Science, introduces a new framework designed to objectively measure user trust by combining system performance with user behavior. The authors propose a behavioral, performance-linked measurement that reflects both how accurate an AI system is and whether users actually trust its outputs.

Bridging performance and perception in AI trust measurement

The rise of deep learning has significantly improved predictive performance across domains, but its complexity has made it difficult for users to understand how decisions are made. This lack of transparency has driven the development of XAI techniques, which aim to provide explanations for AI outputs and improve user confidence.

However, the study highlights a major gap in the field: while explainability has advanced rapidly, the ability to measure whether explanations actually build trust has lagged behind. Existing approaches often rely on questionnaires, scales, or user feedback, which reflect attitudes rather than behavior and are inherently subjective.

The authors point out that trust in AI is not static but evolves based on system performance. When AI systems make errors, user trust declines, even if overall performance remains high. This dynamic relationship between accuracy and trust has been widely observed in prior research and forms the basis of the study’s proposed solution.

To address this, the researchers introduce a novel framework that integrates trust into the traditional confusion matrix used in machine learning evaluation. Instead of measuring only correct and incorrect predictions, the model also accounts for whether users choose to trust or distrust each output.

This results in four key categories: trusted correct predictions, untrusted correct predictions, trusted incorrect predictions, and untrusted incorrect predictions. By combining these elements, the framework captures both the technical performance of the system and the behavioral response of the user.

The approach allows for the adaptation of well-known evaluation metrics such as precision, recall, and F1-score, transforming them into trust-aware indicators. These metrics provide a more nuanced understanding of how well an AI system aligns with user expectations and decision-making patterns.

Detecting overtrust and undertrust in AI systems

The research identifies problematic trust behaviors that are often overlooked in traditional evaluations. Specifically, the framework distinguishes between overtrust, where users rely on incorrect predictions, and undertrust, where users reject correct outputs. These behaviors have significant implications for real-world AI deployment. Overtrust can lead to critical errors, particularly in high-risk domains such as healthcare, while undertrust can reduce the effectiveness of otherwise accurate systems.

To validate their approach, the researchers conducted three case studies. The first involved hypothetical scenarios designed to simulate extreme trust behaviors. These included a “perfect user” who trusts only correct predictions, an “overtrusting user” who trusts all outputs, and a “never-trust user” who rejects all predictions.

The proposed metrics successfully differentiated between these behaviors, even when overall system performance remained constant. Traditional methods, by contrast, often failed to distinguish between desirable and undesirable trust patterns.

In the second case study, the researchers applied the framework to a real machine learning model used for classifying blood cell images. Despite high model accuracy, the trust metrics revealed that overtrust could still occur, demonstrating that strong performance alone does not guarantee appropriate user reliance.

This finding challenges a common assumption in AI development that improving accuracy will automatically lead to better user outcomes. Instead, the study shows that trust must be calibrated alongside performance to ensure safe and effective use.

Real-world testing reveals fragile trust in medical AI systems

The third case study provides the most direct insight into real-world implications. The researchers tested their framework in a healthcare setting, using an AI system designed to detect COVID-19 pneumonia from chest X-ray images. Two senior radiologists evaluated the system through an interactive interface that displayed predictions and visual explanations. The participants were asked to indicate whether they agreed with each result, providing a behavioral measure of trust.

Despite the system achieving a classification accuracy of 80 percent, the trust metrics revealed extremely low levels of user confidence. The F1-scores, which combine trust and performance, were recorded at 0.17 and 0.03 for the two users, indicating a general lack of trust in the system.

Further analysis showed significant variability between users, even when evaluating the same images. While the raw agreement rate between the two radiologists was high, statistical measures revealed inconsistencies driven by differences in trust behavior. This highlights the inherently subjective nature of trust and the importance of measuring it in a structured, objective way.

The study also found that the way explanations are presented can influence trust. When more visual information was included in the explanations, trust levels tended to decrease, suggesting that overly complex or noisy explanations may reduce user confidence rather than improve it.

These findings highlight the fragility of trust in AI systems, particularly in high-stakes environments where decisions have significant consequences. They also demonstrate the limitations of existing evaluation methods, which may fail to capture these dynamics.

Toward more reliable and interpretable AI deployment

Measuring trust in AI systems requires a shift from subjective, survey-based approaches to objective, behavior-driven metrics. By integrating performance data with user decisions, the proposed framework provides a more accurate and actionable measure of trust.

This approach offers several advantages. It enables clearer interpretation of results, supports the identification of specific trust-related issues, and allows for more granular analysis of user behavior. It also aligns with broader efforts in AI research to develop more transparent and accountable systems.

The study also acknowledges limitations. The proposed metrics may be less sensitive in certain scenarios, particularly when dealing with extreme overtrust behaviors. The pilot study also involved a limited number of participants, highlighting the need for larger-scale validation.

Nevertheless, the findings represent a significant step forward in addressing one of the most persistent challenges in AI adoption. As AI systems become more integrated into critical decision-making processes, ensuring that users trust them appropriately will be essential.

  • FIRST PUBLISHED IN:
  • Devdiscourse
Give Feedback