Is AI becoming risky source of medical knowledge in healthcare systems?
A new study published in Healthcare highlights that large language models in healthcare are not simply tools for administrative support but are increasingly being treated as sources of clinical knowledge, raising critical concerns about trust, accountability, and patient safety.
The study, “Governing Generative AI in Healthcare: A Normative Conceptual Framework for Epistemic Authority, Trust, and the Architecture of Responsibility,” presents a framework to address how healthcare systems should govern the use of generative AI. It introduces the Epistemic Authority–Trust–Responsibility (ETR) Architecture, a structured model designed to guide how AI outputs are classified, trusted, and regulated in clinical environments.
The research argues that current governance approaches are insufficient because they focus primarily on issues such as bias, privacy, and accuracy while overlooking more fundamental questions. These include the nature of knowledge produced by AI systems, the conditions under which such outputs should be trusted, and the distribution of responsibility when AI contributes to clinical decisions.
AI outputs blur line between assistance and clinical knowledge
The study identifies a fundamental shift in how AI is used in healthcare. Large language models are no longer limited to administrative functions such as drafting documents or summarizing patient records. Instead, they are increasingly involved in tasks that resemble clinical reasoning, including suggesting diagnoses, generating treatment recommendations, and responding to patient inquiries.
This evolution creates a critical challenge: AI-generated content often appears indistinguishable from authoritative medical knowledge. As a result, clinicians and patients may treat these outputs as reliable evidence, even when they are produced through statistical pattern recognition rather than validated clinical reasoning.
AI systems can accept incorrect medical information, particularly when it is presented in authoritative formats. This raises concerns about misinformation risks in clinical settings, where the consequences of error can be severe. At the same time, evidence suggests that access to AI tools does not necessarily improve diagnostic accuracy, reinforcing the need for careful governance rather than blind adoption.
To address this issue, the study proposes a four-tier classification system for AI outputs. These tiers range from low-risk administrative drafts to high-risk clinical evidence claims. Each level requires a different degree of verification, ensuring that governance intensity matches the potential impact on patient care.
At the lowest level, AI outputs function as drafts that require basic review. At higher levels, outputs include diagnostic suggestions and evidence-based claims that demand rigorous validation, traceability, and institutional oversight. This structured approach aims to prevent healthcare systems from treating all AI outputs equally, a practice that the study identifies as a major governance failure.
Trust in AI depends on verifiability, context, harm assessment, and reversibility
The study emphasizes that trust in healthcare AI is not simply a matter of user confidence but must be grounded in specific conditions that justify reliance. It identifies four key factors that determine whether trust in AI outputs is warranted.
- Verifiability requires that AI-generated claims can be traced back to reliable sources. Without this capability, clinicians cannot assess the validity of the information they receive.
- Contextual fit ensures that the level of oversight aligns with the clinical risk associated with the output.
- Harm assessment involves evaluating the potential consequences of incorrect AI outputs. In high-risk scenarios, even a small probability of error can have significant implications for patient safety.
- Reversibility considers whether the effects of an AI-informed decision can be undone if it proves to be incorrect.
All these conditions form a framework for calibrating trust in AI systems. The study argues that failing to meet these criteria leads to either over-reliance or under-utilization, both of which can undermine clinical outcomes.
A key concept introduced in this context is the “epistemic placebo.” This refers to governance measures that create the appearance of oversight without providing meaningful safeguards. For example, generic statements about human oversight may give the impression of safety but lack clearly defined roles, processes, or accountability mechanisms.
The study warns that epistemic placebos are particularly dangerous because they mimic effective governance while failing to protect patients. They can also create a false sense of compliance with regulatory standards, reducing the incentive for institutions to implement more robust safeguards.
Responsibility gaps expose systemic weaknesses in AI-driven healthcare
One of the most significant challenges identified is the issue of responsibility. When AI systems contribute to clinical decisions, it becomes difficult to determine who is accountable for the outcomes.
Traditional models of medical responsibility assume a clear chain of accountability, typically centered on the clinician and the healthcare institution. However, the introduction of AI complicates this structure. Developers design the systems, institutions deploy them, and clinicians use them, but no single actor has complete control over the decision-making process.
This creates what the study describes as a responsibility gap. In cases where AI-informed decisions lead to patient harm, it is unclear whether accountability should rest with the developer, the healthcare provider, or the institution. To address this issue, the study proposes a structured responsibility model based on six key governance functions. These include model validation, output classification, verification, harm detection, audit trail maintenance, and lifecycle management.
Responsibility for these functions is distributed across four groups: developers, healthcare institutions, clinical teams, and external auditors. Each group has clearly defined roles, ensuring that no aspect of governance is left unassigned. This approach shifts the focus from reactive accountability to proactive design. By defining responsibilities before deployment, healthcare systems can reduce ambiguity and improve their ability to manage risks associated with AI.
The study also acknowledges practical challenges in implementing such frameworks, particularly in resource-limited settings. It suggests simplified governance models for these environments, including restricting AI use to lower-risk applications and consolidating oversight responsibilities.
Critical regulatory window will shape future of AI in healthcare
Global initiatives such as the European Union’s AI Act and guidelines from the World Health Organization are beginning to establish standards for AI governance in healthcare. However, the research argues that the period between 2025 and 2027 represents a critical transition phase. During this time, healthcare institutions are making decisions about how to integrate AI into their workflows, often without fully developed governance structures.
The choices made during this period are likely to have long-term implications. If institutions adopt AI systems without clear frameworks for classification, trust, and responsibility, governance norms may be shaped by default, influenced more by technology vendors than by patient-centered policies.
Governance, as the study stresses, must operate at the institutional level, bridging the gap between high-level regulations and day-to-day clinical practice. Existing evaluation standards focus on individual AI systems, but they do not address how these systems should be integrated into complex healthcare environments.
- FIRST PUBLISHED IN:
- Devdiscourse

