Philosophical test fails ChatGPT: AI coherence isn’t enough to prove human mind

The research reveals that ChatGPT does exhibit proficiency in basic coherence building. It maintains consistent dictional and intentional lines by reusing phrases and aligning responses with contextual topics. It also demonstrates some ability to construct rational coherence by offering logically consistent replies. In scenarios involving mild emotional expressions, ChatGPT uses polite or humorous strategies to maintain a conversational tone.


CO-EDP, VisionRICO-EDP, VisionRI | Updated: 28-06-2025 09:51 IST | Created: 28-06-2025 09:51 IST
Philosophical test fails ChatGPT: AI coherence isn’t enough to prove human mind
Representative Image. Credit: ChatGPT

A newly published study challenges the assumption that AI can fully replicate the nuanced coherence found in human-to-human conversations, raising philosophical concerns about whether machines can ever truly possess what we perceive as a “mind.”

The peer-reviewed study, titled “The Lack of Other Minds as the Lack of Coherence in Human–AI Interactions”, was published in Philosophies. Conducted by Lin Tang of Southwest University, the research interrogates AI’s discursive abilities through a comparative analysis of conversational coherence in human–AI interactions (HAI) versus human–human interactions (HHI), with a particular focus on OpenAI’s ChatGPT. The study reframes a classic philosophical question, the problem of other minds, within the empirical territory of discourse analysis, asserting that coherence in dialogue is a measurable proxy for cognitive presence.

Can AI interactions be taken as evidence of another mind?

How can one know that other minds exist? - This philosophical riddle has long haunted epistemology and philosophy of mind. Traditionally addressed through inferential theories like analogy or explanation, this work relocates this question to the realm of discourse. Instead of examining whether AI possesses consciousness, the study evaluates whether AI can participate in conversation in a way that implies a coherent understanding - a fundamental marker of mind-like engagement.

By analyzing how both human and AI interlocutors construct discursive coherence, the paper treats linguistic exchange as a manifestation of mutual mental modeling. Tang applies four coherence “lines” drawn from the philosophy of language, dictional, intentional, emotional, and rational - to assess whether conversational responses from ChatGPT reflect the kind of mental synchrony expected in human communication. In this framework, each coherence line embodies a cognitive function: dictional links linguistic forms, intentional denotes shared meaning, emotional conveys affective resonance, and rational ensures logical continuity.

If an entity can navigate these coherence structures effectively, it arguably displays traits associated with mind-like functioning. However, Tang finds that while ChatGPT is capable of operating along these lines to a limited extent, it falls significantly short in integrating them with the complexity and adaptability seen in human discourse.

Where does AI fall short in conversational coherence?

The research reveals that ChatGPT does exhibit proficiency in basic coherence building. It maintains consistent dictional and intentional lines by reusing phrases and aligning responses with contextual topics. It also demonstrates some ability to construct rational coherence by offering logically consistent replies. In scenarios involving mild emotional expressions, ChatGPT uses polite or humorous strategies to maintain a conversational tone.

However, its performance noticeably degrades when faced with more emotionally charged or unpredictable dialogue. Emotional coherence, a crucial component of human conversation, is where AI's limitations become most visible. The study finds that ChatGPT's emotional responses are often formulaic, lacking the subtlety, spontaneity, and variability found in human reactions. Even when prompted with emotionally provocative inputs, ChatGPT resorts to predefined patterns, revealing an absence of authentic affective understanding.

More critically, the research underscores that while humans naturally and dynamically interweave all four coherence lines within a single conversation, often shifting fluidly between rational justification, emotional engagement, semantic reference, and linguistic cohesion, ChatGPT tends to fixate on one or two lines per interaction. This results in a constrained discursive range, giving AI conversations a repetitive and somewhat artificial flavor. The inability to harmonize multiple coherence lines simultaneously prevents ChatGPT from approximating the creative and intuitive nature of human dialogue.

The study further notes that ChatGPT occasionally exhibits logical inconsistencies, such as reversing prior positions within a multi-turn exchange. This undermines its rational coherence and raises concerns about its capacity to model conversation partners in a sustained or contextually adaptive way, something human interlocutors do instinctively.

Does discursive coherence imply the presence of mind?

By translating the age-old question of other minds into a pragmatic linguistic test, the study argues that the success of discourse depends not just on grammatical accuracy but on the capacity to co-construct meaning and intent. Human communication involves more than sentence generation, it requires shared mental modeling, anticipation of others’ responses, and adaptive feedback based on emotional and logical cues. These capabilities, the research suggests, are currently beyond AI’s grasp.

The author reinforces that the existence of discursive coherence alone does not confirm the presence of a mind; what matters is the richness, adaptability, and integrative capacity of that coherence. While ChatGPT mimics coherence structures, it does so from the outside in, manipulating form without genuinely participating in the meaning-making process that characterizes human cognition.

In contrasting AI and human discourse, the study illustrates that HHI exhibits layered and multidimensional coherence, where emotional tension, semantic nuance, and logical reasoning converge. By contrast, HAI remains relatively linear and surface-level, reliant on predefined templates and probabilistic patterns rather than genuine insight or empathy.

Current AI systems, the study suggests, are not mind-like in any philosophically or linguistically robust sense. Despite appearing fluent and responsive, AI's coherence lacks the creative spontaneity and affective depth that signify mutual mind recognition. This renders AI interactions functionally impressive but cognitively hollow.

Implications for the future of human–AI communication

The findings have significant implications for the future development of conversational AI. As AI continues to permeate domains that demand human-like interaction, from education to healthcare to customer service, the expectation that these systems can fully replicate human understanding must be tempered with caution.

The study calls for more targeted research into emotional and pragmatic coherence, suggesting that advancing AI discourse models should prioritize not only semantic and logical accuracy but also the capacity to engage with emotional and contextual complexity. Without this, AI risks remaining a convincing impersonator rather than a true conversational partner.

  • FIRST PUBLISHED IN:
  • Devdiscourse
Give Feedback