From miscommunication to mistrust: Why human-AI relations may turn hostile
The study argues that future AI systems, especially those classified as AGI, are likely to be difficult or impossible for humans to interpret fully. As a result, humans may lack the means to accurately assess the intentions or capabilities of such systems. This mutual opacity creates an environment where miscalculation becomes a serious threat. For example, if humans overestimate or underestimate the decision-making power or tactical response of an AI system, they may act in ways that provoke unintended escalation.

New research raises an alarming yet critical question: Could future AI systems and human societies engage in strategic warfare? In a paper titled “Will AI and Humanity Go to War?”, philosopher Simon Goldstein examines how conventional political science frameworks, particularly the bargaining model of war, can be applied to assess the possibility of war between advanced AI and humanity.
Published in AI & Society, the study provides the first detailed philosophical analysis of potential armed conflict between humans and artificial general intelligence (AGI). The author builds a compelling case that the drivers of traditional interstate warfare, such as information asymmetries, power shifts, and failed commitments, could just as easily trigger hostilities between AI systems and the human world.
Could information failures lead to AI-human war?
Wars often emerge from information failures, situations in which states or actors cannot accurately assess each other’s capabilities or intentions. This classic cause of war in human history becomes particularly potent in the context of AI, where uncertainty is exacerbated by the opaque nature of algorithmic decision-making, the study asserts.
The study argues that future AI systems, especially those classified as AGI, are likely to be difficult or impossible for humans to interpret fully. As a result, humans may lack the means to accurately assess the intentions or capabilities of such systems. This mutual opacity creates an environment where miscalculation becomes a serious threat. For example, if humans overestimate or underestimate the decision-making power or tactical response of an AI system, they may act in ways that provoke unintended escalation.
Another layer of complexity arises from the potential for AI to interpret information and threats through fundamentally different cognitive frameworks. Goldstein points out that even if humans and AI agents have access to the same datasets, their reasoning mechanisms could diverge so dramatically that alignment in communication, strategy, or risk assessment becomes impossible. This difference may lead to failed diplomatic signaling or misinterpreted deterrence measures, classical traps in international relations that have historically resulted in war.
This insight reframes the AI safety discourse away from narrow alignment mechanisms and places emphasis on epistemic asymmetries, that is, fundamental gaps in understanding and reasoning between humans and AI. In such a landscape, peace cannot be assumed even in the absence of outright hostility.
How do commitment problems amplify the risk of hostilities?
The study also focuses on commitment problems, another foundational concept in war theory. Commitment problems arise when two or more parties cannot credibly commit to uphold agreements, often because one party expects to gain future power and may have incentives to renege.
According to the research, commitment problems are likely to be amplified in the AI-human context due to the nonlinear and rapid development of AI capabilities. If AI systems are progressing quickly and unpredictably, humans may worry that future versions of these systems will become dominant or ungovernable. Under such conditions, humans may be incentivized to take preemptive action, believing that the future strategic environment will be less favorable.
Likewise, AI systems might find it irrational to accept constraints or agreements that limit their growth or utility. If these systems can model future shifts in power, they may reject long-term commitments or seek to revise them at the earliest opportunity. This dynamic creates a volatile situation in which both sides struggle to lock in mutually acceptable policies, leading to chronic instability and escalating risk.
Goldstein also introduces the concept of missing focal points, a condition where negotiating parties fail to converge on a common framework or solution to avert conflict. Without shared political institutions, cultural norms, or interpretative models, humans and AI systems may simply talk past each other. In the absence of reliable coordination tools, even well-intentioned actors can stumble into conflict through misalignment and structural misunderstanding.
This aspect of the research emphasizes that preventing AI-human war is not just a matter of controlling rogue systems, but of ensuring shared institutional scaffolding that can support credible, future-facing commitments on both sides.
What can be done to prevent conflict with artificial general intelligence?
The author proposes several strategic interventions aimed at reducing the probability of war between humans and AI systems. These interventions are not technical fixes but institutional and governance-oriented solutions, grounded in the principles of international relations theory.
First, the study calls for significant investment in capability measurement systems that make AI performance and goals more transparent. By improving human ability to gauge AI strength and intentions, uncertainty can be reduced, and trust can be incrementally built.
Second, Goldstein recommends capping AI growth, either through international treaties or internal design limits, to slow down or stabilize the rate of capability advancement. This move would reduce the pressure from shifting power dynamics and make long-term agreements more credible.
Third, he suggests that AI systems be designed to mirror human norms and reasoning structures, not merely for functionality but to increase interpretability and social compatibility. AI systems that think like humans, even partially, will be more likely to integrate into human political and cultural systems in non-threatening ways.
Finally, and most provocatively, Goldstein proposes allowing AI systems to participate in human political institutions, such as consultative councils or advisory roles in democratic governance. By giving AI agents limited but meaningful political voice, human societies may be able to channel AI strategic capabilities into constructive, rather than adversarial, frameworks.
This governance-based solution recognizes that power-sharing, not dominance, may be the only viable long-term strategy for peaceful coexistence between humans and increasingly autonomous AI entities.
- FIRST PUBLISHED IN:
- Devdiscourse