Generative AI adoption in finance may concentrate risk, not diversify it

CO-EDP, VisionRI | Updated: 28-08-2025 17:58 IST | Created: 28-08-2025 17:58 IST

Generative AI adoption in finance may concentrate risk, not diversify it — Representative Image. Credit: ChatGPT

A new risk assessment from Miranda McClellan of Schwarzman College, Tsinghua University, argues that large language models (LLMs) used for stock picking could synchronize investor behavior and amplify market shocks, creating vulnerabilities that traditional model-by-model checks miss.

The paper, “AI and Financial Fragility: A Framework for Measuring Systemic Risk in Deployment of Generative AI for Stock Price Predictions,” is published in the Journal of Risk and Financial Management. It proposes a quantitative way to gauge exogenous, market-level risk from LLM-driven trading and outlines technical, cultural, and regulatory levers to contain it.

What new risk does AI introduce into markets?

The study distinguishes the underexplored exogenous risk, system-level instability triggered by external factors and coordinated behavior, from the endogenous risks usually tracked in AI evaluation (accuracy, bias, model failure). The author argues that when many firms deploy similar LLMs, their outputs can converge, producing simultaneous buy or sell signals that pump up bubbles or accelerate crashes across sectors and borders.

To capture this, McClellan develops a covariance-and-correlation metric that measures how closely different LLMs’ stock-price predictions move together. The experiment spans eight general-purpose LLMs, including GPT, Gemini, Claude, DeepSeek, Qwen, Doubao, Cohere, and Mistral, applied to 11 stocks across technology, automobiles, and communications and multiple time horizons. The goal is not to score accuracy but to quantify the relationship between models’ predictions as a proxy for market-level coordination risk.

The finding is stark: all eight models were positively correlated. In other words, the systems tended to point in the same direction, raising the likelihood of synchronized trades that can magnify volatility, strain liquidity, and undermine resilience when conditions change. The paper warns that this kind of homogeneity converts LLM adoption into a potential amplifier of fragility, not just a faster decision engine.

The framework also clarifies why technical fixes alone may disappoint. Classic ensemble tactics, combining models to hedge errors, do not neutralize correlated signals; they can still push markets in lockstep. As models become more capable and easier to deploy, correlation pressures may rise, increasing the need for policy intervention that complements engineering controls.

How does the framework measure systemic risk?

Borrowing from modern portfolio theory, the study applies covariance and correlation coefficients to LLM outputs, treating each model like an asset whose movements can be diversified, if, and only if, their predictions are not all moving in tandem. The pipeline selects models and stocks, constructs prompts using financial indicators, gathers outputs over five timeframes, and then computes pairwise relationships among predictions. The dataset spans U.S., European, and Chinese equities to reflect real-world exposure to geopolitical competition and cross-market contagion.

By focusing on the relationship between models rather than their absolute accuracy, the method surfaces systemic dynamics that typical AI benchmarks ignore. It directly tests whether a “portfolio” of trading LLMs offers diversification benefits or, as the results show, concentrates risk by moving together. That shift from model-centric metrics to market-centric diagnostics is the paper’s central contribution.

The author positions this as a practical tool for firms and supervisors to monitor correlated behavior as GenAI proliferates in trading workflows. The metric, the paper argues, can inform cross-border coordination on AI governance to reduce manipulation risk and maintain stability during stress.

What policy guardrails does the study recommend?

Because correlated outputs make coordinated action more likely, the paper urges multi-level policy that brings together industry bodies, firms, and regulators rather than relying on self-regulation or single-jurisdiction rules. It proposes data-driven governance that requires transparency on model use, mandates routine systemic-risk analysis for high-impact AI trading tools, and empowers local enforcement to audit and sanction misuse.

The analysis includes jurisdiction-specific readings. For example, the paper suggests that China’s centralized financial oversight and existing AI registries could enable stricter pre-deployment controls and even exchange-level exposure caps on LLM-driven strategies, while still supporting innovation. By contrast, the U.S. and EU are encouraged to align transparency, audit capacity, and crisis-coordination playbooks across financial supervisors to avoid regulatory fragmentation.

Most importantly, the study asserts that regulatory attention must shift beyond consumer harm to include market structure risks. Without enforceable standards that map and mitigate correlated AI behavior, the sector could face algorithmic bandwagon effects - rapid, correlated trades that move prices irrespective of fundamentals.

Limitations and next steps

The author notes two boundaries to the evidence. First, proprietary hedge-fund models, often trained on richer data, were not included; the experiment used widely available general-purpose LLMs to reflect what many firms can access today. Second, the framework examines price-prediction use cases; other LLM deployments in finance, such as execution or risk operations, warrant separate testing. Even so, the results point to a material systemic-risk signal that should be measured and governed as adoption scales.

To support replication, the paper provides a public repository containing prompts, indicators, model responses, and the correlation calculations. The author calls for extending the analysis to additional sectors, such as healthcare or shipping, to map where correlated AI behavior might pose the greatest macro-financial risk.

FIRST PUBLISHED IN:
Devdiscourse

Generative AI adoption in finance may concentrate risk, not diversify it

What new risk does AI introduce into markets?

How does the framework measure systemic risk?

What policy guardrails does the study recommend?

Limitations and next steps

TRENDING

Threat Video Sparks Arrest in Jharkhand

Legal Drama Unfolds in South Korea: Indictments Shake Political Landscape

Piyush Gupta: A Financial Luminary with Global Recognition

Senator Wicker Reinforces U.S.-Taiwan Alliance Amidst China Tensions

OPINION / BLOG / INTERVIEW

Indian universities face integrity crisis amid rise of AI-assisted cheating

Future healthcare professionals curious but unprepared for AI and robotics integration

Blockchain boosts trust, efficiency and security in global data supply chains

AI and IT innovations propel agriculture toward the era of smart farming

DevShots

Latest News

Major Militant Crackdown in Imphal: Weapons and Arrests

Senator Wicker Reinforces U.S.-Taiwan Alliance Amidst China Tensions

Piyush Gupta: A Financial Luminary with Global Recognition

Legal Drama Unfolds in South Korea: Indictments Shake Political Landscape

Connect us on

SECTORS

EDITIONS

OTHER LINKS

OTHER PRODUCTS

CONNECT