Generative AI must adapt to developer needs or risk rejection
Goal maintenance, or the ability of genAI to align with and adapt to a developer’s task objectives, emerged as a particularly potent predictor of trust. However, the study found that goal maintenance is also among the most underperforming aspects of current genAI tools. Developers frequently struggle with AI-generated outputs that are misaligned with their intentions, requiring significant manual correction and prompting overhead.

In the rapidly evolving world of generative artificial intelligence (genAI), a new study has spotlighted the major barriers preventing widespread trust and adoption of these tools among software developers. Despite their transformative potential, tools like GitHub Copilot and ChatGPT remain plagued by usability frictions, security concerns, and misalignment with developers’ workflow needs.
The study, titled “What Needs Attention? Prioritizing Drivers of Developers’ Trust and Adoption of Generative AI”, was published in arXiv and represents a collaboration between Oregon State University, GitHub, Microsoft, and Northern Arizona University. It combines a large-scale empirical survey of 238 developers with statistical modeling and qualitative analysis to expose the key factors shaping developers’ trust and willingness to adopt genAI tools in professional contexts.
What factors influence developers’ trust in generative AI?
The study’s structural equation model identifies three primary drivers of trust: system/output quality, functional value, and goal maintenance. Developers are more likely to trust genAI tools that produce accurate, safe, and contextually relevant outputs. When these tools deliver practical benefits, such as reducing development time or aiding learning, they build functional value, another essential ingredient for trust.
Goal maintenance, or the ability of genAI to align with and adapt to a developer’s task objectives, emerged as a particularly potent predictor of trust. However, the study found that goal maintenance is also among the most underperforming aspects of current genAI tools. Developers frequently struggle with AI-generated outputs that are misaligned with their intentions, requiring significant manual correction and prompting overhead.
These issues are not just a nuisance - they undermine trust at a cognitive level. The report highlights how high verification burdens, unclear system behavior, and excessive cognitive effort in shaping prompts or adapting outputs can break developers’ cognitive flow. As a result, rather than being helpful collaborators, genAI tools are often perceived as unreliable or inefficient assistants.
Why are these tools not meeting developers' expectations?
The study’s Importance-Performance Matrix Analysis (IPMA) reveals a striking gap: the most influential factors for trust and adoption are the ones that perform worst in real-world settings. Specifically, developers cited repeated frustrations with:
- Misaligned outputs, where genAI systems fail to account for broader project goals or coding standards;
- High cognitive load, due to the need to craft detailed prompts and heavily edit AI-generated code;
- Security vulnerabilities, with concerns about AI-generated code introducing hidden bugs or unsafe patterns;
- Lack of contextual awareness, as AI tools often overlook key environmental variables or user preferences.
These findings were supported by open-ended responses that painted a consistent picture of skepticism and caution. One developer likened the experience to “planting landmines” for future maintainers when using genAI-generated code. Another stressed that while these tools are useful in prototyping, they lack the rigor and control needed for production environments.
Design flaws in interaction mechanisms further compound these problems. The study notes that genAI tools often require users to adapt to them, rather than the other way around. This friction discourages sustained use, especially for risk-averse developers or those less inclined toward experimentation.
Who is most likely to adopt GenAI and who’s being left behind?
Beyond technical performance, the study delves into the cognitive and motivational factors that shape adoption. Developers with strong intrinsic motivation (technophilia), high computer self-efficacy, and tolerance for risk are more inclined to adopt genAI tools. Conversely, those with task-oriented goals, lower confidence, or aversion to risk are significantly less likely to integrate genAI into their workflows.
This divide raises concerns about interactional inequity. Because many genAI tools are optimized for early adopters and technically confident users, they inadvertently alienate those with different cognitive styles. The study warns that without deliberate design interventions, genAI could deepen the digital divide within software teams.
To bridge this gap, researchers advocate for HAI-UX fairness - the equitable design of human-AI interaction experiences. They suggest that genAI interfaces should dynamically adapt to different user profiles, possibly by detecting cognitive styles during onboarding or inferring them from usage patterns. Features like confidence indicators, simplified prompting scaffolds, and personalized interaction paths could make genAI tools more inclusive and effective.
Designing for trust: What needs to change?
The authors propose a human-centered roadmap for improving genAI adoption. This includes:
- Enhancing goal maintenance through persistent alignment with user objectives, coding conventions, and task constraints;
- Reducing cognitive overhead by improving prompt handling and output interpretability;
- Improving system/output quality, especially in terms of presentation, consistency, and safety;
- Supporting varied cognitive styles by tailoring interaction models and transparency features.
Additionally, the study introduces a psychometrically validated instrument for measuring trust-related constructs in human-genAI interactions. This tool can help researchers and tool designers diagnose trust bottlenecks and track improvements over time.
- FIRST PUBLISHED IN:
- Devdiscourse