AI conversational agent enhances road safety and driver experience
The study also examined how participants perceived and trusted the different systems. CARA earned higher ratings for competence than the pre-scripted agent, and its animacy scores were also significantly stronger. While perceived likeability, intelligence, and safety trended in CARA’s favour, these differences did not reach statistical significance.

A team of researchers from Virginia Tech has found that a ChatGPT-powered conversational agent can enhance both driving stability and the overall user experience compared with a pre-scripted voice assistant or no in-car assistant at all. The findings come from a motion-simulator study designed to assess the safety and usability of large language model (LLM)-driven assistants in real-time driving environments.
Titled “ChatGPT on the Road: Leveraging Large Language Model-Powered In-vehicle Conversational Agents for Safer and More Enjoyable Driving Experience”, the research systematically evaluates how a free-flowing, bidirectional dialogue with an AI agent influences driving performance, trust, and preference. The study responds to rising interest, and concern, over integrating generative AI tools into high-stakes, real-world contexts such as road transportation.
Testing the impact of LLM-powered conversations on driving performance
The experiment involved 40 drivers in a controlled motion simulator, using a within-subjects design to compare three conditions: no in-car agent, a pre-scripted one-way voice assistant, and a ChatGPT-based conversational agent (CARA) capable of multi-turn, adaptive dialogue. Each participant experienced all three conditions, allowing for direct performance comparisons.
Driving data were recorded at a 100 Hz sampling rate to track key metrics: longitudinal and lateral acceleration variability, lane deviation, and steering torque variation. Participants also completed questionnaires measuring affect, social perception, perceived competence, animacy, likeability, perceived intelligence, perceived safety, and trust.
The results were unambiguous in certain areas. CARA consistently outperformed the other two conditions on several stability measures. Drivers using the ChatGPT-based assistant showed reduced variability in longitudinal acceleration, lateral acceleration, and lane positioning. The only metric showing no significant difference was steering torque variation. These findings counter a common concern that extended AI-driven conversations might distract drivers, instead, the data suggest that well-designed LLM-powered interactions can coincide with more stable driving behaviour.
How the agent affects user perceptions and trust
The study also examined how participants perceived and trusted the different systems. CARA earned higher ratings for competence than the pre-scripted agent, and its animacy scores were also significantly stronger. While perceived likeability, intelligence, and safety trended in CARA’s favour, these differences did not reach statistical significance.
Trust emerged as a particularly telling dimension. Affective trust, the emotional willingness to rely on the system, was significantly higher for CARA than for the alternatives. Cognitive trust, which relates to rational, evidence-based confidence, did not differ across conditions. This suggests that while the ChatGPT-based agent may foster stronger emotional connections and comfort, it does not necessarily alter users’ rational assessments of system reliability.
Preferences were also strongly skewed toward the AI-powered assistant. In exit interviews, 25 of 39 participants ranked CARA as their top choice, while 18 of 38 ranked the no-agent condition as the least preferred. These responses indicate that drivers not only tolerated but actively favoured having a responsive conversational partner in the vehicle, provided it could engage in natural, adaptive dialogue.
The nature of conversations and real-world implications
The research went beyond quantitative measures to analyse the content of conversations between drivers and CARA. Across 251 logged dialogue instances, seven primary categories emerged: real-time driving assistance, action requests such as controlling music or vehicle functions, location-based recommendations, general question-and-answer exchanges, entertainment topics, deliberate agent testing, and personal reflection or self-disclosure.
This variety illustrates the versatility of LLM-based systems in responding to driver needs beyond navigation or basic command execution. However, it also highlights areas where further safeguards may be necessary, for example, ensuring that casual or entertainment-focused exchanges do not inadvertently distract drivers during high-demand driving moments.
The findings represent an early but important indication that LLM-based in-vehicle agents can be deployed in ways that support, rather than impair, safe driving. Still, the authors stress the need for further research under more varied driving conditions, including real-world road testing, and in scenarios involving emergencies or malicious AI outputs. Issues of privacy, system robustness, and resilience to misleading or unsafe advice remain central concerns before such systems can be adopted widely.
- FIRST PUBLISHED IN:
- Devdiscourse