AI literacy may decide who gains and who falls behind in workplace
Generative AI could deliver productivity gains in workplaces and classrooms, but only when users remain engaged enough to check, adapt and learn from its outputs, reveals a new study, suggesting that AI assistance can create a productivity paradox: more powerful tools may sometimes reduce output, weaken skills and widen gaps between users who know how to work with AI and those who passively rely on it.
The study, titled “Human-AI Productivity Paradoxes: Modeling the Interplay of Skill, Effort, and AI Assistance,” was published as an arXiv preprint. The paper develops a formal model of human-AI interaction to explain why empirical evidence on generative AI’s impact remains mixed, with some studies showing productivity gains and others showing losses in output quality, learning and long-term skill formation.
AI can substitute for human effort in the short run, but that substitution may reduce the very effort needed to maintain skills, detect mistakes and build future expertise. The model identifies three mechanisms behind this risk: skill development, AI unreliability and AI literacy.
AI assistance can reduce effort and trigger long-term skill loss
In a basic model, human skill, human effort and AI assistance all contribute to task output. In this simple setting, AI works as expected: it reduces the effort needed to produce an outcome while improving productivity. For tasks such as writing, coding, translation or customer service, AI can act as a substitute for part of the work that people would otherwise perform.
But the model changes when skill development is treated as an ongoing process. The authors argue that skills do not remain fixed. They rise or fall depending on how much effort people exert over time. If workers or students use AI in ways that reduce their active engagement, they may lose opportunities to practice, learn and improve. The short-term gain from assistance can then be offset by long-term deterioration in human capability.
This produces the first productivity paradox: AI may improve productivity for a person at a fixed skill level, but when AI also changes the overall skill distribution by reducing effort, the long-term result can be lower productivity. The authors compare the mechanism to a reversal effect in which gains visible within individual skill groups can disappear or reverse once changes across the whole population are considered.
Knowledge-heavy sectors like firms and schools often measure AI success through immediate output, such as faster drafting, faster coding or quicker problem-solving. The study suggests that these metrics can miss hidden losses if AI use reduces the effort needed for learning, debugging, reasoning or conceptual understanding.
The authors point to emerging evidence from education and software development that supports this concern. They cite a randomized controlled trial in which developers learning a new Python library with AI assistance did not show statistically significant productivity gains but scored 17 percent lower on follow-up evaluations of conceptual understanding, code reading and debugging. The paper notes that interaction patterns requiring higher cognitive effort, such as asking conceptual questions or manually adapting AI suggestions, helped preserve skill formation, while passive delegation did not.
AI is not only a tool that changes the speed of work, but also the structure of effort. When users remain cognitively involved, AI may support learning and output. When users offload too much, AI can become a shortcut that weakens future performance.
The authors’ model identifies a condition under which this problem becomes more likely: when skill development is more sensitive to human effort than productivity is to AI assistance. In real settings, this means AI becomes risky when the effort people stop making is more important for future learning than the AI boost is for current output. In such cases, greater AI assistance can reduce steady-state productivity.
The finding is significant for organizations planning large-scale AI deployment. A company may see early productivity improvement after giving employees AI tools, but later face knowledge gaps if workers rely too heavily on generated answers. Schools may see students complete assignments faster, but later discover weaker understanding. The risk is not merely that AI makes an occasional mistake. It is that routine AI reliance can change how people build and retain expertise.
Unreliable AI can cannibalize human effort and lower productivity
The second productivity paradox arises from AI unreliability. Generative AI systems can be highly capable on some tasks and inaccurate on others. They can produce confident answers that are wrong, incomplete or poorly suited to the specific problem. The study models this by treating AI assistance as uncertain: the tool may provide useful help in some cases and fail in others.
The key problem is that users may decide how much effort to exert before they know whether the AI output is reliable. If they expect strong AI help, they may reduce their own effort. But if the AI fails, the loss of human effort can outweigh the benefit of AI assistance. The result is an unexpected decline in productivity.
The authors describe this mechanism as effort cannibalization. As AI becomes more capable on average, users may trust it more and work less. If the AI is still unreliable, that reduced effort can leave users exposed when the system gives weak or wrong support. Under certain production conditions, the model shows that increasing AI proficiency while holding reliability constant can reduce productivity.
This finding challenges a common assumption in AI policy and business strategy: that improving model output quality will automatically improve productivity. The study argues that average output quality is not enough; reliability is also a key consideration. A tool that is brilliant some of the time and misleading at other times can be worse for productivity than a less powerful tool that users approach with greater caution.
The model also helps explain why empirical studies of AI productivity have produced conflicting results. In some settings, AI improves output because the task falls within the system’s capability and users can effectively apply the result. In other settings, especially complex work beyond the reliable frontier of the tool, AI can disrupt performance. The paper links this to the idea of a jagged technological frontier, where AI performs well on some difficult tasks but fails on others that may appear simpler.
At workplaces, AI performance should not be judged only by benchmark scores or average gains. Organizations need risk-based measures that capture variability, failure rates and the cost of mistakes. A customer service agent, software developer, analyst or student may not suffer equally from all AI errors. Some errors are easy to detect and fix. Others are subtle, costly and likely to be accepted if the user has reduced oversight.
Unreliable AI becomes especially dangerous when users cannot easily verify the output, the model suggests. In that setting, they may optimize their effort based on expectations rather than actual quality. If the tool usually helps but sometimes fails in important ways, users can become over-reliant. Productivity then falls not because AI has no value, but because the system changes human behavior in ways that leave less effort available when it is most needed.
Developers may need to build interfaces that slow users down at critical moments, request clarifying input, flag uncertainty or encourage review. Instead of maximizing frictionless automation in every case, systems may need to preserve enough human effort to protect quality and learning.
Usage frictions can be beneficial when they prompt users to think, check and adapt. In education, this could mean AI tools that ask students to explain reasoning before revealing answers. In software development, it could mean tools that request assumptions, tests or manual review before producing final code. In professional settings, it could mean systems that provide uncertainty signals and require human validation for high-impact decisions.
AI literacy could determine who gains and who falls behind
The third mechanism identified in the study is AI literacy. The authors define AI literacy as the ability to collaborate effectively with AI and critically evaluate its outputs. This includes recognizing when AI is wrong, knowing how to adapt a suggestion, and understanding when additional human effort is needed.
The model shows that AI literacy can lead to skill polarization. Users with higher skill and stronger AI literacy are better able to detect failures and compensate with extra effort. That effort helps them maintain or improve their skills. Lower-skill users, by contrast, may be less able to identify poor outputs and more likely to over-rely on AI. Over time, the gap between these groups can widen.
This finding complicates a popular claim that generative AI will naturally level the playing field by helping lower-skilled workers or students catch up. The authors acknowledge that some empirical studies show AI can benefit lower-skilled users in certain tasks. But their model shows that when AI output is unreliable and literacy differs across users, AI can also amplify inequality.
The mechanism resembles a rich-get-richer pattern. Users with early advantages are better positioned to use AI productively. Their stronger domain knowledge helps them question, verify and improve AI outputs. That keeps them engaged and reinforces their skill base. Users with weaker skills may receive the same AI tool but lack the judgment needed to use it safely. Instead of catching up, they may become more dependent.
This is a major concern for schools, universities and firms trying to democratize expertise through AI. Providing access to AI tools is not the same as providing the ability to benefit from them. If users lack training in verification, prompt framing, domain reasoning and error detection, AI adoption may widen the distance between high-performing and low-performing groups.
AI rollout should be paired with AI literacy training, especially in knowledge-critical settings. Workers and students need to know not only how to ask AI for help, but also how to challenge it, test it and learn from the interaction. The authors argue that institutions anticipating skill shortfalls should introduce remedial training and knowledge management programs.
The research also suggests that AI system designers have a role in shaping user behavior. Tools designed as instant answering machines may encourage passive delegation. Tools designed as dialogue partners may encourage users to reason, revise and stay involved. The authors cite recent product directions such as interactive interview modes and study-focused interfaces as examples of systems that can shift AI use from answer delivery toward guided exploration.
The paper extends the concern to regulation as well, noting that the European Union’s Artificial Intelligence Act requires deployment to consider individual knowledge, experience, education, training and context of use.
To sum up, GenAI can increase productivity, but it can also produce hidden losses when it reduces effort, weakens skill development, or encourages over-reliance on unreliable outputs. The biggest risk is not simply that AI will make mistakes. It is that humans will stop doing the work needed to notice, correct and learn from those mistakes.
- FIRST PUBLISHED IN:
- Devdiscourse

