China centralizes, U.S. outsources: How nations shape AI through data labor
Chinese annotation work is heavily shaped by “AI platformization,” wherein centralized platforms, such as Baidu’s EasyDL and Huawei’s ModelArts, structure the entire annotation pipeline. These platforms integrate client requests, labor management, workflow distribution, and model training into tightly controlled digital ecosystems. Workers are often framed as contributors to national technological progress, and annotation activities are supported by local governments through subsidies, tax relief, and urban planning measures. However, protections for workers remain minimal, with low wages, limited autonomy, and heavy surveillance embedded into the labor process.

A landmark comparative study sheds light on the hidden engine of artificial intelligence, data annotation, and exposes the vastly different regulatory, labor, and geopolitical landscapes shaping this foundational activity in the United States and China. Titled “Global Data Empires: Analysing Artificial Intelligence Data Annotation in China and the USA” and published in Big Data and Society, the study systematically analyzes how these two digital superpowers build and manage the invisible labor behind machine learning.
By scrutinizing policy documents, platform structures, and labor practices, the study reveals how national ideologies and platform governance models have led to distinct “data annotation regimes,” each reflecting broader socio-political priorities. It offers a critical lens on how geopolitical rivalry, platform infrastructure, and data labor converge to reinforce global digital hierarchies.
The study identifies stark contrasts in the organization, control, and protection of data labor in China and the United States. In China, the annotation industry is characterized by tight state control, deep corporate-state partnerships, and a distinct emphasis on national strategic goals. Annotation firms are often located in lower-tier cities and embedded in China’s broader economic development plans, including policies such as “Revitalizing the Countryside” and the push for “New Infrastructure.”
Chinese annotation work is heavily shaped by “AI platformization,” wherein centralized platforms, such as Baidu’s EasyDL and Huawei’s ModelArts, structure the entire annotation pipeline. These platforms integrate client requests, labor management, workflow distribution, and model training into tightly controlled digital ecosystems. Workers are often framed as contributors to national technological progress, and annotation activities are supported by local governments through subsidies, tax relief, and urban planning measures. However, protections for workers remain minimal, with low wages, limited autonomy, and heavy surveillance embedded into the labor process.
In contrast, the United States operates a fragmented, market-driven annotation economy. The study highlights a “platform outsourcing” model where major firms, such as Amazon, Microsoft, and Google, depend on third-party vendors and crowd work platforms like Amazon Mechanical Turk and Appen. This distributed infrastructure allows for cost flexibility and task scalability but results in little oversight, sparse worker protections, and highly precarious employment conditions.
American annotators often operate as independent contractors, facing algorithmic task allocation, inconsistent pay, and opaque performance metrics. The work is framed as entrepreneurial gig labor rather than national service, reflecting broader neoliberal ideologies. Unlike China, where the state explicitly incentivizes data work, U.S. regulators have taken a mostly hands-off approach, leaving annotation largely unregulated.
How do these regimes reflect geopolitical and economic priorities?
The study argues that data annotation regimes are more than operational choices - they are embedded within national strategies for technological sovereignty. China’s approach reflects a techno-nationalist agenda, where annotation labor is mobilized to achieve leadership in AI development. Through policies like the “New Generation AI Development Plan,” annotation is treated as a strategic asset, with extensive coordination between firms and government ministries.
This orientation results in vertical integration, where state-backed platforms control not only the data supply chain but also its ideological framing. The study highlights how data labor is nationalized, symbolically positioned as both a patriotic duty and a key input in achieving China’s AI goals. The result is an ecosystem with strong top-down direction but limited space for worker agency or dissent.
In the U.S., annotation is shaped by market imperatives and corporate outsourcing logics. American tech companies maximize efficiency and flexibility by subcontracting annotation to global vendors or anonymized crowd workers. The lack of labor protections is a feature, not a flaw—ensuring minimal liability and cost. This system aligns with U.S. digital capitalism, where the extraction of labor value occurs at arm’s length, and worker invisibility is maintained.
Moreover, while both nations pursue AI supremacy, their pathways reflect divergent values. China’s centralized model prioritizes state-led digital infrastructure and ideological control, while the U.S. favors modularity, scalability, and capital accumulation. These differences shape not only the labor experience but also the global flows of data, expertise, and AI innovation.
What are the risks and implications of these divergent models?
The consequences of these annotation regimes are significant and far-reaching. In both countries, data annotators remain marginalized, their contributions undervalued despite being essential to machine learning. The study warns that this invisibility exacerbates labor exploitation and erodes the ethical foundations of AI systems. Annotators are often required to engage with sensitive, disturbing, or personally invasive content without psychological or legal support.
China’s model risks entrenching authoritarian control, as data annotation is yoked to state surveillance goals and techno-political agendas. Workers are not just underpaid but systematically surveilled, with annotation sometimes extending to facial recognition and social scoring systems. The lack of transparency raises human rights concerns and highlights the coercive potential of centralized AI labor regimes.
In the U.S., precarity and fragmentation pose different risks. Annotators work in informational silos, with limited knowledge of how their tasks feed into larger AI systems. This disconnect reduces their ability to resist unethical uses or demand accountability. Moreover, outsourcing allows companies to shift ethical responsibility, creating a vacuum in which no single actor assumes moral stewardship over AI’s foundational labor.
The study calls for stronger regulatory frameworks, global labor standards, and platform design reforms to improve transparency, worker protections, and democratic governance in AI development. It suggests that countries can no longer ignore the data workers behind intelligent systems, especially as generative AI and LLMs scale in capability and reach.
- FIRST PUBLISHED IN:
- Devdiscourse