Securing decentralized AI: Blockchain system proposed for identity and integrity

The authors argue that AI trained with undisclosed, unvalidated datasets can propagate privacy violations, bias, deepfakes, and misinformation. Centralized systems also fail in environments where connectivity to servers is limited, undermining their utility. Decentralized AI offers resilience by allowing devices to train locally, but it requires a reliable supply of validated data to remain effective.


CO-EDP, VisionRICO-EDP, VisionRI | Updated: 22-08-2025 16:15 IST | Created: 22-08-2025 16:15 IST
Securing decentralized AI: Blockchain system proposed for identity and integrity
Representative Image. Credit: ChatGPT

A new study published in the journal Big Data and Cognitive Computing lays out a blockchain-based system designed to ensure that decentralized artificial intelligence (DAI) devices learn only from verified, trustworthy data. Their research underscores how this model could reshape the way autonomous systems, from smartphones and cars to robots, self-train while avoiding risks tied to malicious inputs and unreliable participants.

The paper, “Proposal of a Blockchain-Based Data Management System for Decentralized Artificial Intelligence Devices,” sheds light on the vulnerabilities of existing centralized AI models and sets out a decentralized data management framework that uses blockchain to secure integrity, identity, and accountability across participants in the AI ecosystem.

Why data validation is crucial for decentralized AI

Artificial intelligence has shifted from being a centralized service delivered through cloud platforms to being embedded in a wide range of consumer devices. This trend, known as decentralized AI, enables devices such as autonomous vehicles, mobile phones, and service robots to operate independently. However, the study warns that without careful safeguards, these systems face the risk of learning from contaminated or unreliable data sources.

The authors argue that AI trained with undisclosed, unvalidated datasets can propagate privacy violations, bias, deepfakes, and misinformation. Centralized systems also fail in environments where connectivity to servers is limited, undermining their utility. Decentralized AI offers resilience by allowing devices to train locally, but it requires a reliable supply of validated data to remain effective.

To address this challenge, the proposed framework introduces a mechanism where devices can access data that has been vetted by third-party validators. The system not only enables learning from trusted knowledge data but also blocks access for devices that repeatedly generate low-quality or harmful outputs. By doing so, it attempts to establish an ecosystem of accountability that deters misuse and improves the reliability of AI at the edge.

How the blockchain-based model works

The framework features a role model consisting of consumers, producers, suppliers, validators, and the blockchain system. Consumers are the decentralized AI devices that require validated knowledge data for training. Producers create raw or knowledge data, suppliers store it, and validators assess its integrity, quality, and compliance with rules before approving it. The blockchain itself functions as a trust anchor that manages identity, access control, and historical records of data processing.

The model adopts a hybrid on-chain and off-chain design. Only critical records such as hashes of datasets, validation results, and transaction histories are stored on-chain to ensure traceability. The larger payloads, raw data, knowledge data, and validation details, reside off-chain for efficiency. This division balances scalability with accountability, ensuring that devices can verify the authenticity of datasets without overloading the blockchain.

The service flow begins with data producers submitting knowledge or raw data to suppliers. These suppliers store the information off-chain while registering its integrity proof on-chain. Validators then fetch the data, confirm its integrity against the recorded hash, and assess it for issues such as personal information exposure, bias, intellectual property violations, or unsafe content. The outcomes of these checks are stored both off-chain and on-chain, providing a permanent, verifiable record. Only when this process is complete can AI devices access the validated datasets for self-learning.

The authors emphasize that this model is designed to evolve continuously. Devices will not only learn from validated external knowledge but also combine it with their own local data, maintaining adaptive intelligence while reducing exposure to corrupted or malicious sources.

What security risks and standards are addressed

The research identifies a series of major security threats in decentralized AI environments, ranging from data theft and malware insertion to abuse of validator authority and identity theft targeting both devices and validators. To mitigate these risks, the authors outline corresponding security requirements anchored in cryptographic and governance measures.

Encryption of off-chain data using algorithms such as AES or SEED prevents theft of raw or knowledge data. Integrity checks with secure hashing ensure that no tampered dataset is used in learning. Security audits of validators and alignment with standards such as ISO/IEC 27000 establish procedural safeguards against collusion or negligence. The adoption of decentralized identifiers and verifiable credentials for both AI devices and validators strengthens trust in identity management across the ecosystem.

The system is also designed with future regulation and standardization in mind. It aligns with ongoing international discussions around AI management frameworks, such as ISO/IEC 23894 for AI risk management and ISO/IEC 42001 for AI management systems. The authors argue that blockchain-based auditing and identity tracking will help meet compliance requirements under evolving AI governance regimes.

Beyond security, the proposed system carries significant policy, technological, and economic implications. By reducing dependence on undisclosed centralized datasets, it lowers costs for developers who would otherwise face continuous retraining expenses. It also accelerates device learning, as validated knowledge data can be ingested more efficiently than raw data collection. The ability to trade validated datasets in marketplaces opens commercial opportunities, while restrictions on unreliable devices serve as a deterrent against the misuse of AI.

Towards a standardized future for decentralized AI

The study calls for international collaboration to standardize such blockchain-based frameworks. The authors highlight ongoing work in ISO/TC 307/JWG 4, which focuses on blockchain and distributed ledger security, privacy, and identity. They also point to efforts in Korea to translate the model into an ICT standard through the Telecommunications Technology Association.

Future research, they suggest, must refine the methodologies for data validation, lifecycle management of device credentials, and the energy efficiency of on-chain and off-chain storage. They also stress the importance of creating viable service models for ranking and classifying knowledge data to ensure long-term adoption and scalability.

  • FIRST PUBLISHED IN:
  • Devdiscourse
Give Feedback