New hybrid AI model accurately predicts harmful algal blooms using buoy data
Unlike existing models that rely on expensive or infrequently collected data, BloomSense uses a cost-effective sensor configuration and advanced algorithms to enhance real-time predictive capability. Feature selection is performed using Random Forest (RF), followed by deep feature extraction via ResNet-18, and sequential data processing with LSTM networks. The integrated approach accounts for both spatial anomalies and time-dependent variations - two key factors in HAB formation and progression.

A hybrid machine learning system embedded in autonomous water-monitoring buoys has demonstrated unprecedented accuracy in predicting harmful algal blooms (HABs), marking a significant leap in the global effort to safeguard water quality. The study, titled “BloomSense: Integrating Automated Buoy Systems and AI to Monitor and Predict Harmful Algal Blooms,” was published in the journal Water.
Developed by researchers at Hohai University in China, the system leverages deep learning models, ResNet-18 for spatial feature extraction and Long Short-Term Memory (LSTM) networks for temporal pattern detection, to predict concentrations of chlorophyll-a (Chl-a), a key indicator of algal bloom activity. The models are optimized to operate on energy-efficient buoys using four low-cost inputs: water temperature, pH, electrical conductivity (EC), and Chl-a. The framework integrates a real-time alert system that activates when Chl-a concentrations exceed 10 µg/L, aligning with World Health Organization thresholds for bloom risk.
How does BloomSense outperform traditional monitoring techniques?
The study targets core limitations in current water quality monitoring, notably the slow detection cycles of manual sampling and the spatial and temporal blind spots of satellite-based remote sensing. Using two solar-powered EM1250 buoys deployed in Spain’s As Conchas Reservoir, the researchers collected over 218,000 water quality records at 15-minute intervals across 38 months. This high-frequency dataset enabled the team to train and validate their machine learning models against dynamic, real-world conditions including seasonal nutrient fluctuations and sudden temperature shifts.
Unlike existing models that rely on expensive or infrequently collected data, BloomSense uses a cost-effective sensor configuration and advanced algorithms to enhance real-time predictive capability. Feature selection is performed using Random Forest (RF), followed by deep feature extraction via ResNet-18, and sequential data processing with LSTM networks. The integrated approach accounts for both spatial anomalies and time-dependent variations - two key factors in HAB formation and progression.
Evaluation across two datasets, representing inland (Dam Buoy) and dynamic (Beach Buoy) environments, showed that BloomSense consistently outperformed standalone ResNet-18 and LSTM models. In regression tasks, the proposed system achieved up to a 26.2% reduction in mean absolute error (MAE) compared to baseline models. For classification tasks such as triggering bloom alerts, F1-scores improved by as much as 70.2% relative to simpler architectures.
What makes the AI model reliable in real-time forecasting?
The robustness of BloomSense stems from its hybrid architecture and adaptive training pipeline. By applying the Synthetic Minority Oversampling Technique (SMOTE), the researchers addressed class imbalance in bloom vs. non-bloom data, a frequent challenge in environmental datasets. ResNet-18’s skip-layer architecture captured nonlinear spatial patterns in water quality indicators, while LSTM modeled memory-based dependencies across hours and days, essential for forecasting bloom trajectories.
The model was trained on four configurations: original input, hourly and daily statistical aggregations, and a combined mixed input. This enabled the system to generalize across different timescales of water quality fluctuation. Notably, the model achieved strong recall (up to 0.79) at the critical WHO threshold of 10 µg/L Chl-a, ensuring high sensitivity to potential HAB events.
An ablation study confirmed the importance of each model component. Removing either ResNet-18 or LSTM reduced accuracy, while combining both with RF-based feature selection delivered the most balanced precision-recall trade-offs. MAE dropped by 20.9% to 25.5% across datasets when the full model was used, validating the effectiveness of this spatial–temporal fusion in both stable and complex aquatic settings.
Can this model be scaled for global water safety?
The success of BloomSense introduces a scalable, deployable model for water authorities and environmental managers worldwide. The buoy system operates autonomously with solar power and GSM communications, making it suitable for remote or resource-constrained regions. Its low-cost sensor design and automated feature processing eliminate the need for specialized on-site expertise.
This advancement is especially timely as HABs intensify globally due to climate change, rising water temperatures, and nutrient runoff from agriculture and industry. By providing continuous 15-minute monitoring, BloomSense fills a critical gap in water management infrastructure. It reduces dependency on satellite imaging, which is often hampered by cloud cover and poor spatial resolution, and overcomes the limitations of lab-based sampling that can lag behind actual bloom conditions.
Moreover, the system has potential for multi-region replication, particularly in freshwater reservoirs, lakes, and coastal areas vulnerable to eutrophication. The research team suggests future integration of explainable AI tools like SHAP to further refine the interpretability of model outputs and understand the secondary impacts of system variables, such as battery level, on data reliability and prediction performance.
- FIRST PUBLISHED IN:
- Devdiscourse