Machine learning forecasts deadly Rift Valley Fever with 99.7% accuracy in Kenya

The disease is known for its strong association with climatic variables, especially excessive rainfall, high humidity, and the presence of clay-heavy soils that promote the formation of stagnant water pools. These conditions support the breeding of Aedes mosquitoes, which are the primary vectors for RVF. The study draws attention to these correlations, reinforcing earlier findings that environmental conditions are essential in shaping outbreak patterns.

CO-EDP, VisionRI | Updated: 22-07-2025 15:51 IST | Created: 22-07-2025 15:51 IST

Machine learning forecasts deadly Rift Valley Fever with 99.7% accuracy in Kenya — Representative Image. Credit: ChatGPT

Country:
Kenya

A newly published study offers a breakthrough in disease surveillance by deploying machine learning (ML) techniques to predict Rift Valley fever (RVF) outbreaks in Kenya. Titled “Machine Learning Approach to Predicting Rift Valley Fever Disease Outbreaks in Kenya” and published in Zoonotic Diseases, the research introduces high-performing classification models capable of forecasting outbreaks based on historical and environmental data spanning three decades.

The study leverages 30 years of climatic and epidemiological data, drawing from regions heavily impacted by RVF. By integrating variables such as rainfall, humidity, elevation, slope, and soil clay content, the researchers aimed to build predictive models that identify the onset of RVF outbreaks with near-perfect accuracy. The initiative marks a pioneering effort in using ML to manage a climate-sensitive zoonotic disease endemic to Africa.

Why RVF prediction matters: Public health and climate dynamics

Rift Valley Fever is a viral zoonotic disease affecting both livestock and humans, with outbreaks typically tied to specific ecological and meteorological triggers. The virus, first identified in Kenya in 1931, has become a recurring health and economic threat, particularly in pastoral regions of sub-Saharan Africa.

The disease is known for its strong association with climatic variables, especially excessive rainfall, high humidity, and the presence of clay-heavy soils that promote the formation of stagnant water pools. These conditions support the breeding of Aedes mosquitoes, which are the primary vectors for RVF. The study draws attention to these correlations, reinforcing earlier findings that environmental conditions are essential in shaping outbreak patterns.

Using data from 1981 to 2010 across Kenya’s diverse topographies, the researchers compiled variables including monthly rainfall, humidity, slope, elevation, and clay content. They observed that RVF cases were most concentrated in Rift Valley (26.8%), Eastern (20.6%), and Northeastern (18.9%) provinces, while highland regions like Nyanza and Western provinces reported no cases. The team used this geographic disparity to strengthen the environmental modeling aspect of the study.

Evaluating machine learning algorithms for epidemic forecasting

The study deployed an extensive array of ML algorithms including Logistic Regression (LR), Linear Discriminant Analysis (LDA), Gaussian Naive Bayes (NB), K-Nearest Neighbors (KNN), Support Vector Machines (SVM), Decision Trees (CART), Random Forest (RF), and Extreme Gradient Boosting (XGBoost). These models were evaluated using a range of statistical metrics such as sensitivity, specificity, precision, recall, F1 score, ROC-AUC, and PR-AUC to assess their predictive performance.

Despite similar accuracy scores across multiple models, such as LR, LDA, and SVM all achieving about 99.7% accuracy, their sensitivity (ability to correctly identify true positives) remained critically low. For instance, most models registered near-zero sensitivity and precision scores, which are essential for correctly forecasting actual outbreak events. This led the researchers to focus on more nuanced metrics like the Precision-Recall Area Under the Curve (PR-AUC), where the XGBoost classifier emerged as the most effective.

According to the analysis, the XGBoost model achieved a PR-AUC score of 0.911 and a ROC-AUC of 0.022, significantly outperforming other contenders. By contrast, Random Forest, a widely used algorithm in epidemiological studies, ranked lower with a PR-AUC of 0.5736 and ROC-AUC of 0.0089. These results highlight the importance of choosing evaluation metrics that reflect real-world prediction challenges, particularly when data are imbalanced and disease events are rare.

The study employed advanced pre-processing techniques such as Isolation Forests to eliminate outliers from a dataset originally comprising over 180,000 records. Cross-validation and balanced test-train splits were used to ensure robustness. Nonetheless, the authors acknowledged limitations such as random data splitting, which may obscure temporal trends that are crucial in epidemiological forecasting.

Implications for surveillance, policy, and future research

The study underscores the transformative potential of AI in managing climate-sensitive zoonotic diseases. Accurate forecasting models such as XGBoost offer a powerful tool for early detection, targeted vaccination, vector control, and resource allocation. These capabilities are especially critical in countries like Kenya, where RVF poses persistent risks to both agricultural livelihoods and human health.

The authors recommend future enhancements that include temporal-aware modeling, integration of genomic data, and longitudinal surveillance to improve the biological relevance and predictive power of machine learning models. They emphasize the need for interdisciplinary collaboration, bringing together epidemiologists, climatologists, data scientists, and public health officials, to build robust, real-time surveillance systems.

FIRST PUBLISHED IN:
Devdiscourse

Machine learning forecasts deadly Rift Valley Fever with 99.7% accuracy in Kenya

Why RVF prediction matters: Public health and climate dynamics

Evaluating machine learning algorithms for epidemic forecasting

Implications for surveillance, policy, and future research

TRENDING

Ghulam Nabi Azad calls for unified political approach on restoration of stat...

Justice Department completes interview with Epstein's accomplice Ghislaine M...

2 suspected Bangladeshi smugglers shot dead at international border in Tripu...

Report Challenges Accusations Against Hamas in Aid Theft Scandal

OPINION / BLOG / INTERVIEW

Agricultural economies gain competitive edge through digital connectivity

Supply chain resilience hinges on structured data governance

Hospitals turn to AI chatbots for patient messages: Empathetic, efficient but potentially dangerous

Biogas breakthrough could transform global waste into fuel goldmine

DevShots

Latest News

Need more consistency from top six to fit in Kuldeep: Morne Morkel

Kerala CM orders comprehensive probe into jailbreak by murder convict

Court discharges 9 'untraceable' accused in 30-year-old dacoity case, says evidence 'insufficient'

UPDATE 1-Beijing warns of geological disasters as storms lash Baoding again

Connect us on

SECTORS

EDITIONS

OTHER LINKS

OTHER PRODUCTS

CONNECT