Machine learning key to early detection of acute malnutrition in Sub-Saharan Africa

Among the diverse indicators tested, previous malnutrition outcomes ranked highest in predictive power, followed by sub-county identifiers and satellite-based GPP data. Interestingly, models trained solely on GPP data still outperformed the statistical baseline, demonstrating the power of remotely sensed agricultural indicators in capturing early signals of food insecurity and health risk.


CO-EDP, VisionRICO-EDP, VisionRI | Updated: 21-05-2025 18:35 IST | Created: 21-05-2025 18:35 IST
Machine learning key to early detection of acute malnutrition in Sub-Saharan Africa
Representative Image. Credit: ChatGPT
  • Country:
  • Kenya

A new study presents compelling evidence that machine learning can significantly improve the forecasting of acute childhood malnutrition in Kenya, potentially transforming how public health authorities plan interventions and allocate resources. The research, published in PLOS ONE under the title “Forecasting Acute Childhood Malnutrition in Kenya Using Machine Learning and Diverse Sets of Indicators”, was led by a multidisciplinary team including researchers from Microsoft AI for Good, Amref Health Africa, and several academic institutions.

The study demonstrates that combining routine health system data with satellite-derived agricultural indicators allows artificial intelligence models to more accurately forecast malnutrition rates in sub-counties across Kenya. With acute malnutrition affecting an estimated 45 million children under five globally, the ability to anticipate outbreaks, even by a few months, could be the difference between life and death in resource-limited settings.

How do AI models forecast malnutrition more effectively?

The researchers used a combination of clinical indicators from Kenya’s District Health Information Software-2 (DHIS2) and satellite data measuring Gross Primary Productivity (GPP) to build machine learning (ML) models. The models aimed to predict malnutrition outcomes one, three, and six months ahead at a sub-county level. These predictions were stratified using the five-point Integrated Food Security Phase Classification for Acute Malnutrition (IPC-AMN), ranging from low (<3%) to extreme (≥30%) prevalence.

They tested two primary ML models, Logistic Regression and Gradient Boosting, against the Window Average model, a statistical baseline representing current decision-making practices in Kenya’s Ministry of Health. The Gradient Boosting model consistently outperformed the baseline, achieving a mean AUC of 0.89 for one-month forecasts and 0.86 for six-month horizons, versus 0.76 and 0.73 for the baseline model respectively.

Performance held even when the model was trained on historical data and validated with new information from late 2023 to early 2024, suggesting the framework’s robustness over time. Notably, the Gradient Boosting model excelled at forecasting the most critical IPC-AMN category (≥30%) with AUCs greater than 0.98 across all horizons.

Which data indicators are most useful for predictions?

Among the diverse indicators tested, previous malnutrition outcomes ranked highest in predictive power, followed by sub-county identifiers and satellite-based GPP data. Interestingly, models trained solely on GPP data still outperformed the statistical baseline, demonstrating the power of remotely sensed agricultural indicators in capturing early signals of food insecurity and health risk.

Clinical data included rates of underweight children, feeding practices, low birth weight, anemia among pregnant women, and nutritional supplement distribution. These were normalized by the number of children visiting health facilities each month. While useful, clinical indicators alone were less predictive than outcome history and GPP, especially when forecasting malnutrition risks below 15%, where traditional models faltered the most.

Feature importance rankings confirmed the dominance of past malnutrition rates and GPP-derived metrics over most clinical indicators. This suggests that when real-time clinical data is unavailable or delayed, satellite signals could serve as viable proxies for early warning systems.

What Are the Practical and Policy Implications?

The practical implications of this research are far-reaching. Forecasts with one- to six-month lead times can inform targeted, timely responses - including nutrition-specific interventions like supplementary feeding and nutrition-sensitive strategies such as improved sanitation and cash transfers. Given that malnutrition weakens immune systems and exacerbates child mortality, early warnings can significantly improve health outcomes and resource efficiency.

The model’s ability to distinguish between moderate acute malnutrition (MAM) and severe acute malnutrition (SAM) separately allows for nuanced planning. Forecasting SAM remains challenging due to its rarity, but even here the machine learning models achieved AUC scores above 0.8, indicating strong potential for guiding high-risk interventions.

The study also underscores the adaptability of the forecasting framework to new administrative levels and data types. For instance, Kenya’s 320 sub-counties posed a mismatch with the 290-unit boundary dataset used for GPP aggregation. The authors reconciled this by focusing on 240 overlapping sub-counties for GPP analysis, while all other models retained full coverage using DHIS2 data.

Work is currently underway with Kenya’s Ministry of Health to integrate these AI models into a co-developed, decision-support system for continuous forecasting. The goal is to establish an operational pipeline that can deliver automated, frequent updates to national and local health authorities. Such a system could later be adapted for use in the 80+ low- and middle-income countries using DHIS2 globally.

  • FIRST PUBLISHED IN:
  • Devdiscourse
Give Feedback