Clear Sky Science · en
Prediction of under-five mortality using supervised machine learning algorithms in the 23 sub-Sharan African countries
Why this study matters to families
In many parts of Sub-Saharan Africa, far too many children still die before their fifth birthday, even as global child survival has improved. This study asks a hopeful question: can modern computer tools sift through huge health surveys to spot which children are most at risk, early enough for health workers and governments to act? By blending public health and machine learning, the researchers aim to turn existing data into practical guidance that could help save young lives.
Taking a fresh look at a stubborn problem
Child deaths in Sub-Saharan Africa remain the highest in the world, with large differences from one country to another. These gaps reflect uneven access to clinics, deep economic hardship, and limits on services for mothers and newborns. Previous attempts to predict which children are most vulnerable often used small samples or simple methods, making their results hard to trust or apply broadly. The team behind this study set out to build stronger, more reliable prediction tools that reflect the realities of millions of families across the region.
Turning big surveys into a picture of risk
The researchers combined recent Demographic and Health Survey data from 23 countries, covering nearly 191,000 children born in the five years before each survey. For every child, they considered a wide range of details: the mother’s age and schooling, household wealth, family size, where the family lives, the type of work parents do, how early the mother gave birth, use of antenatal and postnatal care, place of delivery, and how hard it is to reach health services. They carefully prepared the data, balanced the much smaller group of children who had died with those who survived, and used a feature-selection method to focus on the most informative factors before training several computer models.

Letting algorithms learn from patterns
Seven different supervised learning algorithms were tested, including familiar tools such as logistic regression and decision trees, as well as more powerful “ensemble” methods that combine many simple models. Each algorithm learned to distinguish between children who survived and those who died before age five, and was judged on how often it was correct, how well it found truly high-risk cases, and how clearly it separated high and low risk overall. The random forest approach, which builds many decision trees and averages their results, emerged as the clear leader. It correctly classified children in roughly 94% of cases and showed excellent ability to tell high-risk from low-risk situations.
Seeing inside the black box
To make the model’s decisions understandable to health planners and clinicians, the team used a technique called SHAP that shows how each factor pushes a prediction toward higher or lower risk. Across the region, several themes stood out. Children whose families reported big problems in reaching care, those born to mothers who had their first baby before age 18, and those living in the poorest households faced markedly higher predicted risk. By contrast, children of mothers in their mid-twenties, those born in health facilities, and those whose families could obtain recommended pregnancy and postnatal care had a lower predicted chance of dying. Visual SHAP plots for individual children illustrated how a specific mix of barriers and protections adds up to a personal risk profile.

From numbers to action
The study shows that, when fed large, recent, and representative survey data, machine learning models can give a reliable early warning about which children are most likely to die before age five in Sub-Saharan Africa. Just as important, the interpretability tools highlight familiar but powerful levers for change: making clinics easier to reach, delaying very early childbearing, and reducing extreme poverty. For a lay reader, the message is straightforward: computers are not replacing doctors or nurses, but they can help point scarce resources toward the families who need them most, turning data into a practical roadmap for saving children’s lives.
Citation: Asnake, A.A., Gebrehana, A.K., Asmare, Z.A. et al. Prediction of under-five mortality using supervised machine learning algorithms in the 23 sub-Sharan African countries. Sci Rep 16, 9131 (2026). https://doi.org/10.1038/s41598-026-40401-0
Keywords: under-five mortality, Sub-Saharan Africa, machine learning, child health risk factors, public health prediction