Clear Sky Science · en

Explainable hybrid machine learning model for predicting stunting and identifying key risk factors among Ethiopian children under five

2026-04-09 · Back to index

Why child growth prediction matters

Across low income countries, many children do not grow as tall or as strong as they should because of long term lack of good food, illness, and poor living conditions. This condition, called stunting, harms learning, health, and future earnings. In Ethiopia more than one in three children under five are affected. The study summarized here explores how a new type of computer program can help health workers spot which children are most at risk, using information that surveys already collect, while also explaining how and why the program reaches its conclusions.

Seeing stunting as more than a number

The researchers began with data from the 2019 Ethiopian Demographic and Health Survey, which includes details on thousands of children under five and their families. For each child, the survey records height and age so that stunting level can be grouped into three categories: normal growth, moderate stunting, or severe stunting. Because far fewer children fall into the severe group than the normal group, the team carefully rebalanced the data so that the computer would learn to recognize all three categories fairly rather than being biased toward the most common one. They then cleaned, transformed, and checked the information to make sure it was suitable for analysis.

Figure 1. How AI sorts Ethiopian children into growth risk groups using everyday family and community data.

Blending two smart tools into one

Instead of relying on a single type of machine learning model, the authors created a hybrid system that combines two strong approaches. One part, called Extra Trees, builds many decision trees that excel at finding patterns in mixed data, such as region, family size, and birth history. The other part, called a multilayer perceptron, is a simple deep learning network that can capture more subtle relationships once the data have been transformed. In their design, the tree based model first processes the data and passes rich signals to the neural network, which then produces the final prediction of whether a child is normal, moderately stunted, or severely stunted.

Accuracy with caution

The hybrid model was trained on more than eleven thousand child records and tested on a separate set. It reached about 94% accuracy, precision, recall, and F1 score, and showed strong performance in cross validation, suggesting that its predictions are stable rather than a fluke of one sample. A detailed confusion matrix revealed that the model is especially good at telling clearly normal children from clearly severely stunted children, while most errors occur at the boundaries between moderate and severe stunting. The authors stress that the survey provides only a snapshot in time, so the model finds strong associations rather than proving that any one factor directly causes stunting.

Figure 2. How a two part AI model combines clues like age, region, and birth spacing to flag stunting risk levels.

Opening the black box

High accuracy alone is not enough for public health decisions, because policy makers and clinicians need to understand why a system flags a child as at risk. To address this, the study uses explainable artificial intelligence tools, in particular a method called LIME, which breaks down each prediction into contributions from individual factors. By examining feature importance and local explanations, the researchers found that child age, region of residence, the time gap between births, and the number of children under five in the household were the most influential predictors. Other helpful signals included maternal education, household wealth, and access to clean water, echoing earlier nutrition studies.

From prediction to practical action

For a general reader, the main message is that careful use of artificial intelligence can help health workers move from simply counting how many children are stunted to identifying which children and communities need help most urgently. The hybrid model does not tell us the ultimate causes of stunting, but it offers a reliable, transparent way to spot children at higher statistical risk based on readily available survey questions. Used alongside clinical judgment, it could guide targeted programs in nutrition, clean water, and family planning, helping Ethiopia and similar countries focus limited resources where they can do the most to protect child growth and potential.

Citation: Wudu, T.K., Endalew, A.A. & Dires, A.A. Explainable hybrid machine learning model for predicting stunting and identifying key risk factors among Ethiopian children under five. Sci Rep 16, 16204 (2026). https://doi.org/10.1038/s41598-026-46417-w

Keywords: childhood stunting, Ethiopia, machine learning, explainable AI, child nutrition