Clear Sky Science · en

Comparative evaluation of machine learning and deep learning approaches for compressive strength prediction of geopolymer concrete

· Back to index

A New Way to Design Greener Concrete

Concrete is the backbone of modern cities, but making traditional cement releases large amounts of carbon dioxide. Geopolymer concrete offers a cleaner alternative, yet engineers still struggle to predict how strong it will be once it hardens. This study shows how modern data tools, including machine learning and deep learning, can learn from hundreds of past experiments to forecast the strength of geopolymer concrete mixes and guide greener construction with fewer costly trial batches.

Why Greener Concrete Is Hard to Get Right

Geopolymer concrete replaces much of ordinary cement with industrial by-products such as fly ash, slag, metakaolin, and silica fume. These powders react with alkaline liquids to form a solid, stone-like material that can match or exceed the strength of standard concrete while cutting carbon emissions by up to about 70 percent. But this greener recipe is delicate: strength depends on many intertwined factors, including the type and amount of powders, the dosage and concentration of alkaline liquids, the amount of water and chemical additives, grains of sand and gravel, fibers added for toughness, and how long and under what conditions the material cures. Because all of these ingredients interact in complex, non‑linear ways, simple equations are not enough to predict how strong a new mix will be.

Figure 1
Figure 1.

Building a Big-Picture Data Set

To tackle this complexity, the authors assembled a large, unified database from 19 published studies, covering 594 different geopolymer concrete mixes. Each record describes key ingredients such as the amounts of fly ash, slag, metakaolin, silica fume, sand, gravel, water, alkaline liquids, and superplasticizer, along with steel and polypropylene fibers and the curing age in days. The team carefully cleaned the data: they removed duplicates, harmonized units and variable names, and kept only those mixes with complete information. This produced a diverse but consistent set that spans a wide range of strengths, from relatively weak to very high-performance concretes, mirroring the real spread found across many laboratories.

Letting Machines Learn from Past Mixes

With this database in hand, the researchers trained a suite of prediction models. On the classical side, they used several machine learning methods, including decision trees, random forests, and advanced “boosting” techniques that combine many small models into a strong overall predictor. On the deep learning side, they tested two neural networks of different depths and a modern tabular model called TabNet, which uses an attention mechanism to focus on the most informative ingredients. The data were split so that 80 percent of the mixes were used to teach the models, while the remaining 20 percent were kept aside to test how well each method could predict unseen results using standard measures of error and goodness of fit.

Which Models Worked Best and Why

The clear winner among the machine learning approaches was a boosting method called CatBoost, which delivered very accurate strength predictions and matched the test data closely across the full range of values. Another boosted model, XGBoost, came in a close second. Among the deep learning tools, TabNet performed best and effectively rivaled the top machine learning model, while a deeper neural network with more layers actually did worse, likely because the dataset was too modest in size for such a complex structure. To make these “black box” predictions more transparent, the authors used an explanation tool called SHAP, which assigns an importance score to each input. These analyses showed that slag content and curing age consistently dominate strength prediction, followed by other binder ingredients, alkaline dosage, and aggregate amounts, while steel and plastic fibers play only a secondary role for compressive strength, even though they can strongly affect cracking and toughness.

Figure 2
Figure 2.

What This Means for Future Construction

Put simply, this work shows that computers can learn from hundreds of past geopolymer concrete recipes to provide fast and reliable estimates of how strong a new, eco‑friendly mix will be. Instead of casting and crushing large numbers of test cylinders, engineers can use these models to narrow down promising combinations of binder powders, activator liquids, and curing times before going to the lab. The study also highlights which ingredients matter most, giving practical guidance: getting slag content, activator levels, and curing conditions right is far more important for compressive strength than fine‑tuning fiber amounts. While the models still need to be validated on larger and more standardized datasets, this framework marks a significant step toward data‑guided, lower‑carbon concrete design.

Citation: Ezz, H., Bakr, S.M. Comparative evaluation of machine learning and deep learning approaches for compressive strength prediction of geopolymer concrete. Sci Rep 16, 14652 (2026). https://doi.org/10.1038/s41598-026-50705-w

Keywords: geopolymer concrete, machine learning, deep learning, compressive strength prediction, sustainable construction