Clear Sky Science · en
Hybrid generative–ensemble approach for predicting recycled aggregate concrete strength properties
Why this research matters for our built environment
Concrete is everywhere in modern life, but making it consumes huge amounts of sand, stone, and cement and pumps carbon dioxide into the air. One promising way to cut this impact is to reuse broken concrete from old structures as new building material. The catch is that concrete made with recycled pieces does not always behave the same way as concrete made from fresh rock. This study shows how modern data tools can help engineers predict the strength of such greener concrete mixes before they ever pour a single test block.

Turning rubble into a resource
Construction and demolition sites generate mountains of concrete waste each year. Instead of sending this rubble to landfills, it can be crushed and reused as aggregate, the gravel-like skeleton inside new concrete. Replacing natural sand and stone with recycled pieces helps save dwindling natural resources and lowers the overall environmental footprint of building projects. However, recycled particles often carry old cement on their surfaces, have more pores, and form weaker contact zones inside the new mix. These quirks can reduce how much load the concrete can safely carry, which makes designers cautious about using high levels of recycled material.
Learning from past mixes
To tackle this challenge, the researchers gathered data from 112 different concrete recipes that used both natural and recycled aggregates. For each mix they recorded how much water, cement, sand, gravel, and recycled material it contained, along with four key results: compressive strength, split tensile strength, flexural strength, and stiffness. Because 112 examples are modest for training powerful data models, the team first used a generative tool, a conditional variational autoencoder, to create thousands of additional synthetic mixes that mimic the patterns of the real ones. This step helped the models see a wider variety of realistic combinations while still being checked against real-world test results.
Testing a toolbox of data models
The team then compared seven different machine learning approaches for predicting each of the four strength properties from the mix ingredients. Some were simple linear models, which assume straight-line relationships, while others were more flexible tree-based methods and support vector machines that can capture twists and turns in the data. They trained and checked these models using careful cross-validation so that each prediction for assessment was made on data the model had not seen before, and they reserved a separate hidden test set of real mixes for the final score. Gradient boosting and support vector regression clearly stood out, giving highly accurate and stable predictions across all four properties and outperforming both basic linear fits and standard equations from building research codes, especially when recycled content was high.

Peeking inside the black box
Powerful data models are only useful to engineers if they can be trusted and understood. To open up the black box, the authors used a technique called feature attribution, which measures how much each ingredient in the mix pushes a prediction up or down. They found that the binder side of the recipe, namely the water-to-cement ratio and the amount of cement, is the main driver of strength in compression, tension, and bending. In contrast, stiffness is governed mostly by the aggregates themselves, with recycled fine particles playing a particularly strong role. Higher recycled fine content tends to make the concrete more flexible because these grains are less stiff and carry old, weaker mortar. These patterns match long-standing laboratory observations, giving confidence that the model is learning real physical behavior rather than noise.
From smart predictions to smarter design
In simple terms, this work shows that data-driven tools can help engineers quickly screen greener concrete mixes that include recycled rubble while still meeting safety and performance needs. The study demonstrates that certain modern algorithms can predict how strong and stiff a proposed mix will be with high accuracy, and can highlight which changes in water, cement, or aggregate content matter most. While the current results are limited to the range of mixes found in the underlying studies, the same workflow can be expanded as more data become available. This paves the way for practical design aids that guide builders toward more sustainable concrete choices without sacrificing structural reliability.
Citation: Awoyera, P.O., Simwanda, L., Vasić, M.V. et al. Hybrid generative–ensemble approach for predicting recycled aggregate concrete strength properties. Sci Rep 16, 15205 (2026). https://doi.org/10.1038/s41598-026-42598-6
Keywords: recycled concrete, machine learning, material strength, sustainable construction, data-driven design