Clear Sky Science · en
Curtain grouting volume prediction using a Bayesian-optimized stacking ensemble model with SHAP analysis
Why predicting hidden cement work matters
Deep below dams and tunnels, engineers inject liquid cement into cracks in rock to keep water from leaking through. This process, called grouting, is expensive and hard to see directly, so workers must guess how much grout to pump at each drill hole. Guess too low and water may still seep; guess too high and money, materials, and time are wasted. This study shows how modern data tools can help engineers predict the right amount of grout more reliably before work begins.
Sealing cracks under big water projects
Large dams and underground caverns rely on curtains of hardened grout to block the underground paths that water would otherwise use to escape. Workers drill many holes in the rock and pump in cement slurry, but the rock below the surface is irregular, with fractures that vary in size and connectivity. Because the process is hidden and conditions change from place to place, predicting the volume of grout each hole will need has long been a major challenge for safe design and cost control in big water and hydropower projects.
From rules of thumb to learning from data
For decades, engineers have used simplified formulas, comparisons with past projects, or computer simulations of fluid flow to estimate grout use. These approaches helped but often struggled when faced with complicated real rock and noisy field measurements. In this study, the authors instead turned to machine learning, which learns patterns directly from data. They assembled 778 real grouting records from a large water project in Xinjiang, each describing conditions at a drill hole: its position in the drilling order, depth, length of the treated section, hole width, how easily water flowed through the rock beforehand, the starting mix of water and cement, and the pressure used during grouting. The outcome to be predicted was the actual volume of grout pumped.

Blending several smart models into one
Rather than rely on a single algorithm, the team used a strategy called stacking, which lets several different prediction models work together. Three computer models that are especially good at handling complex patterns in tabular data were chosen as the first layer. Each one examined the same seven input factors and produced its own estimate of grout volume. A simple but carefully controlled regression model then took these three estimates and blended them into a final prediction. To make sure each model used its internal settings as effectively as possible, the researchers turned to Bayesian optimization, a method that explores and tunes many combinations of settings in an organized, data-driven way instead of by trial and error.
Checking accuracy and opening the black box
To test their approach, the authors compared the stacking model with its three individual component models, both before and after fine-tuning. They measured how closely predictions matched real grout volumes using standard error scores. The stacked model with Bayesian tuning performed best, explaining about 92 percent of the variation in grout volume while keeping average errors relatively small. 
What this means for real-world grouting
The study concludes that a carefully tuned, stacked machine learning model can give more accurate and stable predictions of grout volume than individual models or older rule-based methods, at least for the project studied. By quantifying which conditions matter most and how they interact, the approach can help engineers plan drilling and pumping strategies, allocate materials, and manage risks more efficiently in complex ground conditions. While the model still needs testing and adaptation for other sites, it offers a practical path toward data-informed control of this crucial but largely hidden part of dam and tunnel construction.
Citation: Ma, Y., Yuan, Z., Xiong, B. et al. Curtain grouting volume prediction using a Bayesian-optimized stacking ensemble model with SHAP analysis. Sci Rep 16, 15374 (2026). https://doi.org/10.1038/s41598-026-45538-6
Keywords: curtain grouting, grouting volume prediction, machine learning, Bayesian optimization, SHAP analysis