Clear Sky Science · en
Pareto-optimized stacked ensemble machine learning framework for predicting bearing capacity of driven piles from static load test data
Why strong foundations matter
Every building, bridge, or wind turbine ultimately depends on what lies beneath it. Hidden below the surface, long concrete columns called piles carry the weight of structures down to firm soil or rock. If engineers underestimate how much load these piles can safely carry, the result can be settlement, cracking, or even failure. If they overestimate it, projects become unnecessarily expensive. This study explores how modern machine learning can turn routine site tests into far more accurate predictions of pile strength, helping make construction both safer and more economical.
From simple rules to data-driven insight
Traditionally, engineers have relied on formulas, small-scale experiments, and numerical simulations to estimate the bearing capacity of driven piles. These methods work reasonably well, but they struggle with the messy reality of layered soils, groundwater, and installation effects. A field test known as the static load test is considered the gold standard: a pile is loaded step by step while its settlement is measured. However, this approach is slow, costly, and hard to repeat at many locations. At the same time, geotechnical practice routinely collects simple hammer-blow data from the standard penetration test, or SPT, which gives a rough measure of soil resistance with depth. The central idea of this paper is to use machine learning to link these widely available SPT results, along with pile geometry and site details, directly to high-quality load test outcomes.

Building a smarter prediction engine
The authors compiled a database of 472 static load tests on driven reinforced-concrete piles in a Vietnamese province with varied but systematically documented soil conditions. For each test, they recorded pile diameter, thickness of several soil layers, ground and pile elevations, depth to the pile tip, and average SPT hammer-blow counts along the pile shaft and at the tip. These inputs capture both the size of the pile and the strength of the surrounding soil, which together control how much load the pile can carry. The goal was to train a model that, when given these inputs for a new site, would output a reliable estimate of the pile’s axial bearing capacity without the need for a full-scale load test.
Combining several minds into one
Rather than relying on a single algorithm, the study uses a “stacked” ensemble: three different machine learning models—random forest, k-nearest neighbors, and extreme gradient boosting—are trained in parallel and their predictions are then combined by a second-level model. This meta-model is tuned using a multi-objective search inspired by evolutionary strategies, known as Pareto optimization. The optimization balances competing goals: maximizing accuracy on unseen data while avoiding overfitting. Through repeated testing and five-fold cross-validation, the best combination achieved a coefficient of determination (a measure of how well predictions match reality) of about 0.95 on the test set, along with substantially reduced error compared with any single model alone.
Seeing inside the black box
To make the system usable in practice, engineers need to understand what drives its predictions. The authors therefore used modern interpretability tools, particularly SHAP (Shapley Additive ExPlanations), sensitivity analysis, and simple parameter sweeps. These revealed that pile diameter is the single most influential factor, with changes in diameter strongly shifting predicted capacity. Soil properties also play a key role: average SPT counts along the pile shaft and at the tip emerge as major contributors, reflecting the importance of both side friction and end bearing. Thickness of the upper soil layers matters as well, whereas some elevation-related variables have only minor impact. When the model’s inputs are varied in physically meaningful ways—for example increasing pile diameter or placing the tip in denser soil—the predicted capacities respond in line with basic geotechnical principles, suggesting that the model is learning sensible rather than spurious patterns.

How this helps real-world projects
By carefully blending several machine learning approaches and optimizing them against multiple goals, this work delivers a prediction tool that is both accurate and interpretable. For sites with similar geology and test practices to those in the study, engineers could use standard SPT data and design parameters to obtain reliable estimates of pile capacity, reduce the number of expensive load tests, and better target safety margins. The authors caution that their data come from a single region and that long-term time effects and some installation details are not fully captured, so judgment and local calibration remain essential. Even so, the framework shows how data-driven models, when transparently analyzed, can become practical partners in foundation design rather than opaque black boxes.
Citation: Abdellatief, M., ElNemr, A. & Altahrany, A. Pareto-optimized stacked ensemble machine learning framework for predicting bearing capacity of driven piles from static load test data. Sci Rep 16, 11360 (2026). https://doi.org/10.1038/s41598-026-43660-z
Keywords: pile foundation, bearing capacity, geotechnical engineering, machine learning, static load test