Clear Sky Science · en
Assessment of Lasso and Ridge models for soil swelling potential prediction
Why the Ground Beneath Buildings Matters
Many buildings and roads may look solid, but the soil beneath them can quietly swell and shrink as it gains or loses moisture. This hidden movement can crack walls, tilt foundations, and damage pipelines, costing billions in repairs worldwide. The paper summarized here explores how modern data tools can help engineers predict when and where soil is likely to swell, so that homes, highways, and other structures can be designed to stay safe and stable.

Soils That Breathe with Water
Not all soils behave the same way when they get wet. Fine-grained, clay-rich soils can absorb water and expand like a sponge, then contract as they dry. This repeating cycle is especially severe in arid and semi‑arid regions, where strong wet–dry swings are common. Traditionally, engineers have relied on time‑consuming laboratory tests to measure properties such as how much clay is present, how dense the soil is, and how sticky or plastic it becomes when mixed with water. These tests reveal whether a soil is likely to heave and crack, but they are expensive to run for every site and every project.
Turning Soil Measurements into Smart Predictions
To reduce the need for constant laboratory testing, the researchers assembled four sizable collections of soil data drawn from earlier studies in Asia, Africa, and Europe. Together, these sets contained 273 soil samples and 16 different descriptors for each sample, including clay and silt content, moisture levels, density, and standard consistency measures. For every sample, the team also had a measured "swell potential," indicating how much the soil expanded in a controlled test. They carefully cleaned and normalized the data, removed redundant information, and split the samples into training and testing groups so that any prediction method would be judged on data it had not seen before.
Old Versus New Ways to Read the Ground
The core of the study is a head‑to‑head comparison between a familiar statistical tool and two more modern approaches. The traditional tool, called multiple linear regression, estimates swell potential as a straight‑line combination of all soil measurements. The newer methods, known as Lasso and Ridge regression, still use a weighted combination of the same measurements but add a kind of "penalty" that discourages the model from relying too heavily on any one input or from using too many at once. In practice, this means Lasso can automatically zero out less important properties, spotlighting the handful of soil features that truly drive swelling, while Ridge keeps all features but softens their influence when they are strongly interrelated.
What the Models Learned About Swelling Soils
Across the four datasets, the regularized models—Lasso and Ridge—consistently produced more reliable predictions than the traditional method. On the best‑behaved datasets, Lasso in particular tracked measured swell potential very closely, with only small average errors and a high fraction of the natural variation explained. Both modern methods handled noisy, overlapping soil properties much better than the older straight‑line approach, which often over‑ or under‑estimated swelling, especially at higher values. The analysis also confirmed that certain traits, such as liquid limit, plasticity, and clay content, are the main signals of swelling risk, while factors tied to denser, less porous soil tend to reduce that risk.

From Data to Safer Foundations
For a non‑specialist, the key message is that existing soil tests can now be combined with data‑driven models to quickly flag areas where the ground is likely to lift and crack structures. By using Lasso and Ridge regression, engineers can focus on a smaller set of truly important soil properties and obtain more accurate estimates of swelling without running every possible test at every site. This allows foundations, pavements, and earthworks in swelling‑prone regions to be designed with appropriate safeguards from the outset, reducing the chance of costly damage later in the life of a structure.
Citation: Bility, M.T., Vora, T., Lakhani, P.N. et al. Assessment of Lasso and Ridge models for soil swelling potential prediction. Sci Rep 16, 11922 (2026). https://doi.org/10.1038/s41598-026-39917-2
Keywords: expansive soil, soil swelling, machine learning, Lasso regression, geotechnical engineering