Clear Sky Science · en
AI-enhanced soil classification with incomplete CPT data for offshore wind farm
Why Seafloor Soils Matter for Wind Power
Offshore wind turbines stand in some of the harshest environments on Earth, and their stability depends on what lies hidden beneath the waves: the seafloor soils that hold their massive foundations in place. Testing these soils in deep water is expensive and often incomplete, leaving engineers to make big decisions with patchy information. This study shows how artificial intelligence can make sense of limited test data to reliably classify seafloor soils, helping designers place offshore wind turbines more safely and efficiently.

From Seabed Tests to Soil Types
Engineers commonly probe the seabed using cone penetration tests, in which a metal cone is pushed into the ground while measuring how hard the soil resists, how much friction develops along the cone, and how water pressure in the pores of the soil changes. These measurements are traditionally interpreted with charts such as the Robertson Classification, which links patterns in the readings to different soil behaviors, like loose sand, stiff clay, or mixtures in between. The snag is that these charts assume complete, clean data, something rarely achieved when ships, waves, and deep water complicate offshore investigations.
Building a Virtual Library of Seafloor Conditions
To tackle the problem of missing information, the researchers first created an enormous virtual library of possible soil conditions instead of relying only on scarce field tests. They used realistic ranges for key cone measurements—tip resistance, sleeve friction, pore water pressure, and stresses in the ground—to generate more than 200,000 synthetic test records. These records were then translated into soil behavior zones using the Robertson chart, effectively pairing each simulated test with a soil “label.” By sampling both evenly across ranges and according to realistic statistical patterns, they ensured this synthetic library covered everything from very soft clays to dense sands and gravels.
Teaching Machines to Read the Seafloor
With this synthetic library in hand, the team trained several machine-learning models to predict soil behavior directly from cone test inputs. They tested random forests, neural networks, support vector methods, and decision trees, carefully tuning each model’s internal settings. Among them, the random forest model stood out, matching the Robertson-based soil zones with an almost perfect numerical fit and achieving more than 92% accuracy when its predictions were later converted to a standard soil classification. The model’s internal checks showed that cone tip resistance, sleeve friction, and effective vertical stress were the most influential inputs—matching long-standing geotechnical understanding and suggesting the model was learning real physics rather than spurious patterns.
Staying Accurate When Data Are Missing
The crucial test was whether the AI could still perform well when some measurements were missing, mimicking real offshore surveys where equipment fails or certain sensors are not used. The researchers systematically removed each input in turn and replaced it with uncertain values, then repeatedly ran the model in a Monte Carlo fashion to see how accuracy shifted on average. When the most critical inputs—cone resistance, sleeve friction, or effective stress—were missing, accuracy dropped considerably, confirming their importance. Yet when other parameters like pore water pressures or total stress were absent, the model still retained high accuracy, often above 90%. This shows the framework can continue to deliver useful soil classifications even when parts of the cone test record are incomplete.

Proving the Method in Real Offshore Wind Farms
To move beyond virtual data, the team tested their trained random forest model on 229,808 actual cone test records from 99 boreholes at offshore wind farm sites in Taiwan and the Netherlands. None of these real measurements were used in training. The AI’s soil behavior predictions were compared with independent classifications based on a widely used engineering system that groups soils by grain size and plasticity. The model correctly matched these classifications over 92% of the time, outperforming several existing machine-learning approaches. Repeated simulations also showed that the predictions fell within a narrow 95% confidence band, indicating stable and reliable behavior.
What This Means for Future Offshore Wind
For non-specialists, the takeaway is that the study provides a smart assistant for reading the seafloor. Instead of replacing established engineering charts, the AI framework extends them to messy, real-world situations where data are incomplete or imperfect. By learning from a carefully designed synthetic library and then proving itself on large offshore datasets, the method offers a practical way to classify seabed soils quickly and consistently. This can reduce costly extra testing, support safer foundation designs, and ultimately help bring more offshore wind projects online with greater confidence in what lies beneath each turbine.
Citation: Ku, CY., Wu, TY., Liu, CY. et al. AI-enhanced soil classification with incomplete CPT data for offshore wind farm. Sci Rep 16, 10589 (2026). https://doi.org/10.1038/s41598-026-46356-6
Keywords: offshore wind, seafloor soils, cone penetration test, machine learning, foundation design