Clear Sky Science · en
Effect of feature extraction on underwater moving body cavitation pressure reconstruction and prediction
Why underwater bubbles matter
When a fast-moving object darts through water, it can leave more than a wake behind. The sudden drop and surge of pressure can create clouds of vapor bubbles that violently collapse against its surface. These events, known as cavitation, can rattle the vehicle, slow it down, and even damage its skin. Engineers want to predict where and how hard these pressure spikes will hit, but traditional testing in tanks or huge computer simulations is slow and expensive. This study explores how modern data techniques can squeeze more insight from small amounts of simulation data, helping designers make underwater vehicles that are faster, safer, and cheaper to develop.

From bubble storms to numbers
The researchers focused on a simple but demanding case: a slender underwater body shooting vertically upward toward the water surface at high speed. As it moves, pressure sensors distributed over its body record how the pressure rises and falls at hundreds of points. Capturing this with detailed fluid simulations requires tens of millions of grid cells and tiny time steps, meaning each run can take days. As a result, instead of having millions of experimental samples, the team had only a few hundred simulated “pressure movies” and an even smaller subset—just 68 cases—with carefully identified peak pressure values. The central challenge was how to turn these dense, high-dimensional pressure histories into a smaller, more meaningful set of features that still preserve the most important behavior.
Three ways to see the hidden patterns
To tackle this, the authors compared three feature extraction strategies—essentially, three ways of compressing each long pressure record into a short description. The first, Principal Component Analysis, rotates the data into a new set of directions that capture the largest overall swings, a bit like finding the best viewing angle to see the main shape of a cloud of points. The second, Fast Independent Component Analysis, tries to tease apart overlapping “source signals,” separating distinct physical effects such as smooth flow and sudden bubble collapse. The third, a one-dimensional convolutional auto-encoder, is a compact neural network that learns to compress and then reconstruct the pressure histories by scanning along the body with small filters that look for local patterns such as sharp peaks or gentle recoveries. All three methods were trained using unlabeled simulation data to reproduce the original pressure evolution as faithfully as possible.
Rebuilding the pressure story
In the first set of tests, the team asked a simple question: if you keep only a small number of extracted features, how well can you rebuild the full pressure history? Both the classical tools performed strongly. Using about three dozen components, the independent-component approach best reproduced the detailed pressure evolution along the body, closely followed by the principal-component method. The neural-network auto-encoder, in contrast, tended to smooth out the sharpest spikes, a sign that its pooling layers were discarding some of the rapid, localized changes that mark intense cavitation events. Quantitatively, all three methods kept the average reconstruction error below two percent, but the independent-component method was consistently the most accurate in this purely “copy what you saw” task.

Finding the most dangerous hit
The second test focused on what matters most for design: predicting the single strongest pressure surge at a sensor location, using only a small set of labeled examples. Here the story flipped. The researchers built the same simple prediction network in all cases and varied only its inputs: either the raw 795-point pressure record or the much shorter feature vectors from each extraction method. When fed with features from the convolutional auto-encoder, the predictor’s error in estimating the peak pressure dropped by roughly ten percent compared with using the raw data. Features from the principal-component method gave a more modest three-percent improvement. Surprisingly, the independent-component method, which had excelled at reconstruction, made peak prediction worse. The authors argue that this happens because the peak is not an isolated, independent “source,” but the combined result of several interacting processes, which clashes with the assumptions built into that method.
What this means for future underwater designs
For non-specialists, the key message is that smart data compression can make small, hard-won cavitation datasets far more useful. Methods that simply rebuild the overall pressure field are not necessarily the best for forecasting the most damaging spikes. In this study, a compact neural network that learned its own features from the data turned out to be the most helpful for predicting peak pressures, even though it lagged in raw reconstruction fidelity. By showing how different feature extraction tools succeed or fail under tight data constraints, the work offers a roadmap for using machine learning to speed up the design of high-speed underwater vehicles, while still respecting the complex physics of cavitation.
Citation: Qiang, Y., He, Z., Chen, W. et al. Effect of feature extraction on underwater moving body cavitation pressure reconstruction and prediction. Sci Rep 16, 9065 (2026). https://doi.org/10.1038/s41598-026-40012-9
Keywords: cavitation, underwater vehicles, feature extraction, machine learning, pressure prediction