Clear Sky Science · en
A coupled spatial reduction-reconstruction and LSTM framework (SRR-LSTM) for groundwater level prediction in large irrigation districts
Why farmers and cities should care about hidden water
In many dry regions, the water that keeps crops growing and taps flowing does not come from rivers or reservoirs we can see, but from vast underground reserves called aquifers. As irrigation expands and droughts intensify, these hidden water stores are being drained faster than they are refilled. Managing them wisely demands tools that can predict how groundwater levels will change across large farming districts, month by month and field by field, without requiring supercomputers or decades of measurements. This study introduces a new way to do exactly that for a major irrigation district in Northeast China.
A thirsty landscape under pressure
The research focuses on the Taobei Irrigation District, a 1,904-square-kilometer farming region on the plains of the Tao’er River Basin. The area has a semi-arid climate: most of its modest rainfall arrives in just a few summer months, while evaporation is high. Since the early 1990s, irrigated land—especially water-hungry paddy fields—has expanded dramatically, just as a series of dry years reduced river flows. As a result, groundwater has sometimes supplied more than 90 percent of irrigation water. The consequence is a broad, deep “cone” of lowered groundwater levels centered on the paddy fields, with water tables now more than 7–10 meters lower than in past decades and even sitting below the riverbed, reversing natural river–aquifer exchange and stressing local ecosystems.

From slow physics to faster smart models
Scientists have long used physics-based computer models, such as MODFLOW, to simulate groundwater behavior. These models solve the equations that describe how water moves through the subsurface, grid cell by grid cell. They are accurate but slow, especially when exploring many combinations of climate, river flow, and pumping policies. Machine learning and deep learning models can be much faster, but past attempts often treated a whole region with a single model or relied on just a few wells, making it hard to capture how differently groundwater behaves near rivers, under cities, or beneath various crops. The challenge is to keep enough physical realism and spatial detail while cutting computation time to something useful for real-world management.
A smart way to group the land
The authors propose a “spatial reduction–reconstruction” framework, abbreviated SRR-LSTM, that combines a classic clustering method with a modern deep learning network. First, they run an existing detailed surface–subsurface model (SWAT-MODFLOW) under 16 scenarios that mix different climate futures and pumping intensities, generating long groundwater level histories for every 1-kilometer grid in the district. Next, they group grids into clusters with similar traits—such as land use, elevation, aquifer thickness, and how strongly groundwater levels fluctuate—using a method called K-means. For each cluster, they select a representative “control” grid and train a Long Short-Term Memory (LSTM) neural network to predict that grid’s groundwater level from monthly rainfall, evapotranspiration, river flow, pumping, and the previous month’s water level.

Rebuilding a detailed map from a few smart models
Once these control-grid models are trained, the framework tests how well each model predicts groundwater levels at every grid in the district, building an accuracy map. Each grid is then assigned to whichever model predicts it best, and extra control grids are added where accuracy is poor, such as along the outer edge of the drawdown cone and near the river. This “accuracy-driven” reassignment effectively carves the district into zones where a shared model works well. In the final setup, nine LSTM models working in parallel can reproduce the high-resolution groundwater map every month. Compared against three alternative schemes and against the detailed physics model, SRR-LSTM achieves Nash–Sutcliffe Efficiency scores above 0.9 for 96 percent of the grids—far higher than the 11–49 percent range of the simpler schemes—while cutting computation time by about 80 percent.
Seeing which forces matter most
To open the black box of deep learning, the team uses an explanation tool called SHAP, which reveals how much each input—rainfall, pumping, river flow, and so on—contributes to the predictions at different places. Across the heart of the irrigation area, heavy pumping outweighs rainfall in shaping groundwater trends, explaining the persistence and expansion of the drawdown cone beneath paddy fields. In contrast, in upstream croplands farther from the cone, rain plays a bigger role. River flow shows a strong positive impact near the channel, especially upstream: when flows exceed certain thresholds, leakage from the river provides noticeable recharge to the aquifer. However, this benefit levels off at high flows, and in downstream sections weakened river flows limit the recharge potential. The analysis also shows that when pumping is intense, the same river flow produces more recharge because the water table is lower, steepening the gradient from river to aquifer.
What this means for managing hidden water
For non-specialists, the main message is that we can now predict underground water changes over large farming regions with both fine spatial detail and practical speed, even under many possible future climates and pumping policies. By grouping areas that behave similarly and giving each group its own tailored deep learning model, the SRR-LSTM framework preserves local differences that matter for management—such as where cutting pumping will have the biggest effect, or how much extra river flow is needed before recharge really kicks in. At the same time, tools like SHAP turn complex neural networks into decision aids that clarify which levers—rainfall, river operations, or groundwater extraction—most strongly control groundwater levels in each part of the landscape. Together, these advances can help irrigation districts design more targeted, sustainable strategies to protect the invisible water that underpins food production and rural livelihoods.
Citation: Wei, H., Wei, G., Yu, B. et al. A coupled spatial reduction-reconstruction and LSTM framework (SRR-LSTM) for groundwater level prediction in large irrigation districts. Sci Rep 16, 7450 (2026). https://doi.org/10.1038/s41598-026-37618-4
Keywords: groundwater, irrigation, machine learning, LSTM, water management