Clear Sky Science · en
Machine Learning-Based Reconstructions of Historical Daily and Monthly Runoff for the Laurentian Great Lakes
Why Past River Flows Matter Today
The Great Lakes hold nearly one-fifth of the world’s fresh surface water and provide drinking water, transportation, power, and recreation for millions of people. Yet the rivers that feed these lakes have not always been carefully measured. Many stream gauges have shut down or suffered gaps in their records, making it hard to know how much water has actually been flowing into the lakes over the past decades. This paper introduces a new way to digitally “replay” more than 60 years of daily river runoff into the Great Lakes, giving communities and planners a much clearer picture of past wet and dry periods and helping them prepare for future climate and water-level changes. 
Filling in the Missing Pieces
Stream gauges are the basic tools for measuring how much water flows down a river. Across North America, thousands of these have been discontinued or have patchy records because of equipment failures and funding cuts. Traditional methods for filling in the gaps often focus on individual rivers or use global computer models that are too coarse to capture local details. They can struggle to track daily ups and downs in flow, which are crucial for understanding floods, droughts, and how water moves through local systems. The Great Lakes basin adds extra complexity: nearly a third of the area has no gauges at all, it straddles the U.S.–Canada border, and it includes a patchwork of climates, soils, cities, farms, forests, and wetlands.
Teaching a Neural Network to Read Rivers
The authors turn to a type of artificial intelligence called a Long Short-Term Memory (LSTM) network, a form of neural network designed to learn patterns that unfold over time. They train separate LSTM models for four historical periods between 1950 and 2013, so that slow shifts in climate, land use, and human water management are better captured. Each model is fed three kinds of information for hundreds of watersheds across the Great Lakes region: daily weather data (rain and temperature), fixed physical traits of each catchment (like elevation, slope, soils, and land cover), and—crucially—streamflow from nearby “donor” gauges that behave similarly. A companion machine-learning tool first learns which gauges tend to rise and fall together, then selects the best donors for each target site. This setup allows the network to learn how rainfall and snowmelt translate into river flow across a wide variety of landscapes.
From Individual Rivers to Whole Lakes
Once trained, the best-performing model (which uses both climate and donor gauge information) is used to generate complete daily runoff records from 1951–2013 for 656 gauged locations. The team then goes a step further: they apply the same model to 128 shoreline basins that drain directly into the five Great Lakes—Superior, Michigan-Huron, St. Clair, Erie, and Ontario. For each of these basins, the model estimates flows in ungauged areas and blends them with any available observations from interior gauges. These daily values are then added up to produce monthly totals of water flowing into each lake. 
Checking the Digital Rivers Against Reality
The researchers carefully test how well their model performs at sites it has seen before and at sites it has never used for training. Across decades, the model reproduces daily flows more accurately than a simpler version that relies only on climate and basin traits, especially when it can draw on nearby donor gauges. It does particularly well at “filling in the blanks” for gauges with partial records, while still performing competitively in completely ungauged basins. When the team compares their reconstructed monthly runoff into the lakes with several existing products used by agencies, they find strong agreement in recent decades and clearer, more pronounced peak flows in many cases, especially when compared with models that rely solely on rainfall–runoff equations.
What This Means for Great Lakes Water Levels
The new dataset offers one of the most detailed and long-running views of how water has flowed from land into the Great Lakes over the last six decades. For water managers, this means better inputs for lake water-balance models, sharper estimates of past floods and droughts, and more confidence when planning for uncertain future water levels under a changing climate. Because the method relies on long-running weather data and existing gauge networks—not on relatively recent satellite records—it can be extended both backward and to other regions of the world that face similar monitoring gaps. In short, this work provides a powerful “rewind button” for river flow, helping communities understand where Great Lakes water has come from in the past so they can manage it more wisely in the years ahead.
Citation: Gupta, R.S., Wi, S. & Steinschneider, S. Machine Learning-Based Reconstructions of Historical Daily and Monthly Runoff for the Laurentian Great Lakes. Sci Data 13, 624 (2026). https://doi.org/10.1038/s41597-026-07000-0
Keywords: Great Lakes runoff, streamflow reconstruction, machine learning hydrology, water level variability, climate change impacts