Clear Sky Science · en
Machine learning-based Direct Normal Irradiance (DNI) forecasting using satellite data for Concentrated Solar Power (CSP) plants with Thermal Energy Storage (TES)
Why Predicting Sunlight Hours Matters
As more countries look to the sun to power their homes and industries, a big question looms: how do we keep the lights on when the sky changes by the minute? Concentrated solar power plants, which use mirrors to focus sunlight and store heat, can supply electricity even after dark. But to run them efficiently and cheaply, operators must know in advance how strong the sunshine will be. This study shows how free weather satellite data and modern machine learning can deliver practical, affordable sunlight forecasts tailored to these next-generation solar plants.
Turning Space Images into Useful Sunlight Forecasts
The researchers focus on a key quantity called direct normal irradiance, or DNI—the portion of sunlight that travels in a straight line and can be concentrated by mirrors. Unlike regular solar panels, these power plants rely almost entirely on DNI, which is highly sensitive to clouds, moisture, and particles in the air. Instead of installing costly ground cameras and instruments at each site, the team uses data from the Himawari-8 geostationary satellite, which watches much of Asia from space every ten minutes. From this stream of images and derived products, they collect 17 variables, including different measures of sunlight, cloud type, water vapor, temperature, and the angle of the sun, and feed them into machine-learning models trained to predict upcoming DNI 30 minutes to 6 hours ahead.

Smart Algorithms that Learn from the Sky
To find the best forecasting tool, the authors test 24 different machine-learning approaches and compare their accuracy, speed, and computational cost. They discover that a technique called Ensemble Bagging Regression excels for very short-term forecasts, especially at the 30-minute horizon, where it captures fine details in recent changes. For longer lead times, an exponential Gaussian process regression model stands out. This model not only predicts future DNI but also gives a built-in measure of uncertainty, which is valuable for planning. Across forecast horizons up to six hours, the best model maintains strong performance, with errors small enough to be absorbed by the natural buffering ability of thermal storage systems in concentrated solar plants.
Reading the Weather’s Hidden Signals
Beyond accuracy, the team wants to understand what drives the model’s predictions. They use an explainable AI technique known as SHAP, which assigns each input variable a contribution to the forecast, much like splitting the bill fairly among dinner guests. This analysis reveals that the angle of the sun in the sky and cloud-related properties are the most influential factors across many time horizons. Clear-sky versions of sunlight measures—what the radiation would be with no clouds—also play a major role, along with temperature and simple past values of DNI at short lags. Moist air, ozone, and relative humidity tend to reduce expected sunshine, especially in humid summer months, while clear, dry conditions in late winter and early spring boost it. These insights reassure operators that the model is responding to physically sensible signals rather than hidden quirks in the data.

From Flat Plains to Mountain Valleys
A key test is whether a model trained in one place can be trusted in others. The authors train their system at a site in Bhagalpur, India, then apply it to five additional locations across South Asia, ranging from low-lying plains to the Himalayan foothills. On flatter terrain, such as Jhansi and Kharagpur, the forecasts remain highly reliable, with stable error levels and little systematic over- or underestimation. In contrast, stations in more rugged or complex landscapes, such as Makalu in the Himalayas, show greater swings in accuracy and bias at certain time lags. These patterns suggest that local topography and unique weather behavior—like rapid cloud formation around mountains—can challenge a purely satellite-based model and may require extra tailoring.
What This Means for Future Solar Power
For operators of concentrated solar power plants with heat storage, the study delivers a practical message: they may not need the most expensive, ultra-precise forecasting hardware to run their plants effectively. Thanks to the natural smoothing provided by thermal energy storage, a well-designed satellite-based model with moderate errors can still support confident planning of power output over the next few hours. By relying on free, wide-coverage satellite data and interpretable machine learning, this approach offers a scalable way to cut costs and lower the price of solar electricity across sun-rich regions, especially in South Asia. With further refinements that account for mountains, aerosols, and local weather patterns, such forecasting systems could become a backbone technology for a more reliable, solar-powered grid.
Citation: Rathnayake, N., Wijewardane, S. Machine learning-based Direct Normal Irradiance (DNI) forecasting using satellite data for Concentrated Solar Power (CSP) plants with Thermal Energy Storage (TES). Sci Rep 16, 11257 (2026). https://doi.org/10.1038/s41598-026-41733-7
Keywords: solar forecasting, concentrated solar power, satellite data, machine learning, thermal energy storage