Clear Sky Science · en

Data-driven combination of METAR observations and CAMS reanalysis aerosols to enhance satellite retrieval of surface solar irradiance

· Back to index

Why sunlight forecasts matter

Keeping the lights on in a future powered by solar energy depends on knowing how much sunlight will reach the ground, not just on clear blue days but also when the air is thick with dust, smoke or pollution. In many fast-growing regions for solar power, such as North Africa, India, China and southern Africa, tiny airborne particles can dim the sun almost as much as clouds do, disrupting how much electricity solar farms produce. This study explores a new way to use everyday weather reports from airports, together with global atmospheric forecasts, to sharpen satellite-based estimates of how much solar energy actually reaches the Earth’s surface.

Airborne particles that hide the sun

Solar power planners usually rely on satellites and computer models to estimate surface sunlight. These tools work well for tracking clouds, but have a harder time with aerosols – the dust, smoke and haze that float in the air. Satellite instruments struggle when clouds block their view, ground-based monitoring networks are sparse, and global models smooth over local events like a passing dust storm or nearby wildfire. The widely used McClear model, for example, draws on Copernicus (CAMS) aerosol data with grid cells tens of kilometres wide and values updated only every few hours. That is often too coarse to capture the sharp, local swings in air pollution that strongly affect how much sunlight reaches a particular solar plant.

Turning airport visibility into solar insight

A surprisingly rich source of local aerosol information comes from METAR reports – standardised weather observations from airports around the world. Pilots need to know how far they can see along the runway, so visibility is measured automatically every 30 minutes and archived globally. While visibility is influenced not just by aerosols but also by humidity, fog and rain, it still carries valuable clues about how much the air is dimming sunlight, especially during dust and smoke events. The researchers combined these visibility readings and other METAR parameters with CAMS aerosol data and simple solar geometry (such as the sun’s height in the sky), feeding them into a set of machine-learning models designed to infer how much clear-sky solar energy should reach the ground.

Figure 1
Figure 1.

Learning from sunlight without clear days

One major obstacle is that clear-sky sunlight, the amount that would arrive with no clouds at all, is rarely measured directly. Instead of discarding all cloudy periods, the team devised a “pseudo clear-sky” target. They started from actual solar measurements at the ground and satellite images that describe how cloudy each scene is. By mathematically separating out the cloud effect and normalising by the sunlight at the top of the atmosphere, they obtained a clean target quantity between 0 and 1 that machine-learning models can learn from, even when the sky is not perfectly clear. Models including gradient-boosting methods (XGBoost, LightGBM, CatBoost), Random Forests, neural networks and even an experimental quantum variational circuit were trained at a single site in Cairo, then tested at seven other stations across Africa and Asia that experience everything from urban smog to Saharan dust storms and biomass-burning smoke.

Outperforming traditional models in dusty, hazy air

To judge success, the team did not look at the learned clear-sky values in isolation. Instead, they plugged them into the Heliosat-3 method, which turns satellite-observed cloud brightness into all-sky surface sunlight, and compared the results with ground measurements. Across all test sites, the best-performing model, CatBoost, modestly but consistently reduced the average error compared to Heliosat-3 driven by McClear. Improvements were strongest for moderate visibility ranges between about 6 and 8 kilometres and during dust and sand events, where one model (LightGBM) cut error by about one-fifth. Smoke events showed smaller but still noticeable gains, while general haze did not benefit. The experimental quantum model, although less accurate overall, achieved these results with far fewer adjustable parameters, hinting at future potential as quantum hardware matures.

Figure 2
Figure 2.

What this means for solar power

For solar operators and grid managers, even modest improvements in sunlight estimates can translate into better forecasts of power production, fewer surprises for system operators and more reliable integration of solar energy into the grid. This study shows that routine airport visibility reports, when smartly combined with global aerosol data and satellite cloud images, can help correct important weaknesses of existing physics-based models in regions with heavy dust or pollution. As machine-learning models are expanded to more locations, include more detailed aerosol information and better account for local conditions, they could become a powerful companion to traditional methods, making solar power a more predictable and dependable part of the world’s energy mix.

Citation: Roy, A., Heinemann, D., Schroedter-Homscheidt, M. et al. Data-driven combination of METAR observations and CAMS reanalysis aerosols to enhance satellite retrieval of surface solar irradiance. Sci Rep 16, 6716 (2026). https://doi.org/10.1038/s41598-026-39971-w

Keywords: solar irradiance, aerosols, machine learning, METAR visibility, photovoltaic forecasting