Clear Sky Science · en

A hybrid TimeGAN–xLSTM–Transformer framework for photovoltaic power forecasting under complex environmental conditions

· Back to index

Why Better Solar Power Forecasts Matter

As more homes, businesses, and entire cities plug into solar power, keeping the lights on becomes a forecasting challenge. Sunshine may be free, but it is also fickle: clouds, haze, temperature swings, and changing seasons all make solar output bounce up and down. Grid operators must know, hours ahead, how much electricity solar farms will produce so they can balance supply and demand safely and cheaply. This paper presents a new artificial intelligence (AI) framework that learns from past data and even creates realistic new data to make solar power forecasts far more reliable under messy, real-world weather conditions.

Solar Growth Meets Weather Chaos

China’s rapid expansion of photovoltaic (PV) installations mirrors a worldwide trend: solar power is becoming a backbone of modern electricity systems. Unlike coal or gas plants, however, PV output cannot simply be dialed up on command; it depends on the atmosphere. Clouds may roll in, fog may linger, or air may heat up and thin, all of which nudge solar panels’ power up or down. To keep the grid stable, operators rely on three main types of forecasts: single-value predictions, ranges of likely values, and full probability-based scenarios. Traditional tools often need huge historical datasets and still struggle with rare but critical events, such as sudden drops or spikes in solar output. They also have trouble capturing the tangled relationships among sunlight, temperature, humidity, and power generation over time.

Teaching an AI to Invent Realistic Solar Days

The first key idea in this work is to “grow” the dataset instead of accepting its limits. The authors use a model called TimeGAN, designed specifically for time-series data, to generate synthetic solar power records that look and behave like real ones. TimeGAN learns how PV output changes step by step in tandem with weather factors such as sunlight and temperature. After training, it can produce new sequences that share the same patterns, including extreme ups and downs that may be rare in the original data. Tests using visualization tools show that the synthetic data overlap closely with real measurements, both in local detail and overall distribution. When a simple prediction model is trained on this expanded dataset, its errors drop dramatically, confirming that these “imagined” solar days help the AI generalize better to unseen conditions.

Figure 1
Figure 1.

Blending Short-Term Twitches and Long-Term Trends

The second pillar of the framework is a clever blend of two powerful sequence-learning models. An extended form of Long Short-Term Memory, dubbed xLSTM, handles the fine structure of solar output. Unlike standard versions, xLSTM uses richer memory structures and multiple time scales, allowing it to track quick changes—like a passing cloud—as well as slower shifts across hours or days. On top of this, the authors place a Transformer module, an architecture famous for its success in language models. The Transformer pays attention to relationships across distant time steps, effectively deciding which past moments matter most when predicting the future. Together, these components form a pipeline: TimeGAN enriches the training data, xLSTM extracts layered temporal features, and the Transformer weighs them globally to generate accurate forecasts.

Figure 2
Figure 2.

Testing the Model on Real Solar Farms

The researchers validate their approach using six months of data from a real distributed PV cluster in China’s State Grid, sampled every 15 minutes and including power output, temperature, humidity, and sunlight levels. They compare their hybrid TimeGAN–xLSTM–Transformer framework to more conventional LSTM and Transformer models. The results are striking: the new model cuts root-mean-square error by about 48 percent and mean absolute error by roughly 44 percent relative to the best traditional baselines. Its percentage error drops to around 2.7 percent, and the benefit of TimeGAN-based data augmentation is clear—models trained without synthetic data perform far worse, especially when faced with sharp fluctuations in solar power.

What This Means for Everyday Power Use

In simple terms, the study shows that combining realistic “imagined” data with a layered AI design can make solar power forecasts much more dependable, even when the weather misbehaves. For everyday life, better forecasts mean fewer blackouts, less wasted backup power from fossil fuels, and smoother integration of renewable energy into the grid. As solar installations spread across cities and countryside alike, tools like this hybrid TimeGAN–xLSTM–Transformer framework can help power systems plan ahead with greater confidence, bringing us closer to a cleaner, low-carbon energy future.

Citation: Chu, B., Shu, J., Zhao, C. et al. A hybrid TimeGAN–xLSTM–Transformer framework for photovoltaic power forecasting under complex environmental conditions. Sci Rep 16, 8782 (2026). https://doi.org/10.1038/s41598-026-36073-5

Keywords: solar power forecasting, photovoltaic energy, deep learning, time series data, renewable energy integration