Clear Sky Science · en
Temperature trend prediction with explainable artificial intelligence and PCA based machine learning: a case study of Zonguldak, Turkey
Why local temperature trends matter
For many communities, climate change can feel like a distant, global story. Yet its effects show up most clearly in local weather: hotter summers, shifting rainfall, and more intense storms. This study focuses on Zonguldak, a coastal and industrial province on Turkey’s Western Black Sea, asking a practical question: can modern artificial intelligence not only forecast local temperature trends accurately, but also explain how it reaches those forecasts, so planners and residents can trust and use the results?
Turning raw weather records into usable clues
To tackle this question, the researchers gathered more than two decades of monthly weather records for Zonguldak, covering the years 2000 to 2022. The dataset included average, minimum and maximum air temperature, several measures of rainfall, wind direction and speed, and humidity. Before any computer model could learn from these data, the team cleaned and standardized them: missing values were filled in, text labels such as wind directions were converted into numbers, and all variables were put on a common scale so that no single measurement would dominate the others simply because of its units.
Distilling complex weather patterns
Climate data are notoriously tangled: many variables change together, and some are strongly linked. To simplify this web without losing important information, the researchers used a technique called principal component analysis (PCA). Rather than looking at each original measurement separately, PCA creates a small number of new “summary” factors that capture most of the variation in the data. In this study, the team kept enough of these factors to preserve 95 percent of the original information. The most important factor, known as the first principal component, turned out to blend temperature and wind in a meaningful way: higher minimum and maximum temperatures pushed this factor up, while stronger winds pulled it down.

Choosing the most reliable forecasting engines
With these streamlined climate factors in hand, the team tested a suite of machine learning methods to predict the monthly average temperature. Some were simple straight-line models; others, like neural networks and boosted trees, can capture more tangled relationships. The researchers split the data into training and test sets and evaluated each method with several measures of error and fit. Despite the buzz around complex “black box” systems, the clear winners here were two straightforward linear approaches, called linear regression and ridge regression. These models consistently produced the lowest errors and explained more than 90 percent of the variation in the test data, showing that, for this region and time scale, temperature behaves in a largely linear, predictable way.
Opening the black box of AI decisions
Accuracy alone is not enough when forecasts will inform infrastructure, agriculture, or health planning. To see why the models made particular predictions, the team turned to explainable AI tools. They trained a tree-based model well suited to such analysis and used two complementary methods: “permutation importance,” which measures how much predictions worsen when one factor is shuffled, and SHAP values, which assign each factor a contribution to each individual prediction. Both approaches pointed to the same story: the first principal component dominated the model’s decisions, with secondary roles for a few other components. Looking back at how this leading factor is built, the analysis showed that warmer conditions (higher minimum and maximum temperatures) strongly raise the predicted average temperature, while faster winds tend to suppress it. Humidity and rainfall played more modest roles.

What this means for people and planners
In plain terms, the study demonstrates that it is possible to build temperature forecast tools that are both accurate and understandable. For Zonguldak, simple, well-tested statistical models, guided by carefully distilled climate factors, performed as well as or better than more elaborate AI systems. The explainability analyses confirmed that the models behave in a physically sensible way: they respond strongly to changes in temperature and in a counterbalancing way to wind. This combination of performance and transparency makes the framework a promising blueprint for other regions seeking to monitor local climate trends and design adaptation strategies based on trustworthy, interpretable evidence.
Citation: Arslan, R.U., Aksoy, B. & Yapıcı, İ.Ş. Temperature trend prediction with explainable artificial intelligence and PCA based machine learning: a case study of Zonguldak, Turkey. Sci Rep 16, 4910 (2026). https://doi.org/10.1038/s41598-026-35173-6
Keywords: temperature prediction, climate change, machine learning, explainable AI, principal component analysis