Clear Sky Science · en
Forecasting occupational accidents in Turkey using multivariate ARMAX and NLARX models
Why predicting workplace accidents matters
Every year in Turkey, hundreds of thousands of workers are injured and thousands lose their lives in workplace accidents. For governments, employers and unions, knowing whether accidents are likely to rise or fall in the coming years is crucial for planning inspections, training and safety investments. This study asks a simple but important question: can we use past accident statistics to reliably forecast future accidents, and if so, which type of mathematical model does the best job?
A closer look at Turkey’s accident record
The authors draw on official monthly data from the Turkish Social Security Institution, covering the period from 2013—when a new occupational health and safety law came into force—through the end of 2023. To keep the picture clear, they divide the workforce into four groups: insured workers who have not had an accident, those with minor accidents, those with major accidents and those involved in fatal accidents. Looking at these groups together reveals that accident patterns are not isolated. Changes in minor accidents, for example, can ripple through to serious injuries and deaths, especially in high-risk sectors such as construction, mining and transportation. The team’s goal is to capture these intertwined trends with models that can learn from the past and project them into the future. 
From simple curves to linked time lines
Many earlier studies have relied on one-line-at-a-time forecasts, treating each type of accident as if it evolved independently. Here, the researchers instead adopt a multivariate time-series approach that allows the four groups to influence one another over time. They test two families of models. The first, called ARMAX in technical language, is a linear model: it assumes that future values can be expressed as weighted combinations of past values and random noise. The second, called NLARX, adds nonlinear terms such as squared and interaction effects, allowing for more complex responses. Because suitable monthly data on the wider economy and sectors are missing, both models focus solely on the internal dynamics of the accident statistics themselves, rather than adding outside drivers like unemployment or production levels.
How the models were built and judged
Using specialized system-identification tools, the authors convert the accident records into a structured dataset and split it into a training portion (the first 80 months) and a testing portion (the remaining 52 months). They then fit both linear and nonlinear models to the training data and ask each model to predict the test period. Accuracy is measured with a normalized mean squared error score, which compares the gap between predicted and observed curves across all months and all four groups. By scanning through many possible model structures and keeping only parameters that are statistically meaningful, they reduce the risk of overcomplicated formulas that merely memorize the past. This careful procedure lets them compare how well the linear and nonlinear approaches generalize beyond the data they were taught on. 
What the forecasts reveal
The results show a clear pattern. Overall, the linear ARMAX model delivers very accurate fits to the historical data and low forecasting errors for all four populations. It performs especially well for insured workers without accidents and for minor accidents, where the predicted curves track the real data closely over more than four years of testing. The nonlinear NLARX model shines for the accident-free group, where it slightly outperforms the linear approach, and it matches the linear model for minor accidents and fatalities. However, its forecasts for major accidents are noticeably less stable, with larger deviations as the prediction horizon extends. A closer look at the linear model’s parameters suggests that minor accidents and non-accident populations are governed by many modest but significant influences, while major accidents and fatalities are driven by a few strong, dominant effects.
What it means for safety policy
For non-specialists, the bottom line is that relatively simple, well-designed linear models can already provide reliable early warnings about how different categories of occupational accidents are likely to evolve in Turkey. Because these models explicitly track how minor, major and fatal accidents move together over time, they can help decision-makers spot emerging problems in the more dangerous categories and act before deaths spike. Nonlinear models add value in some stable groups, but they are not yet consistently better where it matters most: predicting serious injuries and fatalities. The study suggests that authorities can confidently use multivariate linear forecasts to guide targeted inspections, stricter enforcement in high-risk sectors and better allocation of training and prevention resources, while future work that incorporates richer data on sectors and working conditions may further refine these predictive tools.
Citation: Kaplanvural, S., Tosyalı, E. & Ekmekçi, İ. Forecasting occupational accidents in Turkey using multivariate ARMAX and NLARX models. Sci Rep 16, 5696 (2026). https://doi.org/10.1038/s41598-026-36210-0
Keywords: occupational accidents, time series forecasting, workplace safety, Turkey, statistical modeling