Clear Sky Science · en
Evaluation of the level of responsibility in pedestrian crashes using machine learning algorithms
Why this matters for everyday walkers and drivers
Crossing the street or driving through town feels routine, but when a crash happens, lives can change in an instant—followed by painful questions about who is to blame. This study looks at pedestrian crashes in one Spanish city and asks whether modern computer tools can help police and judges sort out responsibility more fairly and consistently, using patterns hidden in real accident data.
Turning tragic crashes into useful data
The researchers gathered detailed information from 510 real-world accidents in Badajoz, Spain, all of which had gone through full legal proceedings. For each crash, they recorded how responsibility was ultimately shared between driver and pedestrian using five categories: from crashes where the driver was fully responsible, to those where blame was shared equally. Alongside this, they coded 14 simple yes-or-no facts about each case, grouped into four areas: human behavior (such as alcohol, drugs, attention, reaction time), technology (vehicle inspection and pedestrian clothing visibility), street environment (location and lighting), and rules of the road (license status, speed, and mobile phone use).

Teaching computers to recognize blame patterns
With this dataset, the team tested several machine learning methods—computer programs that learn patterns from examples. They compared well-known approaches and focused on three that worked best for this task: decision trees, Naïve Bayes, and support vector machines. Each model was trained on 60% of the crashes and then challenged to predict responsibility categories for the remaining 40%. To avoid the models simply “memorizing” the data, the researchers used cross-validation techniques and carefully balanced the less common categories, such as cases where responsibility was exactly 50–50.
Cleaning up the signals before asking the computer
More information is not always better. The team first checked whether any of the 14 variables essentially repeated the same story. They found that alcohol and drug use for both drivers and pedestrians were strongly overlapping pairs. Keeping only one variable from each pair reduced the list to 12 distinct factors. Models trained on this cleaner set of inputs actually performed better: removing redundant information reduced noise and helped the algorithms make clearer distinctions between different responsibility levels.
Which model won, and which factors really matter
Across many tests, the decision tree model came out on top. It achieved about 78% overall accuracy with the reduced 12-variable set and was faster and lighter on computing resources than the other methods. Decision trees have another advantage: they naturally show which pieces of information weigh most heavily in the final decision. In this study, by far the most influential factor—nearly half of the decision weight—was whether the driver had a valid license. Next in importance were where the pedestrian was located (especially whether they were at or near a crosswalk), whether the driver was under the influence of alcohol or drugs, and whether the driver was distracted by a mobile phone. Pedestrian distractions, clothing visibility, and lighting also played roles, but to a lesser extent.

From courtroom help to safer streets
Some situations remained hard for the algorithms to judge, especially rare cases where responsibility was exactly shared between driver and pedestrian. The authors argue that such borderline situations should still be carefully reviewed by human experts. Even so, the tools they developed can support judges and traffic police by providing an objective, data-driven “second opinion,” highlighting when patterns match past rulings and freeing professionals to focus on the most complex cases. Just as importantly, the findings point to clear priorities for prevention: enforcing licensing rules, cracking down on drunk or drugged driving, limiting phone use behind the wheel, and protecting pedestrians at crossings. In everyday terms, the study shows that both smarter computers and safer behavior can help decide responsibility more fairly—and reduce the number of people who end up in these tragic situations at all.
Citation: Moreno-Sanfélix, A., Gragera-Peña, F.C. & Jaramillo-Morán, M.A. Evaluation of the level of responsibility in pedestrian crashes using machine learning algorithms. Sci Rep 16, 12093 (2026). https://doi.org/10.1038/s41598-026-42875-4
Keywords: pedestrian safety, traffic responsibility, machine learning, decision trees, road crashes