Clear Sky Science · en

Machine learning for prompt estimation of macroseismic intensity from seismometric data in Italy

2026-02-04 · Back to index

Why fast quake assessments matter

When the ground starts shaking, emergency teams have only minutes to decide where to send rescuers and resources. Yet the usual way of describing how hard an earthquake is felt at the surface – macroseismic intensity, such as the Mercalli scale used in Italy – often arrives hours, days, or even months later, after people fill in questionnaires and experts survey the damage. This article explores how modern machine learning can turn the first seismometer readings into quick, reasonably accurate maps of how strongly an earthquake has been felt, helping authorities react faster and more confidently.

From felt reports to rapid estimates

Traditional intensity estimates in Italy rely on two main data streams. One consists of expert field surveys logged in an official database, which focus on damaged locations but take time to organize. The other comes from the online “Hai Sentito Il Terremoto” system, where citizens report what they felt and saw, providing many low- and moderate-intensity observations. Both sources measure intensity on the Mercalli-Cancani-Sieberg scale, which ranks shaking from very weak to destructive based on human and building responses. To link these human-centered measures with instrument readings, the authors merged the two datasets around each seismic station, averaging all reported intensities within 5 km to obtain a single representative value for that area and rounding it to a whole-number class from 1 to 8.

Teaching a forest of models to read shaking

The researchers framed intensity estimation as a classification problem: given early measurements, predict which of eight intensity classes will apply at each station’s surroundings. They used a Random Forest, an ensemble of many decision trees that each makes a simple series of “if–then” splits on the data, such as combinations of magnitude, depth, distance from the source, and direct ground motion measures like peak ground acceleration, velocity, and displacement. Trained on 5,466 observations from 523 earthquakes across Italy (2008–2020), the model learned complex, non-linear links between what seismometers record and what people report. To handle the fact that strong shaking is rarer in the data, the authors adjusted the training so that all intensity levels counted equally, preventing the model from focusing only on the most common, weaker events.

Checking against established rules

To see whether the machine-learning approach truly adds value, the team compared its predictions with two widely used families of empirical relationships. The first, called Intensity Prediction Equations, estimates intensity mainly from the earthquake’s magnitude, depth, and distance, assuming shaking fades with distance in a smooth way. The second, Ground Motion to Intensity Conversion Equations, turns instrument readings of peak motion into expected intensity classes. These formulas are compact and easy to apply, but they cannot fully capture how local geology, building stock, or wave direction influence the shaking people feel. By contrast, the Random Forest naturally integrates both source parameters and ground motion measures, and can adapt to subtle patterns in the Italian dataset without prespecifying a rigid mathematical form.

Seeing inside the black box and its limits

Because emergency managers need to understand the basis of automated decisions, the authors built simpler “surrogate” decision trees that mimic the Random Forest’s behavior. These smaller trees can be drawn as diagrams, showing which ground motion thresholds separate low from high intensity and where variables like acceleration and velocity dominate. This analysis revealed that direct ground-motion measures, especially peak acceleration and velocity, carry more weight than magnitude or depth alone. The authors also introduced a simple way to flag how uncertain each surrogate-tree prediction is, using measures of how mixed the training examples are within each final branch. At the same time, they found that very strong intensities remain hard to predict, in part because they are naturally rare in the historical record, leading to occasional underestimation of the highest shaking levels.

Real-world test during a recent Italian earthquake

The team evaluated their framework on a notable real event: a magnitude 5.5 earthquake off the Adriatic coast near Pesaro-Urbino in 2022. Within about 15 minutes, seismologists had the necessary source and ground-motion information, but only around 90 public intensity reports had been filed, giving a very patchy picture. Using just the instrumental data, the Random Forest and its surrogate tree generated detailed intensity estimates around hundreds of stations in under two seconds on a standard computer. When later compared with the much denser map built from more than 12,000 citizen reports collected over days, the machine-learning maps captured both the overall felt area and the spread of moderate shaking remarkably well, and matched or outperformed the classical equations.

What this means for people living with earthquakes

Overall, the study shows that a carefully trained machine-learning system can take the first minutes of seismometer data and produce rapid, reasonably transparent maps of earthquake impact. These maps do not replace detailed surveys or crowd-sourced reports, but they can bridge the dangerous early gap when authorities must choose where to send ambulances, firefighters, and structural inspectors with very limited information. By combining advanced algorithms with interpretable simplified models and basic uncertainty flags, the framework offers a practical step toward faster, more informed response to earthquakes in Italy and could be adapted to other regions facing similar seismic risks.

Citation: Patelli, L., Cameletti, M., De Rubeis, V. et al. Machine learning for prompt estimation of macroseismic intensity from seismometric data in Italy. Sci Rep 16, 7265 (2026). https://doi.org/10.1038/s41598-026-35740-x

Keywords: earthquake intensity, machine learning, random forest, seismic hazard, Italy