Clear Sky Science · en
Preparatory phase of large earthquakes illuminated by unsupervised categorization of earthquake catalog features
Why This Matters for People Living with Earthquake Risk
Communities around the world live with the fear of devastating earthquakes, yet scientists still cannot say exactly when and where the next big one will strike. This study explores whether subtle changes in everyday small earthquakes can reveal when a fault is quietly preparing for a much larger event. By using advanced pattern-finding techniques on earthquake catalogs, the researchers test if it is possible to spot a genuine “run-up” to major quakes—while also recognizing when no such warning exists.
Listening to the Fault Through Many Small Quakes
Large earthquakes do not usually happen out of nowhere. Before a major rupture, faults often experience changes such as foreshocks, swarms of small quakes, or slow, creeping motion. However, these preparatory phases vary greatly from place to place, and in some cases seem to be absent. The authors gather detailed catalogs of small and moderate earthquakes from five well-studied regions, including the 2023 Kahramanmaraş earthquake in Türkiye, the 2009 L’Aquila quake in Italy, and the 2014 Iquique megathrust event in Chile. For each area, they examine years of seismicity leading up to the mainshock, looking for patterns that might signal a fault approaching failure.
From Raw Catalogs to Families of Related Events
Instead of treating every earthquake as an isolated point, the team groups events into “families” that are close in space, time, and magnitude. Each family contains a mainshock (its largest event) and its associated foreshocks and aftershocks. Around each event, the researchers calculate dozens of descriptive measures: how quickly quakes are occurring, how tightly they cluster in space and time, how much strain they release, and how the sizes of quakes are distributed. These event-based measurements are then averaged within each family and combined with simple descriptors of the family’s internal structure (for example, whether it looks more like a simple aftershock sequence or a more diffuse swarm). The result is a compact fingerprint for each family that captures how the local fault segment is behaving. 
Letting the Data Organize Itself
Rather than telling the computer in advance what a “warning pattern” should look like, the authors use unsupervised machine learning. Specifically, they apply a k-means algorithm that automatically sorts earthquake families into categories with similar fingerprints. These categories range from more stable behavior—events spread out in time and space and releasing little strain—to more critical behavior, marked by tight clustering, strong interaction between events, and concentrated strain release. Crucially, the algorithm does not know when the big earthquake occurs; it simply groups families based on their features. The researchers then examine when and where the most “critical” categories appear in relation to the eventual mainshocks.
Where the Fault Really Warms Up—and Where It Does Not
For three earthquakes known to have clear preparatory phases—Kahramanmaraş, L’Aquila, and Iquique—the method successfully picks out long-lasting, highly localized families that appear shortly before the mainshock and stand apart from earlier activity. In these cases, the critical categories are associated with dense clusters of events, shrinking spatial footprints, and higher strain release, consistent with a fault segment focusing stress and damage prior to failure. By contrast, for two other events—the 2016 Amatrice earthquake in Italy and the 2024 Noto earthquake in Japan—the workflow finds no uniquely critical family that persists up to the mainshock. Amatrice appears to be preceded by relative quiet, and Noto by complex swarm activity and fluids, suggesting that not all large earthquakes display a clear seismic warning in the catalog. 
Toward Practical Early Information, with Care
Finally, the authors test whether their approach could work in something closer to real time. They train their categorization on an earlier period of data, then slide forward through the catalog to see when a new, distinct category appears. In the three cases with known preparatory phases, a marked change in the clustering measure occurs weeks to months before the large quake, hinting at its possible use for operational earthquake forecasting. Still, the study emphasizes important limits: the method can only detect preparatory phases that actually produce detectable seismicity, relies on high-quality catalogs, and requires expert interpretation to judge whether a newly emerging category is truly critical. In short, this framework does not “predict” earthquakes, but it offers a physically grounded way to highlight when and where small quakes may be indicating that a fault is entering a more dangerous state.
Citation: Karimpouli, S., Martínez-Garzón, P., Núñez-Jara, S. et al. Preparatory phase of large earthquakes illuminated by unsupervised categorization of earthquake catalog features. Nat Commun 17, 4024 (2026). https://doi.org/10.1038/s41467-026-72279-x
Keywords: earthquake forecasting, seismicity patterns, machine learning, foreshocks, fault mechanics