Clear Sky Science · en
The INGV data registry as a curated metadata infrastructure for Earth Science data stewardship
Why This Matters for Anyone Curious About Data
Every day, Italy’s national institute for earthquakes and volcanoes (INGV) records enormous amounts of information about how our planet behaves. Turning this flood of numbers into knowledge that scientists, emergency managers, and the public can actually use is surprisingly hard. This article explains how INGV built a kind of master catalog for its data—focused not on storing the files themselves, but on describing them clearly and consistently—so that valuable observations about earthquakes, volcanoes, oceans, and the environment are easier to find, trust, and reuse.

From Scattered Records to a Single Map
INGV is a large organization spread across many offices, laboratories, and observatories throughout Italy. Its researchers monitor earthquakes, erupting volcanoes, the sea floor, the atmosphere, and much more, producing thousands of different datasets. In the past, these were scattered across project websites, institutional servers, and external archives, making it difficult even for INGV itself to know what it had. To meet growing expectations for “Open Science” in Europe—where data are shared widely and early—the institute adopted a “data-first” approach. Instead of waiting for scientific papers to be published, INGV now prioritizes releasing data and their descriptions quickly, complete with stable digital identifiers so that they can be cited and reused on their own.
A Catalogue of Descriptions, Not a Giant Hard Drive
The heart of this effort is the INGV Data Registry, a curated catalog that holds only metadata—the standardized descriptions of each dataset—rather than the data files themselves. Each entry in the Registry points to where the data physically live, whether on INGV servers or in external platforms such as Zenodo or specialized Earth science repositories. Since its launch in 2019, the Registry has grown steadily to nearly 800 records, covering most of the institute’s earthquake, environmental, and volcano-related data. The catalog uses international description formats so that its entries can be read easily by other systems across Europe and beyond. Every record receives a permanent digital code (a DOI) and links the dataset to the people and institutions involved via global researcher and organization IDs.

How Quality and Trust Are Built In
To keep this catalog reliable, INGV designed a three-step checking process that combines automatic tests with human review. When a researcher creates a new entry, an internal web tool checks for missing essentials such as author identifiers, time and place coverage, and licensing information. Only when these basic issues are fixed can the record move forward. Then, staff in the Data Management Office look at the entry’s completeness and confirm that the webpage where the DOI leads is accessible and properly structured. After that, local scientific managers and national department heads review the record for accuracy and strategic fit before it becomes visible to the public. This “human in the loop” design aims to keep data as open as possible while also protecting sensitive information, respecting privacy rules, and meeting new expectations for research security.
Connecting to the Wider World of Science
The Registry is not a closed box; it sits at the center of a wider web of services. Once approved, each metadata record is automatically published on INGV’s open data portal and made available through several programming interfaces used by other institutions. European research infrastructures for solid Earth science, ocean observing systems, national and European open data portals, and global DOI services can all harvest these descriptions. This makes INGV’s datasets visible within a worldwide graph of linked research objects, where data, software, articles, people, and organizations are all connected. At the same time, the system helps INGV’s own managers keep track of what has been produced, which is especially important during crises such as major earthquakes or eruptions, when many temporary monitoring networks are deployed and new data streams appear rapidly.
Looking Ahead to Smarter Discovery
Although the Registry already improves how INGV’s data are organized and shared, the authors note several remaining challenges. Some researchers still upload data to outside platforms without registering them, weakening the institute’s overview. The growing volume of entries can be overwhelming for newcomers, who may not know which datasets are relevant. To address this, INGV is planning more intuitive, visual ways to browse the catalog and to integrate it with new institutional repositories. The team is also testing automated tools that score how well each dataset follows “FAIR” principles—being easy to find, access, combine, and reuse—and exploring how to make the descriptions clearer for artificial intelligence systems that increasingly help users search for information.
What This Means for Our Understanding of Earth
For non-specialists, the key message is simple: when data are carefully described, given stable identities, and checked for quality, they become much more powerful. INGV’s Data Registry turns a patchwork of separate archives into a coherent, navigable landscape of information about how the Earth behaves. This makes it easier for scientists worldwide to combine Italian earthquake and volcano data with other sources, reproduce past studies, and build new ones more quickly. In the long run, such metadata infrastructures help transform raw measurements into shared knowledge that can improve hazard assessment, support civil protection, and deepen our understanding of the restless planet we live on.
Citation: Locati, M., Mazza, S., Montalto, P. et al. The INGV data registry as a curated metadata infrastructure for Earth Science data stewardship. Sci Data 13, 607 (2026). https://doi.org/10.1038/s41597-026-06980-3
Keywords: earth science data, research data catalog, open science, metadata registry, FAIR principles