Clear Sky Science · en

FAIR data gaps and collaboration willingness among hemoglobinopathy research centers

· Back to index

Why better data can change lives

Serious blood disorders like sickle cell disease and thalassemia affect millions of people worldwide, often with lifelong complications. Doctors and scientists already collect a great deal of information about these patients, from simple blood tests to complex genetic scans. But much of this information sits in isolated computer files or hospital systems that cannot easily talk to one another. This article describes an international effort to find out how well hemoglobin disorder centers are organizing their data today, where the biggest gaps are, and how ready they are to work together so that patient information can more effectively drive new discoveries and better care.

Who took part in this data health check

The study surveyed 44 teams in 22 countries that belong to HELIOS, a European-led network focused on inherited blood diseases. Participants included doctors, laboratory scientists, data specialists, and patient advocates, reflecting the many people involved in managing these conditions. Centers ranged from highly resourced hospitals in Western Europe to clinics and labs in countries with fewer research resources, including parts of Eastern Europe, Africa, and Asia. Together they care for patients with sickle cell disease and alpha- and beta-thalassemia, three of the most common inherited blood disorders worldwide.

Figure 1
Figure 1.

What data these centers actually have

The survey shows that most centers already collect the basics needed to understand these diseases. Nearly all have information on patients’ age and sex, routine laboratory measurements, and the key gene changes that cause their blood problems. However, more detailed information is harder to find. Fewer centers capture structured records of symptoms, hospital procedures, or imaging tests in a way that computers can easily search. The scarcest resources are advanced data types such as whole-genome sequencing, other “omics” data, or device-based measurements, which tend to be available only in better funded research settings. Centers in less-resourced countries were especially likely to lack these rich data types, making it harder for them to join cutting-edge studies.

How the data are stored and shared

The study also looked at how centers store their information and whether it is ready to be combined across sites. Almost half of the respondents keep research data in local databases, often using familiar tools such as spreadsheets or REDCap, a common research platform. Very few use international standards designed to make health data plug-and-play across institutions. Only one in five centers reported using widely accepted medical coding systems for symptoms or diagnoses, and none reported using popular “common data models” that are becoming the norm in large health studies. Practical safeguards such as formal rules for how long data are kept or standardized methods for removing personal identifiers were also uncommon, and many staff were unsure what their own institutions were doing.

Figure 2
Figure 2.

Openness to working together

Despite these technical gaps, the human side of collaboration looks much more promising. Most teams had already used their data for published or ongoing research, and an overwhelming 95% said they were interested in taking part in multi-center studies. A striking 86% were willing to join “federated” projects, where data stay on local servers and only summary results are shared, an approach that can help protect privacy and ease legal concerns. Even some centers that had never yet used their data for research said they were ready to participate, suggesting a pool of untapped information that could be brought into future projects with the right support.

What needs to change next

When the researchers compared current practices to the ideal of “FAIR” data—information that is Findable, Accessible, Interoperable, and Reusable—they found a mixed picture. About half the centers had at least basic documentation describing their datasets, and more than half reported that their data had already been used in research with proper approvals. But almost no one had put datasets into public repositories, and almost no one used shared technical standards that make combining data from different places straightforward. The authors argue that targeted investments in infrastructure, training, and simple governance toolkits could help centers, particularly in under-resourced regions, move toward common formats and safer, more consistent data handling.

Why this matters for patients

For people living with sickle cell disease or thalassemia, the details of data standards may sound abstract, but they have very real consequences. When information about thousands of patients can be combined and compared, researchers are better able to spot which treatments work best, which complications to watch for, and how new gene-based therapies perform over the long term. This survey shows that while many hemoglobin disorder centers still rely on fragmented and inconsistent data systems, there is a strong appetite for working together—especially through privacy-preserving networks that let data stay close to home. Turning that willingness into well-organized, connected data could speed up discoveries and help ensure that patients, no matter where they live, benefit from the best available evidence.

Citation: Tamana, S., Yiangou, K., Orphanou, K. et al. FAIR data gaps and collaboration willingness among hemoglobinopathy research centers. Sci Data 13, 582 (2026). https://doi.org/10.1038/s41597-026-06950-9

Keywords: hemoglobinopathies, rare disease data, FAIR data, federated research, data interoperability