Clear Sky Science · en

Predicting cardiopulmonary exercise testing outcomes in congenital heart disease through multimodal data integration and geometric learning

· Back to index

Why this heart study matters

For people born with heart defects, growing up and living into adulthood often means facing uncertainty: Will my heart keep up with daily life, exercise, or major surgery? This study explores whether information already collected in routine care—heart tracings and clinic letters—can be combined and analysed with modern computing techniques to predict how well a patient’s heart and lungs will perform during exercise, without always needing a demanding test.

Understanding fitness from breath and heartbeat

Doctors frequently use a specialised treadmill or bike exam, called a cardiopulmonary exercise test, to see how much oxygen a person can use and how efficiently they breathe out carbon dioxide. These measurements give a powerful snapshot of overall fitness and future health risk, especially in adults living with congenital heart disease. However, the test is time-consuming, requires special equipment, and is not available to every patient or every hospital.

Bringing together scattered patient information

The researchers gathered several kinds of information from 436 adults with congenital heart disease being followed in a Scottish specialist centre. They digitised over four thousand standard 12‑lead electrocardiograms—brief recordings of the heart’s electrical activity—and also converted written clinic letters and exercise reports into structured, computer‑readable form. From these text documents they extracted key details about each person’s diagnoses, heart operations, and medications while stripping out identifying information. For 258 patients who had exercise tests, they focused on two core measures that are known to predict survival: peak oxygen uptake and how much breathing is needed to clear carbon dioxide.

Figure 1
Figure 1.

Finding patterns with geometry instead of brute force

Because congenital heart disease is relatively rare and highly varied, the team could not rely on enormous data sets like those used to train many modern artificial intelligence systems. Instead, they represented each ECG as a summary of how the signals from the different leads vary together—a mathematical fingerprint of the heart’s electrical pattern. These fingerprints take the form of covariance matrices, which the authors analysed using tools from a branch of mathematics called Riemannian geometry. In practical terms, this allowed them to measure similarities between heart signals more sensitively and to create realistic new synthetic examples by smoothly “mixing” existing patients’ patterns, helping the computer model learn from a small and imbalanced sample.

Blending words and waves for better predictions

The study compared several approaches to predicting exercise performance from these data. Models that used only basic ECG measurements, such as standard interval and rate values reported on routine printouts, did poorly. When the researchers instead fed in the richer ECG fingerprints, prediction accuracy improved noticeably. The biggest gains came when they combined those ECG fingerprints with information drawn from clinic letters, so that the model “knew” both how the heart’s electricity behaved and what conditions, operations, and medications the person had. With this fusion of data plus their geometry‑based augmentation, the computer’s estimates of peak oxygen uptake correlated moderately well with the actual test results, outperforming simpler methods in both continuous prediction and in grouping patients into risk bands.

Figure 2
Figure 2.

What this means for patients and care teams

The work does not yet replace exercise testing, and the authors acknowledge that their classification accuracy is still too modest for direct clinical decision‑making. But their results show that carefully designed models, which respect the structure of the data and draw on both heart tracings and narrative clinical information, can meaningfully anticipate how well a person with congenital heart disease will cope with physical exertion. In the future, with larger and more diverse data sets, similar tools could help flag patients whose fitness is slipping before symptoms become obvious, support decisions about surgery or lifestyle changes, and extend advanced risk assessment to hospitals that lack full exercise‑testing facilities.

Citation: Alkan, M., Veldtman, G. & Deligianni, F. Predicting cardiopulmonary exercise testing outcomes in congenital heart disease through multimodal data integration and geometric learning. Sci Rep 16, 9910 (2026). https://doi.org/10.1038/s41598-026-38687-1

Keywords: congenital heart disease, cardiopulmonary exercise testing, electrocardiogram, machine learning, risk prediction