Clear Sky Science · en

Spectroscopic and machine learning approaches for clinical subtyping in systemic sclerosis

· Back to index

Why a blood test for a rare disease matters

Systemic sclerosis is a rare autoimmune disease that scars the skin and internal organs, often damaging the lungs and blood vessels. Doctors struggle to predict which patients will develop the most severe forms, because today’s blood tests only tell part of the story. This study explores whether a quick, non‑invasive test that shines infrared light through a drop of blood, combined with computer analysis, could help sort patients into more precise groups and guide care in the future.

Figure 1
Figure 1.

Looking for hidden clues in a drop of blood

Instead of searching for one specific molecule, the researchers used a technique called infrared spectroscopy, which reads the combined “fingerprint” of many chemicals in the blood at once. Each type of molecule—such as fats, proteins, and sugars—absorbs infrared light in a slightly different way. By measuring these patterns in 59 people with systemic sclerosis, the team asked whether the overall chemical makeup of the blood differed between two main forms of the disease (diffuse and limited) and between patients with and without scarring of the lungs, known as interstitial lung disease.

Subtle differences in fats and proteins

The infrared measurements revealed a series of peaks that correspond to major ingredients of blood, including the building blocks of proteins and lipids (fats). When the researchers averaged the spectra across patients, they saw small but consistent shifts in regions linked to protein structure and blood fats—especially in bands known to reflect how proteins are folded and how fatty molecules are arranged. These differences appeared when comparing diffuse versus limited disease, and in a milder way when comparing patients with and without lung involvement. However, when they looked at the size of individual peaks or simple ratios between peaks, the differences were not strong enough to be statistically convincing on their own.

Figure 2
Figure 2.

Letting computers find patterns people can’t see

To dig deeper into the data, the team turned to multivariate statistics and machine learning. First, they used a method that compresses thousands of infrared data points into a few new coordinates that capture most of the variation between samples. In this reduced space, samples from the two disease subtypes showed a tendency to cluster apart along the main axis, suggesting a real underlying biochemical difference, though there was still noticeable overlap. Then the researchers trained several computer models to classify the blood spectra, including decision trees, k‑nearest neighbors, support vector machines, neural networks, and random forests. After careful tuning, these models reached moderate accuracy in telling the diffuse and limited forms apart, with the random forest approach performing best overall, while distinctions based on lung scarring or other clinical features were weaker.

Promise and limits of an emerging blood test

Although the machine learning models did better than chance, their reliability and ability to assign robust probabilities were not yet strong enough for routine clinical use. The results were affected by the modest number of patients and by imbalances between groups, which can cause some models to favor the more common subtype. The authors emphasize that better pre‑processing of the spectra, smarter selection of the most informative regions, and larger, more diverse patient cohorts are needed. They also suggest that combining infrared fingerprints with other modern techniques, such as metabolomics or protein profiling, could sharpen the signal.

What this could mean for patients

For people living with systemic sclerosis, this work does not immediately change diagnosis or treatment, but it points toward a future in which a simple, low‑cost blood test could help doctors sort patients into biologically meaningful subgroups and spot early signs of lung damage. The study shows that the blood’s overall chemical signature carries information about how the disease behaves, and that smart algorithms can begin to read that signature. With further refinement and larger studies, this approach could become a helpful companion to existing tests, improving risk assessment and guiding more personalized care.

Citation: Miziołek, B., Miszczyk, J., Paja, W. et al. Spectroscopic and machine learning approaches for clinical subtyping in systemic sclerosis. Sci Rep 16, 6929 (2026). https://doi.org/10.1038/s41598-026-37690-w

Keywords: systemic sclerosis, infrared spectroscopy, blood biomarkers, machine learning, interstitial lung disease