Clear Sky Science · en
Diagnosis model of early malignant pulmonary nodules based on clinical laboratory data
Why tiny spots in the lung matter
When a routine scan of the chest reveals a small spot in the lung, doctors and patients face a difficult question: is it an early cancer that needs swift treatment, or a harmless change that can be safely watched? Current imaging tests often cannot tell the difference, leading some people to undergo unnecessary surgery while others may have dangerous delays in diagnosis. This study explores whether a simple blood test, interpreted with modern computer techniques, can help sort these lung “nodules” into lower- and higher-risk groups.
Limits of today’s scanning and blood tests
Low-dose CT scans are now widely used to screen people at risk of lung cancer because they can reveal tumors while they are still small. But these scans also detect many benign nodules, and up to almost all positive findings can turn out not to be cancer. Traditional blood markers used in lung clinics have not solved this problem; they tend to miss many early cancers and cannot reliably guide decisions on their own. As a result, doctors often rely heavily on experience and subtle scan details, which can differ from one specialist to another.
Turning immune fingerprints into clues
One promising idea is to look at the body’s own immune response. Long before a lung tumor is large enough to be seen on a scan, the immune system may recognize abnormal proteins from cancer cells and produce matching antibodies. A panel of seven such autoantibodies has already been approved in China and is known to be quite specific for lung cancer, but it still misses many cases. The researchers in this study asked whether combining this antibody panel with routine lab measurements taken from a standard blood draw could paint a fuller picture of cancer risk.

Teaching computers to recognize patterns
The team analyzed data from 310 patients who had lung nodules confirmed by tissue examination: 142 had early-stage cancer and 168 had benign conditions such as scars or inflammation. For each person, they collected seven autoantibodies, basic information like sex, and a wide range of common blood test results, including measures of blood cells, proteins, and inflammation. Using a statistical method to trim away less useful information, they narrowed the list to 12 key factors. They then trained and compared 11 different machine learning approaches, a family of algorithms that learn patterns from examples rather than relying on fixed formulas.
A focused model built for the clinic
Among all the tested approaches, a method called random forest stood out for its balance of accuracy and stability when evaluated on an independent group of patients. To keep the future test practical, the researchers used an explanation tool to see which inputs contributed most to the model’s decisions. This allowed them to shrink the model down to just five blood-based features: one common clotting protein known as fibrinogen and four of the autoantibodies, named p53, SOX2, MAGE A1, and GBU4-5. Even in this slimmed-down form, the model retained nearly all of the added ability to distinguish cancerous from non-cancerous nodules compared with the full 12-factor version.

How this tool might be used
In testing, the model showed strong ability to correctly identify many true cancers while maintaining high specificity, meaning that most nodules it judged as low-risk were in fact benign. However, its sensitivity—how many cancers it detects—was about two-thirds, too low for it to serve as a stand-alone screening test. Instead, the authors suggest it could become an “extra voice” at the decision table: helping doctors recognize people whose nodules are very unlikely to be cancer and who might safely avoid immediate invasive procedures, while still relying on imaging and clinical judgment for final decisions.
What this means for patients
This research offers a proof of concept that information already available from a routine blood draw, when interpreted by a carefully checked machine learning model, can help clarify the risk that a lung nodule is malignant. The authors even built a web-based calculator so other groups can test the approach. For now, the work remains experimental: it was done at a single hospital, in a modest number of patients, and mainly in one type of lung cancer. Larger, multi-center studies will be needed to show whether the model truly improves care and is reliable in everyday practice. If those future tests succeed, such tools could reduce unnecessary operations, focus attention on patients who need it most, and make the discovery of a small lung spot a little less frightening.
Citation: Liu, L., Li, H., Miao, Y. et al. Diagnosis model of early malignant pulmonary nodules based on clinical laboratory data. Sci Rep 16, 12172 (2026). https://doi.org/10.1038/s41598-026-42111-z
Keywords: lung cancer, pulmonary nodules, autoantibodies, machine learning, blood biomarkers