Clear Sky Science · en
Identification of diagnostic and prognostic biomarkers in lung adenocarcinoma through integrated bioinformatics analysis and real time PCR validation
Why spotting lung cancer early matters
Lung cancer is one of the deadliest cancers, in large part because it is often discovered too late. The most common form, lung adenocarcinoma, can grow quietly for years before causing symptoms. This study explores whether patterns in our blood and tumor tissue can reveal the disease much earlier. By combining big genetic datasets with artificial intelligence and then double‑checking the results in real patients, the researchers aim to find simple blood markers that could one day help doctors detect lung cancer sooner and guide treatment.
Looking for warning signs in genes
The team began with RNA sequencing data from 522 people, including 506 with lung adenocarcinoma and 16 healthy controls. RNA is the "working copy" of our genes and reflects which genes are turned on or off in cells. After carefully cleaning and normalizing the data, they compared gene activity levels between cancer and non‑cancer samples. This revealed 3,513 genes whose activity was significantly different in patients. These genes, called differentially expressed genes, formed the raw material for a computer model that could learn to distinguish cancer from healthy tissue based on gene patterns.

Teaching computers to recognize cancer
To sort through thousands of genes, the researchers used a deep learning approach, a kind of artificial intelligence inspired by networks of brain cells. They built a neural network with several hidden layers that took in gene activity data and learned to classify each sample as cancerous or healthy. The model was trained on most of the data and then tested on a separate portion that it had never seen before. Performance was striking: the system correctly identified cases and controls with about 98% accuracy, an area‑under‑the‑curve of 1.0 (a near‑perfect score), and an extremely low error rate in its probability estimates. From this model they pulled out 20 genes that contributed most strongly to its decisions, highlighting a short list of promising candidates for further study.
From computer predictions to real blood tests
Finding gene patterns in large databases is only useful if those patterns show up in real people. To test this, the researchers collected blood from 30 lung adenocarcinoma patients (all with early‑ to mid‑stage disease and no prior treatment) and 30 healthy volunteers matched for age and sex. Using a laboratory method called real‑time PCR, they measured how strongly several predicted marker genes were expressed in blood cells. Four genes in particular stood out. CYP2C9, KRT14, and PECAM1 were much more active in patients’ blood than in healthy people, while A2M was less active. For example, CYP2C9 levels were about four times higher and KRT14 about eight times higher in patients, whereas A2M was roughly half as abundant. These clear differences suggest that a combined blood test for these markers could help tell who has lung adenocarcinoma.

Clues about outlook and disease behavior
The study went beyond simple yes‑or‑no diagnosis. By linking gene activity to clinical information such as tumor size, spread, stage, and patient survival, the team identified genes that may predict how a person’s cancer will behave. Several genes, including CYP2C9, KCNV1, KRT24, SIRPD, PECAM1, and a non‑coding gene called LOC730668, were associated with patient outcomes. Some appear tied to blood vessel growth that feeds tumors, while others relate to how cancer cells interact with the immune system or resist cell death. External checks in multiple independent datasets showed that most of these candidate markers behaved consistently, increasing confidence that the findings are not a fluke of one dataset.
What this could mean for patients
In plain terms, this work shows that a smart combination of five genes—A2M, CYP2C9, KCNV1, KRT24, and SIRPD—can flag lung adenocarcinoma with high sensitivity in genetic data, and that at least four of them show clear, measurable changes in blood. While these markers are not yet ready for routine screening, they offer a promising blueprint for future blood tests that could detect lung cancer earlier, when it is more curable. They might also help doctors estimate how aggressive a tumor is and tailor treatment accordingly. Further studies in larger and more diverse groups of patients will be needed, but the results suggest that artificial intelligence, paired with careful laboratory validation, can accelerate the search for practical, minimally invasive tools to fight lung cancer.
Citation: Hossein Zadeh, R., Hossein Zadeh, R., Hajimoradi, M. et al. Identification of diagnostic and prognostic biomarkers in lung adenocarcinoma through integrated bioinformatics analysis and real time PCR validation. Sci Rep 16, 6679 (2026). https://doi.org/10.1038/s41598-026-35971-y
Keywords: lung adenocarcinoma, biomarkers, deep learning, blood test, early cancer detection