Clear Sky Science · en
A hybrid deep learning approach integrating CNN and transformer for lung cancer classification using CT scans
Why this research matters to patients and families
Lung cancer is one of the deadliest cancers worldwide, largely because it is often found too late. This study explores how advanced computer vision can help doctors read lung CT scans more accurately and consistently, so that suspicious spots in the lungs can be flagged earlier and with fewer errors, potentially supporting faster and better-informed clinical decisions.
Seeing inside the chest with digital eyes
Doctors commonly rely on CT scans to look for tiny growths in the lungs that may signal cancer. These growths, called nodules, can be very small and subtle, especially in the early stages of disease. Normal lung tissue, harmless nodules, and dangerous tumors can look surprisingly similar, even to experienced specialists. Small changes in image quality, background tissue, or noise in the scan can further obscure the difference. Because of this, some cancers are missed, while other findings trigger false alarms that lead to unnecessary follow-up tests.

Teaching computers to spot patterns in lung scans
The researchers designed a deep learning system, called C-Swin, to help classify lung CT images into three categories: normal, benign (noncancerous), and malignant (cancerous). Deep learning systems learn directly from large numbers of example images, rather than relying on hand-crafted rules. C-Swin combines two powerful ideas. A type of neural network known as a convolutional neural network focuses on fine details such as edges, textures, and small shapes that reveal the structure of a nodule. At the same time, a transformer module, inspired by tools used in language translation, looks at the image more broadly, considering how regions relate to one another across the whole lung.
Focusing on what really matters in the image
To make the most of CT scans, the team introduced a special attention mechanism that helps the model concentrate on relevant areas while ignoring background distractions. The CT image is divided into small patches or windows. Within these windows, the model learns which areas carry the most useful information for judging whether tissue is healthy or not. By shifting and combining windows in different directions, the network preserves relationships between neighboring regions and captures both close-up details and longer-range structures in the lungs. An additional gating component helps the system emphasize subtle but important patterns and suppress less helpful signals, refining how the model distinguishes harmless nodules from dangerous ones.

Putting the system to the test
The authors trained and evaluated C-Swin using a publicly available CT dataset collected from Iraqi hospitals, which includes images of healthy lungs, benign nodules, and malignant cases. Because medical datasets are often small, they expanded the training set using data augmentation, such as flipping and rotating images, to mimic a wider variety of scans. After careful preprocessing and training, the model correctly classified images with an accuracy of about 96 percent and achieved similarly high scores for precision, recall, and F1-score, measures that balance missed cancers against false alarms. In repeated tests using different splits of the data, the results stayed stable, and statistical checks showed that C-Swin performed significantly better than several existing deep learning approaches.
What this could mean for future care
Although this study does not replace the judgment of a radiologist, it shows that a carefully designed combination of local and global image analysis can help computers home in on the same lung regions that experts consider most important. Grad-CAM visualizations, which highlight the image areas influencing the model’s decisions, suggest that C-Swin tends to focus on lesion regions rather than irrelevant background. The authors note that the work is based on a single, relatively small dataset, so broader testing in different hospitals and on different scanners is still needed. If validated on larger and more diverse collections of scans, such systems could become useful assistants in the reading room, helping clinicians prioritize cases, reduce oversights, and potentially support earlier detection of lung cancer.
Citation: Yousafzai, S.N., Nasir, I.M., Mansour, S. et al. A hybrid deep learning approach integrating CNN and transformer for lung cancer classification using CT scans. Sci Rep 16, 15420 (2026). https://doi.org/10.1038/s41598-026-41161-7
Keywords: lung cancer, CT imaging, deep learning, medical AI, image classification