Clear Sky Science · en

Semi-supervised multi-class pneumonia classification using a CNN-cascade forest framework

2026-02-05 · Back to index

Why smarter pneumonia scans matter

Pneumonia remains one of the world’s leading killers, yet many hospitals—especially those with fewer specialists—still rely on busy clinicians visually inspecting chest X‑rays or CT scans. That can make it hard not only to spot pneumonia, but also to tell what kind it is: bacterial, viral, fungal, or a more general inflammatory picture. This article describes a new artificial intelligence (AI) system designed to help. It uses both X‑ray and CT images, learns even from scans that have never been labelled by experts, and can distinguish several pneumonia subtypes with striking accuracy.

From simple yes/no to richer answers

Most existing AI tools for lung infection work like a basic smoke detector: they say “pneumonia” or “no pneumonia” and stop there. Clinicians, however, need more nuance. Different causes of pneumonia respond to different drugs, carry different risks, and often look subtly different on imaging. The authors set out to build a system that could separate five categories—bacterial, viral, fungal, general pneumonia, and normal lungs—so that automated tools would provide guidance closer to what an experienced radiologist offers, rather than a simple red‑flag alert.

Pairing two kinds of scans for a fuller picture

To train and test their method, the researchers assembled a dataset of 4,578 chest images taken from public collections: each patient contributed both an X‑ray and a CT scan acquired during the same clinical episode. X‑rays are quick and cheap but fairly blurry; CT scans are slower and more expensive but show fine structural detail. By carefully matching the two modalities at the patient level and removing inconsistent or questionable cases, the team created a realistic, unbalanced dataset that reflects everyday medicine: some types of pneumonia, such as fungal infection, are much rarer than others.

How the hybrid AI learns from labelled and unlabelled scans

The proposed system, called CNN‑Enhanced Cascade Forest (CE‑Cascade), combines two types of machine learning. First, a deep convolutional network known as ResNet processes each image and turns it into a high‑dimensional fingerprint that captures textures, shapes, and patterns linked to pneumonia. Instead of directly predicting the diagnosis, these fingerprints are passed to a "cascade forest"—multiple layers of decision‑tree ensembles that repeatedly refine the signal, zooming in on local patches in the image and building more complex patterns at each stage. Crucially, the authors embed this hybrid model in a semi‑supervised framework: once an initial version is trained on expert‑labelled scans, it is allowed to assign “pseudo‑labels” to unlabelled images, but only when it is very confident. Those high‑confidence cases are then folded back into training, expanding the effective dataset without additional human labor.

What the system achieved in practice

Using this approach, the CE‑Cascade model achieved an overall classification accuracy of 98.86 percent across all five categories, with similarly high scores on both X‑ray and CT data. It not only outperformed simpler neural networks but also beat more advanced contenders, including deep convolutional models with attention mechanisms and transformer‑based systems. Adding pseudo‑labelled scans consistently improved the quality of the predictions, boosting several evaluation scores and making the model more robust to limited expert annotation. The method also generalized well when trained on one modality and tested on the other, suggesting that it had learned disease‑related patterns rather than quirks of a particular scanner type.

From lab benchmark to bedside helper

For non‑specialists, the key takeaway is that this work moves AI‑assisted chest imaging closer to something clinicians can actually use. Instead of a black‑box tool that merely says “pneumonia: yes or no,” the CE‑Cascade framework offers detailed, multi‑class output and does so efficiently enough for routine deployment. By learning from both labelled and unlabelled scans and by drawing strength from the complementary views of X‑rays and CT images, it sets a high bar for future systems. If translated into clinical software and paired with clear explanations of which image regions drive its decisions, such a model could help doctors triage patients faster, choose more appropriate treatments, and extend expert‑level image interpretation to hospitals that currently lack it.

Citation: Muthukumaraswamy, P., Yuvaraj, T. & Krishnamoorthy, R. Semi-supervised multi-class pneumonia classification using a CNN-cascade forest framework. Sci Rep 16, 7448 (2026). https://doi.org/10.1038/s41598-026-38849-1

Keywords: pneumonia imaging, medical AI, chest X-ray, CT scan, semi-supervised learning