Clear Sky Science · en
A two-stage preprocessing and classification approach for accurate COVID-19 detection in X-ray images
Why smarter X-ray reading matters
During the COVID-19 pandemic, doctors have relied mainly on lab tests to confirm who is infected. But lab equipment can be scarce, and results may take time—especially in places with limited resources. Chest X-rays, by contrast, are quick and widely available, yet even expert radiologists can struggle to spot the often subtle lung changes caused by COVID-19. This paper explores a way to teach computers to read X-rays more reliably by cleaning up the images first and then applying modern artificial intelligence, with the goal of helping clinicians make faster and more accurate decisions.
Seeing the important details in noisy images
A chest X-ray contains a jumble of information: bones, soft tissue, shadows, and machine artifacts all layered together. Standard deep-learning systems try to learn directly from these raw images, which means the network must figure out on its own which parts of each picture are important. That is hard when images vary in brightness, contrast, and quality. The authors tackle this by adding a dedicated preprocessing step before the X-rays reach the neural network. They use a computer-vision technique called Maximally Stable Extremal Regions (MSER), which scans the image at many brightness levels and picks out regions that remain stable as the threshold changes. These stable patches often correspond to meaningful structures—such as areas of abnormal lung tissue—while ignoring more random clutter.

Letting a virtual wolf pack tune the settings
MSER is powerful but fussy: its performance depends heavily on several numerical settings that control how big a region can be, how much its brightness may vary, and how smooth its edges should appear. Manually testing all combinations for thousands of X-rays would be tedious and slow. To solve this, the researchers enlist a nature-inspired optimization method called the Grey Wolf Optimizer. In this algorithm, a virtual pack of “wolves” explores different parameter combinations, gradually moving toward those that produce the cleanest and most informative regions in the images. The quality of each setting is scored using a measure that rewards regions that are both internally uniform and clearly separated from their surroundings. Over a few dozen iterations per image, the pack converges on a set of parameters that makes suspicious lung areas stand out clearly.
Teaching neural networks with cleaner pictures
Once the X-ray images are refined in this way, they are passed to a deep convolutional neural network—the same class of models widely used for face recognition and self-driving cars. The authors tested their two-stage pipeline with three different network families: a very simple custom network, DenseNet (a stronger and more complex architecture), and MobileNet (a lighter model designed for mobile devices). For each of these, they compared performance on two public chest X-ray datasets, first using raw images and then using images preprocessed by the MSER–Grey Wolf combination. Across the board, the preprocessed images led to higher accuracy, fewer false alarms, and faster training. In one case, accuracy jumped from about 90% to nearly 99%, while training time was cut by roughly two-thirds.

How robust is this approach in the real world?
To check whether the method simply memorized quirks of a single dataset, the authors ran a tougher test: they trained their models on one collection of X-rays and then evaluated them on a different collection they had never seen. Performance naturally dipped, but the systems using the new preprocessing step held up far better than those trained on raw images alone. The study also included an ablation analysis, which showed that using MSER without automatic tuning did not help much, and a comparison with a more conventional contrast-enhancement method. In those cases, the gains largely vanished. This suggests that the key ingredient is not just “enhancing” the images, but guiding the algorithm to isolate stable, disease-relevant regions in a principled, data-driven way.
What this means for patients and doctors
The work does not replace PCR tests or expert radiologists, and the authors note limitations: the data come from a limited set of hospitals, and they did not test the method on every modern network architecture or on other lung diseases. Still, their results show that thoughtfully designed preprocessing can make existing AI systems much more accurate and efficient. By helping neural networks focus on the most informative parts of each chest X-ray, the two-stage approach offers a practical path toward reliable, automated support tools for diagnosing COVID-19 and, potentially, other respiratory conditions—especially in settings where rapid decisions and limited computing resources are everyday realities.
Citation: González, A., Gutiérrez, V.G., Cuevas, E. et al. A two-stage preprocessing and classification approach for accurate COVID-19 detection in X-ray images. Sci Rep 16, 13514 (2026). https://doi.org/10.1038/s41598-026-41861-0
Keywords: COVID-19 X-ray diagnosis, medical image preprocessing, deep learning in radiology, computer-aided detection, chest imaging AI