Clear Sky Science · en

Hybrid feature selection with novel deep learning model for COVID-19 risk prediction

· Back to index

Why predicting COVID-19 risk still matters

Even as the world learns to live with COVID-19, the virus has not disappeared. New variants keep emerging, hospitals can still be strained, and vulnerable people remain at higher risk of severe illness or death. Doctors therefore need fast and reliable ways to estimate how likely an infected patient is to become seriously ill. This paper presents a new computer model that uses hospital data and advanced artificial intelligence to predict COVID-19 risk more accurately, potentially helping clinicians decide who needs closer monitoring, early treatment, or intensive care.

From raw patient records to usable signals

The study begins with a very large clinical dataset: more than one million anonymous patients, each described by 21 simple, mostly yes-or-no features such as age group, underlying conditions, and other risk factors. Real-world hospital data are messy, so the first step is to “clean” them. The authors apply a mathematical trick called log scaling, which compresses extreme values and stretches out clusters of very small values. This transformation makes the data more stable and easier for algorithms to handle, reducing the chance that unusual numbers or sparse indicators will mislead the model.

Picking the most telling signs

Not every recorded variable is equally helpful for prediction, and too many weak signals can actually confuse an artificial intelligence system. The researchers therefore perform feature selection, a process that filters out less useful information and keeps the most informative factors. Their hybrid approach combines two ideas: one measure looks at how well a feature separates high-risk from low-risk patients, and another checks how strongly features overlap with each other. By balancing these two viewpoints on a common scale, the method favors features that are both powerful and not redundant. This trimming speeds up training, reduces overfitting, and focuses the model on the most clinically relevant patterns.

Figure 1
Figure 1.

Blending pattern recognition with fuzzy reasoning

The core of the paper is a new prediction engine called the Fuzzy-Deep Kronecker Recurrent Neural Network, or Fuzzy-DKRNN. It blends several complementary techniques. One component, a Deep Kronecker Network, is designed to uncover compact, structured patterns hidden in the clinical data. Another component, a deep recurrent network, is well suited to capturing dependencies and trends, for example when a combination of factors over time influences risk. On top of these, the authors layer a fuzzy logic system. Instead of making only hard yes-or-no decisions, fuzzy rules express statements such as “if several risk indicators are moderately high, the patient is likely high risk.” Each rule carries a degree of certainty, enabling the model to handle the uncertainty and gray areas that are common in medicine.

How well does the model perform?

The authors rigorously test their Fuzzy-DKRNN model against several state-of-the-art alternatives, including systems based on chest X-ray images, traditional machine learning, and other deep learning approaches. Using standard measures such as accuracy, precision, recall, and F1-score, their method consistently comes out ahead. At its best configuration, the model correctly classifies about 91% of cases overall, with high ability both to detect patients who will become severely ill and to avoid unnecessary alarms in those who will not. These gains hold up when the amount of training data and internal validation settings are varied, suggesting that the approach is robust rather than finely tuned to one specific scenario.

Figure 2
Figure 2.

What this means for patients and hospitals

In plain terms, this work shows that combining careful data cleaning, smart selection of key risk factors, and a hybrid of deep learning with fuzzy logic can produce more reliable COVID-19 risk predictions from routine clinical information. Such a tool will not replace doctors, but it could serve as an early-warning assistant—flagging patients who deserve closer watch, guiding the distribution of scarce resources like intensive care beds, and ultimately helping reduce preventable deaths. The same strategy could also be adapted to other diseases where early risk detection from complex clinical data is crucial.

Citation: P, G.S., Kathiravan, M., Shanthi, S. et al. Hybrid feature selection with novel deep learning model for COVID-19 risk prediction. Sci Rep 16, 4106 (2026). https://doi.org/10.1038/s41598-026-35013-7

Keywords: COVID-19 risk prediction, deep learning, fuzzy logic, clinical decision support, medical AI models