Clear Sky Science · en

An explainable artificial intelligence framework for ischemic heart disease prediction using enhanced squirrel search feature selection

2026-04-01 · Back to index

Why smarter heart checks matter

Heart disease is still the world’s top killer, but doctors often have to balance limited time, many test results, and complex computer tools. This study shows how an easy-to-explain artificial intelligence system can help doctors spot ischemic heart disease early while still making its reasoning clear. Instead of hiding its decisions in a black box, the system highlights which few test measurements matter most, helping clinicians use computer support without losing trust or control.

Figure 1. AI system using key medical tests helps doctors quickly assess a patient’s heart disease risk.

Heart trouble and the data behind it

Ischemic heart disease happens when blood flow to the heart is reduced, often due to clogged arteries. Doctors look at many pieces of information, such as age, blood pressure, cholesterol, chest pain, and the results of heart scans. The UCI Heart Disease dataset used in this work collects 303 patient records with 13 such factors plus a label showing the presence or absence of disease. While this richness helps prediction, too many overlapping or unhelpful measurements can confuse both humans and computers, slow analysis, and sometimes reduce accuracy.

Cleaning and slimming the medical record

Before a computer can learn from the data, the raw records must be cleaned and reshaped. The researchers filled in missing numbers using a method that borrows information from similar patients, scaled all measurements to a common range, turned text-like categories such as chest pain type into numerical form, and carefully checked unusual values to separate real extreme cases from likely errors. They also balanced the numbers of sick and healthy patients by generating realistic extra examples of the smaller group, and removed features that were almost duplicates of each other. The result is a tidy table where each column is meaningful and ready for analysis.

How flying squirrels inspire feature choice

The core idea of the study is that the computer does not need all available measurements to make good predictions. Instead, it should automatically search for a small set of the most informative ones. To do this, the authors use an optimization method inspired by how flying squirrels search for food in a forest. In their Enhanced Squirrel Search Optimization procedure, each “squirrel” represents a possible subset of features, and the group collectively glides through the search space, adjusting its moves when progress stalls. The best-performing combinations are kept and refined, seeking the smallest group of measurements that still supports highly accurate decisions.

Figure 2. Selected heart test measurements flow through an explainable model to sort patients into healthy and at-risk groups.

Teaching the model and opening the black box

Once the squirrel-inspired search picks an optimal subset of features, a Random Forest model is trained to predict who has heart disease. Random Forest uses many slightly different decision trees whose votes are combined, making the final prediction robust to noise in the data. On the chosen features, the model reaches around 96 to 98 percent accuracy and a very high score for telling sick and healthy patients apart. To make its logic understandable, the researchers then apply two explanation tools. One, called SHAP, shows which factors are most influential across the whole dataset, while the other, LIME, zooms in on single patients to show how their specific values push the prediction toward higher or lower risk.

What this means for patients and doctors

In plain terms, the study builds a heart disease prediction helper that is both sharp and talkative. By trimming the input down to a handful of key measurements and then using clear visual explanations, the system can tell a clinician not only that a patient is likely to have ischemic heart disease, but also which findings such as a certain scan result or a level of exercise-related change are driving that call. This balance of accuracy, simplicity, and clarity makes the approach better suited for real clinics, and it could be extended in future to larger hospitals and richer data sources like wearable devices and imaging exams.

Citation: Cenitta, D., Arul, N., Arjunan, R.V. et al. An explainable artificial intelligence framework for ischemic heart disease prediction using enhanced squirrel search feature selection. Sci Rep 16, 15422 (2026). https://doi.org/10.1038/s41598-026-46823-0

Keywords: ischemic heart disease, explainable AI, heart disease prediction, feature selection, random forest