Clear Sky Science · en
Enhanced cervical cancer diagnosis using a novel Bayesian fusion ensemble method with explainable AI
Why this matters for women’s health
Cervical cancer remains a major killer of women worldwide, especially where access to specialists and advanced tests is limited. Yet the disease is highly treatable when caught early. This study explores how carefully designed artificial intelligence (AI) can turn simple clinical and lifestyle information—such as age, smoking habits, and routine screening test results—into a highly reliable early warning tool that doctors can use at the bedside or in small clinics.

The global problem behind the numbers
Cervical cancer is largely caused by infection with high‑risk types of human papillomavirus (HPV). It often progresses silently, showing few symptoms until it is advanced, when women may experience abnormal bleeding, pelvic pain, or infertility. In 2020, more than 600,000 new cases were reported worldwide, with almost 90% of deaths occurring in low‑ and middle‑income countries where regular Pap or HPV testing is difficult to maintain. Existing screening methods are effective but can be labor‑intensive, require trained personnel, and still miss some cases. This creates a strong need for tools that can accurately flag high‑risk women using the kinds of information clinics already collect.
Turning patient histories into a risk score
The researchers built a hybrid machine‑learning system that analyzes 36 pieces of information from each patient. These include age, number of sexual partners, age at first intercourse, smoking status, use of hormonal contraception, history of sexually transmitted diseases, and results of common cervical tests such as the Schiller and Hinselmann exams and cytology. Because real medical records often have gaps, the team used a technique called GAIN to intelligently fill in missing values while preserving realistic patterns in the data. They then applied a method called Boruta to sift through all variables and keep only those that truly influenced whether a biopsy—the gold‑standard test—showed cancer or precancer.
Balancing rare cases and finding clear signals
Like many medical datasets, the cervical cancer records contained far more women without disease than with it. If left uncorrected, this imbalance can cause a computer model to learn mostly from the majority group and quietly ignore subtle signs of cancer. To prevent this, the team used random oversampling to create a more even mix of positive and negative cases. They then compressed the data into a smaller set of informative patterns using two mathematical tools, Independent Component Analysis and Principal Component Analysis. This combination removed noise and redundancy while keeping the key signals that distinguish high‑risk from low‑risk patients.

Blending two minds into one decision
At the heart of the system is a new “Bayesian fusion ensemble,” which blends the strengths of two widely used models: decision trees and random forests. Instead of letting each model vote equally, the fusion method weights their contributions based on how well they perform during validation. The result is a single, sharpened risk estimate for each woman. Across multiple rounds of testing, this approach reached about 99.9% accuracy, identified every high‑risk case (perfect recall), and produced an ideal score on a standard measure of diagnostic quality (AUC‑ROC = 1.00), suggesting it rarely missed cancer while also avoiding unnecessary alarms.
Opening the black box for doctors
Because doctors must understand why an algorithm flags a patient as high risk, the team added explainable AI tools called SHAP and LIME. These methods break down each prediction and show which factors pushed the decision toward “cancer” or “no cancer.” They confirmed that Schiller, Hinselmann, and cytology results were the strongest drivers of risk, with age, number of sexual partners, smoking, and past infections also playing important roles. Finally, the researchers wrapped the model in a web‑based application that clinics can use in real time: staff enter patient information, the system returns a risk score, and the explanation panel highlights the main reasons behind that score.
What this means for patients and clinics
This work shows that when thoughtfully designed and transparently explained, AI can turn routine clinical and behavioral data into a powerful early‑warning system for cervical cancer. The model does not replace biopsies or expert judgment, but it can help overburdened clinics quickly spot women who most need further testing, especially in resource‑limited settings. With larger and more diverse datasets in the future, and by extending the approach to other types of health data, such tools could become an integral part of everyday screening and help prevent thousands of avoidable deaths.
Citation: Islam, O., Assaduzzaman, M., Akter, S. et al. Enhanced cervical cancer diagnosis using a novel Bayesian fusion ensemble method with explainable AI. Sci Rep 16, 12306 (2026). https://doi.org/10.1038/s41598-026-35334-7
Keywords: cervical cancer screening, medical AI, machine learning, women's health, early detection