Clear Sky Science · en

The Trust-Aware XAI (TAXAI) framework: a quantitative model for interpretable and reliable clinical AI systems

2026-04-02 · Back to index

Why trust matters when computers help doctors

Hospitals are turning to artificial intelligence to read scans, spot disease early, and guide treatment choices. Yet many doctors and patients are uneasy about relying on software they cannot fully see into. This paper introduces a way to measure how much trust we should place in medical AI systems, not just how well they perform. By turning trust into a number, it aims to help clinicians, regulators, and developers decide when an AI tool is safe and reliable enough to use in real care.

From black boxes to clearer reasoning

Modern AI systems can match or even exceed human experts at reading medical images and analyzing patient data. But these systems often act like black boxes, offering a prediction without a clear explanation. Existing explanation tools can draw heat maps on an X-ray or list which lab values influenced a decision, but they rarely say whether those explanations are reliable, fair, or stable over time. The authors argue that simply showing how a model behaves is not enough in high-stakes settings like diagnosis and cancer staging; we also need evidence that the explanations themselves can be trusted.

Figure 1. How medical data, AI and ethics combine to produce a single trust score for clinical decisions

Three pillars of a trustworthy medical AI

The study proposes the Trust-Aware XAI (TAXAI) framework, which treats trust as a combination of three pillars. The first is fidelity, meaning how closely an explanation matches what the underlying model is actually doing. The second is interpretability alignment, which checks whether the highlighted regions or features match how clinicians reason about a case. The third pillar is compliance and reliability, which brings in ideas of fairness between patient groups, stability of results under small changes, and the ability to reproduce findings across runs and sites. Each of these pillars is measured on a scale from zero to one so they can be compared and combined.

Turning trust into a single clear score

TAXAI pulls these three ingredients together into a single Trust Index, a number between zero and one. This index is calculated by assigning weights to each pillar, which can be tuned for different settings. For example, during early model development, more weight might be given to technical accuracy, while regulators may prefer to emphasize fairness and reliability. The authors prove that with their formula, the Trust Index stays within clear bounds, responds in a predictable way when any component improves or worsens, and remains stable under small shifts in the chosen weights. This makes it easier to compare trust levels across different models, datasets, and explanation methods.

Figure 2. How separate checks on accuracy, clinician alignment and fairness merge into one overall trust signal

Testing the framework on varied medical tasks

To show how TAXAI works in practice, the authors apply it to several common medical AI problems. These include detecting lung cancer from CT scans, reading chest X-rays for pneumonia and COVID, grading lung tissue in histology images, classifying breast cancer from tabular test results, spotting brain tumors in MRI images, and predicting diabetes risk from clinical records. For each task, they attach well-known explanation tools such as SHAP, LIME, and Grad-CAM to standard machine and deep learning models. They then compute fidelity, interpretability alignment, and compliance scores, and roll them up into Trust Index values. Across these settings, the Trust Index typically falls between 0.85 and 0.94, suggesting that the framework yields consistent, interpretable trust scores rather than erratic or dataset-specific behavior.

Connecting algorithms with ethics and policy

The work also places TAXAI in the broader context of medical regulation. New rules in regions such as the European Union and guidance from agencies like the U.S. Food and Drug Administration call for transparency, fairness, and ongoing oversight for AI that influences patient care. TAXAI is presented as a layer that sits on top of existing models and explanation tools, converting their outputs into trust signals that can feed into audits, documentation, and clinical governance. The authors stress that TAXAI does not attempt to replace existing explainer methods; instead, it provides a structured way to judge how ready an explainable system is for use as medical software.

What this means for future AI in the clinic

In plain terms, this paper shows how trust in medical AI can be treated like any other measurable quality, such as accuracy or speed. By breaking trust into technical, human, and ethical parts, then recombining them into a clear index, TAXAI offers hospitals and regulators a common yardstick for comparing systems. While the current work focuses on computational tests rather than live clinical trials, it lays a foundation for future tools such as trust dashboards and clinician-in-the-loop studies. If adopted, such an approach could help move medical AI from impressive demonstrations toward dependable, well-governed tools that doctors and patients feel more comfortable relying on.

Citation: Pal, M., Saha, H.N. & Chakrabarti, A. The Trust-Aware XAI (TAXAI) framework: a quantitative model for interpretable and reliable clinical AI systems. Sci Rep 16, 15455 (2026). https://doi.org/10.1038/s41598-026-44167-3

Keywords: trust in medical AI, explainable AI healthcare, clinical decision support, AI fairness and reliability, Trust Index framework