Clear Sky Science · en

A multi-scale hybrid ResNet–transformer with distance-aware learning for interpretable BI-RADS mammographic classification

2026-02-20 · Back to index

Why this matters for patients and doctors

Breast cancer screening saves lives, but deciding which mammograms truly signal danger is difficult, even for experts. This study presents a new computer tool that aims to sort breast images into the familiar BI-RADS risk categories more accurately and more transparently. By doing so, it could help reduce unnecessary biopsies, catch dangerous cancers earlier, and give radiologists an extra, understandable opinion rather than a mysterious black-box answer.

From breast images to risk categories

Mammograms are typically reported using the BI-RADS scale, which ranges from "normal" to "highly suspicious of cancer." The most critical categories are BI-RADS 3, 4, and 5, where decisions about short-term follow-up or biopsy are made. Human readers can disagree, especially in women with dense breast tissue, and this can lead to both missed cancers and avoidable alarms. To support radiologists, the authors trained an artificial intelligence system on a public mammography dataset in which expert readers had already labeled images from BI-RADS 1 through 5. They also had to tackle a common problem: there were many more normal and clearly benign images than high-risk ones, which can bias learning if left uncorrected.

How the new smart reader sees images

The proposed system combines two recent ideas in image analysis. First, it uses a deep convolutional network (ResNet-50) to pick up fine details such as edges, textures, and small bright specks that may indicate microcalcifications. Second, it adds a transformer module, a design originally popularized in language models, to connect distant parts of the image and reason about overall patterns and symmetry. Before any of this, the mammograms are carefully preprocessed: they are resized, contrast is boosted to make subtle lesions stand out, and rarer categories are balanced by oversampling so that the model sees enough examples of suspicious and malignant cases during training.

Teaching the model that some mistakes matter more

Standard computer models treat all category errors as equally bad, but that is not how medicine works. Mistaking a clearly normal breast for clearly malignant is much more serious than confusing two neighboring categories such as BI-RADS 2 and 3. To reflect this, the authors introduce a "distance-aware" learning strategy. Instead of only counting right and wrong, the loss function also measures how far the prediction sits from the true category on the ordered BI-RADS scale. Large jumps are punished more than near misses, nudging the system to draw smoother, more clinically sensible boundaries between risk levels and to avoid the most dangerous misclassifications.

How well it performed in tests

Trained and tuned in two stages, the system showed strong performance when evaluated on a held-out test set. It correctly assigned BI-RADS categories about 92% of the time overall and achieved very high scores on measures that account for class imbalance and multi-class difficulty. Importantly, it was especially accurate for BI-RADS 4 and 5, the groups that most often lead to biopsies. Almost all of its errors were between neighboring categories, which aligns with the real-world gray areas that radiologists face. Additional analyses showed that its predicted probabilities were well calibrated, meaning its stated confidence matched how often it was right, and that it outperformed several recent deep-learning approaches designed for similar tasks.

Seeing where the model looks

Because trust is crucial in medicine, the authors went beyond raw numbers and examined what parts of the image drove the system’s decisions. Using Grad-CAM, a technique that highlights influential regions, they found that the model consistently focused on lesions, dense tissue patterns, and clusters of tiny bright spots rather than irrelevant background. Visualizations of the internal feature space showed clear grouping of images by BI-RADS level, particularly separating suspicious and malignant exams from normal and benign ones. Decision-curve analysis suggested that, if used alongside clinicians, the tool could reduce unnecessary interventions while preserving or improving cancer detection, though this remains to be tested in real-world practice.

What the results mean going forward

This work shows that combining convolutional and transformer ideas, and explicitly teaching the model that some mistakes are worse than others, can produce an AI assistant that is both accurate and more aligned with clinical thinking. While the study is limited by a modest, single-source dataset and lacks external validation, it points toward AI systems that not only match radiologists in performance but also explain their focus and confidence. With further testing on larger and more varied populations, such tools could become reliable partners in breast cancer screening, helping ensure that the right women get the right follow-up at the right time.

Citation: Singh, M., Mohan, A., Tripathi, U. et al. A multi-scale hybrid ResNet–transformer with distance-aware learning for interpretable BI-RADS mammographic classification. Sci Rep 16, 10033 (2026). https://doi.org/10.1038/s41598-026-40906-8

Keywords: breast cancer screening, mammography AI, BI-RADS classification, deep learning in radiology, explainable medical imaging