Clear Sky Science · en

HMC-transducer: hierarchical mamba-CNN transducer for robust liver tumor segmentation

· Back to index

Why better tumor maps matter

For patients with liver or kidney cancer, doctors rely on CT scans to decide whether surgery, radiation, or other treatments are possible. A key step is drawing precise outlines around every tumor in three dimensions, a job that is slow, painstaking, and inconsistent when done by hand. This paper introduces a new kind of artificial intelligence system that can automatically trace these tumors more accurately and consistently than previous methods, potentially helping clinicians plan treatments faster and with greater confidence.

Seeing the whole picture in 3D scans

Liver tumors are notoriously difficult to outline because they vary widely in size and shape and often blend into the surrounding tissue. Traditional deep-learning tools called convolutional neural networks (CNNs) are very good at spotting fine details in images, but they struggle to understand long-range relationships—how a structure in one part of a scan relates to another far away. Newer models called Transformers can capture this broad context but become extremely expensive to run on large 3D CT volumes, limiting their practicality in real hospitals. The authors argue that to truly succeed, a system must be both detail-oriented and globally aware, without demanding supercomputer levels of power.

A new hybrid brain for medical images

To meet this need, the researchers designed the HMC-Transducer, a hybrid architecture that marries CNNs with a newer family of models called state space models, specifically one known as Mamba. The CNN parts focus on crisp local details such as sharp tumor edges. The Mamba parts track how information flows across an entire 3D scan while using only linear computing cost, avoiding the steep growth seen in Transformers. A specially designed “direction-aware 3D Mamba” block processes the scan along three axes—head-to-foot, left-to-right, and front-to-back—so that the model respects real anatomical structure instead of flattening the volume into a one‑dimensional string of numbers.

Figure 1
Figure 1.

Letting the model decide what matters where

A central innovation is how these two types of features are combined. Rather than simply adding or stacking the CNN and Mamba outputs, the HMC-Transducer uses a gated fusion mechanism that learns, for every tiny region in the scan, how much to trust local detail versus global context. In areas with clear, sharp boundaries, the gate can lean on CNN features; where tumors are fuzzy, infiltrative, or sit near major blood vessels, it can give more weight to the broader view from Mamba. Experiments show that this adaptive blending produces tighter, more stable segmentations than either CNNs or Mamba-based models alone, and clear improvements over earlier hybrid designs that fuse features in a fixed, non-adaptive way.

Tested across organs, scanners, and hospitals

The team evaluated their approach on three major public datasets: LiTS17 and MSD-Liver for liver tumors, and KiTS21 for kidney tumors. Across these benchmarks, HMC-Transducer consistently achieved higher overlap with expert-drawn tumor maps than strong baselines, including the widely used nnU-Net and leading Transformer and Mamba models. It also generalized better when trained on one liver dataset and tested on another collected at different hospitals, a scenario that mimics real-world deployment with varying scanners and imaging protocols. In head‑to‑head tests, large “foundation models” such as SAM and its medical variants, used out of the box without specialized training, lagged far behind, highlighting that task-specific, carefully tuned systems are still needed for high‑stakes pixel‑level decisions in medicine.

Figure 2
Figure 2.

From lab results to clinical help

To a non-specialist, the takeaway is that this work moves tumor mapping software closer to what doctors actually need: a tool that is both trustworthy and efficient. By combining two complementary ways of “seeing”—one that excels at small details, and one that excels at the big picture—the HMC-Transducer draws liver and kidney tumors more accurately and more reliably than earlier systems, while still running on standard high-end hospital hardware. Although further steps are required before routine clinical use, including broader testing on other organs and imaging types, the approach represents a promising advance toward automated 3D tumor maps that could support faster diagnoses, more precise surgeries, and more personalized cancer care.

Citation: Zhu, J., Xu, C., Lei, C. et al. HMC-transducer: hierarchical mamba-CNN transducer for robust liver tumor segmentation. npj Digit. Med. 9, 176 (2026). https://doi.org/10.1038/s41746-026-02361-7

Keywords: liver tumor segmentation, medical imaging AI, deep learning, CT scan analysis, hybrid neural networks