Clear Sky Science · en

A multimodal multi-scale transformer for virtual pretreatment patient-specific QA of SBRT using portal-dosimetry fluence maps

· Back to index

Checking cancer treatment plans more safely

Before a patient receives high dose radiation for cancer, medical teams must double check that the treatment plan will hit the tumor accurately while sparing healthy tissue. This safety check is vital but often slow and labor intensive. The study described here explores whether an artificial intelligence system can reliably predict if a plan will pass quality checks, helping clinics treat patients more safely and efficiently.

Figure 1. AI reviews high dose cancer radiation plans in advance using detector images and plan details to gauge delivery safety.
Figure 1. AI reviews high dose cancer radiation plans in advance using detector images and plan details to gauge delivery safety.

Why these radiation treatments are challenging

Stereotactic body radiation therapy, or SBRT, delivers very high radiation doses in only a few sessions to small targets in the body such as tumors in the lung, liver, brain, spine, or prostate. Because the dose falls off sharply outside the target, even tiny errors in how the beam is delivered can affect nearby healthy tissue. Clinics therefore perform a safety step called patient specific quality assurance, where they compare the planned dose with what the treatment machine actually produces. This is usually done by taking special images with a built in detector and then running detailed checks, a process that can be time consuming.

Turning detector images and plan details into predictions

The authors built a computer model that learns to predict how well a radiation plan will match what is delivered, using information that clinics already compute. One input is a kind of picture called a fluence map, created by the machine’s electronic detector during a test run. It shows how the beam’s intensity is spread across the field. The second input is a set of numbers that describe how complex the beam shapes and motions are, including how many monitor units are used and how much the multileaf collimator moves and modulates the beam. Together, these give the model both a visual impression of the dose pattern and a summary of how difficult the plan is to deliver.

How the new AI model works

Instead of relying only on traditional image based neural networks, the team designed a multimodal transformer, a type of AI architecture that can weigh relationships across an entire image and across many features at once. The visual part looks at versions of each fluence map at several scales, allowing it to capture both fine detail and broader patterns. In parallel, another branch processes the numerical beam descriptors. A fusion step then combines these two streams into a single representation that is used to predict nine different measures of plan agreement known as gamma passing rates, each reflecting a different strictness level in comparing planned and measured dose.

Figure 2. AI combines beam images and plan complexity features step by step to estimate how closely treatment will match the plan.
Figure 2. AI combines beam images and plan complexity features step by step to estimate how closely treatment will match the plan.

Testing on real patient plans from two hospitals

To evaluate performance, the researchers trained and tested the model on 147 SBRT treatment plans, covering 1265 individual beams from two separate cancer centers. These included a range of tumor sites and target sizes. The new model was compared with several well known deep learning systems that rely mainly on image data. Across all nine gamma criteria and on both hospital datasets, the transformer model produced the lowest prediction errors. Statistical tests confirmed that these gains were unlikely to be due to chance. When the authors turned off parts of the architecture in ablation experiments, performance dropped, showing that both the multi scale image processing and the inclusion of beam complexity information were important.

What this means for future cancer care

The study shows that an AI system combining detector images with plan complexity details can accurately forecast whether SBRT treatment fields will pass quality checks. For now, the authors see this approach as a screening tool that can flag likely safe beams and highlight those that deserve closer inspection, helping clinics focus their time where it matters most. With further testing across more centers and careful integration into clinical workflows, such models could reduce the workload of routine checks while maintaining, or even improving, the safety of high precision radiation therapy.

Citation: You, HQ., Zheng, JJ. & He, XS. A multimodal multi-scale transformer for virtual pretreatment patient-specific QA of SBRT using portal-dosimetry fluence maps. Sci Rep 16, 15313 (2026). https://doi.org/10.1038/s41598-026-46687-4

Keywords: stereotactic body radiation therapy, radiation treatment quality assurance, multimodal deep learning, portal dosimetry, transformer neural network