Clear Sky Science · en

RadRepro CBCT: An Open-Access CBCT Phantom Dataset for Improved Standardization and Reproducibility of Radiomics Research

· Back to index

Why this matters for future cancer care

Modern cancer treatment increasingly relies on computers to read medical scans and spot patterns that humans might miss. These patterns, called “radiomic features,” could one day predict how a tumor will behave or how a patient will respond to therapy. But there is a major hurdle: the same patient scanned on different machines, or with slightly different settings, can produce very different numbers. This paper introduces a new open, carefully designed test dataset that helps researchers worldwide check and improve how reliable these image-based measurements really are.

Figure 1
Figure 1.

Turning everyday scanners into reliable measuring tools

The study focuses on cone-beam computed tomography (CBCT), a type of 3D X-ray scan already built into many radiation therapy machines. CBCT is used right before or during treatment to verify that the patient is positioned correctly and to track how tumors and normal tissues change over time. Because CBCT scans are taken so frequently, they are a rich source of information for radiomics research. However, CBCT images are typically noisier and of lower quality than standard diagnostic CT scans, which makes the extracted measurements more fragile and less trustworthy if not carefully tested.

A stand-in patient that never changes

To tackle this, the authors used a physical test object known as a phantom. Unlike real patients, a phantom does not move, lose weight, or change biologically. The team chose a widely available model called Catphan 503, which is already supplied with many treatment machines. It is a compact cylinder with well-defined plastic inserts that mimic different materials. By also adding an oval “body” ring around it, they created a scanning setup that roughly resembles the size and shape of a human torso. This standardized design means clinics around the world can easily reproduce the same conditions and compare their results directly.

Systematically stressing the scanners

The phantom was scanned on four CBCT systems from two major manufacturers used in radiation oncology. For each machine, the researchers deliberately varied key imaging settings: the amount of X-ray exposure, the thickness of the image slices, and the type of image-smoothing filters used during reconstruction. They also repeated the same scan multiple times and shifted the phantom in different directions inside the imaging field to mimic changes in patient position. In total, this produced 120 three-dimensional scan volumes, all from the same unchanging phantom but under many slightly different technical conditions.

Figure 2
Figure 2.

From images to numbers, step by step

For each scan, the team defined six precise regions inside the phantom that contain different materials, such as Teflon and various plastics as well as air. These regions were drawn once and then mapped consistently to every scan using automated alignment, avoiding human-to-human variation. The images were converted to a common file format and processed with an open-source software package that follows international standards for radiomics. All images were resampled to uniform 3D pixels so that textures could be measured fairly, and the same intensity scale and binning rules were used throughout. The authors extracted 107 numerical features describing basic brightness, shape, and more complex texture patterns from each region.

A shared testbed for fair comparison

The outcome of this work is not a new prediction model, but a carefully curated public dataset. It includes the raw CBCT images from all scanners, the region maps, and the full table of extracted features, along with the exact code used. Researchers can use it to see which radiomic features stay stable when scanner settings change, which ones are too sensitive to trust, and how different analysis pipelines compare. In practical terms, this dataset is like a common ruler that lets teams around the world check whether their image-based measurements are consistent. Over time, such standardization should help turn radiomics from a promising idea into a dependable tool that can genuinely guide personalized treatment in the clinic.

Citation: Hatamikia, S., Steiner, E., Muniya, E.J. et al. RadRepro CBCT: An Open-Access CBCT Phantom Dataset for Improved Standardization and Reproducibility of Radiomics Research. Sci Data 13, 454 (2026). https://doi.org/10.1038/s41597-026-06781-8

Keywords: radiomics, cone-beam CT, phantom dataset, radiation therapy imaging, image standardization