Clear Sky Science · en

Predicting DNA damage yields and assessing beam quality for protons and carbon ions using a DBSCAN algorithm

· Back to index

Sharper Cancer Beams

Modern cancer treatments increasingly rely on beams of charged particles, such as protons and carbon ions, to attack tumors while sparing nearby healthy tissue. Yet the same physical dose of radiation does not always cause the same amount of biological damage. This paper asks a practical question: can we predict how “harsh” a given particle beam will be on DNA using a simpler, faster method than today’s heavy computer simulations?

Why DNA Breaks Matter

When radiation passes through our cells, it leaves behind a trail of tiny energy deposits in water and DNA. These events can snap one or both strands of the DNA double helix. Single-strand breaks are often repairable, while double-strand breaks—especially when clustered together—are more likely to kill a cell or lead to mutations. Clinicians now use mainly physical quantities, such as dose and linear energy transfer (LET), to plan treatments, but these cannot fully explain how often serious DNA damage occurs. A more direct link between beam properties and DNA breakage could help design particle therapies that are both more effective against tumors and safer for patients.

Figure 1
Figure 1.

Clustering Tiny Hits Into Meaningful Damage

The authors build on an idea from data science: cluster analysis. Instead of simulating every chemical step after radiation hits water, they simulate only the initial “track structure” of energy deposits made by protons and carbon ions in liquid water. They then apply a widely used clustering algorithm, DBSCAN, to identify groups of damage points. Any interaction depositing at least 17.5 electronvolts is counted as a potential strand break. If at least two such points fall within about 2.1 nanometers—a distance similar to the width of DNA—they are grouped into a cluster, interpreted as a double-strand break. Isolated points are treated as single-strand breaks. By tuning that distance so that the model reproduces detailed benchmark simulations, the team turns raw tracks into estimated yields of simple and complex DNA damage.

A New Way to Score a Beam

From the clustering results, the authors introduce a new metric called the Quality of Beam, or QoB: how many clusters are produced per particle per micrometer of path. They then normalize this by the energy that the particle deposits along its path, yielding a quantity with units of “clusters per unit energy.” For therapeutic protons spanning 0.5 to 200 mega–electronvolts, this normalized QoB shows a remarkably straight-line relationship with the number of double-strand breaks predicted by a trusted, much more elaborate model. This means a simple conversion factor can translate normalized QoB directly into double-strand and single-strand break yields, bypassing full water-radiolysis simulations while staying consistent with earlier work.

Figure 2
Figure 2.

Comparing Protons and Carbon Ions

The same framework was applied to carbon ions, which have denser tracks and are used in some specialized cancer centers. Using the proton-optimized settings, the model still found a tight linear link between normalized QoB and double-strand breaks for carbon ions up to a certain LET (about 160–200 kiloelectronvolts per micrometer). Beyond that, the trend bends over: additional energy does not keep increasing the number of new clusters, a behavior known as the “overkill” effect. Here, so much energy is poured into already-damaged regions that extra ionizations add little new biological effect. Importantly, the curve of normalized QoB versus LET for both protons and carbon ions mirrors published measurements of relative biological effectiveness (RBE) in cells, capturing a rise, a broad maximum, and a downturn at very high LET where traditional LET alone falls short.

What This Means for Future Treatments

To a non-specialist, the key message is that not all radiation of the same physical strength harms cells in the same way. What matters is how energy is distributed at the nanometer scale around DNA. This study shows that by treating radiation tracks like data points and grouping them with a clustering algorithm, one can quickly estimate how often serious DNA breaks occur and define a new measure of beam “quality” that better reflects biological impact. For protons, the method can directly predict single- and double-strand break yields using a single factor. For heavier ions, some tuning is still needed, but the same approach highlights important effects like overkill. In the long run, such biologically informed beam metrics could help refine particle therapy plans, aiming tumor-killing power precisely where it is needed while reducing unintended harm to healthy tissue.

Citation: Chaibura, S., Liamsuwan, T. Predicting DNA damage yields and assessing beam quality for protons and carbon ions using a DBSCAN algorithm. Sci Rep 16, 10327 (2026). https://doi.org/10.1038/s41598-026-40571-x

Keywords: proton therapy, carbon ion therapy, DNA damage, radiation quality, cluster analysis