Clear Sky Science · en
DNA-based identification of plants and the genomic nature of plant species differences
Why tiny DNA changes matter for saving plants
Plants underpin our food, oxygen, and ecosystems, but even experts often struggle to tell closely related species apart. This matters for tracking biodiversity loss, enforcing trade rules, and restoring habitats. This study takes a deep look at how differences in plant DNA are spread across the genome and asks a practical question: how much and what kind of DNA information do we really need to reliably tell plant species apart?
From barcodes to whole genomes
Scientists already use short DNA stretches called barcodes to identify many animals and plants. In animals, a single mitochondrial gene often works very well. In plants, however, the standard barcodes from plastid DNA and a ribosomal region frequently blur the lines between species, especially in recently evolved groups. This is partly because plant species often hybridize, pass on plastid DNA only through seeds, and sometimes form new species quickly without much change in these standard barcode regions. To go beyond these limits, the authors gathered nuclear DNA data from many genes across the genome, which offers a more complete picture of how plant species differ.

Checking if named species form natural genetic groups
The team compiled results from 151 studies covering 134 plant genera and 1713 species, each sampled with multiple individuals and multiple nuclear DNA regions. They asked whether individuals assigned to the same species cluster together on family trees built from nuclear DNA, a pattern called monophyly. About 70 percent of species did so, while roughly 30 percent did not form neat, separate branches. This non-matching can reflect real biological processes such as recent splits, ongoing gene flow, hybrid origins, or polyploidy, as well as unresolved or inconsistent taxonomy. The finding confirms that many but not all named plant species correspond to clear genetic lineages when viewed through the nuclear genome.
How many unique DNA changes mark each species
Next, the researchers examined 27 datasets in more detail to count species-specific single nucleotide polymorphisms, or SNPs, which are single-letter DNA changes fixed in one species but absent from close relatives. Across 462 species, 89 percent had at least one such unique SNP, with a typical density of about 193 unique SNPs per million DNA letters, though the range was wide. Some genera showed thousands of unique SNPs per million bases, while recently split groups had almost none. When species labels were shuffled randomly, the apparent signal of unique SNPs mostly vanished, showing that these markers reflect real biological differences rather than chance. Even species that did not form clean branches often carried some unique SNPs, suggesting useful diagnostic markers can exist even in complicated groups.
How much DNA is enough to tell species apart
The authors then asked how many nuclear SNPs are needed, on average, to reach the same discrimination between species as in the full datasets. By repeatedly drawing random subsets of SNPs from 23 genera, they found that species separation improves quickly between about 100 and 500 SNPs, then levels off around 1500 SNPs, where roughly 90 percent of the distinguishable species are recovered. Around 3000 SNPs, almost all genera reach a clear plateau in performance. For studies that track whole genes rather than scattered SNPs, the pattern was similar: often 100 genes or fewer gave nearly the same power as hundreds of genes, and in several genera a single especially informative gene matched the performance of the full data. In two challenging groups, using only seven to nine of the best genes equaled the discrimination from more than 600 or 800 genes.

What this means for future plant DNA tests
These results show that most plant species do form coherent genetic groups and usually carry some unique DNA changes in their nuclear genomes. They also reveal that high-resolution identification does not always require thousands of genes: a well-chosen set of a few to a few hundred nuclear regions, or a few thousand SNPs, can be enough. This opens the door to new, more powerful nuclear-based DNA tests that can better separate closely related species, improve environmental monitoring, and highlight where current names do not match genetic reality. Developing these tools will require coordinated efforts and more genome data, but the study gives a quantitative roadmap for building the next generation of plant DNA identification methods.
Citation: Huang, W., Li, DZ., Antonelli, A. et al. DNA-based identification of plants and the genomic nature of plant species differences. Commun Biol 9, 673 (2026). https://doi.org/10.1038/s42003-026-09858-7
Keywords: plant DNA barcoding, species identification, nuclear genome, biodiversity monitoring, genetic markers