Clear Sky Science · en
Microsatellite/SSR dataset: characterization of pear cultivars of the German Fruit Genebank
Why old pear trees still matter
All across Germany, old pear trees stand in orchards, gardens, and farm fields. Many belong to traditional varieties that have quietly fed families for generations, yet their true identities and relationships are often uncertain. This study describes a large, carefully checked dataset that reveals how these pears are related at the genetic level and how reliably they have been named, creating a solid foundation for conserving them and using them in future breeding and research.

Saving a living library of pears
Germany’s Fruit Genebank is a nationwide network that looks after traditional fruit varieties. Within this system, a special Pear Network focuses on preserving pears that are historically or culturally important to Germany, or that have valuable qualities such as flavor or storage life. The collections are spread across eight partner institutions, each maintaining living trees. Until now, however, the names and identities of many pear trees were based mainly on old records and local tradition, which can be unreliable. To turn this patchwork into a trustworthy "living library," the network set out to combine expert fruit description in the field with modern DNA analysis.
Looking closely at fruit and trees
Experienced pomologists—specialists in fruit varieties—visited the orchards over several years during ripening season. They studied many features of the fruit, such as shape, stalk, seed form, base of the stalk, skin color, and russeting patterns. These details were compared with descriptions in historic books and with samples from their own reference collections. Several experts assessed each cultivar, discussed uncertain cases, and documented their conclusions according to standardized rules about how confident they were that a tree truly matched a given variety name. This step ensured that visible traits and historical knowledge were fully used before turning to genetic tools.
Reading the pears’ DNA fingerprints
In parallel, young leaves were collected from the same trees, frozen, and sent to a laboratory for molecular analysis. There, DNA was extracted and examined at 17 specific spots in the genome called microsatellites or simple sequence repeats—short, repeated segments that differ among varieties. Using a streamlined test that could read several of these markers at once, the team produced a genetic fingerprint for each sample. Computer programs then compared all fingerprints, grouping together samples that were at least 80 percent identical, under the assumption that each group represented a single cultivar, including any indistinguishable bud sports.

From many trees to one profile per variety
After removing poor-quality or mismatched entries, the researchers retained 1,945 samples grouped into 421 genetic clusters. For each cluster, they built a single representative DNA profile by taking, marker by marker, the allele—that is, the variant pattern—that occurred most often among the samples. In some cases this profile matched a real tree; in others it was a "synthetic" fingerprint that did not occur in any single tree but best summarized the group. They also recorded how frequent each chosen variant was, allowing users to spot cultivars or markers with high internal variation. Additional information, such as the trees’ chromosome set size (ploidy level) and an international code (PUNQ) that links pear genotypes across countries, was added to make the dataset easier to use worldwide.
Checking quality and making data open
To ensure reliability, the raw data and the automatically generated representative profiles were independently reviewed. The team confirmed that more than 90 percent of representative profiles could be traced directly to at least one real tree, and only about 8 percent were purely synthetic summaries. Quality controls in the lab, repeat measurements, and comparison with established international reference varieties further strengthened confidence in the results. All data are stored in an open repository, along with explanations of each column and the computer scripts used to process the information, so that other researchers can reproduce and build on the work.
What this means for pears and people
For non-specialists, the outcome can be viewed as a cleaned, well-labeled catalog of Germany’s traditional pear varieties at the DNA level. Curators can now verify whether a tree is truly the variety its label claims; breeders can search for parents with useful traits; and scientists can study how pear diversity developed over time. Because the dataset is compatible with international coding systems, it also helps link German collections to pear resources in other countries. In short, this work transforms scattered orchard trees into a transparent, globally connected genetic resource that supports conservation, agriculture, and the enjoyment of diverse pears for years to come.
Citation: Broschewitz, L., Schramm, B., Flachowsky, H. et al. Microsatellite/SSR dataset: characterization of pear cultivars of the German Fruit Genebank. Sci Data 13, 391 (2026). https://doi.org/10.1038/s41597-026-07010-y
Keywords: pear genetic diversity, fruit genebank, microsatellite markers, cultivar identification, plant conservation