Clear Sky Science · en
Haplotype-resolved chromosome-level genome assembly of an autohexaploid oil camellia tree Camellia osmantha
Why a tree that makes cooking oil matters
Many people know camellias as pretty garden shrubs, but some of their relatives are workhorses of the kitchen. Oil squeezed from camellia seeds is prized in parts of Asia as a heart-friendly cooking oil rich in healthy fats and protective plant compounds. One rising star is Camellia osmantha, a tough, high‑yield tree that can produce far more oil per hectare than traditional varieties. To fully tap its potential, scientists need to understand its genetic blueprint. This study delivers exactly that: a detailed, high‑resolution map of the tree’s DNA, opening the door to better harvests, healthier oil, and trees that can thrive in a warming world.

A new oil tree with big promise
Camellia osmantha is a recently recognized species of oil camellia. It combines several traits farmers care about: strong tolerance to heat, cold, and drought, and unusually high oil production—about twice that of typical commercial camellia oil trees at just five years of age. Like many crop plants bred for yield, it has an especially complex genome: instead of the usual two copies of each chromosome, it carries six. This “autohexaploid” nature means its DNA is massive, around five times the size of the human genome, and full of repeated sequences. Such complexity has made it hard to build a clean, accurate genome map with earlier technologies.
Cracking a very large genetic puzzle
To tackle this challenge, the researchers combined several cutting‑edge DNA sequencing methods. Long, highly accurate reads from a PacBio HiFi platform provided stretches of genetic code thousands of letters long, while Hi‑C data captured how pieces of DNA are folded and packed inside the cell—clues that help stitch fragments together into whole chromosomes. They also collected RNA data from leaves to see which genes are actually switched on. Using new assembly algorithms designed for polyploid plants, the team pieced together a 14.38‑billion‑base‑pair genome and, crucially, separated it into six distinct but matching “haplotypes,” each representing a full copy set of the chromosomes.
Six full copies, seen clearly for the first time
The final assembly anchored 11.08 billion base pairs onto 90 long, chromosome‑like scaffolds, neatly grouped into six versions of 15 chromosomes. One version, called Haplotype 1, was especially complete and clean, with only a small number of gaps and benchmarks showing over 95% completeness. Across the genome, the scientists cataloged a vast landscape of repeated DNA, especially long terminal repeat elements that make up nearly half of the sequence. On top of this structural map, they identified 60,212 protein‑coding genes, and confirmed that almost all of them carry recognizable functional parts, suggesting that the gene set is both broad and reliable.
Genes linked to oil and flowering
With the genome in hand, the team looked specifically for genes tied to traits people care about. They found 3,269 transcription factors—key “control switches” for other genes—and 2,655 genes resembling known disease‑resistance genes, which may help breeders select trees that shrug off pests and pathogens. Most exciting from an agricultural standpoint, they pinpointed 80 genes involved in building oils and fats, including enzymes that start fat synthesis and others that fine‑tune the types of fatty acids stored in seeds. They also cataloged 497 genes related to flowering time and flower development, important levers for adapting trees to different climates and growing seasons.

A foundation for better trees and better oil
By resolving each of the six chromosome copies and carefully annotating tens of thousands of genes, this work turns an enormous, tangled mass of DNA into a usable reference manual for Camellia osmantha. Plant breeders and molecular biologists can now trace which versions of genes are linked to higher oil yield, better oil quality, stronger disease resistance, or resilience to heat and drought. In practical terms, the study provides a roadmap for developing new camellia oil varieties that are more productive, more robust, and better suited to feeding people in a changing climate—all starting from a clearer picture of what lies inside this remarkable tree’s cells.
Citation: Zhang, Z., Hao, B., Li, M. et al. Haplotype-resolved chromosome-level genome assembly of an autohexaploid oil camellia tree Camellia osmantha. Sci Data 13, 395 (2026). https://doi.org/10.1038/s41597-026-06786-3
Keywords: Camellia osmantha, plant genome, polyploid crops, edible oils, crop breeding