Clear Sky Science – Articles (en)

GENOME ASSEMBLY ARTICLES

Genome assembly is the process of reconstructing the full DNA sequence of an organism from many shorter fragments generated by sequencing technologies. It is central to modern genomics because most platforms cannot read entire chromosomes in a single pass.

Two major data types drive current assemblies. Short reads are highly accurate but only a few hundred bases long, which complicates reconstruction in repetitive regions. Long reads can span thousands to millions of bases, helping to bridge repeats and structural variants, but they usually have higher per base error rates. Modern projects often combine both, using long reads to build the backbone and short reads for polishing.

Assembly typically begins with read preprocessing and error correction, followed by graph based reconstruction. Overlaps or shared subsequences between reads are used to build contigs, which are contiguous stretches of sequence with no gaps. Additional information from long range data such as mate pair libraries, linked reads, Hi C, or optical maps allows contigs to be ordered and oriented into scaffolds that approximate chromosomes.

Quality assessment is critical. Metrics such as N50, total assembly length, and the number of contigs provide basic structure, while gene content analyses test biological completeness. Persistent challenges include segmental duplications, centromeres, telomeres, and other highly repetitive or structurally complex regions.

Recent advances aim at producing telomere to telomere assemblies, capturing diploid variation by phasing maternal and paternal haplotypes, and improving algorithms to handle large genomes efficiently. These developments are transforming fields from evolutionary biology and agriculture to medical genetics by delivering increasingly accurate reference genomes and enabling precise variant discovery.

2026-04-01

A chromosome-scale genome assembly of Tigridiopalma magnifica

2026-03-23

A chromosomal-level genome assembly of Phoxinus grumi (Cypriniformes: Leuciscidae)

2026-03-20

Chromosome-scale Genome Assembly of the Critically Endangered Blue-crowned Laughingthrush (Pterorhinus courtoisi, Leiothrichidae)

2026-03-18

Chromosome-level genome assembly and annotation of the Rhinogobio ventralis, an endangered endemic fish from the Yangtze River

2026-03-06

First Chromosome-level Genome Assembly and Annotation of an Endangered Freshwater Stingray (Fontitrygon garouaensis) from Africa

2026-01-24

GENOME ASSEMBLY ARTICLES

A chromosome-scale genome assembly of Tigridiopalma magnifica

A chromosomal-level genome assembly of Phoxinus grumi (Cypriniformes: Leuciscidae)

Chromosome-scale Genome Assembly of the Critically Endangered Blue-crowned Laughingthrush (Pterorhinus courtoisi, Leiothrichidae)

A chromosome level reference genome for the pecan weevil, Curculio caryae

A near telomere-to-telomere chromosome-level genome assembly of Rhodiola yunnanensis (Crassulaceae)

Chromosome-level genome assembly and annotation of the schizothoracine fish Gymnodiptychus pachycheilus

Chromosomal-level genome assembly of Ichthyurus bourgeoisi Gestro using PacBio HiFi and Hi-C sequencing

High-Quality Genome Assemblies of Two Prototheca wickerhamii Strains

A chromosome-level genome assembly of the South African indigenous, Kolbroek pig, Sus scrofa domesticus

High-quality metagenome assembly from nanopore reads with nanoMDBG

Chromosome-level genome assembly and annotation of the Rhinogobio ventralis, an endangered endemic fish from the Yangtze River

An Improved Chromosome-Level Genome Assembly and Comprehensive Annotation of the Model Ascidian Ciona savignyi

Chromosome-scale genome assembly and annotation of the water monitor lizard, Varanus salvator

Chromosome-level genome assembly of the deep-sea solemyid bivalve Acharax haimaensis

A chromosome-scale genome assembly of Wu’s rock agama (Laudakia wui) from low-altitude habitats

Near telomere-to-telomere genome assembly of the stone loach (Traccatichthys pulcher)

A telomere-to-telomere genome assembly of Castanopsis orthacantha (Fagaceae)

Chromosome-scale genome of the burrowing sea anemone Paracondylactis sinensis

A phased, near-telomere-to-telomere chromosome-scale reference genome of the Moroccan argan tree

A chromosome-scale genome assembly of the striped fruit fly Zeugodacus scutellatus (Diptera: Tephritidae)

Genome Assembly and Characterization of the Endangered Long-armed Scarab Beetle, Cheirotonus jansoni

Chromosome-level genome assembly of the social amoeba Heterostelium pallidum

Chromosome-level genome assembly of the medicinal plant Ophiorrhiza japonica Blume

Chromosome-level genome assembly and annotation of the critically endangered Siberian crane (Leucogeranus leucogeranus)

Chromosome-level genome assembly of the casuarina moth, Lymantria xylina Swinhoe (1903)

A chromosome-level genome assembly of Coryphaenoides armatus

A chromosome-level genome assembly for the mulberry thrips Pseudodendrothrips mori (Thysanoptera: Thripidae)

A high-quality chromosome level genome assembly of the South African indigenous Nguni goat (Capra hircus)

Fungal photobiont and microbiome genome composition in the Cladonia uncialis tripartite symbiosis

First Chromosome-level Genome Assembly and Annotation of an Endangered Freshwater Stingray (Fontitrygon garouaensis) from Africa

Haplotype-resolved chromosome-level genome assemblies of nineteen apple (Malus domestica Borkh.) cultivars

Telomere-to-telomere gap-free genome assembly of the Opsariichthys evolans (Cypriniformes: Cyprinidae)

Chromosome-level genome assembly of narrow-leaf bur-reed (Sparganium angustifolium Michx., Typhaceae)

Chromosome-level genome assemblies of two maize inbred lines with contrasting plant architectures

The Neijiang pig T2T genome reveals domestication history and germplasm traits of Southwest Chinese local breeds

A telomere-to-telomere genome assembly for Cyperus difformis

Chromosome-level genome assembly of the dwarf cattail Typha minima