Largest genome assembly in Brassicaceae: retrotransposon-driven genome expansion and karyotype evolution in Matthiola incana

Chen D, Yang T, Chen H, Zhang X, Huang F, Wan S, Lu Z, Liu C, Lei Y, Jiang H, Liao B, King GJ, Lysak MA, Tan C, Ge X.

Plant Biotechnology Journal (2025): 1-17.

Abstract

Matthiola incana, commonly known as stock and gillyflower, is a widely grown ornamental plant whose genome is significantly larger than that of other species in the mustard family. However, the evolutionary history behind such a large genome (~2 Gb) is still unknown. Here, we have succeeded in obtaining a high-quality chromosome-scale genome assembly of M. incana by integrating PacBio HiFi reads, Illumina short reads and Hi-C data. The resulting genome consists of seven pseudochromosomes with a length of 1965 Mb and 38 245 gene models. Phylogenetic analysis indicates that M. incana and other taxa of the supertribe Hesperodae represent an early-diverging lineage in the evolutionary history of the Brassicaceae. Through a comparative analysis, we revisited the ancestral Hesperodae karyotype (AHK, n = 7) and found several differences from the well-established ancestral crucifer karyotype (ACK, n = 8) model, including extensive inter- and intra-chromosomal rearrangements. Our results suggest that the primary reason for genome obesity in M. incana is the massive expansion of long terminal repeat retrotransposons (LTR-RTs), particularly from the Angela, Athila and Retand families. CHG methylation modification is obviously reduced in the regions where the highest density of Copia-type LTR-RTs and the lowest density of Gypsy-type LTR-RTs overlap, corresponding to the putative centromeres. Based on insertion times and methylation profiling, recently inserted LTR-RTs were found to have a significantly different methylation pattern compared to older ones.

Download