TY - JOUR A1 - Bilgrav Saether, Kristine A1 - Eisfeldt, Jesper A1 - Bengtsson, Jesse D. A1 - Lun, Ming Yin A1 - Grochowski, Christopher M. A1 - Mahmoud, Medhat A1 - Chao, Hsiao-Tuan A1 - Rosenfeld, Jill A. A1 - Liu, Pengfei A1 - Ek, Marlene A1 - Schuy, Jakob A1 - Ameur, Adam A1 - Dai, Hongzheng A1 - Undiagnosed Diseases Network A1 - Hwang, James Paul A1 - Sedlazeck, Fritz J. A1 - Bi, Weimin A1 - Marom, Ronit A1 - Wincent, Josephine A1 - Nordgren, Ann A1 - Carvalho, Claudia M.B. A1 - Lindstrand, Anna T1 - Leveraging the T2T assembly to resolve rare and pathogenic inversions in reference genome gaps Y1 - 2024/11/01 JF - Genome Research JO - Genome Research SP - 1785 EP - 1797 DO - 10.1101/gr.279346.124 VL - 34 IS - 11 UR - http://genome.cshlp.org/content/34/11/1785.abstract N2 - Chromosomal inversions (INVs) are particularly challenging to detect due to their copy-number neutral state and association with repetitive regions. Inversions represent about 1/20 of all balanced structural chromosome aberrations and can lead to disease by gene disruption or altering regulatory regions of dosage-sensitive genes in cis. Short-read genome sequencing (srGS) can only resolve ∼70% of cytogenetically visible inversions referred to clinical diagnostic laboratories, likely due to breakpoints in repetitive regions. Here, we study 12 inversions by long-read genome sequencing (lrGS) (n = 9) or srGS (n = 3) and resolve nine of them. In four cases, the inversion breakpoint region was missing from at least one of the human reference genomes (GRCh37, GRCh38, T2T-CHM13) and a reference agnostic analysis was needed. One of these cases, an INV9 mappable only in de novo assembled lrGS data using T2T-CHM13 disrupts EHMT1 consistent with a Mendelian diagnosis (Kleefstra syndrome 1; MIM#610253). Next, by pairwise comparison between T2T-CHM13, GRCh37, and GRCh38, as well as the chimpanzee and bonobo, we show that hundreds of megabases of sequence are missing from at least one human reference, highlighting that primate genomes contribute to genomic diversity. Aligning population genomic data to these regions indicated that these regions are variable between individuals. Our analysis emphasizes that T2T-CHM13 is necessary to maximize the value of lrGS for optimal inversion detection in clinical diagnostics. These results highlight the importance of leveraging diverse and comprehensive reference genomes to resolve unsolved molecular cases in rare diseases. ER -