RT Journal A1 Xiao, Lucinda C. A1 Semwal, Ayush A1 St John, Brianna A1 Zeglinski, Kathleen A1 Su, Shian A1 Lancaster, James A1 Xue, Shifeng A1 Reversade, Bruno A1 Ritchie, Matthew E. A1 Magdinier, Frédérique A1 Blewitt, Marnie E. A1 Gouil, Quentin T1 Complete genetic and epigenetic architecture of D4Z4 macrosatellites in FSHD, BAMS, and reference cohorts with D4Z4End2End JF Genome Research JO Genome Research YR 2026 FD March 23 DO 10.1101/gr.280907.125 UL http://genome.cshlp.org/content/early/2026/03/20/gr.280907.125.abstract AB The D4Z4 locus is a macrosatellite array on Chromosome 4q normally comprising 8 to >100 3.3-kb repeat units. Its size and repetitiveness render it refractory to most sequencing technologies; consequently, its genetic and epigenetic architectures remain incompletely understood despite their relevance to facioscapulohumeral muscular dystrophy (FSHD). Current FSHD molecular testing relies on complex, multistep and low-resolution assays, which aim to identify contractions on permissive haplotypes (FSHD type 1) or epigenetic reactivation due to pathogenic variants in the epigenetic machinery, most often in SMCHD1 (FSHD type 2). Recent guideline updates highlight the need for more accurate and comprehensive diagnostic approaches. Here, we leverage ultra-long whole-genome and Cas9-targeted sequencing to develop a fast and accurate workflow, D4Z4End2End, for comprehensive genetic and methylation analysis of D4Z4 alleles. We apply it to samples from two controls, four FSHD1 patients, four FSHD2 patients, and two patients with Bosma arhinia microphthalmia syndrome (BAMS) caused by SMCHD1 variants, as well as publicly available data from 30 B-lymphoblastoid cell lines from the 1000 Genomes Project and Human Pangenome Reference Consortium. We attain high-depth sequencing of full-length D4Z4 arrays of up to 40 repeat units (∼132 kb), accurately capture contracted arrays, genetic mosaicism, and pathogenic SMCHD1 variants, and generate consensus sequences of all D4Z4 alleles. We identify new allelic variants, analyze complex D4Z4 rearrangements including in-cis duplications, and reveal length- and SMCHD1-dependent methylation patterns across the D4Z4 array. Our findings offer insights into D4Z4 genetics and epigenetics, and demonstrate the potential of long-read nanopore sequencing to accelerate FSHD research and diagnostics.