Figure 5.

A 120 kb SV in IGHC. (A) Alignment of PacBio HiFi reads from healthy donor 1 (HD1) onto the hg38 reference genome. The top IGV track shows PacBio HiFi reads grouped by origin: centromeric (H1.1) and telomeric (H1.2) side of the duplication in haplotype 1 (H1), and haplotype 2 (H2). The middle track shows a single ultra-long ONT read (>246 kb) from H1 spanning the duplication, with half of it aligning as a supplementary read. The bottom track illustrates constant gene coordinates. (B) Schematic representation of the large duplication involving four constant genes (IGHE–IGHA1) within HD1 H1. (C,D) Gene usage frequency of each subisotype as determined by FLAIRR-seq in HD1. The duplicated genes (C) are stratified by which part of the duplication they originate, except for IGHG4, in which the H2 and H1.1 alleles were identical (ambiguous). (D) Frequency of usage of the nonduplicated genes from each haplotype. (E) Representative phylogenetic trees from clonal families containing both copies of duplicated constant genes. Branch lengths indicate mutational distances between nodes, and node colors denote subisotype and gene location.

2240f05