
Confirmation of HaplotypeSV TSM loci with short-read data. (A) Ratio of REF and ALT allele mapping coverage (r = depthREF/depthALT) reflects the genotype: r ≈ 1 (and thus log10 (r) ≈ 0) for a heterozygote; r ≪ 1 and r ≫ 1 for the two types of homozygotes. The log10 ratios agree for the NA12878 Platinum and Chromium data sets, and 95.5% of the 883 loci identified in HaplotypeSV data are called to contain at least one ALT allele with 99% posterior probability in both data sets; 17 and 10 loci are called homozygous REF with at least one data set and by both (top-right corner). (B) Of the 783 variable loci in NA12878 Chromium data, all but one locus are called to contain at least one ALT allele in the parents. In the 290 loci homozygous for ALT in NA12878 (orange), both parents contain at least one ALT allele. “Unknown” indicates inconsistent variant data or posterior probability below 0.99. One pseudocount was added to all values to avoid divisions by zero. In the legend, the inferred genotypes for the two data sets are separated by a colon, and X1/X2 represents the alternative arrangements of the two alleles, X1|X2 and X2|X1.











