RT Journal
A1 Vaddadi, Kavya
A1 Lin, Mao-Jan
A1 Majidian, Sina
A1 Mun, Taher
A1 Langmead, Ben
T1 Minimizing reference bias with an imputed personalized reference
JF Genome Research 
JO Genome Research 
YR 2026 
FD March 05 
DO 10.1101/gr.280989.125 
SP gr.280989.125 
UL http://genome.cshlp.org/content/early/2026/03/05/gr.280989.125.abstract 
AB Pangenome indexes reduce reference bias in sequencing data analysis. However, bias can be reduced further by using a personalized reference, e.g. a diploid human reference constructed to match a donor individual's alleles. We present a new impute-first alignment framework that combines elements of genotype imputation and read alignment. We first genotype the individual using a subsample of the input reads. Using a reference panel and an efficient imputation algorithm, we impute a personalized diploid reference. Finally, we index the personalized reference and apply a read aligner (either linear or graph) to align the full read set to the personalized reference. On the HG002 sample, this framework achieves a higher variant-calling F1 score (99.77%) compared to the traditional linear aligner (99.62%) graph pangenome aligner (99.72%), and graph personalized-pangenome aligner (99.75%), with substantial reduction in the number of errors (38.73% compared to a linear aligner, 14.97% to a graph aligner, and 6.05% compared to a personalized graph). An imputed reference can have comparable efficiency to a pangenome reference, making it an overall advantageous choice for whole-genome DNA sequencing experiments. Advantages of our impute-first approach include that it (a) fully considers linkage disequilibrium and produces a phased diploid reference as an output, (b) produces accurate personalized references even from low-coverage data, (c) is compatible with both graph and linear reference representations and, achieving its highest variant-calling F1 accuracy using a standard linear aligner (BWA-MEM).