Table 1.

The steps, inputs, outputs, and tools used in the tested impute-first workflows

StepInputToolOutput
A. Personalization
Read samplingDonor reads: whole-genome DNA-seq reads (Baid et al. 2020) from HG001/NA12878, HG002/NA24385, HG003/NA24149, HG004/NA24143, and HG005/NA24631seqtk (https://github.com/lh3/seqtk)Reads sampled to 0.01×, 0.05×, 0.1×, 0.2×, 0.5×, 1×, 2×, 5×, 10×, and 20× average coverage
Alignment and genotypingGenotype panels: HGSVC2 (Ebert et al. 2021), HGSVC3 (Logsdon et al. 2025), HPRC_filtered VCF (Ebler 2022), excluding respective samples and family members; reference: GRCh38 primary assembly (Church et al. 2015); reads: output from sampling stepBowtie 2 (Langmead and Salzberg 2012) + BCFtools (Li 2011)Rough genotype calls in VCF format
ImputationImputation panel and reference: same as previous step; genotype calls: output from genotyping stepBeagle (v5.1) (Browning et al. 2018); Glimpse (v1.0.0) (Rubinacci et al. 2021)Personalized reference as phased VCF file
Personalized reference constructionPersonalized reference: from imputation step; reference: GRCh38 primary assemblyBCFtools (bcftools consensus)Personalized reference as diploid FASTA
B. Downstream analysis
B.1. Variation-graph reference
Graph construction and IndexingPersonalized reference as phased VCF file: from construction step; reference: GRCh38 primary assemblyvg (v1.55.0) autoindex (Garrison et al. 2018)Indexed graph reference
Alignment and LiftingDonor reads; graph reference: from previous stepvg (v1.55.0) surject (Sirén et al. 2021)Aligned reads
Variant calling and EvaluationAligned reads: from previous step; true variants: HG001, HG002, HG003, HG004, and HG005 VCF from GIAB (Zook et al. 2016) high-confidence region annotations, etc.DeepVariant v1.5.0 (Poplin et al. 2018); hap.py v0.3.15 (The Global Alliance for Genomics and Health Benchmarking Team et al. 2019)Variant calls as VCF; benchmarking metrics
B.2. Multi-linear-haplotype reference
IndexingPersonalized reference: From construction step; T2T-CHM13v1.0 genome assembly (Nurk et al. 2022)bwa index (Li 2013)Indexed reference
Alignment and LiftingDonor reads HG001/NA12878, HG002/NA24385, HG003/NA24149, HG004/NA24143, and HG005/NA24631; indexed reference: from previous stepbwa mem (Li 2013) and levioSAM2 lift and levioSAM2 reconcile (Chen et al. 2024)Aligned reads
Variant calling and EvaluationAligned reads: from previous step; true variants: HG001, HG002, HG003, HG004, and HG005 VCF from GIAB high-confidence region annotations, etc.DeepVariant v1.5.0; hap.py v0.3.15Variant calls as VCF; benchmarking metrics