Results on HiFi reads
| Data set | Assembler | Size (Mb) | LG50 | LG90 | NG50 (Mb) | NGA50 (Mb) | Complete (%) | Duplicated (%) | QV | # misasm structural | # misasm local |
|---|---|---|---|---|---|---|---|---|---|---|---|
| CHM13 | GNNome | 3051 | 12 | 31 | 111.3 | 111.0 | 99.53 | 0.71 | 54.24 | 44 | 86 |
| Hifiasm | 3052 | 12 | 32 | 87.7 | 87.7 | 99.55 | 0.70 | 55.86 | 23 | 101 | |
| HiCanu | 3297 | 16 | 57 | 69.7 | 69.7 | 99.54 | 2.79 | 43.30 | 24 | 51 | |
| Verkko | 3030 | 101 | 348 | 9.4 | 9.4 | 99.44 | 0.77 | 51.61 | 43 | 30 | |
| M. musculus | GNNome | 2643 | 38 | 140 | 23.0 | 19.3 | 99.62 | 3.30 | 45.40 | 707 | 1053 |
| Hifiasm | 2613 | 40 | 150 | 21.1 | 18.7 | 99.62 | 1.93 | 45.67 | 706 | 1007 | |
| HiCanu | 2651 | 67 | 271 | 11.2 | 10.5 | 99.56 | 2.65 | 43.77 | 781 | 1205 | |
| Verkko | 2609 | 54 | 204 | 15.9 | 14.6 | 99.60 | 1.95 | 45.72 | 705 | 1005 | |
| A. thaliana | GNNome | 139 | 5 | 13 | 12.4 | 12.4 | 99.89 | 1.09 | 52.08 | 129 | 90 |
| Hifiasm | 151 | 5 | 13 | 12.4 | 12.4 | 99.90 | 1.07 | 44.52 | 342 | 56 | |
| HiCanu | 152 | 6 | 16 | 8.6 | 8.6 | 99.87 | 3.20 | 40.30 | 106 | 52 | |
| Verkko | 158 | 6 | 18 | 10.3 | 10.3 | 99.87 | 1.04 | 39.75 | 229 | 54 | |
| G. gallus | GNNome | 1114 | 31 | 135 | 10.8 | 10.1 | 95.79 | 2.99 | 49.35 | 2434 | 8391 |
| Hifiasm | 1087 | 27 | 123 | 11.5 | 11.4 | 96.14 | 2.03 | 51.08 | 2164 | 7231 | |
| Verkko | 1041 | 83 | 410 | 3.8 | 3.7 | 95.44 | 1.08 | 49.65 | 1340 | 3819 |
-
The best-achieved results are in bold. Size is the total length of the assembly. The lengths of the references are 3054 Mb, 2728 Mb, 133 Mb, and 1053 Mb for CHM13 (v1.1), M. musculus (GRCm39), A. thaliana (Col-XJTU), and G. gallus (bGalGal1 maternal), respectively. The LG50 (LG90) measure is the smallest number of contigs that together cover 50% (90%) of the genome. NG50 and NGA50 were computed with minigraph (Li et al. 2020). “Complete” gives the percentage of the reference single-copy genes that are found in the assembly genome, while “duplicated” gives the percentage of reference single-copy genes that are aligned to multiple positions in the assembly. Both “complete” and “duplicated” were computed with compleasm (Huang and Li 2023). Quality value (QV) is per-base consensus accuracy, computed with yak by comparing k-mers in contigs to k-mers found in short reads (Cheng et al. 2021). Short reads were not available for G. gallus, so we computed QV with PacBio HiFi reads instead. Number of structural and local misassemblies (# misasm) was computed with QUAST (Mikheenko et al. 2018). Full QUAST report for HiFi data is given in Supplemental Table S1.











