Duplex + ultra-long curated assembly statistics for S. lycopersicum and Z. mays compared to existing reference genomes
| Asm | Total BP (Mbp) | Contigs | Contig NG50 (Mb) | LAI | Gaps | QV | Errors | T2T ctgs |
|---|---|---|---|---|---|---|---|---|
| Solanum lycopersicum Heinz 1706 | ||||||||
| Reference SL5.0 | 801.78 | 73 | 41.70 | 15.80 | 60 | 60.77 | 14 | 0/12 |
| Verkko + curation | 814.61 | 20 | 68.51 | 15.89 | 2 | 51.81 | 7 | 11/12 |
| Zea mays B73 | ||||||||
| Reference Zm5.0 | 2178.29 | 1393 | 47.04 | 29.12 | 708 | 52.18 | 93 | 0/10 |
| Verkko + curation | 2192.15 | 26 | 209.62 | 30.35 | 9 | 60.55 | 26 | 6/10 |
-
Total BP: the total length of assembly bases, in megabases. Contigs: number of sequences in the assembly, after splitting at gaps consisting of at least three Ns. Contig NG50: The length of the shortest contig such that half of the genome is in contigs of this length or greater. LAI: The LTR assembly index (Ou et al. 2018) for each assembly, higher is better. Gaps: the total number of gaps (composed of at least three Ns) in the assembly, lower is better. QV: the Phred (Ewing and Green 1998) log-scaled quality score calculated using Merqury (Rhie et al. 2020), higher is better. Errors: estimate of assembly errors based on VerityMap alignments and discordant k-mers (Mikheenko et al. 2020), lower is better. T2T ctgs: The count of telomere-to-telomere contigs for each assembly. A contig is defined as T2T if it has the canonical (TTTAGGG) telomere sequence within 10 kbp of the start and end and has no gaps, higher is better. Bold denotes the best result for each metric and species.











