Fast and space-efficient taxonomic classification of long reads with hierarchical interleaved XOR filters

Table 1.

Results of computational requirement benchmark

RefSeq GTDB
Method Build Query Build Query
Time (mm:ss) RAM (GB) Index size (GB) Time (mm:ss) RAM (GB) Time (hh:mm) RAM (GB) Index size (GB) Time (mm:ss) RAM (GB)
Centrifuge 344:50 385.30 18.4 20:14 18.4
MetaMaps 287:17 331.55 254.5 315:04 334.9
Kraken2 83:06 27.42 26.6 2:50 28.6 08:30 182.7 180.4 4:46 183.6
KMCP 7:01 5.84 16.1 36:49 17.2 00:49 24.61 94.3 353:10 96.0
Ganon2 29:13 53.34 36.3 3:12 39.1 03:40 448.18 320.9 4:11 324.9
Taxor 81:39 13.38 9.8 3:35 9.8 17:58 87.01 71.4 5:09 71.4
  • Reference indexes of a RefSeq database consisting of 21,003 bacterial, viral, archaeal, and fungi genomes and the whole GTDB database were built for all tools. We measured the elapsed time, peak memory usage, and index size to construct the index. For the query benchmarking, we measured the elapsed time and peak memory usage for classifying 426,213 ONT reads. Build and query times were measured using 30 threads on an HPC node. Bold numbers mark the fastest time, lowest memory requirements, or smallest index size among the benchmarked tools.

This Article

  1. Genome Res. 34: 914-924

Preprint Server