RT Journal A1 Zhang, Wenhai A1 Liu, Yuansheng A1 Li, Guangyi A1 Xu, Jialu A1 Chen, Enlian A1 Schönhuth, Alexander A1 Luo, Xiao T1 Strain-level metagenomic profiling using pangenome graphs with PanTax JF Genome Research JO Genome Research YR 2026 FD February 01 VO 36 IS 2 SP 405 OP 420 DO 10.1101/gr.280858.125 UL http://genome.cshlp.org/content/36/2/405.abstract AB Microbes are omnipresent, thriving in a range of habitats, from oceans to soils, and even within our gastrointestinal tracts. They play a vital role in maintaining ecological equilibrium and promoting the health of their hosts. Consequently, understanding the diversity in terms of strains in microbial communities is crucial, as variations between strains can lead to different phenotypic expressions or diverse biological functions. However, current methods for taxonomic classification from metagenomic sequencing data have several limitations, including their reliance solely on species resolution, support for either short or long reads, or their confinement to a given single species. Most notably, most existing strain-level taxonomic classifiers rely on the sequence representation of multiple linear reference genomes, which fails to capture the sequence correlations among these genomes, potentially introducing ambiguity and biases in metagenomic profiling. Here, we present PanTax, a pangenome graph-based taxonomic profiler that overcomes the shortcomings of sequence-based approaches, because pangenome graphs possess the capability to depict the full range of genetic variability present across multiple evolutionarily or environmentally related genomes. PanTax provides a comprehensive solution to taxonomic classification for strain resolution, compatibility with both short and long reads, and compatibility with single or multiple species. Extensive benchmarking results demonstrate that PanTax drastically outperforms state-of-the-art approaches, primarily evidenced by its significantly higher F1 score at the strain level, while maintaining comparable or better performance in other aspects across various data sets.