Daehwan Kim; Li Song; Florian P. Breitwieser; Steven L. Salzberg

Centrifuge: rapid and sensitive classification of metagenomic sequences

(Downloading may take up to 30 seconds. If the slide opens in your browser, select File -> Save As to save it.)

Click on image to view larger version.

Figure 1.

Compression of genome sequences before building the Centrifuge index. All genomes are compared and similarities are computed based on shared 53-mers. In the figure, genomes G₁ and G₂ are the most similar pair. Sequences of G₂ that are ≥99% identical to G₁ are discarded, and the remaining “unique” sequences from G₂ are added to genome G₁, creating a merged genome, G₁₊₂. Similarity between all genomes is recomputed using the merged genomes. Sequences <99% identical in genome G₃ are then added to the merged genome, creating genome G₁₊₂₊₃. This process repeats for the entire Centrifuge database until each merged genome has no sequences ≥99% identical to any other genome.

This Article

Published in Advance October 17, 2016, doi: 10.1101/gr.210641.116 Genome Res. 2016. 26: 1721-1729

Centrifuge: rapid and sensitive classification of metagenomic sequences

This Article

Preprint Server

Current Issue

In This Issue