Chimeric 16S rRNA sequence formation and detection in Sanger and 454-pyrosequenced PCR amplicons

  1. Bruce W Birren1
  1. 1 Broad Institute;
  2. 2 Baylor College of Medicine;
  3. 3 Washington University School of Medicine;
  4. 4 J. Craig Venter Institute;
  5. 5 Lawrence Berkeley National Laboratory;
  6. 6 University of Colorodo
  1. * Corresponding author; email: bhaas{at}broadinstitute.org

Abstract

Bacterial diversity among environmental samples is commonly assessed with PCR-amplified 16S rRNA gene (16S) sequences. Perceived diversity, however, can be influenced by sample preparation, primer selection and formation of chimeric 16S amplification products. Chimeras are hybrid products between multiple parent sequences that can be falsely interpreted as novel organisms, thus inflating apparent diversity. We developed a new chimera detection tool called Chimera Slayer (CS). CS detects chimeras with greater sensitivity than previous methods, performs well on short sequences such as those produced by the 454 Genome Sequencer, and can scale to large data sets. By benchmarking CS performance against sequences derived from a controlled DNA mixture of known organisms and a simulated chimera set, we provide insights into the factors that affect chimera formation such as sequence abundance, the extent of similarity between 16S genes, and PCR conditions. Chimeras were found to reproducibly form among independent amplifications and contributed to false perceptions of sample diversity and the false identification of novel taxa, with less abundant species exhibiting chimera rates exceeding 70%. Shotgun metagenomic sequences of our mock community appear to be devoid of 16S chimeras, supporting a role for shotgun metagenomics in validating novel organisms discovered in targeted sequence surveys.

  • Received July 11, 2010.
  • Accepted December 29, 2010.

This manuscript is Open Access.

OPEN ACCESS ARTICLE
ACCEPTED MANUSCRIPT

Preprint Server