Efficient de novo assembly of bacterial genomes using low coverage short read sequencing
Abstract
We developed a novel approach for de novo genome assembly using sequence data from high throughput short read sequencing technologies. By combining data generated from 454 and Illumina sequencing platforms, we were able to reliably assemble genomes into large scaffolds at a fraction of the traditional cost. We applied this method to two isolates of the phytopathogenic bacteria Pseudomonas syringae. Sequencing and reassembly of the well-studied tomato and Arabidopsis pathogen, PtoDC3000, facilitated development and testing by synteny of our method. Sequencing of a distantly related rice pathogen, Por1_6, demonstrated our method's efficacy for assembly of novel genomes. Our assembly of Por1_6 yielded an N50 scaffold size of 531,821bp with >75% of the predicted genome covered by scaffolds over 100,000 bp. One of the critical phenotypic differences between strains of P. syringae is the range of plant hosts they infect. This is largely determined by their complement of type III effector proteins. The genome of Por1_6 is the first sequenced for a P. syringae isolate that is a pathogen of monocots and, as might be predicted, its complement of type III effectors differs substantially from the previously sequenced isolates of this species. The genome of Por1_6 helps to define an expansion of the P. syringae pan-genome, a corresponding contraction of the core-genome, and a further diversification of the type III effector complement for this important plant pathogen species.
Footnotes
-
- Received July 15, 2008.
- Accepted November 5, 2008.
-
This manuscript is Open Access.
- Copyright © 2008, Cold Spring Harbor Laboratory Press











