RT Journal A1 Gur-Arie, Riva A1 Cohen, Cyril J. A1 Eitan, Yuval A1 Shelef, Leora A1 Hallerman, Eric M. A1 Kashi, Yechezkel T1 Simple Sequence Repeats in Escherichia coli: Abundance, Distribution, Composition, and Polymorphism JF Genome Research JO Genome Research YR 2000 FD January 01 VO 10 IS 1 SP 62 OP 71 DO 10.1101/gr.10.1.62 UL http://genome.cshlp.org/content/10/1/62.abstract AB Computer-based genome-wide screening of the DNA sequence ofEscherichia coli strain K12 revealed tens of thousands of tandem simple sequence repeat (SSR) tracts, with motifs ranging from 1 to 6 nucleotides. SSRs were well distributed throughout the genome. Mononucleotide SSRs were over-represented in noncoding regions and under-represented in open reading frames (ORFs). Nucleotide composition of mono- and dinucleotide SSRs, both in ORFs and in noncoding regions, differed from that of the genomic region in which they occurred, with 93% of all mononucleotide SSRs proving to be of A or T. Computer-based analysis of the fine position of every SSR locus in the noncoding portion of the genome relative to downstream ORFs showed SSRs located in areas that could affect gene regulation. DNA sequences at 14 arbitrarily chosen SSR tracts were compared among E. colistrains. Polymorphisms of SSR copy number were observed at four of seven mononucleotide SSR tracts screened, with all polymorphisms occurring in noncoding regions. SSR polymorphism could prove important as a genome-wide source of variation, both for practical applications (including rapid detection, strain identification, and detection of loci affecting key phenotypes) and for evolutionary adaptation of microbes.[The sequence data described in this paper have been submitted to the GenBank data library under accession numbersAF209020–209030 and AF209508–209518.]