First Pass Annotation of Promoters on Human Chromosome 22

  1. Matthias Scherf1,3,4,
  2. Andreas Klingenhoff1,3,
  3. Kornelie Frech3,
  4. Kerstin Quandt3,
  5. Ralf Schneider1,
  6. Korbinian Grote1,
  7. Matthias Frisch3,
  8. Valérie Gailus-Durner1,
  9. Alexander Seidel1,
  10. Ruth Brack-Werner2, and
  11. Thomas Werner1,3
  1. GSF-National Research Center for Environment and Health,1Institute of Mammalian Genetics; 2Institute of Molecular Virology, Neuherberg, Germany; 3Genomatix Software GmbH, Munich, Germany

Abstract

The publication of the first almost complete sequence of a human chromosome (chromosome 22) is a major milestone in human genomics. Together with the sequence, an excellent annotation of genes was published which certainly will serve as an information resource for numerous future projects. We noted that the annotation did not cover regulatory regions; in particular, no promoter annotation has been provided. Here we present an analysis of the complete published chromosome 22 sequence for promoters. A recent breakthrough in specific in silico prediction of promoter regions enabled us to attempt large-scale prediction of promoter regions on chromosome 22. Scanning of sequence databases revealed only 20 experimentally verified promoters, of which 10 were correctly predicted by our approach. Nearly 40% of our 465 predicted promoter regions are supported by the currently available gene annotation. Promoter finding also provides a biologically meaningful method for “chromosomal scaffolding”, by which long genomic sequences can be divided into segments starting with a gene. As one example, the combination of promoter region prediction with exon/intron structure predictions greatly enhances the specificity of de novo gene finding. The present study demonstrates that it is possible to identify promoters in silico on the chromosomal level with sufficient reliability for experimental planning and indicates that a wealth of information about regulatory regions can be extracted from current large-scale (megabase) sequencing projects. Results are available on-line at http://genomatix.gsf.de/chr22/.

Footnotes

  • 4 Corresponding author.

  • E-MAIL scherf{at}gsf.de; FAX 49 89–5490 8399.

  • Article and publication are at www.genome.org/cgi/doi/10.1101/gr.154601.

    • Received July 6, 2000.
    • Accepted December 29, 2000.
| Table of Contents

Preprint Server