ARTICLE

A code for transcription initiation in mammalian genomes

    • 1 Genome Exploration Research Group (Genome Network Project Core Group), RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa, 230-0045, Japan;
    • 2 ARC Centre in Bioinformatics, Institute for Molecular Bioscience, University of Queensland, Brisbane, Qld 4072, Australia;
    • 3 The Bioinformatics Centre, Department of Molecular Biology & Biotech Research and Innovation Centre, University of Copenhagen, Ole Maaløes Vej 5, DK-2200 København N, Denmark;
    • 4 Genome Science Laboratory, Discovery Research Institute, RIKEN Wako Institute, 2-1 Hirosawa, Wako, Saitama, 351-0198, Japan
    • 5 Present address: CBRC, AIST, 2-42 Aomi, Koto-ku, Tokyo, 135-0064, Japan.
    • 6 Corresponding authors. E-mail [email protected]; fax +81-3-3599-8081. E-mail [email protected]; fax 15-3532-5669.
Published November 21, 2007. Vol 18 Issue 1, pp. 1-12. https://doi.org/10.1101/gr.6831208
Download PDF Please log-in to or register for your personal account in order to access PDF Cite Article Permissions Share
cover of Genome Research Vol 36 Issue 4
Current Issue:

Abstract

Genome-wide detection of transcription start sites (TSSs) has revealed that RNA Polymerase II transcription initiates at millions of positions in mammalian genomes. Most core promoters do not have a single TSS, but an array of closely located TSSs with different rates of initiation. As a rule, genes have more than one such core promoter; however, defining the boundaries between core promoters is not trivial. These discoveries prompt a re-evaluation of our models for transcription initiation. We describe a new framework for understanding the organization of transcription initiation. We show that initiation events are clustered on the chromosomes at multiple scales—clusters within clusters—indicating multiple regulatory processes. Within the smallest of such clusters, which can be interpreted as core promoters, the local DNA sequence predicts the relative transcription start usage of each nucleotide with a remarkable 91% accuracy, implying the existence of a DNA code that determines TSS selection. Conversely, the total expression strength of such clusters is only partially determined by the local DNA sequence. Thus, the overall control of transcription can be understood as a combination of large- and small-scale effects; the selection of transcription start sites is largely governed by the local DNA sequence, whereas the transcriptional activity of a locus is regulated at a different level; it is affected by distal features or events such as enhancers and chromatin remodeling.

Loading
Loading
Back to top