What is a gene, post-ENCODE? History and updated definition

(Downloading may take up to 30 seconds. If the slide opens in your browser, select File -> Save As to save it.)

Click on image to view larger version.

Figure 2.
Figure 2.

Biological complexity revealed by ENCODE. (A) Representation of a typical genomic region portraying the complexity of transcripts in the genome. (Top) DNA sequence with annotated exons of genes (black rectangles) and novel TARs (hollow rectangles). (Bottom) The various transcripts that arise from the region from both the forward and reverse strands. (Dashed lines) Spliced-out introns. Conventional gene annotation would account for only a portion of the transcripts coming from the four genes in the region (indicated). Data from the ENCODE project reveal that many transcripts are present that span across multiple gene loci, some using distal 5′ transcription start sites. (B) Representation of the various regulatory sequences identified for a target gene. For Gene 1 we show all the component transcripts, including many novel isoforms, in addition to all the sequences identified to regulate Gene 1 (gray circles). We observe that some of the enhancer sequences are actually promoters for novel splice isoforms. Additionally, some of the regulatory sequences for Gene 1 might actually be closer to another gene, and the target would be misidentified if chosen purely based on proximity.

This Article

  1. Genome Res. 17: 669-681

Preprint Server