High resolution mapping of Twist to DNA in Drosophila embryos: Efficient functional analysis and evolutionary conservation
- Anil Ozdemir1,5,
- Katherine I. Fisher-Aylor1,5,
- Shirley Pepke2,
- Manoj Samanta3,
- Leslie Dunipace1,
- Kenneth McCue1,
- Lucy Zeng4,
- Nobuo Ogawa4,
- Barbara J. Wold1,6 and
- Angelike Stathopoulos1,6
- 1 Division of Biology, California Institute of Technology, Pasadena, California 91125, USA;
- 2 Center for Advanced Computing Research, California Institute of Technology, Pasadena, California 91125, USA;
- 3 Systemix Institute, Redmond, Washington 98053, USA;
- 4 Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA
-
↵5 These authors contributed equally to this work.
Abstract
Cis-regulatory modules (CRMs) function by binding sequence specific transcription factors, but the relationship between in vivo physical binding and the regulatory capacity of factor-bound DNA elements remains uncertain. We investigate this relationship for the well-studied Twist factor in Drosophila melanogaster embryos by analyzing genome-wide factor occupancy and testing the functional significance of Twist occupied regions and motifs within regions. Twist ChIP-seq data efficiently identified previously studied Twist-dependent CRMs and robustly predicted new CRM activity in transgenesis, with newly identified Twist-occupied regions supporting diverse spatiotemporal patterns (>74% positive, n = 31). Some, but not all, candidate CRMs require Twist for proper expression in the embryo. The Twist motifs most favored in genome ChIP data (in vivo) differed from those most favored by Systematic Evolution of Ligands by EXponential enrichment (SELEX) (in vitro). Furthermore, the majority of ChIP-seq signals could be parsimoniously explained by a CABVTG motif located within 50 bp of the ChIP summit and, of these, CACATG was most prevalent. Mutagenesis experiments demonstrated that different Twist E-box motif types are not fully interchangeable, suggesting that the ChIP-derived consensus (CABVTG) includes sites having distinct regulatory outputs. Further analysis of position, frequency of occurrence, and sequence conservation revealed significant enrichment and conservation of CABVTG E-box motifs near Twist ChIP-seq signal summits, preferential conservation of ±150 bp surrounding Twist occupied summits, and enrichment of GA- and CA-repeat sequences near Twist occupied summits. Our results show that high resolution in vivo occupancy data can be used to drive efficient discovery and dissection of global and local cis-regulatory logic.
Footnotes
-
↵6 Corresponding authors.
E-mail angelike{at}caltech.edu.
E-mail woldb{at}caltech.edu.
-
[Supplemental material is available for this article. The microarray data from this study have been submitted to the NCBI Gene Expression Omnibus (http://www.ncbi.nlm.nih.gov/geo) under accession no. GSE26285, and the sequence data from this study have been submitted to the NCBI Sequence Read Archive (http://www.ncbi.nlm.nih.gov/Traces/sra/sra.cgi) under accession no. SRA027330.]
-
Article published online before print. Article, supplemental material, and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.104018.109.
- Received September 17, 2010.
- Accepted January 4, 2011.
- Copyright © 2011 by Cold Spring Harbor Laboratory Press











