Cross-species de novo identification of cis-regulatory modules with GibbsModule: application to gene regulation in embryonic stem cells

  1. Dan Xie1,
  2. Jun Cai1,
  3. Na-Yu Chia2,
  4. Huck Ng2, and
  5. Sheng Zhong1,3
  1. 1 University of Illinois at Urbana-Champaign;
  2. 2 Genome Institute of Singapore

Abstract

We introduce the GibbsModule algorithm for de novo detection of cis-regulatory motifs and modules in eukaryote genomes. GibbsModule models the co-expressed genes within one species as sharing a core cis-regulatory motif and each homologous gene group as sharing a homologous cis-regulatory module (CRM), characterized by a similar composition of motifs. Without using a pre-determined alignment result, GibbsModule iteratively updates the core motif shared by co-expressed genes and traces the homologous CRMs that contain the core motif. GibbsModule achieved substantial improvements in both precision and recall as compared to peer algorithms on a number of synthetic and real datasets. Applying GibbsModule to analyze the binding regions of the Kruppel-like factor (Klf) transcription factor in embryonic stem cells (ESCs), we discovered a motif that differs from a previously published Klf motif identified by a SELEX experiment, but the new motif is consistent with mutagenesis analysis. Sox2 motif was found to be a collaborating motif to the Klf motif in ESCs. We used quantitative chromatin immunoprecipitation (ChIP) analysis to test whether GibbsModule could distinguish functional and non-functional binding sites. All 7 tested binding sites in GibbsModule predicted CRMs had higher ChIP signals as compared to the other 7 tested binding sites located outside of predicted CRMs. GibbsModule is available at http://biocomp.bioen.uiuc.edu/GibbsModule.

Footnotes

    • Received October 31, 2007.
    • Accepted May 5, 2008.
ACCEPTED MANUSCRIPT

This Article

  1. Genome Res. gr.072769.107 Copyright © 2008, Cold Spring Harbor Laboratory Press

Article Category

Share

Preprint Server