Modeling transcriptional regulation of model species with deep learning

  1. Olga G. Troyanskaya1,4,6
  1. 1Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey 08544, USA;
  2. 2Graduate Program in Quantitative and Computational Biology, Princeton University, Princeton, New Jersey 08544, USA;
  3. 3Yutaka Seino Distinguished Center for Diabetes Research, Kansai Electric Power Medical Research Institute, Kobe, 650-0047, Japan;
  4. 4Flatiron Institute, Simons Foundation, New York, New York 10010, USA;
  5. 5Department of Molecular Biology, Princeton University, Princeton, New Jersey 08544, USA;
  6. 6Department of Computer Science, Princeton University, Princeton, New Jersey 08540, USA
  • Corresponding author: ogt{at}cs.princeton.edu
  • Abstract

    To enable large-scale analyses of transcription regulation in model species, we developed DeepArk, a set of deep learning models of the cis-regulatory activities for four widely studied species: Caenorhabditis elegans, Danio rerio, Drosophila melanogaster, and Mus musculus. DeepArk accurately predicts the presence of thousands of different context-specific regulatory features, including chromatin states, histone marks, and transcription factors. In vivo studies show that DeepArk can predict the regulatory impact of any genomic variant (including rare or not previously observed) and enables the regulatory annotation of understudied model species.

    Footnotes

    • [Supplemental material is available for this article.]

    • Article published online before print. Article, supplemental material, and publication date are at https://www.genome.org/cgi/doi/10.1101/gr.266171.120.

    • Freely available online through the Genome Research Open Access option.

    • Received May 18, 2020.
    • Accepted April 19, 2021.

    This article, published in Genome Research, is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.

    Articles citing this article

    | Table of Contents
    OPEN ACCESS ARTICLE

    Preprint Server