Modeling transcriptional regulation of model species with deep learning

  1. Olga G Troyanskaya1,4
  1. 1 Princeton University;
  2. 2 Kansai Electric Power Medical Research Institute;
  3. 3 Flatiron Institute
  • * Corresponding author; email: ogt{at}genomics.princeton.edu
  • Abstract

    To enable large-scale analyses of regulatory logic in model species, we developed DeepArk, a set of deep learning models of the cis-regulatory codes of four widely-studied species: Caenorhabditis elegans, Danio rerio, Drosophila melanogaster, and Mus musculus. DeepArk accurately predicts the presence of thousands of different context-specific regulatory features, including chromatin states, histone marks, and transcription factors. In vivo studies show that DeepArk can predict the regulatory impact of any genomic variant (including rare or not previously observed), and enables the regulatory annotation of understudied model species.

    • Received May 18, 2020.
    • Accepted April 19, 2021.

    This manuscript is Open Access.

    This article, published in Genome Research, is available under a Creative Commons License (Attribution-NonCommercial 4.0 International license), as described at http://creativecommons.org/licenses/by-nc/4.0/.

    Articles citing this article

    OPEN ACCESS ARTICLE
    ACCEPTED MANUSCRIPT

    This Article

    1. Genome Res. gr.266171.120 Published by Cold Spring Harbor Laboratory Press

    Article Category

    ORCID

    Share

    Preprint Server