Cross-species analysis of enhancer logic using deep learning
- Liesbeth Minnoye1,
- Ibrahim Ihsan Taskiran1,
- David Mauduit1,
- Maurizio Fazio2,
- Linde Van Aerschot1,
- Gert Hulselmans1,
- Valerie Christiaens1,
- Samira Makhzami1,
- Monika Seltenhammer3,
- Panagiotis Karras4,
- Aline Primot5,
- Edouard Cadieu5,
- Ellen van Rooijen6,
- Jean-Christophe Marine4,
- Giorgia Egidy7,
- Ghanem Elias Ghanem8,
- Leonard Zon2,
- Jasper Wouters1 and
- Stein Aerts1,9
- 1 VIB-KU Leuven Center for Brain & Disease Research;
- 2 Boston Children's Hospital and Dana Farber Cancer Institute, Harvard Medical School;
- 3 Medical University of Vienna;
- 4 VIB-KU Leuven Center for Cancer Biology;
- 5 CNRS-University of Rennes 1;
- 6 Boston Children's Hospital and Dana Farber Cancer Institute, Harvard Medical School,;
- 7 Universite Paris-Saclay;
- 8 Université Libre de Bruxelles
Abstract
Deciphering the genomic regulatory code of enhancers is a key challenge in biology as this code underlies cellular identity. A better understanding of how enhancers work will improve the interpretation of noncoding genome variation, and empower the generation of cell type-specific drivers for gene therapy. Here we explore the combination of deep learning and cross-species chromatin accessibility profiling to build explainable enhancer models. We apply this strategy to decipher the enhancer code in melanoma, a relevant case study due to the presence of distinct melanoma cell states. We trained and validated a deep learning model, called DeepMEL, using chromatin accessibility data of 26 melanoma samples across six different species. We demonstrate the accuracy of DeepMEL predictions on the CAGI5 challenge, where it significantly outperforms existing models on the melanoma enhancer of IRF4. Next, we exploit DeepMEL to analyse enhancer architectures and identify accurate transcription factor binding sites for the core regulatory complexes in the two different melanoma states, with distinct roles for each transcription factor, in terms of nucleosome displacement or enhancer activation. Finally, DeepMEL identifies orthologous enhancers across distantly related species where sequence alignment fails, and the model highlights specific nucleotide substitutions that underlie enhancer turnover. DeepMEL can be used from the Kipoi database to predict and optimise candidate enhancers, and to prioritise enhancer mutations. In addition, our computational strategy can be applied to other cancer or normal cell types.
- Received January 30, 2020.
- Accepted June 15, 2020.
- Published by Cold Spring Harbor Laboratory Press
This manuscript is Open Access.
This article, published in Genome Research, is available under a Creative Commons License (Attribution 4.0 International license), as described at http://creativecommons.org/licenses/by/4.0/.











