Identifying gene function and module connections by the integration of multispecies expression compendia

  1. Johan Auwerx1
  1. 1Laboratory of Integrative Systems Physiology, Institute of Bioengineering, École Polytechnique Fédérale de Lausanne, Lausanne 1015, Switzerland;
  2. 2Institute of Mathematics, École Polytechnique Fédérale de Lausanne, Lausanne 1015, Switzerland;
  3. 3Gene Expression Core Facility, École Polytechnique Fédérale de Lausanne, Lausanne 1015, Switzerland;
  4. 4SV-IT, École Polytechnique Fédérale de Lausanne, Lausanne 1015, Switzerland;
  5. 5Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland;
  6. 6Department of Ecology and Evolution, University of Lausanne, Lausanne 1015, Switzerland;
  7. 7Laboratory of Metabolic Signaling, Institute of Bioengineering, École Polytechnique Fédérale de Lausanne, Lausanne 1015, Switzerland;
  8. 8Department of Genetics, Genomics and Informatics, University of Tennessee, Memphis, Tennessee 38163, USA
  • Corresponding author: admin.auwerx{at}epfl.ch
  • Abstract

    The functions of many eukaryotic genes are still poorly understood. Here, we developed and validated a new method, termed GeneBridge, which is based on two linked approaches to impute gene function and bridge genes with biological processes. First, Gene-Module Association Determination (G-MAD) allows the annotation of gene function. Second, Module-Module Association Determination (M-MAD) allows predicting connectivity among modules. We applied the GeneBridge tools to large-scale multispecies expression compendia—1700 data sets with over 300,000 samples from human, mouse, rat, fly, worm, and yeast—collected in this study. G-MAD identifies novel functions of genes—for example, DDT in mitochondrial respiration and WDFY4 in T cell activation—and also suggests novel components for modules, such as for cholesterol biosynthesis. By applying G-MAD on data sets from respective tissues, tissue-specific functions of genes were identified—for instance, the roles of EHHADH in liver and kidney, as well as SLC6A1 in brain and liver. Using M-MAD, we identified a list of module-module associations, such as those between mitochondria and proteasome, mitochondria and histone demethylation, as well as ribosomes and lipid biosynthesis. The GeneBridge tools together with the expression compendia are available as an open resource, which will facilitate the identification of connections linking genes, modules, phenotypes, and diseases.

    Footnotes

    • [Supplemental material is available for this article.]

    • Article published online before print. Article, supplemental material, and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.251983.119.

    • Freely available online through the Genome Research Open Access option.

    • Received May 1, 2019.
    • Accepted October 31, 2019.

    This article, published in Genome Research, is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.

    Articles citing this article

    | Table of Contents
    OPEN ACCESS ARTICLE

    Preprint Server