ERC2.0 evolutionary rate covariation update improves inference of functional interactions across large phylogenies

  1. Maria Chikina4
  1. 1Department of Human Genetics, University of Utah, Salt Lake City, Utah 84112, USA;
  2. 2Department of Biological Sciences, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, USA;
  3. 3Department of Cell Biology and Physiology, Washington University School of Medicine, St. Louis, Missouri 63110, USA;
  4. 4Department of Computational and Systems Biology, University of Pittsburgh, Pennsylvania 15213, USA;
  5. 5Department of Earth Sciences, University of Durham, Durham DH1 3LE, United Kingdom
  • Corresponding author: nclark{at}pitt.edu
  • Abstract

    Evolutionary rate covariation (ERC) is an established comparative genomics method that identifies sets of genes sharing patterns of sequence evolution, which suggests shared function. Whereas many functional predictions of ERC have been empirically validated, its predictive power has hitherto been limited by its inability to tackle the large numbers of species in contemporary comparative genomics data sets. This study introduces ERC2.0, an enhanced methodology for studying ERC across phylogenies with hundreds of species and tens of thousands of genes. ERC2.0 improves upon previous iterations of ERC in algorithm speed, normalizing for heteroskedasticity, and normalizing correlations via Fisher transformations. These improvements have resulted in greater statistical power to predict biological function. In exemplar yeast and mammalian data sets, we demonstrate that the predictive power of ERC2.0 is improved relative to the previous method, ERC1.0, and that further improvements are obtained by using larger yeast and mammalian phylogenies. We attribute the improvements to both the larger data sets and improved rate normalization. We demonstrate that ERC2.0 has high predictive accuracy for known annotations and can predict the functions of genes in nonmodel systems. Our findings underscore the potential for ERC2.0 to be used as a single-pass computational tool in candidate gene screening and functional predictions.

    Footnotes

    • Received February 24, 2025.
    • Accepted July 3, 2025.

    This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see https://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.

    | Table of Contents

    Preprint Server