Genome-wide strand asymmetry in massively parallel reporter activity favors genic strands

  1. Gregory M. Cooper1
  1. 1HudsonAlpha Institute for Biotechnology, Huntsville, Alabama 35806, USA;
  2. 2Department of Biological Sciences, The University of Alabama in Huntsville, Huntsville, Alabama 35899, USA;
  3. 3Calico Life Sciences LLC, South San Francisco, California 94080, USA;
  4. 4Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA;
  5. 5Howard Hughes Medical Institute, Seattle, Washington 98195, USA;
  6. 6Brotman Baty Institute for Precision Medicine, University of Washington, Seattle, Washington 98195, USA
  • Corresponding authors: rmyers{at}hudsonalpha.org, gcooper{at}hudsonalpha.org
  • Abstract

    Massively parallel reporter assays (MPRAs) are useful tools to characterize regulatory elements in human genomes. An aspect of MPRAs that is not typically the focus of analysis is their intrinsic ability to differentiate activity levels for a given sequence element when placed in both of its possible orientations relative to the reporter construct. Here, we describe pervasive strand asymmetry of MPRA signals in data sets from multiple reporter configurations in both published and newly reported data. These effects are reproducible across different cell types and in different treatments within a cell type and are observed both within and outside of annotated regulatory elements. From elements in gene bodies, MPRA strand asymmetry favors the sense strand, suggesting that function related to endogenous transcription is driving the phenomenon. Similarly, we find that within Alu mobile element insertions, strand asymmetry favors the transcribed strand of the ancestral retrotransposon. The effect is consistent across the multiplicity of Alu elements in human genomes and is more pronounced in less diverged Alu elements. We find sequence features driving MPRA strand asymmetry and show its prediction from sequence alone. We see some evidence for RNA stabilization and transcriptional activation mechanisms and hypothesize that the effect is driven by natural selection favoring efficient transcription. Our results indicate that strand asymmetry is a pervasive and reproducible feature in MPRA data. More importantly, the fact that MPRA asymmetry favors naturally transcribed strands suggests that it stems from preserved biological functions that have a substantial, global impact on gene and genome evolution.

    Footnotes

    • Received August 26, 2020.
    • Accepted February 18, 2021.

    This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see https://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.

    Articles citing this article

    | Table of Contents

    Preprint Server