Massive turnover of functional sequence in human and other mammalian genomes

  1. Gerton Lunter2,3
  1. 1 MRC Functional Genomics Unit, University of Oxford;
  2. 2 The Wellcome Trust Centre for Human Genetics, University of Oxford
  1. * Corresponding author; email: gerton.lunter{at}well.ox.ac.uk

Abstract

Despite the availability of dozens of animal genome sequences, two key questions remain unanswered: first, what fraction of any species’ genome confers biological function, and second, are apparent differences in organismal complexity reflected in an objective measure of genomic complexity? Here, we address both questions by applying, across the mammalian phylogeny, an evolutionary model that estimates the amount of functional DNA that is shared between two species' genomes. Our main findings are, first, that as the divergence between mammalian species increases, the predicted amount of pairwise shared functional sequence drops off dramatically. We show by simulations that this is not an artefact of the method, but rather indicates that functional (and mostly non-coding) sequence is turning over at a very high rate. We estimate that between 200 and 300 Mb (~6.5-10%) of the human genome is under functional constraint which includes 5-8 times as many constrained non-coding bases than bases that code for protein. By contrast, in D. melanogaster we estimate only 56-66 Mb to be constrained, implying a ratio of non-coding to coding constrained bases of about 2. This suggests that, rather than genome size or protein-coding gene complement, it is the number of functional bases that might best mirror our naïve preconceptions of organismal complexity.

Footnotes

    This manuscript is Open Access.

    Related Articles

    OPEN ACCESS ARTICLE
    ACCEPTED MANUSCRIPT

    Preprint Server