Research

Error-prone polymerase activity causes multinucleotide mutations in humans

    • 1Department of Mathematics, University of California Berkeley, Berkeley, California 94703, USA;
    • 2Department of Integrative Biology, University of California Berkeley, Berkeley, California 94703, USA;
    • 3Department of Statistics, University of California Berkeley, Berkeley, California 94703, USA;
    • 4Center for Bioinformatics, University of Copenhagen, 2200 Copenhagen, Denmark
Published July 30, 2014. Vol 24 Issue 9, pp. 1445-1454. https://doi.org/10.1101/gr.170696.113
Download PDF Please log-in to or register for your personal account in order to access PDF Cite Article Permissions Share
cover of Genome Research Vol 36 Issue 4
Current Issue:

Abstract

About 2% of human genetic polymorphisms have been hypothesized to arise via multinucleotide mutations (MNMs), complex events that generate SNPs at multiple sites in a single generation. MNMs have the potential to accelerate the pace at which single genes evolve and to confound studies of demography and selection that assume all SNPs arise independently. In this paper, we examine clustered mutations that are segregating in a set of 1092 human genomes, demonstrating that the signature of MNM becomes enriched as large numbers of individuals are sampled. We estimate the percentage of linked SNP pairs that were generated by simultaneous mutation as a function of the distance between affected sites and show that MNMs exhibit a high percentage of transversions relative to transitions, findings that are reproducible in data from multiple sequencing platforms and cannot be attributed to sequencing error. Among tandem mutations that occur simultaneously at adjacent sites, we find an especially skewed distribution of ancestral and derived alleles, with GC → AA, GA → TT, and their reverse complements making up 27% of the total. These mutations have been previously shown to dominate the spectrum of the error-prone polymerase Pol ζ, suggesting that low-fidelity DNA replication by Pol ζ is at least partly responsible for the MNMs that are segregating in the human population. We develop statistical estimates of MNM prevalence that can be used to correct phylogenetic and population genetic inferences for the presence of complex mutations.

Loading
Loading
Back to top