New guidelines for DNA methylome studies regarding 5-hydroxymethylcytosine for understanding transcriptional regulation

  1. Kevin Y. Yip1,2,4,5,6
  1. 1Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong;
  2. 2Department of Biomedical Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong;
  3. 3School of Biomedical Sciences, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong;
  4. 4Hong Kong Bioinformatics Centre, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong;
  5. 5CUHK-BGI Innovation Institute of Trans-omics, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong;
  6. 6Hong Kong Institute of Diabetes and Obesity, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong
  1. 7 These authors contributed equally to this work.

  • Corresponding authors: alfredcheng{at}cuhk.edu.hk, kevinyip{at}cse.cuhk.edu.hk
  • Abstract

    Many DNA methylome profiling methods cannot distinguish between 5-methylcytosine (5mC) and 5-hydroxymethylcytosine (5hmC). Because 5mC typically acts as a repressive mark whereas 5hmC is an intermediate form during active demethylation, the inability to separate their signals could lead to incorrect interpretation of the data. Is the extra information contained in 5hmC signals worth the additional experimental and computational costs? Here we combine whole-genome bisulfite sequencing (WGBS) and oxidative WGBS (oxWGBS) data in various human tissues to investigate the quantitative relationships between gene expression and the two forms of DNA methylation at promoters, transcript bodies, and immediate downstream regions. We find that 5mC and 5hmC signals correlate with gene expression in the same direction in most samples. Considering both types of signals increases the accuracy of expression levels inferred from methylation data by a median of 18.2% as compared to having only WGBS data, showing that the two forms of methylation provide complementary information about gene expression. Differential analysis between matched tumor and normal pairs is particularly affected by the superposition of 5mC and 5hmC signals in WGBS data, with at least 25%–40% of the differentially methylated regions (DMRs) identified from 5mC signals not detected from WGBS data. Our results also confirm a previous finding that methylation signals at transcript bodies are more indicative of gene expression levels than promoter methylation signals. Overall, our study provides data for evaluating the cost-effectiveness of some experimental and analysis options in the study of DNA methylation in normal and cancer samples.

    Footnotes

    • Received May 30, 2018.
    • Accepted February 11, 2019.

    This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.

    | Table of Contents

    Preprint Server