High-fidelity bidirectional translation between single-cell transcriptomes and DNA methylomes with scBOND

  1. Shengquan Chen1,3
  1. 1School of Mathematical Sciences and LPMC, Nankai University, Tianjin 300071, China;
  2. 2College of Electronic Information and Optical Engineering, Nankai University, Tianjin 300350, China;
  3. 3Academy for Advanced Interdisciplinary Studies, Nankai University, Tianjin 300071, China
  1. 4 These authors contributed equally to this work.

  • Corresponding author: chenshengquan{at}nankai.edu.cn
  • Abstract

    Single-cell multiomic sequencing technologies have offered unprecedented insights into cellular heterogeneity by jointly profiling gene expression and epigenetic landscapes at single-cell resolution. However, the application of these technologies remains limited owing to technical challenges and high costs. Computational approaches for cross-modality translation provide a promising solution to these limitations by enabling the inference of one modality from another. However, existing methods for cross-modality translation between single-cell RNA sequencing (scRNA-seq) and single-cell DNA methylation (scDNAm) data face limitations, including unidirectionality, inadequate modeling of context-specific DNA methylation–expression associations, neglect of biological relevance in evaluation, and poor performance in limited paired training data. To fill these gaps, we introduce scBOND, a bidirectional cross-modal translation framework tailored for scRNA-seq and scDNAm profiles. scBOND leverages a mixture-of-experts block to capture context-dependent regulatory patterns, while implementing self-attention mechanism and a feature recalibration module to enhance biological signal fidelity. Extensive experiments demonstrate scBOND consistently outperforms baseline methods in both translation directions, yielding high-accuracy translation while preserving cellular structure. In mouse embryonic data, scBOND preserves subtle, functionally significant differences between closely related cell types, which are undetected in the original data. Downstream analyses confirm that scBOND effectively recovers tissue-specific signals in human brain neurons. Moreover, using RNA-only data, we reconstruct scDNAm profiles and identify cell-type- and stage-specific regulatory mechanisms in oligodendrocyte lineage. To further improve model generalization in paired data-scarce scenarios, we propose scBOND-Aug, a variant of scBOND equipped with a biologically informed data augmentation strategy, which demonstrates superior results with limited paired data.

    Footnotes

    • [Supplemental material is available for this article.]

    • Article published online before print. Article, supplemental material, and publication date are at https://www.genome.org/cgi/doi/10.1101/gr.281350.125.

    • Freely available online through the Genome Research Open Access option.

    • Received August 24, 2025.
    • Accepted March 2, 2026.

    This article, published in Genome Research, is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.

    OPEN ACCESS ARTICLE

    Preprint Server