spRefine denoises and imputes spatial transcriptomic data with a reference-free framework powered by genomic language model
- 1Interdepartmental Program in Computational Biology & Bioinformatics, Yale University, New Haven, Connecticut 06511, USA;
- 2Department of Biostatistics, Yale University, New Haven, Connecticut 06511, USA;
- 3Department of Computer Science, Yale University, New Haven, Connecticut 06511, USA;
- 4Department of Computer Science, Northeastern University, Boston, Massachusetts 02115, USA;
- 5Broad Institute, Cambridge, Massachusetts 02142, USA
Abstract
The analysis of spatial transcriptomic data is hindered by high noise levels and missing gene measurements, challenges that are further compounded by the higher cost of spatial data compared to traditional single-cell data. To overcome this challenge, we introduce spRefine, a deep learning framework that leverages genomic language models to jointly denoise and impute spatial transcriptomic data. Our results demonstrate that spRefine yields more robust cell- and spot-level representations after denoising and imputation, substantially improving data integration. In addition, spRefine serves as a strong framework for model pretraining and the discovery of novel biological signals, as highlighted by multiple downstream applications across data sets of varying scales. Notably, spRefine enhances the accuracy of spatial aging clock estimations and uncovers new aging-related relationships associated with key biological processes, such as neuronal function loss, which offers new insights for analyzing aging effect with spatial transcriptomics.
Footnotes
-
[Supplemental material is available for this article.]
-
Article published online before print. Article, supplemental material, and publication date are at https://www.genome.org/cgi/doi/10.1101/gr.281001.125.
-
Freely available online through the Genome Research Open Access option.
- Received June 2, 2025.
- Accepted January 21, 2026.
This article, published in Genome Research, is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.











