Method

Likelihood-based deconvolution of bulk gene expression data using single-cell references

    • 1Department of Mathematics, University of California, Berkeley, California 94720, USA;
    • 2Department of Statistics, University of California, Berkeley, California 94720, USA;
    • 3Computer Science Division, University of California, Berkeley, California 94720, USA;
    • 4Department of Biostatistics, University of Florida, Gainesville, Florida 32611, USA;
    • 5Chan Zuckerberg Biohub, San Francisco, California 94158, USA
    • 6 These authors contributed equally to this work.
Published July 22, 2021. Vol 31 Issue 10, pp. 1794-1806. https://doi.org/10.1101/gr.272344.120
Download PDF Please log-in to or register for your personal account in order to access PDF Cite Article Permissions Share
cover of Genome Research Vol 36 Issue 4
Current Issue:

Abstract

Direct comparison of bulk gene expression profiles is complicated by distinct cell type mixtures in each sample that obscure whether observed differences are actually caused by changes in the expression levels themselves or are simply a result of differing cell type compositions. Single-cell technology has made it possible to measure gene expression in individual cells, achieving higher resolution at the expense of increased noise. If carefully incorporated, such single-cell data can be used to deconvolve bulk samples to yield accurate estimates of the true cell type proportions, thus enabling one to disentangle the effects of differential expression and cell type mixtures. Here, we propose a generative model and a likelihood-based inference method that uses asymptotic statistical theory and a novel optimization procedure to perform deconvolution of bulk RNA-seq data to produce accurate cell type proportion estimates. We show the effectiveness of our method, called RNA-Sieve, across a diverse array of scenarios involving real data and discuss extensions made uniquely possible by our probabilistic framework, including a demonstration of well-calibrated confidence intervals.

Loading
Loading
Back to top