TY - JOUR A1 - Guo, Shuai A1 - Liu, Xiaoqian A1 - Cheng, Xuesen A1 - Jiang, Yujie A1 - Ji, Shuangxi A1 - Liang, Qingnan A1 - Koval, Andrew A1 - Li, Yumei A1 - Owen, Leah A. A1 - Kim, Ivana K. A1 - Aparicio, Ana A1 - Lee, Sanghoon A1 - Sood, Anil K. A1 - Kopetz, Scott A1 - Shen, John Paul A1 - Weinstein, John N. A1 - DeAngelis, Margaret M. A1 - Chen, Rui A1 - Wang, Wenyi T1 - A deconvolution framework that uses single-cell sequencing plus a small benchmark data set for accurate analysis of cell type ratios in complex tissue samples Y1 - 2025/01/01 JF - Genome Research JO - Genome Research SP - 147 EP - 161 DO - 10.1101/gr.278822.123 VL - 35 IS - 1 UR - http://genome.cshlp.org/content/35/1/147.abstract N2 - Bulk deconvolution with single-cell/nucleus RNA-seq data is critical for understanding heterogeneity in complex biological samples, yet the technological discrepancy across sequencing platforms limits deconvolution accuracy. To address this, we utilize an experimental design to match inter-platform biological signals, hence revealing the technological discrepancy, and then develop a deconvolution framework called DeMixSC using this well-matched, that is, benchmark, data. Built upon a novel weighted nonnegative least-squares framework, DeMixSC identifies and adjusts genes with high technological discrepancy and aligns the benchmark data with large patient cohorts of matched-tissue-type for large-scale deconvolution. Our results using two benchmark data sets of healthy retinas and ovarian cancer tissues suggest much-improved deconvolution accuracy. Leveraging tissue-specific benchmark data sets, we applied DeMixSC to a large cohort of 453 age-related macular degeneration patients and a cohort of 30 ovarian cancer patients with various responses to neoadjuvant chemotherapy. Only DeMixSC successfully unveiled biologically meaningful differences across patient groups, demonstrating its broad applicability in diverse real-world clinical scenarios. Our findings reveal the impact of technological discrepancy on deconvolution performance and underscore the importance of a well-matched data set to resolve this challenge. The developed DeMixSC framework is generally applicable for accurately deconvolving large cohorts of disease tissues, including cancers, when a well-matched benchmark data set is available. ER -