RT Journal A1 Borozan, Bartol A1 Prusina, Tomislav A1 Borozan, Luka A1 Ševerdija, Domagoj A1 Rojas Ringeling, Francisca A1 Matijević, Domagoj A1 Canzar, Stefan T1 Optimal marker genes for c-separated cell types with SepSolve JF Genome Research JO Genome Research YR 2025 FD December 01 VO 35 IS 12 SP 2770 OP 2780 DO 10.1101/gr.280637.125 UL http://genome.cshlp.org/content/35/12/2770.abstract AB The identification of cell types in single-cell RNA-seq studies relies on the distinct expression signature of marker genes. A small set of target genes is also needed to design probes for targeted spatial transcriptomic experiments and to target proteins in single-cell spatial proteomics or for cell sorting. Although traditional approaches have relied on testing one gene at a time for differential expression between a given cell type and the rest, more recent methods have highlighted the benefits of a joint selection of markers that together distinguish all pairs of cell types simultaneously. However, existing methods either consider all pairs of individual cells, which becomes intractable even for medium-sized data sets, or ignore intra-cell-type expression variation entirely by collapsing all cells of a given type to a single representative. Here, we address these limitations and propose to find a small set of genes such that cell types are c-separated in the selected dimensions, a notion introduced previously in learning a mixture of Gaussians. To this end, we formulate a linear program that naturally takes into account expression variation within cell types without including each pair of individual cells in the model, leading to a highly stable set of marker genes that allow to accurately discriminate between cell types and that can be computed to optimality efficiently.