ScisTree2 enables large-scale inference of cell lineage trees and genotype calling using efficient local search
- 1School of Computing, University of Connecticut, Storrs, Connecticut 06269, USA;
- 2Division of Hematology/Oncology, Boston Children's Hospital, Boston, Massachusetts 02115, USA;
- 3Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts 02115, USA;
- 4Institute for Systems Genomics, University of Connecticut, Storrs, Connecticut 06269, USA
Abstract
In a multicellular organism, cell lineages share a common evolutionary history. Knowing this history can facilitate the study of development, aging, and cancer. Cell lineage trees represent the evolutionary history of cells sampled from an organism. Recent developments in single-cell sequencing have greatly facilitated the inference of cell lineage trees. However, single-cell data are sparse and noisy, and the size of single-cell data is increasing rapidly. Accurate inference of cell lineage tree from large single-cell data is computationally challenging. In this paper, we present ScisTree2, a fast and accurate cell lineage tree inference and genotype calling approach based on the infinite-sites model. ScisTree2 relies on an efficient local search approach to find optimal trees. ScisTree2 also calls single-cell genotypes based on the inferred cell lineage tree. Experiments on simulated and real biological data show that ScisTree2 achieves better overall accuracy while being significantly more efficient than existing methods. To the best of our knowledge, ScisTree2 is the first model-based cell lineage tree inference and genotype calling approach that is capable of handling data sets from tens of thousands of cells or more.
Footnotes
-
[Supplemental material is available for this article.]
-
Article published online before print. Article, supplemental material, and publication date are at https://www.genome.org/cgi/doi/10.1101/gr.280542.125.
-
Freely available online through the Genome Research Open Access option.
- Received February 13, 2025.
- Accepted August 26, 2025.
This article, published in Genome Research, is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.











