Partitioned multi-MUM finding for scalable pangenomics with MumemtoM

  1. Ben Langmead
  1. Department of Computer Science, Johns Hopkins University, Baltimore, Maryland 21218, USA
  • Corresponding authors: vshivak1{at}jhu.edu, langmea{at}cs.jhu.edu
  • Abstract

    Pangenome collections are growing to hundreds of high-quality genomes. This necessitates scalable methods for constructing pangenome alignments that can incorporate newly sequenced assemblies. We previously developed Mumemto, which computes maximal unique matches (multi-MUMs) across pangenomes using compressed indexing. In this work, we introduce MumemtoM (Mumemto Merge), comprising two new partitioning and merging strategies. Both strategies enable highly parallel, memory-efficient, and updateable computation of multi-MUMs. One of the strategies, called string-based merging, is also capable of conducting the merges in a way that follows the shape of a phylogenetic tree, naturally yielding the multi-MUM for the tree’s internal nodes as well as the root. With these strategies, Mumemto now scales to 474 human haplotypes, the only multi-MUM method able to do so. It also introduces a time–memory tradeoff that allows Mumemto to be tailored to more scenarios, including in resource-limited settings.

    Footnotes

    • Received May 16, 2025.
    • Accepted November 5, 2025.

    This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see https://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.

    This article has not yet been cited by other articles.

    This Article

    1. Genome Res. © 2026 Shivakumar and Langmead; Published by Cold Spring Harbor Laboratory Press

    Article Category

    ORCID

    Share

    Preprint Server