Markup | Genome Research

Table 1.

Applications of prefix-free parsing

Category	Subcategory	Paper	Contribution
Data structure	BWT, SA, LCP	Boucher et al. (2019), Kuhnle et al. (2020), and Rossi et al. (2022)	Builds the BWT, SA, and LCP using PFP
	BWT, SA, LCP	Oliva et al. (2023) and Ferro et al. (2024)	Builds the BWT, SA, and LCP using recursive PFP
	eBWT	Boucher et al. (2021a, 2024)	Computes the extended BWT (eBWT) for a collection of strings using PFP
	BWT	Díaz-Domínguez et al. (2025)	Shows how to merge BWTs built with PFP
	Compressed suffix tree	Boucher et al. (2021b) and Oliva et al. (2022)	Shows how PFP output can be modified to support full suffix tree functionality
String matching	MEMs	Rossi et al. (2022) and Boucher et al. (2024)	Computes maximal exact matches (MEMs) using data structures built with PFP
	MEMs, MUMs	Shivakumar and Langmead (2025)	Computes maximal exact (MEM) or unique (MUM) matches across sequences using data structures built with PFP
	MEMs, full read alignment	Varki et al. (2025)	MEM-based short-read aligner built on index constructed with PFP
	MS	Boucher et al. (2021c) and Rossi et al. (2022)	Computes matching statistics (MS) with data structures built with PFP
	PMLs	Ahmed et al. (2021, 2023)	Computes an approximation of MS called pseudomatching lengths (PMLs) using data structures built from PFP
	LF-mapping, BWT	Hong et al. (2024)	Shows that leveraging PFP phrases for LF-mapping yields better performance than standard character-level LF-mapping
Compression	LZ77	Hong et al. (2023)	Constructs LZ77 factorization using PFP
	RePair	Gagie et al. (2019)	Builds RePair grammar from PFP output
	RePair	Kim et al. (2024)	Builds RePair grammar from recursive PFP output

[i] The table highlights a subset of works that use PFP, categorized by their primary contribution. Although grouped by category, many of these papers build on one another, resulting in significant overlap.