Figure 3.

Extraction of k-mer coordinates from sequence ACTAGCTAGCTAG for k = 3 (panel A) and subsequent compression with a diff-transform (panel B), where the coordinates at a node's successor are expected to be the same but incremented by +1, as these nodes are likely to be consecutive in the input sequence(s). The symmetric set difference AΔB:=(AB)(AB) is used as a diff-operation. Thus, for example, Lδ(TAG)=({3,7,11}1)Δ{4,8}={12}. The inverse transform is performed losslessly by L(v)=(L(vsucc)ΔLδ(v))1, which follows from the following property: (AΔB)ΔB=AA,B.

1754f03