Combining DNA and protein alignments to improve genome annotation with LiftOn

(Downloading may take up to 30 seconds. If the slide opens in your browser, select File -> Save As to save it.)

Click on image to view larger version.

Figure 4.
Figure 4.

Examples in which LiftOn improves the current T2T-CHM13 annotation. (A) In the NM_006065.5 transcript of the SIRPB1 gene, the current CHM13 annotation omits three coding exons. The LiftOn version finds those exons and increases the DNA sequence identity from 81% to 98%. (B) In transcript NM_001134939.1 of OAZ3, the CHM13 annotation incorporates a partial CDS in the second exon, leading to a truncated protein. LiftOn corrects this misannotation, increasing the protein sequence identity from 5.42% to 100%. (C) In transcript XM_047448259.1 of EPHA2, the published annotation chooses the wrong start codon. LiftOn finds a better start codon that improves the protein sequence identity from 2.4% to 98.7%. (D) In transcript NM_001099772.2 from CYP4B1, LiftOn shifts the donor site of the seventh coding exon by 11 nucleotides, fixing a frameshift and improving the protein sequence identity from 53% to 99%.

This Article

  1. Genome Res. 35: 311-325

Preprint Server