Combining DNA and protein alignments to improve genome annotation with LiftOn

Table 2.

Statistics for LiftOn at both the gene and transcript levels, after mapping RefSeq release v220 annotation from the human genome to Pan troglodytes (mPanTro3-v1.1), from Drosophila melanogaster (genome assembly release 6 + ISO1MT) to Drosophila erecta (Prin_Dsim_3.1), and from Mus musculus (GRCm39) to Rattus norvegicus (mRatBN7.2)

Lift-over experiment Feature Reference/Target Total feature count Protein-coding feature count Noncoding feature count
Single copy Extra copy Extra copy count Total Single copy Extra copy Extra copy count Total
Homo sapiensPan troglodytes Gene Reference (GRCh38) 37,986 19,927 18,059
Target (Pan_trog_v1) 38,879 19,485 157 535 20,177 17,596 271 835 18,702
Transcript Reference (GRCh38) 160,561 130,528 30,033
Target (Pan_trog_v1) 161,222 128,859 450 899 130,208 29,233 404 1377 31,014
Drosophila melanogasterDrosophila erecta Gene Reference (D. melanogaster) 16,005 13,962 2043
Target (D. erecta) 15,276 13,180 140 152 13,472 1804 0 0 1804
Transcript Reference (D. melanogaster) 33,176 30,749 2427
Target (D. erecta) 32,143 29,739 138 150 30,027 2116 0 0 2116
Mus musculusRattus norvegicus Gene Reference (M. musculus) 35,551 22,192 13,359
Target (R. norvegicus) 33,706 20,490 126 164 20,780 12,926 0 0 12,926
Transcript Reference (M. musculus) 119,745 96,192 23,553
Target (R. norvegicus) 115,855 92,767 125 163 93,055 22,800 0 0 22,800

This Article

  1. Genome Res. 35: 311-325

Preprint Server