Table 3.

Numbers of the predicted and annotated genes as well as individual CDSs (including all alternative CDSs)

SpeciesGeneMark-ETPRefSeq annotation
No. of protein-coding genesNo. of predicted genomic CDSsNo. of protein-coding genesNo. of annotated genomic CDSs
C. elegans18,82019,80619,96928,544
A. thaliana26,44927,70827,44540,827
D. melanogaster12,85014,13813,95122,395
S. lycopersicum24,42026,34125,15831,911
D. rerio28,60831,96125,61042,929
G. gallus17,27521,43317,27938,534
M. musculus23,95627,68622,40558,318

[i] Note that the genomic CDSs and the corresponding transcript CDSs are supposed to be identical in sequence. The “order excluded” reference databases were used by GeneMark-ETP (see section “Data sets”).