Figure 4.

A stop codon in the annotated gene E02C12.7. (A) A representation, derived from ACeDB, of the tandem gene cluster containing E02C12.6, E02C12.7, E02C12.8,E02C12.9, E02C12.10, E02C12.11, andE02C12.12, which shows homology to a putative choline kinase. The grey arrows indicate the extent of the annotated genes, with the gene structure predictions depicted at top. The black arrows represent the gene units after realignment of the coding sequences as described in the text. The ruler is marked in 500 base pair units. (B) A BLIXEM window from ACeDB withE02C12.8 and E02C12.7 depicted in the tophalf of the window. Each horizontal line is a homology and the region within the box, covering the gap between the two annotated genes, is expanded to give the lower part of the figure. The theoretical translation across this window is given in the three reading frames, (+1), (+2), and (+3), although only the third reading frame is relevant here. The homologous sequence for three other genes in this tandem cluster, E02C12.6, E02C12.10, and E02C12.12is retained from the BLIXEM window, whereas other homologous sequences have been removed for clarity. The sequence homology extends upstream from the predicted start of E02C12.7(MIIDFVPNIQ…) into the predicted intergenic region. A small intron would then link from the homology of E02C12.8 to that of E02C12.7, matching the coding regions of E02C12.6,E02C12.10, E02C12.12, and others, seamlessly. This perfect alignment only fails because of the stop codon (arrow) in the sequence VYCLK*FDNE, which led to the prediction of two gene units. In fact, E02C12.8 and E02C12.7 appear to form a single pseudogene.

45793-3f4_F1TT