Schematic representation of chimeric transcripts and potentially functional ORFs. Exonic sequences are boxed and were deduced from comparative analyses between the DNA sequences of the transcripts and genomic clones. Sequences highly similar (86%–100% identity) to other known gene-exonic sequences are colored as follows: green,GUSB derived; blue, OCLN derived; orange,NAIP derived. These exons are numbered according to the exonic sequences they are derived from, that is, N7, sequence paralogous toNAIP exon 7. GUSB-derived chimeric transcripts are drawn to scale. Regarding NAIP-OCLN-derived transcripts, only partial information was available. The 3′ part of these cDNAs was assigned previously as NAIP exon 17 (Roy et al. 1995). However, further in silico analysis revealed that the 3′ ends actually consisted of four exons homologous to OCLN exons 5–9, and were associated with the exonic sequence X4, whose genomic origin remains undetermined. (Rp) Repeated sequence as follows: (Rp.1) LTR/pTR5; (Rp.2) LINE/HAL1-SINE/Alu Sx; (Rp.3) MER3-SINE/Alu Y; (Rp.4) LTR/pTR5-LINE/L1. (C161) Exonic sequence containing the CATT1-G1/C161 dinucleotide repeat marker. Gray brackets numbered a–l indicate the extent of potentially functional ORFs initiated from an ATG codon in a reasonable initiation context (Kozak 1984). For genomic localizations of sequences paralogous to exons C161 and X3, see Figure 5B, and for exons X, X1, and X2, see Figures 4 and 5B.
