Connecting Sequence and Biology in the Laboratory Mouse

(Downloading may take up to 30 seconds. If the slide opens in your browser, select File -> Save As to save it.)

Click on image to view larger version.

Figure 3
Figure 3

Combining cluster visualization tools with the MGI FANTOM2 data table for accurate integration in MGI. (A) Alignment view display, a cluster visualization tool available at the FANTOM2 Web interface. FANTOM2 sequences grouped in RIKEN cluster (locus ID) 22339 are shown as colored bars. RIKEN clone IDs are shown to the left of each sequence, as are the corresponding row numbers for the sequences in the MGI FANTOM2 table in B. Sequence alignments are with respect to the top sequence (black), as are various features, including sequence similarity (color-coded as shown) and gaps. The green arrows above the sequences represent predicted CDS regions (shown). The gaps in sequences 5 and 6 (intron) reveal the presence of an unspliced intron in sequences 3 and 8. Note truncation of the CDS at this position in sequences 3 and 8. Sequences 5 and 6 are properly spliced. Sequences 3, 4, and 7 are partial transcripts. Non-RIKEN sequences are not shown in this view. (B) MGI FANTOM2 data table display of the FANTOM2 sequences in A and two non-RIKEN sequences (blue) included in this cluster union (R Cluster 3268). Rows and columns correspond to sequences and sequence features, respectively. Rows are color-coded to reflect sequence origin or other status (as shown). Sequences 3 and 8 are marked as problem sequences because they contain an unprocessed intron (Seq Qual: Problem-in). Sequence 6 was selected as the representative clone (Seq Note: Representative). Sequences 1 and 3 were associated with MGI gene Dnajc5 before the FANTOM2 load, sequence 4 with MGI gene 2610314I24Rik (RA symbol). All sequences are associated with MGI gene Dnajc5 after the FANTOM2 load (Final symbol 1). (C) Integration in MGI. The FANTOM1 clone 2610314I24 (sequence 4 in A, B) does not overlap the coding region of Dnajc5 and was represented as a unique MGI gene during the FANTOM1 load (Symbol: 2610314I24Rik), whereas FANTOM1 clone 1810057D19 (sequence 3 in A, B), which does overlap the CDS, was associated with the Dnajc5 gene. FANTOM2-new sequences reveal that sequence 4 is actually derived from the 3′-UTR region of Dnajc5 and that sequence 3 contains an intron that truncates the CDS. This information triggered a merge in MGI, in which the 2610314I24Rik gene was withdrawn to equal Dnajc5. The MGI accession ID for the previous gene (MGI:1919766) becomes a secondary accession ID for the Dnajc5 gene (shown), and all information previously associated with 2610314I24Rik was migrated to Dnajc5. The nomenclature history for the Dnajc5 gene details this event. The molecular segment record for clone D030049H18 (sequence 8 in A, B), an intron-containing transcript (problem sequence) is shown. A note is attached to molecular segment records of problem sequences to inform users that the sequence has been judged by curators to have some type of problem. Key to MGI FANTOM2 table columns (see Methods for descriptions): SeqID indicates RIKEN Seqid; clone ID, RIKEN cloneid; GenBank ID, DDBJ/EMBL/GenBank seqid; RA MGI ID, MGI ID to which the sequence was associated before the FANTOM2 load; RA symbol, gene symbol corresponding to the RA MGI ID; Seq length, sequence length (bp); locus ID, RIKEN cluster ID; UniGene ID, NCBI UniGene cluster ID; TIGR TC, TIGR cluster ID; R cluster, cluster union ID; locus stat, RIKEN status code; RIKEN #, RIKEN number code; MGI status, MGI status code; MGI #, MGI number code; BLAST group ID; Seq qual, sequence quality; Seq note, sequence note (to designate Representative clone); final MGI ID, MGI ID to which the sequence is associated after the FANTOM2 load; and final symbol 1, gene symbol corresponding to the Final MGI ID.

This Article

  1. Genome Res. 13: 1505-1519

Preprint Server