Table 3.

Mouse-Human Sequence Conservation in Selected Genomic Regions

Genomic region[i] Non-exonic, non-repetitive (unmasked) sequence
Total conserved (%)[ii] Highly conserved (%)[iii] G+C (%)[iv] Length (bp)[v] Masked (%)[vi] Reference[vii]
HOXA 99.321.350.793,21115.2Unpublished
TCR 77.87.044.077,11521.0 Koop and Hood 1994
FHIT 58.17.637.1331,12342.1 Shiraishi et al. 2001
CFTR 53.24.934.9247,33141.3 Ellsworth et al. 2000
BTK 49.64.941.143,50441.0 Oeltjen et al. 1997
SNCA 44.41.034.684,50429.8 Touchman et al. 2001
DIST1 40.90.855.364,84145.7 Flint et al. 2001
MECP2 39.75.947.859,67056.9 Reichwald et al. 2000
CD4 35.63.351.9106,53150.8 Ansari-Lari et al. 1998
CECR 21.31.845.9368,77852.5 Footz et al. 2001
WS region20.31.148.9573,53749.7This paper
MYO15 15.43.756.946,03547.7 Liang et al. 1999
ERCC2 11.0058.515,72161.7 Lamerdin et al. 1996

[i] Listed here are 13 genomic regions for which mouse and human genomic sequence is available for comparative analyses. In all cases except for the WS region, finished sequence was available for both mouse and human; in these cases, the name of a known (human) gene within the sequenced region is given. In the case of the WS region, the ∼1.4 Mb of finished mouse sequence was analyzed and an attempt was made to remove mouse sequence for which the orthologous human sequence was not available.

[ii] Annotated exons and sequences identified by theRepeatMasker program (using the default settings) were masked in the human sequence (or the mouse sequence in the case of the WS region). The mouse and human sequences were then aligned with theBLASTZ component of PipMaker (using the default settings). In all cases except for the WS region, the human sequence was used as the reference for the PipMakeranalysis. Shown in this column is the percentage of the non-exonic, non-repetitive sequence within a mouse-human alignment, reflecting the amount of unmasked sequence with at least moderate levels of mouse–human sequence conservation.

[iii] Percentage of the non-exonic, non-repetitive (unmasked) sequence within a gap-free mouse-human sequence alignment of ≥100 bp in length and ≥70% nucleotide identity.

[iv] Percentage of G+C nucleotides in the non-exonic, non-repetitive (unmasked) sequence.

[v] Total length (in bp) of the non-exonic, non-repetitive (unmasked) sequence.

[vi] Percentage of the entire region masked as repetitive or exonic.

[vii] All of the mouse and human genomic sequences used for the analysis summarized in this table are in GenBank. When available, a citation reporting the mouse and/or human sequence for the region is provided.