Is “Junk” DNA Mostly Intron DNA?

Table 1.

Estimated Intergenic Fractions

Homo sapiens Drosophila melanogaster Caenorhabolitis elegans Arabidosis thaliana
Euchromatin 3180000 123000 97800 130000
Sequenced DNA 369000 123000 91000 119000
Gene-to-gene 45.4 9.0 5.3 4.7
cDNA aligned 1061 1628 583 1401
Genomic quality 1.2 23.3 2.4 15.7
Nested genes 6% 8% 4% 1%
05 Percentile 2.5 0.9 0.8 0.9
Genomic length 43.4 9.5 5.0 2.6
95 Percentile 165.5 36.3 14.2 5.4
%, missing half 11% 10% 21% 30%
Intergenic DNA Discussed in text of article 3% 10% 46%
  • The first three rows list the euchromatic genome size, the amount of genomic sequence that was analyzed, and the annotation-based estimate of the gene-to-gene distance. The next three rows describe the cDNA alignments. These rows list the number of aligned cDNAs, our quality assessment for the genomic contigs (i.e., the median of the genomic contig size divided by the genomic length for the 95th-percentile gene), and our estimate of the frequency of nested genes (i.e., genes on the reverse strand or inside an intron). The genomic length is given in the next three rows by its arithmetic mean, and its 5th or 95th percentile values. Next, we indicate what fraction of the largest genes would have to be unidentified for half of the intragenic space to be missing. The last row lists the intergenic fraction, computed by correcting the mean genomic length for nested genes, dividing that by the mean gene-to-gene distance, and subtracting the result from one. Note: In Drosophila melanogaster, we do not count scaffold joins longer than 1 kb as contiguous when computing the genomic quality. All lengths are reported in kp.

This Article

  1. Genome Res. 10: 1672-1678

Preprint Server