GFScan: A Gene Family Search Tool at Genomic DNA Level

Table 5.

Comparison Results with BLAST

GFScan TBLASTN BLASTN
E<E m E<1e− 4 E<10 E<10
A. NGIC Family (E m = 9e − 6)
Known member 37 37 37 37 37
Location found 38 45 48 59 33
Known location found 29 29 29 29 28
Potential candidates 8 8 8 8 5
False positives 1 8 11 22 0
Known location missed 8 8 8 8 9
B. CA Family (E m = 9e − 10)
Known member 14 14 14 14 14
Location found 19 19 23 38 16
Known location found 12 12 12 12 11
Potential candidates 6 6 6 6 5
False positives 1 1 5 20 0
Known location missed 2 2 2 2 3
C. DH-Domain Family (E m = 1c − 8)
Known member 8 8 8 8 8
Location found 9 11 16 44 5
Known location found 7 7 7 7 5
Potential candidates 1 1 1 1 0
False positives 1 3 8 36 0
Known location missed 1 1 1 1 3
D. ETS-Domain Family (E m = 1c − 10)
Known member 19 19 19 19 19
Location found 26 34 37 58 15
Known location found 18 18 18 18 14
Potential candidates 8 8 8 8 1
False positives 0 8 11 32 0
Known location missed 1 1 1 1 5
  • E m: The minimum E-value used to find all known members by TBLASTN.

  • Genomic location that is not related to known members. The translated protein could match regular expression pattern of the gene family in the PROSITE database.

  • Genomic location where no gene family member locates (see detail in Methods).

This Article

  1. Genome Res. 12: 1142-1149

Preprint Server