Table 1.

Table of Novel Domains

Domain Description Length (AS) Sec.struct.pred. Pred.function No. of proteins Associated domains Species Acc. no. of a representative sequence (domain borders)
Part A.—Domains Present in Different Species
JmjC[ii] Jumonji related family100βMetallo-enzymes140BRIGHT, jmjN PHD, FBOX, LRR, C2, TPR PLAc, CXXC, ZnF_C2H2Eu, y, a, c, d, h O14607 [iv] (1042–1205)
CSZDomain in chromatin remodeling S1 domain containing and Zinc finger proteins750α/βDNA-binding, Chromatin modulation35S1, SH2, C2HC, HhHEu y, a, c, d, h P34703  (389–1120)
RPRProteins involved in regulation of nuclear pre-mRNA120αProtein-interaction40RRM, PWWP, SURP, G-Patchy, a, c, d, hQ9SJQ7  (88–225)
DDT[ii] Different transcription and chromosome remodeling factors60αDNA-binding30AT_Hook, PHD, HOX, BROMO, MBDy, a, c, d, hQ9UIG2[iv] (102–161)
TLDcTBC, LysM and other proteins220α/β+βEnzyme30TBC, LysM, R3H, FBOXy, a, c, d, hQ9VNA1[iv] (1163–1325)
PUGProtein knases, UBA or UBX domain containing proteins and glycanases60α/βRNA-binding25C2H2, UBA, TGc, UBX, S_TKc, STYKcy, a, c, d, hQ9MAT3 (323–386)
HSAHelicases and SANT domains70αDNA-binding20SANT, BROMO DEXDc, HELlcy, a, c, d, h P25439 [iv] (501–573)
PSPProline-rich, in spliceosome associated proteins60αRNA- or snRNP-binding15SAP, C2HCy, a, c, d, h O16997 (200–357)
FYRNTrithorax and X-chromosome inactivating proteins40α/βUnknown25PHD, SET, PWWPa, c, d, h Q24742 [iv] (1869–1914)
FYRCTrithorax and X-chromosome inactivating proteins90α/βUnknown25PHD, SET, PWWPa, c, d, h Q24742 [iv] (3495–3583)
RUN[ii] TBC, PH, FYVE and other proteins65αGTPase signalling40DENN, TBC, PLAT, PH, C1, FYVE, GST, SH3c, d, h BAB14033 (115–178)
TCHTranscription factors and CHROMO domain helicases50α/βUnknown20CHROMO, PHD, TFSM2, DEXDc, HELIc, SANT, BROMOc, d, h O15025 [iv] (882–931)
DZFDSRM or ZnF_C2H2 domain containing proteins250α/βUnknown40C2H2, DSRMc, d, h O88531  (762–1016)
NEUZDomain in neuralized-like proteins120βUnknown10SOCS, RING, SPRY, SH2c, d, h Q19299 (199–321)
ZnF_TTFDomain in transposases and transcription factors100α + βMetal-binding20KRAB, BTBa, d, hQ9ZWT4 (100–199)
Part B.—Domains Species–Specific
FBDDomain in FBOX and other domain containing plant proteins80α/βUnknown160FBOX, LRRcap, BRCT, AAAaQ9LXJ7[iv] (304–382)
ZnF_PMZPlant mutator transposase zinc finger domain27α/βMetal-binding125AT_Hook, ZnF_C2HC, PHDaQ9SH73 (3212–3239)
SPKSET and PHD domain containing proteins and protein kinases120α/βProtein-interaction40SET, ICE_p10, ICE_p20, ZnF_C2HC, PHD, STYKccQ9XU06 (139–250)
Part C.—Domains, Newly Recognized Divergent Subfamilies
ZnF_BED[ii] BED zinc finger, Related to C2H2/C2H2 zinc fingers (based on pattern similarity)60βMetal binding50AT_Hook, PTPc_DSPcy, a, c, d, hQ9LWM2 (169–224)
CPDcCatalytic domain of ctd-like phosphatases, related to phosphatase superfamily (based on pattern similarity)120α/βPhosphatase70BRCT, DSRM, UBQy, a, c, d, hQ9PTJ8  (93–236)
RWDRING finger and WD repeat containing proteins and DEXDc helicases, related to the UBCc domain (revealed by hmm searches)110α/βProtein-interaction60S_TKc, RING, WD, UPF29, DEXDc, HELIcy, a, c, d, hQ9QZ05[iv]  (25–137)
BTPBromodomain transcription factors and PHD domain containing Proteins, related to archaeal histone-like transcription factors, defined by PFAM (revealed by PSI-Blast results with less significance (E = 0.041))90αDNA-binding25AT_Hook, BROMO, PHDy, a, c, d, hQ9S7R9  (41–131)
MADFZinc finger, PHD domain and WD repeats containing proteins, related to SANT domain (after the second iteration Q9SR68 bridges to SANT domains (E = 0.002))90αDNA- or Potein-binding60C2H2, PHD, WDVirus, a, c, dQ9V5Y9  (22–110)
Znf_DBFZinc finger in DBF-like proteins, related to C2H2 zinc fingers (revealed by pattern similarity and hmm searches, E value = 1.4)50αMetal-binding10BRCT, AT_Hooky, d, h O93843 (590–638)
CHKC4-zinc finger and HLH domain containing kinase subfamily of choline kinases (after the second iteration P35790 bridges to choline kinases, defined by PFAM (E = 0.003))200α/βEnzyme70ZnF_C4, HLH, i.c Eu, c, dQ9VBT6 (129–321)
Part D.—Family Specific Extensions of Known Domains
AWSAssociated with SET domain, subdomain of PRESET[iii] (hmm searches, E value = 0.52)50α/βHistone modification25SET, PWWP, AT_Hook, WW, PHD, POSTSET, BAHy, a, c, d, h P46995  (63–119)
POXDomain associated with HOX-domains50αUnknown20HOXa Q38897 (199–337)
PRE_C2HCAssociated with zinc fingers70α/βUnknown15ZnF_C2HCd O44939 [iv] (546–616)

[i] First column, domain name; second column, domain description (e.g., associated domains or well-described proteins); third column, approximate domain length (number of amino acids); fourth column, secondary structure prediction (Rost et al. 1994) (α: domain consists of α-helices; β: domain consists of β-strands; α/β: domain consists of α-helices and β-strands); fifth column, predicted function of novel domain; sixth column, number of proteins containing the novel domain; seventh column, names of associated domains (domain names are according to the Simple Modular Architecture Research Tool (http://smart.embl-heidelberg.de) (Schultz et al. 1998, 2000) or the domain is defined by Pfam (Bateman et al. 2000)†; eighth column, species representives containing the novel domain. Abbreviations: eu, eubacteria; virus, viruses; y, yeast; a, Arabidopsis thaliana; c, Caenorhabditis elegans; d,Drosophila melanogaster; h, Homo sapiens. The ninth column, gives the accession number of representative protein and region of the detected domain in amino acids.

[ii] Novel domain is accepted, in press, or published recently.

[iii] Unpublished domain.

[iv] Additional HMM searches are needed to define all novel domain-containing proteins.

[v] +The more conserved parts of the domains FYRN andFYRC were called ATA1 and ATA2 in human ALR protein (Prasad et al. 1997) and FYR (merged in one domain) in plant proteins (Balciunas and Ronne 2000), respectively.