Data-Mining Approaches Reveal Hidden Families of Proteases in the Genome of Malaria Parasite

Table 1.

Ninety-two (92) P. falciparum Protease Homologs Predicted From Comparative Genomic Analysis

Catalytic class Protease family Protease nomenclature Gene ID Protease homolog with highest BLAST score Pfam domain structure AP
Accession (species, protease name) E-score
Domain ID (name) E-score
Aspartic A1 PM I PF14_0076 P39898(P. falciparum, PM I) 0 PF00026 (eukaryotic  asp protease) 1.5e-59 +
PM II PF14_0077 P46925(P. falciparum, PM II) 0 PF00026 (eukaryotic  asp protease) 8.9e-52
HAP(PM III) PF14_0078 CAB40630 (P. falciparum, HAP) 0 PF00026 (eukaryotic  asp protease) 3.3e-30 +
PM IV PF14_0075 AAC15794 (P. malariae, PM) 0 PF00026 (eukaryotic  asp protease) 5.1e-56
PM V PF13_0133 Q05744 (Chicken, Cathepsin D) 9e-08 PF00026 (eukaryotic  asp protease) 0.00037 +
PM VI PFC0495w CAC20153 (Eimeria tenella,  eimepsin) 2e-99 PF00026 (eukaryotic  asp protease) 1.1e-40
PM VII PF10_0329 CAC20153 (Eimeria tenella,  eimepsin) 8e-28 PF00026 (eukaryotic  asp protease) 1.3e-11 +
PM VIII PF14_0623 CAC20153 (Eimeria tenella,  eimepsin) 1e-35 PF00026 (eukaryotic  asp protease) 7.5e-11
PM IX PF14_0281 AAD56283(Pseudopleuronectes  americanus, pepsinogen A  form IIa) 1e-33 PF00026 (eukaryotic  asp protease) 4.5e-26
PM X PF08_0108 AAA31096 (Pig, pepsinogen A) 1e-42 PF00026 (eukaryotic  asp protease) 2.5e-26 +
Cysteine C1 Falcipain-1 PF14_0553 P25805(P. falciparum,  falcipain-1) 0 PF00112 (Papain  family) 9.5e-92
Falcipain-2 PF11_0165 AAF63497(P. falciparum,  falcipain 2) 0 PF00112 (Papain  family) 3.0e-84 +
Falcipain-3 PF11_0162 AAF86352(P. falciparum,  falcipain-3) 0 PF00112 (Papain  family) 4.8e-79
Papain PF11_0161 AAF97809(P. falciparum,  Falcipain 2) 1e-134 PF00112 (Papain  family) 8.8e-84 +
DPP I PFL2290w AAD02704 (Dog, DPP I) 5e-30 PF00112 (Papain  family) 9.9e-16
DPP I PFD0230c AAD02704 (Dog, DPP I) 6e-06 PF00112 (Papain  family) 2.7e-05 +
Cathepsin C PF11_0174 AAL48191 (Human,  Cathepsin C) 5e-23 PF00112 (Papain  family) 3.3e-14 +
SERA PFB0360c H71617(P. falciparum, SERA) 0 PF00112 (Papain  family) 2.1e-51
SERA PFB0325c F71617(P. falciparum, SERA) 0 PF00112 (Papain  family) 2.9e-10
SERA PFB0330c G71617(P. falciparum, SERA) 0 PF00112 (Papain  family) 2.9e-46
SERA PFB0335c H71617(P. falciparum, SERA) 0 PF00112 (Papain  family) 3.4e-16
SERA PFB0340c B71617(P. falciparum, SERA) 0 PF00112 (Papain  family) 1.5e-53
SERA PFB0345c C71617(P. falciparum, SERA) 0 PF00112 (Papain  family) 1.2e-47
SERA PFB0350c D71617(P. falciparum, SERA) 0 PF00112 (Papain  family) 8.8e-51
SERA PFB0355c H71617(P. falciparum, SERA) 0 PF00112 (Papain  family) 1.5e-49
Papain PFI0135c G71617(P. falciparum, SERA) 1e-177 PF00112 (Papain  family) 2.2e-34
C2 Calpain MAL13P1.310 NP_497964 (C. elegans,  Calpain) 2e-35 PF00648 (Calpain  family) 8.0e-13
C12 UCH1 PF14_0577 BAB47136(Rat, UCH 13) 5e-31 PF01088 (UCH 1) 1.3e-37
UCH1 PF11_0177 NP_062508 (Mouse, UCH37) 6e-30 PF01088 (UCH 1) 1.5e-17
C13 GPI8p  transamidase PF11_0298 CAB96076(P. falciparum,  GPI8p transamidase) 1e-179 PF01650 (Peptidase  C13) 2.7e-13 +
C14 Metacaspase PF13_0289 CAD24804(T. brucei,  metacaspase) 5e-32 No hits
C19 UCH2 PFA0220w NP_566486 (Arabidopsis,  UBP25) 1e-12 PF00442 (UCH2) 1.2e-09
UCH2 PFD0165w O57429(Chicken, UCH2) 5e-13 PF00442 (UCH2) PF00443 (UCH2) 1.9e-06 9.2e-07
UCH2 PFD0680c AAG42755(Arabidopsis,  UBP14) 1e-34 PR00442 (UCH2) 8.5e-11
UCH2 PFE1355c NP_566680 (Arabidopsis,  UBP7) 3e-30 PF00442 (UCH2) PF00443 (UCH2) 7.5e-07 1.9e-14
UCH2 PFE0835w 094966 (Human, UBP19) 3e-51 PF00442 (UCH2) PF00443 (UCH2) 1.8e-11 1.7e-20
UCH2 MAL7P1.147 NP_568171 (Arabidopsis,  UBP12) 1e-54 PF00442 (UCH2) PF00443 (UCH2) 3.7e-10 1.9e-16
UCH2 PFI0225w AAC68865(Chicken, UBP66) 1e-18 PF00442 (UCH2) 9.9e-23
UCH2 PF13_0096 NP_494298 (C. elegans, UBP) 5e-58 PF00442 (UCH2) PF00443 (UCH2) 4.3e-08 0.00048
UCH2 PF14_0145 T41069(Fission yeast, UCH) 1e-20 PF00442 (UCH2) PF00443 (UCH2) 4.8e-05 3.1e-08
C48 Sumo1 protease PFL1635w XP_128236 (Mouse, SUMO  protease) 7e-33 PF02902 (Ulp1  protease) 2.3e-38
Ulp2 peptidase MAL8P1.157 O13769 (Fission yeast, Ulp2) 1e-06 PF02902 (Ulp1  protease) 9.2e-07
C56 DJ-1 peptidase MAL6P1.153 BAB79527 (Chicken, DJ-1  protease) 6e-16 PF01965 (DJ-1/PfpI  protease) 8.1e-18
Metallo M1 AMPN MAL13P1.56 O96935(P. falciparum,  M1 peptidase) 0 PF01433 (Peptidase  family M1) 6.2e-74
M3 Dcp PF10-0058 NP_601500 (Corynebacterium  glutamicum, Zn-dependent  peptidases) 1e-04 PF01432 (Peptidase  family M3) 6.4e-10
Neurolysin MAL13P1.184 Q02038(Pig, neurolysin) 4e-09 PF01432 (Peptidase  family M3) 1.8e-07
M16 MPPa PFE1155c AAF00541(Toxoplama gondii,  MPPa) 1e-109 PF00675 (Insulinase,  family M16) 6.4e-19
MPPb PFI1625c AAK51086(Avicennia  marina, MPPb) 1e-108 PF00675 (Insulinase,  family M16) 1.3e-55
M16 peptidase PF11_0189 NP_593544 (yeast  metallopeptidase) 1e-29 PF00675 (Insulinase,  family M16) 0.0076
Insulysin PF11_0226 P35559(Rat, Insulysin) 3e-10 PF00675 (Insulinase,  family M16) 1.0e-05
Falcilysin PF13_0322 AAF06062(P. falciparum,   falcilysin) 0 PF00675 (Insulinase,  family M16) 0.19
Pitrilysin PF14_0382 T06521(Pea pitrilysin) 5e-27 PF00675 (Insulinase,  family M16) 0.0018
M17 AMPL PF14_0439 NP_194821 (Arabidopsis  AMPL) 2e-83 PF00883 (Peptidase  family M17) 9e-150
M18 DNPE PFI1570c AAL16034(Coccidioides  immitis, DNPE) 7e-55 PF02127 (Peptidase  family M18) 3.6e-31
M22 GCP PF10_0299 NP_194003 (Arabidopsis  GCP) 4e-54 PF00814 (Glycoprotease family) 1.3e-53
M24A AMPM PFE1360c AAG33975(Arabidopsis,  AMPM) 3e-61 PF00557 (Peptidase  family M24) 2.5e-48
AMPM MAL8P1.140 P53582(Human, AMP1) 1e-26 PF00557 (Peptidase  family M24) 9.1e-06 +
AMPM PF10_0150 P53582(Human, AMP1) 1e-104 PF00557 (Peptidase  family M24) 5.4e-67
AMPM PF14_0327 AAL76285(P. falciparum,  AMPM2) 0 PF00557 (Peptidase  family M24) 8.7e-50
M24B AMPP PF14_0517 CAC59823(Tomato, AMPP) 3e-85 PF00557 (Peptidase  family M24) 2.8e-07
M41 Ftsh peptidase PF11_0203 NP_006787 (Human, AFG3) 1e-163 PF01434 (Peptidase  family M41) PF00004 (AAA) 2.6e-80 4.0e-80
Ftsh peptidase PFL1925w NP_422020 (Caulobacter  crescentus, cell division  protein FtsH) 1e-122 PF01434 (Peptidase  family M41) PF00004 (AAA) 3.2e-93 5.7e-90
Ftsh peptidase PF14_0616 NP_568787 (Arabidopsis, FtsH) 1e-141 PF01434 (Peptidase family M41) 2.7e-69
PF00004 (AAA 9.9e-85
Serine S1 DegP protease MAL8P1.126 NP_568577 (Arabidopsis, DegP protease) 4e-52 PF00089 (Trypsin) 1.8e-07
Neurotypsin-like PF14_0067 BAA23986 (Mouse, neurotrypsin) 7e-17 PF01477 (PLAT/LH2 domain) 1.4e-09
S8 Subtilase-1 PFE0370c CAA05627 (P. falciparum, subtilase-1) 0 PF00082 (Subtilase family) 2.6e-15
Subtilase-2 PF11_0381 CAB43592 (P. falciparum, Subtilase-2) 0 PF00082 (Subtilase family) 1.6e-35
Subtilase-like PFE0355c CAA05627 (P. falciparum, subtilase-1) 2e-22 PF00082 (Subtilase family) 5.6e-19
S9 ACPH PFC0950c P13676 (Rat, ACPH) 1e-17 PF00561 (α/β hydrolase fold) 0.021
S14 Clp PFC0310c P54416 (Synechocystis sp Clp1) 6e-42 PF00574 (Clp protease) 1.8e-65 +
Clp PF14_0348 NP_567521 (Arabidopsis clp) 2e-29 PF00574 (Clp protease) 3.0e-37 +
ClpB PF08_0063 AAA88777 (P. berghei, ClpB) 0 PF00574 (Clp protease) 1.0-16
ClpB PF14_0063 NP_439019 (Haemophilus influenzae ClpB) 3e-95 PF00004 (AAA) 7.1e-06 +
ClpC PF11_0175 T07807 (Soybean clp) 1e-154 PF00574 (Clp protease) 0.0026
S16 Lon PF14_0147 AAA61616 (Human, Lon) 4e-53 PF00004 (AAA) 1.4e-17
S26A SP1 PF13_0118 T40251 (Fission yeast, IMP) 2e-10 PF00461 (Signal peptidase) 0.055
S26B signalase MAL13P1.167 AAD19813 (Drosophila, signalase SPC21) 3e-46 PF00461 (Signal peptidase) 3.7e-19
S54 Rhomboid PFE0340c NP_523536 (Drosophila, rhomboid-5) 6e-07 PF01694 (Rhomboid family) 5.5e-27
Rhomboid MAL8P1.16 NP_654179 (Bacillus anthracis rhomboid) 2e-07 PF01694 (Rhomboid family) 3.8e-27
Threonine T1 Proteasome α1 PF14_0716 P92188 (Trypanosoma cruzi,α1) 8e-57 PF00227 (Proteasome) 7.7e-39
Proteasome α2 MAL6P1.88 O9LSU2 (Rice, α2) 3e-73 PF00227 (Proteasome) 1.4e-49
Proteasome α3 PFC0745c O24362 (Spinach, α3) 3e-44 PF00227 (Proteasome) 2.0e-21
Proteasome α4 PF13_0282 O81148 (Arabidopsis,α4) 1e-64 PF00227 (Proteasome) 7.4e-47
Proteasome α5 PF07_0112 Q95083 (Drosophila,α5) 4e-70 PF00227 (Proteasome) 2.1e-47
Proteasome α6 MAL8P1.128 Q9LSU3 (Rice, α6) 4e-52 PF00227 (Proteasome) 2.8e-30
Proteasome α7 MAL13P1.270 O24616 (Arabidopsis,α7) 3e-66 PF00227 (Proteasome) 8.5e-43
Proteasome β1 PFE0915c P42742 (Arabidopsis,β1) 3e-44 PF00227 (Proteasome) 1.2e-39
Proteasome β2 MAL8P1.142 Q9LST6 (Rice, β2) 1e-35 PF00227 (Proteasome) 2.9e-17
Proteasome β3 PFA0400c P25451 (Yeast, β3) 8e-35 PF00227 (Proteasome) 1.5e-18
Proteasome β4 PF14_0676 XP_079788 (Drosophila,β4) 4e-36 PF00227 (Proteasome) 1.4e-22
Proteasome β6 PFI1545c O43063 (Fission yeast, β6) 4e-26 PF00227 (Proteasome) 1.1e-10
Proteasome β7 PF13_0156 Q99436 (Human, β7) 1e-74 PF00227 (Proteasome) 5.5e-35
  • The cut-off criteria of E-score ≤ 1e-04 was employed to define protease homologs. The nomenclature of the protease family is: A1 (pepsin), C1 (papain), C2 (calpain), C12 (ubiquitin carboxyl-terminal hydrolase, family 1, UCH1), C13 (hemoglobinase), C14 (caspase), C19 (ubiquitin C-terminal hydrolase family 2, UCH2), C48 (Ubiquitin-like protease, Ulp), C56 (DJ-1 peptidase), M1 (alanyl aminopeptidase, AMPN), M3 (thimet oligopeptidase), M16 (pitrilysin and mitochondrial processing peptidase, MPP), M17 (leucyl aminopeptidase, AMPL), M18 (aspartyl aminopeptidase, DNPE), M22 (O-sialoglycoprotein endopeptidase, GCP), M24A (methionyl aminopeptidase, AMPM), M24B (X-Pro dipeptidase, AMPP), M41 (FtsH endopeptidase), S1 (trypsin), S8 (subtilsin), S9 (acylaminoacyl-peptidase, ACPH), S14 (clp), S16 (Lon protease, La), S26A (prokarytotic signal peptidase I, SP1), S26B (signalase), S54 (rhomboid), and T1 (threonine endopeptidase).

  • Abbreviations for proteases include: PM, plasmepsin; DPPI, dipeptidyl-peptidase I; UBP, ubiquitin-specific protease; IMP, mitochondrial inner membrane peptidase; SPC21, microsomal signal peptide 21 kDa subunit.

  • Previously characterized proteases with proteolytic activity are highlighted in bold. The 23 proteases predicted by the official annotation published in PlasmoDB are highlighted in italic.

  • Potential candidate proteases Calpain, Metacaspase, and Signal peptidase I (SP1) are highlighted in bold italic.

  • ± indicate the gene is predicted to contain/not contain an apicoplast transit peptide.

This Article

  1. Genome Res. 13: 601-616

Preprint Server