Comparative Analysis of Apoptosis and Inflammation Genes of Mice and Humans
- John C. Reed1,3,4,
- Kutbuddin Doctor1,
- Ana Rojas1,
- Juan M. Zapata1,
- Christian Stehlik1,
- Loredana Fiorentino1,
- Jason Damiano1,
- Wilfried Roth1,
- Shu-ichi Matsuzawa1,
- Ruchi Newman1,
- Shinichi Takayama1,
- Hiroyuki Marusawa1,
- Famming Xu1,
- Guy Salvesen1,
- RIKEN GER Group2,
- GSL Members3,5, and
- Adam Godzik1
- 1The Burnham Institute, La Jolla, California 92037, USA
- 2Laboratory for Genome Exploration Research Group, RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa, 230-0045, Japan
- 3Genome Science Laboratory, RIKEN, Hirosawa, Wako, Saitama 351-0198, Japan
Abstract
Apoptosis (programmed cell death) plays important roles in many facets of normal mammalian physiology. Host-pathogen interactions have provided evolutionary pressure for apoptosis as a defense mechanism against viruses and microbes, sometimes linking apoptosis mechanisms with inflammatory responses through NFκB induction. Proteins involved in apoptosis and NFκB induction commonly contain evolutionarily conserved domains that can serve as signatures for identification by bioinformatics methods. Using a combination of public (NCBI) and private (RIKEN) databases, we compared the repertoire of apoptosis and NFκB-inducing genes in humans and mice from cDNA/EST/genomic data, focusing on the following domain families: (1) Caspase proteases; (2) Caspase recruitment domains (CARD); (3) Death Domains (DD); (4) Death Effector Domains (DED); (5) BIR domains of Inhibitor of Apoptosis Proteins (IAPs); (6) Bcl-2 homology (BH) domains of Bcl-2 family proteins; (7) Tumor Necrosis Factor (TNF)-family ligands; (8) TNF receptors (TNFR); (9) TIR domains; (10) PAAD (PYRIN; PYD, DAPIN); (11) nucleotide-binding NACHT domains; (12) TRAFs; (13) Hsp70-binding BAG domains; (14) endonuclease-associated CIDE domains; and (15) miscellaneous additional proteins. After excluding redundancy due to alternative splice forms, sequencing errors, and other considerations, we identified cDNAs derived from a total of 227 human genes among these domain families. Orthologous murine genes were found for 219 (96%); in addition, several unique murine genes were found, which appear not to have human orthologs. This mismatch may be due to the still fragmentary information about the mouse genome or genuine differences between mouse and human repertoires of apoptotic genes. With this caveat, we discuss similarities and differences in human and murine genes from these domain families.
Apoptosis is a form of programmed cell death that plays an important role in many facets of normal mammalian physiology, including embryological development, tissue homeostasis, and immune cell education (Metzstein et al. 1998). Defects in apoptosis regulation are implicated in the pathogenesis of multiple diseases, perhaps explaining why the study of apoptosis has emerged as one of the fastest growing areas of biomedical research in recent years (Thompson 1995; O'Reilly and Strasser 1999; Reed 2000).
Apoptosis also represents an important defense mechanism against pathogens. For example, cell suicide can provide a mechanism for depriving viruses of a host for replication, thus limiting viral spread (Miller 1997). Also, some of the families of proteins involved in apoptosis regulation participate in inflammatory responses to microbial pathogens. For instance, Caspase-family proteases are critical effectors of the apoptotic program, but some of these proteases are responsible for cleavage and activation of pro-inflammatory cytokines such as pro-Interleukin-1β and pro-Interleukin-18 (Thornberry and Lazebnik 1998). Similarly, some proteins involved in Caspase activation can also participate in triggering induction of NFκB family transcription factors, which regulate expression of numerous genes important for inflammatory responses, as well as innate and acquired immunity (Karin and Lin 2002). NFκB also regulates the expression of several genes involved in apoptosis control, for example, including expression of anti-apoptotic members of the Bcl-2, Inhibitor of Apoptosis (IAP), and Death Effector Domain (DED)-family of proteins (Reed 2002). Thus, the worlds of apoptosis and inflammation are often closely intertwined.
Proteins involved in apoptosis commonly contain evolutionarily conserved domains that can serve as signatures for identification, permitting application of bioinformatics techniques to analysis of families of apoptosis-regulatory proteins. Previously, we used bioinformatics approaches to mine human genomic and EST databases for the presence of expressed or putative genes containing signature domains associated with apoptosis, including the (1) Caspase protease fold; (2) Caspase-associated recruitment domain (CARD); (3) Death Domain (DD); (4) DED; (5) BIR domain of IAP proteins; (6) Bcl-2 homology (BH) domains of Bcl-2 family proteins; (7) nucleotide-binding NACHT domains; and (8) CIDE domains of apoptotic endonucleases, assembling this information into a database (http://apoptosis-db.org). In addition, several families of proteins containing other types of domains implicated either in the regulation of the core apoptotic machinery or in control of closely linked inflammatory response pathways were also organized, including (1) Tumor Necrosis Factor (TNF)-family ligands; (2) TNF receptors (TNFR); (3) TIR domains; (4) PAAD (Pyrin; PYD, DAPIN); (5) TRAFs; (6) REL (NFκB) and IκB family proteins; and (7) BAG domains. These data for human genes thus provided a foundation for performing a comparative analysis with murine genes, including those identified from cDNA sequences deposited into either public databases at NCBI or a collection of cDNA sequence data from the RIKEN mouse transcriptome project (Kawai et al. 2001; Bono et al. 2002). A comparative analysis of human and mouse genes that comprise the aforementioned 15 domain families is provided here. The findings reveal some interesting and presumably functionally important species-specific differences among genes devoted to regulation of apoptosis and inflammation.
RESULTS AND DISCUSSION
After excluding alternative splice forms and adjusting for redundancy due to proteins that contain two or more of the domains of interest, sequences corresponding to human apoptosis and inflammation genes found apparent orthologous matches in either the public databases or RIKEN collection of murine cDNAs in 220 of 228 cases (96%), including 10/11 Caspases; 18/23 CARD; 33/33 DD; 11/11 DED; 27/24 Bcl-2; 7/8 IAPs; 5/5 CIDEs; 12/19 PAAD; 16/20 NACHT; 18/14 TRAF-; and TRAF-related proteins (including TRAFs, TEFs, Meprins, Siah, and TRAF-binding proteins); 17/18 TNF-family ligands; 27/29 TNFRs; 14/16 TIR; 5/5 RELs; 7/8 IκBs; 6/6 BAG; and 33/35 miscellaneous apoptosis-relevant proteins (Table 1). In 21 cases, murine orthologs of human genes that were absent from the public databases were represented in the RIKEN collection.
Summary of Protein Domain Family Comparisons for Humans and Mice
The automated annotation scheme revealed only five cases of novel murine proteins containing at least one of the signature domains of interest, which were not recognized previously. At the same time, few groups of human proteins systematically lack murine orthologs, implying that most of the genes of interest arose early in mammalian evolution. The final evaluation of such cases has to wait for the completion of the mouse genome. However, some species-specific differences are apparent between mouse and human that indicate recent amplification of certain genes. In several cases, these are represented by tandem extra copies of the relevant genes on the same chromosomes. A brief description of each of the domain families follows.
Caspases
Caspases represent a family of intracellular cysteine proteases that either induce apoptosis or that are required for proteolytic processing of certain pro-inflammatory cytokines (for review, see Thornberry and Lazebnik 1998). The cysteine protease fold that comprises the Caspase domain is composed of -20 kD large and -10 kD small catalytic subunits that are generated upon proteolytic cleavage from a proprotein precursor (Fesik 2000). The Caspases contain amino-terminal prodomains of variable length, with the upstream initiator proteases generally having larger prodomains than downstream effector proteases. The larger prodomains serve as protein-interaction modules for controlling Caspase activation via the proposed induced proximity mechanism (Salvesen and Dixit 1999).
We found cDNAs representing 10 members of the Caspase family of cysteine proteases in mice, compared with 11 in humans (Table 2). Most striking is the absence of Caspase-10 in mice, this gene is also absent from all public databases. This member of the Caspase family is a close homolog of Caspase-8, containing two tandem copies of the DED in its amino-terminal prodomain, upstream of the carboxy-terminal catalytic domain that defines membership in this protease family. In humans, the Caspase-8 and Caspase-10 genes are located adjacent to each other on chromosome 2, implying a recent gene duplication event. Other notable differences between human and mouse are found in Caspase-4 and Caspase-5 of man, which both appear to be orthologs of murine Caspase-11 on the basis of phylogeny analysis. In humans, a predicted gene is found on chromosome 11 with striking nucleic acid sequence similarity to murine Caspase-12, but the predicted ORF contains a termination codon prior to the region encoding the catalytic domain (Fischer et al. 2002). All of the Caspases of mouse and human were deposited previously in the NCBI database.
Comparison of Caspases of Mice and Humans
Mouse cDNAs corresponding to a single para-Caspase (MALT) were also identified, suggesting that both mice and humans possess a single para-Caspase gene. The murine para-Caspase sequence was found in the RIKEN, but not the NCBI database. This protein is implicated in NFκB regulation (Uren et al. 2000).
CARDs
The CARD is a protein interaction module, generally comprised of a bundle of six α-helices (Hofmann et al. 1997; Fesik 2000). This domain is commonly implicated in regulation of Caspases that contain CARDs in their amino-terminal prodomains (mouse Caspases 1, 2, 9, 11, and 12; human Caspases 1, 2, 4, 5, and 9), or in regulation of NFκB activation.
In contrast to the 23 human genes encoding CARD-carrying proteins (excluding CARD-carrying Caspases), cDNAs representing only 18 of these were identified for mice in either the NCBI or RIKEN databases (Table 3). Present in both human and mouse were Apaf1; Arc (Nop30); ASC (TMS-1; PyCARD); Bcl-10 (CIPER, CARMEN, mE10, cE10, CLAP); Bimp1 (CARD10, Carma3); Bimp2 (CARD14, Carma2); Cardiak (RIP2, RICK); CARD6; CARP; cIAP1 (HIAP1, MIHB); cIAP2 (HIAP2, MIHC); Helicard (Mda5); NAC (NALP1, DEFCAP, CARD7); Nod1 (CARD4); Nod2 (CARD15); and RAIDD (CRADD). In addition, partial cDNAs were found that likely represent the murine orthologs of Bimp3 (CARD11; Carma1), and CLAN (Ipaf; CARD12), but the CARD-encoding region was missing from the sequences, and thus, caution must be exercised in ascribing these proteins to the CARD family. Missing from the available transcriptome of mice were CARD9, COP (Pseudo-ICE), COP2 (predicted genomic fragment), Iceberg, and TUCAN (Cardinal, CARD8, NDPP, Dakar). Five of the murine CARD-family members were found in the RIKEN, but not NCBI databases, for example, ARC (Nop30), Bimp3, CARD6, CARP, and Nod1. Although some reports have suggested the presence of a CARD in the protein CIITA (Nickerson et al. 2001), our analysis using various structure-prediction programs such as FFAS, RPS-BLAST, CDART, and SMART failed to confirm this hypothesis, thus, it was not included here (Table 3).
Comparison of CARD-Family Proteins of Humans and Mice
The absence of Iceberg, COP, and COP2 in the available mouse cDNA and genomic collections, if true, suggests that humans have evolved additional mechanisms for controlling activation of Caspase-1. These proteins are comprised essentially of just a CARD with strong sequence similarity to the CARD found in pro-Caspase-1, and they have been shown in the cases of Iceberg and COP (Pseudo-Caspase) to bind and inhibit activation of pro-Caspase-1 (Humke et al. 2000; Druilhe et al. 2001; Lee et al. 2001). Similarly, the absence of TUCAN in mice suggests an additional level of complexity to regulating Caspase-9 in humans, as this CARD-containing protein reportedly binds and suppresses activation of Caspase-9 (Pathan et al. 2001). Species-specific differences in Caspase-9 regulation have been reported previously (Reed et al. 2000; Rodriguez et al. 2000).
DEDs
DED is a protein-interaction module similar to the CARD, which is generally comprised of a 6 α-helical bundle (Eberstadt et al. 1998). DED-containing proteins have been implicated in apoptosis regulation via interactions with DED-containing Caspases (Caspase-8 and Caspase-10 in human; Caspase-8 in mouse). Excluding DED-containing Caspases, cDNAs representative of 11 DED-family genes were identified in humans and orthologous sequences were found in all 11 of these in mice (Table 4). The DED-containing proteins listed in Table 4 comprise proteins with classical DEDs, as well as proteins with DED-like domains.
Comparison of DED Family Proteins of Humans and Mice
DDs
The DD is another protein-interaction module belonging to the same superfamily that includes the CARD, DED and PAAD (PYRIN, PYD, DAP; Fesik 2000). This domain is commonly implicated in either NFκB induction or Caspase activation, typically involving interactions with members of the TNF-family of cytokine receptors (Ashkenazi and Dixit 1998). Of the 33 DD-containing proteins recognized in human, 32 orthologs and one novel DD-containing protein were represented in the cDNA sequence data available at NCBI and RIKEN (Table 5). At the time of our analysis, only four of the DD-family proteins were found uniquely in the RIKEN database (Ankyrin-2, IRAK2, MALT, and TRADD).
Comparison of DD Family Proteins of Humans and Mice
Absent from the available mouse data was DR4, one of the receptors for TRAIL (Apo2L), an apoptosis-inducing member of the TNF family. In humans, DR4 and DR5 are highly homologous proteins encoded by tandem genes on chromosome 8p21, both of which bind TRAIL and activate Caspases involved in apoptosis. Thus, it appears that a recent gene duplication event in humans has increased the complexity of the DD-containing TRAIL receptors.
Mouse ESTs encoding a predicted 228 amino-acid protein containing a DD were identified in both the NCBI (GenBank AV149215) and RIKEN (2610311B09) databases. The DD of this protein shares 36% amino-acid sequence identity (55% sequence similarity) with the DD of p75-NTR, and is preceded by a predicted transmembrane (TM) domain, indicative of an integral membrane protein. Homologous ESTs were found in humans (GenBank AI688486; BE839192), but the predicted ORFs contained a termination codon preceding the DD-encoding region, indicating that the predicted human protein lacks a DD. We have tentatively termed the predicted mouse protein, NRDD, for NTR-related death domain.
IAPs
The IAP-family proteins function as apoptosis suppressors (Deveraux and Reed 1999). All members of this family contain at least one copy of a zinc-binding fold, termed the BIR domain (Miller 1999). Several IAPs have been reported to directly bind and suppress Caspase-family proteases (Deveraux and Reed 1999). Humans have genes encoding IAP-family proteins, including Naip, cIAP1, cIAP2, XIAP, Survivin, Apollon (BRUCE), ML-IAP (Livin; K-IAP), and ILP2 (TsIAP). Murine cDNAs corresponding to seven of these IAP-family proteins were identified in either the NCBI or RIKEN databases (Table 6), with ILP2 missing from the available mouse transcriptome data. Sequence information for ML-IAP was uniquely found in the RIKEN database at the time of analysis, whereas the other IAPs were found in NCBI. Interestingly, multiple tandem copies of naip-related genes have been found on mouse chromosome 13 (Endrizzi et al. 2000). In contrast to humans that express only one NAIP gene (Roy et al. 1995; La Casse et al. 1998), mice appear to express at least three versions of the NAIP protein from distinct genes (see legend to Table 6 for details).
Comparison of IAP and IAP-Related-Family Proteins of Humans and Mice
Several IAP antagonists have been identified in humans, including SMAC (Diablo), Omi (HtrA2), and XAF. ESTs or cDNAs corresponding to all of these were observed in mice, with XAF sequence data uniquely found in the RIKEN databases.
Bcl-2
Proteins of the Bcl-2 family are critical regulators of apoptosis, whose functions included governing mitochondria-dependent steps in cell death pathways (Green and Reed 1998; Kroemer and Reed 2000). EST and cDNA data corresponding to 24 human and 27 mouse Bcl-2 family genes were identified (Table 7). These included (1) the multidomain members of the family, which contain Bcl-2 Homology (BH) domains, BH1, BH2, BH3, and (sometimes) BH4 (Bcl-2, BclX, Mcl1, Bcl-W, Bf11 [A1], Bcl-B, Diva [Boo], Bax, Bak, Bok [Mtd]), which have been documented or predicted to share structural similarity with the α-helical pore-forming domains of certain bacterial toxins (Fesik 2000); (2) Bcl-GL, which possess BH2 and BH3 domains (Guo et al. 2001); (3) several BH3-only proteins (Bad, Bid, Bim [Bod], Bmf, Bik [Blk], Noxa [APR], Puma, Hrk [Dp5], [Huang and Strasser 2000]) (4) proteins with BH3-like domain (Nip3 [Bnip3], Nix [Nip3L], Map1, p193; Chen et al. 1999), and (5) a protein containing a putative BH2 domain (Bcl2L12; Scorilas et al. 2001). The human ortholog of mouse Diva (Boo) appears to be Bcl-B, on the basis of phylogeny analysis (data not shown). Interestingly, four copies of the A1 gene of mice (known as Bfl1 in humans) have been identified. Three of these A1 genes (A1a, Bid) are closely linked on chromosome 9, (NCBI Locus ID12044, ID12045, and ID12047), indicating a recent gene-amplification event in mice (Orlofsky et al. 2002). Bcl-2 family genes are well known for production of splice variants that produce proteins, sometimes having opposing functions (e.g., Bcl-XL vs Bcl-XS) (Boise et al. 1993; Reed 1999). Humans and mice appeared to share many of these splicing variants (data not shown).
Comparison of Bcl-2-Family Proteins of Humans and Mice
In humans, 18 putative Bcl-2-binding proteins have been described that lack sequence similarity with Bcl-2 and its relatives, including R-Ras, Raf1, Prp1, BAG1, Flip, ANT1, ANT2, ANT3, VDAC1, VDAC2, BAR, BI-1, RTN-x, Smn, Apaf1, Aven, Nip1, and Nip2. ESTs or cDNAs corresponding to 16 of these 18 proteins were identified for mice. Only Adenine Nucleotide Translocator-3 (ANT3) and RTN-x were absent from the available murine data (data not shown).
TNF-Family Ligands
Many TNF-family cytokines regulate pathways implicated in either suppression or induction of apoptosis (Baud and Karin 2001; Locksley et al. 2001). Humans have 18 genes encoding proteins that contain a conserved carboxy-terminal domain spanning -150 amino acids, which is termed the TNF homology domain (THD). This domain is involved in ligand trimerization and receptor binding. Most family members contain predicted TM domains and are trimeric Type II transmembrane proteins, in which the carboxyl terminus is predicted to be oriented toward the outside of the cell and the amino terminus toward the cell interior. Some TNF-family ligands are released from the cell surface by proteolysis. ESTs or cDNAs corresponding to all of these TNF-family ligands, except AITRL, were identified in the mouse databases (Table 8). Thus, this family of proteins is highly conserved between humans and mice.
Comparison of TNF-Superfamily of Humans and Mice
TNF-Family Receptors
The receptors for TNF-family ligands contain a conserved cysteine-rich domain (CRD) present in one to four copies, typically preceded by a hydrophobic leader peptide sequence and followed by a TM domain, indicative of Type I transmembrane proteins that are sorted to the cell surface (Locksley et al. 2001). Some of these receptors lack a membrane-anchoring TM domain and are secreted from cells, whereas others are released from the cell surface by proteolysis. Humans contain 29 genes encoding TNF-family receptors, including 8 that contain a DD in their cytosolic tails. Sequence data from the NCBI and RIKEN databases demonstrated the presence of 25 mouse TNF-family receptors with orthologs in humans, as well as 2 additional receptors (SOBa and SOBb) found only in mice (Table 9). Missing from the mouse databases were the TRAIL receptor DR4 (as mentioned above), as well as two TRAIL decoy receptors DcR1 (TNFRsF10c) and DcR2 (TNFRsF10d), and the FasL decoy receptor DcR3 (TNFRsF6b) that possess the extracellular ligand-binding domain but lack signaling-transducing cytosolic domains. Thus, humans may have evolved a greater diversity of signaling transducing (DR4 and DR5) and decoy (DcR1 and DcR2; vs. only OPG in mouse) receptors for TRAIL, allowing greater fine tuning of responses to this TNF-family ligand. Analogously, no mouse ESTs or cDNAs were found that encode the decoy receptor, DcR3, which competes for binding to FasL (TNFSF6; Roth et al. 2001), LIGHT (TNFSF14; Yu et al. 1999), and TL1A (TNFSF15; Migone et al. 2002).
Comparison of TNF-Receptors Superfamily of Humans and Mice
Interestingly, although the ligand was not present for mouse, ESTs encoding the mouse homolog of the TNF-family receptor AITR were identified, suggesting either that eventually the corresponding murine ligand will be identified or that another member of the TNF family can bind this receptor, analogous to some TNF-family ligand/receptor combinations in which more than one ligand competes for a given receptor (Kovacsovics et al. 2002).
TIRs
Toll-like receptors (TLRs) play important roles in innate immunity (Aderem and Ulevitch 2000). These Type I transmembrane receptors contain Leucine Rich Repeat (LRR) domains in their extracellular region, which bind various molecules made by microbial pathogens, as well as certain endogenous proteins such as heat-shock proteins when released by cell lysis (Wagner 2001; Vabulas et al. 2002). The TIR domain represents a -130 amino acid fold related to flavodoxin consisting of a 5-strand parallel β-sheet surrounded by two layers of parallel α-helices (Xu et al. 2000). The TIR domain is found in the cytosolic (intracellular) tails of TLRs, as well as in certain intracellular adapter proteins that interact with TLRs in the context of transducing signals important for innate immunity, including activation of NFκB (Silverman and Maniatis 2001). Of the 14 TIR-containing proteins identified previously in humans (IL-1R, TLR1, TLR2, TLR3, TLR4, TLR5, TLR6, TLR7, TLR8, TLR9, TLR10, MyD88, and Mal [TIRAP]), orthologous sequences were found for 13 in mice, the sole exception being TLR10 (Table 10).
Comparison of TIR-Containing Protein in Humans and Mice
TRAFs/TEFs
TRAFs constitute a family of adapter proteins that share an -180 amino acid fold, the TRAF domain, which is comprised of a bundle of eight β-strands, preceded by an α-helical segment that forms coiled-coil interactions, stabilizing these domains into trimers (Chung et al. 2002). TRAFs bind the cytosolic (intracellular) regions of TNF-Rs, certain adapter proteins involved in TNF-R signaling (Tradd), and some IAP-family members. They link these proteins to downstream protein kinases involved in induction of NFκB and Jun amino-terminal kinase (JNKs), among other signaling proteins and pathways (Arch et al. 1998; Bradley and Pober 2001; Chung et al. 2002). Six TRAFs have been identified in humans, with clear orthologs found for all of these in mice (Table 11). Although lacking the TRAF-domain, six TRAF-binding proteins have also been described in humans, including I-TRAF (TANK) (Cheng and Baltimore 1996; Rothe et al. 1996), Trip (Lee et al. 1997), MIP-T3 (Ling and Goeddel 2000), TTRAP (Pype et al. 2000), KRC (Oukka et al. 2002), TB2P (Kanamori et al. 2002), and TB2P-like (J. Zapata, J.C. Reed, unpubl.). Murine orthologs for all of these are evident in murine databases, with partial clones of MIP-T3 (3930402D05) and T2BP-like (9830144P17) found exclusively in the RIKEN database.
Comparison of TRAF- and TEF-Family Proteins of Humans and Mice
The β-strand region of the TRAF domain (so-called, C-TRAF domain) shares documented or predicted structural similarity with a variety of other types of proteins, including Meprins (Uren and Vaux 1996), Siah-family proteins (Tartaglia et al. 1991; Reed 2002), and the TEF (TRAF-Encompassing Factors) TEF1 (Ubiquitin-Specific Protease-7 [USP7]), TEF2 (SPOP), and TEF3 (Mulibrey Nanism [MUL]; Zapata et al. 2001). A total of eight human proteins have been identified that contain these TRAF-like folds, including Meprin-1a and Meprin-1b on chromosomes 6p21 and 18q12, respectively (Uren and Vaux 1996), Siah-1 and Siah-2 (Hu et al. 1997), TEF1, TEF2, and TEF3 (Zapata et al. 2001), and TEF4 (J. Zapata, J.C. Reed, unpubl.). The human TEF4 protein is highly similar (76% identical) to TEF2 (SPOP), containing a TRAF-like domain and a POZ domain. In mice, orthologs for all of these human proteins were found in the EST/cDNA data, with the exception of TEF4. Partial cDNAs encoding mouse TEF3 were uniquely found in the RIKEN database (Table 10). Although lacking an ortholog of TEF4, cDNAs encoding TEF2/TEF4-like proteins were identified in mice that appear to arise from different genes, termed mouse TEF5, TEF6, TEF7, and TEF8 (Table 11). Phylogeny analysis suggests that these TEF2/TEF4-like proteins are not strictly orthologous to their human counterparts. Why mice have evidently evolved several TEF2/TEF4-related genes is unclear, and awaits elucidation of the functions of these proteins. Mice also contain an additional Siah-family gene relative to humans, apparently as a result of gene duplication, in which the resulting protein products (Siah1a, Siah1b) share 98% amino acid sequence identity (Della et al. 1993). The functions of most of these proteins containing TRAF-like folds are unknown, and may not be directly relevant to either apoptosis or inflammation, although Siah-family proteins (which function as E3s in targeting proteins for ubiquitination) have been reported to regulate apoptosis and NFκB in some contexts (Hu et al. 1997; Polekhina et al. 2002).
PAADs/PYRINs
The PAAD domain (also known as PYRIN, PYD, DAPIN) is comprised of a bundle of five or six α-helices (Martinou and Green 2001; Pawlowski et al. 2001; Staub et al. 2001; Bertin and DiStefano 2000; Espejo et al. 2002), and represents another branch of the DD superfamily, which contains DDs, DEDs, and CARDs. Although the functions of PAAD-containing proteins are still being elucidated, recent data suggest they operate as homotypic protein-interaction motifs that mediate interactions with proteins involved in activation of inflammatory Caspases (e.g., Caspase-1) and NFκB (Fiorentino et al. 2002; Manji et al. 2002; Perfettini et al. 2002; Srinivasula et al. 2002; Wang et al. 2002). Hereditary mutations in some genes encoding PAAD-family proteins have been implicated in autoimmune and hyperinflammatory syndromes, further supporting a role for these proteins in regulating inflammatory responses (Consortium 1997; Hoffman et al. 2001).
In humans, 19 candidate PAAD-family genes have been identified, including ASC (PyCARD), NAC (NALP1, PYPAF1, DEFCAP, CARD7), Cryopyrin (PYPAF1, NALP3), Pyrin (Marenostrin), AIM2, IFP16, POP1 (ASC2), POP2, and PAN1 through PAN11 (also known as NALPs and PYPAFs), (Table 12). In some cases, the PAAD domains are present as the only motifs within these proteins (POP1, POP2), but in most instances, the PAAD is combined with various other domains, including CARDs (ASC, NAC), nucleotide-binding NACHT domains (see below) (NAC, Cryopyrin, PAN1–PAN11), LRRs (NAC, Cryopyrin, PAN1–PAN11), HIN200, B-Box, or SPRY domains (Pawlowski et al. 2001). Mice appear to contain substantially fewer PAAD-family genes, as revealed through the available cDNA sequence data. Only 12 of the PAADs were identified within the mouse EST/cDNA data, presenting the orthologs of Pyrin, Cryopyrin, ASC, NAC, IFI16, PAN1, PAN2, PAN3, PAN5, PAN6, PAN7, and PAN11. Of note, the available murine cDNA sequence data corresponding to the apparent NAC ortholog lack the carboxy-terminal CARD domain found in the human protein, suggesting that (1) incomplete cDNAs were sequenced or (2) that a true difference exists in the structures of the human and mouse proteins, or (3) that a true ortholog of NAC is not present in mice, and instead, the candidate cDNA clones should be viewed as having originated from a paralogous gene more similar to the PANs and Cryopyrin in domain organization. Similarly, the Pyrin protein of mice contains the PAAD and B-box domains found in the human protein, but is lacking the carboxy-terminal SPRY domain of its human counterpart. The expansion of PAN family genes in humans correlates with a cluster of eight of these genes on human chromosome 19, suggesting recent gene duplication events.
Comparison of Human and Mouse PAAD-Family Proteins
NACHT Domains
The NACHT domain represents a nucleotide-binding fold of unknown structure, which is found in many proteins of importance for apoptosis and inflammatory reactions (Koonin and Aravind 2000). Because this domain forms oligomers (at least in some cases), it can serve as a scaffold for bringing associated proteins such as proteases and protein kinases into close apposition, inducing their activation through the induced proximity mechanism (Inohara et al. 2000). The Caspase-9-activating protein of mammals, Apaf1, contains a related but distinct nucleotide-binding domain, called an NB-ARC, which is found in Apaf1 homologs in animals (Dark in Drosophila; Ced4 in Caenorhabditis elegans), and in pathogen-response proteins of plants (Chinnaiyan et al. 1997).
In humans, 20 candidate NACHT family members have been identified from a combination of cDNA, EST, and genomic data (A. Rojas, K. Pawlowski, F. Xu, J.C. Reed, and A. Godzik, in prep.), including (1) Nod1 (CARD4), Nod2 (CARD15), and Ipaf (CLAN), which contain amino-terminal CARDs and carboxy-terminal LRRs flanking the NACHT domain; (2) Cryopyrin (PYPAF1; NALP3) and the PAN-family proteins, PAN1–PAN11 (NALP2–NALP12) (PYPAF2–PYPAF12), which contain amino-terminal PAADs and carboxy-terminal LRRs flanking the NACHT domain, (3) NAC (NALP1, CARD7, DEFCAP), which contains an amino-terminal PAAD, followed by NACHT, and then LRRs and a carboxy-terminal CARD; (4) Naip, which has three BIRs preceeding the NACHT, followed by LRRs; (5) CIITA (MATER), which contains an amino-terminal α-helical domain with CARD-like features upstream of the NACHT, followed by LRRs; and (6) the NACHT-only proteins, NOP1 and NOP2 (Table 13). In mice, orthologs for these proteins were found for only 16 of the 20 candidates. These included Naip, Nod1, Nod2, Ipaf, ASC, NAC, Cryopyrin, Pyrin, PAN1, PAN2, PAN3, PAN5, PAN11, and CIITA (MATER). Thus, on the basis of the available data, it appears that humans have evolved additional diversity in NACHT-family proteins, relative to mice. As mentioned above, a cluster of eight PAN-family genes is found on chromosome 19p13, suggesting recent gene amplification events (A. Rojas, K. Pawlowski, F. Zu, J.C. Reed, and A. Godzik, in prep.).
Comparison of NACHT-Family Proteins of Humans and Mice
NFκB, IκB, and IKK Family Proteins
NFκB family transcription factors play important roles in host defense and cell survival (for review, see Ghosh and Karin 2002; Karin and Lin 2002). The DNA-binding activity of these transcription factors is rapidly induced in mammalian cells by a variety of cytokines and by certain molecules produced by bacteria, inducing transcription of several genes involved in inflammation and apoptosis, including TNF-family cytokines (TNF, LTα, LTβ, FasL), molecules involved in TNF-family receptor signaling (TRAF1), Caspase-inhibitors (c-IAP2, FLIP), and Bcl-2-family members (Bfl-1, Bcl-X).
The NFκB (Rel)-family represents a group of structurally related transcription factors containing an evolutionarily conserved amino-terminal domain spanning -300 amino-acids, which is termed the Rel homology domain (RHD). This domain is involved in DNA binding and dimerization, and is also responsible for interactions with a family of endogenous NFκB suppressors, the IκB family (Ghosh et al. 1998; Ghosh and Karin 2002). Five members of the NFκB family are evident in human and mouse databases: RelA (p65), RelB (p50), NFκB-1 (p50/p105), NFκB-2 (p52/p100), and c-Rel (Table 14). The nuclear factor of activated T cells (NF-AT) also contains a RHD-like domain, although it is not commonly considered a member of the Rel-family.
Comparison of Rel-Family and Related Proteins of Humans and Mice
The NFκB-family transcription factors are comprised of homo- and hetero-dimeric pairs of Rel-family proteins. Regulation of these transcription factors is complex, involving a diversity of mechanisms. In general, however, activity of NFκB-family proteins is controlled by a counteracting family of suppressors, the IκB-family, that sequesters these transcription factors in the cytosol. IkB-family proteins contain conserved Ankyrin-repeat structures, which bind RHDs. Eight members of the IκB family have been found in humans. Seven of these are also found in mice, including IκBα, IκBβ, IκBϵ, IκBz, Bcl3, IκB-like-1, and IκB-like-2 (Table 14). Additionally, mouse databases also reveal an alternative form of NFκB1, termed IκBγ that is identical to the carboxy-terminal half of NFκB-1, and that is produced by transcription of the NFκB-1 (p50/p105) gene from an alternative internal promoter (Ghosh et al. 1998).
Release of NFκB typically entails degradation of IκB-family proteins, resulting from a mechanism involving phosphorylation by IκB-Kinases (IKKs or IκB-Ks; Table 14), followed by ubiquitin-dependent proteasome-mediated destruction. IKKs contain a conserved serine-kinase domain and a putative leucine-zipper domain. So far, five members of the IKK family have been found in human and mouse databases. Of those, IκBKα, IκBKβ, and IκBKγ form a protein complex in which IκBKα and IκBKβ represent the catalytic subunits, whereas IκBKγ is a regulatory subunit (lacking a kinase domain) and is not structurally related to the α and β subunits (Karin and Ben-Neriah 2000; Ghosh and Karin 2002). IκBKϵ (Shimada et al. 1999) and Tank Binding Kinase (TBK)-1 (Pomerantz and Baltimore 1999) also are members of the IκB kinase family (Table 14).
BAGs
The BAG domain is comprised of an anti-parallel two α-helix structure that docks with high affinity onto the ATPase domain of 70-kD heat shock proteins (Hsp70), modulating their function as molecular chaperones and helping to target Hsp70-family proteins onto specific target proteins (Takayama and Reed 2001). Hsp70-family molecular chaperones play roles in suppression of apoptosis, and examples of involvement in BAG-family proteins in suppressing cell death have been reported for BAG1 (Rap46), BAG3 (Bis), BAG4 (Sodd), and BAG6 (Scythe) (for review, see Briknarova et al. 2002). Humans contain six identifiable BAG-family proteins, which contain the Hsp70-binding BAG domain in association with various other domains. Mice also contain six orthologous BAG-family members, with cDNA sequence data for one of these uniquely found in the RIKEN database (Table 15). Thus, the BAG-family proteins appear to be highly conserved in humans and mice.
Comparison of BAG-Family Proteins of Humans and Mice
CIDE Domains and Apoptotic Endonucleases
DNA fragmentation is often considered a hallmark of apoptosis, reflecting the activation of endonucleases that cleave DNA between nucleosomes (Wyllie 1980). A unique domain is found in the apoptotic endonuclease DFF40 (CAD) and its homologs (Liu et al. 1997; Inohara et al. 1998; Sakahira et al. 1998), called the CIDE or CIDE-N domain. The CIDE-N domain represents a -75 amino acid fold consisting of a twisted five-stranded β sheet with two α-helices arranged in an α/β roll (Lugovskoy et al. 1999). The CIDE-containing endonuclease DFF40 (CAD) is held in an inactive state by a specific chaperone protein, DFF45 (ICAD), which also contains the CIDE domain, and which associates via CIDE-CIDE interactions (Zhou et al. 2001). The endonuclease DFF40 (CAD) becomes released upon cleavage of chaperone DFF45 (ICAD) by Caspases, thus linking endonuclease activation to activation of the cell death proteases. Five CIDE family members are evident in the transcriptomes of humans and mice (Table 16).
Comparison of CIDE-Containing Proteins in Humans and Mice
In addition to CIDE-family endonucleases, two proteins have been identified that are sequestered in mitochondria, which can contribute directly or indirectly to genome digestion. These are EndoG, an evolutionary endonuclease (Li et al. 2001) and AIF, a flavoprotein that somehow promotes large-scale cleavage of genomic DNA during apoptosis (Susin et al. 1999). Humans and mice contain cDNAs encoding EndoG and AIF. Overall, therefore, the proteins associated with apoptosis-associated genome destruction appear to be well-conserved in humans and mice.
Miscellaneous Apoptosis-Relevant Proteins
We also compared the human and murine databases with respect to a variety of miscellaneous proteins of reported relevance to apoptosis regulation, including Cytochrome c (the Apaf1-activating protein), several Bcl-2-binding proteins (see above), and p53 and its relatives p63 and p73 (transcription factors that can regulate the expression of multiple apoptosis genes), finding conservation of these proteins in humans and mice.
Combinations of Domains
It is interesting to note that 39 of the proteins described above contain more than one of the domains mentioned above. For example, several Caspases combine the Caspase protease fold with either CARD or DED domains. Para-caspase combines the DD with a Caspase-like fold. Several proteins also combined the NACHT domain with either CARD or PAAD domains. Only one predicted protein was identified in either human or mice that combined the DD and DED, namely FADD. Only RAIDD combined both the CARD and DD domains. NAC and ASC were the two proteins found that combine the CARD and PAAD domains. No predicted proteins were found that combine a CARD with a DED domain or that pair a DED with a PAAD domain. Only one protein contained both BIR and NACHT domains, namely NAIP. Only cIAP1 and cIAP2 combine the BIR and CARD domains. Only MyD88 combines the DD and TIR domains. NFκB-1 and NFκB-2 have DDs in association with RHDs. We speculate that these few proteins that contain more than one of the signature domains implicated in apoptosis or inflammation represent critical points of cross-talk between the domain-families. This speculation is supported by gene ablation studies in mice in some instances (Kuida et al. 1995, 1998; Li et al. 1995; Nakagawara et al. 1997; Adachi et al. 1998; Hakem et al. 1998; Takahashi et al. 1998; Zhang et al. 1998; Kawai et al. 1999; Fitzgerald et al. 2001; Kabra et al. 2001).
Conclusions
Comparisons of the cDNA (EST) record of humans and mice reveals remarkable conservation of expressed genes involved in apoptosis and inflammation. In some cases, humans contain additional genes not evident in mouse, suggestive of more complex or more precise regulation of events such as Caspase-1 activation or signaling by TNF-family receptors. Despite the incompleteness of the mouse data, in several cases we can correlate absence of certain genes with obvious duplications in the human genome, which clearly happened after divergence of mouse and human ancestors. It is interesting to note that many of these human-specific expansions can be related to apoptosis triggered by infections or external stimuli. However, in other cases, mice have expanded gene numbers (e.g. A1, NAIP, TEFs) relative to humans, implying greater redundancy in some aspects of apoptosis regulation. Knowledge of the similarities and differences in the repertoire of expressed genes involved in apoptosis and inflammation in humans and mice lays a foundation for understanding the utility and limitations of the mouse models of disease that are used for validating targets for drug discovery and for testing new therapeutic agents prior to entering clinical trials in humans.
METHODS
Several parallel approaches were taken for comparative analysis of apoptosis genes in human and mouse. First, amino acid sequences of human apoptotic genes from our collection, as described earlier, were used for T-BLAST-N searches (Altschul et al. 1997) of the RIKEN EST collection in a search for homologous sequences. Second, amino acid translation of the RIKEN data were subjected to an automated annotation procedure, using Psi-BLAST to find homologous sequences of human apoptotic genes, followed by structure-based analysis using FFAS in conjunction with a library containing three-dimensional structures of the relevant protein folds. Third, a procedure similar to the third one was used for finding murine genes in public databases, including NCBI nonredundant (NR) protein database and Jackson Laboratories.
All identified murine genes were checked by phylogeny and multiple sequence-alignment methods to assign orthologs to human proteins where possible (Thompson et al. 1994; Li et al. 2000). For most of the protein domain families, multiple sequence alignments were prepared using T-coffee (Notredame et al. 2000), and the resulting alignments were used as an input to generate NJ trees using clustal. (Thompson et al. 1997)
Footnotes
-
Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.1053803.
-
↵4 Corresponding author. E-MAIL jreed{at}burnham.org; FAX (858) 646-3194.
-
↵5 Takahiro Arakawa,2 Piero Carninci,2,3 Jun Kawai,2,3 and Yoshihide Hayashizaki.2,3
-
- Accepted April 8, 2003.
- Received January 6, 2003.
- Cold Spring Harbor Laboratory Press











