Protein Coding Palindromes Are a Unique but Recurrent Feature in Rickettsia

Table 2.

Full Size Repeat Insertions within the Rickettsia conorii Genes

Gene name Function Phylogenetic distribution Size (bp) Location of repeat insertion Structure data
RPE-1
coxB(RC0555) Cytochrome c oxidase polypeptide II RPG--YOAE 945 16..159 1OCC/Bos taurus
era (RC0158) GTP-binding protein RPG-SYO-E 1017 10..144 1EGA/Escherichia coli
gltX (RC0966) Glutamyl-tRNA synthetase RPGCSYOAE 1539 1006..1149 1GLN/Thermus thermophilus
gmk (RC1194) Guanylate kinase RPGC-YO-E 687 19..126 1GKY/Saccharomyces  cerevisiae
hemC (RC0706) Porphobilinogen deaminase RPGC-YOAE 1053 772..918 1PDA/Escherichia coli
kdtA(RC0118) 3-deoxy-D-manno-octulosonic-acid  transferase RP-C--O-- 1392 147..290
mesJ(RC0067) Cell cycle protein MesJ RPGCSYO-- 1434 625..768
mviN(RC0898) Virulence factor MviN RP-CSYO-- 1665 6..149
pcnB(RC0015) Poly(A)polymerase RPGCSYO-E 1308 120..263
rlpA(RC0537) Rare lipoprotein A precursor RP--SYO-- 960 33..175
ssrA tmRNA precursor RPGCSYO-- 474 78..223
truB(RC0665) tRNA pseudouridine 55 synthase RPGCSYOAE 1035 784..927
ubiG(RC0965) 3-demethylubiquinone-9  3-methyltransferase RP------E 867 148..291
uhiH(RC0848) Ubiquinone biosynthesis protein RP---Y--E 1293 31..177
 RC1039 Split gene of mannose-1-phosphate  guanylyltransferase -P---YOA- 627 9..151
 RC0071 Unknown function RP---YO-E 1218 985..1128
 RC0127 Unknown function --------- 222 42..185
 RC0183 Unknown function --------- 1158 706..840
 RC0209 Unknown function R-------- 279 22..165
 RC0659 Unknown function R-------- 582 31..174
 RC0675 Unknown function --------- 225 18..162
 RC0809 Unknown function RPGCSYO-- 735 349..492
 RC1172 Unknown function --------- 345 18..161
 RC1201 Unknown function --------- 240 100..243
RPE-2
atpG(RC1236) ATP synthase γ chain RPG--YO-E 969 622..726 1H8E/Bos taurus
ksgA (RC1022) Dimethyladenosine transferase RPGCSYOAE 945 370..468 1YUB/Streptococcus  pneumoniae
nuoC (RC0483) NADH dehydrogenase I chain C RPG--YOAE 726 199..303
 RC0698 Unknown function R-------- 1002 100..201
 RC0715 Unknown function R-------- 753 299..374
RPE-3
envZ(RC0592) Osmolarity sensor protein EnvZ RP------- 1437 34..149
lpxB(RC0440) Lipid-A-disaccharide synthase RP-C-YO-- 1338 1138..1253
murD(RC0560) UDP-N-acetylmuramoylalanineD-glutamate ligase RPGCSYO-- 1500 796..917 1E0D/Escherichia coli
ptrB (RC0377) Protease II RPG------ 2187 526..641 1QFM/Sus scrofa(pig)
RPE-4
 RC0521 Unknown function --------- 336 29..122
 RC0679 Unknown function --------- 1806 243..337
RPE-5
 RC0340 Unknown function --------- 171 45..165
rnpB RNA subunit (M1 RNA) of  ribonuclease P RPGCSYOAE 458 118..228
RPE-6
 RC0614 Unknown function --------- 387 198..334
RPE-7
 RC1210 Unknown function --------- 303 30..129
RR-1
 RC1196 Unknown function --------- 180 9..35
  • Abbreviations for the organism groups are as follows. R: Rickettsia (Rickettsia prowazekii); P: Proteobacteria (Escherichia coli K-12, Haemoplilus influenzae, Xylella fastidiosa, Vibrio cholerae, Pseudomonas aeruginosa, Buchnera sp., Neisseria meningitidis serogroup A and B, Helicobacter pylori 26695 and J99, Campylobacter jejuni); G: Gram positive bacteria (Bacillus subtilis, Bacillus halodurans, Mycoplasma genitalium, Mycoplasma pneumoniae, Ureaplasma urealyticum, Mycobacterium tuberculosis; C:Chlamydia (Chlamydia trachomatis, Chlamydia muridarum, Chlamydia pneumonia CWL029, AR39 and J138); S: Spirochete (Borrelia burgdorferi, Treponema pallidum); Y: Cyanobacteria (Synechocystis); O: Other bacteria (Deinococcus radiodurans, Aquifex aeolicus, Thermotoga maritima); A: Archaea (Methanococcus jannaschii, Methanobacterium thermoautotrophicum, Archaeoglobus fulgidus, Halobacterium sp., Thermoplasma acidophilum, Pyrococcus horikoshii; Pyrococcus abyssi, Aeropyrum pernix); E: Eukaryotes (Saccharomyces cerevisiae, Caenorhabditis elegans, Drosophila melanogaster). When there is no homolog within an organism group, ‘-’ replaces the organism abbreviation.

  • Gene size without stop codon.

  • The repeat location is indicated by the base position within the gene.

  • Protein Data Bank identifiers and the organism names for the available structure data.

This Article

  1. Genome Res. 12: 808-816

Preprint Server