Properties of Vibrio fischeriSuperintegron Cassettes
| Gene cassette[i] | Cassette coordinates[ii] (bp) | Length of attC site[iii](bp) | Name[iv]/length of ORF (bp) | G + C content of ORF (%) | Sequence similarity,[v]E-value,[vi] and motifs/activity[vii] |
| p667 + p641 inserts[viii] | |||||
| c667-1 | <1–39 | — | —[ix] | — | — |
| c667-2* | 40–601 | 98 | ND | — | BLASTX detects a 56-codon seqment that shows homology to intron maturase |
| c667-3* | 602–1161 | 99 | ND | — | — |
| c667-4* | 1162–1724 | 98 | ND | — | — |
| c667-5/c641-1 | 1725–3415 | 116 | Orfc667-5/1539 | 29.4 | NH—3 transmembrane helices |
| c641-2 | 3416–4070 | 116 | Orfc641-2/426 | 35.7 | 40% identity to the Yersinia pestis putative acetyltransferase CAC93527 (E = 4e − 24) |
| c641-3 | 4071–4123> | — | ND | — | NH |
| p668 insert | |||||
| c668-1 | 1–686 | 118 | Orfc668-1/513 | 39.0 | 39% identity to the V. choleraeVCA1017methylated- DNA–protein-cysteine S-methyltransferase hypothetical OGT (E = 3e − 22) |
| c668-2* | 687–1264 | 117 | ND | — | |
| c668-3 | 1265–1780 | 116 | Orfc668-3[x]/147 | 34.0 | NH |
| c668-4* | 1781–2339 | 97 | ND | — | — |
| c668-5 | 2340–3251 | 116 | Orfc668-5/756 | 31.7 | NH—signal peptide—2 transmembrane helices |
| c668-6 | 3552–4003> | — | Orfc668-6/711> | 30.2 | NH |
| p669 insert | |||||
| c669-1 | <1–404 | 116 | Orfc669-1/<280 | 33.3 | NH—1 transmembrane helix |
| c669-2 | 405–1145 | 118 | CcdB/315 CcdA/246 | 33.3 34.2 | Gyrase inhibiting protein, 42% identity to the F plasmid CcdB CcdB antidote, 42% identity to Escherichia coli O157:H7 CcdA (AE005182.1) |
| c669-3 | 1146–1910 | 118 | Orfc669-3/615 | 33.3 | 40% identity to Bacillus halodurans transcriptional repressor of sporulation and degradative enzymes production NP_241278 (E = 2e − 35) |
| c669-4 | 1911–3268 | 116 | Orfc669-4/1200 | 34.9 | 37% identity to the N-terminal 240 aa of Streptomyces coelicolor Serine/threonine protein kinase T36502 (E = 1e − 30) |
| c669-5* | 3269–3829 | 98 | — | — | — |
| c669-6** | 3830–4483> | — | Orfc669-6/632> | 34.3 | NH |
| p672 insert | |||||
| c672-1 | <1–79 | — | —[ix] | — | — |
| c672-2 | 80–682 | 116 | Orfc672-2/444 | 35.6 | 28% identity to the S. coelicolor putative acetyl transferase (E = 1e − 07) |
| c672-3* | 683–1260 | 117 | ND | — | — |
| c672-4 | 1261–2254 | 118 | Orfc672-4/843> | 39.3 | NH—signal peptide sequence—2 transmembrane helices |
| c672-5* | 2255–2813 | 98 | ND | — | — |
| c672-6 | 2814–3438> | — | Orfc672-6/605> | 33.6 | NH |
| p789 insert | |||||
| c789-1 | 1974–2679 | 116 | Orfc789-1/513 | 29.5 | 40% identity to the V. cholerae VCA0405 60 N-terminal amino acids and 35% identity to the Lactococcus lactis prophage pi2 protein 2 (AE006335_3) 75 C-terminal amino acids—signal peptide sequence—2 transmembrane helices |
| c789-2** | 2680–3028> | — | Orfc789-2/327> | 33.7 | NH (identical to c669-6) |
[i] Cassettes (c) have been named according to their plasmid number (see Table 1) and their position in the insert; the two families of repeated cassettes are indicated by * and **, respectively.
[ii] Sequences missing 5′ or 3′ in incomplete cassettes are indicated by < and >, respectively.
[iii] The given attC site length is from the last Y of the inverse core site (RYYYAAC) to the G located upstream of the recombination point in the core site of the integrated cassette (GTTRRRY).
[iv] ORFs are in classical positive orientation that is in the same direction as their associated attC site. ORFs in the opposite orientations are underlined; ND, no ORF > 150-bp detected.
[v] NH, no homologous protein detected by BLAST analysis (http://www.ncbi.nlm.nih.gov/BLAST/). When related to a V. cholerae SI cassette, the corresponding VCAxxx name is underlined.
[vi] Number of equal scoring matches expected by chance, results with value ≤10−2 have been considered.
[vii] Demonstrated activities are in bold characters. Motifs have been evidenced through the CDD search option in BLAST analysis and by using the signal peptide and transmembrane segment prediction programs SignalP and TMHMM (Center for Biological Sequence Analysis;http://www.cbs.dtu.dk/services/).
[viii] The p667 and p641 inserts overlapping each other, we have annotated and deposited to GenBank the corresponding contig.
[ix] Irrelevant, as the sequence only corresponds to the 3′ part of the attC site.
[x] This 147-bp ORF is preceded by a canonical ribosome-binding site and consequently has been considered as a potential gene.