Table 2.

Properties of Vibrio metschnikoviiSuperintegron Cassettes

Gene cassette[i] Cassette coordinates[ii] (bp) Length of attC site[iii](bp) Name[iv]/length of ORF (bp) G + C content of ORF (%) Sequence similarity,[v] E-value,[vi] and motifs[vii]
p253 insert
 c253-1* <1–987[viii] 117ND(ISVme1 insertion)
 c253-2988–1531116Orfc253-2/38737.270% identity to the V. cholerae VCA0890 Glyoxylase I  family protein
 c253-31532–2036118Orfc253-3/36040.194% identity to the V. cholerae VCA0338 andVC0415
 c253-42037–2825117Orfc253-4/64236.6NH—1 transmembrane helix
 c253-52826–3373118Orfc253-5/39639.466% identity to the V. cholerae VCA0414 andVC0425  signal peptide sequence
 c253-63374–3926>Orfc253-6a/17139.3100% identity to the V. cholerae VCA0474 C-terminal 56 aa,  see text.
Orfc253-6b/320>42.599% identity to the V. cholerae VCA0475 (30% identity  to Phage P1Doc)
p273 insert
 c273-1<1–490118Orfc273-1/<37538.748% identity to the C-terminal part of Bacillus halodurans  hypothetical protein BH3804 (E = 7e − 6)
 c273-2* 491–992117ND
 c273-3993–1697118Orfc273-3/55838.0NH
 c273-41698–2385118 Orfc273-4/33649.4NH
 c273-52386–3367117Orfc273-5/83736.922% identity to HphI restriction endonuclease (E = 4e − 8)
 c273-6* 3368–4055117ND
 c273-74056–4557>Orfc273-7/361>31.7NH—signal peptide sequence
p372 insert
 c372-1<1–821118Orfc372-1/<68931.7NH
 c372-2822–1316117Orfc372-2/29134.734% identity to Salmonella enterica hypothetical protein  CAD05348 (E = 0.01)—signal peptide sequence
 c372-31317–2077117Orfc372-3/60935.3NH—signal peptide sequence—6 transmembrane helices
 c372-42078–2931118Orfc372-4/70836.640% identity to the Salmonella typhimurium LT2  putative aspartate racemase AAL21891 (E = 7e − 41)
 c372-5* 2932–3621117ND
 c372-63622–4242>[ix] ND(ISVme1 insertion)
p374 insert
 c374-1658–1711117Orfc374-1/87934.0NH
 c374-2** 1712–2607118 Orfc374-2/66938.920.5% identity to Lactococcus lactis methyltransferase  CAA68045 (E = 0.001)
 c374-32608–3383117Orfc374-3/63336.826.3% identity to Agrobacterium tumefaciens hypothetical  methyltransferase AAK87648 (E = 3e − 13)
 374-4** 3384–4279118 Orfc374-4/66939.398% identity to c374-2
 c374-54280–5137118Orfc374-5/70536.3NH
 c374-65138–5601116ND
 c374-75602–6437>Orfc374-7/77140.629% identity to Sinorhizobium meliloti hypothetical  oxydoreductase CAC46573 (E = 7e − 22)
PCR (VMR1 + VMR2)
 Vme11–437>/OrfVme1/39639.4100% identity to c253-5
 Vme21–578>/OrfVme2a/267 OrfVme2b/24339.1 40.092% identity to the V. cholerae VCA0332 82% identity to V. cholerae VCA0333
 Vme4* 1–393>/ND
 Vme91–502>/OrfVme9/37533.646% identity between the last 40 C-terminal aa and a central  segment of chicken paxillin B55933—signal peptide  sequence—2 transmembrane helices
 Vme111–496>/OrfVme11/41441.391% identity to the V. cholerae VCA0476
 Vme121–335>/OrfVme12/19834.936% identity to the V. cholerae VCA0426(E = 4e − 6)
 Vme23* 1–394>/ND

[i] Cassettes (c) have been named according to their plasmid number (see Table 1) and their position in the insert, or for the cassette obtained from (VMR1 + VMR2) PCR by the prefix Vme followed of a number; the two families of repeated cassettes are indicated by * and **, respectively.

[ii] Sequences missing 5′ or 3′ in incomplete cassettes are indicated by < and >, respectively.

[iii] The given attC site length is from the last Y of the inverse core site (RYYYAAC) to the G located upstream of the recombination point in the core site of the integrated cassette (GTTRRRY).

[iv] ORFs are in classical positive orientation that is in the same direction as their associated attC site. ORFs in the opposite orientations are underlined; ND, no ORF > 150-bp detected.

[v] NH, no homologous protein detected by BLAST analysis (http://www.ncbi.nlm.nih.gov/BLAST/). When related to a V. cholerae SI cassette, the corresponding VCAxxx name is underlined.

[vi] Number of equal scoring matches expected by chance, results with value ≤10−2 have been considered.

[vii] Motifs have been evidenced through the CDD search option in BLAST analysis and by using the signal peptide and transmembrane segment prediction programs SignalP and TMHMM (Center for Biological Sequence Analysis; http://www.cbs.dtu.dk/services/).

[viii] The sequence from 1 to 540 corresponds to a never described IS, ISVme1.

[ix] The sequence from 3703 to the end corresponds to an ISVme1, identical to the one found in cassette c253-1.