Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
USE OF HIGHLY-SPECIFIC ENDOPEPTIDASES
Document Type and Number:
WIPO Patent Application WO/2015/084197
Kind Code:
A1
Abstract:
The present invention relates to the use of certain proteins as site-specific glutamyl endopeptidases of novel, high specificity to substrates to digest proteins within unique amino acid sequences for targeted processing of proteins or peptides from the carboxyl side of the glutamic acid residue, including the cleavage of peptide or protein tags from tag fusion proteins.

Inventors:
ŁOBOCKA MAŁGORZATA (PL)
GĄGAŁA URSZULA (PL)
BELINO-STUDZIŃSKA PAULINA (PL)
DĘBSKI JANUSZ (PL)
DADLEZ MICHAŁ (PL)
Application Number:
PCT/PL2014/000134
Publication Date:
June 11, 2015
Filing Date:
December 04, 2014
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
INST BIOCHEMII I BIOFIZYKI POLSKIEJ AKADEMII NAUK (PL)
SZKOŁA GŁÓWNA GOSPODARSTWA WIEJSKIEGO (PL)
International Classes:
C07K14/00
Other References:
PARK J.W. ET AL.: "Purification and biochemical characterization of a novel glutamyl endopeptidase secreted by a clinical isolate of Staphylococcus aureus", INTERNATIONAL JOURNAL OF MOLECULAR MEDICINE, vol. 27, no. 5, 1 May 2011 (2011-05-01), XP055176357, ISSN: 1107-3756, DOI: 10.3892/ijmm.2011.625
PRASAD L ET AL: "THE STRUCTURE OF A UNIVERSALLY EMPLOYED ENZYME: V8 PROTEASE FROM STAPHYLOCOCCUS AUREUS", ACTA CRYSTALLOGRAPHICA SECTION D: BIOLOGICAL CRYSTALLOGRAPHY, MUNKSGAARD PUBLISHERS LTD. COPENHAGEN, DK, vol. 60, no. 2, 1 February 2004 (2004-02-01), pages 256 - 259, XP009037067, ISSN: 0907-4449, DOI: 10.1107/S090744490302599X
SHINJI KAKUDO ET AL: "Purification, Characterization, Cloning, and Expression of a Glutamic Acid-specific Protease from Bacillus licheniforrnis ATCC 14580"", INC. OF BIOLOGICAL CHEMISTRY, 25 November 1992 (1992-11-25), pages 23782 - 23788, XP055175853, Retrieved from the Internet [retrieved on 1992]
MEIJERS ROB ET AL: "The crystal structure of glutamyl endopeptidase from Bacillus intermedius reveals a structural link between zymogen activation and charge compensation", BIOCHEMISTRY, AMERICAN CHEMICAL SOCIETY, US, vol. 43, no. 10, 16 March 2004 (2004-03-16), pages 2784 - 2791, XP002332542, ISSN: 0006-2960, DOI: 10.1021/BI035354S
STREIFF M B ET AL: "Expression and proteolytic processing of the darA antirestriction gene product of bacteriophage P1", VIROLOGY, ELSEVIER, AMSTERDAM, NL, vol. 157, no. 1, 1 March 1987 (1987-03-01), pages 167 - 171, XP023048944, ISSN: 0042-6822, [retrieved on 19870301], DOI: 10.1016/0042-6822(87)90325-4
M. B. LOBOCKA ET AL: "Genome of Bacteriophage P1", JOURNAL OF BACTERIOLOGY, vol. 186, no. 21, 15 October 2004 (2004-10-15), pages 7032 - 7068, XP055175990, ISSN: 0021-9193, DOI: 10.1128/JB.186.21.7032-7068.2004
CHOI SI; SONG HW; MOON JW; SEONG BL.: "Recombinant enterokinase light chain with affinity tag: expression from Saccharomyces cerevisiae and its utilities in fusion protein technology.", BIOTECHNOL BIOENG, vol. 75, 2001, pages 718 - 24
LENART A.; DUDKIEWICZ M.; GRYNBERG M.; PAWLOWSKI K.: "CLCAs-A Family of Metalloproteases of intriguing phylogenetic distribution and with cases of substituted catalytic sites.", PLOS ONE, vol. 8, no. 5, 2013, pages E62272
LOBOCKA M. B.; SVARCHEVSKY A. N.; RYBCHIN V. N.; YARMOLINSKY M. B.: "Characterization of the primary immunity region of the Escherichia coli linear plasmid prophage N15.", J BACTERIO, vol. 178, no. 10, 1996, pages 2902 - 2910
PARKS TD; LEUTHER KK; HOWARD ED; JOHNSTON SA; DOUGHERTY WG.: "Release of proteins and peptides from fusion proteins using a recombinant plant virus roteinase.", ANAL BIOCHEM., vol. 216, 1994, pages 413 - 7, XP024763375, DOI: doi:10.1006/abio.1994.1060
STREIFF.M.B.; LIDA S.; BICKLE T.A.: "Expression and proteolytic processing of the darA antirestriction gene product of bacteriophage 1", VIROLOGY, vol. 157, no. 1, 1987, pages 167 - 171, XP023048944, DOI: doi:10.1016/0042-6822(87)90325-4
TERPE K.: "Overview of tag protein fusions: from molecular and biochemical fundamentals to commercial systems.", APPL MICROBIOL BIOTECHNOL, vol. 60, no. 5, 2003, pages 523 - 533, XP002298417
VERGIS JM; WIENER MC.: "The variable detergent sensitivity of proteases that are utilized for recombinant protein affinity tag removal.", PROTEIN EXPR PURIF., vol. 78, 2011, pages 139 - 42
YARMOLINSKY M.B.; STERNBERG N.L.: "Bacteriophage P1.", 1988, PLENUM PRESS, article "The Bacteriophages", pages: 291 - 438
Attorney, Agent or Firm:
PADEE, Grażyna (kl. A. lok. 20, 00-663 Warszawa, PL)
Download PDF:
Claims:
Claims

Use of a protein which has at least 60% identity with the amino acid sequence specified in either SEQ ID no. 1 , or with the sequence specified in SEQ ID no. 2, or with the sequence specified in SEQ ID no. 3, as a site-specific glutamyl endopeptidase for endoproteolytic digestion of proteins or peptides on the carboxyl side of glutamic acid residue, between the glutamic acid residue and:

- serine residue within the ESV amino acid sequence motif for the protein being a product of SEQ no. 1 gene;

- alanine residue within the EAI amino acid sequence motif for the protein being either the product of SEQ no. 2 or SEQ no. 3 gene.

Use according to claim 1 , characterised in that either Pro (SEQ no. 1) or DdrB (SEQ no. 2) or DdrB-C (SEQ no. 3) proteins are used, being the products of pro or ddrB \ bacteriophage genes, or the products thereof modified by point mutations, deletions or insertions, or of homologous genes where the modification determines at least 60% identity of these products with the amino acid sequence specified either in SEQ ID no. 1 , or the sequence specified in SEQ ID no. 2, or the sequence specified in SEQ ID no. 3.

Use according to claims 1 or 2, characterised in that the protein of sequence SEQ ID no. 1 is used to digest a protein or peptide within the ESV sequence motif nested in the region of the protein or peptide of sequence: MXaXpXa<t><t>ESVaaX, where: M is a methionine residue, X is any amino acid residue, a is any small amino acid residue, β is any polar amino acid residue, Φ is any medium or large amino acid residue.

Use according to claims 1 or 2 or 3, characterised in that the protein of sequence SEQ ID no. 1 is used preferably to digest proteins or peptides at sites of amino acid sequences: MDATNAMLESVAAE and MQGISGGLFESVSGG.

Use according to claims 1 or 2, characterised in that the protein of either SEQ ID no. 2 or SEQ ID no. 3 is used to digest proteins or peptides within the EAI sequence motif, which at the digestion region is preceded directly by a large or medium hydrophobic amino acid residue.

6. Use according to claims 1 or 2 or 5, characterised in that the protein of either SEQ ID no. 2 or SEQ ID no. 3 is used preferably to digest proteins or peptides within the EATFFYDTPIHWCATDLLEAISSTRLQLHRT sequence.

Description:
Use of highly-specific endopeptidases

The subjects of the present invention are highly-specific endopeptidases cleaving proteins within unique amino acid sequences, and the use thereof for targeted protein processing, including the cleavage of peptide or protein tags from proteins prepared by fusion with such tags.

Currently used technologies for the purification of many proteins rely on the production of these proteins in the form of their recombinant derivatives, where a specific protein is connected by a short linker, usually consisting of a few, and up to twenty, amino acid residues, with a so-called affinity tag - a peptide or protein that can be easily purified under standard conditions by affinity chromatography (Terpe, 2003). The purification of recombinant protein provides its untagged native form, through the cleavage of the tag by highly-specific endopeptidases, and further separation of the prepared mixture of two proteins by tag removal in the process of affinity chromatography.

Methods for the preparation of pure proteins employing an intermediate purification of their recombinant affinity-tagged derivatives are among the most cost- effective, and offer the highest recovery yield. They do not require de novo development of a purification procedure each time, because standard purification conditions developed for the tag can be applied, with little, if any, need for optimisation. This process allows for the preparation of purified proteins, such as Yersinia estisV antigen, for vaccine production (Carr et al., 1999).

However, the possibility to use these methods is limited by the very low number of available endopeptidases digesting proteins in regions with rare combinations of amino acid sequences. In fact, only the lack of regions formed by an amino acid sequence, either in the sequence of a specific protein or tag, recognised by endopeptidase and present in the linker sequence between these proteins, enables their use. So far, protein purification involving an intermediate purification step of their recombinant derivatives, employs only a few endopeptidases: enterokinase, TEV (tobacco etch virus) protease, thrombin, human rhinovirus 3C (HRV3C) protease, and the factor Xa (Terpe, 2003) (Vergis and Wiener, 2011). These endopeptidases digest, respectively, the proteins in the following regions of amino acid sequence motifs: DDD X (Choi et al., 2001), ENLYFQI (G/S) (Parks et al., 1994), LVPFU GS, LEVLFQ1 GP (Birch et al., 1995) and l(E/D)GFUX (Choi et al., 2001). However, a number of proteins contain amino acid sequence motifs recognised by the above-mentioned endopeptidases. Therefore, the identification of new endopeptidases recognising rare, but other than the above-mentioned, amino acid sequence motifs in proteins, is essential for the development of techniques for protein purification using the recombinant forms of these proteins, and the search for new endopeptidases of differing specificities has become the goal of an increasing number of studies.

The purpose of the presented invention is to provide enzymes enabling protein digestion within their internal regions, which are characterised by different amino acid sequences from the sequences recognised and digested by previously identified and purified endopeptidases, and that comprise amino acid sequence motifs that are rare in proteins but necessary for their recognition and cleavage by endopeptidase.

The choice of a potential source of endopeptidases was driven by the fact that proteolytic processing of some proteins occurs frequently as a stage of virion maturation in tailed bacteriophages. Another factor stimulating the choice was that the mature virions of tailed bacteriophages released after the lysis of bacterial cells can be obtained in large volumes and separated from bacterial proteins using standard procedures, while the sequence analysis of more than twenty proteins contained in the virion offers an opportunity to identify the proteins that undergo proteolytic processing. P1 bacteriophage was chosen as a potential source of such endopeptidases, because the reference literature presents reports suggesting the processing of two virion proteins of this phage (Streiff et al., 1987) (Lobocka et al., 2004).

There are scientific reports describing DdrB and Pro proteins coded by the late operons of the P1 bacteriophage, whose products are associated with progeny virion production (Yarmolinsky and Sternberg, 1988) (Lobocka et al., 2004). The genes encoding both proteins have been identified. Some researchers suggested, before the identification of the nucleotide sequence of the P1 phage genome, that the product of the gene located next to or within the region of a gene later identified as Pro, may be involved in the proteolytic processing of P1 virion protein DarA, which plays an antirestriction function (Streiff et al., 1987), but this relationship has not been further investigated. Research carried out by the authors of the present invention revealed that the Pro protein is not associated with the processing of DarA. In turn, an amino acid sequence motif similar to that found in many zinc-dependent metalloproteases involved in various metabolic and regulatory functions was identified in the DdrB protein. However, the function of this protein was not analysed (Lobocka et al., 2004) (Lenart et al., 2013).

The authors of the present invention found that among proteins contained in the virion, as many as four, i.e. DarA*, gp23*, DdrB-N and DdrB-C, are the products of proteolytic processing of their precursors in the regions characterised by amino acid sequence motifs rarely identified in proteins (IEATFFYDTPIHWCATDLLE + AISSTRLQLHRT, MDATNAMLE I SVAAE and MQGISGGLFE I SVSGG), different from those recognised by previously identified endopeptidases. Surprisingly, it was found that the gene products of pro and ddrB of P1 bacteriophage are highly-specific endopeptidases responsible for proteolytic processing of these protein precursors. Summarizing, the present invention opens new prospects for the use of either Pro or DdrB for the cleavage of affinity tags from the proteins fused with tags through linkers containing respective amino acid sequence motifs recognised by either Pro or DdrB.

The essence of the invention is the use of a protein which has at least 60% identity with the amino acid sequence of either SEQ ID no. 1 , or with the sequence SEQ ID no. 2, or with the sequence of SEQ ID no. 3, as a site-specific glutamyl endopeptidase for endoproteolytic digestion of proteins or peptides on the carboxyl side of glutamic acid residue, placed between the glutamic acid residue and:

- serine residue within the ESV amino acid sequence motif for the protein of the amino acid sequence specified in SEQ no. 1 ;

- alanine residue in the EAI amino acid sequence motif for the protein of amino acid sequence specified either in SEQ no. 2 or SEQ no. 3.

Preferably, Pro (SEQ no. 1) or DdrB (SEQ no. 2) or DdrB-C (SEQ no. 3) proteins are used, which are the products of bacteriophage P1 pro or ddrB genes, or the products of these genes modified by point mutations, deletions or insertions, or homologous genes where the modification determines at least 60% identity of these products with the amino acid sequence specified either in SEQ ID no. 1 or the sequence specified in SEQ ID no. 2, or the sequence specified in SEQ ID no. 3.

Preferably, the ESV sequence in the region of the cleavage site is nested in the region of protein sequence:

ΜΧαΧβΧα ΦΦ E + SVaaX,

where:

M is a methionine residue,

X is any amino acid residue,

a is any small amino acid residue,

β is any polar amino acid residue,

Φ is any medium or large hydrophobic amino acid residue.

The cleavage site is marked with the i symbol.

Preferably, digestion with protein of the sequence SEQ ID no. 1 occurs in sites of amino acid sequence: MDATNAMLE I SVAAE and MQGISGGLFE I SVSGG.

Preferably, the EAI motif in the region of cleavage by the protein of sequence SEQ ID no. 2 or protein of sequence SEQ ID no. 3 is preceded by a large or medium hydrophobic amino acid residue. Preferably, digestion with the protein of the sequence SEQ ID no. 2 or the protein with the sequence SEQ ID no. 3 occurs within the sequence EATFFYDTPIHWCATDLLEAISSTRLQLHRT.

Preferably, digestion by the protein of the sequence SEQ ID no. 1 , or sequence SEQ ID no. 2 or the protein of the sequence SEQ ID no. 3 occurs in the reaction environment of neutral or near-neutral pH, characteristic for Escherichia coli cells growing under aerobic conditions.

Because of their unique specificity, Pro and DdrB proteins can be used for digesting proteins within highly-specific or rare amino acid sequence motifs, thus they can be used as specific endopeptidases for the digestion of synthetically introduced linkers of a specific amino acid sequence between proteins or peptides, including the cleavage of affinity tags in respective fusion proteins, separated from the tags by specific peptide linkers. Moreover, we found that a certain tolerance of Pro and DdrB to specific amino acid substitutions in the regions of sites recognised and digested by these endopeptidases allows for a wider range of amino acid sequences in peptide linkers used for digestion by Pro or DdrB.

Such utilization of the invention allows for the endoproteolytic separation of protein or peptide fusion products connected with peptide linkers characterised by the sequence recognised by Pro, DdrB and DdrB-C proteins, or modified forms of these proteins coded by pro, ddrB or ddrB-C genes, containing point mutations, insertions or deletions.

Another preferred result of the invention is the possibility to use peptides containing amino acid sequence motifs recognised and digested by endopeptideases identified herein, as linkers connecting sequences between proteins and peptide or protein tags in protein fusions with such tags. The method for the preparation of the invention is presented in figures and tables.

Individual figures present:

Fig. 1 - identification of P1 bacteriophage virion proteins being the products of proteolytic digestion of their precursors within internal sequences.

Fig. 2 - peptides obtained after digestion with trypsin of shorter forms of gp23

(gp23 * ), DdrB (DdrB-N and DdrB-C) and DarA (DarA*) proteins.

Fig. 3 - sequences of wild-type pro, ddrB and hxr genes of P1 phage and genes inactivated by the substitution of their internal fragments in the P1 phage genome with the kanamycin resistance cassette.

Fig. 4 - identification of the amino acid sequence adjacent to the ESV motif, preferred for the recognition and digestion of the substrate protein with the Pro protein.

Fig. 5 - proteins obtained at subsequent stages of gp23-(His)6 protein purification.

Fig. 6 - identification of the endoproteolytic processing products of the gp23- (His)6 protein with Pro endopeptidase.

Fig. 7 - amino acid changes in the sequence of the DdrB protein in mutant proteins of decreased toxicity.

Fig. 8 - effects of proteins of various ddrB mutants on the growth kinetics of E. coli cells.

Fig. 9 - effects of proteins of various ddrB mutants on the morphology of E. coli cells.

Fig. 10 - identification of the amino acid sequence near the EAI motif, preferred for the recognition and digestion of the substrate protein by DdrB and DdrB-C proteins.

Fig. 1A presents an image of the layer containing P1 phage virions after separation in caesium chloride density gradient. The left panel (a) presents the suspension of P1 phage virions overlayed onto the density gradient of caesium chloride solutions of indicated densities. The right panel (b) presents P1 phage virions after centrifugation in caesium chloride density gradient. A layer containing purified phages is visible on the interface of phases of density 1.45 and 1.50 g/ml, as demonstrated by a plaque assay on a layer of Escherichia coli cells.

Fig. 1 B presents the proteins of the P1 bacteriophage virion separated in 10% polyacrylamide gel with 1 % SDS. MW1 - molecular weight marker (PageRuler™ Prestained Protein Ladder, SM0671), MW2 - molecular weight marker (PageRuler™ Unstained Protein Ladder, SM0661) 1. Structural proteins of the P1 phage obtained after a single centrifugation in caesium chloride density gradient (40 pg of protein was applied onto the gel), 2. Structural proteins of the P1 phage purified by double centrifugation in caesium chloride density gradient (40 pg of protein was applied onto the gel). After separation and staining, the vividly coloured gel fragments containing proteins (visible as bands) were cut out from the gel, and the proteins contained in them were identified by the mass analysis of their peptides obtained after trypsin digestion. Proteins which are indicated by arrows migrated in the gel faster than expected from their molecular mass calculated based on the amino acid sequence that was predicted from the DNA sequence of their genes. Two forms of DdrB protein (DdrB-C and DdrB-N, respectively) migrated in gel congruently with reference standard proteins of molecular weight lower than that of the ddrB gene mRNA translation product (108.8 kDa), predicted based on the sequence analysis for the ddrB gene. Forms of gp ? and DarA that migrated faster than predicted products of 23 and darA genes were designated as gp23 * and DarA * . Results from the analysis of peptide composition of the identified proteins are presented in Table 1.

Fig. 2 presents peptides that were obtained after trypsin digestion of shorter forms of proteins gp23 (gp23*) DdrB (DdrB-N and DdrB-C) and DarA (DarA * ), identified using mass spectrometry in P1 virion proteins purified by centrifugation in caesium chloride density gradient and separation by polyacrylamide gel electrophoresis with SDS (SDS-PAGE) (see Fig. 1 B). The identified peptides are marked in bold font on the background of the precursor protein sequences (gp23, DdrB and DarA) predicted after the digestion of these proteins with trypsin. Trypsin digestion sites are underlined.

Fig. 3. presents the sequences of wild-type pro, ddrB and hxr genes of P1 phage and derivatives of these genes inactivated by the substitution of their internal fragments in the P1 phage genome with the kanamycin resistance cassette. Sequences that were removed from the wild-type genes are highlighted in grey. Sequences of the inserted kanamycin cassette in the respective P1 phage mutants: pr :kar, ddrB an and hxr. an obtained by the inactivation of pro, ddrB and hxr genes are underlined. All mutants were obtained by recombination, through the replacement of fragments containing the pro, ddrB and hxr genes and their flanking P1 genome sequences, cloned in high- or medium-copy-number plasmids and inactivated by the insertion of the kanamycin resistance cassette with their homologuous fragments of the P1 genome containing wild-type genes. The recombinants were selected as resistant to kanamycin, and then analysed for the presence of the kanamycin resistance cassette in the target genes and for the lack of plasmid sequences by the analysis of amplification products with DNA of the obtained P1 phage mutants as a template and primers complementary to the cassette flanking regions.

Fig. 4A presents the comparison of gp23 and DdrB protein digestion regions. The following abbreviations for amino acid residues were used in the CONSENSUS sequence of the optimal digestion region: M - methionine residue, X - any amino acid residue, a - any small amino acid residue, β - any polar amino acid residue, Φ - any medium or large hydrophobic amino acid residue. Positions P10, P9, P8, P7, P6, P5, P4, P3, P2, P1 , P1\ P2', P3\ P4', P5\ P6\ P7' and P8' of amino acid residues are designated according to the conventional system for endoproteolytic processing sites, where a peptide bond cleavage occurs between residues at positions P1 and P1\

Fig. 4B presents the comparison of sequences in the region of the ESV motif in P1 virion proteins other than the Pro substrates. Amino acid residues incongruent with the CONSENSUS sequence specified in part (A) are marked with an asterisk. The lack of amino acid residues in the UpfA protein, shorter that the other proteins, was marked as -.

Fig. 5. presents an image of proteins, separated by SDS-PAGE under denaturing conditions, and obtained in subsequent purification steps on a nickel column (Ni-NTA from Qiagen, cat. no. 31014) of the gp23-(His)6 protein from cells containing a plasmid with the cloned gene 23 of the P1 bacteriophage, enriched at the 3" terminus with a DNA fragment of the GGGGGGCACCACCACCACCACCAC sequence, encoding two glycine residues and six histidine residues (23-(his)6). The protein was purified according to the protocol specified by the producer of Ni-NTA columns. 1. FT - flow through a fraction of extract of E. coli K12 cells with empty control plasmid, 2. FT - flow through a fraction of extract of E. coli K12 cells with a plasmid carrying 23-(his)6 gene, 3. W1 - the first wash fraction of extract of E. coli K12 cells with a plasmid carrying 23-(his)e gene, 4. W1 of extract of E. coli K12 cells with an empty control plasmid, 5. W2 of extract of E. coli K12 cells with a plasmid carrying 23-(his)6 gene, 6. W2 of extract of E. coli K12 cells with an empty control plasmid, 7. E1 of extract of E. coli K12 cells with a plasmid carrying 23-(his)6 gene, 8. E2 of extract of E. coli K12 cells with a plasmid carrying 23-(his)6 gene, 9. E1 of extract of E. coli K12 cells with an empty control plasmid. (FT - fraction passing through the column after packing it with the cell extract, W1- fraction eluted with wash buffer W1 , W2 - fraction of free proteins eluted with wash buffer W2, E1 , 2 - fractions of gp23-(His)6 protein bound with the bed material, eluted with E buffer), MW - marker proteins of specified molecular weight (in kDa).

Fig. 6. presents the identification of products of the gp23-(His)6 protein endoproteolytic processing with Pro endoprotease.

Fig. 6A presents the separation of eluted fraction proteins (E2) in 10% polyacrylamide gel with SDS, under denaturing conditions, obtained in subsequent purification steps on a nickel column (Ni-NTA from Qiagen, cat. no. 31014) of the forms of gp23 -(His)6 protein from P1 phage virions propagated in cells producing the gp23 -(His)6 protein. P1 phage virions were obtained after the induction of P1 phage lytic development in lysogens containing a plasmid with a cloned 23-(his)6 gene. 1 - Elution fraction E2 of proteins from P1 phage virions precipitated from the lysate. 2 - Elution fraction E2 of proteins from the supernatant left after the precipitation of P1 virions from the lysate. MW - Marker proteins of specified molecular weight (in kDa). The purification procedure was identical to that presented in the legend for Fig. 5.

Fig. 6B presents fragments of various forms of gp23 and gp23-(His)6 proteins visible as bands marked 1a, 1b, 2a and 2b, after staining of the gel specified in part (A) and identified, after the analysis of their peptides obtained by trypsin digestion, by mass spectrometry. Peptide identification was carried out according to the procedure described in the legend to Table 1. Two bands representing each of the obtained forms of protein are the consequence of co-purification of the gp23 (His)6 protein fused with the six-histidine tag (encoded by the plasmid), and its processing product, with both untagged forms of gp23 encoded by the gene present in the P1 phage genome.

Fig. 7. presents amino acid changes in the sequence of the DdrB protein of mutant proteins with reduced toxicity. The sequence of this of part DdrB that corresponds to DdrB-C is marked in bold font. Amino acids replaced in the proteins of individual mutants are marked above the sequence with symbols highlighted in grey. Numbers next to the symbols of individual replaced amino acids stand for the numbers of the allele of the ddrB gene coding a protein with the specified change in the amino acid sequence, as shown in Table 3. The motif of the DdrB amino acid sequence similar to the motif found in zinc-dependent metalloproteases is underlined.

Fig. 8. presents growth curves of exemplary derivatives of the BL21(DE3)pLysS strains carrying plasmids of the pUGAIO series with selected point mutations in the ddrB gene. In addition, the growth curve for the reference BL21(DE3)pLysS/pET28a strain, which does not contain the ddrB gene, is presented. The moment of the addition of IPTG, an inducer of expression of the ddrB gene and its mutant versions, from the T7 phage promoter in pET28a-derived plasmids, is indicated on the graph by a vertical arrow. All of the obtained mutations except ddrB- 12 abolished the toxic effects of the DdrB protein on cells.

Fig. 9 presents the morphology of BL21(DE3)/pLysS E. coli strain cells two hours after the induction of protein synthesis in selected ddrB mutants. Cells of the BL21 (DE3)/pLysS strain containing plasmids of the pUGAIO series were cultured in a rich liquid growth medium, with shaking, identically as described in the legend of Fig. 8. Two hours after the induction of transcription from the T7 phage promoter with 0.5 mM IPTG, 50 μΙ of every mutant culture and the control strain was collected, and Gram-stained specimens were prepared. A. control BL21(DE3)/pLysS/pET28a, B. control BL21 (DE3)/pET28a, C. BL21(DE3)pLysS/pUGA10-3 (ddrB-8), D. BL21 (DE3)pLysS/pUGA10-5 (ddrB-22), E. BL21(DE3)pLysS/pUGA10-2 (ddrB-30), F. BL21(DE3)pLysS/pUGA10-1 (ddrB-12). The figure presents example photographs of selected mutants, because the images of all mutants except ddrB-12 were comparable.

Fig. 10. presents the identification of the amino acid sequence flanking the EAI motif, preferred for the recognition and digestion of a substrate protein by DdrB and DdrB-C proteins. The CONSENSUS sequence in the closest neighbourhood of the EAI motif was determined based on the comparison between the sequence of this region in the DarA protein and E. coli proteins (A). The following abbreviations for amino acid residues were used in the CONSENSUS sequence of the optimal digestion region: M - methionine residue, X - any amino acid residue, a - any small amino acid residue, β - any polar amino acid residue, Φ - any medium or large hydrophobic amino acid residue. Positions P11 , P10, P9, P8, P7, P6, P5, P4, P3, P2, P1 , P1\ P2', P3', P4', P5\ P6', P7' and P8' of amino acid residues are designated according to the conventional system for endoproteolytic processing sites, where a peptide bond cleavage occurs between residues at positions P1 and P1'.

The object of the invention is presented in more detail in examples. In examples, we used DdrB protein of reduced activity - a product of the mutated ddrB gene. The mutagenesis of the DdrB gene enabled us to isolate mutants with reduced activity of DdrB protein, suitable for cloning in E. coli cells and for DdrB specificity analysis.

Example 1

Identification of endoproteolytic digestion products among the proteins of the P1 phage virion.

P1 phage virions were purified by centrifugation in caesium chloride density gradient, and then the obtained P1 virion proteins were separated in polyacrylamide gel under denaturing conditions. The major proteins, visible in gel as blue bands after staining with Coomassie blue, were identified using mass spectrometry, based on the results of the mass analysis of peptides obtained after digesting these proteins with trypsin (Fig. 1). Additionally, we analysed peptides obtained after trypsin digestion of whole-virion proteins to identify those whose termini did not correspond with the predicted termini of peptides obtained after trypsin digestion.

Four proteins marked as gp23*, darA * , ddrB-N and DdrB-C out of all 28 proteins identified as bands after the separation in polyacrylamide gel under denaturing conditions, migrated incongruently with the molecular mass predicted based on the translation of the genes encoding these proteins (respectively: 23, darA, and ddrB) into amino acid sequences (Table 1 , Fig. 1).

None of the peptides obtained after trypsin digestion of the protein marked as gp23 * represented a fragment between residue 1-120 of the predicted product of gene 23 (Fig. 2). Moreover, instead of a group of peptides representing the predicted fragment of the gene 23 product, from amino acid residue 116 to 135, we only obtained a group of shorter peptides. Their sequence did not begin with the amino acid residue predicted for N-termini after trypsin digestion [K(116)_AMLESVAAEMMSVSDGVMR_(135)L], but with serine residue at position 121 of gp23 (in bold in the presented sequence for this region). This indicated that the fragments resulted from the digestion of gp23 after the glutamic acid residue at position 120 in the ESV sequence and represent the N-terminus of the shorter form of the gp23 protein formed during the morphogenesis of virions as a result of protease digestion. These peptides were consistently present in all samples obtained by trypsin digestion of a protein migrating in gel at a position characteristic of proteins of size about 48 kDa.

Unexpectedly, two major peptide fractions representing the DdrB protein, marked as DdrB-N and DdrB-C, corresponded to proteins migrating according to molecular mass of approx. 47 kDa and approx. 61 kDa, lower than the molecular mass of its product predicted based on the ddrB gene sequence (108.8 kD) (Fig. 2). Peptides representing the N-terminus of the DdrB-N protein began with a methionine residue, as did the sequence predicted for the unprocessed DdrB protein, and their sequence was identical to the sequence predicted for the N-terminus of DdrB. However, peptides representing the C-terminus of DdrB-N ended with a glutamic acid residue (E) at position 441 of the predicted unprocessed DdrB. Peptides representing the N-terminus of the DdrB-C protein began with 442 amino acid residue of the sequence predicted for the complete DdrB protein, i.e. serine residue (S), placed exactly after the glutamic acid residue, ending the sequence of the DdrB-N protein. Because the trypsin digestion sites adjacent to residues 441 and 442 in the DdrB protein are placed at positions 421 and 461 [R(421)_QVSQELENEGMQGISGGLFESVSGGSYNGVAPYTSLLLHR_(461)A], the actual obtained sequences of terminal peptides for DdrB-N and DdrB-C proteins indicate that these proteins are the products of endoproteolytic processing of the DdrB protein within the ESV sequence, between the glutamic acid residue at position 441 and the serine residue at position 442.

The major peptide fraction representing the DarA protein corresponded to the protein designated by us as DarA * and migrating congruently with a molecular mass of 69-70 kDa, lower than the predicted molecular mass of a translation product of the darA gene (79 kDa). Moreover, the most external, N-terminal peptides representing trypsin digestion products of DarA * began with an alanine residue (A) at position 69 of the DarA gene translation product, indicating that the site of proteolytic processing of DarA, leading to the formation of DarA * , is located in the EAI sequence, between the glutamic acid residue at position 68 and the alanine residue at position 69 of DarA, marked below in bold font (Fig. 2). The trypsin digestion sites adjacent to this alanine residue are at positions 37 and 74, so if DarA were not processed, its digestion should produce a peptide of the sequence [R(37)_YLMTESNTLEEIEATFFYDTPIHWCATDLLEAISSTR_(74)L].

Proteins separated by SDS-PAGE under denaturing conditions were electrotransferred onto a PVDF membrane, stained with Panceau S, and then pieces of the membrane containing proteins at the location predicted for gp23*, DarA* and DdrB-C were cut out, the proteins were eluted from the membrane, and the sequence of their N-termini was determined using the Edman degradation method. The N-terminal sequences of gp23 * , DdrB-C and DarA * proteins identified by the Edman degradation were found to be identical to the identified N-terminal peptide sequences of those proteins that did not represent the N-termini that could result from trypsin digestion (Table 2). In Table 2 the N-terminal amino acid sequence for the DdrB-C protein, from the ambiguous reading is underlined.

This finally confirmed that gp23 * , DdrB-N, DdrB-C and DarA* represent the products of precursor protein processing (gp23, DdrB and DarA, respectively) formed in the process of protein digestion by site-specific endoprotease(s) in the regions corresponding with the positions between amino acid residues 120 and 121 of gp23 (in the EI SV sequence), 441 and 442 of DdrB (in the EISV sequence) and 68 and 69 of DarA (in the E AI sequence).

Example 2.

Identification of endoproteases specific for internal sequences of gp23, DdrB and DarA.

To test whether and what could be the potential role of pro and ddrB gene products in the processing of gp23, DdrB and DarA proteins, we constructed mutants of the P1 phage, in which pro or ddrB gene was inactivated by replacing a large fragment of this gene with a kanamycin resistance cassette (Fig. 3). The pro gene is the last gene of a two-gene operon, so the insertion results in the inactivation of only this gene. The ddrB is the second to last gene in a seven-gene operone. Therefore, the insertion inactivates ddrB, but may also cause a polar effect, manifested by the lack of activity of hxr, the terminal gene in the operon. To differentiate the phenotype of a mutant associated with the inactivation of the hxr gene from the phenotype associated only with the inactivation of ddrB we constructed an additional mutant with a kanamycin resistance cassette inserted in the hxr gene (Fig. 3). The presence of the kanamycin cassette in pro, ddrB or hxr genes in each mutant was confirmed by the analysis of PCR products obtained in the amplification reactions of the mutant P1 DNA genome with primers complementary to the regions of the P1 genome on both sides of the kanamycin cassette in each mutant.

The proteomes of whole virions from purified P1 mutants containing inactivated pro, ddrB or hxr genes and of the wild-type phage were compared for the presence of processed and unprocessed forms of gp23, DdrB and DarA proteins. To do so, we compared peptides obtained by trypsin digestion of proteins from whole virions, identified using mass spectrometry. In samples of hxr::kan mutant we identified peptides from N-termini of gp23 * , DdrB-C and DarA * proteins, which proves that the hxr gene product is not associated with the processing of these proteins. In samples of ddrB::kan mutant we did not identify any of the unprocessed or processed peptide forms of DdrB, in agreement with the ddrB inactivation in this mutant. However, peptides representing the N-terminus of the gp23* protein were present in the virions of this mutant, which rules out the involvement of the DdrB protein in gp23 processing. Unexpectedly, the ddrB::kan mutant was characterised by the presence of, among other virion proteins, peptides representing the N-terminus of the unprocessed form of the DarA protein, as well as peptides representing fragments from positions 37 to 74 DarA containing a continuous EAI sequence, where in the wild-type strain the endoproteolytic digestion of DarA occurs. In samples of pro:kan mutant there were no peptides ending at position 441 or beginning at position 442 of unprocessed DdrB, peptides beginning at position 121 of unprocessed gp23 and peptides beginning at position 69 of unprocessed DarA.

The results indicate conclusively that DdrB is the only protein involved in the endoproteolytic processing of DarA, and Pro is the only protein involved in the endoproteolytic processing of gp23 and DdrB. This is in agreement with the identity of the sequence (ESV) in the closest neighbourhood of the processing site in gp23 and DdrB proteins, and with the difference of the sequence in the closest neighbourhood of the processing site in DarA (EAI).

Example 3.

Identification of amino acid residues adjacent to the ESV motif and optimal for the digestion by Pro endopeptidase.

The presence of the ESV motif at the site of the proteolytic digestion of both gp23 and DdrB by Pro indicates the preferential digestion of substrate proteins by Pro within this motif. However, the ESV motif was also identified in four other virion proteins that are not proteolytically processed (Pro, gpU, gp25 and UpfA), which was found based on the migration of these proteins during the separation in polyacrylamide gel with SDS according to their predicted molecular mass, and the fact that after trypsin digestion only peptides containing continuous ESV motifs were found. This shows the essentiality of the ESV motif as well as the sequences adjacent to it for the recognition and digestion of proteins by Pro, including the essentiality of sequences adjacent to the ESV motif in gp23 and DdrB proteins (MDATNAMLESVAAE and MQGISGGLFESVSGG, respectively, corresponding to positions P10, P9, P8, P7, P6, P5, P4, P3, P2, P1 , Pi ', P2\ P3\ P4\ P5' according to the standard system for designation of endoproteolytic processing sites).

The fact that Pro digests two proteins of slightly different amino acid sequences adjacent to the ESV motif, and that Pro is unable to digest four other proteins containing the ESV motif, was used to determine the sequence adjacent to the ESV motif which is preferred for the recognition and digestion by Pro (Fig. 5). Comparison of the amino acid sequence adjacent to the ESV motif within gp23 and DdrB indicates the range of tolerance with respect to the sequences surrounding the ESV motif at the site of recognition and digestion by Pro (Fig. 4). Based on this, the amino acid sequence of sites that are preferably recognised and digested by Pro was identified as: ΜΧαΧβΧαΦΦΕ I SVaaX,

where:

M is a methionine residue,

X is any amino acid residue,

a is any small amino acid residue,

β is any polar amino acid residue,

Φ is any medium or large hydrophobic amino acid residue.

The cleavage site is marked with the 4 symbol.

None of the P1 proteins other than Pro substrates contains the identified amino acid sequence motifs adjacent to the ESV motif.

Example 4.

Testing the possibility of Pro application for the cleavage of tagged sequences in proteins. To test the applicability of Pro for cleaving protein fragments incorporating affinity tags we used a natural Pro substrate, gp23 with 6xhis tag at the C-terminal end (Fig. 5).

P1 bacteriophage containing the wild-type Pro gene was used as an intracellular source of Pro protein. In P1 lysogens containing a plasmid with cloned 23-6x \s gene a lytic phage development was induced. Next, using the affinity of the 6xhis tag to the nickel bed, we isolated from cell lysate obtained after the completion of phage lytic development the forms of gp23-6xHis proteins present in virions and cell supernatant. We analysed their mobility under denaturing conditions using SDS polyacrylamide gel electrophoresis, and identified their peptide composition using mass spectrometry (Fig. 6). We isolated both the processed and unprocessed forms of gp23-6xHis, which indicates the applicability of Pro for the cleavage of affinity- tagged protein fragments, including the digestion of fusion proteins.

Example 5.

Identification of DdrB amino acid sequence motifs necessary for its proteolytic activity, and cloning of mutated ddrB gene forms.

Attempts to clone the wild-type of the ddrB gene in expression vectors, such as pET28a, wherein the cloned gene is controlled by the inducible promoter for 17 phage polymerase failed, even in cells with transformed ligation products of an amplified fragment incorporating ddrB with a spliced pET28a, containing a plasmid

(pLysS) coding for 17 phage lysosyme - a 17 RNA polymerase inhibitor. In these experiments we only obtained ddrB gene versions with amber or frameshift mutations, which indicated the high toxicity of DdrB protein to E. coli cells.

To identify whether or not the toxicity of DdrB to E. coli cells is associated with the activity of either DdrB-N or DdrB-C domains of DdrB we experimentally cloned separate fragments of the ddrB gene coding each of the two domains. The fragment of the ddrB gene coding the N-terminal part of the protein corresponding to DdrB-N was cloned successfully. However, we were unable to successfully clone the fragment of the ddrB gene coding the C-terminal part of the protein corresponding to DdrB-C, with an added start codon (ATG), and preceded by the sequence coding the ribosome binding site in mRNA. We only obtained clones containing amber mutations or frameshift mutations. This allows for the conclusion that the cytotoxic activity of DdrB results from the toxicity of the C-terminal part of the protein corresponding to DdrB-C.

A toxicity phenotype was used for the positive selection of ddrB mutants, coding inactivated proteins or proteins with reduced activity. As a matrix for the mutagenesis of the ddrB gene we used the pUGAIO plasmid carrying a mutated version of the 3' part of the ddrB gene, coding DdrB-C. The Snal-Xhol (665 bp) fragment containing the mutated 3' part of the ddrB in this plasmid was substituted with fragments from Snal-Xhol libraries obtained by the amplification of the 3' part of the wild-type ddrB gene using P1 phage DNA as a template and two primers: 5'- ATACTCGAGATCTGCATTATTTAGCTCCTTT and 5' CGGTCCACGAGAAAGTG. The ligation mixture was used to transform BL21(DE3) strain cells without additional plasmids and BL21(DE3) cells containing the pLysS plasmid, allowing a reduction in the activity of T7 RNA polymerase by T7 lysosyme encoded by this plasmid. Transformants were plated on a selective medium for the pUGAIO plasmid or its derivatives. The colonies obtained were passaged onto the same type of medium. Next, plasmids were isolated from these colonies. Plasmids were analysed by restriction mapping for the presence of DNA fragments corresponding to the subcloned fragment of the ddrB gene. Plasmids containing the subcloned fragments of the ddrB gene of predicted length were tested by DNA sequencing to identify mutations in the ddrB gene.

We obtained 8 types of single and 3 types of double mutations in the ddrB gene resulting in the sequence change in the C-terminal part of the DdrB protein, corresponding to DdrB-C (Table 3; Fig. 7.). Unexpectedly, as many as 7 single mutants obtained from independently prepared libraries had a single change in DNA that resulted in the substitution of a glutamic acid residue at position 873 of DdrB by the lysine residue (E873K). In the wild-type DdrB, the glutamic acid residue which is changed in the proteins of these mutants is placed precisely within the short motif of the amino acid sequence of DdrB (HELGH), similar to the HEXXH motif, characteristic for the active centres in zinc-dependent metalloproteases. One mutation caused the substitution of the histidine residue at position 876 of DdrB, which is also within the HELGH motif, to the ieucine residue (H876L). Two mutations caused substitutions of sense codons corresponding, respectively, to the tryptophan residue at position 871 of the DdrB protein, and to the aspartic acid residue at position 725 of the DdrB protein, that is before the HELHG motif, by stop codons, causing premature termination of mRNA translation and the formation of mutant proteins lacking the fragment containing this motif. Both the accumulation of mutations reducing the toxicity of DdrB in the ddrB gene region coding for the HELGH motif of this protein and the premature termination of ddrB translation before the region coding for the HELGH motif in some mutants prove the fundamental role of HELGH for the activity of DdrB, and indicate the role of DdrB as a zinc-dependent metalloprotease.

Apart from single mutants coding the DdrB protein modified within the HELGH motif, or proteins lacking the C-terminal fragment containing this motif, we also obtained mutants coding amino acid changes adjacent to this region (E865V) or in other regions of the 3' part of the ddrB gene (I670N, N896Y, L984S). Two constructed double mutants coded DdrB proteins containing substituted amino acid residues outside the HELGH motif (E825K and E968Q; and E825K and L984S), and one coded a truncated DdrB protein containing a single amino acid substitution outside the HELGH motif (E865V and Q943X, where X is the change of codon for glutamic acid in the ddrB gene into the stop codon).

The protein of one mutant (DdrB N896Y ) retained a significant level of activity, which was found based on the growth inhibition and filamentation of E. coli cells after the induction of expression of this mutant gene cloned in the plasmid (Fig.8; Fig.9.). Therefore, the gene encoding the DdrB N896Y protein is a ddrB allele suitable for testing the activity and specificity of the DdrB protein in E. coli cells.

Example 6.

Identification of amino acid residues adjacent to the EAI motif and optimal for digestion by DdrB endopeptidase.

Site-specific digestion of the DarA protein by DdrB between glutamic acid residue at position 68 and alanine residue at position 69 (E l A), indicates that DdrB specifically recognises and digests proteins within the EA sequence, and that DdrB has the activity of glutamyl endoprotease. EA is a common motif in proteins. However, apart from DarA, we found no other P1 virion proteins which, when separated by polyacrylamide gel electrophoresis under denaturing conditions, would produce a visible band migrating at a position characteristic of a protein of molecular mass lower than predicted, and would begin with an alanine residue or end with a glutamic acid residue adjacent to an alanine residue in the predicted precursor protein sequence. To find if DdrB has a trace amount of endoproteolytic activity with substrates other than DarA, containing an alanine residue placed directly after the glutamic acid residue, we analysed peptides obtained after the digestion of virion proteins from the wild-type P1 phage and ddrB::kan P1 mutant for peptides either ending with a glutamic acid residue or beginning with an alanine residue and not being the products of digesting both ends by trypsin. Unfortunately, we found no significant differences (except those in DarA) between the virions of the wild-type phage and mutant. Therefore, because of the cytotoxic effect of DdrB, we tried to identify such peptides among the proteins of Escherichia coli. Identification was carried out in lysogen extracts of the wild-type P1 phage and ddrB::kan mutant, after the induction of phage lytic development. Because DdrB is a late protein formed by the phage, extracts were prepared by cell sonication, just before cell lysis. After digesting lysate proteins from the ddrB::kan mutant by trypsin we identified a low number of peptide products formed by the cleavage of proteins between the glutamic acid residue and alanine residue, but none of them had an isoleucin residue after alanine residue in the digestion region of the analysed proteins. When analysing peptides obtained from cell extracts after induction of wold-type P1 phage lytic development we obtained a low number of peptides of N- or C- terminal sequences, indicating the formation of these peptides as a result of endoproteolytic digestion within the EAI sequence, identical to the sequence within the digestion region of DarA by DdrB.

Comparison between the sequences of the digestion region in DarA (IEATFFYDTPIHWCATDLLEAISSTRLQLHRT) and the sequences of regions adjacent to the EAI motif in processed proteins of E. coli indicated the preference of DdrB for the recognition and digestion of proteins in which the digested EAI motif is preceded by a large or medium hydrophobic amino acid residue. 10).

References:

Choi SI, Song HW, Moon JW, Seong BL. (2001) Recombinant enterokinase light chain with affinity tag: expression from Saccharomyces cerevisiae and its utilities in fusion protein technology. Biotechnol Bioeng. 75:718-24.

Lenart A., Dudkiewicz M., Grynberg M., Pawtowski K. (2013). CLCAs-A Family of Metalloproteases of intriguing phylogenetic distribution and with cases of substituted catalytic sites. Plos One, 8(5), e62272. tobocka M. B., Svarchevsky A. N., Rybchin V. N., Yarmolinsky M. B. (1996).

Characterization of the primary immunity region of the Escherichia coli linear plasmid prophage N15. J Bacterid 178(10): 2902-2910.

Parks TD, Leuther KK, Howard ED, Johnston SA, Dougherty WG. (1994) Release of proteins and peptides from fusion proteins using a recombinant plant virus

proteinase. Anal Biochem. 216:413-7.

Streiff M.B., lida S., Bickle T.A. (1987). Expression and proteolytic processing of the darA antirestriction gene product of bacteriophage P1. Virology 157(1): 167-171.

Terpe K. (2003). Overview of tag protein fusions: from molecular and biochemical fundamentals to commercial systems. Appl Microbiol Biotechnol 60(5):523-533.

Vergis JM, Wiener MC. (2011) The variable detergent sensitivity of proteases that are utilized for recombinant protein affinity tag removal. Protein Expr Purif.

78:139-42.

Yarmolinsky M.B., Sternberg N.L. (1988). Bacteriophage PL, 291-438. In: R. Calendar (ed.), The Bacteriophages. Plenum Press, New York. Table 1. Identification of P1 phage virion proteins that migrate during SDS- polyacrylamide gel electrophoresis under denaturating conditions incongruently with molecular masses predicted based on their gene sequences.

Proteins visible after SDS-polyacrylamide gel electrophoresis and staining with Coommasie blue in the form of bands were analysed by liquid chromatography and mass spectrometry (LTQ FT ICR). Prior to the analysis, proteins were reduced with 100 mM DTT, alkylated with iodoacetamide and digested with trypsin. Resulting peptides were eluted from gel slices with 0.1 % trifluoroacetic acid (TFA) and 2% acetonitrile (ACN). Peptide mixtures were applied to the RP- 8 precolumns (LC Packings) using 0.1% TFA as a mobile phase and then transferred to the nanocolumn-HPLC RP-18 (nanoACQUITY UPLC BEHC18 - Waters 186003545) using an acetonitrile gradient (0% - 60% ACN within 30 minutes) in the presence of 0.05% formic acid, with the flow rate 150 nl/min. Eluate from the column was collected directly onto the LTQ-FT-MS ion source operating in a system of data dependent on MS to MS/MS switch. The data obtained were processed with the Mascot data filter and then searched using the MASCOT search engine (Matrix Science, London, UK, installed at http://proteom.pl/mascot) against the data for proteins from the NCBI database. 1 >proteins, which peptides were identified in most of the analysed bands; 2 · 3 > first and second product of DdrB protein processing; 4 · 6 > unprocessed and processed form of gp23 protein. Table 2. Comparison of the N-terminal sequences of P1 phage processed structural proteins determined based on peptide analysis by mass spectrometry and the N- terminal sequences determined by Edman degradation.

N-terminal sequence determined based Number of

Protein on LC-MS/MS analysis (accession no. N-terminal sequence determined by Edman identified of complete sequence in the NCBI degradation amino acid database) residues.

SVAAEMMSVSDGVMRLPLFLAMILP

gp2? SVAAEMMSVSDGVMRLPLFLAMILP 25

(NCBI YP_006526.1)

AISSTRLQLHRTMQAFVRALNQKLN

DarA* AISSTRLQLHRTMQAFVRALNQKLN 25

(NCBI YP_006494.1 )

SVSGGSYNGVAPYTSLLLHRASGIKDI 5

DdrB-C S(V+A+S)(N+S+Q+D)(G+N+Y+L)(G+K+S+R) #

(NCBI: YP_006491.1) (ambiguous)

Table 3. Changes in the amino acid sequence of the DdrB protein resulting from random mutations in the ddrB gene that caused the abolishment or reduction of its gene product toxicity.

*- X means the stop codon, (1)- mutant with reduced toxicity of DdrB.

Alleles or mutants obtained after the transformation of BL21(DE3) cells

without the pLysS plasmid are marked in bold; alleles or mutants obtained during attempts of the wild-type ddrB gene cloning are underlined.