Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
RHAMNOSE-POLYSACCHARIDES
Document Type and Number:
WIPO Patent Application WO/2020/249737
Kind Code:
A1
Abstract:
The present invention relates to a method of synthesizing a rhamnose polysaccharide. The invention also relates to a synthetic streptococcal polysaccharide, a streptococcal glycoconjugate, an immunogenic composition or vaccine comprising the streptococcoal polysaccharide or glycoconjugate and the polysaccharide, glycoconjugate, immunogenic composition or vaccine for use in raising an immune response in an animal or for use in treating or preventing a disease, condition or infection with a streptococcal aetiology.

Inventors:
DORFMUELLER HELGE (GB)
Application Number:
PCT/EP2020/066314
Publication Date:
December 17, 2020
Filing Date:
June 12, 2020
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNIV DUNDEE (GB)
International Classes:
C12P19/04; A61K39/09; C12P21/00
Domestic Patent References:
WO2019016188A12019-01-24
Foreign References:
US20170340718A12017-11-30
Other References:
SHIBATA Y ET AL: "Expression and characterization of streptococcal rgp genes required for rhamnan synthesis in Escherichia coli", INFECTION AND IMMUNITY, AMERICAN SOCIETY FOR MICROBIOLOGY, US, vol. 70, no. 6, 1 June 2002 (2002-06-01), pages 2891 - 2898, XP002987172, ISSN: 0019-9567, DOI: 10.1128/IAI.70.6.2891-2898.2002
VAN SORGE NINA M ET AL: "The Classical Lancefield Antigen of Group AStreptococcusIs a Virulence Determinant with Implications for Vaccine Design", CELL HOST & MICROBE, ELSEVIER, NL, vol. 15, no. 6, 11 June 2014 (2014-06-11), pages 729 - 740, XP029033068, ISSN: 1931-3128, DOI: 10.1016/J.CHOM.2014.05.009
JEFFREY S. RUSH ET AL: "The molecular mechanism of N -acetylglucosamine side-chain attachment to the Lancefield group A carbohydrate in Streptococcus pyogenes", JOURNAL OF BIOLOGICAL CHEMISTRY, vol. 292, no. 47, 11 October 2017 (2017-10-11), US, pages 19441 - 19457, XP055712863, ISSN: 0021-9258, DOI: 10.1074/jbc.M117.815910
SAMANTHA L. VAN DER BEEK ET AL: "Streptococcal dTDP-L-rhamnose biosynthesis enzymes: functional characterization and lead compound identification", MOLECULAR MICROBIOLOGY, vol. 111, no. 4, 31 January 2019 (2019-01-31), GB, pages 951 - 964, XP055724823, ISSN: 0950-382X, DOI: 10.1111/mmi.14197
AZUL ZORZOLI ET AL: "Group A, B, C, and G Streptococcus Lancefield antigen biosynthesis is initiated by a conserved [alpha]-d-GlcNAc-[beta]-1,4-l-rhamnosyltransferase", JOURNAL OF BIOLOGICAL CHEMISTRY, vol. 294, no. 42, 10 September 2019 (2019-09-10), US, pages 15237 - 15256, XP055724746, ISSN: 0021-9258, DOI: 10.1074/jbc.RA119.009894
CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, 2019
BERROW NSALDERTON DSAINSBURY SNETTLESHIP JASSENBERG RRAHMAN NSTUART DIOWENS RJ: "A versatile ligation-independent cloning method suitable for high-throughput expression screening applications", NUCLEIC ACIDS RESEARCH, vol. 35, no. 6, 1 March 2007 (2007-03-01), pages e45, XP055004202, DOI: 10.1093/nar/gkm047
REGLINSKI ET AL., NPJ VACCINES, vol. 3, pages 53
VAN SORGE, N. M.COLE, J. N.KUIPERS, K.HENNINGHAM, A.AZIZ, R. K.KASIRER-FRIEDE, A.LIN, L.BERENDS, E. T. M.DAVIES, M. R.DOUGAN, G.: "The Classical Lancefield Antigen of Group A Streptococcus Is a Virulence Determinant with Implications for Vaccine Design", CELL HOST MICROBE, vol. 15, 2014, pages 729 - 740, XP029033068, DOI: 10.1016/j.chom.2014.05.009
KRISTIAN, S. A.DATTA, V.WEIDENMAIER, C.KANSAL, R.FEDTKE, I.PESCHEL, A.GALLO, R. L.NIZET, V.: "D-alanylation of teichoic acids promotes group a streptococcus antimicrobial peptide resistance, neutrophil survival, and epithelial cell invasion", J. BACTERIOL., vol. 187, 2005, pages 6719 - 6725, XP002468300, DOI: 10.1128/JB.187.19.6719-6725.2005
HENNINGHAM, A.DAVIES, M. R.UCHIYAMA, S.SORGE, N. M. VANLUND, S.CHEN, K. T.WALKER, M. J.COLE, J. N.NIZET, V.: "Virulence Role of the GlcNAc Side Chain of the Lancefield Cell Wall Carbohydrate Antigen in Non-M1-Serotype Group A Streptococcus", MBIO, vol. 9, 2018, pages e02294 - 17
LE BRETON, Y.BELEW, A. T.FREIBERG, J. A.SUNDAR, G. S.ISLAM, E.LIEBERMAN, J.SHIRTLIFF, M. E.TETTELIN, H.EI-SAYED, N. M.MCLVER, K. S: "Genome-wide discovery of novel M1T1 group A streptococcal determinants important for fitness and virulence during soft-tissue infection", PLOS PATHOG., vol. 13, 2017, pages e1006584
SHELBURNE, S. A.KEITH, D.HORSTMANN, N.SUMBY, P.DAVENPORT, M. T.GRAVISS, E. A.BRENNAN, R. G.MUSSER, J. M.: "A direct link between carbohydrate utilization and virulence in the major human pathogen group A Streptococcus", PROC. NATL. ACAD. SCI. U. S. A., vol. 105, 2008, pages 1698 - 1703
LANCEFIELD, R. C.: "A Serological Differentiation of Human and Other Groups of Hemolytic Streptococci", J. EXP. MED., vol. 57, 1933, pages 571 - 595
MCCARTY, M.: "Further studies on the chemical basis for serological specificity of group a streptococcal carbohydrate", J. EXP. MED., vol. 108, 1958, pages 311 - 323
RUSH, J. S.EDGAR, R. J.DENG, P.CHEN, J.ZHU, H.VAN SORGE, N. M.MORRIS, A. J.KOROTKOV, K. V.KOROTKOVA, N.: "The molecular mechanism of N-acetylglucosamine side-chain attachment to the Lancefield group A carbohydrate in Streptococcus pyogenes", J. BIOL. CHEM., vol. 292, 2017, pages 19441 - 19457
MISTOU, M.-Y.SUTCLIFFE, I. C.SORGE, N. M. VAN: "Bacterial glycobiology: rhamnose-containing cell wall polysaccharides in Gram-positive bacteria", FEMS MICROBIOL. REV., vol. 40, 2016, pages 464 - 479
COLIGAN, J. E.KINDT, T. J.KRAUSE, R. M.: "Structure of the streptococcal groups A, A-variant and C carbohydrates", IMMUNOCHEMISTRY, vol. 15, 1978, pages 755 - 760, XP023681823
KRAUSE, R. M.MCCARTY, M.: "Studies on the Chemical Structure of the Streptococcal Cell Wall", J. EXP. MED., vol. 114, 1961, pages 127 - 140
EDGAR, R. J.HENSBERGEN, V. P. VANRUDA, A.TURNER, A. G.DENG, P.BRETON, Y. L.EI-SAYED, N. M.BELEW, A. T.MCLVER, K. S.MCEWAN, A. G.: "Discovery of glycerol phosphate modification on streptococcal rhamnose polysaccharides", NAT. CHEM. BIOL., vol. 15, 2019, pages 463, XP036760391, DOI: 10.1038/s41589-019-0251-4
H. HEYMANNZELEZNICK, L. D.BOLTRALIK, J. J.BARKULIS, S. S.SMITH, C.: "Biosynthesis of Streptococcal Cell Walls: A Rhamnose Polysaccharide", SCIENCE, vol. 140, 1963, pages 400 - 401
HEYMANN, H.MANNIELLO, J. M.BARKULIS, S. S.: "Structure of streptococcal cell walls. V. Phosphate esters in the walls of group A Streptococcus pyogenes", BIOCHEM. BIOPHYS. RES. COMMUN., vol. 26, 1967, pages 486 - 491, XP024840568, DOI: 10.1016/0006-291X(67)90574-8
VAN HENSBERGEN, V. P.MOVERT, E.DE MAAT, V.LUCHTENBORG, C.LE BRETON, Y.LAMBEAU, G.PAYRE, C.HENNINGHAM, A.NIZET, V.VAN STRIJP, J. A.: "Streptococcal Lancefield polysaccharides are critical cell wall determinants for human Group IIA secreted phospholipase A2 to exert its bactericidal effects", PLOS PATHOG., vol. 14, 2018, pages e1007348
SEWELL, E. W. C.BROWN, E. D.: "Taking aim at wall teichoic acid synthesis: new biology and new leads for antibiotics", J. ANTIBIOT. (TOKYO)., vol. 67, 2014, pages 43 - 51
HUANG, D. H.RAMA KRISHNA, N.PRITCHARD, D. G.: "Characterization of the group A streptococcal polysaccharide by two-dimensional 1 H-nuclear-magnetic-resonance spectroscopy", CARBOHYDR. RES., vol. 155, 1986, pages 193 - 199, XP026619010, DOI: 10.1016/S0008-6215(00)90145-9
VAN DER BEEK, S. L.LE BRETON, Y.FERENBACH, A. T.CHAPMAN, R. N.VAN AALTEN, D. M. F.NAVRATILOVA, I.BOONS, G.-J.MCLVER, K. S.VAN SORG: "GacA is essential for Group A Streptococcus and defines a new class of monomeric dTDP-4-dehydrorhamnose reductases (RmID", MOL. MICROBIOL., vol. 98, 2015, pages 946 - 962
LE BRETON, Y.BELEW, A. T.VALDES, K. M.ISLAM, E.CURRY, P.TETTELIN, H.SHIRTLIFF, M. E.EI-SAYED, N. M.MCLVER, K. S.: "Essential Genes in the Core Genome of the Human Pathogen Streptococcus pyogenes", SCI. REP., vol. 5, 2015, pages 9838
SHIBATA, Y.YAMASHITA, Y.OZAKI, K.NAKANO, Y.KOGA, T.: "Expression and characterization of streptococcal rgp genes required for rhamnan synthesis in Escherichia coli", INFECT. IMMUN., vol. 70, 2002, pages 2891 - 2898, XP002987172, DOI: 10.1128/IAI.70.6.2891-2898.2002
BRUYERE, T.WACHSMANN, D.KLEIN, J. P.SCHOLLER, M.FRANK, R. M.: "Local response in rat to liposome-associated Streptococcus mutans polysaccharide-protein conjugate", VACCINE, vol. 5, 1987, pages 39 - 42, XP023710436, DOI: 10.1016/0264-410X(87)90007-7
CARTEE, R. T.FORSEE, W. T.BENDER, M. H.AMBROSE, K. D.YOTHER, J.: "CpsE from type 2 Streptococcus pneumoniae catalyzes the reversible addition of glucose-1-phosphate to a polyprenyl phosphate acceptor, initiating type 2 capsule repeat unit formation", J. BACTERIOL., vol. 187, 2005, pages 7425 - 7433
OZAKI, K.SHIBATA, Y.YAMASHITA, Y.NAKANO, Y.TSUDA, H.KOGA, T.: "A novel mechanism for glucose side-chain formation in rhamnose-glucose polysaccharide synthesis", FEBS LETT., vol. 532, 2002, pages 159 - 163, XP004395362, DOI: 10.1016/S0014-5793(02)03661-X
VETTING, M. W.FRANTOM, P. A.BLANCHARD, J. S.: "Structural and enzymatic analysis of MshA from Corynebacterium glutamicum: substrate-assisted catalysis", J. BIOL. CHEM., vol. 283, 2008, pages 15834 - 15844
JURTSHUK, P.: "Bacterial Metabolism. in Medical Microbiology", 1996, UNIVERSITY OF TEXAS MEDICAL BRANCH AT GALVESTON
PARSONAGE, D.NEWTON, G. L.HOLDER, R. C.WALLACE, B. D.PAIGE, C.HAMILTON, C. J.DOS SANTOS, P. C.REDINBO, M. R.REID, S. D.CLAIBORNE, : "Characterization of the N-acetyl-a-D-glucosaminyl I-malate synthase and deacetylase functions for bacillithiol biosynthesis in Bacillus anthracis", BIOCHEMISTRY, vol. 49, 2010, pages 8398 - 8414
LOMBARD, V.GOLACONDA RAMULU, H.DRULA, E.COUTINHO, P. M.HENRISSAT, B.: "The carbohydrate-active enzymes database (CAZy) in 2013", NUCLEIC ACIDS RES., vol. 42, 2014, pages D490 - 495, XP055519748, DOI: 10.1093/nar/gkt1178
JAMES, D. B. A.YOTHER, J.: "Genetic and Biochemical Characterizations of Enzymes Involved in Streptococcus pneumoniae Serotype 2 Capsule Synthesis Demonstrate that Cps2T (WchF) Catalyzes the Committed Step by Addition of β1-4 Rhamnose, the Second Sugar Residue in the Repeat Unit", J. BACTERIOL., vol. 194, 2012, pages 6479 - 6489
SCHAGGER, H.: "Tricine-SDS-PAGE", NAT. PROTOC., vol. 1, 2006, pages 16 - 22
WALDO, G. S.STANDISH, B. M.BERENDZEN, J.TERWILLIGER, T. C.: "Rapid protein-folding assay using green fluorescent protein", NAT. BIOTECHNOL., vol. 17, 1999, pages 691 - 695, XP002334382, DOI: 10.1038/10904
DRUZHININA, T. N.DANILOV, L. L.TORGOV, V. I.UTKINA, N. S.BALAGUROVA, N. M.VESELOVSKY, V. V.CHIZHOV, A. O.: "11-Phenoxyundecyl phosphate as a 2-acetamido-2-deoxy-a-d-glucopyranosyl phosphate acceptor in O-antigen repeating unit assembly of Salmonella arizonae 0:59", CARBOHYDR. RES., vol. 345, 2010, pages 2636 - 2640, XP027506778, DOI: 10.1016/j.carres.2010.09.021
ROBINSON, P. T.PHAM, T. N.UHRIN, D.: "In phase selective excitation of overlapping multiplets by gradient-enhanced chemical shift selective filters", J. MAGN. RESON. SAN DIEGO CALIF 1997, vol. 170, 2004, pages 97 - 103, XP055454236, DOI: 10.1016/j.jmr.2004.06.004
RUCKER, F. J.OSORIO, D.: "The effects of longitudinal chromatic aberration and a shift in the peak of the middle-wavelength sensitive cone fundamental on cone contrast", VISION RES., vol. 48, 2008, pages 1929 - 1939, XP023783206, DOI: 10.1016/j.visres.2008.06.021
Attorney, Agent or Firm:
DR PAUL CHAPMAN (GB)
Download PDF:
Claims:
CLAIMS:

1. A method of synthesizing a rhamnose polysaccharide, the method comprising

(i) transferring a rhamnose moiety to a hexose monosaccharide, disaccharide or trisaccharide using a hexose^-1 ,4-rhamnosyltransferase, a hexose-a-1 ,2- rhamnosyltransferase and/or a hexose-a-1 , 3-rhamnosyltransferase, or an enzymatically active fragment or variant thereof to form a disaccharide, trisaccharide or tetrasaccharide comprising a rhamnose moiety at a non-reducing end of the disaccharide, trisaccharide or tetrasaccharide;

(ii) generating the rhamnose polysaccharide by extending from the rhamnose moiety at the non-reducing end of the disaccharide, trisaccharide or tetrasaccharide using a heterologous bacterial enzyme Streptococcus pyogenes Group A carbohydrate enzyme C (GacC) and/or Streptococcus pyogenes Group A carbohydrate enzyme G (GacG) or an enzymatically active homologue, variant or fragment thereof.

2. The method according to claim 1 , wherein the method is performed in a bacterium species heterologous to the bacterium species from which the enzyme GacC and/or GacG, or an enzymatically active homologue, variant or fragment thereof is derived.

3. The method according to claim 1 or claim 2, wherein the hexose^-1 ,4- rhamnosyltransferase is not a GlcNAc- b-1 ,4-rhamnosyltransferase

4. The method according to any one of the preceding clams, wherein the hexose-b-1 ,4- rhamnosyltransferase is a Glc- b-1 ,4-rhamnosyltransferase or an enzymatically active fragment or variant thereof.

5. The method according to claim 4, wherein the Glc^-1 ,4-rhamnosyltransferase comprises a WchF enzyme, or an enzymatically active fragment or variant thereof.

6. The method according to claim 5, wherein the WchF enzyme comprises SEQ ID NO:36, or an enzymatically active fragment or variant thereof.

7. The method according to any one of the preceding claims, wherein the hexose-a 1 ,2- rhamnosyltransferase is a galactose-a-1 ,2-rhamnosyltransferase or an enzymatically active fragment or variant thereof.

8. The method according to claim 7, wherein the galactose-a-1 ,2-rhamnosyltransferase comprises a WbbR enzyme, or an enzymatically active fragment or variant thereof.

9. The method according to claim 8, wherein the WbbR enzyme comprises SEQ ID NO:37, or an enzymatically active fragment or variant thereof.

10. The method according to any one of the preceding claims, wherein the hexose- a- 1 ,3-rhamnosyltransferase is a GlcNAc-a-1 ,3-rhamnosyltransferase, a diNAcBac- a-1 ,3- rhamnosyltransferase, a Glc-a-1 ,3-rhamnosyltransferase, a galactose-a-1 ,3- rhamnosyltransferase or an enzymatically active fragment or variant thereof.

1 1. The method according to claim 10, wherein the GlcNAc-a-1 ,3- rhamnosyltransferase comprises a WbbL enzyme, or an enzymatically active fragment or variant thereof and the galactose-a-1 ,3-rhamnosyltransferase comprises a WsaD enzyme, or an enzymatically active fragment or variant thereof.

12. The method according to claim 11 , wherein the WbbL enzyme comprises SEQ ID NO:38, or an enzymatically active fragment or variant thereof.

13. The method according to claim 1 1 , wherein the WsaD enzyme comprises SEQ ID NO: 41 , or an enzymatically active fragment or variant thereof.

14. The method according to any one of the preceding claims, wherein the enzymatically active homologue of GacC and/or GacG is selected from a homologue from a Streptococci Group B, Group C, Group G, S. mutans, S. uberis or an enzymatically active fragment or variant thereof.

15. The method of any one of the preceding claims, wherein the method is performed in a gram-negative bacterium/bacteria.

16. The method of any one of the preceding claims, wherein the method is performed in E. coli.

17. The method according to any one of the preceding claims, wherein step ii) further comprises using one or more additional enzymes from the Gac cluster of bacterial enzymes, or one or more enzymatically active homologue(s), variant(s) or fragment(s) thereof.

18. The method according to any one of the preceding claims, the method further comprising:

(iii) conjugating the rhamnose polysaccharide to an acceptor molecule using an O- oligosaccharyltransferase capable of recognising the hexose monosaccharide at the reducing end of the rhamnose polysaccharide to form a rhamnose glycoconjugate.

19. The method according to claim 18 when dependent on claim 2, 15 or 16, wherein the O-oligosaccharyltransferase is heterologous to the bacteria in which the method is performed.

20. The method according to claim 16 or claim 17, wherein the O- oligosaccharyltransferase comprises PgIB, PgIL, PgIS or WsaB or an enzymatically active homologue, fragment or variant thereof.

21. The method according to any one of claims 18 to 20, wherein the acceptor molecule comprises a peptide or a protein.

22. The method of any one of claims 18 to 21 , wherein the method further comprises purifying the rhamnose glycoconjugate, optionally wherein purifying comprises affinity or size exclusion chromatography.

23. A product obtainable using the method according to any one of claims 1 to 22.

24. A synthetic streptoccocal polysaccharide, the polysaccharide having a non-reducing end comprising a linear chain of rhamnose moieties and a reducing end comprising a hexose monosaccharide, disaccharide or trisaccharide,

wherein the polysaccharide comprises a a-1 ,3 bond or a a-1 ,2 bond between the hexose monosaccharide, disaccharide or trisaccharide and the linear chain of rhamnose moieties; or the polysaccharide comprises a b-1 ,4 bond between the hexose monosaccharide, disaccharide or trisaccharide and the linear chain of rhamnose moieties and the hexose monosaccharide, disaccharide or trisaccharide does not comprise N-acetylglucosamine.

25. The synthetic streptococcal rhamnose polysaccharide according to claim 24, wherein the polysaccharide comprises a a-1 ,3 bond between the hexose monosaccharide, disaccharide or trisaccharide and the linear chain of rhamnose moieties and the hexose comprises N-acetylglucosamine, N,N’-diacetylbacillosamine, glucose or galactose.

26. The synthetic streptococcal rhamnose polysaccharide according to claim 24, wherein the polysaccharide comprises a a-1 ,2 bond between the hexose monosaccharide, disaccharide or trisaccharide and the linear chain of rhamnose moieties and the hexose comprises galactose.

27. The synthetic streptococcal rhamnose polysaccharide according to claim 24, wherein the polysaccharide comprises a b-1 ,4 bond between the hexose monosaccharide, disaccharide or trisaccharide and the linear chain of rhamnose moieties and the hexose comprises glucose.

28. The synthetic streptococcal rhamnose polysaccharide according to any one of claims 24 to 27, wherein the polysaccharide comprises a polysaccharide or a fragment or variant thereof selected from the group consisting of a Group A, Group B, Group C and Group G carbohydrate.

29. A streptococcal rhamnose glycoconjugate comprising the rhamnose polysaccharide according to any one of claims 24 to 28 conjugated to an acceptor.

30. The streptococcal glycoconjugate according to claim 29, wherein the polysaccharide is conjugated to the acceptor at the reducing end of the polysaccharide.

31. The streptoccocal glycoconjugate according to claim 29 or claim 30, wherein the acceptor comprises a peptide or a protein.

32. An immunogenic composition or vaccine comprising the product of claim 23, the synthetic streptococcal rhamnose polysaccharide according to any one of claims 24 to 28 or the streptococcal glycoconjugate according to any one of claims 29 to 31.

33. The immunogenic composition or vaccine according to claim 32, wherein the immunogenic composition or vaccine further comprises a pharmaceutically acceptable and/or sterile excipient, carrier and/or diluent.

34. The immunogenic composition or vaccine according to claim 32 or claim 33, wherein the immunogenic composition or vaccine further comprises an antigen, polypeptide and/or adjuvant.

35. The product of claim 23, the synthetic streptococcal rhamnose polysaccharide according to any one of claims 24 to 28, the streptococcal glycoconjugate according to any one of claims 29 to 31 , or the immunogenic composition or vaccine according to any one of claims 32 to 34 for use in raising an immune response in an animal or for use in treating or preventing a disease, condition or infection with a streptococcal aetiology.

36. A bacterial host cell, the bacterial host cell comprising a hexose^-1 ,4- rhamnosyltransferase, a hexose-a-1 ,2-rhamnosyltransferase or a hexose-a-1 ,3- rhamnosyltransferase, or an enzymatically active fragment or variant thereof and a heterologous bacterial enzyme GacC and/or GacG or an enzymatically active homologue, variant or fragment thereof.

37. A kit of parts, the kit comprising:

(i) A nucleic acid sequence encoding a hexose- b-1 ,4- rhamnosyltransferase, a hexose- a-1 ,2-rhamnosyltransferase or a hexose-a 1 ,3- rhamnosyltransferase, or an enzymatically active fragment or variant thereof; and

(ii) A nucleic acid sequence encoding a heterologous bacterial enzyme GacC and/or GacG or an enzymatically active homologue, variant or fragment thereof.

Description:
Rhamnose-Polysaccharides

Field

The present invention relates to a method of synthesizing a rhamnose polysaccharide. The invention also relates to a synthetic streptococcal polysaccharide, a streptococcal glycoconjugate, an immunogenic composition or vaccine comprising the streptococcoal polysaccharide or glycoconjugate and the polysaccharide, glycoconjugate, immunogenic composition or vaccine for use in raising an immune response in an animal or for use in treating or preventing a disease, condition or infection with a streptococcal aetiology.

Background

The Streptococci genera of bacteria is a group of versatile gram-positive bacteria that infect a wide range of hosts and are responsible for a remarkable number of illnesses.

Streptococcus pyogenes (Group A Streptococcus, GAS) is a human-exclusive pathogenic Gram-positive bacterium that causes a variety of illnesses. A probably underestimated appraisal of the epidemical power of this organism suggests that over 700 million individuals are afflicted per year worldwide, causing diseases as varied as impetigo, pharyngitis, scarlet fever, necrotising fasciitis, meningitis and toxic shock syndrome, amongst other illnesses. Moreover, autoimmune post-infection sequelae, such as acute rheumatic fever, acute glomerulonephritis or rheumatic heart disease can affect individuals that had previously suffered from GAS infections, extending the list of clinical manifestations caused by this pathogen. The Group A Carbohydrate (GAC) is a peptidoglycan-anchored rhamnose- polysaccharide (RhaPS) from Streptococcus pyogenes that is essential to bacterial survival and contributes to Streptococcus pyogenes’ ability to infect the human host.

Streptococcus agalactiae (Group B Streptococcus, GBS), is a (pathogenic) commensal bacterium which is carried by 20-40% of all adult humans. 25% of women carry GBS in the vagina, where it normally resides without symptoms. However, in pregnant women, GBS is a recognised cause for preterm delivery, maternal infections, stillbirths and late miscarriages. Despite current prevention strategies, 1 in every 1000 babies born in the UK develop GBS infections. Preterm babies are known to be at particular risk of GBS infection as their immune systems are not as well developed. This results in one baby per week dying in the UK from GBS infection and one baby surviving with long-term disabilities. Group C Streptococcus (GCS) can cause epidemic pharyngitis and cellulitis clinically indistinguishable from GAS disease in humans. It is also known to cause septicaemia, endocarditis, septic arthritis and necrotizing infections in patients with predisposing conditions such as diabetes, cancer or in elderly patients. In equine animals, GCS is the cause of the highly contagious and serious upper respiratory tract infection known as strangles, which is enzootic in a worldwide distribution.

Group G Streptococcus (GGS) are significant human pathogens that cause cutaneous infections, for example of the human skin. GGS also infect the oropharynx, gastrointestinal regions and female genital tracts. Other infections associated with GGS include several potentially life-threatening infections such as septicaemia, endocarditis, meningitis, peritonitis, pneumonitis, empyema, and septic arthritis.

Antimicrobial options for effectively controlling, treating and preventing GAS infections are becoming more limited. This is due to emerging antibiotic resistance, pandemic development and the spread of hyper virulent strains. There is thus a clear need for the development of a safe and effective vaccine candidate. For a vaccine to be capable of targeting most of the over 120 different GAS serotypes, it will need to be based on a ubiquitous, conserved and essential GAS target. One such target is the GAC, which is not only an essential structural component to the pathogen, but is also a virulence determinant.

Current forms of vaccine development are limited to chemical and enzymatic extraction methods from native bacteria as well as chemical conjugation to any acceptor compound, for example a protein or peptide. This is labour-intensive and results in a limited yield and quality of product. There is a clear need for a method of producing a GAS polysaccharide which is less labour-intensive and results in a homogenous, pure and high yield of polysaccharide. The present invention is devised with these issues in mind.

Description

In its broadest sense, the present disclosure relates to a method of synthesizing a polysaccharide, specifically a rhamnose polysaccharide.

According to a first aspect there is provided a method of synthesizing a rhamnose polysaccharide, the method comprising:

(i) transferring a rhamnose moiety to a hexose monosaccharide, disaccharide, or trisaccharide using a hexose^-1 ,4-rhamnosyltransferase, a hexose-a-1 ,2-rhamnosyltransferase and/or a hexose-a-1 ,3-rhamnosyltransferase, or an enzymatically active fragment or variant thereof to form a disaccharide, trisaccharide or tetrasaccharide comprising a rhamnose moiety at a non reducing end of the disaccharide, trisaccharide or tetrasaccharide; and

(ii) generating the rhamnose polysaccharide by extending from the rhamnose moiety at the non-reducing end of the disaccharide, trisaccharide or tetrasaccharide using a heterologous bacterial enzyme Streptococcus pyogenes Group A carbohydrate enzyme C (GacC) and/or Streptococcus pyogenes Group A carbohydrate enzyme G (GacG) or an enzymatically active homologue, variant or fragment thereof.

The bacterial species from which the enzyme GacC and/or the enzyme GacG or an enzymatically active homologue, variant or fragment thereof is derived is heterologous to the bacterial species from which the hexose^-1 ,4-rhamnosyltransferase, the hexose-a-1 ,2- rhamnosyltransferase, the hexose-a-1 ,3-rhamnosyltransferase or enzymatically active fragment or variant thereof used in step (i) is derived.

The present inventor has discovered for the first time that the Streptococcus pyogenes enzyme GacB, which initiates the synthesis of the GAC rhamnose polysaccharide, is a a-D- ΰIoNAo-b-1 ,4-ί rhamnosyl-transferase. Entirely surprisingly, the inventor has found that these rhamnose polysaccharides can be synthesized using rhamnosyltransferases from bacterial species different to those from which the GacB is derived. In other words, the inventors have found that rhamnose polysaccharides can be synthesized using rhamnosyltransferases from bacterial species other than S. pyogenes. This is entirely unexpected given that the function of GacB was previously unknown. It is also surprising that enzymes from different species can work together to synthesize a rhamnose polysaccharide.

In some embodiments step (ii) comprises generating the rhamnose polysaccharide by extending from the rhamnose moiety at the non-reducing end of the disaccharide, trisaccharide or tetrasaccharide using the heterologous bacterial enzyme GacC or an enzymatically active homologue, variant or fragment thereof.

Polysaccharide is a known term of the art used to denote a molecule comprising a plurality of identical or different monosaccharides, typically more than four monosaccharides. The term rhamnose polysaccharide, as used herein, will thus be understood to refer to a molecule comprising a plurality, typically more than four, rhamnose moieties, optionally attached to one or more other monosaccharide moieties. Conveniently, the rhamnose polysaccharide may be a single straight-chain of repeating units comprising rhamnose, bound to each other by alpha 1 ,3, or alpha 1 ,2 bonds. Each repeating unit may consist only of rhamnose, or each repeating unit may comprise rhamnose and one or more different monosaccharides. An exemplary repeating unit which comprises rhamnose is a rhamnose-galactose disaccharide repeating unit. Each/any repeating unit and/or rhamnose moiety may or may not include any side-group. In one embodiment no side groups are present and in another embodiment one or more side groups, such as a sugar, with or without additional modifications, such as glycerol-phosphate; or phosphate, may be present.

In embodiments the method is performed in a bacterium.

In such embodiments the method will be understood to be a microbiological method. Embodiments other than those carried out in a bacterium will be understood to be in vitro methods. By “bacterium”, this will be understood to refer to a bacterial cell. It will be appreciated that the invention also encompasses the method being performed in bacteria. Such microbiological methods are ideal for the production of large and homogenous quantities of a particular product, in this instance a rhamnose polysaccharide.

The rhamnose polysaccharide produced by the method will be understood to be a synthetic rhamnose polysaccharide. A synthetic rhamnose polysaccharide, as the skilled person will appreciate, will be understood to refer to a rhamnose polysaccharide, which is not the result of a naturally occurring process. This is because the method of the first aspect uses enzymes, the combination of which is not naturally occurring. In one embodiment, the bacterium is a Streptococcus species other than Streptococcus pyogenes, Escherichia species, such as E. coli, or a Shigella species, such as Shigella dysenteriae or Shigella fiexneri.

Typically, the rhamnose polysaccharide produced by the method is a streptococcal polysaccharide. For example, the polysaccharide may comprise a polysaccharide or a fragment or variant thereof selected from the group consisting of a Group A, Group B, Group C and Group G carbohydrate.

By rhamnose moiety, this will be understood to refer to a rhamnose monosaccharide or a derivative thereof. It will be appreciated that derivatives of rhamnose refer to a rhamnose monosaccharide(s) which has been modified by the addition or replacement of one or more groups or elements in the rhamnose monosaccharide, provided that at least one carbon of the rhamnose monosaccharide is still capable of forming a glycosidic bond with at least one other rhamnose monosaccharide or rhamnose moiety. Derivatives of rhamnose may encompass acetyl or methyl forms of rhamnose, amino-rhamnose, carboxylethyl-rhamnose, halogenated rhamnose and rhamnose phosphate. Unless context otherwise dictates, herein after reference will generally be made to a rhamnose moiety, but this should not be construed as limiting.

Halogenated rhamnose will be understood to refer to a rhamnose monosaccharide wherein one or more groups of the rhamnose, for example one or more OH groups is replaced with a halogen, for example fluoride or chloride to form a fluorinated or chlorinated rhamnose, respectively.

Amino-rhamnose will be understood to refer to a rhamnose monosaccharide where one or more groups of the rhamnose is replaced by an amine group.

An example acetyl-rhamnose may comprise 2-O-acetyl-a-L-rhamnose, while an example methyl-rhamnose may comprise 3-0-methyl--L-rhamnose. Another exemplary derivative of rhamnose may comprise carboxylethyl-rhamnose, for example 4-0-(1-carboxyethyl)-L- rhamnose.

By enzymatically active fragment or variant, we include that the sequence of the relevant enzyme can vary from the naturally occurring sequence with the proviso that the fragment or variant substantially retains the enzymatic activity of the enzyme. By retain the enzymatic activity of the enzyme it is meant that the fragment and/or variant retains at least a portion of the enzymatic activity as compared to the native enzyme. Typically, the fragment and/or variant retains at least 50%, such as 60%, 70%, 80%, 90%, 95%, 97%, 98% or 99% activity. In some instances, the fragment and/or variant may have a greater enzymatic activity than the native enzyme. In some embodiments, the fragment and/or variant may display an increase in another physiological feature as compared to the native enzyme. For example, the fragment and/or variant may possess a greater half-life in vitro and/or in vivo, as compared to the native enzyme. The test for determining the half-life of an enzyme, or a fragment or variant thereof, will be known to the skilled person. Briefly, an in vitro test may involve incubating the enzyme at a particular temperature and pH for different time periods. At the end of each time period, the activity of the enzyme, or fragment or variant thereof, can be measured using an enzymatic assay, which is well known to the skilled person.

The enzyme GacC, as used herein, will be understood to refer to the Streptococcus pyogenes Group A carbohydrate enzyme C (UniProtKB - Q9A0G4 (Q9A0G4_STRP1)). An exemplary amino acid sequence encoding GacC is provided by SEQ ID NO: 1. The enzyme GacG, as used herein, will be understood to refer to the Streptococcus pyogenes Group A carbohydrate enzyme G (UniProtKB - Q9A0G0 (Q9A0G0_STRP1)). In some embodiments, the enzyme GacG comprises or consists of SEQ ID NO:2, or an enzymatically active fragment or variant thereof.

GacG (or an enzymatically active homologue, variant or fragment thereof) is used instead of or in addition to GacC in the method of the invention. GacC is a rhamnose-1 ,3 a rhamnosyltransferase, while GacG is a predicted dual function glycosyltransferase, that synthesizes the repeating unit for the GAC (alpha1 ,3-alpha1 ,2).

“Homologue” may encompass enzymes which exhibit at least about 20%, 30%, 40%, 50%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to a GacC or GacG amino acid sequence.

In some embodiments the enzymatically active homologue is a homologue of GacC.

The degree of (or percentage)“homology” between two or more amino acid sequences may be calculated by aligning the sequences and determining the number of aligned residues which are identical and adding this to the number of conservative amino acid substitutions. The combined total is then divided by the total number of residues compared and the resulting figure is multiplied by 100 - this yields the percentage homology between aligned sequences.

Typically, a homologue of GacC or GacG encompasses an enzyme which substantially retains the enzymatic activity of GacC or GacG.

In some embodiments the homologue of GacC comprises or consists of rfbG. RfbG is an alpha-1 -3 rhamnosyltransferase derived from Shigella flexneri which has 30% identity to GacC. Thus, in the context of the present invention, rfbG is an enzymatically active homologue of GacC. In some embodiments, rfbG comprises or consists of SEQ ID NO: 3. RfbG may be identified using the UniProtKB - A0A2D0WWB9 (A0A2D0WWB9_9ENTR).

The homologue of GacC or GacG may comprise or consist of rfbG, an enzyme derived from a Lancefield group species other than S. pyogenes and/or from a non-Lancefield group Streptoccocus species other than S. pneumoniae. In some embodiments the homologue of GacC or GacG is an enzyme derived from a Lancefield group species other than S. pyogenes and/or from a non-Lancefield group Streptococcus species other than S. pneumoniae.

As the skilled person will be aware, the Lancefield group of bacteria refers to a group of different bacterial species, primarily Streptococcus species, which are catalase-negative and coagulase-negative. The grouping is based on the carbohydrate composition of the cell wall antigens.

Lancefield group bacteria include:

• Group A - Streptococcus pyogenes, Streptococcus dysgalactiae subsp. equisimilis

• Group B - Streptococcus agalactiae

• Group C - Streptococcus equisimilis, Streptococcus equi, Streptococcus

zooepidemicus, Streptococcus dysgalactiae, Streptococcus dysgalactiae subsp. equisimilis

• Group D -Enterococcus faecalis, Enterococcus faecium, Enterococcus durans and Streptococcus bovis

• Group E - Enterococci

• Group F, G & L - Streptococcus anginosus, Streptococcus dysgalactiae subsp.

equisimilis

• Group H - Streptococcus sanguis

• Group K - Streptococcus salivarius

• Group L - Streptococcus dysgalactiae

• Group M & O - Streptococcus mitior

• Group N - Lactococcus lactis

• Group R & S - Streptococcus suis

The non-Lancefield group Streptococcus species may comprise Streptococcus mutans or S. uberis. In some embodiments the non-Lancefield group Streptococcus species may comprise or consist of S. mutans.

The enzymatically active homologue of GacC or GacG may be selected from a homologue from the Streptococcus Group B, Group C, Group G, S. mutans, S. uberis or an enzymatically active fragment or variant thereof. In some embodiments the enzymatically active homologue of GacC or GacG may be selected from a homologue from the Streptococcus Group B, Group C, Group G, S. mutans, or an enzymatically active fragment or variant thereof.

In some embodiments, the enzymatically active homologue of GacC is selected from a homologue of GacC from the Streptococcus Group B, Group C, Group G, S. mutans, S. uberis or an enzymatically active fragment or variant thereof. The skilled person will be aware of Streptococcal homologues to GacC. For example, the Group B homologue of GacC may be GbcC (UniProtKB - Q8DYQ2 (Q8DYQ2_STRA5)). The Group C homologue of GacC may be GccC (UniProtKB - M4YWQ3 (M4YWQ3_STREQ)). The Group G homologue of GacC may be GgcC (UniProtKB - C5WFT8 (C5WFT8_STRDG)), while the S. mutans homologue of GacC may be SccC (UniProtKB - A0A0E2EN43 (A0A0E2EN43_STRMG). The S. uberis homologue of GacC may be SucC (UniProtKB - B9DU25 (B9DU25_STRU0)).

The amino acid sequence of GbcC may comprise or consist of SEQ ID NO:4. The amino acid sequence of GccC may comprise of consist of SEQ ID NO:5, while the amino acid sequence of GgcC may comprise of consist of SEQ ID NO:6. In some embodiments SccC comprises or consists of SEQ ID NO:7. The amino acid sequence of SucC may comprise or consist of SEQ ID NO:8.

In some embodiments, the enzymatically active homologue of GacG is selected from a homologue of GacG from the Streptococcus Group C, Group G, S. mutans, S. uberis or an enzymatically active fragment or variant thereof. Suitable enzymatically active homologues of GacG include, but are not limited to, the Group C homologue of GacG, GccG, the Group G homologue of GacG, GgcG, the S. uberis homologue of GacG, SucG, and the S. mutans homologue of GacG, SccG.

In some embodiments, GccG comprises and consists of SEQ ID NO:9. In some embodiments GccG comprises or consists of two proteins. The two proteins may comprise or consist SEQ ID Nos 10 and 1 1.

GgcG may comprise or consist of two proteins. The two proteins may have the UniProtKBs C5WFU2 (C5WFU2_STRDG) and C5WFU3 (C5WFU3_STRDG), respectively. In some embodiments GgcG may comprise or consist of SEQ ID Nos 12 and 13. SucG may comprise or consist of the amino acid sequence identified by the UniProtKB - B9DU29 (B9DU29_STRU0). For example, SucG may comprise or consist of the amino acid sequence SEQ ID NO: 14.

SccG may comprise or consist of the amino acid sequence identified by the UniProtKB - 082878 (082878_STRMG). In some embodiments, SccG comprises or consists of the amino acid sequence SEQ ID NO: 15.

The enzymatically active homologue of GacC or GacG may be selected from a homologue from, S. mutans, S. uberis or a fragment or variant thereof.

In some embodiments step (ii) comprises generating the rhamnose polysaccharide by extending from the rhamnose moiety at the non-reducing end of the disaccharide, trisaccharide or tetrasaccharide using an enzymatically active homologue of GacC and/or GacG from S. mutans, or an enzymatically active variant or fragment thereof.

The invention also encompasses nucleic acid sequences encoding the enzymes (and/or enzymatically active fragments, variants or homologues) of the present invention.

As used herein, when an enzyme is“derived from” a particular bacterial species, this means that the enzyme is naturally occurring in the particular bacterial species. In the context of the present invention, an enzyme“derived from” a particular bacterial species may include an enzyme endogenous to the bacterium in which the method may be performed, an enzyme or a nucleic acid encoding the enzyme isolated from the particular bacterial species, or variants or fragments thereof. In embodiments where the method is performed in a bacterium, the enzyme or nucleic acid encoding the enzyme isolated from the particular bacterial species may be transferred into the bacterium in which the method is performed.

In embodiments where the method is performed in a bacterium, the enzyme(s) of step (i) and/or the enzymes(s) of step (ii) may be overexpressed in the bacterium. By “overexpressed”, this will be understood to refer to a level of expression of the enzyme higher than that which would be observed for the naturally occurring enzyme when endogenously expressed in its native bacterium. Various techniques for overexpression are known to those skilled in the art. Further information regarding overexpression techniques may be found in Current Protocols in Molecular Biology (2019) which is incorporated herein by reference. In the context of the present invention, heterologous is used to refer to different. A heterologous bacterial species will be understood to mean a bacterial species different to another, or bacterial genera different to another bacterial genera.

It will be appreciated that in the context of the present invention, heterologous does not encompass a bacterial strain being different to another bacterial strain (i.e. two strains, for example, of S. mutans).

By“variants” of an enzyme we include insertions, deletions and substitutions of the amino acid sequence, either conservative or non-conservative wherein the physio-chemical properties of the respective amino acid(s) are not substantially changed (for example, conservative substitutions such as Gly, Ala; Val, lie, Leu; Asp, Glu; Asn, Gin; Ser, Thr; Lys, Arg; and Phe, Tyr). The skilled person will appreciate that such conservative substitutions should not affect the functionality of the respective enzyme. Moreover, small deletions within non-functional regions of the enzyme can also be tolerated and hence are considered“variants” for the purpose of the present invention. “Variants” also include recombinant enzyme proteins in which the amino acids have been post-translationally modified, by for example, glycosylation, or disulphide bond formation. The experimental procedures described herein can be readily adopted by the skilled person to determine whether a“variant” can still function as an enzyme.

It is preferred if the variant has an amino acid sequence which has at least 75%, yet still more preferably at least 80%, in further preference at least 85%, in still further preference at least 90% and most preferably at least 95%, 97%, 98% or 99% identity with the“naturally occurring” amino acid sequence of the enzyme.

It will be appreciated that variants also encompass variants of the nucleic acid sequence encoding the enzyme. In particular, we include variants of the nucleotide sequence where such changes do not substantially alter the enzymatic activity of the enzyme which it encodes. A skilled person would know that such sequences can be altered without the loss of enzymatic activity. In particular, single changes in the nucleotide sequence may not result in an altered amino acid sequence following expression of the sequence.

In some embodiments the method is performed in a bacterium species heterologous to the bacterium species or genera from which the enzyme GacC and/or GacG or an enzymatically active homologue, variant or fragment thereof is derived. In some embodiments the method is performed in a gram-positive bacterium. The method may be performed in a gram-negative bacterium. For example, the method may be performed in a gram-negative bacterium such as E. coli or Campylobacter species. Other suitable gram-negative bacteria will be known to the skilled person. In embodiments, the bacterium species may be heterologous to the bacterium species or genera from which the hexose^-1 ,4-rhamnosyltransferase, hexose-a-1 ,2- rhamnosyltransferase or hexose-a-1 ,3-rhamnosyltransferase is derived.

In some embodiments the method is performed in E. coli.

Step ii) of the method may comprise using one or more additional enzymes from the Gac cluster of bacterial enzymes, or one or more enzymatically active homologue(s), variant(s) or fragment(s) thereof.

As the skilled person will appreciate, GacB is one of a number of enzymes encoded by one gene cluster in S. pyogenes. This gene cluster, which may otherwise be referred to as the Gac gene cluster, ( gacA-gacL , MGAS5005_Spy _0602-0613) is understood to encode 12 different enzymes, as defined by van Sorge et al. , 2014. The 12 enzymes are GacA, GacB, GacC, GacD, GacE, GacF, GacG, GacH, Gael, GacJ, GacK and GacL. Thus, step ii) of the method may further comprise using one or more additional enzymes from the Gac cluster of bacterial enzymes, or one or more enzymatically active homologue(s), variant(s) or fragment(s) thereof. Thus, in some embodiments, step ii) of the method comprises using one or more additional enzymes selected from GacA, GacC, GacD, GacE, GacF, GacG, GacH, Gael, GacJ, GacK, GacL or one or more enzymatically active homologue(s), variant(s) or fragment(s) thereof.

In some embodiments, step ii) of the method further comprises using one or more enzymatically active homologue(s), or enzymatically active variant(s) or fragment(s) thereof, of one or more of GacA, GacC, GacD, GacE, GacF, GacG, GacH, Gael, GacJ, GacK, GacL.

The one or more enzymatically active homologue(s) may be derived from S. mutans and/or S. uberis.

In some embodiments the one or more enzymatically active homologue(s) is derived from S. mutans.

Step ii) may further comprise using the enzyme GacA or an enzymatically active homologue, fragment or variant thereof. In some embodiments step ii) may comprise using the enzymes GacC and GacG, or one or more enzymatically active homologue(s), variant(s) or fragment(s) thereof. In some embodiments step ii) comprises using the enzymes GacC, GacA and GacG, or one or more enzymatically active homologues, variants or fragments thereof. Step ii) may further comprise using the enzymes GacD, GacE, and GacF or one or more enzymatically active homologue(s), fragment(s) or variant(s) thereof.

Step ii) may comprise using the enzymes GacC, GacA, GacG, GacD, GacE, and Gac F or one or more enzymatically active homologue(s), fragment(s) or variant(s) thereof.

In some embodiments step ii) comprises using the enzymes GacA, GacC, GacD, GacE, GacF, GacG, GacH, Gael, GacJ, GacK and GacL, or one or more enzymatically active homologue(s), variant(s) or fragment(s) thereof.

Step ii) may comprise using the enzymatically active homologues from S. mutans and/or S. uberis of GacA, GacC, GacD, GacE, GacF, GacG and GacH.

In some embodiments, step ii) comprises using the enzymatically active homologues from S. mutans of GacA, GacC, GacD, GacE, GacF, GacG and GacH.

GacA may comprise or consist of SEQ ID NO: 16. Without wishing to be bound by theory, GacA is believed to function to synthesize the rhamnose moieties required for the generation of the rhamnose polysaccharide. GacG is believed to be involved in the generation of the rhamnose polysaccharide by extending from the rhamnose moiety at the reducing end.

GacD and GacE may function to form an ATP-dependent ABC transporter. As the skilled person will appreciate, an ATP-dependent ABC transporter translocates substrates across membranes. Thus, without wishing to be bound by theory, GacD and GacE may assist in transporting the rhamnose polysaccharide across the bacterial membrane such that it can then be presented on the bacterial cell wall.

GacH may comprise or consist of SEQ ID NO: 17. GacH can also be identified using UniProtKB - J7M7C2 (J7M7C2_STRP1).

In some embodiments step ii) further comprises using the enzymes GacH, Gael, GacJ, GacK and GacL, or one or more enzymatically active homologue(s), variant(s) or fragment(s) thereof. It is thought that Gael and/or GacJ may enhance the catalytic efficiency of the method of synthesizing the rhamnose polysaccharide.

Enzymatically active homologues of GacA may be selected from a homologue of GacA from the Streptococcus Group B, Group C, Group G, S. mutans, S. uberis or an enzymatically active fragment or variant thereof. For example, the Streptococcus Group B homologue of GacA is RmID. The Streptococcus Group C homologue of GacA is RmID, as is the Streptococcus Group G homologue of GacA.

The Streptococcus Group B homologue of GacA, RmID may have the UniProtKB - A0A0E1 EP43 (A0A0E1 EP43_STRAG). In some embodiments the Streptococcus Group B homologue of GacA, RmID comprises or consists of SEQ ID NO: 18.

The Streptococcus Group C homologue of GacA, RmID may have the UniProtKB - K4Q921 (K4Q921_STREQ). In some embodiments the Streptococcus Group C homologue of GacA, RmID comprises or consists of SEQ ID NO: 19.

The Streptococcus Group G homologue of GacA, RmID may have the UniProt- KB A0A2X3AIL5 (A0A2X3AIL5_STRDY). The Streptococcus Group G homologue of GacA may comprise or consist of SEQ ID NO:20.

The S. mutans homologue of GacA may be identified using the UniProtKB - 033664 (033664_STRMG). In some embodiments the S. mutans homologue of GacA may comprise or consist of SEQ ID NO:21.

The S. uberis homologue of GacA may be identified using the UniProtKB- B9DU23 (B9DU23_STRU0). In some embodiments the S. uberis homologue of GacA may comprise or consist of SEQ ID NO:22.

Enzymatically active homologues of GacD, GacE and/or GacF, may be selected from homologues from the Streptococcus Group C, Group G, S. mutans, S. uberis or an enzymatically active fragment or variant thereof. Suitable homologues of GacD include, but are not limited to, the Streptococcus Group C enzyme GccD, the Streptococcus Group G enzyme GgcD and the S. mutans enzyme SccD. Suitable homologues of GacE include, but are not limited to the Streptococcus Group C enzyme GccE, the Streptococcus Group G enzyme GgcE and the S. mutans enzyme SccE. Suitable homologues of GacF include, but are not limited to the Streptococcus Group C enzyme GccF, the Streptococcus Group G enzyme GgcF, the S. mutans enzyme SccF and the S. uberis enzyme SucF.

In some embodiments GccD comprises or consists of the amino acid sequence SEQ ID NO:23. GccE may be identified using the UniProtKB - A0A380KIL0 (A0A380KIL0_STREQ). In some embodiments GccE comprises or consists of the amino acid sequence SEQ ID NO:24. GccF may be identified using the UniProtKB - A0A3S4QIR3 (A0A3S4QIR3_STREQ). Optionally, GccF comprises or consists of SEQ ID NO:25.

In some embodiments GgcD comprises or consists of the amino acid sequence SEQ ID NO:26. GgcD may be identified using the UniProtKB - C5WFT9 (C5WFT9_STRDG).

In some embodiments GgcE is identified by the UniProtKB -M4YXS7 (M4YXS7_STREQ). Optionally, GgcE comprises or consists of SEQ ID NO:27. GgcF may be identified by the UniProtKB - C5WFU1 (C5WFU1_STRDG). In some embodiments GgcF comprises or consists of SEQ ID NO:28.

SccD may comprise or consist of SEQ ID NO:29. Optionally, SccD is identified using the UniProtKB - I6L8Z4 (I6L8Z4_STRMU).

SccE may comprise or consist of SEQ ID NO:30. Optionally, SccE is identified using the UniProtKB - I6L8X8 (I6L8X8_STRMU).

SccF may be identified using the UniProtKB - 082877 (082877_STRMG). Optionally, SccF comprises or consists of SEQ ID NO:31.

SucD may be identified using the UniProtKB - B9DU26 (B9DU26_STRU0). In some embodiments SucD comprises or consists of SEQ ID NO:32.

SucE may be identified using the UniProtKB - B9DU27 (B9DU27_STRU0). In some embodiments SucE comprises or consists of SEQ ID NO:33.

SucF may be identified using the UniProtKB - B9DU28 (B9DU28_STRU0). In some embodiments SucF comprises or consists of the amino acid sequence SEQ ID NO:34. An enzymatically active homologue of GacH may comprise or consist of the S. mutans enzyme SccH, or an enzymatically active fragment or variant thereof. The enzyme SccH may be identified using the UniProtKB - Q8DUS0 (Q8DUS0_STRMU).

In some embodiments SccH comprises or consists of SEQ ID NO:35.

In some embodiments the hexose^-1 ,4-rhamnosyltransferase is not a N-acetylglucosamine (GlcNAc)^-1 ,4-rhamnosyltransferase. In some embodiments the hexose- b-1 ,4- rhamnosyltransferase is not GacB.

By“hexose^-1 ,4-rhamnosyltransferase”, this will be understood to be an enzyme capable of transferring a rhamnose moiety to a hexose such that a b-1 ,4 linkage is formed between the hexose and the rhamnose moiety. Once the rhamnose moiety is transferred, it will be understood that the hexose is at the reducing end and the rhamnose moiety is at the non reducing end, i.e. the end from which is extended from to generate the rhamnose polysaccharide.

The hexose^-1 ,4-rhamnosyltransferase may comprise or consist of an allose^-1 ,4- rhamnosyltransferase, an altrose^-1 ,4-rhamnosyltransferase, a glucose^-1 ,4- rhamnosyltransferase, a mannose^-1 ,4-rhamnosyltransferase, a xylose^-1 ,4- rhamnosyltransferase, a idose^-1 ,4-rhamnosyltransferase, a galactose^-1 ,4- rhamnosyltransferase a talose^-1 ,4-rhamnosyltransferase, a diacetylbacillosamine^-1 ,4- rhamnosyltransferase or an enzymatically active fragment or variant thereof.

In some embodiments, the hexose^-1 ,4-rhamnosyltransferase comprises a glucose(Glc)^- 1 ,4-rhamnosyltransferase or an enzymatically active fragment or variant thereof. As the skilled person will appreciate, a glucose(Glc)^-1 ,4-rhamnosyltransferase is an enzyme capable of transferring a rhamnose moiety to a glucose, thereby forming a b-1 ,4 linkage between the glucose and the rhamnose moiety. The hexose^-1 ,4-rhamnosyltransferase may comprise a WchF enzyme, or an enzymatically active fragment or variant thereof. The WchF enzyme will be understood to be derived from S. pneumoniae and is a glucose(Glc)^-1 ,4- rhamnosyltransferase.

In some embodiments the WchF enzyme comprises SEQ ID NO:36, or an enzymatically active fragment or variant thereof. The enzymatically active fragment or variant of WchF may have at least 30% amino acid sequence identity to the WchF enzyme.

In some embodiments the enzymatically active fragment or variant of WchF has at least 80%, at least 85%, at least 90%, at least 95%, at least 97% or at least 99% amino acid identity to the WchF enzyme. For example, homologues of WchF from S. mitis, S. oralis, S. pseudopneumoniae and S. perosis share 87%, 93%, 87% and 81 % amino acid identity to WchF, respectively. In the context of the present invention, these particular homologues will thus be understood to be enzymatically active variants of WchF.

The hexose-a-1 ,2-rhamnosyltransferase may comprise or consist of an allose-a-1 ,2- rhamnosyltransferase, an altrose-a-1 ,2-rhamnosyltransferase, a glucose-a-1 ,2- rhamnosyltransferase, a mannose-a-1 ,2-rhamnosyltransferase, a xylose-a-1 ,2- rhamnosyltransferase, a idose-a-1 ,2-rhamnosyltransferase, a-galactose a-1 ,2- rhamnosyltransferase a talose-a-1 ,2-rhamnosyltransferase, a diacetylbacillosamine-a-1 ,2- rhamnosyltransferase, a GlcNAc-a-1 ,2-rhamnosyltransferase or an enzymatically active fragment or variant thereof.

In some embodiments the hexose-a-1 ,2-rhamnosyltransferase comprises or consists of a galactose-a-1 ,2-rhamnosyltransferase or an enzymatically active fragment or variant thereof. The hexose-a-1 ,2-rhamnosyltransferase may comprise a WbbR enzyme, or an enzymatically active fragment or variant thereof. As the skilled person will appreciate, the WbbR enzyme (WP_001045977.1 - UniProtKB - Q32EG0 (Q32EG0_SHIDS) is derived from Shigella dysenterica and is a galactose-a-1 ,2-rhamnosyltransferase.

The WbbR enzyme may comprise or consist of SEQ ID NO:37.

The hexose-a-1 , 3-rhamnosyltransferase may comprise or consist of an allose-a-1 ,3- rhamnosyltransferase, an altrose-a-1 , 3-rhamnosyltransferase, a glucose-a-1 ,3- rhamnosyltransferase, a mannose-a-1 , 3-rhamnosyltransferase, a xylose-a-1 ,3- rhamnosyltransferase, a idose-a-1 , 3-rhamnosyltransferase, a galactose-a-1 ,3- rhamnosyltransferase a talose-a-1 , 3-rhamnosyltransferase, a diacetylbacillosamine-a-1 ,3- rhamnosyltransferase, a GlcNAc-a-1 , 3-rhamnosyltransferase or an enzymatically active fragment or variant thereof

In some embodiments the hexose-a-1 , 3-rhamnosyltransferase comprises or consists of a GlcNAc-a-1 , 3-rhamnosyltransferase, a diNAcBac-a-1 , 3-rhamnosyltransferase, a Glc- a-1 ,3- rhamnosyltransferase, a galactose-a-1 ,3-rhamnosyltransferase or a fragment or variant thereof. The hexose-a-1 ,3-rhamnosyltransferase may comprise or consist of a GlcNAc-a-1 ,3- rhamnosyltransferase or a galactose-a-1 ,3-rhamnosyltransferase or an enzymatically active fragment or variant thereof.

The GlcNAc-a-1 ,3-rhamnosyltransferase may comprise a WbbL enzyme, or an enzymatically active fragment or variant thereof. The WbbL enzyme is derived from E. coli. The WbbL enzyme may comprise or consist of SEQ ID NO:38, or an enzymatically active fragment or variant thereof.

The enzymatically active fragment or variant of WbbL may have at least 20% or at least 25% amino acid sequence identity to the WchF enzyme. For example, a homologous enzyme of WbbL having 27% amino acid identity to WbbL has been identified in Mycobacterium tuberculosis, also known as WbbL. Thus, in the context of the present invention, this homologue will be understood to be an enzymatically active variant of WbbL. This homologous enzyme to WbbL, derived from Mycobacterium tuberculosis may comprise or consist of SEQ ID NO: 39. Another suitable homologue of WbbL comprises or consists of the enzyme rfbF, derived from Shigella flexneri. RfbF may comprise or consist of SEQ ID NO:40. RfbF can be identified using the UniProtKB - A0A2Y2Z3I0 (A0A2Y2Z3I0_SHIFL).

The galactose-a-1 ,3-rhamnosyltransferase may comprise a WsaD enzyme, or an enzymatically active fragment or variant thereof. The WsaD enzyme is derived from Geobacillus stearothermophilus. In some embodiments the WsaD enzyme comprises or consists of SEQ ID NO:41.

Enzymatically active fragments or variants of WsaD may be derived from other Bacilli strains, for example Brevibacillus species and Paenibacillus species. The enzymatically active fragments or variants of WsaD may have at least 20%, 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98% or at least 99% amino acid identity to WsaD.

The inventors have surprisingly found that a chimera of the hexose^-1 ,4- rhamnosyltransferase, the hexose-a-1 ,2-rhamnosyltransferase the hexose-a-1 ,3- rhamnosyltransferase, or an enzymatically active fragment or variant with GacB or an enzymatically active variant, fragment or homologue thereof is capable of transferring the rhamnose moiety to a hexose monosaccharide, disaccharide or trisaccharide. Thus, in some embodiments, transferring a rhamnose moiety to a hexose monosaccharide, disaccharide or trisaccharide uses a GacB/hexose^-1 ,4-rhamnosyltransferase, hexose-a-1 ,2- rhamnosyltransferase, hexose-a-1 ,3-rhamnosyltransferase or enzymatically active fragments or variants thereof chimera. It will be appreciated that in such embodiments the hexose- b- 1 ,4-rhamnosyltransferase is not GacB.

The chimera may comprise at least the C terminus region of GacB linked to the N terminus region of the hexose^-1 ,4-rhamnosyltransferase, the hexose-a-1 ,2- rhamnosyltransferase the hexose-a-1 ,3-rhamnosyltransferase, or an enzymatically active fragment or variant thereof. In some embodiments the chimera comprises the C terminus region of GacB linked to the N terminus region of WchF.

In some embodiments the chimera comprises the full amino acid sequence of GacB except for the initial 50, 100, 150, 160, 170, 180, 190 or 200 amino acids, which are replaced with the corresponding hexose-b-1 ,4-rhamnosyltransferase, hexose-a-1 ,2-rhamnosyltransferase hexose- a-1 ,3-rhamnosyltransferase, or an enzymatically active fragment or variant thereof amino acids. An example chimera may comprise the amino acid sequence of GacB except that the first 178 amino acids of GacB are replaced with the corresponding WchF amino acids (1-186 amino acids).

The hexose monosaccharide, disaccharide or trisaccharide to which the rhamnose moiety is transferred can be any hexose. In embodiments the hexose monosaccharide is not a rhamnose moiety.

In embodiments wherein the rhamnose moiety is transferred to a hexose disaccharide or trisaccharide, the monosaccharides of the di or trisaccharide may be the same or different to each other. For example, the disaccharide may comprise two galactose monosaccharides. Alternatively, the disaccharide may comprise a GlcNAc and a galactose. The GlcNAc may be at the reducing end of the disaccharide, and the galactose at the non-reducing end.

The disaccharide may comprise one rhamnose moiety. The trisaccharide may comprise one or two rhamnose moieties.

In some embodiments, the monosaccharide at the reducing end of the hexose monosaccharide, disaccharide or trisaccharide to which the rhamnose moiety is transferred (so the hexose monosaccharide or first monosaccharide of the disaccharide or trisaccharide) is a glucose or a glucose derivative. In the context of the present invention, glucose derivative will be understood to refer to GlcNAc or diNAcBac. In some embodiments the hexose monosaccharide, disaccharide or trisaccharide does not comprise GlcNAc.

It will be appreciated that the monosaccharide at the non-reducing end of the hexose monosaccharide, disaccharide or trisaccharide determines the specificity of the rhamnosyltransferase. This is because the rhamnosyltransferase transfers the rhamnose moiety to the monosaccharide at the non-reducing end of the hexose monosaccharide, disaccharide or trisaccharide. Thus, when the monosaccharide at the non-reducing end is galactose, the hexose rhamnosyltransferase will be a galactose rhamnosyltransferase.

The disaccharide or trisaccharide may comprise a rhamnose moiety at its non-reducing end.

An exemplary disaccharide may comprise a glucose at the reducing end linked to a rhamnose moiety at the non-reducing end. Other exemplary disaccharides include, but are not limited to, a diNAcBac at the reducing end linked to a rhamnose moiety at the non-reducing end, or a galactose at the reducing end linked to a rhamnose moiety at the non-reducing end.

Exemplary trisaccharides include, but are not limited to a glucose at the reducing end linked to a hexose which is linked to a rhamnose moiety at the non-reducing end, a diNAcBac at the reducing end linked to a hexose which is linked to a rhamnose moiety at the non-reducing end, or a GlcNAc at the reducing end linked to a hexose which is linked to a rhamnose moiety at the non-reducing end. Optionally, the hexose of the trisaccharide may be a rhamnose moiety or a galactose.

When reference is made to a“link” between hexoses, this will be understood to refer to a glycosidic bond. In the di or trisaccharide, the glycosidic bond between two hexoses in the di or trisaccharide may be an alpha (a) or a beta (b) glycosidic bond. The alpha bond may be an alpha 1 ,3 or an alpha 1 ,2 bond. The beta bond may be a beta 1 ,4 bond.

The features of the hexose monosaccharide, disaccharide and trisaccharide as described herein are also applicable to the hexose monosaccharide, disaccharide and trisaccharide, as appropriate of the streptococcal polysaccharide of the invention.

Further examples of monosaccharides, disaccharides and trisaccharides to which the rhamnose moiety can be transferred in step i) of the method and/or which comprise or consist of the hexose monosaccharide, disaccharide or trisaccharide of the streptococcal polysaccharide of the invention are provided in Example 2.

In embodiments wherein step (i) comprises transferring a rhamnose moiety to a hexose disaccharide or trisaccharide, the method may further comprise forming the hexose disaccharide or trisaccharide. The hexose disaccharide or trisaccharide may be formed using a hexosyltransferase, i.e. an enzyme capable of transferring a hexose to another hexose. For the hexose trisaccharide, if each monosaccharide of the trisaccharide is the same (for example the trisaccharide is formed of three glucoses), then one hexosyltransferase can be used to transfer each hexose to the other to form the trisaccharide. However, in embodiments where the hexose trisaccharide is formed of at least two different hexoses, then two different hexosyltransferases will be required to form the hexose trisaccharide.

When the method further comprises forming the hexose disaccharide, the hexose disaccharide may be formed using a hexose-a-1 ,3-hexosyltransferase or an enzymatically active fragment or variant thereof. A hexose-a-1 ,3-hexosyltransferase will be understood to refer to an enzyme which is capable of transferring a hexose to another hexose to form a a- 1 ,3 bond. In the context of the present invention, bond may otherwise be used to refer to linkage. In some embodiments, the hexose disaccharide is formed using a hexose-a-1 ,3- galactosyltransferase. The hexose-a-1 ,3-galactosyltransferase may comprise or consist of a GlcNAc-a-1 ,3-galactosyltransferase, optionally the enzyme WbbP, or an enzymatically active fragment or variant thereof. The enzyme WbbP may be identified using the UniProt KB - Q53982 (Q53982_SHIDY). In some embodiments WbbP may comprise or consist of the amino acid sequence SEQ ID NO:42. Thus, in some embodiments the disaccharide consists of a GlcNAc at its reducing end and a galactose at its non-reducing end, the two hexoses linked via a a-1 ,3 bond.

In some embodiments the method comprises forming the hexose disaccharide using the enzyme WbbP, or an enzymatically active fragment or variant thereof, followed by transferring a rhamnose moiety to the hexose disaccharide using the enzyme WbbR, or an enzymatically active fragment or variant thereof.

The hexose disaccharide may be formed using a hexose- a-1 ,3- rhamnosyltransferase or an enzymatically active fragment or variant thereof. For example, the hexose disaccharide may be formed using a galactose-a-1 ,3-rhamnosyltransferase, for example WsaD or an enzymatically active fragment or variant thereof. It will be appreciated in such embodiments that the hexose disaccharide is formed of a galactose at the reducing end and a rhamnose moiety at the non-reducing end. When the hexose disaccharide is formed using a galactose- a-1 ,3-rhamnosyltransferase, the enzyme WsaP optionally may also be used in the formation of the disaccharide, for example to attach a lipid to the galactose. The enzyme WsaP is derived from Geobacillus stearothermophilus. WsaP may be identified using the UniprotKB - Q7BG44 (Q7BG44_GEOSE). In some embodiments the WsaP enzyme comprises or consists of SEQ ID NO:43.

Enzymatically active fragments or variants of WsaP may be derived from other Bacilli strains, for example Brevibacillus species and Paenibacillus species. The enzymatically active fragments or variants of WsaP may have at least 20%, 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98% or at least 99% amino acid identity to WsaP.

The hexose disaccharide may be extended using a hexose-a-1 ,2-hexosyltransferase or an enzymatically active fragment or variant thereof to form a trisaccharide or tetrasaccharide prior to further extension from the rhamnose moiety at the non-reducing end of the trisaccharide or tetrasaccharide using a heterologous bacterial enzyme GacC and/or GacG or an enzymatically active homologue, variant or fragment thereof. Exemplary hexose-a-1 ,2- hexosyltransferases may include, but not be limited to WsaC and WsaE. WsaC may be identified by the UniProtKB - Q7BG54 (Q7BG54_GEOSE). Optionally, WsaC comprises or consists of SEQ ID NO: 44. WsaE may be identified by the UniProtKB - Q7BG51 (Q7BG51_GEOSE). Optionally, WsaE may comprise or consist of SEQ ID NO:45..

When the method further comprises forming the hexose trisaccharide, two monosaccharides may be linked together as described for the disaccharide, followed by the transfer of a further hexose to the non-reducing end of the disaccharide using an additional hexosyltransferase. The additional hexosyltransferase may comprise hexose-rhamnosyltransferases, such that a rhamnose moiety is transferred to the non-reducing end. Suitable hexose- rhamnosyltransferases may include any of the hexose-rhamnosyltransferases described herein. Suitable hexose-rhamnosyltransferases may include a rhamnose-a-1 ,3- rhamnosyltransferase, for example the enzyme WbbQ or WsaC, or an enzymatically active variant or fragment thereof. WbbQ may be identified using the UniProtKB - A0A090NIC3 (A0A090NIC3_SHIDY). In some embodiments WbbQ comprises or consists of SEQ ID NO:46.

In some embodiments the hexose trisaccharide is formed using a rhamnose-a-1 ,3- rhamnosyltransferase which is not GacC. Further information regarding exemplary hexosyltransferases for use in the present invention are provided in the Examples.

The hexose monosaccharide, disaccharide or trisaccharide to which the rhamnose moiety is transferred may be linked to a lipid. Thus, step i) may comprise transferring a rhamnose moiety to a lipid-linked hexose monosaccharide, disaccharide or trisaccharide. The link between the hexose monosaccharide, disaccharide or trisaccharide may comprise an undecaprenyl- diphosphate.

The method may further comprise a step (step (iii)) of conjugating the rhamnose polysaccharide to an acceptor molecule using an O-oligosaccharyltransferase capable of recognising the hexose monosaccharide at the reducing end of the rhamnose polysaccharide to form a rhamnose glycoconjugate.

O-oligosaccharyltransferases are enzymes used to catalyse the transfer of a carbohydrate moiety to a target protein, in a process known as protein glycosylation. Protein glycosylation is the process of covalently attaching carbohydrate moieties, i.e. a polysaccharide, to a protein substrate. O-oligosaccharyltransferases function by cleaving a phosphate-monosaccharide bond at a reducing end of a polysaccharide. To be capable of interacting with the substrate, the O-oligosaccharyltransferase must be capable of recognising the first two monosaccharides after the phosphate bond. The substrate may otherwise be referred to as an acceptor. Thus, the acceptor molecule may comprise a peptide or a protein. This results in the formation of a glyconjugate comprising the rhamnose polysaccharide of the invention. Such glyconjugates are particularly useful as antigens, which can be used in immunogenic compositions or vaccines. In addition, when the method is performed in a bacterium, the process of glycosylation leads to the presentation of the glycoconjugate on the surface of the bacterium. This enables the glycoconjugate to be isolated from the bacterium for further use, or alternatively enables the whole bacterium to be used as an antigen, which can be used in an immunogenic composition or vaccine.

In some embodiments the O-oligosaccharyltransferase is capable of recognising a glucose or glucose derivative. In such embodiments the hexose monosaccharide at the reducing end of the rhamnose polysaccharide will be a glucose or a glucose derivative, such as N-acetyl glucosamine (GlcNAc).

The O-oligosaccharyltransferase may comprise PgIB, PgIL, PgIS or WsaB or a enzymatically active homologue, fragment or variant thereof. The PgIB enzyme may be derived from a Campylobacter species, for example Campylobacter jejuni or Campylobacter lari. Without wishing to be bound by theory, it is believed that the PgIB enzyme is capable of recognising any hexose except for glucose.

The PgIL enzyme may derived from Neisseria meningitides. It is believed that the PgIL enzyme is capable of recognising any hexose except for glucose.

The PgIS enzyme may be derived from Acinetobacter species. It is believed that the PgIS enzyme is capable of recognising glucose.

The WsaB enzyme is derived from Geobacillus stearothermophilus. Enzymatically active variants of the WsaB enzyme can be derived from other Geobacillus species.

In some embodiments, the O-oligosaccharyltransferase is derived from a bacterial species heterologous to the bacteria in which the method is performed.

The method may further comprise an additional step of purifying the rhamnose glycoconjugate. Purifying may comprise high performance liquid chromatography (HPLC), for example recycling- HPLC, affinity or size exclusion chromatography. Other suitable methods of purification will be known to the skilled person.

It will be appreciated that the method can be carried out at an industrial scale. As the skilled person will be aware, the bacteria in which the method can be performed are grown in liquid media. Such liquid media comprising the bacteria can be used to fill an industrial scale bioreactor, for example at a volume of at least 50, 100 or 1000 litres. This advantageously results in the synthesis of a substantial amount of the polysaccharide product of the invention. A commonly used liquid media is Luria Broth, which may otherwise be referred to as Lysogeny Broth. Other liquid media will be known to the skilled person.

When the method is performed in bacteria, the method may be a fed-batch method. “Fed batch” is a term familiar to a person skilled in the art. Nevertheless, for the purposes of clarity, “fed batch” will be understood to refer to a method of synthesis in which nutrients are supplied to the bacteria via the liquid media during cultivation. Suitable nutrients will be known to the skilled person. Some exemplary, but non-limiting nutrients may include a rhamnose moiety, a hexose other than a rhamnose moiety and/or divalent cations including, but not limited to magnesium and/or manganese.

In some embodiments the rhamnose moiety comprises rhamnose. Rhamnose may be supplied to the liquid media in the D or the L isoform, preferably the L isoform.

Which hexose other than a rhamnose moiety is supplied to the liquid media depends on the composition of the rhamnose polysaccharide produced by the method. If the hexose monosaccharide, disaccharide or trisaccharide to which the rhamnose moiety is transferred comprises glucose, then the skilled person will appreciate that a suitable nutrient to be supplied to the liquid media would be glucose. If the hexose monosaccharide, disaccharide or trisaccharide comprises galactose, then the skilled person will appreciate that a suitable nutrient to be supplied to the liquid media would be galactose. Thus, the hexose for supply to the liquid media may be selected from one or more of allose, altrose, glucose, mannose, xylose, idose, galactose, talose, diacetylbacillosamine, GalNAc or GlcNAc, as appropriate.

The rhamnose moiety and/or other hexose may (each) be supplied to the liquid media at a final concentration in the liquid media of 0.1 , 0.25, 0.5, 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10 or 15 g/L. In some embodiments the rhamnose moiety and/or other hexose is (each) supplied to the liquid media at a final concentration in the liquid media of about 4g/L.

The rhamnose moiety and/or other hexose may (each) be supplied to the liquid media at a final concentration in the liquid media of 0.05, 0.1 , 0.15, 0.2, 0.25, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 or 1.0 mg/ml.

In embodiments the rhamnose moiety is supplied to the liquid media as L-rhamnose. L- rhamnose may be supplied to the liquid media at a final concentration in the liquid media of 0.05, 0.1 , 0.15, 0.2, 0.25, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 or 1.0 mg/ml_

When magnesium is fed to the liquid media, this may be supplied in the form of MgS04 or MgCh. The MgS04 or MgCh may be supplied to the liquid media to form a final concentration in the media of between 0 and 10mM.

Prior to step i), when the method is performed in a bacterium the method may further comprise the introduction of one or more nucleic acids encoding one or more of the enzymes described herein into the bacterium. For example, the method may further comprise the introduction of a nucleic acid encoding the O-oligosaccharyltransferase and/or a nucleic acid encoding the hexose^-1 ,4-rhamnosyltransferase, the hexose-a 1 ,2-rhamnosyltransferase, the hexose-a- 1 ,3-rhamnosyltransferase, or an enzymatically active fragment or variant thereof into the bacterium. In some embodiments, the method further comprises the introduction of a nucleic acid encoding the bacterial enzyme GacC and/or the bacterial enzyme GacG or one or more enzymatically active homologue(s), variant(s) or fragment(s) thereof into the bacterium. The enzyme can then be expressed from its respective nucleic acid. The nucleic acid(s) encoding the one or more enzymes may further comprise a nucleic acid sequence encoding an endogenous or constitutive promoter and/or an artificial ribosome binding site.

Methods for the introduction of one or more nucleic acids into a bacterium are well known to those skilled in the art. One commonly used method is that of transformation. As used herein, transforming or transformation (which may otherwise be referred to as transfecting or transfection) refers to the process of introducing free nucleic acid into a cell by allowing the nucleic acid to cross the plasma membrane of the cell. By free nucleic acid, this will be understood to refer to nucleic acid which is not contained within a virus, virus-like particle or other organism; i.e. the nucleic acid is independent of an organism (although it will be appreciated that the nucleic acid may be derived or isolated from the nucleic acid sequence of an organism).

Methods of transfection typically involve altering the plasma membrane such that free nucleic acid can cross the plasma membrane (for example, electroporation methods) or complexing the free nucleic acid with a reagent that enables the free nucleic acid to cross the plasma membrane.

It will be appreciated that the nucleic acid for transfection may be in the form of a plasmid, this being a circular strand of nucleic acid. Hence, a plasmid may comprise one or more nucleic acid(s) encoding the one or more enzymes.

The nucleic acid is typically DNA, although RNA may also or alternatively be envisaged.

Transfecting may comprise polyethylenimine, poly-L-lysine, calcium phosphate, electroporation or liposomal-based methods. In embodiments, transfecting may comprise polyethylenimine, calcium phosphate or liposomal-based methods.

It will be appreciated that a variety of liposomal-based reagents are available commercially for liposomal-based methods of transfection. Liposomal methods may include, but may not be limited to lipofectamine-based transfection or FuGENE®HD (Promega Corporation, Wisconsin, USA) -based transfection.

Further information regarding transformation/transfection techniques may be found in Current Protocols in Molecular Biology (2019) which is incorporated herein by reference.

The plasmid may further comprise appropriate regulatory sequences, including promoter sequences, terminator fragments, enhancer sequences, marker genes and/or other sequences. For further details see, for example, Sambrook & Russell, Molecular Cloning: A Laboratory Manual: 3 rd edition.

The plasmid may be further engineered to contain regulatory sequences that act as enhancer and promoter regions and lead to efficient transcription of the fusion protein sequence carried on the construct. Many parts of the regulatory unit are located upstream of the coding sequence of the heterologous gene and are operably linked thereto. The regulatory sequences can direct constitutive or inducible expression of the heterologous coding sequence. Such regulatory sequences are especially suitable if expression is wanted to occur in a time specific manner. Expression may be induced by supplying the liquid media with an inducer. The inducer may comprise or consist of arabinose, IPTG or rhamnose. Regulatory sequences which can direct inducible expression when exposed to arabinose, IPTG or rhamnose will be known to the skilled person.

Arabinose may be supplied to the liquid media at a final concentration in the liquid media of 1 , 2, 3, 4,5, 6, 7, 8, 9 or 10g/L. Optionally, arabinose is supplied to the liquid media at a concentration of about 2g/L.

IPTG may be supplied to the liquid media at a final concentration in the liquid media of 0.1 to 5mM. In some embodiments IPTG is supplied to the liquid media at a final concentration in the liquid media of 0.1 to 2mM, preferably at a concentration of about 1 mM.

L-rhamnose may be supplied to the liquid media at a final concentration of 0.05, 0.1 , 0.15, 0.2, 0.25, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0 mg/mL as an inducer.

Also provided is a product obtainable using the method according to the first aspect. A product obtainable by the method according to the first aspect is especially pure and homogenous due to its synthetic method of production. The product of this invention is therefore ideally suited to commercial use, for example for the production on a large scale for use as an antigen or for use in research applications.

According to a third aspect there is provided a synthetic streptoccocal polysaccharide, the polysaccharide having a non-reducing end comprising a linear chain of rhamnose moieties and a reducing end comprising a hexose monosaccharide, disaccharide or trisaccharide, the hexose monosaccharide, disaccharide or trisaccharide being as described in relation to the method aspect. The polysaccharide comprises an a-1 ,3 bond or a an a-1 ,2 bond between the hexose monosaccharide, disaccharide or trisaccharide and the linear chain of rhamnose moieties, or the polysaccharide comprises an b-1 ,4 bond between the hexose monosaccharide, disaccharide or trisaccharide and the linear chain of rhamnose moieties and the hexose monosaccharide, disaccharide or trisaccharide does not comprise N- acetylglucosamine.

As the inventors have found, the naturally occurring GAC from S. pyogenes comprises a GlcNAc (N-acetylglucosamine) monosaccharide linked by a b-1 ,4 glycosidic bond to a linear chain of rhamnose monosaccharides. By altering this natural composition of the reducing end sugars, the inventors have generated a synthetic polysaccharide which retains the chemical composition and antigenic capacity of the alpha-1 , 2-alpha-1 , 3 rhamnose disaccharide repeat units of GAC, while enabling production of the polysaccharide at an industrial scale and at high levels of purity and tightly regulated size distribution to increase product length homogeneity.

Thus, typically, the polysaccharide comprises a polysaccharide or a fragment or variant thereof selected from the group consisting of a Group A, Group B, Group C and Group G carbohydrate.

In some embodiments the polysaccharide comprises an a-1 ,3 bond between the hexose monosaccharide, disaccharide or trisaccharide and the linear chain of rhamnose moieties. The hexose monosaccharide disaccharide or trisaccharide may comprise N- acetylglucosamine, N,N’-diacetylbacillosamine, glucose or galactose.

In some embodiments the polysaccharide comprises an a-1 ,2 bond between the hexose monosaccharide, disaccharide or trisaccharide and the linear chain of rhamnose moieties. The hexose may comprise galactose. In some embodiments the polysaccharide comprises a b-1 ,4 bond between the hexose monosaccharide, disaccharide or trisaccharide and the linear chain of rhamnose moieties and the hexose comprises glucose.

According to a fourth aspect there is provided a streptococcal rhamnose glycoconjugate comprising the streptococcal polysaccharide according to the third aspect conjugated to an acceptor. Glyconjugates have strong antigenic potential and so rhamnose glyconjugates of the invention have particular utility in raising an immune response for example as part of or as an immunogenic composition or vaccine.

In embodiments the polysaccharide is conjugated to the acceptor at the reducing end of the polysaccharide. The acceptor may comprise a peptide or a protein.

In some embodiments the streptococcal rhamnose glycoconjugate is expressed on the surface of a bacterial host cell, optionally a gram negative bacterium such as E.coli. Thus, the invention also encompasses a bacterial host cell comprising the streptococcal rhamnose glycoconjugate of the fourth aspect on its cell surface. Conveniently, expression on the cell surface of the bacterial host cell enables ease of isolation of the glycoconjugate. Even more conveniently, this means that the bacterial host cell which comprises the streptococcal rhamnose glycoconjugate on its cell surface can be used as a component of or an immunogenic composition or vaccine without requiring isolation of the glyconjugate from the bacterial host cell. This reduces the time and cost necessary to produce the glyconjugate for downstream use as an immunogenic composition or vaccine.

Thus, according to a fifth aspect there is provided a bacterial host cell comprising a hexose- b-I , ' 4-rhamnosyltransferase, a hexose-a-1 ,2-rhamnosyltransferase or a hexose-a-1 ,3- rhamnosyltransferase, or an enzymatically active fragment or variant thereof and the heterologous bacterial enzyme GacC and/or GacG or an enzymatically active homologue, variant or fragment thereof as described herein.

The bacterial host cell may be heterologous to the species from which the hexose-b -1 ,4- rhamnosyltransferase, a hexose-a-1 ,2-rhamnosyltransferase or a hexose-a-1 ,3- rhamnosyltransferase, or an enzymatically active fragment or variant thereof is derived. Optionally, the bacterial host cell is a gram-negative bacterium such as E.coli. The bacterial host cell may comprise the enzymes described herein and/or the nucleic acid sequences encoding the enzymes. According to a sixth aspect there is provided an immunogenic composition or vaccine comprising the rhamnose polysaccharide of the second or third aspect or the streptococcal glycoconjugate according to the fourth aspect. The immunogenic composition or vaccine may further comprise a pharmaceutically acceptable and/or sterile excipient, carrier and/or diluent.

In some embodiments the immunogenic composition or vaccine further comprises an antigen, polypeptide and/or adjuvant.

The composition may further comprise a pharmaceutically acceptable carrier, diluent or excipient. A“pharmaceutically acceptable carrier” as referred to herein is any physiological vehicle known to those of ordinary skill in the art useful in formulating pharmaceutical compositions. A“diluent” as referred to herein is any substance known to those of ordinary skill in the art useful in diluting agents for use in pharmaceutical compositions. The agent may be mixed with, or dissolved, suspended or dispersed in the carrier, diluent or excipient.

The composition may be in the form of a capsule, tablet, liquid, ointment, cream, gel, hydrogel, aerosol, spray, micelle, transdermal patch, liposome or any other suitable form that may be administered to an animal suffering from, or at risk of developing a disease, condition or infection with a streptococcal aetiology.

The compositions and/or vaccines of this invention may be formulated for oral, topical (including dermal and sublingual), intramammary, parenteral (including subcutaneous, intradermal, intramuscular and intravenous), transdermal and/or mucosal administration. In embodiments the compositions and vaccines of this invention may be formulated for parenteral administration, optionally subcutaneous, intradermal, intramuscular and/or intravenous administration.

There is also provided the rhamnose polysaccharide of the second or third aspect, the streptococcal glycoconjugate according to the fourth aspect, or the immunogenic composition or vaccine according to the sixth aspect for use in raising an immune response in an animal or for use in treating or preventing a disease, condition or infection with a streptococcal aetiology.

The animal may be any mammalian subject, for example a dog, cat, rat, mouse, human, sheep, goat, donkey, horse, cow, pig and/or chicken. In embodiments, the animal is an ovine animal, a caprine animal, an equine animal, a porcine animal, a bovine animal or a human. In embodiments, the animal is an ovine animal. By“ovine animal”, this will be understood to include sheep.

The skilled person will appreciate that the term “caprine” includes goats, while “bovine” includes cattle. Equine is a term that will be understood to include horses. As used herein, the term“porcine” includes pigs.

An immune response which contributes to an animal’s ability to resolve an infection/infestation and/or which helps reduce the symptoms associated with an infection/infestation may be a referred to as a“protective response”. In the context of this invention, the immune responses raised through exploitation of the rhamnose polysaccharides described herein may be referred to as“protective” immune responses. The term“protective” immune response may embrace any immune response which: (i) facilitates or effects a reduction in host pathogen burden; (ii) reduces one or more of the effects or symptoms of an infection/infestation; and/or (iii) prevents, reduces or limits the occurrence of further (subsequent/secondary) infections.

Thus, a protective immune response may prevent an animal from becoming infected/infested with a particular pathogen and/or from developing a particular disease or condition.

An“immune response” may be regarded as any response which elicits antibody (for example IgA, IgM and/or IgG or any other relevant isotype) responses and/or cytokine or cell mediated immune responses. The immune response may be targeted to the rhamnose polysaccharide of the invention. For example, the immune response may comprise antibodies which have affinity for epitopes of or the entire rhamnose polysaccharide.

Also provided is a method of treating an animal having a disease, condition or infection with a streptococcal aetiology, the method comprising administering the animal a therapeutically effective amount of the rhamnose polysaccharide of the second or third aspect, the streptococcal glycoconjugate according to the fourth aspect, or the immunogenic composition or vaccine according to the sixth aspect.

A therapeutically effective amount will be understood to refer to an amount sufficient to eliminate, reduce or prevent a disease, condition or infection with a streptococcal aetiology.

The rhamnose polysaccharide, glyconjugate or the immunogenic composition or vaccine may be administered as a single dose or as multiple doses. Multiple doses may be administered in a single day (e.g. 2, 3 or 4 doses at intervals of e.g. 3, 6 or 8 hours). The agent may be administered on a regular basis (e.g. daily, every other day, or weekly) over a period of days, weeks or months, as appropriate.

It will be appreciated that optimal doses to be administered can be determined by those skilled in the art, and will vary depending on the particular agent in use, the strength of the preparation, the mode of administration and the advancement or severity of the disease, condition or infection with a streptococcal aetiology. Additional factors depending on the particular subject being treated will result in a need to adjust dosages, including subject age, weight, gender, diet, and time of administration. Known procedures, such as those conventionally employed by the pharmaceutical industry (e.g. in vivo experimentation, clinical trials, etc.), may be used to establish specific formulations for use according to the invention and precise therapeutic dosage regimes.

Also provided is a kit of parts, the kit comprising:

(i) A nucleic acid sequence encoding a hexose-b 1 ,4-rhamnosyltransferase, a

hexose-a-1 ,2-rhamnosyltransferase or a hexose-a 1 ,3-rhamnosyltransferase, or an enzymatically active fragment or variant thereof; and

(ii) A nucleic acid sequence encoding the heterologous bacterial enzyme GacC

and/or GacG or an enzymatically active homologue, variant or fragment thereof.

Suitable nucleic acid sequences for the kit of parts are as described herein in relation to the method of the invention.

In some embodiments, the kit further comprises one or more nucleic acid sequences encoding an O-oligosaccharyltransferase as described herein.

Further nucleic acid sequences which the kit may comprise may include one or more nucleic acid sequences encoding one or more of the following 12 enzymes GacA, GacD, GacE, GacF, GacH, Gael, GacJ, GacK and GacL, or one or more enzymatically active homologue(s), variant(s) or fragment(s) thereof.

In some embodiments the kit further comprises a nucleic acid sequence encoding GacA, or an enzymatically active homologue, variant or fragment thereof. In some embodiments the kit comprises a nucleic acid sequence encoding GacG, or an enzymatically active homologue, variant or fragment thereof.

In some embodiments the kit comprises nucleic acid sequences encoding GacG and GacC, or one or more enzymatically active homologue(s), variant(s) or fragment(s) thereof.

In some embodiments the kit further comprises nucleic acid sequences encoding the enzymes GacA, GacD, GacE, and GacF or one or more enzymatically active homologues, fragments or variants thereof.

The kit may further comprise one or more nucleic acid sequences encoding a reporter gene. The reporter sequence may encode a gene or peptide/protein, the expression of which can be detected by some means. Suitable reporter sequences may encode genes and/or proteins, the expression of which can be detected by, for example, optical, immunological or molecular means. Exemplary reporter sequences may encode, for example, fluorescent and/or luminescent proteins. Examples may include sequences encoding firefly luciferase (Luc. including codon-optimised forms), green fluorescent protein (GFP), red fluorescent protein (dsRed). One or both of the nucleic acid sequences described in (i) and (ii) of the kit may comprise the reporter sequence.

The kit may optionally further comprise bacteria, for example gram-negative bacteria such as E. coli. The bacteria may be heterologous to the bacterial species from which the hexose-b- 1 ,4-rhamnosyltransferase, the hexose-a-1 ,2-rhamnosyltransferase, the hexose-a-1 ,3- rhamnosyltransferase or enzymatically active fragment or variant thereof is derived.

It will be appreciated that the plurality of nucleic acid sequences may be provided in one or a plurality of plasmids.

All of the features described herein (including any accompanying claims, abstract and drawings) may be combined with any of the above aspects in any combination, unless otherwise indicated.

Detailed Description

The invention will now be described by way of example with reference to the following figures, which show: Figure 1 A) shows a gene complementation strategy and map of S. pyogenes and S. mutans genes required to produce the rhamnose chain. S. mutans cluster: sccA (Smu0824), sccB (Smu0825), sccC (Smu0826), sccD (Smu0827), sccE (Smu0828), sccF (Smu0829), sccG (Smu0830). S. pyogenes cluster: gacA (M5005_Spy_0602), gacB (M5005_Spy_0603), gacC (M5005_Spy_0604), gacD (M5005_Spy_0605), gacE (M5005_Spy_0606), gacF (M5005_Spy_0607), gacG (M5005_Spy_0608). B) Bacterial complementation assay. Western blot of whole cells samples probed with anti-Group A antibody. Legends on the figure;

Figure 2 shows a western blot of whole cell samples probed against anti-GAC antibody showing the complementation of AsccB or AgacB with sccB_TTG, sccB_ATG and gacB;

Figure 3 shows a thin layer chromatography analysis of radiolabelled lipid-linked oligosaccharides extracted from E. coli cells expressing the empty vector, S. mutans SccAB- DEFG, S. pyogenes GacB or S. mutans SccB;

Figure 4 shows an in vitro assessment of GacB’s activity detected MALDI-MS. Spectra obtained from the products of the enzymatic reaction between dTDP-Rha and: A. Acceptor 1 (C13-PP-GlcNAc) B. Acceptor 1 + GacB-GFP C. Acceptor 1 + GacB cleaved (no GFP) D. Acceptor 2 (Phenol-0-C1 1-PP-GlcNAc). E. Acceptor 2 + GacB-GFP. F. Acceptor 2 + GacB cleaved (no GFP) G. Acceptor 2 + GacB-D160N-F GFP H. Acceptor 2 + GacB-Y182N-F-GFP;

Figure 5 shows an in vitro assessment of GacB’s specificity towards different activated nucleotide sugar donors using MALDI-MS. Spectra obtained from the products of the enzymatic reaction between GacB-GFP, acceptor 2 and either dTDP-Rha (A), UDP-GIc (B), UDP-GIcNAc (C) or UDP-Rha (D). The conversion to the product (818 m/z and 840 m/z) was observed only when dTDP-Rha was used as nucleotide sugar donor;

Figure 6 shows an in vitro assessment of GacB’s metal ion dependency via MALDI MS. Spectra obtained from the products of the enzymatic reaction between dTDP-Rha, acceptor 2 (A), and either: GacB-GFP (B), 1 mM MgCI 2 (C), 1 mM MnCI 2 (D), or EDTA (E). The conversion to the product (818 m/z and 840 m/z) was observed in all conditions where GacB- GFP was present, regardless of the addition of metal ions or the metal chelator;

Figure 7 shows A) 800 MHz 1 H NMR spectra of (a) acceptor substrate 1 , (b) product 1 , (c) acceptor substrate 2, (d) product 2. B) Partial 2D ROESY spectrum of the product 1 showing the correlations between the H1 of a b-L-Rha and protons of rhamnose (R) and GlcNAc (G). The F2 cross section through H1 of Rha is shown in red. C) The chemical structures with proton numbering.

Figure 8 shows a schematic representation of the RhaPS initiation within different Streptococcus species in comparison to the capsule polysaccharide in S. pneumoniae. RhaPS biosynthesis is initiate on Und-P by GacO (green background), followed by the action of GacB (turquoise), generating the conserved core structure Und-PP-GIcNac-Rha. Percentage of the amino acid sequence identity, positive amino acids, and gaps within the sequence compared to GacO or GacB are given below each homolog: S. mutans serotype c SccB, Streptococcus agalactiae (GBS) RfaB, Streptococcus dysgalactiae subsp. equisimilis 167 (GCS) RgpAc, Streptococcus dysgalactiae subsp. equisimilis ATCC 12394 (GGS) Rs03945. The specific carbohydrate composition extending the lipid linked core structure of each group are depicted on the right side. Repeating units (RU) of the carbohydrates are highlighted (light pink background), symbolic representation of the sugar residues are shown in the figure legend;

Figure 9 shows (top) anti-lipid A and anti-GAC western blot of E. coli total cell lysate. WchF complementation of the dgacB gene cluster complements RhaPS biosynthesis in 21548 cells (lacking Und-PP-GIcNAc, inactive wecA gene), whilst no other GacB and homologous enzyme fail to initiate RhaPS biosynthesis. (Below) All gene combinations result in functional RhaPS biosynthesis in CS2775 cells (containing Und-PP-GIcNAc, functional wecA gene);

Figure 10 A) shows phylogenetic relationships amongst forty-eight partially or completely sequenced streptococcal pathogens. The tree was constructed based a multiple sequence alignment of GacB homologs using the default neighbour-joining clustering method of Clustal Omega. The tree was plotted using iTOL online tool. Black squares at the branches indicate species with fully-sequenced genomes. (B) Bar charts associates to each node indicate the percentage amino acid identity of the respective homologs to GacB (blue) or GacO (magenta);

Figure 11 Left) shows anti-GAC western blot of total cell lysate western blot of E. coli 21548 cells expressing dgacB gene cluster and either gacB, gacB-mutants or gacB-WchF chimera. The GacB-WchF chimera complements the dgacB RhaPScluster, suggesting that the N- terminal WchF domain is sufficient to alter the acceptor substrate specificity for GacB from Und-PP-GIcNAc to Und-PP-GIc. Right) Loading control - coomassie stained membrane after Western blotting;

Figure 12 is a schematic diagram to show the composition of the naturally occurring GAC; and Figure 13 is a schematic diagram to illustrate an embodiment of the invention;

Figure 14 is a schematic diagram to illustrate another embodiment of the invention;

Figure 15 is a schematic diagram to illustrate a further embodiment of the invention;

Figure 16 is a schematic diagram to illustrate another embodiment of the invention;

Figure 17 is a schematic diagram to illustrate embodiments of the invention;

Figure 18 is another schematic diagram to further illustrate the invention;

Figure 19 is an anti GAC Western Blot to show that WbbL can be used instead of GacB or SccB in a method according to the invention. The figure shows an anti-GAC Western blot of total E. coli lysate from cells expressing the gene cluster RmID-SccC-SccD-SccE-SccF-SccG (deltaSccB) and GacA-GacC-GacD-GacE-GacF-GacG (deltaGacB) complemented with empty plasmid controls or WbbL. Arabinose induction concentrations stated in %;

Figures 20 and 21 are images of radiolabelled lipid-linked oligosachides prepared in vivo\

Figure 22 shows the results from E. coli complementation studies;

Figure 23 shows the results of phylogenetic studies of the GacO, GacB and GacC enzymes from Streptococci spp:,

Figure 24 shows the functional characterisation of GacC and how GacC installs poly- rhamnose to an adaptor/stem;

Figure 25 shows assignment of proton and carbon sugar signals as obtained from 2D TOCSY and NOESY spectra and how this translates into the rhamnose polysaccharide molecule;

Figure 26 shows a Western blot image obtained from generating rhamnose polysaccharides with a WbbPQR adaptor/stem;

Figure 27 shows a schematic of rhamnose polysaccharides generated from Shigella spp. adaptor/stem and GAC repeat units; and Figure 28 shows rhmanose polysaccharides prepared in accordance with the present invention are capapble of acting as substrates for an E .coli glycoconjugation system.

Example 1 - GacB is a a-D-GIcNAc p-1 ,4-L-rhamnosyltransferase

Introduction

S. pyogenes relies on different mechanisms to withstand the host’s defences (1-5). These mechanisms are supported by the synthesis of a wide array of virulence factors, amongst which is the Group A Carbohydrate (GAC), a surface polysaccharide that constitutes between 40% and 60% of the bacterial cell wall (6-9). GAC is composed of a [ 3)a-Rha(1 2)a- Rha(1 ] rhamnose polysaccharide (RhaPS) backbone with a /3-D-GICNAC (1 3) side chain modifications on every a-1 ,2-linked rhamnose (9-1 1). Recent structural examinations and composition analysis of the GAC also suggest the presence of glycerol phosphate (GroP) (12), an observation that remained unnoticed for over fifty years (13, 14). Further, Edgar et al. demonstrated that approximately 25% of GAC side chain GlcNAcs are decorated with GroP, imparting a negative charge to this polymer that has implications on S. pyogenes biology and defence mechanisms (12, 13,15). This feature, previously identified in other surface glycans (16, 17), provided new insight into the structural composition, biosynthesis and function of GAC.

GAC is proposed to be synthesised by twelve proteins, GacABCDEFGHIJKL, encoded in one gene cluster (i.e. : MGAS5005 _spy0602-0613) that has been found in all S. pyogenes species identified so far (1 , 18). Through sequencing of transposon mutant libraries, Le Breton et al. discovered that eight of these genes, gacABCDEFG and gacL are essential for S. pyogenes survival (4, 19). This information supports the observation by van Sorge et al., who identified via insertional mutagenesis that the first three genes of the cluster ( gacABC ) are essential (1).

It is currently hypothesized that the GAC is formed in five consecutive steps: (i) lipid-linked acceptor initiation, (ii) [ 3)a-Rha(1 2)a-Rha(1 ] RhaPS backbone synthesis, (iii) membrane translocation, (iv) post-translocational chain modifications in the extracellular environment and (v) linkage to the peptidoglycan (9). The cytoplasmic pool of dTDP-rhamnose is supplied by the enzymes encoded in two separate gene clusters rm/ABC and gacA/rmID (16).

Despite the recent findings, some pressing questions remain unanswered regarding the biosynthesis of the GAC. For example, the products of six of the twelve genes that constitute the GAC cluster (gacBCDEFG) have not yet been characterised, leaving the GAC initiation, RhaPS backbone biosynthesis and translocation steps unknown.

As a means of attaining more information on the GAC initiation step, we conducted an in- depth examination of the second enzyme encoded in the GAC gene cluster. Here we demonstrate that GacB, in disagreement with its preliminary genetic annotation and currently proposed action (8), is the first retaining rhamnosyltransferase that catalyses the transfer of L-rhamnose from c/TDP-/3-L-rhamnose. GacB forms a /3-1 ,4 glycosidic bond with the lipid- linked GlcNAc-diphosphate through a metal-independent mechanism. More importantly, our research on phylogenetically-related homologs from other important human pathogenic streptococci, in particular from the Lancefield groups B, C and G streptococci, reveal that the role of GacB is well conserved within the Streptococcus genus, suggesting a common first committed step for the production of RhaPS from all Lancefield groups.

Experimental procedures

Bioinformatics analysis

Alignment of protein sequences was performed using NCBI Blast Global align (https://goo.gl/vB9zmD) and ClustalOmega (https://goo.gl/8FbvYP) (49). Molecular weight predictions were obtained using the ProtParam tool at the Expasy server (http://www.expasy.org/). Topological predictions were generated using both SpOctopus (http://octopus.cbr.su.se/) and the TMHMM algorithms (www.cbs.dtu.dk/services/TMHMM/).

Secondary structure predictions were generated using either Phyre2 (https://goo.gl/zrGKJ7) or RaptorX (raptorx.uchicago.edu) homology recognition engines, and these structures were viewed and analysed using the PyMOL Molecular Graphics System (educational version 1.8 Schrodinger, LLC). The Carbohydrate Active Enzymes database (CAZy) (http://www.cazy.org/) (50) was examined to obtain information about the classification and characterization of carbohydrate active enzymes. Phylogeny relationships were established using Clustal Omega, Clustal X and the interactive tree of life iTOL (22).

Bacterial strains and growth conditions

E. coli strains DH5a and MC1061 were used indistinctively as host strains for the propagation of recombinant plasmids and plasmid integration. E. coli CS2775, a strain lacking the Rha modification on the lipopolysaccharide, was used as the host strain to evaluate the production of RhaPS. E. coli 21548 is an Und-PP-GIcNAc deficient strain that contains a wecA deletion, serving as a negative control for the production of RhaPS. E. coli strain C43 (DE3) was used for the production of recombinant protein. All E. coli strains were grown in LB media. Unless otherwise indicated, all bacterial cultures were incubated at 37°C in a shaking incubator at 200 rpm. Where necessary, media were supplemented with one or more antibiotics to the following final concentration: carbenicillin (Amp) at 100 pg/pL, erythromycin (Erm) at 300 pg/pL or kanamycin (Kan) at 50 pg/mL.

Molecular genetic techniques

Table 1 shows the DNA sequence of the forward and reverse oligonucleotide primer pairs used to amplify, delete, or mutagenise the genes of interest. All primers were obtained from Integrated DNA Technologies (IDT). All PCR reactions were performed using a SimpliAmp Thermal Cycler from ThermoFisher Scientific with standard procedures. Constructs were cloned using standard molecular biology procedures, including restriction enzyme digest and ligation. All constructs were validated with DNA sequencing.

_

Table 1

Determination of RhaPS production

50 mI_ of OD 6 oo-normalised overnight cultures grown at 37°C were mixed with 50 mI_ of 6 x SDS-loading buffer and resolved in 20% Tricine-SDS gels (29). Assessment of the RhaPS production was performed via immunoblotting on PVDF membranes following the traditional immunoblotting technique. Primary antibody: rabbit-raised anti -Streptococcus pyogenes Group A carbohydrate polyclonal antibody (Abeam, ab21034). Secondary antibody: goat- raised anti-rabbit IgG HRP conjugate (Biorad, 170-6515). Immunoreactive signals were captured using either GENESYS™ 10S UV-Vis Spectrophotometer (Thermo Scientific) after exposure to the Clarity Western ECL (Biorad).

Extraction and radiolabelling of lipid-linked oligosaccharides

Radiolabelled lipid-linked saccharides (LLS) of induced E. coli CS2775 cells bearing the selected plasmids were extracted using 1 :1 CHCI 3 /CH 3 OH and water-saturated butan-1-ol (1 : 1 v/v) solution to determine the addition of sugar residues in vivo after glucose D[6s 3 H] (N) (Perkin Elmer) supplementation (1 mCi/mL). The incorporated radioactivity was measured in a Beckman LS6000SE scintillation counter. The organic phase containing the LLSs were normalised to 0.05 pCi/pL. The samples were separated via thin layer chromatography (TLC) on a HPTLC Silica Gel 60 plate (Merck) using a C:M:AC:A:W mobile phase (180 ml_ chloroform + 140 ml_ methanol + 9 ml_ 1 M ammonium acetate + 9 ml_ 13 M ammonia solution, 23 ml_ distilled water), then dried and sprayed with En3Hance liquid (Perkin Elmer). Radioautography images were obtained Carestream® Kodak® BioMax® XAR Film and MS Intensifying Screens after 5 to 10 days.

Purification of recombinantly expressed membrane associated proteins

The purification was conducted following the established protocol from Waldo, et. al. (30) with the following modifications. Overnight cultures of E. coli C43 (DE3) cells expressing C-terminal GFP-fusion proteins were diluted 1 : 100, incubated for 3 hours until Oϋboo = 0.6, induced with 0.5 mM IPTG and shifted to room temperature overnight, all at 200 rpm shaking. GPF expression was detected through in-gel fluorescence using a Fuji FLA-5000 laser scanner. Cloning, expression and purification of GacB-WT, GacB-D160N-GFP and GacB-Y182-GFP: plasmids containing GFP-Hiss-tagged recombinant proteins were constructed as described in Table 1 into the vector pWaldo-E (30). For protein production and purification purposes, the vectors were transformed into E. coli C43 (DE3) cells and expressed as described above. The cells were fractionated using an Avestin C3 High-Pressure Homogenisator (Biopharma, UK) and spun down at 4000 x g. Further centrifugation of the supernatant at 200 000 x g for 2 h rendered 2-3 g of membrane containing the GacB-GFP proteins. Membranes were solubilised in Buffer 1 (500 mM NaCI, 10 mM Na 2 HP0 4 , 1.8 mM KH 2 P0 4 2.7 mM KCI, pH of 7.4, 20mM imidazole, 0.44 M TCEP) with the addition of 1 % DDM (Anatrace) for 2 hr at 4°C, and bound to a 1 ml_ Ni-Sepharose 6 Fast Flow (GE healthcare) column, prewashed with buffer 1 plus 0.03% DDM. Elution was conducted using Buffer 1 supplemented with 250 mM imidazole and 0.03% DDM. Imidazole was removed using a HiPrep 26/10 desalting column (GE Healthcare) equilibrated with Buffer (PBS, 0.03% DDM, 0.4 mM TCEP). The GFP-His tag was removed with PreScission Protease cleavage in a 1 : 100 ratio overnight at 4°C. Cleaved GacB proteins were collected after negative IMAC. Protein identity and purity was determined by tryptic peptide mass fingerprinting, matrix-assisted laser desorption/ionisation time-of-flight mass spectrometry (MALDI-TOF), respectively (University of Dundee ‘Fingerprints’ Proteomics Facility).

Synthesis of acceptor acceptor 1 and 2

Acceptor 2 (P 1 -(1 1-phenoxyundecyl)-P 2 -(2-acetamido-2-deoxy-a-D-glucopyranosyl) diphosphate) was synthesised as sodium salt from phenoxyundecyl dihydrogen phosphate and 2-acetamido-2-deoxy-3,4,6-tri-0-acetyl-a-D-glucopyranosyl dihydrogen phosphate according to the procedure by T.N. Druzhinina et al. 2010 (94). Acceptor 1 (P 1 -tridecyl-P 2 -(2- acetamido-2-deoxy-a-D-glucopyranosyl) diphosphate) was synthesised from tridecyl dihydrogen phosphate (obtained similarly to phenoxyundecyl dihydrogen phosphate) by the same procedure as described for acceptor 2.

GacB in vitro enzymatic reaction

Purified GacB-WT-GFP, GacB-D160N-GFP, GacB-Y182F-GFP and the GacB (tag-less) protein (0.15 mg/ml final concentration) were mixed in a 100 pi TBS buffer supplemented with 1 mM TDP-Rha as sugar donor and 1 mM acceptor-1 (Ci3-PP-GlcNAc) or 1 mM acceptor-2 (Phenol-0-Cii H 22 -PP-GlcNAc) as acceptor substrate. The reaction was incubated for 3 h to 24 h at 30°C. The assay mixture was adjusted with the exchange of the nucleotide sugar donor to UDP-Rha or UDP-GIcNAc and with the addition of either 1 mM MgCI 2 , 1 mM MnCI 2 , or 1 mM EDTA to define the essentiality of metal dependency.

Mass Spectrometry analysis

Matrix-assisted Laser Desorption Ionization Time-of-Flight (MALDI-TOF) was used to analyse the acceptors and products of the GacB in vitro assay. 100 mI reaction samples were purified over a 100 pL Sep-Pak C18 cartridges (Waters, UK), pre-equilibrated with 5% EtOH. The bound samples were washed with 800 mI H 2 0 and 800 mI 15% EtOH, eluted in two fractions with a) 800 mI 30% and b) 800 mI 60% EtOH. The two elution fractions were dried in a speed vac and resuspended in 20 mI 50% MeOH. 1 mI of sample was mixed with 1 mI 2,5- dihydroxybenzoic (DHB) acid matrix (15 mg/ml_ in 30:70 acetonitrile: 0.1 % TFA) and 1 mI was added to the MALDI grid. Samples were analyzed by MALDI in an Autoflex speed mass spectrometer set up in reflection positive ion mode (Bruker, Germany).

NMR analysis

The purified GacB in vitro assay products (0.5-2 mg) were dissolved in D2O (550 mI_) and measured at 300 K. The spectra were acquired on a 4-channel Avance III 800 MHz Bruker NMR spectrometer equipped with a 5 mm TCI CryoProbe™ with automated matching and tuning. 1 D spectra were acquired using the relaxation and acquisition times of 5 and 1.8 s, respectively. Between 32 and 512 scans were acquired using the spectral width of 1 1 ppm. J connectivities were established in a series of 1 D and 2D TOCSY experiments with mixing times between 20 and 120 ms. Selective 1 D TOCSY spectra (32) were acquired using a 40 ms Gaussian pulses and DIPSI-2 sequence (33) (gBi/2p = 10 kHz) for spin lock of between 20 and 120 ms. The following parameters were used to acquire 2D TOCSY and ROESY experiments: 2048 and 768 complex points in fe and , respectively, spectral widths of 11 and 8 ppm in F2 and F^ , yielding fe and acquisition times of 1 16 and 60 ms, respectively. Sixteen scans were acquired for each increments using a relaxation time of 1.5 s. The overall acquisition time was 6-7 hours per experiment. A forward linear prediction to 4096 points was applied in Fi . A zero filling to 4096 was applied in F2. A cosine square window function was used for apodization prior to Fourier transformation in both dimensions. The ROESY mixing time was applied in the form of a 250 ms rectangular pulse at gBi/2p = 4167 Hz. DIPSI-2 sequence (gBi/2p = 10 kHz) was applied for a 20, 80 and 120 ms spin lock. 2D magnitude mode HMBC experiments: 2048 and 128 complex points in fe and , respectively, spectral widths of 6 and 500 ppm in F2 and F^ , yielding fe and acquisition times of 0.35 s and 0.6 ms, respectively. Two scans were acquired for each of 128 increments using a relaxation time of 1.2 s. The overall acquisition time was 8 minutes. A forward linear prediction to 512 points was applied in Fi ; zero filling to 4096 was applied in F2. A sine square window function was used for apodization prior to Fourier transformation in both dimensions.

GacC / homologous enzymes protein purification

For production of recombinant proteins, target genes (GacC, GbcC, Cps2F, SccC) were synthesized using IDT’s gBIock gene fragment synthesis service. Wild-type sequences for GacC and its’ homologs were PCR amplified with overhangs designed for cloning into pOPINF 1 , which contains an N-terminal 6xHistidine tag for affinity purification. Cloning into pOPINF was carried out using In-Fusion™ cloning technology (Clontech). The resulting plasmids were then transformed into DH5a competent cells for propagation and extraction (miniprep kit; Qiagen). Positively transformed plasmids were identified by size comparison to a non-transformed control pOPINF plasmid using gel electrophoresis, which were subsequently confirmed by DNA sequencing. For insertion of point mutants, wild-type plasmids were used as templates to PCR amplify 2 overlapping fragments containing the desired point mutant. Fragments were designed to contain a minimum of a 15bp overlap, and were cloned into pOPINF and sequence verified as for wild type plasmids. A full list of primers used for both wild-type and mutant cloning can be found in Table A.

Sequence verified plasmids were then transformed into C43 cells for protein expression. For activity assays, 1 L of E. coli culture typically yielded enough protein for >50 assays (1 mg L 1 ). Cultures were grown at 37 °C and shaking at 200 RPM to an OD of 0.6-1 , at which point they were transferred to 18 °C for 1 hour before induction with 0.5 mM isopropyl b-D- thiogalactopyranoside (IPTG). Cultures were left shaking at 18 °C overnight. Following centrifugation of the culture at 3000 xg, proteins were extracted in Buffer A0 (50 mM HEPES pH 7.5, 300 mM NaCI, 10% glycerol, 2 mM TCEP) supplemented with protease inhibitors, using an Avestin C3 cell disruptor according to the manufacturer’s instructions. Lysed cultures were then subject to ultracentrifugation at 200,000 x g and the supernatant was collected. The supernatant containing the soluble proteins of interest was then purified over a Nickel-affinity (Thermofisher) column using wash Buffer A (50 mM HEPES pH 7.5, 300 mM NaCI, 10% glycerol, 2 mM TCEP, 20mM imidazole) and elution Buffer B (50 mM HEPES pH 7.5, 300 mM NaCI, 10% glycerol, 2 mM TCEP, 400mM imidazole) according to manufacturer’s instructions. Elution fractions containing the target proteins were then passed over a desalting column, preequilibrated with Buffer A0, to remove imidazole. Protein samples were concentrated to 0.5-1 mg/ml and snap frozen in liquid nitrogen until use.

TABLE A

1. Berrow NS, Alderton D, Sainsbury S, Nettleship J, Assenberg R, Rahman N, Stuart Dl, Owens RJ. A versatile ligation-independent cloning method suitable for high- throughput expression screening applications. Nucleic acids research. 2007 Mar 1 ;35(6):e45.

HPLC Assay

For in vitro enzyme analyses, 50 mI reactions were set up to include 2.5 mM synthetic lipid acceptor PH-0-Cn H 22 -PP-alpha-NAG, 12.5 mM TDP-L-rhamnose, 0.5 - 1.5 mM GacB-GFP, and 1.25 - 2.5 mM GacC or homolog/mutant of interest, topped up to 50 mI with TBS Buffer supplemented with 2mM MnCL. Reactions were incubated at 30°C and when desired timepoints were met, quenched with 50 mI acetonitrile and left on ice for 15 minutes. Reactions were spin filtered at 14,000 RPM in a benchtop centrifuge to remove precipitated protein before being injected onto a Xbridge BEH Amide OBS Prep column (130 A, 5 mM, 10 x 250 mm) connected to an HPLC system fitted with a UV detector set to 270 nm (Ultimate 3000, Thermo). Samples were applied to the column at 4 ml/min using Running Buffer A (95% acetonitrile, 10 mM ammonium acetate, pH 8) and Running Buffer B (50% acetonitrile, 10 mM ammonium acetate, pH 8) over a gradient of increasing concentration of B. Increasingly polar products with additional sugar residues eluted later into the gradient, with the triple rhamnosylated GacC product typically eluting ~14 min into a 36 min run. Products purified from the HPLC were dried in a speed vacuum to remove excess acetonitrile, before being freeze-dried to remove residual water and ammonium acetate. Samples could be stored at - 20 °C for structural analysis. NMR analysis GacC product For NMR analysis at the University of Dundee, HPLC purified products (0.5 - 2mg) were resuspended in 600 mI of D2O and NMR spectra were recorded at 293 K. The spectra were acquired on a Bruker AVANCE III HD 500 MHz NMR Spectrometer equipped with a 5-mm QCPI cryoprobe. NMR spectra were recorded as described for the GacB reaction product. Spectra were analysed using Bruker Topsin (4.0.7).

Results

GacB is required for the biosynthesis of the GAC RhaPS chain

To investigate the GacB function and to identify potential catalytic residues, we used E. coli as a heterologous expression system to study the GAC RhaPS backbone biosynthesis. We constructed two vectors carrying the homologous genes from S. pyogenes, gacACDEFG ( gacA-G ; AgacB ) and gacB (Figure 1A).

The RhaPS chain is presumed to be translocated to the outer membrane in E. coli, which naturally contains rhamnose attached to the lipopolysaccharides. Thus, to avoid unspecific binding of the anti-GAC antibody, all transformations were made using a rfaS- deficient strain (20). The interruption of the rfaS gene impedes the attachment of rhamnose to the LPS on the bacterial outer membrane, rendering a strain that lacks endogenous rhamnose on its surface (20) . The role of GacB was investigated using the traditional complementation strategy depicted in Figure 1.

We investigated the production of RhaPS by gacA-G from our complementation approach using immunoblots of total cells lysates (Figure 1 B). If the expression of GacBCDEFG is sufficient to produce the RhaPS chain, then we should be able to detect the synthesised RhaPS using a specific anti-GAC antibody. The results showed that E. coli cells lacking the gacA-G gene cluster (empty vector) did not produce RhaPS (Figure 1 , lane 2). Likewise, transformants bearing the AgacB or AsccB plasmids lost reactivity with the GAC antibody (Figure 1 , lane 3 and 5). Instead, co-transformation of sccB + AsccB or gacB + AgacB restored the RhaPS production, underlining the essentiality of sccB and gacB for the biosynthesis of the GAC backbone (Figure 1 , lane 4 and 6).

In order to investigate if GacB and SccB are catalysing the same reaction, we tested the ability of GacB to functionally substitute SccB and vice versa by co-transforming AsccB + gacB and AgacB + sccB. In all cases, SccB and GacB were interchangeable (Figure 2). GacB’s predicted initiation codon was different from S. mutans SccB, with the latter using TTG instead of ATG (Figure 2). We decided to test two versions of SccB; one with a TTG as the initiation codon and the other one with an ATG. Both versions rendered an active enzyme that could complement either AsccB and AgacB (Figure 2). Unless stated otherwise, all further work was conducted using sccB constructs with the native TTG start codon. GacB extends a lipid-linked precursor

We investigated whether GacB is a GT that uses GlcNAc-PP-Und as an acceptor. We performed an in vivo experiment generating radiolabelled lipid-linked oligosaccharides (LLO), which were isolated from the bacterial membrane and separated via thin-layer chromatography (TLC). Based on the annotation as a rhamnosyltransferase, radiolabelled c/TDP-p-L-rhamnose would be the preferred sugar donor for GacB. However, this compound is not commercially available, therefore tritiated glucose was chosen as an alternative. Inside the bacterial cell, glucose is used as a substrate to synthesise a wide array of organic components, including c/TDP-L-rhamnose (25).

We hypothesised that GacB transfers an activated sugar from a (radiolabelled) nucleotide sugar donor to a membrane-bound acceptor monosaccharide-PP-Und, e.g. GlcNAc-PP-Und. Therefore, we expected a change in size of the membrane bound acceptor, compared to the signal of the monosaccharide lipid-linked acceptor after running the samples in a TLC plate. As negative control, we used E. coli CS2775 ( D/faS ) transformed with the empty vector. This transformant showed a signal consistent with the generation of monosaccharide-PP-Und (Figure 3 lane 1). Upon expression of either the gacB or sccB genes, we observed the accumulation of a radioactive signal that migrated more slowly on the TLC plate, suggesting a higher molecular mass for these compounds (Figure 3, lane 3 and 4). The same shift was observed for the sccAB-DEFG (LsccC) construct (Figure 3, lane 2), demonstrating that sccB and gacB can glycosylate a lipid-linked precursor. Based on the literature, we assume that the upper radiolabelled band corresponds to GlcNAc-PP-Und, and the lower one to Rha-GIcNAc-PP-Und (8, 9).

GacB is a rhamnosyltransferase that transfers rhamnose from TDP-P-I-Rha onto GlcNAc-PP-lipid acceptors

The observed band shift suggested that GacB adds a monosaccharide to a lipid-linked precursor, most likely GlcNAc-PP-Und. We investigated this hypothesis using recombinantly produced and purified GacB WT and amino acid mutants (mutants Di 6 oN and Y 182 F). We established an in vitro assay using the predicted nucleotide sugar donor, TDP-p-L-rhamnose and a synthetic acceptor substrate. We tested two of these synthetic substrates designed to mimic the native lipid-linked acceptor: C 13 H 27 -PP-GICNAC (acceptor 1) or phenyl-0-Cn H 22 -PP-GlcNAc (acceptor 2) (Figure 7C). The reactions were purified and characterised using matrix-assisted laser desorption ionisation mass spectrometry (MALDI-MS) in positive ion mode.

The MALDI-MS spectra of the enzymatic reaction (Figure 4) confirmed that GacB catalyses the addition of one rhamnose to both acceptor substrates when incubated with TOR-b-L-rha (Figure 4B and E). Acceptor 1 possesses a molecular weight of 563 Da and is detected at both m/z = 608 [M-1 H+2Na] + and m/z = 630 [M-2H+3Na] + (Figure 4A). GacB-GFP and GacB lacking the GPF tag modified the acceptor, resulting in one predominant peak at m/z = 776 [M-2H+3Na] + (Figure 4B, C). In this spectrum, we can also observe an additional peak of lower intensity at m/z = 754 [M-1 H+2Na] + , corresponding to the modified acceptor 1 coupled with 2 Na + ions, instead of 3 Na + ions. In both cases, the products are shifted by m/z = 146 compared to the unmodified acceptor, which is consistent with the addition of one rhamnose via a glycosidic linkage. The same mass shift was observed for the second acceptor; the peaks of the unmodified acceptor 2 (Figure 4D) were detected at m/z = 672 [M-1 H+2Na] + and m/z = 694 [M- 2H+3Na] + , while the product peaks emerge at m/z = 818 [M-1 H+2Na] + and m/z = 840 [M-2H+3Na] + (Figure 4E and 4F). We also tested the ability of GacB to catalyse the rhamnosylation of GlcNAc-a-1-P, but the reaction rendered no detectable product (data not shown), suggesting that the enzyme interacts not only with the GlcNAc-P, but might require the second phosphate and the lipid component to recognise the acceptor substrate.

We further investigated GacB’s specificity towards the sugar-nucleotide donor. In particular, we tested if GacB is selective for thymidine-based nucleotides and tolerates uridine-based nucleotides such as UDP-GIc, UDP-GIcNAc and UDP-Rha. As shown before, in the presence of TOR-b-L-Rha, two products consistent with the incorporation of rhamnose plus either two or three sodium cations were observed in the spectrum (Figure 5A). In contrast, no product peaks were observed with UDP-a-D-GIc or UDP-a- D-GIcNAc as substrates (Figure 5B and C), while residual activity was detected for UDP- b-L-Rha (Figure 5D). This data demonstrate that GacB does not tolerate a-D configured nucleotide sugars. Furthermore, GacB has specificity towards the deoxyribose (TDP- rhamnose) and/or requires binding of the thymine methyl group.

Finally, we assessed metal ion dependency in vitro. Compared to the control reaction (Figure 6B), we noticed no significant differences in the rhamnosylation activity of the enzyme when GacB was supplemented with MgCh, MnCh or EDTA as a metal chelator (Figure 6C, D, E), indicating that GacB does not require a divalent metal ion for its activity.

Together, these data confirmed our previous conclusions drawn from the LLSs radiolabelled assay (Figure 3). This is the first in vitro evidence revealing that GacB is a metal-independent rhamnosyltransferase that catalyses the initiation step in the GAC RhaPS backbone biosynthesis by transferring a single rhamnose to GlcNAc-PP-Und using TOR-b-L-Rha as the exclusive activated nucleotide sugar donor.

Investigation of GacB’s catalytic residues

We were unable to obtain diffraction-quality crystals from the detergent-extracted protein, which would ultimately have revealed detailed insights into the catalytic region. We constructed a GacB structural model based on two enzymes that belong to the GT- 4 family of GTs: Bacillus anthracis’ SaBshA (PDB entry 3mbo) (72) and Corynebacterium glutamicum’s MshA (PDB ID: 3c4v) (24). SaBshA shares 15% identity in 64 out of 424 amino acids. MshA is a ‘homologous’ GT that shares 16% identical residues in a sequence stretch of 71 residues out of 446. Based on the scarce information provided by the structural models and the multiple sequence alignment described in detail below, we mutated several residues that are highly conserved in over forty pathogenic streptococci species.

Our in vitro E. coli system is the first one that enables the study of GacB mutant proteins, allowing the identification of those mutants that abrogate or reduce the production of RhaPS backbone. Conducting this in S. pyogenes is not possible since deletion of the gacB gene renders inviable cells (1 , 20). We used the information available from the GT models mentioned above and the sequence alignment of multiple streptococci to select residues that might be involved in substrate binding, which tends to be conserved among GT. Through in-situ mutagenesis, we constructed nine recombinant versions of GacB containing the following amino acid substitutions: D126A, D126N, E222A, E222Q, D160A, D160N, Y182A, Y182F and K131 R. The latter mutation was included as a negative control since it is a conserved predicted surface residue that presumably is not engaged in the catalytic activity or could inactivate the enzyme otherwise.

We found that substitution of D160 with an asparagine led to a drastic reduction in the production of the RhaPS chain, while an alanine residue did not cause such significant effect. This suggests that the D160 carboxyl group might be required for catalysis, which potentially can be replaced in the alanine mutant by a water molecule. A more severe effect was observed with mutations of Y182. The alanine substitution of Y182 (Y182A) impeded the RhaPS backbone biosynthesis significantly, while Y182F completely inactivated GacB, suggesting an essential role for the Y182 hydroxyl group in GacB’s enzymatic activity.

We further investigated the mutants D160N and Y182F in an in vitro assay using recombinantly expressed and purified GacB-GFP-fusions. The MALDI-MS analysis of the reaction products from GacB-D160N-GFP and GacB-Y182F-GFP revealed that both mutants lacked an enzymatic activity in vitro (Figure 4G and H). These results support the hypothesis that the residues D160 and Y182 play a role in substrate binding or catalysis.

Finally, we created three truncated versions of GacB at the N-terminal end as an attempt to determine whether the enzyme remains active in the absence of the residues predicted to be associated with the membrane. Our results showed that truncations of the first 22 (GacB23-38s), 75 (GacB76-38s) and 1 18 residues (GacBi 19-385) led to inactivation of the enzyme when assessed through the complementation assay. Their inability to complement LgacB suggest that the N-terminal domain is required for activity and supports the hypothesis that GacB is a membrane-associated rhamnosyltransferase.

GacB is a retaining p-1 ,4-rhamnosyl-transferase

The current gene annotation suggests that GacB is an inverting a-1 ,2 rhamnosyltransferase (1 , 8). This annotation is incompatible with the acceptor sugar GlcNAc since its carbon at position C2 is already decorated with the N-acetyl group. Therefore, GacB can only transfer the rhamnose onto the available hydroxyl groups on C3, C4 or C6. In addition, the GAC backbone is composed of repeating units of rhamnose connected via an a-1 ,3-1 ,2 linkage (9, 12) suggesting that GacB would be the only rhamnosyltransferase of this pathway using a retaining mechanism of action. According to the CAZy database, the GacB sequence is classified as a GT-4 family member, which are classified as retaining GTs (27). If that classification is correct for GacB, the stereochemical configuration at the anomeric centre of the sugar donor, TDP-b-i.- rhamnose, should be retained in the final product. In order to elucidate whether GacB is an inverting or a retaining rhamnosyltransferase, we conducted nuclear magnetic resonance (NMR) spectroscopy on the purified reaction products 1 and 2. 1 H NMR spectra were collected at 800 MHz to both establish the structural integrity of acceptors 1 and 2 (Figure 7A) and to determine the chemical structure of their products after the enzymatic reaction (Product 1 and 2). The NMR parameters were determined through one and two-dimensional (1 D and 2D) and 2D total correlation spectroscopy (TOCSY) experiments (Figure 7B); their chemical shifts are summarised in Table 2. For both acceptors, the anomeric proton of a-D-GIcNAc appeared as a doublet of doublets with 3J(H1 ,H2) = 3.4 Hz, and 3J(H1 ,P) = 7.2 Hz. Proton H2 of a-D-GIcNAc was also split by a 3J(H2,P) = 2.4 Hz coupling with P. A 2D 1 H, 31 P HMQC spectrum (data not shown) revealed a correlation of both of these H-V protons with P at -13.5 ppm. Another correlation appeared between the 31 P at -10.6 ppm and protons of the adjacent CH2 groups of the alkyl chain, confirming the integrity of the acceptor substrate. For acceptor 2 a typical pattern of signals of a monosubstituted benzene with integral intensities of 2:2: 1 was observed.

The addition of rhamnose to both acceptor substrates was accompanied by the appearance of a characteristic signal in the anomeric region of the spectrum (4.88 ppm, H1) next to the water signal. The anomeric configuration of this monosaccharide was established in several ways. The measured 3 J(H1 ,H2) coupling constant of 1.0 Hz indicated a b-L configuration (1.1 and 1.8 Hz reported) for b-L and a-L-Rha, respectively). A rotating-frame nuclear Overhauser effect (ROESY) spectrum (Figure 4B) showed spatial proximity of H1 of rhamnose with four other protons. Among these were H2, H3 andH5 protons of rhamnose, the latter two confirming a 1 ,3 diaxial arrangement between H1 , H3 andH5 that is indicative of a b-L Rha configuration. Finally, a comparison of 1 H chemical shifts of rhamnose with those of a-L and b-L-rhamnopyranose (Figure 7C) showed a good agreement with those of b-L-rhamnose (75), thus confirming configuration of this ring. The forth ROESY cross peak of H1 of rhamnose was with H4 of GlcNAc, revealing the presence of a (1 ®4) linkage between the two monosaccharides. This observation was further supported by a comparison of GlcNAc 1 H chemical shifts of acceptor substrates and products. Here, an increased chemical shift (+0.21 ppm) was observed for H4 upon glycosylation, while the average of the absolute values of the differences between the chemical shifts of the other corresponding protons of GlcNAc was 0.03 ppm. As expected, the signals of the alkyl and aryl sidechains practically did not change in the respective acceptor - product pairs. In conclusion, 1 H NMR spectroscopy revealed the formation of a b-L-Rha (1 ®4) D- GlcNAc moiety and the integrity of the product.

Group A, B, C and G Streptococcus share a common RhaPS initiation step

In addition to S. mutans SccB, GacB homologs with a high degree of sequence identity are found in other streptococcal species of clinical importance, such as the Streptococcus species from Group B (GBS), Group C (GCS) and Group G (GGS). All homologous enzymes are situated in the corresponding gene clusters encoding the biosynthesis of their Lancefield antigens, i.e. the Group B, C and G carbohydrate (15). The homologous gene products share 67%, 89% and 89% amino acid identity to GacB, respectively (Table 2, Figure 8). With varying degrees of evidence depending on the species, there is a general understanding of the chemical structure of the RhaPS of these streptococci (9). The currently accepted structures for GAC, GBC, GCC, GGC and SCC are summarised in Figure 8. Remarkably, none of the investigations that led to the understanding of the surface carbohydrate structures includes data describing the mechanism of action of the enzymes involved in the priming step of each RhaPS biosynthesis.

Based on the high-sequence identity to GacB, we hypothesised that the carbohydrate biosynthesis of the Group A, Group B, Group C and Group G Streptococcus possess a conserved initiation step, in which the first rhamnose residue is transferred onto the lipid- linked acceptor forming Rha^-1 ,4-GlcNAc-PP-Und. We tested the ability of the homologs from GBS, GCS and GGS (GbsB, GcsB and GgsB, respectively) to functionally substitute GacB in the production of the RhaPS chain (Figure 9). Our results show that all homologous proteins were able to restore the RhaPS backbone when their genes were co-expressed with the LgacB expression plasmid, suggesting these enzymes can perform the same enzymatic reaction.

We showed that GacB requires GlcNAc-PP-Und as acceptor, but it is possible that the enzymes from GBS, GCS and GGS use a different lipid-linked acceptor substrate, such as Glc-PP-Und. Thus, to determine whether the GacB homologs require GlcNAc-PP- Und as lipid acceptor, we conducted the complementation assay using E. coli LwecA cells, which lack GlcNAc-PP-Und (23). As a positive control we identified S. pneumoniae WchF, a Glc-1 ,4^-rhamnosyltransferase that uses exclusively Glc-PP-Und as substrate (28). As expected, GacB was unable to restore the RhaPS chain when co-transformed with the LgacB vector in the absence of the GlcNAc-PP-Und (Figure 9A, lane 2). The GacB homologs from GBS, GCS and GGS also failed to produce the RhaPS backbone (Figure 9A, lane 4-6), but could replace GacB function in the D/faS strain (Figure 9B). Only WchF, which uses a Glc-PP-Und acceptor for the transfer of a rhamnose residue, restored the RhaPS biosynthesis in the absence of GlcNAc-PP-Und (Figure 9A, lane 3). Combined with the data from our in vitro enzymatic reactions, these results suggest that the GacB homologues from GBS, GCS and GGS are also GlcNAc-1 ,4-p- rhamnosyltransferases that require GlcNAc-PP-Und as membrane-bound acceptor.

Most streptococcal pathogens are predicted to have a GlcNAc-1,4-p-rhamnosyl- transferase

S. pneumoniae wchF encodes a Glc-p-1 ,4-rhamnosyltransferase that requires Glc-PP- Und as acceptor (28). It shares 51 % amino acid identity to GacB, compared to 67-89% for the homologous enzymes from GBS, GCS, GGS and S. mutans. Towards a better understanding of the conservation of GacB in the Streptococcus genus, we extended our bioinformatics analysis to search for other strains that harbour GacB homologous genes. We found 48 human/veterinary pathogenic Streptococcus species with a single GacB homolog, sharing 50 to 94% sequence identity (Table 2, Figure 10). Five of our 48 identified species showed a percentage identity equal or lower than 51 % (S. mitis, S. pneumoniae, S. oralis subsp. tigurinus, S. peroris andS. pseudopneumoniae ), while all other encoded proteins presented more than 65 % homology to GacB. For simplicity, we will refer to the five Streptococcus strains with low amino acid identity as‘low identity’ subgroup, and the rest of the species as the‘high identity’ subgroup.

The sequence analysis paired with the complementation assay led us to hypothesise that all GacB homologs encompassed in the‘high identity’ subgroup possess GlcNAc-b- 1 ,4- rhamnosyltransferase activity. In contrast, the‘low identity’ subgroup contains S. pneumoniae WchF, a known Glc-1 ,4-p-rhamnosyltransferase (28). All five members of the‘low identity subgroup’ exhibit very high sequence identity (> 90%) when compared to WchF.

GacO from S. pyogenes, the WecA homolog, was shown to be responsible for the biosynthesis of the GlcNAc-PP-Und (8,9), the substrate for GacB. We therefore hypothesised that the‘low’ and‘high identity’ subgroups utilise different substrates, and therefore investigated whether an equivalent discrepancy should be observed when comparing the sequence identity of the GacO homologs. Within the 48 pathogenic streptococci genomes (Table 2, Figure 10), we found that all strains from the‘high identity’ subgroup share a gacO homologue with 63-92% sequence identity. Importantly, any genome from the‘low identity’ subgroup contains a gene product with equal or less than 30% sequence identity to GacO. This subgroup present gene products that have high homology to S. pneumoniae Cps2E, which transfers Glc-1-P to P-Und, to generate Glc-PP-Und (28). S. mitis, S. oralis subsp. tigurinus, S. peroris and S. pseudopneumoniae homologues share 98% sequence identity to Cps2E.

The degree of phylogenetic conservation of GacB in the Streptococcus genus highlights the importance of this gene, for survival and pathogenesis of streptococcal pathogens. Overall, these results lead us to propose that those streptococcal species that have GacB homologs with a high degree of identity (>65%) are OIoNAo-b-1 ,4- rhamnosyltransferases that catalyse the first committed step in the biosynthesis of their surface RhaPS by transferring rhamnose from TDP^-L-rhamnose to the membrane- bound GlcNAc-PP-Und. In contrast, we postulate that the species within the‘low identity’ subgroup, in accordance with the function of S. pneumoniae serotype 2 WchF, contains a rhamnosyltransferase that acts on lipid-linked Glc-PP-Und.

Table 2 Sequence conservation in % for GacB and GacO homologous enzymes from 48 species of the Streptococcus genus.

GacB’s N-terminal domain encodes specificity for the GlcNAc acceptor

We performed a multiple sequence alignment of the GacB homologs from all 48 streptococcal pathogens to identify the most variable and conserved regions in the protein sequence. We observed a higher discrepancy between the‘high identity’ and the ‘low identity’ subgroups in their N-terminal domains (Table 2). More precisely, a low sequence conservation region is identifiable between the GacB amino acid residues 40 and 80, suggesting that this section of the domain is either involved in the GlcNAc acceptor sugar recognition or in essential protein-protein interactions.

We knew from our previous experiment that GacB cannot initiate the RhaPS biosynthesis on a wecA deletion background (Figure 9A, lane 2). Based on this information and in order to identify residues involved in sugar acceptor recognition, we introduced mutations in the GacB amino acid sequence. The goal was to salvage the RhaPS initiation step in a wecA-deficient E. coli strain in which GacB mutants recognise a lipid-linked sugar acceptor other that GlcNAc-PP-Und.

Therefore, we investigated a structural model based on the GacB homolog from Bacillus anthracis, BaBshA (PDB entry 3mbo), which suggested that residues L128, R131 , GNT100 may potentially be involved in sugar acceptor recognition. We mutated these residues to mimic those found in WchF. Complementation assays using GacB L128H_R131 L, failed to complement LgacB in a LwecA background (Figure 11 , lane 2). Following a sequential approach, we modified the GacB primary sequence by introducing additional amino acid substitutions that corresponded to those found in WchF: L128H_R131 L_GNT100ARC and L128H_ R131 L_GNT100ARC_A105P. None of these mutants recognised glucose to initiate the rhamnose chain, and thus, did not restore GacB’s activity. Finally, we replaced the first 178 residues of GacB with the corresponding WchF amino acids (1-186). When expressed in a wecA deletion background, this WchF-GacB chimera was able to synthesise the RhaPS backbone on the exclusive acceptor substrate Glc-PP-Und (Figure 11 , lane 5). Discussion

This work sheds light on the first committed step of the GAC biosynthesis and provides insight into the function of GacB, the first metal-independent, retaining and non- processive a-D-GIcNAc b-I , I-GΐΐqGhhoegIΐGqheίbGqeb reported. This insight is depicted schematically in Figure 12, which shows the elucidated structure of GAC as well as the endogenous S.mutans enzymes involved in the synthesis of each section. Other enzymes from Gram-negative and Gram-positive bacteria that are involved in polysaccharide biosynthesis use lipid-linked GlcNAc as acceptor and either c/TDP-L- or GDP-D-rhamnose sugar nucleotides, however, their reaction results in an a-1 ,3 or a-1 ,4 glycosidic bond (29-31). Also, the fact that the GAC backbone is composed of repeating units of rhamnose connected via an a-1 , 3-1 ,2 linkage (9, 13, ) suggest that GacB is the only rhamnosyltransferase of this pathway using a retaining mechanism of action.

We have also shown that streptococcal RhaPS can be synthesized in a recombinant expression system, namely E. coli , onto a different acceptor, Und-PP-Glu using the enzyme WchF. This is depicted schematically in Figure 13. Specifically, Figure 13 demonstrates how the enzyme WchF can be used to transfer a rhamnose moiety to a glucose monosaccharide to form a disaccharide, the disaccharide having the glucose at the reducing end and the rhamnose moiety at the non-reducing end. The enzyme WchF facilitates the formation of a b-1 ,4 glycosidic bond between the two monosaccharides. A rhamnose polysaccharide is then generated by extended from the rhamnose moiety at the non-reducing end of the disaccharide using the bacterial enzyme GacC or its enzymatically active homologue GbcC. WchF is derived from S. pneumoniae, this is heterologous to the bacteria ( S.mutans and S.agaiactiae) from which GacC or GbcC are derived. In this particular embodiment, the method was carried out in E.coii, which is also a different species to the bacteria from which WchF, GacC and GbcC are derived.

This results in the formation of a synthetic streptococcal polysaccharide having a non reducing end comprising a linear chain of rhamnose moieties and a reducing end comprising a glucose monosaccharide, the polysaccharide comprising a b-1 ,4 bond between the glucose and the linear chain of rhamnose moieties. As the skilled person will appreciate, this differs from the naturally occurring GAC (which is shown in Figure 12) due to the monosaccharide at the reducing end being glucose rather than GlcNAc. Example 2

To further illustrate the invention, this Example is directed to further exemplary methods of synthesis and the rhamnose polysaccharide of the invention.

Figure 14 is another exemplary embodiment of the invention. Figure 14 shows how the enzyme WbbL, which is derived from E. coli , can be used to transfer a rhamose moiety to a GlcNAc monosaccharide. This forms a disaccharide having the GlcNAc at its reducing end and the rhamnose moiety at the non-reducing end with an a-1 ,3 glycosidic bond between the rhamnose moiety and the GlcNAc. The rhamnose polysaccharide is then generated by extension from the rhamnose moiety at the reducing end of the disaccharide using the bacterial enzyme GacC or its enzymatically active homologue GbcC. Since WbbL is derived from E. coli , it is derived from a bacterial species heterologous to the bacterial species from which GacC and GbcC are derived.

In this particular example, the method is performed in E. coli , although other bacteria can be envisaged for this purpose. Thus, in this particular embodiment, WbbL can be endogenous to the E. coli or it can be overexpressed in the E. coli.

This method, as Figure 14 shows, results in the generation of a synthetic streptococcal polysaccharide having a non-reducing end comprising a linear chain of rhamnose moieties and a reducing end comprising a GlcNAc monosaccharide, the polysaccharide comprising a a-1 ,3 bond between the GlcNAc and the linear chain of rhamnose moieties. This differs from the endogenous GAC (as shown in Figure 12), as GAC contains a b- 1 ,4 bond between the GlcNAc and the linear chain of rhamnoses. Any other enzyme which is a hexose-a-1 ,3-rhamnosyltransferase could be used instead of WbbL, as shown schematically in Figure 15. Figure 15 differs from Figure 14 in that the monosaccharide is a glucose rather than a GlcNAc. Thus, the product of Figure 14 is a synthetic Streptococcal polysaccharide having a non-reducing end comprising a linear chain of rhamnose moieties and a reducing end comprising a glucose monosaccharide, the polysaccharide comprising a a-1 , 3 bond between the glucose and the linear chain of rhamnose moieties. This differs from the endogenous GAC (shown in Figure 12) with the inclusion of the glucose and the a-1 ,3 bond. Other methods of synthesis are within the scope of the present invention. Figure 16 shows such an exemplary method. In this method, a diNAcBac-a-1 ,3- rhamnosyltransferase is used to transfer a rhamnose moiety to a diNAcBac monosaccharide. Thus, a disaccharide is formed having the diNAcBac at its reducing end and the rhamnose moiety at the non-reducing end. The two monosaccharides are linked with an a -1 ,3 glycosidic bond. The rhamnose polysaccharide is then generated by extended from the rhamnose moiety at the non-reducing end of the disaccharide using the bacterial enzyme GacC or its enzymatically active homologue GbcC. The diNAcBac- a-1 ,3-rhamnosyltransferase is derived from a bacterial species different to the bacterial species from which GacC or its enzymatically active homologue GbcC is derived.

The method of Figure 16 leads to the generation of a synthetic streptococcal polysaccharide having a non-reducing end comprising a linear chain of rhamnose moieties and a reducing end comprising diNAcBac monosaccharide, the polysaccharide comprising a a-1 ,3 bond between the diNAcBac and the linear chain of rhamnose moieties. This differs from the endogenous GAC (as shown in Figure 12), as GAC contains a b-1 ,4 bond between a GlcNAc and the linear chain of rhamnoses.

Figure 17 demonstrates another exemplary method and product. In this method, a disaccharide, trisaccharide or tetrasaccharide can be formed before extending from the rhamnose moiety. For the disaccharide, the galactose- a-1 ,2-rhamnosyltransferase WbbR is used to transfer a rhamnose moiety to a galactose monosaccharide. This forms a disaccharide having the galactose at its reducing end and the rhamnose moiety at its non-reducing end. The rhamnose polysaccharide is then generated by extending from this rhamnose moiety to form a linear chain of rhamnose moieties. In this example, extension is using the enzymes GacC, GacG or GbcC (see penultimate schematic of Figure 17 and top schematic). WbbR is derived from Shigella, which is a different bacterial species to the Streptococcus from which GacC, GacG or GbcC are each derived. This method leads to the production of a synthetic streptococcal polysaccharide having a non-reducing end comprising a linear chain of rhamnose moieties and a reducing end comprising a galactose monosaccharide, the polysaccharide comprising a a-1 ,2 bond between the diNAcBac and the linear chain of rhamnose moieties. An alternative embodiment, as also depicted by the top and penultimate schematics of Figure 17, is the formation of a trisaccharide before extending from the rhamnose moiety. For the trisaccharide, the enzyme WbbP is used to transfer a galactose monosaccharide to a GlcNAc, thus forming an a-1 ,3 glycosidic bond between the two monosaccharides. The enzyme WbbR is then used as described above for the disaccharide such that a rhamnose moiety is transferred to the galactose. After this extension can occur as detailed for the disaccharide above.

To the left of Figure 17 is a spot blot (positive antibody blot). Each blot represents a sample from one experiment; each row represents a triplicate of the same conditions. For each experiment, the sample from the reaction was added as a spot, and an anti- GAC antibody used to determine if the reaction was successful in the formation of the rhamnose polysaccharide. The middle row shows triplicates of samples obtained from reactions where the enzyme WbbP is used to transfer a galactose monosaccharide to a GlcNAc, followed by the enzyme WbbR then GacG. The dot plot to the left confirms that this reaction is capable of producing the rhamnose polysaccharide of the invention.

WbbP can alternatively be used to form a disaccharide (i.e. a galactose monosaccharide at its non-reducing end linked by an a-1 ,3 glycosidic bond to a GlcNAc at its reducing end, following which the rhamnose polysaccharide is generated by extended from the rhamnose moiety at the non-reducing end of the disaccharide (see bottom schematic of Figure 17). The dot plot row to the left of this schematic confirms that this reaction is also capable of producing the rhamnose polysaccharide of the invention.

Optionally, one or two additional rhamnose moieties can be transferred to the rhamnose moiety linked to the galactose to form a tetra or pentasaccharide, prior to the step of extension as detailed above. The one or two additional rhamnose moieties can be transferred using the enzyme WbbQ, followed by further extension using GacC using GbcC, as shown in the third schematic of Figure 17. The dot plot row to the left of this Figure confirms that a reaction containing WbbP, WbbR, WbbQ and GacC was successful in generating a rhamnose polysaccharide according to the present invention.

For the tri, tetra or pentasaccharide methods, these methods result in the generation of a synthetic Streptococcal polysaccharide having a reducing end comprising a linear chain of rhamnose moieties and a non-reducing end comprising a GlcNac and a galactose, the polysaccharide comprising a a-1 ,2 bond between the linear chain of rhamnose moieties and the galactose and a a-1 ,3 bond between the galactose and the GlcNAc.

In embodiments wherein a rhamnose moiety is transferred to a disaccharide or trisaccharide, it is envisaged that any combination of hexoses may be used to form the di or trisaccharide using alpha or beta bonds as described herein. This is depicted in Figure 18. Likewise, for the extension of the rhamnose polysaccharide from the rhamnose moiety, it is envisaged that any enzymatically active homologue of GacC, GacG, or a fragment or variant thereof, could be used, provided that a-1 ,2 and/or a-1 ,3 glycosidic bonds are formed between each pair of rhamnose moieties.

Figure 19 confirms that WbbL can be used instead of GacB or SccB in a method of the invention to produce the rhamnose polysaccharide. The figure shows an anti-GAC Western blot of total E. coli lysate from cells expressing the gene cluster RmlD-SccC- SccD-SccE-SccF-SccG (deltaSccB) and GacA-GacC-GacD-GacE-GacF-GacG (deltaGacB) complemented with empty plasmid controls or WbbL. The first column is a ladder. The second column confirms that GAC was not produced in E. coli cells having a RgpA deletion, while the third column confirms that the expression of WbbL alone in RgpA deficient cells did not restore GAC synthesis. The third column shows the lysate from E. coli cells having a RgpA deletion but also expressing the gene cluster GacA- GacC-GacD-GacE-GacF-GacG (deltaGacB). No GAC was found in these cells. However, the fourth column shows that when WbbL is expressed in the cells of the third column, GAC is produced. The same result is observed when rgpA deficient cells express the gene cluster RmID-SccC-SccD-SccE-SccF-SccG (deltaSccB) together with WbbL (see duplicates of last two columns). This data confirms that WbbL can be used with heterologous enzymes from other species to produce a rhamnose polysaccharide according to the present invention.

Figure 20 confirms that GacC introduces up to five Rhamnose sugars onto the product generated from GacB. Figure 20 shows radiolabelling of lipid-linked oligosaccharides (LLOS) in vivo (E. coli). Film exposure of a TLC plate with radiolabelled LLOS from E. coli CS2775 bearing gacB (lane 1) or gacBC (lane 2). Homologues to GacC can function in a similar manner. Figure 21 shows results similar to that shown in Figure 20, but using GbcC, GccC and GgcC, from homologous enzymes from Group B, C and G Streptococci. Figure 21 shows a film exposure of a TLC plate with radiolabelled LLOS from E. coli CS2775 bearing gacB and gacC (lane 1 ), gacB alone (lane 2), gacB and gbcC (lane 3), gacB and gccC (lane 4), gacB and ggcC (lane 5). GacC, GbcC, GccC, GgcC are homologous enzymes from Group A, B, C and G Streptococci and the figure shows that all transfer 3-5 rhamnose sugars onto the product of GacB.

Similarly, the inventor has shown that the GacC enzyme function is conserved amongst Streptococci and is able to complement SccC enzyme of E.coli. Figure 22 shows:

A) Gene complementation strategy. sccC gene replaced with homologous genes gacC, gbcC, gccC, ggcC.

B) Immunoblots of whole-cell lysates for the bacterial complementation assay probed with anti-Group A antibody.

Complementation study confirms that GacC enzyme function is conserved amongst Streptococci from Group B,C,G and S. mutans.

Phylogenetic analysis of GacO, GacB and GacC enzymes show the high degree of similarity and hence function is conserved in Streptococci - Pathogenic strains are all expected to produce RhaPS with identical adapter / stem and as such, all are suitable for use in accordance with the present invention.

Figure 23 shows A) Phylogenetic tree based on GacB ortholog protein sequences identified from forty-eight pathogenic streptococci. An asterisk after the species name indicates that the ortholog sequence was not retrieved from a whole sequenced genome. Sequences were aligned using the default neighbour-joining clustering method of Clustal Omega and then plotted using iTOL online tool. B) The bar charts indicate the degree of homology in percentage to S. pyogenes GacO (red), GacB (blue) or GacC (green). The figures next to GacO, GacB and GacC labels represent the step catalysed by S. pyogenes. The figures in the indentation at the centre of the figure is based on our current knowledge of the role of S. pneumoniae Cps2E, Cps2T (WchF) and Cps2F (James 2013). Figure 24 shows that GacC rhamnosylates synthetic LLO substrate (GacB product) in vitro.

A) HPLC analysis showing that GacC extends a chemoenzymatic lipid-linked disaccharide generated using GacB with 3 additional rhamnose residues. The chemical linkage was subsequently analysed by NMR. B) Chemical drawing of GacB/C reactions with in vitro acceptor substrate

Further studies, not all data shown, by the inventors using NMR and mass spectrometry techniques confirm that GacC can add up to 4 rhamnose sugars and that GacC is an inverting alpha-1 ,3 rhamnosyltransferase. Figure 25 shows full assignment of protons

1

and carbon sugar signals. H assignments were based on the analysis of several F1-

13 1 13 band-selective 2D TOCSY spectra. C signals were assigned using 2D H, C HSQC. Linkages were assigned using a 2D NOESY experiment. Chemical shifts for each of the sugar residues agrees well with published data for 1 H and 13C signals for glycopyranoses.

The inventor has further shown that the rhamnose polysaccharide in accordance with the present invention may be generated using different enzyme combinations. Figure 26 shows that the rhamnose polysaccharide according to the present invention may be generated using enzymes from Shigella dysente ae in combination with E. coli and Shigella dysentehae in combination with Streptoccous mutans. Figure 26 shows a whole cell Western blot using anti-Group A Carbohydrate antibody. Total E. coli cell lysates were separated over SDS-PAGE. NewRhaPS are build by Shigella dysentehae gene products combined with S. mutans / Group A Streptococcus gene products. RmlD_GacD_E_F_G plus WbbP_Q_R are sufficient to build NewRhaPS. NewRhaPS can also be build with RmlD_SccC_D_E_F_G plus WbbP_Q_R.

Based on the above evidence, it is expected that Shigella spp. can be further used in order to provide the adaptor/stem and GAC repeat units, as shown schematically in Figure 27. In a native system, GacB and GacC enzymes install the adaptor/stem region (red box) before GacG installs the immunogenic repeat unit. The figure shows as an example 3 alpha1 ,3-rhamnose sugars installed by GacC.

Replacement of the GacB/C enzymes (replacement of the GlcNAc-beta1 ,4-rhamnose- alpha1 ,3-rhamnose adaptor/stem) to generate NewRhaPS, provides an alternative to maintain the immunogenic repeat unit (proposed to be introduced by GacG enzyme activity). Replacing the adaptor region (green box) with a O-Otase compatible polysaccharide/oligosaccharide is sufficient to build the immunogenic polysaccharide (alpha1 ,2-alpha1 ,3 rhamnose).

As described herein, the rhamnose polysaccharides of the present invention may be conjugated with a suitable protein and presented on the surface of a bacterium. Figure 28 shows that rhamnose polysaccharides prepared in accordance with the present invention are suitable substrates for use in an E. coli glycoconjugation system. A periplasmic expressin test system was set up in accordance with the procedure described by Reglinski etal., npj Vaccines (2108)3:53. Figure 28 shows that NewRhaPS are compatible substrate for O-Otase (PgIB) / for Protein Glycan Coupling Technology (PGCT)

Periplasmic expression of test protein NanA (in accordance with Reglinski)+/- active/inactive NewRhaPS system (1-8).

Lanes 5 and 7 show that two different expression conditions for NewRhaPS system are positive for NanA-NewRhaPS glycosylation.

Lane 9: GAC chemically extracted from S. pyogenes (positive control for GAC antibody).

This description should not be construed as limiting and it will be appreciated that other variants and embodiments thereof fall within the scope of the present invention.

REFERENCES

1. van Sorge, N. M., Cole, J. N., Kuipers, K., Henningham, A., Aziz, R. K., Kasirer- Friede, A., Lin, L., Berends, E. T. M., Davies, M. R., Dougan, G., Zhang, F., Dahesh, S., Shaw, L., Gin, J., Cunningham, M., Merriman, J. A., Flutter, J., Lepenies, B., Rooijakkers, S. H. M., Malley, R., Walker, M. J., Shattil, S. J., Schlievert, P. M., Choudhury, B., and Nizet, V. (2014) The Classical Lancefield Antigen of Group A Streptococcus Is a Virulence Determinant with Implications for Vaccine Design. Cell Host Microbe. 15, 729-740

2. Kristian, S. A., Datta, V., Weidenmaier, C., Kansal, R., Fedtke, I., Peschel, A., Gallo, R. L., and Nizet, V. (2005) D-alanylation of teichoic acids promotes group a streptococcus antimicrobial peptide resistance, neutrophil survival, and epithelial cell invasion. J. Bacteriol. 187, 6719-6725

3. Henningham, A., Davies, M. R., Uchiyama, S., Sorge, N. M. van, Lund, S., Chen, K. T., Walker, M. J., Cole, J. N., and Nizet, V. (2018) Virulence Role of the GlcNAc Side Chain of the Lancefield Cell Wall Carbohydrate Antigen in Non-M1 -Serotype Group A Streptococcus. mBio. 9, e02294-17

4. Le Breton, Y., Belew, A. T., Freiberg, J. A., Sundar, G. S., Islam, E., Lieberman, J., Shirtliff, M. E., Tettelin, H., El-Sayed, N. M., and Mclver, K. S. (2017) Genome wide discovery of novel M1T1 group A streptococcal determinants important for fitness and virulence during soft-tissue infection. PLoS Pathog. 13, e1006584

5. Shelburne, S. A., Keith, D., Horstmann, N., Sumby, P., Davenport, M. T., Graviss, E. A., Brennan, R. G., and Musser, J. M. (2008) A direct link between carbohydrate utilization and virulence in the major human pathogen group A Streptococcus. Proc. Natl. Acad. Sci. U. S. A. 105, 1698-1703

6. Lancefield, R. C. (1933) A Serological Differentiation of Human and Other Groups of Hemolytic Streptococci. J. Exp. Med. 57, 571-595

7. McCarty, M. (1958) Further studies on the chemical basis for serological specificity of group a streptococcal carbohydrate. J. Exp. Med. 108, 311-323

8. Rush, J. S., Edgar, R. J., Deng, P., Chen, J., Zhu, H., van Sorge, N. M., Morris, A.

J., Korotkov, K. V., and Korotkova, N. (2017) The molecular mechanism of N- acetylglucosamine side-chain attachment to the Lancefield group A carbohydrate in Streptococcus pyogenes. J. Biol. Chem. 292, 19441-19457

9. Mistou, M.-Y., Sutcliffe, I. C., and Sorge, N. M. van (2016) Bacterial glycobiology: rhamnose-containing cell wall polysaccharides in Gram-positive bacteria. FEMS Microbiol. Rev. 40, 464-479

10. Coligan, J. E., Kindt, T. J., and Krause, R. M. (1978) Structure of the streptococcal groups A, A-variant and C carbohydrates. Immunochemistry. 15, 755-760

11. Krause, R. M., and McCarty, M. (1961) Studies on the Chemical Structure of the Streptococcal Cell Wall. J. Exp. Med. 114, 127-140

12. Edgar, R. J., Hensbergen, V. P. van, Ruda, A., Turner, A. G., Deng, P., Breton, Y.

L., El-Sayed, N. M., Belew, A. T., Mclver, K. S., McEwan, A. G., Morris, A. J., Lambeau, G., Walker, M. J., Rush, J. S., Korotkov, K. V., Widmalm, G., Sorge, N.

M. van, and Korotkova, N. (2019) Discovery of glycerol phosphate modification on streptococcal rhamnose polysaccharides. Nat. Chem. Biol. 15, 463

13. H. Heymann, Zeleznick, L. D., Boltralik, J. J., Barkulis, S. S., and Smith, C. (1963) Biosynthesis of Streptococcal Cell Walls: A Rhamnose Polysaccharide. Science. 140, 400—401

14. Heymann, H., Manniello, J. M., and Barkulis, S. S. (1967) Structure of streptococcal cell walls. V. Phosphate esters in the walls of group A Streptococcus pyogenes. Biochem. Biophys. Res. Commun. 26, 486-491

15. van Hensbergen, V. P., Movert, E., de Maat, V., Luchtenborg, C., Le Breton, Y., Lambeau, G., Payre, C., Henningham, A., Nizet, V., van Strijp, J. A. G., Brugger, B., Carlsson, F., Mclver, K. S., and van Sorge, N. M. (2018) Streptococcal Lancefield polysaccharides are critical cell wall determinants for human Group IIA secreted phospholipase A2 to exert its bactericidal effects. PLoS Pathog. 14, e 1007348

16. Sewell, E. W. C., and Brown, E. D. (2014) Taking aim at wall teichoic acid synthesis: new biology and new leads for antibiotics. J. Antibiot. (Tokyo). 67, 43- SI

17. Huang, D. H., Rama Krishna, N., and Pritchard, D. G. (1986) Characterization of the group A streptococcal polysaccharide by two-dimensional 1 H-nuclear- magnetic-resonance spectroscopy. Carbohydr. Res. 155, 193-199

18. van der Beek, S. L, Le Breton, Y., Ferenbach, A. T., Chapman, R. N., van Aalten, D. M. F., Navratilova, I., Boons, G.-J., Mclver, K. S., van Sorge, N. M., and Dorfmueller, H. C. (2015) GacA is essential for Group A Streptococcus and defines a new class of monomeric dTDP-4-dehydrorhamnose reductases (RmID). Mol. Microbiol. 98, 946-962

19. Le Breton, Y., Belew, A. T., Valdes, K. M., Islam, E., Curry, P., Tettelin, H., Shirtliff, M. E., El-Sayed, N. M., and Mclver, K. S. (2015) Essential Genes in the Core Genome of the Human Pathogen Streptococcus pyogenes. Sci. Rep. 5, 9838

20. Shibata, Y., Yamashita, Y., Ozaki, K., Nakano, Y., and Koga, T. (2002) Expression and characterization of streptococcal rgp genes required for rhamnan synthesis in Escherichia coli. Infect. Immun. 70, 2891-2898

21. Bruyere, T., Wachsmann, D., Klein, J. P., Scholler, M., and Frank, R. M. (1987) Local response in rat to liposome-associated Streptococcus mutans polysaccharide-protein conjugate. Vaccine. 5, 39-42

22. Cartee, R. T., Forsee, W. T., Bender, M. H., Ambrose, K. D., and Yother, J. (2005) CpsE from type 2 Streptococcus pneumoniae catalyzes the reversible addition of glucose-1 -phosphate to a polyprenyl phosphate acceptor, initiating type 2 capsule repeat unit formation. J. Bacteriol. 187, 7425-7433

23. Ozaki, K., Shibata, Y., Yamashita, Y., Nakano, Y., Tsuda, H., and Koga, T. (2002) A novel mechanism for glucose side-chain formation in rhamnose-glucose polysaccharide synthesis. FEBS Lett. 532, 159-163 24. Vetting, M. W., Frantom, P. A., and Blanchard, J. S. (2008) Structural and enzymatic analysis of MshA from Corynebacterium glutamicum: substrate- assisted catalysis. J. Biol. Chem. 283, 15834-15844

25. Jurtshuk, P. (1996) Bacterial Metabolism in Medical Microbiology, 4th Ed. (Baron, S. ed), University of Texas Medical Branch at Galveston, Galveston (TX)

26. Parsonage, D., Newton, G. L, Holder, R. C., Wallace, B. D., Paige, C., Hamilton, C. J., Dos Santos, P. C., Redinbo, M. R., Reid, S. D., and Claiborne, A. (2010) Characterization of the N-acetyl-a-D-glucosaminyl l-malate synthase and deacetylase functions for bacillithiol biosynthesis in Bacillus anthracis. Biochemistry (Mosc.). 49, 8398-8414

27. Lombard, V., Golaconda Ramulu, H., Drula, E., Coutinho, P. M., and Henrissat, B.

(2014) The carbohydrate-active enzymes database (CAZy) in 2013. Nucleic Acids Res. 42, D490-495

28. James, D. B. A., and Yother, J. (2012) Genetic and Biochemical Characterizations of Enzymes Involved in Streptococcus pneumoniae Serotype 2 Capsule Synthesis Demonstrate that Cps2T (WchF) Catalyzes the Committed Step by Addition of b1- 4 Rhamnose, the Second Sugar Residue in the Repeat Unit. J. Bacteriol. 194, 6479-6489

29. Schagger, H. (2006) Tricine-SDS-PAGE. Nat. Protoc. 1 , 16-22

30. Waldo, G. S., Standish, B. M., Berendzen, J., and Terwilliger, T. C. (1999) Rapid protein-folding assay using green fluorescent protein. Nat. Biotechnol. 17, 691- 695

31. Druzhinina, T. N., Danilov, L. L., Torgov, V. I., Utkina, N. S., Balagurova, N. M., Veselovsky, V. V., and Chizhov, A. O. (2010) 11-Phenoxyundecyl phosphate as a 2-acetamido-2-deoxy-a-d-glucopyranosyl phosphate acceptor in O-antigen repeating unit assembly of Salmonella arizonae 0:59. Carbohydr. Res. 345, 2636- 2640

32. Robinson, P. T., Pham, T. N., and Uhrin, D. (2004) In phase selective excitation of overlapping multiplets by gradient-enhanced chemical shift selective filters. J. Magn. Resort. San Diego Calif 1997. 170, 97-103

33. Rucker, F. J., and Osorio, D. (2008) The effects of longitudinal chromatic aberration and a shift in the peak of the middle-wavelength sensitive cone fundamental on cone contrast. Vision Res. 48, 1929-1939 SEQUENCES

SEQ ID NO:1 GacC

MNINILLSTYNGERFLAEQIQSIQRQTVNDWTLLIRDDGSTDGTQDIIRTFVKEDKRIQW

INEGQTENLGVIKNFYTLLKHQKADVYFFSDQDDIWLDNKLEVTLLEAQKHEMTAPL LV

YTDLKVVTQHLAVCHDSMIKTQSGHANTSLLQELTENTVTGGTMMITHALAEEWTTC

DGLLMHDWYLALLASAIGKLVYLDIPTELYRQHDANVLGARTWSKRMKNWLTPHHLV

NKYWWLITSSQKQAQLLLDLPLKPNDHELVTAYVSLLDMPFTKRLATLKRYGFRKNR I

FHTFI FRSLVVTLFGYRRK

SEQ ID N0:2 GacG

MNRILLYVHFNKYNKISAHVYYQLEQMRSLFSKIVFISNSKVSHEDLKRLKNHCLIDEFL

QRKNKGFDFSAWHDGLIIMGFDKLEEFDSLTIMNDTCFGPIWEMAPYFENFEEKETV D

FWGITNNRGTKAFKEHVQSYFMTFKNQVIQNKVFQQFWQSIIEYENVQEVIQHYETQ L

TSILLNEGFSYQTVFDTRKAESSFMPHPDFSYYNPTAILKHHVPFIKVKAIDANQHI APY

LLNLIRETTNYPIDLIVSHMSQISLPDTKYLLSQKYLNCQRLAKQTCQKVAVHLHVF YVD

LLDEFLTAFENWNFHYDLFITTDSDIKRKEIKEILQRKGKTADIRVTGNRGRDIYPM LLL

KDKLSQYDYIGHFHTKKSKEADFWAGESWRKELIDMLVKPADSILSAFETDDIGIII ADIP

SFFRFNKIVNAWNEHLIAQEMMSLWRKMDVKKQIDFQAMDTFVMSYGTFVWFKYDA

LKSLFDLELTQNDIPSEPLPQNSILHAIERLLVYIAWGDSYDFRIVKNPYELTPFID NKLL

NLREDEGAHTYVNFNQMGGIKGALKYIIVGPAKAMKYIFLRLMEKLK

SEQ ID N0:3 RfbG

MHSSDQKRVAVLMATYNGECWIEEQLKSIIEQKDVDISIFISDDLSTDNTLNICEEFQLS

YPSIINILPSVNKFGGAGKNFYRLIKDVDLENYDYICFSDQDDIWYKDKIKNAIDCL VFN

NANCYSSNVIAYYPSGRKNLVDKAQSQTQFDYFFEAAGPGCTYVIKKETLIEFKKFI IN

NKNAAQDICLHDWFLYSFARTRNYSWYIDRKPTMLYRQHENNQVGANISFKAKYKRL

GLVRNKWYRKEVTKIANALADDSFVNNQLGKGYIGNLILALSFWKLRRKKADKIYIL LM

LILNIF SEQ ID NO:4 GbcC

MKVNILMATYNGEKFLAQQIESIQKQTFKEWNLLIRDDGSSDKTCDIIRNFTAKDSRIRF

INENEHHNLGVIKSFFTLVNYEVADFYFFSDQDDVWLPEKLSVSLEAAKHKASDVPL LV

YTDLKVVNQELNILQDSMIRAQSHHANTTLLPELTENTVTGGTMMINHALAEKWFTP N

DILMHDWFLALLAASLGEIIYLDLPTQLYRQHDNNVLGARTMDKRFKILREGPKSIF TRY

WKLIHDSQKQASLIVDKYGDIMTANDLELIKCFIKIDKQPFMTRLRWLWKYGYSKNQ FK

HQVVFKWLIATNYYNKR

SEQ ID NO: 5 GccC

MNINILLSTYNGERFLAEQIQSIQKQTIKDWTLLIRDDGSTDRTPDIIREFVKQDQRIQW

INENQIENLGVIKNFYTLLKYQAADVYFFSDQDDIWLEDKLEVTLLEAQKHDLSKPL LVY

TDLKVVNQQLEITHASMIKTQSAHANTTLLQELTENTVTGGTMMINQALAKEWNTCE G

LLMHDWYLALVAAARGKLVCLDIPTELYRQHDANVLGARTWSKRMKHWLRPHQLIRK

YWWLITSSQQQAQLLLDLPLQPKDRDMVEAYVSLLTMSLTKRLATLKTYGFRKNRAF

HTLVFWSLVITLFGYRRK

SEQ ID NO:6 GqcC

MNINILLSTYNGERFLAEQIQSIQKQTIKDWTLLIRDDGSTDRTPDIIREFVKQDQRIQW

INENQIENLGVIKNFYTLLKYQAADVYFFSDQDDIWLEDKLEVTLLEAQKHDLSKPL LVY

TDLKVVNQQLEITHASMIKTQSAHANTTLLQELTENTVTGGTMMINQALAKEWNTCE G

LLMHDWYLALVAAARGKLVYLDIPTELYRQHDANVLGARTWSKRMKHWLRPHQLIRK

YWWLITSSQQQAQLLLDLPLQPKDRDMVEAYVSLLTMSLTKRLATLKTYGFRKNRAF

HTLVFWSLVITLFGYRRK

SEQ ID NO: 7 SccC

MKVNILMSTYNGQEFIAQQIQSIQKQTFENWNLLIRDDGSSDGTPKIIADFAKSDARIRF

I NADKREN FGVI KN FYTLLKYEKADYYFFSDQDDVWLPQKLELTLASVEKEN NQI PLMV

YTDLTVVDRDLQVLHDSMIKTQSHHANTSLLEELTENTVTGGTMMVNHCLAKQWKQC

YDDLIMHDWYLALLAASLGKLIYLDETTELYRQHESNVLGARTWSKRLKNWLRPHRL V

KKYWWLVTSSQQQASHLLELDLPAANKAIIRAYVTLLDQSFLNRIKWLKQYGFAKNR A

FHTFVFKTLIITKFGYRRK

SEQ ID NO:8 SucC

MKINILMSTYNGEKFLAEQIESIQKQTVTDWTLLIRDDGSSDRTPEIIQDFVAKDSRIHF

INADHRINFGVIKNFFTLLKYEEADYYFFSDQDDVWLPHKIETSLNKAKELEKNRPF LIY TDLTIVNQSLETIHESMISFQSDHANTTLLEELTENTVTGGTALINHALAELWTDDKDLL

MHDWFLALLASAMGNLVYINEATELYRQHDRNVLGARTWSKRLKTWSKPHLMLNKY

WWLIQSSQQQAQKLLDLPLSSDKRKLVEHYVTLLEKPLMTRLRDLKKYGYKKNRAFH

TFVFRMLIITKIGYRRTVKNGIIQ

SEQ ID NO: 9 GccG

MNRVLLYVHFNKYNKVSKHIYYQLEKLRPLFTTVVFISNSKVEQKELENLQKQRLIDSFI

QRENKGFDFAAWHDGMMKIGFDDLTLCDSLTIMNDTCFGPLWGMAPYFEKFDNNQS

VDFWGLTNNRKTSSFKEHIQSYFITFKQHVIQSDAFLNFWKTIKEYDDVQEVIQKYE TQ

VTTTLLEAGFNYQTVFDTREADSSFM LH PDFSYYN PTAI LQH RVPFI KVKAI DANQH ITP

YLLNMIEEETTYPVDLIISHMSQVGLPDAKYLLARKYLPFESLVTQNVPRIAVHLHV FYV

DLLNEFLEGFASWEFQYDLYITTDTQEKKEAIEKLLVQSNRHAHLYVTGNVGRDVLP M

LLLKDKLRDYDYIGHFHTKKSKEADFWAGESWRKELINMLIKPANEIVRSFENNDIG IVI

ADIPSFFRFNKIVDAWNEHLIAPEMMRLWKEMGLKKEIDFQSMDTFVMSYGTFVWFK

FDALKPLFDLDLTVDDIPKEPLPQNSILHAIERLLVYIAWDRFYDFRIVKNPYNLSP FIDN

KLLNLRESGGARTYVNFDHMGGIKGALKYIIIGPARAMKYIVKRVLKSKR

SEQ ID NO: 10 GccG Protein 1

MNRVLLYVHFNKYNKVSKHIYYQLEKLRPLFTTVVFISNSKVEQKELENLQKQRLIDSFI QRENKGFDFAAWHDGMMKIGFDDLTLCDSLTIMNDTCFGPLWGMAPYFEKFDNNQS VDFWGLTNNRKTSSFKEHIQSYFITFKQHVIQSDAFLNFWKTIKEYDDVQEVIQKYETQ VTTTLLEAGFNYQTVFDTREADSSFM LH PDFSYYN PTAI LQH RVPFI KVKAI DANQH ITP YLLNMIEEETTYPVDLIISHMSQVGLPDAKYLLARKYLPFESLVTQNVPRIAVHLHVFYV DLLNEFLEGFASWEFQYDLYITTDTQEKRKQLKNY

SEQ ID NO: 1 1 GccG Protein 2

MGVSVRPLYYNRYSRKKEAIEKLLVQSNRHAHLYVTGNVGRDVLPMLLLKDKLRDYD

YIGHFHTKKSKEADFWAGESWRKELINMLIKPANEIVRSFENNDIGIVIADIPSFFR FNKI

VDAWNEHLIAPEMMRLWKEMGLKKEIDFQSMDTFVMSYGTFVWFKFDALKPLFDLDL

TVDDIPKEPLPQNSILHAIERLLVYIAWDRFYDFRIVKNPYNLSPFIDNKLLNLRES GGA

RTYVNFDHMGGIKGALKYIIIGPARAMKYIVKRVLKSKR

SEQ ID NO: 12 GqcG Protein 1

MIGKIIRSYQDEGGRATLRKIRQRLQGGGHPQSAGKIDLNRIPIMPQLEDIAQADYINHP YQRPAKLDKKQLNIAWVSPPVGKGGGGHTTISRFVKYLQSQGHHITFYIYHNNTIEQS

AKEAQEIFSKAYGIEVAVDDLKNFSNQDLVFATSWETAYAVFNLKSENLHKFYFVQD F

EPIFYGVGSRYKLAEATYKFGFYGITAGKWLTHKLKDYHMDADYFNFGADTDIYKPK A

PLQKKKKIAFYARAHTERRGFELGVMALKIFKDKHPEYDIEFFGQDMSHYDIPFDFI DR

GILNKEELAAIYHESVACLVLSLTNVSLLPLELLVAGCIPVMNSGDNNTMVLGENDD IAY

AEAYPVALAEELCKAVERSDIDTYANEMSQKYDGVSWENSYRKVEEIIRREVIND

SEQ ID NO: 13 GqcG Protein 2

MTDKIKATVFIPVYNGENDHLEETLTALYTQKTDFSWNVMITDSESKDRSVAIIETFAER

YGNLQLIKLKKSDYSHGATRQMAAELSSAEYMVYLSQDAVPANEHWLAEMLKPFTIH

HDIVAVLGKQKPRIGCFPAMKYDINAVFNEQGVAGAITLWTRQEESLKGKYTKESFY S

DVCSAAPRDFLVNEIGYRSVPYSEDYEYGKDILDAGYMKAYNSDAIVEHSNDVLLSE Y

KQRIFDETYNVRRNSGVTTPISVSTVLIQFLKSSVKDAMKIVSDQDYSWKRKLYWLA V

NPLFHFEKWRGMRLANSVDMTKDNSKHSLENSKSKG

SEQ ID NO:14 SucG

MKRLLLYVHFNKYNRLSPHVLYQLKKMRPLFSNLIFISNSSLNDSDRQELLSSGLVNEV

IQRQNIGFDFAAWRDGMATVGFESLSEYDNVTIMNDTCFGPLWDMKPYFLTYEDDEE

VDFWGLTNNRQTKEFDEHIQSYFISFKKTVLSNETFLHFWRTVQDFTDVQDVIKNYE T

QVTTGLLKEGFRYKCIFNTVTADASGMLHADFSYYNPTAILKHQVPFIKVKTIDANQ SIA

PYLLQVIKNQTDYPVDLIVSHMSDIHYPDAPYLLSQKYLEKQEESDLKVSEHSIAVH LH

VFYVDLLEEFLHAFTSFKFPFDLYITTDKSEKESEIKAILDSFRVSAKIVVTGNIGR DVLP

MLKLKDELSQYDYIGHFHTKKSKEADFWAGESWRNELIDMLIKPANTIINQFEDPAI GIII

ADIPSFFRFNKIVTPLNEHLIAPEMNKLWEKMNLSKTIDFEQFDTFVMSYGTFVWFK YD

ALKPLFDLNLKDGDVPKEPLPQNSILHAVERLLIYIAWDSHFDFRIAKNNVELTPFL DNK

LLNDKSNSLPNTYVDFTYMGGIKGALKYIFIGPARAIKYIYIRTKEKIFNG

SEQ ID NO: 15 SccG

MKRLLLYVHFNKYNRVSSHVVYQLTQMRSLFSKVIFISNSQVADADVKMLREKHLIDD

FIQRQNSGFDFAAWRDGMVFVGFDELVTYDSVTTMNDTCFGPLWEMYSIYQEFETK

TTVDFWGLTNNRATKSFREHIQSYFISFKASVLRSTAFRDFWENIKEYQDVQKVIDQ Y

ETKVTTTLLDAGFQYDVVFDTTKEDASHMLHADFSYYNPTAILNHRVPFIKVKAIDN NQ

HITPYLLNDIQKNSTYPIDLIVSHMSEINYPDFSYLLGHKYVKKRERVDLKNQKVAV HLH

VFYVDLLEEFLTAFKQFHFSYDLFITTDSDDKKAEIEEILSANGQEAQVFVTGNIGR DVL

PMLKLKNYLSAYDFVGHFHTKKSKEADFWAGQSWREELIDMLVKPADNILAQLQQNP KIGLVIADMPTFFRYNKIVDAWNEHLIAPEMNTLWQKMGMTKKIDFNAFHTFVMSYGT

FVWFKYDALKPLFDLNLTDDDVPEEPLPQNSILHAIERLLIYIAWNEHYDFRISKNP VDL

TPFIDNKLLNERGNSAPNTFVDFNYMGGIKGAFKYIFIGPARAVKYILKRSLQKIKS

SEQ ID NO: 16 GacA

MLENTKILRKVFYLWQKGELMILITGSNGQLGTELRYLLDERGVDYVAVDVAEMDITNE

DKVEAVFAQVKPTLVYHCAAYTAVDAAEDEGKALNEAINVTGSENIAKACGKYGATL V

YISTDYVFDGNKPVGQEWVETDHPDPKTEYGRTKRLGELAVERYAEHFYIIRTAWVF

GNYGKNFVFTMEQLAENHSRLTVVNDQHGRPTWTRTLAEFMCYLTENQKAFGYYHL

SNDAKEDTTWYDFAKEILKDKAVEVVPVDSSAFPAKAKRPLNSTMNLDKAKATGFVI P

TWQEALKAFYQQGLKK

SEQ ID NO: 17 GacH

MIKDTFLKTNWLNISHHIILLVFGFYFSFYSLAKELVSSTAQPVNYYAHLLNVSFVGYII

SLIGLSYYLSRQVSRQLFLKTSFIVISYLIVSYWVQITQHLNDKRFDIWSLTKNQFY QFQ

ALPSLLI I LVMATLI Kl LVAYFAI EKDRFGLLGYQGNTFSVALI LAVVPINDIH LLKLIS

SRFSELVTAGNSQIALLKISGLLIVLLVIFATIIYVVLNALKHLKSNKPSFSVAATT SLF

LALVFNYTFQYGVKGDEALLGYYVFPGATLFQIVAITLVALLAYVITNRYWPTTFFL LIL

GTIISVVNDLKESMRSEPLLVTDFVWLQELGLVTSFVKKSVIVEMVVGLAICIVVAW YLH

GRVLAGKLFMSPVKRASAVLGLFIVSCSMLIPFSYEKEGKILSGLPIISALNNDNDI NWL

GFSTNARYKSLAYVWTRQVTKKIMEKPTNYSQETIASIAQKYQKLAEDINKDRKNNI AD

Q

TVIYLLSESLSDPDRVSNVTVSHDVLPNIKAIKNSTTAGLMQSDSYGGGTANMEFQT LT

SLPFYNFSSSVSVLYSEVFPKMAKPHTISEFYQGKNRIAMHPASANNFNRKTVYSNL G

FSKFLALSGSKDKFKNIENVGLLTSDKTVYNNILSLINPSESQFFSVITMQNHIPWS SDY

PEEIVAEGKNFTEEENHNLTSYARLLSFTDKETRAFLEKLTQINKPITVVFYGDHLP GLY

PDSAFNKHIENKYLTDYFIWSNGTNEKKNHPLINSSDFTAALFEHTDSKVSPYYALL TE

VLNKASVDKSPDSPEVKAIQNDLKNIQYDVTIGKGYLLKHKTFFKISR

SEQ ID NO:18 Group B RMID

MILITGANGQLGSELRHLLDERTQEYVAVDVAEMDITNAEMVDKVFEEVKPSLVYHCA

AYTAVDAAEDEGKELDFAINVTGTENVAKAAAKHDATLVYISTDYVFDGEKPVGQEW E

VDDLPDPKTEYGRTKRMGEELVEKYASKFYTIRTAWVFGNYGKNFVFTMQNLAKTHK TLTVVNDQHGRPTWTRTLAEFMTYLAENQKDFGYYHLSNDAKEDTTWYDFAVEILKD

TDVEVKPVDSSQFPAKAKRPLNSTMSLEKAKATGFVIPTWQDALKEFYKQEVKK

SEQ ID N0:19 Group C RMID

MILITGSNGQLGTELRYLLDERHVDYVAVDVAEMDITDADKVEAVFAQVKPTLVYHCA

AYTAVDAAEDEGKALNEAINVTGSENIAKACGKYGATLVYISTDYVFDGNKPVGQEW L

ETDVPDPQTEYGRTKRLGELAVEQYAEHFYIIRTAWVFGNYGKNFVFTMQQLAEKHP

RLTVVNDQHGRPTWTRTLAEFMCYLAENQKAFGYYHLSNDAKEDTTWYDFAKEILKD

KAVEVVPVDSSAFPAKAKRPLNSTMNLDKAKATGFVIPTWQEALKEFYQQDRHQ

SEQ ID NQ:20 Group G RMID

MILITGSNGQLGTELRYLLDERHVDYVAVDVAEMDITDADKVEAVFAQVKPTLVYHCA

AYTAVDAAEDEGKALNEAINVTGSENIAKACGKYGATLVYISTDYVFDGNKPVGQEW L

ETDVPDPQTEYGRTKRLGELAVEQYAEHFYIIRTAWVFGNYGKNFVFTMQQLAEKHP

RLTVVNDQHGRPTWTRTLAEFMCYLAENQKAFGYYHLSNDAKEDTTWYDFAKEILKD

KAIEVVPVDSSAFPAKAKRPLNSTMNLDKAKATGFVIPTWQEALKEFYQQDRHQ

SEQ ID NO:21 RmID S. mutans

MILITGSNGQLGTELRHLLNERNEDYVAVDVAEMDITKAEKVDEVFLQVKPSLVYHCA

AYTAVDAAEDEGKELDYAINVTGTENIAKACEKYNATLVYISTDYVFDGEKPVGQEW E

VDDKPDPKTEYGRTKRLGEEAVEKYVKNFYIIRTAWVFGNYGKNFVFTMQHLAKSHN

SLTVVNDQHGRPTWTRTLAEFMTYLAENQKEYGYYHLSNDATEDTTWYDFALEILKD

TDVVVKPVDSSQFPAKAKRPLNSTMSLTKAKATGFVIPTWQEALQEFYKQDVKK

SEQ ID NO:22 RmID S. uberis

MILITGSNGQLGTELRYLLDERNVEYVAVDVAEMDITNPDMVDEVFAQVKPTLVYHCA

AYTAVDAAEDEGKALNQAINVDGTVNIAKACQKYNATLVYISTDYVFDGTKTVGQEW L

ETDIPDPKTEYGRTKRLGEEAVEKYVDQFYIIRTAWVFGHYGKNFVFTMQNLAKTHP K

LTVVNDQYGRPTWTRTLAEFMCHLTENQKDYGYYHLSNDSKEDTSWYDFAKEILKDT

DVEVVPVDSSAFPAKAKRPLNSTMNLDKAKATGFVIPTWQEALNEFYKQEVKK

SEQ ID NO:23 GccD

MNFLTKKNRILLREMVKTDFKLRYQGSAIGYLWSILKPLMMFTIMYLVFIRFLRLGGNIP

HFPVALLLANVIWSFFSEATSMGMVSIVSRGDLLRKLNFSKHIIVFSAILGALINFL INLVV

VLIFALINGVTISNYAYFSFFLFIELVVFVVGIALLLSTVFVYYRDLAQVWEVLLQA GMYA TPIIYPITFVLEGHPLAAKILMLNPIAQMIQDFRYLLIDRANVTIWQMSTNWFYIAIPYL IPF

ILLFIGITVFKKNATKFAEII

SEQ ID NO:24 GccE

MTNNKIAVKVEHVSKSFKLPTEATKSFRTTLVNRFRGIKGFTEQQVLKDINFEVHKGDF

FGIVGRNGSGKSTLLKIISQIYVPEKGQVTVDGKMVSFIELGVGFNPELTGRENVYM N

GAMLGFTKEEINAMYDDIVDFAELHDFMNQKLKNYSSGMQVRLAFSVAIKAQGDVLI L

DEVLAVGDEAFQRKCNDYFMERKDSGKTTILVTHDMGAVKKYCNRAVLIEDGLVKAY

GEPFDVANQYSVDNTETKEELQDSEKVAISDIVQQLRVNLTSKQRITPKEIISFEVS YEV

LRDEPTYIAFSLTDMDRNIWVYNDNSRDQLVEGIGKKTISYQCHLSHLNDIKLKLEV TV

RDKDGQMLLFSTAEQSPKIIIQRDDITSDDFSALDSASGLYQRNGQWTFS

SEQ ID NO:25 GccF

MHKVSIICTNYNKAPWLGEALDSFLNQKTNFEVDIIVIDDASTDESKTILEDYQTRFPEK

ITLLFNDHNLGITKTWIKACLYAKGKYIARCDGDDYWTDDLKLQKQVDALEASKYSK W

SNTDFDFVDNKGKVLHSNVFETGYIPFTDTYEKVLALKGMTMASTWVVDAELMRFVN

QKINIETPDDTFDMQLELFQLTSLTYINDSTTVYRMTSNSDSRPADKKRMIHRIKQL LQ

TQVFYLAKYPQANIPQIANLLMEQDGKNELRIHELSCLINDLRQELNEKTEQQKERE FE

IKEIIENQSRQICELTHQYNCVINSRRWKYMSKLIDFIRRKK

SEQ ID NO:26 GqcD

MNFLTKKNRILLREMVKTDFKLRYQGSFIGHLWSILKPMLLFTIMYLVFVRFLKFDDGT PHYAVSLLLGMVTWNFFTEATNMGMLSIVSRGDLLRKINFPKEIIVISSVVGATINYFIN I LVVFAFALINGVQPSFGVFILIPLFLELFLFATGVAFILATLFVKYRDMGPIWEVMLQAG MYGTPIIYSITYIIQRGHLGIAKVMMMNPLAQIIQELRHFIVYSGATINWDIFENKFFTL IPII LSLSAFVIGYVI FKRNAKKFAEI L

SEQ ID NO:27 GqcE

MSEKKVVLSVDSVSKSFKLPTEASNSLRTSLVNYFKGIKGYTEQHVLDDISFQVEEGD

FFGIVGRNGSGKSTLLKIISKIYEPEKGTVTVDGKLVPFIELGVGFNPELTGRENVF MN

GALLGFSRDEVAAMYDDIVSFAELHDFMDQKLKNYSSGMQVRLAFSIAIKAKGDILI LD

EVLAVGDEAFQRKCFDYFAQLKREHKTVILVTHSMEQVQRFCNKAMLIDKGHHMEVG

TPLEISQIYKQLNGLNVAKESAKETENNGISLSSQFINHKDDTLTFTFDVHFEQTIE DPV

LTFTIHKDTGELLYRWVSDEEVEGSIMIKNHKVSIDFAIQNIFPNGKFTTEFGVKSR DRS

KEYAMFSGICNFELINRGKSGNNIYWKPETTVKLS SEQ ID NO:28 GqcF

MRMYQGKRFLLTHIWLRGFSGAEINILELATYLKEAGAQVEVFTFLAKSPMLDEFQKN

GIPVIDDSDYPFDVSQYDVVCSAQNIIPPAMIEALGKSQEKLPKFIFFHMAALPEHV LEQ

PYIYQLEKKISSATLAISEEIVNKNLKRFFKDIPNLHYYPNPAPESYAAMEHLKKQS PERI

LVISNHPPQEVIDMEPLLAKKGIHVDYFGVWSDHYELVTPELLASYDCVVGIGKNAQ Y

CLVMGKPIYIYDHFKGPGYLTETNFEAAALNNFSGRGFEEQEKTAEELVDDLLEHYQ S

AQAFQHNHLYDYRSRYTISTIVDHIYKSINIIPKAIAPLEQVDVEYIKAITLFIRTR LVRLEN

DVANLWEAVHRYEQLDRKATAKREALEQLLTAKTTELNLIKTSRMFKLYQLLWRIKG F

FFRKEHLKRAK

SEQ ID NO:29 SccD

MDFFSRKNRILLKELIKTDFKLRYQGSAIGYLWSILKPLMLFAIMYIVFVRFLPLGGDVP

HWPVALLLGNVIWTFFQETTMMGMVSVVTRGDLLRKLNFSKQTIVFSAVSGAAINFG I

NVIVVLIFALLNGVTFTFRWNLFLLIPLFLELLLFSTGIAFILSTLYVRYRDIGPVW EVILQ

GGFYGTPIIYSLTYIATRSVVGAKLLLLSPIAQIIQDMRHILIDPANVTIWQMINHK SIA

VIPYLVPIFVFIIGFLVFNYNAKKFAEII

SEQ ID NQ:30 SccE

MTKNNIAVKVDHVSKYFKLPVESTQSLRTALVNRFKGIKGYKKQHVLRDIDFEVEKGD

FFGIVGRNGSGKSTLLKIISQIYVPEQGKVTVDGKLVSFIELGVGFNPELTGRENVY MN

GAMLGFTTEEVDTMYQDIVDFAELQDFMNQKLKNYSSGMQVRLAFSVAIKAQGDVLI L

DEVLAVGDEAFQRKCNDYFLERKNSGKTTILVTHDMAAVKKYCNKAVLIDDGLIKAI GE

PFDVANQYSLDNTDQIVEDKQEEEAAVQEEEQIVVDNLEVKLLSANRMTPRDSIRFE IS

YNVLADVGTYIALSLTDVDRNIWIYNDNSLDYLSSGSGKKRVFYECHLKSLNDIKLK LE

VTVRDKQGQMLAFSSATNTPIISINRDDLEGDDKSAMDSASGLIQRNGQWQFS

SEQ ID N0:31 SccF

M VKVSI ICTNYN KGSWIGEAI DSFLKQETSFPYEI 11 VDDASTDHSVH 11 KTYQKQYPDL

IRAFFNQENQGITKTWSDICKKARGQYIARCDGDDYWIDPFKLQKQIDLLETSPESK W

SNTDFDMVDSKGNIIHKDVLKNNIIPFMDSYEKMLALKGMTMASTWLVETKLMLEIN DR

INKDAVDDTFNIQLELFKKTKLAFLRDSTTVYRMDAESDSRSKDSEKLAQRFDRLLE TQ

LEYIEKYPDSDYKKVLEYLLPKHNDFEKVLAQDGKNVWDNQQITIYLAKGDDQEFSE E

NCFQFPLQHSGNIQLTFPENIRKIRIDLSEIPSYYRQVSLVNTTVNTELLPTWTNAK VFG

YSYYFI APDPQMIYDLTAQEGQDFKLTYEWFNVDQPSQPDFLANHLVKELDQKKVELKM LSPY KYQYQKAVAERDLYLEQLNEMVVRYNSVTHSRRWTIPTKIINLFRRKK

SEQ ID NO 32 SucD

MELFSKKNRILLKELVKTDFKLRYQGSAIGYLWSILKPLLMFTIMYLVFIRFLRLGGSVP HFPVALLLANVIWSFFSEATGMGMVSIVTRGDLLRKLNFSKHTIVFSAVLGALINFSINL VVVLI FALI NGVTISPFAYMAI PLFI ELLI LAVGVALLLSTLFVYYRDLAQVWEVLMQAAM YATPIIYPITFVSDKNPLAAKILMLNPLAQMIQDLRFLLIDRANATIWQMSNHWYYVMIP Y LI PFLVLALGI LVFN KNAKKFAEI I

SEQ ID NO 33 SucE

MSTRDIAVKVEHVSKSFKLPTEATKSFRTTLVNRFRGIKGYTEQKVLKDINFEVKKGDF

FGIVGRNGSGKSTLLKIISQIYVPEKGTVTVEGKMVSFIELGVGFNPELTGRENVYM NG

AMLGFTQEEVDAMYEDIVDFAELHDFMNQKLKNYSSGMQVRLAFSVAIKAQGDVLIL D

EVLAVGDEAFQRKCNDYFMERKESGKTTILVTHDMAAVKKYCNRAVLIEDGLVKALG

DPDDVANQYSFDNAIASETVEKKEDGKSTEKKESQLISDFSAQLLTKPQISPDEDIT ISF

SYNVLKNMETHVALSFIDIDTNLGLYNDNSMSLKTNGQGQKTVTMTCQMSYLNHAKL

KLAATVRDKDKH PLAFLPVN El PVI LI DRKVDASN ESEWDANTGI LRRSSQWT*

SEQ ID NO 34 SucF

MKKILFVSPTGTLDNGAEISITNLMVLLTQEGYDIINVIPKIKHSTHDAYLHKMRENQIK

VYELDYTNWWWESAPGDKIGHLEDRSAYYQKYIYEIRKIIAEEAVDLVITSTANLFQ GA

LAAACERIPHYWIIHEFPLDEFAYYKELIPFIEEYSDKIFTVEGKLTEFLRPLLKES QKLF

PFVPFVNIKKNNNLKTGEETRLISISRINENKNQLELLKAYQSMAEPKPELLFVGDW DD

SYKEKCDDFIQSHQLKTVRFLGHQSNPWNLMTDKDILVLNSKMETFGLVFVEALIQG IP

VLASNNYGYSSVVDYFGCGKLYHLGDEKELVALLNEFVTNFSEEKKKSLTQSFMVEE

KYTIEKSYCALLDAISNENSVKSDRPIWLSQFLGAYNPLSTFSPAGKESISIYYRDE NGN

WSENQKLVFSLFNRDSFTFSVPKGMTRIRLDMSERPSYYDKITLVDSDTMTQLLPTN V

SGFEENNSFYFNHSDPQMEFNVSFSKNNVFQLSYQLANLENIFQDSFLPNQLVQKLL

SFKEKQSDLEMLKIENHQLQEKNKLKQEQLEEMVVRYNSVIHSRRWSIPTKMINFLR R

KK

SEQ ID NO:35 SccH

MKQLKKIWDMLGKQKLLIFIFIFALNVTLRNYDLLIGRRANSSLSFKVISKNFDIMIEHW E

ALPSHFKIIGGVCLVIYVLSILGLSFYLSKNLKKTFFIELLLGYGLYIVISYFLAVT RELNNE SFKIWDLAKNHFFQPYFLPTLVLIIVCTLALNYLIRVKMKRSHLSRKMTLLLENFSETEF L

LTGLIVSFILSDTLYVKLLQESLRAYYHKPLAYESLLFLYTLLTLILFSVIVEACFN AYRSIK

LNRPNLSLAFVSSLLFATIFNYAFQYGLKNDADLLGKYIVPGATAYQILVLTAAGFF LYLII

NRYLLVTFLIVILGSIITVVNVLKVGMRNEPLLVTDFAWVTNIRLLARSVNANIIFS TLLILA

ALILLYLFLRKRLLQGKITENHRLKVGLISSICLLGFSIFIIFRNEKGSKIVNGIPV ISQVNN

WVDIGYQGFYSNASYKSLMYVWTKQVTKSIMDKPSDYSKERILKLAKKYNNVANKIN K

VRTENISNQTVIYILSESFSDPDRVKGVNLSRDVIPNIKQIKEKTTSGLMHSDGYGG GT

ANMEFQSLTGLPYYNFNSSVSTLYTEVVPDMSVFPSISNQFKSKN RVVIHPSSASNYS

RKYVYDKLKFPTFVASSGTSDKITHSEKVGLNVSDKTTYQNILDKINPSQSQFFSVM T

MQNHVPWASDEPSDVVATGKGYTKDENGSLSSYARLLTYTDKETKDFLAQLSQLKH

KVTVVFYGDHLPGLYPESAFKKDPDSQYQTDYFIWSNYNTKTLNHSYVNSSDFTAEL L

EHTNSKVSPYYALLTEVLDNTTVGHGKLTKEQKEIANDLKLIQYDITVGKGYIRNYK GF

FDIR

SEQ ID NO:36 WchF PHD0486

MKQSVYIIGSKGIPAKYGGFETFVEKLTEYQKDGNIQYYVACMRENSAKSGFTADTFE

YNGAICYNIDVPNIGPARAIAYDIAAVNKAIELSKGNKDEAPIFYILACRIGPFISG LKKKIR

SIGGRLLVNPDGHEWLRAKWSLPVRKYWKFSEQLMVKHADLLVCDSKNIEKYIREDY

KQYQPKTTYIAYGTDTTPSSLKSEDAKVRNWYREKGVSENGYYLVVGRFVPENNYET

MIREFIKSKSNKDFVLITNVEQNKFYDQLLKETGFDKDLRVKFVGTVYDQELLKYIR EN

AFAYFHGHEVGGTNPSLLEALASTKLNLLLDVGFNREVGEDGAIYWKKDELAHVIEE V

ERFDEGDITELDEKSSQRIADAFTWEKIVSDYEEVFTV

SEQ ID NO:37 WbbR

MNKYCILVLFNPDISVFIDNVKKILSLDVSLFVYDNSANKHAFLALSSQEQTKINYFSIC E

NIGLSKAYNETLRHILEFNKNVKNKSINDSVLFLDQDSEVDLNSINILFETISAAES NVMI

VAGNPIRRDGLPYIDYPHTVNNVKFVISSYAVYRLDAFRNIGLFQEDFFIDHIDSDF CSR

LIKSNYQILLRKDAFFYQPIGIKPFNLCGRYLFPIPSQHRTYFQIRNAFLSYRRNGV TFN

FLFREIVNRLIMSIFSGLNEKDLLKRLHLYLKGIKDGLKM

SEQ ID NO:38 WbbL PHD0480

MVYIIIVSHGHEDYIKKLLENLNADDEHYKIIVRDNKDSLLLKQICQHYAGLDYISGGVY G

FGHNNNIAVAYVKEKYRPADDDYILFLNPDIIMKHDDLLTYIKYVESKRYAFSTLCL FRD

EAKSLHDYSVRKFPVLSDFIVSFMLGINKTKIPKESIYSDTVVDWCAGSFMLVRFSD FV RVNGFDQGYFMYCEDIDLCLRLSLAGVRLHYVPAFHAIHYAHHDNRSFFSKAFRWHL

KSTFRYLARKRILSNRNFDRISSVFHP

SEQ ID NO:39 WbbL

MVAVTYSPGPHLERFLASLSLATERPVSVLLADNGSTDGTPQAAVQRYPNVRLLPTG

ANLGYGTAVNRTIAQLGEMAGDAGEPWGDDWVIVANPDVQWGPGSIDALLDAASRW

PRAGALGPLIRDPDGSVYPSARQMPSLIRGGMHAVLGPFWPRNPWTTAYRQERLEP

SERPVGWLSGSCLLVRRSAFGQVGGFDERYFMYMEDVDLGDRLGKAGWLSVYVPS

AEVLHHKAHSTGRDPASHLAAHHKSTYIFLADRHSGWWRAPLRWTLRGSLALRSHL

MVRSSLRRSRRRKLKLVEGRH

SEQ ID NQ:40 RfbF

MNSNIYAVIVTYNPELKNLNALITELKEQNCYVVVVDNRTNFTLKDKLADIEKVHLICLG RNEGIAKAQNIGIRYSLEKGAEKIIFFDQDSRIRNEFIKKLSCYMDNENAKIAGPVFIDR D KSHYYPICNIKKNGLREKIHVTEGQTPFKSSVTISSGTMVSKEVFEIVGMMDEELFIDYV DTEWCLRCLNYGILVHIIPDIEMVHAIGDKSVKICGINIPIHSPVRRYYRVRNAFLLLRK N H VPLLLSI REVVFSLI HTTLI IATQKN Kl EYM KKH I LATLDGI RGITGGGRYNA

SEQ ID N0:41 WsaD

MDISIIIVNYNTPKLTVEAIESILKSKTKYSYEIIVVDNHSSDDSVRILKGKFPNIVVIE NKQ

NVGFSKANNQAIKLSKGRYILLLNSDTIVKEDTIEKMIEFMDKSKKVGASGCEVVLP NG

ELDRACHRGFPTPEASFYYLVGLARLFPRSRRFNQYHLGYMNLNEPHPIDCLVGAFM

MVRREVIEQVGLLDEEFFMYGEDIDWCYRIKQAGWEIYYCPFTSIIHYKGASSKKKP FK

IVYEFHRAMFLFHRKHYARKYPFIVNCLVYTGIAAKFILSAIINTFRKIGG

SEQ ID NO:42 WbbP

MKISIIGNTANAMILFRLDLIKTLTKKGISVYAFATDYNDSSKEIIKKAGAIPVDYNLSR

SGINLAGDLWNTYLLSKKLKKIKPDAILSFFSKPSIFGSLAGIFSGVKNNTAMLEGL GFL

FTEQPHGTPLKTKLLKNIQVLLYKIIFPHINSLILLNKDDYHDLIDKYKIKLKSCHI LGG

IGLDMNNYCKSTPPTNEISFIFIARLLAEKGVNEFVLAAKKIKKTHPNVEFIILGAI DKE

NPGGLSESDVDTLIKSGVISYPGFVSNVADWIEKSSVFVLPSYYREGVPRSTQEAMA M

GRPILTTNLPGCKETIIDGVNGYVVKKWSHEDLAEKMLKLINNPEKIISMGEESYKL ARE

RFDANVNNVKLLKILGIPD SEQ ID NO:43 WsaP

MVKVIRGRERFLTKLYAFVDFAMMQGAFFLAWVLKFKVFHNGVGGHLPLEDYLFWSF

VYGAIAIVIGYLVELYAPKRKEKFSNELAKVLQVHTLSMFVLLSVLFTFKTVDVSRS FLLL

YFAWNLILVSIYRYIVKQSLRTLRKKGYNKQFVLIIGAGSIGRKYFENLQMHPEFGL EVV

GFLDDFRTKHAPEFAHYKPIIGQTADLEHVLSHQLIDEVIVALPLQAYPKYREIIAV CEK

MGVRVSIIPDFYDILPAAPHFEIFGDLPIINVRDVPLDELRNRVLKRSFDIVFSLVA IIVTS

PIMLLIAIGIKLTSPGPIIFKQERVGLNRRTFYMYKFRSMKPMPQSVSDTQWTVESD PR

RTKFGAFLRKTSLDELPQFFNVLKGDMSIVGPRPERPFFVEKFKKEIPKYMIKHHVR P

GITGWAQVCGLRGDTSIQERIEHDLFYIENWSLWLDIKIILLTITNGLVNKNAY

SEQ ID NO:44 WsaC

MEMPLVSIVVATYFPRTDFFEKQLQSLNNQTYENIEIIICDDSANDAEYEKVKKMVENII

SRFPCKVIRNEKNVGSNKTFERLTQEANGDYICYCDQDDIWLSEKVERLVNHITKHH C

TLVYSDLSLIDENDRIIHKSFKRSNFRLKHVHGDNTFAHLINRNSVTGCAMMIRADV AK

SAIPFPDYDEFVHDHWLAIHAAVKGSLGYIKEPLVWYRIHLGNQIGNQRLVNITNIN DYI

RHRIEKQGNKYRLTLERLSLTLQQKQLVYFQIHLTEARKKFSQKPCLGNFFKIVPLI KYD

IILFLFELMIFTVPFTCSIWIFKKLKY

SEQ ID NO:45 WsaE

MERCRMNKKIPFDQYQRYKNAAEIINLIREENQSFTILEVGANEHRNLEHFLPKDQVTY

LDIEVPEHLKHMTNYIEADATNMPLDDNAFDFVIALDVFEHIPPDKRNQFLFEINRV AKE

GFLIAAPFNTEGVEETEIRVNEYYKALYGEGFRWLEEHRQYTLPNLEETEDILRKEN IE

YVKFEHGSLLFWEKLMRLHFLVADRNVLHDYRFMIDDFYNKNIYEVDYIGPCYRNFI VV

CRDKAKREFIQSIYEKRKQNSYLKNSTISKLNELENSIYSLKIIDKENQIYKKSLEI TEQLL

EDLKLKEQQIIEKIQTIKKKTEMIELQNQKIQELKIECENKSIENNNLYSQLLEKEN YIKQ

LQNQAESMRIKNRLKKILNFSFIKYVRKIINIIFRRKFKFKLQPVHHLEWSNGKWLV LGR

DPHFILKGGSYPSSWTIIQWRASANSSALLRLYYDTGGGFSENQSFNLGKIGNDINR D

YECVICLPENIHLLRLDIEGEISEFELENLTFTSISRLEVFYKSFINHCRKRNIKNY KELYS

LI KKLFI LVRREGLKSI WYRAKQKLSM ELLSEDPYEVFLN VSSKVDKEI VLSEI KKLKYK

PKFSVI LPVYN VEEKWLRKCI DSVLNQWYPYWELCI VDDNSSKDYI KPVLEEYSN RDS

RIKTVFRSNNGHISEASNTALEIATGDFIALLDHDDELAPEALYENAVLLNEHPDAD MIY

SDEDKITKDGKRHSPLFKPDWSPDTLRSQMYIGHLTVYRTNLVRQLGGFRKGFEGSQ

DYDLALRVAEKTNNIYHIPKILYSWREIETSTAVNPSSKPYAHEAGLKALNEHLERV FG

KGKAWAEETEYLFVYDVRYAI PEDYPLVSI 11 PTKDN I ELLSSCIQSI LDKTTYPNYEI LI M

NNNSVMEETYSWFDKQKENSKIRIIDAMYEFNWSKLNNHGIREANGEVFVFLNNDTI VI SEDWLQRLVEKALREDVGTVGGLLLYEDNTIQHAGVVIGMGGWADHVYKGMHPVHN

TSPFISPVINRNVSASTGACLAIAKKVIEKIGGFNEEFIICGSDVEISLRALKMGYV NIYDP

YVRLYHLESKTRDSFIPERDFELSAKYYSPYREIGDPYYNQNLSYNHLIPTIRS

SEQ ID NO:46 WbbQ

MARSGGVVIKKKVAAIIITYNPDLTILRESYTSLYKQVDKIILIDNNSTNYQELKKLFEK

KEKIKIVPLSDNIGLAAAQNLGLNLAIKNNYTYAILFDQDSVLQDNGINSFFFEFEK LVS

EEKLNIVAIGPSFFDEKTGRRFRPTKFIGPFLYPFRKITTKNPLTEVDFLIASGCFI KLE

CIKSAGMMTESLFIDYIDVEWSYRMRSYGYKLYIHNDIHMSHLVGESRVNLGLKTIS LH

GPLRRYYLFRNYISILKVRYIPLGYKIREGFFNIGRFLVSMIITKNRKTLILYTIKA IKDG

INNEMGKYKG