Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
SEQUENCES OF MEGAPLASMID PMP118, CHROMOSOMAL SEQUENCES AND INDIVIDUAL SEQUENCES THEREOF OF LACTOBACILLUS SALIVARIUS STRAIN UCC118
Document Type and Number:
WIPO Patent Application WO/2007/020617
Kind Code:
A2
Abstract:
The present invention relates to an isolated polynucleotide comprising a nucleic acid sequence of SEQ ID No. 1 or a variant or fragment thereof. The present invention also refers to an isolated polynucleotide comprising a nucleic acid sequence of SEQ ID No 2 to SEQ ID No. 30 or a variant or fragment thereof.

Inventors:
O'TOOLE PAUL W (IE)
FITZGERALD GERALD F (IE)
Application Number:
PCT/IE2006/000088
Publication Date:
February 22, 2007
Filing Date:
August 18, 2006
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNIV COLLEGE CORK NAT UNIV IE (IE)
O'TOOLE PAUL W (IE)
FITZGERALD GERALD F (IE)
International Classes:
C07K14/335; A61P1/04; A61P1/14; C12N15/52; C12N15/74
Other References:
FLYNN SARAH ET AL: "Characterization of the genetic locus responsible for the production of ABP-118, a novel bacteriocin produced by the probiotic bacterium Lactobacillus salivarius subsp. salivarius UCC118" MICROBIOLOGY (READING), vol. 148, no. 4, April 2002 (2002-04), pages 973-984, XP002410387 ISSN: 1350-0872
UNIVERSITY COLLEGE CORK, MICROBIOLOGY, IRELAND, CORK: "Lactobacillus salivarius subsp. salivarius UCC118 plasmid pSF118-20, complete sequence"[Online] 17 December 2004 (2004-12-17), XP002410388 Retrieved from the Internet: URL:http://www.ncbi.nlm.nih.gov/entrez/que ry.fcgi?db=genome&cmd=Retrieve&dopt=Overvi ew&list_uids=18106> cited in the application
UNIVERSITY COLLEGE CORK, MICROBIOLOGY, IRELAND, CORK: "Lactobacillus salivarius subsp. salivarius UCC118 plasmid pSF118-44, complete sequence"[Online] 17 December 2004 (2004-12-17), XP002410389 Retrieved from the Internet: URL:http://www.ncbi.nlm.nih.gov/entrez/que ry.fcgi?db=genome&cmd=Retrieve&dopt=Overvi ew&list_uids=18107> cited in the application
CLAESSON M J ET AL: "Multireplicon genome architecture of Lactobacillus salivarius" PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF USA, NATIONAL ACADEMY OF SCIENCE, WASHINGTON, DC, US, vol. 103, no. 17, April 2006 (2006-04), pages 6718-6723, XP002402646 ISSN: 0027-8424
Attorney, Agent or Firm:
O'BRIEN, John, A. et al. (Third Floor Duncalm House, 14 Carysfort Avenu, Blackrock County Dublin, IE)
Download PDF:
Claims:

46

Claims

1. An isolated polynucleotide comprising a nucleic acid sequence of SEQ ID No. 1 or a variant or fragment thereof or a sequence complementary thereto.

2. A polynucleotide comprising a nucleic acid sequence of SEQ ID No. 1 or a variant or fragment thereof isolated from a probiotic bacterium.

3. A polynucleotide comprising a nucleic acid sequence of SEQ ID No. 1 or a variant or fragment thereof isolated from Lactobacillus.

4. A polynucleotide as claimed in claim 3 isolated from Lactobacillus salivarim subsp salivarius.

5. A polynucleotide as claimed in claim 3 or 4 isolated from Lactobacillus salivarius subsp salivarius UCCl 18 [NCIMB 40829].

6. A polynucleotide as claimed in any preceding claim comprising a nucleic acid sequence that is at least 70% identical to the nucleic acid sequence of SEQ ID No. 1.

7. A polynucleotide as claimed in any preceding claim wherein the fragment comprises at least 30 contiguous nucleic acids of SEQ ID No. 1.

8. An isolated polynucleotide comprising a nucleic acid sequence of SEQ ID

No. 2 or a variant or fragment thereof or a sequence complementary thereto.

9. An isolated polynucleotide comprising a nucleic acid sequence of SEQ ID

No. 3 or a variant or fragment thereof or a sequence complementary thereto.

NATIδ8/C/WO/IE 47

10. An isolated polynucleotide comprising a nucleic acid sequence of SEQ ID No. 4 or a variant or fragment thereof or a sequence complementary thereto.

1 1. An isolated polynucleotide comprising a nucleic acid sequence of SEQ ID No. 5 or a variant or fragment thereof or a sequence complementary thereto.

12. An isolated polynucleotide comprising a nucleic acid sequence of SEQ ID No. 6 or a variant or fragment thereof or a sequence complementary thereto.

13. An isolated polynucleotide comprising a nucleic acid sequence of SEQ ID

No. 7 or a variant or fragment thereof or a sequence complementary thereto.

14. An isolated polynucleotide comprising a nucleic acid sequence of SEQ ID No. 8 or a variant or fragment thereof or a sequence complementary thereto.

15. An isolated polynucleotide comprising a nucleic acid sequence of SEQ ID No. 9 or a variant or fragment thereof or a sequence complementary thereto.

16. An isolated polynucleotide comprising a nucleic acid sequence of SEQ ID No. 10 or a variant or fragment thereof or a sequence complementary thereto.

17. An isolated polynucleotide comprising a nucleic acid sequence of SEQ ID No. 1 1 or a variant or fragment thereof or a sequence complementary thereto.

18. A polynucleotide as claimed in any preceding claim which encodes a gene whose function is essential to probiotic activity or function.

19. An isolated polynucleotide as claimed in any preceding claim that encodes a gene involved in any one or more of sugar utilisation and sugar phosphate

NATI68/CλVO/1E

48

metabolism, environmental sensing mechanism, adhesion, bile resistance and amino acid metabolism and/or acid resistance.

20. An isolated polynucleotide as claimed in any preceding claim that encodes a plasmid transfer/conjugate operon (LSLl 812) [SEQ ID No. 6] and linked genes.

21. An isolated polynucleotide as claimed in any preceding claim that encodes an oligopeptide transport protein Opp (LSLl 882) [SEQ ID No. 1 1]

22. An isolated polypeptide encoded by a polynucleotide as claimed in any preceding claim.

23. A genetic construct comprising a polynucleotide as claimed in any preceding claim.

24. Use of a polynucleotide as claimed in any preceding claim in the selection or production of probiotic bacteria.

25. A recombinant vector comprising a polynucleotide as claimed in any preceding claim.

26. An isolated polynucleotide comprising a nucleic acid sequence of SEQ ID No. 14 (LSLO 152) or a variant or fragment thereof.

27. An isolated polynucleotide comprising a nucleic acid sequence of SEQ ID No. 15 (LSLO 152) or a variant or fragment thereof.

28. An isolated polynucleotide comprising a nucleic acid sequence of SEQ ID No. 16 (LSL0311 ) or a variant or fragment thereof.

NATI68/CλVO/IE

49

29. An isolated polynucleotide comprising a nucleic acid sequence of SEQ ID No. 17 (LSLl 085) or a variant or fragment thereof.

30. An isolated polynucleotide comprising a nucleic acid sequence of SEQ ID

No. 18 (LSL 1319) or a variant or fragment thereof.

31. An isolated polynucleotide comprising a nucleic acid sequence of SEQ ID No. 19 (LSL 1319) or a variant or fragment thereof.

32. An isolated polynucleotide comprising a nucleic acid sequence of SEQ ID No. 20 (LSL 1335) or a variant or fragment thereof.

33. An isolated polynucleotide comprising a nucleic acid sequence of SEQ ID No. 21 (LSL 1832b) or a variant or fragment thereof.

34. An isolated polynucleotide comprising a nucleic acid sequence of SEQ ID No. 22 (LSL0350) or a variant or fragment thereof.

35. An isolated polynucleotide comprising a nucleic acid sequence of SEQ ID

No. 23 (LSL0351) or a variant or fragment thereof.

36. An isolated polynucleotide comprising a nucleic acid sequence of SEQ ID No. 24 (LSL0352) or a variant or fragment thereof.

37. An isolated polynucleotide comprising a nucleic acid sequence of SEQ ID No. 25 (LSL0873) or a variant or fragment thereof.

38. An isolated polynucleotide comprising a nucleic acid sequence of SEQ ID No. 26 (LSL1401b) or a variant or fragment thereof.

50

39. An isolated polynucleotide comprising a nucleic acid sequence of SEQ ID No. 27 (LSLHOIb) or a variant or fragment thereof.

40. An isolated polynucleotide comprising a nucleic acid sequence of SEQ ID No. 28 (LSLl 832c) or a variant or fragment thereof.

41. An isolated polynucleotide comprising a nucleic acid sequence of SEQ ID No. 29 (LSL1832d) or a variant or fragment thereof.

42. An isolated polynucleotide comprising a nucleic acid sequence of SEQ ID No. 30 (LSLl 838) or a variant or fragment thereof.

43. An isolated polynucleotide as claimed in any of claims 26 to 42 wherein the nucleic acid sequence is at least 70% identical to the nucleic acid sequence of any one or more of SEQ ID No. 14 to SEQ ID No. 30.

44. An isolated polynucleotide as claimed in any of claims 26 to 42 wherein the fragment comprises at least 30 contiguous nucleic acids to the nucleic acid sequence of any one or more of SEQ ID No. 14 to SEQ ID No. 30.

45. An isolated polynucleotide as claimed in any of claims 26 to 42 or variant or fragment thereof which encode a polypeptide or protein having probiotic function and/or activity.

46. A polynucleotide as claimed in any of claims 26 to 42 which encodes a gene involved with intestinal cell adherence properties of Lactobacillus.

47. A polypeptide or protein encoded by a polynucleotide as claimed in claim 46.

51

48. A polypeptide or protein as claimed in claim 47 capable of mediating adherence to epithelial cells and modulating epithelial gene expression to improve gut barrier function.

49. Use of a polypeptide or protein as claimed in claim 47 or 48 for the preparation of a medicament for use in the prophylaxis or treatment of undesirable inflammatory activity.

50. Use of a polypeptide or protein as claimed in claim 47 or 48 for the preparation of a medicament for use in generating an immune response.

51. Use of a polypeptide or protein as claimed in claim 47 or 48 for engineering hyperadhesive mutants.

52. Use of a polypeptide or protein as claimed in claim 47 or 48 to modulate or alter the metabolism of Lactobacillus.

53. A vaccine comprising a polypeptide or protein as claimed in claim 47 or 48.

54. A formulation comprising a polypeptide or protein as claimed in claim 47 or

48.

55. A recombinant expression vector comprising the nucleic acid of SEQ ID No. 1.

Description:

"A Product"

The invention relates to the identification and isolation of genes in the Lactobacillus salivarius genome involved in probiotic activity or function, and a novel megaplasmid that may be exploited for strain alteration.

Introduction

Lactobacillus salivarius subsp salivarius strain UCCl 18 is a probiotic bacterium of human origin. It has been previously shown to have desirable properties including acid resistance, bile resistance, adhesion to human cells and potent anti-microbial activity.

WO98/35014 describes strains of Lactobacillus salivarius isolated from resected and washed human gastrointestinal tract which inhibit a broad range of Gram positive and Gram negative microorganisms and which secretes a product having antimicrobial activity into a cell-free supernatant.

WO00/41707 describes the use of Lactobacillus salivarius in the prophylaxis or treatment of undesirable inflammatory activity, especially gastrointestinal activity such as inflammatory bowel disease (IBD), irritable bowel syndrome (IBS), ulcerative colitis or Crohn's disease. The inflammatory activity may also be due to cancer. The strain therefore has use in the prophylaxis or treatment of a number of disease states including gastrointestinal inflammatory activity such as pouchitis, post infection colitis, diarrhoeal disease associated by Clostridium difficile or with

Rotovirus, post infective diarrhoeal disease, inflammatory activity due to gastrointestinal cancer, systemic inflammatory disease or an autoimmune disorder such as rheumatoid arthritis or undesirable inflammatory activity due to cancer.

NATI68/CλVO/IE

Lactobacillus salivarius subsp salivarius strain UCCl 18 has demonstrated colonisation ability and therapeutic effects in human subjects [1] . It has also been shown to reduce arthritis, inflammation and tumour development in mice. [2].

There is a however a need for more detailed information relating to the biology of lactic acid bacteria (LAB) in particular in relation to their interaction with a host.

The invention is directed towards a better understanding of the mechanisms involved in their activity.

Statements of Invention

According to the invention there is provided an isolated polynucleotide comprising a nucleic acid sequence of SEQ ID No. 1 or a variant or fragment thereof or a sequence complementary thereto.

The invention also provides a polynucleotide comprising a nucleic acid sequence of SEQ ID No. 1 or a variant or fragment thereof isolated from a probiotic bacterium.

The invention also provides a polynucleotide comprising a nucleic acid sequence of SEQ ID No. 1 or a variant or fragment thereof isolated from Lactobacillus.

Preferably the polynucleotide is isolated from Lactobacillus salivarius subsp salivarius. Most preferably the polynucleotide is isolated from Lactobacillus salivarius subsp salivarius UCCl 18 [NCIMB 40829].

In one embodiment of the invention the polynucleotide comprises a nucleic acid sequence that is at least 70% identical to the nucleic acid sequence of SEQ ID No. 1.

In one embodiment of the invention the polynucleotide comprises at least 30 contiguous nucleic acids of SEQ ID No. 1.

The invention further provides an isolated polynucleotide comprising a nucleic acid sequence selected from any one or more of SEQ ID No. 2, SEQ ID No. 3, SEQ ID

No.4, SEQ ID No. 5, SEQ ID No. 6, SEQ ID No. 7, SEQ ID No. 8, SEQ ID No. 9, 10 and SEQ ID No. 11 or a variant or fragment thereof or a sequence complementary thereto.

In one embodiment of the invention the polynucleotide encodes a gene whose function is essential to probiotic activity or function.

In another embodiment of the invention the isolated polynucleotide encodes a gene involved in any one or more of sugar utilisation and sugar phosphate metabolism, environmental sensing mechanism, adhesion, bile resistance and amino acid metabolism and/or acid resistance.

In one embodiment of the invention the isolated polynucleotide includes a plasmid transfer/conjugation operon (LSLl 812 and linked genes) [SEQ ID No. 6].

In one embodiment of the invention the isolated polynucleotide encodes an oligopeptide transport protein Opp (LSL 1882) [SEQ ID No. 11]

The invention further provides a polypeptide encoded by a polynucleotide as hereinbefore described.

The invention further provides a genetic construct comprising a polynucleotide as hereinbefore described.

The invention also provides use of a polynucleotide as hereinbefore described in the selection or production of probiotic bacteria.

The invention also provides use of recombinant vector comprising a polynucleotide as hereinbefore described.

The invention further provides an isolated polynucleotide comprising a nucleic acid sequence selected from any one or more of SEQ ID No. 14 (LSLOl 52), SEQ ID No.

15 (LSLO 152), SEQ ID No. 16 (LSL0311 ), SEQ ID No. 17 (LSL 1085), SEQ ID No. 18 (LSL1319), SEQ ID No. 19 (LSLl 319), SEQ ID No. 20 (LSL 1335), SEQ ID No.

21 (1832b), SEQ ID No. 22 (LSL0350), SEQ ID No. 23 (LSL0351), SEQ ID No. 24

(LSL0352), SEQ ID No. 25 (LSL0873), SEQ ID No. 26 (LSL 140 Ib), SEQ ID No.

27 (LSLHOIb), SEQ ID No. 28 (LSL 1832c), SEQ ID No. 29 (LSL1832d), SEQ ID

No. 30 (LSL_1838) or a variant or fragment thereof.

The invention also provides an isolated polynucleotide wherein the nucleic acid sequence is at least 70% identical to the nucleic acid sequence of any one or more of

SEQ ID No. 14 to SEQ ID No. 30.

The invention also provides an isolated polynucleotide wherein the fragment comprises at least 30 contiguous nucleic acids to the nucleic acid sequence of any one or more of SEQ ID No. 14 to SEQ ID No. 30.

In one embodiment the isolated polynucleotide encodes a polypeptide or protein having probiotic function and/or activity. In another embodiment the polynucleotide encodes a gene involved with intestinal cell adherence properties of Lactobacillus.

The invention also provides a polypeptide or protein encoded by a polynucleotide as hereinbefore described.

In one embodiment the polypeptide or protein is capable of mediating adherence to epithelial cells and modulating epithelial gene expression to improve gut barrier function.

In another embodiment the polypeptide or protein is used for the preparation of a medicament for use in the prophylaxis or treatment of undesirable inflammatory activity.

The polypeptide or protein may be used for the preparation of a medicament for use in generating an immune response, or for engineering hyperadhesive mutants.

The polypeptide or protein may also be used to modulate or alter the metabolism of Lactobacillus.

The invention further provides a vaccine or formulation comprising a polypeptide or protein as hereinbefore described. The vaccine or formulation may be prepared using commonly used excipients and/or carriers.

A megaplasmid is defined as an autonomously replicating episome greater than 100 kb, which has a plasmid-related replication origin and partition mechanisms, but no genes for essential functions such as ribosomal RNA, or transfer RNA [3]. The invention also provides a megaplasmid or portion thereof which may be introduced into other bacteria, or used as a cloning vector.

The term genome or genomic sequence is taken to mean the sequence of the chromosome of Lactobacillus salfvarius. The term plasmid is taken to designate any extrachromosomal piece of DNA contained in the Lactobacillus.

The term variant or fragment thereof is taken to mean a derivative fragment generated by minimal genetic modification including but not restricted to point mutation, rearrangement or in vitro optimisation.

Brief description of the drawings

The invention will be more clearly understood from the following description thereof given by way of example only with reference to the accompanying drawings in which;- Fig. 1 is a Genome atlas of L. salivarius UCCl 18;

Fig. 2A shows a pulsed field gel electrophoresis (PFGE) of total genomic DNA of Lactobacillus salivarius UCCl 18 run either digested (+) or undigested (-) with the labelled restriction enzymes. The in silico fragment number predictions for the chromosome, megaplasmid, pSF44 and pSF20 are indicated in the accompanying table; This shows that there is complete concordance, within the resolution limits of the gel for smaller fragments;

Fig. 2B shows a Southern hybridization of the Lactobacillus salivarius UCCl 18 genomic digests from Fig. 2A, using a gene probe based on abpa+abpβ (encoding bacteriocin ABPl 18 structural peptides) localized in the megaplasmid pMPl 18 identified in strain UCCl 18;

Fig. 3 A shows a pulsed field gel electrophoresis (PFGE) of genomic DNA of strain UCC 1 18 (lanes 1 and lanes 2), 4231 (lanes 3), 4331 (lanes 4), 43310

(lanes 5), and 43324 (lanes 6). Sl, treatment without (-) or with (+) Sl nuclease;

Fig. 3B shows a Southern blot using a probe specific to the genes abpa+abpβ (encoding bacteriocin ABPl 18 structural peptides) localized in the megaplasmid pMPl 18 identified in strain UCC 118;

Fig. 4 is a graph showing the disruption of the sortase gene srtA (LSL 1606) of Lactobacillus salivarius UCCl 18 having a significant effect on Lactobacillus salivarius UCCl 18 adhesion to intestinal epithelial cells (HT29 cell line). Values graphed are the mean of three independent biological replicates, each with technical duplicates. Error bars are standard error of the mean;

Fig. 5A & B Impaired epithelial cell adhesion by mutants lacking sortase, or sortase-dependent proteins, in Lactobacillus salivarius UCCl 18. Adhesion to HT29 cells of L. salivarius UCCl 18 wild type and mutants lacking the indicated proteins, as determined by viable count method (panel A) and semiquantitative real-time PCR (panel B). Results shown are averages of three independent experiments. Percentage adhesion is expressed as relative adherence compared to the wild type strain and the error bars represent standard error of the mean (SEM). Statistically significant differences (p < 0.05) are determined by student's t-Test and indicated with an asterisk; and

Fig. 6 is a graph showing the functional categorisation of the UCCl 18 genome with the gene content assigned to functional categories;

Fig. 7 is a table showing the relatedness of the UCCl 18 to other Lactobacilli.

The table shows a summary of organism distribution among the 10 best Smith- Waterman hits for each UCC 118 gene from the four replicons;

Fig. 8A is a pulse field gel electrophoresis of Z. salivarius;

Fig. 8B is a southern hybridisation corresponding to Fig 8A. Megaplasmids of varying size are found in L. salivarius. Panel A: PFGE separation of genomic DNA of 10 Z. salivarius strains. Panel B: Corresponding Southern hybridization with the LSL_1739 repA probe. (+) or (-) indicate treatment with Sl nuclease. Arrows to left indicate λ DNA concatamers used as size standard. Prominent bands in Panel A, Sl nuclease treated samples, are chromosomal DNA. In panel B, non-specific hybridization with chromosomal DNA occurred in some samples. The estimated size of megaplasmids are: AH4231 , 210 kb; UCC 118, 242 kb; UCC 119, 195 kb; DSM20492, 240 kb; DSM20555, 380 kb; NICMB8816, 180 kb;

NCIMB8817, 145 kb; DSM20554, 260 kb; NCIMB702343, 160 kb; JCM 1230, 100 kb.;

Fig. 9 is a diagrammatic representation of the molecular organisation of the intact sortase dependent proteins of Z. salivarius UCC 1 18 and their homologues.

Fig 1OA is a southern hybridisation of a sortase gene deletion mutant; Fig. 1OB is a schematic overview of the genome structure of a sortase gene deletion mutant. Verification of the genome structure of a sortase-gene deletion mutant. Panel A: Southern hybridization. The fragments expected following digestion are indicated by arrows, and fragments sizes are indicated in kilobase pairs. Panel B: Schematic overview. The sortase gene is indicated as a box whereas the probe is indicated as a hatched box

Fig. 1 1 shows expression analysis of genes encoding sortase-dependent proteins. Transcription of genes for sortase-dependent proteins investigated by RT-PCR. PCR was performed on cDNA prepared from stationary phase cells grown in MRS broth. Arrows indicate sizes in base pairs. Genes are

NATIδ8/CλV0/IE

indicated above the lanes. Gene labels with 5' or 3' suffices indicate expression was tested upstream or downstream of the internal stop codon.

Detailed description

To determine the physiology of Lactobacillus salivarius UCCl 18 and identify genes potentially involved in the bacteria's interaction with a host the genome of Lactobacillus salivarius UCCl 18 was sequenced and analysed using standard methodology.

We identified and isolated a megaplasmid present in the genome sequence. The circular megaplasmid is composed of 242,436 basepairs. The nucleic acid sequence for the megaplasmid is given in SEQ ID No. 1. The megaplasmid has been named pMPl 18. We have found that the megaplasmid plays a major role in the probiotic properties of Lactobacillus salivarius.

We have also identified genes in the genome of L. salivarius which encode proteins involved in the intestinal cell adherence properties of Z. salivarius UCCl 18.

Assembly of the genome sequence of Lactobacillus salivarius UCCl 18 revealed that the genome consisted of the following:

1. A plasmid pSF-20 composed of 20,417 basepairs

2. A plasmid pSF-44 composed of 44,013 basepairs

3. A circular chromosome composed of 1,827,111 basepairs (NCBI database accession no. CP000233)

4. A circular megaplasmid (pMPl 18), composed of 242,436 basepairs (SEQ ID No. 1, and NCBI database accession no. CP000234)

10

The arrangement of the Lactobacillus salivarius UCCl 18 genome is shown in the genome atlas presented in Fig. 1. Sequence analysis show that it has an asymmetric GC skew pattern.

Table 1 shows the content of the UCCl 18 genome Table 1

The sequences of the plasmids pSF-20 and pSF-44 are known, (the sequences are available on the NCBI website having accessions No.s NC006529 andNC006530 respectively).

The genome of L. salivarius UCCl 18 described herein reveals the presence of a 242- kb megaplasmid, which, although apparently dispensable for viability based on gene content, confers on the strain a large number of contingency metabolic capabilities and traits directly related to GI tract survival or competitiveness.

The megaplasmid mPMl 18 may be used as a vector. The minimal replication origin of pMPl 18 may be adapted as a vector to clone extremely large fragments of DNA. A method of cloning large DNA fragments in a bacterial system is discussed in reference 26.

1 1

General Genome Features.

The genome sequence of L. salivarius subsp. salivarius strain UCCl 18 consists of 2,133,977 nucleotides with an average GC content of 33.04%. The genome comprises four replicons (Fig. 1), a circular chromosome of 1,827,111 nucleotides, a previously undescribed megaplasmid of 242,436 nucleotides designated pMPl 18, and two previously described plasmids of 20.4 kb QpSFl 18-20) and 44 kb (pSFl 18- 44) (21). Multireplicon genome architecture may include minichromosomes, which are difficult to formally distinguish from megaplasmids [3]. We designated pMPl 18 as a megaplasmid for the following reasons: It contains neither tRNA nor rRNA genes; it has plasmid-related replication and partition proteins; and it does not contain the only copy in the genome of any known essential gene.

The megaplasmid of the invention has a number of distinguishing properties. It is the first megaplasmid to be identified in a probiotic bacterium. It is the largest plasmid identified to date in gram-positive bacteria and it is the largest plasmid identified in the lactic acid bacteria (LAB) group to which L. salivarius belongs.

L. salivarius Megaplasmid-Encoded Properties.

There are no genes on the megaplasmid pMPl 18 that might be considered strictly essential for viability. For example, pMPl 18 harbors an additional copy oirpsN

(LSL_1944, ribosomal protein S14P), which is aparalog of the chromosomal gene LSL_1422 and is homologous to rpsN2 of L. plantarum and L.johnsonii. The gene on pMPl 18 encoding a bile-salt hydrolase (choloylglycine hydrolase; LSL_1801) is one of only two genes encoding bile-inactivating enzymes detected in the genome. The copy number of pMPl 18 was estimated by PCR to be 4.7 _ 0.6 copies per chromosome equivalent in stationary phase, so gene dosage effects would contribute to amplifying the contribution of LSL l 801 to bile resistance. The L. salivarius UCCl 18 chromosome harbors one copy each of ldhL and ldhD genes, whereas pMPl 18 encodes an additional copy of the ldhD gene (LSL_1887), whose product is 42% identical to the L. plantarum enzyme. D-lactate is an important

12

component of cell wall precursors in L. plantarum (17), and the additional pMPl 18- encoded LdhD could increase the efficiency of D-lactate production, provided that the gene is expressed and the gene product is catalytically active. In addition, LSL_1901 on pMPl 18 encodes a bifunctional acetaldehyde_alcohol dehydrogenase, which is the only enzyme in this strain that catalyzes the formation of ethanol from acetyl-CoA via acetaldehyde. Although not essential, the presence of this additional reductive pathway on pMPl 18 likely would improve the redox-balancing capability ofstrain UCC1 18.

We used a combination of hybridization and Sl nuclease treatment, in combination with pulsed-field gel electrophoresis (PFGE) to investigate plasmid content of L. salivarius strains from varying sources. Nuclease Sl preferentially nicks and then linearizes megaplasmids because of their torsional stress [4]. L. salivarius comprises two subspecies, salivarius and salicinius [5], and both were found to contain megaplasmids (Fig. 8), of sizes varying from 100 kb to 380 kb. All of these megaplasmids hybridized with a pMPl 18 repA gene probe. (Prominent bands in Sl nuclease-treated samples in Fig. %A are chromosomal DNA, which caused nonspecific hybridization in Fig. SB in some samples). It is noteworthy that pMPl 18 contains a tract of genes that show low relatedness to known or suspected conjugation genes, thus representing a functional or remnant plasmid transfer locus.

The ability to disseminate by conjugation would explain the apparently universal presence of pMPl 18-related plasmids (in strains so far tested), and this potential tra locus also might be involved in mobilization of smaller plasmids. The megaplasmid pMPl 18 is the largest plasmid from lactic acid bacteria in current nucleotide databases. However, plasmids 100 kb in Lactobacillus spp. have been reported.

Lactacin F production by L. acidophilus strain 88 was shown by conjugation analysis to be linked to a 1 10-kb plasmid, pPM68 [6]. A plasmid of 150 kb was identified in L. gasseri by PFGE and was suggested to be linear on the basis of electrophoretic behavior [7].

13

However, its size and conformation were not confirmed. Given that many plasmid- profile studies of lactic acid bacteria [8] predated PFGE technology or did not employ conditions required to separate undigested replicons from the chromosome [4], it is possible that megaplasmids have a wider distribution in these bacteria than was previously recognized. The sequence of pMPl 18 reported here, in the context of the whole genome, uniquely illustrates the contribution of a megaplasmid to diverse metabolic and phenotypic properties of L. salivarius by both integrating with and extending chromosomally encoded features. The circular chromosome of Z. salivarius UCCl 18 is the smallest Lactobacillus chromosome so far sequenced, 57.5 kb smaller than that of X. sakei and 165.5 kb smaller than that of L. johnsonii. The existence of a circular chromosome is the default expectation for a conventional bacterial genome.

The fact that pMP 1 18 contains a repertoire of genes that likely confer metabolic flexibility, seen in the context of significant megaplasmid size variation in other strains, strongly suggests that the multireplicon genome architecture of Z. salivarius bestows on the species a dynamic and flexible genetic complement. This architecture could be in response to dietary fluctuations in host species, flexible niches in the GI tract, or adaptation to different hosts.

Sequence analysis

The sequence analysis carried out suggests that the megaplasmid of Lactobacillus salivarius UCC 1 18 and potentially other Lactobacillus salivarius strains is an autonomously replicating, stable repository for contingency genes that confer a selective advantage upon the strain for surviving and interacting with epithelial cells in the human gastrointestinal tract.

Probiotic properties

Other Lactobacillus salivarius strains were also examined for the presence of a megaplasmid (Fig. 3). All strains harboured a megaplasmid. The presence of the

14

megaplasmid in a number of other strains of human origin with known probiotic properties (4) suggest that the megaplasmid contributes to the probiotic properties of Lactobacillus salivarius UCCl 18.

Analysis of the genes encoded by the megaplasmid reveals that many of the genes are likely to contribute to the probiotic properties of Lactobacillus salivarius UCC 1 18. The functions of the genes were inferred by sequence homology to other proteins.

The megaplasmid is therefore believed to play an important role in the probiotic properties of Lactobacillus salivarius.

Biosynthetϊc Capabilities.

L. salivarius has an alanine dehydrogenase gene (EC 1.4.1.1; LSL_1768) located on the megaplasmid, which is unique among published Lactobacillus genomes but present in Lactobacillus casei and Lactobacillus delbrueckii draft genomes (www.jgi.doe.gov). This enzyme catalyzes the NAD_-dependent reversible reductive amination of pyruvate into alanine. It is most frequently found in the genus Bacillus and has been exploited for engineering of Z. lactis to produce L-alanine [9]. By virtue of the pMP1 18-encoded enzyme, L-alanine can be synthesized from pyruvate.

Linked to this alanine dehydrogenase gene is a gene encoding a putative alanine permease (LSL_1767); both genes show elevated GC content (40.37% and 41.5% for LSL_1767 and LSL_1768, respectively), indicating lateral transfer.

The megaplasmid also encodes a paralog (LSL_1927) for one of the two enzymes required for conversion of pyruvate to L-aspartate. The megaplasmid also harbors two genes (LSL_1931 and LSL_1932) predicted to encode the alpha and beta subunits of L-serine dehydratase (EC 4.3.1.17), which catalyzes pyruvateserine interconversion. Serine formed from pyruvate can subsequently be converted to glycine by a chromosomally encoded enzyme. Serine may be thiolated to cysteine by

15

CysK (EC 2.5.1.47; LSL_0026 and LSLJ 718), and cysteine can be converted to methionine by using four chromosomally encoded enzymes. Genes whose products are predicted to synthesize aspartate from pyruvate, and lysine and threonine from aspartate, were also annotated. Unlike L. plantarum, L. salivarius appears to lack the genes required for synthesis of tryptophan and related amino acids.

Carbohydrate Metabolism and Transport.

L. salivarius is currently regarded as homofermentative [5], meaning that sugars can be fermented only via the Embden-Meyerhof-Parnas pathway, and genes for the complete glycolysis pathway are present in the chromosome. Interestingly, genes for the pentose phosphate pathway also were found in the L. salivarius UCCl 18 genome, suggesting that it should be grouped among the facultatively heterofermentative lactobacilli. The genome sequence suggested that L. salivarius UCCl 18 would be able to assimilate ribose. This suggestion was experimentally confirmed by growth on ribose as a sole carbon source, and, furthermore, we detected lactate, acetate, and ethanol by HPLC in the culture medium, confirming the heterofermentative status of this strain.

The two key enzymes of the pentose phosphate pathway, transketolase (LSL_1946) and transaldolase (LSLJ 888, LSL_1947), are encoded by the megaplasmid (Fig. 9).

The gene products of pMPl 18 are not essential for nucleotide biosynthesis when L. salivarius UCCl 18 is grown on glucose, but the presence of an additional copy of a gene for ribose-5-phosphate isomerase on pMPl 18 (LSL_1806) may increase the flexibility and flux of this pathway. The presence of pMPl 18 would be essential for the growth of UCC 1 18 if pentose were used as the sole carbon source. Growth on ribose, aided by genes resident on pMPl 18, may confer a competitive advantage on the strain when living in the human GI tract, because ribose might be an abundant carbon source in the GI tract because of RNA degradation. A gene on pMPl 18 encoding fructose biphosphatase (LSL_1903) and a chromosomally encoded phosphoenolpyruvate carboxykinase (LSL_0395) were identified. The first enzyme

16

catalyzes the formation of fructose-6-phosphate, which is required for the synthesis of glucosamine-6-phosphate and its derivatives involved in peptidoglycan formation. The second enzyme catalyzes the formation of phosphoenolpyruvate by decarboxylation of oxaloacetate, while hydrolyzing ATP, a ratelimiting step in gluconeogenesis. Although L. sakei 23k has fructose biphosphatase and L. plcmtarum

WCFSl has phosphoenolpyruvate carboxykinase, L. salivarius UCCl 18 and L. casei ATCC334 (www.jgi.doe.gov) seem to be the only sequenced Lactobacilli that have both enzymes (with the assistance of pMPl 18 in L. salivarius), suggesting the presence of a functional gluconeogenesis pathway. When glucose is exhausted, as can be encountered in certain regions of the GI tract, the gluconeogenesis pathway might be activated. The presence of a complete pentose phosphate pathway to generate glyceraldehyde-3-phosphate, coupled with gluconeogenesis from pyruvate, could be an adaptation to pentose-based growth.

In addition, genes encoding enzymes involved in rhamnose and N-acetylneuraminic acid (sialic acid) catabolism as well as sorbitol utilization are present on pMPl 18. Collectively, the presence of pMPl 18 appears likely to increase the metabolic flexibility and, thus, competitiveness of L. salivarius UCCl 18.

The inferred properties of the Lactobacillus strain that are encoded by the megaplasmid include, but are not restricted to, the following: (locus numbers from genome sequence in parentheses):

• Sugar (rhamnose and sorbitol) utilization and sugar-phosphate metabolism (genes LSL 1752; SEQ ID No. 3) and linked genes, and gene LSLl 890 (SEQ

ID No. 4) and linked genes, respectively). This would confer a competitive advantage on the bacterium in the gut, allowing it to outgrow other bacteria including pathogens.

NATI68/CλVO/IE

17

• An environmental sensing mechanism (two component system; gene

LSLl 803 (SEQ ID No. 5) and linked genes). This would allow the bacterium to react quickly to changes in the gut environment, and thus confer a competitive advantage.

• A plasmid transfer/conjugation operon (gene LSLl 812 (SEQ ID No. 6) and linked genes). This cluster of genes may promote dissemination of the megaplasmid to other Lactobacilli, or may govern the mobilization of the smaller plasmids pSF-20 and pSF-44, both of which have genes which render them mobilizable, but are incapable of mobilizing themselves as they lack the transfer genes. The megaplasmid could thus be a key element that may be exploited for introducing genes into other LAB, including by non-GMO methods such as protoplast fusion and conjugation.

• Potential adhesion molecules including LSLl 838 (SEQ ID No. 7)), and

LSLl 816 (SEQ ID No. 8), a putative fibrinogen-keratin binding protein.

• An alanine permease (LSL 1767 (SEQ ID No. 9)) and linked dehydrogenase that may have a role in amino acid metabolism, or acid resistance, both of which are relevant for GI tract colonization and survival.

• A bile salt hydrolase (coded LSL 1801 (SEQ ID No. 10)), which is expected to contribute to the resistance of Lactobacillus salivarius UCCl 18 to bile, conferring a survival and competitive advantage.

• An oligopeptide transport protein Opp (LSLl 882 (SEQ ID No. 1 1)), that is predicted to allow uptake of short-length amino acid chains from the external environment.

• A locus encoding a two-component bacteriocin Abpl 18.

18

The presence of a potential adhesin gene on pMPl 18 indicated that general adhesion mechanisms that might be encoded by the complete genome.

Secreted proteins

The predicted secreted protein complement of L. salivarius UCCl 18 contains 1 19 proteins. Among these 1 19 proteins, only 10 are likely sortase-dependent proteins. The megaplasmid is predicted to encode eight of the secreted proteins. There is a single sortase gene srtA (LSL_1606) in the genome and genes encoding putative signal peptidase I (LSL_0876) and signal peptidase Il (LSL_0825).

Sortase protein and adhesions of L. salivarius UCC118.

Many gram-positive bacteria produce cell surface proteins that are anchored on the cell surface by covalent linkage to peptidoglycan [10]. The enzyme that is responsible for this linkage of surface proteins is called Sortase [10]. There are several classes of Sortase enzymes. Some bacteria have several such sortase enzyme, while others, including Lactobacillus salivarius, have only one.

In Lactobacillus salivarius UCCl 18 the gene is srtA (LSL 1606). Other Lactobacilli whose genomes have been sequenced and published (L. acidophilus, L. plant arum,

L.johnsonii) also have only one sortase gene [[1 1 -13]].

The cell surface proteins that require the sortase enzyme for anchoring on the cell surface are referred to here as sortase-dependant proteins or SDP's. Sortase acts upon the SDP's by recognizing sequence patterns at the carboxy-terminal end of the

SDP, which it then bonds to the peptidoglycan in the cell wall [14]. These sequence patterns can be identified in genome sequences by for example computer analysis [15].

NATI68/CλVO/IE

19

We found in the genome sequence of Lactobacillus salivarius UCCl 18 genes for 10 SDPs. The carboxy-terminal end of these proteins is conserved relative to SDP's in other Lactobacilli and indeed other gram-positive bacteria. However, the sequence of the rest of the SDP molecules, which protrude into the external environment, are much less conserved. The relationship of the SDP's of Lactobacillus salivarius

UCCl 18 to other proteins is shown in Table 2.

Table 2 - Putative sortase-dependent proteins of L. salivarius UCC118 with relevant properties.

775 85 Streptococcal surface protein A precursor 3e-75

LSL_1832b sapA LPQMG pMP118 (AAC44101), S. gordonn

49 5.3 Hypothetical protein (XP_500168), 5.5

LSL_1902b hypothetical LPQTG pMPllδ Yarrouna hpolytica

646+325 na Collagen binding precursor 00

LSL_2020b cna LPQTG pSFllδ-44 (ABA12809), L. paracasei subsp. paracasei

" AA, amino acid; h for pseudogenes, the number of amino acids are indicated upstream and downstream of the internal stop codon, respectively; c bold font indicates a gene fragment; ''na, not applicable; "Blast hits were generated by comparing the 6-frame translation output of the target nucleotide sequence to a protein database using the BLAST-X algorithm; the protein accession number is indicated in parentheses, followed by the corresponding organism

21

Four of the eight SDP genes identified were found by bioinformatic analysis to be corrupted by mutations and are therefore apparently pseudogenes, or non-functional genetic remnants. The possibility remains, however, that these mutations are suppressed when the probiotic bacterium is in the host GI tract, and that the corresponding proteins are actually produced in vivo.

Four genes were found to be functional, in terms of transcription, translation, and absence of stop codons or other interruptions, and encoded the following predicted proteins (locus numbers from genome sequence in parentheses):

1. Lsp A (LSLJB 11) SEQ ID No. 16

2. LspB (LSLJ 085) SEQ ID No. 17

3. LspC (LSLJ335) SEQ ID No. 20 4. LspD (LSLJ 838) SEQ ID No. 30

Adhesion is an important property of probiotic bacteria, as it allows them to interact with the host, and is thus implicated with colonization, immunomodulation and pathogen exclusion [16]. Cell surface proteins are the primary adhesion interface between bacteria and their host [17]. In probiotic bacteria such as Lactobacillus salivarius UCCl 18, impairment of SDP anchoring would result in reduced adhesion to epithelial cells. We therefore constructed a sortase gene knockout (KO), using standard genetic techniques to determine the function of SDPs.

We found that SDPs contribute significantly to epithelial cell adhesion, indicating that LspA (SEQ ID No. 16) in particular, and potentially LspD (SEQ ID No. 30), are likely to be individually or collectively responsible for this phenotype.

Lactobacillus sakei is widely used for meat fermentation and food preservation, and thus is commonly used as a negative control for epithelial cell adhesion, as it is not a

22

human GI tract-associated species [18]. The sortase gene KO strain, though significantly less adherent than the parental Lactobacillus salivarius UCCl 18 strain, still adhered at around 50% of the wild-type level, and at almost double the level shown by L. sakei (Fig. 4). This indicated that proteins other than SDPs contributed to the intestinal cell adherence properties of Lactobacillus salivarius UCCl 18.

These putative sortase-independent adhesion proteins are referred to as Sortase Independent Adhesins or SIA's. Bioinformatic approaches were used including searching for cell export signals (SIGNAL-P), Transmembrane Helix Markov Models (TMHMM) and homology scores against known or suspected adhesins in other bacteria to compile a list of SIA's. Table 3 lists putative sortase-independent adhesins of Lactobacillus salivarius UCCl 18.

Table 3

1. "Horn." indicates homologue

2. Percentage identity of protein to homologue in last column, in overlapping sequences, and percentage similarity based on chemically similar amino acids/conservative substitutions

Potential applications of adhesion activity conferred by proteins identified in the Lactobacillus salivarius UCC 1 18 genome include the following.

NATI08/CλVO/IE

23

The adhesin protein or derivative, fragment, or recombinant products thereof may be used for improving gut barrier function and or competitively excluding potential pathogens from binding to and or invading epithelial cells. They may also be used for mediating adherence of microorganisms to epithelial cells.

The protein may be used for the preparation of a medicament for use in generating an immune response, for engineering hyperadhesive mutants or for the preparation of a medicament for use in regulating cell cycle and/or invasive behaviour of tumour cells. The invention will be more clearly understood from the following examples.

EXAMPLES

Lactobacillus salivarius is part of a distinct clade in..l6S-based phylogentic trees of the genus, well separated from other sequenced Lactobacillus species. To examine the phylogenomics of UCC 1 18 we employed a shotgun approach to sequence the genome. Overall coverage of 10-fold was achieved. The currently assembled genome size is 2.07Mb and contains 2120 predicted coding regions and has an average GC content of 32.9%. The genome includes 76tRNA genes and 7 rRNA operons. Four prophage remnants were detected, of which three were highly degenerated. A large number of genes potentially related to probiotic function and host interaction were inferred by homology.

Comparison of the UCCl 18 sequence to other Lactobacillus genomes reveals no long range synteny with other species. The greatest number of shared orthologs is with Lactobacillus plantarum (54%). The UCCl 18 genome also showed orthology of 41% with Lactobacillus acidophilus, 40% with Lactobacillus johnsonii, 38% with Lactobacillus gasseri and 48% with Enterococcus faecalis. The genome sequence analysis, and its comparison with that of other Lactobacilli provide a rational basis for understanding the metabolism and host interaction of Lactobacillus salivarius.

24

A deposit of a biologically pure culture of Lactobacillus salivarius strain UCCl 18 was made at the National Collections of Industrial and Marine Bacteria Limited (NCIMB) on November 27, 1996 and accorded the accession number NCIMB 40829. The strain of Lactobacillus salivarius is described in detail in WO98/35014.

Example 1 : Genome sequence determination for Lactobacillus salivarius UCC118.

In brief, total genomic DNA of Lactobacillus salivarius UCCl 18 was randomly sheared by sonication, to produce fragments 1.5 - 2.0 kb, and cloned into the plasmid vector pGEM-T for bulk sequencing. Pilot sequencing indicated over- representation of fragments derived from pSF-20 and pSF-44, two plasmids present in Lactobacillus salivarius UCCl 18. To circumvent this problem, the genomic DNA was partially denatured by treatment with alkali-sodium dodecyl sulphate; the denatured chromosomal DNA was recovered by low-speed centrifugation, leaving the plasmid DNA in solution. The enriched chromosomal DNA was recovered and used for plasmid bank construction and bulk sequencing. 10-fold coverage of the genome was achieved by sequencing approximately 38,000 reads. Cloning and DNA sequencing was performed under contract by MWG Biotechnology, Ebersberg, Germany.

The initial DNA sequence information was assembled into 110 scaffolds, organized into 18 contigs. To orient and link these contigs and to complete the genome sequence, approximately 500 inverse PCR reactions and 70 direct PCR reactions were performed. An additional 540 sequencing reactions were completed. Sequence data was assembled using Staden software, with in-built Phred and Phrap modules. Genes were predicted by a combination of standard openly-available software packages comprizing Orpheus, Glimmer, ZCurve 1.0, and Critica. These programs were run locally on a Linux-based server.

25

Sequence analysis of the megaplasmid also showed that it had an asymmetric GC skew pattern (Fig. 1), which is typical of megaplasmids, especially those that are conjugative (ref [19]).

Example 2 - DNA fragment analysis by Pulsed Field GeI Electrophoresis and

Southern Hybridization.

To confirm that the assembly of the Lactobacillus salivarius UCCl 18 genome sequence was correct, and to verify the existence of the megaplasmid by an independent method, we subjected total genomic DNA preparations from

Lactobacillus salivarius UCCl 18 to Pulsed-Field Gel Electrophoresis (PFGE), in combination with Southern hybridization. This analysis (Fig. 2) confirmed the existence of a 242 kb plasmid in Lactobacillus salivarius UCCl 18.

Standard methodology was employed. Low melting point agarose, PFGE certified agarose, λ DNA PFGE marker were all purchase from Bio-Rad Laboratories. Sarkosyl (N-Lauroylsacrosine), lysozyme, proteinase, mutanolysin, and phenylmethylsulfonyl fluoride (PMSF) were purchased from Sigma-Aldrich. Aspergillus ofγzae S 1 nuclease was purchased from Roche. Agarose gel plugs of high molecular weight DNA for PFGE were prepared by a modification of a published protocol. Bacteria were grown in MRS broth until it reached stationary phase. 1.5 ml cells was centrifuged (10,000 g, 1 min), washed once with 1 ml NT buffer (1 M NaCl, 10 mM Tris-HCl, pH 7.6) and re-pelleted. The cell pellet was resuspended in 450 μl NET buffer (1 M NaCl, 100 mM EDTA, 10 mM Tris-Cl, pH 7.6). Immediately after resuspension, an equal volume of melted 2% (w/v) low melting point (LMP) agarose, prepared in 0.125 M EDTA (pH 7.6) and maintained at 5O 0 C, was added. Cell suspension and LMP agarose were mixed carefully without introducing bubbles. Gel plugs were formed by pipetting 300-μl volumes into plug molds and allowed to solidify at 4 0 C. Up to ten plugs per strain were added to 5 ml of NET buffer containing 1% (w/v) sarkosyl, 10 mg/ml lysozyme, and 40 U/ml

NATI68/CλVO/IE

26

mutanolysin, incubated at 37 0 C for 24 h. The lysozyme solution was replaced with 5 ml of 0.5 M EDTA (pH 8.0) containing 1% (w/v) sarkosyl and 0.5 mg/ml proteinase K, incubated at 37 0 C for 24 h. This step was repeated with a fresh proteinase K solution. Plugs were treated with 10 ml TE buffer (10 mM Tris-HCl, 1 mM EDTA, pH 8.0) containing 1 mM phenylmethylsulfonyl fluoride (PMSF, freshly prepared) at

37 0 C for 1 h to inactivate the proteinase. This was followed by two 1-h incubations in 10 ml TE buffer at room temperature to remove the PMSF. Plugs were stored in 10 mM Tris-HCl, 100 mM EDTA (pH 8.0) at 4 0 C.

Prior to incubation with selected restriction enzymes, gel plugs were cut into 3-mm slices with a sterile glass coverslip and soaked three times for 15 min in 1 ml 10 mM Tris-HCl, 0.1 mM EDTA (pH 8.0) at room temperature. Each slice was pre- incubated with 100 μl of restriction buffer recommended for the enzyme (New England Biolabs) for 30 min at room temperature and then replaced with 100 μl of fresh buffer containing 20 units of restriction enzyme. Restriction digests were carried out overnight at temperatures recommended by the supplier.

For Sl -nuclease treatment, single slices were soaked under room temperature for 30 min in 200 μl Sl -buffer (50 mM NaCl, 30 mM sodium acetate (pH 4.5), 5 mM ZnSO4). The S 1 -buffer was replaced with another 200 μl S 1 -buffer containing 1 unit of Aspergillus ory∑ae Sl nuclease, incubated at 37 0 C for 30 min. The reaction was stopped by transferring the slices to 200 μl of 0.5 M EDTA (pH 8.0) on ice.

Plug slices were loaded directly into the wells of a 1% (w/v) pulsed-field grade agarose gel melted in 0.5 x TBE (89 mM Tris-borate, 2 mM EDTA, pH 8.3) running buffer. The wells were sealed by melted 1% LMP agarose. DNA fragments were resolved using a CHEF-DR II pulsed-field system (Bio-Rad Laboratories) at 13 V/cm for 24 hours with 0.5 x TBE running buffer maintained at 1O 0 C. Linear ramped pulsed times were selected depending on the size of DNA fragments to be resolved.

27

A frequently used protocol is 3 seconds to 80 seconds. Gels were stained in distilled water containing 0.5 μg/ml ethidium bromide for 60 min under dark conditions.

Gels were depurinated for 10 min in 0.2 M HCl, denatured for 30 min in 0.5 M NaOH, 1.5 M NaCl, neutralized for 45 min in 0.5 M Tris (pH 7.5), 1.5 M NaCl, transferred capillarily overnight to Hybond-N+ nylon membranes (Amersham Biosciences), and cross-linked to the membrane with UV light. For detection of megaplasmid from Lactobacillus salivarius strains by Southern hybridization, membranes were probed with a 410-bp PCR product, referred to as abpa+β. This PCR product, covering the genes encoding the a and /? peptides of the pre-identified bacteriocin ABPl 18, was amplified from the total genomic DNA of strain UCCl 18 using forward primer YL007 (5'-AAGGAA TTTACAGTATTGAC AG-3') [SEQ ID No. 12] and reverse primer YL008 (5'- ACGGCAACTTGTAAAACCA-3') [SEQ ID No. 13]. 100 nanograms of probe DNA was labeled with the enzyme horseradish peroxidase according to the instruction of the ECL direct nucleic acid labeling and detection kit (Amersham Biosciences). Membranes were pre-hybridized in 10 ml ECL hybridization buffer containing 5% blocking agent and 0.3 M NaCl at 42 0 C for 30 min, followed which the labeled probe was added to the pre-hybridization buffer. Hybridization was done at 42 0 C for 16 h. Membranes were washed three times for 20 min in 6 M urea, 0.4%

SDS, 0. IxSSC (0.3 M sodium citrate, 3 M NaCl, pH 7.0) at 42 0 C and three times for 5 min in 2xSSC at room temperature. Autoradiographs were produced by exposing Hyperfϊlm ECL for 1 h to 16 h at room temperature.

Example 3 - Detection of megaplasmids in other L. salivarius strains

Four other Lactobacillus salivarius strains were investigated for the presence of the megaplasmid, by PFGE and southern hybridization. The published ability of the single-strand specific endonuclease S 1 to linearize megplasmids was exploited, and thus convert them to a more abundant linear form [4]. All four strains harboured a

NATI68/CλVO/IE

28

megaplasmid (Fig. 3), detected either by DNA staining, or hybridization with an Abp 118-based probe (the gene for Abp 1 18 is on the Lactobacillus salivarius UCCl 18 megaplasmid). The megaplasmids of strains 43310 [NCIMB 41093] and 43324 [NCIMB 41044] were judged to be the same size as the megaplasmid of the invention, according to migration in PFGE. The megaplasmids of strains 4231

[NCIMB 41047] and 4331 were smaller, estimated at around 200 kb. These strains are described in detail in WO03/010298.

Example 4 - Gene knock-out construction by a two-plasmid integration strategy.

In brief, the regions flanking the sortase gene srtA (SEQ ID No. 30) of Lactobacillus salivarius UCCl 18 were amplified by PCR, ligated, and cloned into the pORI19 vector [20]. The resulting plasmid therefore contained a piece of DNA homologous to both sides of the srtAgene, but lacked the srtAgene itself. The plasmid was introduced in Lactobacillus salivarius UCCl 18 cells harboring the plasmid pVE6007 which has a temperature sensitive origin of replication [20]. Integrants were selected by culturing the transformants in MRS broth plus erythromycin at 44°C, a restrictive temperature for pVE6007 replication. This produced a single cross over, on one side of the chromosomal srtA gene. The second cross-over, resulting in loss of the integrated plasmid and the srtA gene, was isolated by serial subculture in MRS in the absence of erythromycin. The resulting mutant has lost the srtA gene to the precise extent defined by the original amplified fragments, and its genomic structure was confirmed by southern hybridization and PCR.

For inactivating the ispA, lspB and lspD genes, a plasmid integration strategy based upon gene fragments internal to the coding sequence of the respective gene was employed [20]. The plasmid integration event was performed as for the srtA clean deletion strategy above, but a single cross-over was obtained, to produce a strain

NATI68/CλVO/IE

29

with pORI 19 integrated into the respective gene. Mutations were verified by PCR with primers spanning from flanking DNA to the inserted plasmid sequence.

The sortase KO thus lacks the sortase gene, but is otherwise identical to the parental Lactobacillus salivarius UCC 1 18 strain.

Example 5 - Assessment of adhesion of Lactobacillus salivarius UCC118 strains and derivatives to epithelial cells in vitro.

HT-29 enterocytic cell lines [21] were cultures as monolayers in DMEM (Dulbeccos modified Eagle's medium; Invitrogen) supplemented with 10% (w/v) foetal calf serum (Invitrogen). Cells were grown in 75 cm 2 tissue culture flasks (Costar, Cambridge, MA, USA) at 37°C in a humidified atmosphere containing 5% CO 2 . At 95% confluency the monolayers were passaged by incubating with 0.25% trypsin " (Invitrogen) for 10 min at 37°C. The adhesion of the strains was examined using a modified version of a previously described method. In brief, HT29 cells were seeded at 1 x 10 6 cells / well in a 6 well plate (Greiner) and grown 10-12 days until confluence. Prior to the assay, epithelial cells were washed once in serum-free DMEM. Bacteria were washed once in sterile PBS and suspended to 1 x 10 8 CFU/ml in PBS. ImI bacterial suspension was combined with ImI serum-free DMEM and added to the epithelial cells (MOI - 50:1) and bacteria were allowed to adhere for 30 min at 37°C in a humidified atmosphere containing 5% CO 2 . Unadhered bacteria were removed by washing the cells five times using sterile PBS. Epithelial cells with adhering bacteria were scraped off using a rubber scraper (Greiner) and resuspended in ImI PBS. Serial dilutions were plated on MRS agar (Oxoid) and percentage adhesion was determined. Adhesion of mutants was expressed relative to that of wild type where wild type was set to 100%.

Adhesion of the Lactobacillus salivarius UCCl 18 sortase gene KO to epithelial cells grown in tissue culture was reduced reproducibly and statistically significantly,

30

compared to the parental strain (Fig. 4;p = 0.004 by paired students t test). This demonstrates that SDP 's contribute to the adhesion of Lactobacillus salivarius UCCl 18, in a widely accepted and published laboratory model for human Gl tract adhesion.

Mutants were therefore created that lacked expression of the genes ispA, lspB and lspD by the plasmid pORI19 integration strategy. Adhesion to intestinal epithelial cells was not significantly altered in either the lspB or lspD mutants. However, a significant reduction (p = 0.0060; paired Student's T-test, one tailed distribution) was observed for the mutant lacking expression of the IspA gene (Fig. 5), while a consistent, but statistically not significant reduction, was observed for the lspB mutant.

Example 6 - Identification of cell wall-anchored proteins. Using the annotated genome sequence of L. salivarius UCCl 18, a bioinformatic approach was employed to identify secreted proteins, including those predicted to be cell wall anchored. The results were compared to the data from parallel analyses of the genomes of L. plantarum WCFSl, L. acidophilus NCFM, L. johnsoniiηCC 533, and L. sakei 23K (Table 4).

L. salivarius ' UCCl 18 L. plantarum WCFSl" LJohnsonii NCC 533 * L. acidophilus NCFM L. sakei 23K

No of proteins with No of proteins with No of proteins with No of proteins with No of proteins with BLAST-NR % BLAST-NR % identity BLAST-NR % BLAST-NR % BLAST-NR %

No of identity cut-offs of: No of cut-offs of: No of identity cut-offs of: No of identity cut-offs of: No of identity cut-offs of:

Export features proteins* proteins proteins proteins proteins

>60 30-60 <30 >60 30-60 <30 >60 30-60 <30 >60 30-60 <30 >60 30-60 <30

Signal sequence 6 124 27 70 27 217 30 151 3ό 128 93 27 8 173 δθ 78 35 144 35 78 31

SPaseI cleavage c 44 6 27 1 1 86 12 51 23 31 21 9 1 56 10 28 18 50 10 25 15

Cell wall anchored

N or C terminal 80 18 41 21 131 20 99 12 97 72 18 7 117 53 47 17 94 24 54 16 anchored proteins

LPXTG anchors' 1 10 0 5 5 27 0 15 12 16 3 9 4 12 1 6 5 4 0 2 2

Lipoprotein

3 1 2 0 3 0 3 0 1 1 0 0 5 3 0 2 2 0 2 0 anchors 6

Choline binding

0 0 0 0 0 domain r

Peptidoglycan 0 1 0 1 0 1 0 1 0 0 0 binding domain f

GW repeats 8 4 1 2 1 11 3 5 3 1 1 0 0 3 0 3 0 3 1 1 I

LysM domain f 9 0 6 3 11 6 5 0 1 0 1 0 1 0 1 0 4 1 2 1

WxL domain 11 1 0 1 0 6 0 3 3 0 0 9 0 7 2

Table 4. Genome wide survey of cell wall anchored proteins in L salivarius UCC118 and comparison with available Lactobacillus genomes

* organisms in which proteins with LPXTG-anchors were previously identified by Boekhorst et ah; a values tabulated include pseudogenes; b Signal sequence prediction using Hidden-Markov-Model in SignalP3.0 with P > 0.95 as cut-off; c cleavage site prediction using Neural Network Model in SignalP3.0 with Cmax > 0.52 and Ymax > 0.32 as cut-offs; d LPXTG-anchors as predicted by HMM for sortase-substrates; e lipoprotein prediction as described by Sutcliffe et al.; f Pfam database with cut-off E- value of E < 10- 5 ; s manual screening for presence of GW residues in repetitive regions; h prediction based on presence of [LI]TW[TS]L-motif in C-terminal sequence

32

L. salivarius UCCl 18 possesses the second-largest genome (2.13 Mb) of the fully sequenced lactobacilli and is the only sequenced Lactobacillus strain harboring a megaplasmid [22]. Using SignalP3.0, we identified 119 proteins predicted to be secreted, the majority (108) of which are encoded by the chromosome. Eight are encoded by the megaplasmid (pMPl 18), two are encoded by the 44-kb plasmid pSFl 18-44, and one is encoded by the 20-kb plasmid pSFl 18-20. Deduced products of an additional five pseudogenes were predicted to be secreted, with two encoded by the chromosome and three encoded by pMPl 18. Of the 1 19 proteins, 44 were predicted to be cleaved by signal peptidases I, and 3 were predicted to be cleaved by signal peptidase II. The majority of secreted proteins will remain associated with the cell membrane (Table 4). For other Lactobacillus genomes analyzed, the distributions of identity levels for existing database entries were similar, with a preponderance of proteins with values centrally distributed around identities of 30 to 60% (Table 4).

Example 7 - Identification of sortase substrates

We used a combination of manual inspection of proteins predicted to be secreted and the hidden Markov model of Boekhorst et al. [23] to identify sortase substrates. The hidden Markov model was also used to search the genomes of X. sakei and L. acidophilus, whereas sortase substrates for L. plantarum and L. johnsonii have been described previously [23]. We identified 10 proteins containing sortase substrates in L. salivarius (Table 2 above), one of which is encoded by the previously characterized 44-kb plasmid pSFl 18-44 [24]. Four of these proteins are encoded by pMPl 18, and five are encoded by chromosomal genes. Two SDPs are theoretical products of gene fragments, and four theoretical proteins were derived from pseudogenes caused by interruption with an internal stop codon or a frameshift. One of the pseudogenes, designated LSL_0152, encodes a protein which shares 30% identity with the mucus-binding protein Mub of L. reuteri. LSL Ol 52 is interrupted by a stop codon (TGA) at nucleotide position 499. Another pseudogene,

33

LSL_1319, shows 21% identity to the R28 protein of Streptococcus pyogenes , which is involved in binding to epithelial cells. The DNA sequence of LSL_1319 is interrupted by a stop codon (TAA) at nucleotide position 667. The third pseudogene, LSL_2020b, is located on pSFl 18-44. The encoded protein shares 25% identity with the collagen adhesin of Staphylococcus aureus. LSL_2020b is interrupted after 1,938 base pairs by the stop codon TAA. For these pseudogenes, there is no evidence of a second ribosome binding site with a start codon, which could lead to translation of the distal fragment of the gene. Whereas the other pseudogenes are disrupted by a stop codon, we identified a pseudogene (LSL_1774b) with a frameshift in its sequence which introduced a stop codon.

The product of LSL_1774b is homologous (32% identity) to a 1 ,480-amino-acid proteinase (PrtR) of the human isolate L. rhamnosus BGTlO. Apart from the pseudogenes, two gene fragments harboring a sortase recognition sequence were also identified (Table 5). Both gene fragments are located on pMP 1 18. LSL_1832b is a

2.3-kb gene fragment whose derived amino acid sequence harbors an LPQMG sortase recognition motif. The fragment is homologous (17% identity) to the C- terminal region of a 1 ,575-amino-acid salivary agglutinin-binding protein of Streptococcus gordonii. The smallest gene fragment harboring a sortase recognition motif is LSL_1902b (147 base pairs). It has no homology with proteins in the nonredundant BLAST database.

Apart from the six interrupted/partial genes containing sortase recognition motifs, we identified four predicted sortase-anchored proteins which are intact, designated Lactobacillus surface proteins A, B, C, and D (LspA, LspB, LspC, and LspD, respectively) (Table 2 above).

LspA (LSL_0311) is a 1 ,209-amino-acid protein which contains seven repeats of 79 amino acids (Rl to R7) (Fig. 9). Rl and R7 are the least conserved repeats, sharing 73% identity, whereas R2 to R6 are more conserved, sharing 92% identity. Pfam

34

analysis revealed that each of these repeats is similar to mucus-binding domains

(PF06458), with E values ranging from 10 1 to 10 6 but with all scores being above the gathering threshold. BLAST-NR searches did not reveal homology to a functionally characterized protein, since the closest homologue is a hypothetical protein of Streptococcus suis (ZP_00874951 ), as shown in Fig. 9. LspB (LSL_1085) is an 827-amino-acid protein (Fig. 9) containing an LPQMG cleavage motif. Three 13-amino-acid repeats were identified at the C-terminal end of the protein. The repeats are 100% identical, and Pfam analysis revealed no predicted function. The top BLAST hit for LspB is an enterococcal surface protein (Esp) of Enterococcus faecium (AAQ89938) which has no assigned function (Fig. 9). LspC (LSLJ 335) is

785 amino acids in size and has four repeats of 97 amino acids (Fig. 9). There is over 98% identity among these repeats, and their sequences are similar to those of mucus- binding domains, as predicted by Pfam analysis, with E values ranging between 10 3 and 10_4 but with all scores being above the gathering threshold. It is homologous to the 3,269- amino-acid mucus-binding protein (Mub [AAF25576]) previously characterized in L. reuteri (42). This protein has two types of repeats. One set of repeats is divergent, with 15 to 85% identity, whereas the second set of repeats is conserved, displaying 91% identity. Both types have been shown to be involved in binding to mucin components (42). The four repeats of LspC show a higher sequence identity to the diverse repeats of Mub (13% identity), whereas there is very low identity (5% identity) to the conserved repeats of Mub. LspD (LSL_1838) is encoded by pMPl 18 and consists of 493 amino acids (Fig. 9). No repeats were identified, and the top BLAST hit is a hypothetical protein of the fungus Magriaporthe grisea (15.4% identity). Similar homology was noted for a sortasedependent hypothetical protein of Streptococcus agalactiae (NP 735436). Since LspD is plasmid encoded, it is noteworthy that there is 15% homology to PrgA, a hypothetical surface exclusion protein of Enterococcus faecalis . Surface exclusion proteins block the conjugative transfer of plasmids to cells bearing identical or closely related plasmids.

NATI68/CλVO/IE

35

Example 8 - Construction of an isogenic sortase mutant.

Previous studies targeting the sortase gene have shown that sortase-dependent proteins play a role in adhesion and virulence in a range of organisms . In order to investigate whether a sortase-dependent protein(s) in L. salivarius UCCl 18 is involved in adhesion, we constructed a mutant strain lacking the sortase gene (LSL_1606). The small size of sortase did not allow us to disrupt the gene by plasmid integration, and we therefore opted for a gene deletion, using a double-crossover strategy. Upstream and downstream flanking regions of 772 bp and 818 bp, respectively, were amplified. The upstream flanking amplicon includes the first 13 codons of the sortase gene, whereas the downstream flanking amplicon includes the last 3 codons. Both flanking amplicons were joined by SOE-PCR and cloned into pORI19. The resultant recombinant plasmid, pLSOOl, was transformed into L. salivarius UCC 118 harboring p VE6007, and a double-crossover mutant was obtained. The deletion of the sortase gene in strain UCCl 18 was verified by Southern hybridization (Fig. 10). A PCR using wild-type genomic DNA with the primer pair JPl 44-JP 149 resulted in a 1.1-kb amplicon which was used as a probe. Genomic DNAs of both the wild-type strain and the sortase mutant were digested with Xhol. The hybridization patterns showed bands of 5.8 kb and 5.1 kb for the wild-type and mutant strains, respectively (Fig. 10). An Xbal-Xhol double digest produced bands of 3.6 kb and 2.9 kb for the wild-type and mutant strains, respectively, confirming the deletion of sortase. The strain lacking the sortase gene was designated UCC 1 1 SjsrtA.

Example 9 - The sortase mutant has reduced adhesion to epithelial cells. Following the construction of UCC 1 1 S srtA, we tested the strain for adhesion to intestinal epithelial cells. UCCl l8_srtA adhered significantly less to HT29 cells (P _ 0.04) than the wild-type strain did (Fig. 5A). We also employed a semiquantitative real-time PCR method to validate the viable count method (Fig. 5B). The adhesion of the UCC 11 S_srtA mutant was also significantly reduced (P _ 0.04) as measured by this method, at 61% of the level of the wild-type strain. The adhesion of

36

UCC 1 1 B_srtA to Caco C2 cells was also reduced significantly (68%; P _ 0.007) compared to that of the wild-type strain, as determined by real-time PCR, but this reduction was less than that observed for the sortase gene mutant grown on HT29 cells. Collectively, these data indicate that one or more sortase-dependent proteins are involved in adhesion to human epithelial cells.

Example 10 - Transcriptional analysis of sortase-dependent proteins.

We employed endpoint reverse transcription-PCR (RT-PCR) to test if genes encoding sortase-dependent proteins in L. salivarius UCCl 18 were expressed in vitro when the strain was cultured in MRS broth. RNA was prepared from stationary-phase cells and reverse transcribed. For each target gene, internal primers were designed, and for the three pseudogenes, primers were designed upstream and downstream of the internal stop codon. After 50 cycles of PCR, no gene expression was detected for the chromosomally located gene lspC and the pseudogene LSL_2020b, which is carried on the previously described 44-kb plasmid pSFl 18-44 (41). The remainder of the sortase-dependent proteins were expressed (Fig. 1 1). Previously, it was reported that the adhesion of L. salivarius UCCl 18 to HT29 cells is growth phase dependent (59), with a significant increase upon entry into stationary phase. We therefore investigated whether there was differential gene expression of sortase-dependent proteins in the two growth phases by performing real-time PCR, using RNA isolated from cells in the respective growth phases. All of the genes except the pseudogene LSL l 319 were transcribed at higher levels in stationary phase than in logarithmic phase (Table 6). The pseudogene LSL_0152 was transcribed at 2- to 2.5-fold higher levels in stationary phase than in logarithmic phase, while the transcription of lspB and lspD increased dramatically (Table 6). No gene expression was detected for lspC and LSL_2020b in logarithmic- phase cells (data not shown).

37

Table 6 — Differential expression of genes encoding sortase-dependent proteins in L. salivarius UCC118

Fold up-regulation ±

Gene SEM stationary phase/

log phase a

lspA 1.7 + 0.2

lspB 5.2 ± 2.5

lspO 10.5 ± 3.0

LSLJH52 (1) 2.0 + 0.6

LSL_0152 (2) 2.5 ± 0.4

LSLJ319 (1) 1.2 ± 0.1

LSL_1319 (2) 0.8 ± 0.3

a Values tabulated represent fold up-regulation ± standard error of the mean (SEM) in stationary phase cells compared to logarithmic phase cells, relative to the constitutively expressed 16S rRNA gene. Data shown are averages of three independent experiments. Numbers in parenthesis following gene names indicate expression upstream of internal stop codon (1) or downstream of internal stop codon (2).

Example 11 - Role in adherence of LspA, LspB, and LspD. Transcriptional analysis showed that the expression of lspC was not detected after 50 cycles of PCR, and we therefore omitted this gene as a target for disruption. Internal gene fragments of lspA, lspB, 0.56). These data were corroborated by the semiquantitative PCR assay (Fig. 12B), by which statistical significance was

NATI68/CλVO/1E

38

detected only for the adhesion reduction of the lspA knockout strain. Adhesion to Caco C2 cells was also significantly reduced for strain UCCl W^OK\\9::lspA {11%; P _ 0.009) but was not significantly reduced for the lspB and lspD mutants (92% and 94%, respectively, as determined by real-time PCR [data not shown]).

The chromosome of L. salivarius UCCl 18 potentially encodes 108 secreted proteins, which comprise 6.2% of the chromosomally located open reading frames (ORFs). This is lower than the proportions for L. acidophilus, L. plantarum, L. sakei, and L. johnsonii, all of which devote 7% of their coding capacities to secreted proteins. Interestingly, there are only eight ORFs identified on pMPl 18 of Z. salivarius whose products are predicted to be secreted, which is 1.9% of the plasmidlocated ORFs. Six of these encode proteins with hypothetical functions, including a protein with an LPQTG sorting motif (lspD) and one putative thioredoxin. The two secreted proteins encoded by pMPl 18 with assigned functions are an oligopeptide binding protein and an amino acid transporter. The plasmids pSFl 18-20 and pSFl 18-44 contribute little to the predicted L. salivarius UCCl 18 secretome, encoding one and two secreted proteins, respectively. Ten proteins were identified in L. salivarius UCCl 18 as sortase substrates by manual screening, and searching the genome with a hidden Markov model (40) did not identify additional potential SDPs. Among these 10 proteins, two sortase substrates were encoded by gene fragments and four were encoded by pseudogenes interrupted by a single stop codon or frameshift.

The following are potential applications for UCCl 18 chromosome and megaplasmid, pMPl 18 of the invention:-

The megaplasmid may be used to transform other probiotic lactobacilli and thus improve or manipulate their desirable properties, including but not restricted to growth rate in synthetic medium (this refers to the sugar utilization genes on the megaplasmid), gut colonization, gut persistence, host immunomodulation. Megaplasmids of other Lb. salivarius strains may also be used in this way.

39

The individual genes carried by pMPl 18 may be used to improve the properties of Lactobacillus strains.

The pMPl 18 or other lactobacillus megaplasmids may be used to mobilize other plasmids into recipent strains including, but not restricted to, other lactic acid bacteria.

Derivatives may be made of the replication origin of pMPl 18, with and without the putative conjugation-related genes, and these derivative plasmids may be used as cloning vectors in Lactobacillus sp.. or other Gram-positive bacteria.

The adherence of UCCl 18 or other lactic acid bacteria may be manipulated based upon knowledge of the surface proteins as hereinbefore described.

The gut colonization, gut persistence, host immuno-modulation of UCC 1 18 or other lactic acid bacteria may be modulated based upon knowledge of surface proteins as hereinbefore described.

The genes and corresponding proteins as hereinbefore described may be used to confer de novo gut interaction properties, or to improve the gut interaction of heterologous bacterial species

The proteins as hereinbefore described may be used to modulate the colonization of the gut by microbial pathogens.

Strains in which expression levels of the genes and corresponding proteins as hereinbefore described may be manipulated to produce strains with superior properties.

40

Genes from pMPl 18 and the UCCl 18 chromosome may be used or manipulated to increase the stress-resistance of UCCl 18 or other lactic acid bacteria. Stress factors include, but are not restricted to, bile, acid, high/low temperature, and osmotic stress.

Genes from pMPl 18 and the UCCl 18 chromosome may be used or manipulated to modulate clumping of the cells of UCCl 18 or other lactic acid bacteria.

Genes from pMPl 18 and the UCCl 18 chromosome may be used or manipulated to modulate or alter the metabolism of UCCl 18 or other lactic acid bacteria. The invention is not limited to the embodiments herein before described which may be varied in detail.

41

References

1. Dunne, C, L. Murphy, S. Flynn, L. O'Mahony, S. O'Halloran, M. Feeney, D. Morrissey, G. Thornton, G. Fitzgerald, C. Daly, B. Kiely, E.M. Quigley, G.C. O'Sullivan, F. Shanahan, and J.K. Collins, Probiotics: from myth to reality.

Demonstration of functionality in animal models of disease and in human clinical trials. Antonie Van Leeuwenhoek, 1999. 76(1-4): p. 279-92.

2. Sheil, B., J. McCarthy, L. O'Mahony, M.W. Bennett, P. Ryan, JJ. Fitzgibbon, B. Kiely, J.K. Collins, and F. Shanahan, Is the mucosal route of administration essential for probiotic function? Subcutaneous administration is associated with attenuation of murine colitis and arthritis. Gut, 2004. 53(5): p. 694-700.

3. Ng, W.V., S.A. Ciufo, T.M. Smith, R.E. Bumgarner, D. Baskin, J. Faust, B. Hall, C. Loretz, J. Seto, J. Slagel, L. Hood, and S. DasSarma, Snapshot of a large dynamic replicon in a halophilic archaeon: megaplasmid or minichromosome? Genome Res., 1998. 8(1 1): p. 1131-41.

4. Barton, B.M., G.P. Harding, and AJ. Zuccarelli, A general method for detecting and sizing large plasmids. Anal. Biochem., 1995. 226(2): p. 235- 40. 5. Rogosa, M., R.F. Wiseman, J.A. Mitchell, M.N. Disraely, and AJ. Beaman,

Species differentiation of oral lactobacilli from man including description of Lactobacillus salivarius nov spec and Lactobacillus cellobiosus nov spec. J. Bacteriol., 1953. 65(6): p. 681-99.

6. Muriana, P.M. and T.R. Klaenhammer, Conjugal transfer ofplasmid- encoded determinants for bacteriocin production and immunity in

Lactobacillus acidophilus 88. Appl. Environ. Microbiol., 1987. 53(3): p. 553- 560.

7. Roussel, Y., C. Colmin, J.M. Simonet, and B. Decaris, Strain characterization, genome size andplasmid content in the Lactobacillus acidophilus group (Hansen and Mocquot). J. Appl. Bacteriol., 1993. 74(5): p.

549-56.

8. Wang, T.T. and B.H. Lee, Plasmids in Lactobacillus. Crit. Rev. Biotechnol., 1997. 17(3): p. 227-72.

9. Hols, P., M. Kleerebezem, A.N. Schanck, T. Ferain, J. Hugenholtz, J. Delcour, and W.M. de Vos, Conversion of Lactococcus lactisfi'om homolactic to homoalanine fermentation through metabolic engineering. Nat. Biotechnol., 1999. 17(6): p. 588-92.

10. Paterson, G.K. and TJ. Mitchell, The biology of Gram-positive sortase enzymes. Trends Microbiol., 2004. 12(2): p. 89-95. 11. Altermann, E., W.M. Russell, M.A. Azcarate-Peril, R. Barrangou, B.L. Buck,

O. McAuliffe, N. Souther, A. Dobson, T. Duong, M. Callanan, S. Lick, A. Hamrick, R. Cano, and T.R. Klaenhammer, Complete genome sequence of the probiotic lactic acid bacterium Lactobacillus acidophilus NCFM. Proc. Natl. Acad. Sci. U S A, 2005. 102(11): p. 3906-12.

42

12. Kleerebezem, M., J. Boekhorst, R. van Kranenburg, D. Molenaar, O.P. Kuipers, R. Leer, R. Tarchini, S.A. Peters, H.M. Sandbrink, M.W. Fiers, W. Stiekema, R.M. Lankhorst, P.A. Bron, S.M. Hoffer, M.N. Groot, R. Kerkhoven, M. de Vries, B. Ursing, W.M. de Vos, and R.J. Siezen, Complete genome sequence of Lactobacillus plantarum WCFSl. Proc. Natl. Acad. Sci.

U S A, 2003. 100(4): p. 1990-5.

13. Pridmore, R.D., B. Berger, F. Desiere, D. Vilanova, C. Barretto, A.C. Pittet, M.C. Zwahlen, M. Rouvet, E. Altermann, R. Barrangou, B. Mollet, A. Mercenier, T. Klaenhammer, F. Arigoni, and M.A. Schell, The genome sequence of the probiotic intestinal bacterium Lactobacillus johnsonii NCC

533. Proc. Natl. Acad. Sci. U S A, 2004. 101(8): p. 2512-7.

14. Navarre, W. W. and O. Schneewind, Proteolytic cleavage and cell wall anchoring at the LPXTG motif of surface proteins in gram-positive bacteria. MoI. Microbiol., 1994. 14(1): p. 1 15-21. 15. Roche, F.M., R. Massey, S.J. Peacock, N.P. Day, L. Visai, P. Speziale, A.

Lam, M. Pallen, and TJ. Foster, Characterization of novel LPXTG- containing proteins of Staphylococcus aureus identified fi'om genome sequences. Microbiology, 2003. 149(Pt 3): p. 643-54.

16. Tuomola, E., R. Crittenden, M. Playne, E. Isolauri, and S. Salminen, Quality assurance criteria for probiotic bacteria. Am. J. Clin. Nutr., 2001. 73(2

Suppl): p. 393S-398S.

17. Servin, A.L., Antagonistic activities oflactobacilli and bifidobacteria against microbial pathogens. FEMS Microbiol. Rev., 2004. 28(4): p. 405-40.

18. Champomier- Verges, M.C, S. Chaillou, M. Cornet, and M. Zagorec, Lactobacillus sakei: recent developments and future prospects. Res.

Microbiol., 2001. 152(10): p. 839-48.

19. Gilmour, M.W., N.R. Thomson, M. Sanders, J. Parkhill, and D.E. Taylor, The complete nucleotide sequence of the resistance plasmid R478: defining the backbone components of incompatibility group H conjugative plasmids through comparative genomics. Plasmid, 2004. 52(3): p. 182-202.

20. Law, J., G. Buist, A. Haandrikman, J. Kok, G. Venema, and K. Leenhouts, A system to generate chromosomal mutations in Lactococcus lactis which allows fast analysis of targeted genes. J. Bacteriol., 1995. 177(24): p. 701 1-8.

21. Neeser, J.R., A. Chambaz, M. Golliard, H. Link-Amster, V. Fryder, and E. Kolodziejczyk, Adhesion of colonization factor antigen Il-positive enterotoxigenic Escherichia coli strains to human enterocyte like differentiated HT-29 cells: a basis for host-pathogen interactions in the gut. Infect. Immun., 1989. 57(12): p. 3727-34.

22. Claesson, M.J., Y. Li, S. Leahy, C. Canchaya, J.-P. van Pijkeren, A.M. Cerdeno-Tarraga, J. Parkhill, S. Flynn, G.C. O'Sullivan, J.K. Collins, D.

Higgins, F. Shanahan, G.F. Fitzgerald, D. van Sinderen, and P. W. O'Toole, Multireplicon genome architecture of Lactobacillus salϊvarius. Proc. Nat. Acad. Sci. U.S.A., 2006. 103: p. 6718-6723.

NATI68/CλVO/IE

43

23. Boekhorst, J., M.W. de Been, M. Kleerebezem, and R.J. Siezen, Genome- wide detection and analysis of cell wall-bound proteins with LPxTG-like sorting motifs. J. Bacteriol., 2005. 187(14): p. 4928-34.

24. Flynn, S., Molecular characterisation ofhacteriocin producing genes and plasmid encoded functions of the probiotic strain Lactobacillus salivarius subsp. salivarius UCCl 18, in Dept. Microbiology. 2001, University College Cork, Ireland: Cork, Ireland, p. 287.

25. Ventura, M., C. Canchaya, V. Bernini, E. Altermann, R. Barrangou, S. McGrath, M. Claesson, Y. Li, S. Leahy, CD. Walker, R. Zink, E. Neviani, J. Steele, J. Broadbent, T.R. Klaenhammer, G.F. Fitzgerald, P.W. O'Toole, and

D. van Sinderen, Comparative genomics and transcriptional analysis of prophages identified in the genomes of Lactobacillus gasseri, Lactobacillus salivarius and Lactobacillus casei. Appl. Env. Microbiol., 2006. 72: p. 3130- 3146. 26. Hosoda F, Nishimura S, Uchida H, Ohki M. An F factor based cloning system for large DNA fragments. Nucleic Acids Res. 1990 JuI l l;18(13):3863-9.

44

Review of Sequence listings

SEQ ID No.1 The circular megaplasmid composed of 242,436 base pairs

(CP000234)

SEQ ID No.2 The circular chromosome of 1 ,827, 111 bp. (CP000233)

SEQIDNo.3 LSL 1752

SEQIDNo.4 LSL 1890

SEQIDNo.5 LSL 1803 SEQIDNo.6 LSL1812

SEQIDNo.7 LSL1838

SEQIDNo.8 LSL1816

SEQ ID No.9 LSL 1767

SEQIDNo.10 LSLl 801 SEQIDNo.11 LSL 1882

SEQ ID No.12 forward primer YL007 (5'-AAGGAA

TTTACAGTATTGACAG-3 ')

SEQ ID No.13 reverse primer YL008 (5'-ACGGCAACTTGTAAAACCA- 3')

45

Sortase protein and adhesins

SDP

SEQIDNo.14 LSL0152

SEQIDNo.15 LSLOl 52 SEQIDNo.16 LSL0311

SEQIDNo.17 LSL1085

SEQIDNo.18 LSL1319

SEQIDNo.19 LSL1319

SEQIDNo.20 LSLl 335 SEQ ID No.21 LSL 1832b

SEQIDNo.30 LSLl 838

SIA

SEQ ID No.22 LSL0350 SEQIDNo.23 LSL0351

SEQ ID No.24 LSL0352

SEQ ID No.25 LSL0873

SEQ ID No.26 LSL1401b

SEQ ID No.27 LSL 140 Ib SEQ ID No.28 LSLl 832c

SEQ ID No.29 LSL1832d