Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
THE USE OF THE KLUYVEROMYCES MARXIANUS INULINASE GENE PROMOTER FOR PROTEIN PRODUCTION
Document Type and Number:
WIPO Patent Application WO/1994/013821
Kind Code:
A1
Abstract:
The invention provides a nucleic acid sequence derivable from a yeast, e.g. from a Kluyveromyces yeast, preferably from the strain K. marxianus var. marxianus (CBS 6556), and comprising at least a regulatory region derivable from a gene encoding a polypeptide having inulinase activity or a functional modification thereof. Regulatory regions comprise a promoter, a terminator, and a sequence encoding a secretory signal necessary for secreting a gene product from a yeast. The DNA sequence of the inulinase gene from K. marxianus var. marxianus (CBS 6556) is given, including its regulatory regions. The regulatory regions can be used for preparing vectors suitable for transforming a host, preferably a yeast, for the production of desired expression products, e.g. proteins, RNA suitable for flavouring purposes, and metabolites. A process for production of an expression product is also provided.

Inventors:
CHAPMAN JOHN WILLIAM (NL)
MUSTERS WOUTER (NL)
ROUWENHORST ROBERT JAN (NL)
TOSCHKA HOLGER YORK (DE)
VERBAKEL JOHANNES MARIA A (NL)
Application Number:
PCT/EP1993/003547
Publication Date:
June 23, 1994
Filing Date:
December 09, 1993
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
QUEST INT (NL)
CHAPMAN JOHN WILLIAM (NL)
MUSTERS WOUTER (NL)
ROUWENHORST ROBERT JAN (NL)
TOSCHKA HOLGER YORK (DE)
VERBAKEL JOHANNES MARIA A (NL)
International Classes:
C12N9/24; C12N9/40; C12N15/55; C12N15/81; (IPC1-7): C12N15/81; C12N1/19; C12N15/11; C12N15/55; C12N15/56; C12N15/62
Domestic Patent References:
WO1991000920A21991-01-24
Other References:
BERGKAMP, R. ET AL.: "Multiple-copy integration of the alpha-galactosidase gene from Cyamopsis tetragonoloba into the ribosomal DNA of Kluyveromyces lactis", CURRENT GENETICS, vol. 21, no. 4-5, April 1992 (1992-04-01), BERLIN, D, pages 365 - 370
ROUWENHORST, R. ET AL.: "Structure and properties o fthe extracellular inulinase of Kluyveromyces marxianus CBS 6556", APPLIED AND ENVIRONMENTAL MICROBIOLOGY, vol. 56, no. 11, November 1990 (1990-11-01), AMERICAN SOCIETY FOR MICROBIOLOGY, pages 3337 - 3345
BERGKAMP R J M ET AL: "Expression of an alpha-galactosidase gene under control of the homologous inulinase promoter in Kluyveromyces", APPLIED MICROBIOLOGY AND BIOTECHNOLOGY 40 (2-3). 1993. 309-317
Download PDF:
Claims:
C L A I M S
1. A nucleic acid sequence derivable from a yeast and compriεing at leaεt a regulatory region derivable from a gene encoding a polypeptide having inulinaεe activity or a modified sequence of said nucleic acid εequence alεo having regulatory activity.
2. A nucleic acid εequence according to claim 1, wherein said yeast belongε to the genus Kluyveromyceε , preferably belonging to the specieε Kluyveromyceε marxianuε , more preferably εaid yeaεt being the εtrain X. marxianuε var. marxianuε depoεited at the CBS at Baarn, The Netherlands under number CBS 6556.
3. A nucleic acid sequence according to any of the previous claims, comprising at least one region selected from the group consisting of a promoter, a termination signal, and a sequence encoding a secretory signal necesεary for εecreting a gene product from a yeast, the latter preferably being derivable from the inulinase gene of a Kluyveromyceε yeast.
4. A nucleic acid sequence derivable from a yeast and comprising at least one regulatory region derivable from a gene encoding a polypeptide having inulinase activity according to any of the previouε claims, comprising at least a part of the nucleic acid sequence of Figure 5 or an equivalent nucleic acid sequence, preferably comprising at least polynucleotide 737 to 1 of Figure 5 having promoter activity or compriεing at leaεt polynucleotide 1 to 48 of Figure 5 encoding the inulinaεe presequence.
5. A recombinant nucleic acid sequence according to any of the previouε claims, wherein the regulatory region is operably linked to DNA encoding a specific RNA sequence not coding for a specific protein.
6. A vector comprising a nucleic acid sequence according to any of the claims 14 , said nucleic acid sequence being operably linked to a structural gene, such as a gene encoding a polypeptide having inulinase activity or α galactosidaεe activity, preferably a yeast vector.
7. A recombinant host cell comprising a nucleic acid sequence according to any of the claims 15 or a vector according to claim 6.
8. A process for producing a desired expression product, wherein a host cell comprising a vector according to claim 6 iε cultured under conditions enabling the structural gene to be expressed and optionally the resulting desired expresεion product iε isolated.
9. A proceεε for producing RNA, wherein a hoεt cell comprising a recombinant nucleic acid sequence according to claim 5 iε cultured under conditionε enabling production of the RNA, whereby preferably the reεulting RNA influences the formation of at leaεt one metabolite or the reεulting RNA iε a flavouring component.
10. Uεe of a part of a nucleic acid εequence according to any of claimε 14 aε a probe or a primer εaid part having a length of at least 10 nucleotides.
Description:
The use of the Kluyveromyces marxianus Inulinase gene promoter for protein production.

Technical Field The subject invention lies in the field of DNA technology. In particular the invention covers a nucleic acid sequence derivable from a yeast and comprising at least a regulatory region derivable from a gene encoding a polypeptide having inulinase activity. The invention is also directed at an expression vector comprising the aforementioned nucleic acid sequence and is furthermore directed at the use of said nucleic acid sequence or expression vector for producing a desired expression product.

Background

Yeast strains of the genus Kluyveromyces have been used for the production of enzymes for many years, and the growth of these strains has been extensively studied. Kluyveromyces marxianus var. marxianus strains (hereinafter also called Kluyveromyces marxianus or K. marxianus) are well known for their ability to utilize a large variety of compounds as carbon and energy sources for growth. Since these strains are able to grow at high temperatures and exhibit high growth rates they are promising hosts for the industrial production of heterologόus proteins.

Among the substrates that support growth are polysaccharides such as inulin, xylan and pectin, which are degraded by extracellular enzymes. Inulinase (EC.3.2.1.7) is an extracellular enzyme that enables the yeast to grow on fructans such as inulin and sucrose. The enzyme occurs in two forms, whereby part of the enzyme is secreted into the culture fluid as a dimer and part is retained in the cell wall as a tetramer. The relative amounts of the cell wall and supernatant enzyme depend on cultivation conditions, a situation similar to that of invertase

(E.C.3.2.1.26) of Saccharomyces cereviεiae . The two enzymes differ in substrate specificity for inulin and sucrose, a

fact normally expressed in the S/I ratio (Vandamme et al . , 1983) .

The pKDl plasmid (Falcone et al . , 1986) originally found in Kluyveromyces droεophilarum (now regarded as a variety of Kluyveromyces lactis) belongs to the family of yeast double stranded circular plasmids, and does not confer any evident phenotype. Based on plasmid pKDl several commercially attractive expression systems for high level expression of prochymosin (v/d Berg et al . , 1991) and human serum albumin (Fleer et al . , 1991) have been developed for the yeast

Kluyveromyces lactis . As known from S . cerevisiae, a high copy plasmid based expression system has the advantage of supplying the host with a sufficient number of gene copies to obtain high-level expression, while integration into the genome in single or low copy number increases the mitotic stability under non selective growth conditions. To combine the benefits of both expression systems, the concept of a multicopy integration system in the rDNA locus of S . cereviεiae has already been successfully proven (Verbakel, 1991) . The potential of these constructs for stable, multicopy integration into the genome, has been demonstrated for different organisms, genes and auxotrophic markers (Lopes, 1990; Verbakel, 1991, Bergkamp et al . , 1991) . Progress has been made elsewhere to stabilize a plasmid borne expression system (Fleer et al . , 1991), the adaptation of multicopy integration into the genome of K . marxianus is however a more favourable option; and is currently under investigation. Therefore the first leu2 mutant of strain CBS 6556 has been made, in this specification named KMS1, in which the LEU2 gene is inactivated through integration of a pPGK/Neo ycin resistance (Neo R ) cassette (Bergkamp et al . , 1991). Within the transferring process of the multicopy integration system from S . cerevisiae to K . marxianus , first a vector was developed, which is capable of integrating into the genome of K. lactis by targeted homologous recombination in the ribosomal DNA locus

(Bergkamp et al . , 1992) . Using this vector system, the expression of an α-galactosidase gene from a fusion construct containing the S . cereviεiae GAL7 promoter, the SUC2 invertase signal sequence was obtained. With a maximum number of integrated plasmids of about 15, a level of about 250 mg/1 α-galactosidase was obtained, with a secretion efficiency of about 95%. Compared to the ARS- or pKDl derived K . lactiε vectors containing the fusion construct, the multicopy integrants exhibit a considerably higher stability under non-selective growth conditions.

However, in addition to the importance of a stable, high copy, system the strengths and effectiveness of surrounding regulatory sequences seem to be crucial factors for high level expression, at least in K . marxianu . This is supported by results from the same group, where attempts to use the S . cerevisiae GAL7 and PGK promoters for expression in K . marxianuε have, in contrast to their effects in K. lactiε , only led to an extremely low yield. On the other hand, the homologous ORF1 promoter of killer plasmid kl and the LAC4 promoter have already been successfully tested by different companies (Fleer et al . , 1991, v/d Berg et al . , 1991).

Yet another difficulty is the proficient secretion of recombinant protein by Kluyveromyceε , especially when the protein is expressed in large quantities. Even though some heterologous secretion/signal sequences have been shown to be functional in K . lactiε , as for example the human serum albumin prepro-sequence (Fleer et al . , 1991), there is a strong demand for an efficient homologous signal sequence, especially from K . marxianuε .

Description

Since inulinase is known to be expressed in very high concentrations under appropriate cultivation conditions in K . marxianuε , the present invention is directed in particular at the cloning of regulatory regions, such as

the promoter sequence and the signal sequence of the inulinase gene as promising components for the development of an expression system.

This invention therefore relates generally to a nucleic acid sequence derivable from a yeast and comprising at least a regulatory region derivable from a gene encoding a polypeptide having inulinase activity. Said nucleic acid sequence can be combined with nucleic acid sequences encoding other homologous or heterologous genes to bring these genes under the control of at least one strong inulinase regulatory sequence.

"Nucleic acid sequence" as used herein refers to a polymeric form of nucleotides of any length, thus to both single and double stranded deoxyribonucleic acid (DNA) sequences, to ribonucleic acid (RNA) sequences, as well as to modifications thereof. In principle a single stranded nucleic acid DNA refers to the primary structure of the molecule. In general the term "polypeptide" refers to a molecular chain of amino acids with a biological activity and does not refer to a specific length of the product and if required can be modified in vivo or in vitro. This modification can for example take the form of glycosylation, amidation, carboxylation or phosphorylation; thus, inter alia, peptides, oligopeptideε and proteins are included. In this instance the polypeptide has inulinase activity.

Yet another major aspect of the present invention is related to the isolation, characterization and the use of the signal sequence of a polypeptide having inulinase activity, and parts thereof, for secretion of any overexpressed product from yeast, in particular from Kluyveromyceε . A nucleic acid sequence according to the invention can therefore optionally further comprise a nucleic acid sequence encoding a secretory signal of inulinase. The invention further relates to a vector containing the

nucleic acid sequences as described and also relates to micro-organisms containing said vectors or nucleic acid sequences.

The invention is also directed at modified sequences of the aforementioned nucleic acid sequences according to the invention, said modified sequences also having regulatory activity. The term "a modified sequence" covers nucleic acid sequences having the regulatory activity equivalent to or better than the nucleic acid sequence derivable from a yeast and comprising at least a regulatory region derivable from a gene encoding a polypeptide having inulinase activity. Such an equivalent nucleic acid sequence can have undergone substitution, deletion or insertion, or a combination of the aforementioned, of one or more nucleotides resulting in a modified nucleic acid sequence without concomitant loss of regulatory activity occurring. Such modified nucleic acid sequences fall within the scope of the present invention. In particular modified sequences capable of hybridizing with the non modified nucleic acid sequence and still maintaining at least the regulatory activity of the unmodified nucleic sequence fall within the scope of the invention.

The term "a part of" covers a nucleic acid sequence being a subsequence of the nucleic acid sequence derivable from a yeast and comprising at least a regulatory region derivable from a gene encoding a polypeptide having inulinase activity. The term "a part of" also covers a subsequence that is specific for the nucleic acid sequence derivable from a yeast and comprising at least a regulatory region derivable from a gene encoding a polypeptide having inulinase activity, said subsequence having a length of at least ten nucleotides and being capable of hybridizing to a regulatory region of a yeast inulinase gene under stringent conditions, said subsequence being suitable for use as a probe or a primer. The invention is in fact also directed at such use of a nucleic acid sequence according to the invention.

In particular the invention is directed at a nucleic acid sequence derivable from a yeast of the genus Kluyveromyceε . A suitable example of a yeast from which a nucleic acid sequence according to the invention can be derived is a Kluyveromyceε of the species X. marxianuε . Of this species the strain K . marxianuε var. marxianuε is eminently suitable for deriving a nucleic acid sequence according to the invention. This strain, deposited in 1974 at the Centraal Bureau voor Schimmelcultures (CBS) in Baarn, The Netherlands under accession number CBS 6556, is freely available and is also known as NRRL 47571 and ATCC 26548. Preferably, the nucleic acid sequence according to the invention comprises at least a promoter as regulatory region. The nucleic acid sequence according to the invention can also comprise an enhancer sequence enabling a higher level of expression of any nucleic acid sequence operably linked to the promoter. The nucleic acid sequence can further either comprise an activating sequence upstream of the promoter (UAS) that can be activated by an inducer, said inducer being a compound that binds to said UAS, or a repressing sequence upstream of the promoter (URS) that can be derepressed by a component that inactivates the repressor being present at said URS. In older literature the term "operator" was sometimes used. Instead of this term UAS and URS are now often used. It is also possible for the nucleic acid sequence according to the invention to comprise a termination signal as regulatory region. Naturally, the nucleic acid sequence according to the invention can comprise one or more regulatory regions. A nucleic acid sequence according to the invention can comprise solely the promoter as regulatory region or a combination thereof with an enhancer, UAS or URS. A nucleic acid sequence according to the invention can also further comprise termination signal sequences, although these are not always required to end expression of the desired expression product. A nucleic acid sequence according to the invention can further comprise a sequence encoding a

secretory signal necessary for secreting a gene product from a yeast. This will be preferred when intracellular production of a desired expression product is not sufficient and extracellular production of the desired expression product is required. Secretory signals comprise the prepro or pre sequence of the inulinase gene for example. A secretory signal derivable from the inulinase gene of a Kluyveromyceε yeast is particularly favoured. The specific embodiment of the nucleic acid sequence according to the invention will however depend on the goal that is to be achieved upon using a sequence according to the invention.

With the help of DNA oligonucleotides deduced from either existing protein sequencing data of inulinase from X. marxianuε or newly obtained protein sequence analysis for example a 290 bp DNA fragment has been generated by use of the PCR technique, and the fragment has further been used for the isolation of chromosomal DNA fragments containing the whole inulinase gene of X. marxianuε including the regulatory regions, such as the promoter, the signal sequence and the termination sequence. The invention is therefore in particular directed at a nucleic acid sequence derivable from a yeast and comprising at least a regulatory region derivable from a gene encoding a polypeptide having inulinase activity in any of the embodiments described above, said nucleic acid sequence comprising at least a part of the nucleic acid sequence of figure 5 or an equivalent nucleic acid sequence. The term "equivalent nucleic acid sequence" has the same meaning as given above for "a modified nucleic acid sequence".

Yet another aspect of this invention relates to the isolated nucleic acid fragment of X. marxianuε containing an open reading frame encoding 556 amino acids, with nucleotides encoding a prepro-peptide sequence of 23 amino acids at the amino terminus. The calculated molecular weight of the corresponding gene product is 62.5 kDa, which is in good agreement with the 64 kDa experimentally

determined for the corresponding polypeptide (Rouwenhorst et al . , 1990) .

A further aspect of the invention is directed at processes for the construction of either episomal or integrating expression vectors containing the described regulatory sequence or sequences. In the given examples the expression and secretion potential of the obtained INU promoter and the INU signal sequences have been tested by constructing a variety of new vectors for expression of a heterologous α- galactosidase gene in Kluyveromyceε . The resulting constructs were tested in X. marxianuε , variety marxianuε . Yet another aspect of this invention relates to a method of transforming a Kluyveromyceε strain capable of producing a heterologous protein through fusion with the prepro- and the pre-part of the homologous inulinase signal sequence. High expression levels and nearly complete secretion were obtained with all episomal plasmids that were constructed. A strain, transformed with a construct containing the whole prepro-sequence secreted up to 150 mg/L enzyme when grown in shake flask, which is an approximately 100 fold increase compared to the vectors containing non homologous S . cereviεiae promoters and signal sequences. In another embodiment of the present invention the PCR technique is used in combination with the use of class IIS restriction enzymes, to facilitate primarily the functional new recombination of the described DNA fragments. As a typical example, BspMI, a constituent of the group of IIS restriction endonucleases, cuts every DNA sequence 4 bp 5 1 of the specific recognition site "ACCTGC", thereby generating 5' N4 protruding ends (reviewed by Szybalski et al, 1991) . The advantage of these enzymes, particularly in combination with PCR, is the nearly complete independence from a given sequence within modern molecular working procedures. Introduction of the recognition sequence into the non priming part of a primer used for PCR, allows subsequent generation of any desired end of the PCR fragment.

Furthermore, this invention relates to a process for producing a desired expression product wherein a host cell comprising a vector according to the invention is cultured under conditions enabling the expression of the structural gene and optionally the isolation of the desired expression product.

The invention also relates to a method for producing RNA, wherein a host cell comprising a recombinant nucleic acid sequence according to the invention is cultured under conditions enabling production of the RNA, whereby said recombinant nucleic acid sequence further comprises a regulatory region operably linked to DNA encoding a specific RNA sequence not encoding a specific protein. Such a process can for example be used to produce RNA itself as the desired expression product, or to produce RNA that influences the formation of at least one metabolite as the desired expression product. The amount of metabolite can be increased by using the regulatory region or regions according to the invention (as described above in various suitable embodiments) in combination with a nucleic acid sequence encoding a protein that influences the formation of the metabolite.

The production of anti-sense RNA, that binds to sense RNA encoding an expression product, can be used for decreasing the amount of said desired expression product, whereby the latter can be a specific protein or a protein influencing the formation of a metabolite.

A nucleic acid sequence according to the invention can therefore also comprise an anti-sense nucleic acid sequence in combination with one or more of the regulatory regions according to the invention.

A process for producing RNA according to the invention can be directed at the production of an RNA sequence that functions as a flavouring component. The nucleic acid sequence according to the invention is therefore not only to be considered useful for overexpression of a proteinaceous gene product but also for producing RNA.

Brief Description of the Figures

Figure l.

Protein sequence analysis of two forms of inulinase from X. marxianuε after CNBr-digestion. Fragment 1 in figure 1 corresponds to Seq. ID. No. 1 and fragment 2 in figure 1 corresponds to Seq. ID. No. 2. The expected cleavage after a Met- residue is indicated by an arrow, small letters in the sequence indicate very likely residues. Identification is based on either homology with invertase (n) or strong suspicion.

Figure 2. a) DNA oligonucleotides derived from the amino acid sequence of the internal CNBr-fragments 1 and 2. Fragment 1 in figure 2a corresponds to Seq. ID. No. 3 and fragment 2 in figure 2a coreesponds to Seq. ID. No. 4. b) DNA probes from the mature N-terminus of secreted inulinase. The number of nucleotides is given in brackets, as well as the abbreviations used in the text (probe KLM 04 corresponds to Seq. ID. No. 5, probe KLM 05 corresponds to Seq. ID. No. 6, probe KLM 08 corresponds to Seq. ID. No. 7 and probe KLM 09 corresponds to Seq. ID. No. 8) . In cases where mixed oligonucleotides are used during DNA synthesis, the corresponding letters are given; the orientation of the DNA oligonucleotides is mentioned.

Figure 3.

Nucleotide sequence (Seq. ID. No. 15) of the 280 nucleotides long PCR fragment of the N-terminal coding region of the inulinase gene in pTZlδR. The localization of two corresponding PCR primers; as well as their code, is given. In the line "seq" the experimentally determined amino acid sequence of inulinase from X. marxianuε is given, The deduced amino acid sequence (Seq. ID. No 16) is mentioned in the line below the DNA sequence, here, all amino acids identical to amino acids of Saccharomyceε

cereviεiae invertase are underlined.

Figure 4.

First restriction endonuclease cleavage map of the region around the inulinase gene of X. marxianuε . Restriction sites were located through Xp.nl double digestions. Not all restriction sites of the given restriction enzymes are mentioned. 0 kbp mark refers to the 5' end of the coding sequence of inulinase. In the upper part of the figure about 22 Kbp are mapped, while the lower part displays a refinement of approximately 2.5 Kbp around the target sequence.

Figure 5. Nucleotide sequence (Seq. ID. No. 9) of the inulinase gene (INU1 ) of Kluyveromyceε marxianuε . The TATAAA box, transcription start sites, the putative MIG1 binding site as well as the predicted recognition site for the signal peptidase (G-V-S-A-t-S-V-I) and the processing site for a KEX2-like endoprotease (K-R-t-) are indicated. Numbering starts with the ATG start codon, the deduced amino acid sequence (Seq. ID. No. 10) is given in one letter code below the coding part.

Figure 6.

Autoradiogram of the primer extension assay. Results are shown for the primer extension assays in the presence of [α- 32 P]dCTP with total RNA from repressed [1] and derepressed grown cells [2]. The size of the obtained fragments is indicated.

A: assay with primer p21T; B: assay with primer pl6T;

C: specificity control; both primers with total RNA from S . cereviεiae . For further details see text.

Figure 7.

Schematic representation of the construction for the inulinase promoter/signal sequence link to α-galactosidase with the help of PCR generated fragments. The beginning of mature protein and the first amino acid of the pre-protein is indicated by arrows. Digestion of plasmids such as for example pSKl with EcoRI and EagI removes the DNA part comprising the GAL7 promoter and the SUC2 signal sequence in such a way, that an in frame fusion of inulinase signal sequences with the α-galactosidase gene was directly possible. For the in frame fusion of the whole prepro- sequence to α-galactosidase an oligonucleotide complementary to the coding strand was used as PCR primer, said oligonucleotide comprising the recognition site of BspMI. After PCR, digestion of the product with .EcoRI and BspMI created sticky ends, that were compatible with the ends of the original vector, for example pSKl. By changing the hybridizing part of the PCR primer a similar fragment was obtained for the in frame connection of the pre- sequence to α-galactosidase.

Figure 8.

Schematic representation of the construction routes of plasmids, suitable for the expression of a heterologous gene, here for example α-galactosidase, within X. lactiε . The construction route is only given for an episomal plasmid based expression system. The sequence of the α- galactosidase gene is indicated in black, shadowed areas indicate yeast sequences, solid lines are bacterial sequences, the direction of transcription is indicated.

Figure 9.

Schematic representation of the construction route of plasmids pUR2431 and pUR2432, examples of construction intermediates for the expression of α-galactosidase from the INU promoter, containing either the intact prepro- signal sequence or only the pre- part of the inulinase signal sequence. The promoter sequence is shaded,, the empty

box indicates both versions of the signal sequence, the example of a heterologous gene, here α-galactosidase is given as a black box. The direction of transcription is indicated by arrows.

Figure 10.

Schematic representation of the construction of plasmids pUR2433, pUR2434, pUR2435 and pUR2436, all variants of episomal expression plasmids for the expression of α- galactosidase in Kluyveromyceε strains, having leu2 determined auxotrophy.

Figure 11.

Schematic representation of the construction of pUR2437 and pUR2438, vectors for integrating multiple copies of a homologous or heterologous gene, such as α-galactosidase, into the rDNA of X. marxianuε . The overall structure of one rDNA unit as well as the 3.5 kbp _BcoRI fragment actually used are drawn schematically.

Figure 12.

Sequence of the X. marxianuε URA3 gene (corresponding to Seq. ID. No. 11) and its deduced amino acid sequence (corresponding to Seq. ID. No. 12) .

Figure 13.

The construction of plasmid pKMU2, which was used for the construction of a food-grade K. marxianus leu2 mutant. (A) Plasmid pKMLl contains a K.marxianus LEU2 gene on a 5 kb EcoRI fragment. (B) . Plasmid pKMU2, where the intact LEU2 gene is replaced by a leu2 : : URA3 disruption. The small boxes indicate the URA3 promoter and terminator regions.

Figure 14. Structure of plasmid pKUR2431 (A) and the chromosomal organization of the INU1 locus after integration of the plasmid at the Xhol site (B) .

Figure 15 .

Structure of plasmids pUR2439 and pUR2440 which contain X. marxianuε DNA 5' to the previously cloned sequences. These plasmids are based on pBluescript (Stratagene) and the inserted X. marxianuε DNA is shown as the dark shaded boxes.

Figure 16.

The sequence of the 479 bp .EcoRI fragment 5* to the previously cloned sequences. The sequence (corresponding to Seq. ID. No. 13) begins with the 5 1 .EcoRI site from the previously cloned INU1 DNA.

Figure 17. Sequence (corresponding to Seq. ID. No. 14) of primer INUT used for sequencing across the EcoRI site 5' to the inulinase gene.

Figure 18. Structure of plasmid pUR2445 which contains the 470 bp .EcoRI fragment from pUR2440 5' to the INU1 sequences in pUR2434. The location and orientation of the approximatley 470 bp EcoRI fragment is indicated.

Experimental

The following experimental section is offered by way of example and should not be considered a limitation of the scope of the invention.

Molecular biological procedures

Mostly standard methods were used as described in Sambrook, J. , Fritsch, E.F., & Maniatis, T. , 1989. Molecular Cloning. A laboratory Manual. Second edition. Cold Spring Harbour Laboratory Press. Any modifications used are described below.

Strains, plasmids and growth conditions

E . coli strain JM109 (endAl, recAl, syrA96, thi, hsdR17, rk ~ , mk + , relAl, supE44, Yanish-Perron et al . , 1985) was used for amplification of plasmids. Transformation of JM109 was carried out according to Cohen et al . , 1973.

X. marxianuε var. marxianuε CBS 6556 (= ATCC 26548) was obtained from the Yeast Division of the Centraalbureau voor Schimmelcultures, Delft, The Netherlands, and maintained on YEPD agar (1% yeast extract, 2% peptone, 2% glucose, 2% agar) slopes.

Genomic DNA was isolated from a 200 ml YEPD overnight culture and incubated with lyticase (Sigma Chemical Company) from Arthrobacter luteuε according to the manufacturer. Total DNA was isolated as described by Struhl et al . (1979) .

The X. marxianuε strains were transformed with the plasmids pUR2431 up to and including pUR2445. Transformation of the Kluyveromyceε strains was performed as described by Carter et al . , 1988. Transformants were recovered on selective

YNB-plates (0.67% YNB, 2% glucose, 2% agar) supplemented with the essential amino acids (tryptophan 20μg/ml or leucine 20μg/ml) . The same liquid medium was used for precultures, cultivated twice overnight at 30°C and diluted 1:10 in YPmedium containing 5% sucrose (YPS) for derepression of the INU promoter.

Example 1. Generation of DNA oligonucleotides

To acquire a set of DNA oligonucleotides for PCR a set of mixed DNA oligonucleotides corresponding to the recently determined N-terminal amino acid sequence of forms I and II was synthesized [Rouwenhorst et al . , 1990]. As a potential source for further DNA-probes and to apply the PCR technique, the amino acid sequences of two internal CNBr fragments from the secreted inulinase form I and the cell wall bound inulinase, form II were determined. Therefore the reduced and carboxya idomethylated proteins

were subjected to overnight incubation in 70% formic acid in the dark under N 2 . After addition of water the mixtures were freeze dried and yielded the CNBr-digests of inulinase forms I and II respectively. Separation of the obtained fragments was achieved by reversed phase chromatography using a Bakerbond C4 wide pore column (4.6*250mm) mounted in a Waters HPLC. Elution was achieved using a linear gradient of acetonitrile in 0.1% Trifluoroacetic acid in water. Detection was carried out at 214 and 254 nm.

The chromatograms obtained from digests of inulinase forms I and II showed the presence of a number of poorly resolved peaks; the overall pattern however was similar for both digests and enabled collection of two fractions from both runs which were subjected to sequence analysis. The outcomes of the runs are given in Fig.l (corresponding to Seq. ID. No. 1 and 2) .

No differences could be detected in the amino acid sequences derived for the isolated fractions between forms I and II. Peptide bond hydrolysis has occurred in one case at a C-terminal Trp residue which is rare but not impossible [fragment 2 in Fig.l/Seq. ID. No. 2] under the given circumstances. The nucleotide sequence was selected in such a way, that PCR could generate the genetic information of the intervening sequence. From these sequence results sets of mixed oligonucleotides were synthesized using an Applied Biosystems 380 A synthesizer. The sequence of these DNA oligonucleotides is given in Fig. 2 (corresponding to Seq. ID. No. 3 to 8) .

Example 2. cloning the 5' coding region of inulinase

With two of the obtained DNA oligonucleotide probes from the N-terminal and internal protein sequence [KLM09, resp. KLM06] PCR amplification on total X. marxianuε genomic DNA was carried out (Perkin Elmer Cetus DNA Thermal Cycler) in 100 βl 10 mM Tris HC1, pH 8.3 , 50 mM KC1, 1.5 mM MgCl 2 ,

0.001% gelatine, with 0.2 mM of each dNTP, 100 pmol of the DNA oligonucleotides KLM06 and KLM09, approximately 0.5 μg of BamHI digested DNA and 1U of Amplitaq polymerase. Incubation parameters were set as follows: 32 cycles/ 1 min 95 °C/ 2 min 50 °C/ 2.5 min 72 °C.

The reaction formed a specific 290-bp fragment, which was subcloned into the Smal site of the E . coli plasmid pTZ18R (Mead et al, 1986) and introduced into E . coli JM109 by the transformation protocol described by Chung et al . (1989). One of the positive clones, designated pUR2415, was further characterized, the DNA isolated and purified according to the Qiagen (Qiagen Inc. , Chatsworth California) protocol and subsequently characterized by DNA sequencing. The sequence of the PCR clone is given in Fig. 3 (corresponding to Seq. ID. No. 15) .

Comparison of the gained DNA sequence with known sequence data of invertase from S . cereviεiae further confirmed the authentic origin of the PCR product, since it displayed a significant sectional homology with the supposed N-terminus of invertase as described by Rouwenhorst et al . (1990) .

Example 3. Restriction map of the DNA around the inulinase gene

The NcoI-BamHI fragment of pUR2415 was further used as a 32 P labelled probe for the construction of a physical map of the DNA region around the 5 ' coding sequence of the inulinase gene. Therefore chromosomal DNA of X. marxianuε was digested with several restriction enzymes separately and in combination. After electrophoresis of DNA fragments the gel was placed for 15 min in 0.25 M HC1, 15 min in 0.4 M NaOH, 0.6 M NaCl and 15 min in 0.5 M Tris, 1.5 M NaCl. The DNA was transferred onto Hybond N-filters (Amersham International pic.) by vacuum blotting (LKB 2016 Vacugene) for 2 hours in lOxSSC (1.5M NaCl, 0.15M Na 3 citrate) and finally UV crosslinked for 15 min.

By using Kpnl double digestions the resulting signals were arranged into a first physical map spanning about 25 Kbp, (given in Fig.4) .

Example 4. Cloning of the inulinase gene

Results of the chromosomal restriction analysis revealed two positive overlapping DNA fragments of 2.0 Kbp for EcoRI and 4.0 Kbp for PstI, respectively. To isolate clones containing the inulinase promoter, the signal sequence and the polyadenylation/termination sequences both digested DNA pools were subcloned into pTZ19. Therefore about 8 μg of chromosomal DNA was digested with EcoRI and PstI separately and resolved by agarose gel electrophoresis. DNA fragments of about 2.0 Kbp from the EcoRI digest and the fragments between 3.5 and 4.5 Kbp from the PstI digest were isolated from the gel and purified with the Geneclean II kit (Bio 101 Inc) . A small amount of both digestions was again loaded onto an agarose gel, the bands transferred to Hybond membrane, and hybridized with the 32 P labelled PCR fragment to verify the presence of the hybridizing band within the isolated pool. Since both fractions contained the corresponding DNA fragments the isolated EcoRI and PstI DNA fragment pools were ligated into the resp. digested pTZ19 plasmids and transformed into E. coli JM109 by standard procedures. The colonies obtained were subjected to colony hybridization after replica plating them onto Hybond-N filter (Amersham International pic) and plasmid amplification on LB-plates with containing 500 μg chloramphenicol ml -1 . After 8 hours incubation at 37°C each filter was subjected to the following wash procedure: 5 min. in 1.5 M NaCl, 5 min in 0.5M NaOH and twice in 1.5M NaCl, 0.5M Tris HC1. The DNA was finally fixed to the filter by UV crosslinking. Hybridisation was done in 50mM Tris pH 7.4, 10 mM EDTA pH 7.0, IM NaCl, 0.5% SDS, 0.1% Na-pyrophosphate, 0.2% ficoll, 0.2% polyvinylpyrolidone, 0.2% BSA and 0.01 mg denaturated salmon sperm DNA at 68 °C. The added [α- 32 P]dCTP (Amersham

International pic; 370 MBq/mL; 110 TBq/mmol) labelled DNA probe was prepared by using a Multiprime DNA labelling kit from Amersham Corporation, purified by elution over a Sephadex G-50 column in TES (lOmM TriHCl pH 7.4 , 1 mM EDTA pH 8.0, 0.2% SDS) and then denatured by incubation for 2 minutes at 100 °C prior to use. After overnight incubation the filters were washed 2x for 20 min in 2xSSC, 0.1% SDS, 0.1% Na-pyrophosphate; 2x 20 min in O.lx SSC, 0.1% SDS, 0.1% Na-pyrophosphate at 68°C. Positive clones were detected after overnight exposure of the dried filters with a Kodak-X-ray film.

To verify the specificity of the obtained spots, filters enclosing presumably positive clones were washed for 30 min in 2% SDS at 90°C and rehybridized with a DIG labelled DNA probe of the PCR fragment, by using the DIG luminescent detection kit from Boehringer Mannheim according to the manufacturer's protocol.

Plasmid DNA was isolated from putative positive colonies, digested with appropriate enzymes and analysed by Southern hybridization as follows: after electrophoresis of the digested DNA the gel was placed for 15 min in 0.25 M HC1, 15 min in 0.4 M NaOH, 0.6 M NaCl and 15 min in 0.5 M Tris, 1.5 M NaCl. The DNA was transferred onto Hybond N-filters (Amersham International pic.) by vacuum blotting (LKB 2016 Vacugene) for 2 hours in lOxSSC and finally UV crosslinked for 15 min. In this experiment the probe was labelled through random primed incorporation of DIG-UTP according to the protocol of the manufacturer (DIG DNA labelling kit, Boehringer Mannheim) . After overnight hybridisation with the denaturated probe at 68 °C in 5x SSC, 0,1% N- laurylsarcosine, 0,02% SDS, 1% blocking agent) filters were washed as described in the manufacturers instructions. The fragments found in the plasmids which hybridized with the probe were in reliable agreement with the physical map given in Fig. 4. One of the 2.0 Kbp EcoRI insert containing pTZ19R plasmids, designated pUR2421, and one 4.0 Kbp PstI fragment comprising pTZ19R plasmid, designated pUR2422,

were further utilized to determine the DNA sequence of the total inulinase gene.

Example 6. DNA sequence analysis of the complete inulinase gene

DNA sequencing was mainly done as described by Sanger et al . , 1977, and Hsiao et al . , 1991, using the Sequenase version 2.0 kit from United States Biochemical Company, according to the protocol with T7 DNA polymerase (Amersham International pic) and [α- 35 S]dATP (Amersham International pic: 370 MBq/ml; 22 TBq/mmol) .

The complete sequence was determined from the recombinant plasmid pUR2421 and from the plasmid pUR2422 by subcloning fragments and by primer walking strategy. Both DNA fragments showed the expected overlap. In summary, 3223 bp were sequenced on both DNA strands, including the promoter region extending over about 0.75 Kbp upstream of the putative start codon and a sequence of about 0.83 Kbp behind the putative stop codon, including the putative polyadenylation site and termination regions. The sequence is given in Fig. 5 (corresponding to Seq. ID. No. 9 and 10) . Sequence comparison of the coding part of the inulinase gene of the present invention showed about 98% homology with the very recently published inulinase coding sequence of X. marxianuε , ATCC 12424 (Laloux et al . , 1991) . The homology is less striking for the 50 nucleotides before the prepro-sequence given in the same publication, corresponding to about one third of the leader sequence before the start codon from pUR2421. This variation is probably due to strain variations.

At the amino acid sequence level, invertase and inulinase display a homology of 69%, a homology, which is even higher than the homology between invertases from S . cereviεiae and Schwanniomyceε occidentaliε . Therefore, both enzymes should be treated as variations of the same enzymatic activity rather than as different enzymes. Since the N-terminus of secreted mature inulinase was

identified by protein sequencing, it was easy to distinguish the coding part of the precursor protein, having a deduced 23 amino acid sequence displaying some characteristic prepro- features. 270 bp in front of the supposed ATG start codon a TATAAA box was identified, indicating the presence of a promoter element.

Example 7. Determination of transcription start

To test the functionality of the detected promoter structure and to identify the transcription start points of the cloned inulinase fragment, primer extension experiments were carried out. For the primer extension assay two DNA oligonucleotides were used, complementary to nucleotides 98 to -84 and 18 to 48 of the given DNA sequence given in Fig.5 (which corresponds to Seq. ID. No. 9).

primer p21T 5'- AGC ACT GAC TCC TGC CAA TGG AAG CAA GAG (Seq. ID. No. 17) primer pl6T 5'- TCT CTA TGG CAT AGA GA (Seq. ID. No. 18) To further confirm the specificity of the signals, two different total RNA preparations for the reverse transcription were chosen. Therefore, X. marxianuε cells were grown as described in YPS medium (Rouwenhorst at al., 1990) under non-repressive conditions. From a 100 ml culture, cells were harvested and total RNA was isolated as described (Koehrer et al . , 1991) . A reverse transcriptase reaction was carried out in the presence of either [α- 32 P]dCTP or [α- 35 S]dATP. For each experiment about 10 μg RNA and lOOpg of primer were dissolved in 40 μL hybridisation buffer, containing 50 mM Tris.HCl pH 8.3 , 75 mM KC1, 3 mM MgCl 2 , 10 mM DTT and 40 U RNA'se inhibitor (Boehringer Mannheim) . The mixture was incubated at 65 °C for 5 min. and slowly cooled down to room temperature. 1 μL [α- 32 P]dCTP (Amersham International pic: 370 MBq/ml; 110 TBq/mmol) or [α- 35 S]dATP, 1 μL (25 U) reverse transcriptase (Biolabs) , dATP, dGTP and dTTP to a final concentration of 0.1 M, dCTP to a final concentration of 0.01 mM were added

to a final volume of 50 μl and incubated for 1 hour at 37 °C (for reactions containing [α- 35 S]dATP the nucleotide concentrations were 0.1 mM for dGTP, dTTP and dCTP and 0.01 mM for dATP) . The mixture was precipitated with ethanol and subsequently loaded onto a 5% Polyacrylamide gel together with the DNA sequence reactions generated from the same primers (Fig.6).

In all experiments, three dominant signals emerged, coinciding in each case to T_ 174 , c _ 170 and C_ 167 of Fig.5 (which corresponds with Seq. ID. No. 9) . These nucleotides are located about 100 nucleotides behind the TATAAA box [position -276 to -271 in Fig.5]. The results of this part of the invention therefore relate the start of transcription for the inulinase gene within the region TAATCAGCAATT, defining the length of the uncommonly long 5' non-coding sequence as 174, 170 and 167 nucleotides. Directly behind this transcription initiation region we recognized a sequence [TAAATCCGGG, nucleotides -163 to -153 in Fig.5 (which corresponds with Seq. ID. No. 9)] that perfectly matches the MIGl binding consensus sequence of S . cereviεiae SUC2 , GAL4 and GAL1 genes [WWWWTSYGGGG] (Nehlin et al . , 1991). The MIGl gene product is known to be involved in glucose repression of the GAL genes and directly controls SUC2 expression in S . cerevisiae (Nehlin et al . , 1990). In contrast to the presumably related SUC2 gene, where the MIGl binding site is located at -446 to - 435, this sequence motif is closer to the start codon in X. marxianuε and more similar to the location of the MIGl site in the GAL1 and GAL4 promoters. But in contrast to the regulation of the GAL gene family of S . cereviεiae , inulinase and invertase promoters seem to be solely regulated by glucose repression; a fact, allowing the construction of a strong, non-repressible promoter by exchanging the putative MIGl DNA binding site in the inulinase promoter.

The operative importance of the sequence around the AUG start codon during initiation of translation in eukaryoteε

is still a point of discussion, and has led to the formulation of a consensus sequence for S . cereviεiae (Hamilton et al . , 1987) . The inulinase gene shows little homology with this sequence and even the formulation of a preliminary Kluyveromyceε ATG context consensus sequence does not improve the homology significantly. Taking the known high expression level of inulinase in the natural host into account, the idea of improving protein expression through adaptation of the AUG context in Kluyveromyceε mRNA on the basis of the present information seems to be less striking.

Example 8. PCR of Kluyveromyceε regulatory sequences

Sequence analysis of the prepro-sequence of the leader region of the cloned inulinase promoter in pUR2421 revealed 5 amino acids after a predicted recognition site for the signal peptidase (G-V-S-A-i-S-V-I) (Van Heijne, 1986) also a putative processing site for a KEX2 like endoprotease (..K-R-i-..) (Fuller et al . , 1988) . To test the functional importance of the prepro-sequence, DNA oligonucleotides were synthesized, one creating the complete prepro- inulinase sequence, and the second appropriate for the direct, in frame, attachment of the DNA coding for the inulinase signal peptide to a given coding sequence. Two oligonucleotides containing the BspMI recognition site, one with sequence information for new restriction sites and the complementary sequence for the DNA from the putative KEX2 protease site, and the second complementary for the signal peptidase cleavage site, were utilized for the generation of suitable promoter/signal sequence fragments by PCR. For the assembly of promoter and secretion signal fragments from the inulinase gene, PCR amplification of this part of the inulinase gene was performed. Thereby, the primers were conceived in such a way that perfect couplings with mature alpha-galactosidase were obtained. Two versions were made, one with the inulinase pre-pro signal sequence and the other with only the inulinase putative pre signal

sequence (see fig.7.) . The following primers were used:

INP 01: 5•-GGAATTCTCAAACCGAAATG-3 * (Seq. ID. No. 19) INP 02: 5'-CCCAAGCTTACCTGCCATGGGCCCTCTTGTAATTGATAACTG-3 ' (Seq. ID. No. 20)

INP 03: 5'-CCCAAGCTTACCTGCATGCGGCCGCACTGACTCCTGCCAATG-3 ' (Seq. ID. No. 21)

Two PCR-mixes of 100 μl were made, each containing 40 pg pUR2421 cut with Xpnl, 1.5 mM MgC12, 1U AmpliTaq polymerase and buffers and NTP's as appropriate. One reaction mixture contained 100 pmoles of INP 01 and INP 02, the other 100 pmoles INP 01 and INP 03. 25 cycles were performed in a Perkin Elmer Cetus DNA Thermal Cycler, each cycle 1:00 min 95°C, 1:30 min 48°C, 2:00 min 72°C. Afterwards, they were treated with proteinaεe K (Crowe et al . , 1991) before digestion with EcoRI and _Hi_ndIII was performed (48 hours at 37°C) . The fragments were then isolated from a gel, and ligated with pTZ19R digested with EcoRI and Hi_ndIII. The resulting plasmids, pUR2427 and pUR2428 (INP 01/INP 02 and INP 01/INP 03 PCR-products respectively) , were transformed into E. coli and several colonies for both constructions were cultivated, the plasmids isolated, purified and the sequence confirmed by DNA sequence analysis.

Example 9. Construction of K. lactis expression plasmids

The E. coli - Kluyveromyces shuttle vector pSKl is a pKDl derivative (Chen et al . , 1989, Bianchi et al . , 1987) and contains a unique EagI site at the junction between the S . cereviεiae SUC2 (invertase) signal sequence and the α- galactoεidaεe gene. The vector also compriseε a X. lactiε TRP1 gene as selectable marker in X. lactiε and the ampicillin resistance gene as selectable marker in E. coli .

Digestion of said plasmids pUR2427 and pUR2428 with EcoRI and BspMI produces two fragments, which could be easily

ligated with the EcoRI/EagI digested Kluyveromyceε vector pSKl, thereby only replacing the EcoRI -EagI fragment, containing the GAL7 promoter and SUC2 signal sequence with the EcoRI/BspMI fragments from pUR2427 and pUR2428. This reεultε in two vectors, one with the DNA sequence encoding the expected signal peptide (= pre-sequence) directly linked to the α-galactosidase gene (pUR2429) , and a second with the DNA encoding the natural prepro-sequence in frame linked to the α-galactosidaεe gene (pUR2430) . Said episomal plasmidε could immediately be used to transform the trp ~ mutant strain of X. lactiε , for example by electroporation (Bolen et al . , 1990) in which strain pSKl derivatives are known to be stably maintained. Expresεion and secretion of α-galactosidase could be determined under induced and non induced conditions by known procedures (Verbakel, J. , 1991).

By digestion of said plasmidε with EcoRI and Hindlll a fragment compriεing both the promoter and the DNA encoding the leader sequence, as well as the α-galactosidase gene can be transferred into existing vectors for targeted homologous recombination into the X. lactiε rDNA (Bergkamp et al . , 1992) . The potential of these constructs for stable, multicopy integration into the genome, has been shown for different organisms, genes and auxotrophic markers (Lopes, 1990; Verbakel, 1991, Giuseppin et al . ,

1991) . Substitution, for example, of the GAL7 promoter and SUC2 signal sequence in the plasmidε pMIRKGAL-T±l,2 and 3 for said inulinase promoter prepro- or pro-sequences, followed by transformation of the constructs into X. lactiε MSKllO (a, uraA, trp1: : URA3) , should give high and εtable expression and secretion of α-galactosidaεe. DNA fragmentε comprising the whole given promoter sequence or functional partε thereof, but without the prepro- εequence coding part can also be used for intracellular overexpreεεion of homologous and/or heterologous genes.

Example 10. Construction of X. marxianuε episomal plasmids

To evaluate the function of the obtained DNA sequences in the natural host, the pTZ19R derivatives pUR2427 and pUR2428 were digested with EcoRI and BspMI, to release the PCR fragments, which were further used to simply replace the EcoRI/EagI fragment in pSY9, an E. coli / S . cereviεiae shuttle containing a unique EagI site at the junction between the S . cereviεiae SUC2 (invertase) signal sequence and the α-galactosidaεe gene and an EcoRI site in front of the GAL7 promoter; the vector also comprises a LEU2 gene copy from S . cereviεiae and the ampicillin resiεtance and the MB1 origin for maintenance and selection in E . coli (M. Harmsen, unpubliεhed) . The two variant vectors, one with the direct connection of the expected signal peptide to the α-galactosidaεe gene (pUR2432) , and the second with the natural prepro-sequence in frame linked to the α- galactosidase gene (pUR2431) are not able to replicate in X. marxianuε (Figure 9) . Finally, for obtaining the Kluyveromyceε episomal expresεion vectors, the naturally occurring plasmid pKDl was linearized with EcoRI and ligated into the EcoRI site of pUR2331 and pUR2332, thereby yielding 4 new plasmids, with the DNA encoding either the pre- or the prepro-sequence and in each case both possible orientations of the pKDl vector backbone (pUR2433 - pUR2436) .

Example 11. Construction of X. marxianus integrating plasmids To obtain mitotically stable integration into the ribosomal DNA locus of the X. marxianuε genome homologous rDNA εequences were used to target the integrating linearized vector into the rDNA locus. In addition to its use in S . cereviεiae, this approach has been succeεεfully proven for multicopy integration into the rDNA locuε of X. lactiε uεing the vectors pMIRKMl and pMIRKM2 (Bergkamp et al . , 1992) . These plaεmids include either the LEU2 or the LEU2d

genes of S . cereviεiae and homologous rDNA sequences from X. marxianuε . Since the heterologous LEU2d gene apparently is not able to functionally complement the LEU2 gene disruption in the X. lactiε strain (Bergkamp et al . , personal communication) only the vector with the intact LEU gene was used for further X. marxianuε constructions. A very recently cloned 3.5 kb EcoRI fragment of the X. marxianuε rDNA, containing the 3' end of the 17S rDNA-, the 5.8S rDNA-and the 5' end of the 26S rDNA gene, which has recently been cloned, but haε not yet been completely sequenced (Bergkamp, personal communication) was ligated into the EcoRI sites of pUR2431 and pUR2432. After transformation into E. coli the transformantε were found to contain only plaεmids in which the inulinase- α-gal expression cassette was joined to the 26S rDNA gene part (pUR2437 comprising the prepro-sequence, and pUR2438 containing the pre-sequence) .

Example 12. Expression and secretion in K. marxianus

The outcome of the complete construction procesε waε 6 different expreεsion plasmids for X. marxianuε , with the following characteristics:

pUR2433: episomal vector;

INU promoter + prepro-sequence + α gal in orientation I pUR2434: episomal vector;

INU promoter + prepro-sequence + α gal in orientation II. pUR2435: episomal vector;

INU promoter + pre-sequence + α gal in orientation I pUR2436: episomal vector;

INU promoter + pre-sequence + α gal in

orientation II. pUR2437: integration vector;

INU promoter + prepro-sequence + α gal in orientation I. pUR2438: integration vector;

INU promoter + pre-sequence + α gal in orientation I.

The four episomal plasmidε and the two integration vectors were (after linearization at the unique Xbal site) transformed to the existing X. marxianuε leu2 strain KMS1 by known procedures (Carter at al.; 1988). In this strain, the gene coding for β-isopropylmalate dehydrogenaεe was inactivated through integration of a dominant selection marker [G 418 ] under the control of the PGK promoter. The resulting strain is leu " , Neo R . Some of the acquired clones were grown overnight at 37°C in YNB medium and diluted 1:10 into 50 ml of YPS in a shake flask and grown for 48 hr at 37°C under non repressing conditions. The expresεion level of α-galactosidase was determined enzymatically intra- and extracellularly by known procedures (Verbakel, 1991; Giuseppin et al . , 1991). The copy number was preliminary estimated by Southern blot analysis. The α-galactoεidaεe expression assays confirmed the benefit of homologous regulatory sequenceε for high level expression in X. marxianuε . Application of the inulinase DNA promoter sequence increased the expreεεion of α- galactoεidaεe up to 150 fold, compared to experimentε, in which the S . cereviεiae PGK-or GAL7 promoterε were uεed (data not εhown) . Moreover, the natural connection of this promoter to the corresponding signal sequence, led to nearly 100% secretion of the heterologous protein. Here, the use of the complete precursor sequence, including the S-V-I-N-Y-K-R pro-peptide εequence appearε not only to increaεe the amount of εecreted α-galactoεidase, but also the amount of protein produced. This finding is in some conflict with the conclusion given by Fleer et al . , 1991,

where the deletion of the pro sequence of human serum albumin (HSA) did not influence the capacity of Kluyvero¬ myceε lactiε to expreεε and secrete rHSA. On the other hand, both experiments manifest, that the final proteolytic removal of the pro peptide from the mature product, presumably by a KEX2 equivalent protease, is not a rate limiting step. Whether the pro-sequence plays an appreciable role in secretion, translation or mRNA stabilization, cannot yet be decided. Some influence of the orientation of the DNA casεette within the plasmid has been found, at least for the episomal expression systems (compare pUR2433 with pUR2434, and pUR2435 with pUR2436) . This effect might be related to -not detected- copy number variations, plasmid stability effects, or transcriptional interference with other transcription processes on the plasmid in orientation I.

The expression of the integration vectors (pUR2437 and pUR2438) was very low, a result, which could be correlated to the low copy number present in the cell. One possible explanation for this effect is the presence of the intact heterologous, S . cereviεiae LEU2 marker gene on the cassette; since this promoter might be strong enough to supply the leu deficient cell with sufficient gene product, even in this single copy configuration.

By taking benefit of the very recently cloned and sequenced LEU2 gene of X. marxianuε (Bergkamp et al . ,1991), one can further enhance stability and expression of homologous and

heterologouε geneε in the described X. marxianuε strain KMS1, by replacing the LEU2 copy from S . cereviεiae in pUR2437 and pUR2438 with the corresponding homologous LEU2 and LEU2d promoter deficient gene copies from X. marxianuε on the integration cassette.

Moreover, the cloned LEU2 gene can be further used to obtain a disruption/deletion leu2 auxotrophic mutant strain without insertion of heterologous DNA.

Example 13. Construction of KMS3

For the construction of a strictly homologous X. marxianuε leu2 mutant the URA3 gene of strain CBS 6556 was isolated first and εubsequently utilized for the disruption of the LEU2 locus. The last εtep in the bioεynthesiε of pyrimidine is catalysed by orotidine-5 ' phosphate carboxylase ( EC 4.1.1.23) , an enzyme which in S . cereviεiae iε encoded by the URA3 gene. The URA3 gene iε one of the moεt commonly used selection markers because of the availability of counter selection for the marker (Boeke et al., 1984) .

Yeast cellε having an active URA3 gene are unable to grow in medium containing 5-fluoro-orotic acid (5-FOA) , while ura3 mutants grow normally. The URA3 gene of X. marxianuε CBS 6556 was isolated by screening a genomic X. marxianuε DNA bank (Bergkamp et al., 1991), inserted in the vector lambdaL47.1 (Loenen et al., 1980) with a radioactively labelled S . cereviεiae URA3 DNA fragment. Three phage clones, which hybridized with the URA3 probe were iεolated by εtandard techniqueε. Reεtriction analyεiε followed by Southern analyεiε in all caεeε detected a 2.5 Kbp EcoRI /SphI fragment which waε εubcloned in pUC19, resulting in plasmid pKMUl. This insert carried by this plasmid and several subclones thereof were sequenced by described methods; the DNA sequence and the corresponding amino acid tranεlation are given in Figure 12 (corresponding to Seq. ID. No. 11 and 12) . The determined DNA sequences showed 71% homology on

the DNA level and 81% homology on the amino acid level with the corresponding S . cereviεiae URA3 gene and the product (Rose et al. , 1984) .

For the construction of the food-grade leu2 mutant spontaneous ura- mutants were selected by plating X. marxianuε wt cells on 5-FOA plates. Out of 10 8 cellε, 4 uracil requiring mutantε were obtained; one of theεe -named KMS2- waε used for further construction work. Plasmid pKMU2 was constructed by replacing parts of the 5.1 kb EcoRI LEU2 fragment in plasmid pKMLl (Bergkamp et al., 1991) with the X. marxianuε URA3 gene as indicated in Figure 13. A 1 kb StuI/EcoRV fragment containing a large part of the coding sequence of the LEU2 gene was replaced by a 2.5 kb EcoRI/SphI fragment containing the X. marxianuε URA3 gene, giving plasmid pKMU2. EcoRI and SphI sticky ends were made blunt by use of T4 DNA polymeraεe in the presence of all four dNTP's The linear 6.5 kb EcoRI fragment, containing the leu2 gene disruption leu2 : ι URA3 was further used to transform KMS2 by electroporation as described by Meilhoc et al. 1990 with selection on medium lacking uracil to εelect for uracil prototrophy. From the 75 transformantε obtained, 12 also displayed leucine requirement, indicating that in 63 transformantε heterologouε recombination had occurred, theεe tranεformantε were not inveεtigated any further. The Southern analyεiε of 3 of the 12 leu " transformants confirm that in all cases the wild type copy of the LEU2 gene has been replaced by leu2 : : URA3 fragment. One of the newly obtained leu2 transformants, designated KMS3, waε εtable even during long term growth under non εelective conditionε. Thiε non-reverting X. marxianuε leu2 εtrain iε εuitable for overexpreεsion of homologous or heterologous proteins in, for example, the food industry.

Example 14. Single copy integration of the α-galatosidase gene cassette into the INU1 locus

Multi-copy integration of a mRNA producing promoter-gene casεette into the rDNA locus generates an unusual DNA

arrangement conεisting of sequences from the gene desired for expresεion which are tranεcribed by RNA polymeraεe II and the εtable rRNA geneε, tranεcribed by RNA polymerases I and III. To teεt the potential influence of the εurrounding rDNA on the expression of α-galactoεidaεe under the control of the INU promoter, thiε combination alεo teεted at a different locuε. The caεεette waε therefore integrated into the inulinaεe, INU1 , locuε through εingle cross over, thereby recombining the INU promoter with the wild type 5 ' upεtream sequence.

For the construction of inulinase integration plasmids the Cyamopεiε tetragonoloba α-galactosidase cassette with the described promoter fragment and the prepro-sequence of the X. marxianuε inulinase (INU1 ) gene, and the S . cereviεiae PGK terminator, were combined in a plasmid incapable of replication in X. marxianuε . To achieve this, the 804 bp long EcoRI/BspMI fragment of plasmid pUR2427 waε ligated into the EcoRI/EagI digeεted plasmid pSY9 (M. Harmsen, unpublished) . The resulting plasmid, pUR2431 (Figure 9) was linearized with Xhol within the promotor sequence of the

INUl gene prior to transformation to strain pUR2431 thereby preferentially targetting the integration into the chromosomal INUl locuε. The expected integration event creates a chromosomal situation in which the α- galactosidaεe gene iε placed under the regulation of the INUl promoter within the wt. chromoεomal 5' DNA context, whereaε the chromosomal INUl gene is placed under the control of the cloned INUl promoter fragment used in the fusion conεtructs (Figure 14.) . The acquired transformants were analyzed for both α- galactosidase- and inulinaεe production, α-galactosidase was measured as deεcribed earlier, while inulinaεe waε meaεured aε described by Rouwenhorst et al. (1988) . Results for 4 different transformantε are summarized below; the total α-galactosidaεe production and the extracellular inulinaεe activity of pUR2431 transformants and of the untransformed yeaεt εtrain KMS3 after growth for 24h in YPS

medium are given .

The transformant designated IG4 showed a strikingly high α- galactosidase production level compared to the other 3 transformantε, concomitant with rather low inulinaεe production. Sothern analayεiε of all 4 transformants revealed that only in transformant IG4 was plasmid pUR2431 integrated into the chromosomal inulinase locus in the expected manner, while in all three other cases the integration event occured elsewhere in the genome (not shown) . Hence, the low inulinase production of this transformant might be caused by the lack of a further activating 5 ' DNA sequence element not present on the utilized fusion construct, an interpretation in accordance with the low expresεion results of the other transformants where the integration took place elsewhere and where the α- galactoεidaεe gene is not under the control of the INUl promoter and further 5' upstream chromosomal inulinaεe εequences.

Example 15. Cloning of further INUl upstream sequences.

To obtain additional upstream εequenceε of the INUl promoter, which may enhance the expreεεion directed by the

described JNC72 promoter, total chromosomal DNA of X. marxianuε was isolated as described by Struhl et al. (1979) and digested with different reεtriction endonucleases, all having a recognition sequence within the first 500 bpε of the cloned and εequenced DNA fragment containing the INUl locuε. The fragmentε were separated by electrophoresis and the agaroεe gel subsequently subjected to Sothern analysiε. For the identification of additional 5' sequences, plasmid pUR2421 was digested with EcoRI and Wcol and the about 290 bp long DNA fragment containing 5' sequence of the cloned promoter was isolated and used for the synthesis of a digoxygenin labelled DNA fragment, which was subsequently used as a DNA probe. From the obtained specifically hybridizing signals, the EagI digestion product with an apparent lenght of about 1.9 kb and the Xhol digestion product with a length of approximately 1 kb were chosen for further cloning. The EagI digested X. marxianuε DNA was separated by electrophoresiε and fragments in the region of 1.9 kb purified from the gel by the procedure described earlier.

These fragmentε were ligated into EagI digested Bluescript vector (Stratagene) and the productε introduced by tranεformation into E. coli JM109. The transformants were εecreened by colony hybridization as deεcribed earlier uεing the DIG labelled DNA probe containing 5' sequences from the INUl promoter described above. A εimilar approach waε used to clone the Xhol fragment containing upstream sequences from INUl but in thiε caεe the vector was digested with Xhol prior to ligation with Xhol digested X. marxianuε DNA of approximately 1 Kb in size.

Using these techniques plaεmidε pUR2440 and pUR2439 were identified which contain the approximately 1.9 kb EagI fragment and the approximately 1.1 Kb Xhol fragments of the INUl promoter respectively (Figure 15) . The DNA sequence of an approximately 470 bp region immediately 5' to the previously cloned INUl εequences was determined by the

techniques described earlier and is shown in figure 16 (corresponding to Seq. ID. No. 13) . Plasmid pUR2440 was deposited in the Centraalbureau voor Schimmelcultures, Baarn, The Netherlands under accession number CBS 648.93,

Example 16. Construction of expression vectors containing longer derivatives of the INUl promoter.

To obtain an autonomously replicating vector carrying the extended INUl promoter sequence plasmid pUR2434 was partially digested with EcoRI and the approximately 11.6 Kb linear fragment isolated from a gel and dephosphorylated. The approximately 470 bp EcoRI fragment from pUR2440 was subεequently ligated into thiε vector. The ligation mix waε used to transform E. coli JM109 to ampicillin resiεtance and the plasmid DNA from the resulting transformantε analysed by digestion with Ncol to identify those which contained the inεert adjacent to the exiεting INUl promoter εequenceε in pUR2434. Sequencing across the relevant EcoRI site in pUR2440 using primer INUT (Figure 18) which hybridizes within the previously sequenced region of the

INUl promoter was carried out to determine the sequence of the extended promoter. The orientations of the cloned approximately 470 bp EcoRI fragments were confirmed by εequencing uεing primer INUT. Two plaεmidε were identified which carried the fragment in the correct orientation and 4 were found with the fragment in the oppoεite orientation. A εtill longer upεtream region can be cloned into pUR2434 by cloning of the 1.4 kb fragment from pUR2440 produced by partial digeεtion with EcoRI into pUR2434 partially digeεted with EcoRI aε described above. The correct orientation of the fragment can be easily determined by digestion of the reεulting plasmids with EagI which will release the approximately 1.9 kb fragment cloned in pUR2440. Plasmid pUR2445 (Figure 18) carrying the extended promoter and pUR2434 were introduced by electroporation (Bolen et al . , 1990) into X. marxianuε strain KMS3 with selection for

leucine prototrophy. Representativeε of the reεulting tranεformantε were grown overnight in minimal medium at 37 °C and diluted 1:10 into 10 ml of YEP, 5% εucroεe induction medium. Theεe cultures were grown for a further 24 hours at 37°C and the α-galactoεidaεe levels in the fermentation meduim determined. The reεultε are εhown below:

From theεe reεults it appearε aε if the extension of the promoter has a beneficial effect upon enzyme production levels. Multi-copy rDNA integrative plasmidε carrying a longer INUl promoter can εimilarly be conεtructed using the 1.9 kb EagI fragment from pUR2440. The integrative vector pUR2437 can be partially digested with EcoRI and either the approximately 470 bp EcoRI fragment from pUR2440 or 1.4 kb fragment formed by partial digestion of pUR2440 with EcoRI inserted upstream of the existing INUl sequences. The reεulting ligation mix can be introduced into E . coli JM109 and plaεmids containing inεerts identified by digestion with EcoRI. The orientation of the approximately 470 bp fragment could be confirmed by εequencing and that of the approximately 1.4 kb EcoRI fragment by digeεtion of the plaεmids with EagI which should give a fragment of approximately 1.9 kb in addition to those derived from the vector. The plasmids so produced can be linearized by

digestion with Xbal and introduced by transformantion into X. marxianuε strain KMSl or KMS3 with selection for leucine prototrophy.

Literature References

Berg, v/d J.H. , Laken, v/d K.J. , Ooyen, A.J.J. , Renniers, T., Rietveld, K. , Schaap, A., Brake, A.J., Bishop, R.J. , Schultz, K. , Moyer, D. , Richmann, M, Shuster, J. , 1990. Kluyveromyceε as a host for heterologouε gene expreεεion: expreεεion and εecretion of prochymoεin. Biotech. 8, 135- 139

Bergkamp, R.J.M. , Geerεe, R.H. , Verbakel, J.M.A. , Muεterε, W. , Planta, R.J., 1991. Cloning and diεruption of the LEU2 gene of Kluyveromyceε marxianuε CBS 6556. Yeast, Vol 7, 963-970.

Bergkamp, R.J.M. , Kool, I.M., Geerse, R.H., Planta, R.J., 1992. Multiple copy integration of the α-galactosidase gene from Cyamopεiε tetragonoloba into the ribosomal DNA of Kluyveromyceε lactiε . Curr. Genet 21, 365-370.

Bianchi, M.M. , Falcone, C. , Chen, X.J. , Wesolowski-Louvel, M. , Frontali, L. , and Fukuhara, H. , 1987. Transformation of the yeast Kluyveromyceε lactiε with new vectors derived from the 1.6 μm circular plasmid pKDl. Cur. Geneticε 12: 185-192.

Boeke J.D. , LaCroute, F. , Fink, G.R. (1984). A poεitive εelection εyεtem for mutantε lacking orotidine -5'- phoεphate decarboxylaεe activity in yeaεt: 5-fluoro-orotic acid reεiεtance. Mol Gen. Genet. 197, 345-346.

Bolen, P.L., and McCutchan, J.E., 1990. Electroporation of Kluyveromyceε lactiε . In Mager, W.H., and Planta, R.J. (eds.) .15th International conference on yeast genetics and

molecular biology, The Hague, Netherlands. Wiley, Chichester.

Carter, B.L.A., Irani, M. , Mackay, V.I., Seale, R.L., Sledziewεki, A.V. , Smith, R.A. , 1988. Expression and secretion of foreign genes in yeast. In: Glover, D.M. , (Ed.) DNA Cloning III. A practical approach. IRL press Oxford. 141-161.

Chen, X.J., Bianchi, M.M. , Suda, K. , and Fukahara, H. , 1989. The host range of pKDl derived plasmids in yeast. Curr. Genet. 16, 95-98.

Chung, C.T., Niemela, S,L., Miller, R.H., 1989. One-step preparation of competent E . coli : Tranεformation and storage of bacterial cells in the same solution. Proc. Natl. Acad. Sci. USA, 86; 2172-2175

Crowe, J.S., Cooper, H.J., Smith, M.A. , Sims, M.J. , Parker, D. , and Gewert, D. , 1991. Improved cloning efficiency of polymerase chain reaction (PCR) productε after proteinaεeK digestion. Nucl. Acids Res. Vol. 19, p. 184.

Falcone, C. , Saliola, M. , Chen, X.J. , Frontali, L. , Fukuhara, H. , 1986. Analysis of a 1.6-μm circular plasmid from the yeast Kluyveromyceε droεophilarum : structure and molecular dimorphism. Plasmid 15: 248-252

Fleer, R. , Yeh, P., Amellal, N. , Maury, I., Fournier, A., Bacchetta, F. , Baduel, G., Jung, G. , Hote, H.L. , Becquart J. , Fukuhara H. , Mayaux, J.F., 1991. Stable multicopy vectors for high-level secretion of recombinant human serum albumin by Kluyveromyces yeastε. Biotech., Vol 9, 969-975

Fuller, R.S., Sterne, R.E., Thorner J. , 1988. Enzymes required for yeaεt prohormone processing. Ann. Rev. Physiol. 50, 345-362

Giuεeppin, M.L.F., Lopes, M.T.S., Planta, R.J. , Verbakel, J.M.A. , Verrips, C.T., 1991. Proceεs for preparing a protein by a fungus transformed by multicopy integration of an expresεion vector. PCT International WO 91/00920

Hamilton, R. , Watanabe, K.C., de Boer H.A. , 1987. Compilation and compariεon of the sequence context around the AUG start codons in S . cereviεiae mRNA'ε. Nucl. Acidε. Res. 15, 3381-3593.

Heijne, van, G. , 1986. A new method for preceding signal sequence cleavage siteε. Nucl. Acidε Reε. 16: 4683-4690.

Hεiao, K. , 1991. A faεt and simple procedure for sequencing double stranded DNA with Sequenase. Nucl. Acids Reε. 19: 2787.

Kohrer, K. , and Domdey, H. , 1991. Preparation of high molecular weight RNA, pp. 398-405. In Guthrie, C. , Fink, G.R. (edε.) , Methodε in Enzymology 194: Guide to yeaεt geneticε and molecular biology, Academic Press, Inc. , Harcourt Brace Jovanovich, San Diego.

Laloux, 0., Cassert, J.P. , Delcour, J. , Van Beeumen, J. , Vandenhaute, J. , 1991. Cloning and sequencing of the inulinase gene of Kluyveromyceε marxianuε var. marxianuε ATCC 12424. FEBS lett. ,289, 64-68.

Loenen, W.A.M. , Brammar, W.J., 1980. A bacteriophage lambda vector for cloning large DNA fragmentε made with εeveral reεtriction sites. Gene 20, 249-259.

Lopes, T.A. , Klootwijk, J. , Venstra, A.E., van der Aar, P.C., van Heerikhuizen, H. , Raue, H.A. , Planta, R.J. , 1989. High-copy-number integration into the riboεomal DNA of S . cereviεiae : a new vector for high level expreεεion. Gene 79, 199-206

Mead, D.A. , Szczesna-Skorupa, E. , and Kemper, B. , 1986. Single-εtranded 'blue' T7 promoter plaεmidε: a verεatile tandem promoter εystem for cloning and protein engineering. Protein engineering 1: 67-74.

Meilhoc, E. , Masεon, J. , Teissie, J. , (1990) . High efficiency tranεformation of intact yeast cells by electric field pulses. Bio/Technology 8, 223-227.

Nehlin, J.O., Ronne, H. , 1990. Yeaεt MIGl repreεεor iε related to the mammalian early growth response and Wil s' tumor finger proteins. EMBO, 9, 2891-2898.

Nehlin, J.O., Carlberg, M. , Ronne, H. , 1991. Control of yeast GAL genes by MIGl repressor: a transcriptional caεcade in the glucoεe reεponse. EMBO, 10, 3373-3377.

Rose, M. , Grisafi, P., Botεtein, D. (1984). Structure and function of the yeast URA3 gene: expression in E. coli . Gene, 29, 113-124.

Rouwenhorst, R.J., Viεεer, L.E., van der Baan, Schefferε, W.A. , van Dijken, J.P. (1988) Production, distribution and kinetic properties of inulinaεe in continouε culture of Kluyveromyceε marxianuε 'CBS 6556. Appl. Environm. Microbiol. 54; 1131-1137.

Rouwenhorst, R.J., Hensing, M. , Verbakel, J. , Schefferε, W.A. , and van Dijken, J.P., 1990. Structure and propertieε of the extracellular inulinase of Kluyveromyceε marxianuε CBS 6556. Appl. Environ. Microbiol. 56: 3337-3345.

Sambrook, J. , Fritsch, E.F., Maniatiε, T. , 1989. Molecular cloning. A laboratory manual. Second edition. Cold Spring Harbor Laboratory, Cold Spring Harbor, New York.

Sanger, F. , Nicklen, S., and Coulεon, A.R. , 1977. DNA

sequencing with chain-terminating inhibitors. Proc. Natl. Acad. Sci. USA 74: 5463-5467.

Struhl, K. , Stinchcomb, D.T., Scherer, S., and Davis., R.W. , 1979. High-frequency transformation of yeast: autonomouε replication of hybrid DNA moleculeε. Proc. Natl. Acad. Sci. USA. 76: 1035-1039.

Szybalεki,W. , Kim, S.C., Haεan, N. , Podhajska, A.J. , 1991. Class-IIS restriction enzymes: a review. Gene 100, 13-26.

Tanguy-Rougea , C. , Weselowεky-Louvel, M. , Fukuhara., H. , 1988. The Kluyveromyceε KEXl gene encodeε a subtilisin-type εerine proteaεe. FEBS Lett. 234, 464-470.

Vandamme, E.J. , Derycke, D.G., 1983. Microbiol. inulinases. Fermentation proceεε; properties and applications. Adv. Appl. Microbiol. 29, 139-176

Verbakel, J.M.A. , 1991. Heterologous gene expression in the yeast Saccharomyces cerevisiae. Phd thesis.

Yanisch-Perron, C. , Vieira, J. , and Meεεing, J. , 1975. Improved M13 phage cloning vectors and host εtrainε: nucleotide εequenceε of the M13mpl8 and pUC vectorε. Gene 33: 103-119.

Samples of Eεcherichia coli transformed with plasmids pUR2421 and pUR2422 were deposited under the Budapest Treaty at the Centraalbureau for Schimmelcultures (CBS) in Baarn, The Netherlands on 27 May 1992. They received deposit numbers CBS 265.92 and CBS 266.92, respectively. These plasmids were mentioned in the specification in Examples 5 and 6 and in Example 8. Plaεmid pUR2440 (example 15) was also deposited at the CBS under accession number CBS 648.93.