Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHOD FOR PRODUCING DITERPENES
Document Type and Number:
WIPO Patent Application WO/2009/044336
Kind Code:
A2
Abstract:
The present invention provides a method of producing diterpene compounds, said method comprising contacting a particular polypeptide having a diterpene synthase activity with the diterpene precursor geranylgeranyl pyrophosphate (GGPP). In particular, said method may be carried out in vitro or in vivo to produce labdenediol diphosphate, labdenediol and/or sclareol, which are very useful compounds in the fields of perfumery and flavoring. The present invention also provides the amino acid sequence of the polypeptide used in the method. A nucleic acid derived from Salvia sclarea and encoding the polypeptide of the invention, an expression vector containing said nucleic acid, as well as a non-human organism or a cell transformed to harbor the same nucleic acid, are also part of the present invention.

Inventors:
SCHALK MICHEL (FR)
Application Number:
PCT/IB2008/053973
Publication Date:
April 09, 2009
Filing Date:
September 30, 2008
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
FIRMENICH & CIE (CH)
SCHALK MICHEL (FR)
International Classes:
C12P7/02; C12N9/10; C12P17/06
Foreign References:
US20040072323A12004-04-15
Other References:
BANTHORPE D V ET AL: "ACCUMULATION OF THE ANTI-FUNGAL DITERPENE SCLAREOL BY CELL CULTURES OF SALVIA SCLAREA AND NICOTIANA GLUTINOSA" PHYTOCHEMISTRY, PERGAMON PRESS, GB, vol. 29, no. 7, January 1990 (1990-01), pages 2145-2148, XP000671796 ISSN: 0031-9422
CYR ANTHONY ET AL: "A modular approach for facile biosynthesis of labdane-related diterpenes." JOURNAL OF THE AMERICAN CHEMICAL SOCIETY, vol. 129, no. 21, 30 May 2007 (2007-05-30), pages 6684-6685, XP002473660 ISSN: 0002-7863
DATABASE EBI [Online] 1 May 2000 (2000-05-01), IMAI R: "Tomato cDNA for copalyl diphosphate synthase" XP002473661 Database accession no. Q9ST35
DATABASE EBI [Online] 5 October 1999 (1999-10-05), IMAI R: "Tomato cDNA for copalyl diphosphate synthase" XP002473664 Database accession no. AB015675
Attorney, Agent or Firm:
SALVATERRA-GARCIA, Maria de Lurdes (Rue de la Bergère, P.O. Box 148 Meyrin 2, CH)
Download PDF:
Claims:

Claims

1. A method for producing at least one diterpene and/or labdenediol diphosphate comprising a) contacting geranylgeranyl pyrophosphate (GGPP) with a polypeptide having a diterpene synthase activity and comprising an amino acid sequence at least 70% identical to SEQ ID NO:1 or 2; and b) optionally, isolating the at least one diterpene and/or the labdenediol diphosphate produced in step a).

2. The method of claim 1, wherein said polypeptide comprises an amino acid sequence at least 80%, more preferably at least 90% identical to SEQ ID NO: 1 or 2.

3. The method of claim 2, wherein said polypeptide comprises the amino acid sequence set out in SEQ ID NO:1 or 2.

4. The method of claim 3, wherein said polypeptide consists of the amino acid sequence set out in SEQ ID NO: 1 or 2.

5. The method of claim 1, wherein said at least one diterpene is a labdane derivative.

6. The method of claim 5, wherein said at least one diterpene is labdenediol and/or sclareol.

7. The method of claim 1 , carried out at pH 7 or below, more preferably at pH 6.

8. The method of claim 7, wherein a naturally occurring organic acid is used to reach the prescribed pH.

9. The method of any of claims 1 to 8, carried out in a reaction medium free of

Mg 2+ .

10. The method of claim 1, wherein step a) is carried out by cultivating a non- human organism or cell capable of producing GGPP and transformed to express said polypeptide under conditions conducive to the production of at least one diterpene and/or labdenediol diphosphate.

11. The method of claim 10, further comprising, prior to step a), transforming the non human organism or cell capable of producing GGPP with a nucleic acid encoding said polypeptide, so that said organism expresses said polypeptide.

12. The method of claim 10 or 11, wherein said non-human organism is a plant, a prokaryote or a fungus.

13. The method of claim 10 or 11, wherein said non-human organism is a microorganism.

14. The method of claim 13, wherein said microorganism is a bacteria or yeast.

15. The method of claim 14, wherein said bacteria is E. coli and said yeast is Saccharomyces cerevisiae.

16. The method of claim 10 or 11, wherein said non-human cell is a plant cell.

17. A polypeptide having a diterpene synthase activity and comprising an amino acid sequence at least 70% identical to SEQ ID NO: 1 or 2.

18. The polypeptide of claim 17, comprising an amino acid sequence at least 80%, more preferably at least 90%, identical to SEQ ID NO: 1 or 2.

19. The polypeptide of claim 18, comprising the amino acid sequence set out in

SEQ ID NO: 1 or 2.

20. The polypeptide of claim 19, consisting of the amino acid sequence set out in SEQ ID NO: 1 or 2.

21. The polypeptide of any of claims 17 to 20, derived from Salvia Sclarea.

22. A nucleic acid encoding the polypeptide of any of claims 17 to 21.

23. The nucleic acid of claim 22, comprising a nucleotide sequence at least 70% identical to SEQ ID NO: 3, SEQ ID NO:4 or the complement thereof.

24. The nucleic acid of claim 23, comprising a nucleotide sequence at least 80%, preferably at least 90%, identical to SEQ ID NO: 3, SEQ ID NO:4 or the complement thereof.

25. The nucleic acid of claim 24, comprising a nucleotide sequence identical to

SEQ ID NO:3, SEQ ID NO:4 or the complement thereof.

26. The nucleic acid of claim 25, consisting of the nucleotide sequence set out in SEQ ID NO:3, SEQ ID NO:4 or of the complement thereof.

27. The nucleic acid of any of claims 23 to 26, derived from Salvia sclarea.

28. An expression vector comprising the nucleic acid of any of claims 23 to 27.

29. The expression vector of claim 28, in the form of a viral vector, a bacteriophage or a plasmid.

30. The expression vector of claim 28 or 29, including the nucleic acid of the invention operably linked to at least one regulatory sequence which controls transcription, translation initiation or termination, such as a transcriptional promoter, operator or enhancer or an mRNA ribosomal binding site and, optionally, including at least one selection marker.

31. A non-human organism transformed with the expression vector of any of claims 28 to 30, so that it harbors the nucleic acid of any of claims 23 to 27 and heterologous Iy expresses or over-expresses the polypeptide of any of claims 17 to 21.

32. The non-human organism of claim 31, wherein said non-human organism is a plant, a prokaryote or a fungus.

33. The non-human organism of claim 31, wherein said non-human organism is a microorganism.

34. The non-human organism of claim 33, wherein said microorganism is a bacteria or yeast.

35. The non-human organism of claim 34, wherein said bacteria is E. coli and said yeast is Saccharomyces cerevisiae.

36. A higher eukaryotic cell transformed with the expression vector of any of claims 28 to 30, so that it harbors the nucleic acid of any of claims 23 to 27 and expresses the polypeptide of any of claims 17 to 21.

37. The higher eukaryotic cell of claim 36, wherein said higher eukaryotic cell is a plant cell.

38. A method for producing at least one polypeptide having a diterpene synthase activity comprising: a) transforming a non-human host organism or cell with the expression vector of any of claims 28 to 30, so that it harbors the nucleic acid of any of claims 23 to 27 and expresses or over-expresses the polypeptide encoded by said nucleic acid; b) culturing the organism under conditions conducive to the production of said polypeptide.

39. A method for preparing a variant polypeptide having a diterpene synthase activity comprising the steps of: a) selecting a nucleic acid according to any of claims 23 to 27; b) modifying the selected nucleic acid to obtain at least one mutant nucleic acid; c) transforming host cells or unicellular organisms with the mutant nucleic acid sequence to express a polypeptide encoded by the mutant nucleic acid sequence; d) screening the polypeptide for a functional polypeptide having at least one modified property; and, e) optionally, if the polypeptide has no desired variant diterpene synthase activity, repeat the process steps (a) to (d) until a polypeptide with a desired variant diterpene synthase activity is obtained.

Description:

METHOD FOR PRODUCING DITERPENES

Technical field

The present invention provides a method of producing diterpene compounds, said method comprising contacting a particular polypeptide having a diterpene synthase activity with the diterpene precursor geranylgeranyl pyrophosphate (GGPP). In particular, said method may be carried out in vitro or in vivo to produce labdenediol diphosphate, labdenediol and/or sclareol, which are very useful compounds in the fields of perfumery and flavoring. The present invention also provides the amino acid sequence of the polypeptide used in the method. A nucleic acid derived from Salvia sclarea and encoding the polypeptide of the invention, an expression vector containing said nucleic acid, as well as a non-human organism or a cell transformed to harbor the same nucleic acid, are also part of the present invention.

Prior art

Terpenoids or terpenes represent a family of natural products found in most organisms (microorganisms, animals and plants). Terpenoids are made up of five carbon units called isoprene units and are classified by the number of these units present in their structure. Thus monoterpenes, sesquiterpenes and diterpenes are terpenes containing 10, 15 and 20 carbon atoms respectively. Diterpenoids, for example, are widely found in the plant kingdom and over 2500 diterpenoid structures have been described (Connolly and Hill, Dictionary of terpenoids, 1991, Chapman & Hall, London). Terpene molecules have been of interest for thousands of years because of their flavor and fragrance properties and their cosmetic, medicinal and antimicrobial effects. Plant extracts obtained by different means such as steam distillation or solvent extraction are used as source of terpenes. Terpene molecules are often used as such, but in some cases chemical reactions are used to transform the terpenes into other high value molecules.

Biosynthetic production of terpenoids involves enzymes called diterpene synthases. These enzymes convert an acyclic terpene precursor in one or more terpene products. In particular, diterpene synthases produce diterpenes by cyclization of the precursor geranylgeranyl pyrophosphate.

Sclareol is a naturally occurring diterpene molecule extensively used as starting

material for the synthesis of fragrance molecules with ambergris notes. These syntheses were developed to provide an alternative to ambergris, a waxy substance secreted by the intestines of sperm whale. Ambergris is highly appreciated for its pleasant odor and has been historically used as a perfume ingredient. Due to its high price and the increasing demand for ambergris, and particularly due to the protection of the whale species, chemical synthesis of ambergris constituents and molecules with ambergris character have been developed. Amongst these molecules, Ambrox ® (registered trademark of Firmenich SA, Switzerland) is the most largely appreciated substitute for Ambergris. The most widely used starting material for the synthesis of Ambrox ® is the diterpene-diol sclareol. Labdenediol is also suitable as starting material for this synthesis.

Generally, the price and availability of plant natural extracts are dependent on the abundance, oil yield and geographical origin of the plants. In addition, the availability and quality of natural extracts is very much dependent on climate and other local conditions leading to variability from year to year, rendering the use of such ingredients in high quality perfumery very difficult or even impossible some years. Therefore, it would be an advantage to provide a source of sclareol and/or labdenediol, which is less subjected to fluctuations in availability and quality. Chemical synthesis would seem to be an evident option for the preparation of sclareol. However, given its highly complex structure, an economic synthetic process for the preparation of sclareol is still difficult. A biochemical pathway leading to the synthesis of sclareol and/or labdenediol would therefore be of great interest.

The biosynthesis of terpenes in plants and other organisms has been extensively studied and is not further detailed in here, but reference is made to Dewick, Nat. Prod. Rep., 2002, 19, 181-222, which reviews the state of the art of terpene biosynthetic pathways.

WO 2008/007031 discloses a protein having a syn-copalyl-8-ol diphosphate synthase activity, the nucleotide sequence encoding said protein, as well as a vector and a transgenic non-human organism comprising said nucleic acid. This syn-copalyl-8-ol diphosphate synthase is nevertheless very different from the polypeptide of the invention, because the protein there disclosed has an amino acid sequence only 44% identical to the one of the invention.

Several other diterpene synthases have already been identified. In particular,

US 7,238,514 B2 discloses a number of diterpene synthases, the nucleic acids encoding them as well as unicellular organisms transformed to express each of these synthases together with a GGPP synthase, thus producing diterpenes in vivo. Nevertheless, none has the same specific diterpene synthase activity as the polypeptide provided in the present invention. Moreover, the amino acid and nucleotide sequences disclosed in that document are very different from the sequences of the present invention. Among the described synthases, the closest to the polypeptides of the present invention is a Stevia rebaudiana copalyl pyrophosphate synthase (Cppsl) designated by SEQ ID NO:385 in US 7,238,514 B2 and by the accession number AAB87091.1. This polypeptide and the one of the invention only share 39% identity. Moreover, there is no evidence in this prior art document that the described diterpene synthase is useful for the production of the compounds of interest according to the present invention.

Copalyl diphosphate synthases having a certain percentage of sequence identity with the sequences of the present invention have also been found in the sequences databases. Nevertheless, the percentage of identity between the known diterpene synthases and the polypeptides of the invention is relatively low. The closest synthases to the ones of the invention are two copalyl diphosphate synthases (one from Solarium lycopersicum (BAA84918) and one from Cucurbita maxima (AAD04293 and AAD04292)), a putative copalyl diphosphate synthase from Scoparia dulcis (BAD91286) and a hypothetical protein from Vitis vinifera. The sequences of these proteins share only 41% identity with the ones of the invention and there is strictly no information on their activity or lack of activity with regard to the production of the compounds of interest according to the present invention.

In addition to the difference between the sequences themselves, it also has to be pointed out that the properties of copalyl diphosphate are very different from those of the products that can be obtained through the method of the invention. In particular, copalyl diphosphate is of no use in the field of perfumery and flavoring, whereas the compounds produced by the method of the present invention are of high interest in these technical fields, as explained above. One document of the prior art relates specifically to a sclareol synthase (Banthorpe,

Brown and Morris, Partial purification of farnesyl pyrophosphate: Drimenol cyclase and geranylgeranyl pyrophosphate: Sclareol cyclase, using cell culture as a source of material,

Phytochemistry 31, 1992, 3391-3395). In this reference, a partially purified protein from Nicotiana glutinosa is identified as a sclareol synthase, but no indication is given regarding the amino acid sequence of that protein, the nucleotide sequence of the nucleic acid encoding it or the use of that protein in a method for producing diterpenes and/or labdenediol diphosphate and, more particularly, in a method for the biosynthesis of labdenediol diphosphate, labdenediol and/or sclareol.

Despite extensive studies of terpene cyclization, the isolation and characterization of the enzymes is still difficult, particularly in plants, due to their low abundance, their often transient expression patterns, and the complexity of purifying them from the mixtures of resins and phenolic compounds in tissues where they are expressed.

It is an objective of the present invention to provide methods for making diterpenes and/or labdenediol diphosphate in an economic way, as indicated above. Accordingly, the present invention has the objective to produce diterpenes while having little waste, a more energy and resource efficient process and while reducing dependency on fossil fuels. It is a further objective to provide enzymes capable of synthesizing diterpenes, which are useful as perfumery and/or aroma ingredients.

Abbreviations Used bp base pair

BSA bovine serum albumine

DNA deoxyribonucleic acid cDNA complementary DNA dNTP deoxy nucleotide triphosphate dT deoxy thymine

DTT dithiothreitol

FID flame ionization detector

FPP farnesyl-pyrophosphate

GC gaseous chromatograph

GGOH geranyl geraniol

GGPP Geranylgeranyl pyrophosphate

GLOH geranyllinalool idi isopentyl diphosphate isomerase

IPTG isopropyl-D-thiogalacto-pyranoside

LB lysogeny broth

LOH labdenediol

LPP labdenediol diphosphate

MES 2-(λ/-morpholino)ethanesulfonic acid

MOPSO 3 -(N-morpholino)-2-hydroxypropanesulfonic acid

MS mass spectrometer

PCR polymerase chain reaction

RMCE recombinase-mediated cassette exchange

RT-PCR reverse transcription - polymerase chain reaction

3'-/5'-RACE 3' and 5' rapid amplification of cDNA ends

RNA ribonucleic acid mRNA messenger ribonucleic acid nt nucleotide

RNase ribonuclease

SDS-PAGE SDS-polyacrylamid gel electrophoresis

Description of the invention

The present invention provides a method to biosynthetically produce at least one diterpene and/or labdenediol diphosphate in an economic, reliable and reproducible way.

One object of the present invention is therefore a method for producing at least one diterpene and/or labdenediol diphosphate comprising a) contacting geranylgeranyl pyrophosphate (GGPP) with a polypeptide having a diterpene synthase activity and comprising an amino acid sequence at least 70% identical to SEQ ID NO:1 or 2; and b) optionally, isolating the at least one diterpene and/or the labdenediol diphosphate produced in step a).

The method can be carried out in vitro as well as in vivo, as will be explained in details further on.

The "at least one diterpene" produced is defined as an unsaturated hydrocarbon based on a C 2 o structure composed of four isoprene units (C 5 H 8 ), and which may be acyclic or cyclic. As used herein, the word "diterpene" is intended to include diterpenes as

well as diterpene derivatives, including compounds that have undergone one or more steps of functionalization such as hydroxylations, isomerizations, oxido-reductions, dimethylations or acylations. More generally, as used herein, a "derivative" is understood as any compound obtained from a known or hypothetical parent substance and containing essential elements of that parent substance.

According to a preferred embodiment of the invention, the at least one diterpene produced is a labdane derivative, labdane derivatives being intended as any compound containing the essential structural elements of labdane, as represented in Figure 1. According to a more preferred embodiment the at least one diterpene produced is labdenediol and/or sclareol. According to an even more preferred embodiment, the method of the invention is a method for producing labdenediol diphosphate. The products obtained by the method of the invention are dependent on the conditions under which the method is carried out. These conditions will be detailed later on.

As a "diterpene synthase" or as a "polypeptide having a diterpene synthase activity", we mean here a polypeptide capable of catalyzing the synthesis of a diterpene and/or of a diterpene diphosphate ester starting from the acyclic terpene precursor geranylgeranyl pyrophosphate (GGPP). In particular, the "diterpene synthase" or the "polypeptide having a diterpene synthase activity" is defined as a polypeptide capable of catalyzing the formation of labdenediol, sclareol and/or labdenediol diphosphate starting from GGPP. Terpene synthases are often named by reference to the compound, of which they catalyze the formation. For example, a polypeptide capable of catalyzing the formation of labdenediol diphosphate starting from GGPP can be named a labdenediol diphosphate synthase. The ability of a polypeptide to catalyze the synthesis of a particular diterpene and/or of labdenediol diphosphate can be confirmed by performing the enzyme assay as detailed in the Examples.

According to the present invention, polypeptides are also meant to include truncated polypeptides provided that they keep their diterpene synthase activity as defined above and that they share at least the defined percentage of identity with the corresponding fragment of SEQ ID NO: 1 or SEQ ID NO:2. According to a preferred embodiment, the method for producing at least one diterpene and/or labdenediol diphosphate comprises contacting GGPP with a polypeptide having a diterpene synthase activity and comprising an amino acid sequence at least 75%,

preferably 80%, preferably 85%, preferably 90%, more preferably 95% and even more preferably 98% identical to SEQ ID NO:1 or 2. According to a more preferred embodiment, said polypeptide comprises the amino acid sequence SEQ ID NO:1 or 2. In an even more preferred embodiment, said polypeptide consists of SEQ ID NO: 1 or 2. According to a more preferred embodiment, the polypeptide having a diterpene synthase activity as intended in any embodiment of the method of the invention is a polypeptide comprising an amino acid sequence at least 70% identical to any of SEQ ID NO:43 to 46, which are truncated forms of SEQ ID NO:1, or to any of SEQ ID NO:51 to 54, which are truncated forms of SEQ ID NO:2. Preferably said polypeptides comprises an amino acid sequence at least 75%, preferably at least 80%, preferably at least 85%, preferably at least 90%, more preferably at least 95% and most preferably at least 98% identical to any of SEQ ID NO:43 to 46 or to any of SEQ ID NO:51 to 54. According to a more preferred embodiment, said polypeptide comprises any of SEQ ID NO:43 to 46 or any of SEQ ID NO:51 to 54. According to an even more preferred embodiment, said polypeptide consists of any of SEQ ID NO:43 to 46 or of any of SEQ ID NO:51 to 54.

The percentage of identity between two peptidic or nucleotidic sequences is a function of the number of amino acids or nucleic acids residues that are identical in the two sequences when an alignment of these two sequences has been generated. Identical residues are defined as residues that are the same in the two sequences in a given position of the alignment. The percentage of sequence identity, as used herein, is calculated from the optimal alignment by taking the number of residues identical between two sequences dividing it by the total number of residues in the shortest sequence and multiplying by 100. The optimal alignment is the alignment in which the percentage of identity is the highest possible. Gaps may be introduced into one or both sequences in one or more positions of the alignment to obtain the optimal alignment. These gaps are then taken into account as non-identical residues for the calculation of the percentage of sequence identity.

Alignment for purposes of determining the percentage of amino acid or nucleic acid sequence identity can be achieved in various ways using computer programs and for instance publicly available computer programs available on the world wide web. Preferably, the BLAST program (Tatiana et al, FEMS Microbiol Lett., 1999, 174:247- 250, 1999) set to the default parameters, available from the National Center for Biotechnology Information (NCBI) at

http://www.ncbi.nlm.nih.gov/BLAST/bl2seq/wblast2.cgi, can be used to obtain an optimal alignment of peptidic or nucleotidic sequences and to calculate the percentage of sequence identity.

The polypeptide to be used when the method is carried out in vitro can be obtained by extraction from any organism expressing it, using standard protein or enzyme extraction technologies. If the host organism is a unicellular organism or cell releasing the polypeptide of the invention into the culture medium, the polypeptide may simply be collected from the culture medium, for example by centrifugation, optionally followed by washing steps and re-suspension in suitable buffer solutions. If the organism or cell accumulates the polypeptide within its cells, the polypeptide may be obtained by disruption or lysis of the cells and further extraction of the polypeptide from the cell lysate.

The polypeptides, either in an isolated form or together with other proteins, for example in a crude protein extract obtained from cultured cells or microorganisms, may then be suspended in a buffer solution at optimal pH. If adequate, salts, DTT, BSA and other kinds of enzymatic co-factors, may be added in order to optimize enzyme activity. Appropriate conditions are described in more details in the Examples further on.

GGPP may then be added to the suspension or solution, which is then incubated at optimal temperature, for example between 15 and 40 0 C, preferably between 25 and 35°C, more preferably at 30 0 C. After incubation, the at least one diterpene and/or the labdenediol diphosphate produced may be isolated from the incubated solution by standard isolation procedures, such as solvent extraction and distillation, optionally after removal of polypeptides from the solution.

Labdenediol diphosphate, labdenediol ((13-iT)-labda-13-ene-8α,15-diol) and/or sclareol ((13-R)-labda-14-ene-8α,13-diol)) may be obtained. Other compounds that may be formed are manoyl oxides ((13R)-8,13-epoxy-14-labdene and (13S)-8,13-epoxy-14- labdene), see Figure 1). The exact product profile is dependent on the conditions in which the method is carried out. Examples of product profiles obtained when the method is carried out in vitro with a crude protein extract comprising the polypeptide of sequence SEQ ID NO:1 are provided in Figure 7. Labdenediol diphosphate is the direct product of the enzymatic reaction catalyzed by the diterpene synthase used in the method of the invention. This product may then readily undergo chemical or enzymatic modifications, depending on the conditions under

which the reaction is carried out, thus leading for example to the formation of sclareol and/or labdenediol (Figure 13). An increased production of sclareol and labdenediol is of particular interest for the purpose of the present invention since they can be in turn used for the chemical synthesis of Ambrox ® , a well appreciated ingredient in the perfume industry. Labdenediol diphosphate is also a compound of interest, since it may be the substrate of further enzymatic reactions potentially leading to other useful molecules.

When the pH of the reaction medium is 7 or below, the amount of sclareol obtained increases relative to the other products. The more acidic the medium is, the more the relative concentration of sclareol increases. Therefore, according to a preferred embodiment, the method of the invention is carried out at pH 7 or below, more preferably at pH 6. According to an even more preferred embodiment, the acid used to reach the adequate pH is a naturally occurring acid, as for example citric acid. This last embodiment is particularly advantageous as it provides a product that fulfills the regulatory conditions for qualification as a "natural product". According to another advantageous embodiment of the invention, the method is carried out in a reaction medium free of Mg 2+ . Under these conditions, the production of sclareol is considerably favored compared to the other products.

The respective proportions of each of the products obtained by the method of the invention are also dependent on the presence or the absence of phosphatases. After treatment with phosphatases, only traces of labdenediol diphosphate are found, the production of labdenediol being considerably favored. To the contrary, the proportion of labdenediol diphosphate and of sclareol increases with the suppression or inhibition of phosphatases, for example with Na 3 VO 4 for alkaline phosphatases.

According to another preferred embodiment, the method for producing at least one diterpene and/or labdenediol diphosphate is carried out in vivo. In this case, step a) of the above-described method comprises cultivating a non-human organism or cell capable of producing GGPP and transformed to express a polypeptide having a diterpene synthase activity and comprising an amino acid sequence at least 70% identical to SEQ ID NO:1 or 2 under conditions conducive to the production of at least one diterpene and/or labdenediol diphosphate.

According to a more preferred embodiment, the method further comprises, prior to step a), transforming a non human organism or cell capable of producing GGPP with a

nucleic acid encoding a polypeptide having a diterpene synthase activity and comprising an amino acid sequence at least 70% identical to SEQ ID NO:1 or 2, so that said organism expresses said polypeptide.

According to a preferred embodiment, said nucleic acid comprises a nucleotide sequence at least 70% identical to SEQ ID NO:3, SEQ ID NO:4 or the complement thereof. According to a more preferred embodiment, said nucleic acid comprises a nucleotide sequence at least 75%, preferably 80%, more preferably 85%, more preferably 90%, more preferably 95% and even more preferably 98% identical to SEQ ID NO:3, SEQ ID NO:4 or the complement thereof. According to a more preferred embodiment, the nucleic acid comprises a nucleotide sequence identical to SEQ ID NO:3, SEQ ID NO:4 or the complement thereof. According to an even more preferred embodiment, the nucleic acid consists of SEQ ID NO:3, SEQ ID NO:4 or the complement thereof.

According to a more preferred embodiment, the nucleic acid comprises a nucleotide sequence at least 70% identical to any of SEQ ID NO:39 to 42, which are truncated forms of SEQ ID NO:3, to any of SEQ ID NO:47 to 50, which are truncated forms of SEQ ID NO:4, or to the complement thereof. Preferably said nucleic acid comprises a nucleotide sequence at least 75%, preferably at least 80%, preferably at least 85%, preferably at least 90%, more preferably at least 95% and most preferably at least 98% identical to any of SEQ ID NO:39 to 42, to any of SEQ ID NO:47 to 50 or to the complement thereof. According to a more preferred embodiment, said nucleic acid comprises any of SEQ ID NO:39 to 42, SEQ ID NO:47 to 50 or the complement thereof. According to an even more preferred embodiment, said nucleic acid consists of any of SEQ ID NO:39 to 42, of any of SEQ ID NO:47 to 50 or of the complement thereof.

These embodiments of the invention are particularly advantageous since it is possible to carry out the method in vivo without previously isolating the polypeptide. The reaction occurs directly within the organism or cell transformed to express said polypeptide.

The organism or cell is meant to "express" a polypeptide, provided that the organism or cell is transformed to harbor a nucleic acid encoding said polypeptide, this nucleic acid is transcribed to mRNA and the polypeptide is found in the host organism or cell. The term "express" encompasses "heterologously express" and "over-express", the latter referring to levels of mRNA, polypeptide and/or enzyme activity over and above

what is measured in a non-transformed organism or cell. A more detailed description of suitable methods to transform a non-human organism or cell will be described later on in the part of the specification that is dedicated to such transformed non-human organisms or cells as specific objects of the present invention and in the examples. A particular organism or cell is meant to be capable of producing GGPP when it produces GGPP naturally or when it does not produce GGPP naturally but is transformed to produce GGPP, either prior to the transformation with a nucleic acid as described herein or together with said nucleic acid. Organisms or cells transformed to produce a higher amount of GGPP than the naturally occurring organism or cell are also encompassed by the "organisms or cells capable of producing GGPP". Methods to transform organisms, for example microorganisms, so that they produce GGPP are already known in the art. Such methods can for example be found in Huang, Roessner, Croteau and Scott, Engineering Escherichia coli for the synthesis of taxadiene, a key intermediate in the biosynthesis of taxol, Bioorg Med Chem., 9(9), 2001, 2237-2242. According to a preferred embodiment, the organism accumulates GGPP naturally or is transformed to accumulate this precursor.

To carry out the invention in vivo, the host organism or cell is cultivated under conditions conducive to the production of diterpenes and/or labdenediol diphosphate. Accordingly, if the host is a transgenic plant, optimal growth conditions are provided, such as optimal light, water and nutrient conditions, for example. If the host is a unicellular organism, conditions conducive to the production of the diterpenes and/or the labdenediol diphosphate may comprise addition of suitable co factors to the culture medium of the host. In addition, a culture medium may be selected, so as to maximize diterpene and/or labdenediol diphosphate synthesis. Optimal culture conditions are described in a more detailed manner in the following Examples.

Non-human organisms suitable to carry out the method of the invention in vivo may be any non-human multicellular or unicellular organisms. In a preferred embodiment, the non-human organism used to carry out the invention in vivo is a plant, a prokaryote or a fungus. In a more preferred embodiment the non-human organism is a microorganism. According to an even more preferred embodiment said microorganism is a bacteria or a fungus, preferably yeast. Most preferably, said bacteria is E. coli and said yeast is Saccharomvces cerevisiae.

Any plant may be used to carry out the method of the invention in vivo. Particularly useful plants are those that naturally produce high amounts of terpenes. In a more preferred embodiment, the plant is selected from the family of Solanaceae, Poaceae, Brassicaceae, Fabaceae, Malvaceae, Asteraceae or Lamiaceae. For example, the plant is selected from the genera Nicotiana, Solanum, Sorghum, Arabidopsis, Brassica (rape), Medicago (alfalfa), Gossypium (cotton), Artemisia, Salvia and Mentha. Preferably, the plant belongs to the species of Nicotiana tabacum. Any prokaryote or fungus can be used to carry out the method of the invention and, similarly, any microorganism can be used. However, it has to be mentioned that several of these host organisms, for example microorganisms, do not produce GGPP naturally. To be suitable to carry out the method of the invention, these organisms have to be transformed to produce said precursor. They can be so transformed either before the modification with the nucleic acid encoding the polypeptide having a diterpene synthase activity or simultaneously.

Isolated higher eukaryotic cells can also be used, instead of complete organisms, as hosts to carry out the method of the invention in vivo. Suitable eukaryotic cells may be any non-human cell, but are preferably plant cells.

An important tool to carry out the method of the invention is the polypeptide itself.

A polypeptide having a diterpene synthase activity and comprising an amino acid sequence at least 70% identical to SEQ ID NO:1 or 2 is therefore another object of the present invention.

According to a preferred embodiment, the diterpene synthase comprises an amino acid sequence at least 75%, preferably 80%, preferably 85%, preferably 90%, more preferably 95% and even more preferably 98% identical to SEQ ID NO:1 or 2. According to another preferred embodiment, the polypeptide comprises the amino acid sequence SEQ

ID NO:1 or 2. According to a more preferred embodiment, the polypeptide consists of

SEQ ID NO :1 or 2.

According to a more preferred embodiment, the diterpene synthase is a polypeptide comprising an amino acid sequence at least 70% identical to any of SEQ ID NO:43 to 46, which are truncated forms of SEQ ID NO: 1 , or to any of SEQ ID NO:51 to 54, which are truncated forms of SEQ ID NO:2. Preferably said diterpene synthase comprises an amino acid sequence at least 75%, preferably at least 80%, preferably at least 85%, preferably at

least 90%, more preferably at least 95% and most preferably at least 98% identical to any of SEQ ID NO:43 to 46 or to any of SEQ ID NO:51 to 54. According to a more preferred embodiment, said diterpene synthase comprises any of SEQ ID NO:43 to 46 or any of SEQ ID NO:51 to 54. According to an even more preferred embodiment, said diterpene synthase consists of any of SEQ ID NO:43 to 46 or of any of SEQ ID NO:51 to 54.

According to another preferred embodiment of the invention, the polypeptide is derived from Salvia sclarea.

As used herein, the terms "diterpene synthase" or "polypeptide having a diterpene synthase activity" refers to a genus of polypeptide or peptide fragment that encompasses the amino acid sequences identified herein, as well as truncated polypeptides, provided that they keep their diterpene synthase activity as defined above and that they share at least the defined percentage of identity with the corresponding fragment of SEQ ID NO: 1 or 2.

Naturally occurring peptide variants are also encompassed by the invention. Examples of such variants are proteins that result from alternate mRNA splicing events or form proteolytic cleavage of the polypeptides described herein. Variations attributable to proteolysis include, for example, differences in the N- or C- termini upon expression in different types of host cells, due to proteolytic removal of one or more terminal amino acids from the polypeptides of the invention.

The nucleic acid encoding the polypeptide having a diterpene synthase activity, as defined above, is a necessary tool to modify non-human organisms or cells intended to be used when the method is carried out in vivo. A nucleic acid encoding a polypeptide as defined in any of the above embodiments is therefore another object of the invention.

According to a preferred embodiment, the nucleic acid comprises a nucleotide sequence at least 70% identical to SEQ ID NO:3, SEQ ID NO:4 or the complement thereof. According to a more preferred embodiment, said nucleic acid comprises a nucleotide sequence at least 75%, preferably 80%, more preferably 85%, more preferably 90%, more preferably 95% and even more preferably 98% identical to SEQ ID NO:3, SEQ ID NO:4 or the complement thereof. According to a more preferred embodiment, the nucleic acid comprises a nucleotide sequence identical to SEQ ID NO:3, SEQ ID NO:4 or the complement thereof. According to an even more preferred embodiment, the nucleic acid consists of SEQ ID NO:3, SEQ ID NO:4 or the complement thereof.

According to a more preferred embodiment, the nucleic acid comprises a nucleotide sequence at least 70% identical to any of SEQ ID NO:39 to 42, which are truncated forms of SEQ ID NO:3, to any of SEQ ID NO:47 to 50, which are truncated forms of SEQ ID NO:4, or to the complement thereof. Preferably said nucleic acid comprises a nucleotide sequence at least 75%, preferably at least 80%, preferably at least 85%, preferably at least 90%, more preferably at least 95% and most preferably at least 98% identical to any of SEQ ID NO:39 to 42, to any of SEQ ID NO:47 to 50 or to the complement thereof. According to a more preferred embodiment, said nucleic acid comprises any of SEQ ID NO:39 to 42, any of SEQ ID NO:47 to 50 or the complement thereof. According to an even more preferred embodiment, said nucleic acid consists of any of SEQ ID NO:39 to 42, of SEQ ID NO:47 to 50 or of the complement thereof.

According to another preferred embodiment of the invention, the nucleic acid is derived from Salvia sclarea.

The nucleic acid of the invention can be defined as including deoxyribonucleotide or ribonucleotide polymers in either single- or double-stranded form (DNA and/or RNA). The terms "nucleotide sequence" should also be understood as comprising a polynucleotide molecule or an oligonucleotide molecule in the form of a separate fragment or as a component of a larger nucleic acid. Nucleic acids of the invention also encompass certain isolated nucleotide sequences including those that are substantially free from contaminating endogenous material. The nucleic acid of the invention may be truncated, provided that it encodes a polypeptide encompassed by the present invention, as described above.

The nucleic acids obtained by mutations of SEQ ID NO:3, of SEQ ID NO:4 or of the complement thereof are also encompassed by an embodiment of the invention, provided that the resulting nucleic acid retains the desired diterpene synthase activity. Mutations may be any kind of mutations of these nucleic acids, such as point mutations, deletion mutations, insertion mutations and/or frame shift mutations. Variant nucleic acids may be prepared in order to adapt its nucleotide sequence to a specific expression system. For example, bacterial expression systems are known to more efficiently express polypeptides if amino acids are encoded by a preferred codon. Due to the degeneracy of the genetic code, wherein more than one codon can encode the same amino acid, multiple

DNA sequences can code for the same polypeptide, all these DNA sequences being encompassed by the invention.

Another important tool for transforming host organisms or cells suitable to carry out the method of the invention in vivo is an expression vector comprising a nucleic acid according to any embodiment of the invention. Such a vector is therefore also an object of the present invention.

An "expression vector" as used herein includes any linear or circular recombinant vector including but not limited to viral vectors, bacteriophages and plasmids. The skilled person is capable of selecting a suitable vector according to the expression system. In one embodiment, the expression vectors include the nucleic acid of the invention operably linked to at least one regulatory sequence, which controls transcription, translation, initiation and termination, such as a transcriptional promoter, operator or enhancer, or an mRNA ribosomal binding site and, optionally, including at least one selection marker. Nucleotide sequences are "operably linked" when the regulatory sequence functionally relates to the nucleic acid of the invention.

The expression vectors of the present invention may be used in the methods for preparing a genetically transformed host organism and/or cell, in host organisms and/or cells harboring the nucleic acids of the invention and in the methods for producing or making polypeptides having a diterpene synthase activity, as disclosed further below.

Recombinant non-human organisms and cells transformed to harbor the nucleic acid of the invention, so that it heterologously expresses or over-expresses the polypeptide of the invention are also very useful tools to carry out the method of the invention. Such non-human organisms and cells are therefore another object of the present invention.

The non-human organisms of the invention may be any non-human multicellular or unicellular organisms. In a preferred embodiment, the non-human organism of the invention is a plant, a prokaryote or a fungus. In a more preferred embodiment the non- human organism is a microorganism. According to an even more preferred embodiment said microorganism is a bacteria or a fungus, preferably yeast. Most preferably, said bacteria is E. coli and said yeast is Saccharomyces cerevisiae.

Any plant can be transformed as described herein. Particularly useful plants are

plants that naturally produce high amounts of terpenes. In a more preferred embodiment, the plant is selected from the family of Solanaceae, Poaceae, Brassicaceae, Fabaceae, Malvaceae, Asteraceae or Lamiaceae. For example, the plant is selected from the genera Nicotiana, Solarium, Sorghum, Arabidopsis, Brassica (rape), Medicago (alfalfa), Gossypium (cotton), Artemisia, Salvia and Mentha. Preferably, the plant belongs to the species of Nicotiana tabacum. Any prokaryote or fungus can be transformed according to the present invention and, similarly, any microorganism can be transformed. However, it has to be mentioned that several of these host organisms, for example microorganisms, do not produce GGPP naturally and therefore have to be transformed to produce said precursor. They can be so transformed either before the modification with the nucleic acid encoding the polypeptide having a diterpene synthase activity or simultaneously.

Isolated higher eukaryotic cells can also be transformed, instead of complete organisms. As higher eukaryotic cells, we mean here any non-human eukaryotic cell except yeast cells. Preferred higher eukaryotic cells are plant cells. The term "transformed" refers to the fact that the host was subjected to genetic engineering to comprise one, two or more copies of any of the nucleic acids of the invention. Preferably the term "transformed" relates to hosts heterologously expressing the polypeptides of the invention, as well as over-expressing them. Accordingly, in an embodiment, the present invention provides a transformed organism, in which the polypeptide of the invention is expressed in higher quantity than in the same organism not so transformed.

There are several methods known in the art for the creation of transgenic host organisms or cells such as plants, fungi, prokaryotes, or cell cultures of higher eukaryotic organisms. Appropriate cloning and expression vectors for use with bacterial, fungal, yeast, plant and mammalian cellular hosts are described, for example, in Pouwels et al, Cloning Vectors: A Laboratory Manual, 1985, Elsevier, New York and Sambrook et al., Molecular Cloning: A Laboratory Manual, 2 nd edition, 1989, Cold Spring Harbor Laboratory Press. Cloning and expression vectors for higher plants and/or plant cells in particular are available to the skilled person. See for example Schardl et al. Gene 61 : 1-11, 1987.

Methods for transforming host organisms or cells to harbor transgenic nucleic acids, such as those of the present invention, are familiar to the skilled person. For the

creation of transgenic plants, for example, current methods include: electroporation of plant protoplasts, liposome-mediated transformation, agrobacterium-mediated transformation, polyethylene-glycol-mediated transformation, particle bombardement, microinjection of plant cells, and transformation using viruses. In one embodiment, transformed DNA is integrated into a chromosome of a non- human host organism and/or cell such that a stable recombinant systems results. Any chromosomal integration method known in the art may be used in the practice of the invention, including but not limited to, recombinase-mediated cassette exchange (RMCE), viral site-specific chromosomal insertion, adenovirus, and pronuclear injection.

In order to carry out the method for producing at least one diterpene and/or labdenediol diphosphate in vitro, as exposed herein above, it is very advantageous to provide a method of making at least one polypeptide having a diterpene synthase activity. Therefore, the invention provides a method for producing at least one polypeptide having a diterpene synthase activity comprising the steps of: a) transforming a non-human host organism or cell with the expression vector of the invention, so that it harbors a nucleic acid according to the invention and expresses or over-expresses the polypeptide encoded by said nucleic acid; b) culturing the organism under conditions conducive to the production of said polypeptide.

A "polypeptide variant" as referred to herein means a polypeptide having a diterpene synthase activity and being substantially homologous to a native polypeptide, but having an amino acid sequence different from that encoded by any of the nucleic acid sequences of the invention because of one or more deletions, insertions or substitutions.

Variants can comprise conservatively substituted sequences, meaning that a given amino acid residue is replaced by a residue having similar physiochemical characteristics. Examples of conservative substitutions include substitution of one aliphatic residue for another, such as He, VaI, Leu, or Ala for one another, or substitutions of one polar residue for another, such as between Lys and Arg; GIu and Asp; or GIn and Asn. See Zubay, Biochemistry, Addison- Wesley Pub. Co., (1983). The effects of such substitutions can be calculated using substitution score matrices such a PAM-120, PAM-200, and PAM-250 as

discussed in Altschul, (J. MoI. Biol. 219:555-65, 1991). Other such conservative substitutions, for example substitutions of entire regions having similar hydrophobicity characteristics, are well known.

Naturally occurring peptide variants are also encompassed by the invention. Examples of such variants are proteins that result from alternate mRNA splicing events or from proteolytic cleavage of the polypeptides described herein. Variations attributable to proteolysis include, for example, differences in the N- or C-termini upon expression in different types of host cells, due to proteolytic removal of one or more terminal amino acids from the polypeptides encoded by the sequences of the invention. Variants of the polypeptides of the invention may be used to attain desired enhanced or reduced enzymatic activity, modified regiochemistry or stereochemistry, or altered substrate utilization or product distribution. Furthermore, variants may be prepared to have at least one modified property, for example an increased affinity for the substrate, an improved specificity for the production of one or more desired compounds, a different product distribution, a different enzymatic activity, an increase of the velocity of the enzyme reaction, a higher activity or stability in a specific environment (pH, temperature, solvent, etc), or an improved expression level in a desired expression system. A variant or site directed mutant may be made by any method known in the art. As stated above, the invention provides recombinant and non-recombinant, isolated and purified polypeptides, such as from Salvia sclarea. Variants and derivatives of native polypeptides can be obtained by isolating naturally-occurring variants, or the nucleotide sequence of variants, of other or same plant lines or species, or by artificially programming mutations of nucleotide sequences coding for native terpene synthases. Alterations of the native amino acid sequence can be accomplished by any of a number of conventional methods. Polypeptide variants resulting from a fusion of additional peptide sequences at the amino and carboxyl terminal ends of the polypeptides of the invention can be used to enhance expression of the polypeptides, be useful in the purification of the protein or improve the enzymatic activity of the polypeptide in a desired environment or expression system. Such additional peptide sequences may be signal peptides, for example. Accordingly, the present invention encompasses variants of the polypeptides of the invention, such as those obtained by fusion with other oligo- or polypeptides and/or those which are linked to signal peptides.

Therefore, in an embodiment, the present invention provides a method for preparing a variant polypeptide having a diterpene synthase activity and comprising the steps of:

(a) selecting a nucleic acid according to any of the embodiments exposed above; (b) modifying the selected nucleic acid to obtain at least one mutant nucleic acid;

(c) transforming host cells or unicellular organisms with the mutant nucleic acid sequence to express a polypeptide encoded by the mutant nucleic acid sequence;

(d) screening the polypeptide for a functional polypeptide having at least one modified property; and, (e) optionally, if the polypeptide has no desired variant diterpene synthase activity, repeat the process steps (a) to (d) until a polypeptide with a desired variant diterpene synthase activity is obtained.

Thereafter, in step (b), a large number of mutant nucleic acid sequences may be created, for example by random mutagenesis, site-specific mutagenesis, or DNA shuffling. The detailed procedures of gene shuffling are found in Stemmer, DNA shuffling by random fragmentation and reassembly: in vitro recombination for molecular evolution. Proc Natl

Acad Sci U S A., 1994, 91(22): 10747-1075. In short, DNA shuffling refers to a process of random recombination of known sequences in vitro, involving at least two nucleic acids selected for recombination. For example mutations can be introduced at particular loci by synthesizing oligonucleotides containing a mutant sequence, flanked by restriction sites enabling ligation to fragments of the native sequence. Following ligation, the resulting reconstructed sequence encodes an analog having the desired amino acid insertion, substitution, or deletion. Alternatively, oligonucleotide-directed site-specific mutagenesis procedures can be employed to provide an altered gene wherein predetermined codons can be altered by substitution, deletion or insertion.

Accordingly, SEQ ID NO:3 or SEQ ID NO:4 may be recombined with each other and/or with other diterpene synthase encoding nucleic acids, for example isolated from an organism other than Salvia sclarea. Thus, mutant nucleic acids may be obtained and separated, which may be used for transforming a host cells according to standard procedures, for example such as disclosed in the present Examples.

In step (d), the polypeptide obtained in step (c) is screened for a modified property, for example a desired modified enzymatic activity. Examples of desired enzymatic

activities, for which an expressed polypeptide may be screened, include enhanced or reduced enzymatic activity, as measured by K M or V max value, modified regio-chemistry or stereochemistry and altered substrate utilization or product distribution. The screening of enzymatic activity can be performed according to procedures familiar to the skilled person and those disclosed in the present Examples.

Step (e) provides for repetition of process steps (a)-(d), which may preferably be performed in parallel. Accordingly, by creating a significant number of mutant nucleic acids, many host cells may be transformed with different mutant nucleic acids at the same time, allowing for the subsequent screening of an elevated number of polypeptides. The chances of obtaining a desired variant polypeptide may thus be increased at the discretion of the skilled person.

In an embodiment, the present invention provides a method for preparing a nucleic acid encoding a variant polypeptide having a diterpene synthase activity, the method comprising the steps (a)-(e) disclosed above and further comprising the step of: (f) if a polypeptide having a desired variant diterpene synthase activity was identified, acquiring the mutant nucleic acid obtained in step (c), which was used to transform host cells or unicellular organisms to express the variant diterpene synthase following steps (c) and (d).

Polypeptide variants also include polypeptides having a specific minimal sequence identity with any of the polypeptides comprising the amino acid sequences according to SEQ ID NO:1 or SEQ ID NO:2.

All the publications mentioned in this application are incorporated by reference to disclose and describe the methods and/or materials in connection with which the publications are cited.

Description of the drawings

Figure 1 : Structures of the diverse compounds cited in the description.

Figure 2: Sequence of FN23 (SEQ ID NO: 17), a fragment of a S. sclarea diterpene synthase, obtained by PCR amplification form the cDNA library. The deduced amino acid sequence is shown above the nucleotidic sequence. The positions and orientation of the sense and anti-sense oligonucleotides specific for FN23 are shown above and below the

sequence respectively (SEQ ID NO: 18 to 20 and SEQ ID NO:24 to 26).

Figure 3: Full length sequence of the fragment SaTpsl (SEQ ID NO:28) obtained by assembling of the cDNA fragments FN23 (SEQ ID NO:17), FN30 (SEQ ID NO:23) and FN40 (SEQ ID NO:27). The deduce amino acid sequence is shown below the nucleotidic sequence.

Figure 4: Alignment of the amino acid sequences SEQ ID NO:1 and 2 deduced from Sa3 (SEQ ID NO:3) and Sa9 (SEQ ID NO:4), two closely related diterpene synthases encoding cDNAs, isolated for the purpose of the present invention. Identical residues are in white letters and residues differing between the two sequences are in black letters.

Figure 5: SDS-PAGE analysis of the crude soluble protein extracts from E. coli cells expressing the Sa3 and Sa9 proteins (SEQ ID NO:1 and 2). Lanes 1 and 8: molecular weight standards; lanes 2 and 7: control proteins obtained from cells transformed with the plasmid without insert; lanes 3 and 4: proteins from cells transformed with pETDue-Sa3; lanes 5 and 6 proteins from cells transformed with pETDuet-Sa9. The gel was stained for total protein using Coomassie blue.

Figure 6: GC-MS analysis of the products generated by the recombinant Sa3 protein. (A) Total ion chromatogram of the products obtained from the incubation of GGPP with a crude protein extract from E. coli expressing Sa3 (SEQ ID NO:1). (B) Total ion chromatogram of the extract obtained from the incubation of GGPP with a control protein extract. The mass spectra of peaks 1 and 2 are shown on the left side of the chromatogram. Peak 3, 4 and 5 have been identified as (+)-manoyl oxide, (+)-13-epi- manoyl oxide and geranylgeraniol respectively by comparison of the retention time and mass spectra of authentic standards (data not shown). (C) and (D) Total ion chromatogram obtained with a sclareol and a labdenediol standard respectively. The mass spectrum of each standard is presented next to each chromatogram.

Figure 7: Examples of products profiles observed when incubating the unpurifϊed S. sclarea diterpene synthases with GGPP under different conditions. Crude protein extracts from E. coli expressing the recombinant Sa3 (SEQ ID NO:1) were incubated with 80 μM GGPP in 50 mM MOPSO pH 7 in a final volume of 1 mL. Variation in compositions of the incubations where as follows: 100 μL protein, 20 μM MgCl 2 (A); 100 μL protein,

20 μM MgCl 2 , 6 mM Na 3 VO 4 (B); 100 μL protein, 6 mM Na 3 VO 4 , without MgCl 2 (C);

50 μL protein, 6 mM Na 3 VO 4 , without MgCl 2 (C).

Figure 8: SDS-PAGE analysis of the affinity purified recombinant sage diterpene synthase Sa3 (SEQ ID NO:1) expressed in E. coli. Lane M, molecular weight standard; lane 1, crude soluble protein extract from control cells; lane 2, crude soluble protein extract from cells transformed with pET28-Sa3; lane 3, flow-through fractions; lanes 4 to

7, washing fractions; lanes 8 to 10, elution fractions with 250 mM L-histidine. The gel was stained for total protein using Coomassie blue.

Figure 9: GC analysis of the products obtained after incubation of the affinity purified Sa3 (SEQ ID NO: 1) with GGPP in the absence of MgCl 2 and Na 3 VO 4 . (A) Direct solvent extract; (B) Solvent extract of the same sample after alkaline phosphatase treatment.

Figure 10: Effect of pH on enzymatic conversion of geranylgeranyl diphosphate

(GGPP). GGPP was incubated for 16 hours at pH 6, 7 and 9 with purified Sa3 protein (SEQ ID NO:1) and the incubations were extracted for GC analysis (A). The remaining aqueous phases were then treated with alkaline phosphatase and extracted for GC analysis

(B).

Figure 11 : Effect on pH and enzyme on Labdenediol diphosphate (LPP). Purified

LPP was incubated for 12 hours at pH 6, 7 and 9 with purified Sa3 protein and the incubations were extracted for GC analysis (A). In parallel, the same incubations were performed but leaving out the enzyme (C). The remaining aqueous phases were then treated with alkaline phosphatase and extracted for GC analysis (B and D, respectively).

Epoxy: labdenediol-epoxydes.

Figure 12: (A) N-terminal sequences of the full-length and truncated Sa3 recombinant diterpene synthases (SEQ ID NO:34 and SEQ ID NO:35 to 38). (B) SDS-

PAGE analysis of the full-length and truncated versions of the Sa3 diterpene synthases expressed in E. coli. Lane M, molecular weight standard; lane 1, crude soluble protein extract from control cells; lane 2, crude soluble protein extract from cells transformed with pET28-Sa3; lane 3, purified histidine tagged-Sa3; lane 4 and 5: respectively 1 and 0.5 μL of crude soluble protein extract from cells transformed with pETDuet-Sa3; lanes 5 to 9,

0.5 μL of crude soluble protein extract from cells transformed with pETDuet containing the four sequencial deletions. The gel was stained for total protein using Coomassie blue.

Figure 13: Cyclization mechanism of GGPP to labdenediol diphosphate (LPP) and further conversions in sclareol, labdenediol and epoxy labdene.

Specific embodiments of the invention or examples

The invention will now be described in further detail by way of the following Examples.

Example 1

Isolation of diterpene synthases cDNA from Salvia clarea

A. Plant material and RNA extraction Salvia sclarea flowers, flowers buds and leaves were collected in fields of Bassins

(Switzerland) and directly frozen in liquid nitrogen. The different parts of the plants were extracted separately with ethyl acetate and the concentration in sclareol was evaluated by GC analysis (capillary SPBl column (supelco); initial temperature of 100 0 C followed by a gradient of 10°C/min to 240 0 C). Sclareol was found to be present in all parts of the plants. The mature flowers contained the highest concentration in sclareol (up to 1% w/w) but sclareol was already present in the early stage of the developping flower buds. Assuming that the expression of the genes responsible for the biosynthesis of a metabolite does precede the detectable accumulation of this metabolite we used developing flower buds (1.5 to 2 cm length, 1-2 days old) for the isolation of mRNA. Total RNA was extracted using the Concert™ Plant RNA Reagent from Invitrogen

(Carlsbad, CA) and the mRNA was purified by oligodT-cellulose affinity chromatography using the FastTrack ® 2.0 mRNA isolation Kit (Invitrogen, Carlsbad, CA) according to the manufacturer's instructions. A cDNA library was constructed using the Marathon™ cDNA Amplification Kit (Clontech, Mountain View, CA).

B. Polymerase Chain Reactions for amplification of diterpene synthases cDNAs

Amino acid sequences of class I and II diterpene synthases from different plants were aligned and conserved motifs were selected. Degenerate oligonucleotides sequences were deduced from these conserved amino acid motifs. Thus the motifs SAYDTAWVA (SEQ ID NO:5) and DGSWGD (SEQ ID NO:6) located in the amino -terminal end of diterpene synthases were used to design the forward oligonucleotides DTlF (5'- TCDGCNT AYGAYACNGCWTGGGTDG-S' (SEQ ID NO :7)) and DT2F (5'- GAYGGNTCNTGGGGHGA-3' (SEQ ID NO:8)). The Motif DxDDTAM (SEQ ID NO: 9, x being any amino acid), found in the central part of diterpene synthases amino acid sequences and postulated to be involved in the interaction with the diphosphate moiety of GGPP in class II diterpene synthases, was used to design the forward primer DT3F (5'- GAYRTNGAYGAYACNGCNATGG-3' (SEQ ID NO: 10)) and the reverse primer DT3R (5 '-CCATNGCNGTRTCRTCNAYRTC-S ' (SEQ ID NO: H)). Another motif, DVW(I/L)GK(T/S) (SEQ ID NO: 12, 13, 14 and 15), found in some diterpene synthases, was used to design the reverse primer DT4R (5 '-GTYTTNCCNAKCCANACRTCRYYT- 3' (SEQ ID NO:16)).

PCR were performed using these primers in all possible combinations of reverse and forward primers. The PCR mixture contained 0.4 μM of each primer, 300 μM each dNTPs, 5 μL of 1OX HotStartTaq ® DNA polymerase buffer (Qiagen), 2 μL of 50 to 250 fold diluted cDNA, 0.5 μL of HotStartTaq ® DNA polymerase in a final volume of 50 μL. The cycling conditions were: 35 cycles of 45 sec at 94°C, 45 sec at 50 0 C and 2 min at 72°C; and 10 min at 72°C. The sizes of the PCR products were evaluated on a 1% agarose gel. The bands corresponding to the expected size were excised from the gel, purified using the QIAquick ® Gel Extraction Kit (Qiagen) and cloned in the pCR ® 2.1-TOPO vector using the TOPO TA cloning Kit (Invitrogen, Carlsbad, CA). Inserted cDNAs fragments were then subject to DNA sequencing and the sequence was compared against the GenBank non-redundant protein database (NCBI) using the BLASTX algorithm (Altschul et al, 1990). From the different PCR performed, only the combination of primers DT3F (SEQ ID NO: 10) and DT4R (SEQ ID NO: 16) gave a DNA fragments with the expected size and with sequence homology to diterpene synthases. All fragments issued from this amplification had the exact same sequence. This 354 bp sequence was named FN23 (SEQ ID NO: 17) (Figure 2).

C. Full length cDNA isolation by rapid amplification ofcDNA ends (RACE)

Oligonucleotides specific for the FN23 sequence (SEQ ID NO: 17) were designed: FN23-F1 (3 '-GCACGGATACGACGTCGATCCAAATGTAC-S ' (SEQ ID NO: 18)), FN23-F2 (3 '-GGGCTGCTCAACTAAGATTTCCAGGAG-S ' (SEQ ID NO: 19)) and FN23-F3 (5 '-GGGTGATATCCGACCACTTATTTGATGAG-S ' (SEQ ID NO:20)) (Figure 2). These primers were used in RT-PCR in combination with oligodT primers extended with an adaptor sequence (5'-aattcggtacccgggatcc(T)i 7 -3') (SEQ ID NO:21). The composition of the RT-PCR reaction mixture was the following: 10 ml 5X Qiagen OneStep RT-PCR buffer, 400 mM each dNTP, 400 nM each primer, 2 ml Qiagen OneStep RT-PCR Enzyme Mix, 1 ml RNasin ® Ribonuclease Inhibitor (Promega Co., Madisson, WI) and 1250 ng total RNA in a final volume of 50 ml. The thermal cycler conditions were: 30 min at 50 0 C (reverse transcription); 15 min at 95°C (DNA polymerase activation); 35 cycles of 45 sec at 94°C, 45 sec at 50 0 C and 90 sec at 72°C; and 10 min at 72°C. A second round of PCR was performed using the RT-PCR products as template with the adapteurP primer (5'-aattcggtacccgggatcc-3' (SEQ ID NO:22)) in combination with the same or nested FN23-specific primers. This PCR approach provided a 1271 bp cDNA fragment (FN30 (SEQ ID NO:23)) having a 192 bp perfect overlap with the FN23 fragment (SEQ ID NO: 17) and containing the 3 'end including the stop codon and the 3' non-coding sequence of the corresponding cDNA. For amplification of the 5 ' end of the cDNA, anti-sense oligonucleotides specific for FN23 were designed: FN23-R1 (5'-

CATGGCATCTTCAACCCCAGCTTT ATCTCATC-3' (SEQ ID NO:24)), FN23-R2 (5'- GTGGTCGGATATCACCCATCTTTCTTGAAGTCG-3' (SEQ ID NO:25)), FN23-R3 (5 '-CATTGGAGATGCAGACTCGACCGATTGACC-S ' (SEQ ID NO:26)) (Figure 2). These primers were used for 5'RACE using the S. sclarea cDNA library following the Marathon™ cDNA Amplification Kit protocol (Clontech, Mountain View, CA). The thermal Cycling conditions were as follows: 1 min at 94°C, 5 cycles of 30 sec at 94°C and 4 min at 72°C, 5 cycles of 30 sec at 94°C and 4 min at 70 0 C, 20 cycles of 30 sec at 94°C and 4 min at 68°C. This 5'RACE provided a 1449 bp cDNA fragment (FN40 (SEQ ID NO:27) having a 227 bp perfect overlap with FN23 (SEQ ID NO: 17). Comparison with known diterpene synthases sequences revealed that the FN40 fragment (SEQ ID NO:27) contained the translation initiation codon and a 87 bp non-coding region. The assembling

of the three cDNA fragments (FN23, FN30 and FN40 (SEQ ID NO.17, 23 and 27) provided a full length cDNA sequence (SaTpsl) of 2655 bp with an open reading frame of 2355 bp (SEQ ID NO:28) coding for a 785 residues protein (SEQ ID NO:29) having strong homology with diterpene synthases and namely with copalyl diphosphate synthases. The nucleotide sequence of SaTpsl and the corresponding amino acid sequence are presented in Figure 3. The DxDD motif, involved in protonation initiated cyclization, was present in the amino acid sequence (position 372). The DDxD motif, involved in ionization initiated cyclization, was not found in the protein sequence but an aspartate/glutamate rich motif was found in the carboxy-terminal end region.

Example 2

Heterologous expression in E. coli

The pETDuet-1 (Novagen, Madison, WI), designed for expression under the control of a T7 promoter, was used for expression in E. coli cells. To construct the expression plasmid, the open reading frame of SaTpsl (SEQ ID NO:28) was amplified by PCR from the cDNA library with the forward and reverse primers SaTps-Nde (3'- TACTGACATATGACTTCTGTAAATTTGAGCAGAGCACC-5' (SEQ ID NO:30)) and SaTps-Kpn (3'- TTGGTACCTCATACAACCGGTCGAAAGAGTACTTTG-S' (SEQ ID NO:31)) designed to introduce an Ndel site immediately before the start codon and a Kpnl site after the stop codon. Since the open reading frame contains an Ndel site at position of 1614 of the open reading frame, this amplification was performed in two steps by overlap extension PCR (Horton et al, Gene 77, 61-68, 1989), using the primers SaTps-Nde (SEQ ID NO: 30) and SaTps-Kpn (SEQ ID NO:31) in combination with the primers Satps-mutlf (5'- GTTGGAGTGGATCCACATGCAGGAATGGTAC-S' (SEQ ID NO:32)) and Satps- mutlr (3'- GTACCATTCCTGCATCTGGATCCACTCCAAC-5' (SEQ ID NO:33)), designed to remove the Ndel site without altering the amino acid sequence. The resulting cDNA were first ligated in the PCR2.1-Topo plasmid using the TOPO TA Cloning Kit (Invitrogen, Carlsbad, CA) and the sequences of the inserts were verified prior to sub- cloning as Ndel-Kpnl fragment into the pETDuet-1 vector.

Analysis of the sequence of several clones obtained by amplification from the cDNA library with the SaTpsl specific primers showed some variability in several positions of the cDNA sequence. Seven positions were identified, in which two different amino acids can be found. One position was found were insertion of a serine residue occurred in some of the clones. These positions are listed in the table below.

These variations seemed to occur in a random manner in eleven different clones sequenced, suggesting that at least two very closely related isoforms of a diterpene synthase are present in the S. sclarea genome and that the PCR amplification approach leaded to shuffling of the sequences. Two clones, Sa3 (SEQ ID NO:3) and Sa9 (SEQ ID NO:4) representative of the sequences variability (Figure 4), were selected for the heterologous expression and enzyme characterization experiments.

The plasmids pETDuet-Sa3 and pETDuet-Sa9 were transferred into B121(DE3) E. CoIi cells (Novagene, Madison, WI). Single colonies of transformed cells were used to inoculate 5 ml LB medium. After 5 to 6 hours incubation at 37°C, the cultures were transferred to a 20 0 C incubator and left 1 hour for equilibration. Expression of the protein was then induced by the addition of 1 mM IPTG and the culture was incubated over-night at 20 0 C. The next day, the cells were collected by centrifugation, resuspended in 0.1 volume of 50 mM MOPSO pH 7, 10% glycerol and lyzed by sonication. The extracts were cleared by centrifugation (30 min at 20,000 g), and the supernatants containing the soluble proteins were used for further experiments.

The crude protein extracts from pETDuet-Sa3 and pETDuet-Sa9 transformed cells were analyzed by SDS-PAGE and compared to protein extracts obtained from cells transformed with the empty pETDuet plasmid. The recombinant Sa3 and Sa9 proteins (SEQ ID NO:1 and 2) were clearly detected and the apparent molecular weight estimated at 90 KDa, a value in concordance with the calculated molecular weight of 83 KDa (Figure 5).

Example 3

Enzymatic activity of the crude protein extract

The enzymatic assays were performed in Teflon sealed glass tubes using 50 to 100 μl of protein extract in a final volume of 1 mL of 50 mM MOPSO pH 7, 10% glycerol supplemented with 20 mM MgCl 2 and 50 to 200 μM purified geranylgeranyl diphosphate GGPP (prepared as described by Keller and Thompson, Rapid synthesis of isoprenoid diphosphates and their isolation in one step using either thin layer or flash chromatography, J. Chromatogr 645(1), 1993, 161-167). The tubes were incubated 5 to 48 hours at 30 0 C and the enzyme products were extracted twice with one volume of pentane. After concentration under a nitrogen flux, the extracts were analyzed by GC and GC/MS as described above (Example 1) and compared to extracts from control proteins (obtained from cells transformed with the empty plasmid).

Analysis of the products from the incubation of Sa3 (SEQ ID NO:1) and Sa9 (SEQ ID NO:2) with GGPP revealed the production of several compounds with mass spectra characteristic of labdane diterpenes (Figure 1). In the conditions described above, sclareol was formed as minor product (peak 2 in Figure 6). The major product formed by the recombinant enzyme (peak 1 in Figure 6), was found to be (E)-13-labdene-8,15-diol (labdenediol). The identity of these two products was confirmed by coincidence of the retention time and mass spectrum of authentic standard (Figure 6). Two additional enzyme products were detected (peaks 3 and 4) and were confirmed to be (13R)-8,13-epoxy-14- labdene and (13S)-8,13-epoxy-14-labdene (manoyl oxides).

Variations in the composition of the incubation mixtures leaded to alteration of the relative amounts of the different diterpene alcohols produced and particularly in the ratio

of sclareol and labdenediol. Addition of 6 niM sodium ortho -vanadate (Na 3 VO 4 ), an inhibitor of alkaline phosphatases, resulted in a decrease of the labdenediol produced and increase of sclareol. Omission of the Mg 2+ cation leaded to the invertion of the ratio of the two products with sclareol becoming the major product. Variations in the amount of protein extract used in the incubations affected also the product profile. For instance, decreasing the volume of protein from 100 to 50 μl resulted in product profile where sclareol was the major product, and with only trace amounts of labdenediol and epoxy- labdenes and with a lower overall productivity. Figure 7 illustrates the variations in product profile observed when incubating Sa3 (SEQ ID NO:1) or Sa9 (SEQ ID NO:2) with GGPP in different conditions.

Example 4

Purification of the sage diterpene synthases and enzymatic activities

To further characterize the recombinant diterpene synthases and to elucidate the possible contribution of phosphatase activities in the protein extracts, we undertook to purify the Sa3 and Sa9 enzymes (SEQ ID NO:1 and T).

The PCR2.1-Topo plasmids containing the Sa3 and Sa9 cDNA (SEQ ID NO:3 and 4) (Example T) were digested with Ndel and Sad and the inserts were ligated into the pET28a(+) plasmid (Novagen). The resulting expression plasmids (pET28-Sa3 and pET28-Sa9) contain the cDNAs with a 5 '-end modification designed to express the proteins with an N-terminal hexa-histidine tag. Purification was performed under native conditions using the ProBond™ Purification System (Invitrogen) following the manufacturer protocol excepted that, for the elution, imidazole was replaced by L-histidine to minimize inhibition of the enzyme. Using this approach, the Sa3 and Sa9 recombinant enzymes (SEQ ID NO:1 and 2) could be purified to apparent homogeneity (Figure 8).

The affinity purified enzymes were incubated 12 hours at 30 0 C with 200 μM GGPP in MOPSO pH 7, 10% glycerol, with DTT 1 mM, without MgCl 2 and without Na 3 VO 4 . The incubation was extracted with pentane and the extract analyzed by GC. None of the diterpene products previously observed with the unpurified enzyme were detected. Treatment of the aqueous phase with 6 units bacterial alkaline phosphatase (Sigma)

followed by pentane extraction and GC analysis of the organic phase, revealed the presence of labdenediol and traces of epoxy-labdenes (Figure 9). Thus, in these conditions the enzyme showed only labdenediol-diphosphate (LPP) synthase activity. Addition of 6 mM Na 3 VO 4 , which had a positive effect on the production of sclareol and labdenediol with the crude protein extracts, had no effect on the enzymatic activity with the recombinant enzyme.

We then undertook to clarify why sclareol or manoyl oxides previously obtained as direct products of the enzymes were no longer observed with the purified enzymes. If sclareol were obtained from LPP by an ionization-initiated-type enzymatic reaction it would require a divalent cation for the chelating of the diphosphate moiety in the active site. The incubations were thus repeated in the presence of several cations. MgCl2, MnCl2, CuCl 2 , ZnCl 2 , NiCl 2 , CaCl 2 , CdCl 2 or CoCl 2 at concentrations of 0.25, 1, 3 and 1O mM. The incubations were extracted and analyzed by GC to evaluate the production of diterpenes. To hydrolyse any diterpene diphosphate ester remaining in the solution, aqueous phase was treated with alkaline phosphatase, or with HCL (at a final concentration of 0.3 M HCl, pH 2) in cases were the alkaline phosphatase was inhibited by the presence of the cations, and re-extracted with pentane. Nevertheless, none of these incubations resulted in direct production of diterpene. In addition the LPP synthase activity was inhibited by some of the cations: Mn 2+ at concentrations of 0.25 mM or higher, Ca 2+ and Cd 2+ at concentrations of 3 mM or higher and Co 2+ at a concentration of 10 mM.

Example 5

Effect of pH on the enzymatic activity

The above results suggest that the two diterpene synthase isoforms do not catalyze the second cyclization step (from LPP to sclareol). The sclareol observed previously in some of the incubations with crude protein extracts could be formed by a non-enzymatic reaction. Diphosphate esters are pH sensitive and particularelly unstable in acidic conditions. In addition, acid hydrolysis of allylic diphosphate esters often proceed with allylic rearrangement as illustrated by the hydrolysis of GGPP to a mixture of geranyllinalool and geranylgeraniol with a ratio increasing with the decrease of the pH. The formation of sclareol in some of the in-vitro assays could therefore result from acid hydrolysis of LPP.

We thus performed assay at various pH and the production of diterpenes and diterpene diphosphate esters were evaluated as described above in Example 4. Incubations were performed in various buffers (MES, MOPSO and Tris) at pH 6, 6.2, 7, 8 and 9. Figure 10 shows examples of GC profiles of extracts obtained from incubation at different pH before and after hydrolysis of the diphosphates esters by alkaline phosphatase treatment. Sclareol was detected only in the incubation at pH between 6 and 7 and the ratio of sclareol to LPP or labdenediol was higher with increasing acidification of the medium. At pH 7 and higher the major product of the enzyme was LPP.

This pH dependence supported the idea of a non-enzymatic conversion of LPP. To confirm this hypothesis, we evaluated the stability of LPP at different pH. For this purpose, LPP was synthesized enzymatically and purified. Large scale incubations (4X10 mL) were performed with the Sa3 or Sa9 recombinant diterpene synthase in conditions promoting the formation of LPP: in 50 mM MOPSO pH 7 containing 200 μM GGPP, without divalent cations, for 16 hours at 30 0 C. The incubation were pooled, concentrated by freeze-drying to a volume of 7 mL and 3 mL of isopropanol:NH 4 OH conc.:H 2 O (6:2.5:0.5) were added. The sample was then subjected to silica flash chromatography using the isopropanol:NH 4 OH:H 2 O described above as mobile phase. Fractions were collected and analyzed by TLC, as described by Keller and Thompson (J. Chromatogr., 645(1), 1993, 161-167) Fractions containing LPP were pooled, concentrated by freeze-

drying, diluted in 1O mM ammonium hydrogenocarbonate methanol (1 :1) and stored at -20 0 C.

The purified LPP was then incubated for 12 hours at different pH, with or without purified diterpene synthase and the conversion of LPP to diterpenes was evaluated either by direct solvent extraction of the incubation of by extraction after alkaline phosphatase treatment. Figure 11 shows GC-FID traces of selected examples of these experiments obtained at pH 6, 7 and 9. As observed previously in the incubation of GGPP with the diterpene synthase, formation of sclareol was observed at acidic pH and LPP was stable at alkaline pH. The same profiles were observed whether the enzyme was present or not. This experiment shows that the sclareol observed in some of the incubations and particularly at pH 7 or lower is formed by acid rearrangement of LPP and is not directly catalyzed by the diterpene synthases Sa3 or Sa9 (SEQ ID NO:1 or 2). The formation of the two isomers of manoyl oxides at a ratio of approximatly 1 : 1 could be explained by an acid catalyzed dehydration of LPP (Ravn et al, Org Lett. 2, 573-586, 1999) (Figure 11). Thus the enzymatic step catalyzed by Sa3 and Sa9 is limited to the conversion of GGPP to LPP.

Example 6

N-terminal deletions of the diterpene synthases

In plants, diterpene synthases are located in the plastids. This compartmentalization is controlled by a transport mechanism that recognized an N-terminal transit peptide signal. Thus, diterpene synthases are generally expressed as pre-proteins and are processed in the plastids by cleavage of the peptide signal resulting in a mature protein. Analysis of the N- terminal sequence of Sa3 and Sa9 (SEQ ID NO:34) did not reveal any clear evidence for the presence of a transit peptide. Experiments were thus performed to evaluate the effect of sequential N-terminal deletions on the enzymatic activity. Four truncated cDNA were made for Sa3 resulting in deletion of 17, 37, 53 and 63 aminoacids respectivly. Each construct was made by PCR using four different forward primers each designed to anneal at the position of the one of desired truncation and introducing an Ndel restriction site followed by a ATG translation initiation codon (Sa3_dell,

attaCATATGCTGCAGCTACAGCCGGAATTTCATGCCG (SEQ ID NO:35); Sa3_del2, attaCATATGGCGCCCTTGACCTTGAGTTGCCAAATCC (SEQ ID NO:36); Sa3_del3, attaCATATGATAGCTGAATTGAGAGTAACAAGCCTGG (SEQ ID NO:37), Sa3_del4, attaCATATGGCGTCGCAAGCGAGTGAAAAAGAC (SEQ ID NO:38)). These primers were used in combination with the primer SaTps-Kpn (SEQ ID NO:31) (Example 2) and the four cDNAs obtained were ligated in the pETDuet-1 plasmid. Heterolgous expression of the proteins were performed in E coli as described in Example 2. Figure 12 shows an SDS-PAGE analysis comparing the level of production of the heterologous proteins obtained with the different full-length and truncated constructs. An improved expression level was clearly observed specially for the two largest deletions. These results are typical for plastid-localized terpene synthases and reflect an improved solubility and/or stability of the mature protein compared to the pre-protein.

Example 7

In vivo production of diterpenes in E coli using Sa3 and Sa9

In vivo production of diterpenes using Sa3 and Sa9 (SEQ ID NO:1 and 2) was evaluated in E coli. Cells transformed with pETDuet-Sa3 (i.e. expressing only the Sa3 diterpene synthase) were grown in LB medium at 20 0 C for 24 hours after induction of the protein expression by IPTG. After addition of an internal standard (2 μg of farnesol), the culture was extracted with pentane, concentrated and analyzed by GC. In those cultures no diterpene compounds were detected.

These first observations were not surprising and are consistent with the fact that E. coli does not produce any C 2 o derived terpenes and thus does not accumulate any free GGPP. For the production of diterpenes in E. coli the diterpene synthases needs to be expressed with a GGPP synthase. The CrtE gene from Pantoea agglomerans encoding for a GGPP synthase was selected and was synthesized with codon optimization and addition of appropriate restrictions sites at the 3' and 5' ends. This gene was ligated into the pETDuet-Sa3 plasmid thus providing the pETDuet-Sa3-CrtE plasmid containing both the Sa3 and CrtE genes under the control of the T7 promoter. Culture of E. coli cells transformed with this plasmid and extraction of the medium as described above revealed

the presence of labdenediol at concentration of 30 to 70 μg/L of culture. A second plasmid was constructed for the co-expression of the two enzymes upstream in the pathway: farnesyl diphosphate (FPP) synthase and isopentenyl diphosphate isomerase (idi). The two genes were amplified by PCR from E. coli cells and ligated into the pACYCDuet-1 plasmid as NcoI-EcorI and Ndel-Xhol fragments respectively providing the plasmid pACYCDuet-FPPS-idi. The latter plasmid was co -transformed with the pETDuet-Sa3- CrtE plasmid in E. coli BL21 cells and the culture evaluated as described above. Quantities of labdendiol of up to 0.2 mg/L were obtained with cells expressing the four enzymes.

In other experiments, the accumulation of diterpene diphosphate esters in the cells was evaluated. For this purpose, cells were collected by centrifugation after the culture and were extracted 3 times with methanol: water (2:1). The extract was concentrated in a rotary evaporator and diluted in 25 mM Na 2 COs (pH 10). After extraction with pentane to remove any free form of diterpene, the aqueous phase was treated with alkaline phosphatase and extracted with pentane. In this extracts trace amounts of labdenediol were observed showing accumulation of trace amounts of LPP in the cells.