Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
BIOSYNTHESIS OF O-METHYLATED PHENOLIC COMPOUNDS
Document Type and Number:
WIPO Patent Application WO/2014/128252
Kind Code:
A1
Abstract:
The invention provides for a process for biosynthesis of an O-methylated compound by biotransforming a compound of Formula I into an O-methylated compound, wherein R1 is H, CHO, COOH, C1-6alkyl-COOH, or C2-6 alkenyl-COOH; R2 is H, OH, or OCH3; which process comprises the steps: a) providing a recombinant host cell comprising i) at least one first heterologous gene encoding a first enzyme which is a 4-O-1 methyltransferase operable to biotransform the compound into a 4-O-methylated compound; and ii) optionally, if R2 is OH, at least one second heterologous gene encoding a second enzyme which is a 3-O-methyltransferase, wherein said first and second enzymes are operable to biotransform the compound into a 3,4-O-dimethylated compound; b) cultivating said host cell to express said first and optionally said second enzyme; c) providing the compound and biotransforming the compound into a 4-O-methylated compound or if R2 is OH, into a 3,4-O-dimethylated compound.

Inventors:
RAMAEN ODILE (FR)
SEBAI SARRA (FR)
PANDJAITAN RUDY (FR)
Application Number:
PCT/EP2014/053412
Publication Date:
August 28, 2014
Filing Date:
February 21, 2014
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
EVIAGENICS S A (FR)
International Classes:
C12N9/10; C12N15/52; C12P7/22; C12P7/24; C12P7/42
Domestic Patent References:
WO2013022881A12013-02-14
WO2004036979A22004-05-06
Other References:
T. KOEDUKA ET AL: "Biosynthesis of t-Anethole in Anise: Characterization of t-Anol/Isoeugenol Synthase and an O-Methyltransferase Specific for a C7-C8 Propenyl Side Chain", PLANT PHYSIOLOGY, vol. 149, no. 1, 5 November 2008 (2008-11-05), pages 384 - 394, XP055114190, ISSN: 0032-0889, DOI: 10.1104/pp.108.128066
MUKAI N ET AL: "PAD1 and FDC1 are essential for the decarboxylation of phenylacrylic acids in Saccharomyces cerevisiae", JOURNAL OF BIOSCIENCE AND BIOENGINEERING, ELSEVIER, AMSTERDAM, NL, vol. 109, no. 6, 1 June 2010 (2010-06-01), pages 564 - 569, XP027051720, ISSN: 1389-1723, [retrieved on 20091216]
KOTA P ET AL: "O-Methylation of benzaldehyde derivatives by ''lignin specific'' caffeic acid 3-O-methyltransferase", PHYTOCHEMISTRY, PERGAMON PRESS, GB, vol. 65, no. 7, 1 April 2004 (2004-04-01), pages 837 - 846, XP004501784, ISSN: 0031-9422, DOI: 10.1016/J.PHYTOCHEM.2004.01.017
Attorney, Agent or Firm:
REDL, Gerda et al. (Donau-City-Straße 11, Wien, AT)
Download PDF:
Claims:
CLAIMS

1 . A process for biosynthesis of an O-methylated compound by biotransforming compound of Formula I into an O-methylated compound, wherein

Formula I is

wherein

R1 is H, CHO, COOH, Ci-6alkyl-COOH, or C^alkenyl-COOH;

R2 is H, OH, or OCH3;

which process comprises the steps:

a) providing a recombinant host cell comprising

i) at least one first heterologous gene encoding a first enzyme which is a 4-O-methyltransferase operable to biotransform the compound into a 4-O- methylated compound; and

ii) optionally, if R2 is OH, at least one second heterologous gene encoding a second enzyme which is a 3-O-methyltransferase, wherein said first and second enzymes are operable to biotransform the compound into a 3,4-0- dimethylated compound;

b) cultivating said host cell to express said first and optionally said second enzyme;

c) providing the compound and biotransforming the compound into a 4-O- methylated compound or if R2 is OH, into a 3,4-O-dimethylated compound.

2. The process of claim 1 , wherein the compound is added to the host cell, or wherein the compound is produced by the host cell as a metabolite.

3. The process of claim 1 or 2, wherein the compound is biotransformed by the host cell in a host cell culture.

4. The process of claim 1 , wherein the compound is biotransformed by an in vitro method, wherein

a) said first and/or said second enzyme is expressed by the host cell in a host cell culture, optionally followed by separation of said first and/or said second enzyme from the host cell culture; and

b) the compound is biotransformed ex vivo, preferably in the cell culture supernatant or separate from the host cell culture. 5. The process of any of claims 1 to 4, wherein said first enzyme is a 3,4-0- dimethyltransferase operable to biotransform the compound into a 3,4-O-dimethylated compound, if R2 is OH.

6. The process of any of claims 1 to 5, wherein said first and/ or said second enzyme is

a) a native enzyme originating from a species different from the host cell;

b) a functionally active variant enzyme of the native enzyme of a) having at least 60% sequence identity; or

c) a recombined variant enzyme with altered methylation function, preferably any alteration in regioselectivity, activity, conversion rate, thermostability and/or pH stability, preferably a chimeric enzyme.

7. The process of claim 6, wherein the native enzyme is a O-methyltransferase originating from a eukaryotic species selected from the group consisting of Medicago sativa, Homo sapiens, Pimpinella anisum, Ocimum basilicum, and Yarrowia lipolytica.

8. The process of claim 7, wherein the enzyme is selected from the group consisting of caffeic acid 3-O-methyltransferase from Medicago sativa (COMTmsa), t- anol/isoeugenol synthzase from Pimpinella anisum (AIMTpan), chavicol O- methyltransferase from Ocimum basilicum (CVOMToba), human catechol 3-O- methyltransferase (COMThs), and catechol O-methyltransferase from Yarrowia lipolytica (COMTyli).

9. The process of claim 6, wherein said recombined variant enzyme is obtainable by mutagenesis of an O-methyltransferase, thereby producing a repertoire of variants, and selecting a variant enzyme with altered enzyme activity. 10. The process of any of claims 1 to 9, wherein the compound is selected from the group consisting of caffeic acid, ferulic acid, p-hydroxybenzaldehyde, and vanillin.

1 1 . The process of claim 10, wherein

a) caffeic acid is biotransformed into 3,4-dimethoxycinnamic acid

b) ferulic acid is biotransformed into 3,4-dimethoxycinnamic acid;

c) p-hydroxybenzaldehyde is biotransformed into p-anisaldehyde

d) vanillin is biotransformed into 3,4-dimethoxybenzaldehyde.

12. The process of any of claims 1 to 1 1 , wherein the host cell is a bacterial or yeast cell, preferably a yeast cell selected from the group consisting of a Saccharomyces cell, a Schizosaccharomyces cell, a Pichia cell, a Yarrowia cell, a Hansenula cell, a Candida cell.

13. The process of any of claims 1 to 12, wherein the host cell is S. cerevisiae, preferably which is mutated to delete at least one of the genes FDC1 or PAD1.

14. An expression construct incorporating genes operable to express at least one enzyme to biotransform a compound of Formula I into an O-methylated compound, which comprises

i) at least one first gene encoding a first enzyme which is a 4-O- methyltransferase, and one or more genes encoding enzymes other than O- methyltransferases, which are part of a metabolic pathway to produce the compound of Formula I; or

ii) at least one first gene encoding a first enzyme which is a 4-O- methyltransferase, and at least a second gene encoding a second enzyme which is a 3-O-methyltransferase, and optionally one or more genes encoding enzymes other than O-methyltransferases, which are part of a metabolic pathway to produce the compound of Formula I.

15. The construct of claim 1 4, wherein said first and second genes are of different origin, preferably wherein at least one of the genes is encoding

a) a native enzyme originating from a different species; or

b) a functionally active variant enzyme of the native enzyme of a) having at least 60% sequence identity; or

c) a recombined variant enzyme of a) with altered methylation function, preferably any alteration in regioselectivity, activity, conversion rate, thermostability and/or pH stability, preferably a chimeric enzyme. 16. The construct of claim 14 or 15, wherein said first and second genes are native enzymes originating from a different eukaryotic species.

17. The construct of any of claims 14 to 16, which further comprises regulatory sequences to express said genes, preferably wherein the construct is an expression cassette.

18. A vector comprising the construct of any of claims 14 to 16.

19. A recombinant host cell comprising the construct of any of claims 14 to 16, or the vector of claim 18.

20. The host cell of claim 19, which is a bacterial or yeast cell, preferably a yeast cell selected from the group consisting of a Saccharomyces cell, a Schizosaccharomyces cell, a Pichia cell, a Yarrowia cell, a Hansenula cell, a Candida cell.

21 . The host cell of claim 20, which is S. cerevisiae, preferably which is mutated to delete at least one of the genes FDC1 or PAD1.

22. A process for enzymatic O-methylation of a compound of Formula I into an O-methylated compound, comprising the steps:

a) provid ing a first enzyme wh ich is a 4-O-methyltransferase operable to biotransform the compound into a 4-O-methylated compound; and

b) optionally, if R2 is OH , providing a second enzyme which is a 3-O- methyltransferase, wherein said first and second enzymes are operable to biotransform the compound into a 3,4-O-dimethylated compound;

c) providing the compound and biotransforming the compound into a 4-O- methylated compound or if R2 is OH, into a 3,4-O-dimethylated compound.

23. Use of a 4-O-methyltransferase in a process of enzymatically producing a compound of Formula I into an 4-O-methylated compound, and if R2 is OH, into a 3,4- O-dimethylated compound. 24. Use of a 4-O-methyltransferase and a 3-O-methyltransferase in a process of enzymatically producing a compound of Formula I wherein R2 is OH, into a 3,4-O- dimethylated compound.

Description:
BIOSYNTHESIS OF O-METHYLATED PHENOLIC COMPOUNDS

The invention refers to a process for biosynthesis of an O-methylated phenolic compound in recombinant host cells.

BACKGROUND

Organisms have developed sophisticated mechanisms for the synthesis of a wide spectrum of low molecular weight organic compounds, collectively known as secondary metabolites. The structural diversity found in natural products is attributed to a number of substitution reactions, which are catalyzed by substrate-specific, position- oriented enzymes. Of these, enzymatic O-methylation, catalyzed by S-adenosyl-L- methionine (SAM)-dependent O-methyltransferases (OMTs), plays a major role in the biosynthesis of secondary metabolites. The methylated products play important roles in lignin biosynthesis, as pharmacolog ically active stances, as antim icrobial compounds (phytoalexins), and in interactions of plants with the surrounding environment.

OMT's have been classified according to methyl acceptor and structural basis. In general, OMTs are substrate specific, position-oriented enzymes, as described for a number of distinct enzymes. They methylate a wide range of compounds with a high degree of selectivity. But in some cases, OMTs have been shown to exhibit a broader substrate specificity accepting substrates from different classes of secondary metabolites. They also have been shown to be multifunctional (promiscuous) enzymes that catalyze the methylation of structurally related compounds such as phenylpropanoids (caffeic acid, caffeoyl CoA, catechol, and dimethylfuranone O- methyltransferase).

For substrates that possess two vicinal phenolic (catecholic) hydroxyl groups, according to the literature, substrates are in most cases methylated at the meta- position of the phenyl ring only by native O-methyltransferases, and are precluded from substitution of the para-hydroxyl (4-OH) position. In some cases proteins are described to be specific for methylation at the 4-position only.

WO2013/022881 A1 describes the biosynthesis of vanillin and/or vanillin beta- D-glucoside in a recombinant host which harbors a heterologous nucleic acid encoding a mutant Arom Multifunctional Enzyme (AROM) polypeptide and/or a mutant Catechol- O-M ethyl Tra n sferase (COMT) polypeptid e . Va n i l l i n is O-methylated in the metaposition only. In some cases isovanillin is produced besides vanillin, which isovanillin is O-methylated in the paraposition only.

As methylated compounds are becoming more and more important for different industrial applications it is of high interest to identify and develop improved methods of producing O-methylated compounds. Although substantial work has been done to elucidate the substrate specificity of the different OMT's; the structural properties still remain to be identified. SUMMARY OF THE INVENTION

It is the objective of the present invention to provide for a new process for biosynthetic production of O-methylated products, in particular wherein products are fully O-methylated.

The object is solved by the subject matter as claimed.

According to the invention there is provided a process for biosynthesis of an O- methylated compound by biotransforming a compound of Formula I into an O- methylated compound, wherein

Formula I is

wherein

R 1 is H, CHO, COOH, Ci -6 alkyl-COOH, or C 2-6 alkenyl-COOH;

R 2 is H, OH, or OCH 3 ;

which process comprises the steps:

a) providing a recombinant host cell comprising

i) at least one first heterologous gene encoding a first enzyme which is a 4-O- methyltransferase operable to biotransform the compound into a 4-O-methylated compound; and ii) optionally, if R 2 is OH, at least one second heterologous gene encoding a second enzyme which is a 3-O-methyltransferase, wherein said first and second enzymes are operable to biotransform the compound into a 3,4-O-dimethylated compound;

b) cultivating said host cell to express said first and optionally said second enzyme;

c) providing the compound and biotransforming the compound into a 4-O- methylated compound or if R 2 is OH, into a 3,4-O-dimethylated compound.

Specifically, the enzyme is operable to biotransform the compound by catalytical activity, i.e. enzymatic activity to accelerate the biosynthesis of the O-methylated compound.

Said first and second enzymes do not necessarily provide for a specific order of the first and second enzymatic reaction. Thus, the 4-O-methylation may be performed before or after the 3-O-methylation. The first enzyme is herein specifically understood as the enzyme capable of catalyzing at least the 4-O-methylation reaction; and the second enzyme is herein specifically understood as the enzyme capable of catalyzing the 3-O-methylation reaction.

Specifically, said first enzyme is a 3,4-O-dimethyltransferase operable to biotransform the compound comprising OH as R 2 , into a 3,4-O-dimethylated compound. In such case, a single enzyme would provide for the dimethoxymethylation, avoiding the need for providing an additional enzyme with 3-O-methylation activity, if

R 2 is OH.

According to a specific aspect, the compound is added to the host cell, e.g. as an additive to the cell culture medium. Alternatively, the compound is produced by the host cell as a metabolite.

Specifically, the compound is biotransformed by the host cell in a host cell culture. Such production of the compound by the host cell e.g. in a host cell culture, is also called "in vivo production" or "in vivo biotransformation".

According to a specific aspect, the compound is biotransformed by an in vitro method, also called "in vitro biotransformation", wherein

a) said first and/or said second enzyme is expressed by the host cell in a host cell culture, optionally followed by separation of said first and/or said second enzyme from the host cell culture; and b) the compound is biotransformed ex vivo, preferably in the cell culture supernatant or separate from the host cell culture.

Specifically, the process is a production method to obtain the O-methylated compound in an extracellular medium of a eukaryotic host cell culture comprising:

a) providing a eukaryotic production host cell line capable of expressing the first and optionally the second enzyme under suitable conditions;

b) incubating the cells in a culture medium; and

c) optionally, extracting or isolating the O-methylated compound from the culture medium.

Specifically the host cell is a bacterial or yeast cell, preferably a yeast cell selected from the group consisting of a Saccharomyces cell, a Schizosaccharomyces cell, a Pichia cell, a Yarrowia cell, a Hansenula cell, a Candida cell.

More specifically, the host cell is S. cerevisiae, preferably which is mutated to delete at least one of the genes FDC1 or PAD1.

Specifically, the compound is selected from the group consisting of caffeic acid, ferulic acid, p-hydroxybenzaldehyde, and vanillin.

According to specific embodiments,

a) caffeic acid is biotransformed into 3,4-dimethoxycinnamic acid;

b) ferulic acid is biotransformed into 3,4-dimethoxycinnamic acid

c) p-hydroxybenzaldehyde is biotransformed into p-anisaldehyde; or

d) vanillin is biotransformed into 3,4-dimethoxybenzaldehyde.

Specifically, caffeic acid is biotransformed into 3,4-dimethoxycinnamic acid employing COMTyli as a first enzyme (4-O-methylation), and COMTmsa or COMThs as a second enzyme (3-O-methylation). The first enzyme provides for the biotransformation of ferulic acid into 3,4-dimethoxycinnamic acid, and the second enzyme provides for the biotransformation of the caffeic acid into ferulic acid, preferably by a cell expressing both, the first and the second enzyme.

Alternatively, caffeic acid is biotransformed into 3,4-dimethoxycinnamic acid employing COMThs as a first enzyme (4-O-methylation), and COMTmsa or COMTyli as a second enzyme (3-O-methylation). Th e fi rst enzym e provid es for th e biotransformation of caffeic acid into isoferulic acid, and the second enzyme provides for the biotransformation of the isoferulic acid into 3,4-dimethoxycinnamic acid, preferably by a cell expressing both, the first and the second enzyme. Specifically, isoeugenol acid is biotran sformed into methyl-isoeugenol employing AIMTpan or CVOMT as a first enzyme, which provides for the 4-O- methylation.

Specifically, p-hydroxylbenzaldehyde is biotransformed into p-anisaldehyde employing AIMTpan or CVOMT as a first enzyme, which provides for the 4-O- methylation.

Specifically, vanillin is biotransformed into 3,4-dimethoxybenzaldehyde employing COMTyli as a first enzyme, which provides for the 4-O-methylation.

It is well understood that any of the OMT enzymes listed above may be a native enzyme, i.e. as encoded by the genes which are present in a wild-type organism, or a functionally active enzyme derived from such native enzyme, e.g. a variant obtained by mutagenesis, such as a chimeric enzyme.

It is envisaged that any such biotransformation process as described herein may be used to produce the O-methylated compound as end products, or as intermediates, which may be subject to further biotransformation processes and/or chemical derivatization. Such end products may be e.g. produced as a fermentation product and separated from the host cell culture. Such intermediate products may be e.g. produced by the host cell and at the same time further consumed to produce a secondary metabolite from said intermediate product.

Specifically, any such biotransformation process as described herein may be used to produce the O-methylated compound by biotransforming the compound of Formula I, which is added to the cell culture medium during cultivating the cell as exogenous substrates, or else as substrates which are endogenously produced by the host cell through a metabolic pathway, e.g. employing a host cell which is engineered to produce the substrate by biosynthesis, e.g. starting from any precursor compound which is a natural amino acid, such as phenylalanine, tyrosine or tryptophan, preferably phenylalanine or tyrosine, or a monosaccharide, preferably selected from the group consisting of glucose, galactose or arabinose or ethanol.

According to a specific aspect, said first and/ or said second enzyme is a) a native enzyme originating from a species different from the host cell;

b) a functionally active variant enzyme of the native enzyme of a) having at least 60% sequence identity; or c) a recombined variant enzyme with altered methylation function, preferably any alteration in regioselectivity, activity, conversion rate, thermostability and/or pH stability.

Specifically, the native enzyme is a O-methyltransferase originating from a eukaryotic species selected from the group consisting of Medicago sativa, Homo sapiens, Pimpinella anisum, Ocimum basilicum, and Yarrowia lipolytica.

More specifically, the enzyme is selected from the group consisting of caffeic acid 3-O-methyltransferase from Medicago sativa (COMTmsa), t-anol/isoeugenol synthzase from Pimpinella anisum (AIMTpan), chavicol O-methyltransferase from Ocimum basilicum (CVOMToba, herein also referred to as CVOMT), human catechol 3-O-methyltransferase (COMThs), and catechol O-methyltransferase from Yarrowia lipolytica (COMTyli).

Specifically, said recombined variant enzyme is obtainable by mutagenesis of an O-methyltransferase, thereby producing a repertoire of variants, and selecting a variant enzyme with altered enzyme activity. For example, the recombined variant enzyme is obtainable by mutagenesis of a meta-specific methyltransferase, thereby producing a repertoire of variants, and selecting a variant enzyme with para-specific enzyme activity.

Specifically, said variant enzyme is encoded by a chimeric gene, e.g. including at least one fragment of a first gene and at least one fragment of a second gene, and optionally at least one fragment of a third gene, or fragments of even more different genes, which are ligated to engineer the chimeric gene, wherein said genes are preferably from one or more different species or synthetically produced. Any such fragment may consist of a nucleotide sequence encoding at least 1 amino acid, e.g. up to 95% of the complete gene. Specifically, said chimeric gene comprises one or more cross-overs, e.g. at least two or three cross-overs, i.e. where two or more fragments of a first gene are combined with one or more parts of a second gene. Specifically, the chimeric genes may comprise a gene mosaic. Not only odd but also an even number of cross-overs can be obtained in one single recombined chimeric gene.

Complex patterns of recombinant mosaicism can be used in a chimeric gene encoding an enzyme as decribed herein, reaching out high numbers of recombined sequence blocks of different length within one single molecule. Moreover, point-like replacement of nucleotides corresponding to one of the strand templates can be obtained as an important source of diversity respecting the frame of the open reading frames. Mosaicism and point-like exchange are not necessarily conservative at the protein level. Indeed, new amino acids with different polar properties can be generated after recombination, giving novel potential and enzymatic protein properties to the recombinant proteins derived from the chimeric genes.

According to a specific aspect, the invention further provides for an isolated nucleic acid incorporating genes operable to express at least one enzyme to biotransform a compound of Formula I into an O-methylated compound, which comprises

i) at least one first heterologous gene encoding a first enzyme which is 4-O- methyltransferase operable to biotransform the compound into a 4-O-methylated compound; and

ii) if R 2 is OH, at least a second heterologous gene encoding a second enzyme which is a 3-O-methyltransferase, wherein said first and second enzymes are operable to biotransform the compound into a 3,4-O-dimethylated compound.

In this regard, the heterologous genes are understood to be included in a recombinant nucleic acid, specifically wherein the genes are recombined with non- coding sequences or regulatory elements, which are not present in the natural context.

According to a specific embodiment, the nucleic acid may be provided in the isolated form or integrated in an expression construct. It may encode a series of enzymes expressed from a single polycistronic operon, or encode a series of enzymes expressed from separate promoters. Specifically, the enzymes are integrated into the expression construct or the host cell as a multienzyme complex.

Specifically, at least one of said genes encodes a native enzyme, such as a wild-type enzyme, e.g. encoded by a wild-type nuclic acid sequence or a codon- optimised nucleotide sequence.

Specifically, at least one of said genes is a chimeric gene encoding a chimeric enzyme.

Preferably, the polynucleotides are stably integrated into the cell genome.

Specifically, the invention provides for an expression construct incorporating genes operable to express at least one enzyme to biotransform a compound of Formula I into an O-methylated compound, which comprises

i) at least one first gene encoding a first enzyme which is a 4-O- methyltransferase, and one or more genes encoding enzymes other than O- methyltransferases, which are part of a metabolic pathway to produce the compound of Formula I; or

i i ) at l ea st o n e fi rst g e n e e n cod i n g a fi rst e nzym e wh i ch i s a 4-O- methyltransferase, and at least a second gene encoding a second enzyme which is a 3-O-methyltransferase, and optionally one or more genes encoding enzymes other than O-methyltransferases, which are part of a metabolic pathway to produce the compound of Formula I.

The expression construct specifically contains at least two genes of a metabolic pathway. Specifically, the construct may contain polynucleotides encoding at least three, four or five enzymes, or preferably at least six enzymes, or preferably at least seven enzymes involved in the metabolic pathway of producing the O-methylated compound.

Specifically, the invention provides for an expression construct comprising at least two genes of a metabolic pathway to produce an O-methylated compound of Formula I, wherein

i) at least one first gene encodes a first enzyme which is a 4-O- methyltransferase;

ii) and optionally at least a second gene encodes a second enzyme which is a 3-O-methyltransferase.

Specifically, said first and second genes are of different origin, preferably wherein at least one of the genes is encoding

a) a native enzyme originating from a different species; or

b) a functionally active variant enzyme of the native enzyme of a) having at least 60% sequence identity; or

c) a recombined variant enzyme of a) with altered methylation function , preferably any alteration in regioselectivity, activity, conversion rate, thermostability and/or pH stability, preferably a chimeric enzyme.

Specifically, the variant enzyme is preferably

a) encoded by a nucleotide sequence that is composed of fragments of different polynucleotides, which fragments are assembled to a chimeric nucleotide sequence; and/or

b) encoded by a nucleotide sequence that is obtained by insertion, deletion and/or substitution of one or more nucleotides in a parent polynucleotide. Specifically, the polynucleotide encoding the chimeric enzyme is composed of fragments of different polynucleotides, preferably with a sequence identity of at least 30%, or at least 40% or at least 50%, which fragments are assembled to a chimeric nucleotide sequence. In addition, the fragments may be optionally mutagenized to include mutated sequences derived from one or more polynucleotides, e.g. mutated by insertion, deletion and/or substitution of one or more nucleotides.

Alternatively, the polynucleotide encoding the chimeric enzyme is derived from only one parent polynucleotide, and a gene mosaic obtained by e.g. mutagenesis, or by insertion, deletion and/or substitution of one or more nucleotides.

Specifically, said first and second genes are native enzymes originating from a different eukaryotic species.

Specifically, said construct comprises regulatory sequences to express said genes, preferably wherein the construct is an expression cassette.

Synthetic nucleic acid sequences or cassettes may be provided in the form of linear polynucleotides, plasmids, megaplasmids, synthetic or artificial chromosomes, such as plant, bacterial, mammalian or yeast artificial chromosomes.

According to a specific aspect, the invention further provides for a vector comprising the nucleic acid of the invention.

According to a specific aspect, the invention further provides for a recombinant host cell comprising the nucleic acid of the invention and/or the vector of the invention.

Specifically, the host cell is a production host cell line.

According to the invention, there is further provided a method of engineering a host cell of the present invention, by introducing heterologous polynucleotides encoding at least two genes of a metabolic pathway to produce an O-methylated compound of Formula I, wherein

i) at least one first gene encodes a first enzyme which is a 4-O- methyltransferase;

ii) and optionally at least a second gene encodes a second enzyme which is a 3-O-methyltransferase.

For example, the prepared expression cassette may be engineered before introducing the polynucleotides into the host cell or during such introduction, i.e. in situ or in vivo. For example, upon engineering the appropriate expression cassette or metabolic pathway, it may be advantageous to introduce the cluster of polynucleotides into a production host cell. According to a specific example, such engineering method may comprise the method steps

a) providing the polynucleotides encoding the individual enzymes optionally wherein at least one of the polynucleotides is composed of fragments of different polynucleotides, which fragments are assembled to a chimeric nucleotide sequence; b) assembling the polynucleotides into a cluster and integrating said cluster into the cell genome, preferably by in vivo recombination; and

c) optionally engineering a production cell, wherein said cluster is stably integrated in the production cell genome.

The cell may be a eukaryotic or prokaryotic cell, preferably selected from the group consisting of yeast, mammalian, insect, plant and bacterial cells.

Specifically, the host cell is a bacterial or yeast cell, preferably a

Saccharomyces cell, a Schizosaccharomyces cell, a Pichia cell, a Yarrowia cell, a Hansenula cell, a Candida cell, preferably Saccharomyces cerevisiae.

Specifically, the host cell is a genetically engineered yeast cell which is deficient in expression and/or activity of an enzyme involved in endogenous degradation of the substrate or product of the biotransformation process as described herein.

Specifically, the host cell is a S. cerevisiae that is mutated to delete at least one of the genes FDC1 or PAD.

FDC1 : putative phenylacrylic acid decarboxylase of Saccharomyces cerevisiae ;

Gene ID: 852152; essential for the decarboxylation of aromatic carboxylic acids to the corresponding vinyl derivatives.

PAD1 : phenylacrylic acid decarboxylase of Saccharomyces cerevisiae; Gene ID: 852150; confers resistance to cinnamic acid, decarboxylates aromatic carboxylic acids to the corresponding vinyl derivatives.

According to a specific aspect, the invention provides for a process for enzymatic O-methylation of a compound of Formula I into an O-methylated compound, comprising the steps:

a) providing a first enzyme which is a 4-O-methyltransferase operable to biotransform the compound into a 4-O-methylated compound; and

b) optionally, if R 2 is OH , providing a second enzyme which is a 3-O- methyltransferase, wherein said first and second enzymes are operable to biotransform the compound into a 3,4-O-dimethylated compound; c) providing the compound and biotransforming the compound into a 4-O- methylated compound or if R 2 is OH, into a 3,4-O-dimethylated compound.

Specifically, the enzymatic O-methylation of the compound is perfomed in vivo by a recombinant host cell as described herein, or in vitro, e.g. in the host cell culture or separate from a host cell culture.

Specifically, the first and/or the second enzyme is a recombinant enzyme, e.g. produced by genetically engineering an expression construct to express said enzyme in a recombinant host cell, a biosynthetic enzyme or otherwise synthetically produced, employing methods well known in the art, e.g. using the wild-type or recombinant enzyme as a template to synthesize a polypeptide sequence in vitro.

The invention specifically provides for the use of a 4-O-methyltransferase in a process of enzymatically producing a compound of Formula I into an 4-O-methylated compound, and if R 2 is OH, into a 3,4-O-dimethylated compound. In particular, if R 2 is OH, the fully O-methylated compound is being produced, using the 4-O- methyltransferase, and optionally a further 3-O-methyltransferase.

The invention specifically provides for the use of a 4-O-methyltransferase and a 3-O-methyltransferase in a process of enzymatically producing a compound of Formula I wherein R 2 is OH, into a 3,4-O-dimethylated compound. FIGURES

Figure 1 :

A. phylogenetic tree B. Sequence alignment of protein sequences of caffeic O- methyltransferases C. Sequence alignment of catechol O-methyltransferase. The SEQ ID numbers of the sequences shown herein are provided in Figure 8.

Figure 2:

Strategy for integration of a candidate gene into the yeast genome in order to study its functionality. IF1 contains the 5'-insertion site in the BUD 31 region of the yeast chromosome and 5'-end of URA marker, IF2 contains 3'end URA marker and pGAL promoter. IF4 contains tCYC terminator and 5' end of LEU marker and IF5 contains 3'-end of LEU marker and 3'-insertion site in the BUD 31 region. Synthetized gene was amplified from a plasmid provided by GeneArt. The 5'-end of the upstream oligonucleotides used for amplifying the gene of interest contains a sequence of 40 nucleotides homologous with the 3'-end of the pGAL1 promoter. The downstream oligonucleotides contained a 40-nt sequence homologous with the 5'-end of the tCYC terminator. After assembly by homologous recombination in yeast transformant, the double selection permits the recombinant isolation. After recombination, the gene possesses one promoter (pGAL) and one terminator (tCYC) sequence permitting their expression in yeast cells.

Figure 3:

O-methylation reaction. Figure illustrates O-methylation results obtained when cells overexpressing different COMTs fed with indicated substrates.

Figure 4:

Possible ways for biosynthesis of 3,4 dimethoxycinnamic acid from caffeic acid with both COMTmsa and COMTyli. Caffeic acid is converted via 4-0 methylation by COMTyli into isoferulic acid which is then converted into 3,4 dimethoxycinnamic acid by COMTmsa via 3-0 methylation. Caffeic acid is converted by COMTmsa into ferulic acid via 3-0 methylation which is then converted into 3,4dimethoxycinnamic acid by COMTyli via 4-0 methylation.

Figure 5:

A: Caffeic acid conversion into ferulic and isoferulic acid by integrated COMTyli. B: Isoferulic acid conversion into 3,4 dimethoxycinnamic acid by integrated COMTmsa.

C: Ferulic acid conversion into 3,4 dimethoxycinnamic acid by integrated

COMTyli.

D: Caffeic acid conversion into ferulic and isoferulic acid and 3,4 dimathoxycinnamic acid from integrated COMTmsa and COMTyli.

Figure 6:

A. The synthesis pathway of p-anisaldehyde production cell. The figure shows the schematic diagram wherewith phenylalanine is converted into p-anisaldehyde. Phenylalanine undergoes several reactions: deamination, hydroxylation of 4-position of the phenyl ring, reduction chain reaction and O-methylation of the 4-position of the phenyl ring.

B. Assembly of p-anisladehyde pathway by fragments containing homologous gene sequences. This figure shows the co-transformation of 8 fragments comprising the 5 genes coding for proteins of p-anisaldehyde pathway, starting from phenylalanine. HIS3 is the marker enabling the selection of the recombinant pathway. The CAR protein catalyzes the reduction of carboxylic acids to their corresponding aldehydes. The role of the PPTase is to transfer the phosphopantetheine from coenzyme A to its acceptor CAR protein. Organism sources of each gene are indicated with three letters following the name of the gene, also shown in three letters: pop: Populous deltoids; gly: Glycine max, pfl: Pseudomonas fluorecens; nio: Nocardia iowensis; oba: Ocimum basilicum.

Figure 7:

Strategy for homeologous recombination and assembly to genome. IF1 contains the 5'-insertion site in the BUD 31 region of the yeast chromosome and 5'-end of URA marker, IF2 contains 3'end URA marker and pGAL promoter. IF4 contains tCYC terminator and 5' end of LEU marker and IF5 contains 3'-end of LEU marker and 3'- insertion site in the BUD 31 region. Synthetized genes were amplified from plasmid provided by GeneArt. The 5'-end of the upstream oligonucleotides used for amplifying the gene 1 contains a sequence of 40 nucleotides homologous with the 3'-end of the pGAL1 promoter. For gene 2 amplification, the downstream oligonucleotides contained a 40-nt sequence homologous with the 5'-end of the tCYC terminator. After assembly by homeologous recombination in yeast transformant, the double selection permits the recombinant isolation. After recombination, the gene possesses one promoter (pGAL) and one terminator (tCYC) sequence permitting their expression in yeast cells.

Figure 8:

Sequences described herein: Amino acid sequences of the individual enzymes.

Nucleic acid sequences encoding such enzymes: wild-type and codon-optimised sequences.

COMTyli: catechol O-methyltransferase from Yarrowia lipolytica;

COMThs: catechol 3-O-methyltransferase from Homo sapiens (human);

CVOMToba: chavicol O-methyltransferase from Ocimum basilicum;

AIMTpan: t-anol/isoeugenol synthzase from Pimpinella anisum;

COMTmsa: caffeic acid-O-methyltransferase from Medicago sativa.

DETAILED DESCRIPTION OF THE INVENTION

Specific terms as used throughout the specification have the following meaning.

The term "biosynthesis" as used herein shall specifically refer to the cellular production of a product, specifically including enzymatic processes, e.g. for biotransformation of a substrate into a product. The biosynthesis is typically performed by in vivo production in host cells in cell culture, specifically microbial host cells, which cellular production may be optionally combined with further biosynthetic production steps (e.g. in a host cell different from the prior one) and/or with reactions of chemical synthesis, e.g. by in vitro reactions. The term "biosynthesis" is however also used with respect to enzymatic reactions in general. Contrary to chemical reactions, the enzymatic reactions follow a biological principle, thus, an enzymatic product is biosynthesized either by a cell ("in vivo biosynthesis") or in vitro I ex vivo by an enzymatic reaction in a (an acellular) reaction, i.e. without a cellular production system ("in vitro biosynthesis"). Such enzymatic reaction may be by wild-type enzymes, recombinant enzymes or artificial enzymes, e.g . enzymes that are synthetically produced.

The term "product" as used herein specifically with respect to biosynthesis shall refer to any product of an enzymatic reaction or of a primary and/or secondary metabolism, in particular a compound that may be used as a precursor, intermediate, side-product or end-product of a metabolic pathway.

The biosynthesis may employ heterologous genes or polynucleotides. Such biosynthesis is also called "heterologous biosynthesis" and specifically refers to the biosynthesis of products by recombinant host cells, which comprise at least one heterologous element, such as a heterologous polynucleotide, which e.g. enables the biosynthesis of exogenous products, or endogenous products with improved properties or at an increased yield.

The term "biotransformation" as used herein shall mean the transformation of a substrate catalyzed by an enzyme, which is by an enzyme and/or an organism, in particular by a eukaryotic cell, preferably a microbial cell.

The biotransformation is typically performed by in vivo production of a biotransformation product in host cells in cell culture, specifically microbial host cells, which cellular production may be optionally combined with further production steps (e.g. in a host cell different from the prior one) and/or with reactions of chemical synthesis, e.g. by in vitro reactions. The term "biotransformation" is however also used with respect to enzymatic reactions in general. Contrary to chemical reactions, the enzymatic reactions follow a biological principle, thus, an enzymatic product is biotransformed either by a cell ("in vivo biotransformation") or in vitro I ex vivo by an enzymatic reaction in a (an acellular) reaction, i.e. without a cellular production system ("in vitro biotransformation"). Such enzymatic reaction may be by wild-type enzymes, recombinant enzymes or artificial enzymes, e.g. enzymes that are synthetically produced.

Preferably, such biotransformation is by an endogenous enzyme, i.e. an enzyme that is produced by the organism or cell, which is either encoded by an autologous gene or a heterologous gene. The subject OMT enzymes as described herein are specifically heterologous to the host/ producer cell, preferably of different origin, e.g. selected from specific (different) species, i.e. not originating from the host/ producing cell, and/or recombinantly engineered, e.g. chimeric enzymes encoded by chimeric genes, or recombinant variant enzymes obtained by mutagenesis of native enzymes.

The term "cell" as used herein in particular with reference to engineering and introducing one or more genes to be expressed in said cell, e.g. a metabolic pathway or an assembled cluster of genes, or a production cell is understood to refer to any prokaryotic or eukaryotic cell. Prokaryotic and eukaryotic host cells are both contemplated for use according to the invention, including bacterial host cells like E. coli or Bacillus sp, yeast host cells, such as S. cerevisiae, insect host cells, such as Spodooptera frugiperda or human host cells, such as HeLa and Jurkat.

Preferred host cells are haploid cells, such as from Candida sp, Pichia sp and Saccharomyces sp.

The term "cell" shall specifically include a single cell or cells cultivated in a cell culture, such as cell lines.

The term "production cell" as used herein shall specifically refer to a cell recombinantly engineered to produce a product of a production process or biosynthesis, e.g. a product of a metabolic pathway.

The term "cell line" as used herein refers to an established clone of a particular cell type that has acquired the ability to proliferate over a prolonged period of time. The term "host cell line" refers to a cell line as used for engineering and/or expressing an endogenous or recombinant gene or products of a metabolic pathway to produce polypeptides or cell metabolites mediated by such polypeptides. A "production host cell line" or "production cell line" is commonly understood to be a cell line ready-to-use for cultivation in a bioreactor to obtain the product of a production process or biosynthesis, such as a product of a metabolic pathway. Once clones are selected that produce the desired products of biosynthesis, the products are typically produced by a production host cell line on the large scale by suitable expression systems and fermentations, e.g. by microbial production in cell culture.

The enzymes encoded by the genes and the O-methylated compounds as described herein can be produced using the recombinant host cell line of the invention by culturing a transformant, e.g . in an appropriate culture medium, isolating the expressed product or metabolite from the culture, and optionally purifying it by a suitable method.

The preferred host cell line according to the invention maintains the genetic properties employed according to the invention, and the production level remains high, e.g. at least at a g level, even after about 20 generations of cultivation, preferably at least 30 generations, more preferably at least 40 generations, most preferred of at least 50 generations. The stable recombinant host cell is considered a great advantage when used for industrial scale production.

The term "chimeric" as used herein with respect to a polypeptide, such as an enzyme, or a nucleotide sequence, such as a polynucleotide or gene encoding an enzyme, shall refer to those molecules which comprise at least two heterologous parts. In this context, heterologous signifies that the parts are not found in the same position in a single polypeptide or polynucleotide in vivo. Normally, this means that the parts are derived from at least two different polypeptides or polynucleotides, e.g. from different origin, such as analogs derived from different organism or species. The parts may also be obtained by mutagenesis of one source (parent) sequence.

Chimeric polypeptides having different combinations of polypeptide sequences may originate from one or more parent molecules, which may have undergone mutagenesis, thus may comprise mutations, such as insertions, deletions and/or substitutions of one or more amino acids.

Chimeric polynucleotides having different combinations of genes or sequences may originate from one or more parent genes, which may have undergone mutagenesis, thus may comprise mutations, such as insertions, deletions and/or substitutions of one or more nucleotides.

In this context, the term "originating", e.g. with respect to a species of origin, or

"different origin" is understood in the following way. A molecule endogenous to a cell of a specific species is herein understood as originating from said species, either in the naturally-occurring form, e.g. as a wild-type molecule and its isomer, or fragments or mutants thereof. A molecule that is characterized by being of a different origin relative to another molecule, is specifically understood to refer to a molecule of different sequence, e.g . obtained or derived from a different species, such as a naturally- occurring molecule, e.g. an analog, or provided as an artificial or recombinant molecule, such as a molecule not occurring as a wild-type molecule in nature.

Exemplary enzymes as described herein are of various prokaryotic or eukaryotic origin.

A chimeric enzyme as described herein specifically may comprise analogous sequences of different origin, e.g. from different species, thus, a partial sequence may be homologous to corresponding sequences in enzymes derived from a particular species, while other parts or segments may be homologous to correspond ing sequences in another species. Typically the full-length molecules or parts of such molecules are recombined and optionally assembled to obtain a chimeric molecule.

In a specific embodiment, a chimeric enzyme may also be an enzyme in which the positioning, spacing or function of two endogenous partial sequences has been changed, e.g. by manipulation, with respect to the wild-type enzyme. For example, elements of a sequence may be repositioned by adding, shifting or removing nucleotides or amino acids. Alternatively, the amino acid or nucleotide sequence itself may be mutated, e.g. to introduce desired properties. Typically, such properties include the ability to increase the activity of the enzyme.

The term "methyltransferase" as used herein shall specifically refer to a methylase which is a type of transferase enzyme that transfers a methyl group from a donor to an acceptor. The term shall specifically refer to an O-methyltransferase (OMT), specifically a 3-O-methyltransferase or a 4-O-methyltransferase, or a 3,4-0- dimethyltransferase.

The term "gene" as used herein shall specifically refer to genes or DNA fragments of a gene, in particular those that are partial genes. A gene fragment can also contain several open reading frames, either repeats of the same ORF or different ORF's. The term shall specifically refer to coding nucleotide sequences, but shall also include nucleotide sequences which are non-coding, e.g. untranscribed or untranslated sequences, or encoding polypeptides, in whole or in part.

The term shall particularly apply to the polynucleotide(s) as used herein, e.g. as full-length nucleotide sequence or fragments or parts thereof, which encodes a polypeptide with enzymatic activity, e.g. an enzyme of a metabolic pathway, or fragments or parts thereof, respectively. The term "polynucleotides" as used herein shall specificall refer to a single or double-stranded deoxyribonucleotide or ribonucleotide polymer of any length, and include as non-limiting examples, coding and non-coding sequences of a gene, recombinant polynucleotides, isolated and purified naturally occurring DNA or RNA sequences, synthetic RNA and DNA sequences, nucleic acid probes, primers, fragments, genetic constructs, vectors and modified polynucleotides. Reference to nucleic acids, nucleic acid molecules, nucleotide sequences and polynucleotide sequences is to be similarly understood.

The genes as used herein, e.g. for assembly, diversification or recombination can be non-coding sequences or sequences encod ing polypeptides or protein encoding sequences or parts or fragments thereof having sufficient sequence length for successful recombination events. More specifically, said genes have a minimum length of 3 bp, preferably at least 100 bp, more preferred at least 300 bp.

The term "gene mosaic" as used herein means the combination of at least two different genes or partial genes with at least one cross-over event, preferably at least two, at least three, at least four, at least five, at least six, at least seven or even more cross-overs within a single polynucleotide encoding the same type of enzyme ("intragenic") or within a single molecule or nucleic acid strand, e.g. a cross-over at the nucleic acid section joining polynucleotides encoding different types of enzymes to obtain an assembly of the polynucleotides ("intergenic"). Specifically such a cross-over provides for the combination or mixing of DNA sequences. A gene mosaic may be created by intragenic mixing of gene(s), an intragenic gene mosaic, and/or gene assembly, e.g. with intergenic cross-over, with or without an overlapping section, or composite genes stringed together, optionally with an overlap, further optionally assembly of genes with both, intragenic and intergenic cross-over(s) or gene mosaic(s).

The gene mosaics specifically may be of at least 3, preferably up to 30.000 base pairs, a preferred range would be 300 - 25.000 bp; particularly preferred are large DNA sequences of at least 500 bp or at least 1 .000 bp.

Specific gene mosaics are characterized by at least 3 cross-over events per 700 base pairs, preferably at least 4 cross-overs per 700 base pairs, more preferred at least 5, 6 or 7 cross-overs per 700 base pairs or per 500 base pairs, which include the crossing of single nucleotides, or segments of at least 1 , preferably at least 2, 3, 4, 5, 10, 20 up to larger nucleotide sequences. The term "cross-over" refers to recombination between genes at a site where two DNA strands can exchange genetic information, i.e. at least one nucleotide. The cross-over process leads to offspring mosaic genes having different combinations of genes or sequences originating from one or more parent genes, which may have undergone mutagenesis, thus may comprise mutations, such as insertions, deletions and/or substitutions of one or more nucleotides

The term "heterologous" as used herein, specifically refers to a gene or a nucleic acid which is either foreign, i.e. "exogenous", such as not found in nature, to a given host microorganism or host cell; or that is naturally found in a given host microorganism or host cell , e.g . , is "endogenous", however, in the context of a heterologous nucleic acid. The heterologous nucleotide sequence as found endogenously may also be produced in an unnatural, e.g. greater than expected or greater than naturally found, amount in the cell. The heterologous nucleotide sequence, or a nucleic acid comprising the heterologous nucleotide sequence, possibly differs in sequence from the endogenous nucleotide sequence but encodes the same protein as found endogenously. Specifically, heterologous nucleotide sequences are those not found in the same relationship to a host cell in nature. Any recombinant or artificial nucleotide sequence is understood to be heterologous. An example of a heterologous polynucleotide is a nucleotide sequence encoding an enzyme sequence as described herein, which originates from a species other than the host cell species. A further example is a chimeric polynucleotide. A further example is a nucleotide sequence encoding an enzyme sequence operably linked to a transcriptional control element, e.g., a promoter, to which an endogenous, naturally- occurring enzyme coding sequence is not normally operably linked.

The term "multienzyme complex" as used herein shall specifically refer to a number or series of enzymes of a metabolic pathway, either in the order of cascadic reactions or else without such order, e.g. by a random sequence. The multienzyme complex produced by a host cell of heterologous biosynthesis typically is encoded by an assembly or at least one cluster of (recombinant) polynucleotides each encoding an enzyme, which assembly or cluster(s) may be e.g. located at one or more different loci on one or more chromosomes, or located on one or more chromosomes in part and additionally located on plasmid(s). The multienzyme complex does not need to be provided as a complex of proteins, wherein the proteins are linked to each other. The term is rather understood as a multienzyme complex provided as individual enzymes involved in a specific metabolic pathway of a cell.

An exemplary multienzyme complex as described herein relates to at least two nucleotide sequences, wherein at least one encodes an O-methyltransferase, such as an enzyme with 4-O-methyltransferase activity and optionally 3-O-transferase activity, or a first enzyme with 4-O-methyltransferase activity and a second enzyme with 3-O- transferase activity.

Specific multienzyme complexes further comprise enzymes of a metabolic pathway to produce the substrate compound of Formula I, e.g. enzymes of the shikimate pathway, which is a seven step metabolic route used by bacteria, fungi, and plants for the biosythesis of aromatic amino acids, like phenylalanine, tyrosine and tryptophan, or enzymes of the cinnamic and p-coumaric acids biosynthesis. Typically, biosynthesis of all phenylpropanoids begins with the amino acids phenylalanine and tyrosine. Phenylalanine ammonia-lyase (PAL, phenylalanine/ TAL, tyrosine ammonia- lyase) is an enzyme responsible for the transformation of L-phenylalanine or tyrosine into trans-cinnamic acid or p-coumaric acid, respectively.

Preferred multienzyme complexes comprise a series of enzymes, e.g. a mixture of enzymes. The polynucleotides encoding the enzymes of a multienzyme complex may be assembled and provided as a cluster, wherein the nucleic acid encodes the enzymes, e.g. in the order of the enzymatic (catalyzed) reactions or irrespective of the order.

The term "metabolic pathway" refers to a series of two or more enzymatic reactions, in which the product of one enzymatic reaction becomes the substrate for the next enzymatic reaction. At each step of a metabolic pathway, intermediate compounds are formed and utilized as substrates for a subsequent step. These compounds may be called "metabolic intermediates." The products of each step or final products are also called "metabolites."

Enzymes of a metabolic pathway as described herein typically play an integral role in primary and/or secondary metabolism. In primary metabolism an enzyme is essential for viability, e.g. directly involved in the normal growth, development, or reproduction of an organism. In secondary metabolism an enzyme serves to produce secondary metabolites, which are understood as organic compounds that are - unlike primary metabolites - not essential for viability in the first instance. Absence of secondary metabolites does not result in immediate death, but rather in long-term impairnnent of the organism's survivability, fecundity, or aesthetics, or perhaps in no significant change at all. Metabolic pathways as described herein may particularly comprise at least two enzymes, preferably at least three, at least four, at least five, at least six, at least seven or even more enzymes, to obtain a product of biosynthesis.

O-methylated compounds of Formula I as described herein are specifically understood as secondary metabolites, which may find use as intermediates to produce final products by further (bio)synthetic measures or derivatization, or final products, e.g. for use as aroma, medicines, flavorings, fragrance agents or as food ingredient.

"Recombinant" as used herein, means that a particular nucleic acid (DNA or RNA) is the product of various combinations of cloning, restriction, ligation, and/or in vitro DNA synthesis steps resulting in a construct having a structural coding or non- coding sequence distinguishable from endogenous nucleic acids found in natural systems. Generally, DNA sequences encoding the structural coding sequence can be assembled from cDNA fragments and short oligonucleotide linkers, or from a series of synthetic oligonucleotides, to provide a synthetic nucleic acid which is capable of being expressed from a recombinant transcriptional unit contained in a cell or in a cell-free transcription and translation system. Such sequences can be provided in the form of an open reading frame uninterrupted by internal non-translated sequences, or introns, which are typically present in eukaryotic genes. Genomic DNA comprising the relevant sequences can also be used in the formation of a recombinant gene or transcriptional unit. Sequences of non-translated DNA may be present 5' or 3' from the open reading frame, where such sequences do not interfere with manipulation or expression of the coding regions, and may indeed act to modulate production of a desired product by various mechanisms.

Thus, e.g., the term "recombinant" polynucleotide or nucleic acid refers to one which is not naturally occurring, e.g., is made by the artificial combination of two otherwise separated segments of sequence through human intervention. This artificial combination is often accomplished by the artificial manipulation of isolated segments of nucleic acids, e.g., by genetic engineering techniques. For example, it is performed to join together nucleic acid segments of desired functions to generate a desired combination of functions. The term "recombination" shall specifically apply to assembly of polynucleotides, joining together such polynucleotides or parts thereof, with or without recombination to achieve a cross-over or a gene mosaic. The term "recombinant" as used herein, specifically with respect to nucleic acid sequences shall refer to nucleic acids or polynucleotides produced by recombinant DNA techniques, e.g. a DNA construct comprising a polynucleotide heterologous to a host cell, which is optionally incorporated into the host cell. A chimeric nucleotide sequence may specifically be produced as recombinant molecule.

The term "recombinant" as used herein, specifically with respect to enzymes shall refer to enzymes produced by recombinant DNA techniques. Such recombinant enzymes produced from cells incorporating a heterologous gene or transformed by an exogenous DNA construct encoding the desired enzyme. Recombinant enzymes and wild-type enzymes, e.g. originating from a native organism, are also referred to as "biosynthetic" enzymes. "Synthetic" enzymes are those prepared by chemical synthesis. A chimeric enzyme may specifically be produced as recombinant molecule.

The term "recombinant host", also referred to as a "genetically modified host cell" denotes a host cell that comprises a heterologous nucleic acid.

A "variant" as used herein refers to homologs, orthologs and paralogs.

Homologs of a protein encompass peptides, oligopeptides and polypeptides having amino acid substitutions, deletions and/or insertions, preferably by a conservative change, relative to the unmodified protein in question and having similar biological and functional activity as the unmodified protein from which they are derived (herein also referred to as "parent"); or in other words, without significant loss of function or activity. Orthologs and paralogs define subcategories of homologs and encompass evolutionary concepts used to describe the ancestral relationships of genes. Paralogs are genes within the same species that have originated through duplication of an ancestral gene; orthologs (herein also referred to as analogs) are genes from different organisms that have originated through speciation, and are also derived from a common ancestral gene. Several different methods are known by those of skill in the art for identifying and defining these functionally homologous sequences. Preferably, said homolog, ortholog or paralog has a sequence identity at protein level of at least 60%, preferably 70%, more preferably 80%, even more preferably 90%, most preferably 95% as measured in a BLASTp.

A variety of polynucleotide sequences are capable of encoding the enzymes as described herein, e.g. native enzymes or variant enzymes.

Specific variant enzymes as described herein may be obtained by mutagenesis of parent genes, which parent genes are e.g. wild-type genes encoding native enzymes, or codon-optimized nucleotide sequences, or nucleotide sequences including point mutations or larger mutations, including insertions, deletions and/or substitutions. Specific variant enzymes are chimeric enzymes comprising fragments of one or more enzymes which are functionally similar, and in particular capable of transforming the same substrate into the same product, or have an altered selectivity or specificity, e.g. capable of transforming the same substrate into a different product, or transforming a different substrate into the same product.

Due to the degeneracy of the genetic code, many different polynucleotides can encode identical and/or substantially similar enzymes. Sequence alterations that do not change the amino acid sequence encoded by the polynucleotide are termed "silent" variations, e.g. obtained by codon-optimization. Any of the possible codons for the same amino acid can be substituted by a variety of techniques, e.g., site-directed mutagenesis. In addition to silent variations, other conservative variations that alter one, or a few amino acids in the encoded enzyme, can be made without altering the function of the enzyme, i.e. for improved secondary metabolite production. Conservative substitutions or variations are specifically those in which at least one residue in the amino acid sequence has been removed and a different residue inserted in its place.

Conservative substitutions are those that take place within a family of amino acids that are related in their side chains and chemical properties. Examples of such families are amino acids with basic side chains, with acidic side chains, with non-polar aliphatic side chains, with non-polar aromatic side chains, with uncharged polar side chains, with small side chains, with large side chains etc.

A point mutation is particularly understood as the engineering of a polynucleotide that results in the expression of an amino acid sequence that differs from the non-engineered amino acid sequence in the substitution or exchange, deletion or insertion of one or more single (non-consecutive) or doublets of amino acids for different amino acids.

Preferred point mutations refer to the exchange of amino acids of the same polarity and/or charge. In this regard, amino acids refer to twenty naturally occurring amino acids encoded by sixty-four triplet codons. These 20 amino acids can be split into those that have neutral charges, positive charges, and negative charges:

The "neutral" amino acids are shown below along with their respective three- letter and single-letter code and polarity: Alanine: (Ala, A) nonpolar, neutral;

Asparagine: (Asn, N) polar, neutral;

Cysteine: (Cys, C) nonpolar, neutral;

Glutamine: (Gin, Q) polar, neutral;

Glycine: (Gly, G) nonpolar, neutral;

Isoleucine: (lie, I) nonpolar, neutral;

Leucine: (Leu, L) nonpolar, neutral;

Methionine: (Met, M) nonpolar, neutral;

Phenylalanine: (Phe, F) nonpolar, neutral;

Proline: (Pro, P) nonpolar, neutral;

Serine: (Ser, S) polar, neutral;

Threonine: (Thr, T) polar, neutral;

Tryptophan: (Trp, W) nonpolar, neutral;

Tyrosine: (Tyr, Y) polar, neutral;

Valine: (Val, V) nonpolar, neutral; and

Histidine: (His, H) polar, positive (10%) neutral (90%).

The "positively" charged amino acids are:

Arginine: (Arg, R) polar, positive; and

Lysine: (Lys, K) polar, positive.

The "negatively" charged amino acids are:

Aspartic acid: (Asp, D) polar, negative; and

Glutamic acid: (Glu, E) polar, negative.

A variant enzyme described herein is specifically understood to include homologs, analogs, fragments, modifications or variants with a specific enzymatic activity, e.g. produced by engineering or mutating the nucleotide sequence, which are functional and may serve as functional equivalents, e.g. catalytically supporting the conversion of a substrate into a product of an enzymatic reaction.

Any of the known mutagenesis methods may be employed, including point mutations at desired positions, e.g. obtained by randomisation techniques. In some cases positions are chosen randomly, e.g. with either any of the possible amino acids or a selection of preferred amino acids to randomise the amino acid sequences. The term "mutagenesis" refers to any art recognized technique for altering a polynucleotide or polypeptide sequence. Preferred types of mutagenesis include error prone PCR mutagenesis, saturation mutagenesis, or other site directed mutagenesis. Th e term "variant" shall specifically encompass functionally active variants.

The term "functionally active variant" of an enzyme as used herein, means a sequence resulting from modification of this sequence (a parent enzyme or a parent sequence), e.g. by insertion, deletion or substitution of one or more amino acids, such as by recombination techniques of one or more amino acid residues in the amino acid sequence, or nucleotides within the nucleotide sequence, or at either or both of the distal ends of the sequence, and which modification does not affect (in particular impair) the activity of this sequence. The functionally active variant of an enzyme would still have the predetermined enzymatic activity, though this could be changed, e.g. to change the regioselectivity, i.e. changing a 3-O-methyltransferase into a 4-O- methyltransferase or changing a 4-O-methyltransferase into a 3-O-methyltransferase, the activity, conversion rate, thermostability and/or pH stability.

Functionally active variants may be obtained, e.g. by changing the sequence of a parent enzyme, e.g. an enzyme as listed in Figure 8, or any other native or wild-type enzyme of a metabolic pathway, but with modifications which does not impair the enzymatic activity of the expression product, and preferably would have a biological activity similar to the parent enzyme, including the ability to specifically or selectively support biotransformation and biosynthesis of a cell metabolite.

For example, the functionally active variant enzyme has substantially the same enzymatic activity as the parent enzyme, which is any as listed in Figure 8, or any other native or wild-type enzyme of a metabolic pathway.

The term "substantially the same enzymatic activity" as used herein refers to the activity as indicated by substantially the same activity being at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, at least 98% or even at least 100% or at least 1 10%, or at least 120%, or at least 130%, or at least 140%, or at least 150%, or at least 160%, or at least 170%, or at least 180%, or at least 190%, e.g. up to 200% of the activity as determined for the parent enzyme, e.g. as measured by the substrate - product conversion rate.

In a preferred embodiment the functionally active variant of a parent enzyme a) is a biologically active fragment of the enzyme, the fragment comprising at least 50% of the sequence of the molecule, preferably at least 60%, at least 70%, at least 80%, at least 90%, or at least 95% and most preferably at least 97%, 98% or 99%; b) is derived from the enzyme by at least one amino acid substitution, addition and/or deletion, wherein the functionally active variant has a sequence identity to the molecule or part of it, such as an enzyme of at least 50% sequence identity, preferably at least 60%, more preferably at least 70%, more preferably at least 80%, still more preferably at least 90%, even more preferably at least 95% and most preferably at least 97%, 98% or 99%; and/or

c) consists of the enzyme or a functionally active variant thereof and additionally at least one amino acid or nucleotide heterologous to the polypeptide or the nucleotide sequence.

In one preferred embodiment of the invention, the functionally active variant of the enzyme is essentially identical to the variant described above, but differs from its polypeptide or the nucleotide sequence, respectively, in that it is derived from a homologous sequence of a different species. These are referred to as naturally occurring variants or analogs.

"Percent (%) identity" with respect to the nucleotide sequence of a gene is defined as the percentage of nucleotides in a candidate DNA sequence that is identical with the nucleotides in the DNA sequence or its corresponding or complementary sequence, after aligning the sequence and introducing gaps, if necessary, to achieve the maximum percent sequence identity, and not considering any conservative substitutions as part of the sequence identity. Alignment for purposes of determining percent nucleotide sequence identity can be achieved in various ways that are within the skill in the art, for instance, using publicly available computer software.

"Percent (%) amino acid sequence identity" with respect to enzyme sequences and homologs described herein is defined as the percentage of amino acid residues in a candidate sequence that are identical with the amino acid residues in the specific polypeptide sequence, after aligning the sequence and introducing gaps, if necessary, to achieve the maximum percent sequence identity, and not considering any conservative substitutions as part of the sequence identity.

Those skilled in the art can determine appropriate parameters for measuring alignment, including any algorithms needed to achieve maximal alignment over the full length of the sequences being compared.

The term "isolated" or "isolation" as used herein with respect to a nucleic acid, a product of a biosynthesis or biotransformation or other compound shall refer to such compound that has been sufficiently separated from the environment with which it would naturally be associated, so as to exist in "substantially pure" form. "Isolated" does not necessarily mean the exclusion of artificial or synthetic mixtures with other compounds or materials, or the presence of impurities that do not interfere with the fundamental activity, and that may be present, for example, due to incomplete purification. In particular, isolated nucleic acid molecules of the present invention are also meant to include those chemically synthesized.

With reference to nucleic acids as described herein, the term "isolated nucleic acid" may be used. This term, when applied to DNA, refers to a DNA molecule that is separated from sequences with which it is immediately contiguous in the naturally occurring genome of the organism in which it originated. For example, an "isolated nucleic acid" may comprise a DNA molecule inserted into a vector, such as a plasmid or virus vector, or integrated into the genomic DNA of a prokaryotic or eukaryotic cell or host organism. When applied to RNA, the term "isolated nucleic acid" refers primarily to an RNA molecule encoded by an isolated DNA molecule as defined above. Alternatively, the term may refer to an RNA molecule that has been sufficiently separated from other nucleic acids with which it would be associated in its natural state (i.e., in cells or tissues). An "isolated nucleic acid" (either DNA or RNA) may further represent a molecule produced directly by biological or synthetic means and separated from other components present during its production.

The term "isolate" with respect to the prod uct of a biosynth esis or biotransformation shall mean to separate or purify the substance from at least one impurity, specifically by preparative means, to obtain an isolated product. In particular the term specifically refers to compounds that are free or substantially free of material with which they are naturally associated such as other compounds with which they are found in their natural environment, or the environment in which they are prepared (e g. cell culture) when such preparation is by recombinant DNA technology. Isolated compounds can be formulated with diluents or further additives and still for practical purposes be isolated - for example, the compounds can be mixed with other compounds or excipients when used for commercial applications or industrial use.

The term "expression" or "expression system" or "expression cassette" refers to nucleic acid molecules containing a desired coding sequence and control sequences in operable linkage, so that hosts transformed or transfected with these sequences are capable of producing the encoded proteins or host cell metabolites. In order to effect transformation, the expression system may be included in a vector; however, the re- levant DNA may also be integrated into the host chromosome. Expression may refer to secreted or non-secreted expression products, including polypeptides or metabolites.

A cell may be transformed by exogenous or heterologous DNA when such DNA has been introduced inside the cell. The transforming DNA may or may not be integrated, i.e. covalently linked into the genome of the cell. In prokaryotes, yeast, and mammalian cells for example, the transforming DNA may be maintained on an episomal element such as a plasmid. With respect to eukaryotic cells, a stably transformed cell is one in which the transforming DNA has become integrated into a chromosome so that it is inherited by daughter cells through chromosome replication. This stability is demonstrated by the ability of the eukaryotic cell to establish cell lines or clones comprised of a population of daughter cells containing the transforming DNA.

The diverse genes or polynucleotides may be incorporated into plasmids. The plasmids are often standard cloning vectors, e.g., bacterial multicopy plasmids. The substrates can be incorporated into the same or different plasmids. Often at least two different types of plasmid having different types of selectable markers are used to allow selection for cells containing at least two types of vector.

Plasmids containing diverse gene substrates are initially introduced into cells by any method (e.g., chemical transformation, natural competence, electroporation, biolistics, packaging into phage or viral systems). Often, the plasmids are present at or near saturating concentration (with respect to maximum transfection capacity) to increase the probability of more than one plasmid entering the same cell. The plasmids containing the various substrates can be transfected simultaneously or in multiple rounds. For example, in the latter approach cells can be transfected with a first aliquot of plasmid, transfectants selected and propagated, and then infected with a second aliquot of plasmid. Preferred plasmids are, for example, pUC and pBluescript derivatives as pMIX-LAM or YAC derivatives as YCp50.

The term "operable" with respect to a gene encoding an enzyme as described herein shall mean the gene in the context of an expression system, so to obtain an expression product which is the enzyme encoded by the gene. For example, the gene may be incorporated in an expression cassette or vector which is operable to express the encoded enzyme. For such purpose the gene or suitable polynucleotides are typically operably linked to further nucleotide sequences, e.g. regulatory sequences, such as promoter sequences, which term "operably linked" as used herein refers to a linkage in which the regulatory sequence is contiguous with the gene of interest to control the gene of interest, as well as regulatory sequences that act in trans or at a distance to control the gene of interest. For example, a DNA sequence is operably linked to a promoter when it is ligated to the promoter downstream with respect to the transcription initiation site of the promoter and allows transcription elongation to proceed through the DNA sequence. A DNA for a signal sequence is operably linked to DNA coding for a polypeptide, if it is expressed as a pre-protein that participates in the transport of the polypeptide. Linkage of DNA sequences to regulatory sequences is typically accomplished by ligation at suitable restriction sites or adapters or linkers inserted in lieu thereof using restriction endonucleases known to one of skill in the art.

The term "regulatory sequence" as used herein refers to polynucleotide sequences which are necessary to affect the expression of coding sequences to which they are operably linked. Expression control sequences are sequences which control the transcription, post-transcriptional events and translation of nucleic acid sequences. Expression control sequences include appropriate transcription initiation, termination, promoter and enhancer sequences; efficient RNA processing signals such as splicing and polyadenylation signals; sequences that stabilize cytoplasmic mRMA; sequences that enhance translation efficiency (e.g., ribosome binding sites); sequences that enhance protein stability; and when desired, sequences that enhance protein secretion. The nature of such control sequences differs depending upon the host organism. The term "control sequences" is intended to include, at a minimum, all components whose presence is essential for expression, and can also include additional components whose presence is advantageous, for example, leader sequences and fusion partner sequences.

"Expression vectors" or "vectors" used herein are defined as DNA sequences that are required for the transcription of cloned recombinant nucleotide sequences, i.e. of recombinant genes and the translation of their mRNA in a suitable host organism. Such expression vectors usually comprise an origin for autonomous replication in the host cells, selectable markers (e.g. an amino acid synthesis gene or a gene conferring resistance to antibiotics such as erythromycin, chloramphenicol, zeocin, kanamycin, G418 or hygromycin), a number of restriction enzyme cleavage sites, a suitable promoter sequence and a transcription terminator, which components are operably linked together. The terms "plasmid" and "vector" as used herein include autonomously replicating nucleotide sequences as well as genome integrating nucleotide sequences. Appropriate expression vectors typically comprise regulatory sequences suitable for expression of DNA encoding a heterologous polypeptide or protein in a prokaryotic or eu karyotic host cell. Examples of regulatory sequences include promoters, operators, and enhancers, ribosomal binding sites, and sequences that control transcription and translation initiation and termination. The regulatory sequences may be operably linked to the DNA sequence to be expressed. For example, a promoter sequence is said to be operably linked to a coding sequence, if the promotor controls the transcription of the coding sequence.

The promoter may be any suitable DNA sequence which shows transcriptional activity in the host cell and may be derived from genes encoding proteins either homologous or heterologous to the host. The promoter is preferably derived from a gene encoding a protein homologous to the host cell. The promoter can be an endogenous promoter or heterologous to the host cell. Suitable promoter sequences for use with prokaryotic host cells may include but are not limited to

Further suitable promoter sequences for use with selected host cells may include but are not limited to promoters obtained from genes that code for metabolic enzymes which are known to be present at high concentration in the cell, e.g. glycolytic enzymes like glyceraldehyde-3-phosphate dehydrogenase.

In a preferred expression system the promoter is an inducible or a constitutive promoter.

According to a preferred embodiment of the present invention, a recombinant construct is obtained by ligating the relevant genes into a vector. These genes can be stably integrated into the host cell genome or replicated as an episomal plasmid. The vector may be transferred into the host cell by transformation. Preferred methods of transformation for the uptake of the recombinant DNA fragment by the microorganism include chemical transformation, electroporation or transformation by protoplastation. Transformants according to the present invention can be obtained by introducing such a vector DNA, e.g. plasmid DNA, into a host and selecting transformants which express the relevant protein or host cell metabolite with high yields.

In many cases it may be desirable simply to assemble, e.g. to string together and optionally mix such genes with gene variants, to diversify larger genes, e.g. members of an individual metabolic pathway or to assemble multiplicities of metabolic pathways according to this method. Metabolic pathways, which do not exist in nature, can be constructed in this manner. Thus, enzymes which are present in one organism that operate on a desired substrate produced by a different organism lacking such a downstream enzyme, can be encoded in the same organism by virtue of constructing the assembly of genes or partial genes to obtain recombined enzymes. Multiple enzymes can thus be included to construct complex metabolic pathways. This is advantageous if a cluster of polypeptides or partial polypeptides shall be arranged according to their biochemical function within the pathway. Exemplary gene pathways of interest are encoding enzymes for the synthesis of secondary metabolites of industrial interest, such as flavonols, macrolides, polyketides, etc.

Genetic pathways can be constructed in a combinatorial fashion such that each member in the combinatorial library has a different combination of gene variants. For example, a combinatorial library of variants can be constructed from individual DNA elements, where different fragments are recombined and assembled and wherein each of the different fragments has several variants. The recombination and assembly of a metabolic pathway may not need the presence of a marker sequence to prove the successful engineering. The expression of a metabolite in a desired way would already be indicative for the working example. The successful recombination and assembly of the metabolic pathway may, for example, be determined by the detection of the secondary metabolite in the cell culture medium.

According to the invention a recombinant host cell is specifically cultivated under suitable conditions to produce the O-methylated compound as a fermentation product. According to a specific embodiment, a production host cell is cultivated in a batch mode, fed-batch mode or continuous mode.

Preferably, host cells are cultivated in a batch culture to accumulate biomass, followed by a fed-batch culture for the biotransforming reaction.

The cultivation and biotransforming reaction can take place concomitantly with the growth of the microorganisms or can be independent of the growth of the microorganisms. Therefore, the fermentation product can be accumulated in the medium while the microorganisms are growing or after the microorganisms ceased growing or both.

Inoculation of the microorganism may be into a fermentation medium, which is either a medium to provide for the bacterial growth, such as the growth medium, or directly into a production medium.

A growth medium allowing the accumulation of biomass typically comprises a carbon source, a nitrogen source, a source for sulphur and a source for phosphate. Typically, such a medium comprises furthermore trace elements, vitamins, amino acids.

A typical production medium comprises the compound for biotransformation, or any other starting compound of a metabolic pathway, and besides a sugar and optionally vitamins.

The production cell line is specifically used on a large scale or industrial scale. Production conditions in industrial scale refer to e.g. fed batch cultivation in reactor volumes of 100 L to 10 m 3 or larger, employing typical process times of several days, or continuous processes in fermenter volumes of appr. 50 - 1000 L or larger, with dilution rates of appr. 0.05 - 0.15 h ~1 .

Specifically, the method of the invention provides for the high-yield production of said product, e.g. with a yield or concentration of at least 10 mg/L, preferably at least 20 mg/L, at least 30 mg/L, at least 40 mg/L, at least 50 mg/L, at least 100 mg/L, at least 200 mg/L, at least 1 g/l, at least 5g/l or at least 50g/l of the product obtained in the culture medium.

It is preferred that the fermentation product is produced as a high purity material, specifically with at least 95% purity, preferably with at least 96% purity, preferably at least 97% purity, preferably at least 98%, even more preferably with 99% purity and most preferably with 99,5% purity. The purified products may conveniently be obtained by purification of the cell culture supernatant.

The fermentation product produced according to the invention typically can be isolated and purified using state of the art techniques, including the increase of the concentration of the desired chemical and/or the decrease of the concentration of at least one impurity.

As isolation and purification methods the following are preferred, specifically for preparative isolation: distillation, chromatography, crystallization, filtration, centri- fugation, decanting, reprecipitation. The chemical substance may e.g. be obtained from the fermented broth that is clarified using a centrifuge.

The isolated and purified chemical substance can be identified by conventional methods such as HPLC, GC, MS, NMR, infrared spectroscopy.

The foregoing description will be more fully understood with reference to the following examples. Such examples are, however, merely representative of methods of practicing one or more embodiments of the present invention and should not be read as limiting the scope of the invention. EXAMPLES

1. Characterization of specificity of O-methylation activities

Several OMT's were characterized. COMTmsa (Caffeic acid 3-O- methyltransferase from Medicago sativa, GENEBANK Accession n° AAB46623.1 ) was described as bifunctional, converting caffeic acid to ferulic acid and 5-hydroxyferulic acid to sinapic acid, AIMTpan (t-anol/isoeugenol synthase from Pimpinella anisum GENEBANK Accession n° ACL13527.1 ) was described to convert t-anol or isoeugenol to t-anethole or methylisoeugenol, respectively, via methylation of the para-OH group. CVOMT (chavicol O-methyltransferase from Ocimum basilicum, GENEBANK Accession n° AAL30423.1 ) was described with the same specificity for methylation of the para-OH group. It catalyzes methylation of chavicol and isoeugenol. COMThs (Human Catechol 3-O-methyltransferase, GENEBANK Accession n° CR456997.1 ) was described to deg rade catecholam ines such as dopam ine, epinephrine, and norepinephrine. And finally COMTyli (catechol O-methyltransferase from Yarrowia lipolytica, GENEBANK Accession n°500451 ) was described to be less specific toward position of the phenyl ring by converting 3,4-dihydroxybenzaldehyde into vanillin and isovanillin. The sequence alignment and homology tree are shown in Figure 1 . All OMT genes were expressed in yeast and position specificity of O-methylation was characterized.

All of the Saccharomyces cerevisiae strains used in this work were isogenic haploids from BY4741 and were obtained from EUROSCARF (haploid a-mater BY00 or a-mater BY10). Yeast strain BY47 derived from a strain collection that contains knock outs of auxotrophic (-ura3, -Ieu2, his3) marker genes. The different strains and relevant genotypes are listed in Table 1 .

Table 1 : S. cerevisiae strains

Yeast cells were then deleted for FDC1 and/or PAD1 and/or AHD6 genes in order to avoid endogenous degradation of substrate or products. PAD1 and FDC1 encode two endogenous phenyl acrylate decarboxylases. Deletion of such genes avoids ferulic acid, isoferulic acid, cinnamic acid degradation.

For each gene, recombinant clones were constructed using in vivo homologous recombination at bud31 locus (Figure 2). Integration fragments were designed. T 5' and T 3' correspond to the bud31 target sequences on the yeast genome allowing homologous integration onto the chromosome locus. URA and LEU are the flanking markers for the double selection. Overlapping sequences correspond to the 5' part and the 3' part of the marker genes. All integration fragments IF1 -IF2-IF4 and IF5 were amplified by PCR and amplicons were purified using the Wizard PCR Clean-up System (Promega). Synthetized ORF was amplified from GeneArt plasmid. The 5' end of the upstream oligonucleotides used for amplifying the gene of interest contains a sequence of 40 nucleotides homologous with the 3'end of the pGAL1 promoter. The downstream oligonucleotides contained a 40-nt sequence homologous with the 5'end of the tCYC terminator. After assembly by homologous recombination in yeast, the double selection allows selection of the recombinants. All genes are optimized for yeast expression.

For each transformation, five recombinant clones were randomly chosen and the correct integration of the cluster was analyzed by targeted PCRs using gDNA as template. Colony PCR has been done as described below. A minimal amount of cells (edge of a 10 μΙ tip) was re-suspended in a PCR tube containing 15 μΙ of lysis mix (100mM Tris-HCI pH=7.5 + 5μΙ_ zymolase (10mg/nnL) from Sigma). The tubes were first incubated 20 min at 20°C, then 5 min at 37°C and finally 5 min at 95°C. 2 μΙ of each lysate mix were used in 25-100 μΙ DreamTaq PCR reactions as indicated by the supplier (Fermentas). Amplified DNAs for sequencing were separated from primers using the Wizard PCR Clean-up System (Promega).

Then recombinant clones were cultured in induction medium to allow synthesis of proteins. As in this construction, gene expression is controlled by inducible GAL1 promoters, cells were grown on YPAGAL medium (YEP medium with galactose as the sole carbon source). After growth for 24 hours, cells were fed with 500μΜ or 1 mM of appropriate substrate (cinnamic acid, caffeic acid, ferulic acid, isoferulic acid, isoeugenol, 4-hydroxybenzaldehyde, 3,4-dihydroxybenzaldehyde, vanillin). Supernatants were then analyzed by high performance liquid chromatography (HPLC) to identify the appropriate product. Metabolites were analyzed using an Agilent 1260 or 1290 series HPLC system using an ACE5-C18 column (4.6 x 250 mm, 5-μιτι particle size) or a ZORBAX RRHD Eclipse plus C18 (3.0 x 100 mm, 1 .8μηη particle size) respectively. An acetonitrile/water gradient was determined and a diode array detector was used to detect eluted compounds by their UV fluorescence at 260 nm, 280 nm and 320nm. All standards were obtained from Sigma Aldrich. O-Methyltransferases enzymes were tested for their ability to catalyse 3-O-Methylation, and/or 4-O- Methylation of catechol/phenol precursors.

The results of feeding assays are presented in Figure 3, it was focused on methylation that occurs on the 4-position of the phenyl ring. Depending on protein expressed, and the structure of the substrate, methylation is detected at 4- position, depending on the substituent of the phenyl ring at 3-postion (R2) and at 1 -position (R1 ). AIMT and CVOMToba, that share more sequence homology with COMTmsa than with catechol-OMTs, are very specific for the 4-position of the phenyl ring. They methylate 4- position even if 3-position is a hydrogen or a methoxy. Thus isoeugenol, 4hydroxybenzaldehyde are methylated. COMTyli is able to methylate 3- and 4- positions. It methylates 4-position only if R2 is a methoxy group or a hydroxyl group, but no hydrogen. Caffeic acid is methylated into isoferulic and ferulic acid, and isoferulic into 3,4-dimethoxycinnamic acid.

2. Cell producing 3,4-dimethoxycinnamic acid

3,4 dimethoxycinnamic acid belongs to the family of phenylpropanoids, and is described as having antioxidant activities. It is also used as intermediate of synthesis compounds of interests. Biosynthesis of 3,4 dimethoxycinnamic acid was evaluated from COMTs that are specific for 3- and 4-mehylation of caffeic, ferulic and isoferulic acids (figure 4). Methylation of 4-0 position of caffeic acid into isoferulic acid by COMTyli is described as the first step, before methylation of isoferulic acid into 3,4 dimethoxycinnamic acid by COMTmsa. Methylation of ferulic acid into 3,4 dimethoxycinnamic acid by COMTyli is also described. a) Methylations of isoferulic and ferulic acid

• 3-O-Methylation of isoferulic acid

COMTyl i is able to perform 4-O-Methylation on caffeic acid catechol moiety leading to biotsynthesis of isoferulic acid and ferulic acid. Since isoferulic acid is readly decarboxylated by endogenous phenylacrylate decarboxylases, PAD and FDC, COMTyli was expressed in ^pad, Atoc and .-ipad -ifdc cells (EUROSCARF, double deletion produced according to methods well-known in the art), to stabilize produced isoferulic acid in the medium. Cells were grown overnight in YPGAL medium and diluted the next day at OD600=0.1 in YPGAL. Caffeic acid was added at a final concentration of 500μΜ . Yeast supernatant was analyzed as a function of time by UHPLC. Isoferulic acid is detected in all transformed cells, although, the highest production is detected for mutated cells Apad and ApadAfdc, with about 120μΜ detected product (NB only 30μΜ are measured in wt cells expressing COMTyli). Caffeic acid is regularly consumed with time. Ferulic acid is poorly detected in supernatants of all cells (i.e. less than 2μΜ detected) (figure 5A).

Isoferulic acid was added as a substrate to wild type and Apad, Afdc and ApadAfdc cells expressing COMTmsa. Cells were grown overnight in YPGAL medium and diluted the next day at OD600=0.1 in YPGAL medium. Isoferulic acid was added at a final concentration of 1 mM. 3,4 dimethoxycinnamic acid is detected in all transformed cells. The highest conversion is detected for COMTmsa expressed in -4pad, _4fdc and ^pad^fdc cells where, about 400 μΜ of 3,4 dimethoxycinnamic acid is measured in culture supernatants, whereas 120μΜ was detected in wt transformed cells (figure 5B).

• 4-Omethylation of ferulic acid

It was evaluated whether COMTyli could generate 3,4 dimethoxycinnamic acid from ferulic acid. Since ferulic acid is readly decarboxylated by endogenous phenylacrylate decarboxylases, PAD and FDC, conversion tests were performed in Apad, Afdc and ApadAfdc cells. Cells were grown overnight in YPGAL medium and diluted the next day at OD600=0.1 in YPGAL. Ferulic acid was added at a final concentration of 500μΜ . Yeast supernatant was analyzed as a function of time by UHPLC. 3,4 dimethoxycinnamic acid was detected in all tested supernatants (figure 5C), although, the highest concentration is measured is detected for COMT yli expressed in Apad, and ApadAfdc cells. (10 and 12μΜ respectively). The level of detected 3,4 dimethoxycinnamic acid is about 10 times lower than conversion from COMT msa. Althought COMTyli is more specific for a 4-O-methylation; its substrate binding site might be hindred by the O-Met moiety of ferulic acid in position 3. b) Combination of 3 and 4-Omethylations of caffeic acid

It was tested whether 3,4 dimethoxycinnamic acid could be generated from caffeic acid, using COMTyli and COMTmsa constructs in one cell. To do so, cells expressing both enzymes of each mating type were grown overnight in YPGAL medium, and diluted the next day at OD600=0.1 in the same YPGAL medium. Cells were let to conjugate with time. Wt and ApadAfdc cells were tested. Caffeic acid was added at a final concentration of 500 μΜ. Culture supernatants were analysed as a function of time by UHPLC. 3,4 dimethoxycinnamic acid is detected in all supernatants. Ferulic acid and isoferulic acid are also detected. This construct is able to convert caffeic acid into isoferulic acid, ferulic and 3,4 dimethoxycinnamic acid, (figure 5D). 3. Cell producing anise natural anise flavor

P-anisaldehyde is a component of Pimpinella anisum seed responsible of the anise flavour. P-anisaldehyde is prepared commercially by oxidation of methoxytoluene (p-cresyl methyl ether) using manganese dioxide. It can also be produced by oxidation of anethole, a related fragrance that is found in some alcoholic beverages. Here it is propose an alternative biotechnology-based approach of natural source of p-anisaldehyde.

Since anise is not an endogenous metabolite, it is necessary to recreate a synthetic production pathway in yeast. The anisaldehyde synthesis starts as all phenylpropanoids with phenylalanine which is produced endogenously by the cell. Five enzymes are required for the conversion of L-phenylalanine to p-anisaldehyde (Figure 6A). Phenylalanine is converted to coumarate by the successive action of the enzymes phenylalanine ammonia lyase (PAL), cinnamate-4-hydroxylase (C4H). The following step is the reduction chain reaction of the coumaric acid leading to the 4- hydroxybenzaldehyde. The reaction is initiated by the activation of coumaric acid to coumaroyl-CoA provided by 4CL enzyme (4-Coumarate Coenzyme A Ligase), followed by a β-oxidation performed by ECH enzyme (enoyl-CoA hydratase/aldolase, crotonase family enzyme). Last step is the 4-methylation of the phenyl ring by CVOMToba. In order to preclude oxidation of aldehydes, CAR protein was added, as carboxylic acid reductase activity. Fragment hybridizes and recombines together in the region of the entire ORF of each couple of gene and then integrated into the chromosome (Figure 6B). Tg 5' and Tg 3' correspond to the target sequences on the yeast genome that corresponds to the insertion site in the BUD 31 locus of the yeast chromosome triggering the homologous integration into the desired chromosome site. HIS3 is the marker enabling the selection of the recombinant pathway. Each gene is under the control of one promoter and one terminator sequences allowing its expression in yeast cells. All fragments were amplified by PCR and transformations of competent yeast cells were performed with 250 ng of each fragment (Figure 6B). Selection of recombinant clones was performed on media without histidine. After 3 days clones transformed with the various fragments, were observed on selection media. Correct integration of the pathway was correlated by PCR.

The expression of the pathway in recombinant yeast was analyzed. Recombinant yeast containing p-anisaldehyde were grown in rich medium (yeast extract 10g/I, peptone 20g/l, glucose 20g/l or galactose 30g/l). Culture was done in Erlenmeyer and shaken at 30°C for three days. Cells were harvested by centrifugation and supernatants recovered. As controls, the Y00 wild type strain (no p-anisaldehyde gene) was cultured under the same conditions as recombinant cells (clone expressing p-anisaldehyde pathway genes), and the medium without yeast. HPLC was used to measure the production of p-anisaldehyde and pathway intermediates in S. cerevisiae cultures. Analysis showed that chromatograms from cells expressing p-anisaldehyde pathway genes contained additional peaks compared to an Y00 control. These peaks were identified by comparison to the library of molecules. Thus, cinnamic acid, coumaric acid, 4 hydroxybenzoic acid, p-anisic acid were identified. CVOMToba is able to methylate only aldehyde form, and do not convert 4-hydroxybenzoic acid into p- anisic acid. So the pathway is fully functional, but a deviation from the acid derivatives takes place immediately after the reduction step of the chain and on p-anisaldehyde to p-anisic acid. The p-anisic acid concentration was: 52μΜ.