Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
RECOMBINANT HOST CELLS TO PRODUCE ANTHRAQUINONE DERIVATIVES
Document Type and Number:
WIPO Patent Application WO/2020/161354
Kind Code:
A1
Abstract:
The present invention relates to a recombinant host cell comprising a heterologous nucleic acid sequence encoding a type III polyketide synthase (PKS) which is an octaketide synthase (OKS) and exhibiting post-PKS tailoring activities including aromatase, cyclase and ketoreductase activities. It further relates to the use of said recombinant host cell to produce anthraquinone derivatives, in particular 3,8-dihydroxy- 1-methyl-anthraquinone-3-carboxylic acid and aloesaponarin II.

Inventors:
CREPIN LUCIE (FR)
SCHIAVON CAROLINE (FR)
BAYLAC AUDREY (FR)
CORDIER HÉLÈNE (FR)
BOISSONNAT GUILLAUME (FR)
Application Number:
PCT/EP2020/053309
Publication Date:
August 13, 2020
Filing Date:
February 10, 2020
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
PILI (FR)
International Classes:
C12N9/10; C09B1/16; C09B61/00; C12N9/02
Domestic Patent References:
WO2016198564A12016-12-15
WO2016198623A12016-12-15
WO1996040968A11996-12-19
WO2016198564A12016-12-15
WO1996000787A11996-01-11
WO2000056900A22000-09-28
Foreign References:
US20020045220A12002-04-18
US40929795A1995-03-24
Other References:
KARPPINEN K ET AL: "Octaketide-producing type III polyketide synthase from Hypericum perforatum is expressed in dark glands accumulating hypericins", FEBS JOURNAL, WILEY-BLACKWELL PUBLISHING LTD, GB, vol. 275, no. 17, 1 September 2008 (2008-09-01), pages 4329 - 4342, XP002752481, ISSN: 1742-464X, [retrieved on 20080721], DOI: 10.1111/J.1742-4658.2008.06576.X
XUE GAO ET AL: "Engineered polyketide biosynthesis and biocatalysis in", APPLIED MICROBIOLOGY AND BIOTECHNOLOGY, SPRINGER, BERLIN, DE, vol. 88, no. 6, 19 September 2010 (2010-09-19), pages 1233 - 1242, XP019862132, ISSN: 1432-0614, DOI: 10.1007/S00253-010-2860-4
ABHIRUP DAS ET AL: "Biosynthesis of Aromatic Polyketides in Bacteria", ACCOUNTS OF CHEMICAL RESEARCH., vol. 42, no. 5, 19 May 2009 (2009-05-19), US, pages 631 - 639, XP055595851, ISSN: 0001-4842, DOI: 10.1021/ar8002249
AMES B D ET AL: "Structural and biochemical characterization of ZhuI aromatase/cyclase from the R1128 polyketide pathway", BIOCHEMISTRY, AMERICAN CHEMICAL SOCIETY, vol. 50, no. 39, 4 October 2011 (2011-10-04), pages 8392 - 8406, XP002747579, ISSN: 0006-2960, [retrieved on 20110908], DOI: 10.1021/BI200593M
GAO ET AL., APPL. MICROBIOL. BIOTECHNOL., vol. 88, no. 6, 2010, pages 1233 - 1242
NEEDLEMANWUNSCH, J. MOL. BIOL, vol. 48, 1970, pages 443
SMITHWATERMAN, ADV. APPL. MATH., vol. 2, 1981, pages 482
ALTSCHUL ET AL., NUCLEIC ACIDS RES., vol. 25, 1997, pages 3389 - 3402
ALTSCHUL ET AL., FEBS J., vol. 272, 2005, pages 5101 - 5109
MCDANIEL ET AL., J. AM. CHEM. SOC., vol. 115, 1993, pages 11671
MCDANIEL ET AL., J. AM. CHEM. SOC., vol. 116, 1994, pages 11855
HAWKSWORTH ET AL.: "Ainsworth and Bisby's Dictionary of The Fungi", 1995, CAB INTERNATIONAL, UNIVERSITY PRESS
"Uniprot", Database accession no. AOA1B 1MKH3
MIZUUCHI ET AL., FEBS JOURNAL, 2009, pages 2391 - 2401
ABE ET AL., J. AM. CHEM. SOC., vol. 127, no. 5, 2005, pages 1362 - 1363
"Genbank", Database accession no. AAS87170.1
DEBOER ET AL., PROC. NATL. ACAD. SCI. USA, vol. 80, 1983, pages 21 - 25
EGON ET AL., GENE, vol. 69, 1988, pages 301 - 315
VILLA-KAMAROFF ET AL., PROC. NATL. ACAD. SCI. USA, vol. 75, 1978, pages 3727 - 3731
GUZMAN ET AL., J BACTERIOL., vol. 177, no. 14, July 1995 (1995-07-01)
BRENTPTASHNE, PROC NATL ACAD SCI USA, vol. 78, no. 7, July 1981 (1981-07-01)
GILBERT ET AL., SCIENTIFIC AMERICAN, vol. 242, 1980, pages 74 - 94
SAMBROOK ET AL.: "Molecular cloning: a laboratory manual", 2001, COLD SPRING HARBOR
ROMANOS ET AL., YEAST, vol. 8, 1992, pages 423 - 488
ODELL ET AL., NATURE, vol. 313, 1985, pages 9810 - 812
MCELROY ET AL., PLANT CELL, vol. 2, 1990, pages 163 - 171
CHRISTIAN ET AL., PLANT MOL. BIOL., vol. 18, 1989, pages 675 - 689
LAST ET AL., THEOR APPL. GENET., vol. 81, 1991, pages 581 - 588
VELTEN ET AL., EMBO J., vol. 3, 1984, pages 2723 - 2730
THOMASFLAVELL, PLANT CELL., vol. 2, no. 12, December 1990 (1990-12-01), pages 1171 - 80
JEANNEAU ET AL., BIOCHIMIE, vol. 84, 2002, pages 1127 - 1135
KATAYAMA ET AL., PLANT MOL. BIOL., vol. 44, 2000, pages 99 - 106
BARTEL ET AL., J. BACTERIOL., vol. 172, no. 9, September 1990 (1990-09-01)
"NCBI", Database accession no. SC05091
Attorney, Agent or Firm:
CABINET BECKER ET ASSOCIES (FR)
Download PDF:
Claims:
CLAIMS

1. A recombinant microbial host cell comprising a heterologous nucleic acid sequence encoding a type III polyketide synthase (PKS) which is an octaketide synthase (OKS), a nucleic acid sequence encoding an enzyme exhibiting post-PKS aromatase activity, a nucleic acid sequence encoding an enzyme exhibiting post-PKS cyclase activity and a nucleic acid sequence encoding an enzyme exhibiting post-PKS ketoreductase activity.

2. The recombinant host cell of claim 1 , wherein the host cell is selected from the group consisting of bacteria, yeasts and filamentous fungi. 3. The recombinant host cell of claim 1 or 2, wherein the host cell is a bacterium, preferably a Gram-negative bacterium.

4. The recombinant host cell of any of claims 1 to 3, wherein the host cell is Escherichia coli.

5. The recombinant host cell of any of claims 1 to 4, wherein the OKS is selected from plant OKS and variants thereof, said variants exhibiting OKS activity and at least

70% sequence identity to any of said plant OKS.

6. The recombinant host cell of any of claims 1 to 5, wherein the OKS is from a plant belonging to Aloe, Hypericum, Gerbera, Rheum, Medicago, Arachis, Phalaenopsis, Wachendorfia, Ipomoea, Drosophyllum, Plumbago, Ruta, Hydrangea, Humulus, Vitis, Pisum, Phaseolus, Pueraria, Pinus, Oryza, Zea, Petunia or Camellia genus, preferably from a plant belonging to Aloe or Hypericum genus.

7. The recombinant host cell of any of claims 1 to 5, wherein the OKS is a variant of a PKS producing polyketide consisting of less than eight ketide units wherein the active-site residue corresponding to Thrl97 in chalcone synthase of Medicago sativa (SEQ ID NO: 39) is substituted by a glycine residue.

8. The recombinant host cell of any of claims 1 to 4, wherein the OKS comprises, or consists of, an amino acid sequence selected from the group consisting of SEQ ID NO: 1 to 8 and 40, and variants thereof exhibiting OKS activity and having at least 70 % sequence identity to any of SEQ ID NO: 1 to 8 and 40, preferably an amino acid sequence selected from the group consisting of SEQ ID NO: 1, 6, 7 and 40, and variants thereof exhibiting OKS activity and having at least 70 % sequence identity to any of SEQ ID NO: 1, 6, 7 and 40.

9. The recombinant host cell of any of claims 1 to 4, wherein the OKS comprises, or consists of, an amino acid sequence selected from the group consisting of SEQ ID NO:

1 to 8, preferably from SEQ ID NO: 1 to 7, and variants thereof exhibiting OKS activity and having at least 70 % sequence identity to any of SEQ ID NO: 1 to 8, preferably to any of SEQ ID NO: 1 to 7.

10. The recombinant host cell of any of claims 1 to 4, wherein the OKS comprises, or consists of, an amino acid sequence selected from the group consisting of SEQ ID

NO:l, 6 and 7, preferably of SEQ ID NO: 1 or 6, and variants thereof exhibiting OKS activity and having at least 70 % sequence identity to SEQ ID NO: 1, 6 or 7, preferably to SEQ ID NO: 1 or 6.

11. The recombinant host cell of any of claims 1 to 10, wherein said recombinant host cell comprises a heterologous nucleic acid encoding an enzyme exhibiting post-PKS aromatase activity.

12. The recombinant host cell of any of claims 1 to 11, wherein the enzyme exhibiting post-PKS aromatase activity comprises, or consists of, an amino acid sequence selected from the group consisting of SEQ ID NO: 9 to 18, and variants thereof exhibiting aromatase activity and having at least 70% sequence identity to any of SEQ ID NO: 9 to 18.

13. The recombinant host cell of any of claims 1 to 11, wherein the enzyme exhibiting post-PKS aromatase activity comprises, or consists of, an amino acid sequence selected from Streptomyces post-PKS aromatases and variants thereof exhibiting post- PKS aromatase activity and having at least 70 % sequence identity to any of Streptomyces post-PKS aromatases.

14. The recombinant host cell of any of claims 1 to 13, wherein the enzyme exhibiting post-PKS aromatase activity comprises, or consists of, an amino acid sequence selected from the group consisting of SEQ ID NO: 9 to 15, and variants thereof exhibiting aromatase activity and having at least 70% sequence identity to any of SEQ ID NO: 9 to 15.

15. The recombinant host cell of any of claims 1 to 14, wherein the enzyme exhibiting post-PKS aromatase activity comprises, or consists of, an amino acid sequence selected from the group consisting of SEQ ID NO: 9 and variants thereof exhibiting aromatase activity and having at least 70% sequence identity to SEQ ID NO: 9.

16. The recombinant host cell of any of claims 1 to 15, wherein said recombinant host cell comprises a heterologous nucleic acid encoding an enzyme exhibiting post-PKS cyclase activity. 17. The recombinant host cell of any of claims 1 to 16, wherein the enzyme exhibiting post-PKS cyclase activity comprises, or consists of, an amino acid sequence selected from the group consisting of SEQ ID NO: 19 to 28, and variants thereof exhibiting cyclase activity and having at least 70% identity to any of SEQ ID NO: 19 to 28. 18. The recombinant host cell of any of claims 1 to 16, wherein the enzyme exhibiting post-PKS cyclase activity comprises, or consists of, an amino acid sequence selected from Streptomyces post-PKS cyclases and variants thereof exhibiting post-PKS cyclase activity and having at least 70 % sequence identity to any of Streptomyces post- PKS cyclases. 19. The recombinant host cell of any of claims 1 to 18, wherein the enzyme exhibiting post-PKS cyclase activity comprises, or consists of, an amino acid sequence selected from the group consisting of SEQ ID NO: 19 to 25, and variants thereof exhibiting cyclase activity and having at least 70% identity to any of SEQ ID NO: 19 to 25. 20. The recombinant host cell of any of claims 1 to 19, wherein the enzyme exhibiting post-PKS cyclase activity comprises, or consists of, an amino acid sequence selected from the group consisting of SEQ ID NO: 19, and variants thereof exhibiting cyclase activity and having at least 70% identity to SEQ ID NO: 19. 21. The recombinant host cell of any of claims 1 to 20, wherein said recombinant host cell comprises a heterologous nucleic acid sequence encoding an enzyme exhibiting post-PKS ketoreductase activity.

22. The recombinant host cell of any of claims 1 to 21, wherein the enzyme exhibiting post-PKS ketoreductase activity comprises, or consists of, an amino acid sequence selected from the group consisting of SEQ ID NO: 29 to 38 and variants thereof exhibiting post-PKS ketoreductase activity and having at least 70% identity to any of SEQ ID NO: 29 to 38.

23. The recombinant host cell of any of claims 1 to 21, wherein the enzyme exhibiting post-PKS ketoreductase activity comprises, or consists of, an amino acid sequence selected from Streptomyces post-PKS ketoreductases and variants thereof exhibiting post-PKS ketoreductase activity and having at least 70 % sequence identity to any of Streptomyces post-PKS ketoreductases.

24. The recombinant host cell of any of claims 1 to 23, wherein the enzyme exhibiting post-PKS ketoreductase activity comprises, or consists of, an amino acid sequence selected from the group consisting of SEQ ID NO: 29 to 34 and variants thereof exhibiting post-PKS ketoreductase activity and having at least 70% identity to any of SEQ ID NO: 29 to 34.

25. The recombinant host cell of any of claims 1 to 24, wherein the enzyme exhibiting post-PKS ketoreductase activity comprises, or consists of, an amino acid sequence selected from the group consisting of SEQ ID NO: 29 and variants thereof exhibiting post-PKS ketoreductase activity and having at least 70% identity to SEQ ID NO: 29.

26. The recombinant host cell of any of claims 1 to 25, wherein the host cell, preferably a bacterium, more preferably E. coli, comprises

- a heterologous nucleic acid sequence encoding a type III polyketide synthase (PKS) which is an octaketide synthase (OKS), preferably as defined in any of claims 5 to 10, and more preferably as defined in claim 9, and - a heterologous nucleic acid encoding a post-PKS aromatase, preferably as defined in any of claims 12 to 15, and more preferably as defined in claim 14, and

- a heterologous nucleic acid encoding a post-PKS cyclase, preferably as defined in any of claims 17 to 20, and more preferably as defined in claim 19, and - a heterologous nucleic acid encoding a post-PKS ketoreductase, preferably as defined in any of claims 22 to 25, and more preferably as defined in claim 24.

27. The recombinant host cell of any of claims 1 to 26, wherein the host cell, preferably a bacterium, more preferably E. coli, comprises

- a heterologous nucleic acid sequence encoding a type III polyketide synthase (PKS) which is an octaketide synthase (OKS) comprising, or consisting of, an amino acid sequence selected from the group consisting of SEQ ID NO: 1, 6 and 7, preferably of SEQ ID NO: 1 or 6, and variants thereof exhibiting OKS activity and having at least 70 % sequence identity to SEQ ID NO: 1, 6 or 7, preferably to SEQ ID NO: 1 or 6; and

- a heterologous nucleic acid encoding a post-PKS aromatase, comprising, or consisting of, an amino acid sequence selected from the group consisting of SEQ ID NO:

9 and variants thereof exhibiting aromatase activity and having at least 70% sequence identity to SEQ ID NO: 9, and

- a heterologous nucleic acid encoding a post-PKS cyclase, comprising, or consisting of, an amino acid sequence selected from the group consisting of SEQ ID NO: 19, and variants thereof exhibiting cyclase activity and having at least 70% identity to SEQ ID NO: 19, and

- a heterologous nucleic acid encoding a post-PKS ketoreductase comprising, or consisting of, an amino acid sequence selected from the group consisting of SEQ ID NO: 29 and variants thereof exhibiting post-PKS ketoreductase activity and having at least 70% identity to SEQ ID NO: 29.

28. A recombinant nucleic acid comprising a nucleic acid sequence encoding a type III PKS which is an octaketide synthase and a nucleic acid sequence encoding a post- PKS tailoring enzyme selected from the group consisting of an aromatase, a cyclase and a ketoreductase, operably linked to one or more control sequences that direct the expression of said nucleic acid sequences in a suitable host cell under conditions compatible with the control sequences.

29. The recombinant nucleic acid of claim 28, wherein said recombinant nucleic acid comprises a nucleic acid sequence encoding a type III PKS which is an octaketide synthase, preferably as defined in any of claims 5 to 10, and a nucleic acid sequence encoding a post- PKS ketoreductase, preferably as defined in any of claims 22 to 25, and optionally a nucleic acid sequence encoding a post- PKS aromatase, preferably as defined in any of claims 12 to 15, and/or a nucleic acid sequence encoding a post- PKS cyclase, preferably as defined in any of claims 17 to 20, operably linked to one or more control sequences that direct the expression of said nucleic acid sequences in a suitable host cell under conditions compatible with the control sequences.

30. An expression vector comprising a recombinant nucleic acid of claim 28 or 29.

31. A recombinant microbial host cell, preferably a bacterial host cell, comprising a recombinant nucleic acid of claim 28 or 29, or an expression vector of claim 30.

32. A method of producing a compound which is an anthraquinone derivative, comprising culturing a recombinant host cell of any of claims 1 to 27 and 31, under conditions suitable to produce said compound, and optionally recovering said compound.

33. Use of a recombinant host cell of any of claims 1 to 27 and 31 to produce a compound which is an anthraquinone derivative.

34. The method of claim 32 or the use of claim 33, wherein the anthraquinone derivative is selected from the group consisting of 3, 8-dihydroxy- 1-methyl- anthraquinone-3 -carboxylic acid (DMAC) and aloesaponarin II (AL2), and mixture thereof.

Description:
RECOMBINANT HOST CELLS TO PRODUCE ANTHRAOUINONE

DERIVATIVES

FIELD OF THE INVENTION

The present invention relates to the field of the production of biobased compounds, in particular the production of anthraquinone derivatives using recombinant host cells.

BACKGROUND OF THE INVENTION

Colouring substances are an integral part of our daily lives. Among other things, they are widely used for colouring textiles, decorative objects and home furnishings, inks and paper products, as well as food, cosmetic and pharmaceutical products.

99% of all dyes are made using petrochemistry, while colors from plants and animals are not scalable. Increased awareness of environmental problems and of public health safety has led to an evaluation of the environmental toxicity and impact of these colorants. It thus emerged that production processes of many of them are sources of environmental pollution.

Anthraquinones represent the most prevalent family of colouring agents after the azo compounds family. Anthraquinones are aromatic polycyclic hydrocarbons having the following backbone of formula (A):

Based on various substitutions on the backbone of formula (A), a wide range of compounds can be prepared, thereby providing a wide diversity of colors.

Until recently, preparation of most of anthraquinones was based on anthraquinonesulfonic acids and thus generated large volumes of waste dilute acids. Environmental considerations triggered the development of new chemical processes having less ecological impact. However, there is still a need to provide a sustainable and environment-friendly method for the production of these compounds.

Biosynthesis of anthraquinone derivatives involves the production of a polyketide modified by tailoring enzymes. For example, 3,8-dihydroxy-l-methyl-anthraquinone-3- carboxylic acid (DM AC) and Aloes aponarin II (AL2) (Figure 1) are shunt products of the actinorhodin pathway, a blue compound naturally produced by Streptomyces coelicolor. These two anthraquinones are produced from one molecule of acetyl-CoA and seven molecules of malonyl-ACP. First, a linear poly- -keto chain is biosynthesized by the minimal type II polyketide synthase (PKS) consisting of the dimeric ketosynthase-chain length factor (KS-CFF) and an acyl carrier protein (ACP). The nascent chain is then modified by a ketoreductase (KR), an aromatase (ARO) and a cyclase (CYC). The cyclase, that has a thioesterase activity (TE), releases the polyketide in production from ACP anchor. Once free, the molecule spontaneously evolves to form DMAC and its decarboxylated derivative AF2. In S. coelicolor, KS, CFF and ACP are encoded by the genes actl-l, actI-2 and actI-3, respectively, while the tailoring enzymes KR, ARO and CYC are encoded by the genes actlll, actVII and act IV, respectively.

To date, while production of anthraquinone derivatives was obtained in plant or fungal host cells (e.g. International patent application WO 2016/198564), no production of anthraquinones has been reported in non-natural producing bacterial host cells such as E. coll. Indeed, despite repeated attempts by numerous groups, expression of the core KS- CFF heterodimer in E. coll has always resulted in 100% of the proteins as inclusion bodies (Gao et ak, 2010, Appl. Microbiol. Biotechnok, 88 (6), 1233-1242). An approach to overcome the barrier of reconstituting bacterial minimal PKS in E. coll was to dissect and reassemble G. fujikuroi PKS4 (fungal type I iterative PKS) to mimic the dissociated components of bacterial type II minimal PKS. However, in order to produce the polyketide of interest, E. coll BAP1 strain had to be used. This strain possesses a chromosomal insertion of the phosphopantetheinyl transferase from Bacillus subtilis which ensures that polyketide synthase acyl carrier proteins (ACP) are post- translationnally modified to ensure the effective functioning of the PKS. Using another approach, the oxytetracycline biosynthesis pathway from Streptomyces rimnosus, consisting of a type II polyketide synthase gene cluster of 21 genes, was expressed in E. coli (Stevens et al., 2013, PLoS One, 8(5)). The strain BAP1 was used for the same reason as above, i.e. to efficiently activate the heterologous ACP group. Moreover, overexpression of a sigma factor, s 54 , was required to express the biosynthetic gene cluster and produce oxytetracycline. However, modifying a bacterial strain to express a heterologous phosphopantetheinyl transferase or to overexpress a sigma factor, increases the metabolic burden on the host cell and may negatively affect the productivity thereby impairing the economic viability of production processes using such bacteria.

SUMMARY OF THE INVENTION

The inventors aim at the production of anthraquinone derivatives in a recombinant bacterial host cell thereby opening the way toward the production of these compounds through fermentation and in particular to the production of biobased colorants.

Accordingly, in a first aspect, the present invention relates to a recombinant microbial host cell comprising a heterologous nucleic acid sequence encoding a type III polyketide synthase (PKS) which is an octaketide synthase (OKS), a nucleic acid sequence encoding an enzyme exhibiting post-PKS aromatase activity, a nucleic acid sequence encoding an enzyme exhibiting post-PKS cyclase activity and a nucleic acid sequence encoding an enzyme exhibiting post-PKS ketoreductase activity.

The host cell may be selected from the group consisting of bacteria, yeasts and filamentous fungi. Preferably, the host cell is a bacterium, more preferably a Gram- negative bacterium, even more preferably is Escherichia coli.

The OKS may be selected from plant OKS and variants thereof, said variants exhibiting OKS activity and at least 70% sequence identity to any of said plant OKS. In particular, the OKS may be from a plant belonging to Aloe, Hypericum, Gerbera, Rheum, Medicago, Arachis, Phalaenopsis, Wachendorfia, Ipomoea, Drosophyllum, Plumbago, Ruta, Hydrangea, Humulus, Vitis, Pisum, Phaseolus, Pueraria, Pinus, Oryza, Zea, Petunia or Camellia genus, preferably from a plant belonging to Aloe or Hypericum genus.

The OKS may also be a variant of a PKS producing polyketide consisting of less than eight ketide units wherein the active-site residue corresponding to Thrl97 in chalcone synthase of Medicago sativa (SEQ ID NO: 39) is substituted by a glycine residue.

The OKS may comprise, or consist of, an amino acid sequence selected from the group consisting of SEQ ID NO: 1 to 8 and 40 and variants thereof exhibiting OKS activity and having at least 70 %, preferably at least 75%, more preferably 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, and even more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to any of SEQ ID NO: 1 to 8 and 40.

In particular, the OKS may comprise, or consist of, an amino acid sequence selected from the group consisting of SEQ ID NO: 1 to 8, preferably SEQ ID NO: 1 to 7, and variants thereof exhibiting OKS activity and having at least 70 %, preferably at least 75%, more preferably 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, and even more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to any of SEQ ID NO: 1 to 8, preferably to any of SEQ ID NO: 1 to 7.

More particularly, the OKS may comprise, or consist of, an amino acid sequence selected from the group consisting of SEQ ID NO:l, 6 and 7, preferably of SEQ ID NO: 1 or 6, and variants thereof exhibiting OKS activity and having at least 70 %, preferably at least 75%, more preferably 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, and even more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to SEQ ID NO: 1, 6 or 7, preferably to SEQ ID NO: 1 or 6. Preferably, the recombinant host cell comprises a heterologous nucleic acid encoding an enzyme exhibiting post-PKS aromatase activity.

The enzyme exhibiting post-PKS aromatase activity may comprise, or consist of, an amino acid sequence selected from the group consisting of aromatases of SEQ ID NO: 9 to 18, and variants thereof exhibiting aromatase activity and having at least 70% identity to any of SEQ ID NO: 9 to 18.

The enzyme exhibiting post-PKS aromatase activity may comprise, or consist of, an amino acid sequence selected from Streptomyces post-PKS aromatases and variants thereof exhibiting post-PKS aromatase activity and having at least 70 % sequence identity to any of Streptomyces post-PKS aromatases. In particular, the enzyme exhibiting post- PKS aromatase activity may comprise, or consist of, an amino acid sequence selected from the group consisting of SEQ ID NO: 9 to 15, preferably SEQ ID NO: 9, and variants thereof exhibiting aromatase activity and having at least 70% sequence identity to any of SEQ ID NO: 9 to 15, preferably SEQ ID NO: 9.

Preferably, the recombinant host cell comprises a heterologous nucleic acid encoding an enzyme exhibiting post-PKS cyclase activity.

The enzyme exhibiting post-PKS cyclase activity may comprise, or consist of, an amino acid sequence selected from the group consisting of cyclases of SEQ ID NO: 19 to 28, and variants thereof exhibiting cyclase activity and having at least 70% identity to any of SEQ ID NO: 19 to 28. The enzyme exhibiting post-PKS cyclase activity may comprise, or consist of, an amino acid sequence selected from Streptomyces post-PKS cyclases and variants thereof exhibiting post-PKS cyclase activity and having at least 70 % sequence identity to any of Streptomyces post-PKS cyclases. In particular, the enzyme exhibiting post-PKS cyclase activity may comprise, or consist of, an amino acid sequence selected from the group consisting of SEQ ID NO: 19 to 25, preferably SEQ ID NO: 19, and variants thereof exhibiting cyclase activity and having at least 70% identity to any of SEQ ID NO: 19 to 25, preferably SEQ ID NO: 19.

Preferably, the recombinant host cell comprises a heterologous nucleic acid sequence encoding an enzyme exhibiting post-PKS ketoreductase activity. The enzyme exhibiting post-PKS ketoreductase activity may comprise, or consist of, an amino acid sequence selected from the group consisting of ketoreductases of SEQ ID NO: 29 to 38, and variants thereof exhibiting ketoreductase activity and having at least 70% identity to any of SEQ ID NO: 29 to 34.

The enzyme exhibiting post-PKS ketoreductase activity may comprise, or consist of, an amino acid sequence selected from Streptomyces post-PKS ketoreductases and variants thereof exhibiting post-PKS ketoreductase activity and having at least 70 % sequence identity to any of Streptomyces post-PKS ketoreductases. In particular, the enzyme exhibiting post-PKS ketoreductase activity may comprise, or consist of, an amino acid sequence selected from the group consisting of SEQ ID NO: 29 to 34, preferably SEQ ID NO: 29, and variants thereof exhibiting post-PKS ketoreductase activity and having at least 70% identity to any of SEQ ID NO: 29 to 34, preferably SEQ ID NO: 29.

In a second aspect, the present invention also relates to a recombinant nucleic acid comprising a nucleic acid sequence encoding a type III PKS which is an octaketide synthase and a nucleic acid sequence encoding a post-PKS tailoring enzyme selected from the group consisting of an aromatase, a cyclase and a ketoreductase, preferably a nucleic acid sequence encoding a post-PKS ketoreductase and optionally a nucleic acid sequence encoding a post-PKS aromatase and/or a nucleic acid sequence encoding a post-PKS cyclase, operably linked to one or more control sequences that direct the expression of said nucleic acid sequences in a suitable host cell under conditions compatible with the control sequences.

In a third aspect, the present invention further relates to an expression vector comprising a recombinant nucleic acid of the invention. It also relates to a host cell, preferably a recombinant microbial host cell, more preferably a bacterial host cell, comprising a recombinant nucleic acid or an expression vector of the invention.

In a fourth aspect, the present invention relates to a method of producing a compound which is an anthraquinone derivative, comprising culturing a recombinant host cell of the invention, under conditions suitable to produce said compound, and optionally recovering said compound. The present invention also relates to the use of a recombinant host cell of the invention to produce a compound which is an anthraquinone derivative.

Preferably, the anthraquinone derivative is selected from the group consisting of 3,8-dihydroxy-l-methyl-anthraquinone-3-carboxylic acid (DMAC) and aloesaponarin II (AL2).

BRIEF DESCRIPTION OF THE DRAWINGS Figure 1: Structures of SEK4, SEK4b, SEK34, mutactin, aloesaponarin II (AL2) and 3, 8 -dihydroxy- l-methylanthraquinone-2-carboxylic acid (DMAC). DETAILED DESCRIPTION OF THE INVENTION

Polyketide synthases (PKSs) are a family of multi-domain enzymes or enzyme complexes that produce polyketides. PKSs can be classified into three groups according to their sequences, primary structures and catalytic mechanisms. Type I PKSs are large assemblies of multifunctional polypeptides including at least three essential domains for chain elongation, i.e. ketosynthase (KS), chain length factor (CLF) and acyl carrier protein (ACP) domains. Type II PKSs are aggregates of monofunctional enzymes including KS, CLF and ACP activities, that function repeatedly in the synthesis of a poly- b-ketone backbone. Type III PKSs are homodi meric KSs that act independently of the ACP domain. The single active site in each monomer catalyzes the priming and extension reactions iteratively to form polyketide products.

As shown in the experimental section of the present application, the inventors herein demonstrated for the first time that functional type III PKSs can be expressed in bacterial host cells, e.g. E. coli, and efficiently coupled with tailoring enzymes, in particular with ketoreductases (KRs) which are known to recognize substrates that are covalently bound to ACPs within the same PKS module. Indeed, the inventors herein demonstrated that polyketides produced by type III PKSs can be efficiently modified with tailoring enzymes despite of the absence of ACP anchor. They thus succeeded to produce two anthraquinones, namely DMAC and AL2, in a recombinant E. coli host cell using type III PKSs from plant coupled with heterologous bacterial tailoring enzymes, thereby paving the way to the bioproduction of anthraquinone derivatives and thus to a sustainable and environment- friendly method for producing colorants.

Definitions

In the context of the invention, the term“ recombinant host cell” designates a cell that is not found in nature and which contains a modified genome as a result of either a deletion, insertion or modification of one or several genetic elements. The term " host cell " also encompasses any progeny of a parent host cell that is not identical to the parent host cell due to mutations that occur during replication. The host cell may be a microbial or plant host cell. Preferably, the host cell is a microbial host cell. As used herein, the term“ microbial host cell” refers to a bacterium, a filamentous fungus or a yeast, preferably a bacterium or a yeast, more preferably a bacterium.

A“recombinant nucleic acid’ or " recombinant nucleic acid molecule " designates a nucleic acid (such as, e.g., DNA, cDNA or RNA molecule) which has been engineered and is not found as such in nature. Typically, this term refers to a nucleic acid molecule comprising segments generated and/or joined together using recombinant DNA technology, such as for example molecular cloning and nucleic acid amplification. A recombinant nucleic acid molecule comprises one or more non-naturally occurring sequences, and/or contains joined nucleic acid molecules from different original sources and not naturally attached together.

The term“gene” designates any nucleic acid encoding a protein. This term encompasses DNA, such as cDNA or gDNA, as well as RNA. The gene may be first prepared by e.g., recombinant, enzymatic and/or chemical techniques, and subsequently replicated in a host cell or an in vitro system. The gene typically comprises an open reading frame encoding a desired protein. The gene may contain additional sequences such as a transcription terminator or a signal peptide.

The term " operably linked' means a configuration in which a control sequence is placed at an appropriate position relative to a coding sequence, in such a way that the control sequence directs expression of the coding sequence. The term " control sequence " means nucleic acid sequences necessary for expression of a gene. Control sequences may be native or heterologous. Well-known control sequences and currently used by the person skilled in the art will be preferred. Such control sequences include, but are not limited to, a leader, polyadenylation sequence, propeptide sequence, promoter, signal peptide sequence, ribosome binding site and transcription terminator· Preferably, the control sequences include a promoter and a transcription terminator·

The term“expression cassette” denotes a nucleic acid construct comprising a coding region, i.e. one or several genes, and a regulatory region, i.e. comprising one or more control sequences, operably linked. Optionally, the expression cassette may comprise several coding regions operably linked to several regulatory regions. In particular, the expression cassette may comprise several coding sequences, each of these sequences being operably linked to the same promoter or to a distinct promoter. Alternatively, the expression cassette may comprise one or several coding sequences, each of these sequences operably linked to a distinct promoter, and several other coding sequences operably linked to a common promoter.

As used herein, the term " expression vector " means a DNA or RNA molecule that comprises an expression cassette. Preferably, the expression vector is a linear or circular double stranded DNA molecule.

As used herein, the term“native” or“endogenous”, with respect to a host cell, refers to a genetic element or a protein naturally present in said host cell. The term “heterologous” , with respect to a host cell, refers to a genetic element or a protein that is not naturally present in said host cell. Preferably, this term refers to a genetic element or a protein provided from a cell of a different species or a different genus than the host cell, more preferably of a different genus than the host cell. In the present application, a protein encoded by a nucleic acid which is heterologous to the host cell is a protein which is heterologous to the host cell, i.e. a protein which is non naturally present in said host cell. Thus the expressions“a heterologous nucleic acid sequence encoding a protein”,“a heterologous nucleic acid sequence encoding a heterologous protein” and“a nucleic acid sequence encoding a heterologous protein” can be used interchangeably. As used herein, the term“sequence identity” or“identity” refers to the number (%) of matches (identical amino acid residues) in positions from an alignment of two polypeptide sequences. The sequence identity is determined by comparing the sequences when aligned so as to maximize overlap and identity while minimizing sequence gaps. In particular, sequence identity may be determined using any of a number of mathematical global or local alignment algorithms, depending on the length of the two sequences. Sequences of similar lengths are preferably aligned using a global alignment algorithms (e.g. Needleman and Wunsch algorithm; Needleman and Wunsch, 1970, J. Mol. Biol 48:443) which aligns the sequences optimally over the entire length, while sequences of substantially different lengths are preferably aligned using a local alignment algorithm (e.g. Smith and Waterman algorithm (Smith & Waterman, Adv. Appl. Math. 2:482, 1981) or Altschul algorithm (Altschul et al. 1997, Nucleic Acids Res. 25:3389-3402; Altschul et al. 2005, FEBS J. 272:5101-5109)). Alignment for purposes of determining percent amino acid sequence identity can be achieved in various ways that are within the skill in the art, for instance, using publicly available computer software available on internet web sites such as http://blast.ncbi.nlm.nih.gov/ or http://www.ebi.ac.uk Tools/emboss/). Those skilled in the art can determine appropriate parameters for measuring alignment, including any algorithms needed to achieve maximal alignment over the full length of the sequences being compared. For purposes herein, % amino acid sequence identity values refers to values generated using the pair wise sequence alignment program EMBOSS Needle that creates an optimal global alignment of two sequences using the Needleman- Wunsch algorithm, wherein all search parameters are set to default values, i.e. Scoring matrix = BLOSUM62, Gap open = 10, Gap extend = 0.5, End gap penalty = false, End gap open = 10 and End gap extend = 0.5. In some particular embodiments, all sequence identities (in particular variant sequence identities) are identical and set to at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%. In some more particular embodiments, all sequence identities (in particular variant sequence identities) are identical and set to at least 80% sequence identity. In some other particular embodiments, all sequence identities (in particular variant sequence identities) are identical and set to at least 90%, sequence identity.

As used in this specification, the term "about" refers to a range of values ± 10% of the specified value. For example, "about 20" includes ± 10 % of 20, or from 18 to 22. Preferably, the term“about” refers to a range of values ± 5 % of the specified value.

The terms“peptide”,“oligopeptide” ,“polypeptide” and“protein” are employed interchangeably and refer to a chain of amino acids linked by peptide bonds, regardless of the number of amino acids forming said chain. The term“wild-type protein” as used herein, refers to the non-mutated version of a polypeptide as it appears naturally in a species.

The amino acids are herein represented by their one-letter or three-letter code according to the following nomenclature: A: alanine (Ala); C: cysteine (Cys); D: aspartic acid (Asp); E: glutamic acid (Glu); F: phenylalanine (Phe); G: glycine (Gly); H: histidine (His); I: isoleucine (lie); K: lysine (Lys); L: leucine (Leu); M: methionine (Met); N: asparagine (Asn); P: proline (Pro); Q: glutamine (Gin); R: arginine (Arg); S: serine (Ser); T: threonine (Thr); V: valine (Val); W: tryptophan (Trp ) and Y: tyrosine (Tyr).

The term "substitution" , as used herein in relation to a position or amino acid, means that the amino acid in the particular position has been replaced by another amino acid or that an amino acid different from the one of the wild-type protein is present. Preferably, the term“substitution” refers to the replacement of an amino acid residue by another selected from the naturally-occurring standard 20 amino acid residues, rare naturally occurring amino acid residues (e.g. hydroxyproline, hydroxylysine, allohydroxylysine, 6-N-methylysine, N-ethylglycine, N-methylglycine, N- ethylasparagine, allo-isoleucine, N-methylisoleucine, N-methylvaline, pyroglutamine, aminobutyric acid, ornithine), and non-naturally occurring amino acid, often made synthetically, (e.g. norleucine, norvaline and cyclohexyl-alanine). More preferably, the term“substitution” refers to the replacement of an amino acid residue by another selected from the naturally-occurring standard 20 amino acid residues (G, P, A, V, L, I, M, C, F, Y, W, H, K, R, Q, N, E, D, S and T). In the present document, the following terminology is used to designate a substitution: A207G denotes that alanine residue at position 207 is changed to a glycine (G).

The term "variant" , as used herein, refers to a polypeptide which is derived from a wild- type protein and comprises an alteration, i.e., a substitution, insertion, and/or deletion, at one or more (e.g., several) positions. The term " deletion ", used in relation to a position or an amino acid, means that the amino acid in the particular position has been deleted or is absent. The term "insertion" , used in relation to a position or amino acid, means that one or more amino acids have been inserted or are present adjacent to and immediately following the amino acid occupying the particular position. The variant may be obtained by various techniques well known in the art. In particular, examples of techniques for altering the DNA sequence encoding the wild-type protein, include, but are not limited to, site-directed mutagenesis, random mutagenesis and synthetic oligonucleotide construction.

As used herein, the terms“polyketide synthase” and“PKS” refer to a multi- domain enzyme or enzyme complex that catalyzes the sequential condensation of simple acetate units to produce polyketides. PKSs are classified into three types on the basis of their domain structures. The term "type III PKS” or "type III polyketide synthase” refers to self-contained enzymes that form homodimers wherein each monomer catalyzes the priming and extension reactions iteratively to form polyketide products. These enzymes utilize CoA thioesters as substrates without the involvement of acyl carrier protein. Type III PKSs may be found in plants, bacteria and fungi and, based on his general knowledge, a skilled person knows if a specific PKS is a type I, type II or type III PKS. In preferred embodiments, the terms“ polyketide synthase” and“PKS” refer to an octaketide synthase.

As used herein, the term“ octaketide” refers to a polyketide chain consisting of eight ketide units. In preferred embodiments, this term refers to a non-reduced octaketide, i.e. an octaketide wherein none of the ketones originated from the starter or extender units has been reduced (to alcohol, alkene or alkane group).

As used herein, the terms“ octaketide synthase” and“OKS” refer to a type III PKS catalyzing the sequential condensations of acyl-CoA precursors, e.g. eight molecules of malonyl-CoA, to produce octaketides. Preferably, this term refers to an enzyme which is able to carry out sequential condensations of eight molecules of malonyl-CoA to yield octaketides SEK4 and SEK4b (Figure 1). The enzyme exhibiting OKS activity used in the present invention has to be able to produce octaketides but may further produce polyketides of different chain lengths, in particular of less than eight ketide units. In preferred embodiments, the OKS used in the present invention produces octaketides as major products.

As used herein, the term“post-PKS tailoring enzyme” or“tailoring enzyme” or “post-PKS enzyme” refers to an enzyme involved in the modification of a polyketide produced by a PKS. Preferably, this term refers to an enzyme involved in the modification of an octaketide produced by an OKS as defined above, preferably to yield an anthraquinone.

As used herein, the term “ketoreductase” or “KR” or “post-PKS tailoring ketoreductase” or“post-PKS ketoreductase” refers to a post-PKS tailoring enzyme which is able to modify an octaketide produced by an OKS by converting a ketone, preferably the C-9 ketone, to a secondary alcohol. Ketoreductase activity can be assessed by any method known by the skilled person. For example, this activity can be detected by transforming a host cell expressing a functional OKS, e.g. an E. coli strain expressing PKS4 from Aloe arborescens (see the experimental section) with a plasmid encoding a ketoreductase or putative ketoreductase, and assessing the production of mutactin (Figure 1), said production being indicative of a post-PKS tailoring enzyme having ketoreductase activity. Alternatively, this activity can be assessed by transforming Streptomyces coelicolor CH999 (in which the entire act gene cluster has been deleted (McDaniel et al. J. Am. Chem. Soc. 1993, 115, 11671) with a plasmid comprising genes encoding the minimal PKS ( actl-l , actI-2 and actI-3 genes) of Streptomyces coelicolor and a gene encoding a ketoreductase or putative ketoreductase, and assessing the production of mutactin (McDaniel et al. J. Am. Chem. Soc. 1994, 116, 11855). As used herein, the term “aromatase” or “ARO” or “post-PKS tailoring aromatase” or“post-PKS aromatase” refers to an enzyme which is able to catalyze the aromatization of the first six-membered ring of an octaketide produced by an OKS and optionally modified by a post-PKS tailoring ketoreductase. Aromatase activity can be assessed by any method known by the skilled person. For example, this activity can be detected by transforming a host cell expressing functional OKS and KR, e.g. an E. coll strain expressing PKS4 from Aloe arborescens and KR ( actllT) from S. coelicolor (see the experimental section) with a plasmid encoding an aromatase or putative aromatase, and assessing the production of SEK34 (Figure 1), said production being indicative of a post-PKS tailoring enzyme having aromatase activity. Alternatively, this activity can be assessed by transforming Streptomyces coelicolor CH999 (in which the entire act gene cluster has been deleted (McDaniel et al. J. Am. Chem. Soc. 1993, 115, 11671) with a plasmid comprising genes encoding the minimal PKS ( actl-l , actI-2 and actI-3 genes) of Streptomyces coelicolor, actlll (KR) gene of Streptomyces coelicolor and a gene encoding an aromatase or putative aromatase, and assessing the production of SEK34 (McDaniel et al. J. Am. Chem. Soc. 1994, 116, 11855).

As used herein, the term“cyclase” or“ CYC’ or“post-PKS tailoring cyclase” or “post-PKS cyclase” refers to an enzyme which is able to catalyze an intramolecular aldol condensation to form the second ring of an octaketide produced by an OKS and modified by a post-PKS aromatase and optionally a post-PKS ketoreductase. Cyclase activity can be assessed by any method known by the skilled person. For example, this activity can be detected by transforming a host cell expressing functional OKS, KR, and ARO, e.g. an E. coll strain expressing PKS4 from Aloe arborescens, KR ( actlll) and ARO ( actVIT) from S. coelicolor (see the experimental section) with a plasmid encoding a cyclase or putative cyclase, and assessing the production of DMAC or AL2 (Figure 1), said production being indicative of a post-PKS tailoring enzyme having cyclase activity. Alternatively, this activity can be assessed by transforming Streptomyces coelicolor CH999 (in which the entire act gene cluster has been deleted (McDaniel et al. J. Am. Chem. Soc. 1993, 115, 11671) with a plasmid comprising genes encoding the minimal PKS ( actl-l , actI-2 and actI-3 genes) of Streptomyces coelicolor, actlll (KR) and actVII (ARO) genes of Streptomyces coelicolor and a gene encoding a cyclase or putative cyclase, and assessing the production of DMAC or AL2 (McDaniel et al. J. Am. Chem. Soc. 1994, 116, 11855). As used herein, the term “anthraquinone” , “anthraquinone derivative” or

“anthraquinone compound” refers to an aromatic polycyclic hydrocarbon having the following backbone of formula (A):

(A).

Preferably, this term refers to an aromatic polycyclic hydrocarbon having the backbone of formula (A) and wherein the carbon in position 6 is unsubstituted.

In particular, an anthraquinone may be a compound having the following formula

(I)

wherein Ri, R 2 , R 3 , R 4 , R 5 , R6, R 7 and Rs are independently selected and each R is a substituted or unsubstituted, saturated or unsaturated, cyclic or acyclic, aliphatic or aromatic hydrocarbon group, optionally comprising one or more heteroatoms (such as O, N, S), hydroxyl group or hydrogen, wherein preferably R 2 is methyl and R 7 is hydrogen. In preferred embodiments, this term refers to DM AC and/or AL2 (Figure 1).

As used herein, the term“hydrocarbon group” refers to a chemical group comprising at least one carbon atom and at least one hydrogen atom.

In a first aspect, the present invention relates to a recombinant host cell comprising a nucleic acid sequence encoding a type III polyketide synthase (PKS) which is an octaketide synthase (OKS), and wherein said host cell exhibits post-PKS tailoring activities including ketoreductase, aromatase and cyclase activities.

The recombinant host cell may be a bacterium, a yeast, a filamentous fungus or a plant cell. Preferably, the recombinant host cell is a bacterium, a yeast or a filamentous fungus. More preferably, the recombinant host cell is a bacterium or a yeast. Even more preferably, the recombinant host cell is a bacterium.

In a preferred embodiment, the recombinant host cell is a bacterium. The recombinant host cell of the invention may be any Gram-positive or Gram-negative bacterium. Examples of suitable bacteria include, but are not limited to, bacteria of the genus Escherichia (e.g. Escherichia coli ), Streptomyces , Bacillus, Cupridavidus, Mycobacterium, Kitasatospora, Luteipulveratus, Thermobifida, Thermomonospora, Frankia, Pseudonocar dia, Saccharothrix, Kutzneria, Lentzea, Prauserella, Salinispora, Micromonospora, Actinoplanes, Catenulispora, Mycolicibacterium, Dietzia, Aeromicrobium, Nonomuraea, Blastococcus, Modestobacter, Saccharopolyspora, Amycolatopsis, Actinopolyspora, Acidimicrobium, Photorhabdus, Hoeflea, Azospirillum, Crinalium or Cylindrospermum. Preferably, the recombinant host cell of the invention is selected from bacteria of the genus Escherichia, Streptomyces, Bacillus and Cupridavidus. More preferably, the recombinant host cell of the invention is selected from Escherichia coli, Streptomyces coelicolor, Bacillus subtilis and Cupridavidus necator. In a particular embodiment, the recombinant host cell is a Gram-positive bacterium, preferably a bacterium of the genus Streptomyces, Bacillus, Mycobacterium, Kitasatospora, Luteipulveratus, Thermobifida, Thermomonospora, Frankia, Pseudonocardia, Saccharothrix, Kutzneria, Lentzea, Prauserella, Salinispora, Micromonospora, Actinoplanes, Catenulispora, Mycolicibacterium, Dietzia, Aeromicrobium, Nonomuraea, Blastococcus, Modestobacter, Saccharopolyspora, Amycolatopsis, Actinopolyspora or Acidimicrobium, more preferably a bacterium of the genus Streptomyces or Bacillus. In particular, the recombinant host cell may be selected from Streptomyces coelicolor and Bacillus subtilis, preferably may be Streptomyces coelicolor.

In another particular embodiment, the recombinant host cell is a Gram-negative bacterium, preferably a bacterium of the genus Escherichia, Cupridavidus, Photorhabdus, Hoeflea, Azospirillum, Crinalium or Cylindrospermum, more preferably a bacterium of the genus Escherichia, preferably Escherichia coli, or Cupridavidus. In particular, the recombinant bacterium may be selected from Escherichia coli or Cupridavidus necator.

In preferred embodiments, the recombinant host cell is Escherichia coli.

In another embodiment, the recombinant host cell is a yeast. The yeast may be selected from yeasts of the genus Saccharomyces, Yarrowia, Rhodoturula, Schizosaccharomyces, Hansenula, Kluyveromyces, Pichia, Lipomyces, Debaryomyces or Candida.

Preferably, the yeast is selected from yeasts of the genus Saccharomyces, Yarrowia or Rhodoturula, in particular from the group consisting of Saccharomyces cerevisiae, Yarrowia lipolytica and Rhodoturula glutinis, and more preferably from yeasts of the genus Saccharomyces, in particular Saccharomyces cerevisiae.

In a further embodiment, the recombinant host cell is a filamentous fungus. " Filamentous fungi " include all filamentous forms of the subdivision Eumycota and Oomycota (as defined by Hawksworth et al., In, Ainsworth and Bisby's Dictionary of The Fungi, 8th edition, 1995, CAB International, University Press, Cambridge, UK). The filamentous fungi are generally characterized by a mycelial wall composed of chitin, cellulose, glucan, chitosan, mannan, and other complex polysaccharides. The filamentous fungus may be selected from fungi of the genus Acremonium, Aspergillus, Aureobasidium, Bjerkandera, Ceriporiopsis, Chrysosporium, Coprinus, Coriolus, Cryptococcus, Filibasidium, Fusarium, Humicola, Magnaporthe, Mucor, MyceUophthora, Neocallimastix, Neurospora, Paecilomyces , Penicillium, Phanerochaete, Phlebia, Piromyces, Pleurotus, Schizophyllum, Talaromyces, Thermoascus, Thielavia, Tolypocladium, Trametes or Trichoderma. Preferably, the fungus is selected from fungi of the genus Aspergillus.

In a further embodiment, the recombinant host cell is a plant cell. The plant cell may be a cell of a plant belonging to a genus selected from Nicotiana (preferably Nicotiana benthamania ), Hypericum (preferably Hypericum perforatum), Aloe (preferably Aloe arborescens), Rheum (preferably Rheum palmatum ) and Curcuma (preferably Curcuma longa ), Gerbera, Medicago, Arachis, Phalaenopsis, Wachendorfia, Ipomoea, Drosophyllum, Plumbago, Ruta, Hydrangea, Humulus, Vitis, Pisum, Phaseolus, Pueraria, Pinus, Oryza, Zea, Petunia or Camellia.

Preferably, the plant cell is a cell from Aloe, Rheum or Hypericum species, more preferably from Aloe arborescens, Rheum palmatum or Hypericum perforatum, and even more preferably from Aloe arborescens or Rheum palmatum.

Preferably, the recombinant host cell does not comprise a heterologous gene encoding a phosphopantetheinyl transferase and/or does not overexpress a sigma factor. More preferably, the recombinant host cell is an Escherichia coli bacterium which does not comprise a heterologous gene encoding a phosphopantetheinyl transferase and/or does not overexpress a sigma factor.

The recombinant host cell of the invention comprises a nucleic acid sequence encoding a type III PKS which is an OKS. The recombinant host cell may naturally express an OKS, e.g. when the host cell is Aloe arborescens. Otherwise, the recombinant host cell comprises a heterologous nucleic acid encoding an OKS. In preferred embodiments, the recombinant host cell comprises a heterologous nucleic acid encoding an OKS. Said OKS may be any known OKS, in particular may be selected from known plant OKS and variants thereof. Preferably, said variants exhibit OKS activity and have at least 70 %, preferably at least 80%, more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to any of said OKS. Preferably, the OKS is from a plant belonging to Abe, Hypericum, Gerbera,

Rheum, Medicago, Arachis, Phalaenopsis, Wachendorfia, Ipomoea, Drosophyllum, Plumbago, Ruta, Hydrangea, Humulus, Vitis, Pisum, Phaseolus, Pueraria, Pinus, Oryza, Zea, Petunia or Camellia genus. More preferably, the OKS is from a plant belonging to Aloe or Hypericum genus. Alternatively, the OKS may be a variant of a PKS producing polyketide consisting of less than eight ketide units. Indeed, it was previously shown that large-to-small substitutions of active-site residue corresponding to Thrl97 in chalcone synthase of Medicago sativa (Uniprot accession number: P30074.1 ), in such PKS resulted in the formation of longer chain length polyketides. The active-site residue of the PKS of interest corresponding to Thrl97 in chalcone synthase of Medicago sativa, can be easily identified using any sequence alignment software. As illustration, the A207G mutant of PKS3 (also named aloesone synthase, Uniprot accession number: C4MBZ5) losts the heptaketide- forming activity and produces octaketides as major products (Mizuuchi et al. 2009, FEBS Journal, 2391-2401). Similarly, the A198G mutant of the aloesone synthase from Rheum palmatum and the M207G mutant of the pentaketide chromone from Aloe arbor escens produces octaketides. Thus, the OKS may be a variant of a PKS producing polyketide consisting of less than eight ketide units wherein the active-site residue corresponding to Thrl97 in chalcone synthase of Medicago sativa (Uniprot accession number: P30074; SEQ ID NO: 39) is substituted by a glycine residue. In a particular embodiment, the OKS comprises, or consists of, an amino acid sequence selected from the group consisting of PKS4 (also named OKS2, Uniprot accession number: C4NF90, SEQ ID NO: 1), PKS5 (also named OKS3, Uniprot accession number: C4NF91 SEQ ID NO: 2), PKS3 (also named aloesone synthase (Uniprot accession number: C4MBZ5, SEQ ID NO: 3) and the A207G (SEQ ID NO: 4) mutant thereof (Mizuuchi et al. 2009, FEBS Journal, 2391-2401), the M207G (SEQ ID NO: 5) mutant of the pentaketide chromone synthase (Abe et al., 2005 J. Am. Chem. Soc., 127 (5), pp 1362-1363) and OKS1 (Uniprot accession number: Q3L7F5, SEQ ID NO: 6) from Aloe arborescens, the aloesone synthase from Rheum palmatum (Genbank accession number : AAS87170.1, SEQ ID NO: 40) and the A198G mutant thereof (SEQ ID NO: 7), PKS2 (Uniprot accession number: A8QW47, SEQ ID NO: 8) from Hypericum perforatum, and variants thereof exhibiting OKS activity and having at least 70 %, preferably at least 75%, more preferably at least 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, and even more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to any of SEQ ID NO: 1 to 8. Preferably, the OKS comprises, or consists of, an amino acid sequence selected from the group consisting of SEQ ID NO: 1 to 8, and variants thereof exhibiting OKS activity and having at least 70 %, preferably at least 75%, more preferably at least 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, and even more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to any of SEQ ID NO: 1 to 8. In particular, the OKS may comprise, or consist of, an amino acid sequence selected from the group consisting of SEQ ID NO: 1 to 8 and 40, and variants thereof, said variants exhibiting OKS activity and comprising, or consisting of, a sequence that differs from a sequence set forth in any of SEQ ID No. 1 to 8 and 40 by 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 substitutions, insertions and/or deletions, preferably by 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 substitutions. More particularly, the OKS may comprise, or consist of, an amino acid sequence selected from the group consisting of SEQ ID NO: 1 to 8, and variants thereof, said variants exhibiting OKS activity and comprising, or consisting of, a sequence that differs from a sequence set forth in any of SEQ ID No. 1 to 8 by 1, 2, 3, 4, 5, 6, 7, 8,

9 or 10 substitutions, insertions and/or deletions, preferably by 1, 2, 3, 4, 5, 6, 7, 8, 9 or

10 substitutions.

Preferably, the OKS comprises, or consists of, an amino acid sequence selected from the group consisting of SEQ ID NO: 1 to 7, and variants thereof exhibiting OKS activity and having at least 70 %, preferably at least 75%, more preferably at least 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, and even more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to any of SEQ ID NO: 1 to 7.

More preferably, the OKS comprises, or consists of, an amino acid sequence selected from the group consisting of OKS from Aloe genus, preferably Aloe arborescens. In particular, the OKS may comprise, or consist of, an amino acid sequence selected from the group consisting of SEQ ID NO: 1 to 6, and variants thereof exhibiting OKS activity and having at least 70 %, preferably at least 75%, more preferably at least 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, and even more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to any of SEQ ID NO: 1 to 6.

In a particular embodiment, the OKS comprises, or consists of, an amino acid sequence selected from the group consisting of SEQ ID NO: 1, 6, 7 and 40, and variants thereof exhibiting OKS activity and having at least 70 %, preferably at least 75%, more preferably at least 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, and even more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to SEQ ID NO: 1, 6, 7 or 40. Preferably, the OKS comprises, or consists of, an amino acid sequence selected from the group consisting of SEQ ID NO: 1, 6 and 7, and variants thereof exhibiting OKS activity and having at least 70 %, preferably at least 75%, more preferably at least 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, and even more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to SEQ ID NO: 1, 6 or 7.

In a more particular embodiment, the OKS comprises, or consists of, an amino acid sequence selected from the group consisting of SEQ ID NO: 1, and variants thereof exhibiting OKS activity and having at least 70 %, preferably at least 75%, more preferably at least 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, and even more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to SEQ ID NO: 1. Alternatively, the OKS may comprise, or consist of, an amino acid sequence selected from the group consisting of SEQ ID NO: 6, and variants thereof exhibiting OKS activity and having at least 70 %, preferably at least 75%, more preferably at least 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, and even more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to SEQ ID NO: 6. Alternatively, the OKS may comprise, or consist of, an amino acid sequence selected from the group consisting of SEQ ID NO: 7, and variants thereof exhibiting OKS activity and having at least 70 %, preferably at least 75%, more preferably at least 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, and even more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to SEQ ID NO: 7. Alternatively, the OKS may comprise, or consist of, an amino acid sequence selected from the group consisting of SEQ ID NO: 40, and variants thereof exhibiting OKS activity and having at least 70 %, preferably at least 75%, more preferably at least 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, and even more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to SEQ ID NO: 40. In particular, the OKS may comprise, or consist of, an amino acid sequence selected from the group consisting of SEQ ID NO: 1, 6, 7 and 40, More particularly, the OKS may comprise, or consist of, an amino acid sequence selected from the group consisting of SEQ ID NO: 1, 6 and 7.

The recombinant host cell of the invention exhibits post-PKS tailoring activities including ketoreductase, aromatase and cyclase activities. Preferably, the recombinant host cell of the invention comprises a nucleic acid sequence encoding a post-PKS aromatase, a nucleic acid sequence encoding a post-PKS cyclase and a nucleic acid sequence encoding a post-PKS ketoreductase.

The recombinant bacterium of the invention may naturally express one or several of post-PKS aromatase, post-PKS cyclase, and post-PKS ketoreductase, e.g. when the bacterium is Streptomyces coelicolor. Otherwise, the recombinant bacterium of the invention comprises a heterologous nucleic acid encoding a post-PKS aromatase and/or a heterologous nucleic acid encoding a post-PKS cyclase and/or a heterologous nucleic acid encoding a post-PKS ketoreductase. In preferred embodiments, the recombinant bacterium of the invention comprises a heterologous nucleic acid encoding a post-PKS aromatase, a heterologous nucleic acid encoding a post-PKS cyclase and a heterologous nucleic acid encoding a post-PKS ketoreductase, i.e. expresses heterologous post-PKS aromatase, heterologous post-PKS cyclase and heterologous post-PKS ketoreductase.

The recombinant host cell of the invention comprises a nucleic acid sequence encoding an enzyme exhibiting post-PKS aromatase activity, preferably a heterologous nucleic acid sequence encoding an enzyme exhibiting post-PKS aromatase activity. The post-PKS aromatase may be any known post-PKS aromatase, in particular may be selected from known bacterial post-PKS aromatases and variants thereof. Preferably, said variants exhibit post-PKS aromatase activity and have at least 70 %, preferably at least 80%, more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to any of said post-PKS aromatases. In an embodiment, the post-PKS aromatase comprises, or consists of, an amino acid sequence selected from the group consisting of actinorhodin polyketide synthase bifunctional cyclase/dehydratases from Streptomyces coelicolor (Uniprot accession number: Q02055, SEQ ID NO: 9), from Streptomyces lividans (Uniprot accession number : D6EWA4, SEQ ID NO : 10), from Streptomyces albireticuli, (Uniprot accession number : A0A1Z2KWC6, SEQ ID NO : 11), granaticin polyketide synthase bifunctional cyclase/dehydratase from Streptomyces leeuwenhoeku (Uniprot accession number : A0A0F7VNI8, SEQ ID NO : 12), OxyK from Streptomyces rimosus (Uniprot accession number : Q3S8Q5, SEQ ID NO : 13), SnoaE from Streptomyces nogalater (Uniprot accession number : Q54490, SEQ ID NO : 14), ZhuJ from Streptomyces sp. R1128 (Uniprot accession number: Q9F6D2, SEQ ID NO: 15), polyketide cyclase / dehydrase family protein from Hoeflea sp. IMCC20628 (Uniprot accession number: A0A0F7PM76, SEQ ID NO: 16), cyclase from Mycobacteroides saopaulense, (Uniprot accession number: A0A1S4VRT7, SEQ ID NO: 17), aromatase from Photorhabdus luminescens subsp. laumondu (Uniprot accession number: Q7MZT7, SEQ ID NO: 18), and variants thereof exhibiting post-PKS aromatase activity and having at least 70 %, preferably at least 75%, more preferably at least 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, and even more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to any of SEQ ID NO: 9 to 18. In particular, the post-PKS aromatase may comprise, or consist of, an amino acid sequence selected from the group consisting of SEQ ID NO: 9 to 18, and variants thereof, said variants exhibiting post-PKS aromatase activity and comprising, or consisting of, a sequence that differs from a sequence set forth in any of SEQ ID NO: 9 to 18 by 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 substitutions, insertions and/or deletions, preferably by 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 substitutions. Preferably, the post-PKS aromatase comprises, or consists of, an amino acid sequence selected from Streptomyces post-PKS aromatases and variants thereof exhibiting post-PKS aromatase activity and having at least 70 %, preferably at least 80%, more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to any of Streptomyces post-PKS aromatases. In particular, the post-PKS aromatase may comprise, or consist of, an amino acid sequence selected from the group consisting of SEQ ID NO: 9 to 15, and variants thereof exhibiting post-PKS aromatase activity and having at least 70 %, preferably at least 75%, more preferably at least 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, and even more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to any of SEQ ID NO: 9 to 15.

More preferably, the post-PKS aromatase comprises, or consists of, an amino acid sequence selected from the group consisting of SEQ ID NO: 9, and variants thereof exhibiting post-PKS aromatase activity and having at least 70 %, preferably at least 75%, more preferably at least 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, and even more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to SEQ ID NO: 9. In a more particular embodiment, the post-PKS aromatase comprises, or consists of, the amino acid sequence of SEQ ID NO: 9.

The recombinant host cell of the invention comprises a nucleic acid sequence encoding an enzyme exhibiting post-PKS cyclase activity, preferably a heterologous nucleic acid sequence encoding an enzyme exhibiting post-PKS cyclase activity.

The post-PKS cyclase may be any known post-PKS cyclase, in particular may be selected from known bacterial post-PKS cyclases and variants thereof. Preferably, said variants exhibit post-PKS cyclase activity and have at least 70 %, preferably at least 80%, more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to any of said post-PKS cyclases.

In an embodiment, the post-PKS cyclase comprises, or consists of, an amino acid sequence selected from the group consisting of polyketide cyclases from Streptomyces coelicolor (Uniprot accession number: Q93IZ0, SEQ ID NO: 19), from Streptomyces peucetius (Uniprot accession number: 068500, SEQ ID NO: 20), from Streptomyces nogalater (Uniprot accession number: Q9RN54, SEQ ID NO: 21), from Streptomyces rimosus (Uniprot accession number: Q3S8Q2, SEQ ID NO: 22), from Streptomyces venezuelae (Uniprot accession number: Q9XCV6, SEQ ID NO: 23), from Streptomyces collinus (Uniprot accession number: S5VM78, SEQ ID NO: 24), Zhul from Streptomyces sp. R1128 (Uniprot accession number: Q9F6D3, SEQ ID NO: 25), cyclase from Dietzia sp. JS16-p6b (Uniprot accession number: A0A2S0QEW5, SEQ ID NO: 26), Zn- dependent hydrolase from Nocardia nova SH22a (Uniprot accession number: W5TA90, SEQ ID NO: 27), Zn-dependent hydrolase glyoxylase from Mycobacterium rhodesiae (strain NBB3) (Uniprot accession number: G8RQN3, SEQ ID NO: 28) and variants thereof exhibiting post-PKS cyclase activity and having at least 70 %, preferably at least 75%, more preferably at least 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, and even more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to any of SEQ ID NO: 19 to 28. In particular, the post-PKS cyclase may comprise, or consist of, an amino acid sequence selected from the group consisting of SEQ ID NO: 19 to 28, and variants thereof, said variants exhibiting post-PKS cyclase activity and comprising, or consisting of, a sequence that differs from a sequence set forth in any of SEQ ID NO: 19 to 28 by 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 substitutions, insertions and/or deletions, preferably by 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 substitutions.

Preferably, the post-PKS cyclase comprises, or consists of, an amino acid sequence selected from Streptomyces post-PKS cyclases and variants thereof exhibiting post-PKS cyclase activity and having at least 70 %, preferably at least 80%, more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to any of Streptomyces post-PKS cyclases. In particular, the post-PKS cyclase may comprise, or consist of, an amino acid sequence selected from the group consisting of SEQ ID NO: 19 to 25, and variants thereof exhibiting post-PKS cyclase activity and having at least 70 %, preferably at least 75%, more preferably at least 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, and even more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to any of SEQ ID NO: 19 to 25.

More preferably, the post-PKS cyclase comprises, or consists of, an amino acid sequence selected from the group consisting of SEQ ID NO: 19, and variants thereof exhibiting post-PKS cyclase activity and having at least 70 %, preferably at least 75%, more preferably at least 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, and even more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to SEQ ID NO: 19. In a more particular embodiment, the post-PKS cyclase comprises, or consists of, the amino acid sequence of SEQ ID NO: 19.

The recombinant host cell of the invention comprises a nucleic acid sequence encoding an enzyme exhibiting post-PKS ketoreductase activity, preferably a heterologous nucleic acid sequence encoding an enzyme exhibiting post-PKS ketoreductase activity.

The post-PKS ketoreductase may be any known post-PKS ketoreductase, in particular may be selected from known bacterial post-PKS ketoreductase and variants thereof. Preferably, said variants exhibit post-PKS ketoreductase activity and have at least 70 %, preferably at least 80%, more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to any of said post-PKS ketoreductases.

In an embodiment, the post-PKS ketoreductase comprises, or consists of, an amino acid sequence selected from the group consisting of ACTIII from Streptomyces coelicolor (Uniprot accession number: P16544, SEQ ID NO: 29), AknA from Streptomyces galilaeus (Uniprot accession number: Q9L553, SEQ ID NO: 30), OxyJ from Streptomyces rimosus (Uniprot accession number: Q3S8Q6, SEQ ID NO: 31), SnoaD from Streptomyces nogalater (Uniprot accession number: Q54491, SEQ ID NO: 32), ACTIII from Streptomyces lividans (Uniprot accession number: D6EWA8, SEQ ID NO: 33), ketoreductase from Streptomyces lincolnensis (Uniprot accession number: A0A1B1MKH3, SEQ ID NO: 34), ketoacyl reductase from Nocardia terpenica, (Uniprot accession number: A0A291RGK2, SEQ ID NO: 35), ketoacyl reductase from Azospirillum brasilense (Uniprot accession number: A0A0P0FD24, SEQ ID NO: 36), ketoacyl reductase from Photorhabdus thracensis (Uniprot accession number: A0A0F7LPP9, SEQ ID NO: 37), 3 -oxoacyl- [acyl-carrier-protein] reductase FabG from Escherichia coli (strain K12) (Uniprot accession number: P0AEK2, SEQ ID NO:38) and variants thereof exhibiting post-PKS ketoreductase activity and having at least 70 %, preferably at least 75%, more preferably at least 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, and even more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to any of SEQ ID NO: 29 to 38. In particular, the post-PKS ketoreductase may comprise, or consist of, an amino acid sequence selected from the group consisting of SEQ ID NO: 29 to 38, and variants thereof, said variants exhibiting post-PKS ketoreductase activity and comprising, or consisting of, a sequence that differs from a sequence set forth in any of SEQ ID NO: 29 to 38 by 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 substitutions, insertions and/or deletions, preferably by 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 substitutions.

Preferably, the post-PKS ketoreductase comprises, or consists of, an amino acid sequence selected from Streptomyces post-PKS ketoreductases and variants thereof exhibiting post-PKS ketoreductase activity and having at least 70 %, preferably at least 80%, more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to any of Streptomyces post-PKS ketoreductases. In particular, the post-PKS ketoreductase may comprise, or consist of, an amino acid sequence selected from the group consisting of SEQ ID NO: 29 to 34, and variants thereof exhibiting post-PKS ketoreductase activity and having at least 70 %, preferably at least 75%, more preferably at least 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, and even more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to any of SEQ ID NO: 29 to 34.

More preferably, the post-PKS ketoreductase comprises, or consists of, an amino acid sequence selected from the group consisting of SEQ ID NO: 29, and variants thereof exhibiting post-PKS ketoreductase activity and having at least 70 %, preferably at least 75%, more preferably at least 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, and even more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to SEQ ID NO: 29. In a more particular embodiment, the post-PKS ketoreductase comprises, or consists of, the amino acid sequence of SEQ ID NO: 29.

In most preferred embodiments, the recombinant host cell of the invention comprises a heterologous nucleic acid encoding a post-PKS aromatase and/or a heterologous nucleic acid encoding a post-PKS cyclase and/or a heterologous nucleic acid encoding a post-PKS ketoreductase, preferably a heterologous nucleic acid encoding a post-PKS aromatase, a heterologous nucleic acid encoding a post-PKS cyclase and a heterologous nucleic acid encoding a post-PKS ketoreductase.

In particular, the recombinant host cell may comprise

- a heterologous nucleic acid encoding a post-PKS aromatase comprising, or consisting of, an amino acid sequence selected from Streptomyces post-PKS aromatases and variants thereof exhibiting post-PKS aromatase activity and having at least 70 % sequence identity to any of Streptomyces post-PKS aromatases, and/or

- a heterologous nucleic acid encoding a post-PKS cyclase comprising, or consisting of, an amino acid sequence selected from Streptomyces post-PKS cyclases and variants thereof exhibiting post-PKS cyclase activity and having at least 70 % sequence identity to any of Streptomyces post-PKS cyclases, and/or

- a heterologous nucleic acid encoding a post-PKS ketoreductase comprising, or consisting of, an amino acid sequence selected from Streptomyces post-PKS ketoreductases and variants thereof exhibiting post-PKS ketoreductase activity and having at least 70 % sequence identity to any of Streptomyces post-PKS ketoreductase.

More particularly, the recombinant host cell may comprise

- a heterologous nucleic acid encoding a post-PKS aromatase comprising, or consisting of, an amino acid sequence selected from the group consisting of SEQ ID NO: 9 to 15, and variants thereof exhibiting post-PKS aromatase activity and having at least 70 %, preferably at least 75%, more preferably at least 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, and even more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to any of SEQ ID NO: 9 to 15, and/or

- a heterologous nucleic acid encoding a post-PKS cyclase comprising, or consisting of, an amino acid sequence selected from the group consisting of SEQ ID NO: 19 to 25, and variants thereof exhibiting post-PKS cyclase activity and having at least 70 %, preferably at least 75%, more preferably at least 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, and even more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to any of SEQ ID NO: 19 to 25, and/or

- a heterologous nucleic acid encoding a post-PKS ketoreductase comprising, or consisting of, an amino acid sequence selected from the group consisting of SEQ ID NO:

29 to 34, and variants thereof exhibiting post-PKS ketoreductase activity and having at least 70 %, preferably at least 75%, more preferably at least 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, and even more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to any of SEQ ID NO: 29 to 34. Preferably, said recombinant host cell further comprises a heterologous nucleic acid encoding an OKS wherein the OKS comprises, or consists of, an amino acid sequence selected from the group consisting of SEQ ID NO: 1 to 8, preferably from SEQ ID NO: 1 to 7, and variants thereof exhibiting OKS activity and having at least 70 %, preferably at least 75%, more preferably at least 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, and even more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to any of SEQ ID NO: 1 to 8, preferably to any of SEQ ID NO: 1 to 7.

More particularly, the recombinant host cell may comprise

- a heterologous nucleic acid encoding a post-PKS aromatase comprising, or consisting of, an amino acid sequence selected from Streptomyces post-PKS aromatases and variants thereof exhibiting post-PKS aromatase activity and having at least 70 % sequence identity to any of Streptomyces post-PKS aromatases, and

- a heterologous nucleic acid encoding a post-PKS cyclase comprising, or consisting of, an amino acid sequence selected from Streptomyces post-PKS cyclases and variants thereof exhibiting post-PKS cyclase activity and having at least 70 % sequence identity to any of Streptomyces post-PKS cyclases, and

- a heterologous nucleic acid encoding a post-PKS ketoreductase comprising, or consisting of, an amino acid sequence selected from Streptomyces post-PKS ketoreductases and variants thereof exhibiting post-PKS ketoreductase activity and having at least 70 % sequence identity to any of Streptomyces post-PKS ketoreductase.

Even more particularly, the recombinant host cell may comprise

- a heterologous nucleic acid encoding a post-PKS aromatase comprising, or consisting of, an amino acid sequence selected from the group consisting of SEQ ID NO: 9 to 15, and variants thereof exhibiting post-PKS aromatase activity and having at least 70 %, preferably at least 75%, more preferably at least 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, and even more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to any of SEQ ID NO: 9 to 15, and

- a heterologous nucleic acid encoding a post-PKS cyclase comprising, or consisting of, an amino acid sequence selected from the group consisting of SEQ ID NO: 19 to 25, and variants thereof exhibiting post-PKS cyclase activity and having at least 70 %, preferably at least 75%, more preferably at least 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, and even more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to any of SEQ ID NO: 19 to 25, and - a heterologous nucleic acid encoding a post-PKS ketoreductase comprising, or consisting of, an amino acid sequence selected from the group consisting of SEQ ID NO: 29 to 34, and variants thereof exhibiting post-PKS ketoreductase activity and having at least 70 %, preferably at least 75%, more preferably at least 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, and even more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to any of SEQ ID NO: 29 to 34.

Preferably, said recombinant host cell further comprises a heterologous nucleic acid encoding an OKS wherein the OKS comprises, or consists of, an amino acid sequence selected from the group consisting of SEQ ID NO: 1 to 8, preferably from SEQ ID NO: 1 to 7, and variants thereof exhibiting OKS activity and having at least 70 %, preferably at least 75%, more preferably at least 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, and even more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to any of SEQ ID NO: 1 to 8, preferably to any of SEQ ID NO: 1 to 7.

In preferred embodiments, the recombinant host cell comprises

- a heterologous nucleic acid encoding a post-PKS aromatase comprising, or consisting of, an amino acid sequence selected from the group consisting of SEQ ID NO: 9, and variants thereof exhibiting post-PKS aromatase activity and having at least 70 %, preferably at least 75%, more preferably at least 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, and even more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to SEQ ID NO: 9, and/or - a heterologous nucleic acid encoding a post-PKS cyclase comprising, or consisting of, an amino acid sequence selected from the group consisting of SEQ ID NO: 19, and variants thereof exhibiting post-PKS cyclase activity and having at least 70 %, preferably at least 75%, more preferably at least 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, and even more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to SEQ ID NO: 19, and/or - a heterologous nucleic acid encoding a post-PKS ketoreductase comprising, or consisting of, an amino acid sequence selected from the group consisting of SEQ ID NO: 29, and variants thereof exhibiting post-PKS ketoreductase activity and having at least 70 %, preferably at least 75%, more preferably at least 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, and even more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to SEQ ID NO: 29.

Preferably, in these embodiments, the recombinant host cell further comprises a heterologous nucleic acid encoding an OKS wherein the OKS comprises, or consists of, an amino acid sequence selected from the group consisting of SEQ ID NO:l, 6 and 7, preferably of SEQ ID NO: 1 or 6, and variants thereof exhibiting OKS activity and having at least 70 %, preferably at least 75%, more preferably at least 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, and even more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to SEQ ID NO: 1, 6 or 7, preferably to SEQ ID NO: 1 or 6.

In particular, the recombinant host cell may comprise - a heterologous nucleic acid encoding a post-PKS aromatase comprising, or consisting of, an amino acid sequence selected from the group consisting of SEQ ID NO: 9, and variants thereof exhibiting post-PKS aromatase activity and having at least 70 %, preferably at least 75%, more preferably at least 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, and even more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to SEQ ID NO: 9, and

- a heterologous nucleic acid encoding a post-PKS cyclase comprising, or consisting of, an amino acid sequence selected from the group consisting of SEQ ID NO: 19, and variants thereof exhibiting post-PKS cyclase activity and having at least 70 %, preferably at least 75%, more preferably at least 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, and even more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to SEQ ID NO: 19, and

- a heterologous nucleic acid encoding a post-PKS ketoreductase comprising, or consisting of, an amino acid sequence selected from the group consisting of SEQ ID NO: 29, and variants thereof exhibiting post-PKS ketoreductase activity and having at least 70 %, preferably at least 75%, more preferably at least 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, and even more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to SEQ ID NO: 29.

Preferably, said recombinant host cell further comprises a heterologous nucleic acid encoding an OKS wherein the OKS comprises, or consists of, an amino acid sequence selected from the group consisting of SEQ ID NO: 1, 6 and 7, preferably of SEQ ID NO: 1 or 6, and variants thereof exhibiting OKS activity and having at least 70 %, preferably at least 75%, more preferably at least 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, and even more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to SEQ ID NO: 1, 6 or 7, preferably to SEQ ID NO: 1 or 6. In a particular embodiment, the recombinant host cell is a bacterium, preferably

E. coli, and comprises a heterologous nucleic acid encoding a post-PKS aromatase, a heterologous nucleic acid encoding a post-PKS cyclase and a heterologous nucleic acid encoding a post-PKS ketoreductase as defined above, and a heterologous nucleic acid encoding an OKS as defined above. In a more particular embodiment, the recombinant host cell comprises a heterologous nucleic acid encoding a post-PKS aromatase comprising, or consisting of, the amino acid sequence of SEQ ID NO: 9, a heterologous nucleic acid encoding a post- PKS cyclase comprising, or consisting of, the amino acid sequence of SEQ ID NO: 19 and a heterologous nucleic acid encoding a post-PKS ketoreductase comprising, or consisting of, the amino acid sequence of SEQ ID NO: 29. Preferably said recombinant host cell is a bacterium, more preferably is E. coli. Preferably, in this embodiment, the recombinant host cell further comprises a heterologous nucleic acid encoding an OKS comprising, or consisting of, the amino acid sequence of SEQ ID NO: 1, 6 or 7.

OKS, aromatases, cyclases and/or ketoreductases used in the present invention may be used in the form of hybrid or fusion polypeptides in which the polypeptide exhibiting OKS, aromatase, cyclase or ketoreductase activity is fused at its N-terminus and/or C-terminus to another polypeptide. Techniques for producing fusion polypeptides are well known in the art, and include ligating the coding sequences encoding the polypeptide and the addition region of another polypeptide so that they are in frame and that expression of the fusion polypeptide is under control of the same promoter(s) and terminator. The addition region of the fusion polypeptide can be selected for example to enhance the stability of the enzyme or to promote the secretion (such as a N-terminal hydrophobic signal peptide) of the fusion protein from a cell.

In embodiments wherein the recombinant host cell naturally expresses OKS, post- PKS aromatase, post-PKS cyclase and/or post-PKS ketoreductase, the endogenous gene(s) may be overexpressed. To increase the expression of a gene, the skilled person can use any known techniques such as increasing the copy number of the gene in the host cell, using a promoter inducing a high level of expression of the gene, i.e. a strong promoter, using elements stabilizing the corresponding messenger RNA or modifying Ribosome Binding Site (RBS) sequences and sequences surrounding them.

Nucleic acid sequences encoding heterologous polypeptides may be deduced from the sequence of the polypeptide and codon usage may be adapted according to the host cell in which the nucleic acids shall be transcribed. This step may be carried out according to methods well known to one of skill in the art. Nucleic acid sequences encoding OKS, ARO, CYC and/or KR, may be comprised in one or several recombinant expression cassettes. Each expression cassette may comprise a gene encoding an OKS (OKS), a gene encoding a post-PKS aromatase (ARO), a gene encoding a post-PKS cyclase (CYC) and/or a gene encoding a post-PKS ketoreductase (KR). In particular, the recombinant host cell of the invention may comprise a recombinant expression cassette comprising (a) OKS, (b) ARO, (c) CYC, (d) KR, (e) OKS and ARO, (f) OKS and CYC, ( g ) OKS and KR, (h) ARO and CYC, (i) ARO and KR, (j) CYC and KR, (k) OKS, ARO and CYC, ( l ) OKS, ARO and KR, (m) OKS, CYC and KR, (n) ARO, CYC and KR, (o) OKS, ARO, CYC and KR genes. In embodiments wherein the recombinant expression cassette comprises several genes, these genes may be expressed under the control of one or several promoters. In particular, each gene may be expressed under the control of a distinct promoter. In preferred embodiments, genes encoding ARO, CYC, and KR, are placed under the control of the same promoter and the gene encoding the OKS is placed under the control of a distinct promoter. Preferably, for operons comprising several genes, a RBS sequence is introduced before each gene.

The(se) recombinant expression cassette(s) may be integrated into the genome of the host cell or may be maintained in an episomal form into an expression vector. In embodiments wherein the expression cassette(s) is(are) maintained in an episomal form, the expression vector may be present in the host cell in one or several copies, depending on the nature of the origin of replication.

Preferably, the recombinant expression cassette(s) is(are) integrated into the genome of the host cell. One or several copies of the genes may be introduced into the genome by methods of recombination, known to the expert in the field, including gene replacement.

The recombinant host cell may be further genetically modified in order to increase (by comparison to the wild-type host cell) the production of compounds used as starter units (e.g. malonyl-CoA) and/or extender units during the synthesis of the octaketide, for example by overexpression of some genes involved in the synthesis of starter and/or extender units or by deletion of some other genes in order to redirect the carbon flux toward metabolic pathway of interest and/or to optimize the availability of the cofactor and others substrates of the metabolic pathway of interest, or may be genetically modified in order to produce non-naturally produced starter/extender units such as aromatic (e.g. 4-coumaroyl, cinnamoyl and benzoyl) and long-chain fatty acyl (e.g. n-hexanoyl, n- octanoyl, n-decanoyl, n-dodecanoyl, n-tetradecanoyl, n-hexadecanoyl, n-octadecanoyl and n-eicosanoyl) Coenzyme A or any other starter unit know by the skilled person (see e.g. US 2002/0045220).

The present invention further relates to a recombinant nucleic acid comprising a nucleic acid sequence encoding a type III PKS which is an OKS and a nucleic acid sequence encoding a post-PKS tailoring enzyme selected from the group consisting of an aromatase, a cyclase and a ketoreductase, preferably encoding a ketoreductase, more preferably encoding a ketoreductase and an aromatase or a cyclase, and even more preferably encoding a ketoreductase, an aromatase and a cyclase, operably linked to one or more control sequences that direct the expression of said nucleic acid sequences in a suitable host cell under conditions compatible with the control sequences.

In particular, the recombinant nucleic acid of the invention may comprise a nucleic acid sequence encoding an OKS and a nucleic acid sequence encoding a post- PKS aromatase; or a nucleic acid sequence encoding an OKS and a nucleic acid sequence encoding a post-PKS cyclase; or a nucleic acid sequence encoding an OKS and a nucleic acid sequence encoding a post-PKS ketoreductase; or a nucleic acid sequence encoding an OKS, a nucleic acid sequence encoding a post-PKS aromatase and a nucleic acid sequence encoding a post-PKS cyclase; or a nucleic acid sequence encoding an OKS, a nucleic acid sequence encoding a post-PKS aromatase and a nucleic acid sequence encoding a post-PKS ketoreductase; or a nucleic acid sequence encoding an OKS, a nucleic acid sequence encoding a post-PKS cyclase and a nucleic acid sequence encoding a post-PKS ketoreductase; or a nucleic acid sequence encoding an OKS, a nucleic acid sequence encoding a post-PKS aromatase, a nucleic acid sequence encoding a post-PKS cyclase and a nucleic acid sequence encoding a post-PKS ketoreductase.

OKS, ARO, CYC and KR may be as defined above, including preferred features and combinations.

The control sequence(s) may include one or several promoters that are recognized by a host cell or an in vitro expression system, preferably by a bacterial host cell. The promoter(s) may be any polynucleotide that shows transcriptional activity in the host cell including mutant, truncated, and hybrid promoters, and may be obtained from genes encoding extracellular or intracellular polypeptides either endogenous or heterologous to the host cell. The promoter may be a strong, weak, constitutive or inducible promoter.

Examples of suitable promoters in a bacterial host cell are the tac promoter (DeBoer et ak, 1983, Proc. Natl. Acad. Sci. USA 80: 21 -25) obtained from E. coli lac operon, E. coli trc promoter (Egon et ak, 1988, Gene 69: 301 -315), prokaryotic beta- lactamase gene (Villa-Kamaroff et ak, 1978, Proc. Natl. Acad. Sci. USA 75: 3727-3731), the promoter of the L-arabinose operon of E. coli (Guzman et ak, 1995, J Bacteriol. 1995 Jul;177(14)), the constitutively active E. coli recA promoter lacking the LexA binding site (Brent and Ptashne, 1981, Proc Natl Acad Sci U S A. 1981 Jul;78(7)). Further promoters are described in "Useful proteins from recombinant bacteria" in Gilbert et ak, 1980, Scientific American 242: 74-94, and in Sambrook et al., 2001, Molecular cloning: a laboratory manual, Third, Edition Cold Spring Harbor.

Examples of suitable promoters in a filamentous fungal host cell are promoters obtained from the genes for Aspergillus nidulans acetamidase, Aspergillus niger neutral alpha-amylase, Aspergillus niger acid stable alpha-amylase, Aspergillus niger or Aspergillus awamori glucoamylase (glaA), Aspergillus oryzae TAKA amylase, Aspergillus oryzae alkaline protease, Aspergillus oryzae triose phosphate isomerase, Fusarium oxysporum trypsin-like protease (W096/00787), Fusarium venenatum amyloglucosidase (WO 00/56900), Fusarium venenatum Daria (WO 00/56900), Fusarium venenatum Quinn (WO 00/56900), Rhizomucor miehei lipase, Rhizomucor miehei aspartic proteinase, Trichoderma reesei beta-glucosidase, Trichoderma reesei cellobiohydrolase I, Trichoderma reesei cellobiohydrolase II, Trichoderma reesei endoglucanase I, Trichoderma reesei endoglucanase II, Trichoderma reesei endoglucanase III, Trichoderma reesei endoglucanase IV, Trichoderma reesei endoglucanase V, Trichoderma reesei xylanase I, Trichoderma reesei xylanase II, Trichoderma reesei beta-xylosidase, and mutant, truncated, and hybrid promoters thereof.

Examples of suitable promoters in a yeast host cell are promoters obtained from the genes for Saccharomyces cerevisiae enolase (ENO-1), Saccharomyces cerevisiae galactokinase (GAL1), Saccharomyces cerevisiae alcohol dehydrogenase/glyceraldehyde- 3 -phosphate dehydrogenase (ADH1, ADH2/GAP), Saccharomyces cerevisiae triose phosphate isomerase (TPI), Saccharomyces cerevisiae metallothionein (CUP1), and Saccharomyces cerevisiae 3-phosphoglycerate kinase. Other useful promoters for yeast host cells are described in Romanos et ak, 1992, Yeast 8: 423-488. Examples of suitable promoters in a plant host cell are CaMV 35S promoter (Odell et al. (1985) Nature 313, 9810-812), rice actin promoter (McElroy et al. (1990) Plant Cell 2: 163- 171) and ubiquitin promoter (Christian et al. (1989) Plant Mol. Biol. 18 (675- 689); pEMU (Last et al. (1991) Theor Appl. Genet. 81: 581-588); MAS (Velten et al. (1984) EMBO J. 3. 2723-2730), ALS promoter (U.S. Application Serial No. 08/409,297), root specific promoters such as pPROl lO from rice, the kernel specific promoter High Molecular Weight (HMW) promoter (Thomas and Flavell, 1990, Plant Cell. Dec;2(12):1171-80), leaf specific promoters such as pPEPc promoter (Jeanneau el al, 2002, Biochimie, 84, 1127-1135), or the Rubisco small subunit promoter (rbcS) (Katayama et al., 2000, Plant Mol. Biol. 44:99-106).

In the recombinant nucleic acid of the invention, encoding nucleic acid sequences may be expressed under the control of one or several promoters. In particular, each gene may be expressed under the control of a distinct promoter. In preferred embodiments, genes encoding ARO, CYC and KR, are placed under the control of the same promoter and the gene encoding the OKS is placed under the control of a distinct promoter. Preferably, for operons comprising several genes, a RBS sequence is introduced before each gene.

The control sequences may also include a transcription terminator, which is recognized by a host cell, preferably a microbial host cell, more preferably a bacterial host cell, to terminate transcription. The terminator is operably linked to the 3 '-terminus of the nucleic acid encoding the polypeptide. Any terminator that is functional in the host cell may be used in the present invention and can be easily chosen by the skilled person. Usually, the terminator is chosen in correlation with the promoter.

The control sequences may also include a signal peptide coding sequence that encodes a signal peptide linked to the N-terminus of an encoded polypeptide and directs the polypeptide into the cell's secretory pathway, i.e. for secretion into the extracellular (or periplasmic) space.

Optionally, the recombinant nucleic acid of the invention may also comprise a selectable marker that permits easy selection of recombinant host cells. Typically, the selectable marker is a gene encoding antibiotic resistance or conferring autotrophy.

The recombinant nucleic acid of the invention may be used directly to transform a host cell, preferably a microbial host cell, more preferably a bacterial host cell, and enable the expression of OKS, ARO, CYC and KR in said cell. Preferably, the recombinant nucleic acid of the invention is inserted in the genome of the host cell. The present invention also relates to an expression vector comprising a recombinant nucleic acid of the invention.

The expression vector of the invention may be used to transform a host cell, preferably a microbial host cell, more preferably a bacterial host cell, and enable the expression of the coding sequences of the recombinant nucleic acid of the invention in said cell. The choice of the vector will typically depend on the compatibility of the vector with the host cell into which the vector is to be introduced. The vector may be an autonomously replicating vector, i.e., a vector that exists as an extra-chromosomal entity, the replication of which is independent of chromosomal replication, e.g., a plasmid, an extra-chromosomal element, a mini-chromosome, or an artificial chromosome. The vector may contain any means for assuring self-replication. Alternatively, the vector may be one that, when introduced into the host cell, is integrated into the genome and replicated together with the chromosome(s) into which it has been integrated. Preferably, the vector, or a part thereof comprising the recombinant nucleic acid of the invention, is inserted in the genome of the host cell.

The vector preferably comprises one or more selectable markers that permit easy selection of host cells comprising the vector. Typically, the selectable marker is a gene encoding antibiotic resistance or conferring autotrophy.

The vector preferably comprises an element that permits integration of the vector into the host cell's genome or autonomous replication of the vector in the cell independent of the genome. When integration into the host cell genome occurs, integration of the sequences into the genome may rely on homologous or non-homologous recombination. In one hand, the vector may contain additional polynucleotides for directing integration by homologous recombination at a precise location into the genome of the host cell. These additional polynucleotides may be any sequence that is homologous with the target sequence in the genome of the host cell. On the other hand, the vector may be integrated into the genome of the host cell by non-homologous recombination.

For autonomous replication, the vector may further comprise an origin of replication enabling the vector to replicate autonomously in the host cell in question. The origin of replication may be any plasmid replicator mediating autonomous replication that functions in a cell. The methods for selecting these elements according to the host cell in which expression is desired, are well known to one of skill in the art. The vectors may be constructed by the classical techniques of molecular biology, well known to one of skill in the art.

The present invention further relates to the use of a recombinant nucleic acid or an expression vector of the invention to transform, transfect or transduce a cell. It also relates to a host cell comprising a recombinant nucleic acid or an expression vector of the invention. The host cell may be transformed, transfected or transduced in a transient or stable manner. A recombinant nucleic acid or vector of the invention is introduced into a host cell so that the recombinant nucleic acid or vector is maintained as a chromosomal integrant or as a self-replicating extra-chromosomal vector as described earlier.

Preferably, the host cell is a microbial host cell, more preferably a bacterial host cell, as defined above. The recombinant nucleic acid or expression vector according to the invention may be introduced into the host cell by any method known by the skilled person, such as electroporation, conjugation, transduction, competent cell transformation, protoplast transformation, protoplast fusion, biolistic "gene gun" transformation, PEG-mediated transformation, lipid-assisted transformation or transfection, chemically mediated transfection, lithium acetate-mediated transformation or liposome-mediated transformation. According to the nature of the host cell, the skilled person can easily chose a suitable method.

In a further aspect, the present invention relates to the use of a recombinant host cell of the invention, preferably a microbial host cell, more preferably a bacterial host cell to produce a compound which is an anthraquinone derivative.

The present invention also relates to a method of producing a compound which is an anthraquinone derivative, comprising culturing a recombinant host cell of the invention under conditions suitable to produce said compound, and optionally recovering said compound. The method may further comprise isolating or purifying said anthraquinone derivative. Anthraquinone derivatives may be isolated or purified using any method known by the skilled person such as liquid-liquid extraction, crystallization, precipitation, freeze-drying, drying, chromatography techniques, or the method described in Bartel et al., 1990, J. Bacteriol. Sep; 172(9).

All embodiments described above for the recombinant host cell of the invention are also contemplated in these aspects.

Preferably, the anthraquinone derivative is selected from the group consisting of 3,8-dihydroxy-l-methyl-anthraquinone-3-carboxylic acid (DMAC) and aloesaponarin II (AL2), and mixture thereof.

Conditions suitable to produce an anthraquinone derivative may be easily determined by the skilled person according to the recombinant host cell used. In particular, the skilled person may easily chose suitable culture medium and growth conditions according to the host cell. In a particular embodiment, the recombinant host cell is a microbial host cell and the culture medium is supplemented with one or several compounds that can be used as starter units by the OKS. Examples of such compounds include, but are not limited to, aromatic (e.g. 4-coumaroyl, cinnamoyl and benzoyl) and long-chain fatty acyl (e.g. n- hexanoyl, n-octanoyl, n-decanoyl, n-dodecanoyl, n-tetradecanoyl, n-hexadecanoyl, n- octadecanoyl and n-eicosanoyl) Coenzyme A or any other starter unit know by the skilled person (see e.g. US 2002/0045220). The use of diverse unit starters for the polyketides production can lead to the production of wide variety of anthraquinones.

Notwithstanding the claims, the present invention is also defined by way of the following clauses.

Clause 1. A recombinant host cell which is a microbial or plant cell, comprising a nucleic acid sequence encoding a type III polyketide synthase (PKS) which is an octaketide synthase (OKS), a nucleic acid sequence encoding an enzyme exhibiting post- PKS aromatase activity, a nucleic acid sequence encoding an enzyme exhibiting post- PKS cyclase activity and a nucleic acid sequence encoding an enzyme exhibiting post- PKS ketoreductase activity, wherein at least one of these nucleic acid sequences is heterologous to the host cell.

Clause 2. The recombinant host cell of clause 1, wherein the host cell is a plant cell, preferably a cell from Aloe, Rheum or Hypericum species.

Clause 3. The recombinant host cell of clause 2, wherein the host cell is a cell from Aloe arborescens, Rheum palmatum or Hypericum perforatum, preferably from Aloe arborescens or Rheum palmatum.

Clause 4. The recombinant host cell of clause 1, wherein the host cell is filamentous fungus, preferably selected from fungi of the genus Acremonium, Aspergillus, Aureobasidium, Bjerkandera, Ceriporiopsis, Chrysosporium, Coprinus, Coriolus, Cryptococcus, Filibasidium, Fusarium, Humicola, Magnaporthe, Mucor, Myceliophthora, Neocallimastix, Neurospora, Paecilomyces , Penicillium,

Phanerochaete, Phlebia, Piromyces, Pleurotus, Schizophyllum, Talaromyces, Thermoascus, Thielavia, Tolypocladium, Trametes or Trichoderma.

Clause 5. The recombinant host cell of clause 4, wherein the host cell more selected from fungi of the genus Aspergillus.

Clause 6. The recombinant host cell of clause 1 , wherein the host cell is a yeast, preferably selected from yeasts of the genus Saccharomyces, Yarrowia, Rhodoturula, Schhosaccharomyces, Hansenula, Kluyveromyces, Pichia, Lipomyces, Debaryomyces or Candida.

Clause 7. The recombinant host cell of clause 6, wherein the host cell is selected from yeasts of the genus Saccharomyces, Yarrowia or Rhodoturula, preferably from the group consisting of Saccharomyces cerevisiae, Yarrowia lipolytica and Rhodoturula glutinis.

Clause 8. The recombinant host cell of clause 6, wherein the host cell is selected from yeasts of the genus Saccharomyces, preferably is Saccharomyces cerevisiae.

Clause 9. The recombinant host cell of clause 1, wherein the host cell is a bacterium. Clause 10. The recombinant host cell of clause 9, wherein the bacterium is a Gram-positive bacterium, preferably a bacterium of the genus Streptomyces, Bacillus, Mycobacterium, Kitasatospora, Luteipulveratus, Thermobifida, Thermomonospora, Frankia, Pseudonocar dia, Saccharothrix, Kutzneria, Lentzea, Prauserella, Salinispora, Micromonospora, Actinoplanes, Catenulispora, Mycolicibacterium, Dietzia, Aeromicrobium, Nonomuraea, Blastococcus, Modestobacter, Saccharopolyspora, Amycolatopsis, Actinopolyspora or Acidimicrobium.

Clause 11. The recombinant host cell of clause 9, wherein the bacterium is a bacterium of the genus Streptomyces or Bacillus, preferably selected from Streptomyces coelicolor and Bacillus subtilis.

Clause 12. The recombinant host cell of clause 9, wherein the bacterium Streptomyces coelicolor.

Clause 13. The recombinant host cell of clause 9, wherein the bacterium is a Gram-negative bacterium, preferably a bacterium of the genus Escherichia, Cupridavidus, Photorhabdus, Hoeflea, Azospirillum, Crinalium or Cylindrospermum.

Clause 14. The recombinant host cell of clause 9, wherein the bacterium is a bacterium of the genus Escherichia or Cupridavidus, preferably Escherichia coli or Cupridavidus necator.

Clause 15. The recombinant host cell of clause 9, wherein the bacterium is Escherichia coli.

Clause 16. The recombinant host cell of any of clauses 1 to 15, wherein the nucleic acid sequence encoding a type III polyketide synthase (PKS) which is an octaketide synthase (OKS), is heterologous to the host cell.

Clause 17. The recombinant host cell of any of clauses 1 to 16, wherein the OKS is selected from known plant OKS and variants thereof.

Clause 18. The recombinant host cell of any of clauses 1 to 16, wherein the OKS a variant of a PKS producing polyketide consisting of less than eight ketide units wherein the active-site residue corresponding to Thrl97 in chalcone synthase of Medicago sativa (XJniprot accession number: P30074.1 ) is substituted by a glycine residue. Clause 19. The recombinant host cell of any of clauses 1 to 16, wherein the OKS comprises, or consists of, an amino acid sequence selected from the group consisting of SEQ ID NO: 1 to 8 and 40, and variants thereof exhibiting OKS activity and having at least 70 % sequence identity to any of SEQ ID NO: 1 to 8 and 40, preferably selected from the group consisting of SEQ ID NO: 1 to 8, and variants thereof exhibiting OKS activity and having at least 70 % sequence identity to any of SEQ ID NO: 1 to 8.

Clause 20. The recombinant host cell of any of clauses 1 to 16, wherein the OKS comprises, or consists of, an amino acid sequence selected from the group consisting of SEQ ID NO: 1 to 7 and variants thereof exhibiting OKS activity and having at least 70 %, preferably at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to any of SEQ ID NO: 1 to 7.

Clause 21. The recombinant host cell of any of clauses 1 to 16, wherein the OKS comprises, or consists of, an amino acid sequence selected from the group consisting of SEQ ID NO: 1 and variants thereof exhibiting OKS activity and having at least 70 %, preferably at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to SEQ ID NO: 1.

Clause 22. The recombinant host cell of any of clauses 1 to 16, wherein the OKS comprises, or consists of, the amino acid sequence of SEQ ID NO: 1. Clause 23. The recombinant host cell of any of clauses 1 to 16, wherein the OKS comprises, or consists of, an amino acid sequence selected from the group consisting of SEQ ID NO: 6 and variants thereof exhibiting OKS activity and having at least 70 %, preferably at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to SEQ ID NO: 6. Clause 24. The recombinant host cell of any of clauses 1 to 16, wherein the OKS comprises, or consists of, the amino acid sequence of SEQ ID NO: 6.

Clause 25. The recombinant host cell of any of clauses 1 to 16, wherein the OKS comprises, or consists of, an amino acid sequence selected from the group consisting of SEQ ID NO: 7 and variants thereof exhibiting OKS activity and having at least 70 %, preferably at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to SEQ ID NO: 7.

Clause 26. The recombinant host cell of any of clauses 1 to 16, wherein the OKS comprises, or consists of, the amino acid sequence of SEQ ID NO: 7. Clause 27. The recombinant host cell of any of clauses 1 to 26, wherein the nucleic acid sequence encoding an enzyme exhibiting post-PKS aromatase activity is heterologous to the host cell.

Clause 27. The recombinant host cell of any of clauses 1 to 27, wherein the enzyme exhibiting post-PKS aromatase activity comprises, or consists of, an amino acid sequence selected from the group consisting of SEQ ID NO: 9 to 18, preferably SEQ ID NO: 9 to 15, and variants thereof exhibiting aromatase activity and having at least 70%, preferably at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to any of SEQ ID NO: 9 to 18, preferably to any of SEQ ID NO: 9 to 15. Clause 28. The recombinant host cell of any of clauses 1 to 27, wherein the enzyme exhibiting post-PKS aromatase activity comprises, or consists of, an amino acid sequence selected from the group consisting of SEQ ID NO: 9 and variants thereof exhibiting aromatase activity and having at least 70%, preferably at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to SEQ ID NO: 9.

Clause 29. The recombinant host cell of any of clauses 1 to 28, wherein the enzyme exhibiting post-PKS aromatase activity comprises, or consists of, the amino acid sequence of SEQ ID NO: 9.

Clause 30. The recombinant host cell of any of clauses 1 to 29, wherein the nucleic acid sequence encoding an enzyme exhibiting post-PKS cyclase activity is heterologous to the host cell.

Clause 31. The recombinant host cell of any of clauses 1 to 30, wherein ,the enzyme exhibiting post-PKS cyclase activity comprises, or consists of, an amino acid sequence selected from the group consisting of SEQ ID NO: 19 to 28, preferably SEQ ID NO: 19 to 25, and variants thereof exhibiting cyclase activity and having at least 70%, preferably at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to any of SEQ ID NO: 19 to 28, preferably to any of SEQ ID NO: 19 to 25. Clause 32. The recombinant host cell of any of clauses 1 to 31, wherein the enzyme exhibiting post-PKS cyclase activity comprises, or consists of, an amino acid sequence selected from the group consisting of SEQ ID NO: 19 and variants thereof exhibiting cyclase activity and having at least 70%, preferably at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to SEQ ID NO: 19.

Clause 33. The recombinant host cell of any of clauses 1 to 32, wherein the enzyme exhibiting post-PKS cyclase activity comprises, or consists of, the amino acid sequence of SEQ ID NO: 19.

Clause 34. The recombinant host cell of any of clauses 1 to 33, wherein the nucleic acid sequence encoding an enzyme exhibiting post-PKS ketoreductase activity is heterologous to the host cell.

Clause 35. The recombinant host cell of any of clauses 1 to 34, wherein ,the enzyme exhibiting post-PKS ketoreductase activity comprises, or consists of, an amino acid sequence selected from the group consisting of SEQ ID NO: 29 to 38, preferably SEQ ID NO: 29 to 34, and variants thereof exhibiting ketoreductase activity and having at least 70%, preferably at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to any of SEQ ID NO: 29 to 38, preferably to any of SEQ ID NO: 29 to 34.

Clause 36. The recombinant host cell of any of clauses 1 to 35, wherein the enzyme exhibiting post-PKS ketoreductase activity comprises, or consists of, an amino acid sequence selected from the group consisting of SEQ ID NO: 29 and variants thereof exhibiting ketoreductase activity and having at least 70%, preferably at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%, more preferably at least 90%, 95%, 96%, 97%, 98% or 99%, sequence identity to SEQ ID NO: 29. Clause 37. The recombinant host cell of any of clauses 1 to 36, wherein the enzyme exhibiting post-PKS ketoreductase activity comprises, or consists of, the amino acid sequence of SEQ ID NO: 29.

Clause 38. A recombinant nucleic acid comprising a nucleic acid sequence encoding a type III PKS which is an octaketide synthase and a nucleic acid sequence encoding a post-PKS tailoring enzyme selected from the group consisting of an aromatase, a cyclase and a ketoreductase, operably linked to one or more control sequences that direct the expression of said nucleic acid sequences in a suitable host cell under conditions compatible with the control sequences. Clause 39. The recombinant nucleic acid of clause 38 which comprises a nucleic acid sequence encoding an OKS, a nucleic acid sequence encoding a post-PKS aromatase, a nucleic acid sequence encoding a post-PKS cyclase and a nucleic acid sequence encoding a post-PKS ketoreductase.

Clause 40. The recombinant nucleic acid of clause 38 or 39, wherein the OKS, the post-PKS aromatase, the post-PKS cyclase and the post-PKS ketoreductase are as defined in any of clauses 1 to 37.

Clause 41. An expression vector comprising a recombinant nucleic acid of any of clauses 38 to 40.

Clause 42. Use of a recombinant nucleic acid of any of clauses 38 to 40 or an expression vector of clause 41 to transform, transfect or transduce a host cell as defined in any of clauses 1 to 15.

Clause 43. A method of producing a compound which is an anthraquinone derivative, comprising culturing a recombinant host cell of any of clauses 1 to 37, under conditions suitable to produce said compound, and optionally recovering said compound. Clause 44. Use of a recombinant host cell of any of clauses 1 to 37 to produce a compound which is an anthraquinone derivative.

Clause 45. Method of clause 43 or use of clause 44, wherein the anthraquinone derivative is selected from the group consisting of 3, 8-dihydroxy- 1 -methyl- anthraquinone-3 -carboxylic acid (DMAC) and aloesaponarin II (AL2), and mixtures thereof.

Further aspects and advantages of the present invention will be described in the following examples, which should be regarded as illustrative and not limiting.

EXAMPLES

Material and methods

Octaketide synthases

The following polyketide synthases (PKS) were used: octaketide synthase 1 (OKS, Uniprot accession number: Q3L7F5, SEQ ID NO: 6) and octaketide synthase 2 (PKS4, Uniprot accession number: C4NF90, SEQ ID NO: 1) from Aloe arborescens, the aloesone synthase from Rheum palmatum (ALSwt, SEQ ID NO: 40) and the A198G mutant of the aloesone synthase from Rheum palmatum (ALS, SEQ ID NO: 7). From the amino acid sequences, the nucleotide sequences were codon optimized for expression in Escherichia coli.

Tailoring enzymes

Three enzymes, a ketoreductase (KR), an aromatase (ARO) and a cyclase with a thioesterare activity (CYC-TE) were used as tailoring enzymes. The genes actlll, actVII and actIV from Streptomyces coelicolor (NCBI accession numbers: SCO5086, SC05090 and SCO5091 respectively) are coding for the tailoring enzymes KR (SEQ ID NO: 29), ARO (SEQ ID NO: 9) and CYC-TE (SEQ ID NO: 19), respectively.

Construction of plasmids and bacterial strains

All plasmids and strains are presented in Table 1.

The genes coding for OKS, ALSwt, ALS and PKS4 were individually organized into operon (PKS operon): for each PKS gene, an inducible promoter and a ribosome binding site (RBS) were operably linked to its 5'-terminus and a terminator sequence was operably linked to its 3'-terminus. Each PKS operon was introduced into pBR322 plasmid by restriction ligation into EcoRI and BamHI sites resulting in the following plasmids: pILI051 (oks), pILI036 ( pks4 ), pILI208 {als wt) and pILI039 (als).

The genes coding for the tailoring enzymes actlll, actVII and act IV from Streptomyces coelicolor (S. coelicolor, SCO5086, SC05090 and SCO5091 respectively) were regrouped on the same operon (tailoring enzyme operon): for each gene (i.e. actlll, actVII and actIV ), an inducible promoter and a ribosome binding site (RBS) were operably linked to its 5'-terminus and a terminator sequence was operably linked to its 3'- terminus. The tailoring enzyme operon was introduced into the pILI036, pILI039 and pILI051 plasmids by restriction ligation with Sail and Eagl sites, resulting in the pILI040, pILI043 and pILI058 plasmids, respectively.

The pILI040, pILI043, pILI058 and pILI208 plasmids were introduced into the MG1655 strain resulting in the D46, D53, D68 and D105 strains, respectively.

Table 1: List of plasmids and strains

Names Description Genes References

Plasmids

pBR322 pBR322 ori, Amp R , Kan Bolivar et al, 1977 plLI051 pBR322 ori, Amp R oks This work plLI039 pBR322 ori, Amp R als This work plLI036 pBR322 ori, Amp R pks4 This work plLI040 pBR322 ori, Amp R pks4, actlll, actVII, actIV This work plLI043 pBR322 ori, Amp R als, actlll, actVII, actIV This work plLI058 pBR322 ori, Amp R oks, actlll, actVII, actIV This work plLI208 pBR322 ori, Amp R als WT This work

Strains

MG1655 Escherichia coli K-12 ATCC 47076

D46 MG1655-pl LI040 pks4, actlll, actVII, actIV This work

D53 MG1655-pl LI043 als, actlll, actVII, actIV This work D68 MG1655-plLI057 oks, actlll, actVII, actIV This work

D105 MG1655-plLI208 als WT, actlll, actVU, actIV This work

Culture of strains

The D68, D53, D46 and D105 strains were characterized in flask culture into 25 mL of M9 minimal medium (Cold Spring Harb Protoc; 2010; doi: 10.1101/pdb.rec 12295) at 28°C. Expression of genes encoding PKS and tailoring enzymes was induced when the cells reached an optical density at 600nm of 2 (ODeoo nm : 2). The culture was stopped after 48h of culture. During culture, the growth was followed by monitoring ODeoo nm . Samples of supernatants and pellets were taken at the end of the culture for analytical analyzes of 3,8-dihydroxy-l-methyl-anthraquinone-2-carboxylic acid (DMAC) and Aloesaponarin II (A12) production by LC/TQMS method. Results

As shown in Table 2, after 48h of culture, substantial amounts of DMAC were produced in the pellet and supernatant of strains and D68, D53, D46 and D105 while AL2 was only recovered from the pellet.

Table 2 : Production of DMAC andAL2 by strains D68, D53, and D46 and D105