Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
FLAVONOIDS
Document Type and Number:
WIPO Patent Application WO/2005/084305
Kind Code:
A2
Abstract:
The invention provides methods and materials related to producing flavonoids as well as other organic compounds. For example, the invention provides isolated nucleic acids, polypeptides, host cells, and methods and materials for producing flavonoids and other organic compounds.

Inventors:
SCHMIDT-DANNERT CLAUDIA (US)
WATTS KEVIN (US)
Application Number:
PCT/US2005/006587
Publication Date:
September 15, 2005
Filing Date:
March 01, 2005
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNIV MINNESOTA (US)
SCHMIDT-DANNERT CLAUDIA (US)
WATTS KEVIN (US)
International Classes:
C12N1/21; C12N9/00; C12N9/02; C12N9/10; C12N9/88; C12P7/26; C12P17/06
Other References:
HWANG ET AL., APPL. ENVIRON. MICROBIOL., vol. 69, 2003, pages 2699 - 2706
LEE ET AL., PLANT MOL. BIOL., vol. 28, 1995, pages 817 - 884
BECKER, FEMS YEAST RES., vol. 4, 2003, pages 79 - 85
SAMBROOK; RUSSELL: "Molecular Cloning - A Laboratory Manual", vol. 3, 2001, COLD SPRING HARBOR LABORATORY PRESS
KHLEBNIKOV ET AL., MICROBIOLOGY, vol. 147, 2001, pages 3241 - 3247
See also references of EP 1765977A4
Attorney, Agent or Firm:
FINN, J., Patrick, III (P.A.Suite 3300,60 South Sixth Stree, Minneapolis MN, US)
Download PDF:
Claims:

WHAT IS CLAIMED IS: 1. A microorganism comprising phenol-type CoA-ligase activity and chalcone synthase or stilbene synthase activity, wherein said microorganism produces a flavonoid compound.
2. The microorganism of claim 1, wherein said microorganism comprises an exogenous nucleic acid molecule that encodes a polypeptide having said phenol-type CoA-ligase activity.
3. The microorganism of claim 1, wherein said phenol-type CoA-ligase activity is coumaroyl-CoA-ligase activity.
4. The microorganism of claim 1, wherein said microorganism comprises an exogenous nucleic acid molecule that encodes a polypeptide comprising the sequence set forth in SEQ ID NO : 2.
5. The microorganism of claim 1, wherein said microorganism comprises said chalcone synthase activity.
6. The microorganism of claim 1, wherein said microorganism comprises an exogenous nucleic acid molecule that encodes a polypeptide having said chalcone synthase activity.
7. The microorganism of claim 1, wherein said microorganism comprises an exogenous nucleic acid molecule that encodes a polypeptide comprising the sequence set forth in SEQ ID NO : 4.
8. The microorganism of claim 1, wherein said microorganism comprises said stilbene synthase activity.

9. The microorganism of claim 1, wherein said microorganism comprises an exogenous nucleic acid molecule that encodes a polypeptide having said stilbene synthase activity.
10. The microorganism of claim 1, wherein said microorganism comprises an exogenous nucleic acid molecule that encodes a polypeptide comprising the sequence set forth in SEQ ID NO : 6.
11. The microorganism of claim 1, wherein said flavonoid compound is naringenin, eriodictyol, homoeriodictyol, pinocembrin, or phloretin.
12. The microorganism of claim 1, wherein said microorganism is a bacterium.
13. The microorganism of claim 1, wherein said microorganism is Escherichia coli, Pseudomonas species, Streptomyces species, or Bacillus subtilis.
14. The microorganism of claim 1, wherein said microorganism comprises tyrosine ammonia lyase activity.
15. The microorganism of claim 1, wherein said microorganism comprises an exogenous nucleic acid molecule that encodes a polypeptide having tyrosine ammonia lyase activity.
16. The microorganism of claim 1, wherein said microorganism comprises an exogenous nucleic acid molecule that encodes a polypeptide comprising the sequence set forth in SEQ ID NO : 8.
17. The microorganism of claim 1, wherein said microorganism comprises phenylalanine ammonia lyase activity.

18. The microorganism of claim 1, wherein said microorganism comprises an exogenous nucleic acid molecule that encodes a polypeptide having phenylalanine ammonia lyase activity.
19. The microorganism of claim 1, wherein said microorganism comprises an exogenous nucleic acid molecule that encodes a polypeptide comprising the sequence set forth in SEQ ID NO : 10.
20. The microorganism of claim 1, wherein said microorganism comprises cinnamate hydroxylase activity.
21. The microorganism of claim 1, wherein said microorganism comprises an exogenous nucleic acid molecule that encodes a polypeptide having cinnamate hydroxylase activity.
22. The microorganism of claim 1, wherein said microorganism comprises an exogenous nucleic acid molecule that encodes a polypeptide comprising the sequence set forth in SEQ ID NO : 12.
23. The microorganism of claim 1, wherein said microorganism comprises cytochrome P450 reductase activity.
24. The microorganism of claim 1, wherein said microorganism comprises an exogenous nucleic acid molecule that encodes a polypeptide having cytochrome P450 reductase activity.
25. The microorganism of claim 1, wherein said microorganism comprises an exogenous nucleic acid molecule that encodes a polypeptide comprising the sequence set forth in SEQ ID NO : 14.

26. The microorganism of claim 1, wherein a culture of said microorganism produces at least about 10 mg of said flavonoid compound per liter of culture media.

27. A method for making a flavonoid compound, said method comprising culturing microorganisms under conditions wherein said microorganisms produce said flavonoid compound, said microorganisms comprising phenol-type CoA-ligase activity and chalcone synthase or stilbene synthase activity such that said flavonoid compound is produced.
28. The method of claim 27, wherein said microorganisms comprise an exogenous nucleic acid molecule that encodes a polypeptide having said phenol-type CoA-ligase activity.
29. The method of claim 27, wherein said phenol-type CoA-ligase activity is coumaroyl-CoA-ligase activity.
30. The method of claim 27, wherein said microorganisms comprise an exogenous nucleic acid molecule that encodes a polypeptide comprising the sequence set forth in SEQ ID NO : 2.
31. The method of claim 27, wherein said microorganisms comprise said chalcone synthase activity.
32. The method of claim 27, wherein said microorganisms comprise an exogenous nucleic acid molecule that encodes a polypeptide having said chalcone synthase activity.
33. The method of claim 27, wherein said microorganisms comprise an exogenous nucleic acid molecule that encodes a polypeptide comprising the sequence set forth in SEQ ID NO : 4.

34. The method of claim 27, wherein said microorganisms comprise said stilbene synthase activity.
35. The method of claim 27, wherein said microorganisms comprise an exogenous nucleic acid molecule that encodes a polypeptide having said stilbene synthase activity.
36. The method of claim 27, wherein said microorganisms comprise an exogenous nucleic acid molecule that encodes a polypeptide comprising the sequence set forth in SEQ ID NO : 6.
37. The method of claim 27, wherein said flavonoid compound is naringenin, eriodictyol, homoeriodictyol, pinocembrin, or phloretin.
38. The method of claim 27, wherein said microorganisms are bacteria.
39. The method of claim 27, wherein said microorganisms are Escherichia coli, Pseudomonas species, Streptonyces species, or Bacillus subtilis.
40. The method of claim 27, wherein said microorganisms comprise tyrosine ammonia lyase activity.
41. The method of claim 27, wherein said microorganisms comprise an exogenous nucleic acid molecule that encodes a polypeptide having tyrosine ammonia lyase activity.
42. The method of claim 27, wherein said microorganisms comprise an exogenous nucleic acid molecule that encodes a polypeptide comprising the sequence set forth in SEQ ID NO : 8.
43. The method of claim 27, wherein said microorganisms comprise phenylalanine ammonia lyase activity.

44. The method of claim 27, wherein said microorganisms comprise an exogenous nucleic acid molecule that encodes a polypeptide having phenylalanine ammonia lyase activity.
45. The method of claim 27, wherein said microorganisms comprise an exogenous nucleic acid molecule that encodes a polypeptide comprising the sequence set forth in SEQIDNO : 10.
46. The method of claim 27, wherein said microorganisms comprise cinnamate hydroxylase activity.
47. The method of claim 27, wherein said microorganisms comprise an exogenous nucleic acid molecule that encodes a polypeptide having cinnamate hydroxylase activity.
48. The method of claim 27, wherein said microorganisms comprise an exogenous nucleic acid molecule that encodes a polypeptide comprising the sequence set forth in SEQ ID NO : 12.
49. The method of claim 27, wherein said microorganisms comprise cytochrome P450 reductase activity.
50. The method of claim 27, wherein said microorganisms comprise an exogenous nucleic acid molecule that encodes a polypeptide having cytochrome P450 reductase activity.
51. The method of claim 27, wherein said microorganisms comprise an exogenous nucleic acid molecule that encodes a polypeptide comprising the sequence set forth in SEQ ID NO : 14.
52. The method of claim 27, wherein said method comprises culturing said microorganisms in the presence of an aromatic acid.

53. The method of claim 27, wherein said aromatic acid is 4-coumaric acid, caffeic acid, ferulic acid, phenylpropionic acid, hydroxyphenyl propionic acid, 3- (4- hydroxyphenyl) propionic acid, sinapic acid, or muconic acid.
54. The method of claim 27, wherein said microorganisms produce at least about 10 mg of said flavonoid compound per liter.
55. The method of claim 27, wherein said microorganisms produce at least about 15 mg of said flavonoid compound per liter.
56 The method of claim 27, wherein said microorganisms produce at least about 20 mg of said flavonoid compound per liter.
57. A method for making a chalcone compound, said method comprising culturing microorganisms under conditions wherein said microorganisms produce said chalcone compound, said microorganisms comprising phenol-type CoA-ligase activity and chalcone synthase or stilbene synthase activity such that said chalcone compound is produced.
58. The method of claim 57, wherein said microorganisms comprise an exogenous nucleic acid molecule that encodes a polypeptide having said phenol-type CoA-ligase activity.
59. The method of claim 57, wherein said phenol-type CoA-ligase activity is coumaroyl-CoA-ligase activity.
60. The method of claim 57, wherein said microorganisms comprise an exogenous nucleic acid molecule that encodes a polypeptide comprising the sequence set forth in SEQ ID NO : 2.

61. The method of claim 57, wherein said microorganisms comprise said chalcone synthase activity.
62. The method of claim 57, wherein said microorganisms comprise an exogenous nucleic acid molecule that encodes a polypeptide having said chalcone synthase activity.
63. The method of claim 57, wherein said microorganisms comprise an exogenous nucleic acid molecule that encodes a polypeptide comprising the sequence set forth in SEQ ID NO : 4.
64. The method of claim 57, wherein said microorganisms comprise said stilbene synthase activity.
65. The method of claim 57, wherein said microorganisms comprise an exogenous nucleic acid molecule that encodes a polypeptide having said stilbene synthase activity.
66. The method of claim 57, wherein said microorganisms comprise an exogenous nucleic acid molecule that encodes a polypeptide comprising the sequence set forth in SEQ ID NO : 6.
67. The method of claim 57, wherein said chalcone compound is phloretin.
68. The method of claim 57, wherein said microorganisms are bacteria.
69. The method of claim 57, wherein said microorganisms are Eschericlzia coli, Pseudomonas species, Streptomyces species, or Bacillus subtilis.
70. The method of claim 57, wherein said method comprises culturing said microorganisms in the presence of an aromatic acid.

71. The method of claim 57, wherein said aromatic acid is 4-coumaric acid, caffeic acid, ferulic acid, phenylpropionic acid, hydroxyphenyl propionic acid, 3- (4- hydroxyphenyl) propionic acid, sinapic acid, or muconic acid.
72. The method of claim 57, wherein said microorganisms produce at least about 10 mg of said chalcone compound per liter.
73. An isolated nucleic acid comprising the sequence set forth in SEQ ID NO : 42, wherein said nucleic acid encodes a polypeptide having stilbene synthase activity.
74. An isolated nucleic acid encoding a polypeptide comprising the sequence set forth in SEQ ID NO : 43.
75. A composition comprising a compound selected from the group consisting of piceatannol, isorhapontigenin, dihydrokaempferol and dihydroquercetin.
76. The composition of claim 75, wherein greater than 10 percent of said composition is said compound.
77. The composition of claim 75, wherein greater than 50 percent of said composition is said compound.
78. The composition of claim 75, wherein greater than 80 percent of said composition is said compound.
79. The composition of claim 75, wherein greater than 90 percent of said composition is said compound.
80. The composition of claim 75, wherein greater than 95 percent of said composition is said compound.
Description:

FLAVONOIDS CROSS-RELATED APPLICATIONS This application claims the benefit of U. S. Provisional Application Serial No.

60/549,077, filed March 1,2004.

BACKGROUND 1. Technical Field The invention relates to methods and materials involved in producing flavonoids and other organic compounds.

2. Background Information Flavonoids are ubiquitous plant natural products that play a variety of roles in plants, including UV protection, defense against pathogens, and coloration. The uncovering of an increasing number of health benefits associated with flavonoids present in fruits, vegetables, red wine, and green tea resulted in an explosion of research on the medicinal properties of flavonoids during the last few years. Medicinal activities shown for flavonoid compounds range from scavenging of harmful oxygen species, enzyme inhibition, anti-inflammatory and estrogenic activities to cytotoxic antitumor activities.

The recognition of flavonoids as health-promoting nutraceuticals also spurred research on elucidating the complex metabolic networks of flavonoid biosynthesis with the idea of enhancing and altering flavonoid composition in dietary plants. Flavonoids are synthesized from an activated phenylpropanoid starter unit and three malonyl-CoA extender units. Phenylpropanoids are phenolic acids, such as 4-coumaric, caffeic, and ferulic acid, which are used in the formation of lignin, coumarins, and other plant natural products in addition to flavonoids.

SUMMARY The invention relates to methods and materials involved in producing flavonoids and other organic compounds. A flavonoid compound can be naringenin, eriodictyol, homoeriodictyol, a chalcone, a stilbene, a flavonol, a flavone, an isoflavonoid, a

condensed tannin, an isoflavene (e. g. , phenoxodiol), a pterocarpan, an anthocyanin pigment, a pyrone, daidzein, genistein, or phloretin. For example, the invention provides nucleic acid molecules, polypeptides, host cells, and methods that can be used to produce flavonoids and other organic compounds. The nucleic acid molecules described herein can be used to engineer host cells having the ability to produce one or more flavonoids or other organic compounds. The polypeptides described herein can be used in cell-free systems to make one or more flavonoids or other organic compounds. The host cells described herein can be used in culture systems to produce large quantities of, for example, flavonoids such as naringenin.

As described herein, exogenously supplied phenylpropionic acids can be readily taken up by cells (e. g. , bacterial cells) and converted into other compounds by those cells.

Thus, phenylpropionic acids, which can be abundantly available from agricultural waste products, can be used as inexpensive precursors for the production of higher valued flavonoid compounds, for example. In addition, in vivo feeding of exogenous precursor compounds can be used to determine catalytic functions (e. g. , activity levels, substrate specificity, etc. ) of enzymes such as CoA-ligases and type III polyketide synthases as well as isoenzymes and engineered variants of known enzymes.

In general, the invention features a microorganism having phenol-type CoA-ligase activity and chalcone synthase or stilbene synthase activity, where the microorganism produces a flavonoid compound. The microorganism can contain an exogenous nucleic acid molecule that encodes a polypeptide having the phenol-type CoA-ligase activity.

The phenol-type CoA-ligase activity can be coumaroyl-CoA-ligase activity. The microorganism can contain an exogenous nucleic acid molecule that encodes a polypeptide containing the sequence set forth in SEQ ID NO : 2. The microorganism can have the chalcone synthase activity. The microorganism can contain an exogenous nucleic acid molecule that encodes a polypeptide having the chalcone synthase activity.

The microorganism can contain an exogenous nucleic acid molecule that encodes a polypeptide containing the sequence set forth in SEQ ID NO : 4. The microorganism can have the stilbene synthase activity. The microorganism can contain an exogenous nucleic acid molecule that encodes a polypeptide having the stilbene synthase activity. The microorganism can contain an exogenous nucleic acid molecule that encodes a

polypeptide containing the sequence set forth in SEQ ID NO : 6. The flavonoid compound can be naringenin, eriodictyol, homoeriodictyol, pinocembrin, or phloretin. The microorganism can be a bacterium. The microorganism can be Escherichia coli, Pseudonaonas species, Strepto7ayces species, or Bacillus subtilis. The microorganism can have tyrosine ammonia lyase activity. The microorganism can contain an exogenous nucleic acid molecule that encodes a polypeptide having tyrosine ammonia lyase activity.

The microorganism can contain an exogenous nucleic acid molecule that encodes a polypeptide containing the sequence set forth in SEQ ID NO : 8. The microorganism can have phenylalanine ammonia lyase activity. The microorganism can contain an exogenous nucleic acid molecule that encodes a polypeptide having phenylalanine ammonia lyase activity. The microorganism can contain an exogenous nucleic acid molecule that encodes a polypeptide containing the sequence set forth in SEQ ID NO : 10.

The microorganism can have cinnamate hydroxylase activity. The microorganism can contain an exogenous nucleic acid molecule that encodes a polypeptide having cinnamate hydroxylase activity. The microorganism can contain an exogenous nucleic acid molecule that encodes a polypeptide containing the sequence set forth in SEQ ID NO : 12.

The microorganism can have cytochrome P450 reductase activity. The microorganism can contain an exogenous nucleic acid molecule that encodes a polypeptide having cytochrome P450 reductase activity. The microorganism can contain an exogenous nucleic acid molecule that encodes a polypeptide containing the sequence set forth in SEQ ID NO : 14. A culture of the microorganism can produce at least about 10 mg of the flavonoid compound per liter of culture media.

In another aspect, the invetion features a method for making a flavonoid compound. The method includes culturing microorganisms under conditions wherein the microorganisms produce the flavonoid compound. The microorganisms have phenol-type CoA-ligase activity and chalcone synthase or stilbene synthase activity such that the flavonoid compound is produced. The microorganisms can contain an exogenous nucleic acid molecule that encodes a polypeptide having the phenol-type CoA-ligase activity.

The phenol-type CoA-ligase activity can be coumaroyl-CoA-ligase activity. The microorganisms can contain an exogenous nucleic acid molecule that encodes a polypeptide containing the sequence set forth in SEQ ID NO : 2. The microorganisms can

have the chalcone synthase activity. The microorganisms can contain an exogenous nucleic acid molecule that encodes a polypeptide having the chalcone synthase activity.

The microorganisms can contain an exogenous nucleic acid molecule that encodes a polypeptide containing the sequence set forth in SEQ ID NO : 4. The microorganisms can have the stilbene synthase activity. The microorganisms can contain an exogenous nucleic acid molecule that encodes a polypeptide having the stilbene synthase activity.

The microorganisms can contain an exogenous nucleic acid molecule that encodes a polypeptide containing the sequence set forth in SEQ ID NO : 6. The flavonoid compound can be naringenin, eriodictyol, homoeriodictyol, pinocembrin, or phloretin. The microorganisms can be bacteria. The microorganisms can be Escherichia coli, Pseudomonas species, Streptomyces species, or Bacillus subtilis. The microorganisms can have tyrosine ammonia lyase activity. The microorganisms can contain an exogenous nucleic acid molecule that encodes a polypeptide having tyrosine ammonia lyase activity.

The microorganisms can contain an exogenous nucleic acid molecule that encodes a polypeptide containing the sequence set forth in SEQ ID NO : 8. The microorganisms can have phenylalanine ammonia lyase activity. The microorganisms can contain an exogenous nucleic acid molecule that encodes a polypeptide having phenylalanine ammonia lyase activity. The microorganisms can contain an exogenous nucleic acid molecule that encodes a polypeptide containing the sequence set forth in SEQ ID NO : 10.

The microorganisms can have cinnamate hydroxylase activity. The microorganisms can contain an exogenous nucleic acid molecule that encodes a polypeptide having cinnamate hydroxylase activity. The microorganisms can contain an exogenous nucleic acid molecule that encodes a polypeptide containing the sequence set forth in SEQ ID NO: 12.

The microorganisms can have cytochrome P450 reductase activity. The microorganisms can contain an exogenous nucleic acid molecule that encodes a polypeptide having cytochrome P450 reductase activity. The microorganisms can contain an exogenous nucleic acid molecule that encodes a polypeptide containing the sequence set forth in SEQ ID NO : 14. The method can include culturing the microorganisms in the presence of an aromatic acid. The aromatic acid can be 4-coumaric acid, caffeic acid, ferulic acid, phenylpropionic acid, hydroxyphenyl propionic acid, 3- (4-hydroxyphenyl) propionic acid, sinapic acid, or muconic acid. The microorganisms can produce at least about 10 mg of

the flavonoid compound per liter. The microorganisms can produce at least about 15 mg of the flavonoid compound per liter. The microorganisms can produce at least about 20 mg of the flavonoid compound per liter.

In another embodiment, the invention features a method for making a chalcone compound. The method includes culturing microorganisms under conditions wherein the microorganisms produce the chalcone compound. The microorganisms have phenol-type CoA-ligase activity and chalcone synthase or stilbene synthase activity such that the chalcone compound is produced. The microorganisms can contain an exogenous nucleic acid molecule that encodes a polypeptide having the phenol-type CoA-ligase activity.

The phenol-type CoA-ligase activity can be coumaroyl-CoA-ligase activity. The microorganisms can contain an exogenous nucleic acid molecule that encodes a polypeptide containing the sequence set forth in SEQ ID NO : 2. The microorganisms can have the chalcone synthase activity. The microorganisms can contain an exogenous nucleic acid molecule that encodes a polypeptide having the chalcone synthase activity.

The microorganisms can contain an exogenous nucleic acid molecule that encodes a polypeptide containing the sequence set forth in SEQ ID NO : 4. The microorganisms can have the stilbene synthase activity. The microorganisms can contain an exogenous nucleic acid molecule that encodes a polypeptide having the stilbene synthase activity.

The microorganisms can contain an exogenous nucleic acid molecule that encodes a polypeptide containing the sequence set forth in SEQ ID NO : 6. The chalcone compound can be phloretin. The microorganisms can be bacteria. The microorganisms can be Escherichia coli, Pseudomonas species, Strepto7nyces species, or Bacillus subtilis. The method can include culturing the microorganisms in the presence of an aromatic acid.

The aromatic acid can be 4-coumaric acid, caffeic acid, ferulic acid, phenylpropionic acid, hydroxyphenyl propionic acid, 3- (4-hydroxyphenyl) propionic acid, sinapic acid, or muconic acid. The microorganisms can produce at least about 10 mg of the chalcone compound per liter.

In another embodiment, the invention features an isolated nucleic acid containing the sequence set forth in SEQ ID NO : 42, wherein the nucleic acid encodes a polypeptide having stilbene synthase activity.

In another embodiment, the invention features an isolated nucleic acid encoding a polypeptide containing the sequence set forth in SEQ ID NO : 43.

In another embodiment, the invention features a composition containing a compound selected from the group consisting of piceatannol, isorhapontigenin, dihydrokaempferol and dihydroquercetin. Greater than 10 percent (e. g. , greater than about 20,30, 40,50, 60,70, 80,90, 95, or 99 percent) of the composition can be the compound.

Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention pertains. Although methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present invention, suitable methods and materials are described below. All publications, patent applications, patents, and other references mentioned herein are incorporated by reference in their entirety. In case of conflict, the present specification, including definitions, will control. In addition, the materials, methods, and examples are illustrative only and not intended to be limiting.

Other features and advantages of the invention will be apparent from the following detailed description, and from the claims.

DESCRIPTION OF DRAWINGS Figure 1 is a diagram of a pathway for making flavonoids such as naringenin, eriodictyol, and homoeriodictyol.

Figure 2 is a diagram of a pathway for making various flavonoid compounds from naringenin.

Figure 3 is a listing of a nucleic acid sequence that encodes a polypeptide having coumaroyl-CoA-ligase activity (SEQ ID NO : 1). This nucleic acid sequence encodes an A. thaliana 4-coumaroyl: CoA ligase polypeptide (GenBank Accession Number U18675).

Figure 4 is a listing of an amino acid sequence of a polypeptide having coumaroyl-CoA-ligase activity (SEQ ID NO : 2). The nucleic acid set forth in SEQ ID NO : 1 encodes this amino acid sequence.

Figure 5 is a listing of a nucleic acid sequence that encodes a polypeptide having chalcone synthase activity (SEQ ID NO : 3). This nucleic acid sequence encodes an A. t7aczliancz chalcone synthase polypeptide (GenBank Accession Number AF112086).

Figure 6 is a listing of an amino acid sequence of a polypeptide having chalcone synthase activity (SEQ ID NO : 4). The nucleic acid set forth in SEQ ID NO : 3 encodes this amino acid sequence.

Figure 7 is a listing of a nucleic acid sequence that encodes a polypeptide having stilbene synthase activity (SEQ ID NO : 5). This nucleic acid sequence encodes an Arachis hypogaea stilbene synthase polypeptide (GenBank Accession Number AB027606).

Figure 8 is a listing of an amino acid sequence of a polypeptide having stilbene synthase activity (SEQ ID NO : 6). The nucleic acid set forth in SEQ ID NO : 5 encodes this amino acid sequence.

Figure 9 is a listing of a nucleic acid sequence that encodes a polypeptide having tyrosine ammonia lyase activity (SEQ ID NO : 7). The start codon was changed from GTG to ATG for translation in E. coli. The GenBanlc sequence (Accession Number ZP 00005404) lists the start codon as GTG for this Rhodobacter sphaeroides tyrosine ammonia lyase polypeptide.

Figure 10 is a listing of an amino acid sequence of a polypeptide having tyrosine ammonia lyase activity (SEQ ID NO : 8). The nucleic acid set forth in SEQ ID NO : 7 encodes this amino acid sequence.

Figure 11 is a listing of a nucleic acid sequence that encodes a polypeptide having phenylalanine ammonia lyase activity (SEQ ID NO : 9). This nucleic acid sequence encodes an A. thaliana phenylalanine ammonia lyase polypeptide (GenBank Accession Number AY303128).

Figure 12 is a listing of an amino acid sequence of a polypeptide having phenylalanine ammonia lyase activity (SEQ ID NO : 10). The nucleic acid set forth in SEQ ID NO : 9 encodes this amino acid sequence.

Figure 13 is a listing of a nucleic acid sequence that encodes a polypeptide having cinnamate hydroxylase activity (SEQ ID NO : 11). This nucleic acid sequence encodes an

A. thaliana cinnamate-4-hydroxylase polypeptide (GenBank Accession Number U71080).

Figure 14 is a listing of an amino acid sequence of a polypeptide having cinnamate hydroxylase lyase activity (SEQ ID NO: 12). The nucleic acid set forth in SEQ ID NO : 11 encodes this amino acid sequence.

Figure 15 is a listing of a nucleic acid sequence that encodes a polypeptide having NADPH-cytochrome p450 reductase activity (SEQ ID NO : 13). This nucleic acid sequence encodes an A. thaliana NADPH-ferrihemoprotein reductase polypeptide (GenBank Accession Number NM119167).

Figure 16 is a listing of an amino acid sequence of a polypeptide having NADPH- cytochrome p450 reductase activity (SEQ ID NO : 14). The nucleic acid set forth in SEQ ID NO : 13 encodes this amino acid sequence.

Figure 17 contains graphs generated from the HPLC analysis of extracts from culture supernatants of E. coli cells in modified M9 medium after 24 hours induction.

Panel A: Standard compounds, 4-coumaric acid (1), trans-cinnamic acid (2), and naringenin (3). Panel B: E. coli pAC-PAL/C4H + pBAD-4CL/CHS. Panel C: E. coli pAC-PAL/C4H + pBAD-4CL/CHS fed 4-coumaric acid. Panel D: E. coli pBAD- 4CL/CHS fed 4-coumaric acid. Absorbance monitored at 290 nm. The insets contain graphs plotting the UV/Vis spectra of the indicated compound peaks. The maximum absorbance of 4-coumaric acid, trans-cinnamic acid, and naringenin are 310,275 and 290 nm, respectively.

Figure 18 contains graphs generated from the HPLC analysis of E. coli cells fed 3- (4-hydroxyphenyl) propionic acid. Panel A: HPLC chromatogram showing the accumulation of 4-coumaric acid (1) and production of both phloretin (2) and naringenin (3). Panel B: Selective ion chromatogram of the 3- (4-hydroxyphenyl) propionic acid fed culture confirming the masses of 4-coumaric acid, phloretin, and naringenin. Absorbance monitored at 290 nm. The insets contain graphs plotting the UV/Vis spectra of the indicated compound peaks. The maximum absorbance of phloretin is 287 nm.

Figure 19 contains graphs generated from the HPLC analysis of extracts from culture supernatants of E. coli transformants expressing Rhodobacter sphaeroides TAL alone and together with Arabidopsis 4CL and CHS in modified M9 medium after 24

hours induction. Panel A: standard compounds 4-coumaric acid (1), trans-cinnamic acid (2), and naringenin (3). Panel B: E. coli pAC-TAL + pBADMod2. Panel C: E. coli pAC- TAL + pBAD-4CL/CHS. Absorbance monitored at 290 nm. The insets contain graphs plotting the UV/Vis spectra of compound peaks.

Figure 20 contains graphs plotting growth and naringenin production of recombinant E. coli expressing Rba. sphaeroides TAL together with Arabidopsis 4CL and CHS in TB (A) and modified M9 (B) medium. Filled squares represent growth; circles and triangles represent naringenin production in the culture supernatant and cell pellet, respectively. Data points represent the mean of three independent cultures.

Figure 21 is an HPLC chromatogram of extracts from culture supernatants of E. coli transformants expressing PAL + C4H (dark black) or PAL + C4H + AtR2 (light grey). The peak under the arrow corresponds to 4-coumaric acid.

Figure 22 is a listing of (1) a nucleic acid sequence that encodes a Medicago truncatula polypeptide having chalcone synthase activity (SEQ ID NO : 15) and (2) an amino acid sequence of a Medicago tfuracatula polypeptide having chalcone synthase activity (SEQ ID NO : 16). The CHS1 polypeptide designation used herein refers to the Medicago truncatula polypeptide having the amino acid sequence set forth in SEQ ID NO : 16.

Figure 23 is a listing of (1) a nucleic acid sequence that encodes a Medicago truyacatula polypeptide having chalcone synthase activity (SEQ ID NO : 17) and (2) an amino acid sequence of a Medicago truncatula polypeptide having chalcone synthase activity (SEQ ID NO : 18). The CHS2 polypeptide designation used herein refers to the Medicago truncatula polypeptide having the amino acid sequence set forth in SEQ ID NO : 18.

Figure 24 is a listing of (1) a nucleic acid sequence that encodes a Medicago truncatula polypeptide having chalcone synthase activity (SEQ ID NO: 19) and (2) an amino acid sequence of a Medicago truncatula polypeptide having chalcone synthase activity (SEQ ID NO : 20). The CHS3 polypeptide designation used herein refers to the Medicago trufacatula polypeptide having the amino acid sequence set forth in SEQ ID NO : 20.

Figure 25 is a listing of (1) a nucleic acid sequence that encodes a Medicago truncatula polypeptide having chalcone synthase activity (SEQ ID NO : 21) and (2) an amino acid sequence of a Medicago truncatula polypeptide having chalcone synthase activity (SEQ ID NO : 22). The CHS4 polypeptide designation used herein refers to the Medicago truncatula polypeptide having the amino acid sequence set forth in SEQ ID NO : 22.

Figure 26 is a listing of (1) a nucleic acid sequence that encodes a Medicago truncatula polypeptide having chalcone synthase activity (SEQ ID NO : 23) and (2) an amino acid sequence of a Medicago truncatula polypeptide having chalcone synthase activity (SEQ ID NO : 24). The CHS5 polypeptide designation used herein refers to the Medicago truncatula polypeptide having the amino acid sequence set forth in SEQ ID NO : 24.

Figure 27 is a diagram of pathways for making flavonoids such as stilbenes, chalcones, and pyrones.

Figure 28 is a diagram of pathways for making flavonoids.

Figure 29 is a listing of (1) a nucleic acid sequence that encodes a Rheum tataf°icum polypeptide having stilbene synthase activity (SEQ ID NO : 25) and (2) an amino acid sequence of a Rheum tataricuiii polypeptide having stilbene synthase activity (SEQ ID NO : 26).

Figure 30 is a listing of (1) a nucleic acid sequence that encodes a Psilotum nudum polypeptide having stilbene synthase activity (SEQ ID NO : 27) and (2) an amino acid sequence of a Psilotum nudum polypeptide having stilbene synthase activity (SEQ ID NO : 28).

Figure 31 is a listing of (1) a nucleic acid sequence that encodes a Vitis vinifera polypeptide having stilbene synthase activity (SEQ ID NO : 29) and (2) an amino acid sequence of a Vitis vinifera polypeptide having stilbene synthase activity (SEQ ID NO : 30).

Figure 32 is a listing of (1) a nucleic acid sequence that encodes a Pseudomonas putida KT2440 polypeptide having feruloyl-CoA synthase activity (SEQ ID NO : 31) and (2) an amino acid sequence of a Pseudomonas putida KT2440 polypeptide having feruloyl-CoA synthase activity (SEQ ID NO : 32).

Figure 33 is a listing of (1) a nucleic acid sequence that encodes a Rhodobacter sphaeroides polypeptide having p-coumaroyl-CoA ligase activity (SEQ ID NO : 33) and (2) an amino acid sequence of a Rhodobacter sphaeroides polypeptide having p- coumaroyl-CoA ligase activity (SEQ ID NO : 34).

Figure 34 is a listing of (1) a nucleic acid sequence that encodes a Streptoznyces coelicolor polypeptide having cinnamate-CoA ligase activity (SEQ ID NO : 35) and (2) an amino acid sequence of a Streptomyces coelicolor polypeptide having cinnamate-CoA ligase activity (SEQ ID NO : 36).

Figure 35 is a listing of (1) a nucleic acid sequence that encodes an arrachis laypogaea polypeptide having stilbene synthase activity (SEQ ID NO : 42) and (2) an amino acid sequence of an Arachis hypogaea polypeptide having stilbene synthase activity (SEQ ID NO : 43).

Figure 36 is a listing of (1) a nucleic acid sequence that encodes an A. thaliana polypeptide having flavanone-3ß-hydroxylase activity (SEQ ID NO : 44) and (2) an amino acid sequence of an A. thaliana polypeptide having flavanone-3ß-hydroxylase activity (SEQ ID NO : 45).

Figure 37 is a listing of (1) a nucleic acid sequence that encodes an A. thalia7la polypeptide having flavonol synthase activity (SEQ ID NO : 46) and (2) an amino acid sequence of an A. thaliana polypeptide having flavonol synthase activity (SEQ ID NO : 47).

DETAILED DESCRIPTION The invention provides methods and materials related to producing flavonoids (e. g. , naringenin, eriodictyol, homoeriodictyol, chalcones, stilbenes, flavonols, flavones, isoflavonoids, condensed tannins, pterocarpans, anthocyanin pigments, pyrones, daidzein, genistein, or phloretin) and/or other organic compounds. For example, the invention provides isolated nucleic acids, polypeptides, host cells, and methods and materials for producing flavonoids such as naringenin or phloretin.

Flavonoids can be synthesized from an activated phenylpropanoid starter unit and three malonyl-CoA extender units. Phenylpropanoids are phenolic acids such as 4- coumaric, caffeic, and ferulic acid (Figure 1), which are used to form lignin, coumarins,

and other plant natural products in addition to flavonoids (Winkel-Shirley, Plant Physio, 126, 485-493 (2001); Weisshaar and Jerkins, Curr. Opin. Pla7ltBiol., 1, 251-257 (1998) ) ; and Paiva, J. Plant Growth Regul., 19, 131-143 (2000)).

A first step in phenylpropanoid biosynthesis can be deamination of L- phenylalanine by a polypeptide having phenylalanine ammonia lyase (PAL) activity to produce trans-cinnamic acid. trans-cinnamic acid can be hydroxylated in the para position of the benzyl ring by a polypeptide having cinnamate hydroxylase lyase (C4H) activity to make 4-coumaric acid, which then can be activated by a polypeptide having coumaroyl-CoA-ligase (4CL) activity to make 4-coumaroyl-CoA. Naringenin chalcone can be synthesized from a single activated 4-coumaroyl-CoA starter unit by sequential addition of three acetate extender units, derived from malonyl-CoA, via a polypeptide having type III polyketide synthase activity such as a polypeptide having chalcone synthase (CHS) activity (Austin and Noel, Nat. Prod. Rep., 20,79-110 (2003) ).

Naringenin chalcone then can be converted spontaneously in vitro to the three ringed flavanone structure naringenin, or enzymatically in vivo by a polypeptide having chalcone isomerase (CHI) activity (Mol et al., Phytoche71listry, 24, 2267-2269 (1985)).

1. Metabolic pathways The invention provides several metabolic pathways that can be used to produce organic compounds (Figures 1, 2,27, and 28). As depicted in Figure 1, phenylalanine can be converted into trans-cinnamic acid by a polypeptide having PAL activity (e. g. , EC 4.3. 1.-) ; the resulting trans-cinnamic acid can be converted into 4-coumaric acid by a polypeptide having C4H activity (e. g. , EC 1.14. 13. -); the resulting 4-coumaric acid (or added compounds such as 4-coumaric acid, caffeic acid, or ferulic acid) can be converted into 4-coumaroyl-CoA (or other compounds such as caffeoyl-CoA or feruloyl-CoA) by a polypeptide having 4CL activity (e. g. , EC 6.2. 1. -); and the resulting CoA product (e. g. , 4- coumaroyl-CoA, caffeoyl-CoA, or feruloyl-CoA) can be converted into naringenin chalcone (or another product such as eriodictyol chalcone or homoeriodictyol chalcone) by a polypeptide having CHS activity (e. g. , EC 2.3. 1.-or EC 2.3. 1.74). The final form of products such as naringenin (or eriodictyol or homoeriodictyol) can be formed from

naringenin chalcone (or eriodictyol chalcone or homoeriodictyol chalcone) spontaneously or by a polypeptide having CHI activity (e. g. , EC 5.5. 1.6).

In some embodiments, tyrosine can be converted into 4-coumaric acid by a polypeptide having tyrosine ammonia lyase activity (TAL activity; e. g. , EC 4.3. 1.-). In other embodiments, 4-coumaroyl-CoA (or other compounds such as caffeoyl-CoA or feruloyl-CoA) can be converted into resveratrol (or other compounds such as piceatannol) by a polypeptide having stilbene synthase activity (STS activity; e. g. , EC 2.3. 1. -, EC 2.3. 1.95, or EC 2.3. 1.146). In some embodiments, a polypeptide having NADPH- cytochrome p450 reductase activity (e. g. , EC 1.6. 2. -) can be used. Such polypeptides can be co-expressed with other polypeptides such as polypeptides having C4H activity such that C4H activity is observed.

Polypeptides having PAL activity as well as nucleic acid encoding such polypeptides can be obtained from various species including, without limitation, Arabidopsis thaliana, Medicago truncatula, and Arachis hypogaea. For example, nucleic acid that encodes a polypeptide having PAL activity can be obtained from Arabidopsis tlzaliana and can have a nucleic acid sequence as set forth in SEQ ID NO : 9 (Figure 11), which can encode the amino acid sequence set forth in SEQ ID NO : 10 (Figure 12). In addition, polypeptides having PAL activity as well as nucleic acid encoding such polypeptides can be obtained as described herein.

Polypeptides having C4H activity as well as nucleic acid encoding such polypeptides can be obtained from various species including, without limitation, Arabidopsis thaliana, Medicago truncatula, and Araclls hypogaea. For example, nucleic acid encoding a polypeptide having C4H activity can be obtained from Arabidopsis tlaaliana and can have a nucleic acid sequence as set forth in SEQ ID NO: 11 (Figure 13), which can encode the amino acid sequence set forth in SEQ ID NO: 12 (Figure 14). In addition, polypeptides having C4H activity as well as nucleic acid encoding such polypeptides can be obtained as described herein.

Polypeptides having 4CL activity as well as nucleic acid encoding such polypeptides can be obtained from various species including, without limitation, Arabidopsis tltaliana, Medicago truncatula, and Arachis hypogaea. For example, nucleic acid that encodes a polypeptide having 4CL activity can be obtained from Arabidopsis

thaliana and can have a nucleic acid sequence as set forth in SEQ ID NO: 1 (Figure 3), which can encode the amino acid sequence set forth in SEQ ID NO : 2 (Figure 4). In addition, polypeptides having 4CL activity as well as nucleic acid encoding such polypeptides can be obtained as described herein.

Polypeptides having other types of CoA ligase activity can be used to produce flavonoids or other organic compounds. For example, polypeptides having cinnamate- CoA ligase activity (which can be obtained from Streptoillyces coelicolor or S. avermitilis), polypeptides having feruloyl-CoA ligase activity (which can be obtained from Pseudomonas and other genera of lignin degraders), and polypeptides having p- coumaroyl-CoA ligase activity (which can be obtained from Rhodobacter and other photoactive yellow protein forming genera) can be used (Figures 32-34).

Polypeptides having CHS activity as well as nucleic acid encoding such polypeptides can be obtained from various species including, without limitation, Arabidopsis thaliana, Medicago truncatula, and Arachis hypogaea. For example, nucleic acid that encodes a polypeptide having CHS activity can be obtained from Arabidopsis thaliana and can have a sequence as set forth in SEQ ID NO : 3 (Figure 5), which can encode the amino acid sequence set forth in SEQ ID NO : 4 (Figure 6). In addition, polypeptides having CHS activity as well as nucleic acid encoding such polypeptides can be obtained as described herein.

In some embodiments, polypeptides having CHS activity as well as nucleic acid encoding such polypeptides can be obtained from Medicago truncatula and can have the amino acid and nucleic acid sequences, respectively, set forth in Figures 22,23, 24,25, or 26. Other polypeptides having CHS activity (and nucleic acid encoding such polypeptides) that can be used as described herein include, without limitation, those homologous to the polypeptides (and nucleic acids) set forth in Figures 6 and 22-26. for example, the CHS 1 polypeptide of Figure 22 is homologous to a polypeptide obtained from Medicago sativa (GenBank Accession Number L02904); the CHS2 polypeptide of Figure 23 is homologous to a polypeptide obtained from Medicago sativa (GenBank Accession Number L02902); the CHS3 polypeptide of Figure 24 is homologous to a polypeptide obtained from Vitis W ! !/e (GenBank Accession Number BAA31259); the CHS4 polypeptide of Figure 25 is homologous to a polypeptide obtained from Medicago

sativa (GenBank Accession Number L02905); and the CHS5 polypeptide of Figure 26 is homologous to a polypeptide obtained from Pisum sativum (GenBank Accession Number X80007).

Polypeptides having CHI activity as well as nucleic acid encoding such polypeptides can be obtained from various species including, without limitation, Arabidopsis thaliana, Medicago truncatula, and Arachis Ilypogaea. For example, nucleic acid that encodes a polypeptide having CHI activity can be obtained from Arabidopsis thaliana and can have a sequence as set forth in GenBank accession number M86358, or can be obtained from Medicago truncatula and can have a sequence as set forth in GenBank accession number TC85633. In addition, polypeptides having CHI activity as well as nucleic acid encoding such polypeptides can be obtained as described herein.

Polypeptides having TAL activity as well as nucleic acid encoding such polypeptides can be obtained from various species including, without limitation, Rhodobacter sphaeroides, Rhodobacter capsulatus, and Halorhodospiro halophila. For example, nucleic acid that encodes a polypeptide having TAL activity can be obtained from Rhodobacter sphaeroides and can have a nucleic acid sequence as set forth in SEQ ID NO : 7 (Figure 9), which can encode the amino acid sequence set forth in SEQ ID NO : 8 (Figure 10). In addition, polypeptides having TAL activity as well as nucleic acid encoding such polypeptides can be obtained as described herein.

Polypeptides having STS activity as well as nucleic acid encoding such polypeptides can be obtained from various species including, without limitation, Arachis hypogaea, Vitis vinifera, Rheuna a'cM/M, Psilotum nudum, and Pinus sylvestris. For example, nucleic acid that encodes a polypeptide having STS activity can be obtained from Araclais Izypogaea and can have a nucleic acid sequence as set forth in SEQ ID NO : 5 (Figure 7), which can encode the amino acid sequence set forth in SEQ ID NO : 6 (Figure 8). In addition, polypeptides having STS activity as well as nucleic acid encoding such polypeptides can be obtained as described herein.

Polypeptides having STS activity can catalyze the same reaction catalyzed by polypeptides having CHS activity. For example, polypeptides having STS activity can form a linear tetraketide that is cyclized in the active site of the enzyme to the final product. The reactions of STS and CHS polypeptides are identical up to the cyclization

reaction, in which case an STS polypeptide can perform an aldol condensation and a CHS polypeptide can perform a Claisen condensation. The final products reflect this difference in cyclization: stilbenes produced by polypeptides having STS activity can have two rings, whereas chalcones produced by polypeptides having CHS activity can have three rings. As shown in Figure 27, polypeptides having STS or CHS activity can be used to produce organic compounds such as stilbenes, chalcones, and/or pyrones.

Polypeptides having NADPH-cytochrome p450 reductase activity as well as nucleic acid encoding such polypeptides can be obtained from various species including, without limitation, Arabidopsis thaliana, Medicago truncatula, and Arachis hypogaea.

For example, nucleic acid that encodes a polypeptide having NADPH-cytochrome p450 reductase activity can be obtained from Arabidopsis thaliana and can have a sequence as set forth in SEQ ID NO : 13 (Figure 15), which can encode the amino acid sequence set forth in SEQ ID NO : 14 (Figure 16). In addition, polypeptides having NADPH- cytochrome p450 reductase activity as well as nucleic acid encoding such polypeptides can be obtained as described herein.

The term"polypeptide having enzymatic activity"as used herein refers to any polypeptide that catalyzes a chemical reaction of other substances without itself being destroyed or altered upon completion of the reaction. Typically, a polypeptide having enzymatic activity catalyzes the formation of one or more products from one or more substrates. Such polypeptides can have any type of enzymatic activity including, without limitation, the enzymatic activity or enzymatic activities associated with enzymes such as ligases (e. g. , CoA-ligases, coumaroyl-CoA-ligases, benzoyl-CoA-ligases, and fernloyl- CoA-ligases), synthases (e. g. , chalcone synthases, and stilbene synthases), lyases (e. g., tyrosine ammonia lyases, histidine ammonia lyases, and phenylalanine ammonia lyases), hydroxylases (e. g., cinnamate hydroxylase, flavanone 3 hydroxylase, and flavonoid 3'5' hydroxylase), and reductases (e. g. , NADPH-cytochrome p450 reductases).

As depicted in Figure 2, naringenin can be converted into various products by polypeptides having the indicated activities. Polypeptides having a particular activity as well as nucleic acid encoding such polypeptides can be obtained as described herein. For example, polypeptides having the indicated enzymatic activity can be obtained from the indicated species and can have a sequence as set forth in the indicated GenBank accession number (Table 1).

Table 1. List of enzymatic activities.

Abbreviation Enzymatic activity Source Accession number F3'H Flavonoid 3'-hydroxylase Arabidopsis tlzaliana AH009204 F3'5'H Flavonoid 3'5'-hydroxylase Arabidopsis thaliana AAM13084 AAL16143 FLS Flavonol synthase Arabidopsis thalioana Q96330 FHT Flavanone 3ß hydroxylase Arabidopsis thaliana U33932 DFR Dihydroflavonol-4-reductase Arabidopsis thaliana NM 123645 LDOX Leucocyanidin dioxygenase A7abidopsis tllaliana Q96323 (anthocyanidin synthase ANS) BAN Leucoanthocyanidin reducatase Arabidopsis thaliana Q9SEV0 LAR/IFR putative IFR-like proteins Arabidopsis thaliana NP_565107 homologs NP 195634 LAR/IFR putative IFR-like protein Medicago truncatula TC77184 homologs TC86142 CHR Chalcone recutase Medicago truncatula X82366 IFS Isoflavone synthase Medicago truncatula AY167424 IFR Isoflavone reductase Medicago truncatula AF277052 VR Vestitone reducatse Medicago truncatula TC77308 3-O-UGT homolog Putative UDP-glucose : flavonoid Arabidopsis thaliana T515601) 3-O-glycosyltransferase 5-O-UGT homolog putative UDP-glucose: flavonoid Ai-abidopsis tlialiai7a AAM91686 2) 5-0-glycosyltransferase ATR2 NADPH-cytochrome P450 Arabidopsis thaliana X66017 reductase 1) most homologous to anthocyanidin/flavonoid 3-O-GT from Perilla f flutescens (GenBank accession number BAA19659 ; 46% identity, 62% similarity) and Vitis vinifera (GenBank accession number AAB81682; 55% identity, 69% similarity).

2) most homologous to anthocyanin 5-O-GT from Perilla frutescens (GenBank accession number AB013596 ; 47% identity, 62% similarity).

Each step provided in the pathways depicted in Figures 1,2, 27, and 28 can be performed within a cell or outside a cell (e. g. , in a container or column). For example, a microorganism provided herein can be used to perform the steps provided in Figure 1, or an extract containing polypeptides having the provided enzymatic activities can be used to perform the steps provided in Figure 1. In addition, chemical treatments can be used to perform the conversions provided in Figures 1, 2,27, and 28. For example, naringenin can be converted into apigenin by reduction.

The organic compounds produced from any of the steps provided in Figures 1 and 2 can be chemically converted into other organic compounds. For example, apigenin can be hydrogenated to form naringenin. Hydrogenating an organic acid can be performed using any method such as those used to hydrogenate acids. In another example, dihydrokaempferol can be dehydrated to form apigenin. Any method can be used to perform a dehydration reaction. For example, dihydrokaempferol can be heated in the presence of a catalyst (e. g. , a metal or mineral acid catalyst) to form apigenin.

2. Nucleic acids The tenn"nucleic acid"as used herein encompasses both RNA and DNA, including cDNA, genomic DNA, and synthetic (e. g. , chemically synthesized) DNA. The nucleic acid can be double-stranded or single-stranded. Where single-stranded, the nucleic acid can be the sense strand or the antisense strand. In addition, nucleic acid can be circular or linear.

The term"isolated"as used herein with reference to nucleic acid refers to a naturally-occurring nucleic acid that is not immediately contiguous with both of the sequences with which it is immediately contiguous (one on the 5'end and one on the 3' end) in the naturally-occurring genome of the organism from which it is derived. For example, an isolated nucleic acid can be, without limitation, a recombinant DNA molecule of any length, provided one of the nucleic acid sequences normally found immediately flanking that recombinant DNA molecule in a naturally-occurring genome is removed or absent. Thus, an isolated nucleic acid includes, without limitation, a recombinant DNA that exists as a separate molecule (e. g. , a cDNA or a genomic DNA fragment produced by PCR or restriction endonuclease treatment) independent of other sequences as well as recombinant DNA that is incorporated into a vector, an autonomously replicating plasmid, a virus (e. g. , a retrovirus, adenovirus, or herpes virus), or into the genomic DNA of a prokaryote or eucaryote. In addition, an isolated nucleic acid can include a recombinant DNA molecule that is part of a hybrid or fusion nucleic acid sequence.

The term"isolated"as used herein with reference to nucleic acid also includes any non-naturally-occurring nucleic acid since non-naturally-occurring nucleic acid sequences

are not found in nature and do not have immediately contiguous sequences in a naturally- occurring genome. For example, non-naturally-occurring nucleic acid such as an engineered nucleic acid is considered to be isolated nucleic acid. Engineered nucleic acid can be made using common molecular cloning or chemical nucleic acid synthesis techniques. Isolated non-naturally-occurring nucleic acid can be independent of other sequences, or incorporated into a vector, an autonomously replicating plasmid, a virus (e. g. , a retrovirus, adenovirus, or herpes virus), or the genomic DNA of a prokaryote or eukaryote. In addition, a non-naturally-occurring nucleic acid can include a nucleic acid molecule that is part of a hybrid or fusion nucleic acid sequence.

It will be apparent to those of skill in the art that a nucleic acid existing among hundreds to millions of other nucleic acid molecules within, for example, cDNA or genomic libraries, or gel slices containing a genomic DNA restriction digest is not to be considered an isolated nucleic acid.

The term"exogenous"as used herein with reference to nucleic acid and a particular cell refers to any nucleic acid that does not originate from that particular cell as found in nature. Thus, non-naturally-occurring nucleic acid is considered to be exogenous to a cell once introduced into the cell. It is important to note that non- naturally-occurring nucleic acid can contain nucleic acid sequences or fragments of nucleic acid sequences that are found in nature provided the nucleic acid as a whole does not exist in nature. For example, a nucleic acid molecule containing a genomic DNA sequence within an expression vector is non-naturally-occurring nucleic acid, and thus is exogenous to a cell once introduced into the cell, since that nucleic acid molecule as a whole (genomic DNA plus vector DNA) does not exist in nature. Thus, any vector, autonomously replicating plasmid, or virus (e. g. , retrovirus, adenovirus, or herpes virus) that as a whole does not exist in nature is considered to be non-naturally-occurring nucleic acid. It follows that genomic DNA fragments produced by PCR or restriction endonuclease treatment as well as cDNAs are considered to be non-naturally-occurring nucleic acid since they exist as separate molecules not found in nature. It also follows that any nucleic acid containing a promoter sequence and polypeptide-encoding sequence (e. g., cDNA or genomic DNA) in an arrangement not found in nature is non-naturally- occurring nucleic acid.

Nucleic acid that is naturally-occurring can be exogenous to a particular cell. For example, an entire chromosome isolated from a cell of person X is an exogenous nucleic acid with respect to a cell of person Y once that chromosome is introduced into Y's cell.

The invention provides isolated nucleic acids that encode at least two (e. g. , at least two, three, four, five, six, seven, eight, nine, ten, or more) of the polypeptides described herein. For example, the invention provides an isolated nucleic acid containing a nucleic acid sequence that encodes the amino acid sequence set forth in SEQ ID NO : 2 and a nucleic acid sequence that encodes the amino acid sequence set forth in SEQ ID NO : 4. In some embodiments, a nucleic acid can contain nucleic acid sequences that encode between two and ten polypeptides (e. g. , between two and five polypeptides, between two and four polypeptides, between three and six polypeptides, or between three and five polypeptides). Each polypeptide can have an activity described herein. For example, each polypeptide can have a ligase (e. g. , CoA-ligase, coumaroyl-CoA-ligase, benzoyl- CoA-ligase, and fernloyl-CoA-ligase), synthase (e. g. , chalcone synthase and stilbene synthase), lyase (e. g. , tyrosine ammonia lyase, histidine ammonia lyase, and phenylalanine ammonia lyase), hydroxylase (e. g., cinnamate hydroxylas, flavanone 3 hydroxylase, and flavonoid 3'5'hydroxylase), or reductase (e. g., NADPH-cytochrome p450 reductase) activity. In one embodiment, a nucleic acid can contain nucleic acid sequences that encode a polypeptide having 4CL activity and a polypeptide having CHS activity. In another embodiment, a nucleic acid can contain nucleic acid sequences that encode a polypeptide having PAL activity and a polypeptide having C4H activity.

The nucleic acids provided herein can be in the form of an expression vector such that the encoded polypeptide sequences are expressed. For example, nucleic acid sequences having the sequences set forth in SEQ ID NOs : 1 and 3 can be inserted into an expression vector such that the polypeptides encoded by sequences set forth in SEQ ID NOs : 1 and 3 are expressed when the expression vector is introduced into a cell (e. g. , a bacterial, fungal, plant, protozoan, animal, or mammalian cell).

The isolated nucleic acids provided herein can be obtained using any method including, without limitation, common molecular cloning and chemical nucleic acid synthesis techniques. For example, PCR can be used to obtain an isolated nucleic acid containing a nucleic acid sequence sharing similarity to the sequences set forth in SEQ ID

NO : 1,3, 5,7, 9,11, or 13. PCR refers to a procedure or technique in which target nucleic acid is amplified in a manner similar to that described in U. S. Patent No. 4,683, 195, and subsequent modifications of the procedure described therein. Generally, sequence information from the ends of the region of interest or beyond are used to design oligonucleotide primers that are identical or similar in sequence to opposite strands of a potential template to be amplified. Using PCR, a nucleic acid sequence can be amplified from RNA or DNA. For example, a nucleic acid sequence can be isolated by PCR amplification from total cellular RNA, total genomic DNA, and cDNA as well as from bacteriophage sequences, plasmid sequences, viral sequences, and the like. When using RNA as a source of template, reverse transcriptase can be used to synthesize complimentary DNA strands.

Isolated nucleic acids provided herein also can be obtained by mutagenesis. For example, an isolated nucleic acid containing a sequence set forth in SEQ ID NO : 1, 3,5, 7, 9,11, or 13 can be mutated using common molecular cloning techniques (e. g. , site- directed mutagenesis). Possible mutations include, without limitation, deletions, insertions, and substitutions, as well as combinations of deletions, insertions, and substitutions.

In addition, nucleic acid and amino acid databases (e. g., GenBank@) can be used to obtain isolated nucleic acids. For example, any nucleic acid sequence having some homology to a sequence set forth in SEQ ID NO : 1,3, 5,7, 9,11, or 13, or any amino acid sequence having some homology to a sequence set forth in SEQ ID NO : 2,4, 6, 8, 10,12, or 14 can be used as a query to search GenBanla'.

Further, nucleic acid hybridization techniques can be used to obtain an isolated nucleic acid provided herein. Briefly, any nucleic acid having some homology to a sequence set forth in SEQ ID NO : 1,3, 5, 7,9, 11, or 13 can be used as a probe to identify a similar nucleic acid by hybridization under conditions of moderate to high stringency.

Once identified, the nucleic acid then can be purified, sequenced, and analyzed to determine whether it encodes a polypeptide having an activity described herein.

For the purpose of this invention, moderately stringent hybridization conditions mean the hybridization is performed at about 42°C in a hybridization solution containing 25 mM KP04 (pH 7.4), 5X SSC, 5X Denhart's solution, 50 jug/mL denatured, sonicated

salmon sperm DNA, 50% formamide, 10% Dextran sulfate, and 1-15 ng/mL probe (about 5x107 cpm/yg), while the washes are performed at about 50°C with a wash solution containing 2X SSC and 0.1% sodium dodecyl sulfate.

Highly stringent hybridization conditions mean the hybridization is performed at about 42°C in a hybridization solution containing 25 mM KP04 (pH 7.4), 5X SSC, 5X Denhart's solution, 50 yg/mL denatured, sonicated salmon sperm DNA, 50% formamide, 10% Dextran sulfate, and 1-15 ng/mL probe (about 5x107 cpm/yg), while the washes are performed at about 65°C with a wash solution containing 0.2X SSC and 0. 1% sodium dodecyl sulfate.

Hybridization can be done by Southern or Northern analysis to identify a DNA or RNA sequence, respectively, that hybridizes to a probe. The probe can be labeled with a biotin, digoxygenin, an enzyme, or a radioisotope such as 32p The DNA or RNA to be analyzed can be electrophoretically separated on an agarose or polyacrylamide gel, transferred to nitrocellulose, nylon, or other suitable membrane, and hybridized with the probe using standard techniques well known in the art such as those described in sections 7.39-7. 52 of Sambrook et al., (1989) Molecular Cloning, second edition, Cold Spring harbor Laboratory, Plainview, NY. Typically, a probe is at least about 20 nucleotides in length. For example, a probe corresponding to a 20 nucleotide sequence set forth in SEQ ID NO : 1,3, 5,7, 9,11, or 13 can be used to identify an identical or similar nucleic acid.

In addition, probes longer or shorter than 20 nucleotides can be used.

3. Polypeptides The invention also provides substantially pure polypeptides. The term "substantially pure"as used herein with reference to a polypeptide means the polypeptide is substantially free of other polypeptides, lipids, carbohydrates, and nucleic acid with which it is associated in nature. A substantially pure polypeptide can be at least about 60, 65,70, 75,80, 85, 90,95, or 99 percent pure. Typically, a substantially pure polypeptide will yield a single major band on a polyacrylamide gel.

In one embodiment, the invention provides a substantially pure polypeptide having an amino acid sequence encoded by a nucleic acid provided herein. Such polypeptides include, without limitation, substantially pure polypeptides having one or

more of the following activities: a ligase (e. g. , CoA-ligase, coumaroyl-CoA-ligase, benzoyl-CoA-ligase, and fernloyl-CoA-ligase), synthase (e. g. , chalcone synthase and stilbene synthase), lyase (e. g. , tyrosine ammonia lyase, histidine ammonia lyase, and phenylalanine ammonia lyase), hydroxylase (e. g. , cinnamate hydroxylas, flavanone 3 hydroxylase, and flavonoid 3'5'hydroxylase), or reductase (e. g. , NADPH-cytochrome p450 reductase) activity.

In another embodiment, the invention provides a composition that contains two or more (e. g. , three, four, five, six, seven, eight, nine, ten, or more) substantially pure polypeptide preparations. For example, a composition can contain a substantially pure polypeptide preparation with the polypeptide having the sequence set forth in SEQ ID NO : 2 and a substantially pure polypeptide preparation with the polypeptide having the sequence set forth in SEQ ID NO : 4. Such compositions can be in the form of a container.

For example, two or more substantially pure polypeptide preparations can be located within a column. In some embodiments, the polypeptides can be immobilized on a substrate such as a resin.

Any method can be used to obtain a substantially pure polypeptide. For example, common polypeptide purification techniques such as affinity chromatography and HPLC as well as polypeptide synthesis techniques can be used. In addition, any material can be used as a source to obtain a substantially pure polypeptide. For example, tissue from wild-type or transgenic animals can be used as a source material. In addition, tissue culture cells engineered to over-express a particular polypeptide of interest can be used to obtain a substantially pure polypeptide. Further, a polypeptide within the scope of the invention can be"engineered"to contain an amino acid sequence that allows the polypeptide to be captured onto an affinity matrix. For example, a tag such as c-myc, hemagglutinin, polyhistidine, or Flag tag (Kodak) can be used to aid polypeptide purification. Such tags can be inserted anywhere within the polypeptide including at either the carboxyl or amino termini. Other fusions that can be used include enzymes such as alkaline phosphatase that can aid in the detection of the polypeptide.

4. Ge7letically modified cells

The invention provides genetically modified cells (e. g. , cells containing an exogenous nucleic acid molecule). Such cells can be used to produce flavonoids (e. g., naringenin, eriodictyol, and homoeriodictyol) and other organic compounds. In addition, such cells can be from any species including those listed within the taxonomy web pages at the National Center for Biotechnology Information (e. g. , at"www"dot"ncbi"dot "nlm"dot"nih"dot"gov"). The cells can be eukaryotic or prokaryotic. For example, genetically modified cells can be mammalian cells (e. g. , human, murine, and bovine cells), plant cells (e. g., corn, wheat, rice, and soybean cells), fungal cells (e. g., Aspergillus and Rhizopus cells), or bacterial cells (e. g., Escherichia, Bacillus, Streptomyces, and Pseudomonas cells). A cell can be a microorganism. The term"microorganism"as used herein refers to any microscopic organism including, without limitation, bacteria, algae, fungi, and protozoa. Thus, Esclaerichia, Bacillus, Streptomyces, and Pseudomonas cells are considered microorganisms and can be used as described herein.

Typically, a cell of the invention is genetically modified such that a particular organic compound is produced. Such cells can contain one or more exogenous nucleic acid molecules that encode polypeptides having enzymatic activity. For example, a microorganism can contain exogenous nucleic acid that encodes a polypeptide having 4CL and CHS activity. In this case, 4-coumaric acid can be converted into 4-coumaroyl- CoA which can be converted into naringenin. It is noted that a cell can be given an exogenous nucleic acid molecule that encodes a polypeptide having an enzymatic activity that catalyzes the production of a compound not normally produced by that cell.

Alternatively, a cell can be given an exogenous nucleic acid molecule that encodes a polypeptide having an enzymatic activity that catalyzes the production of a compound that is normally produced by that cell. In this case, the genetically modified cell can produce more of the compound, or can produce the compound more efficiently, than a similar cell not having the genetic modification.

A polypeptide having a particular enzymatic activity can be a polypeptide that is either naturally-occurring or non-naturally-occurring. A naturally-occurring polypeptide is any polypeptide having an amino acid sequence as found in nature, including wild-type and polymorphic polypeptides. Such naturally-occurring polypeptides can be obtained from any species including, without limitation, animal (e. g. , mammalian), plant, fungal,

and bacterial species. A non-naturally-occurring polypeptide is any polypeptide having an amino acid sequence that is not found in nature. Thus, a non-naturally-occurring polypeptide can be a mutated version of a naturally-occurring polypeptide, or an engineered polypeptide. For example, a non-naturally-occurring polypeptide having CHS activity can be a mutated version of a naturally-occurring polypeptide having CHS activity that retains at least some CHS activity. A polypeptide can be mutated by, for example, sequence additions, deletions, substitutions, or combinations thereof.

The invention provides genetically modified cells that can be used to perform one or more steps of a metabolic pathway described herein. For example, an individual microorganism can contain exogenous nucleic acid such that each of the polypeptides necessary to perform the steps depicted in Figures 1,2, 27, or 28 are expressed. It is important to note that such cells can contain any number of exogenous nucleic acid molecules. For example, a particular cell can contain three exogenous nucleic acid molecules with each one encoding one of the three polypeptides necessary to convert tyrosine into naringenin as depicted in Figure 1, or a particular cell can endogenously produce polypeptides necessary to convert 4-coumaroyl-CoA into naringenin while containing exogenous nucleic acids that encode polypeptides necessary to convert tyrosine into 4-coumaroyl-CoA.

In addition, a single exogenous nucleic acid molecule can encode one or more than one polypeptide. For example, a single exogenous nucleic acid molecule can contain sequences that encode three different polypeptides. Further, the cells described herein can contain a single copy, or multiple copies (e. g. , about 5,10, 20,35, 50,75, 100 or 150 copies), of a particular exogenous nucleic acid molecule. Again, the cells described herein can contain more than one particular exogenous nucleic acid molecule. For example, a particular cell can contain about 50 copies of exogenous nucleic acid molecule X as well as about 75 copies of exogenous nucleic acid molecule Y.

In one embodiment, the invention provides a cell containing an exogenous nucleic acid molecule that encodes a polypeptide having enzymatic activity that leads to the formation of naringenin. It is noted that the produced naringenin can be secreted from the cell, eliminating the need to disrupt cell membranes to retrieve the organic compound.

Typically, the cell of the invention produces naringenin with the concentration being at

least about 1 mg per L (e. g. , at least about 2.5 mg/L, 5 mg/L, 10 mg/L, 20 mg/L, 25 mg/L, 50 mg/L, 75 mg/L, 80 mg/L, 90 mg/L, 100 mg/L, or 120 mg/L). When determining the yield of an organic compound such as naringenin for a particular cell, any method can be used. See, e. g., Applied Enviro7lmental Microbiolo, 59 (12): 4261-4265 (1993).

A nucleic acid molecule encoding a polypeptide having enzymatic activity can be identified and obtained using any method such as those described herein. For example, nucleic acid molecules that encode a polypeptide having enzymatic activity can be identified and obtained using common molecular cloning or chemical nucleic acid synthesis procedures and techniques, including PCR. In addition, standard nucleic acid sequencing techniques and software programs that translate nucleic acid sequences into amino acid sequences based on the genetic code can be used to determine whether or not a particular nucleic acid has any sequence homology with known enzymatic polypeptides.

Sequence alignment software such as MEGALIGN° (DNASTAR, Madison, WI, 1997) can be used to compare various sequences. In addition, nucleic acid molecules encoding known enzymatic polypeptides can be mutated using common molecular cloning techniques (e. g. , site-directed mutageneses). Possible mutations include, without limitation, deletions, insertions, and base substitutions, as well as combinations of deletions, insertions, and base substitutions. Further, nucleic acid and amino acid databases (e. g., GenBank@) can be used to identify a nucleic acid sequence that encodes a polypeptide having enzymatic activity. Briefly, any amino acid sequence having some homology to a polypeptide having enzymatic activity, or any nucleic acid sequence having some homology to a sequence encoding a polypeptide having enzymatic activity can be used as a query to search GenBank@. The identified polypeptides then can be analyzed to determine whether or not they exhibit enzymatic activity.

In addition, nucleic acid hybridization techniques can be used to identify and obtain a nucleic acid molecule that encodes a polypeptide having enzymatic activity.

Briefly, any nucleic acid molecule that encodes a known enzymatic polypeptide, or fragment thereof, can be used as a probe to identify a similar nucleic acid molecules by hybridization under conditions of moderate to high stringency. Such similar nucleic acid molecules then can be isolated, sequenced, and analyzed to determine whether the encoded polypeptide has enzymatic activity.

Expression cloning techniques also can be used to identify and obtain a nucleic acid molecule that encodes a polypeptide having enzymatic activity. For example, a substrate known to interact with a particular enzymatic polypeptide can be used to screen a phage display library containing that enzymatic polypeptide. Phage display libraries can be generated as described elsewhere (Burritt et al., Anal. Biochem. 238: 1-13 (1990)), or can be obtained from commercial suppliers such as Novagen (Madison, WI).

Further, polypeptide sequencing techniques can be used to identify and obtain a nucleic acid molecule that encodes a polypeptide having enzymatic activity. For example, a purified polypeptide can be separated by gel electrophoresis, and its amino acid sequence determined by, for example, amino acid microsequencing techniques.

Once determined, the amino acid sequence can be used to design degenerate oligonucleotide primers. Degenerate oligonucleotide primers can be used to obtain the nucleic acid encoding the polypeptide by PCR. Once obtained, the nucleic acid can be sequenced, cloned into an appropriate expression vector, and introduced into a microorganism.

Any method can be used to introduce an exogenous nucleic acid molecule into a cell. In fact, many methods for introducing nucleic acid into microorganisms such as bacteria and yeast are well known to those skilled in the art. For example, heat shock, lipofection, electroporation, conjugation, fusion of protoplasts, and biolistic delivery are common methods for introducing nucleic acid into bacteria and yeast cells. See, e. g. , Ito et al., J. Bacterol. 153: 163-168 (1983); Durrens et al., Curr. Genet. 18: 7-12 (1990); and Becker and Guarente, Methods in Enzymology 194: 182-187 (1991).

An exogenous nucleic acid molecule contained within a particular cell can be maintained within that cell in any form. For example, exogenous nucleic acid molecules can be integrated into the genome of the cell or maintained in an episomal state. In other words, a cell of the invention can be a stable or transient transformant. Again, a microorganism described herein can contain a single copy, or multiple copies (e. g. , about 5,10, 20, 35, 50, 75,100 or 150 copies), of a particular exogenous nucleic acid molecule as described herein.

Methods for expressing an amino acid sequence from an exogenous nucleic acid molecule are well known to those skilled in the art. Such methods include, without

limitation, constructing a nucleic acid such that a regulatory element promotes the expression of a nucleic acid sequence that encodes a polypeptide. Typically, regulatory elements are DNA sequences that regulate the expression of other DNA sequences at the level of transcription. Thus, regulatory elements include, without limitation, promoters, enhancers, and the like. Any type of promoter can be used to express an amino acid sequence from an exogenous nucleic acid molecule. Examples of promoters include, without limitation, constitutive promoters, tissue-specific promoters, and promoters responsive or unresponsive to a particular stimulus (e. g. , light, oxygen, chemical concentration, and the like). Moreover, methods for expressing a polypeptide from an exogenous nucleic acid molecule in cells such as bacterial cells and yeast cells are well known to those skilled in the art. For example, nucleic acid constructs that are capable of expressing exogenous polypeptides within E. coli are well known. See, e. g. , Sambrook et al., Molecular cloning: a laboratory manual, Cold Spring Harbour Laboratory Press, New York, USA, second edition (1989).

As described herein, a cell can contain an exogenous nucleic acid molecule that encodes a polypeptide having enzymatic activity that leads to the formation of flavonoids (e. g. , naringenin, eriodictyol, and homoeriodictyol) and other organic compounds.

Methods of identifying cells that contain exogenous nucleic acid are well known to those skilled in the art. Such methods include, without limitation, PCR and nucleic acid hybridization techniques such as Northern and Southern analysis. In some cases, immunohisto-chemistry and biochemical techniques can be used to determine if a cell contains a particular nucleic acid by detecting the expression of the encoded enzymatic polypeptide encoded by that particular nucleic acid molecule. For example, an antibody having specificity for an encoded enzyme can be used to determine whether or not a particular cell contains that encoded enzyme. Further, biochemical techniques can be used to determine if a cell contains a particular nucleic acid molecule encoding an enzymatic polypeptide by detecting an organic product produced as a result of the expression of the enzymatic polypeptide. For example, detection of naringenin after introduction of exogenous nucleic acid that encodes a polypeptide having CHS activity into a cell that does not normally express such a polypeptide can indicate that that cell not only contains the introduced exogenous nucleic acid molecule but also expresses the

encoded enzymatic polypeptide from that introduced exogenous nucleic acid molecule.

Methods for detecting specific enzymatic activities or the presence of particular organic products are well known to those skilled in the art. For example, the presence of a flavonoid such as naringenin can be determined as described elsewhere for other flavonoids (See, e. g., Chen et al., J. Chromatography A., 913: 387-395 (2001); Justesen et al., J. Chromatograplly A., 799: 101-110 (1998) andHughes etal., Int. J. Mass Spectrom., 210/211: 371-385 (2001) ).

5. Producingflavonoids and other organic compounds The cells described herein can be used to produce flavonoids (e. g. , naringenin, eriodictyol, and homoeriodictyol) and other organic compounds. For example, a microorganism can be transfected with nucleic acid that encodes a polypeptide having TAL activity, a polypeptide having 4CL activity, and a polypeptide having CHS activity.

Such a microorganism can produce more naringenin or other flavonoids than had the microorganism not been given that nucleic acid. Once transfected, the microorganism can be cultured under conditions optimal for flavonoid production.

In addition, substantially pure polypeptides having enzymatic activity can be used alone or in combination with cells to produce flavonoids or other organic compounds.

For example, a preparation containing a substantially pure polypeptide having 4CL activity can be used to catalyze the formation of 4-coumaroyl-CoA. Further, cell-free extracts containing a polypeptide having enzymatic activity can be used alone or in combination with substantially pure polypeptides and/or cells to produce flavonoids or other organic compounds. For example, a cell-free extract containing a polypeptide having 4CL activity can be used to form 4-coumaroyl-CoA, while a microorganism containing a polypeptide having CHS activity can be used to produce naringenin. Any method can be used to produce a cell-free extract. For example, osmotic shock, sonication, and/or a repeated freeze-thaw cycle followed by filtration and/or centrifugation can be used to produce a cell-free extract from intact cells.

It is noted that a cell, substantially pure polypeptide, and/or cell-free extract can be used to produce any flavonoid or other organic compound that is, in turn, treated chemically to produce another compound. For example, a microorganism can be used to

produce naringenin, while a chemical process is used to modify naringenin into a derivative such as apigenin or phloretin. Likewise, a chemical process can be used to produce a particular compound that is, in turn, converted into a flavonoid or other organic compound using a cell, substantially pure polypeptide, and/or cell-free extract described herein. For example, a chemical process can be used to produce 4-coumaroyl-CoA, while a microorganism can be used convert 4-coumaroyl-CoA into naringenin.

Typically, naringenin is produced by providing a microorganism and culturing the provided microorganism with culture medium such that naringenin is produced. In general, the culture media and/or culture conditions can be such that the microorganisms grow to an adequate density and produce naringenin efficiently. For large-scale production processes, any method can be used such as those described elsewhere (Manual of Industrial Microbiology and Bioteclmology, 2"d Edition, Editors: A. L. Demain and J.

E. Davies, ASM Press; and Principles of Fermentation Technology, P. F. Stanbury and A.

Whitaker, Pergamon). Briefly, a large tank (e. g. , a 100 gallon, 200 gallon, 500 gallon, or more tank) containing appropriate culture medium with, for example, a glucose carbon source is inoculated with a particular microorganism. After inoculation, the microorganisms are incubated to allow biomass to be produced. Once a desired biomass is reached, the broth containing the microorganisms can be transferred to a second tank.

This second tank can be any size. For example, the second tank can be larger, smaller, or the same size as the first tank. Typically, the second tank is larger than the first such that additional culture medium can be added to the broth from the first tank. In addition, the culture medium within this second tank can be the same as, or different from, that used in the first tank. For example, the first tank can contain medium with glucose, while the second tank contains medium with glycerol.

Once transferred, the microorganisms can be incubated to allow for the production of naringenin. Once produced, any method can be used to isolate the naringenin. For example, common separation techniques can be used to remove the biomass from the broth, and common isolation procedures (e. g. , extraction, distillation, and ion-exchange procedures) can be used to obtain the naringenin from the microorganism-free broth. In addition, naringenin can be isolated while it is being produced, or it can be isolated from the broth after the product production phase has been terminated.

In some embodiments, naringenin can be converted into another flavonoid such as a flavonoid depicted in Figure 2 or 28. Once produced, the particular flavonoid can be isolated using common common isolation procedures (e. g. , extraction, distillation, and ion-exchange procedures).

The invention will be further described in the following examples, which do not limit the scope of the invention described in the claims.

EXAMPLES Example 1-Methods and materials 1. Chemicals Caffeic acid, ferulic acid, and 3- (4-hydroxyphenyl)-propionic acid were obtained from Sigma Aldrich (St. Louis, MO). Naringenin, 4-coumaric acid, phloretin, and arabinose were obtained from ICN (Aurora, OH). trans-Cinnainic acid was obtained from Acros Organics (Morris Plains, NJ). All solvents were of HPLC grade and obtained from Fisher Scientific (Pittsburgh, PA). HPLC grade water was obtained from Mallinckrodt Chemicals (Phillipsburg, NJ). T4 DNA ligase and Vent DNA polymerase were obtained from New England Biolabs (Boston, MA). Restriction enzymes were obtained from NEB or Promega (Madison, WI), and restriction enzyme buffers (the SuRE/Cut buffers) were obtained from Roche (Indianapolis, IN).

2. Strains and culture conditions All cloning and DNA manipulations were carried out in E. coli JM109 using standard techniques (Sambrook and Russell, Molecular Cloning-A Laboratory Ma7zual, Vol. 3, Third ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 2001) and grown at 30°C with 300 rpm shaking. Following sequencing, plasmids were transformed into E. coli strain BW27784 provided by the E. coli Genetic Stock Center (New Haven, CT), for expression (Table 2; Khlebnikov et al., Microbiology, 147, 3241- 3247 (2001)).

Table 2: Strains and plasmids used.

Strain or plasmid Properties or genotype Source Strains E. coli JM109 recAl supE44 endAl hsdR17 (r#-m#+) gyrA96 relAl thi #(lac- 1 proAB) [F'traD36 proAB+ lacIq lacZ#M15] E. coli BW27784 laclq rrnB3 #lacZ4787 hsdR514 #(araBAD)567 2 #(rhaBAD)568 # (araFGH)#(# araEp PCP18-araE) Rba. capsulatus 1710 Type strain 3 Rba. sphaeroides 158 Type strain 3 Plasmids pUCMod Cloning vector, constitutive lac promoter, Ampr 4 pACMod Cloning vector, Tef, Cm'4 pBADModl Cloning vector from pBAD-Thio/TOPO, Amp' pBADMod2 Cloning vector, Amp' pBADModl-PAL Arabinose inducible PAL from4. tAlaliana pBADModl-C4H Arabinose inducible C4H fromA. tlzaliana pBADModl-4CL Arabinose inducible 4CL from A. thaliana pBADModl-CHS Arabinose inducible CHS from A. tl2aliana pACMod-PAL/C4H Arabinose inducible PAL and C4H, Tetr pBADMod2-4CL/CHS Arabinose inducible 4CL and CHS, Amp' pUCMod-TAL Constitutively expressed TAL from Rba. sphaeroides pACMod-TAL Constitutively expressed TAL from Rba. sphaei-oides, Crd 1 : Yanisch-Perron et al. , Gene, 33,103-119 (1985).

2 : Khlebnikov et al., Microbiology, 147, 3241-3247 (2001).

3 : Obtained from Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH (DSMZ, Braunschweig, Germany).

4: Schmidt-Dannert et al., Nat. Bioteelznol., 1S, 750-753 (2000).

Rba. capsulatus (DSM No. 1710) and Rba. sphaeroides (DSM No. 158) were obtained from Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH (DSMZ, Braunschweig, Germany). Rba. capsulatus was grown anaerobically at 30°C

under direct light in modified Van Niel's medium (ATCC medium 1676) for more than 5 days. Rba. sphaeroides 158 was grown aerobically at 30°C in Luria-Bertani (LB) medium for 3 days. Genomic DNA was prepared with Wizard Genomic DNA kit from Promega. E. coli harboring either the Arabidopsis pathway (pACMod-PAL/C4H + pBADMod2-4CL/CHS) or TAL pathway (pACMod-TAL + pBADMod2-4CL/CHS) was grown in a modified M9, LB, or Terrific broth (TB) medium, supplemented with tetracycline (12.5 mg mL-1) or chloramphenicol (50 mg mL-1) and carbenicillin or ampicillin (100 mg mol-') to OD6oo = 0. 4-0.6 and induced with arabinose (0.2% m/v). M9 medium was modified by addition of yeast extract (1.25 g L-1) and glycerol (0.5% v/v) into standard M9 medium (Sambrook and Russell, Molecular Clo1li7lg-A Laboratory Manual, Vol. 3, Third ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 2001).

3. Plasmid construction and nucleic acid cloning pBADModl was constructed from pBAD/Thio-TOPO (Invitrogen, Carlsbad, CA) by elimination of the NcoI/P7neI fragment using long-range PCR with primers (5'- GGCGCGCCTTAAACAAAATTATTTCTAG-3', SEQ ID NO : 37; and 5'- TAATTAAGGTCTCCAGCTTGGCTG-3', SEQ ID NO : 38) to introduce unique AscI and PacI sites downstream of the arabinose promoter. pBADMod2 was constructed in the samewaybyusingprimers (5'-GGTACCCTCGAGGTTTAAACAAGCTTCGCTTC- TCTGAGTAGGAC-3', SEQ ID NO : 39; and 5'-CCATGGGCGGCCGCGAATTC- GTCGACCTCTGAATGGCGGGAG-3', SEQ ID NO : 40) to eliminate the arabinose promoter and terminator and introduce a multiple cloning site. pUCMod and pACMod have been described elsewhere (Schmidt-Dannert et al., Nat. Biotechnol., 18, 750-753 (2000)).

Nucleic acid sequences encoding a polypeptide having PAL activity (GenBank Accession No. AY303128), a polypeptide having C4H activity (GenBank Accession No.

U71080), a polypeptide having 4CL activity (GenBank Accession No. U18675), and a polypeptide having CHS activity (GenBank Accession No. API 12086) were cloned from a pFL61 Arabidopsis thaliana cDNA library obtained from the American Type Culture Collection (Manassas, VA, ATCC No. 77500) with forward primers containing a 5'AscI

site followed by an optimized Shine-Dalgarno sequence (5'-AGGAGGATTA- CAAAATG-3', SEQ ID NO : 41) and the start codon for each gene, followed by an additional 10-15 nucleotides corresponding to the respective gene sequences. Reverse primers contained a Pad site for directional cloning into pBADModl. PCR was carried out with Vent polymerase, and conditions were as follows: 94°C for 2 minutes, 30 cycles of 94°C for 30 seconds, 50°C for 30 seconds, 72°C for 1 minute followed by a final extension step at 72°C for 4 minutes. The nucleic acid sequences encoding a polypeptide having PAL activity and a polypeptide having C4H activity were subcloned, along with the arabinose promoter from pBADModl, into pACMod using the NcoI and EcoRI sites, respectively, to create pACMod-PAL/C4H. The nucleic acid sequences encoding a polypeptide having 4CL activity and a polypeptide having CHS activity were subcloned in the same way into the NcoI and A7oI sites, respectively, of pBADMod2 to create pBADMod2-4CL/CHS.

Nucleic acid encoding a polypeptide having TAL activity (hypothetical protein listed as GenBank Accession No. ZP00005404) was cloned from Rba. sphaeroides 158 genomic DNA into XbaI lSrnaI sites of pUCMod using primers designed as described above with the forward primer providing a Shine-Dalgarno sequence and start codon.

PCR conditions were the same as described above except for addition of DMSO (10% v/v) and betaine monohydrate (final concentration 1 M). The nucleic acid encoding a polypeptide having TAL activity was subcloned into the BamHI site of pACMod to create pACMod-TAL.

4. Feeding experiments Overnight cultures (5 mL) of E. coli transformants harboring pACMod-PAL/C4H + pBADMod2-4CL/CHS, pBADMod2-4CL/CHS, or pBADMod2 alone were inoculated (1: 100) into modified M9 medium (50 mL) supplemented with tetracycline and carbenicillin or carbenicillin alone. Cultures were induced with arabinose and supplemented with 4-coumaric acid, trans-cinnamic acid, caffeic acid, ferulic acid, or 3- (4-hydroxyphenyl) propionic acid (5 mg) and allowed to grow for an additional 24 hours before harvest. Additional E. coli controls carrying plasmids pBADModl-4CL or

pBADModl-CHS were tested in the same way as above with 3- (4- hydroxyphenyl) propionic acid.

5. Growth curves Overnight cultures (5 mL) of recombinant E. coli pACMod-TAL + pBADMod2- 4CL/CHS were inoculated 1: 200 into modified M9 and TB medium (250 mL) supplemented with chloramphenicol and carbenicillin. Cultures (10 mL) were harvested at induction for the initial production time point and samples (10 mL) were removed at 12,24, 36, and 48 hours after induction. Samples were centrifuged for 25 minutes at 4000 rpm at 4°C to remove cells from culture media. Cell pellets were washed once with deionized water and frozen, along with the culture supernatants, at-20°C prior to extraction.

6. Extraction conditions Methanol (5 mL) was added to thawed cell pellets and placed in a sonicating water bath for one hour at 4°C. Cell debris was removed by centrifugation, and methanol was decanted to a fresh conical tube. Water was added to give the final volume (15 mL).

The pH of the water/methanol mixture was adjusted (approximately 9.0) to spontaneously convert chalcones to the corresponding flavanones, which aids detection and quantification of products (Mol et al., Phytochemistry, 24,2267-2269 (1985) ). The mixture was allowed to sit for one hour at room temperature, followed by two extractions with an equal volume (15 mL) of ethyl acetate. The pooled organic phase was frozen at - 80°C for more than 2 hours, then allowed to warm to room temperature, and residual water was removed. The ethyl acetate was dried under vacuum and resuspended in acetonitrile (100-200 AL).

Culture supernatants (10 mL) were pH adjusted the same as above and incubated at room temperature for one hour and then extracted twice with an equal volume (10 mL) of ethyl acetate. The pooled organic phase was frozen and dried in the same way as above and resuspended in acetonitrile (100 yL). All samples were stored at-20°C prior to HPLC and MS analysis. Extraction of 4-coumaric acid, 3- (4-hydroxyphenyl) propionic

acid, and phloretin were conducted in the same way as above but without adjusting the pH of the culture medium prior to extraction.

7. HPLC analysis Pellet and culture supernatant extracts (10 yL) were applied to a Zorbax SB-C 18 column (4.6 x 250 mm, 5/. tm ; Agilent Technologies, Palo Alto, CA) and eluted with an isocratic mobile phase of water : acetonitrile: acetic acid (69.3 : 30: 0.7 flow rate 1 mL min-1) using an Agilent 1100 HPLC system equipped with a photodiode array detector.

Compound peaks were identified by comparison to retention times and W/Vis spectra of standard compounds. Peak integrations of known amounts of standard to peak areas of unknown were used for quantification.

8. LC/ESI-MS and LC/MS/MS LC-Mass spectrometry was carried out with a LCQ mass spectrophotometer (Thermo Finnigan, USA) equipped with a Zorbax SB-C18 column under the same elution conditions as HPLC analysis. Mass fragmentation spectra of standard compounds and the extracted compounds were monitored in a mass range of 7n/z 60-400 with a negative electron spray ionization (ESI) interface (Lee et al., Chenu. Biol., 10, 453-462 (2003)).

Parent molecular ions were further fragmented by MS/MS analysis using an ESI interface at optimal collision-induced dissociation energy (25-30%). Negative ion values for standard compounds were as follows: 4-coumaric acid (m/z 163.1), trans-cinnamic acid (mlz 146.9), naringenin (m/z 271. 1), and phloretin (walz 273.1).

Example 2-Cloning and assembly ofnaringenin pathway in E. coli Nucleic acid sequences encoding a polypeptide having PAL activity and a polypeptide having C4H activity were cloned into a medium copy number plasmid pACMod (Table 2) under the control of the arabinose promoter (pACMod-PAL/C4H).

Nucleic acid sequences encoding a polypeptide having 4CL activity and a polypeptide having CHS activity were cloned onto a high copy number plasmid pBADMod2 (pBADMod2-4CL/CHS) also with the arabinose promoter. This modified pBAD plasmid also contained the arabinose repressor, AraC, to control gene expression from the

arabinose promoter (Guzman et al., J. Bacteriol., 177, 4121-4130 (1995) ). These two plasmids (pACMod-PAL/C4H + pBADMod2-4CL/CHS) were co-transformed into E. coli BW27784, a strain that overexpresses a chromosomal low affinity, high-capacity arabinose permease, AraE (Khlebnikov et al., Microbiology, 147, 3241-3247 (2001)).

After 24 hours induction, culture supernatants and pellets of cultures grown in modified M9, LB, and TB medium were extracted and analyzed by HPLC. Only trans-cinnamic acid was detected (Figure 17; panal B) in both culture supernatants and cell pellets, with the majority found in the culture supernatants, indicating a blockage after the first enzymatic step catalyzed by a PAL activity (Figure 1). When protein expression levels were checked by SDS-PAGE, the recombinant polypeptides were found in both the soluble and insoluble fractions.

These results suggest that a cytochrome P450 monooxygenase is non-functional in E. coli since trans-cinnamic acid was not hydroxylated to 4-coumaric acid by the polypeptide having C4H activity. To investigate whether the subsequent polypeptides in the pathway were functional, exogenous 4-coumaric acid was fed at induction to recombinant E. coli expressing pACMod-PAL/C4H + pBADMod2-4CL/CHS grown in modified M9 medium. After 24 hours induction, the culture was harvested, and naringenin was detected by HPLC (Figure 17 ; panal C) in both the culture supernatant and cell pellet, with the majority found in the culture supernatant. Naringenin was identified by LC-MS/MS (m/z 271. 1) and comparison of the obtained fragmentation pattern with that of an authentic standard and literature data (Hughes et al., lilt. J. Mass Spectro7n., 210-211, 371-385 (2001) ). No residual 4-coumaric acid was detected, indicating that 4-coumaric acid can be efficiently transported and metabolized by E. coli expressing polypeptides having 4CL and CHS activities. High levels of tra71s-cinnamic acid were detected due to the functional PAL still present in the assembled four-gene pathway.

To confirm the function of the polypeptide having 4CL activity and the polypeptide having CHS activity in a background devoid of PAL and C4H activities, 4- coumaric acid was fed in the same way to E. coli trasfected with only the pBADMod2- 4CL/CHS plasmid. The transfected E. coli produced naringenin with no detectable trans-

cinnamic acid (Figure 17 ; panal D) as determined by HPLC and LC-MS. No naringenin was detected in unfed control cultures harboring pBADMod2-4CL/CHS.

Example 3-Feeding of additional phenylpropanoid precursors Caffeic, ferulic, and 3- (4-hydroxyphenyl) propionic acids were fed to E. coli cultures harboring pBADMod2-4CL/CHS to examine the substrate specificities of the polypeptide having 4CL activity and the polypetpide having CHS activity in vivo.

Caffeic and ferulic acids were not converted to the corresponding chalcones or flavanones (eriodictyol and homoeriodictyol, respectively) in modified M9 or TB media as determined by HPLC. Cultures fed with 3- (4-hydroxyphenyl) propionic acid, however, produced both the expected product, phloretin (m/z 273.1), and the 4-coumaric acid product, naringenin (Figure 1) in equal amounts after 24 hours of cultivation as determined by HPLC and LC-MS analysis. In addition, 4-coumaric acid (m/z 163.0) accumulated to a large extent, with no detectable levels of 3- (4-hydroxyphenyl) propionic acid seen (Figure 18).

To determine whether phloretin was converted to naringenin by E. coli or during the extraction process, phloretin was fed to control cultures containing empty vector (pBADMod2) at induction. After 24 hours, the culture was extracted and found to contain phloretin with no detectable naringenin. Extraction at pH 9.0 and extraction without adjusting the pH were both tested and found to be identical. Next, it was tested whether E. coli metabolized 3- (4-hydroxyphenyl) propionic acid into 4-coumaric acid by feeding 3- (4-hydroxyphenyl) propionic acid to control E. coli cultures containing empty vector (pBADMod2). After 24 hours, no 4-coumaric acid was detected, and only 3- (4- hydroxyphenyl) propionic acid was found. E. coli cultures expressing either the polypeptide having 4CL activity or the polypeptide having CHS activity alone were individually fed with 3- (4-hydroxyphenyl) propionic acid. E. coli expressing the polypeptide having 4CL activity alone converted 3- (4-hydroxyphenyl) propionic acid to 4- coumaric acid, indicating that there may be an unknown E. coli enzyme that acts on the CoA ester of 3- (4-hydroxyphenyl) propionic acid. With the polypeptide having CHS activity alone, only 3- (4-hydroxyphenyl) propionic acid was detected without any conversion.

Example 4-Cloning and expression of Rba. spl2aeroides TAL Cloning of a recently described polypeptide having TAL activity from Rhodobacter capsulatus was attempted (Kyndt et al., FEBS Lett., 512, 240-244 (2002)).

The Rhodobacter TAL can produce 4-coumaric acid from tyrosine required for the formation of the chromophore of a photoactive yellow protein (Cusanovich and Meyer, Biochemistry, 42, 4759-4770 (2003) ). Following the procedures described, PCR repeatedly failed to amplify a product of the expected size from genomic DNA.

A BLAST search was conducted using the available Rba. capsulatus amino acid sequence of the polypeptide having TAL activity as query. The BLAST search revealed a hypothetical polypeptide (GenBank Accession No. ZP_00005404) from Rba. sphaeroides with 51 percent amino acid identity. The nucleic acid sequence encoding this polypeptide was amplified from genomic DNA and cloned into pUCMod to produce pUCMod-TAL for expression under control of a constitutive lac promoter. E. coli cells containing pUCMod-TAL were able to produce 4-coumaric acid but not trans-cinnamic acid (the deamination products of tyrosine and phenylalanine, respectively) as determined by HPLC and LC-MS. Production of 4-coumaric acid was highest in TB medium, followed by modified M9 and LB.

The nucleic acid encoding the polypeptide having TAL activity was subcloned into pACMod to allow co-expression in E. coli with pBADMod2-4CL/CHS.

Transformation of pACMod-TAL into E. coli resulted in the production of 4-coumaric acid (2.30 mg L-1) in the culture supernatant after 24 hours of cultivation in modified M9 medium (Figure 19; panal B).

Example 5-Production of naringenin in E. coli with a three-gene hybrid pathway To establish a functional hybrid pathway for naringenin production, pACMod- TAL and pBADMod2-4CL/CHS were co-transformed into E. coli BW27784. E. coli cells expressing this three-gene pathway (TAL + 4CL + CHS) were grown in modified M9, LB, and TB medium, and the culture media were extracted after 24 hours of induction. Naringenin was detected in all culture supernatants and cell pellets examined, with the majority found in the culture supernatants (Figure 19; panal C).

E. coli cells expressing the TAL-4CL-CHS hybrid pathway were cultured in modified M9 and TB medium to monitor naringenin production levels during growth.

Samples were removed from the cultures 12 hours following induction with arabinose for quantification of naringenin by HPLC. Naringenin production was highest in TB and seen almost exclusively in the culture media, which accounted for more than 90 percent of the total production amount. In TB medium (Figure 20; panal A), naringenin was not detected at induction, but increased at 12 (1.45 mg L-1), 24 (7.65 mg L-1), 36 (13.5 mg L-l), and 48 hours (20. 8 mg L-1) after induction. In modified M9 medium (Figure 20; panal B), naringenin was also not detected at induction, but increased at 12 (0.93 mg L-1), 24 (4.89 mg L-1), 36 (7.39 mg L-1), and 48 hours (7.53 mg L-1) after induction. Production in the cell pellet reached a maximum in modified M9 medium 36 hours after induction (0.43 mg L-1) and in TB 48 hours after induction (0.73 mg L-1), which account for 5. 8 percent and 2.9 percent of total production at those times, respectively.

These results indicate that microorganisms transfected with nucleic acid encoding a polypeptide having TAL activity, a polypeptide having 4CL activity, and a polypeptide having CHS activity can produce high levels of naringenin. In addition, these results demonstrate that E. coli can produce greater than 20 mg of naringenin per liter, which is a 250-fold increase over another report when no tyrosine is fed into the culture media (Hwang et al., Appl. Environ. Microbiol., 69, 2699-2706 (2003)).

Example 6-Cloning and expression of nucleic acid encoding a polypeptide having NADPH-cytochrome p450 reductase activity Flavonoid pathways contain many cytochrome p450 monooxygenases including polypeptides having C4H activity. Polypeptides having C4H activity can convert ti-ans- cinnamic acid, which can be produced by polypeptides having PAL activity, into 4- coumaric acid. As disclosed in Example 2, an Arabidopsis thaliana polypeptide having C4H activity was found to lack function when expressed in E. coli.

The following experiment was performed to determine whether expression of a polypeptide having NADPH-cytochrome p450 reductase activity could allow the A. thaliana polypeptide having C4H activity to be active in E. coli. Nucleic acid encoding

an A. thaliana NADPH-cytochrome p450 reductase (AtR2) polypeptide was obtained using sequence specific PCR primers in a PCR reaction with an Arabidopsis cDNA library obtained from the ATCC. The nucleic acid and amino acid sequences for the AtR2 polypeptide are available on GenBank (GenBank Accession Number Nom119167).

The PCR product with the expected size (about 2.2 kb) was purified and digested with XbaI/NotI for cloning into a modified pUCl9 plasmid, pUCMod. The nucleic acid was sequenced and found to match the sequence provided in GenBank Accession Number Nom 119167.

E. coli expressing the polypeptide having PAL activity and the polypeptide having C4H activity (PAL + C4H) were transfected with the nucleic acid encoding the AtR2 polypeptide to produce E. coli expressing all three polypeptides (PAL + C4H + AtR2).

When cultured as described above, the E. coli expressing all three polypeptides (PAL + C4H + AtR2) exhibited C4H activity in vivo (conversion of trans-cinnamic acid into 4- coumaric acid) as determined by HPLC analysis, while E. coli lacking expression of the AtR2 polypeptide (PAL + C4H) exhibited no C4H activity (Figure 21). The large peak to the right of the 4-coumaric acid peak corresponds to trans-cim1amic acid.

To increase the activity and/or expression level of the AtR2 polypeptide, the nucleic acid encoding the AtR2 polypeptide is constructed to encode an AtR2 polypeptide having an N-terminal deletion as described elsewhere (Hull and Celenza, Prot. Expr.

Purif, 18, 310-315 (2000) ). In addition, other polypeptides can be used with or instead of the AtR2 polypeptide. For example, a polypeptide having isoflavone synthase activity (IFS) can be obtained from Medicago tru7zeatula, and uses in conjunction with the AtR2 polypeptide to produce isoflavones in E. coli.

Example 7-Cloning and expression of nucleic acid encoding polypeptides having CHS activity Nucleic acid encoding Medicago truncatula polypeptides having chalcone synthase activity were provided by Dr. Deborah Samac's laboratory at the University of Minnesota. The nucleic acid and amino acid sequences are set forth in Figures 22-26.

Nucleic acid encoding the CHS5 polypeptide was subcloned into pUCMod behind a constitutive lac promoter for complementation to produce a pUC-CHS5 plasmid. E.

coli transfected with the pUC-CHS5 plasmid were tested for the ability to use both 4- coumaroyl-CoA and additional CoA thoesters using a substrate feeding experiment.

Briefly, the pUC-CHS5 plasmid was introduced into E. coli cells containing the plasmid pAC-TAL/4CL or pAC-4CL. The pAC-TAL/4CL and pAC-4CL plasmids contain nucleic acid encoding TAL and 4CL polypeptides or 4CL polypeptide only behind a constitutive lac promoter so that induction with arabinose is not necessary.

With E. coli containing pAC-TAL/4CL + pUC-CHS5, the cells were grown for 24 hours, and the culture media was harvested after centrifugation was used to remove the cells. The resulting media was extracted and analyzed. Naringenin was detected. With E. coli containing pAC-4CL + pUC-CHS5, the cells were grown to OD 0.4-0. 6 and then fed 5.0 mg of either ferulic, caffeic, or 3- (4-hydroxy-phenyl) propionic acid. After an additional 24 hour incubation, the cells were removed, and the media extracted and analyzed. Cells fed 3- (4-hydroxyphenyl) propionic acid produced phloretin, which is similar to the results obtained using the Arabidopsis CHS polypeptide. Cells fed caffeic acid produced detectable levels of eriodictyol. These results demonstrate that cells can be engineered to express polypeptides that allow the cells to produce new organic compounds such as flavonoids by feeding the cells particular substrates.

Example 8-Cloning and expression of nucleic acid encoding polypeptides having STS activity Nucleic acid encoding a polypeptide having STS activity was cloned from peanut (Arachis I2ypogaea). Once cloned, the nucleic acid was sequenced and found to be different from the sequence provided in GenBank accession number AB027606 (Figure 35). In particular, there were nine amino acid differences.

E. coli designed to express the nucleic acid encoding a polypeptide having STS activity as well as nucleic acid encoding Rhodobacter sphaeroides TAL and A. thaliana 4CL produced a stilbene compound, resveratrol. This compound was extracted from the E. coli growth media in the same manner as described herein for naringenin. Briefly, cells were removed by centrifugation after about 24 hours of growth. The liquid media was decanted to a fresh tube and extracted with ethyl acetate. The pH of the liquid media optionally can be adjusted with hydrochloric acid prior to extraction to increase yield.

In addition, an in vivo feeding technique was used to produce several flavonoid compounds. This technique was similar to those described herein except that instead of adding a 5 mg quantity of a substrate (e. g. , 4-coumaric acid) directly to a growing E. coli culture, a quantity of substrate was added in a small volume of DMSO or any possible solvent (e. g. , methanol, ethanol, water, etc. ) to make a concentrate in the solvent. This concentrate was then diluted to a working concentration in the culture. For example, a 1 molar solution of 4-coumaric acid was made in DMSO and then diluted to 1 mM for the final concentration in the growing culture. hi one experiment, resveratrol was produced by and obtained from E. coli cultures that (1) were designed to express a polypeptide having 4CL activity and a polypeptide having STS activity and (2) were fed 4-coumaric acid. The production of additional stilbene compounds, piceatannol and isorhapontigenin, was also observed via feeding the E. coli cultures caffeic and ferulic acids, respectively. Each of these stilbene compounds were extracted in a manner similar to those described herein.

Example 9-Cloning and expression of nucleic acid encoding polypeptides having FHT and FLS activity Nucleic acid encoding a polypeptide having FHT activity was cloned from A. thaliarza (Figure 36). In addition, nucleic acid encoding a polypeptide having FLS activity was cloned from A. thaliana (Figure 37). When the nucleic acid encoding a polypeptide having FHT activity was expressed in E. coli, the dihydroflavonol class of compounds were produced after using the in vivo feeding technique described herein to feed flavanones such as naringenin, eriodictyol, etc. as substrates. In particular, dihydrokaempferol was produced from E. coli expressing FHT that had been fed naringenin, while dihydroquercetin was produced when the E. coli were fed eriodictyol.

The dihydroflavonols were extracted from the liquid media as described herein for other flavonoid classes and were readily detected on HPLC.

Flavonols were produced by co-expressing FHT and FLS in conjunction with feeding of flavanone (e. g., naringenin, eriodictyol, etc. ) substrates. In particular, kaempferol was produced by E. coli that had been fed naringenin and that expressed both FHT and FLS polypeptides. Quercetin was produced by E. coli that had been fed

eriodictyol and that expressed both FHT and FLS polypeptides. Small quantities of these flavonols were purified by extraction from the liquid media, but the vast majority was purified from the materials that were pelleted with the cells since the flavonols appeared water insolubility. Briefly, after centrifugation and decanting the media, a small amount of water (e. g. , 50-150 gL) was added, and the cell material removed. The steps of centrifugation, water addition, and cell material removal were repeated several times. The flavonols can be purified away from the cell pellet using other methods such as solid phase extraction or gel filtration chromatography.

In addition, both dihydroflavonols and flavonols can be produced by (1) co- expressing 4CL and CHS along with FHT or FHT and FLS, and (2) in vivo feeding of phenylpropionic acids (e. g. , 4-coumaric acid, caffeic acid, etc. ) to produce the corresponding dihydroflavonol or flavonol. For example, E. coli expressing 4CL, CHS, and FHT, that are fed 4-couomaric acid, can produce dihydrokaempferol. Inclusion of FLS to that pathway can produce kaempferol.

OTHER EMBODIMENTS It is to be understood that while the invention has been described in conjunction with the detailed description thereof, the foregoing description is intended to illustrate and not limit the scope of the invention, which is defined by the scope of the appended claims.

Other aspects, advantages, and modifications are within the scope of the following claims.