Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
ALDEHYDES AND METHODS OF SYNTHESIS BY CATALYSIS WITH CAROTENOID CLEAVAGE DIOXYGENASE ENZYMES
Document Type and Number:
WIPO Patent Application WO/2015/168121
Kind Code:
A1
Abstract:
Methods of producing aldehydes from unsaturated fatty acids using isolated carotenoid cleavage dioxygenase (CCD) enzyme are provided. Aldehydes and flavor additives produced by catalysis of an unsaturated fatty acid, or salt or ester derivative thereof, with an isolated CCD enzyme are also provided. Briefly described, embodiments of the present disclosure provide methods of producing aldehydes from unsaturated fatty acids and their derivatives using isolated carotenoid cleavage dioxygenase (CCD) enzymes, and aldehydes produced by these methods.

Inventors:
STEWART JON DALE (US)
Application Number:
PCT/US2015/027989
Publication Date:
November 05, 2015
Filing Date:
April 28, 2015
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNIV FLORIDA (US)
International Classes:
C12P7/24
Domestic Patent References:
WO2003003849A22003-01-16
WO2004053070A22004-06-24
Foreign References:
US20130149756A12013-06-13
US20130167263A12013-06-27
Other References:
MUTTI, F.: "Alkene Cleavage Catalysed by Hene and Nonheme Enzymes: Reaction Mechanisms and Biocatalytic Applications", BIOINORGANIC CHEMISTRY AND APPLICATIONS, vol. 2012, 2012, pages 1 - 14
EL HADI ET AL.: "Advances in Fruit Aroma Volatile Research", MOLECULES, vol. 18, 11 July 2013 (2013-07-11), pages 8200 - 8229, XP055235262, DOI: doi:10.3390/molecules18078200
Attorney, Agent or Firm:
GORMAN, Heather C. et al. (LLP400 Interstate North Parkway, SE,Suite 150, Atlanta GA, US)
Download PDF:
Claims:
Claims:

1. A method of producing aldehydes from unsaturated fatty acids and their derivatives, the method comprising:

combining an isolated unsaturated fatty acid, or a salt or ester derivative thereof, with an isolated carotenoid cleavage dioxygenase (CCD) enzyme such that the CCD enzyme catalyzes alkene cleavage of the fatty acid to produce an aldehyde.

2. The method of claim 1 , further comprising recovering the produced aldehyde.

3. The method of claim 1 , wherein the aldehyde is a flavor aldehyde.

4. The method of claim 1 , wherein the aldehyde is n-hexanal.

5. The method of claim 1 , wherein the aldehyde is n-hexanal and the isolated unsaturated fatty acid or ester derivative thereof is linoleic acid or a salt or ester derivative thereof.

6. The method of claim 1 , wherein the derivative of linoleic acid is selected from the group consisting of methyl linoleate and ethyl linoleate.

7. The method of claim 1 , wherein the isolated CCD enzyme is a CCD enzyme isolated from a plant source.

8. The method of claim 1 , wherein the isolated CCD enzyme is isolated from a plant selected from the group consisting of maize, tomato, Arabidopsis, and Novosphigobium aromaticivorans.

9. The method of claim 1 , wherein the isolated CCD enzyme is a plant CCD enzyme produced by overexpression of a CCD gene in a protein expression system selected from the group consisting of: a bacterial expression system, a plant expression system, a yeast expression system, an animal expression system, and a viral expression system.

10. The method of claim 1 , wherein the isolated CCD enzyme is a Z. mays or A. thaliana CCD enzyme produced by expression of a Z. mays or A. thaliana CCD gene selected from the group consisting of: ZmCCDI , AmCCD 7, ZmCCD8, and AtCCDI .

11. The method of claim 1 , wherein the isolated CCD enzyme is an enzyme produced by expression of a CCD gene having a nucleotide sequence selected from: SEQ ID NO: 1 , SEQ ID NO: 3, SEQ ID NO: 5, SEQ ID NO: 8, and variants thereof that encode a functional variant of a CCD enzyme.

12. The method of claim 1 , wherein the isolated CCD enzyme is an enzyme produced by expression of a CCD gene having a nucleotide sequence having about 90% or more sequence identity with a sequence selected from: SEQ ID NO: 1 , SEQ ID NO: 3, SEQ ID NO: 5, and SEQ ID NO: 8.

13. The method of claim 1 , wherein the isolated CCD enzyme has a peptide sequence selected from: SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 9, and functional variants thereof.

14. The method of claim 1 , wherein the islolated CCD enzyme has a peptide sequence having about 90% or more sequence identity with a sequence selected from: SEQ ID NO 2, SEQ ID NO: 4, SEQ ID NO: 6, and SEQ ID NO: 9.

15. The method of claim 1 , wherein the isolated unsaturated fatty acid is a

polyunsaturated fatty acid, or salt or ester derivative thereof, and wherein the isolated CCD enzyme catalyzes double cleavage of the polyunsaturated fatty acid to produce a dialdehyde.

16. A composition comprising:

an aldehyde being a product of combining an isolated unsaturated fatty acid, or a salt or ester derivative thereof, with an isolated carotenoid cleavage dioxygenase (CCD) enzyme such that the islolated CCD enzyme catalyzes alkene cleavage of the fatty acid.

17. The composition of claim 16, wherein the aldehyde is n-hexanal and the unsaturated fatty acid is linoleic acid or a salt or ester derivative thereof.

18. A method of producing an n-hexanal flavor additive from isolated linoleic acid, the method comprising:

combining linoleic acid, or a salt or ester derivative thereof, with an isolated carotenoid cleavage dioxygenase (CCD) enzyme such that the isolated CCD enzyme catalyzes alkene cleavage of the fatty acid to produce n-hexanal and wherein the linoleic acid or derivative thereof and the isolated CCD enzyme are each independently obtained, directly or indirectly, from a plant source.

19. The method of claim 18, wherein the linoleic acid, or salt or ester derivative thereof, is isolated from oranges.

20. The method of claim 18, wherein the isolated CCD enzyme is obtained directly or indirectly from a plant source selected from the group consisting of: maize, tomatoes, Arabidopsis, and novosphigobium aromaticivorans.

21 . The method of claim 18, wherein the isolated CCD enzyme is a Z. mays or A.

thaliana CCD enzyme produced by expression of a Z. mays or A. thaliana CCD gene selected from the group consisting of: ZmCCDI , AmCCD 7, ZmCCD8, and AtCCDI .

22. An n-hexanal flavor additive comprising:

n-hexanal produced by a method comprising combining an isolated linoleic acid, or a salt or ester derivative thereof, with an isolated carotenoid cleavage dioxygenase (CCD) enzyme such that the isolated CCD enzyme catalyzes alkene cleavage of the fatty acid to n- hexanal and wherein the linoleic acid or derivative thereof and the isolated CCD enzyme are each independently obtained from a natural plant source.

23. A method for producing a natural flavor aldehyde, the method comprising:

combining an unsaturated fatty acid, or a salt or ester derivative thereof, with a carotenoid cleavage dioxygenase (CCD) enzyme such that the CCD enzyme catalyzes alkene cleavage of the fatty acid to produce a naturally-derived flavor aldehyde and wherein the unsaturated fatty acid or derivative thereof and the CCD enzyme are each independently obtained from a plant source.

24. A flavor aldehyde produced by the method comprising:

combining an unsaturated fatty acid, or a salt or ester derivative thereof, with a carotenoid cleavage dioxygenase (CCD) enzyme such that the CCD enzyme catalyzes alkene cleavage of the fatty acid to produce a naturally derived short-chain flavor aldehyde and wherein the unsaturated fatty acid or derivative thereof and the CCD enzyme are each independently obtained from a natural plant source.

25. A dialdehyde produced by the method comprising:

combining an isolated polyunsaturated fatty acid, or a salt or ester derivative thereof, with an isolated carotenoid cleavage dioxygenase (CCD) enzyme such that the isolated CCD enzyme catalyzes alkene cleavage of two or more carbon-carbon double bonds of the polyunsaturated fatty acid to produce the dialdehyde.

Description:
ALDEHYDES AND METHODS OF SYNTHESIS BY CATALYSIS WITH CAROTENOID CLEAVAGE DIOXYGENASE ENZYMES

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims priority to U.S. provisional application entitled, "Aldehydes and Methods of Synthesis by Catalysis with Carotenoid Cleavage Dioxygenase Enzymes" having serial number 61/985,692, filed on April 29, 2014, which is entirely incorporated herein by reference.

SEQUENCE LISTING

This application contains a sequence listing filed in electronic form as an ASCII.txt file entitled 02192077.txt, created on April 27, 2015, and having a size of 41 ,683 bytes. The content of the sequence listing is incorporated herein in its entirety

INTRODUCTION

Straight-chain aldehydes, such as n-hexanal, are important flavor and fragrance ingredients, and for some markets, these ingredients must be derived from natural sources or synthesized from natural isolates by steps that allow the final product to retain the "natural" designation. In order to be labeled as "natural" in the food and cosmetics industry, flavor and fragrances must be produced by a limited set of allowable conditions, which includes enzyme-catalyzed conversions of naturally-derived substances. Many "natural" aldehydes can be derived by conversion of fatty acids, or their derivatives, extracted from natural sources, such as fruits and other plant sources. Conversion of the fatty acids to aldehydes via an enzyme-catalyzed reaction allows retention of the "natural" designation.

For instance, n-hexanal represents an important flavor ingredient, which can be derived from conversion of substances, such as certain fatty acids extracted from orange fruit and other natural sources. In principle, ozonolysis would be a logical synthetic method to convert linoleic acid to n-hexanal; unfortunately, the use of ozone (0 3 ) would preclude labeling the final product as "natural". Linoleic acid occurs at high levels within orange plant oil, and it is a logical precursor for natural n-hexanal. Plant-derived linoleic acid has been converted to n-hexanal by a two-enzyme process, but the two-step process presents problems that may require significant protein engineering efforts to overcome.

SUMMARY

Briefly described, embodiments of the present disclosure provide methods of producing aldehydes from unsaturated fatty acids and their derivatives using isolated

l carotenoid cleavage dioxygenase (CCD) enzymes, and aldehydes produced by these methods.

Embodiments of methods of the present disclosure for producing aldehydes from unsaturated fatty acids and their derivatives includes combining an isolated unsaturated fatty acid, or a salt or ester derivative thereof, with an isolated carotenoid cleavage dioxygenase (CCD) enzyme such that the CCD enzyme catalyzes alkene cleavage of the fatty acid to produce an aldehyde.

In embodiments, the present disclosure also provides aldehydes produced by a method including combining an isolated unsaturated fatty acid, or a salt or ester derivative thereof, with an isolated carotenoid cleavage dioxygenase (CCD) enzyme such that the isolated CCD enzyme catalyzes alkene cleavage of the fatty acid to produce the aldehyde.

The present disclosure also provides methods of producing an n-hexanal flavor additive from isolated linoleic acid. Embodiments of such methods can include combining linoleic acid, or a salt or ester derivative thereof, with an isolated carotenoid cleavage dioxygenase (CCD) enzyme such that the isolated CCD enzyme catalyzes alkene cleavage of the fatty acid to produce n-hexanal where the linoleic acid or derivative thereof and the isolated CCD enzyme are each independently obtained from a plant source.

Embodiments of the present disclosure also include n-hexanal flavor additives including n-hexanal produced by a method involving combining an isolated linoleic acid, or a salt or ester derivative thereof, with an isolated carotenoid cleavage dioxygenase (CCD) enzyme such that the isolated CCD enzyme catalyzes alkene cleavage of the fatty acid to n- hexanal where the linoleic acid or derivative thereof and the isolated CCD enzyme are each independently obtained from a natural plant source.

Methods of the present disclosure also include methods for for producing a natural flavor aldehyde. In embodiments, such methods can include combining an unsaturated fatty acid, or a salt or ester derivative thereof, with a carotenoid cleavage dioxygenase (CCD) enzyme such that the CCD enzyme catalyzes alkene cleavage of the fatty acid to produce a naturally-derived flavor aldehyde where the unsaturated fatty acid or derivative thereof and the CCD enzyme are each independently obtained from a plant source.

The present disclosure also includes flavor aldehydes produced by methods of the present disclosure involving combining an unsaturated fatty acid, or a salt or ester derivative thereof, with a carotenoid cleavage dioxygenase (CCD) enzyme such that the CCD enzyme catalyzes alkene cleavage of the fatty acid to produce a naturally derived short-chain flavor aldehyde and where the unsaturated fatty acid or derivative thereof and the CCD enzyme are each independently obtained from a natural plant source.

Embodiments of the present disclosure also include dialdehydes produced by methods including combining an isolated polyunsaturated fatty acid, or a salt or ester derivative thereof, with an isolated carotenoid cleavage dioxygenase (CCD) enzyme such that the isolated CCD enzyme catalyzes alkene cleavage of two or more carbon-carbon double bonds of the polyunsaturated fatty acid to produce the dialdehyde.

Other methods, compositions, plants, features, and advantages of the present disclosure will be or become apparent to one with skill in the art upon examination of the following drawings and detailed description. It is intended that all such additional compositions, methods, features, and advantages be included within this description, and be within the scope of the present disclosure.

BRIEF DESCRIPTION OF THE DRAWINGS

Further aspects of the present disclosure will be more readily appreciated upon review of the detailed description of its various embodiments, described below, when taken in conjunction with the accompanying drawings.

FIG. 1 illustrates a 2-step, 2-enzyme reaction scheme for the synthesis of n-hexanal from linoleic acid.

FIGS. 2A-2B illustrate the active site of 9-c/s-epoxycarotenoid dioxygenase (VP14) from Zea mays (PDB code 3NPE), whose sequence shows 38% identity and 55% similarity to the maize CCD1 protein. FIG. 2A shows that the non-heme iron (lower, larger sphere) is coordinated by four His residues. Dioxygen is modeled into its likely location (smaller, spheres located above the non-heme iron). This constellation is located in the bottom of a large, open cavity that can accommodate hydrophobic carotenoids such as lycopene and β- carotene. FIG. 2B illustrates the active site cavity outlined as a semi-transparent, light-gray, surface; the location of the iron/oxygen moiety at the bottom of the substrate binding cavity is apparent.

FIG. 3 illustrates a summary (Scheme 2) of several different reaction schemes involving CCD1 catalysis of oxidative cleavage reactions involving a variety of carotenoid substrates.

FIG. 4 illustrates a substrate alignment of linoleic acid/ester with other CCD1 substrates showing the alignment of the alkene cleavage positions (Scheme 3).

FIG. 5 illustrates a map of an embodiment of an E. coli overexpression plasmid for maize CCD1 protein. The CCD1 gene is fused to one that encodes glutathione S- transferase (GST) with a thrombin cleavage site between the fusion partners. The fusion protein can be purified in a single step by affinity chromatography on glutathione agarose. If desired, the CCD1 portion can be liberated by digesting with thrombin. Clones for maize CCD7 and CC8 and Arabidopsis thaliana CCD1 were also constructed and have analogous structures. FIG. 6 shows GC data from the negative control (an overnight reaction initially containing linoleic acid but lacking added CCD1 enzyme) with key peaks marked (the small amount of n-hexanal present corresponds with contamination of commercial linoleic acid starting material due to spontaneous oxidation).

FIG. 7 shows GC data from an overnight reaction mixture of linoleic acid with an isolated CCD1 from Z. mays, showing the presence of the reaction product n-hexanal.

FIGS. 8A-8B illustrate the GC/MS data from the reaction mixture of Z. mays CCD1 and linoleic acid as substrate. FIG. 8A shows the region of the GC trace where n-hexanal elutes (centered on 2.58 min), and FIG. 8B shows the experimentally-determined MS data for the peak at 2.58 min.

FIGS. 9A-9B illustrate a comparison between the MS data found for the 2.58 min. peak in the reaction (FIG. 9A) and the library data for n-hexanal (FIG. 9B).

FIG. 10 shows GC data from the negative control (an overnight reaction initially containing linoleic acid but lacking added AtCCDI enzyme) with key peaks marked (small amount of n-hexanal present likely due to contamination, di 2 -n-hexanal was added to reaction mixture as an internal standard).

FIGS. 1 1 A-1 1 B illustrate the GC/MS data from the negative control. FIG. 1 1 A shows the region of the GC trace where di 2 -n-hexanal internal standard (peak centered at 2.48 min, for reference) and the n-hexanal elute (centered on 2.58 min), and FIG. 1 1 B shows the experimentally-determined MS data for the peaks in FIG. 1 1 A.

FIG. 12 shows GC data from an overnight reaction mixture of linoleic acid with an isolated CCD1 from A. thaliana.

FIGS. 13A-13B illustrate the GC/MS data from the reaction mixture of /A. thaliana CCD1 and linoleic acid as substrate. FIG. 13A shows the region of the GC trace where the di 2 -n-hexanal internal standard (peak centered at 2.49 min, for reference) and n-hexanal elute (centered on 2.58 min), and FIG. 13B shows the experimentally-determined MS data for the n-hexanal in FIG. 13A.

DETAILED DESCRIPTION

Before the present disclosure is described in greater detail, it is to be understood that this disclosure is not limited to particular embodiments described, and as such may, of course, vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to be limiting.

Where a range of values is provided, it is understood that each intervening value, to the tenth of the unit of the lower limit unless the context clearly dictates otherwise, between the upper and lower limit of that range and any other stated or intervening value in that stated range, is encompassed within the disclosure. The upper and lower limits of these smaller ranges may independently be included in the smaller ranges and are also encompassed within the disclosure, subject to any specifically excluded limit in the stated range. Where the stated range includes one or both of the limits, ranges excluding either or both of those included limits are also included in the disclosure.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs. Although any methods and materials similar or equivalent to those described herein can also be used in the practice or testing of the present disclosure, the preferred methods and materials are now described.

Any publications and patents cited in this specification that are incorporated by reference, where noted in the application, are incorporated by reference to disclose and describe the methods and/or materials in connection with which the publications are cited. The citation of any publication is for its disclosure prior to the filing date and should not be construed as an admission that the present disclosure is not entitled to antedate such publication by virtue of prior disclosure. Further, the dates of publication provided could be different from the actual publication dates that may need to be independently confirmed.

As will be apparent to those of skill in the art upon reading this disclosure, each of the individual embodiments described and illustrated herein has discrete components and features which may be readily separated from or combined with the features of any of the other several embodiments without departing from the scope or spirit of the present disclosure. Any recited method can be carried out in the order of events recited or in any other order that is logically possible.

Embodiments of the present disclosure will employ, unless otherwise indicated, techniques of molecular biology, microbiology, organic chemistry, biochemistry, food science, and the like, which are within the skill of the art. Such techniques are explained fully in the literature.

It must be noted that, as used in the specification and the appended embodiments, the singular forms "a," "an," and "the" include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to "a support" includes a plurality of supports. In this specification and in the embodiments that follow, reference will be made to a number of terms that shall be defined to have the following meanings unless a contrary intention is apparent.

As used herein, the following terms have the meanings ascribed to them unless specified otherwise. In this disclosure, "consisting essentially of" or "consists essentially" or the like, when applied to methods and compositions encompassed by the present disclosure refers to compositions like those disclosed herein, but which may contain additional structural groups, composition components or method steps (or analogs or derivatives thereof as discussed above). Such additional structural groups, composition components or method steps, etc., however, do not materially affect the basic and novel characteristic(s) of the compositions or methods, compared to those of the corresponding compositions or methods disclosed herein. "Consisting essentially of" or "consists essentially" or the like, when applied to methods and compositions encompassed by the present disclosure have the meaning ascribed in U.S. Patent law and the term is open-ended, allowing for the presence of more than that which is recited so long as basic or novel characteristics of that which is recited is not changed by the presence of more than that which is recited, but excludes prior art embodiments.

Prior to describing the various embodiments, the following definitions are provided and should be used unless otherwise indicated.

Definitions

In describing the disclosed subject matter, the following terminology will be used in accordance with the definitions set forth below.

The terms "nucleic acid" and "polynucleotide" are terms that generally refer to a string of at least two base-sugar-phosphate combinations. As used herein, the terms include deoxyribonucleic acid (DNA) and ribonucleic acid (RNA) and generally refer to any polyribonucleotide or polydeoxyribonucleotide, which may be unmodified RNA or DNA or modified RNA or DNA. RNA may be in the form of a tRNA (transfer RNA), snRNA (small nuclear RNA), rRNA (ribosomal RNA), mRNA (messenger RNA), anti-sense RNA, RNAi (RNA interference construct), siRNA (short interfering RNA), or ribozymes. Thus, for instance, polynucleotides as used herein refers to, among others, single-and double- stranded DNA, DNA that is a mixture of single-and double-stranded regions, single- and double-stranded RNA, and RNA that is mixture of single- and double-stranded regions, hybrid molecules comprising DNA and RNA that may be single-stranded or, more typically, double-stranded or a mixture of single- and double-stranded regions. The terms "nucleic acid sequence" and "oligonucleotide" also encompasses a nucleic acid and polynucleotide as defined above.

In addition, polynucleotide as used herein refers to triple-stranded regions comprising RNA or DNA or both RNA and DNA. The strands in such regions may be from the same molecule or from different molecules. The regions may include all of one or more of the molecules, but more typically involve only a region of some of the molecules. One of the molecules of a triple-helical region often is an oligonucleotide.

It will be appreciated that a great variety of modifications have been made to DNA and RNA that serve many useful purposes known to those of skill in the art. The term polynucleotide as it is employed herein embraces such chemically, enzymatically or metabolically modified forms of polynucleotides, as well as the chemical forms of DNA and RNA characteristic of viruses and cells, including simple and complex cells, inter alia. For instance, the term polynucleotide includes DNAs or RNAs as described above that contain one or more modified bases. Thus, DNAs or RNAs comprising unusual bases, such as inosine, or modified bases, such as tritylated bases, to name just two examples, are polynucleotides as the term is used herein.

The term also includes PNAs (peptide nucleic acids), phosphorothioates, and other variants of the phosphate backbone of native nucleic acids. Natural nucleic acids have a phosphate backbone, artificial nucleic acids may contain other types of backbones, but contain the same bases. Thus, DNAs or RNAs with backbones modified for stability or for other reasons are "nucleic acids" or "polynucleotides" as that term is intended herein.

A "gene" typically refers to a hereditary unit corresponding to a sequence of DNA that occupies a specific location on a chromosome and that contains the genetic instruction for a characteristic(s) or trait(s) in an organism.

As used herein, the term "transfection" refers to the introduction of an exogenous and/or recombinant nucleic acid sequence into the interior of a membrane enclosed space of a living cell, including introduction of the nucleic acid sequence into the cytosol of a cell as well as the interior space of a mitochondria, nucleus, or chloroplast. The nucleic acid may be in the form of naked DNA or RNA, it may be associated with various proteins or regulatory elements (e.g., a promoter and/or signal element), or the nucleic acid may be incorporated into a vector or a chromosome. A "transformed" cell is thus a cell transfected with a nucleic acid sequence. The term "transformation" refers to the introduction of a nucleic acid (e.g., DNA or RNA) into cells in such a way as to allow expression of the coding portions of the introduced nucleic acid. The term "transgene" refers to an artificial gene which is used to transform a cell of an organism, such as a bacterium or a plant.

As used herein, "transformation" or "transformed" refers to the introduction of a nucleic acid (e.g., DNA or RNA) into cells in such a way as to allow expression of the coding portions of the introduced nucleic acid.

As used herein a "transformed cell" is a cell transfected with a nucleic acid sequence.

As used herein, a "transgene" refers to an artificial nucleic acid which is used to transform a cell of an organism, such as a bacterium or a plant.

As used herein, "transgenic" refers to a cell, tissue, or organism that contains a transgene.

As used herein, "isolated" means removed or separated from the native environment. Therefore, isolated DNA can contain both coding (exon) and noncoding regions (introns) of a nucleotide sequence corresponding to a particular gene. An isolated peptide or protein indicates the protein is separated from its natural environment. Isolated nucleotide sequences and/or proteins are not necessarily purified. For instance, an isolated nucleotide or peptide may be included in a crude cellular extract or they may be subjected to additional purification and separation steps.

With respect to nucleotides, "isolated nucleic acid" refers to a nucleic acid with a structure (a) not identical to that of any naturally occurring nucleic acid or (b) not identical to that of any fragment of a naturally occurring genomic nucleic acid spanning more than three separate genes, and includes DNA, RNA, or derivatives or variants thereof. The term covers, for example but not limited to, (a) a DNA which has the sequence of part of a naturally occurring genomic molecule but is not flanked by at least one of the coding sequences that flank that part of the molecule in the genome of the species in which it naturally occurs; (b) a nucleic acid incorporated into a vector or into the genomic nucleic acid of a prokaryote or eukaryote in a manner such that the resulting molecule is not identical to any vector or naturally occurring genomic DNA; (c) a separate molecule such as a cDNA, a genomic fragment, a fragment produced by polymerase chain reaction (PCR), ligase chain reaction (LCR) or chemical synthesis, or a restriction fragment; (d) a recombinant nucleotide sequence that is part of a hybrid gene, e.g., a gene encoding a fusion protein, and (e) a recombinant nucleotide sequence that is part of a hybrid sequence that is not naturally occurring. Isolated nucleic acid molecules of the present disclosure can include, for example, natural allelic variants as well as nucleic acid molecules modified by nucleotide deletions, insertions, inversions, or substitutions.

It is advantageous for some purposes that a nucleotide sequence or peptide is in purified form. The term "purified" in reference to nucleic acid and/or peptide sequence represents that the sequence has increased purity relative to the natural environment.

The term "polypeptides" and "protein" include proteins and fragments thereof.

Polypeptides are disclosed herein as amino acid residue sequences. Those sequences are written left to right in the direction from the amino to the carboxy terminus. In accordance with standard nomenclature, amino acid residue sequences are denominated by either a three letter or a single letter code as indicated as follows: Alanine (Ala, A), Arginine (Arg, R), Asparagine (Asn, N), Aspartic Acid (Asp, D), Cysteine (Cys, C), Glutamine (Gin, Q), Glutamic Acid (Glu, E), Glycine (Gly, G), Histidine (His, H), Isoleucine (lie, I), Leucine (Leu, L), Lysine (Lys, K), Methionine (Met, M), Phenylalanine (Phe, F), Proline (Pro, P), Serine (Ser, S), Threonine (Thr, T), Tryptophan (Trp, W), Tyrosine (Tyr, Y), and Valine (Val, V).

"Variant" refers to a polypeptide that differs from a reference polypeptide, but retains essential properties. A typical variant of a polypeptide differs in amino acid sequence from another, reference polypeptide. Generally, differences are limited so that the sequences of the reference polypeptide and the variant are closely similar overall and, in many regions, identical. A variant and reference polypeptide may differ in amino acid sequence by one or more modifications (e.g., substitutions, additions, and/or deletions). A substituted or inserted amino acid residue may or may not be one encoded by the genetic code. A variant of a polypeptide may be naturally occurring such as an allelic variant, or it may be a variant that is not known to occur naturally.

Modifications and changes can be made in the structure of the polypeptides of in disclosure and still obtain a molecule having similar characteristics as the polypeptide (e.g., a conservative amino acid substitution). For example, certain amino acids can be substituted for other amino acids in a sequence without appreciable loss of activity.

Because it is the interactive capacity and nature of a polypeptide that defines that polypeptide's biological functional activity, certain amino acid sequence substitutions can be made in a polypeptide sequence and nevertheless obtain a polypeptide with like properties.

In making such changes, the hydropathic index of amino acids can be considered. The importance of the hydropathic amino acid index in conferring interactive biologic function on a polypeptide is generally understood in the art. It is known that certain amino acids can be substituted for other amino acids having a similar hydropathic index or score and still result in a polypeptide with similar biological activity. Each amino acid has been assigned a hydropathic index on the basis of its hydrophobicity and charge characteristics. Those indices are: isoleucine (+4.5); valine (+4.2); leucine (+3.8); phenylalanine (+2.8);

cysteine/cysteine (+2.5); methionine (+1.9); alanine (+1.8); glycine (-0.4); threonine (-0.7); serine (-0.8); tryptophan (-0.9); tyrosine (-1.3); proline (-1.6); histidine (-3.2); glutamate (- 3.5); glutamine (-3.5); aspartate (-3.5); asparagine (-3.5); lysine (-3.9); and arginine (-4.5).

It is believed that the relative hydropathic character of the amino acid determines the secondary structure of the resultant polypeptide, which in turn defines the interaction of the polypeptide with other molecules, such as enzymes, substrates, receptors, antibodies, antigens, and the like. It is known in the art that an amino acid can be substituted by another amino acid having a similar hydropathic index and still obtain a functionally equivalent polypeptide. In such changes, the substitution of amino acids whose hydropathic indices are within ± 2 is preferred, those within ± 1 are particularly preferred, and those within ± 0.5 are even more particularly preferred.

Substitution of like amino acids can also be made on the basis of hydrophilicity, particularly, where the biological functional equivalent polypeptide or peptide thereby created is intended for use in immunological embodiments. The following hydrophilicity values have been assigned to amino acid residues: arginine (+3.0); lysine (+3.0); aspartate (+3.0 ± 1 ); glutamate (+3.0 ± 1 ); serine (+0.3); asparagine (+0.2); glutamnine (+0.2); glycine (0); proline (-0.5 ± 1 ); threonine (-0.4); alanine (-0.5); histidine (-0.5); cysteine (-1.0); methionine (-1.3); valine (-1.5); leucine (-1.8); isoleucine (-1.8); tyrosine (-2.3); phenylalanine (-2.5); tryptophan (-3.4). It is understood that an amino acid can be substituted for another having a similar hydrophilicity value and still obtain a biologically equivalent, and in particular, an

immunologically equivalent polypeptide. In such changes, the substitution of amino acids whose hydrophilicity values are within ± 2 is preferred, those within ± 1 are particularly preferred, and those within ± 0.5 are even more particularly preferred.

As outlined above, amino acid substitutions are generally based on the relative similarity of the amino acid side-chain substituents, for example, their hydrophobicity, hydrophilicity, charge, size, and the like. Exemplary substitutions that take various of the foregoing characteristics into consideration are well known to those of skill in the art and include (original residue: exemplary substitution): (Ala: Gly, Ser), (Arg: Lys), (Asn: Gin, His), (Asp: Glu, Cys, Ser), (Gin: Asn), (Glu: Asp), (Gly: Ala), (His: Asn, Gin), (lie: Leu, Val), (Leu: lie, Val), (Lys: Arg), (Met: Leu, Tyr), (Ser: Thr), (Thr: Ser), (Tip: Tyr), (Tyr: Trp, Phe), and (Val: lie, Leu). Embodiments of this disclosure thus contemplate functional or biological equivalents of a polypeptide as set forth above. In particular, embodiments of the polypeptides can include variants having about 50%, 60%, 70 %, 80%, 90%, and 95% sequence identity to the polypeptide of interest.

As used herein "functional variant" refers to a variant of a protein or polypeptide (e.g., a variant of a CCD enzyme) that can perform the same functions or activities as the original protein or polypeptide, although not necessarily at the same level (e.g., the variant may have enhanced, reduced or changed functionality, so long as it retains the basic function).

"Identity," as known in the art, is a relationship between two or more polypeptide sequences, as determined by comparing the sequences. In the art, "identity" also refers to the degree of sequence relatedness between polypeptide as determined by the match between strings of such sequences. "Identity" and "similarity" can be readily calculated by known methods, including, but not limited to, those described in (Computational Molecular Biology, Lesk, A. M., Ed., Oxford University Press, New York, 1988; Biocomputing:

Informatics and Genome Projects, Smith, D. W., Ed., Academic Press, New York, 1993; Computer Analysis of Sequence Data, Part I, Griffin, A. M., and Griffin, H. G., Eds., Humana Press, New Jersey, 1994; Sequence Analysis in Molecular Biology, von Heinje, G.,

Academic Press, 1987; and Sequence Analysis Primer, Gribskov, M. and Devereux, J., Eds., M Stockton Press, New York, 1991 ; and Carillo, H., and Lipman, D., SIAM J Applied Math., 48: 1073 (1988).

Preferred methods to determine identity are designed to give the largest match between the sequences tested. Methods to determine identity and similarity are codified in publicly available computer programs. The percent identity between two sequences can be determined by using analysis software (e.g., Sequence Analysis Software Package of the Genetics Computer Group, Madison Wis.) that incorporates the Needelman and Wunsch, (J. Mol. Biol., 48: 443-453, 1970) algorithm (e.g., NBLAST, and XBLAST). The default parameters are used to determine the identity for the polypeptides of the present disclosure.

By way of example, a polypeptide sequence may be identical to the reference sequence, that is be 100% identical, or it may include up to a certain integer number of amino acid alterations as compared to the reference sequence such that the % identity is less than 100%. Such alterations are selected from: at least one amino acid deletion, substitution, including conservative and non-conservative substitution, or insertion, and wherein said alterations may occur at the amino- or carboxy- terminal positions of the reference polypeptide sequence or anywhere between those terminal positions, interspersed either individually among the amino acids in the reference sequence or in one or more contiguous groups within the reference sequence. The number of amino acid alterations for a given % identity is determined by multiplying the total number of amino acids in the reference polypeptide by the numerical percent of the respective percent identity (divided by 100) and then subtracting that product from said total number of amino acids in the reference polypeptide.

The term "expression" as used herein describes the process undergone by a structural gene to produce a polypeptide. It is a combination of transcription and translation. Expression generally refers to the "expression" of a nucleic acid to produce a polypeptide, but it is also generally acceptable to refer to "expression" of a polypeptide, indicating that the polypeptide is being produced via expression of the corresponding nucleic acid.

As used herein, the term "over-expression" and "up-regulation" refers to the expression of a nucleic acid encoding a polypeptide (e.g., a gene) in a transformed plant cell at higher levels (therefore producing an increased amount of the polypeptide encoded by the gene) than the "wild type" plant cell (e.g., a substantially equivalent cell that is not transfected with the gene) under substantially similar conditions. Thus, to over-express or increase expression of an CCD nucleic acid refers to increasing or inducing the production of the CCD polypeptide encoded by the nucleic acid, which may be done by a variety of approaches, such as increasing the number of genes encoding for the polypeptide, increasing the transcription of the gene (such as by placing the gene under the control of a constitutive promoter), or increasing the translation of the gene, or a combination of these and/or other approaches. Conversely, "under-expression" and "down-regulation" refers to expression of a polynucleotide (e.g., a gene) at lower levels (producing a decreased amount of the polypeptide encoded by the polynucleotide) than in a "wild type" plant cell. As with over-expression, under-expression can occur at different points in the expression pathway, such as by decreasing the number of gene copies encoding for the polypeptide, inhibiting (e.g., decreasing or preventing) transcription and/or translation of the gene (e.g., by the use of antisense nucleotides, suppressors, knockouts, antagonists, etc.), or a combination of such approaches.

The term "plasmid" as used herein refers to a non-chromosomal double-stranded DNA sequence including an intact "replicon" such that the plasmid is replicated in a host cell.

As used herein, the term "vector" or "expression vector" is used in reference to a vehicle used to introduce an exogenous nucleic acid sequence into a cell. A vector may include a DNA molecule, linear or circular, which includes a segment encoding a polypeptide of interest operably linked to additional segments that provide for its transcription and translation upon introduction into a host cell or host cell organelles. Such additional segments may include promoter and terminator sequences, and may also include one or more origins of replication, one or more selectable markers, an enhancer, a polyadenylation signal, etc. Expression vectors are generally derived from yeast DNA, bacterial genomic or plasmid DNA, or viral DNA, or may contain elements of more than one of these.

As used herein, the term "expression system" includes a biologic system (e.g., a cell based system) used to express a polynucleotide to produce a protein. Such systems generally employ a plasmid or vector including the polynucleotide of interest, where the plasmid and/or expression vector is constructed with various elements (e.g., promoters, selectable markers, etc.) to enable expression of the protein product from the polynucleotide. Expression systems use the host system/host cell transcription and translation mechanisms to express the product protein. Common expression systems include, but are not limited to, bacterial expression systems (e.g., E. coli), yeast expression systems, viral expression systems, animal expression systems, and plant expression systems.

As used herein, the term "promoter" or "promoter region" includes all sequences capable of driving transcription of a coding sequence. In particular, the term "promoter" as used herein refers to a DNA sequence generally described as the 5' regulator region of a gene, located proximal to the start codon. The transcription of an adjacent coding sequence(s) is initiated at the promoter region. The term "promoter" also includes fragments of a promoter that are functional in initiating transcription of the gene.

The term "operably linked" indicates that the regulatory sequences necessary for expression of the coding sequences of a nucleic acid are placed in the nucleic acid molecule in the appropriate positions relative to the coding sequence so as to effect expression of the coding sequence. This same definition is sometimes applied to the arrangement of coding sequences and transcription control elements (e.g. promoters, enhancers, and termination elements), and/or selectable markers in an expression vector.

As used herein, the term "selectable marker" refers to a gene whose expression allows one to identify cells that have been transformed or transfected with a vector containing the marker gene. For instance, a recombinant nucleic acid may include a selectable marker operably linked to a gene of interest and a promoter, such that expression of the selectable marker indicates the successful transformation of the cell with the gene of interest.

The terms "native," "wild type", or "unmodified" polypeptide, protein or enzyme, are used herein to provide a reference point for a variant/mutant of a polypeptide, protein, or enzyme prior to its mutation and/or modification (whether the mutation and/or modification occurred naturally or by human design). Typically, the unmodified, native, or wild type polypeptide, protein, or enzyme has an amino acid sequence that corresponds substantially or completely to the amino acid sequence of the polypeptide, protein, or enzyme as it generally occurs naturally.

An "enzyme," as used herein, is a polypeptide that acts as a catalyst, which facilitates and generally speeds the rate at which chemical reactions proceed but does not alter the direction or nature of the reaction.

As used herein the term "natural flavor" has the meaning ascribed under the Code of Federal Regulations, 21 CFR 101 .22: the essential oil, oleoresin, essence or extractive, protein hydrolysate, distillate, or any product of roasting, heating or enzymolysis, which contains the flavoring constituents derived from a spice, fruit or fruit juice, vegetable or vegetable juice, edible yeast, herb, bark, bud, root, leaf or similar plant material, meat, seafood, poultry, eggs, dairy products, or fermentation products thereof, whose significant function in food is flavoring rather than nutritional. Natural flavors also include the natural essence or extractives obtained from certain plants listed in title 21 of the Code of Federal Regulations. Thus, flavor aldehydes of the present disclosure include "natural flavors" in embodiments where the flavor aldehyde is produced as a product of enzymolysis where the substrate and enzyme are derived from a spice, fruit or fruit juice, vegetable or vegetable juice, edible yeast, herb, bark, etc., as set forth above.

Furthermore, the terms "natural" and "naturally-derived" in reference to a fatty acid substrate of the present disclosure indicates that the fatty acid is obtained from a source (e.g., extracted, distilled, etc.) considered a "natural" source as set forth above in the CFR definition of "natural flavor", (e.g., a spice, fruit, vegetable, edible yeast, and so forth). The terms "natural", "naturally-derived" and/or "obtained from a natural source" in reference to the CCD enzymes of the present disclosure indicates that the enzyme is obtained, directly (e.g., extracted) or indirectly (e.g., extracted and then cloned and expressed in an expression system) from a natural source as set forth above (e.g., spice, fruit, vegetable, edible yeast, herb, plant material, meat, seafood, etc.). The use of "natural" has the specific meanings set forth above with respect to the CFR or the derivation of a fatty acid substrate and/or an enzyme and does not indicate that the methods of the present disclosure or the aldehydes or flavoring products produced by these methods are products of nature or a natural process as they are used in a 35 U.S.C. §101.

Discussion

Embodiments of the present disclosure encompass methods of producing aldehydes from unsaturated fatty acids and their derivatives via alkene cleavage catalyzed by carotenoid cleavage dioxygenase (CCD) enzymes, as well as aldehydes produced by the methods of the present disclosure.

In exemplary embodiments, the present disclosure provides methods to convert linoleate (either as a free acid or an ester or salt derivative, such as, but not limited to, sodium linoleate, methyl linoleate, ethyl linoleate, etc.) to n-hexanal, using a carotenoid cleavage dioxygenase enzyme. The linoleate can be derived from oranges ("From The Natural Fruit," FTNF), and the enzymes can be derived from plants, directly (e.g., by direct isolation from the plant source) or indirectly (e.g., by using a vector including a plant derived nucleotide encoding a CCD enzyme and over-expressing it in an expression system and isolating the enzyme produced in the expression system), or other source suitable for use in the preparation of "natural flavorings" as set forth by the Code of Federal Regulations for FDA-regulated consumable products as described above. Embodiments of methods of producing n-hexanal of the present disclosure can use only a single enzyme and a single step with no need for cofactor supply or regeneration. The methods of the present disclosure can be used to produce other aldehydes by using CCD enzymes to catalyze cleavage of other precursor fatty acids (or salt or ester derivatives thereof). It is believed that this is the first report of the production of short-chain flavor aldehydes by dioxygenase- catalyzed cleavage of unsaturated acids (or ester derivatives).

Straight-chain aldehydes such as n-hexanal represent important flavor and fragrance ingredients, and for some markets, must be derived directly or indirectly from natural sources or synthesized from natural isolates by steps that fall within certain regulatory parameters to allow the final product to be designated as "natural" as defined by the FDA or other government regulatory agency for purposes of labeling, marketing, etc. Linoleic acid occurs at high levels within orange plant oil (e.g., within triglycerides in orange plant oil) and provides a logical natural precursor of n-hexanal that can be used as a "natural"

ingredient/additive. While conversion of plant-derived linoleic acid to n-hexanal has been accomplished by a two-enzyme process (Scheme 1 , FIG. 1 ), a one-step process has not been reported.

The two enzyme process involves catalysis with a 12,13 lipoxygenase followed by a hydroperoxide lyase. The first step has been optimized with soybean lipoxygenase, which is highly selective for the 13-position of linoleic acid. In addition, the enzyme is relatively abundant, inexpensive, stable, efficient, and simple to isolate. On the other hand, the known hydroperoxide lyase enzymes are unstable and have poor turnover numbers due to their tendency to lose activity after short reaction times. A good deal of current research is therefore devoted to improving the properties of hydroperoxide lyases. Even with the already high level of effort surrounding this route, the hydroperoxide lyase step remains problematic, and significant protein engineering efforts will likely be needed to solve this problem.

In principle, ozonolysis would be a logical synthetic method to convert linoleic acid to n-hexanal; unfortunately, the use of ozone (0 3 ) would preclude labeling the final product as "natural", which can be important in the food and cosmetics industry. Carotenoid cleavage dioxygenases (CCD's) appear to offer a biological alternative to ozone, carrying out a similar type of conversion on highly hydrophobic substrates. These proteins contain a single non- heme iron in the active site that activates oxygen to cleave alkenes. This class of enzyme was discovered in conjunction with a role in abscisic acid biosynthesis in corn (Zea mays). The Z. mays genome encodes eight CCD's, and several have been cloned and expressed in Escherichia coli. Other groups have reported the E. coli expression of homologs from other plant species such as Arabidopsis thaliana, Crocus sativus and tomato. Z. mays VP14 (SEQ ID NO: 9) has been studied most thoroughly, and the crystal structure of the enzyme is available (PDB code 3NPE). The active site contains a large, hydrophobic pocket with the active-site iron coordinated by four His residues (FIGS. 2A-2B).

A CCD1 enzyme (SEQ ID NO: 2) is predicted to have a similar active site structure to VP41 (SEQ ID NO: ) based on percent sequence identity (38%) and sequence similarity (55.2%) as determined by sequence alignment and comparison. A few other structures have also been determined as described in Sui, X. et al., Arch. Biochem. Biophys. 2013. Interestingly, orange fruit was recently shown to express several CCD's (Rodrigo, M. J., et al., J. Exp. Botany 2013. Like VP14, the active site of Z. mays CCD1 is also believed to be large and hydrophobic, and the enzyme accepts a variety of bulky carotenoids as substrates, and FIG. 3, Scheme 2 illustrates some exemplary reactions catalyzed by CCD1. Although CCD enzymes accept several carotenoids and carotenoid biosynthetic precursors, their ability to catalyze alkene cleavage in fatty acids has not been explored.

Despite the large number of potential cleavage sites, CCD1 shows regioselectivity in forming volatile products from carotenoids. Evidence supports a dioxygenase mechanism in which both atoms of 0 2 are incorporated into the carbonyl products (Mutti, F. G. Bioinorg. Chem. and Appl. 2012). It appears that the ability of CCD1 (or any other CCD enzyme) to oxidatively cleave a fatty acid such as linoleate has not been investigated. CCD1 appears to achieve its regioselectivity by "counting" carbons from the end of the substrate such that whichever alkene is positioned above the iron/oxygen intermediate is cleaved. One can crudely model this by aligning substrate structures (Fig. 4, Scheme 3). Interestingly, when linoleic acid is included in this substrate alignment, the desired 12,13-alkene is correctly placed with respect to the end of the chain. The present disclosure now provides methods whereby the desired n-hexanal product can be obtained by cleavage of linoleic acid by a CCD enzyme.

Since CCD1 was determined, as described below, to oxidatively cleave the 12,13 alkene of linoleic acid/ester to produce n-hexanal, it may be capable of the catalysis of alkene cleavage of other unsaturated fatty acids by CCD enzymes to produce aldehydes. Such CCD enzyme-catalyzed cleavage of unsaturated fatty acids to produce flavor aldehydes provides for the production of flavor aldehydes that can be used in the food industry and other industries, such as cosmetics, where "natural" designations can be an important labeling, approval, and marketing factor. In embodiments, extracts of CCD enzymes can be obtained from plants and used directly; in other embodiments, nucleotides encoding a plant-derived CCD enzyme or derivative thereof can be overexpressed in known expression systems (e.g., bacterial, plant, animal, viral, etc.). Then, the enzyme can be isolated, and the isolated enzyme can be used to catalyze the cleavage of fatty acids to yield "natural" aldehydes. As discussed above, the isolated enzyme can include a crude extract or a more refined extract. This biological alternative to an ozonolysis reaction uses reaction conditions that allow the product to be labeled as "natural" according to the CFR, since an enzyme is used in the process. By employing a CCD enzyme, the reaction can be carried out in one step with no need for cofactors, providing a substantial benefit over the conventional two-step, two-enzyme "natural" process. However, use of the word "natural" in this context should be distinguished from the use of "natural" in the context of 35 U.S. C. 101 , and use of "natural" in the present disclosure is distinguishable from the use of the same word in a 35 U.S.C. 101 context.

Moreover, in embodiments, the enzyme-bound oxidant can be derived from 0 2 , and the reaction produces no waste products since both oxygen atoms ultimately reside in the two products. Linoleic acid and other fatty acids are readily available from natural sources (e.g., orange seeds for linoleic acid), providing ideal starting materials for aldehyde production (such as, but not limited to, n-hexanal) from a natural fruit source.

The present disclosure thus provides methods of producing aldehydes from unsaturated fatty acids, or salt or ester derivatives thereof, using isolated carotenoid cleavage dioxygenase (CCD) enzymes to catalyze alkene cleavage of the fatty acids or their derivatives to produce the aldehyde. The present disclosure also provides aldehydes produced by the methods of the present disclosure, and designated "natural" flavor aldehydes produced by the methods of the present disclosure using fatty acids, or derivatives, obtained from a designated "natural" source and CCD enzymes derived from a "natural source" (e.g., a plant source).

According to methods of the present disclosure, isolated unsaturated fatty acids, or derivatives thereof (such as, but not limited to, salt and ester derivatives thereof) are combined with an isolated CCD enzyme, such that the CCD enzyme catalyzes alkene cleavage of the fatty acid, or derivative, to produce an aldehyde. Additional description of the fatty acids, and fatty acid derivatives, and the CCD enzymes used in the methods of the present disclosure and to produce the aldehydes and natural flavor aldehydes of the present disclosure are provided below.

Fatty Acids and derivatives and aldehyde products

Fatty acids and derivatives used in the methods of the present disclosure to produce aldehydes include unsaturated fatty acids and derivatives of the acids, such as salts and esters of the fatty acids. For instance, in an embodiment, the fatty acid can be, but is not limited to, linoleic acid, or linoleate (an ester derivative of linoleic acid, such as, but not limited to, methyl linoleate and ethyl linoleate). In embodiments where the fatty acid is linoleic acid or a linoleate, the product aldehyde is n-hexanal, a useful flavor aldehyde.

The unsaturated fatty acids can be mono- or polyunsaturated acids. In embodiments where polyunsaturated acids are used, it is possible that the CCD enzyme can catalyze two or more of the carbon-carbon double bonds of the fatty acid, thus producing dialdehydes. Thus, methods of the present disclosure also include producing dialdehydes by combining polyunsaturated acids with a CCD enzyme capable of cleaving two or more alkene double bonds to produce the dialdehye.

The present disclosure includes the use of various unsaturated fatty acids that can be oxidatively cleaved to yield an aldehyde. Some exemplary fatty acids are those that are precursors to useful short-chain flavor and fragrance aldehydes although the methods of the present disclosure are not limited to the production of such short-chain aldehydes. In embodiments, the product aldehyde is n-hexanal, a short-chain flavor aldehyde. In embodiments, the n-hexanal is a n-hexanal flavor additive, capable of obtaining a "natural" designation, produced from linoleic acid, or a salt or ester derivative thereof, that has been obtained from a "natural" source, such as a "natural" plant source. In embodiments, the linoleic acid or salt or ester derivative thereof is isolated from oranges. In embodiments the product aldehyde of the methods of the present disclosure is a flavor aldehyde produced from an unsaturated fatty acid, or salt or ester derivative of a fatty acid, that has been obtained from a "natural source", such as a natural plant source (e.g., fruit, vegetable, or other plant extract) and thus may be capable of obtaining a "natural" designation from a regulatory agency, such as the FDA. Other exemplary unsaturated fatty acids that could be oxidatively cleaved to yield aldehydes include, but are not limited to, some of the following. CCD cleavage of palmitoleic acid (16:1 , Δ 9 ) or a derivative thereof could yield n-heptanal. CCD cleavage of oleic acid (18:1 , Δ 9 ) or a derivative thereof could yield n-nonanal. CCD cleavage of the 9,10 alkene of linoleic acid (18:2, Δ 9,12 ) or a derivative thereof couldyield n-nonenal. CCD cleavage of linolenic acid (18:3, Δ 9,12,15 ) or a derivative thereof could potentially yield n-propanal, n- hexenal or n-nonadienal, depending on the site of alkene cleavage.

CCD enzymes

CCD enzymes used in the methods of the present disclosure to catalyze production of aldehydes include CCD enzymes capable of oxidative cleavage of unsaturated fatty acids and/or derivatives of the acids. CCD enzymes are a family of non-heme iron-containing dioxygenase enzymes that catalyse the oxidative cleavage of carotenoid substrates. In embodiments, the CCD enzymes of the present disclosure include enzymes derived from a "natural" source. The CCD enzymes used in methods of the present disclosure are isolated enzymes, meaning that they are removed or separated from their native environment, or their environment of origin. However, to be "naturally-derived" as defined above, the CCD enzymes used in the present disclosure do not need to be directly extracted from a natural source, but may be first extracted and then cloned and expressed in an expression system in order to increase production of the CCD enzyme for use on an industrial scale. For instance, in an embodiment, a CCD enzyme or the CCD gene encoding the CCD enzyme may be extracted, e.g., in a crude extract, or otherwise obtained from an original source (e.g., a plant source). Then the gene encoding the CCD enzyme can be cloned and expressed in an expression system (e.g., a bacterial expression system, plant expression system, animal expression system, yeast expression system, viral expression system, etc.), and the expressed enzyme can then be isolated from the expression system and used in the methods of the present disclosure. In an embodiment, a gene for a plant CCD enzyme is inserted into an expression vector, cloned in an E. coli expression system, over-expressed in the E. coli expression system, and isolated for use.

Most members of the CCD family of enzymes are found in plants, but they are also found in mammals and micro-organisms. In embodiments, any CCD enzymes from a qualified "natural" source can be used in the methods of the present disclosure and to produce the aldehydes of the present disclosure. In embodiments, plant-derived CCD enzymes are used in methods of the present disclosure. In embodiments, the CCD enzyme is derived from a plant selected from maize (corn), tomato, Arabidopsis, and

Novosphigobium aromaticivorans. In embodiments, synthetic copies of natural CCD enzymes or the genes that encode the enzymes can be synthesized for expression and use in the methods of the present disclosure. For instance, a gene encoding a maize CCD enzyme (e.g., CCD 1 ) could be synthesized and then cloned and expressed in an expression system, isolated and used according to the methods of the present disclosure.

In some exemplary embodiments, the CCD enzyme is a Z. mays CCD enzyme. In embodiments, the CCD enzyme is a Z. mays CCD enzyme produced by expression of a Z. mays CCD gene including, but not limited to, ZmCCDI , ZmCCD 7, and ZmCCD8. In embodiments, the CCD enzyme is produced by expression of a CCD gene having a nucleotide sequence selected from, but not limited to: SEQ ID NO: 1 , SEQ ID NO: 3, and SEQ ID NO: 5, or having a nucleotide sequence with about 70% or greater, about 80% or greater, or about 90% or greater sequence identity with a sequence selected from: SEQ ID NO: 1 , SEQ ID NO: 3, and SEQ ID NO: 5. In embodiments, the Z. mays CCD enzyme has a peptide sequence selected from, but not limited to, SEQ ID NO: 2, SEQ ID NO: 4, and SEQ ID NO: 6. In embodiments, the CCD enzyme has a peptide sequence selected from a peptide sequence having about 80% or greater, about 90% or greater, or about 95% or greater sequence identity with a sequence selected from: SEQ ID NO: 2, SEQ ID NO: 4, and SEQ ID NO: 6. Modifications of known CCD enzymes (e.g., mutant or modified CCD enzymes) can also be used in the methods of the present disclosure. Several maize CCD mutants are also available that have changes at the active site entrance, and these may impact substrate binding and provide for a broader substrate base for producing additional aldehyde products.

The present disclosure also contemplates engineered, modified CCD enzymes containing mutations to improve use for oxidative cleavage of aldehydes according to methods of the present disclosure. Such modified enzymes may include those with modifications to increase substrate specificity, stability, efficiency, and the like.

In exemplary embodiments, E. coli expression clones for three maize CCD1 enzymes were obtained (see FIG. 5). In embodiments, crude enzyme extracts were mixed with linoleic acid, methyl or ethyl linoleate and incubated as described in Example below. Organic products were extracted by solvent or by solid phase microextraction (SPME), and the reaction mixtures were analyzed by GC/MS. The results provided in Example 1 have shown that the desired product as well as both ester substrates can be easily observed by GC/MS. After successful production of n-hexanal in these reactions, the reaction conditions can be further optimized with respect to temperature, pH, substrate concentration, etc. to optimize production.

In some exemplary embodiments, the CCD enzyme is an Arabidopsis thaliana CCD enzyme. In embodiments, the CCD enzyme is an A. thaliana CCD enzyme produced by expression of an A. thaliana CCD gene including, but not limited to, AtCCDI . In embodiments, the CCD enzyme is produced by expression of a CCD gene having a nucleotide sequence selected from, but not limited to: SEQ ID NO: 8, or a sequence having a nucleotide sequence with about 70% or greater, about 80% or greater, or about 90% or greater sequence identity with SEQ ID NO: 8. In embodiments, the A. thaliana CCD enzyme has a peptide sequence selected from, but not limited to, SEQ ID NO: 9. In embodiments, the CCD enzyme has a peptide sequence selected from a peptide sequence having about 80% or greater, about 90% or greater, or about 95% or greater sequence identity with SEQ ID NO: 9. Modifications of known CCD enzymes (e.g., mutant or modified CCD enzymes) can also be used in the methods of the present disclosure. Several A. thaliana CCD mutants are also available that have changes at the active site entrance, and these may impact substrate binding and provide for a broader substrate base for producing additional aldehyde products. The present disclosure also contemplates engineered, modified CCD enzymes containing mutations to improve use for oxidative cleavage of aldehydes according to methods of the present disclosure. Such modified enzymes may include those with modifications to increase substrate specificity, stability, efficiency, and the like.

In exemplary embodiments, E. coli expression clones for AtCCDI were obtained (see Example 2). In embodiments, enzyme isolate was mixed with linoleic acid and incubated as described below. Organic products were extracted by solvent or by solid phase

microextraction (SPME), and the reaction mixtures were analyzed by GC/MS. The results provided in Example 2 below have shown that the desired product as well as both ester substrates can be easily observed by GC/MS. After successful production of n-hexanal in these reactions, the reaction conditions can be further optimized with respect to temperature, pH, substrate concentration, etc. to optimize production.

In embodiments, a fusion peptide of a CCD enzyme and another peptide can be used where the second peptide can be used as a selective marker for further purification purposes. For instance, in embodiments, a fusion peptide of a CCD enzyme and a

Glutathione S-Transferase (GST) peptide can be used such that a GST column, or other method, can be used to purify the fusion peptide from a crude extract. Example 2 describes an embodiment of a fusion peptide of AtCCDI with E. coli GST. In such embodiments, after purification, the peptide can be used directly or cleaved first to provide the unfused CCD enzyme.

Designated "Natural" methods and products

Embodiments of the present disclosure also include methods of producing flavor aldehydes that can be labeled as "natural" according to guidelines and requirements of the food science industry. In such embodiments the unsaturated fatty acid or derivative thereof and the CCD enzyme are obtained from a "natural" source (as designated in relevant guidelines and regulations such as those set forth above), such as, but not limited to, a natural plant source.

In embodiments, such methods include using, as the substrate, an unsaturated fatty acid, or salt or ester derivative thereof, that has been obtained from a designated "natural" source, such as a plant source, as described above. For instance, the naturally-derived unsaturated fatty acid (or salt or ester derivative thereof) can be obtained from a biological source, or, in embodiments, specifically from a plant source. For instance, in embodiments, the fatty acid or derivative thereof, can be linoleic acid or a linoleate derivative obtained from oranges.

In embodiments, such methods also include using an isolated CCD enzyme obtained (directly or indirectly) from a "natural" source, such as, but not limited to, a plant source. In embodiments, such as described above, such methods include a CCD enzyme obtained from (e.g., obtained from, directly or indirectly derived from) maize, tomato, Arabodopsis, or Novosphigobium aromaticivorans. In embodiments the CCD enzyme is a Z. mays CCD enzyme produced by expression of a Z. mays CCD gene selected from: CCD1 , CCD 7, and CCD8, or other Z. mays CCD gene or derivative thereof. In embodiments the CCD enzyme is an A. thaliana CCD enzyme produced by expression of an A. thaliana CCD gene such as, but not limited to, AtCCDI or other A. thaliana CCD gene or derivative thereof. In embodiments, the CCD enzyme is produced by expression of a CCD gene having a nucleotide sequence selected from, but not limited to: SEQ ID NO: 1 , SEQ ID NO: 3, SEQ ID NO: 5, and SEQ ID NO: 8, or having a nucleotide sequence with about 70% or greater, about 80% or greater, or about 90% or greater sequence identity with a sequence selected from: SEQ ID NO: 1 , SEQ ID NO: 3, SEQ ID NO: 5, and SEQ ID NO: 8. In embodiments, the Z. mays or A. thaliana CCD enzyme has a peptide sequence selected from, but not limited to, SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, and SEQ ID NO: 9. In

embodiments, the CCD enzyme has a peptide sequence selected from a peptide sequence having about 80% or greater, about 90% or greater, or about 95% or greater sequence identity with a sequence selected from: SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, and SEQ ID NO: 9. The isolated CCD enzyme may be cloned and over-expressed in an expression system. In embodiments, the CCD enzyme is part of a fusion peptide including a GST peptide fused to the N-terminus of the CCD enzyme. In embodiments the fusion peptide has a peptide sequence of SEQ ID NO: 10 or a peptide sequence having about 80% or greater, about 90% or greater, or about 95% or greater sequence identity with SEQ ID NO: 10.

In an exemplary embodiment of the methods set forth above, a crude extract of Z. mays carotenoid cleavage dioxygenase 1 (CCD1 ) was overexpressed in E. coli and then isolated from E. coli by obtaining a second crude extract (isolate). The enzyme isolate can be used directly or further refined/purified. This isolated enzyme was used to catalyze the cleavage of linoleic acid derived from oranges to yield natural n-hexanal. Greater detail is provided in Example 1 below. In another embodiment a crude extract of Arabidopsis thaliana CCD1 was prepared as a fusion protein with GST and was overexpressed in E. coli and then isolated from E. coli by obtaining a second crude extract (isolate). The enzyme isolate can be used directly or further purified by use of a GST column. This isolated, purified enzyme was used to catalyze the cleavage of linoleic acid derived from oranges to yield natural n- hexanal. Greater detail is provided in Example 2 below.

Additional details regarding the methods and compositions of the present disclosure are provided in the Examples below. The specific examples below are to be construed as merely illustrative, and not limitative of the remainder of the disclosure in any way whatsoever. Without further elaboration, it is believed that one skilled in the art can, based on the description herein , utilize the present disclosure to its fullest extent. All publications recited herein are hereby incorporated by reference in their entirety.

It should be emphasized that the embodiments of the present disclosure, particularly, any "preferred" embodiments, are merely possible examples of the implementations, merely set forth for a clear understanding of the principles of the disclosure. Many variations and modifications may be made to the above-described embodiment(s) of the disclosure without departing substantially from the spirit and principles of the disclosure. All such modifications and variations are intended to be included herein within the scope of this disclosure, and protected by the following claims.

The following examples are put forth so as to provide those of ordinary skill in the art with a complete disclosure and description of how to perform the methods and use the compositions and compounds disclosed herein. Efforts have been made to ensure accuracy with respect to numbers (e.g. , amounts, temperature, etc.), but some errors and deviations should be accounted for. Unless indicated otherwise, parts are parts by weight, temperature is in °C, and pressure is at or near atmospheric. Standard temperature and pressure are defined as 20 °C and 1 atmosphere.

It should be noted that ratios, concentrations, amounts, and other numerical data may be expressed herein in a range format. It is to be understood that such a range format is used for convenience and brevity, and thus, should be interpreted in a flexible manner to include not only the numerical values explicitly recited as the limits of the range, but also to include all the individual numerical values or sub-ranges encompassed within that range as if each numerical value and sub-range is explicitly recited. To illustrate, a concentration range of "about 0.1 % to about 5%" should be interpreted to include not only the explicitly recited concentration of about 0.1 wt% to about 5 wt% , but also include individual concentrations (e.g. , 1 %, 2%, 3%, and 4%) and the sub-ranges (e.g. , 0.5%, 1 .1 %, 2.2% , 3.3%, and 4.4%) within the indicated range. In an embodiment, the term "about" can include traditional rounding according to significant figures of the numerical value. In addition, the phrase "about 'x' to 'y "' includes "about 'x' to about 'y '■

EXAMPLES

Now having described the embodiments of the present disclosure, in general, the following Examples describe some additional embodiments of the present disclosure. While embodiments of present disclosure are described in connection with the following examples and the corresponding text and figures, there is no intent to limit embodiments of the present disclosure to this description. On the contrary, the intent is to cover all alternatives, modifications, and equivalents included within the spirit and scope of embodiments of the present disclosure

EXAMPLE 1— Z. mays CCD Enzyme

In this example Z. mays CCD1 was expressed, isolated and tested for conversion of linoleic acid to n-hexanal as described below

Materials and Methods

CCD1 Protein Expression and Purification:

A derivative of pGEX2T containing the Z. mays CCD1 gene (SEQ ID NO: 1 ) (FIG. 5) was used to transform BL21 (DE3) cells (described in Vogel, J.T.; Tan, B.; McCarty, D.R.; Klee, H.J. (2008) J. Biol. Chem. 283, 1 1364-1 1373, which is hereby incorporated by reference herein in pertinent part). A single colony grown on an LB plate containing 200 μg / mL ampicillin was used to inoculate 10 mL of LB supplemented with 100 μg / mL ampicillin, then the culture was shaken overnight at 37°C. A 0.5 mL aliquot was used to inoculate 50 mL of LB containing 100 μg / mL ampicillin, then, this culture was shaken overnight at 37°C. A 40 mL portion was added to 4 L of LB containing 0.4% glucose (80 mL of a 20% stock), 200 μg / mL ampicillin and 1 mL Sigma Antifoam 204 in a New Brunswick fermenter. The culture was grown at 37°C with air flow of 1-1.5 vvm and stirring at 600 rpm until it reached O.D.600 = 0.6. CCD1 overexpression was induced by adding sterile L-arabinose to a final concentration of 0.10 mM and continued incubation overnight at 30°C. Cells were harvested by centrifugation (6,000 * g for 15 min at 4°C), then resuspended in cold 50 mM NaP,, pH 7.2 (1 mL buffer per gram of wet cells). Cells were lysed by two passages through a French pressure cell at ca. 15,000 psi, then debris was removed by centrifugation (30,000 * g for 1 h at 4°C). The supernatant was retained, glycerol was added to a final concentration of 20%, and the enzyme isolate was stored in aliquots at -80°C.

Linoleic acid oxidation: Linoleic Acid Solution: A 1 % Tween-20 solution was made by adding 100 [it of Tween-20 (Fisher BP337-500) to 10 mL of Milli Q water. A 30 mM linoleic acid stock solution was then prepared by adding 93.48 μΙ_ of Linoleic Acid (Acros 60-33-3) to the 10 mL of the 1 % Tween-20 solution. The mixture becomes cloudy white when the linoleic acid is sufficiently suspended.

Reactions were performed in a total volume of 1 mL in 2.0 mL Eppendorf tubes. Each contained 5 mM linoleic acid (167 μί of the 30 mM stock solution described above), 100 μί οί 50 mM NaP0 4 (pH=7.2), 100 μί of 3 M NaCI, 10 μί of 500 μΜ FeS0 4 , 10 μί of 500 mM ascorbate. Milli Q water was then added to bring the total volume to 0.98 mL, mixed, and allowed to reach room temperature for approximately 5 minutes. Once at room temperature, 20 μί of the CCD1 enzyme isolate was added, mixed, and left on a rotisserie (Labquake) overnight at room temperature.

A negative control lacking CCD1 was prepared in parallel. Solid-phase

microextraction (SPME) was used to analyze volatiles in the reaction mixtures. Each sample was placed in a heat block set at 75°C for 30 minutes to allow the volatiles to collect in the headspace. After heating, the outer needle was used to penetrate the septum of the GC vial. A conditioned (270°C for 30 minutes), 50/30 urn, 24 Ga DVB/CAR/PDMS StableFlex fibre (Supelco) was exposed to the headspace of the sample for 30 minutes. The fibre was then immediately desorbed into the GC/MS injection port.

GC/MS Analysis:

The GC/MS method followed the report of Schmelz et al. (Schmelz, E.A.; Alborn, H.T.; and Tumlinson, J.H. (2001 ) Planta 214, 171-179, which is hereby incorporated by reference herein in pertinent part). Analyses were performed on a Hewlett-Packard (HP) 5890 Series II Plus gas chromatograph (He carrier gas; 0.7 imL/min; injector temp 220°C; injection volume 1 μί, detector temp 280°C) with a 30 m χ 0.25 mm Restek RT-50 coupled with an HP 5971 mass selective detector. The oven temperature was programmed with an initial step at 60°C (2 min), followed by a 10°C / min increase to 250°C with a final hold at 250°C (10 min). Retention times for n-hexanal (2.57 min), and methyl benzoate (8.93 min) were determined by comparison with authentic standards and confirmed by comparing mass spectra to NIST data.

Results

GC/MS data from initial experiments described above with ZmCCDI are presented in FIGS. 6-9. Fig. 6 shows GC/MS data from an overnight reaction initially containing 5 mM linoleic acid but lacking added CCD1 enzyme with key peaks marked. A very small quantity of n-hexanal is present (the same level was found to contaminate the commercial linoleic acid starting material due to spontaneous air oxidation). FIG. 7 shows GC/MS data from the reaction in which CCD1 was added. The n-hexanal peak is significantly larger. Spectral library searches were conducted for the chromatographic peaks found in FIG 7 and indicated that peak no. 2 at around 1.36 corresponds to ethyl acetate, peak no. 3 at 2.58 corresponds to hexanal, peak no. 5 at 5.50 corresponds to furans (e.g., 2-pentylfuran, 2-ethylfuran, 2(methoxymethyl)furan, etc.), and peak no. 10 at 8.94 corresponds to the internal standard methyl benzoate.

FIG. 8A shows the region of the GC trace where n-hexanal elutes (centered on 2.58 min) along with the experimentally-determined MS data for the peak at 2.58 min (FIG. 8B). FIGS. 9A-9B show a comparison between the MS data found for the 2.58 min. peak in the reaction (FIG. 9A) and the library data for n-hexanal (FIG. 9B). It should be noted that the GC/MS software automatically eliminates any MS data for m/z values below 50. Based on this comparison, the 2.58 min peak in the GC trace was assigned as n-hexanal.

EXAMPMLE 2— A. Thaliana CCD enzyme

In addition to maize CCD enzymes tested in Example 1 , a clone for Arabidopsis thaliana CCD1 was also prepared, expressed, isolated, and tested for conversion of linoleic acid to n-hexanal as described below.

Materials and Methods

AtCCDI purification: The Arabidopsis thaliana CCD1 gene (SEQ ID NO: 8) was chemically synthesized and cloned as an Nde\, Xho\ fragment between these sites in a pGEX plasmid. The resulting plasmid (pEA1 ) overexpresses a GST-CCD1 fusion protein (SEQ ID NO: 10) under control of a T7 promoter. Plasmid pEA1 was used to transform BL21 (DE3) Gold cells and colonies were selected on LB + ampicillin plates. A single colony was used to make a starter culture (5 mL of LB containing 10 μί of 50 mg/mL ampicillin) that was shaken overnight at 37°C. The overnight culture was added to 1 L of sterile LB containing 2 mL of 50 mg/mL ampicillin. This culture was shaken at 250 rpm at 37°C until an OD 600 ~ 0.6 was reached. Fusion protein overexpression was induced by adding IPTG to a final concentration of 0.10 mM (200 μί of a 0.5 M stock solution). Once induced, the culture was incubated at 18°C for 24 hours at 220 rpm. The cells were collected by centrifugation at 6,000 rpm, 4°C for 15 minutes. They were resuspended in an equal amount of NaP0 4 buffer and then lysed using a high pressure homogenizer (French Press). To the lysed cells, 0.2% Triton X-100 was added, mixed thoroughly, and then the resulting mixture centrifuged at 18,000 rpm, 4°C for 60 minutes. The supernatant was retained and filtered through a 0.45 μΜ filter before being passed through the GST column.

GST column purification: The effluent from the filtered protein was added to the GST column and was washed with 1X PBS buffer. The flowthrough was discarded and AtCCDI was eluted with 10 mL Tris buffer (50 mM Tris, 15 mM reduced glutathione, pH 8.0) at 4°C. The protein was concentrated using Amicon Ultra 10kDa (Millipore) to about 2 mL. The presence of the desired protein was confirmed using a 10% polyacrylamide gel under denaturing conditions.

Linoleic Acid Solution: A 1 % Tween-20 solution was made by adding 10 μΙ_ of Tween-20 (Fisher BP337-500) to 1 mL of Milli Q water. A 30 mM linoleic acid solution was then prepared by adding 9.35 [it of Linoleic Acid (Acros 60-33-3) to the 1 mL 1 % Tween-20 solution. The mixture became cloudy white.

Enzyme Assays: The assays were performed in a total volume of 1 mL according to McCarty, D.R.; ef a/. 1 To a 2.0 mL crimp top GC vial the following was added: 167 μί of the 30 mM substrate solution (final concentration 5 mM), 100 il of 50 mM NaP0 4 (pH=7.2), 100 μί of 3 M NaCI, 10 μί of 500 μΜ FeS0 4 , 10 μί of 500 mM ascorbate, and 20 μί οί the purified AtCCDI . Milli Q water was then added to bring the total volume to 1 mL. The reactions were vortexed briefly and put on the rotisserie (LabQuake) to react overnight in an incubator set at 25°C. Immediately before analysis, 1.23 μί of a 10X dilution of di 2 -hexanal (1 mM) was added. The products were analyzed by Solid Phase Micro Extraction (SPME) and GC-MS procedures.

SPME: Each sample was placed in a heat block set at 75°C for 30 minutes to allow the volatiles to collect in the headspace. After heating, the outer needle was used to penetrate the septum of the GC vial. A conditioned (270°C for 30 minutes), 50/30 um, 24 Ga DVB/CAR/PDMS StableFlex fibre (Supelco) was exposed to the headspace of the sample for 30 minutes. The fibre was then immediately desorbed for 3 minutes into the GC/MS injection port.

GC-MS Analysis: Gas chromatography-mass spectrometry was performed on a Hewlett-Packard (HP) 5890 Series II Plus gas chromatograph (He carrier gas; 1.0 imL/min; splitless injector 220°C; injection volume 1 L) with a DB-17 (30m long; 250 μιη i.d;0.25 thickness) column. The temperature was programmed from 60°C (no solvent delay) at 10°/minute to 250°C (hold for 10 minutes) with manual injections. The GC was coupled to a HP 5971 series mass selective detector. Hexanal (Aldrich 66-25-1 ), di 2 -hexanal (CDN Isotopes D-6265), and linoleic acid (ACROS 60-33-3) retention times were determined using standard solutions and comparison of the mass spectra with NIST.

Results

GC/MS data from initial experiments described above with AtCCDI are presented in FIGS.10-13B. Fig. 10 shows GC/MS data from an overnight reaction initially containing 5 mM linoleic acid but lacking added AtCCDI enzyme with key peaks marked. A very small quantity of n-hexanal is present (likely from contamination due to spontaneous air oxidation). FIGS. 1 1 A-1 1 B illustrate the GC/MS data from the negative control. FIG. 1 1 A shows the region of the GC trace where di 2 -n-hexanal elutes (peak centered at 2.48 min), which was added to the reaction mixture for reference. The region of the GC trace where n-hexanal elutes (centered on 2.58 min) also appears on the GC trace in FIG. 1 1 A due to the likely contamination. FIG. 1 1 B shows the experimentally-determined MS data for the peaks in FIG. 1 1 A.

FIG. 12 shows MS data from the overnight reaction in which AtCCDI was added. FIGS. 13A-13B illustrate the GC/MS data from the reaction mixture of A. thaliana CCD1 and linoleic acid as substrate. The n-hexanal peak in FIGS. 13A and 13B is significantly larger than in the negative control (FIGS. 1 1A and 1 1 B). FIG. 13A shows the region of the GC trace where di 2 -n-hexanal (peak centered at 2.49 min, for reference) and n-hexanal elute (centered on 2.58 min), and FIG. 13B shows the experimentally-determined MS data for the peaks in FIG. 13A. Spectral library searches were conducted for the chromatographic peaks found in the above figures and indicated that peak no. 1 at around 1 .36 corresponds to ethyl acetate, peak no. 3 at 2.58 corresponds to n-hexanal, and peak no. 4 at 5.46 corresponds to a heptenal (e.g., 2-heptenal, 1 -heptadecanol, etc.). Peak no. 2 at 2.49 was incorrectly identified as 2,3-dichloro-1 -propanol because di 2 -n-hexanal is not present in the NIST database.

CCD enzymes from other plant sources

Other possible CCD enzymes for use in methods of the present disclosure include tomato CCDs and CCD enzymes that have been described from other organisms such as tomato and Novosphigobium aromaticivorans. E. coli overexpression clones analogous to the one for maize CCD1 (Figure 5) will be constructed and the proteins tested for cleavage of linoleic acid or an ester derivative. Such studies may employ synthetic genes for these clones rather than starting from the original mRNA samples.

Mutagenesis studies can be conducted to identify one or more CCD variants that will also be suitable for use in the methods of the present disclosure. Pools of mutants will be created and screened. As the level of linoleate cleavage activity in the variants increases, the pool sizes following mutagenesis may be decreased.

References

(1 ) Sz. Marczy, J.; Sz. Nemeth, A.; Samu, Z.; Hager-Veress, A.; Szajani, B. Biotechnol. Lett. 2002, 24, 1673. Production of Hexanal from Hydrolyzed Sunflower Oil by

Lipoxygenase and Hydroperoxid Lyase Enzymes.

(2) Giuliano, G.; Al-Babili, S.; von Lintig, J. Trends Plant Sci. 2003, 8, 145. Carotenoid Oxygenases: Cleave It or Leave It. (3) Camara, B.; Bouvier, F. Arch. Biochem. Biophys. 2004, 430, 16. Oxidative

Remodeling of Plastid Carotenoids.

(4) Moise, A. R.; von Lintig, J.; Palczewski, K. Trends Plant Sci. 2005, 10, 178. Related Enzymes Solve Evolutionary Recurrent Problems in the Metabolism of Carotenoids.

(5) Bouvier, F.; Isner, J.-C; Dogbo, O.; Camara, B. Trends Plant Sci. 2005, 10, 187. Oxidative Tailoring of Carotenoids: A Prospect Towards Novel Functions in Plants.

(6) Sui, X.; Kiser, P. D.; von Lintig, J.; Palczewski, K. Arch. Biochem. Biophys. 2013, 539, 203. Structural Basis of Carotenoid Cleavage: From Bacteria to Mammals.

(7) Rodrigo, M. J.; Alquezar, B.; Alos, E.; Medina, V.; Carmona, L; Bruno, M.; Al-Babili, S.; Zacarias, L. J. Exp. Botany 2013, 64, 4461. A Novel Carotenoid Cleavage Activity Involved in the Biosynthesis of Citrus Fruit-Specific Apocarotenoid Pigments.

(8) Vogel, J. T.; Tan, B.-C; McCarty, D. R.; Klee, H. J. J. Biol. Chem. 2008, 283, 1 1364. The Carotenoid Cleavage Dioxygenase 1 Enzyme Has Broad Substrate Specificity, Cleaving Carotenoids at Two Different Bond Positions.

(9) Mutti, F. G. Bioinorg. Chem. and Appl. 2012, 2012, 1. Alkene Cleavage Catalysed by Heme and Non-Heme Enzymes: Reaction Mechanisms and Biocatalytic Applications.

(10) Kim, Y.-S.; Seo, E.-S.; Oh, D.-K. Biotechnol. Lett. 2012, 34, 1851. Characterization of an apo-Carotenoid 13,14-Dioxygenase from Novosphingobium aromaticivorans that Converts 3-apo-8'-carotenal to 3-apo-13-carotenone.

SEQUENCES:

SEQ ID NO: 1 : Z. mays CCD1 DNA coding region:

ATGGGGACGGAGGCGGAGCAGCCGGACATGGACAGCCACCGAAACGACGGCGTCGT

GGTGGTGCCAGCGCCGCGCCCGCGTAAGGGGCTCGCCTCCTGGGCGCTTGACCTGC

TCGAGTCCCTCGCCGTGCGCCTCGGCCACGACAAGACCAAGCCGCTCCACTGGCTCT

CCGGCAACTTCGCCCCCGTCGTCGAGGAGACCCCGCCGGCCCCAAACCTTAGCGTCC

GCGGACACCTCCCGGAGTGCTTGAATGGAGAGTTTGTCAGGGTTGGGCCTAATCCGA A

GTTTGCTCCTGTTGCGGGGTATCACTGGTTTGATGGAGACGGGATGATTCATGCCAT G

CGTATTAAGGATGGAAAAGCTACCTATGTATCAAGATATGTGAAGACTGCCCGCCTC AA

AC AAG AG G AGT ATTTTG GTG GAG C AAAGTTTATG AAG ATTG G AG ACCTTAAGG G ATTTT

TTGGATTGTTTATGGTCCAAATGCAGCAACTTCGGAAAAAATTCAAAGTCTTGGATT TTA

CCTATGGATTTGGGACAGCTAATACTGCACTTATATATCATCATGGTAAACTCATGG CCT

TGTCAGAAGCAGATAAGCCATATGTTGTTAAGGTCCTTGAAGATGGAGACTTGCAGA CT

CTTGGCTTGTTGGATTATGACAAAAGGTTGAAACATTCTTTTACTGCCCATCCAAAG GTT

GACCCTTTTACAGATGAAATGTTCACATTCGGATATTCACATGAACCTCCATACTGT ACA

TACCGTGTGATTAACAAAGAAGGAGCTATGCTTGATCCTGTGCCAATAACAATACCG GA

ATCTGTAATGATGCATGATTTTGCCATCACAGAGAATTACTCTATTTTTATGGACCT CCC

TTTATTGTTCCGACCAAAGGAAATGGTGAAGAACGGTGAGTTTATCTACAAGTTTGA TCC

TACAAAGAAAGGTCGTTTTGGTATTCTCCCCCGCTATGCAAAGGATGACAAACTCAT CA

GATGGTTTCAACTCCCTAATTGTTTCATATTCCATAATGCTAATGCTTGGGAAGAGG GTG

ATGAAGTTGTTCTCATTACCTGCCGCCTTGAGAATCCAGATTTGGACAAGGTGAATG GA TATCAAAGTGACAAGCTCGAAAACTTCGGGAATGAGCTGTACGAGATGAGATTCAACAT

GAAAACGGGTGCTGCTTCACAAAAGCAATTGTCTGTTTCTGCTGTGGATTTTCCTCG TG

TTAATGAGAGCTATACTGGCAGAAAGCAGCGGTATGTCTACTGCACTATACTTGACA GC

ATTGCGAAGGTGACTGGCATCATAAAGTTTGATCTGCATGCTGAACCGGAAAGTGGT GT

GAAAGAACTTGAAGTGGGAGGAAATGTACAAGGCATATATGACCTGGGACCTGGTAG A

TTTGGTTCAGAGGCGATTTTTGTTCCCAAGCATCCAGGTGTGTCCGGAGAAGAAGAT GA

CGGCTATTTGATATTCTTTGTACACGACGAGAATACAGGGAAATCTGAAGTAAATGT TAT

CGATGCAAAGACAATGTCTGCTGATCCAGTTGCGGTGGTTGAGCTTCCTAATAGGGT TC

CTTATGGATTCCATGCCTTCTTTGTAACTGAGGACCAACTGGCTCGACAGGCGGAGG G

GCAGTGA

SEQ ID NO: 2, Z. mays CCD1 protein sequence:

MGTEAEQPDMDSHRNDGWVVPAPRPRKGLASWALDLLESLAVRLGHDKTKPLHWLSGN

FAPVVEETPPAPNLSVRGHLPECLNGEFVRVGPNPKFAPVAGYHWFDGDGMIHAMRI KDG

KATYVSRYVKTARLKQEEYFGGAKFMKIGDLKGFFGLFMVQMQQLRKKFKVLDFTYG FGT

ANTALIYHHGKLMALSEADKPYWKVLEDGDLQTLGLLDYDKRLKHSFTAHPKVDPFT DEM

FTFGYSHEPPYCTYRVINKEGAMLDPVPITIPESVMMHDFAITENYSIFMDLPLLFR PKEMVK

NGEFIYKFDPTKKGRFGILPRYAKDDKLIRWFQLPNCFIFHNANAWEEGDEVVLITC RLENP

DLDKVNGYQSDKLENFGNELYEMRFNMKTGAASQKQLSVSAVDFPRVNESYTGRKQR YV

YCTILDSIAKVTGIIKFDLHAEPESGVKELEVGGNVQGIYDLGPGRFGSEAIFVPKH PGVSGE

EDDGYLIFFVHDENTGKSEVNVIDAKTMSADPVAVVELPNRVPYGFHAFFVTEDQLA RQAE

GQ

SEQ ID NO: 3, Z. mays CCD7 DNA coding region:

ATGGCCGCCATGCACGCCATCGTGCACCACCGCGCACCGCCGCCTGGCCGCTGCTGT

CGCGGCCACGGGCGCAGCGTCGTCGTCCGCGCGTCGGCCGCCACCGTCACCACCAG

CATCCCCGGGTCCGCGGCGACAGTGCCGGACTCGCCGTCCGCGGCGTTCTGGGACTA

CAACCTCCTATTCCGGTCGCAGCGCGCCGAGTCCCCCGACCCCGTGGTGCTCCGCGT

CACGGAGGGCGCGATCCCGCCGGACTTCCCGGCGGGCACCTACTACCTCGCCGGTCC

CGGGATGTTCACCGACGACCACGGGTCCACGGTCCACCCGCTCGACGGCCACGGCTA

CCTCCGCTCGTTCCGCTTCGACGCCAGCGGCGCGGCGCACTACTCCGCGCGGTACGT

GGAGACGGCGGCGAAGCGGGAGGAGCACGACGCGGGCGGCGCGTCGTGGCGGTTC

ACGCACCGGGGCCCCTTCTCGGTGCTGCAGGGTGGGAGCCGGGTGGGCAACGTGAA

GGTGATGAAGAACGTGGCCAACACCAGCGTGCTGCGCTGGGGCGGCCGCGTGCTCTG

CCTCTGGGAAGGCGGGGAGCCGTACGAGCTGGACCCGCGGACGCTGGAGACCATCG

GCCCGTTCGACATCCTCGGCAGCCTCTGCGGCGGTACCGACGAAGTGGCACGAGACG

CCAGCGGCGAGGCTGCGCATCACGGGCGCCGGCAGCCGTGGCTGCAGGAGGCAGGG

ATCGACGTGGCCGCGCGCCTGCTGCGACCGGTCCTCAGTGGTGTCTACAGCATGCCG

GCCAAGCGGCTCCTCGCGCACTACAAGATCGACCCCAAGAAGAACCGGCTGCTGATG

GTGGCCTGCAACGCCGAGGACATGCTCCTCCCGCGCTCCAACTTCACTTTCTACGAG T

TCGACGCCAACTTCGCGCTGGTGCAGAAGCGGGAGTTCGTCTTGCCGGACCACCTGA T

GATCCACGACTGGGCCTTCACGGATAGCCAATACGTCCTCCTCGGAAACAGAATCAG G

CTGGACATTCCCGGTTCGCTGCTGGCGCTCACGGGCACTCACCCAATGATCGCGGCC

CTCGCCGTGGACCCGAGCCGGCAGTCCACACCCGTCTACCTGCTGCCGCGCACCCCG

GAGGCCGAGGCCGAGGCCAGCGGCCGCGACTGGAGCGTGCCCATCGAGGCGCCGTC

GCAGATGTGGTCGATGCACGTCGGCAACGCCTTCGAGGAGCGCAACGCCCGGGGCG

GCATCAACATACAGCTCCACATGTCCGGCTGCTCCTACCAGTGGTTCAACTTCCACA GG ATGTTTGGTTACAACTGGCTGAACAAGAAGCTGGACCCGTCCTTCATGAATATAGCCAA

GGGAAGGGAATGGCTACCTCGTCTTGTTCAGGTGTCCATCGACCTCGACAAGAGAGG G

ACGCGCCGAGGATGCGCCGTCCGGAGACTGTCCGACCAGTGGACCAGGCCGGCGGA

CTTCCCGGCGATCAACCCAGGCTTCGCCAACCGGAGGAGCCGCTTCGTCTACGCCGG

TGCCGCCTCCGGCTCGCGCAGATTCTTGCCGTACTTCCCCTTTGACAGCGTTGTCAA G

GTAGACGTCTCCGATGGATCAGCGCGGCTGTGGTCTGTCGCTGGGCGCAAGTTCGTC

GGTGAGCCGGTCTTCGTCCCCACCGGCAGTGGCGAGGATGACGGCTATGTTCTACTT G

TAGAGTATGCAGTGTACGATCACAGGTGCCATCTGGTGCTGCTGGACGCAAGGAAGA T

CGGGGAAAGGAACGCAGTTGTGGCAAAACTTGAGGTGCCCAAGCACCTCACCTTCCC G

ATGGGATTCCATGGGTTCTGGGCAGATGAATGA

SEQ ID NO: 4, Z. mays CCD7 protein sequence

MAAMHAIVHHRAPPPGRCCRGHGRSVVVRASAATVTTSIPGSAATVPDSPSAAFWDYNLL

FRSQRAESPDPVVLRVTEGAIPPDFPAGTYYLAGPGMFTDDHGSTVHPLDGHGYLRS FRF

DASGAAHYSARYVETAAKREEHDAGGASWRFTHRGPFSVLQGGSRVGNVKVMKNVAN TS

VLRWGGRVLCLWEGGEPYELDPRTLETIGPFDILGSLCGGTDEVARDASGEAAHHGR RQP

WLQEAGIDVAARLLRPVLSGVYSMPAKRLLAHYKIDPKKNRLLMVACNAEDMLLPRS NFTF

YEFDANFALVQKREFVLPDHLMIHDWAFTDSQYVLLGNRIRLDIPGSLLALTGTHPM IAALAV

DPSRQSTPVYLLPRTPEAEAEASGRDWSVPIEAPSQMWSMHVGNAFEERNARGGINI QLH

MSGCSYQWFNFHRMFGYNWLNKKLDPSFMNIAKGREWLPRLVQVSIDLDKRGTRRGC AV

RRLSDQWTRPADFPAINPGFANRRSRFVYAGAASGSRRFLPYFPFDSWKVDVSDGSA RL

WSVAGRKFVGEPVFVPTGSGEDDGYVLLVEYAVYDHRCHLVLLDARKIGERNAVVAK LEV

PKHLTFPMGFHGFWADE

SEQ ID NO: 5, Z. mays CCD8 DNA coding region:

ATGTCTCCCACTATGGCTTCGTCGTTGTGCGTATTTGCAGCGATGTCTGGCGCCAGCG

GCAGGCCGTCGGCCGGTGGCTCGGCGGTACCGGGCCGTCTGTCCAGCAGCACACAG

GGGGGCAAGGGAAAGCGGGCCGTGGTGCAGCCGCTCGCGGCTAGCGTGGTGACGGA

GACGCCAGCGCCGGCCGTAGCTCCGGCTCGGCCCGTCGTCGACGCCCCGCCGCGCC

GCCGTGGGGGCCGCGGCACCGTCGAGCACGCGGCGTGGAAGAGCGTCCGGCAGGA

GAGGTGGGAGGGGGCGCTGGAGCTGGAGGGAGAGCTGCCGCTCTGGCTGGATGGCA

CCTACCTGAGGAACGGGCCGGGCCTGTGGAACCTGGGCGACTACGGCTTCCGGCACC

TGTTCGACGGCTACGCCACGCTGGTCCGCGTGTCGTTCCGCGACGGGCAGGCGGTGG

GCGCGCACCGGCAGATCGAGTCGGAGGCGTACAAGGCGGCGCGCGCGCACGGCAAG

GTGTGCTACCGCGAGTTCTCGGAGGTGCCCAAGGCGGAGGGGTTCCTCTCCCACGTG

GGCCAGCTCGCCACCCTCTTCTCGGGCTCCTCGCTCACCGACAACTCCAACACGGGC

GTCGTCAGGCTCGGCGACGGCCGCGTCCTCTGCCTGACAGAGACCATCAAGGGCTCC

ATCGTGGTCGACCCGGACACGCTCGACACCATCGGCAAGTTCGAGTACACGGACAGG

CTGGGCGGCCTCATCCACTCGGCGCACCCCATCGTGACGGACACCGAGTTCTGGACG

CTCATCCCGGACCTCATCCGCCCGGGCTACTCGGTGGTGAGGATGGACGCCGGGACC

AACGAGCGGCGGTTCGTCGGCAGGGTGGACTGTCGGGGCGGGCCGGCGCCCGGGTG

GGTGCACTCGTTCCCCATCACCGACCACTACGTGGTGGTGCCGGAGATGCCGCTCCG

GTACTGCGCCAGGAACCTCCTCCGCGCGGAGCCCACGCCGCTGTACAAGTTCGAGTG

GCACCTCGAGTCCGGCAGCTACATGCACGTCATGTGCAAGGCTAGCGGCAGGGTCGT

GGCGACCGTGGAGGTGCCGCCGTTCGTCACGTTCCACTTCATCAACGCGTACGAGGA

GAAGGACGACGAGGGCCGCGTCACCGCGATCATCGCCGACTGCTGCGAGCACAACGC

CAACACCTCCATCCTCGACAAGCTCCGGCTCCAGAACCTGCGCTCTTCCACCGGCCA G

GACGTCCTCCCCGACGCCAGGGTGGGCCGCTTCAGGATCCCGCTGGACGGGAGCCC GTTCGGCGAGCTGGAGCCGGCGCTGGACCCGGACCAGCACGGCCGCGGGATGGACA

TGTGCAGCATCAACCCGGCCCACGTCGGCAAGAAGTACCGGTACGCCTACGCCTGCG

GAGCCCACCGGCCGTGCAACTTCCCCAACACCCTCACCAAGATCGACCTGGTGGAGA A

GACGGCCAAGAACTGGTACGAGGAGGGCGCCGTGCCGTCAGAGCCCTTCTTCGTGCC

GCGCCCCGGCGCCGTGGAGGAGGACGACGGCGTCGCGATCTCGATGGTGAGCGCCA

AGGACGGATCGGCCTACGCGTTGGTGCTGGACGCCAAGACGTTCCACGAGATCGCGC

GGGCCAAGTTCCCGTACGCGATGCCCTACGGCTTGCACTGCTGCTGGGTGCCTAGGA

GTACCTCAGACGCGTAG

SEQ ID NO: 6, Z. mays CCD8 protein sequence:

MSPTMASSLCVFAAMSGASGRPSAGGSAVPGRLSSSTQGGKGKRAVVQPLAASWTETP

APAVAPARPVVDAPPRRRGGRGTVEHAAWKSVRQERWEGALELEGELPLWLDGTYLR NG

PGLWNLGDYGFRHLFDGYATLVRVSFRDGQAVGAHRQIESEAYKAARAHGKVCYREF SEV

PKAEGFLSHVGQLATLFSGSSLTDNSNTGWRLGDGRVLCLTETIKGSIVVDPDTLDT IGKFE

YTDRLGGLIHSAHPIVTDTEFWTLIPDLIRPGYSVVRMDAGTNERRFVGRVDCRGGP APGW

VHSFPITDHYVVVPEMPLRYCARNLLRAEPTPLYKFEWHLESGSYMHVMCKASGRVV ATV

EVPPFVTFHFINAYEEKDDEGRVTAIIADCCEHNANTSILDKLRLQNLRSSTGQDVL PDARV

GRFRIPLDGSPFGELEPALDPDQHGRGMDMCSINPAHVGKKYRYAYACGAHRPCNFP NTL

TKIDLVEKTAKNWYEEGAVPSEPFFVPRPGAVEEDDGVAISMVSAKDGSAYALVLDA KTFH

EIARAKFPYAMPYGLHCCWVPRSTSDA

SEQ ID NO: 7, Z. mays VP14 protein sequence

MQGLAPPTSVSIHRHLPARSRARASNSVRFSPRAVSSVPPAECLQAPFHKPVADLPAPSR K

PAAIAVPGHAAAPRKAEGGKKQLNLFQRAAAAALDAFEEGFVANVLERPHGLPSTAD PAVQ

IAGNFAPVGERPPVHELPVSGRIPPFIDGVYARNGANPCFDPVAGHHLFDGDGMVHA LRIR

NGAAESYACRFTETARLRQERAIGRPVFPKAIGELHGHSGIARLALFYARAACGLVD PSAGT

GVANAGLVYFNGRLLAMSEDDLPYHVRVADDGDLETVGRYDFDGQLGCAMIAHPKLD PAT

GELHALSYDVIKRPYLKYFYFRPDGTKSDDVEIPLEQPTMIHDFAITENLVWPDHQV VFKLQ

EMLRGGSPWLDKEKTSRFGVLPKHAADASEMAWVDVPDCFCFHLWNAWEDEATGEVV V

IGSCMTPADSIFNESDERLESVLTEIRLDARTGRSTRRAVLPPSQQVNLEVGMVNRN LLGRE

TRYAYLAVAEPWPKVSGFAKVDLSTGELTKFEYGEGRFGGEPCFVPMDPAAAHPRGE DD

GYVLTFVHDERAGTSELLVVNAADMRLEATVQLPSRVPFGFHGTFITGQELEAQAA

SEQ ID NO: 8, A. thaliana CCD1 DNA coding region

ATGGCGGAGAAACTCAGTGATGGCAGCGTCATCATCTCAGTCCATCCTAGACCCTCCA

AGGGTTTCTCCTCGAAGCTTCTCGATCTTCTCGAAAGACTTGTCGTCAAGCTCATGC AC

GATGCTTCTCTCCCTCTCCACTACCTCTCAGGCAACTTCGCTCCCATCCGTGATGAA AC

TCCTCCCGTCAAGGATCTCCCCGTCCATGGATTTCTTCCCGAATGCTTGAATGGTGA AT

TTGTGAGGGTTGGTCCAAACCCCAAGTTTGATGCTGTCGCTGGATATCACTGGTTTG AT

GGAGATGGGATGATTCATGGGGTACGCATCAAAGATGGGAAAGCTACTTATGTTTCT CG

ATATGTTAAGACATCACGTCTTAAGCAGGAAGAGTTCTTCGGAGCTGCCAAATTCAT GA

AGATTGGTGACCTTAAGGGGTTTTTCGGATTGCTAATGGTCAATATCCAACAGCTGA GA

ACGAAGCTCAAAATATTGGACAACACTTATGGAAATGGAACTGCCAATACAGCACTC GT

ATATCACCATGGAAAACTTCTAGCATTACAGGAGGCAGATAAGCCGTACGTCATCAA AG

TTTTGGAAGATGGAGACCTGCAAACTCTTGGTATAATAGATTATGACAAGAGATTGA CC

CACTCCTTCACTGCTCACCCAAAAGTTGACCCGGTTACGGGTGAAATGTTTACATTC GG

CTATTCGCATACGCCACCTTATCTCACATACAGAGTTATCTCGAAAGATGGCATTAT GCA TGACCCAGTCCCAATTACTATATCAGAGCCTATCATGATGCATGATTTTGCTATTACTGA

GACTTATGCAATCTTCATGGATCTTCCTATGCACTTCAGGCCAAAGGAAATGGTGAA AG

AGAAGAAAATGATATACTCATTTGATCCCACAAAAAAGGCTCGTTTTGGTGTTCTTC CGC

GCTATGCCAAGGATGAACTTATGATTAGATGGTTTGAGCTTCCCAACTGCTTTATTT TCC

ACAACGCCAATGCTTGGGAAGAAGAGGATGAAGTCGTCCTCATCACTTGTCGTCTTG A

GAATCCAGATCTTGACATGGTCAGTGGGAAAGTGAAAGAAAAACTCGAAAATTTTGG CA

ACGAACTGTACGAAATGAGATTCAACATGAAAACGGGCTCAGCTTCTCAAAAAAAAC TA

TCCGCATCTGCGGTTGATTTCCCCAGAATCAATGAGTGCTACACCGGAAAGAAACAG A

GATACGTATATGGAACAATTCTGGACAGTATCGCAAAGGTTACCGGAATCATCAAGT TT

G ATCTG C ATG C AG AAGCTG AG AC AG GG AAAAG AATG CTG G AAGTAG GAG GT AATATC A

AAGGAATATATGACCTGGGAGAAGGCAGATATGGTTCAGAGGCTATCTATGTTCCGC GT

GAGACAGCAGAAGAAGACGACGGTTACTTGATATTCTTTGTTCATGATGAAAACACA GG

GAAATCATGCGTGACTGTGATAGACGCAAAAACAATGTCGGCTGAACCGGTGGCAGT G

GTGGAGCTGCCGCACAGGGTCCCATACGGCTTCCATGCCTTGTTTGTTACAGAGGAA C

AACTCCAGGAACAAACTCTTATATAA

SEQ ID NO: 9, A. thaliana CCD1 protein sequence

MAEKLSDGSVIISVHPRPSKGFSSKLLDLLERLVVKLMHDASLPLHYLSGNFAPIRDETP PVK

DLPVHGFLPECLNGEFVRVGPNPKFDAVAGYHWFDGDGMIHGVRIKDGKATYVSRYV KTS

RLKQEEFFGAAKFMKIGDLKGFFGLLMVNIQQLRTKLKILDNTYGNGTANTALVYHH GKLLA

LQEADKPYVIKVLEDGDLQTLGIIDYDKRLTHSFTAHPKVDPVTGEMFTFGYSHTPP YLTYR

VISKDGIMHDPVPITISEPIMMHDFAITETYAIFMDLPMHFRPKEMVKEKKMIYSFD PTKKARF

GVLPRYAKDELMIRWFELPNCFIFHNANAWEEEDEVVLITCRLENPDLDMVSGKVKE KLEN

FGNELYEMRFNMKTGSASQKKLSASAVDFPRINECYTGKKQRYVYGTILDSIAKVTG IIKFDL

HAEAETGKRMLEVGGNIKGIYDLGEGRYGSEAIYVPRETAEEDDGYLIFFVHDENTG KSCVT

VIDAKTMSAEPVAVVELPHRVPYGFHALFVTEEQLQEQTLI

SEQ ID NO: 10, fusion protein of /A. thaliana CCD1 protein sequence fused at N-terminus to

Glutathione S-Transferase (GST) protein sequence (underlined portion)

MTKLPILGYWKIKGLVQPTRLLLEYLEEKYEEHLYERDEGDKWRNKKFELGLEFPNLPYY ID

GDVKLTQSMAIIRYIADKHNMLGGCPKERAEISMLEGAVLDIRYGVSRIAYSKDFET LKVDFL

SKLPEMLKMFEDRLCHKTYLNGDHVTHPDFMLYDALDVVLYMDPMCLDAFPKLVCFK KRIE

AIPQIDKYLKSSKYIAWPLQGWQATFGGGDHPPKSDHLVPRHMAEKLSDGSVIISVH PRPSK

GFSSKLLDLLERLVVKLMHDASLPLHYLSGNFAPIRDETPPVKDLPVHGFLPECLNG EFVRV

GPNPKFDAVAGYHWFDGDGMIHGVRIKDGKATYVSRYVKTSRLKQEEFFGAAKFMKI GDL

KGFFGLLMVNIQQLRTKLKILDNTYGNGTANTALVYHHGKLLALQEADKPYVIKVLE DGDLQ

TLGIIDYDKRLTHSFTAHPKVDPVTGEMFTFGYSHTPPYLTYRVISKDGIMHDPVPI TISEPIM

MHDFAITETYAIFMDLPMHFRPKEMVKEKKMIYSFDPTKKARFGVLPRYAKDELMIR WFELP

NCFIFHNANAWEEEDEVVLITCRLENPDLDMVSGKVKEKLENFGNELYEMRFNMKTG SAS

QKKLSASAVDFPRINECYTGKKQRYVYGTILDSIAKVTGIIKFDLHAEAETGKRMLE VGGNIK

GIYDLGEGRYGSEAIYVPRETAEEDDGYLIFFVHDENTGKSCVTVIDAKTMSAEPVA WELP

HRVPYGFHALFVTEEQLQEQTLI