Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
HOST CELLS AND METHODS FOR PRODUCING DIACID COMPOUNDS
Document Type and Number:
WIPO Patent Application WO/2012/071439
Kind Code:
A1
Abstract:
The present invention provides for a method of producing one or more fatty acid derived dicarboxylic acids in a genetically modified host cell which does not naturally produce the one or more derived fatty acid derived dicarboxylic acids. The invention provides for the biosynthesis of dicarboxylic acid ranging in length from C3 to C26. The host cell can be further modified to increase fatty acid production or export of the desired fatty acid derived compound, and/or decrease fatty acid storage or metabolism.

Inventors:
STEEN ERIC J (US)
FORTMAN JEFFREY (US)
DIETRICH JEFFREY (US)
KEASLING JAY D (US)
Application Number:
PCT/US2011/061900
Publication Date:
May 31, 2012
Filing Date:
November 22, 2011
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNIV CALIFORNIA (US)
STEEN ERIC J (US)
FORTMAN JEFFREY (US)
DIETRICH JEFFREY (US)
KEASLING JAY D (US)
International Classes:
C12P7/44
Domestic Patent References:
WO2009121066A12009-10-01
WO2006017577A22006-02-16
Foreign References:
US20080233628A12008-09-25
US20100022766A12010-01-28
US7063972B22006-06-20
Attorney, Agent or Firm:
LOCKYER, Jean, M. et al. (Two Embarcadero Center 8th Floo, San Francisco California, US)
Download PDF:
Claims:
WHAT IS CLAIMED IS: 1. A recombinant cell that produces an omega-hydroxylated fatty acid or a dicarboxyic acid or both from endogenous fatty acids, wherein said cell comprises (i) a Type I fatty acid biosynthesis pathway and a fatty acid omega hydroxylase; (ii) a Type II fatty acid biosynthesis pathway and a fatty acid omega hydroxylase; (iii) a Type III fatty acid biosynthesis pathway and a fatty acid omega hydroxylase; (iv) a Type I polyketide synthase (PKS) pathway and a fatty acid omega hydroxylase; (v) a 2-keto acid biosynthesis pathway and a fatty acid omega hydroxylase; and (v) a biotin biosynthesis pathway and a cytochrome P450 oxidase; wherein one or more of said fatty acid omega hydroxylase, cytochrome P450 oxidase, and biosynthesis pathway enzymes is encoded by a recombinant nucleic acid in said cell. 2. The recombinant cell of claim 1, wherein at least two of said fatty acid omega hydroxylase, cytochrome P450 oxidase, and biosynthesis pathway enzymes is encoded by a recombinant nucleic acid in said cell. 3. The recombinant cell of claim 1 or 2, wherein said fatty acid is produced by a Type 1 , II, or III fatty acid biosynthesis pathway and said fatty acid omega hydroxylase is encoded by a recombinant nucleic acid in said cell. 4. The recombinant cell of claim 1 or 2, wherein said fatty acid is produced by a Type I PKS pathway and said Type I PKS is encoded by a recombinant nucleic acid in said cell. 5. The recombinant host cell of claim 1 or 2, wherein said fatty acid is produced by a 2-keto acid biosynthesis pathway that includes mutated LeuA and KIVD encoded by a recombinant nucleic acid in said cell. 6. The recombinant host cell of claim 1 or 2, wherein said fatty acid is produced by a biotin biosynthesis pathway, at least one enzyme of which is encoded by a recombinant nucleic acid in said cell, and said cell produces pimelic acid. 7. The recombinant cell of any of claims 1 to 5 that produces an alpha, omega-dicarboxylic acid by conversion of said omega-hydroxylated fatty acid with a fatty acid oxidase and aldehyde dehydrogenase enzymes.

8. The recombinant cell of any of claims 1 to 5, wherein the fatty acid omega hydroxylase is selected from the group consisting of P450 (3P2), P450 (PHP3), and P450 BM3 (F87A). 9. The recombinant cell of any of claims 1 to 8, wherein the host cell has been genetically modified to reduce β-oxidation. 10. The recombinant cell of any of claims 6 to 9, wherein the dicarboxylic acid has a chain length from C3 to C26. 11. The recombinant cell of any of claims 1 to 10, wherein the cell further comprises a genetic modification selected from the group consisting of (i) a genetic modification that increases the expression of one or more genes involved in the production of fatty acid compounds is increased; (ii) a genetic modification that decreases the expression of one or more genes encoding proteins involved in the storage or metabolism of fatty acid compounds; and (iii) a genetic modification that increases the expression of a dicarboxylic acid transporter. 12. The recombinant cell of any of claims 1 to 11 that, relative to a wild- type cell of identical cell type, produces a dicarboxylic acid not produced by the wild-type cell. 13. The recombinant cell of any of claims 1 to 11 that is a yeast cell. 14. The recombinant cell of claim 13 selected from the group consisting of Bebaromyces, Candida, Pichia, Saccharomyces, Schizosaccharomyces, and Yarrowia cells. 15. The recombinant cell of claim 14 that is a Saccharomyces cell. 16. The recombinant cell of claim 15 that is S. cerevisiae. 17. The recombinant cell of claim 16 that contains a recombinant biotin biosynthesis pathway and a recombinant Biol gene and produces pimelic acid. 18. The recombinant cell of claim 16 that contains a recombinant Type I fatty acid biosynthesis pathway and a recombinant fatty acid omega hydroxylase gene and produces adipic acid.

19. The recombinant cell of claim 18 wherein the recombinant Type I fatty acid biosynthesis pathway includes HexA and HexB genes.

Description:
HOST CELLS AND METHODS FOR PRODUCING

DIACID COMPOUNDS

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application claims benefit of priority to U.S. provisional application no.

61/416,287, filed November 22, 2010, which application is herein incorporated by reference.

STATEMENT AS TO RIGHTS TO INVENTIONS MADE UNDER FEDERALLY SPONSORED RESEARCH AND DEVELOPMENT

[0002] The invention described and claimed herein was made utilizing funds supplied by the U.S. Department of Energy under Contract No. DE-AC02-05CH11231. The government has certain rights in this invention.

FIELD OF THE INVENTION

[0003] The present invention is in the field of production of dicarboxylic acids or diacid compounds derived from fatty acids, and in particular, host cells that are genetically modified to produce fatty acid-derived diacids.

BACKGROUND OF THE INVENTION

[0004] Aliphatic dioic acids, alcohols and compounds having combinations of alcohols and acids are versatile chemical intermediates useful as raw materials for the preparation of adhesives, fragrances, polyamides, polyesters, and antimicrobials. While chemical routes for the synthesis of long-chain α,ω-dicarboxylic acids are available, the synthesis is complicated and results in mixtures containing dicarboxylic acids of shorter chain lengths. As a result, extensive purification steps are necessary. Chemical synthesis is the preferred route of synthesis for these compounds today.

[0005] Picataggio reports conversion of the dodecane (a C12 linear alkane) and tetradecane (a C14 linear alkane) or their corresponding fatty acids (dodecanoate, tetradecanoate) into their corresponding α,ω-diacarboxylic acids using the yeast Candida tropicalis (see, e.g., Picataggio, et ah, Biotechnology 10:894-898, 1992). The method described is greatly disadvantaged by its reliance upon exogenous addition of C 12 or C 14 alkane, or C 12 or C 14 fatty acid; moreover, the method is disadvantaged by the inability of Candida to convert other, non C12 or C14, fatty acid and alkane substrates into corresponding diacids. Thus, a method for the endogenous production of fatty acid substrates of desired chain length and subsequent omega oxidation of the substrates, producing the corresponding ω-hydroxy fatty acid or α,ω-dicarboxylic acid, would provide an economical, competitive route to valuable α,ω-dicarboxylic acids, ω-hydroxy fatty acids, diamines, etc that has no precedence.

[0006] Thus, there remains a need for methods and materials for biocatalytic conversion of feedstock chemicals into their corresponding co-hydroxy fatty acids and α,ω-diacarboxylic acids, methods for producing the ω-hydroxy fatty acid and α,ω-diacarboxylic acid in a fermentation broth, methods for controlling the co-hydroxy fatty acid or α,ω-diacarboxylic acid or fatty acid chain length, methods for secreting or retaining the product from / in the cells, and methods for purifying the product from the culture broth. The present invention meets these needs.

SUMMARY OF THE INVENTION

[0007] This present invention provides recombinant host cells and related methods and materials for the biocatalytic production of α,ω-dicarboxylic acids, ω-hydroxy fatty acids, fatty acids (FA), or other fatty acid-derived molecules from fermentable carbon sources and provides a source of diacids for the production of renewable chemicals for use in

applications, including making polyesters, resins, polyamides, nylon, fuel additives and fuels, lubricants, paints, varnishes, engineering plastics and the like. [0008] The invention provides host cells and methods for producing fatty acids, co-hydroxy fatty acids, α,ω-dicarboxylic acids, and related compounds with controlled chain lengths from inexpensive feedstocks, including cornstarch, cane sugar, glycerol, and other carbon sources. The invention also provides methods for making specific short and long chain fatty acids, diacids, and diols that have not previously been made by biosynthetic methods in microbial host cells.

[0009] In nature there exist multiple routes for microbial production of fatty acids of different chain lengths. The most abundant in nature are the fatty acid pathways, of which there are three primary systems: the Type I, Type II, and Type III fatty acid systems. Type I and Type III fatty acid systems often contain multiple enzymatic activities on a single polypeptide chain and are referred to as elongases for the Type III system. Generally, Type I and Type III systems generate specific chain length acyl-CoA molecules, which are normally transferred directly into the production of membrane lipids (phospholipids, glyerolipids, etc.) but can be hydro lyzed by a thioesterase to release the free fatty acid in engineered systems. Type II fatty acid systems are composed of single polypeptides that individually encode the multiple enzymatic activities required for fatty acid biosynthesis to generate a range of fatty acyl-ACPs that are normally transferred directly into the production of membrane lipids, but can be hydrolyzed by a thioesterase that recognizes specific chain length fatty acids.

[0010] Certain cells also make molecules called polyketides that contain aliphatic backbones similar to fatty acids. Certain of these polyketides are made by Type I polyketide synthases (PKSs). Type I PKSs are composed of catalytic modules that minimally contain an acyl carrier protein (ACP), acyl transerfase (AT), and a ketosynthase (KS) and in some instances contain a ketoreductase (KR), a KR and a dehydratase (DH), and a KR, DH, and an enoyl reductase (ER). Type I PKSs generally contain a thioesterase (TE) to cleave the product from the acyl- ACP thioester, unlike natural fatty acid systems that directly incorporate acyl-ACPs or acyl-CoAs through transfer reactions. In Type I PKSs, the starter molecule and the total number of extension modules dictates the length of the final product. Type I PKSs' modular nature has made them amenable to engineering a variety of products not made by naturally occurring PKSs.

[0011] The enzymatic decarboxylation of a 2-keto acid substrate results in the formation of the corresponding aliphatic aldehyde; subsequent oxidation of the aldehyde to the

corresponding carboxylic acid produces the corresponding fatty acid. All cells make a variety of 2-keto acids as intermediates in amino acid biosynthesis. Cells engineered for overexpression of native or engineered enzymes encoded by genes in the LeuABCD operon (for example, and without limitation, in E. coli) extend by a single carbon the 2-keto acid substrate 2-ketobutyrate into longer chain length 2-keto acids. [0012] In accordance with the methods of this invention, engineered cells and recombinant vectors are provided in which Type I, II, III fatty acid synthases, Type I PKSs, and 2-ketoacid biosynthesis pathways, decarboxylases, and oxidases are genetically engineered to make free fatty acids of a specific chain length.

[0013] In various embodiments of the host cells and recombinant DNA vectors of the present invention, Type I PKS systems are engineered to produce a specific chain length fatty acid by choosing the appropriate number of modules that terminate with a thioesterase that cleaves the thioester bond and releases a free carboxylate. This thioesterase can be covalently attached to the PKS polypeptide, or expressed independently. In other embodiments of the present invention, a recombinant Type I fatty acid system is employed to produce fatty acid. The FA biosynthesis enzymes produce a specific chain length acyl- thioester, and a thioesterase is used to produce a fatty acid for subsequent oxidation to the dicarboxylic acid. In other embodiments of the invention, a type II fatty acid system is employed to produce fatty acid, thioesterases specific for desired chain lengths are employed to produce the desired chain length fatty acid product. In other embodiments of the invention, a Type III fatty acid system is employed to produce specific chain length fatty acid coenzyme A (CoA) esters, and a thioesterase is employed to cleave the specific chain length fatty acid from CoA. In other embodiments of the present invention, a Type I hybrid PKS system is employed to produce a desired chain length fatty acid, where the C-terminal PKS domain is a thioesterase that cleaves the fatty acid from the acyl carrier protein. In yet other embodiments of the present invention, a 2-keto acid pathway, 2-keto acid decarboxylase, and aldehyde dehydrogenase are used to produce a desired chain length fatty acid. [0014] Numerous microbes can be employed for the production of fatty acid-derived chemicals in accordance with the methods of the invention. In various embodiments, the microbes have characteristics that allow them to produce higher levels of product. For example, in one embodiment, the host organism provided by the invention lacks or has reduced expression levels of, or has been modified for decreased activity of, enzymes catalyzing the degradation of specific chain length fatty acids. These enzyme activities include CoA-ligases (for example, and without limitation, FadD (E.coli), FAAl, FAA2, FAA3, FAA4 (S. cerevisiae), etc as provided later and enzymes necessary for beta oxidation of fatty acids (for example, and without limitation, POX1, POX2, IDP3, TES1, FOX3 (S. cerevisiae), etc as provided later). In some embodiments of the present invention, diols are produced from fatty acids. In these embodiments, enzymes necessary for beta oxidation will be reduced, but CoA-ligases may be retained.

[0015] Because malonyl-CoA is an essential precursor to fatty acid synthesis, it is advantageous to upregulate malonyl-CoA biosynthesis. In various embodiments of the invention, the host organism has been engineered for increased expression of enzymes catalyzing production of malonyl-CoA. For example, and without limitation, increasing the expression level of actyl-CoA carboxylase (gene ACC1 (FAS3) in S. cerevisiae is included herein for reference). [0016] Thus, the invention provides a variety of different engineered host organisms that exhibit improved production of fatty acids and the corresponding diacid products. In various embodiments of the invention, the host organisms have reduced expression of genes and/or their corresponding enzyme products associated with fatty acid, α,ω-dicarboxylic acid, and related product, beta-oxidation, and have increased expression of genes and/or their corresponding enzyme products associated with α,ω-dicarboxylic acid and related product transporters. In this manner, the organism is deficient in its ability to degrade the final fatty acid or α,ω-dicarboxylic acid product and/or secretes product into the fermentation broth. Furthermore, the organism has been engineered for increased expression of genes and/or their corresponding enzyme products associated with biosynthesis of malonyl-CoA. In some embodiments, the methods of the invention are practiced with host cells in which the genes/enzymes that promote storage of fatty acids and so impede the ability to achieve high production levels of a given fatty acid derived product have been inactivated or engineered to reduce expression level/activity. [0017] In some embodiments, the host organism is yeast. Yeast host cells suitable for practice of the methods of the invention include, but are not limited to, Yarrowia, Candida, Bebaromyces, Saccharomyces, Schizosaccharomyces and Pichia, including engineered strains provided by the invention. In one embodiment, the yeast host cell is a species of Candida, including but not limited to C. tropicalis, C maltosa, C. apicola, C paratropicalis, C albicans, C cloacae, C guillermondii, C intermedia, C lipolytica, C panapsilosis and C. zeylenoides. In one embodiment, Candida tropicalis is the host organism.

[0018] In some embodiments the host is bacteria. Bacterial host cells suitable for practice of the methods of the invention include, but are not limited to, Escherichia and Bacillus, including engineered strains provided by the invention. In one embodiment, the bacterial host cell is a species of Bacillus, including but not limited to B. subtilis, B. brevis, B.

megaterium, B. aminovorans, and B.fusiformis. In one embodiment, B. subtilis is the host organism.

[0019] In the methods of the present invention, once a fatty acid of a desired chain length is produced, it is hydroxylated at the omega carbon to produce a ω-hydroxy fatty acid. In many embodiments, the hydroxylation is achieved by expressing a cytochrome P450 or

monooxygenase that is specific for hydroxylation at the terminal (omega, ω-) carbon of a fatty acid (EC 1.14.15.3). ω-hydroxy fatty acids are themselves valuable and are used in the production of fast drying paints and varnishes, etc. As such, the methods of the invention provide that, if desired, the co-hydroxy fatty acid can be isolated. Alternatively, the omega- hydroxy fatty acid is then, in accordance with the methods of the invention, further oxidized to a α,ω-dicarboxylic acid. In various embodiments, the route of oxidation is through an aldehyde, mediated by either a fatty alcohol dehydrogenase (FAD) or fatty alcohol oxidase (FAO) (these two terms are used interchangeably; EC 1.1.3.20 or 1.1.3.13). The aldehyde intermediate is then converted into a diacid by an aldehyde dehydrogenase (ADH; EC 1.2.1.3 or 1.2.1.4). In some embodiments, the P450 monoxygenase carrying out hydroxylation of the omega carbon is P450 BM3 from B. megaterium, either wild type or engineered for altered regiospecficity; for example, without limitation, introduction of mutation of phenylalanine 87 to alanine (mutation F87A) in P450 BM3 alters enzyme regiospecificity toward increased hydroxylation of fatty acid substrates at the omega position (Oliver et al, Biochemistry, 36: 1567-1572, 1997).

[0020] Figure 1 shows various biosynthetic reactions provided by the method of the invention. Using the Type I, II, or III fatty acid synthase by the method of the invention, a desired fatty acid is produced intracellulary, hydroxylated by a cytochrome P450 at the omega carbon, and then enzymatically oxidized to the α,ω-dicarboxylic acid. Using the Type I PKS by the method of the invention, a desired fatty acid is produced from the PKS system by appropriate selection of the acyl-CoA loading module, extension modules, and

thioesterase; the resulting fatty acid is subsequently hydroxylated by a fatty acid omega hydroxylase (EC 1.14.15.3), and oxidized to the corresponding α,ω-dicarboxylic acid by an alcohol oxidase (EC 1.1.3.20 or 1.1.3.13) and aldehyde dehydrogenase (EC 1.2.1.3 or 1.2.1.4).

[0021] Figure 4 shows a 2-keto acid-based pathway to production of the α,ω-dicarboxylic acid adipic acid. By the method of the invention, 2-ketoheptanoate is produced from the naturally occurring substrate 2-ketobutyrate in the host organism through the activity of enzymes encoded by the LeuABCD operon genes. 2-ketoheptanoate is subsequently decarboxylated to 1-hexanal throught the activity of the KivD decarboxylase, oxidized to the fatty acid by aldehyde dehydrogenase (EC 1.2.1.3), to the ω-hydroxy fatty acid by a fatty acid omega hydroxylase (EC 1.14.15.3), and to the α,ω-dicarboxylic acid adipate by an alcohol oxidase (EC 1.1.3.20 or 1.1.3.13) and aldehyde dehydrogenase (EC 1.2.1.3 or 1.2.1.4). [0022] Thus, the invention provides new pathways for making α,ω-dicarboxylic acids in modified host cells. While all yeast, E. coli, and Bacillus hosts have endogenous routes to production of the α,ω-dicarboxylic acid succinate, no other α,ω-dicarboxylic acids are produced in unmodified host cells. In one aspect, the present invention provides a method for producing one or more fatty acid-derived dicarboxylic acid compounds in a genetically modified host cell that does not naturally produce the α,ω-dicarboxylic acid compounds. For example, and without limitation, yeast and E. coli hosts do not make α,ω-dicarboxylic acids by the methods of this invention. Bacillus is not known to naturally produce any α,ω- dicarboxylic acids, except pimelic acid, by the methods of the invention; furthermore, the methods of the invention provide additional routes to other diacids in a Bacillus host.

[0023] In one aspect, the present invention provides methods for the biosynthesis of fatty acid derived α,ω-dicarboxylic acids compounds ranging in carbon length from C3 to C26, including both even and odd numbers of carbons. Such α,ω-dicarboxylic acid compounds include, but are not limited to, C3 diacids, C4 diacids, C5 diacids, C6 diacids, C7 diacids, C8 diacids, C9 diacids, CIO diacids, CI 1 diacids, C12 diacids, C13 diacids, C14 diacids, C15 diacids, C16 diacids, C17 diacids, CI 8 diacids, C19 diacids, C20 diacids, C21 diacids, C22 diacids, C23 diacids, C24 diacids, C25 diacids, and C26 diacids.

[0024] In other embodiments of the invention, the methods for producing co-hydroxy fatty acids are provided. In these embodiments of the invention, appropriate selection of the P450 enables hydroxylation of the free fatty acid at the ω-position. For example, and wihout limitation, expression of native P450 BM3 results in mixed hydroxylation of numerous fatty acid substrates at the co-1, co-2and CO- 3 positions; introduction of the point mutation F87A into the P450 BM3 amino acid sequence imparts co-hydroxylation regioselectivity when using various fatty acid substrates. As described in the preceding paragraph, such ω-hydroxy fatty acid compounds include, but are not limited to, C3 to C26 ω-hydroxy fatty acids.

[0025] One can modify the expression of a gene by a variety of methods in accordance with the methods of the invention. Those skilled in the art would recognize that increasing gene copy number, ribosome binding site strength, promoter strength, and various transcriptional regulators can be employed to alter an enzyme expression level. The present invention provides a method of producing a fatty acid derived α,ω-dicarboxylic acid compounds in a genetically modified host cell that is modified by the increased expression of one or more genes involved in the production of fatty acid compounds, such that the production of fatty acid compounds by the host cell is increased. The invention also provides such genetically modified host cells. Such genes include, without limitation, those that encode the following enzymatic activities: acetyl CoA carboxylase, ketosynthase, ketoreductase, dehydratase, enoyl reductase, cytosolic thiosterase, and acyl-carrier protein. Illustrative genes that encode these enzymatic functions include acpP, acpS, accA, accB, accC, accD, fabD, fabH, fabG, fabZ, fab A, fabl, fabB, fabF (suitable copies of these genes may be obtained from, and without limitation, E. coli, B. subtilis), tesA, tesB (E.coli), yneP, ysmA, ykhA, yvaM, ylpC (B. subtilis), FAS1, FAS2, FAS3, ELOl, EL02, EL03 (S. cerevisiae), ELOl, EL02, EL03 (T. brucei, T. cruzi, L. major), fasA,fasB (C. glutamicum, B. ammoniagenes, C. ammoniagenes), FAS1 (Mycoplasma tuberculosis, Mycoplasma, smegmatis), and hexA, hexB (A.flavus, A. parasiticus). In other embodiments, one increases transcriptional regulation of these genes. Suitable transcriptional regulators include fadR (suitable copies of these genes may be obtained from, and without limitation, E. coli or B. subtilis) and RAP I, ABF1, REB1, IN02, IN04 (S. cerevisiae).

[0026] The present invention also provides a method of producing a fatty acid derived α,ω- dicarboxylic acid compound in a genetically modified host cell that is modified by the decreased or lack of expression of one or more genes encoding proteins involved in the storage and/or metabolism of fatty acid compounds, such that the storage and/or metabolism of fatty acid compounds by the host cell is decreased. Such genes include, without limitation, the following: the acyl-CoA sterol transferases ARE1 (S. cerevisiae), ARE 2 (S. cerevisiae), diacylglycerol acyl transferases, DGA1 (S. cerevisiae) and LROl (S. cerevisiae), plsB, plsX (E.coli), yhfL, IcfA, des, plsX, cypC, andj/z T (B. subtilis) genes.

[0027] The present invention also provides methods and host cells that have been engineered to be capable of secreting or excreting the product into the media. In one embodiment, engineered host cells and methods are provided to make fatty acids that are secreted or excreted into the fermentation broth. In particular embodiments, these genetically modified host cells are modified by expression of one or more genes encoding proteins involved in the export of α,ω-dicarboxylic acid, fatty acid, or ω-hydroxy fatty acid

compounds such that the product is moved from the interior of the cell to the exterior. Such genes include the following: DAL5, DIP5, JEN1 (S. cerevisiae), MAE1

(Schizosaccharomyces pombe), atoE, citT (B. subtilis), dcuB, dcuC (B. subtilis, A.

succinogenes, E. coli), and various multidrug resistance pumps. [0028] Once in the fermentation broth, the diacids and hydroxy acids can be separated and purified in accordance with the invention. In various embodiments of the invention, the microbe is engineered to secrete fatty acids, α,ω-dicarboxylic acids, or ω-hydroxy fatty acids and subsequently purified from the broth. In various embodiments of the invention, the products are purified through precipitation as calcium salts, or reactive extraction with tertiary amines. In various embodiments of the invention, the tertiary amines employed include, and without limitation, tripropylamine, trioctylamine, or tridecylamine. In some embodiments of the invention, ion exchange is employed for further purification of the fatty acid, α,ω-dicarboxylic acid, or ω-hydroxy fatty acid. [0029] In other embodiments, the host cells are not engineered or modified to secrete the product into the growth medium and the product accumulates in the host cell. In these embodiments, the diacid product is separated from the host cell in accordance with the invention by centrifugation or settling of the cell material, cell lysis, and subsequent purification of the diacid product as described above. [0030] Thus, the present invention further provides for a wide variety of genetically modified host cells useful in practice of the methods of the present invention. In various embodiments, the host cell is genetically modified in any one of and any combination of the genetic modifications described herein.

[0031] The present invention further provides for an isolated dicarboxylic acid compound produced from the methods of the present invention.

BRIEF DESCRIPTION OF THE DRAWINGS

[0032] The foregoing aspects and others will be readily appreciated by the skilled artisan from the following description of illustrative embodiments when read in conjunction with the accompanying drawings.

[0033] Figure 1 illustrates five of the general α,ω-dicarboxylic acid production methods of the present invention. A summary of the engineered pathways for production of fatty acids from Type I, Type II, andType III FAS, Type I PKS, and 2-ketoacid synthesis is provided. Fatty acids of the desired chain length are produced using Type I, Type II, or Type III fatty acid synthase systems or Type I hybrid PKS systems, or decarboxylation and oxidation of 2- ketoacids. The resulting fatty acid is subsequently oxidized to the corresponding α,ω- dicarboxylic acid by fatty acid a fatty acid omega hydroxylase (EC 1.14.15.3), fatty alcohol oxidase (1.1.3.20 or 1.1.3.13), and aldehyde dehydrogenase (1.2.1.3 or 1.2.1.4). Figure 1 depicts a progresson that shows the flow of carbon from a feedstock such as sugar, through a FA node to form specific chain length FAs, which are then oxidized at the ω carbon to produce ω-hydroxy-FAs, ω-οχο-FAs, and finally, α,ω-dicarboxylic acids. Enzymes are italicized and major producted are indicated in bold.

[0034] Figures 2A and 2B illustrate FT-MS analysis of strains producing diacids from co- cultures expressing LtesA and one of two P450s. Strain BM3 is an engineered E. coli host DH1 AfadD expressing P450-Bm3 and P450-Bm3 (F87A); strain LtesA-Bm3 is an engineered E. coli host DH1 AfadD expressing LtesA, P450-Bm3, and P450-Bm3 (F87A). Strains were analyzed for production of fatty acid, ω-hydroxyacid, and the α,ω-dicarboxylic acid. (See, Example 1 below). Figure 2A is compiled data from MS analysis to identify tetradecanoic acid, 13- or 14-hydroxy-tetradecanoic acid, and tetradecanedioic acid from cultures expressing P450 Bm3 alone or cultures coexpressing a thioesterase, LtesA and P450 Bm3s. There is no detectable product for cultures expressing Bm3 alone, but there is production of tetradecanoic acid, 13- or 14-hydroxy-tetradecanoic acid, and tetradecanedioic acid in cultures expressing the P450 Bm3 and LtesA. Figure 2B is data showing the identification of the molecular ion for tetradecanedioic acid from the MS data for cultures expressing both the P450 Bm3 and LtesA. [0035] Figure 3 illustrates the plasmids that were used in Example 1, below. E. coli DH1 AfadD was employed; cultures were grown for 24 h in TB media with lmM IPTG at 30°C, sampled, and analyzed for production of fatty acid, ω-hydroxyacid, and the α,ω-dicarboxylic acid.

[0036] Figure 4 illustrates use of a 2-ketoacid pathway for the production of an α,ω- dicarboxylic acid (e.g., adipate) in accordance with an embodiment of the invention. Similar short-chain α,ω-dicarboxylic acids can be produced by varying the 2-ketoacid overproduced in the host cell. Following decarboxylation of the fatty acid with substrate promiscuous KivD decarboxylase, or other related decarboxylases (EC 4.1.1.X), the resulting fatty acid is subsequently oxidized to the corresponding α,ω-dicarboxylic acid by a fatty acid omega hydroxylase (EC 1.14.15.3), alcohol oxidase (1.1.3.20 or 1.1.3.13), and aldehyde

dehydrogenase (1.2.1.3 or 1.2.1.4). DETAILED DESCRIPTION OF THE INVENTION

[0037] Before the invention is described in detail, it is to be understood that, unless otherwise indicated, this invention is not limited to particular nucleic acids, expression vectors, enzymes, host microorganisms, or processes, as such may vary. It is also to be understood that the terminology used herein is for purposes of describing particular embodiments only, and is not intended to be limiting.

[0038] As used in the specification and the appended claims, the singular forms "a," "an," and "the" include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to an "expression vector" includes a single expression vector as well as a plurality of expression vectors, either the same (e.g., the same operon) or different; reference to "cell" includes a single cell as well as a plurality of cells; and the like.

[0039] In this specification and in the claims that follow, reference will be made to a number of terms that shall be defined to have the following meanings.

[0040] The terms "optional" or "optionally" as used herein mean that the subsequently described feature or structure may or may not be present, or that the subsequently described event or circumstance may or may not occur, and that the description includes instances where a particular feature or structure is present and instances where the feature or structure is absent, or instances where the event or circumstance occurs and instances where it does not. [0041] The terms "host cell" and "host microorganism" are used interchangeably herein to refer to a living biological cell that can be (or has been) transformed via insertion of an expression vector. Thus, a host organism or cell as described herein may be a prokaryotic organism (e.g., an organism of the kingdom Eubacteria) or a eukaryotic cell. As will be appreciated by one of ordinary skill in the art, a prokaryotic cell lacks a membrane-bound nucleus, while a eukaryotic cell has a membrane-bound nucleus.

[0042] As used herein, a "recombinant cell" or "recombinant host cell" refers to a host cell that has been genetically altered to comprise a heterologous nucleic acid sequence. Such a heterologous sequence may be: (i) an exogenous nucleic acid that is not native to the cell, e.g., an exogenous gene, an exogenous promoter, an optimized coding sequence, a mutated coding sequence; (ii) extra copies of an endogenous gene or promoter; (iii) or nucleic acids, e.g., a promoter operably linked to a coding region, that are heterologous to one another. It is understood that such terms refer not only to the particular subject cell but to the progeny or potential progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term as used herein.

[0043] The term "heterologous nucleic acid" or "heterologous DNA" as used herein refers to a polymer of nucleic acids wherein at least one of the following is true: (a) the sequence of nucleic acids is exogenous to (i.e., not naturally found in) a given host microorganism (b) the sequence may be naturally found in a given host microorganism, but in an unnatural (e.g., greater than expected) amount; or (c) the sequence of nucleic acids comprises two or more subsequences that are not found in the same relationship to each other in nature. For example, regarding instance (c), a heterologous nucleic acid sequence that is recombinantly produced will have two or more sequences from unrelated genes arranged to make a new functional nucleic acid. For example and without limitation, the present invention describes the introduction of an expression vector into a host microorganism, wherein the expression vector contains a nucleic acid sequence coding, e.g., a promoter and/or coding region, that is not normally found in a host microorganism. With reference to the host microorganism's genome, then, the nucleic acid sequence is heterologous.

[0044] The terms "expression vector" or "vector" refer to a nucleic acid compound and/or composition that transduces, transforms, or infects a host microorganism, thereby causing the cell to express nucleic acids and/or proteins other than those native to the cell, or in a manner not native to the cell. An "expression vector" contains a sequence of nucleic acids (ordinarily R A or DNA) to be expressed by the host microorganism. Optionally, the expression vector also comprises materials to aid in achieving entry of the nucleic acid into the host

microorganism, such as a virus, liposome, protein coating, or the like. The expression vectors contemplated for use in the present invention include those into which a nucleic acid sequence can be inserted, along with any preferred or required operational elements. Further, the expression vector must be one that can be transferred into a host microorganism and replicated therein. Preferred expression vectors are plasmids, particularly those with restriction sites that have been well documented and that contain the operational elements preferred or required for transcription of the nucleic acid sequence. Such plasmids, as well as other expression vectors, are well known to those of ordinary skill in the art. [0045] The term "transduce" as used herein refers to the transfer of a sequence of nucleic acids into a host microorganism or cell. Only when the sequence of nucleic acids becomes stably replicated by the cell does the host microorganism or cell become "transformed." As will be appreciated by those of ordinary skill in the art, "transformation" may take place either by incorporation of the sequence of nucleic acids into the cellular genome, i.e., chromosomal integration, or by extrachromosomal integration. In contrast, an expression vector, e.g., a virus, is "infective" when it transduces a host microorganism, replicates, and (without the benefit of any complementary virus or vector) spreads progeny expression vectors, e.g., viruses, of the same type as the original transducing expression vector to other microorganisms, wherein the progeny expression vectors possess the same ability to reproduce.

[0046] The terms "isolated" or "biologically pure" refer to material that is substantially or essentially free of components that normally accompany it in its native state.

[0047] As used herein, the terms "nucleic acid sequence," "sequence of nucleic acids," and variations thereof shall be generic to polydeoxyribonucleotides (containing 2-deoxy-D- ribose), to polyribonucleotides (containing D-ribose), to any other type of polynucleotide that is an N-glycoside of a purine or pyrimidine base, and to other polymers containing nonnucleotidic backbones, provided that the polymers contain nucleobases in a configuration that allows for base pairing and base stacking, as found in DNA and R A. As used herein, the symbols for nucleotides and polynucleotides are those recommended by the IUPAC-IUB Commission of Biochemical Nomenclature (Biochem. 9:4022, 1970).

[0048] The term "operably linked" refers to a functional linkage between a nucleic acid expression control sequence (such as a promoter) and a second nucleic acid sequence, wherein the expression control sequence directs transcription of the nucleic acid

corresponding to the second sequence.

[0049] In some embodiments, the invention provides for a method for producing a α,ω- dicarboxylic acid in a genetically modified host cell, the method comprising: culturing a genetically modified host cell under a suitable condition to produce enzymes in a system to oxidize fatty acids to ω-hydroxy fatty acids to α,ω-dicarboxylic acids. In some

embodiments, such a genetically modified host cell comprises first a enzyme that produces a fatty acyl-CoA (or acyl-ACP), a second optional enzyme that is a thioesterase, an fatty acid omega oxidase (EC 1.14.15.3) that oxides the fatty acid at the ω-carbon to produce a co- hydroxy fatty acid, a second oxidase, i.e., a fatty alcohol dehydrogenase (FAD) or fatty alcohol oxidase (FAO) (these two terms are used interchangeably; EC 1.1.3.20 or 1.1.3.13), to oxidize the ω-hydroxy fatty acid to an aldehyde, and an aldehyde dehydrogenase (ADH; 1.2.1.3 or 1.2.1.4) to convert the aldehyde into an α,ω-dicarboxylic acid. Again, the methods of the invention are generally illustrated in Figure 1 and involve either Type I, II, or III fatty acid synthase systems, or a Type I PKS system or a 2-ketoacid system, wherein the carbon length of the output fatty acid is controlled. Figure 4 illustrates the extension of endogenously produced 2-ketybutyrate substrate to longer chain 2-ketoacids throught the activity of enzymes encoded by the LeuABCD operon, subsequent decarboxylation to the fatty aldehyde, and oxidation to the fatty acid. In various embodiments, appropriately selected oxidizing enzymes perform omega oxidation on said fatty acid.

[0050] In some embodiments, the genetically modified host cell comprises a first nucleic acid construct encoding the first enzyme (i.e., the elongase), optionally a second nucleic acid construct encoding the thioesterase, a third nucleic acid encoding the fatty acid omega hydroxylase, a fourth nucleic acid encoding the FAD or FAO and a fifth nucleic acid encoding the ADH, and the culturing results in the expression of the elongase, (optionally) the thioesterase, the fatty acid omega hydroxylase, the FAD or FAO and the ADH.

[0051] In some embodiments, the method further comprises the step of recovering the diacid produced, wherein the recovering step is concurrent or subsequent to the culturing step. Kurzrock et al, report on multiple purification strategies used for isolation of the microbially produced diacid succinate from fermentation broth (Kurzrock et al.

Biotechnology Letters, 32:331-339, 2010); these methods, including precipitation with calcium hydroxide, calcium oxide, or ammonia; electrodialysis; reactive extraction with long chain aliphatic primary, secondary, or tertiary amines (for example, and without limitation, tripropylamine, trioctylamine, or tridecylamine) in organic solvent; and ion exchange are generally applicable for purification of fatty acid, α,ω-dicarboxylic acid, ω-hydroxy fatty acids products. In various embodiments of the invention, the products are purified through precipitation as calcium salts, or reactive extraction with tertiary amines. In various embodiments of the invention, the tertiary amines employed include, and without limitation, tripropylamine, trioctylamine, or tridecylamine. In some embodiments of the invention, ion exchange is employed for further purification of the fatty acid, α,ω-dicarboxylic acid, or ω- hydroxy fatty acid. [0052] In various embodiments, the method comprises a method of genetically modifying a cell, e.g., a bacterial or yeast cell, to increase expression of one or more genes involved in the production of fatty acid compounds; such that the production of fatty acid compounds by the cell is increased. Such genes encode proteins such as acetyl-CoA carboxylase (ACC), cytosolic thiosterase (LtesA), a fatty acid synthase, and acyl-carrier protein (AcpP). In some embodiments, the genetically modified cell may be modified to produce higher levels of cytosolic acetyl-coA and malonyl-CoA. Thus, in some embodiments a genetically modified cell may comprise a modification to express, or increase expression of, proteins such as ATP citrate lyase. [0053] In various embodiments, the genetically modified host cell expresses an enzyme system for for producing α,ω-dicarboxylic acids from a simple sugar substrate (for example, but not limited to, glucose, sucrose, xylose, arabinose; such sugars might be obtained from cornstarch, sugar cane, cellulosics, and waste biomass), wherein the enzyme system comprises: an elongase to produce a fatty-acyl-CoA-thioester of a desired chain length; a fatty acid omega hydroxylase (EC 1.14.15.3) to hydroxylate the fatty acid at the omega carbon to produce a co-hydroxy fatty acid; an oxidase to oxidize the co-hydroxy fatty acid to an aldehyde (EC 1.1.3.20 or 1.1.3.13); and an aldehyde dehydrogenase to produce the α,ω- dicarboxylic acid (EC 1.2.1.3 or 1.2.1.4); and wherein at least one of the enzymes are recombinant enzymes encoded by one or more expression cassettes. [0054] In various embodiments, the genetically modified host cell expresses an enzyme system for producing α,ω-dicarboxylic acids from a simple sugar substrate (for example, but not limited to, glucose, sucrose, xylose, arabinose; such sugars might be obtained from cornstarch, sugar cane, cellulosics, and waste biomass) wherein the enzyme system comprises: an elongase to produce a fatty-acyl-CoA-thioester of a desired chain length; a thioesterase that produces a fatty acid from the acyl-thioester; an oxidase to hydroxylate the fatty acid at the omega carbon to produce a ω-hydroxy fatty acid (EC 1.14.15.3) ; an oxidase to oxidize the ω-hydroxy fatty acid to an co-oxo fatty acid (EC 1.1.3.20 or 1.1.3.13); and an aldehyde dehydrogenase to produce the α,ω-dicarboxylic acid (EC 1.2.1.3 or EC 1.2.1.4), wherein at least two of the enzymes are recombinant enzymes encoded by one or more expression cassettes. Enzymes and constructs encoding thereof

[0055] As noted above, one of the advantages of the present invention is that it does not rely upon exogenous alkanes, fatty acids (FA), or hydroxy-FA supplementation or the ability of the microbe to produce enough substrate for conversion into diacids (or other products). Instead, in the methods of the present invention, the fatty acid starting material is also microbially produced. Numerous methods for microbially producing fatty acids are known to those of skill in the art and include those methods described in PCT International Publication Nos. WO 2007/136762, WO 2008/100251 and WO 2010/075483 as well as those methods described in U.S. Patent Application Publication No. US 2010/0170148, the teachings of all of which are incorporated herein by reference. Both Type I, II, and III FASs and Type I PKS are employed in various embodiments of the methods of the present invention.

[0056] While the invention provides modified host cells, methods and enzymes for the production of fatty acid molecules via five distinct routes but without limitation and including Type I, II and III fatty acid, Type I PKS and 2-ketoacid biosynthetic pathways, first we focus on enzymes involved in fatty acid biosynthesis. Table 1, below, provides Type I, II, and III fatty acid synthases and elongases and other enzymes involved in the biosynsthesis of fatty acids suitable for use or alteration in accordance with the methods and in the host cells of the invention.

[0057] In Table 1 below we provide suitable enzymes, without limitation, for performing the methods in accordance with the invention that are used to produce fatty acids via Type I, II or III fatty acid biosynthesis. In detail, the "enzyme" column provides both the gene, enzyme name and its accession number either in NCBI, Genbank, UniProt, or associated catalytic activity, unless the gene name is unavailable in which case only enzyme function and accession numbers are provided. The "modification" column describes the genetic modification in accordance with the invention; "OE" means overexpress and in some embodiments of the invention, where the host cell does not have an endogenous copy of the gene it is taken to mean the enzyme is expressed heterologously. In other embodiments, in which the host cell has an endogenous copy of the gene the gene product is overexpressed. Express and overexpress mean that enzyme levels and activity are increased compared to the wild-type case and those skilled in the art appreciate that this can be achieved by increasing the strength or changing the type of the promoter, increasing the strength of the ribosome binding site or Kozak sequence, increasing the stability of the mRNA transcript, altering the codon usage, and increasing the stability of the enzyme, etc. In the modification column, "decrease" means that the enzyme activity is decreased compared to the wildtype. Those skilled in the art will appreciate that decreasing enzyme activity compared to wildtype is achieved in a variety of ways in accordance with the methods of the invention and not limited to completely removing an enzyme by gene knockout, addition of an inhibitor compound that reduces or eliminates enzyme's activity, expression level is modulated such that total enzyme activity is decreased by weakening a promoter, ribosome binding site or Kozak sequence, by decreasing mRNA transcript stability or by increasing protein degradation. The "use" column indicates a more specific use of the enzyme without limitation in accordance with the methods of the invention and in some cases indicates the fatty acid chain length product. For example, the "hexA" enzyme is involved in producing a fatty acid chain six carbons in length and this is indicated by "C6". The "organism" column indicates suitable sources for the genes and enzymes and does not necessarily indicate the choice of host cells. Finally, superscripted numbers indicate relevant citations.

Table I. ENZYMES INVOLVED IN FATTY ACID SYNTHESIS

OE=overexpress; Organism= an illustrative, non-limiting organism that is a source of the gene/enzyme

fabl (enoyl reductase; OE N/A B subtilis, E. coli

[0058] Non-engineered cells typically produce a range of fatty acid chain lengths with varying degrees of saturation to maintain membrane fluidity, etc and usually rely on acyl- transferases to move the fatty acid from the FAS into products that compose cell membranes like diacylglycerols and phospholipids. Under typical conditions, they do not utilize thioesterases as part of the fatty acid biosynthetic machinery and in the case where a fatty acid thioesterase may be present in a naturally occurring organism's genome, they often contain signal peptides that target their expression to areas where fatty acid biosynthesis is not occurring. Therefore, to use the thioesterases in accordance with the methods of the invention, one skilled in the art will appreciate the requirement to express the thioesterase in the same location as where the fatty acids are produced or located. In detail, TesA, a thioesterase native to E. coli has a leader peptide sequence that targets its expression to the periplasm and to use this thioesterase in accordance with the methods of the invention, the sequence must be removed to target its activity to the cytosol (indicated by LTesA) in the case of E. coli, as that is the sight of fatty acid biosynthesis. However, in general, and in the methods in accordance with this invention, thioesterases can be used for cleaving fatty acid moieties whenever the fatty acid is covalently attached via a thioester bond to an acyl-carrier protein and this occurs in most Type II FAS proteins as well as in Type I PKS proteins, while Type I and III FAS proteins typically generate a CoA bound fatty acyl thioester. The distinction here is emphasized because in some cases, the CoA thioester is labile and the fatty acid can be released without a thioesterase, yet, in the other cases, a TE is required to efficiently cleave the thioester bond and release the fatty acid. Although natural hydrolysis may occur in the case of Type I and III FAS proteins (whose composition determines chain length), the rates of hydrolysis in some in bodiments is increased by expressing thioesterases. Another distinction is made between PKS TEs and FAS TEs, because often PKS TEs are incorporated into the PKS polypeptide at the C terminal domain, whereas often FAS TEs are separate proteins, although in the case of FASs this is not a rule. Because PKS TEs are typically this final domain, they will be discussed in a different section, but in some embodiments, a suitable TE is the DEBS TE from the erythromycin PKS pathway. TEs have selectivity for cleaving fatty acyl-CoA or fatty acyl-ACP thioester bonds exist in nature and have been found in a variety of hosts, including but not limited to plants, bacteria and eukaryotes. In accordance with the methods of the invention, for Type I, II, and III fatty acid biosynthesis, a TE with appropriate fatty acid carbon chain length selectivity is chosen for a particular free fatty acid product. While Type I, II,and III FASs and Type I PKSs enzymes utilize TEs, Table II below provides illustrative thioesterases suitably used for the fatty acid synthase systems herein and in accordance with the methods of the invention. To reiterate, in general, many thioesterases are available for use in connection with the fatty acid syntheses, a thioesterase may be used, for example, to produce a fatty acid from either an ACP bound fatty acyl-thioester or to produce a fatty acid from a CoA bound fatty acyl-thioester.

Typically with Type II FAS, a thioesterase is employed to cleave the ACP-bound fatty acid. With Type I & III FAS, the fatty-acyl-CoA-thioester is naturally hydro lyzed by water, thereby providing the fatty acid and does not necessarily require a thioesterase. [0059] Thioesterases suitable for use in accordance with the methods of the present invention include those set forth in Table II below. The "thioesterase" column includes the enzyme name and accession number (in most cases) in various forms; the "modification" column is defined as previously, including the operative definition of "OE" or overexpress; substrate specificity refers to the fatty acid chain length recognized by the thioesterase; the "organism" column contains an illustrative organism that is suitable for obtaining the genetic element / enzyme, but is not mean to be limiting.

Table II. Thioesterases

[0060] In addition to use of thioesterases in the fatty acid pathways described above, the PKS pathways also use thioesterases that in most cases are a part of the PKS peptide located at the C-terminus in accordance with the methods of the invention and are provided in the following. An alternative way of producing the fatty acid of a specific chain length in accordance with the invention is to employ a hybrid PKS. Exemplary modules are listed herein. To make a fully reduced fatty acid of a given chain length, one constructs, in accordance with the invention, a hybrid that contains a loading module (KS, AT, ACP) an extension module (KS, KR, DH, ER, ACP), and a thioesterase (TE). The choice of loading module and choice of extension modules that condense precursors like malonyl-CoA or methylmalonyl-CoA determines whether the fatty acid chain is even or odd carbon number and the number of extension modules preceding the TE determines the overall chain length. To construct an odd-chain fatty acid PKS in accordance with the invention, one selects a loading module that incorporates propionate via methylmalonyl-CoA, and to construct an even-chain fatty acid PKS, one selects a loading module that incorporates acetate via malonyl-CoA. Another method of the invention involves selection of modules that incorporate longer chain acyl-CoA molecules like butyryl-CoA. Illustrative loading, extension, and thioesterase modules suitable for use in the methods, PKS, and host cells of the invention are provided in the following, where a PKS known to produce a specific compound is named followed by parantheticals that identify the source organism for the genetic material.

[0061] Non-limiting examples of loading modules for malonyl-CoA are provided as follows: Niddamycin PKS (S. caelestis), Amphotericin PKS (Streptomyces nodosus), Concanamycin a PKS (Streptomyces neyagawaensis), Epothilone PKS (Sorangium cellulosum), Mycolactone PKS (Mycobacterium ulcerans), Nanchangmycin PKS

(Streptomyces nanchangensis), Nystatin PKS (Streptomyces noursei), Oleandomycin PKS (Streptomyces antibioticus), Oligomycin (Streptomyces avermitilis), Pimaricin PKS

(Streptomyces natalensis), Pyoluteorin PKS (Pseudomonas fluorescens), stigmatellin PKS (Stigmatella aurantiaca).

[0062] Non-limiting examples of loading modules for methylmalonlyl-Co A are provided as follows and are used in accordance with the methods of the invention to load an odd-carbon number onto the PKS, but does not necessarily require the final fatty acid product to be an odd-carbon number as described previously: Megalomicin PKS (Micromonospora megalomicea), Methymycin PKS (Streptomyces venezuelae), Monensin PKS (Streptomyces cinnamonensis), Narbomycin PKS (Streptomyces venezuelae), Neomethymycin PKS

(Streptomyces venezuelae), Pikromycin (Streptomyces venezuelae), Spinosad PKS

(Saccharopolyspora spinosa), Tylactone PKS (Streptomyces fradiae).

[0063] An illustrative example of a loading module for propionyl-CoA is from the erythromycin PKS (Saccharopolyspora erythraea).

[0064] Non-limiting extension modules that incorporate malonyl-CoA via condensation, increase the chain length by two carbons, and fully reduce the acyl chain are provided as follows, where "M" and the number indicate the module number within the PKS post loading module, such that "Ml" would directly follow a loading module in a given PKS sequence: Nystatin PKS M5, M15 (S. caelestis); Amphotericin PKS M5, M16 (Streptomyces nodosus); Mycolactone PKS M9 (Mycobacterium ulcerans); Nanchangamycin PKS M6, M8

(Streptomyces nanchangensis); Oleandomycin PKS M3 (Streptomyces antibioticus);

Stigmatellin PKS M5 (Stigmatella aurantiaca); Soraphen PKS M2, M3, M5 (Sorangium cellulosum); Monensin PKS M6, M8 (Streptomyces cinnamonensis); Spinosad PKS M2 (Saccharopolyspora spinosa); Herbimycin A PKS M6 (S. hygroscopicus); FR008 PKS M19 (Streptomyces sp. FR-008) [0065] An illustrative example of a thioesterase from the erythromycin PKS {Saccharopolyspora erythraea), which is sometimes referred to as the DEBS TE, is suitable for cleaving the fatty acyl ACP thioester bond produced via PKSs and result in production of a specific chain length fatty acid. [0066] Four major routes to producing a desired chain length fatty acid via Types I, II, or III FASs or Type I PKSs have been described in detail. A fifth route, decribed below, is through 2-ketoacid intermediates as in a portion of the leucine biosynthetic pathway.

[0067] Thus, in another aspect, the present invention provides methods for producing fatty acids, ω-hydroxyacids, and α,ω-dicarboxylic acids (e.g., adipic acid), using elements of amino acid biosynthetic pathways (2-ketoacid system). Normally cells do not produce fatty acids, ω-hydroxyacids, or α,ω-dicarboxylic acids from amino acid pathways. In non- engineered cells, pyruvate in the tricarboxylic acid cycle is converted into oxaloacetate, which is then converted through multiple steps into L-threonine, and then into 2-ketobutyrate via a threonine deaminase. 2-ketobutyrate is normally a substrate in cells for producing isoleucine and leucine, but has also been demonstrated to be a suitable substrate for elongation in one-carbon increments with an engineered LeuA enzyme and native LeuBCD enzymes to produce the 2-ketovalerate (C5), 2-ketocaproate (C6), and 2-ketoheptanoate (C7) intermediates. Suitable mutations of the LeuA enzyme include, but are not limited to, G462D, S139G, H97A, N167A. These ketoacid intermediates are then decarboxylated by a promiscuous enzyme, Kivd (from Lactococcus lactis) to form an aldehyde {see, Zhang, et al. , "Expanding metabolism for biosynthesis of non-natural alcohols, Proc Natl Acad Sci USA 105, 20653-20658 (2008), the teachings of which are incorporated herein by reference).

Suitable mutations of the KIVD enzyme include, but are not limited to V461A, F381L.

[0068] In accordance with the teaching of the methods of this invention, we convert the fatty aldehyde into a fatty acid by oxidation or expression of an aldehyde dehydrogenase (EC. 1.2.1.3). In other embodiments of the invention, we produce a fatty alcohol by expressing an alcohol dehydrogenase (ADH) that converts the aldehyde into an alcohol and serves as a substrate for oxidation to the fatty acid by methods described elsewhere. In some

embodiments, a suitable, but not limited to alcohol dehydrogenase is ADH6 (S. cerevisiae). The pathway utilized in this embodiment of the invention is illustrated in Figure 4. Examples of short chain aldehyde dehydrogenases (ALDs) have been described and suitable enzymes are listed below. Further oxidation of the omega-carbon is necessary, once the short chain fatty acid is produced, and is achieved through omega oxidation described in a following section.

[0069] To produce a fatty acid from the 2-ketoacid pathway, the supply of the keto acid (e.g., 2-ketobutyrate) is important as a precursor to the pathway, thus in some embodiments the invention provides cells that are engineered to produce appropriate substrate levels by overexpressing genes that encode enzymes in the pathway. The present invention provides a number of ways to supply or increase the supply of 2-ketobutyrate, which ultimately increases the fatty acid, ω-hydroxyfatty acid, or α,ω-dicarboxylic acid product; these include, but are not limited to: threonine degradation pathways, isoleucine biosynthesis pathways (via citramalate synthase and 2-methylmalate), glutamate pathways (via 2-methylaspartate and 2- methyloxaloacetate) or χ-elimination of o-phosphohomoserine and o-acetyl-homoserine. Other host cells and methods of the invention exploit prevention of transamination of ketoacids by deletion or attenuation of various genes including, e.g., ilvE, tyrB, etc.

[0070] Here, we provide enzymes without limitation in accordance with the methods of the invention for producing fatty aldehydes, fatty acids, w-hydroxy fatty acids, and diacids from 2-ketoacid precursors. First, we provide the enzymes involved in elongating the 2-ketoacid precursor, 2-ketobutanoate to 2-ketovalerate, 2-ketocaproate, and 2-ketoheptanoate, and decarboxylating these precursors to fatty aldehydes as provided below in Table III, with column definitions as previously described, but with the "EC number" that provides the biochemical reaction associated with the provided enzymes.

Table III: 2-ketoacid enzymes

[0071] Having provided the enzymes for production of fatty aldehydes from 2-ketoacid pathways in accordance with the methods of the invention, we now provide without limitation enzymes for the conversion of the fatty aldehydes into fatty acids by expressing aldehyde dehydrogenase (ALD) enzymes performing biochemistry described by EC 1.2.1.3 and shown in Figure 4. In general, many ALDs exist, but here we provide suitable, non- limiting examples by enzyme name and in paranthesis, an illustrative source organism for the enzyme, and any associated specificity as a carbon chain length range (eg C4-C14): Aldl (Acinetobacter sp Ml; C4-C14); ScAldl (Mus musculus; C6-C9); Psdrl (Homo sapiens; C2- C12); ALD4 (S. cerevisiae; C2-C12), etc. In accordance with the methods of the invention, we have now provided five routes to produce fatty acids with specific carbon chain lengths via the Type I, II, and III FASs, Type I PKSs, and the 2-ketoacid biosynthetic enzymes. In the following, we provide enzymes for the production of ω-hydroxyacids and α,ω- dicarboxylic acids from these fatty acid precursors, but next we describe production of other valuable chemicals from the 2-ketoacid pathways.

[0072] In addition to the α,ω-dicarboxylic acids, ω-hydroxy acids, diols and shorter chain monoacids are synthesized using the 2-keto acid pathway in accordance with other embodiments of the invention. For instance, in one embodiment, production of a monoacid is achieved by eliminating reactions EC 1.14.15.3 and EC 1.1.3.20 from the pathway illustrated in Figure 4. In another embodiment, production of a co-hydroxyacid is achieved by eliminating reaction EC 1.1.3.20 from the pathway illustrated in Figure 4. In yet another embodiment, production of a diol is achieved by replacing the ALD (aldehyde dehyrogenase) EC 1.2.1.3 with an aldehyde reductase EC 1.1.1.21 in the pathway illustrated in Figure 4.

[0073] The methods of the present invention involve the use of an oxidase to hydroxylate the fatty acid at the omega carbon to produce a ω-hydroxy fatty acid. As discussed in the background section above, for example Candida tropicalis shows the oxidation of the C 12 and C14 fatty acid to the C12 and C14 co-hydroxy fatty acid is described, for example, by Picataggio, et al. (Biotechnology 10:894-898, 1992) and in U.S. Patent Nos. 7,405,063;

7,160,708; 7,109,009; 7,063,972; 7,049,112; 6,790,640; and 6,331,420 as well as in PCT International Publication No. WO 2004/013336, the teachings of all of which are

incorporated herein by reference. In one embodiment, productivity of the cooxidation is enhanced by amplification of both the cytochrome P450 monooxygenase and NADPH- or NADH-cytochrome reductase genes or by using highly active promoters with such genes.

[0074] Once the fatty acid of a desired chain length is produced with one of the five routes it is hydroxylated in accordance with the invention at the omega carbon, producing a co- hydroxyfatty acid, co-hydroxyfatty acids themselves are valuable and used in the production of rapidly drying paints and varnishes, etc.

[0075] Here, we provide without limitation and in accordance with the methods of the invention enzymes that are suitable for overexpressing in the provided host cells and results in hydroxylating the omega carbon to produce co-hydroxy fatty acids, i.e., an omega hydroxy fatty acid. The EC number describing the biochemical reaction that converts a fatty acid into a co-hydroxy fatty acid is EC 1.14.15.3. We provide non- limiting examples of suitable enzymes by their name and in parentheses provide an illustrative organism from which to source the genetic material, followed by fatty acid chain length specificity where C3-C10 indicates activity on fatty acids ranging in chain length from three to ten carbons, e.g.

"Enzyme Name" ("Source Organism"; Chain length specificity). Any superscripts indicate references that describe the enzyme. Suitable enzymes for performing omega hydroxylation are as follows: P450alkl (C. Tropicalis; C12-C16) 2 ' 9 ' 10 ;; CPR (C.tropicalisf; P450 (3P2) (chimeric enzyme; C6-C12) 11 ; P450 (pHP3) (Rabbit; C6-C12) 11 ; P450 (P.oleovarans; C8- C12); P450 BM3 (B. megaterium; C12-C18); CYP86A8 (A.thaliana; C12-C18); CYP703A1 (Petunia x hybrida; CI 2) CYP704B2 (O. sativa ssp japonica; CI 8); CYP4V2 (H. sapiens; C12-C16); CYP4B (H. sapiens; C7-C10) 12 ; CYP4A (H. sapiens; C10-C16) 12 ; and CYP4F (H. sapiens; C16-C26) 12 . Although here and in all embodiments for α,ω-dicarboxylic acid production we employ an omega hydroxylase, in some embodiments hydroxylating other carbons within the fatty acid or diacid backbone may be useful and can be accomplished by hydroxylases in general. In one embodiment, a P450 BM3 that has an F87A mutation is used to change the regiospecificity of hydroxylation and demonstrates hydroxylation at the co-1, co- 2, and ω-3 positions.

[0076] It will be readily apparent to those of skill in the art in view of this disclosure that ω-hydroxyfatty acids, i.e., 1 -hydroxyfatty acids, are themselves valuable and used in the production of rapidly drying paints and varnishes, etc. As such, if of interest, the co- hydroxyfatty acids, i.e., 1 -hydroxyfatty acids, can be isolated or recovered in accordance with the invention. [0077] The ω-hydroxyfatty acids, i.e., ω-hydroxyfatty acids, can be, in other embodiments, further oxidized to an α,ω-diacarboxylic acid. This omega hydroxy fatty acid can be further oxidized in accordance with the invention to a diacid using methods described herein. We provide fatty alcohol oxidases (FAOs) or fatty aldehyde dehydrogenases (FADs) to convert an omega-hydroxy fatty acid into an omega-oxo fatty acid in accordance with the

biochemical reaction EC 1.1.3.20. In one embodiment, the fatty alcohol oxidase provided is FAOl, FA02a or FA02b from Candida tropicalis. In one embodiment, the route of oxidation is through an aldehyde using a FAO.

[0078] The aldehyde intermediate is then converted, in accordance with the invention, into a diacid by an aldehyde dehydrogenase (ALD). In general, many ALDs exist. The following are non-limiting examples of suitable enzymes by enzyme name and in parentheses, an illustrative source organism for the enzyme, and any associated specificity as a carbon chain length range (e.g., C4-C14): Aldl (Acinetobacter sp Ml; C4-C14); ScAldl (Mus musculus; C6-C9); Psdrl (Homo sapiens; C2-C12); ALD1, ALD2, ALD3, ALD4, ALD5, ALD6, HFD1 (S. cerevisiae; C2-C12).

[0079] Most cells naturally have the capacity to degrade fatty acids, hydroxyl fatty acids and diacids to some capacity through enzymatic activities associated with the β-oxidation pathway. Briefly, the pathway functions in most cases by activating free fatty acid groups to CoA thioesters with acyl-CoA ligases, which are further oxidized and degraded, proceeding through a 2,3 enoyl-CoA, 3-hydroxyacyl-CoA, 3-ketoacyl-CoA, and then to a two carbon- shortened acyl-CoA that repeats the cycle. The enzymatic activity required for this degradation is known. In accordance with the methods of this invention, we provide cells that have reduced or eliminated degradation pathways for fatty acids, hydroxyl fatty acids, and diacids compared to their wildtype counterparts. In some embodiments, the host organism is engineered in accordance with the invention to remove or attenuate genes encoding fatty acyl-CoA synthetase enzymes. In other embodiments, the host organism is engineered to remove or attenuate genes encoding acyl-CoA dehydrogenases. Methods for making host cells that are substantially β-oxidation pathway blocked are known to those of skill in the art. Here, and in accordance with the methods of the invention, we provide without limitation illustrative enzymes involved in fatty acid degradation that are removed or attenuated to increase fatty acid, hydroxyl fatty acid, or diacid production in an engineered host. In detail, we provide the enzyme name and in parantheticals its function. Superscripts provide references for certain enzymes. We provide host cells with the following enzymes removed or attenuated in S. cerevisiae or related yeasts that increase fatty acid, diacid or hydroxyl fatty acid production: ANT1 (adenine nucleotide transporter); POX2 (3

hydroxyacyl-CoA dehydrogenase); IDP3 (isocitrate dehydrogenase); POX1 (acyl-CoA oxidase); FOX3 (oxoacyl thiolase); EHD3 (hydrolase); PAS1 and PAS2 (peroxisomal formation protein); FAAl, FAA2, FAA3, and FAA4 (acyl-CoA synthetase). We provide host cells with the following enzymes removed or attenuated in E. coli: FadD and FadK (acyl-CoA synthetase); FadE and YdiO (acyl-CoA dehydrogenase); FadB, FadJ, and PaaZ (enoyl-CoA hydratase / hydroxyacyl dehydrogenase); FadA (3-ketoacyl thiolase); Fadl (acetyl-CoA acyltransferase). We provide host cells with the following enzymes removed or attenuated in B. subtilis or related yeasts that increase fatty acid, diacarboxylic acid or hydroxy fatty acid production: YhfT, YhfL, LcfA, YdaB, YtcL, and BioW (acyl-coA synthetase); YdbM, YngJ, mmgC, acdA, and FadE (acyl-CoA dehydrogenase); YngF, YsiB, YhaR, and fadN (enoyl-CoA hydratase).

[0080] In addition, the host cell is, in some embodiments of the invention, genetically modified so that it has decreased or lacks expression of one or more genes encoding proteins involved in the storage and/or metabolism of fatty acid compounds, such that the storage and/or metabolism of fatty acid compounds by the host cell is decreased. Such genes include the following: the ARE1, ARE2, DGA1, and LROl genes. Other engineered host cells with genes that are modified are provided in accordance with the methods of the invention and include those set forth in Table IV. The "enzyme" column provides the name of the enzyme to be modified; the "manipulation" column provides the modification to the enzyme that is provided and is either "attenuate" or "OE". Here attenuate means either decreasing the enzyme activity or completely eliminating it; "OE"= overexpress. The superscripts refer to references.

Table IV. Genes Involved in Storage/Metabolism of FA Compounds

ACL1 OE A.nidulans

ALD6 OE S.cerevisiae

ACS I OE S. enterica

MAE1 OE S.cerevisiae

GLC3 attenuate S. cerevisiae

GLG1, GLG2 attenuate S. cerevisiae

[0081] In general, cells do not naturally biosynthesize odd-chain alpha, α,ω-dicarboxylic acids. However, an odd-chain α,ω-dicarboxylic acid does appear in biotin biosynthetic pathways usually as a bound intermediate to the acyl carrier protein of fatty acid biosynthesis. Biotin biosynthesis is found in some, but not all organisms, which are therefore auxotrophic for biotin. The bound intermediate is the C7 α,ω-dicarboxylic acid, pimelic acid, which is always bound as a pimeloyl-ACP until it is condensed with alanine where it is sequestered into the production of biotin. Until recently, the precise mechanism of pimeloyl-ACP formation has remained elusive. Now E. coifs pathway to this intermediate has been reported to proceed by methylating malonyl-CoA with a methyltransferase, BioC, followed by condensation with malonyl-ACP to form 3-oxo-glutaryl-ACP methyl ester by fabH, followed by two full reduction-dehydration cycles and one extension by the fatty acid synthases (fabG, fabZ, fabl, fabB). The methyl ester is then converted into the free ω- carboxylic acid, pimeloyl-ACP by activity conferred by bioH (see Lin S, et al. Nature Chemical Biology 6, 682-688 (2010)). A different pathway for biotin biosynthesis in B.

subtilis has been suggested to employ a P450 (Biol, CYP107H1) that in vitro is reported to cleave a carbon-carbon bond in a C 14 fatty acyl-ACP to form two C7 molecules {see, e.g., Cryle, et al., "Structural insights from a P450 Carrier Protein complex reveal how specificity is achieved in the P450(BioI) ACP complex," Proc Natl Acad Sci USA 105, 15696-701 (2008); and Cryle et al, "Products of cytochrome P450(BioI) (CYP107Hl)-catalyzed oxidation of fatty acids," Org Lett 5, 3341-4 (2003), the teachings of both of which are incorporated by reference). Yet, this work remains unclear and no one has identified free pimelic acid from B. subtilis cultures, potentially due to the very low levels of biotin required to support cell growth. [0082] In accordance with the methods of this invention, we provide engineered host cells capable of producing odd-chain ω- hydroxyl fatty acids and α,ω-dicarboxylic acids in three general strategies that employ hybrid Type I PKSs, engineered portions of the B. subtilis biotin pathway, or engineered portions of the E. coli biotin pathway in a variety of embodiments.

[0083] In one embodiment, a hybrid PKS system is used to produce heptanedioic (pimelic) acid, which is then oxidized with the methods described above. In one embodiment, a hybrid PKS is constructed and is composed of a loading module for either propionyl-CoA or methylmalonyl-CoA such that the starting unit is odd-chain. While there exist numerous loading modules that perform this function, one suitable loading module is selected from the Erythromycin PKS to load propionyl-CoA. This loading module is operatively linked to two extension-condensation modules that condense malonyl-CoA into the growing acyl-ACP chain. One suitable choice for these modules is the Nystatin PKS M5 and Ml 5. Finally, the hybrid PKS is terminated with a thioesterase that cleaves the thioester bond and releases heptanoic acid. One suitable choice for the thioesterase is that from the Erythromycin PKS (DEBS TE). This construct is cloned into an expression vector and transformed into cells that have phosphopantetheinylation activity to activate the hybrid PKS, the cells are grown in appropriate medium and the free fatty acid heptanoic acid is produced. In some

embodiments, the cells co-express the hybrid PKS and a set of enzymes for omega hydroxylation as described previously to further oxidize heptanoic acid into pimelic acid. In some embodiments the cells have no detectable levels of pimelic acid or do not have any known pathways for its production. In some embodiments the cells are C. tropicalis, B. subtilis, E. coli or S. cerevisiae.

[0084] In another embodiment, pimelic acid is produced by engineering a B. subtilis host. Specifically, engineered cells are provided that overexpress the gene encoding P450 Biol (biol) and higher levels of pimelic acid are detected (compared to wild-type). The enzyme cleaves the central carbon-carbon bond in a C 14 fatty acyl-ACP by consecutive formation of alcohol and threo-diol intermediates to form pimelate. In one embodiment, Biol is overexpressed by cloning it behind a regulatable promoter for expression in B. subtilis and includes Pete, PgsiB and results in pimelic acid production. In other embodiments, Biol is overexpressed by cloning it behind constitutive promoters derived from the sigma A or sigma B RNA polymerase promoter sequence. Additionally, biotin itself is a valuable chemical derived from pimelate, so overproduction of pimelate in accordance with the methods and host cells of the invention decreases the costs of microbial biotin production. In another embodiment, a thioesterase is expressed to increase the pimelate production. In another embodiment, the fatty acid synthesis enzymes native to B. subtilis are overexpressed which results in increased pimelate production. Suitable enzymes are provided in Table I.

[0085] In another embodiment, P450 biol is overexpressed in E. coli, a non-native organism, with or without overexpression of the B. subtilis fatty acid enzymes (acpP, accABCD, KS, KR, ER, DH). In some embodiments, expression of a thioesterase is employed to increase release of the pimelate from the ACP. In another embodiment, the P450 Biol is overexpressed in S. cerevisiae with or without the B. subtilis fatty acid enzymes (KS, KR, ER, DH, acpP) and a thioesterase and pimelate is produced.

[0086] In another embodiment, fatty acid biosynthetic genes are expressed from D. vulgaris or D. deslfuricans including the 3-oxoacyl acyl carrier protein reductase (fabG) (Accession No. YP_011773.1), the acpP (Accession No. YP_011774.1), the beta ketoacyl ACP synthase (fabF) (Accession No. YP_011775.1), the beta-hydroxyacyl ACP dehydratase (FabA/Z like) (Accession No. YP 011772.1) and pimelate is produced. In some embodiments, additional expression of the NC 002937 gene or thioesterase supports pimelate biosynthesis as an auxiliary protein based upon close proximity within the natural D. vulgaris gene cluster.

[0087] In another embodiment, E. coli genes are expressed in the native host or a heterologous host to produce pimeloyl-ACP and include bioC, fabF , fabG, fab A/Z, and fabl. In order to remove the methyl ester bonded to the omega carboxylate, some embodiments include the expression/overexpression of bioH. Once the omega carboxylate is exposed a fatty acid synthase editing TE releases the pimelic acid from the acp-thioester. This TE is naturally produced or can be overexpressed to increase pimelic acid production. However, in some cases the pimelate methyl ester is desired. In some embodiments, expression of a thioesterase increases the production of pimelate or methyl pimelate. In another embodiment, the genes required for pimelic acid production in E. coli are expressed in S. cerevisiae and result in production of pimelic acid.

[0088] The enzymes described herein can be readily replaced using a homologous enzyme thereof. A homologous enzyme is an enzyme that has a polypeptide sequence that is at least 70%, 75%, 80%, 85%, 90%, 95% or 99% identical to any one of the enzymes described in this specification or in an incorporated reference. The homologous enzyme retains amino acids residues that are recognized as conserved for the enzyme. The homologous enzyme may have non-conserved amino acid residues replaced or found to be of a different amino acid, or amino acid(s) inserted or deleted, but which do not affect or has insignificant effect on the enzymatic activity of the homologous enzyme. The homologous enzyme has an enzymatic activity that is identical or essentially identical to the enzymatic activity any one of the enzymes described in this specification or in an incorporated reference. The homologous enzyme may be found in nature or be an engineered mutant thereof. [0089] Nucleic acid constructs of the present invention comprise nucleic acid sequences encoding one or more of the subject enzymes. The nucleic acid of the subject enzymes are operably linked to promoters and optionally control sequences such that the subject enzymes are expressed in a host cell cultured under suitable conditions. The promoters and control sequences are specific for each host cell species. In some embodiments, expression vectors comprise the nucleic acid constructs. Methods for designing and making nucleic acid constructs and expression vectors are well known to those skilled in the art.

[0090] Sequences of nucleic acids encoding the subject enzymes are prepared by any suitable method known to those of ordinary skill in the art, including, for example, direct chemical synthesis or cloning. Further, nucleic acid sequences for use in the invention can be obtained from commercial vendors that provide de novo synthesis of the nucleic acids.

[0091] Each nucleic acid sequence encoding the desired subject enzyme can be

incorporated into an expression vector. Incorporation of the individual nucleic acid sequences may be accomplished through known methods that include, for example, the use of restriction enzymes (such as BamHI, EcoRI, Hhal, Xhol, Xmal, and so forth) to cleave specific sites in the expression vector, e.g., plasmid. The restriction enzyme produces single stranded ends that may be annealed to a nucleic acid sequence having, or synthesized to have, a terminus with a sequence complementary to the ends of the cleaved expression vector. Annealing is performed using an appropriate enzyme, e.g., DNA ligase. As will be appreciated by those of ordinary skill in the art, both the expression vector and the desired nucleic acid sequence are often cleaved with the same restriction enzyme, thereby assuring that the ends of the expression vector and the ends of the nucleic acid sequence are complementary to each other. In addition, DNA linkers may be used to facilitate linking of nucleic acids sequences into an expression vector.

[0092] A series of individual nucleic acid sequences can also be combined by utilizing methods that are known to those having ordinary skill in the art (e.g., U.S. Pat. No.

4,683,195). [0093] For example, each of the desired nucleic acid sequences can be initially generated in a separate PCR. Thereafter, specific primers are designed such that the ends of the PCR products contain complementary sequences. When the PCR products are mixed, denatured, and reannealed, the strands having the matching sequences at their 3' ends overlap and can act as primers for each other Extension of this overlap by DNA polymerase produces a molecule in which the original sequences are "spliced" together. In this way, a series of individual nucleic acid sequences may be "spliced" together and subsequently transduced into a host cell simultaneously. Thus, expression of each of the plurality of nucleic acid sequences is effected. [0094] Individual nucleic acid sequences, or "spliced" nucleic acid sequences, are then incorporated into an expression vector. The invention is not limited with respect to the process by which the nucleic acid sequence is incorporated into the expression vector. Those of ordinary skill in the art are familiar with the necessary steps for incorporating a nucleic acid sequence into an expression vector. A typical expression vector contains the desired nucleic acid sequence preceded by one or more regulatory regions, along with a ribosome binding site, e.g., a nucleotide sequence that is 3-9 nucleotides in length and located 3-11 nucleotides upstream of the initiation codon and followed by a terminator in the case of E. coli or other prokaryotic hosts. See Shine et al. (1975) Nature 254:34 and Steitz, in

Biological Regulation and Development: Gene Expression (ed. R. F. Goldberger), vol. 1, p. 349, 1979, Plenum Publishing, N.Y. In the case of eukaryotic hosts like yeast a typical expression vector contains the desired nucleic acid sequence preceded by one or more regulatory regions, along with a Kozak sequence to initiate translation and followed by a terminator. See Kozak M (1984). Nature 308 (5956): 241-246.

[0095] Regulatory regions include, for example, those regions that contain a promoter and an operator. A promoter is operably linked to the desired nucleic acid sequence, thereby initiating transcription of the nucleic acid sequence via an RNA polymerase enzyme. An operator is a sequence of nucleic acids adjacent to the promoter, which contains a protein- binding domain where a repressor protein can bind. In the absence of a repressor protein, transcription initiates through the promoter. When present, the repressor protein specific to the protein-binding domain of the operator binds to the operator, thereby inhibiting transcription. In this way, control of transcription is accomplished, based upon the particular regulatory regions used and the presence or absence of the corresponding repressor protein. Examples for for prokaryotic expression include lactose promoters (Lacl repressor protein changes conformation when contacted with lactose, thereby preventing the Lacl repressor protein from binding to the operator) and tryptophan promoters (when complexed with tryptophan, TrpR repressor protein has a conformation that binds the operator; in the absence of tryptophan, the TrpR repressor protein has a conformation that does not bind to the operator). Another example is the tac promoter. (See deBoer et al. (1983) Proc. Natl. Acad. Sci. USA, 80:21-25.). Examples of promoters to use for eukaryotic expression include pTDH3, pTEFl, pTEF2, pRNR2, pRPL18B, pREVl, pGALl, pGALlO, pGAPDH, pCUPl, pMET3, pPGKl, pPYKl, pHXT7, pPDCl, pFBAl, pTDH2, pPGll, pPDCl, pTPll, pEN02, pADHl, and pADH2. As will be appreciated by those of ordinary skill in the art, these and other expression vectors or elements may be used in the present invention, and the invention is not limited in this respect.

[0096] Although any suitable expression vector may be used to incorporate the desired sequences, readily available expression vectors include, without limitation: plasmids, such as pSClOl, pBR322, pBBRlMCS-3, pUR, pEX, pMRlOO, pCR4, pBAD24, pUC19, pRS series; bacteriophages, such as Ml 3 phage and λ phage. Of course, such expression vectors may only be suitable for particular host cells. One of ordinary skill in the art, however, can readily determine through routine experimentation whether any particular expression vector is suited for any given host cell. For example, the expression vector can be introduced into the host cell, which is then monitored for viability and expression of the sequences contained in the vector. In addition, reference may be made to the relevant texts and literature, which describe expression vectors and their suitability to any particular host cell. In addition to the use of expression vectors, strains are built where expression cassettes are directly integrated into the host genome.

[0097] The expression vectors or integration cassettes of the invention must be introduced or transferred into the host cell. Such methods for transferring the expression vectors into host cells are well known to those of ordinary skill in the art. For example, one method for transforming E. coli with an expression vector involves a calcium chloride treatment wherein the expression vector is introduced via a calcium precipitate. Other salts, e.g., calcium phosphate, may also be used following a similar procedure. In addition, electroporation (i.e., the application of current to increase the permeability of cells to nucleic acid sequences) may be used to transfect the host microorganism. Also, microinjection of the nucleic acid sequencers) provides the ability to transfect host microorganisms. Other means, such as lipid complexes, liposomes, and dendrimers, may also be employed. Those of ordinary skill in the art can transfect a host cell with a desired sequence using these or other methods.

[0098] For identifying a transfected host cell, a variety of methods are available. For example, a culture of potentially transfected host cells may be separated, using a suitable dilution, into individual cells and thereafter individually grown and tested for expression of the desired nucleic acid sequence. In addition, when plasmids are used, an often-used practice involves the selection of cells based upon antimicrobial resistance that has been conferred by genes intentionally contained within the expression vector, such as the amp, gpt, neo, and hyg genes. [0099] The host cell is transformed with at least one expression vector. When only a single expression vector is used (without the addition of an intermediate), the vector will contain all of the nucleic acid sequences necessary.

[0100] Once the host cell has been transformed with the expression vector, the host cell is allowed to grow. For microbial hosts, this process entails culturing the cells in a suitable medium. It is important that the culture medium contain a carbon source, such as a sugar (e.g., glucose) when an intermediate is not introduced. In this way, cellular production of acetyl-CoA, the starting material for the production of the diacids, is ensured. When added, the intermediate is present in an excess amount in the culture medium or cells.

[0101] As the host cell grows and/or multiplies, expression of the enzymes necessary for producing the fatty acid, hydroxyl fatty acid, 1-oxo fatty acid, l-ol fatty acid and the diacid is effected. Once expressed, the enzymes catalyze the steps necessary for carrying out the enzymatic steps shown in Figures 1 and 4. If an intermediate has been introduced, the expressed enzymes catalyze those steps necessary to convert the intermediate into the respective fatty acid derived compounds. Any means for recovering the diacid from the host cell may be used. For example, the host cell may be harvested and subjected to hypotonic conditions, thereby lysing the cells. The lysate may then be centrifuged and the supernatant subjected to high performance liquid chromatography (HPLC) or gas chromatography (GC).

Host cells

[0102] The host cells of the present invention are genetically modified in that heterologous nucleic acid have been introduced into the host cells or naturally occurring cells have been engineered to produce higher levels of a given product, and as such the genetically modified host cells do not occur in nature. The suitable host cell is one capable of expressing a nucleic acid construct encoding an enzyme capable of catalyzing a desired biosynthetic reaction in order to produce the enzyme for producing the desired fatty acid or fatty acid derived molecule. Such enzymes are described herein. In some embodiments, the host cell naturally produces some of the precursors, as shown in Figures 1 and 4, for the production of the fatty acid derived compounds. These genes encoding the desired enzymes may be heterologous to the host cell or these genes may be native to the host cell but are operatively linked to heterologous promoters and/or control regions, which result in the higher expression of the gene(s) in the host cell. In other embodiments, the host cell does not naturally produce the fatty acid starting material and comprises heterologous nucleic acid constructs capable of expressing one or more genes necessary for producing the fatty acid.

[0103] Each of the desired enzymes capable of catalyzing the desired reaction can be native or heterologous to the host cell. Where the enzyme is native to the host cell, the host cell is optionally genetically modified to modulate expression of the enzyme. This modification can involve the modification of the chromosomal gene encoding the enzyme in the host cell or a nucleic acid construct encoding the gene of the enzyme is introduced into the host cell. One of the effects of the modification is the expression of the enzyme is modulated in the host cell, such as the increased expression of the enzyme in the host cell as compared to the expression of the enzyme in an unmodified host cell. [0104] The genetically modified host cell can further comprise a genetic modification whereby the host cell is modified by the increased expression of one or more genes involved in the production of fatty acid compounds from one of five methods provided such that the production of fatty acid compounds by the host cell is increased. Such genes encode enzymes related to either Type I, II, or III fatty acid biosynthesis, hybrid Type I polyketide synthesis, or 2-ketoacid biosynthesis and include: acetyl carboxylase (ACC), ketosynthase, ketoreductase, deyhdratase, enoyl reductase, cytosolic thiosterase ('TesA, sometimes referred to as LTesA), and acyl-carrier protein (AcpP). In some embodiments, the genetically modified host cell is modified to produce higher levels of cytosolic acetyl-coA or malonyl- CoA or the pathway may be targeted to the mitochondria or compartment where there is a natural or engineered abundance of acetyl-CoA and other necessary precursors. Thus, in some embodiments, a host cell of the invention comprises a modification to express, or increase expression of a protein such as ATP citrate lyase, and to increase levels of NADPH, malic enzyme. For example, Saccharomyces cerevisiae has little ATP citrate lyase and can be engineered in accordance with the invention to express ATP citrate lyase by introducing an expression vector encoding ATP citrate lyase into the yeast cells.

[0105] In some embodiments, a genetically modified host cell is modified to increase expression of a Type I (prokaryotic, eukaryotic) or Type II (prokaryotic) or Type III fatty acid synthase (FAS) gene or Type I polyketide synthase (PKS) or 2-ketoacid biosynthetic enzymes. For example, a yeast host cell is modified to express a FAS gene. Fatty acid synthase proteins are known in the art. FAS3 catalyzes the first committed step in fatty acid biosynthesis and in yeast is encoded by a 6.7 kb gene and contains two enzymatic domains: biotin carboxylase, and biotin carboxyltransferase. FAS2 is encoded, in yeast, by a 5.7 kb gene and contains four domains: an acyl-carrier protein, beta-ketoacyl reductase, beta- ketoacyl synthase, and phosphopantetheinyl transferase (PPT). FAS1 is encoded, in yeast, by a 6.2 kb gene and contains five domains: acetyltransacylase, dehydratase, enoyl reductase, malonyl transacylase, and palmitoyl transacylase. FAS1 and FAS2 complex to form a heterododecamer, containing six each of FAS 1 and FAS2 subunits (Lomakin et al., Cell 129:319-322, 2007, incorporated herein by reference). In some embodiments, a genetically modified host cell overexpresses or expresses native and/or non-native type II fatty acid synthase enzymes. Illustrative genes that encode the enzymes are provided: acpP, accA, accB, accC, accD, fabD, fabH, fabG, fabZ, fab A, fabl, fabB, fabF, fadR. In some

embodiments a genetically modified host cell overexpresses or expresses hybrid engineered type I polyketide synthases.

[0106] The genetically modified host cell can further comprise a genetic modification whereby the host cell is modified by the decreased or lack of expression of one or more genes encoding proteins involved in the storage and/or metabolism of fatty acid compounds; such that the storage and/or metabolism of fatty acid compounds by the host cell is decreased. Such genes include the following: the ARE1, ARE2, DGA1, and/or LROl genes. In some embodiments, the host cell is modified by the decreased or lack of expression of genes that are involved in the β-oxidation of fatty acids. For example, in yeast such, e.g.,

Saccharomyces cerevisiae, β-oxidation occurs in the peroxisome. Genes such as PATl and PEXll are peroxisomal proteins involved in degradation of long-chain and medium-chain fatty acids, respectively. Accordingly, a host cell may be modified in accordance with the invention to delete PATl and/or PEXll, or otherwise decrease expression of the PATl and/or PEX11 proteins. [0107] The genetically modified host cell of the invention can further comprise a genetic modification whereby the host cell is modified to express or have increased expression of an ABC transporter that is capable of exporting or increasing the export of any of the fatty acid derived compounds from the host cell. Such an ABC transporter is the plant cer5 or dcuC. [0108] The present invention provides a wide variety of prokaryotic or eukaryotic host cell suitable for use in the present method and methods for making such host cells. In some embodiments, the bacteria is a cyanobacteria. Examples of suitable bacterial host cells include, without limitation, those species assigned to the Escherichia, Enter obacter,

Azotobacter, Erwinia, Bacillus, Pseudomonas, Klebsiella, Proteus, Salmonella, Serratia, Shigella, Rhizobia, Vitreoscilla, Synechococcus, Synechocystis, and Paracoccus taxonomical classes.

[0109] Suitable eukaryotic cells include, but are not limited to, fungal, insect or mammalian cells. Suitable fungal cells are yeast cells, such as yeast cells of the Saccharomyces genus. In some embodiments the eukaryotic cell is an algae, e.g., Chlamydomonas reinhardtii, Scenedesmus obliquus, Chlorella vulgaris or Dunaliella salina.

[0110] In some embodiments, the host organism is yeast. Suitable yeast host cells include, but are not limited to, Yarrowia, Candida, Bebaromyces, Saccharomyces,

Schizosaccharomyces and Pichia. In one embodiment, the yeast host cell is a species of Candida selected from the group consisting of C. tropicalis, C. maltosa, C. apicola, C.

paratropicalis, C. albicans, C. cloacae, C. guillermondii, C. intermedia, C. lipolytica, C. panapsilosis and C. zeylenoides. In one embodiment, Candida tropicalis is employed as the host organism.

[0111] The present invention provides for an isolated fatty acid derived compound produced by the method of the present invention. Isolating the fatty acid derived compound involves the separating at least part or all of the fermentation medium, host cells, and parts thereof, from which the fatty acid derived compound was produced, from the isolated fatty acid derived compound. The isolated fatty acid derived compound may be free or essentially free of impurities formed from at least part or all of the host cells, and parts thereof. The isolated fatty acid derived compound is essentially free of these impurities when the amount and properties of the impurities do not interfere in the subsequent use of the fatty acid derived compound. For example, if the subsequent use is as an industrial chemical, such as a chemical to be used in a polymerization reaction, then the compound is essentially free of impurities when any remaining impurities would not interfere with the use of the compound as an industrial chemical in a polymerization reaction or any other downstream industrial reaction. If the product is to be used as a fuel, such as a fuel to be used in a combustion reaction, then the compound is essentially free of impurities when any impurities remaining would not interfere with the use of the material as a fuel. In some instances, the host cells of the invention do not naturally produce the desired fatty acid derived compound.

[0112] The fatty acid derived compound of the present invention are useful not only as fuels as a chemical source of energy but also as industrial chemicals and precursors thereof that can be used as an alternative to petroleum derived fuels, ethanol and the like, and industrial chemicals and their precursors. The fatty acid derived compounds of the present invention are also useful in the synthesis of alkanes, alcohols, and esters of various for use as a renewable fuel or for industrial chemical production. In addition, the fatty acid derived compounds can also be as precursors in the synthesis of therapeutics, high-value oils, such as a cocoa butter equivalent and animal feeds. The fatty acid derived compounds are also useful in the production of the class of eicosanoids or related molecules, which have therapeutic related applications.

[0113] It is to be understood that, while the invention is described herein in conjunction with specific embodiments thereof, the foregoing description is intended to illustrate and not limit the scope of the invention. Other aspects, advantages, and modifications within the scope of the invention will be apparent to those skilled in the art to which the invention pertains in view of this disclosure.

[0114] All patents, patent applications, and publications mentioned herein are hereby incorporated by reference in their entireties.

[0115] The invention having been described, the following examples are offered to illustrate the subject invention by way of illustration, not by way of limitation..

EXAMPLES

Example 1: Production of tetradecanedioic acid.

[0116] In accordance with various embodiments of the present invention, tetradecanedioic acid is produced. In one embodiment, a heterologous type III fatty acid synthase ELOl and EL02 (T. brucei) are expressed in conjunction with a set of enzymes that encode for the production of butyryl-CoA, which in some embodiments include phaA, phaB, phaJ, ter, and a thioesterase, which in some embodiments are encoded by ItesA to produce tetradecanoic acid. The tetradecanoic acid is then converted through the alcohol and aldehyde intermediate into the diacid tetradecanedioic acid by expressing a P450 and FAO.

[0117] In another embodiment, a native type II FAS system and a thioesterase from E. coli that produces tetradecanoate were expressed in a host cell of the invention. Once produced, the tetradecanoate is further oxidized in accordance with the invention at the omega carbon by P450s. Although many thioesterases are suitable for these embodiments, in one embodiment the ITesA from E. coli is employed. Although many P450s or omega oxidases are suitable for these embodiments, in one embodiment the P450 BM3 from B. subtilis is employed. Briefly, P450 BM3 and a mutant P450 BM3 (F87A) were separately cloned using standard methods and inserted behind the TAC promoter on an E. coli expression plasmid that contained the pi 5a origin of replication, and an ampicillin resistant gene. A thioesterase, ITesA was cloned using standard methods and inserted behind the LacUV5 promoter on an E. coli expression plasmid that contained the pBBR origin of replication and a tetracycline resistance gene. Plasmid maps are included in Figure 3. The host cells harboring expression plasmids for the LtesA and p450 BM3 or LtesA and p450 BM3 (F87A) were grown in LB media at 37 degrees C and induced at OD = 0.5 with 1 mM IPTG. Cells were grown for 48 h and separated from the supernatant by centrifugation. Both supernatant and pellet fraction were separately dried and resuspended in MeOH:H20 (1 : 1 v/v). Chemical standards were purchased from Sigma and were made up to 20 μΜ, in methanol and water (1 : 1, v/v). The separation of diacids was conducted on a ZIC-HILIC column (250 mm length, 2.1 mm internal diameter, and 3.5 μιη particle size; from Merck SeQuant, and distributed via The Nest Group, Inc., MA., USA) using an Agilent Technologies 1200 Series HPLC system (Agilent Technologies, CA, USA). An injection volume of 4 μΐ, was used throughout. The auto-sample tray was maintained at 4 °C by an Agilent FC/ALS Thermostat. The column compartment was set to 50 °C. Analytes were eluted isocratically with a mobile phase composition of 50 mM ammonium acetate, in water, and acetonitrile (3.6:6.4, v/v). A flow rate of 0.1 mL/min was used throughout.

[0118] The HPLC system was coupled to an Agilent Technologies 6210 time-of-flight mass spectrometer (LC-TOF MS), by a 1/3 post-column split. Contact between both instrument set-ups was established by a LAN card to trigger the MS into operation upon the initiation of a run cycle from the MassHunter workstation (Agilent Technologies, CA, USA). Electrospray ionization (ESI) was conducted in the negative ion mode and a capillary voltage of - 3500 V was utilized. MS experiments were carried out in full scan mode, at 0.85 spectra/ second and a cycle time of 1.176 seconds, for the detection of [M - H]- ions. The instrument was tuned for a range of 50 - 1700 m/z. Prior to LC-TOF MS analysis, the TOF MS was calibrated via an ESI-L-low concentration tuning mix (Agilent Technologies, CA, USA). Data acquisition and processing were performed by the MassHunter software package (Agilent Technologies, CA, USA).

[0119] Figures 2 and 3 illustrate that the expression of both the wild-type P450 Bm3 and the engineered P450 BM3 (F87A) resulted in diacid production when the native E. coli fatty acid pathway is overexpressed via LtesA. Data shown in Figure 2 demonstrated production and excretion of the C14 α,ω-dicarboxylic acid tetradecanoate. Specifically, Figure 2A demonstrated production of both the C14 co-hydroxytetradecanoic acid and tetradecanedioic acid. Figure 2B is MS data that show the expected molecular ion of tetradecanedioic acid.

Example 2: Production of butanedioic acid.

[0120] In accordance with the methods of the invention, butanedioic is produced by a variety of embodiments. This example describes converting acetyl-CoA into butyryl-CoA, which is cleaved from the CoA by a thioesterase (TES), oxidized by a P450 monooxygenase (OXl) and further oxidized into its respective diacid by another oxidase (OX2). Importantly, a thioesterase can be engineered in accordance with the invention so that it cleaves hydroxybutyrate from its CoA, allowing for the production of 2-hydroxybutanedioic acid. [0121] The above method is merely illustrative of the many embodiments provided by the invention. Another embodiment involves the use of a PKS composed of a loading module that incorporates malonyl-CoA an extension module and a TE that releases a butyrate product. In this embodiment, the fatty acid is then oxidized into omega hydroxybutyrate and finally into butanedioic acid. While many PKS modules can be used, in one embodiment, a malonyl-CoA loading module from the niddamycin PKS is functionally attached to an extension and full reduction module from the nystatin PKS (module 5) to provide a PKS of the invention. To release the product as a fatty acid, a TE, for example, the DEBS PKS TE, is functionally attached to the nystatin module. This sequence is placed into an expression vector for E. coli, expressed at 15C-37 degrees C and results in production of butyric acid (SEQ ID NO: l below). Additional expression of a short chain oxidase (OXl and OX2) further results in production of butanedioic acid in accordance with the invention. Example 3: Production of hexanedioic acid.

[0122] In accordance with various embodiments of the methods of the invention, hexanedioic acid is produced. In one embodiment, genes in the aflatoxin biosynthesis pathway that encode fatty acid biosynthesis reactions resulting in a final C6 fatty acid product, hexAB are used. Because others have reported that the hexanoic acid is bound to the hexAB enzyme 3 , in some embodiments, the hexAB has been mutated to decrease or eliminate such binding, and in other embodiments, a thioesterase (TES) is expressed that cleaves the C6 fatty acid from its thioester. In another embodiment, the transacylase from the PKS is used to load an ACP and subsequently cleave it with a TES. In another alternative embodiment, a short-chain thioesterase is engineered to directly produce the C6 fatty acid from type II fatty acid biosynthesis (see, e.g, Dehesh, et αί, "Production of high levels of 8:0 and 10:0 fatty acids in transgenic canola by overexpression of Ch FatB2, a thioesterase cDNA from Cuphea hookeriana." Plant J 9, 167-72 (1996)). Once produced, the hexanoate is further oxidized in accordance with the invention at the omega carbon by P450s. Although many P450s or omega oxidases are suitable for these embodiments, in one embodiment the P450 BM3 from B. subtilis is employed, either native or engineered. In various embodiments, additional expression of a P450 monoxygenase and reductase (P4503P2 and CPR) further results in production of adipate, in accordance with the invention.

Scheme II [0123] The above embodiments are merely illustrative. Another embodiment utilizes an engineered PKS of the invention composed of a loading module that incorporates malonyl- CoA, two extension modules, and a TE that releases a final hexanoate product. This fatty acid is then oxidized into omega hydroxybutyrate and finally into hexanedioic acid (adipate) by, in one embodiment, expression of P4503P2, CPR, and a FAO / ALD in the host cell. More specifically, a malonyl-CoA loading module from the niddamycin PKS is functionally attached to an extension and full reduction module from the nystatin PKS (module 5), followed by another extension and full reduction module from the nystatin PKS (module 15) to provide a PKS of the invention. To release the product as a fatty acid, a TE, optionally the DEBS PKS TE, is functionally attached to the last nystatin module. This sequence is placed into an expression vector for E. coli, expressed at 15-37 degrees C, resulting in production of hexanoic acid (sequence hex orf 1 & hex orf 2, i.e, SEQ ID NOS:2 and 3, respectively, below). Additional expression of a short chain oxidase (0X1 and 0X2) in accordance with the invention further results in production of adipic acid.

Example 4: Production of octanedioic acid.

[0124] In accordance with various embodiments of the present invention, octanedioic acid is produced. In one embodiment, engineered host cells are provided that express a native type II FAS system and a heterologous thioesterase, ChFatB2 from C. hookeriana to produce octanoate. The octanoate is further converted into the diacid octanedioic acid by expression of a hydroxylase (P4503P2) and an F AO/FAD, ADH combination.

Scheme III Example 5: Production of decanedioic acid.

[0125] In accordance with various embodiments of the present invention, decanedioic acid is produced. The thioesterase, ChFatB2, from C. hookeriana, produces decanoate. In one embodiment, host cells are modified to produce decanedioic acid by expressing a native type

II FAS system and a thioesterase, ChFatB2 from C. hookeriana, to produce decanoate. In another embodiment host cells are modified to produce decanedioic acid by expressing a type

III fatty acid synthase, ELOl (T. brucei) and has appropriate genes for the production of butyryl-CoA, which in some embodiments include phaA, phaB, phaJ, ter, and a thioesterase, which in some embodiments are encoded by ChFatB2. Although many P450s could be employed, in one embodiment, the decanoate is then converted in accordance with the invention into the diacid decanedioic acid by expressing P4503P2 and FAO/FAD, ADH combination.

Scheme IV

Example 6: Production of dodecanedioic acid.

[0126] In accordance with various embodiments of the present invention, dodecanedioic acid is produced. In one embodiment, a host cell containing a native type II FAS system and a heterologous thioesterase, UcFATBl from U. californica that produces dodecanoate, is provided. The dodecanoate can be further oxidized in accordance with the invention at the omega carbon by P450s.

Scheme V Example 7: Production of C14(n+2) a, OD-Dicarboxylic Acids.

[0127] The methods presented above can be used to produce of longer chain diacids up to C26 and longer, with longer chain fatty acid biosynthesis systems existing in organisms such as mycoplasms, etc. Example 8: Engineering thioesterase substrate specificity

[0128] Thioesterase substrate specificity can be engineered in accordance with the methods of the invention to produce specific fatty acid chain lengths (see, e.g., Yuan et al., Proc Natl Acad Sci USA 92, 10639-43 (1995); see also, references 4 and 6).

Example 9: Oxidation by P450 BM3. [0129] Terminal oxidation can be carried out by the wild-type P450 BM3 monoxygenase using an ω-hydroxyfatty acid as a substrate {see, e.g., Schneider et al, "Production of alkanedioic acids by cytochrome P450 BM3 monooxygenase: oxidation of 16- hydroxyhexadecanoic acid to hexadecane-l,16-dioic acid," Biocataysis and

Biotransformation, 17: 163-178 (1999)). Thus, in addition to the previous examples that utilize fatty alcohol oxidation/fatty alcohol dehydration and aldehyde dehydration, the final oxidation is, in some embodiments, carried out by the P450 BM3 enzyme. Further, the hydroxylation position can be changed to the cocarbon by a point mutation, resulting in ω- hydroxylation of laurate {see, Oliver, et al., "A single mutation in cytochrome P450 BM3 changes substrate orientation in a catalytic intermediate and the regiospecificity of hydroxylation," Biochemistry 36, 1567-72 (1997)). Thus, expression of both the wild-type P450 BM3 and the engineered P450 BM3 (F87A) results in diacid production if fatty acids are supplied. Further, the substrate specificity can be changed to shorter chain length fatty acids by introduction of various point mutations, resulting in oxidation of short chain length substrates (see, Ost, et al. "Rational re-design of the substrate binding site of

flavocytochrome P450 BM3." FEBS Letters 486, 173-177 (2000)).

Example 10: Controlling Saturation.

[0130] Fatty acid saturation can be controlled by expressing desaturases or, alternatively, by overexpressing fadR. Example 11: Controlling Internal Hydroxylation.

[0131] Other P450s hydroxylate various ω-1,2,3 positions and produce long chain molecules that resemble polyhydroxyalkanoates. Alternatively, one can cleave the thioester early in the fatty acid reduction/elongation cycle to produce molecules like 2- hydroxymyristate in accordance with embodiments of the invention.

Example 12: Biosynthetic Route to Omega Hydroxy Fatty Acids

[0132] Omega hydroxy fatty acids themselves are valuable as polymer substrates and can easily be produced with an embodiment of the invention in which example number 6 above is utilized after eliminating the F AO/FAD and ADH enzyme activities. Example 13: Providing fatty acid substrate through Type I fatty acid biosynthesis.

[0133] Alternate methods of the invention utilize Type I fatty acid biosynthesis for controlling fatty acid chain length through short chain elongation systems. This results in production of specific acyl-CoA chain lengths ranging from C4, CIO, C14, C18, C20, C22, and C26. The fatty acid substrates are cleaved from the Co A thioester by expressing a thioesterase that has broad substrate range. The fatty acid of desired chain length can then be omega oxidized to form its respective diacid, as described in Examples 1-6.

Example 14: Production of odd chain a, OD-Dica boxy lie Acids.

[0134] Odd chain diacids are also valuable molecules that can be produced through decarbonylation of fatty acids to produce an odd chain fatty acids in accordance with the invention. This odd chain fatty acids are then oxidized utilizing the oxidation methods described herein. Alternatively, odd chain fatty acids are produced when propionyl-CoA is used as a primer for fatty acid or polyketide synthases (instead of acetyl-CoA). Once the odd chain fatty acids are produced via these methods, they proceed through omega oxidation as described above. Example 15: Production of C7 diacid (pimelic acid).

[0135] Pimelic acid is a precursor to the biotin biosynthesis pathway and is produced naturally in different organisms by different pathways relating to fatty acid like mechanisms. The present invention provides a variety of embodiments for the production of pimelic acid.

[0136] In one embodiment, the gene encoding native or engineered P450 Biol (biol) native to B. subtilis is overexpressed in B. subtilis by cloning behind a sigma B RNA polymerase constitutive promoter and higher levels of pimelic acid are detected (compared to wild-type). The enzyme cleaves the central carbon-carbon bond in a C 14 fatty acyl-ACP by consecutive formation of alcohol and threo-diol intermediates to form pimelate. Additionally, biotin itself is a valuable chemical derived from pimelate, so overproduction of pimelate in accordance with the methods and host cells of the invention decreases the costs of microbial biotin production.

[0137] In another embodiment, P450 biol is overexpressed in E. coli, a non-native organism, with or without overexpression of the B. subtilis fatty acid enzymes (acpP, accABCD, KS, KR, ER, DH). In some embodiments, expression of a thioesterase is employed to increase release of the pimelate from the ACP.

[0138] In another embodiment, the P450 Biol is overexpressed in S. cerevisiae with or without the B. subtilis fatty acid enzymes (KS, KR, ER, DH).

[0139] In another embodiment, the orfs following biol, ytbQ and ytcP and ytcQ (B.

subtilis) are expressed to increase pimelate production. In another embodiment, a

thioesterase is expressed to increase the pimelate production.

[0140] In another embodiment, a hybrid PKS system is used to produce heptanedioic acid which is then oxidized with the methods described above. Specifically, a PKS propionyl- CoA or methylmalonyl-CoA is functionally linked to two malonyl-CoA extension and reduction modules and finally to a TE module and expressed in E. coli to produce heptanoic acid. Expression of the omega oxidizing enzymes results in oxidation of the heptanoic acid to heptanedioic (pimelic) acid.

[0141] In another embodiment, fatty acid biosynthetic genes are expressed from D. vulgaris or D. deslfuricans including the 3-oxoacyl acyl carrier protein reductase (fabG) (Accession No. YP_011773.1), the acpP (Accession No. YP_011774.1), the beta ketoacyl ACP synthase (fabF) (Accession No. YP_011775.1), the beta-hydroxyacyl ACP dehydratase (FabA/Z like) (Accession No. YP 011772.1) to produce pimelate. In some embodiments, additional expression of the NC 002937 gene or thioesterase supports pimelate biosynthesis as an auxiliary protein based upon close proximity within the natural D. vulgaris gene cluster.

[0142] In another embodiment, E. coli genes are expressed in the native host or a heterologous host to produce pimeloyl-ACP and include bioC, fabF, fabG, fabA/Z, and fabl. In order to remove the methyl ester bonded to the omega carboxylate, some embodiments include the expression/overexpression of bioH, however in some cases, the pimelate methyl ester is desired. In some embodiments, expression of a thioesterase increases the production of pimelate or methyl pimelate.

Listing of References

1. Mobley, D. Biosynthesis of long-chain dicarboxylic acid monomers from renewable resources. US Department of Energy Report (1999).

2. Picataggio, S. et al. Metabolic engineering of Candida tropicalis for the production of long-chain dicarboxylic acids. Biotechnology (N Y) 10, 894-8 (1992).

3. Watanabe, C. M. & Townsend, C. A. Initial characterization of a type I fatty acid synthase and polyketide synthase multienzyme complex NorS in the biosynthesis of aflatoxin B(l). Chem Biol 9, 981-8 (2002).

4. Yuan, L., Voelker, T. A. & Hawkins, D. J. Modification of the substrate specificity of an acyl-acyl carrier protein thioesterase by protein engineering. Proc Natl Acad Sci U SA 92, 10639-43 (1995).

5. Dehesh, K., Jones, A., Knutzon, D. S. & Voelker, T. A. Production of high levels of 8:0 and 10:0 fatty acids in transgenic canola by overexpression of Ch FatB2, a thioesterase cDNA from Cuphea hookeriana. Plant J9, 167-72 (1996).

6. Yuan, L. (Calgene, 12/14/2005).

7. S Schneider, M. W., D Sanglard, B Witholt. Production of alkanedioic acids by

cytochrome P450 BM-3 monooxygenase: oxidation of 16-hydroxyhexadecanoic acid to hexadecane-l,16-dioic acid. Biocataysis and biotransformation 17, 163-178 (1999).

8. Oliver, C. F. et al. A single mutation in cytochrome P450 BM3 changes substrate orientation in a catalytic intermediate and the regiospecificity of hydroxylation.

Biochemistry 36, 1567-72 (1997).

9. Craft, D. L., Madduri, K. M., Eshoo, M. & Wilson, C. R. Identification and

characterization of the CYP52 family of Candida tropicalis ATCC 20336, important for the conversion of fatty acids and alkanes to alpha,omega-dicarboxylic acids. Appl Environ Microbiol 69, 5983-91 (2003).

10. Seghezzi, W. et al. Identification and characterization of additional members of the cytochrome P450 multigene family CYP52 of Candida tropicalis. DNA Cell Biol 11, 767-80 (1992).

11. Imai, Y. Characterization of rabbit liver cytochrome P-450 (laurate omega- 1

hydroxylase) synthesized in transformed yeast cells. J Biochem 103, 143-8 (1988).

12. Hardwick, J. P. Cytochrome P450 omega hydroxylase (CYP4) function in fatty acid metabolism and metabolic diseases. Biochem Pharmacol 75, 2263-75 (2008).

13. Lee, S. H., Stephens, J. L., Paul, K. S. & Englund, P. T. Fatty acid synthesis by

elongases in trypanosomes. Cell 126, 691-9 (2006).

14. Erdmann, R., Veenhuis, M., Mertens, D. & Kunau, W. H. Isolation of peroxisome- deficient mutants of Saccharomyces cerevisiae. Proc Natl Acad Sci USA 86, 5419-23 (1989).

15. Scharnewski, M., Pongdontri, P., Mora, G., Hoppert, M. & Fulda, M. Mutants of Saccharomyces cerevisiae deficient in acyl-CoA synthetases secrete fatty acids due to interrupted fatty acid recycling. Febs J 275, 2765-78 (2008).

16. Kamisaka, Y. et al. Identification of genes affecting lipid content using transposon mutagenesis in Saccharomyces cerevisiae. Biosci Biotechnol Biochem 70, 646-53 (2006). Sandager, L. et al. Storage lipid synthesis is non-essential in yeast. J Biol Chem 277, 6478-82 (2002).

Cryle, M. J. & Schlichting, I. Structural insights from a P450 Carrier Protein complex reveal how specificity is achieved in the P450(BioI) ACP complex. Proc Natl Acad Sci USA 105, 15696-701 (2008).

Cryle, M. J., Matovic, N. J. & De Voss, J. J. Products of cytochrome P450(BioI) (CYP107Hl)-catalyzed oxidation of fatty acids. Org Lett 5, 3341-4 (2003).

[0143] While the present invention has been described with reference to the specific embodiments thereof, it should be understood by those skilled in the art that various changes may be made and equivalents may be substituted without departing from the true spirit and scope of the invention. In addition, many modifications may be made to adapt a particular situation, material, composition of matter, process, process step or steps, to the objective, spirit and scope of the present invention. All such modifications are intended to be within the scope of the claims appended hereto.

SEQUENCES

SEQ ID NO:l

1 gacgtcttaa gacccacttt cacatttaag ttgtttttct aatccgcata tgatcaattc 61 aaggccgaat aagaaggctg gctctgcacc ttggtgatca aataattcga tagcttgtcg

121 taataatggc ggcatactat cagtagtagg tgtttccctt tcttctttag cgacttgatg 181 ctcttgatct tccaatacgc aacctaaagt aaaatgcccc acagcgctga gtgcatataa 241 tgcattctct agtgaaaaac cttgttggca taaaaaggct aattgatttt cgagagtttc 301 atactgtttt tctgtaggcc gtgtacctaa atgtactttt gctccatcgc gatgacttag 361 taaagcacat ctaaaacttt tagcgttatt acgtaaaaaa tcttgccagc tttccccttc

421 taaagggcaa aagtgagtat ggtgcctatc taacatctca atggctaagg cgtcgagcaa 481 agcccgctta ttttttacat gccaatacaa tgtaggctgc tctacaccta gcttctgggc 541 gagtttacgg gttgttaaac cttcgattcc gacctcatta agcagctcta atgcgctgtt 601 aatcacttta cttttatcta atctagacat cattaattcc taatttttgt tgacactcta 661 tcgttgatag agttatttta ccactcccta tcagtgatag agaaaagaat tcaaaagatc

721 ttttaagCTC TAGGAGGCAT atgCATCATC ACCATCACCA Cgcaggtcat ggtgatgcta 781 ctgcacaaaa agctcaagat gcagaaaaat ctgaagacgg ttctgacgct atcgcagtta 841 tcggtatgtc ctgccgtttt cctggcgccc cgggcactgc tgaattttgg cagctgctgt 901 cctctggcgc agatgcagta gtgaccgcag ctgatggccg tcgtcgtggc accatcgacg 961 caccggcgga cttcgacgct gctttcttcg gtatgtcccc tcgcgaagcc gccgccaccg

1021 acccgcaaca gcgcctggtg ctggaactgg gctgggaagc gctggaagat gctggtatcg 1081 ttccggaatc cctgcgtggt gaggctgcct ccgtgttcgt aggcgcgatg aacgatgact 1141 acgcaactct gctgcatcgt gcaggcgctc cgactgacac ttacaccgca actggcctgc 1201 agcacagcat gatcgcgaac cgtctgtcct actttctggg cctgcgcggc ccgtccctgg 1261 tagtagacac cggccagagc agcagcctgg ttgcggtggc cctggcggtt gaatctctgc

1321 gtggtggtac ctctggtatc gccctggccg gtggtgtgaa cctggttctg gcggaagaag 1381 gcagcgcggc tatggaacgt gtaggcgcac tgtccccaga cggtcgctgc catacgtttg 1441 atgcgcgtgc gaacggttat gtacgcggcg aaggcggtgc tattgtcgtt ctgaagccgc 1501 tggccgacgc actggctgac ggcgatcgtg tttactgtgt tgtgcgcggc gtagcgacgg 1561 gcaacgacgg tggcggtccg ggcctgaccg taccagatcg tgcgggtcag gaggcggttc

1621 tgcgtgcagc ttgtgaccaa gcaggtgtac gtccggcgga tgtacgtttc gtggaactgc 1681 atggcactgg cacccctgcc ggcgatccgg ttgaagcaga agctctgggt gcagtctacg 1741 gcactggtcg tccggcgaac gaaccactgc tggtgggctc tgttaagacc aacatcggtc 1801 atctggaggg cgcagctggt atcgcgggct ttgtcaaagc ggctctgtgc ctgcacgagc 1861 gtgcgctgcc ggcgagcctg aacttcgaaa ccccgaaccc ggcgatcccg ctggaacgtc

1921 tgcgcctgaa agttcagact gcgcacgccg ctctgcagcc gggtacgggc ggtggcccgc 1981 tgctggcggg cgtgtccgct ttcggtatgg gcggtaccaa ctgccacgtc gtgctggaag 2041 agactccagg tggtcgtcag ccggccgaaa ccggtcaagc tgatgcgtgt ctgttcagcg 2101 cgagcccgat gctgctgctg agcgcccgtt ctgaacaagc gctgcgtgcg caggctgctc 2161 gtctgcgtga acacctggaa gatagcggcg ctgacccgct ggacatcgcg tactctctgg

2221 cgaccactcg cacccgtttc gaacatcgcg cggctgtacc gtgcggtgac ccggaccgtc 2281 tgagctctgc gctggcagcg ctggcagcgg gtcagacccc gcgtggcgtg cgcatcggtt 2341 ctaccgacgc tgatggtcgc ctggcgctgc tgttcactgg ccagggtgcg cagcatccgg 2401 gtatgggcca agagctgtat accaccgacc cgcacttcgc ggcagctctg gacgaagttt 2461 gcgaagaact gcagcgttgc ggtactcaaa acctgcgtga ggtgatgttc actccggatc

2521 aaccggacct gctggaccgt accgaatata cccagccggc cctgtttgct ctgcaaaccg 2581 cgctgtaccg taccctgacc gcgcgtggta cccaggctca tctggtgctg ggtcactccg 2641 ttggcgaaat caccgcagct cacattgcag gcgtcctgga cctgccggac gcagctcgtc 2701 tgatcactgc tcgcgcacat gtaatgggtc agctgcctca tggtggtgca atgctgtctg 2761 ttcaggcagc tgaacacgat ctggatcagc tggcgcatac tcacggcgtg gagatcgcgg

2821 ctgtgaacgg tccgacgcac tgcgttctgt ctggtccgcg taccgcgctg gaagaaaccg 2881 cacaacacct gcgcgagcag aacgtgcgtc acacctggct gaaagtttct cacgcgtttc 2941 actccgcact gatggatccg atgctgggtg cgtttcgtga taccctgaac actctgaact 3001 accagccgcc aactatcccg ctgatttcca acctgaccgg ccagattgct gacccgaatc 3061 acctgtgcac tccggattat tggatcgacc acgcgcgtca cactgtgcgt tttgctgacg

3121 cggttcaaac cgctcatcat caaggcacta ccacttacct ggaaattggt cctcatccga 3181 ctctgaccac cctgctgcat cataccctgg ataaccctac tactattccg accctgcacc 3241 gtgaacgccc ggagccggaa accctgactc aggcaatcgc agctgttggt gtgcgtaccg 3301 acggtatcga ttgggcggtt ctgtgcggcg cgtctcgtcc gcgtcgcgta gaactgccga 3361 cctacgcttt ccagcgtcgt actcattggg cccctggtct gacgccgaac catgcaccgg

3421 ctgaccgtcc ggctgctgaa ccacagcgcg cgatggctgt cggtccggtc agccgtgagg 3481 ccctggtgcg cctggttggt gaaaccaccg cgtccgtgct gggcctggac ggcccagacg 3541 aggtggcgct ggaccgtccg ttcacttctc aaggtctgga ctctatgacc gctgttgaac 3601 tggccggtct gctgggcacc gcggcaggcg ttgctctgga cccaactctg gtttacgaac 3661 tgccgacgcc tcgcgctgtc gcagatcacc tggcaaaaac tctgctgggt gagagcgctg 3721 cggacgcaga tcaggaagtg aacggccgta cgggtgaggc cgaagcgaaa gcgggtgacc

3781 cgatcgctgt aatcggcatc ggttgtcgct tcccaggtgg tgtggcgacc ccggacgatc 3841 tgtgggaact ggtagcgtcc ggcaccgatg ctatttccac tttcccaacc gaccgtggct 3901 gggacctgga cggtctgtac gacccggatc cgtctactcc aggtaagtcc tatgttcgcc 3961 atggcggctt tctgcacgat gcggcccagt ttgatgctga attctttggt atctccccgc 4021 gtgaagctac tgcgatggac ccgcaacaac gcctgctgct ggaaactagc tgggaggcac

4081 tggaacgcgc tggcgtagtt ccggagtccc tgcgcggtgg ccgcactggt gtttttgtgg 4141 gtactactgc tccggaatat ggcccacgtc tgcacgaagg cactgacggt tatgaaggct 4201 tcctgctgac cggtaccact gcatctgtgg ccagcggtcg tatcgcgtat gccctgggta 4261 cccgtggccc ggcgctgacc gttgacactg cgtgcagcag ctccctggtg gccctgcatc 4321 tggcggtaca atccctgcgt cgtggtgaat gcgacctggc gctggcaggt ggcaccaccg

4381 ttatgtccgg tccgggcatg ttcgttgagt tctcccgcca gcgtggtctg gctcctgacg 4441 gtcgttgcaa ggcattctct gcagacgcgg atggcaccgc ctgggccgaa ggtgttggca 4501 tgctgctggt cgaacgtctg tctgacgcgg aacgtctggg tcaccgcgtc ctggcagtcg 4561 tgcgtggcac tgcggtaaat caggacggtg cgtctaacgg tctgacggca ccgtctggcc 4621 ctgcccagca gcaggtgatc cgcgacgcgc tgagcgacgc gggcctgtcc gctgatgaca

4681 ttgatgcggt tgaagctcat ggtaccggca ccgcgctggg tgatccgatc gaagcgggtg 4741 ccctgctggc aacctatggc cacccgaaac gccagacgcc ggtttggctg ggttccctga 4801 aaagcaacat tggccacacg caggcggcgg ctggtattgc tggtatcatc aaaatggttc 4861 aggctctgcg tcacgacacc ctgccgcgta cgctgcacgc cgaccatccg tctagcaaag 4921 tggactggga tgccggccca ctgcagctgc tgaccgacgc tcgtccgtgg ccagcggacc

4981 cggatcgtcc gcgccgtgca ggtatcagcg ctttcggcgt ctctggtacc aacgcgcacg 5041 ttatcgtaga acagcctgcg ctggtggaat ctccagctgc ggaacctagc ggtcgtgaac 5101 caggcgttgt tccgctgcca ctgtctggta aaagcccgga agctctgcgt gaccaggcag 5161 ctcgcctgct ggcgggtctg gccgaacgcc cggctctgcg tccgctggat ctgggctatt 5221 ctctggctac cacgcgtagc gcgttcgatc atcgtgcagt agtactggcg acggaccgtg

5281 ccgacgctgt ccgcgcgctg actgctctgg ctgcagcgga tgccgacctg tctgccgtgg 5341 tcggtgatac gcgtactggt cgccacgcgg tcctgttttc cggtcagggt tcccagcgtc 5401 tgggcatggg tcgtgagctg tacgaacgct ttcctgtgtt tgctgaagct ctggacgtgg 5461 cgatcgacca cctggatgcc gcactgccgg cgcaagcctc tctgcgcgaa gttatgtggg 5521 gtgatgatgt tgaactgctg gacgaaaccg gttggacgca gccggcgctg ttcgcggtgg

5581 aagtggctct gttccgtctg gttgaaagct ggggcgtccg tccggacttc gttgcgggcc 5641 actctattgg cgaaatcgca gctgcccacg tggttggtgt cttcagcctg gaagatgcgt 5701 gccgtctggt agccgcacgt gcgaccctga tgcaggcgct gccgaccggt ggtgctatga 5761 tcgcaattca ggcggctgaa gatgaagtta cccagcatct gactgacgat gtttccatcg 5821 cggccgttaa cggtccgact tctgttgtgg tttctggtgc tgaatctgct gcgcgcaccg

5881 tagcagaccg tctggctgaa aacggccgta aaaccactcg cctgcgtgtc tcccacgcat 5941 ttcactctcc gctgatggac ccgatgctgg ccgaattccg tgctgttgct gaaggtctgt 6001 cctatgctac cccgaccctg ccagtagtgt ctaacctgac tggccgtctg gcgactgcag 6061 atgatctgtg ttccgctgaa tactgggccc gtcacgtacg cgaggccgtg cgtttcgcgg 6121 atggcgtttc taccctggaa aatgaaggcg ttaccacctt cctggaactg ggtccggacg

6181 gcgtactgag cgcaatggca cagcaatccc tgaccggtga cgccgctacc gttccggcgc 6241 tgcgtaaaga ccgtgatgag gaaacctctg ctctgaccgc gctggcacat ctgcacaccg 6301 cgggtctgcg cgttgactgg gctgcgttct tcgcgggttc cggtgcaact cgcgtggatc 6361 tgccgacgta cgcttttcag cacgcgacct attggccgac cggcacgctg ccgactgccc 6421 acgccgctgc ggttggtctg acggctgctg agcatccgct gctgaatggt agcgttgagc

6481 tggctgaggg cgagggcgta ctgttcactg gtcgcctgtc cctgcaaagc catccatggc 6541 tggcagacca cgcggtcatg ggtcaggttc tgctgcctgg cactgctctg ctggagctgg 6601 ctttccgtgc gggcgacgaa gcgggctgcg accgtgtaga agaactgacc ctggcggctc 6661 cgctggtcct gcctgaacgt ggtgccgttc agacccaggt gcgtgttggc gttgcagacg 6721 acaccggccg tcgcaccgtg accgtacact ctcgcccaga gcatgcgacc gacgtaagct

6781 ggactcagca cgcaaccggt accctgacca tgggttctgc gccggcagac actggtttcg 6841 acgcgaccgc ttggccgccg gcagacgctg aaccgctggc taccgatgat tgctacgcgc 6901 gctttactac cctgggtttc gcgtatggcc cggtgttcca gggcctgcgc gcagcttggc 6961 gtgccggtga tgtcctgtat gccgaagttg ctctggcaga gtccaccggt gatgaagcga 7021 ctgcgtttgg tctgcacccg gccctgctgg acgccgcgct gcacgcgtcc ctggttgcgc

7081 acgaaggtga agagagcaac ggcggcctgc ctttttcttg ggaaggtgcc accctgtatg 7141 ctactggcgc aaccgcactg cgcgtgcgcc tgaccccaac cggtaccgat ggccgttccg 7201 ttgcaatcgc ggttgctgat accgcaggcc gtccagtagc ggctatcgac aacctggtgt 7261 cccgccgtgt gagcggtgac cagctgaccg gtgctgcagg tctggctcgt gacgcgctgt 7321 ttactctgga ctggaacccg gttccggaaa acctggttcc agaaaacccg gtcccagaaa 7381 acaccggtgg cggccacgca caagaccagg acggtcgtcc ggctgcggca actgttgcgc

7441 tggtcggcgc ggatggtacg gcaattgctg ctgatctgac cgctgcgggt atccacacta 7501 ccctgcaccc ggacctgacc actctggcca ctacggacgc ggatgtgcct aagaccgttc 7561 tgatcccgct gactggcacc ggcaccggca ctggtacggg tacggagagc actgacggta 7621 tcggtactgg tgctgcggaa tccgatgcgt ctgcgccgag ccctgccgaa gtcgctcaca 7681 ccctgagcac cgcagcgctg gcgctggtgc aggaatggac tgcccaagaa cgtttcgccg

7741 gcagccgcct ggcgtttgtc accactggcg cgaccgcagc gggtggtact gacgtaatgg 7801 acgtggctgc ggctgcggtc tggggtctgg tacgcagcgc tcagtctgaa gccccggata 7861 cctttgtcct gatcgatcgt gatccgggcc ctgctggcac tcatgaccgc actgcggcag 7921 ccgaacgtgg tcagctgctg ctgcgcgcac tgcacactga cgagccgcag ctggcgctgc 7981 gtgacggtgg cgtgctggct gcgcgtctgg cgcgcttcga taccgctgcc gcgctgaccc

8041 ctccggcgga ccgcgcttgg cgtctggatt ccacggctaa aggtagcctg aacggtctgg 8101 cactgacccc gtatccagcc gcgctggccc cgctgaccgg tcacgaagtt cgtgttgaag 8161 ttcgtgctgc gggcctgaac ttccgcgatg ttctgaacgc cctgggtatg taccctggtg 8221 atgacgtggg ttctttcggt tccgaagcag ctggcgtcgt tgtggaggtg ggcccggagg 8281 ttacgggcct ggccccaggt gatcaggtga tgggcatgat cactggtagc ttcggttctc

8341 tggccgtaga cgacgctcgc cgcctggcgc gtctgccgga agactggagc tgggaaaccg 8401 gcgcttccgt accgctggta ttcctgaccg cgtattacgc cctgaaagaa ctgggtggtc 8461 tgcgtgcagg tgaaaaagta ctggtacacg ctggtgctgg cggtgtgggc atggcggcga 8521 tccagattgc acgtcacgtc ggcgcagaag ttttcgccac cgccagcgaa ggtaaatggg 8581 acgttctgcg ctctctgggt gttgctgatg atcacatcgc ctcctctcgt actctggatt

8641 ttgaagccgc ttttgcagaa gtcgccggtg atcgtggcct ggatgttgtg ctgaactctc 8701 tggccggcga cttcgttgat gcctccatgc gcctgctggg tgacggtggc cgttttctgg 8761 agatgggtaa aactgacatc cgtgcggcag atagcgttcc ggacggtctg tcttaccaat 8821 ccttcgacct ggcttgggtc gtgccggaaa ccattggcac catgctggct gaactgatgg 8881 atctgttccg taccggtgcg ctgcgcccgc tgccggtacg cacgtgggac gtgcgccacg

8941 cgaaagatgc cttccgtttc atgtccatgg ccaaacatat cggtaaaatc gtactgactc 9001 tgccgcgttc ctggaagccg gaaggcaccg tactggtgac cggcggtact ggtggcctgg 9061 gtggcctggt ggcacgtcac ctggttcgca gctgcggcgt ccgtcacctg ctgctgacct 9121 ctcgtagcgg cgttggtgct gcgggtgccg cgggtctggt tgcagaactg gaatctctgg 9181 gcgcacgcgt agttgtcgcg gcgtgtgacg ttggcgacgg tagcgcagtt gcggagctgg

9241 tagctggtgt tagcgaaagc tatcctctgt ctgcagttgt acatgcggcg ggtgttctgg 9301 acgacggtgt tgtaggttcc ctgaccccgg aacgcctggc cgctgttctg cgcccgaagg 9361 tcgacggtgc atggaacctg cacgaagcta cccgtggcct ggacctggac gcattcgtgg 9421 tcttctcctc tgtggcaggc gtttttggcg gtgcaggtca ggcgaactac gccgctggta 9481 acgcttttct ggatgccctg atggttcacc gtgtagcggg tggtctgcca ggcgtatctc

9541 tggcttgggg cgcgtgggat caaggcgttg gcatgactgc aggtctgacc gagcgtgacg 9601 tacgccgcgc agcagagtcc ggcatgccgc tgctgactgt tgatcaaggt gtcgctctgt 9661 tcgacgccgc actggcgacg ggcagcgctg ccctggtgcc ggtacgtctg gatctggcgg 9721 ccctgcgtac ccgtggtgac atcgctccgc tgctgcgtgg tctggtaaag gcaccaatcc 9781 gtcgtgccgc tgccaccacc ccgggtgata ccggcctggc agaacagctg acccgtctgc

9841 agcgcgccga acgtcgtgac accctgctgg cactggtgcg cgaccaggca gcgatggtgc 9901 tgggccacac gagcggcgat ggcgtggacc cgtctcgcgc gtttcgtgac ctgggctttg 9961 attctctgac cgctgtggag ctgcgtaacc gtatcggtgc tgctaccggt ctgcgtctgc 10021 cggctaccgc agtattcgat tacccgaccg ctgacgcact ggcagcacac ctgctgaccg 10081 aactggacag cggtacccca gctcgcgaag cgtctagcgc actgcgtgat ggctaccgtc

10141 aggcaggtgt gtccggccgt gttcgttctt acctggacct gctggcaggt ctgtctgatt 10201 tccgcgaaca cttcgacggt tctgatggtt tctctctgga tctggttgat atggcagatg 10261 gtccgggtga agttactgta atctgttgtg cgggcaccgc tgcgatcagc ggtccacacg 10321 aatttactcg tctggcaggt gctctgcgtg gcattgctcc tgttcgtgcc gttccgcagc 10381 cgggctacga ggagggtgaa ccgctgccgt cctccatggc ggcggtagcc gccgttcagg

10441 ctgatgcggt gatccgtacc caaggcgata aaccttttgt cgttgcaggc cacagcgctg 10501 gtgctctgat ggcttacgcc ctggccaccg aactgctgga tcgtggtcac cctccgcgtg 10561 gtgttgtact gatcgatgtt tacccaccgg gtcatcagga cgccatgaac gcatggctgg 10621 aggaactgac cgcgactctg ttcgaccgtg agactgttcg catggatgat acgcgtctga 10681 ctgcactggg cgcatatgac cgcctgactg gtcagtggcg tccgcgtgaa accggtctgc

10741 cgaccctgct ggtgtctgct ggcgaaccga tgggtccgtg gccggacgac tcctggaaac 10801 ctacctggcc ttttgaacat gacactgtgg cagttccggg cgatcacttc actatggttc

10861 aggaacacgc agatgcgatc gcacgtcaca tcgacgcttg gctgggtggc ggtaactctt

10921 aatgagatcc ggctgctaac aaagcccgaa aggaagctga gttggctgct gccaccgctg

10981 agcaataacc ctagggtacg ggttttgctg cccgcaaacg ggctgttctg gtgttgctag

11041 tttgttatca gaatcgcaga tccggcttca gccggtttgc cggctgaaag cgctatttct

11101 tccagaattg ccatgatttt ttccccacgg gaggcgtcac tggctcccgt gttgtcggca

11161 gctttgattc gataagcagc atcgcctgtt tcaggctgtc tatgtgtgac tgttgagctg

11221 taacaagttg tctcaggtgt tcaatttcat gttctagttg ctttgtttta ctggtttcac

11281 ctgttctatt aggtgttaca tgctgttcat ctgttacatt gtcgatctgt tcatggtgaa

11341 cagctttgaa tgcaccaaaa actcgtaaaa gctctgatgt atctatcttt tttacaccgt

11401 tttcatctgt gcatatggac agttttccct ttgatatgta acggtgaaca gttgttctac

11461 ttttgtttgt tagtcttgat gcttcactga tagatacaag agccataaga acctcagatc

11521 cttccgtatt tagccagtat gttctctagt gtggttcgtt gtttttgcgt gagccatgag

11581 aacgaaccat tgagatcata cttactttgc atgtcactca aaaattttgc ctcaaaactg

11641 gtgagctgaa tttttgcagt taaagcatcg tgtagtgttt ttcttagtcc gttatgtagg

11701 taggaatctg atgtaatggt tgttggtatt ttgtcaccat tcatttttat ctggttgttc

11761 tcaagttcgg ttacgagatc catttgtcta tctagttcaa cttggaaaat caacgtatca

11821 gtcgggcggc ctcgcttatc aaccaccaat ttcatattgc tgtaagtgtt taaatcttta

11881 cttattggtt tcaaaaccca ttggttaagc cttttaaact catggtagtt attttcaagc

11941 attaacatga acttaaattc atcaaggcta atctctatat ttgccttgtg agttttcttt

12001 tgtgttagtt cttttaataa ccactcataa atcctcatag agtatttgtt ttcaaaagac

12061 ttaacatgtt ccagattata ttttatgaat ttttttaact ggaaaagata aggcaatatc

12121 tcttcactaa aaactaattc taatttttcg cttgagaact tggcatagtt tgtccactgg

12181 aaaatctcaa agcctttaac caaaggattc ctgatttcca cagttctcgt catcagctct

12241 ctggttgctt tagctaatac accataagca ttttccctac tgatgttcat catctgagcg

12301 tattggttat aagtgaacga taccgtccgt tctttccttg tagggttttc aatcgtgggg

12361 ttgagtagtg ccacacagca taaaattagc ttggtttcat gctccgttaa gtcatagcga

12421 ctaatcgcta gttcatttgc tttgaaaaca actaattcag acatacatct caattggtct

12481 aggtgatttt aatcactata ccaattgaga tgggctagtc aatgataatt actagtcctt

12541 ttcccgggtg atctgggtat ctgtaaattc tgctagacct ttgctggaaa acttgtaaat

12601 tctgctagac cctctgtaaa ttccgctaga cctttgtgtg ttttttttgt ttatattcaa

12661 gtggttataa tttatagaat aaagaaagaa taaaaaaaga taaaaagaat agatcccagc

12721 cctgtgtata actcactact ttagtcagtt ccgcagtatt acaaaaggat gtcgcaaacg

12781 ctgtttgctc ctctacaaaa cagaccttaa aaccctaaag gcttaagtag caccctcgca

12841 agctcgggca aatcgctgaa tattcctttt gtctccgacc atcaggcacc tgagtcgctg

12901 tctttttcgt gacattcagt tcgctgcgct cacggctctg gcagtgaatg ggggtaaatg

12961 gcactacagg cgccttttat ggattcatgc aaggaaacta cccataatac aagaaaagcc

13021 cgtcacgggc ttctcagggc gttttatggc gggtctgcta tgtggtgcta tctgactttt

13081 tgctgttcag cagttcctgc cctctgattt tccagtctga ccacttcgga ttatcccgtg

13141 acaggtcatt cagactggct aatgcaccca gtaaggcagc ggtatcatca acaggcttac

13201 ccgtcttact gtccctagtg cttggattct caccaataaa aaacgcccgg cggcaaccga

13261 gcgttctgaa caaatccaga tggagttctg aggtcattac tggatctatc aacaggagtc

13321 caagcgagct ctcgaacccc agagtcccgc tcagaagaac tcgtcaagaa ggcgatagaa

13381 ggcgatgcgc tgcgaatcgg gagcggcgat accgtaaagc acgaggaagc ggtcagccca

13441 ttcgccgcca agctcttcag caatatcacg ggtagccaac gctatgtcct gatagcggtc

13501 cgccacaccc agccggccac agtcgatgaa tccagaaaag cggccatttt ccaccatgat

13561 attcggcaag caggcatcgc catgggtcac gacgagatcc tcgccgtcgg gcatgcgcgc

13621 cttgagcctg gcgaacagtt cggctggcgc gagcccctga tgctcttcgt ccagatcatc

13681 ctgatcgaca agaccggctt ccatccgagt acgtgctcgc tcgatgcgat gtttcgcttg

13741 gtggtcgaat gggcaggtag ccggatcaag cgtatgcagc cgccgcattg catcagccat

13801 gatggatact ttctcggcag gagcaaggtg agatgacagg agatcctgcc ccggcacttc

13861 gcccaatagc agccagtccc ttcccgcttc agtgacaacg tcgagcacag ctgcgcaagg

13921 aacgcccgtc gtggccagcc acgatagccg cgctgcctcg tcctgcagtt cattcagggc

13981 accggacagg tcggtcttga caaaaagaac cgggcgcccc tgcgctgaca gccggaacac

14041 ggcggcatca gagcagccga ttgtctgttg tgcccagtca tagccgaata gcctctccac

14101 ccaagcggcc ggagaacctg cgtgcaatcc atcttgttca atcatgcgaa acgatcctca

14161 tcctgtctct tgatcagatc atgatcccct gcgccatcag atccttggcg gcaagaaagc

14221 catccagttt actttgcagg gcttcccaac cttaccagag ggcgccccag ctggcaattc

14281 c

SEQ ID NO: 2 LOCUS pBbA7c_HEX_ORF_l_h 12876 bp ds-DNA circular

1 gacgtcctca ctgcccgctt tccagtcggg aaacctgtcg tgccagctgc attaatgaat 61 cggccaacgc gcggggagag gcggtttgcg tattgggcgc cagggtggtt tttcttttca 121 ccagtgagac gggcaacagc tgattgccct tcaccgcctg gccctgagag agttgcagca 181 agcggtccac gctggtttgc cccagcaggc gaaaatcctg tttgatggtg gttaacggcg

241 ggatataaca tgagctgtct tcggtatcgt cgtatcccac taccgagatg tccgcaccaa 301 cgcgcagccc ggactcggta atggcgcgca ttgcgcccag cgccatctga tcgttggcaa 361 ccagcatcgc agtgggaacg atgccctcat tcagcatttg catggtttgt tgaaaaccgg 421 acatggcact ccagtcgcct tcccgttccg ctatcggctg aatttgattg cgagtgagat 481 atttatgcca gccagccaga cgcagacgcg ccgagacaga acttaatggg cccgctaaca

541 gcgcgatttg ctggtgaccc aatgcgacca gatgctccac gcccagtcgc gtaccgtctt 601 catgggagaa aataatactg ttgatgggtg tctggtcaga gacatcaaga aataacgccg 661 gaacattagt gcaggcagct tccacagcaa tggcatcctg gtcatccagc ggatagttaa 721 tgatcagccc actgacgcgt tgcgcgagaa gattgtgcac cgccgcttta caggcttcga 781 cgccgcttcg ttctaccatc gacaccacca cgctggcacc cagttgatcg gcgcgagatt

841 taatcgccgc gacaatttgc gacggcgcgt gcagggccag actggaggtg gcaacgccaa 901 tcagcaacga ctgtttgccc gccagttgtt gtgccacgcg gttgggaatg taattcagct 961 ccgccatcgc cgcttccact ttttcccgcg ttttcgcaga aacgtggctg gcctggttca 1021 ccacgcggga aacggtctga taagagacac cggcatactc tgcgacatcg tataacgtta 1081 ctggtttcac attcaccacc ctgaattgac tctcttccgg gcgctatcat gccataccgc

1141 gaaaggtttt gcgccattcg atggtgtccg ggatctcgac gctctccctt atgcgactcc 1201 tgcattagga agcagcccag tagtaggttg aggccgttga gcaccgccgc cgcaaggaat 1261 ggtgcatgca aggagatggc gcccaacagt cccccggcca cggggcctgc caccataccc 1321 acgccgaaac aagcgctcat gagcccgaag tggcgagccc gatcttcccc atcggtgatg 1381 tcggcgatat aggcgccagc aaccgcacct gtggcgccgg tgatgccggc cacgatgcgt

1441 ccggcgtaga ggatcgagat cgatctcgat cccgcgaaat taatacgact cactataggg 1501 gaattgtgag cggataacaa tttcagaatt caaaagatct aggaggcata tgCATCATCA 1561 CCATCACCAC gcaggtcatg gtgatgctac tgcacaaaaa gctcaagatg cagaaaaatc 1621 tgaagacggt tctgacgcta tcgcagttat cggtatgtcc tgccgttttc ctggcgcccc 1681 gggcactgct gaattctggc agctgctgag ctctggcgca gatgcagtag tgaccgcagc

1741 tgatggccgt cgtcgtggca ccatcgacgc accggcggac ttcgacgctg ctttcttcgg 1801 tatgtcccct cgcgaagccg ccgccaccga cccgcaacag cgcctggtgc tggaactggg 1861 ctgggaagcg ctggaagatg ctggtatcgt tccggaatcc ctgcgtggtg aggctgcctc 1921 cgtgttcgta ggcgcgatga acgatgacta cgcaactctg ctgcatcgtg caggcgctcc 1981 gactgacact tacaccgcaa ctggcctgca gcacagcatg atcgcgaacc gtctgtccta

2041 ctttctgggc ctgcgcggcc cgtccctggt agtagacacc ggcCAGagca gcagcctggt 2101 tgcggtggcc ctggcggttg aatctctgcg tggtggtacc tctggtatcg ccctggccgg 2161 tggtgtgaac ctggttctgg cggaagaagg cagcgcggct atggaacgtg taggcgcact 2221 gtccccagac ggtcgctgcc atacgtttga tgcgcgtgcg aacggttatg tacgcggcga 2281 aggcggtgct attgtcgttc tgaagccgct ggccgacgca ctggctgacg gcgatcgtgt

2341 ttactgtgtt gtgcgcggcg tagcgacggg caacgacggt ggcggtccgg gcctgaccgt 2401 accagatcgt gcgggtcagg aggcggttct gcgtgcagct tgtgaccaag caggtgtacg 2461 tccggcggat gtacgtttcg tggaactgCA Tggcactggc acccctgccg gcgatccggt 2521 tgaagcagaa gctctgggtg cagtctacgg cactggtcgt ccggcgaacg aaccactgct 2581 ggtgggctct gttaagacca acatcggtCA Tctggagggc gcagctggta tcgcgggctt

2641 tgtcaaagcg gctctgtgcc tgcacgagcg tgcgctgccg gcgagcctga acttcgaaac 2701 cccgaacccg gcgatcccgc tggaacgtct gcgcctgaaa gttcagactg cgcacgccgc 2761 tctgcagccg ggtacgggcg gtggcccgct gctggcgggc gtgtccgctt tcggtatggg 2821 cggtaccaac tgccacgtcg tgctggaaga gactccaggt ggtcgtcagc cggccgaaac 2881 cggtcaagct gatgcgtgtc tgttcagcgc gagcccgatg ctgctgctga gcgcccgttc

2941 tgaacaagcg ctgcgtgcgc aggctgctcg tctgcgtgaa cacctggaag atagcggcgc 3001 tgacccgctg gacatcgcgt actctctggc gaccactcgc acccgtttcg aacatcgcgc 3061 ggctgtaccg tgcggtgacc cggaccgtct gagctctgcg ctggcagcgc tggcagcggg 3121 tcagaccccg cgtggcgtgc gcatcggttc taccgacgct gatggtcgcc tggcgctgct 3181 gttcactggc cagggtgcgc agcatccggg tatgggccaa gagctgtata ccaccgaccc

3241 gcacttcgcg gcagctctgg acgaagtttg cgaagaactg cagcgttgcg gtactcaaaa 3301 cctgcgtgag gtgatgttca ctccggatca accggacctg ctggaccgta ccgaatatac 3361 ccagccggcc ctgtttgctc tgcaaaccgc gctgtaccgt accctgaccg cgcgtggtac 3421 ccaggctcat ctggtgctgg gtcactccgt tggcgaaatc accgcagctc acattgcagg 3481 cgtcctggac ctgccggacg cagctcgtct gatcactgct cgcgcacatg taatgggtca

3541 gctgcctcat ggtggtgcaa tgctgtctgt tcaggcagct gaacacgatc tggatcagct 3601 ggcgcatact cacggcgtgg agatcgcggc tgtgaacggt ccgacgcact gcgttctgtc 3661 tggtccgcgt accgcgctgg aagaaaccgc acaacacctg cgcgagcaga acgtgcgtca 3721 cacctggctg aaagtttctc acgcgtttca ctccgcactg atggacccga tgctgggtgc 3781 gtttcgtgat accctgaaca ctctgaacta ccagccgcca actatcccgc tgatttccaa 3841 cctgaccggc cagattgctg acccgaatca cctgtgcact ccggattatt ggatcgacca

3901 cgcgcgtcac actgtgcgtt ttgctgacgc ggttcaaacc gctcatcatc aaggcactac 3961 cacttacctg gaaattggtc ctcatccgac tctgaccacc ctgctgcatc ataccctgga 4021 taaccctact actattccga ccctgcaccg tgaacgcccg gagccggaaa ccctgactca 4081 ggcaatcgca gctgttggtg tgcgtaccga cggtatcgat tgggcggttc tgtgcggcgc 4141 gtctcgtccg cgtcgcgtag aactgccgac ctacgctttc cagcgtcgta ctcattgggc

4201 ccctggtctg acgccgaacc atgcaccggc tgaccgtccg gctgctgaac cacagcgcgc 4261 gatggctgtc ggtccggtca gccgtgaggc cctggtgcgc ctggttggtg aaaccaccgc 4321 gtccgtgctg ggcctggacg gcccagacga ggtggcgctg gaccgtccgt tcacttctca 4381 aggtctggac tctatgaccg ctgttgaact ggccggtctg ctgggcaccg cggcaggcgt 4441 tgctctggac ccaactctgg tttacgaact gccgacgcct cgcgctgtcg cagatcacct

4501 ggcaaaaact ctgctgggtg agagcgctgc ggacgcagat caggaagtga acggccgtac 4561 gggtgaggcc gaagcgaaag cgggtgaccc gatcgctgta atcggcatcg gttgtcgctt 4621 cccaggtggt gtggcgaccc cggacgatct gtgggaactg gtagcgtccg gcaccgatgc 4681 tatttccact ttcccaaccg accgtggctg ggatctggac ggtctgtacg acccggaccc 4741 gtctactcca ggtaagtcct atgttcgcca tggcggcttt ctgcacgatg cggcccagtt

4801 tgatgctgag ttctttggta tctccccgcg tgaagctact gcgatggacc cgcaacaacg 4861 cctgctgctg gaaactagct gggaggcact ggaacgcgct ggcgtagttc cggagtccct 4921 gcgcggtggc cgcactggtg tttttgtggg tactactgct ccggaatatg gcccacgtct 4981 gcacgaaggc actgacggtt atgaaggctt cctgctgacc ggtaccactg catctgtggc 5041 cagcggtcgt atcgcgtatg ccctgggtac ccgtggcccg gcgctgaccg ttgacactgc

5101 gtgcagcagc tccctggtgg ccctgcatct ggcggtacaa tccctgcgtc gtggtgaatg 5161 cgacctggcg ctggcaggtg gcaccaccgt tatgtccggt ccgggcatgt tcgttgagtt 5221 ctcccgccag cgtggtctgg ctcctgacgg tcgttgcaag gcattctctg cagacgcgga 5281 tggcaccgcc tgggccgaag gtgttggcat gctgctggtc gaacgtctgt ctgacgcgga 5341 acgtctgggt caccgcgtcc tggcagtcgt gcgtggcact gcggtaaatc aggacggtgc

5401 gtctaacggt ctgacggcac cgtctggccc tgcccagcag caggtgatcc gcgacgcgct 5461 gagcgacgcg ggcctgtccg ctgatgacat tgatgcggtt gaagctcatg gtaccggcac 5521 cgcgctgggt gatccgatcg aagcgggtgc cctgctggca acctatggcc acccgaaacg 5581 ccagacgccg gtttggctgg gttccctgaa aagcaacatt ggccacacgc aggcggcggc 5641 tggtattgct ggtatcatca aaatggttca ggctctgcgt cacgacaccc tgccgcgtac

5701 gctgcacgcc gaccatccgt ctagcaaagt ggactgggat gccggcccac tgcagctgct 5761 gaccgacgct cgtccgtggc cagcggaccc ggatcgtccg cgccgtgcag gtatcagcgc 5821 tttcggcgtc tctggcacca acgcgcatgt catcgtggaa gaggcaccgg agtcttctgc 5881 tgacgcggtc gcggaaagcg gtgtgcgtgt cccggttccg gtagttcctt gggtagtttc 5941 tgcgcgtagc gcagaaggtc tggctgctca ggctgaacgc ctggcacgtt tcgtgggcga

6001 acgttctgat caggacccgg ttgatatcgg tttctccctg gtgcgctccc gttctctgct 6061 ggaacatcgc gcagttgtgc tgggtaaggg ccgtgatgac ctggtggcag gtctggcgtc 6121 cctggcgtcc gacggtagcg ctactggtgt agtgagcggt gtggcacgtg gtcgcgcacg 6181 cgtggctttc ggcttttctg gccagggtgc acaacgtgtt ggtatgggcg ctgaactggc 6241 ttccgtttac ccagtattcg cagaagctct ggctgaggtg accggcgctc tgggcctgga

6301 cccggaagtt tttggcgacg ttgaccgcct gggccgtacg gaagtaaccc aggctgcgct 6361 gtttgcgttc gaggtcgctg tcgtacgtct gctggaatcc tttggtgtac gcccggatgt 6421 actgattggc cactccatcg gcgaaattgc ggccgcttac gtcgctggcg tgttctctct 6481 gggtgacgca gccgccctgg taggcgcccg tggtcgtctg atgcaagcac tgccggctgg 6541 cggcgttatg gttgcggtac aggccggtga agcggaagtt gttgctgctc tggaaggctt

6601 cgcggatcgt gttagcctgg cggccgttaa cggtccgtcc agcgtcgtgg tttccggtga 6661 ggcagaggcc gtagaacagg ttgttgcacg cctgggtaag gttaaatcca agcgtctgcg 6721 tgtgagccac gcgttccact ccccgctgat ggagccaatg ctggctgact tccgccaggt 6781 cgccgaacaa atcacctaca acgaaccgca gctgccggtt gtgagcaacg tctctggtcg 6841 cctggcagag ccaggcgagc tgacgacgcc agactactgg gtgcgccacg tacgtgaagc

6901 ggttcgtttc ggtgacggtg tacgcgctct ggctgcagat ggtgtgggcg tgctggttga 6961 ggttggcccg gattccgttc tgactgcact ggcacgtgaa agcctggatg gtgaggatgg 7021 cctgcgcgca gttccgctgc tgcgtaaaga tcgcccggag ccggaaaccc tgctgaccgg 7081 tgttgcgcag gcattcactc acggcgttca agtggattgg ccggcactgc tgccgggtgg 7141 tcgccgtgtg gaactgccga cttacgcgtt ccagcgtcgt cgctactggc tggaagatgc

7201 tgacccgacc ggtggtgacc cggctgccct gggcctgact gcagcagacc acccactgct 7261 gggtgcggcg gtgccgctgg ctgaagacca gggtatcgtt atcacttccc gcctgtctct 7321 gcgtactcat ccgtggctgg cagaccacga aatcggtggc actgttctgc tgccgggtgc 7381 tggcctggtt gaaatcgcgc tgcgtgcagg cgacgaagta ggctgcggcc gtgtagaaga 7441 actgaccctg gaaattccgc tggttgtgcc gcaggagggt ggtgtaacgg ttcagatccg 7501 tgtaggcgcg cctgatgaaa gcggttggcg tccaatgacc gtacactctc gcactgaccc

7561 tgaggaagaa tggacccgtc acgttagcgg cgtgctgtct ccggacgtgc ctaccgaacg 7621 ttacgacctg ggcgcgtggc cgcctgctgg cgcgaccccg gttgaactgg acggtttcta 7681 cgaagcgtat gctcgtctgg gttacgcgta tggcccgtct tttcagggcc tgcgtgcggc 7741 gtggcgtcgt ggcgatgagg ttttcgcaga agtatctctg ccagttgagg aacaggaaac 7801 cgcgggtcgc ttcaccctgc acccggctct gctggatgcg gcactgcaga gcgcgggtgc

7861 aggtgcattc ttcgactccg gcggtagcat gcgtctgccg ttcgcctggt ccggtgtttc 7921 tgtgtttgcg gccggtgcgt ctaccgtccg tgtgcgtctg tctccggctg gtccggatgc 7981 ggttactgta gcgctggcgg acccgacggg tgcgccagtg gcgctggttg aacgtctgct 8041 gatcccggaa atgagcccgg agcaactgga acgcgttcgt ggtgaagaaa aagaagcgcc 8101 gtatgttctg gactgggttc cggtggaagt tccggctgac gacctggttc gcccggaacg

8161 ctggaccctg ctgggtggtg ctgatgcagg cgtaggcctg gatgttgctg gtgcattcgc 8221 gagcctggag ccatccgacg gcgctccgga atttgttgtt ctgccgtgtg tgccgccgac 8281 tagcccaacc cgcgctgcgg acgttcgcca gtctaccctg caggcgctga ctgtcctgca 8341 aaattgggtg accgacgagc gccatgccga tagccgcctg gtgctggtta cccgtcgcgc 8401 ggttggcgtg ggtgcccacg acgatgttcc ggacctgacc catgcggctc tgtggggcct

8461 ggtgcgtagc gcgcagaccg aaaacccagg ccgtttcctg ctggtagacc tggatgaagg 8521 tgcggaactg gccgaggttc tgccaggtgc tctgggtagc ggcgaatctc aggttgctgt 8581 acgtgctggt cgtgtgctgg cggctcgtct ggcccgtagc ggtagcggtg gcgcagaact 8641 ggtgccgcct gcaggcgcac cgtggcgtct ggatactacc tctccgggta ccctggagaa 8701 cctggcgctg gtaccgagcg cagaagagcc tctgggtccg ctggacgttc gtgtttccgt

8761 gcgtgcggct ggcctgaact ttcgcgatgt tctgatcgcg ctgggcatgt acccaggtga 8821 cgctcgcatg ggtggcgaag gtgccggcgt tgtaaccgat gtaggttctg aagtaactac 8881 cctggcaccg ggtgaccgtg taatgggtat gctgtcttct gccttcggcc cgaccgccgt 8941 atctgatcat cgtgcgctgg ttcgcgttcc agacgactgg tcttttgaac aggctgcttc 9001 cgttccaacg gttttcgcaa ccgcatacta tggcctggtc gacctggccg agctgcgtgc

9061 gggccagtct gttctggttc acgccgctgc gggtggcgtt ggtatggctg cagtgcagct 9121 ggctcgccat ctgggtgccg aagtttttgg tactgcgagc accggcaaat gggactctct 9181 gcgcgcaggt ggtctggatg ccgaacacat cgcgtccagc cgcaccgttg aatttgaaga 9241 aaccttcctg gcagccaccg ctggccgtgg tgttgatgtg gtcctggatt ccctggcggg 9301 cgagttcgta gacgcctctc tgcgcctgct gccgcgtggc ggtcgttttg tcgagatggg

9361 caaagcagat attcgtgatg cagagcgcgt cgcagctgat catccgggtg tgacctaccg 9421 ttccttcgat ctgctggaag cgggtctgga tcgcttccaa gaaatcctga ctgaagttgt 9481 gcgtctgttc gaacgtggcg tactgcgtca cctgccggtg acggcgtggg atgttcgtcg 9541 tgccgcagaa gcgttccgtt tcgttagcca ggcccgtcac gttggcaaaa acgtactggt 9601 tatgccgcgt gtttgggatc gtgacggtac ggtgctgatc actggcggta ccggtgcgct

9661 gggcgctctg gttgcacgcc atctggttgc agaacacggc atgcgcaacg tgctgctggc 9721 cggtcgtcgt ggcgttgacg ctccgggtgc ccgtgagctg ctggcagaac tggaaaccgc 9781 cggtgcccag gtaagcgtag ttgcatgtga cgtggctgac cgcgatgctg tggccgagct 9841 gattgcaaag gtgccggtag aacatccact gaccgcagtc gttcataccg ccggtgtagt 9901 cgccgacgct acgctgactg ccctggacgc cgaacgtgtt gataccgtac tgcgtgccaa

9961 ggttgacgcg gtgctgcacc tgcacgaagc gacgcgtggc ctggacctgg cgggtttcgt 10021 cctgttctcc tctgcatctg gtatctttgg cagccctggt cagggtaact acgcggcagc 10081 aaactccttt atcgacgcgt ttgcacatca tcgtcgtgcg cagggtctgc cagcactgtc 10141 tctggcttgg ggcctgtggg cgcgcacgtc tggcatggcc ggtcagctgg gccacgacga 10201 cgttgctcgc atctcccgta ctggtctggc gccgattacc gacgatcagg gtatggcgct

10261 gctggacgct gcgctgggtg ctggtcgtcc actgctggtt cctgtgcgtc tggatcgtgc 10321 ggcgctgcgt agccaggcaa ccgcaggtac cctgccgccg atcctgcgtg gtctggttcg 10381 tgcgaccgtt cgtcgtgctg ctagcactgc ggctgcccaa ggtccatctc tggctgaacg 10441 tctggcgggt ctgccggtaa ccgaacacga acgcattgtg gttgagctgg tgcgcgcgga 10501 tctggcggct gttctgggcc acagcagctc cgctggtatc gatccgggcc gtgcgttcca

10561 agacatgggc attgactccc tgaccgcagt tgagctgcgt aaccgcctga acggtgcaac 10621 gggtctgcgt ctggcggcaa gcctggtttt cgattatccg accccgaacg cgctggcaac 10681 ccacatcctg gacgaactgg cgctggacac tgctggcgcg ggtgctgcag gtgaaccgga 10741 tggtcctgct ccagcaccgg cggacgaagc ccgcttccgt cgtgtaatta actccatccc 10801 tctggatcgc attcgccgtg cgggtctgct ggacgcgctg ctgggtctgg cgggtacctc

10861 tgcagatacc gcagcgtccg acgattttga tcaggaagaa gacggtccgg ccatcgcgag 10921 catggatgta gacgatctgg ttcgtatcgc gctgggtgaa tctgacacca ccgcggacat

10981 cactgagggc actgatcgca gctgataagg atccaaactc gagtaaggat ctccaggcat

11041 caaataaaac gaaaggctca gtcgaaagac tgggcctttc gttttatctg ttgtttgtcg

11101 gtgaacgctc tctactagag tcacactggc tcaccttcgg gtgggccttt ctgcgtttat

11161 acctagggat atattccgct tcctcgctca ctgactcgct acgctcggtc gttcgactgc

11221 ggcgagcgga aatggcttac gaacggggcg gagatttcct ggaagatgcc aggaagatac

11281 ttaacaggga agtgagaggg ccgcggcaaa gccgtttttc cataggctcc gcccccctga

11341 caagcatcac gaaatctgac gctcaaatca gtggtggcga aacccgacag gactataaag

11401 ataccaggcg tttccccctg gcggctccct cgtgcgctct cctgttcctg cctttcggtt

11461 taccggtgtc attccgctgt tatggccgcg tttgtctcat tccacgcctg acactcagtt

11521 ccgggtaggc agttcgctcc aagctggact gtatgcacga accccccgtt cagtccgacc

11581 gctgcgcctt atccggtaac tatcgtcttg agtccaaccc ggaaagacat gcaaaagcac

11641 cactggcagc agccactggt aattgattta gaggagttag tcttgaagtc atgcgccggt

11701 taaggctaaa ctgaaaggac aagttttggt gactgcgctc ctccaagcca gttacctcgg

11761 ttcaaagagt tggtagctca gagaaccttc gaaaaaccgc cctgcaaggc ggttttttcg

11821 ttttcagagc aagagattac gcgcagacca aaacgatctc aagaagatca tcttattaat

11881 cagataaaat atttctagat ttcagtgcaa tttatctctt caaatgtagc acctgaagtc

11941 agccccatac gatataagtt gttactagtg cttggattct caccaataaa aaacgcccgg

12001 cggcaaccga gcgttctgaa caaatccaga tggagttctg aggtcattac tggatctatc

12061 aacaggagtc caagcgagct cgatatcaaa ttacgccccg ccctgccact catcgcagta

12121 ctgttgtaat tcattaagca ttctgccgac atggaagcca tcacaaacgg catgatgaac

12181 ctgaatcgcc agcggcatca gcaccttgtc gccttgcgta taatatttgc ccatggtgaa

12241 aacgggggcg aagaagttgt ccatattggc cacgtttaaa tcaaaactgg tgaaactcac

12301 ccagggattg gctgagacga aaaacatatt ctcaataaac cctttaggga aataggccag

12361 gttttcaccg taacacgcca catcttgcga atatatgtgt agaaactgcc ggaaatcgtc

12421 gtggtattca ctccagagcg atgaaaacgt ttcagtttgc tcatggaaaa cggtgtaaca

12481 agggtgaaca ctatcccata tcaccagctc accgtctttc attgccatac gaaattccgg

12541 atgagcattc atcaggcggg caagaatgtg aataaaggcc ggataaaact tgtgcttatt

12601 tttctttacg gtctttaaaa aggccgtaat atccagctga acggtctggt tataggtaca

12661 ttgagcaact gactgaaatg cctcaaaatg ttctttacga tgccattggg atatatcaac

12721 ggtggtatat ccagtgattt ttttctccat tttagcttcc ttagctcctg aaaatctcga

12781 taactcaaaa aatacgcccg gtagtgatct tatttcatta tggtgaaagt tggaacctct

12841 tacgtgccga tcaacgtctc attttcgcca gatatc

SEQ ID NO:

1 gacgtcctca ctgcccgctt tccagtcggg aaacctgtcg tgccagctgc attaatgaat 61 cggccaacgc gcggggagag gcggtttgcg tattgggcgc cagggtggtt tttcttttca 121 ccagtgagac gggcaacagc tgattgccct tcaccgcctg gccctgagag agttgcagca 181 agcggtccac gctggtttgc cccagcaggc gaaaatcctg tttgatggtg gttaacggcg 241 ggatataaca tgagctgtct tcggtatcgt cgtatcccac taccgagatg tccgcaccaa 301 cgcgcagccc ggactcggta atggcgcgca ttgcgcccag cgccatctga tcgttggcaa 361 ccagcatcgc agtgggaacg atgccctcat tcagcatttg catggtttgt tgaaaaccgg 421 acatggcact ccagtcgcct tcccgttccg ctatcggctg aatttgattg cgagtgagat 481 atttatgcca gccagccaga cgcagacgcg ccgagacaga acttaatggg cccgctaaca 541 gcgcgatttg ctggtgaccc aatgcgacca gatgctccac gcccagtcgc gtaccgtctt 601 catgggagaa aataatactg ttgatgggtg tctggtcaga gacatcaaga aataacgccg 661 gaacattagt gcaggcagct tccacagcaa tggcatcctg gtcatccagc ggatagttaa 721 tgatcagccc actgacgcgt tgcgcgagaa gattgtgcac cgccgcttta caggcttcga 781 cgccgcttcg ttctaccatc gacaccacca cgctggcacc cagttgatcg gcgcgagatt 841 taatcgccgc gacaatttgc gacggcgcgt gcagggccag actggaggtg gcaacgccaa 901 tcagcaacga ctgtttgccc gccagttgtt gtgccacgcg gttgggaatg taattcagct 961 ccgccatcgc cgcttccact ttttcccgcg ttttcgcaga aacgtggctg gcctggttca 1021 ccacgcggga aacggtctga taagagacac cggcatactc tgcgacatcg tataacgtta 1081 ctggtttcac attcaccacc ctgaattgac tctcttccgg gcgctatcat gccataccgc 1141 gaaaggtttt gcgccattcg atggtgtccg ggatctcgac gctctccctt atgcgactcc 1201 tgcattagga agcagcccag tagtaggttg aggccgttga gcaccgccgc cgcaaggaat 1261 ggtgcatgca aggagatggc gcccaacagt cccccggcca cggggcctgc caccataccc 1321 acgccgaaac aagcgctcat gagcccgaag tggcgagccc gatcttcccc atcggtgatg 1381 tcggcgatat aggcgccagc aaccgcacct gtggcgccgg tgatgccggc cacgatgcgt 1441 ccggcgtaga ggatcgagat cgatctcgat cccgcgaaat taatacgact cactataggg 1501 gaattgtgag cggataacaa tttcagaatt caaaagatct aggaggcata tgtcttccgc 1561 atccagcgag aaaattgttg aggcactgcg cgcaagcctg actgagaacg aacgtctgcg 1621 tcgcctgaac caggaactgg cggctgcggc ccatgaaccg gtagcgatcg tgtctatggc 1681 gtgccgcttc ccaggcggcg ttgaaagccc ggaagatttc tgggacctga tttctgaagg 1741 ccgtgacgca gtatctggtc tgccggataa ccgtggctgg gatctggatg cactgtacga

1801 ccctgacccg gaagcacagg gcaaaaccta cgttcgcgaa ggcgctttcc tgtatgatgc 1861 agccgagttc gacgccgaac tgtttggcat ctccccgcgt gaagctctgg cgatggaccc 1921 gcagcagcgt ctgctgatgg aaaccagctg ggaagttctg gaacgtgcgg gcattcgtcc 1981 agactctctg cgcggcaaac ctgtgggcgt gttcaccggt ggcatcacca gcgactacgt 2041 cacccgccac tatgcatctg gcaccgctcc gcagctgccg tccggtgttg aaagccattt

2101 catgactggt tctgcaggtt ccgtcttctc tggccgtatc gcctatacct acggtttcga 2161 aggtccggca gttactgtcg atactgcgtg ttcttcttct ctggttgctc tgcacatggc 2221 tgcacagagc ctgcgtcagg gcgagtgttc cctggcgttc gccggtggcg ttgcggtcct 2281 gccaaacccg ggtactttcg ttggcttctc ccgtcaacgc gcgctgagcc cagacggtcg 2341 ttgcaaagct ttcagcgcag atgcagatgg taccggttgg ggcgagggcg ctggtctggt

2401 tctgctggag aaactgtctg acgctcgtcg taacggtcac ccggtactgg caattctgcg 2461 tggcagcgct gtgaaccaag atggcgcaag caatggcctg actgcaccta acggcccgtc 2521 tcagcagcgc gtcattcgtg ccgcgctggc gaacgctcgc ctgagcccgg atgatgtcga 2581 tgtggttgaa gcccacggca ccggtactcc gctgggtgac cctattgagg cgcaggcact 2641 gcaggctact tatggccgtt cccgcagcgc ggaacgtccg ctgtggctgg gttctgtgaa

2701 atctaacgtt gctcatgcac aggcagctgc tggcgttgca agcgtcatca aagttgtgat 2761 ggctctgcgc caccgtctgc tgccgaagac gctgcatgct gacgaacgtt ctccgcacat 2821 cgactggcac tccggtgcgg ttgaactgct gaccgaagca cgcgagtggt cccgtaccga 2881 aggccgtgct cgtcgcgcag gcgtttcttc ctttggtatc tctggcacta acgcccacgt 2941 ggtcattgtc gaacaggcac cggacacgcc tgcggaagca gccgacgata ctccgccgcg

3001 tacgccacgc actctgccgt ggctgctgtc tgcgcgcacc ggtgctgctc tgcgtgacca 3061 ggcaaccgcc ctgctggacc atctggatcg tccggacggt gatcgcggtc ctactgctct 3121 ggataccgct ttctctctgg ctaccacccg cgcggccctg gaacaccgtc tggctgtggt 3181 gaccggtact gatggcactg ccggtcgcga cgccctgacc gcgtggctgg cccatggtac 3241 tgcaccggac gcgcacgaag gtcacgcggc gggtcgcact cgttgtgcag ctctgttcag

3301 cggtcaaggt gcgcaacgtc tgggcatggg ccgcgaactg cacgcacgtt tcccagtatt 3361 cgcacgcgct ctggataccg cggttgacct gctggatgca gaactgggcg gtaccctgcg 3421 cgaagtgatt tggggcactg acgatgcgcc gctgaacgaa accggtttta ctcagccggc 3481 actgttcgcg gttgaggttg cactgtatcg cctgattgaa tcctggggtg ttgcaccgga 3541 tttcgtggca ggccactcta ttggcgaaat cgcggctgcg cacgttgccg gcgtgttctc

3601 cctggaggac gcatgtactc tggtggctgc tcgtgctggc ctgatgcagg cactgccgcg 3661 tggtggtgct atggtggccg tagaagccac tgaagacgaa gtatccccgc tgctgactga 3721 cggcgttgcc attgcggcga tcaacggccc gacttctctg gttgtttccg gcgacgagac 3781 cgcgaccctg gcggtggcag cgcgtctggc ggaacagggt cgtcgtacca cgcgcctgcg 3841 cgttagccac gccttccaca gcccgctgat ggacccgatg ctggctgaat ttcgcgctgt

3901 agcggaaggt ctgtcctacg gcgagccgca gatcccagtg gtgtccaacc tgaccggtgc 3961 agtagctgat ggcacgctgc tgggtacggc ggactactgg gtgcgtcacg ttcgtgaagc 4021 agtacgcttc gcagacggca ttcgcgcact gactgatgct ggtgtcggcg cattcctgga 4081 actgggtcca gacggtaccc tggcagcgct ggcacagcaa agcgcaccgg atgcggtaag 4141 cgtaccggtt ctgcgcaaag accgcgacga agaaccggcg gcagttgcgg cactggctcg

4201 tctgcacacc gcgggcgtgc cggttgactg gaccgctttt tacgctggca ccggtgctca 4261 ccgtacggat ctgcctacct acgcgtttca gtatgagcgc tactggccaa aagctactta 4321 ccgtccggcg gacgctaccg gtctgggtct gaccgcagct gaccacccgc tgctgggtgc 4381 ggctatgtct gtggcgggtt ccgacgaact gctgctgact ggtactctgt ctctggctac 4441 ccacccatgg ctggcggacc atgtcgtagg cggtatggtg ttcttcccgg gcaccggctt

4501 cctggagctg gcggtacgtg cagccgacca ggtgggctgt gatcgcgtag aagaactgat 4561 gctggccgca ccgctgattc tgccggcgac cggcactgtg cagatgcaga tcgccgttgg 4621 cgctgcagat gacgatggtg gccgtgatct gcgcttcttc acccgtccgg gcgacgatcc 4681 ggacgctgct tgggcgcagc atgcgacggg ccgtatcacc gaaggcgaac gcgtgctggc 4741 tctggatacc actacctggc cgcctcgtga cgcagaaccg gttgacatcg acggtctgta

4801 cgatcgttac cgtgctaacg gtctggatta cggcccggtg ttccgcggcc tgcgtgcggt 4861 ttggcgtcgt gacaccgaga tttatgcaga agtagcactg cctgaaggca cggcggatgc 4921 cgatgcattc ggtctgcacc cggccctgtt tgacgccgta ctgcactcta ccctgtttgc 4981 gagcgcagac ggtgatgatc gttctctgct gccttttgcc tggaacggcg tcagcctgca 5041 cgccgctggc gctgacgcac tgcgtgttcg tattacgtct tgcggtccgg atgctgttga

5101 aatcacggct gttgacccac agggtcgccc ggttgtgtcc gttgagtctc tgaccctgcg 5161 tgctgctggt ccagatgcag gcaccgctga ccaccgtgca gacgccggca gcctgtttcg 5221 tatggattgg acgccgcgca ctgtacacgc accggctacc ccggcgacct gggccgtact 5281 gggcactgac ccgatcggtc tgaccgaagc actgaccgct gcaggtccgg acacggtaac 5341 tggtctgcgc gacggcgtcg atgctctggg tgaactgacc gcaggcgacg atcgtccagt 5401 gccggatgtg gtagcggttc cactgcgcgg cgcgaccgac cacggtcctg ccggtgcgca

5461 cgacctgacc cgcacggttc tggccctgct gcaggaatgg ctggctgaag aacgtttcgc 5521 gcgctctcgc ctgctgctgg ttacccgcgg tgcggtggca gatggtgaac gtggcccgct 5581 ggacctggcg gctgcaccgg tgtggggtct ggtccgttcc gctcagtctg aaaacccggg 5641 tcgcctgctg ctggttgatc tggacgacac tgccgaatct gcggcgcaac tgccgctgct 5701 gcctgcgctg ctggacgctg acgaaccaca ggcggttgtc cgtgagggca ccgttcgtgt

5761 aggccgtctg gcacgtctgg actccggtcg tggtctggtc ccacctccag gtactccgtg 5821 gcgtctgggt tcccgtgcca agggttccct ggacggtctg gcgctgctgc cgcaccctga 5881 agcgcgtcgc cctctgaccg gccatgaagt tcgtgtaggc atccgcgctg caggtctgaa 5941 cttccgtgat gtcctgaacg ccctgggtat gtacccgggc gacgccggcc tgtttggcag 6001 cgaggcggca ggtgttgttg tggaagtagg tccggaggtt acgggtctgg ctccgggtga

6061 ccgtgtgatg ggcatgctgt tcggcggctt tggccctctg ggcatcgccg atgcgcgtct 6121 gctgactcca gttccggcag actggtcctg ggaaacgggt gcatctgtcc ctctggtttt 6181 tctgacggcg tactatgccc tgaaagaact gggcggtctg cgtgcaggtg aaaaagtact 6241 ggtccacgcc ggtgccggtg gtgtaggcat ggcggctatc cagatcgcgc gtcacgttgg 6301 tgcggaagtt ttcgcaactg caagcgaagg caagtgggat gtgctgcgtt ctctgggcgt

6361 tgctgacgac cacattgctt cctcccgtac tctggacttc gaagcagctt tcgctgaagt 6421 tgcgggtgac cgtggcctgg atgttgttct gaatgctctg tctggtgagt ttgtggacgc 6481 ctccatgcgt ctgctgggtg atggcggtcg ttttctggag atgggcaaaa cggacatccg 6541 tgcggcagac tctgtgccgg acggcctgag ctaccacagc ttcgacctgg gtatggttga 6601 tccggaacac atccagcgca tgctgctgga cctggtggaa ctgttcgacc gtggtgcgct

6661 ggcagccctg ccagttcgct cttgggatgt tcgtcgtgct ggcgaggcgt ttcgtttcat 6721 gtccctggcg cagcacattg gtaagattgt tctgactgtt ccgcagccgc tggacccgga 6781 cggtactgtg ctgctgaccg gcggcactgg cggtctggca ggtctgctgg cacgccacct 6841 ggtaaccgag catggtgctc gtcatctgct gctggctggt cgccgtggcc cggatgcgcc 6901 aggcgctgct gcactgcacg cggaactgac tgccctgggc gcggaggtaa ccgttgctgc

6961 ttgtgatgtt gctgaccgta ccgcgctggc ggcgctgctg gctactgtgc cggcagaaca 7021 tcctctgacc gcagttgtgc acactgctgg cgtcctggac gatggcaccc tgaccgcact 7081 gaacccggat cgtctggcga ccgttctgcg cccgaaagtg gacgccgctt ggcacctgca 7141 cgacctgacc cgtcacctgg atctggcagc gttcgttctg tattcctcta ctgcaggcgt 7201 catgggtggt ccgggtcagg ctaactacgc cgctggcaac accttcctgg atgctctggc

7261 agcacaccgt cacgccctgg gtctgccggc tacctccctg gcgtggggtg cttgggaaca 7321 gggcgctggt atgaccggcg ctctgaccga tcatgatctg cgtcgtgtgt ccgacgcggg 7381 tggtcagccg ctgctgactg cagaacgtgg tctggcgctg tacgatgcgg ccactgcagc 7441 agacgaacct ctgatcgtgc cgctgggtct gaccggcggt gctctgcctg ctggcgttgg 7501 cgtgccagcc gtactgcgcg gcctggtacg tactgcgggc cgtcgtgcgc gtgctggtac

7561 cgccggtgtc tcccgtgccg gcctggccga acgtctggca gcactgccgg aggaggaacg 7621 tactcctttt ctggttgaac tggtacgtac cgaagcagca accgtcctgg gccacggtag 7681 cactgacccg gttgacgcac gccgtgaatt tcgtcagctg ggcttcgact ccctgactgc 7741 tattgagctg cgtaatcgcc tgggcaaagc tacgggtctg accctgccag ccaccctgat 7801 ctttgattat ccgaccgtgc gtcgtctggc tgaccatatt ggtcagcagc tggactctgg

7861 taccccagcc cgcgaagcct cctctgctct gcgtgatggt tatcgtcagg ctggtgttag 7921 cggtcgcgtc cgttcttatc tggacctgct ggcgggcctg tctgactttc gtgaacactt 7981 tgatggtagc gatggtttct ccctggacct ggttgacatg gcggacggtc cgggtgaagt 8041 tactgtcatc tgctgtgccg gcactgcagc tatctctggc ccgcatgaat ttactcgcct 8101 ggcaggcgca ctgcgtggta tcgcaccggt ccgtgctgtt ccgcagccgg gctacgaaga

8161 aggtgagccg ctgcctagca gcatggctgc agttgcggca gttcaggcag acgccgttat 8221 ccgcacccag ggcgacaagc cgttcgtggt cgctggtcac tccgcaggtg cgctgatggc 8281 ttatgcactg gcgactgaac tgctggatcg tggtcatcca ccgcgtggtg ttgtactgat 8341 cgacgtgtac ccgccgggtc atcaggacgc tatgaatgct tggctggagg agctgacggc 8401 cactctgttc gaccgtgaga ctgtgcgtat ggatgacacc cgcctgaccg ctctgggcgc

8461 atacgaccgt ctgactggcc agtggcgtcc acgcgagact ggcctgccga ctctgctggt 8521 atctgcaggt gaacctatgg gcccgtggcc ggacgactct tggaagccga cctggccttt 8581 cgaacacgac accgttgcag taccgggcga ccactttact atggtccagg aacatgcgga 8641 cgctattgct cgtcacattg atgcttggct gggcggtggt aatagctgat aaggatccaa 8701 actcgagtaa ggatctccag gcatcaaata aaacgaaagg ctcagtcgaa agactgggcc

8761 tttcgtttta tctgttgttt gtcggtgaac gctctctact agagtcacac tggctcacct 8821 tcgggtgggc ctttctgcgt ttatacctag ggatatattc cgcttcctcg ctcactgact

8881 cgctacgctc ggtcgttcga ctgcggcgag cggaaatggc ttacgaacgg ggcggagatt

8941 tcctggaaga tgccaggaag atacttaaca gggaagtgag agggccgcgg caaagccgtt

9001 tttccatagg ctccgccccc ctgacaagca tcacgaaatc tgacgctcaa atcagtggtg

9061 gcgaaacccg acaggactat aaagatacca ggcgtttccc cctggcggct ccctcgtgcg

9121 ctctcctgtt cctgcctttc ggtttaccgg tgtcattccg ctgttatggc cgcgtttgtc

9181 tcattccacg cctgacactc agttccgggt aggcagttcg ctccaagctg gactgtatgc

9241 acgaaccccc cgttcagtcc gaccgctgcg ccttatccgg taactatcgt cttgagtcca

9301 acccggaaag acatgcaaaa gcaccactgg cagcagccac tggtaattga tttagaggag

9361 ttagtcttga agtcatgcgc cggttaaggc taaactgaaa ggacaagttt tggtgactgc

9421 gctcctccaa gccagttacc tcggttcaaa gagttggtag ctcagagaac cttcgaaaaa

9481 ccgccctgca aggcggtttt ttcgttttca gagcaagaga ttacgcgcag accaaaacga

9541 tctcaagaag atcatcttat taatcagata aaatatttct agatttcagt gcaatttatc

9601 tcttcaaatg tagcacctga agtcagcccc atacgatata agttgttact agtgcttgga

9661 ttctcaccaa taaaaaacgc ccggcggcaa ccgagcgttc tgaacaaatc cagatggagt

9721 tctgaggtca ttactggatc tatcaacagg agtccaagcg agctcgatat caaattacgc

9781 cccgccctgc cactcatcgc agtactgttg taattcatta agcattctgc cgacatggaa

9841 gccatcacaa acggcatgat gaacctgaat cgccagcggc atcagcacct tgtcgccttg

9901 cgtataatat ttgcccatgg tgaaaacggg ggcgaagaag ttgtccatat tggccacgtt

9961 taaatcaaaa ctggtgaaac tcacccaggg attggctgag acgaaaaaca tattctcaat

10021 aaacccttta gggaaatagg ccaggttttc accgtaacac gccacatctt gcgaatatat

10081 gtgtagaaac tgccggaaat cgtcgtggta ttcactccag agcgatgaaa acgtttcagt

10141 ttgctcatgg aaaacggtgt aacaagggtg aacactatcc catatcacca gctcaccgtc

10201 tttcattgcc atacgaaatt ccggatgagc attcatcagg cgggcaagaa tgtgaataaa

10261 ggccggataa aacttgtgct tatttttctt tacggtcttt aaaaaggccg taatatccag

10321 ctgaacggtc tggttatagg tacattgagc aactgactga aatgcctcaa aatgttcttt

10381 acgatgccat tgggatatat caacggtggt atatccagtg atttttttct ccattttagc

10441 ttccttagct cctgaaaatc tcgataactc aaaaaatacg cccggtagtg atcttatttc

10501 attatggtga aagttggaac ctcttacgtg ccgatcaacg tctcattttc gccagatatc