Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
CO-CONVERSION OF CARBOHYDRATES TO FERMENTATION PRODUCTS IN A SINGLE FERMENTATION STEP
Document Type and Number:
WIPO Patent Application WO/2014/160402
Kind Code:
A1
Abstract:
Provided are methods for the production of a fermentation product comprising co-fermenting a sugar stream with a pretreated lignocellulosic biomass. Lignocellulosic biomass feedstocks useful in the methods include grass, switch grass, cord grass, rye grass, reed canary grass, miscanthus, sugar-processing residues, sugarcane bagasse, agricultural wastes, rice straw, rice hulls, barley straw, corn cobs, cereal straw, wheat straw, canola straw, oat straw, oat hulls, corn fiber, stover, soybean stover, corn stover, forestry wastes, recycled wood pulp fiber, paper sludge, sawdust, hardwood, softwood, agave, and combinations thereof. Sugar streams useful in the methods include molasses, sugar cane, sugar beet, corn starch, wheat starch, and potato starch. Fermentation products include alcohols (including, e.g., ethanol), lactic acid, and acetic acid. The methods can be performed at existing fermentation facilities, which will increase yield of fermentation products and reduce the costs of production.

Inventors:
LYND LEE R (US)
VAN ROOYEN JUSTIN D (US)
Application Number:
PCT/US2014/026499
Publication Date:
October 02, 2014
Filing Date:
March 13, 2014
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
MASCOMA CORP (US)
International Classes:
C12P7/10; C12P7/54; C12P7/56; D21C5/00
Domestic Patent References:
WO2009015481A12009-02-05
WO2012066042A12012-05-24
WO2014074895A22014-05-15
WO2006009434A12006-01-26
WO1998045425A11998-10-15
Foreign References:
US6333181B12001-12-25
US7026152B22006-04-11
US5821093A1998-10-13
US5000000A1991-03-19
US7098009B22006-08-29
Other References:
BRETHAUER S ET AL: "Review: Continuous hydrolysis and fermentation for cellulosic ethanol production", BIORESOURCE TECHNOLOGY, ELSEVIER BV, GB, vol. 101, no. 13, 1 July 2010 (2010-07-01), pages 4862 - 4874, XP026986220, ISSN: 0960-8524, [retrieved on 20100327]
LEE J M ET AL: "Detoxification of woody hydrolyzates with activated carbon for bioconversion to ethanol by the thermophilic anaerobic bacterium Thermoanaerobacterium saccharolyticum", BIOMASS AND BIOENERGY, PERGAMON, AMSTERDAM, NL, vol. 35, no. 1, 1 January 2011 (2011-01-01), pages 626 - 636, XP027577878, ISSN: 0961-9534, [retrieved on 20101228], DOI: 10.1016/J.BIOMBIOE.2010.10.021
MCMILLAN J D ED - SHALABY SHALABY W: "CONVERSION OF HEMICELLULOSE HYDROLYZATES TO ETHANOL", WATER-SOLUBLE POLYMERS: SYNTHESIS, SOLUTION PROPERTIES AND APPLICATIONS, AMERICAN CHEMICAL SOCIETY, WASHINGTON, DC, US, vol. 566, 1 January 1994 (1994-01-01), pages 411 - 437, XP008045627, ISBN: 978-0-541-23408-9
LYND ET AL.: "Microbial cellulose utilization: Fundamentals and biotechnology", MICROBIOLOGY AND MOLECULAR BIOLOGY REVIEWS, vol. 66, no. 3, 2002, pages 506 - 577, XP002551605, DOI: doi:10.1128/MMBR.66.3.506-577.2002
LYND ET AL.: "Consolidated bioprocessing of cellulosic biomass: An update", CURRENT OPINION IN BIOTECHNOLOGY, vol. 16, no. 5, 2005, pages 577 - 583, XP005097204, DOI: doi:10.1016/j.copbio.2005.08.009
BREAKING THE BIOLOGICAL BARRIERS TO CELLULOSIC ETHANOL: A JOINT RESEARCH AGENDA, December 2005 (2005-12-01)
"Ultimate low-cost configuration for cellulose hydrolysis and fermentation", DOE/USA JOINT RESEARCH AGENDA
DOE/SC-0095 JOINT RESEARCH AGENDA
SAMBROOK, J.; FRITSCH, E. F.; MANIATIS, T.: "MOLECULAR CLONING: A LABORATORY MANUAL", 1989, COLD SPRING HARBOR LABORATORY PRESS
VAN WALSUM; LYND, BIOTECH. BIOENG., vol. 58, 1998, pages 316
JEPPSSON ET AL., APPL. ENVIRON. MICROBIOL., vol. 68, no. 4, April 2002 (2002-04-01), pages 1604 - 9
KARHUMAA ET AL., MICROB. CELL. FACT., vol. 6, 5 February 2007 (2007-02-05), pages 5
KUYPER, M. ET AL., FEMS YEAST RES., vol. 4, 2004, pages 655 - 64
KUYPER, M. ET AL., FEMS YEAST RES., vol. 5, 2005, pages 399 - 409
KUYPER, M. ET AL., FEMS YEAST RES., vol. 5, 2005, pages 925 - 34
CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, pages 13.7.1 - 13.7.10
DAVIS, L. ET AL., BASIC METHODS IN MOLECULAR BIOLOGY, 1986
YOMANO ET AL., J IND. MICRO. & BIO., vol. 20, 1998, pages 132 - 138
"Current Protocols in Molecular Biology", 1992, JOHN WILEY & SONS
SAMBROOK ET AL.: "Molecular Cloning: A Laboratory Manual.", 1989, COLD SPRING HARBOR LABORATORY PRESS
KREIG ET AL.: "Bergey's Manual of Determžnatžve Bacteriology", 1984, WILLIAMS AND WILKINS
MCLAUGHLIN ET AL., ENVIRON. SCI. TECHNOL., vol. 36, 2002, pages 2122
DESAI ET AL., APPL. MICROBIOL. BIOTECHNOL., vol. 65, 2004, pages 600
LYND ET AL., MICROBIOL. MOL. BIOL. REV., vol. 66, 2002, pages 506
"Genes II", 1985, JOHN WILEY & SONS
BOWIE ET AL.: "Deciphering the Message in Protein Sequences: Tolerance to Amino Acid Substitutions", SCIENCE, vol. 247, 1990, pages 1306 - 1310, XP002939052, DOI: doi:10.1126/science.2315699
CUNNINGHAM; WELLS, SCIENCE, vol. 244, 1989, pages 1081 - 1085
NAKAMURA, Y. ET AL.: "Codon usage tabulated from the international DNA sequence databases: status for the year 2000", NUCL. ACIDS RES., vol. 28, 2000, pages 292, XP002941557, DOI: doi:10.1093/nar/28.1.292
Attorney, Agent or Firm:
JACKMAN, Peter A. et al. (Kessler Goldstein & Fox P.L.L.C.,1100 New York Avenue N.W.,8th and 9th Floor, Washington District of Columbia, US)
Download PDF:
Claims:
WHAT IS CLAIMED IS:

1. A method for the production of a fermentation product, said method comprising: i) contacting a pretreated lignocellulosic biomass feedstock, a sugar stream, and a population of microorganisms capable of hydro lyzing the lignocellulosic biomass and fermenting sugars into a fermentation product; and ii) culturing said population of microorganisms under conditions for a period sufficient to allow hydrolysis of the lignocellulosic biomass and fermentation of sugars by said population of microorganisms into a fermentation product.

2. The method of claim 1, wherein said lignocellulosic biomass feedstock is selected from the group consisting of: grass, switch grass, cord grass, rye grass, reed canary grass, miscanthus, sugar-processing residues, sugarcane bagasse, agricultural wastes, rice straw, rice hulls, barley straw, corn cobs, cereal straw, wheat straw, canola straw, oat straw, oat hulls, corn fiber, stover, soybean stover, corn stover, forestry wastes, recycled wood pulp fiber, paper sludge, sawdust, hardwood, softwood, Agave, and combinations thereof.

3. The method of claim 1 or 2, wherein said sugar stream is selected from the group consisting of: molasses, sugar cane, sugar beet, corn starch, wheat starch, and potato starch.

4. The method of any of claims 1-3, wherein said sugar stream is provided by a dry milling process.

5. The method of any of claims 1-3, wherein said sugar stream is provided by a wet milling process.

6. The method of any of claims 1-5, wherein said population of microorganisms is selected from the group consisting of bacteria and yeast.

7. The method of claim 6, wherein said population of microorganisms comprises at least one genetically modified microorganism.

8. The method of claim 7, wherein said genetically modified microorganism is a bacterium.

9. The method of claim 8, wherein said bacterium is thermophilic or mesophilic.

10. The method of claim 7, wherein said genetically modified microorganism is a fungus.

11. The method of claim 10, wherein said fungus is a yeast.

12. The method of any of claims 1-11, wherein said fermentation product is selected from the group consisting of an alcohol, lactic acid, and acetic acid.

13. The method of claim 12, wherein said fermentation product is ethanol.

14. The method of any of claims 1-13, comprising adding exogenous enzymes to the culture.

Description:
CO-CONVERSION OF CARBOHYDRATES TO FERMENTATION PRODUCTS IN A SINGLE FERMENTATION STEP

BACKGROUND OF THE INVENTION

[0001] Currently, ethanol and other fermentation products are produced from various foodstuffs such as sugar (e.g. , derived from either sugar cane or sugar beet and starch), or derived from sources such as corn, wheat and potato. Numerous industrial facilities exist in which sugar streams are present that are processed to biofuels, including, for example, industrial plants that process sugar cane into ethanol, corn into ethanol, or sugar beets into ethanol. While this process is simple and effective, there is a growing competition between these processes and food stuffs and, at times, it is more economically advantageous to sell the feed for such a fermentation in the open market. This can greatly increase the cost of production using traditional fermentation methods.

[0002] To alleviate this problem a widespread effort has been underway to use, for example, lignocellulosic biomass ("biomass") as a source of carbohydrate for these fermentations. Biomass is particularly well-suited for energy applications because of its large-scale availability, low cost, and environmentally benign production. In particular, many energy production and utilization cycles based on cellulosic biomass have near-zero greenhouse gas emissions on a life-cycle basis. The primary obstacle impeding the more widespread production of energy from biomass feedstocks is the general absence of low- cost technology for overcoming the recalcitrance of these materials to conversion into useful products. Lignocellulosic biomass contains carbohydrate fractions (e.g., cellulose and hemicellulose) that can be converted into ethanol or other products such as lactic acid and acetic acid. In order to convert these fractions, the cellulose and hemicellulose must ultimately be converted or hydrolyzed into monosaccharides; it is the hydrolysis that has historically proven to be problematic. To aid in the effort, biocatalysts have been developed that can liberate carbohydrate from the cellulosic biomass and ferment the resulting carbohydrates into end products using little or no added enzymes.

[0003] Biologically mediated processes are promising for energy conversion. Biomass processing schemes involving enzymatic or microbial hydrolysis commonly involve four biologically mediated transformations: (1) the production of saccharo lytic enzymes (cellulases and hemicellulases); (2) the hydrolysis of carbohydrate components present in pretreated biomass to sugars; (3) the fermentation of hexose sugars (e.g., glucose, mannose, and galactose); and (4) the fermentation of pentose sugars (e.g., xylose and arabinose). These four transformations occur in a single step in a process configuration called consolidated bioprocessing (CBP), which is distinguished from other less highly integrated configurations in that it does not involve a dedicated process step for cellulase and/or hemicellulase production.

[0004] Consolidated Bio-Processing (CBP) in essence describes a mode of operation where biocatalysts produce enzymes that can breakdown inexpensive cellulose into usable sugars and then simultaneously ferment them into value added products in a single vessel. CBP, which reduces the number of unit processes, significantly lowers operating and capital costs associated with cellulosic biofuel production. Furthermore, CBP processes reduce or eliminate the need for externally-added, expensive cellulases. See Lynd et al. "Microbial cellulose utilization: Fundamentals and biotechnology," Microbiology and Molecular Biology Reviews 66(3):506-577 (2002); Lynd et al., "Consolidated bioprocessing of cellulosic biomass: An update," Current Opinion in Biotechnology 16(5):577-583 (2005); "Breaking the Biological Barriers to Cellulosic Ethanol: A Joint Research Agenda," December 2005, Rockville, Maryland Publication Date: June 2006; DOE/SC-0095. CBP is widely considered to be the "Ultimate low-cost configuration for cellulose hydrolysis and fermentation." DOE/USA Joint Research Agenda. See DOE/SC-0095 Joint Research Agenda. CBP on plant biomass, e.g., lignocellulosic biomass, also reduces the need to rely on petrochemical feedstocks to produce fermentable, value added products, such as propanols, alcohols, polyols, and other industrial products.

[0005] CBP offers the potential for lower cost and higher efficiency than processes featuring dedicated cellulase production. The benefits result in part from avoided capital costs, substrate and other raw materials, and utilities associated with cellulase production. In addition, several factors support the realization of higher rates of hydrolysis, and hence reduced reactor volume and capital investment using CBP, including enzyme-microbe synergy and the use of thermophilic organisms and/or complexed cellulase systems. Moreover, cellulose-adherent cellulolytic microorganisms are likely to compete successfully for products of cellulose hydrolysis with non-adhered microbes, e.g., contaminants, which could increase the stability of industrial processes based on microbial cellulose utilization. Progress in developing CBP-enabling microorganisms is being made through two strategies: engineering naturally occurring cellulo lytic microorganisms to improve product-related properties, such as yield and titer; and engineering non-cellulolytic organisms that exhibit high product yields and titers to express a heterologous cellulase and hemicellulase system enabling cellulose and hemicellulose utilization.

[0006] For CBP, the amount of cellulase present is dependent upon synthesis of cellulase within the bioreactor in which cellulose is hydrolyzed and sugars are fermented to products, such as biofuels. As a result, CBP performance is quite sensitive to inhibitors and, in particular, often more sensitive to inhibitors than processes in which cellulase is produced separately and added to the reactor in which hydrolysis and/or fermentation occurs.

[0007] Rates of biomass conversion in a CBP process tend to decrease with increasing biomass concentration, while final ethanol concentration is directly proportional to solids loading. It is desirable to process cellulosic biomass at high solids concentrations because this has the potential to increase the ethanol concentration produced and thus the product production per unit fermentor volume. At the same time, operation at high solids concentration entails challenges and in general the magnitude of these challenges increases disproportionately with increasing solids concentration. Particular challenges include inhibition of cell growth and cellulase synthesis, as well as inhibition of cellulase enzymes - likely both due to higher concentrations of inhibitors. Mixing also becomes more difficult, exacerbating other problems. At its current state of development, CBP performs quite well at 12 to 15 wt.% solids, is more difficult at 15 to 20 wt.% solids, and suffers appreciably in the 20 to 25% solids concentrations favored by economics in the absence of performance compromises related to high solids concentration. Rates of conversion can be increased by increasing exogenous enzyme or by boosting the initial yeast innoculum, both of which are costly.

[0008] The present invention addresses technical and economic challenges encountered in existing fermentation processes. BRIEF SUMMARY OF THE INVENTION

[0009] The invention is generally directed to a method of co-fermenting a sugar stream(s) with a pretreated lignocellulosic biomass.

[0010] In a particular aspect, the invention is directed to a method for the production of a fermentation product, the method comprising: i) contacting a pretreated lignocellulosic biomass feedstock, a sugar stream, and a population of microorganisms capable of hydrolyzing the lignocellulosic biomass and fermenting sugars into a fermentation product; and ii) culturing said population of microorganisms under conditions for a period sufficient to allow hydrolysis of the lignocellulosic biomass and fermentation of sugars by said population of microorganisms into a fermentation product. In one embodiment, exogenous enzymes are added to the culture.

[0011] In one embodiment, the lignocellulosic biomass feedstock is selected from the group consisting of: grass, switch grass, cord grass, rye grass, reed canary grass, miscanthus, sugar-processing residues, sugarcane bagasse, agricultural wastes, rice straw, rice hulls, barley straw, corn cobs, cereal straw, wheat straw, canola straw, oat straw, oat hulls, corn fiber, stover, soybean stover, corn stover, forestry wastes, recycled wood pulp fiber, paper sludge, sawdust, hardwood, softwood, Agave, and combinations thereof.

[0012] In one embodiment, the sugar stream is selected from the group consisting of: molasses, sugar cane, sugar beet, corn starch, wheat starch, and potato starch.

[0013] In one embodiment, the sugar stream is provided by a dry milling process. In another embodiment, the sugar stream is provided by a wet milling process.

[0014] In one embodiment, the population of microorganisms is selected from the group consisting of bacteria and yeast. In one embodiment, the population of microorganisms comprises at least one genetically modified microorganism. In a specific embodiment, the genetically modified microorganism is a bacterium. In a more specific embodiment, the bacterium is thermophilic or mesophilic. In another embodiment, the genetically modified microorganism is a fungus. In a more specific embodiment, the fungus is a yeast.

[0015] In one embodiment, the fermentation product is selected from the group consisting of an alcohol, lactic acid, and acetic acid. In a specific embodiment, the fermentation product is ethanol. DETAILED DESCRIPTION OF THE INVENTION

[0016] Aspects of the present invention relate to a method of co-fermenting sugar stream with a pretreated lignocellulosic biomass. The fermentation method can be located on an existing fermentation facility, such as a corn ethanol mill or a sugarcane ethanol mill. On the technical front, this will allow for lower higher rates, lower yeast innocula and operation at lower lignocellulosic solids concentrations. On the economic front, reduction in the quantity of food stuff derived carbohydrate will significantly boost profitability of existing fermentation facilities.

[0017] The advantages of the present invention over existing fermentation process include:

Consumption of less food-derived carbohydrate for the same quantity of fermentation product;

• Existing fermentation facilities can use low cost cellulosic materials such as grasses, corn stover, sugar cane bagasse, and woods to increase existing production;

• Allows for lower pretreated lignocellulosic solids concentration in the fermenters, thereby increasing the rate of reaction for the CBP microbes versus lignocellulosic fermentations with no sugar added;

• Allows for lower microorganism innoculum dose in fermentation versus lignocellulosic fermentations with no sugar added; and

Allows for lower nutrient doses versus lignocellulosic fermentations with no sugar added.

Definitions

[0018] As used herein the term "recombinant host" is intended to include a cell suitable for genetic manipulation, e.g., which can incorporate heterologous polynucleotide sequences, e.g., which can be transfected. The cell can be a microorganism or a higher eukaryotic cell. The term is intended to include progeny of the cell originally transfected. In certain embodiments, the cell is a fungal cell {e.g., Saccharomyces cerevisiae) or a bacterial cell {e.g., a Gram-negative bacterial cell). In some embodiments, recombinant hosts are Saccharomyces cerevisiae. In other embodiments, recombinant hosts are mesophilic or thermophillic microorganisms, such as thermophilic bacteria including species of Thermoanaeorbacterium {e.g., T. saccharolyticum) or Clostridium {e.g., C. thermocellum or C. propionicum). In other embodiments, recombinant hosts are Escherichia coli and/or Klebsiella oxytoca cells. Other recombinant host cells include any host cells known in the art or described herein.

[0019] A "vector," e.g., a "plasmid" or "YAC" (yeast artificial chromosome) refers to an extrachromosomal element often carrying one or more genes that are not part of the central metabolism of the cell, and is usually in the form of a circular double-stranded DNA molecule. Such elements may be autonomously replicating sequences, genome integrating sequences, phage or nucleotide sequences, linear, circular, or supercoiled, of a single- or double-stranded DNA or RNA, derived from any source, in which a number of nucleotide sequences have been joined or recombined into a unique construction which is capable of introducing a promoter fragment and DNA sequence for a selected gene product along with appropriate 3' untranslated sequence into a cell. Preferably, the plasmids or vectors of the present invention are stable and self-replicating.

[0020] An "expression vector" is a vector that is capable of directing the expression of genes to which it is operably associated.

[0021] The term "heterologous" as used herein refers to an element of a vector, plasmid or host cell that is derived from a source other than the endogenous source. Thus, for example, a heterologous sequence could be a sequence that is derived from a different gene or plasmid from the same host, from a different strain of host cell, or from an organism of a different taxonomic group {e.g., different kingdom, phylum, class, order, family genus, or species, or any subgroup within one of these classifications). The term "heterologous" is also used synonymously herein with the term "exogenous."

[0022] The term "domain" as used herein refers to a part of a molecule or structure that shares common physical or chemical features, for example hydrophobic, polar, globular, helical domains or properties, e.g., a DNA binding domain or an ATP binding domain. Domains can be identified by their homology to conserved structural or functional motifs. Examples of cellobiohydrolase (CBH) domains include the catalytic domain (CD) and the cellulose binding domain (CBD).

[0023] A "nucleic acid," "polynucleotide," or "nucleic acid molecule" is a polymeric compound comprised of covalently linked subunits called nucleotides. Nucleic acid includes polyribonucleic acid (RNA) and polydeoxyribonucleic acid (DNA), both of which may be single-stranded or double-stranded. DNA includes cDNA, genomic DNA, synthetic DNA, and semi-synthetic DNA.

[0024] An "isolated nucleic acid molecule" or "isolated nucleic acid fragment" refers to the phosphate ester polymeric form of ribonucleosides (adenosine, guanosine, uridine or cytidine; "RNA molecules") or deoxyribonucleosides (deoxyadenosine, deoxyguanosine, deoxythymidine, or deoxycytidine; "DNA molecules"), or any phosphoester analogs thereof, such as phosphorothioates and thioesters, in either single stranded form, or a double-stranded helix. Double stranded DNA-DNA, DNA-RNA and RNA-RNA helices are possible. The term nucleic acid molecule, and in particular DNA or RNA molecule, refers only to the primary and secondary structure of the molecule, and does not limit it to any particular tertiary forms. Thus, this term includes double-stranded DNA found, inter alia, in linear or circular DNA molecules (e.g., restriction fragments), plasmids, and chromosomes. In discussing the structure of particular double-stranded DNA molecules, sequences may be described herein according to the normal convention of giving only the sequence in the 5' to 3' direction along the non-transcribed strand of DNA (i.e., the strand having a sequence homologous to the mRNA).

[0025] A "gene" refers to an assembly of nucleotides that encode a polypeptide, and includes cDNA and genomic DNA nucleic acids. "Gene" also refers to a nucleic acid fragment that expresses a specific protein, including intervening sequences (introns) between individual coding segments (exons), as well as regulatory sequences preceding (5' non-coding sequences) and following (3' non-coding sequences) the coding sequence. "Native gene" refers to a gene as found in nature with its own regulatory sequences.

[0026] A nucleic acid molecule is "hybridizable" to another nucleic acid molecule, such as a cDNA, genomic DNA, or RNA, when a single stranded form of the nucleic acid molecule can anneal to the other nucleic acid molecule under the appropriate conditions of temperature and solution ionic strength. Hybridization and washing conditions are well known and exemplified, e.g., in Sambrook, J., Fritsch, E. F. and Maniatis, T. MOLECULAR CLONING: A LABORATORY MANUAL, Second Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor (1989), particularly Chapter 11 and Table 11.1 therein (hereinafter "Maniatis", entirely incorporated herein by reference). The conditions of temperature and ionic strength determine the "stringency" of the hybridization. Stringency conditions can be adjusted to screen for moderately similar fragments, such as homologous sequences from distantly related organisms, to highly similar fragments, such as genes that duplicate functional enzymes from closely related organisms. Post-hybridization washes determine stringency conditions. One set of conditions uses a series of washes starting with 6X SSC, 0.5% SDS at room temperature for 15 min, then repeated with 2X SSC, 0.5% SDS at 45°C for 30 min, and then repeated twice with 0.2X SSC, 0.5% SDS at 50°C for 30 min. For more stringent conditions, washes are performed at higher temperatures in which the washes are identical to those above except for the temperature of the final two 30 min washes in 0.2X SSC, 0.5%> SDS are increased to 60°C. Another set of highly stringent conditions uses two final washes in

0.1X SSC, 0.1% SDS at 65°C. An additional set of highly stringent conditions are defined by hybridization at 0.1X SSC, 0.1% SDS, 65°C and washed with 2X SSC, 0.1% SDS followed by 0. IX SSC, 0.1% SDS.

[0027] Hybridization requires that the two nucleic acids contain complementary sequences, although depending on the stringency of the hybridization, mismatches between bases are possible. The appropriate stringency for hybridizing nucleic acids depends on the length of the nucleic acids and the degree of complementation, variables well known in the art. The greater the degree of similarity or homology between two nucleotide sequences, the greater the value of Tm for hybrids of nucleic acids having those sequences. The relative stability (corresponding to higher Tm) of nucleic acid hybridizations decreases in the following order: R A:RNA, DNA:RNA, DNA:DNA. For hybrids of greater than 100 nucleotides in length, equations for calculating Tm have been derived (see, e.g., Maniatis at 9.50-9.51). For hybridizations with shorter nucleic acids,

1. e., oligonucleotides, the position of mismatches becomes more important, and the length of the oligonucleotide determines its specificity (see, e.g., Maniatis, at 11.7-11.8). In one embodiment the length for a hybridizable nucleic acid is at least about 10 nucleotides. Preferably a minimum length for a hybridizable nucleic acid is at least about 15 nucleotides; more preferably at least about 20 nucleotides; and most preferably the length is at least 30 nucleotides. Furthermore, the skilled artisan will recognize that the temperature and wash solution salt concentration may be adjusted as necessary according to factors such as length of the probe.

[0028] The term "population of microorganisms" refers to one or more microorganisms, generally in culture. Microorganisms useful in a population of microorganisms includes any host cell, including bacteria or yeast cells, and including recombinant host cells. Non-limiting examples of microorganisms (e.g., host cells), are provided herein below. The population of microorganisms can comprise cells of a single species or a co-culture of cells from different species.

[0029] The term "heterologous polynucleotide segment" is intended to include a polynucleotide segment that encodes one or more polypeptides or portions or fragments of polypeptides. A heterologous polynucleotide segment may be derived from any source, e.g., eukaryotes, prokaryotes, viruses, or synthetic polynucleotide fragments.

[0030] The terms "gene(s)" or "polynucleotide segment" or "polynucleotide sequence(s)" are intended to include nucleic acid molecules, e.g., polynucleotides which include an open reading frame encoding a polypeptide, and can further include non-coding regulatory sequences, and introns. In addition, the terms are intended to include one or more genes that map to a functional locus. In addition, the terms are intended to include a specific gene for a selected purpose. The gene may be endogenous to the host cell or may be recombinantly introduced into the host cell, e.g., as a plasmid maintained episomally or a plasmid (or fragment thereof) that is stably integrated into the genome. In addition to the plasmid form, a gene may, for example, be in the form of linear DNA. In certain embodiments, the gene of polynucleotide segment is involved in at least one step in the bioconversion of a carbohydrate to a fermentation product, such as ethanol, acetate, or lactate. Accordingly, the term is intended to include any gene encoding a polypeptide, such as the enzymes acetate kinase (ACK), phosphotransacetylase (PTA), lactate dehydrogenase (LDH), pyruvate formate lyase (PFL), aldehyde dehydrogenase (ADH) and/or alcohol dehydrogenase (ADH), enzymes in the D-xylose pathway, such as xylose isomerase and xylulokinase, enzymes in the L-arabinose pathway, such as L-arabinose isomerase and L-ribulose-5 -phosphate 4-epimerase, a pyruvate decarboxylase, a secretory protein(s), or a polysaccharase, e.g., an endoglucanase, exoglucanase, endoxylanase, exoxylanase, endogalactanase, endoarabinase, cellobiohydrolase, exo-P-l,3-glucanase, endo-P"l,4-glucanase, endo-P-D-mannanase, endo-P-l,4-mannanase, β-mannanase, β- mannosidase, endo-P-xylanase, a-galactosidase, polygalacturonase, a-glucuronidase, cellodextrinase, xyloglucanase, xylose reductase, xylitol dehydrogenase, transaldolase, transketolase, β-glucosidase, endo-l,4-P-xylanase (EC-Number 3.2.1.8), xylan endo-β- 1,3-xylosidase (EC-Number 3.2.1.32), a-xylosidase, β-xylosidase, oligoxyloglucan hydrolase, oligoxyloglucan reducing-end-specific cellobiohydrolase (EC-Number 3.2.1.150), endoxyloglucan transferase, xyloglucan endotransglycosylase, xyloglucan hydrolase, xyloglucan endohydrolase, xyloglucan-specific exo-P-l,4-glucanase (EC- Number 3.2.1.155), xyloglucan-specific endo-P-l,4-glucanase (EC-Number 3.2.1.151), glucuronoarabinoxylan endo-P-l,4-xylanase (EC-Number 3.2.1.136), a-L- arabinofuranosidase, acetylesterase, acetylxylanesterase, a-amylase, β-amylase, glucoamylase, pullulanase, β-glucanase, hemicellulase, arabinosidase, mannanase, pectin hydrolase, and pectate lyase, or combination(s) thereof. Said heterologous genes may be incorporated onto the chromosome of such cells. The term gene is also intended to cover all copies of a particular gene, e.g., all of the DNA sequences in a cell encoding a particular gene product.

[0031] The term "transcriptional control" is intended to include the ability to modulate gene expression at the level of transcription. In certain embodiments, transcription, and thus gene expression, is modulated by replacing or adding a surrogate promoter near the 5' end of the coding region of a gene-of-interest, thereby resulting in altered gene expression. In certain embodiments, the transcriptional control of one or more gene is engineered to result in the optimal expression of such genes, e.g., in a desired ratio. The term also includes inducible transcriptional control as recognized in the art.

[0032] The term "expression product" is intended to include the resultant product, e.g., a polypeptide, of an expressed gene.

[0033] The term "increased expression" is intended to include an alteration in gene expression at least at the level of increased mRNA production and, preferably, at the level of polypeptide expression. The term "increased production" is intended to include an increase in the amount of a polypeptide expressed, in the level of the enzymatic activity of the polypeptide, or a combination thereof.

[0034] A DNA or RNA "coding region" is a DNA or RNA molecule which is transcribed and/or translated into a polypeptide in a cell in vitro or in vivo when placed under the control of appropriate regulatory sequences. "Suitable regulatory regions" refer to nucleic acid regions located upstream (5' non-coding sequences), within, or downstream (3' non- coding sequences) of a coding region, and which influence the transcription, RNA processing or stability, or translation of the associated coding region. Regulatory regions may include promoters, translation leader sequences, RNA processing site, effector binding site and stem-loop structure. The boundaries of the coding region are determined by a start codon at the 5' (amino) terminus and a translation stop codon at the 3' (carboxyl) terminus. A coding region can include, but is not limited to, prokaryotic regions, cDNA from mRNA, genomic DNA molecules, synthetic DNA molecules, or RNA molecules. If the coding region is intended for expression in a eukaryotic cell, a polyadenylation signal and transcription termination sequence will usually be located 3' to the coding region.

[0035] An "isoform" is a protein that has the same function as another protein but which is encoded by a different gene and may have small differences in its sequence.

[0036] A "paralogue" is a protein encoded by a gene related by duplication within a genome.

[0037] An "orthologue" is gene from a different species that has evolved from a common ancestral gene by speciation. Normally, orthologues retain the same function in the course of evolution as the ancestral gene.

[0038] "Open reading frame" is abbreviated ORF and means a length of nucleic acid, either DNA, cDNA or RNA, that comprises a translation start signal or initiation codon, such as an ATG or AUG, and a termination codon and can be potentially translated into a polypeptide sequence.

[0039] "Promoter" refers to a DNA fragment capable of controlling the expression of a coding sequence or functional RNA. In general, a coding region is located 3' to a promoter. Promoters may be derived in their entirety from a native gene, or be composed of different elements derived from different promoters found in nature, or even comprise synthetic DNA segments. It is understood by those skilled in the art that different promoters may direct the expression of a gene in different tissues or cell types, or at different stages of development, or in response to different environmental or physiological conditions. Promoters which cause a gene to be expressed in most cell types at most times are commonly referred to as "constitutive promoters". It is further recognized that since in most cases the exact boundaries of regulatory sequences have not been completely defined, DNA fragments of different lengths may have identical promoter activity. A promoter is generally bounded at its 3' terminus by the transcription initiation site and extends upstream (5' direction) to include the minimum number of bases or elements necessary to initiate transcription at levels detectable above background. Within the promoter will be found a transcription initiation site (conveniently defined for example, by mapping with nuclease SI), as well as protein binding domains (consensus sequences) responsible for the binding of RNA polymerase.

[0040] A coding region is "under the control" of transcriptional and translational control elements in a cell when RNA polymerase transcribes the coding region into mRNA, which is then trans-RNA spliced (if the coding region contains introns) and translated into the protein encoded by the coding region.

[0041] "Transcriptional and translational control regions" are DNA regulatory regions, such as promoters, enhancers, terminators, and the like, that provide for the expression of a coding region in a host cell. In eukaryotic cells, polyadenylation signals are control regions.

[0042] The term "operably associated" refers to the association of nucleic acid sequences on a single nucleic acid fragment so that the function of one is affected by the other. For example, a promoter is operably associated with a coding region when it is capable of affecting the expression of that coding region (i.e., that the coding region is under the transcriptional control of the promoter). Coding regions can be operably associated to regulatory regions in sense or antisense orientation.

[0043] The term "expression," as used herein, refers to the transcription and stable accumulation of sense (mRNA) or antisense RNA derived from the nucleic acid fragment of the invention. Expression may also refer to translation of mRNA into a polypeptide.

[0044] The terms "activity," "activities," "enzymatic activity," and "enzymatic activities" are used interchangeably and are intended to include any functional activity normally attributed to a selected polypeptide when produced under favorable conditions. Typically, the activity of a selected polypeptide encompasses the total enzymatic activity associated with the produced polypeptide. The polypeptide produced by a host cell and having enzymatic activity may be located in the intracellular space of the cell, cell- associated, secreted into the extracellular milieu, or a combination thereof. Techniques for determining total activity as compared to secreted activity are described herein and are known in the art.

[0045] The term "xylanolytic activity" is intended to include the ability to hydrolyze glycosidic linkages in oligopentoses and polypentoses. [0046] The term "cellulolytic activity" is intended to include the ability to hydrolyze glycosidic linkages in oligohexoses and polyhexoses. Cellulolytic activity may also include the ability to depolymerize or debranch cellulose and hemicellulose.

[0047] As used herein, the term "lactate dehydrogenase" or "LDH" is intended to include the enzyme capable of converting pyruvate into lactate. It is understood that LDH can also catalyze the oxidation of hydroxybutyrate.

[0048] As used herein the term "alcohol dehydrogenase" or "ADH" is intended to include the enzyme capable of converting acetaldehyde into an alcohol, such as ethanol.

[0049] As used herein, the term "phosphotransacetylase" or "PTA" is intended to include the enzyme capable of converting Acetyl CoA into acetate.

[0050] As used herein, the term "acetate kinase" or "ACK" is intended to include the enzyme capable of converting Acetyl CoA into acetate.

[0051] As used herein, the term "pyruvate formate lyase" or "PFL" is intended to include the enzyme capable of converting pyruvate into Acetyl CoA.

[0052] The term "pyruvate decarboxylase activity" is intended to include the ability of a polypeptide to enzymatically convert pyruvate into acetaldehyde (e.g., "pyruvate decarboxylase" or "PDC"). Typically, the activity of a selected polypeptide encompasses the total enzymatic activity associated with the produced polypeptide, comprising, e.g., the superior substrate affinity of the enzyme, thermostability, stability at different pHs, or a combination of these attributes.

[0053] The term "ethanologenic" is intended to include the ability of a microorganism to produce ethanol from a carbohydrate as a fermentation product. The term is intended to include, but is not limited to, naturally occurring ethanologenic organisms, ethanologenic organisms with naturally occurring or induced mutations, and ethanologenic organisms which have been genetically modified.

[0054] The terms "fermenting" and "fermentation" are intended to include the enzymatic process (e.g., cellular or acellular, e.g., a lysate or purified polypeptide mixture) by which ethanol is produced from a carbohydrate, in particular, as a product of fermentation.

[0055] The term "secreted" is intended to include the movement of polypeptides to the periplasmic space or extracellular milieu. The term "increased secretion" is intended to include situations in which a given polypeptide is secreted at an increased level (i.e., in excess of the naturally-occurring amount of secretion). In certain embodiments, the term "increased secreted" refers to an increase in secretion of a given polypeptide that is at least about 10% or at least about 100%, 200%, 300%, 400%, 500%, 600%, 700%, 800%, 900%), 1000%), or more, as compared to the naturally-occurring level of secretion.

[0056] The term "secretory polypeptide" is intended to include any polypeptide(s), alone or in combination with other polypeptides, that facilitate the transport of another polypeptide from the intracellular space of a cell to the extracellular milieu. In certain embodiments, the secretory polypeptide(s) encompass all the necessary secretory polypeptides sufficient to impart secretory activity to a Gram-negative or Gram-positive host cell. Typically, secretory proteins are encoded in a single region or locus that may be isolated from one host cell and transferred to another host cell using genetic engineering. In certain embodiments, the secretory polypeptide(s) are derived from any bacterial cell having secretory activity. In certain embodiments, the secretory polypeptide(s) are derived from a host cell having Type II secretory activity. In certain embodiments, the host cell is a thermophilic bacterial cell.

[0057] The term "derived from" is intended to include the isolation (in whole or in part) of a polynucleotide segment from an indicated source or the purification of a polypeptide from an indicated source. The term is intended to include, for example, direct cloning, PCR amplification, or artificial synthesis from or based on a sequence associated with the indicated polynucleotide source.

[0058] By "thermophilic" is meant an organism that thrives at a temperature of about

45°C or higher.

[0059] By "mesophilic" is meant an organism that thrives at a temperature of about 20-

45°C.

[0060] The term "organic acid" is art-recognized. "Organic acid," as used herein, also includes certain organic solvents such as ethanol. The term "lactic acid" refers to the organic acid 2-hydroxypropionic acid in either the free acid or salt form. The salt form of lactic acid is referred to as "lactate" regardless of the neutralizing agent, i.e., calcium carbonate or ammonium hydroxide. The term "acetic acid" refers to the organic acid methanecarboxylic acid, also known as ethanoic acid, in either free acid or salt form. The salt form of acetic acid is referred to as "acetate."

[0061] Certain embodiments of the present invention provide for the "insertion," {e.g., the addition, integration, incorporation, or introduction) of certain genes or particular polynucleotide sequences within thermophilic or mesophilic microorganisms, which insertion of genes or particular polynucleotide sequences may be understood to encompass "genetic modification(s)" or "transformation(s)" such that the resulting strains of said thermophilic or mesophilic microorganisms may be understood to be "genetically modified" or "transformed." In certain embodiments, strains may be of bacterial, fungal, or yeast origin.

[0062] Certain embodiments provide for the "inactivation" or "deletion" of certain genes or particular polynucleotide sequences within thermophilic or mesophilic microorganisms, which "inactivation" or "deletion" of genes or particular polynucleotide sequences may be understood to encompass "genetic modification(s)" or "transformation(s)" such that the resulting strains of said thermophilic or mesophilic microorganisms may be understood to be "genetically modified" or "transformed." In certain embodiments, strains may be of bacterial, fungal, or yeast origin.

[0063] The term "CBP organism" is intended to include microorganisms of the invention, e.g., microorganisms that have properties suitable for CBP.

[0064] In one aspect of the invention, the genes or particular polynucleotide sequences are inserted to activate the activity for which they encode, such as the expression of an enzyme. In certain embodiments, genes encoding enzymes in the metabolic production of ethanol, e.g., enzymes that metabolize pentose and/or hexose sugars, may be added to a mesophilic or thermophilic organism. In certain embodiments of the invention, the enzyme may confer the ability to metabolize a pentose sugar and be involved, for example, in the D-xylose pathway and/or L-arabinose pathway.

[0065] In one aspect of the invention, the genes or particular polynucleotide sequences are partially, substantially, or completely deleted, silenced, inactivated, or down-regulated in order to inactivate the activity for which they encode, such as the expression of an enzyme. Deletions provide maximum stability because there is no opportunity for a reverse mutation to restore function. Alternatively, genes can be partially, substantially, or completely deleted, silenced, inactivated, or down-regulated by insertion of nucleic acid sequences that disrupt the function and/or expression of the gene (e.g., PI transduction or other methods known in the art). The terms "eliminate," "elimination," and "knockout" are used interchangeably with the terms "deletion," "partial deletion," "substantial deletion," or "complete deletion." In certain embodiments, strains of thermophilic or mesophilic microorganisms of interest may be engineered by site directed homologous recombination to knockout the production of organic acids. In still other embodiments, RNAi or antisense DNA (asDNA) may be used to partially, substantially, or completely silence, inactivate, or down-regulate a particular gene of interest.

[0066] In certain embodiments, the genes targeted for deletion or inactivation as described herein may be endogenous to the native strain of the microorganism, and may thus be understood to be referred to as "native gene(s)" or "endogenous gene(s)." An organism is in "a native state" if it has not been genetically engineered or otherwise manipulated by the hand of man in a manner that intentionally alters the genetic and/or phenotypic constitution of the organism. For example, wild-type organisms may be considered to be in a native state. In other embodiments, the gene(s) targeted for deletion or inactivation may be non-native to the organism.

Biomass

[0067] Biomass can include any type of biomass known in the art or described herein.

The terms "lignocellulosic material," "lignocellulosic substrate," and "cellulosic biomass" mean any type of biomass comprising cellulose, hemicellulose, lignin, or combinations thereof, such as but not limited to woody biomass, forage grasses, herbaceous energy crops, non-woody-plant biomass, agricultural wastes and/or agricultural residues, forestry residues and/or forestry wastes, paper-production sludge and/or waste paper sludge, waste-water-treatment sludge, municipal solid waste, corn fiber from wet and dry mill corn ethanol plants, and sugar-processing residues. The terms "hemicellulosics," "hemicellulosic portions," and "hemicellulosic fractions" mean the non-lignin, non- cellulose elements of lignocellulosic material, such as but not limited to hemicellulose (i.e., comprising xyloglucan, xylan, glucuronoxylan, arabinoxylan, mannan, glucomannan, and galactoglucomannan, inter alia), pectins (e.g., homogalacturonans, rhamnogalacturonan I and II, and xylogalacturonan), and proteoglycans (e.g., arabinogalactan-protein, extensin, and proline-rich proteins).

[0068] In a non-limiting example, the lignocellulosic material can include, but is not limited to, woody biomass, such as recycled wood pulp fiber, sawdust, hardwood, softwood, and combinations thereof; grasses, such as switch grass, cord grass, rye grass, reed canary grass, miscanthus, or a combination thereof; sugar-processing residues, such as but not limited to sugar cane bagasse; agricultural wastes, such as but not limited to rice straw, rice hulls, barley straw, corn cobs, cereal straw, wheat straw, canola straw, oat straw, oat hulls, and corn fiber; stover, such as but not limited to soybean stover, com stover; and forestry wastes, such as but not limited to recycled wood pulp fiber, sawdust, hardwood (e.g., poplar, oak, maple, birch, willow), softwood, or any combination thereof. Lignocellulosic material may comprise one species of fiber; alternatively, lignocellulosic material may comprise a mixture of fibers that originate from different lignocellulosic materials. Other lignocellulosic materials include agricultural wastes, such as cereal straws, including wheat straw, barley straw, canola straw and oat straw; corn fiber; stovers, such as corn stover and soybean stover; grasses, such as switch grass, reed canary grass, cord grass, and miscanthus; or combinations thereof.

[0069] Paper sludge is also a viable feedstock for lactate or acetate production. Paper sludge is solid residue arising from pulping and paper-making, and is typically removed from process wastewater in a primary clarifier. At a disposal cost of $30/wet ton, the cost of sludge disposal equates to $5/ton of paper that is produced for sale. The cost of disposing of wet sludge is a significant incentive to convert the material for other uses, such as conversion to ethanol. Processes provided by the present invention are widely applicable. Moreover, the saccharification and/or fermentation products may be used to produce ethanol or higher value added chemicals, such as organic acids, aromatics, esters, acetone and polymer intermediates.

Pretreatment

[0070] Lignocellulosic materials generally require some method of pretreatment to increase the accessibility of lignocellulosics and other components to enzymes. Pretreatment can include any method or type of pretreatment known in the art. Among processes developed to pretreat lignocellulosic biomass, steam-explosion has been identified as a low cost and high yield technology, along with low-pressure steam autohydro lysis. In certain manifestations, steam explosion heats wetted lignocellulose to high temperatures (e.g., about 160°C to about 230°C) and releases the pressure immediately. Due to rapid decompression, which flashes the water trapped in fibers, physical size reduction occurs. The high temperatures remove acetic acid from hemicellulose, so this process results in some autohydro lysis of the biomass. Additional chemical agents, such as sulfuric acid or ammonia (e.g., gaseous, anhydrous liquid, or ammonium hydroxide), may be added to aid in the hydrolysis. The pretreated cellulose can then be sterilized, if desired, to prevent growth of other microorganisms during the fermentation reaction. The optimum reaction conditions vary depending on the starting material.

[0071] In a non-limiting example, the lignocellulosic materials may be soaked in water or other suitable liquid(s) prior to the addition of steam or ammonia or both. The excess water may be drained off the lignocellulosic materials. The soaking may be done prior to conveying into a reactor, or subsequent to entry (i.e., inside a pretreatment reactor).

[0072] The terms "reactor" and "pretreatment reactor" used herein mean any vessel suitable for practicing a method of the present invention. The dimensions of the pretreatment reactor should be sufficient to accommodate the lignocellulose material conveyed into and out of the reactor, as well as additional headspace around the material. In a non-limiting example, the headspace extends about one foot around the space occupied by the materials. Furthermore, the pretreatment reactor should be constructed of a material capable of withstanding the pretreatment conditions. Specifically, the construction of the reactor should be such that the pH, temperature and pressure do not affect the integrity of the vessel.

[0073] The size range of the substrate material varies widely and depends upon the type of substrate material used as well as the requirements and needs of a given process. In a preferred embodiment of the invention, the lignocellulosic raw material may be prepared in such a way as to permit ease of handling in conveyors, hoppers and the like. In the case of wood, the chips obtained from commercial chippers are suitable; in the case of straw it is sometimes desirable to chop the stalks into uniform pieces about 1 to about 3 inches in length. Depending on the intended degree of pretreatment, the size of the substrate particles prior to pretreatment may range from less than a millimeter to inches in length. The particles need only be of a size that is reactive.

[0074] Ultrasound treatments may also be applied to processes of the present invention.

See U.S. Patent No. 6,333,181, which is hereby incorporated by reference.

Consolidated Bioprocessing

[0075] Consolidated bioprocessing (CBP) is a processing strategy for cellulosic biomass that involves consolidating into a single process step four biologically-mediated events: enzyme production, hydrolysis, hexose fermentation, and pentose fermentation. Implementing this strategy requires development of microorganisms that both utilize cellulose, hemicellulosics, and other biomass components while also producing a product of interest at sufficiently high yield and concentrations. The feasibility of CBP is supported by kinetic and bioenergetic analysis. See van Walsum and Lynd (1998) Biotech. Bioeng. 58:316.

Xylose metabolism

[0076] Xylose is a five-carbon monosaccharide that can be metabolized into useful products by a variety of organisms. There are two main pathways of xylose metabolism, each unique in the characteristic enzymes they utilize. One pathway is called the "Xylose Reductase-Xylitol Dehydrogenase" or XR-XDH pathway. Xylose reductase (XR) and xylitol dehydrogenase (XDH) are the two main enzymes used in this method of xylose degradation. XR, encoded by the XYL1 gene, is responsible for the reduction of xylose to xylitol and is aided by cofactors NADH or NADPH. Xylitol is then oxidized to xylulose by XDH, which is expressed through the XYL2 gene, and accomplished exclusively with the cofactor NAD+. Because of the varying cofactors needed in this pathway and the degree to which they are available for usage, an imbalance can result in an overproduction of xylitol byproduct and an inefficient production of desirable ethanol. Varying expression of the XR and XDH enzyme levels have been tested in the laboratory in the attempt to optimize the efficiency of the xylose metabolism pathway.

[0077] The other pathway for xylose metabolism is called the "Xylose Isomerase" (XI) pathway. Enzyme XI is responsible for direct conversion of xylose into xylulose, and does not proceed via a xylitol intermediate. Both pathways create xylulose, although the enzymes utilized are different. After production of xylulose both the XR-XDH and XI pathways proceed through enzyme xylulokinase (XK), encoded on gene XKSl, to further modify xylulose into xylulose-5-P where it then enters the pentose phosphate pathway for further catabolism.

[0078] Studies on flux through the pentose phosphate pathway during xylose metabolism have revealed that limiting the speed of this step may be beneficial to the efficiency of fermentation to ethanol. Modifications to this flux that may improve ethanol production include a) lowering phosphoglucose isomerase activity, b) deleting the GND1 gene, and c) deleting the ZWF1 gene (Jeppsson et ah, Appl. Environ. Microbiol. 2002 Apr; 68(4): 1604-9). Since the pentose phosphate pathway produces additional NADPH during metabolism, limiting this step will help to correct the already evident imbalance between NAD(P)H and NAD+ cofactors and reduce xylitol byproduct. Another experiment comparing the two xylose metabolizing pathways revealed that the XI pathway was best able to metabolize xylose to produce the greatest ethanol yield, while the XR-XDH pathway reached a much faster rate of ethanol production (Karhumaa et al., Microb. Cell. Fact. 2007 Feb 5;6:5). See also, Publication No. WO2006/009434, incorporated herein by reference in its entirety.

[0079] Microorganisms are particularly diverse in the fermentation products that are produced by different genera. These products include organic acids, such as lactic, acetic, succinic, and butyric, as well as neutral products, such as ethanol, butanol, acetone, and butanediol. Aspects of the present invention relate to microorganisms with the ability to produce enzymes, which are used to depolymerize the hemicellulosic portions of lignocellulosic biomass materials {e.g., hemicellulose, pectins, proteoglycans). Aspects of the present invention provide for a novel approach wherein microorganisms that produce such enzymes are implemented in the consolidated bioprocessing of lignocellulosic materials, more particularly in combination with a sugar stream such as, but not limited to molasses, cane juice, or starch {e.g., corn, wheat, potato). Previously, costly enzyme had to be purchased, which was the economic driver that led researchers to focus on hydrolyzing hemicellulose during pretreatment. In certain embodiments, methods of the present invention provide for an approach wherein the enzyme specifically targets the hemicellulosic fractions of the biomass (i.e., non-lignin, non-cellulose elements such as but not limited to hemicellulose, pectins, and proteoglycans). Previously, all depolymerization of any undigested hemicellulosics exiting pretreatment depended upon cellulase enzyme, which specifically targets cellulose. In certain embodiments of the present invention, the organisms will be thermophilic (i.e., thrives at a temperature of about 45°C or higher), thus able to thrive at the temperature of optimal enzymatic activity. Previously, all microorganisms implemented in alternate processes were not able to thrive at optimal temperatures for enzymatic hydrolysis.

[0080] With the introduction of recombinant DNA technology it has become possible to clone genes from one organism and transfer them to another organism, delete genes in the genome and also vary the expression levels of genes. It is thus possible to perform directed modifications of metabolic pathways. This new discipline is called metabolic engineering and has been defined as "Improvement of cellular activities by manipulation of enzymatic, transport, and regulatory functions of the cell with the use of recombinant DNA technology" and "Purposeful modification of intermediary metabolism using recombinant DNA techniques." Like all classic fields of engineering, metabolic engineering is characterized by an analysis step and a synthesis step. In the analysis step, the microorganism is physiologically characterized and evaluated using, for instance, MFA (metabolic flux analysis), enzymatic activity measurements or expression analysis. The analysis provides information on where genetic modifications may improve the performance of the microorganism. The synthesis step involves the construction of a strain, with genetic modifications based on the analysis, using recombinant DNA technology. The new recombinant strain is then analyzed using the same methodology as for its parental strain. If the analysis reveals that further improvement is required, new targets for genetic manipulation are identified followed by a new round of synthesis and analysis.

[0081] Aspects of the present invention are related to microorganisms with the ability to produce enzymes, which are used to depolymerize the hemicellulosic portions of lignocellulosic biomass materials (i.e., the non-lignin, non-cellulose elements such as but not limited to hemicellulose, pectins, and proteoglycans).

[0082] Pectin is a heterosaccharide derived from the cell wall of plants. Pectins vary in their chain lengths, complexity and the order of each of the monosaccharide units. Pectin is composed of four main polysaccharide types: homogalacturonan, which is composed of repeated D-galacturonic acid monosaccharide subunits, that are methyl-esterified to a varying degree; rhamnogalacturonan I, which is composed of alternating L-rhamnose and D-galacturonic acid subunits that contain a-l,5-L-arabinan and P-l,4-D-galactan side chains; rhamnogalacturonan II, which is a complex, highly branched polysaccharide; and xylogalacturonans. The gelling characteristics of different pectins are influenced greatly by the degree of esterification of the molecule. Pectin releases small amounts of methane in a non-enzymatic reaction and methanol in an enzymatic reaction.

[0083] Proteoglycans represent a special class of glycoproteins that are heavily glycosylated. They consist of a core protein with one or more covalently attached glycosaminoglycan chain(s). These glycosaminoglycan (GAG) chains are long, linear carbohydrate polymers. In certain embodiments, the present invention relates to the proteoglycans arabinogalactan-protein, extensin, and/or proline-rich proteins. [0084] Hemicelluloses are heteropolysaccharides formed from a variety of monomers.

The most common monomers are glucose, galactose, and mannose (hexoses) and xylose and arabinose (pentoses). Hemicellulase enzymes are broadly categorized - e.g., as a glucanase, xylanase, or mannanase - based on their ability to catalyze the hydrolysis of heteropolysaccharides composed of glucan, xylan, or mannan, respectively.

[0085] Aspects of the present invention relate to microorganisms that produce enzymes, which act to depolymerize hemicellulosic portions of lignocellulosic biomass. Such enzymes may also be active primarily on side-chains {e.g., arabinofuranosidase). Such enzymes may be active for de-branching. Such enzymes may be active for methylation and/or other chemical modifications of biomass polysaccharides. Said microorganisms can be cultured per se or can be used as sources of genetic information with which to engineer other microorganisms to produce the enzyme. In certain embodiments, enzymes useful in the present invention may have a pH profile for activity in catalyzing said degradation that ranges from about pH 4.5 to about pH 11 and may be active at a temperature of at least about 45°C to about 60°C. In certain embodiments, the present invention relates to enzymes selected from the group consisting of: endoglucanase, exoglucanase, endoxylanase, exoxylanase, endogalactanase, endoarabinase, cellobiohydrolase, exo-P-l,3-glucanase, endo-P-l,4-glucanase, endo-P-D-mannanase, endo-P"l,4-mannanase, β-mannanase, β-mannosidase, endo-P-xylanase, a-galactosidase, polygalacturonase, a-glucuronidase, cellodextrinase, xyloglucanase, xylose isomerase, xylose reductase, xylitol dehydrogenase, xylulokinase, transaldolase, transketolase, β- glucosidase, endo-l,4-P-xylanase (EC-Number 3.2.1.8), xylan endo-P-l,3-xylosidase (EC-Number 3.2.1.32), a-xylosidase, β-xylosidase, oligoxyloglucan hydrolase, oligoxyloglucan reducing-end-specific cellobiohydrolase (EC-Number 3.2.1.150), endoxyloglucan transferase, xyloglucan endotransglycosylase, xyloglucan hydrolase, xyloglucan endohydrolase, xyloglucan-specific exo-P-l,4-glucanase (EC-Number 3.2.1.155), xyloglucan-specific endo-P-l,4-glucanase (EC-Number 3.2.1.151), glucuronoarabinoxylan endo-P-l,4-xylanase (EC-Number 3.2.1.136), a-L- arabinofuranosidase, acetylesterase, acetylxylanesterase, a-amylase, β-amylase, glucoamylase, pullulanase, β-glucanase, hemicellulase, arabinosidase, mannanase, pectin hydrolase, pectate lyase, and combinations thereof. [0086] In certain embodiments, some recombinant microorganisms useful in the present invention have been transformed with heterologous sequences that encode for enzymes involved in the metabolism of hemicellulose to ethanol, including, but not limited to xylose isomerase, xylose reductase, xylitol dehydrogenase, xylulokinase, transaldolase, and transketolase. The nucleotide sequences of representatives of said encoding genes are publicly available without restriction.

[0087] It is known that enzymes that effect hydrolysis of mannans, such as a galactan or a glucomannan, are produced by various microorganisms, including bacteria and fungi, and that they also occur in some animals and in numerous plants. Among the microorganisms that produce such mannanases are species of Aeromonas, Aspergillus, Streptomyces, Rhodococcus, and Bacillus (e.g., B. circulans).

[0088] A hemicellulase capable of catalyzing the degradation of mannan-containing hemicellulose can be produced using microorganisms that synthesize the enzyme. Microorganisms possessing the ability to produce a hemicellulase can be isolated by conventional methods from soil, where they comprise a reproducible, finite subsection of indigenous microflora, and they can also be produced by transforming another microorganism of choice, such as B. subtilis or B. brevis. Transformation via standard recombinant DNA techniques may proceed with hemicellulase-encoding DNA obtained from e.g. , soil microflora.

[0089] Mannanase can be produced by fungi or bacteria, for example, by microorganisms belonging to the following genera: Trichoderma (e.g., T. reesei), Aspergillus (e.g., A. niger, A. aculeatus), Phanerochaete (e.g., P. chrysosporium), Penicillium (e.g., P. janthinelium, P. digitatum), and Bacillus. As a host organism for mannanase production, a white-rot fungi belonging to the genera Phlebia, Ceriporiopsis, or Trametes can be used.

[0090] Xylanase can be obtained or derived from fungal and bacterial organisms, for example, Aspergillus, Disporotrichum, Penicillium, Neurospora, Fusarium, and Trichoderma. A Bacillus xylanase can be obtained from, for example, B. halodurans, B. pumilus, B. agaradhaerens, B. circulans, B. polymyxa, B. stearothermophilus, and B. subtilis.

[0091] Fungal xylanase can be obtained from yeast or filamentous fungal polypeptides, and for example, derived from the following fungal genera: Aspergillus, Aureobasidium, Emericella, Fusarium, Gaeumannomyces, Humicola, Lentinula, Magnaporthe, Neocallimastix, Norcardiopsis, Orpinomyces, Paecilomyces, Penicillium, Pichia, Schizophyllum, Talaromyces, Thermomyces, and Trichoderma.

[0092] Other approaches to xylose fermentation include the conversion of xylose to xylulose using xylose isomerase prior to fermentation by Saccharomyces cerevisiae and the development of genetically engineered strains of S. cerevisiae, which express xylose isomerase {supra).

[0093] Erwinia chrysanthemi is also known to produce at least two different endoglucanase activities, EGY and EGZ.

Host Cells

[0094] Host cells useful in the present invention include any prokaryotic or eukaryotic cells; for example, microorganisms selected from bacterial, algal, and yeast cells. Among host cells thus suitable for the present invention are microorganisms, for example, of the genera Aeromonas, Aspergillus, Bacillus, Escherichia, Kluyveromyces, Pichia, Rhodococcus, Saccharomyces and Streptomyces.

[0095] In some embodiments, the host cells are microorganisms. In one embodiment the microorganism is a yeast. According to the present invention the yeast host cell can be, for example, from the genera Saccharomyces, Kluyveromyces, Candida, Pichia, Schizosaccharomyces, Hansenula, Kloeckera, Schwanniomyces, and Yarrowia. Yeast species as host cells may include, for example, S. cerevisiae, S. bulderi, S. barnetti, S. exiguus, S. uvarum, S. diastaticus, K. lactis, K. marxianus, or K. fragilis. In some embodiments, the yeast is selected from the group consisting of Saccharomyces cerevisiae, Schizzosaccharomyces pombe, Candida albicans, Pichia pastoris, Pichia stipitis, Yarrowia lipolytica, Hansenula polymorpha, Phaffia rhodozyma, Candida utilis, Arxula adeninivorans, Debaryomyces hansenii, Debaryomyces polymorphus, Schizosaccharomyces pombe and Schwanniomyces occidentalis . In one particular embodiment, the yeast is Saccharomyces cerevisiae. In another embodiment, the yeast is a thermotolerant Saccharomyces cerevisiae. The selection of an appropriate host is deemed to be within the scope of those skilled in the art from the teachings herein.

[0096] In some embodiments, the host cell is an oleaginous cell. The oleaginous host cell can be an oleaginous yeast cell. For example, the oleaginous yeast host cell can be from the genera Blakeslea, Candida, Cryptococcus, Cunninghamella, Lipomyces, Mortierella, Mucor, Phycomyces, Pythium, Rhodosporidum, Rhodotorula, Trichosporon or Yarrowia. According to the present invention, the oleaginous host cell can be an oleaginous microalgae host cell. For example, the oleaginous microalgae host cell can be from the genera Thraustochytrium or Schizochytrium. Biodiesel could then be produced from the triglyceride produced by the oleaginous organisms using conventional lipid transesterification processes. In some particular embodiments, the oleaginous host cells can be induced to secrete synthesized lipids. Embodiments using oleaginous host cells are advantageous because they can produce biodiesel from lignocellulosic feedstocks which, relative to oilseed substrates, are cheaper, can be grown more densely, show lower life cycle carbon dioxide emissions, and can be cultivated on marginal lands.

[0097] In some embodiments, the host cell is a thermotolerant host cell. Thermotolerant host cells can be particularly useful in simultaneous saccharification and fermentation processes by allowing externally produced cellulases and ethanol-producing host cells to perform optimally in similar temperature ranges.

[0098] Thermotolerant host cells can include, for example, Issatchenkia orientalis, Pichia mississippiensis, Pichia mexicana, Pichia farinosa, Clavispora opuntiae, Clavispora lusitaniae, Candida mexicana, Hansenula polymorpha and Kluyveromyces host cells. In some embodiments, the thermotolerant cell is an S. cerevisiae strain, or other yeast strain, that has been adapted to grow in high temperatures, for example, by selection for growth at high temperatures in a cytostat.

[0099] In some particular embodiments, the host cell is a Kluyveromyces host cell. For example, the Kluyveromyces host cell can be a K lactis, K marxianus, K. blattae, K phqffii, K. yarrowii, K. aestuarii, K. dobzhanskii, K. wickerhamii K. thermotolerans, or K waltii host cell. In one embodiment, the host cell is a K. lactis, or K marxianus host cell. In another embodiment, the host cell is a K marxianus host cell.

[0100] In some embodiments, the thermotolerant host cell can grow at temperatures above about 30° C, about 31° C, about 32° C, about 33° C, about 34° C, about 35° C, about 36° C, about 37° C, about 38° C, about 39° C, about 40° C, about 41° C or about 42° C. In some embodiments of the present invention the thermotolerant host cell can produce ethanol from cellulose at temperatures above about 30° C, about 31° C, about 32° C, about 33° C, about 34° C, about 35° C, about 36° C, about 37° C, about 38° C, about 39° C, about 40° C, about 41° C, about 42° C, or about 43 °C, or about 44 °C, or about 45 °C, or about 50° C.

[0101] In some embodiments of the present invention, the thermotolerant host cell can grow at temperatures from about 30° C to 60° C, about 30° C to 55° C, about 30° C to 50° C, about 40° C to 60° C, about 40° C to 55° C or about 40° C to 50° C. In some embodiments of the present invention, the thermotolerant host cell can produce ethanol from cellulose at temperatures from about 30° C to 60° C, about 30° C to 55° C, about 30° C to 50° C, about 40° C to 60° C, about 40° C to 55° C or about 40° C to 50° C.

[0102] In some embodiments, the host cell has the ability to metabolize xylose. Detailed information regarding the development of the xylose-utilizing technology can be found in the following publications: Kuyper, M. et al., FEMS Yeast Res. 4: 655-64 (2004), Kuyper, M. et al, FEMS Yeast Res. 5:399-409 (2005), and Kuyper, M. et al, FEMS Yeast Res. 5:925-34 (2005), which are herein incorporated by reference in their entirety. For example, xylose-utilization can be accomplished in S. cerevisiae by heterologously expressing the xylose isomerase gene, XylA, e.g., from the anaerobic fungus Piromyces sp. E2, overexpressing five S. cerevisiae enzymes involved in the conversion of xylulose to glycolytic intermediates (xylulokinase, ribulose 5-phosphate isomerase, ribulose 5- phosphate epimerase, transketolase and transaldolase) and deleting the GRE3 gene encoding aldose reductase to minimize xylitol production.

[0103] The host cells can contain antibiotic markers or can contain no antibiotic markers.

[0104] In certain embodiments, the host cell is microorganism that is a species of the genera Thermoanaerobacterium, Thermoanaerobacter, Clostridium, Geobacillus, Saccharococcus, Paenibacillus, Bacillus, Caldicellulosiruptor, Anaerocellum, or Anoxybacillus. In certain embodiments, the host cell is a bacterium selected from the group consisting of: Thermoanaerobacterium thermo sulfur igenes, Thermoanaerobacterium aotearoense, Thermoanaerobacterium polys accharolyticum, Thermoanaerobacterium zeae, Thermoanaerobacterium xylanolyticum,

Thermoanaerobacterium s accharolyticum, Thermoanaerobium brockii,

Thermoanaerobacterium thermosaccharolyticum, Thermoanaerobacter thermohydrosulfuricus, Thermoanaerobacter ethanolicus, Thermoanaerobacter brocki, Clostridium thermocellum, Clostridium cellulolyticum, Clostridium phytofermentans, Clostridium straminosolvens, Geobacillus thermoglucosidasius, Geobacillus stearothermophilus, Saccharococcus caldoxylosilyticus, Saccharococcus thermophilus, Paenibacillus campinasensis, Bacillus flavothermus, Anoxybacillus kamchatkensis, Anoxybacillus gonensis, Caldicellulosiruptor acetigenus, Caldicellulosiruptor saccharolyticus, Caldicellulosiruptor kristjanssonii, Caldicellulosiruptor owensensis, Caldicellulosiruptor lactoaceticus, and Anaerocellum thermophilum. In certain embodiments, the host cell is Clostridium thermocellum, Clostridium cellulolyticum, or Thermoanaerobacterium saccharolyticum .

[0105] In one aspect, host cells are genetically engineered (transduced or transformed or transfected) with the polynucleotides encoding cellulases which are described in more detail herein. The polynucleotides encoding cellulases can be introduced to the host cell on a vector, which may be, for example, a cloning vector or an expression vector comprising a sequence encoding a heterologous cellulase. The host cells can comprise polynucleotides as integrated copies or plasmid copies.

[0106] The host cells useful in the methods of the invention can express one or more heterologous cellulase polypeptides. In some embodiments, the host cell comprises a combination of polynucleotides that encode heterologous cellulases or fragments, variants or derivatives thereof. The host cell can, for example, comprise multiple copies of the same nucleic acid sequence, for example, to increase expression levels, or the host cell can comprise a combination of unique polynucleotides. In other embodiments, the host cell comprises a single polynucleotide that encodes a heterologous cellulase or a fragment, variant or derivative thereof. In particular, such host cells expressing a single heterologous cellulase can be used in co-culture with other host cells of the invention comprising a polynucleotide that encodes at least one other heterologous cellulase or fragment, variant or derivative thereof.

[0107] Introduction of a polynucleotide encoding a heterologous cellulase into a host cell can be done by methods known in the art. Introduction of polynucleotides encoding heterologous cellulases into, for example yeast host cells, can be effected by lithium acetate transformation, spheroplast transformation, or transformation by electroporation, as described in Current Protocols in Molecular Biology, 13.7.1-13.7.10. Introduction of the construct in other host cells can be effected by calcium phosphate transfection, DEAE-Dextran mediated transfection, or electroporation. (Davis, L. et al., Basic Methods in Molecular Biology, (1986)). [0108] The transformed host cells or cell cultures, as described above, can be further analyzed for hydrolysis of cellulose (e.g., by a sugar detection assay), for a particular type of cellulase activity (e.g., by measuring the individual endoglucanase, cellobiohydrolase or β glucosidase activity) or for total cellulase activity. Endoglucanase activity can be determined, for example, by measuring an increase of reducing ends in an endoglucanase specific CMC substrate. Cellobiohydrolase activity can be measured, for example, by using insoluble cellulosic substrates such as the amorphous substrate phosphoric acid swollen cellulose (PASC) or microcrystalline cellulose (Avicel) and determining the extent of the substrate's hydrolysis. B-glucosidase activity can be measured by a variety of assays, e.g., using cellobiose.

[0109] A total cellulase activity, which includes the activity of endoglucanase, cellobiohydrolase and B-glucosidase, can hydrolyze crystalline cellulose synergistically. Total cellulase activity can thus be measured using insoluble substrates including pure cellulosic substrates such as Whatman No. 1 filter paper, cotton linter, microcrystalline cellulose, bacterial cellulose, algal cellulose, and cellulose-containing substrates such as dyed cellulose, alpha-cellulose or pretreated lignocellulose. Specific activity of cellulases can also be detected by methods known to one of ordinary skill in the art, such as by the Avicel assay (described supra) that would be normalized by protein (cellulase) concentration measured for the sample.

[0110] One aspect of the invention is thus related to the efficient production of cellulases to aid in the digestion of cellulose and generation of ethanol. A cellulase can be any enzyme involved in cellulase digestion, metabolism and/or hydrolysis, including an endoglucanase, exogluconase, or β-glucosidase.

[0111] In additional embodiments, the transformed host cells or cell cultures are assayed for ethanol production. Ethanol production can be measured by techniques known to one or ordinary skill in the art, e.g., by a standard HPLC refractive index method.

[0112] Recombinant host cells useful in the invention may be engineered for optimal expression and secretion of hemicellulosic depolymerization activities. For example, recombinant enteric bacteria, such as Escherichia and Klebsiella, can be transformed to express an endoglucanase under the transcriptional control of a surrogate promoter for optimal expression. For example, a recombinant enteric bacterium can be produced that expresses two different endoglucanases celY and celZ, where each is under the transcriptional control of a surrogate promoter for optimal expression in a particular ratio. See U.S. Patent No. 7,026,152, which is hereby incorporated by reference.

[0113] In certain embodiments, the hosts are further modified to include secretory protein or proteins that allow for the increased production and/or secretion of enzymes from the cell. For example, the hosts may be further modified to include exogenous ethanologenic genes derived from an efficient ethanol producer (e.g., Zymomonas mobilis). Accordingly, these hosts are capable of expressing high levels of proteins that may be used alone or in combination with other enzymes or recombinant hosts for the efficient degradation of lignocellulosic biomass and the subsequent production of alcohol. In certain embodiments, such additional enzyme is a secretory enzyme.

[0114] In certain embodiments of the above aspects, the host cell may be ethanologenic, e.g., E. coli K04 (ATCC 55123), E. coli KOl l (ATCC 55124), E. coli K012 (ATCC 55125) and E. coli LY01 (ATCC 11303) K. oxytoca M5A1, and K. oxytoca P2 (ATCC 55307), K. oxytoca strain P2 (pCPP2006), K. oxytoca strain SZ6 (pCPP2006), K. oxytoca strain SZ21 (pCPP2006), ox K. oxytoca strain SZ22 (pCPP2006).

[0115] In certain embodiments, a recombinant ethanologenic bacterium contains at least one heterologous polynucleotide segment encoding at least one enzyme with hemicellulosic depolymerization activities. In a preferred embodiment, the recombinant ethanologenic bacteria contains more than one heterologous polynucleotide segment, which segments encode enzymes of the present invention.

[0116] In certain embodiments, the recombinant host is a Gram-negative bacterium. In yet other embodiments, the recombinant host is from the family Enterobacteriaceae. The ethanologenic hosts of U.S. Pat. No. 5,821,093, which is hereby incorporated by reference, for example, are suitable hosts and include, in particular, E. coli strains K04 (ATCC 55123), KOl l (ATCC 55124), and K012 (ATCC 55125), and Klebsiella oxytoca strain P2 (ATCC 55307). Alternatively, a non-ethanologenic host of the present invention may be converted into an ethanologenic host (such as the above-mentioned strains) by introducing, for example, ethanologenic genes from an efficient ethanol producer, such as Zymomonas mobilis. This type of genetic engineering, using standard techniques, results in a recombinant host capable of efficiently fermenting sugar into ethanol. In addition, the LY01 ethanol tolerant strain (ATCC 11303) may be employed as described in published PCT international application WO 98/45425, and this published application is hereby incorporated by reference (see also, e.g., Yomano et al. (1998) J. Ind. Micro. & Bio. 20: 132-138).

[0117] Certain embodiments relate to use of a non-ethanologenic recombinant host, e.g.,

E. coli strain B, E. coli strain DH5a, or Klebsiella oxytoca strain M5A1. These strains may be used to express at least one desired polypeptide, e.g., a hemicellulase, using techniques described herein. In addition, these recombinant hosts may be used in conjunction with another recombinant host that expresses yet another desirable polypeptide, e.g., a different enzyme. In addition, the non-ethanologenic host cell(s) may be used in conjunction with an ethanologenic host cell. For example, the use of a non- ethanologenic host(s) for carrying out the synergistic depolymerization of a complex hemicellulose material may be followed by the use of an ethanologenic host for fermenting the depolymerized sugar. Accordingly, it will be appreciated that these reactions may be carried out serially or contemporaneously using homogeneous or mixed cultures of non-ethanologenic and ethanologenic recombinant hosts.

[0118] In certain embodiments, one or more genes for fermenting a hemicellulosic substrate into ethanol are provided on a plasmid or integrated into the host chromosome. For example, genes for fermenting a sugar substrate into ethanol, e.g., pyruvate decarboxylase and/or alcohol dehydrogenase, may be introduced into the host of the invention using an artificial operon, such as the PET operon as described in U.S. Pat. No. 5,821,093, which is hereby incorporated by reference. Indeed, aspects of the present invention relate to techniques and vectors for introducing multiple genes into a suitable host (see, e.g., Current Protocols in Molecular Biology, eds. Ausubel et al., John Wiley & Sons (1992); Sambrook et al., Molecular Cloning: A Laboratory Manual. 2nd, ed., Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1989); and Bergey's Manual of Determinative Bacteriology, Kreig et al., Williams and Wilkins (1984), each of which are hereby incorporated by reference).

[0119] Accordingly, using the methods of the invention, a single genetic construct can encode all of the necessary gene products (e.g., a glucanase, an endoglucanase, an exoglucanase, a secretory protein(s), pyruvate decarboxylase, alcohol dehydrogenase) for performing simultaneous saccharification and fermentation (SSF) or consolidated bioprocessing (CBP). [0120] Multiple strategies are encompassed for the development of microorganisms with the combination of substrate -utilization and product-formation properties required for CBP. The "native cellulo lytic strategy" involves engineering naturally occurring cellulolytic microorganisms to improve product-related properties, such as yield and titer. The "recombinant cellulolytic strategy" involves engineering natively non-cellulolytic organisms that exhibit high product yields and titers to express a heterologous cellulase system that enables cellulose utilization or hemicellulose utilization or both.

Organism development via the native cellulolytic strategy

[0121] One approach to organism development for CBP begins with organisms that naturally utilize cellulose, hemicellulose and/or other biomass components, which are then genetically engineering to enhance product yield and tolerance. For example, Clostridium thermocellum is a thermophilic bacterium that has among the highest rates of cellulose utilization reported. Other organisms of interest are xylose-utilizing thermophiles such as Thermoanaerobacterium saccharolyticum and Thermoanaerobacterium thermosaccharolyticum. Organic acid production may be responsible for the low concentrations of produced ethanol generally associated with these organisms. Thus, one objective is to eliminate production of acetic and lactic acid in these organisms via metabolic engineering. Substantial efforts have been devoted to developing gene transfer systems for the above-described target organisms and multiple C. thermocellum isolates from nature have been characterized. See McLaughlin et al. (2002) Environ. Sci. Technol. 36:2122. Metabolic engineering of thermophilic, saccharolytic bacteria is an active area of interest, and knockout of lactate dehydrogenase in T. saccharolyticum has recently been reported. See Desai et al. (2004) Appl. Microbiol. Biotechnol. 65:600. Knockout of acetate kinase and phosphotransacetylase in this organism is also possible.

Organism development via the recombinant cellulolytic strategy

[0122] An alternative approach to organism development for CBP involves conferring the ability to grow on lignocellulosic materials to microorganisms that naturally have high product yield and tolerance via expression of a heterologous cellulasic system and perhaps other features. For example, Saccharomyces cerevisiae has been engineered to express over two dozen different saccharolytic enzymes. See Lynd et al. (2002) Microbiol. Mol. Biol. Rev. 66:506.

[0123] Whereas cellulosic hydrolysis has been approached in the literature primarily in the context of an enzymatically-oriented intellectual paradigm, the CBP processing strategy requires that cellulosic hydrolysis be viewed in terms of a microbial paradigm. This microbial paradigm naturally leads to an emphasis on different fundamental issues, organisms, cellulasic systems, and applied milestones compared to those of the enzymatic paradigm. In this context, C. thermocellum has been a model organism because of its high growth rate on cellulose together with its potential utility for CBP.

[0124] In certain embodiments, organisms useful in the present invention may be applicable to the process known as simultaneous saccharification and fermentation (SSF), which is intended to include the use of said microorganisms and/or one or more recombinant hosts (or extracts thereof, including purified or unpurified extracts) for the contemporaneous degradation or depolymerization of a complex sugar {i.e., cellulosic biomass) and bioconversion of that sugar residue into ethanol by fermentation.

Heterologous Cellulases

[0125] DNA encoding an enzyme that acts to depolymerize hemicellulosic fractions can be isolated and used, via known procedures, to transform a suitable host organism such that the enzyme is produced by the recombinant host in commercially useful amounts. Said enzyme-encoding DNA can be isolated by screening nucleic acid libraries generated from microorganisms expressing a hemicellulase according to the present invention. Such libraries would be screened by means of oligonucleotide probes that are complementary to a polynucleotide encoding; for example, a portion of the N-terminus of an enzyme within the present invention.

[0126] Many of the nucleotide sequences of the genes encoding heterologous enzymes involved in production of ethanol from pentoses {e.g., xylose) are known and publicly available. Using well-known and widely practiced molecular biology techniques {e.g., restriction endonuclease cleavage/re-ligation, PCR, etc.), these sequences, or portions of these sequences, can be manipulated to provide microorganisms according to the present invention. Such genetic engineering techniques are well within the ordinarily skilled artisan's knowledge and abilities, and can be performed without undue or excessive experimentation. [0127] Alternatively, other portions including or adjacent to the endogenous coding sequence of an enzyme according to the present invention can be used, when isolated using a probe as a template for generating other probes useful for isolating an enzyme- encoding polynucleotide according to the present invention, e.g. , based on the N-terminal sequence described above. Such a probe could be used in a known manner to screen a genomic or cDNA library as described above, or to synthesize polymerase chain reaction (PCR) probes for use in amplifying a cDNA generated from an isolated R A, which codes for an enzyme of the present invention. Such a cDNA could then be cloned into a suitable expression vector and employed to transform a host organism.

[0128] A suitable polynucleotide in this regard would preferably comprise a nucleotide sequence, corresponding to the desired amino-acid sequence, that is optimized for the host of choice with regard to codon usage, the initiation of translation, and the expression of sufficient amounts of enzyme that acts to depolymerize hemicellulosics. A vector selected for transforming the chosen host organism with such a polynucleotide molecule should also allow for efficient maintenance and transcription of the sequence encoding the polypeptide. Such a vector is readily available or derivable from commercial sources, and is suited to a particular host cell employed for expressing a hemicellulase.

[0129] The expression of heterologous cellulases in a host cell can be used advantageously to produce ethanol from cellulosic sources. Cellulases from a variety of sources can be heterologously expressed to successfully increase efficiency of ethanol production. For example, the cellulases can be from fungi, bacteria, plant, protozoan or termite sources. In some embodiments, the cellulase is a H. grisea, T. aurantiacus, T. emersonii, T. reesei, C. lacteus, C. formosanus, N. takasagoensis, C. acinaciformis, M. darwinensis, N. walkeri, S. fibuligera, C. lucknowense, R. speratus, Thermobifida fusca, Clostridum thermocellum, Clostridium cellulolyticum, Clostridum josui, Bacillus pumilis, Cellulomonas fimi, Saccharophagus degradans, Piromyces equii, Neocallimastix patricarum, Aspergillus kawachii, Heterodera schachtii, H. jecorina, Orpinomyces sp., Irpex lacteus, Acremonium thermophilum, Neosartorya fischeri, Chaetomium globosum, Chaetomium thermophilum, Aspergillus fumigatus, Aspergillus terreus, Neurospora Crassa, or Arabidopsis thaliana cellulase.

[0130] In some embodiments, multiple cellulases from a single organism are co- expressed in the same host cell. In some embodiments, multiple cellulases from different organisms are co-expressed in the same host cell. In particular, cellulases from two, three, four, five, six, seven, eight, nine or more organisms can be co-expressed in the same host cell. Similarly, the invention can encompass co-cultures of microorganisms, such as yeast strains, wherein the microorganisms express different cellulases. Co- cultures can include microorganisms, such as yeast strains, expressing heterologous cellulases from the same organisms or from different organisms. Co-cultures can include microorganisms, such as yeast strains, expressing cellulases from two, three, four, five, six, seven, eight, nine or more organisms.

[0131] Cellulases of useful in the methods of the invention include both endoglucanases or exoglucanases. The cellulases can be, for example, endoglucanases, β-glucosidases or cellobiohydrolases. In certain embodiments of the invention, the endoglucanase(s) can be an endoglucanase I or an endoglucanase II isoform, paralogue or orthologue. In some embodiments, the endoglucanase expressed by the host cells of the present invention can be recombinant endo-l,4-P-glucanase. In particular embodiments, the endoglucanase is a T. reesei, C. lacteus, C. formosanus, N. takasagoensis, C. acinaciformis, M. darwinensis, N. walkeri, R. speratus Aspergillus kawachii, Heterodera schachtii, H. jecorina, Orpinomycess, Irpex lacteus, C. lucknowense, C. globosum, Aspergillus terreus, Aspergillus fumigatus, Neurospora crassa or Acremonium thermophilum endoglucanase.

[0132] Fragments of cellobiohydrolase, endoglucanase or beta-glucosidase polypeptides encompass domains, proteolytic fragments, deletion fragments and in particular, fragments of H. grisea, T. aurantiacus, T. emersonii, T. reesei, C. lacteus, C. formosanus, N. takasagoensis, C. acinaciformis, M. darwinensis, N. walkeri, S. fibuligera, C. lucknowense, R. speratus, Thermobifida fusca, Clostridum thermocellum, Clostridium cellulolyticum, Clostridum josui, Bacillus pumilis, Cellulomonas fimi, Saccharophagus degradans, Piromyces equii, Neocallimastix patricarum, Aspergillus kawachii, Heterodera schachtii, H. jecorina, Orpinomyces sp., Irpex lacteus, Acremonium thermophilum, Neosartorya fischeri, Chaetomium globosum, Chaetomium thermophilum, Aspergillus fumigatus, Aspergillus terreus, Neurospora Crassa, R. flavipes, or Arabidopsis thaliana cellobiohydrolase, endoglucanase or beta-glucosidase polypeptides which retain any specific biological activity of the cellobiohydrolase, endoglucanase or beta-glucosidase proteins. Polypeptide fragments further include any portion of the polypeptide which retains a catalytic activity of cellobiohydrolase, endoglucanase or beta- glucosidase proteins.

[0133] The polypeptides useful in the present invention further include variants of the polypeptides. A "variant" of the polypeptide can be a conservative variant, or an allelic variant. As used herein, a conservative variant refers to alterations in the amino acid sequence that do not adversely affect the biological functions of the protein. A substitution, insertion or deletion is said to adversely affect the protein when the altered sequence prevents or disrupts a biological function associated with the protein. For example, the overall charge, structure or hydrophobic-hydrophilic properties of the protein can be altered without adversely affecting a biological activity. Accordingly, the amino acid sequence can be altered, for example to render the peptide more hydrophobic or hydrophilic, without adversely affecting the biological activities of the protein.

[0134] By an "allelic variant" is intended alternate forms of a gene occupying a given locus on a chromosome of an organism. Genes II, Lewin, B., ed., John Wiley & Sons, New York (1985). Non-naturally occurring variants may be produced using art-known mutagenesis techniques. Allelic variants, though possessing a slightly different amino acid sequence than those recited above, will still have the same or similar biological functions associated with the H. grisea, T. aurantiacus, T. emersonii, T. reesei, C. lacteus, C. formosanus, N. takasagoensis, C. acinaciformis, M. darwinensis, N. walkeri, S. fibuligera, C. lucknowense, R. speratus, Thermobifida fusca, Clostridum thermocellum, Clostridium cellulolyticum, Clostridum josui, Bacillus pumilis, Cellulomonas fimi, Saccharophagus degradans, Piromyces equii, Neocallimastix patricarum, Aspergillus kawachii, Heterodera schachtii, H. jecorina, Orpinomyces sp., Irpex lacteus, Acremonium thermophilum, Neosartorya fischeri, Chaetomium globosum, Chaetomium thermophilum, Aspergillus fumigatus, Aspergillus terreus, Neurospora Crassa, R. flavipes, or Arabidopsis thaliana cellobiohydrolase, endoglucanase or beta-glucosidase protein.

[0135] Using known methods of protein engineering and recombinant DNA technology, variants may be generated to improve or alter the characteristics of the cellulase polypeptides. For instance, one or more amino acids can be deleted from the N-terminus or C-terminus of the secreted protein without substantial loss of biological function. [0136] Further included are H. grisea, T. aurantiacus, T. emersonii, T. reesei, C. lacteus,

C. formosanus, N. takasagoensis, C. acinaciformis, M. darwinensis, N. walkeri, S. fibuligera, C. lucknowense, R. speratus, Thermobifida fusca, Clostridum thermocellum, Clostridium cellulolyticum, Clostridum josui, Bacillus pumilis, Cellulomonas fimi, Saccharophagus degradans, Piromyces equii, Neocallimastix patricarum, Aspergillus kawachii, Heterodera schachtii, H. jecorina, Orpinomyces sp., Irpex lacteus, Acremonium thermophilum, Neosartorya fischeri, Chaetomium globosum, Chaetomium thermophilum, Aspergillus fumigatus, Aspergillus terreus, Neurospora Crassa, R. flavipes or Arabidopsis thaliana cellobiohydrolase, endoglucanase or beta-glucosidase polypeptide variants which show substantial biological activity. Such variants include deletions, insertions, inversions, repeats, and substitutions selected according to general rules known in the art so as have little effect on activity.

[0137] The skilled artisan is fully aware of amino acid substitutions that are either less likely or not likely to significantly effect protein function {e.g., replacing one aliphatic amino acid with a second aliphatic amino acid), as further described below.

[0138] For example, guidance concerning how to make phenotypically silent amino acid substitutions is provided in Bowie et al., "Deciphering the Message in Protein Sequences: Tolerance to Amino Acid Substitutions," Science 247: 1306-1310 (1990), wherein the authors indicate that there are two main strategies for studying the tolerance of an amino acid sequence to change.

[0139] The first strategy exploits the tolerance of amino acid substitutions by natural selection during the process of evolution. By comparing amino acid sequences in different species, conserved amino acids can be identified. These conserved amino acids are likely important for protein function. In contrast, the amino acid positions where substitutions have been tolerated by natural selection indicates that these positions are not critical for protein function. Thus, positions tolerating amino acid substitution could be modified while still maintaining biological activity of the protein.

[0140] The second strategy uses genetic engineering to introduce amino acid changes at specific positions of a cloned gene to identify regions critical for protein function. For example, site directed mutagenesis or alanine- scanning mutagenesis (introduction of single alanine mutations at every residue in the molecule) can be used. (Cunningham and Wells, Science 244: 1081-1085 (1989).) The resulting mutant molecules can then be tested for biological activity.

[0141] As the authors state, these two strategies have revealed that proteins are often surprisingly tolerant of amino acid substitutions. The authors further indicate which amino acid changes are likely to be permissive at certain amino acid positions in the protein. For example, most buried (within the tertiary structure of the protein) amino acid residues require nonpolar side chains, whereas few features of surface side chains are generally conserved. Moreover, tolerated conservative amino acid substitutions involve replacement of the aliphatic or hydrophobic amino acids Ala, Val, Leu and He; replacement of the hydroxyl residues Ser and Thr; replacement of the acidic residues Asp and Glu; replacement of the amide residues Asn and Gin, replacement of the basic residues Lys, Arg, and His; replacement of the aromatic residues Phe, Tyr, and Trp, and replacement of the small-sized amino acids Ala, Ser, Thr, Met, and Gly.

[0142] The terms "derivative" and "analog" refer to a polypeptide differing from the H. grisea, T. aurantiacus, T. emersonii, T. reesei, C. lacteus, C. formosanus, N. takasagoensis, C. acinaciformis, M. darwinensis, N. walkeri, S. fibuligera, C. lucknowense, R. speratus, Thermobifida fusca, Clostridum thermocellum, Clostridium cellulolyticum, Clostridum josui, Bacillus pumilis, Cellulomonas fimi, Saccharophagus degradans, Piromyces equii, Neocallimastix patricarum, Aspergillus kawachii, Heterodera schachtii, H. jecorina, Orpinomyces sp., Irpex lacteus, Acremonium thermophilum, Neosartorya fischeri, Chaetomium globosum, Chaetomium thermophilum, Aspergillus fumigatus, Aspergillus terreus, Neurospora Crassa, R. flavipes, or Arabidopsis thaliana cellobiohydrolase, endoglucanase or beta-glucosidase polypeptide, but retaining essential properties thereof. Generally, derivatives and analogs are overall closely similar, and, in many regions, identical to the H. grisea, T. aurantiacus, T. emersonii, T. reesei, C. lacteus, C. formosanus, N. takasagoensis, C. acinaciformis, M. darwinensis, N. walkeri, S. fibuligera, C. lucknowense, R. speratus, Thermobifida fusca, Clostridum thermocellum, Clostridium cellulolyticum, Clostridum josui, Bacillus pumilis, Cellulomonas fimi, Saccharophagus degradans, Piromyces equii, Neocallimastix patricarum, Aspergillus kawachii, Heterodera schachtii, H. jecorina, Orpinomyces sp., Irpex lacteus, Acremonium thermophilum, Neosartorya fischeri, Chaetomium globosum, Chaetomium thermophilum, Aspergillus fumigatus, Aspergillus terreus, Neurospora Crassa, R. flavipes, or Arabidopsis thaliana cellobiohydrolase, endoglucanase or beta- glucosidase polypeptides. The terms "derivative" and "analog" when referring to H. grisea, T. aurantiacus, T. emersonii, T. reesei, C. lacteus, C. formosanus, N. takasagoensis, C. acinaciformis, M. darwinensis, N. walkeri, S. fibuligera, C. lucknowense, R. speratus, Thermobifida fusca, Clostridum thermocellum, Clostridium cellulolyticum, Clostridum josui, Bacillus pumilis, Cellulomonas fimi, Saccharophagus degradans, Piromyces equii, Neocallimastix patricarum, Aspergillus kawachii, Heterodera schachtii, H. jecorina, Orpinomyces sp., Irpex lacteus, Acremonium thermophilum, Neosartorya fischeri, Chaetomium globosum, Chaetomium thermophilum, Aspergillus fumigatus, Aspergillus terreus, Neurospora Crassa, R. flavipes or Arabidopsis thaliana cellobiohydrolase, endoglucanase or beta-glucosidase polypeptides include any polypeptides which retain at least some of the activity of the corresponding native polypeptide, e.g., the exoglucanase activity, or the activity of the catalytic domain.

[0143] Derivatives of H. grisea, T. aurantiacus, T. emersonii, T. reesei, C. lacteus, C. formosanus, N. takasagoensis, C. acinaciformis, M. darwinensis, N. walkeri, S. fibuligera, C. lucknowense, R. speratus, Thermobifida fusca, Clostridum thermocellum, Clostridium cellulolyticum, Clostridum josui, Bacillus pumilis, Cellulomonas fimi, Saccharophagus degradans, Piromyces equii, Neocallimastix patricarum, Aspergillus kawachii, Heterodera schachtii, H. jecorina, Orpinomyces sp., Irpex lacteus, Acremonium thermophilum, Neosartorya fischeri, Chaetomium globosum, Chaetomium thermophilum, Aspergillus fumigatus, Aspergillus terreus, Neurospora crassa, R. flavipes, or Arabidopsis thaliana cellobiohydrolase, endoglucanase or beta-glucosidase polypeptides, are polypeptides which have been altered so as to exhibit additional features not found on the native polypeptide. Derivatives can be covalently modified by substitution, chemical, enzymatic, or other appropriate means with a moiety other than a naturally occurring amino acid (for example, a detectable moiety such as an enzyme or radioisotope). Examples of derivatives include fusion proteins.

[0144] An analog is another form of a H. grisea, T. aurantiacus, T. emersonii, T. reesei,

C. lacteus, C. formosanus, N. takasagoensis, C. acinaciformis, M. darwinensis, N. walkeri, S. fibuligera, C. lucknowense, R. speratus, Thermobifida fusca, Clostridum thermocellum, Clostridium cellulolyticum, Clostridum josui, Bacillus pumilis, Cellulomonas fimi, Saccharophagus degradans, Piromyces equii, Neocallimastix patricarum, Aspergillus kawachii, Heterodera schachtii, H. jecorina, Orpinomyces sp., Irpex lacteus, Acremonium thermophilum, Neosartorya fischeri, Chaetomium globosum, Chaetomium thermophilum, Aspergillus fumigatus, Aspergillus terreus, Neurospora crassa, R. flavipes, or Arabidopsis thaliana cellobiohydrolase, endoglucanase or beta- glucosidase polypeptide useful in the present invention. An "analog" also retains substantially the same biological function or activity as the polypeptide of interest, e.g., functions as a cellobiohydrolase. An analog includes a proprotein which can be activated by cleavage of the proprotein portion to produce an active mature polypeptide.

[0145] The polypeptide useful in the present invention may be a recombinant polypeptide, a natural polypeptide or a synthetic polypeptide. In some particular embodiments, the polypeptide is a recombinant polypeptide.

[0146] Also provided are allelic variants, orthologs, and/or species homologs.

Procedures known in the art can be used to obtain full-length genes, allelic variants, splice variants, full-length coding portions, orthologs, and/or species homologs of known genes. For example, allelic variants and/or species homologs may be isolated and identified by making suitable probes or primers from the sequences provided herein and screening a suitable nucleic acid source for allelic variants and/or the desired homologue.

[0147] In some embodiments, the host cells express a combination of heterologous cellulases.

[0148] The cellulases may be either tethered or secreted. As used herein, a protein is

"tethered" to an organism's cell surface if at least one terminus of the protein is bound, covalently and/or electrostatically for example, to the cell membrane or cell wall. It will be appreciated that a tethered protein may include one or more enzymatic regions that may be joined to one or more other types of regions at the nucleic acid and/or protein levels (e.g., a promoter, a terminator, an anchoring domain, a linker, a signaling region, etc.). While the one or more enzymatic regions may not be directly bound to the cell membrane or cell wall (e.g., such as when binding occurs via an anchoring domain), the protein is nonetheless considered a "tethered enzyme" according to the present specification.

[0149] Tethering may, for example, be accomplished by incorporation of an anchoring domain into a recombinant protein that is heterologously expressed by a cell, or by prenylation, fatty acyl linkage, glycosyl phosphatidyl inositol anchors or other suitable molecular anchors which may anchor the tethered protein to the cell membrane or cell wall of the host cell. A tethered protein maybe tethered at its amino terminal end or optionally at its carboxy terminal end.

[0150] As used herein, "secreted" means released into the extracellular milieu, for example into the media. Although tethered proteins may have secretion signals as part of their immature amino acid sequence, they are maintained as attached to the cell surface, and do not fall within the scope of secreted proteins as used herein.

[0151] As used herein, "flexible linker sequence" refers to an amino acid sequence which links two amino acid sequences, for example, a cell wall anchoring amino acid sequence with an amino acid sequence that contains the desired enzymatic activity. The flexible linker sequence allows for necessary freedom for the amino acid sequence that contains the desired enzymatic activity to have reduced steric hindrance with respect to proximity to the cell and may also facilitate proper folding of the amino acid sequence that contains the desired enzymatic activity.

[0152] In some embodiments, the tethered cellulase enzymes are tethered by a flexible linker sequence linked to an anchoring domain. In some embodiments, the anchoring domain is of CWP2 (for carboxy terminal anchoring) or FLOl (for amino terminal anchoring) from S. cerevisiae.

[0153] In some embodiments, heterologous secretion signals may be added to the expression vectors of the present invention to facilitate the extra-cellular expression of cellulase proteins. In some embodiments, the heterologous secretion signal is the secretion signal from T. reesei Xyn2.

[0154] Fusion proteins comprising cellulases are also encompassed. For example, the fusion proteins can be a fusion of a heterologous cellulase and a second peptide. The heterologous cellulase and the second peptide can be fused directly or indirectly, for example, through a linker sequence. The fusion protein can comprise for example, a second peptide that is N-terminal to the heterologous cellulase and/or a second peptide that is C-terminal to the heterologous cellulase. Thus, in certain embodiments, the polypeptide of the present invention comprises a first polypeptide and a second polypeptide, wherein the first polypeptide comprises a heterologous cellulase.

[0155] In one embodiment, the fusion protein comprises a first and second polypeptide wherein the first polypeptide comprises a heterologous cellulase and the second polypeptide comprises a signal sequence. According to another embodiment, the fusion protein comprises a first and second polypeptide, wherein the first polypeptide comprises a heterologous cellulase and the second polypeptide comprises a polypeptide used to facilitate purification or identification or a reporter peptide. The polypeptide used to facilitate purification or identification or the reporter peptide can be, for example, a HIS- tag, a GST-tag, an HA-tag, a FLAG-tag, a MYC-tag, or a fluorescent protein.

[0156] According to yet another embodiment, the fusion protein comprises a first and second polypeptide, wherein the first polypeptide comprises a heterologous cellulase and the second polypeptide comprises an anchoring peptide. In some embodiments, the anchoring domain is of CWP2 (for carboxy terminal anchoring) or FLOl (for amino terminal anchoring) from S. cerevisiae.

[0157] According to yet another embodiment, the fusion protein comprises a first and second polypeptide, wherein the first polypeptide comprises a heterologous cellulase and the second polypeptide comprises a cellulose binding module (CBM). In some embodiments, the CBM is from, for example, T. reesei Cbhl or Cbh2, from H. grisea Cbhl, or from C. lucknowense Cbh2b. In some particular embodiments, the CBM is fused to a cellobiohydrolase. In one particular embodiment, the fusion protein comprises a first and second polypeptide, wherein the first polypeptide comprises a heterologous cellobiohydrolase and the second polypeptide comprises a CBM.

Co-Cultures

[0158] In some embodiments, the invention relates to co-cultures of host cells (e.g, microorganisms). As used herein, "co-culture" refers to growing two different strains or species of host cells together in the same vessel. In some embodiments of the invention, at least one host cell of the co-culture comprises a heterologous polynucleotide comprising a nucleic acid which encodes an endoglucanase, at least one host cell of the co-culture comprises a heterologous polynucleotide comprising a nucleic acid which encodes a β-glucosidase and at least one host cell comprises a heterologous polynucleotide comprising a nucleic acid which encodes a cellobiohydrolase. In a further embodiment, the co-culture further comprises a host cell comprising a heterologous polynucleotide comprising a nucleic acid which encodes a second cellobiohydrolase.

[0159] In some embodiments, the co-culture comprises two or more strains of yeast host cells, two or more strains of bacterial host cells, or a combination of one or more yeast host cells and one or more bacterial host cells. The heterologous cellulases can be expressed in any combination in the two or more strains of host cells.

[0160] The co-cultures can include tethered cellulases, secreted cellulases or both tethered and secreted cellulases. For example, in some embodiments, the co-culture comprises at least one yeast host cell comprising a polynucleotide encoding a secreted heterologous cellulase. In another embodiment, the co-culture comprises at least one yeast host cell comprising a polynucleotide encoding a tethered heterologous cellulase. In one embodiment, all of the heterologous cellulases in the co-culture are secreted, and in another embodiment, all of the heterologous cellulases in the co-culture are tethered. In addition, other enzymes or cellulases, such as externally added cellulases may be present in the co-culture.

Codon Optimized Polynucleotides

[0161] The polynucleotides encoding heterologous cellulases can be codon-optimized.

As used herein the term "codon-optimized coding region" means a nucleic acid coding region that has been adapted for expression in the cells of a given organism by replacing at least one, or more than one, or a significant number, of codons with one or more codons that are more frequently used in the genes of that organism.

[0162] In general, highly expressed genes in an organism are biased towards codons that are recognized by the most abundant tRNA species in that organism. One measure of this bias is the "codon adaptation index" or "CAI," which measures the extent to which the codons used to encode each amino acid in a particular gene are those which occur most frequently in a reference set of highly expressed genes from an organism.

[0163] The CAI of codon optimized sequences of the present invention corresponds to between about 0.8 and 1.0, between about 0.8 and 0.9, or about 1.0. A codon optimized sequence may be further modified for expression in a particular organism, depending on that organism's biological constraints. For example, large runs of "As" or "Ts" (e.g., runs greater than 4, 4, 5, 6, 7, 8, 9, or 10 consecutive bases) can be removed from the sequences if these are known to effect transcription negatively. Furthermore, specific restriction enzyme sites may be removed for molecular cloning purposes. Examples of such restriction enzyme sites include Pad, Ascl, BamHI, Bglll, EcoRI and Xhol. Additionally, the DNA sequence can be checked for direct repeats, inverted repeats and mirror repeats with lengths of ten bases or longer, which can be modified manually by replacing codons with "second best" codons, i.e., codons that occur at the second highest frequency within the particular organism for which the sequence is being optimized.

Deviations in the nucleotide sequence that comprise the codons encoding the amino acids of any polypeptide chain allow for variations in the sequence coding for the gene. Since each codon consists of three nucleotides, and the nucleotides comprising DNA are restricted to four specific bases, there are 64 possible combinations of nucleotides, 61 of which encode amino acids (the remaining three codons encode signals ending translation). The "genetic code" which shows which codons encode which amino acids is reproduced herein as Table 1. As a result, many amino acids are designated by more than one codon. For example, the amino acids alanine and proline are coded for by four triplets, serine and arginine by six, whereas tryptophan and methionine are coded by just one triplet. This degeneracy allows for DNA base composition to vary over a wide range without altering the amino acid sequence of the proteins encoded by the DNA.

TABLE 1 : The Standard Genetic Code

[0165] Many organisms display a bias for use of particular codons to code for insertion of a particular amino acid in a growing peptide chain. Codon preference or codon bias, differences in codon usage between organisms, is afforded by degeneracy of the genetic code, and is well documented among many organisms. Codon bias often correlates with the efficiency of translation of messenger RNA (mRNA), which is in turn believed to be dependent on, inter alia, the properties of the codons being translated and the availability of particular transfer RNA (tRNA) molecules. The predominance of selected tRNAs in a cell is generally a reflection of the codons used most frequently in peptide synthesis. Accordingly, genes can be tailored for optimal gene expression in a given organism based on codon optimization.

[0166] Given the large number of gene sequences available for a wide variety of animal, plant and microbial species, it is possible to calculate the relative frequencies of codon usage. Codon usage tables are readily available, for example, at http://phenotype.biosci.umbc.edu/codon/sgd/index.php (visited May 7, 2008) or at http://www.kazusa.or.jp/codon (visited March 20, 2008), and these tables can be adapted in a number of ways. See Nakamura, Y., et al. "Codon usage tabulated from the international DNA sequence databases: status for the year 2000," Nucl. Acids Res. 28:292 (2000). Codon usage tables for yeast, calculated from GenBank Release 128.0 [15 February 2002], are reproduced below as Table 2. This table uses mRNA nomenclature, and so instead of thymine (T) which is found in DNA, the tables use uracil (U) which is found in RNA. The Table has been adapted so that frequencies are calculated for each amino acid, rather than for all 64 codons.

TABLE 2: Codon Usage Table for Saccharomyces cerevisiae Genes

Amino Acid Codon Number Frequency per

hundred

Phe UUU 170666 26.1

Phe UUC 120510 18.4

Total

Leu UUA 170884 26.2

Leu UUG 177573 27.2

Leu CUU 80076 12.3

Leu CUC 35545 5.4

Leu CUA 87619 13.4 Amino Acid Codon Number Frequency per hundred

Leu CUG 68494 10.5

Total

He AUU 196893 30.1

He AUC 112176 17.2

He AUA 116254 17.8

Total

Met AUG 136805 20.9

Total

Val GUU 144243 22.1

Val GUC 76947 11.8

Val GUA 76927 11.8

Val GUG 70337 10.8

Total

Ser UCU 153557 23.5

Ser UCC 92923 14.2

Ser UCA 122028 18.7

Ser UCG 55951 8.6

Ser AGU 92466 14.2

Ser AGC 63726 9.8

Total

Pro ecu 88263 13.5

Pro CCC 44309 6.8

Pro CCA 119641 18.3

Pro CCG 34597 5.3

Total

Thr ACU 132522 20.3

Thr ACC 83207 12.7

Thr ACA 116084 17.8

Thr ACG 52045 8.0

Total

Ala GCU 138358 21.2

Ala GCC 82357 12.6

Ala GCA 105910 16.2

Ala GCG 40358 6.2

Total

[0167] By utilizing this or similar tables, one of ordinary skill in the art can apply the frequencies to any given polypeptide sequence, and produce a nucleic acid fragment of a codon-optimized coding region which encodes the polypeptide, but which uses codons optimal for a given species. Codon-optimized coding regions can be designed by various different methods.

[0168] In one method, a codon usage table is used to find the single most frequent codon used for any given amino acid, and that codon is used each time that particular amino acid appears in the polypeptide sequence. For example, referring to Table 2 above, for leucine, the most frequent codon is UUG, which is used 27.2% of the time. Thus all the leucine residues in a given amino acid sequence would be assigned the codon UUG.

[0169] In another method, the actual frequencies of the codons are distributed randomly throughout the coding sequence. Thus, using this method for optimization, if a hypothetical polypeptide sequence had 100 leucine residues, referring to Table 2 for frequency of usage in the S. cerevisiae, about 5, or 5% of the leucine codons would be CUC, about 11, or 11% of the leucine codons would be CUG, about 12, or 12% of the leucine codons would be CUU, about 13, or 13% of the leucine codons would be CUA, about 26, or 26% of the leucine codons would be UUA, and about 27, or 27% of the leucine codons would be UUG.

[0170] These frequencies would be distributed randomly throughout the leucine codons in the coding region encoding the hypothetical polypeptide. As will be understood by those of ordinary skill in the art, the distribution of codons in the sequence can vary significantly using this method; however, the sequence always encodes the same polypeptide. [0171] When using the methods above, the term "about" is used precisely to account for fractional percentages of codon frequencies for a given amino acid. As used herein, "about" is defined as one amino acid more or one amino acid less than the value given. The whole number value of amino acids is rounded up if the fractional frequency of usage is 0.50 or greater, and is rounded down if the fractional frequency of use is 0.49 or less. Using again the example of the frequency of usage of leucine in human genes for a hypothetical polypeptide having 62 leucine residues, the fractional frequency of codon usage would be calculated by multiplying 62 by the frequencies for the various codons. Thus, 7.28 percent of 62 equals 4.51 UUA codons, or "about 5," i.e., 4, 5, or 6 UUA codons, 12.66 percent of 62 equals 7.85 UUG codons or "about 8," i.e., 7, 8, or 9 UUG codons, 12.87 percent of 62 equals 7.98 CUU codons, or "about 8," i.e., 7, 8, or 9 CUU codons, 19.56 percent of 62 equals 12.13 CUC codons or "about 12," i.e. , 11, 12, or 13 CUC codons, 7.00 percent of 62 equals 4.34 CUA codons or "about 4," i.e., 3, 4, or 5 CUA codons, and 40.62 percent of 62 equals 25.19 CUG codons, or "about 25," i.e., 24, 25, or 26 CUG codons.

[0172] Randomly assigning codons at an optimized frequency to encode a given polypeptide sequence, can be done manually by calculating codon frequencies for each amino acid, and then assigning the codons to the polypeptide sequence randomly. Additionally, various algorithms and computer software programs are readily available to those of ordinary skill in the art. For example, the "EditSeq" function in the Lasergene Package, available from DNAstar, Inc., Madison, WI, the backtranslation function in the VectorNTI Suite, available from InforMax, Inc., Bethesda, MD, and the "backtranslate" function in the GCG~Wisconsin Package, available from Accelrys, Inc., San Diego, CA. In addition, various resources are publicly available to codon-optimize coding region sequences, e.g., the "backtranslation" function at http://www.entelechon.corn/bioinformatics/backtranslation.ph p?lang=eng (visited April 15, 2008) and the "backtranseq" function available at http://bioinfo.pbi.nrc.ca:8090/EMBOSS/index.html (visited July 9, 2002). Constructing a rudimentary algorithm to assign codons based on a given frequency can also easily be accomplished with basic mathematical functions by one of ordinary skill in the art.

[0173] A number of options are available for synthesizing codon optimized coding regions designed by any of the methods described above, using standard and routine molecular biological manipulations well known to those of ordinary skill in the art. In one approach, a series of complementary oligonucleotide pairs of 80-90 nucleotides each in length and spanning the length of the desired sequence is synthesized by standard methods. These oligonucleotide pairs are synthesized such that upon annealing, they form double stranded fragments of 80-90 base pairs, containing cohesive ends, e.g., each oligonucleotide in the pair is synthesized to extend 3, 4, 5, 6, 7, 8, 9, 10, or more bases beyond the region that is complementary to the other oligonucleotide in the pair. The single-stranded ends of each pair of oligonucleotides is designed to anneal with the single-stranded end of another pair of oligonucleotides. The oligonucleotide pairs are allowed to anneal, and approximately five to six of these double-stranded fragments are then allowed to anneal together via the cohesive single stranded ends, and then they ligated together and cloned into a standard bacterial cloning vector, for example, a TOPO ® vector available from Invitrogen Corporation, Carlsbad, CA. The construct is then sequenced by standard methods. Several of these constructs consisting of 5 to 6 fragments of 80 to 90 base pair fragments ligated together, i.e., fragments of about 500 base pairs, are prepared, such that the entire desired sequence is represented in a series of plasmid constructs. The inserts of these plasmids are then cut with appropriate restriction enzymes and ligated together to form the final construct. The final construct is then cloned into a standard bacterial cloning vector, and sequenced. Additional methods would be immediately apparent to the skilled artisan. In addition, gene synthesis is readily available commercially.

In additional embodiments, a full-length polypeptide sequence is codon-optimized for a given species resulting in a codon-optimized coding region encoding the entire polypeptide, and then nucleic acid fragments of the codon-optimized coding region, which encode fragments, variants, and derivatives of the polypeptide are made from the original codon-optimized coding region. As would be well understood by those of ordinary skill in the art, if codons have been randomly assigned to the full-length coding region based on their frequency of use in a given species, nucleic acid fragments encoding fragments, variants, and derivatives would not necessarily be fully codon optimized for the given species. However, such sequences are still much closer to the codon usage of the desired species than the native codon usage. The advantage of this approach is that synthesizing codon-optimized nucleic acid fragments encoding each fragment, variant, and derivative of a given polypeptide, although routine, would be time consuming and would result in significant expense.

Vectors and Methods of Using Vectors in Host Cells

[0175] In one embodiment, host cells for use in the invention are genetically engineered

(transduced or transformed or transfected) with the vectors of this invention which may be, for example, a cloning vector or an expression vector. The vector may be, for example, in the form of a plasmid, a viral particle, a phage, etc. The engineered host cells can be cultured in conventional nutrient media modified as appropriate for activating promoters, selecting transformants or amplifying the genes of the present invention. The culture conditions, such as temperature, pH and the like, are those previously used with the host cell selected for expression, and will be apparent to the ordinarily skilled artisan.

[0176] Polynucleotides can be employed for producing polypeptides by recombinant techniques. Thus, for example, the polynucleotide can be included in any one of a variety of expression vectors for expressing a polypeptide. Such vectors include chromosomal, nonchromosomal and synthetic DNA sequences, e.g., derivatives of SV40; bacterial plasmids; and yeast plasmids. However, any other vector can be used, as long as it is replicable and viable in the host.

[0177] The appropriate DNA sequence may be inserted into the vector by a variety of procedures. In general, the DNA sequence is inserted into an appropriate restriction endonuclease site(s) by procedures known in the art. Such procedures and others are deemed to be within the scope of those skilled in the art.

[0178] The DNA sequence in the expression vector is operatively associated with an appropriate expression control sequence(s) (promoter) to direct mRNA synthesis. Representative examples of such promoters are as follows:

Gene Organism Systematic name Reason for use/benefits

PGK1 S. cerevisiae YCR012W Strong constitutive promoter

ENOl S. cerevisiae YGR254W Strong constitutive promoter

TDH3 S. cerevisiae YGR192C Strong constitutive promoter

TDH2 S. cerevisiae YJR009C Strong constitutive promoter

TDH1 S. cerevisiae YJL052W Strong constitutive promoter

EN02 S. cerevisiae YHR174W Strong constitutive promoter

GPM1 S. cerevisiae YKL152C Strong constitutive promoter

TPI1 S. cerevisiae YDR050C Strong constitutive promoter [0179] Additionally, promoter sequences from stress and starvation response genes are useful in the present invention. In some embodiments, promoter regions from the S. cerevisiae genes GAC1, GET3, GLC7, GSH1, GSH2, HSF1, HSP12, LCB5, LRE1, LSP1, NBP2, PIL1, PIM1, SGT2, SLG1, WHI2, WSC2, WSC3, WSC4, YAP1, YDC1, HSP104, HSP26, ENA1, MSN2, MSN4, SIP2, SIP4, SIP 5, DPL1, IRS4, KOG1, PEP4, HAP4, PRB1, TAX4, ZPR1, ATG1, ATG2, ATGIO. ATG11, ATG12, ATG13, ATG14, ATG15, ATG16, ATGI 7, ATGI8, and ATGI9 may be used. Any suitable promoter to drive gene expression in the host cells of the invention may be used. Additionally the E. coli, lac or trp, and other promoters known to control expression of genes in prokaryotic or lower eukaryotic cells can be used.

[0180] In addition, the expression vectors may contain one or more selectable marker genes to provide a phenotypic trait for selection of transformed host cells such as URA3, HIS3, LEU2, TRPI, LYS2 or ADE 2, dihydrofolate reductase, neomycin (G418) resistance or zeocin resistance for eukaryotic cell culture, or tetracycline or ampicillin resistance in E. coli.

[0181] The expression vector may also contain a ribosome binding site for translation initiation and/or a transcription terminator. The vector may also include appropriate sequences for amplifying expression, or may include additional regulatory regions.

[0182] The vector containing the appropriate DNA sequence as herein, as well as an appropriate promoter or control sequence, may be employed to transform an appropriate host to permit the host to express the protein.

[0183] Thus, in certain aspects, host cells for use in the invention contain the above- described constructs. The host cell can be a host cell as described elsewhere in the application. The host cell can be, for example, a lower eukaryotic cell, such as a yeast cell, e.g., Saccharomyces cerevisiae or Kluyveromyces, or the host cell can be a prokaryotic cell, such as a bacterial cell.

[0184] As representative examples of appropriate hosts, there may be mentioned: bacterial cells, such as E. coli, Streptomyces, Salmonella typhimurium; thermophilic or mesophlic bacteria; fungal cells, such as yeast; and plant cells, etc. The selection of an appropriate host is deemed to be within the scope of those skilled in the art from the teachings herein. [0185] Appropriate fungal hosts include yeast. In certain aspects of the invention the yeast is selected from the group consisting of Saccharomyces cerevisiae, Kluyveromyces lactis, Schizzosaccharomyces pombe, Candida albicans, Pichia pastoris, Pichia stipitis, Yarrowia lipolytica, Hansenula polymorpha, Phaffia rhodozyma, Candida utilis, Arxula adeninivorans, Debaryomyces hansenii, Debaryomyces polymorphus, Schwanniomyces occidentalis, Issatchenkia orientalis, Kluyveromyces marxianus, Blakeslea, Candida, Cryptococcus, Cunninghamella, Lipomyces, Mortierella, Mucor, Phycomces, Pythium, Rhodosporidium, Rhodotorula, Trichosporon and Yarrowia

Transposons

[0186] To select for foreign DNA that has entered a host it is preferable that the DNA be stably maintained in the organism of interest. With regard to plasmids, there are two processes by which this can occur. One is through the use of replicative plasmids. These plasmids have origins of replication that are recognized by the host and allow the plasmids to replicate as stable, autonomous, extrachromosomal elements that are partitioned during cell division into daughter cells. The second process occurs through the integration of a plasmid onto the chromosome. This predominately happens by homologous recombination and results in the insertion of the entire plasmid, or parts of the plasmid, into the host chromosome. Thus, the plasmid and selectable marker(s) are replicated as an integral piece of the chromosome and segregated into daughter cells. Therefore, to ascertain if plasmid DNA is entering a cell during a transformation event through the use of selectable markers requires the use of a replicative plasmid or the ability to recombine the plasmid onto the chromosome. These qualifiers cannot always be met, especially when handling organisms that do not have a suite of genetic tools.

[0187] One way to avoid issues regarding plasmid-associated markers is through the use of transposons. A transposon is a mobile DNA element, defined by mosaic DNA sequences that are recognized by enzymatic machinery referred to as a transposase. The function of the transposase is to randomly insert the transposon DNA into host or target DNA. A selectable marker can be cloned onto a transposon by standard genetic engineering. The resulting DNA fragment can be coupled to the transposase machinery in an in vitro reaction and the complex can be introduced into target cells by electroporation. Stable insertion of the marker onto the chromosome requires only the function of the transposase machinery and alleviates the need for homologous recombination or replicative plasmids.

[0188] The random nature associated with the integration of transposons has the added advantage of acting as a form of mutagenesis. Libraries can be created that comprise amalgamations of transposon mutants. These libraries can be used in screens or selections to produce mutants with desired phenotypes. For instance, a transposon library of a CBP organism could be screened for the ability to produce less ethanol, or more lactic acid and/or more acetate.

Native cellulolytic strategy

[0189] Naturally occurring cellulolytic microorganisms are starting points for CBP organism development via the native strategy. Anaerobes and facultative anaerobes are of particular interest. The primary objective is to engineer product yields and lactate or acetate titers to satisfy the requirements of an industrial process. Metabolic engineering of mixed-acid fermentations in relation to, for example, ethanol production, has been successful in the case of mesophilic, non-cellulolytic, enteric bacteria. Recent developments in suitable gene -transfer techniques allow for this type of work to be undertaken with cellulolytic bacteria.

Recombinant cellulolytic strategy

[0190] Non-cellulolytic microorganisms with desired product- formation properties (e.g., high lactate or acetate yield and titer) are starting points for CBP organism development by the recombinant cellulolytic strategy. The primary objective of such developments is to engineer a heterologous cellulase system that enables growth and fermentation on pretreated lignocellulose. The heterologous production of cellulases has been pursued primarily with bacterial hosts producing ethanol at high yield (engineered strains of E. coli, Klebsiella oxytoca, and Zymomonas mobilis) and the yeast Saccharomyces cerevisiae. Cellulase expression in strains of K. oxytoca resulted in increased hydrolysis yields - but not growth without added cellulase - for microcrystalline cellulose, and anaerobic growth on amorphous cellulose. Although dozens of saccharolytic enzymes have been functionally expressed in S. cerevisiae, anaerobic growth on cellulose as the result of such expression has not been definitively demonstrated. Aspects of the present invention relate to the use of thermophilic or mesophilic microorganisms as hosts for modification via the native cellulolytic strategy. Their potential in process applications in biotechnology stems from their ability to grow at relatively high temperatures with attendant high metabolic rates, production of physically and chemically stable enzymes, and elevated yields of end products. Major groups of thermophilic bacteria include eubacteria and archaebacteria. Thermophilic eubacteria include: phototropic bacteria, such as cyanobacteria, purple bacteria, and green bacteria; Gram-positive bacteria, such as Bacillus, Clostridium, Lactic acid bacteria, and Actinomyces; and other eubacteria, such as Thiobacillus, Spirochete, Desulfotomaculum, Gram-negative aerobes, Gram-negative anaerobes, and Thermotoga. Within archaebacteria are considered Methanogens, extreme thermophiles (an art-recognized term), and Thermoplasma. In certain embodiments, the present invention relates to Gram-negative organotrophic thermophiles of the genera Thermus, Gram-positive eubacteria, such as genera Clostridium, and also which comprise both rods and cocci, genera in group of eubacteria, such as Thermosipho and Thermotoga, genera of Archaebacteria, such as Thermococcus, Thermoproteus (rod-shaped), Thermofilum (rod- shaped), Pyrodictium, Acidianus, Sulfolobus, Pyrobaculum, Pyrococcus, Thermodiscus, Staphylothermus, Desulfurococcus, Archaeoglobus, and Methanopyrus. Some examples of thermophilic or mesophilic (including bacteria, procaryotic microorganism, and fungi), which may be suitable for the present invention include, but are not limited to: Clostridium thermosulfurogenes, Clostridium cellulolyticum, Clostridium thermocellum, Clostridium thermohydrosulfuricum, Clostridium thermoaceticum, Clostridium thermosaccharolyticum, Clostridium tartarivorum, Clostridium thermocellulaseum, Clostridium phytofermentans, Clostridium straminosolvens, Thermoanaerobacterium thermosaccarolyticum, Thermoanaerobacterium saccharolyticum, Thermobacteroides acetoethylicus, Thermoanaerobium brockii, Methanobacterium thermoautotrophicum, Anaerocellum thermophilium, Pyrodictium occultum, Thermoproteus neutrophilus, Thermofilum librum, Thermothrix thioparus, Desulfovibrio thermophilus, Thermoplasma acidophilum, Hydrogenomonas thermophilus, Thermomicrobium roseum, Thermus flavas, Thermus ruber, Pyrococcus furiosus, Thermus aquaticus, Thermus thermophilus, Chloroflexus aurantiacus, Thermococcus litoralis, Pyrodictium abyssi, Bacillus stearothermophilus, Cyanidium caldarium, Mastigocladus laminosus, Chlamydothrix calidissima, Chlamydothrix penicillata, Thiothrix carnea, Phormidium tenuis simum, Phormidium geysericola, Phormidium subterraneum, Phormidium bijahensi, Oscillatoria filiformis, Synechococcus lividus, Chloroflexus aurantiacus, Pyrodictium brockii, Thiobacillus thiooxidans, Sulfolobus acidocaldarius, Thiobacillus thermophilica, Bacillus stearothermophilus, Cercosulcifer hamathensis, Vahlkampfia reichi, Cyclidium citrullus, Dactylaria gallopava, Synechococcus lividus, Synechococcus elongatus, Synechococcus minervae, Synechocystis aquatilus, Aphanocapsa thermalis, Oscillatoria terebriformis, Oscillatoria amphibia, Oscillatoria germinata, Oscillatoria okenii, Phormidium laminosum, Phormidium parparasiens, Symploca thermalis, Bacillus acidocaldarias, Bacillus coagulans, Bacillus thermocatenalatus, Bacillus licheniformis, Bacillus pamilas, Bacillus macerans, Bacillus circulans, Bacillus laterosporus, Bacillus brevis, Bacillus subtilis, Bacillus sphaericus, Desulfotomaculum nigrificans, Streptococcus thermophilus, Lactobacillus thermophilus, Lactobacillus bulgaricus, Bifidobacterium thermophilum, Streptomyces fragmentosporus, Streptomyces thermonitrificans, Streptomyces thermovulgaris, Pseudonocardia thermophila, Thermoactinomyces vulgaris, Thermoactinomyces sacchari, Thermoactinomyces Candidas, Thermomonospora curvata, Thermomonospora viridis, Thermomonospora citrina, Microbispora thermodiastatica, Microbispora aerata, Microbispora bispora, Actinobifida dichotomica, Actinobifida chromogena, Micropolyspora caesia, Micropolyspora fiaeni, Micropolyspora cectivugida, Micropolyspora cabrobrunea, Micropolyspora thermovirida, Micropolyspora viridinigra, Methanobacterium thermoautothropicum, Caldicellulosiruptor acetigenus, Caldicellulosiruptor saccharolyticus, Caldicellulosiruptor kristjanssonii, Caldicellulosiruptor owensensis, Caldicellulosiruptor lactoaceticus, variants thereof, and/or progeny thereof.

[0192] In particular embodiments, the present invention relates to thermophilic bacteria selected from the group consisting of Clostridium cellulolyticum, Clostridium thermocellum, and Thermoanaerobacterium saccharolyticum.

[0193] In certain embodiments, the present invention relates to thermophilic bacteria selected from the group consisting of Fervidobacterium gondwanense, Clostridium thermolacticum, Moorella sp., and Rhodothermus marinus.

[0194] In certain embodiments, the present invention relates to thermophilic bacteria of the genera Thermoanaerobacterium or Thermoanaerobacter, including, but not limited to, species selected from the group consisting of: Thermoanaerobacterium thermosulfurigenes, Thermoanaerobacterium aotearoense, Thermoanaerobacterium polysaccharolyticum, Thermoanaerobacterium zeae, Thermoanaerobacterium xylanolyticum, Thermoanaerobacterium saccharolyticum, Thermoanaerobium brockii, Thermoanaerobacterium thermosaccharolyticum, Thermoanaerobacter thermohydrosulfuricus, Thermoanaerobacter ethanolicus, Thermoanaerobacter brockii, variants thereof, and progeny thereof.

[0195] In certain embodiments, the present invention relates to microorganisms of the genera Geobacillus, Saccharococcus, Paenibacillus, Bacillus, and Anoxybacillus, including, but not limited to, species selected from the group consisting of: Geobacillus thermoglucosidasius, Geobacillus stearothermophilus, Saccharococcus caldoxylosilyticus, Saccharococcus thermophilus, Paenibacillus campinasensis, Bacillus flavothermus, Anoxybacillus kamchatkensis, Anoxybacillus gonensis, variants thereof, and progeny thereof.

[0196] In certain embodiments, the present invention relates to mesophilic bacteria selected from the group consisting of Saccharophagus degradans; Flavobacterium johnsoniae; Fibrobacter succinogenes; Clostridium hungatei; Clostridium phytofermentans; Clostridium cellulolyticum; Clostridium aldrichii; Clostridium termitididis; Acetivibrio cellulolyticus; Acetivibrio ethanolgignens; Acetivibrio multivorans; Bacteroides cellulosolvens; and Alkalibacter saccharofomentans, variants thereof and progeny thereof.

Ethanol Production

[0197] According to the present invention a population of microorganisms can be used to produce ethanol from cellulosic substrates and a sugar stream. Such methods can be accomplished, for example, by contacting a cellulosic substrate and a sugar stream with a host cell or a co-culture as known in the art and/or as described herein.

[0198] Sugar streams for use in the invention include sugars derived from a natural source including, but not limited to molasses, cane juice, or starch {e.g., potato, wheat, or corn). Sugar streams may be available at an existing fermentation facility, to which cellulosic substrates are added {e.g., pretreated lignocellulosic biomass). The fermentation facility may use either a wet milling or a dry milling process. [0199] Numerous cellulosic substrates can be used in accordance with the present invention. Substrates for cellulose activity assays can be divided into two categories, soluble and insoluble, based on their solubility in water. Soluble substrates include cellodextrins or derivatives, carboxymethyl cellulose (CMC), or hydroxyethyl cellulose (HEC). Insoluble substrates include crystalline cellulose, micro crystalline cellulose (Avicel), amorphous cellulose, such as phosphoric acid swollen cellulose (PASC), dyed or fluorescent cellulose, and pretreated lignocellulosic biomass. These substrates are generally highly ordered cellulosic material and thus only sparingly soluble.

[0200] It will be appreciated that suitable lignocellulosic material may be any feedstock that contains soluble and/or insoluble cellulose, where the insoluble cellulose may be in a crystalline or non-crystalline form. In various embodiments, the lignocellulosic biomass comprises, for example, wood, corn, corn stover, sawdust, bark, leaves, agricultural and forestry residues, grasses such as switchgrass, ruminant digestion products, municipal wastes, paper mill effluent, newspaper, cardboard or combinations thereof.

[0201] In some embodiments, the invention is directed to a method for hydrolyzing a cellulosic substrate, for example a cellulosic substrate as described above, by contacting the cellulosic substrate with a host cell of the invention. In some embodiments, the invention is directed to a method for hydrolyzing a cellulosic substrate, for example a cellulosic substrate as described above, by contacting the cellulosic substrate with a co- culture comprising yeast cells expressing heterologous cellulases.

[0202] In some embodiments, the invention is directed to a method for fermenting cellulose. Such methods can be accomplished, for example, by culturing a host cell or co- culture in a medium that contains insoluble cellulose to allow saccharification and fermentation of the cellulose.

[0203] The production of ethanol can, according to the present invention, be performed at temperatures of at least about 30° C, about 31° C, about 32° C, about 33° C, about 34° C, about 35° C, about 36° C, about 37° C, about 38° C, about 39° C, about 40° C, about 41° C, about 42° C, about 43 °C, about 44 °C, about 45 °C, about 46 °C, about 47 °C, about 48 °C, about 49 °C, or about 50° C. In some embodiments of the present invention the thermotolerant host cell can produce ethanol from cellulose at temperatures above about 30° C, about 31° C, about 32° C, about 33° C, about 34° C, about 35° C, about 36° C, about 37° C, about 38° C, about 39° C, about 40° C, about 41° C, about 42° C, or about 43 °C, or about 44 °C, or about 45 °C, or about 50° C. In some embodiments of the present invention, the thermotolterant host cell can produce ethanol from cellulose at temperatures from about 30° C to 60° C, about 30° C to 55° C, about 30° C to 50° C, about 40° C to 60° C, about 40° C to 55° C or about 40° C to 50° C.

[0204] Given the abundance of pentose sugars, the fermentation of xylose and other hemicellulose constituents is an attractive option for the development of an economically viable process to produce ethanol from biomass. Hexose (C6) and pentose (C5) sugars are converted into pyruvate by modified glycolytic pathways. The pyruvate can then be redirected to ethanol. For example, the net reaction for a pentose sugar is typically such that three pentose sugars yield five molecules of ethanol and five molecules of carbon dioxide. Aspects of the present invention relate to the use of ethanologenic enzymes (i.e., pyruvate decarboxylase and/or alcohol dehydrogenase).

[0205] A variety of microorganisms are known to be useful for the conversion of organic material to ethanol. Examples of microorganisms which may be used in practice are fermentation agents, such as Saccharomyces cerevisiae for producing ethanol. An alternative ethanol-producing organism which may be used is Zymomonas mobilis or a member selected from the Zymomonas, Erwinia, Klebsiella, Xanthomonas or Escherichia genii. Other microorganisms that convert sugars to ethanol include species of Schizosaccharomyces (such as S. pombe), Pichia (P. stipitis), Candida (C. shehatae) and Pachysolen (P. tannophilus).

[0206] For the production of ethanol, the microorganisms of the subject invention can also be engineered with nucleic acids, such as those disclosed in U.S. Pat. No. 5,000,000, which is hereby incorporated by reference. In certain embodiments, the d-ldh, 1-ldh, ppc, ack, pfl genes of said microorganisms may optionally be inactivated. For example, genes coding for the alcohol dehydrogenase II and pyruvate decarboxylase activities together with appropriate regulatory sequences may be used to transform host cells; the regulatory sequences may consist of promoters, inducers, operators, ribosomal binding sites, terminators, and/or other regulatory sequences. See U.S. Patent No. 7,098,009, which is hereby incorporated by reference. A biocatalyst, such as a recombinant ethanologenic bacterium, can be engineered to express one or more enzymatic activities, such as those described above, in particular amounts sufficient for degrading complex sugars. Such a biocatalyst would be suitable for the efficient degradation of complex sugars and subsequent fermentation into alcohol in the processes of the present invention.

[0207] Ethanol production can be measured using any method known in the art. For example, the quantity of ethanol in fermentation samples can be assessed using HPLC analysis. Many ethanol assay kits are commercially available that use, for example, alcohol oxidase enzyme based assays. Methods of determining ethanol production are within the scope of those skilled in the art from the teachings herein. The U.S. Department of Energy (DOE) provides a method for calculating theoretical ethanol yield. Accordingly, if the weight percentages are known of C6 sugars (i.e., glucan, galactan, mannan), the theoretical yield of ethanol in gallons per dry ton of total C6 polymers can be determined by applying a conversion factor as follows:

(1.11 pounds of C6 sugar/pound of polymeric sugar) x (0.51 pounds of ethanol/pound of sugar) x (2000 pounds of ethanol/ton of C6 polymeric sugar) x (1 gallon of ethanol/6.55 pounds of ethanol) x (1/100%), wherein the factor (1 gallon of ethanol/6.55 pounds of ethanol) is taken as the specific gravity of ethanol at 20°C.

[0208] And if the weight percentages are known of C5 sugars (i.e., xylan, arabinan), the theoretical yield of ethanol in gallons per dry ton of total C5 polymers can be determined by applying a conversion factor as follows:

(1.136 pounds of C5 sugar/pound of C5 polymeric sugar) x (0.51 pounds of ethanol/pound of sugar) x (2000 pounds of ethanol/ton of C5 polymeric sugar) x (1 gallon of ethanol/6.55 pounds of ethanol) x (1/100%), wherein the factor (1 gallon of ethanol/6.55 pounds of ethanol) is taken as the specific gravity of ethanol at 20°C.

[0209] It follows that by adding the theoretical yield of ethanol in gallons per dry ton of the total C6 polymers to the theoretical yield of ethanol in gallons per dry ton of the total C5 polymers gives the total theoretical yield of ethanol in gallons per dry ton of feedstock.

[0210] Applying this analysis, the DOE provides the following examples of theoretical yield of ethanol in gallons per dry ton of feedstock: corn grain, 124.4; corn stover, 113.0; rice straw, 109.9; cotton gin trash, 56.8; forest thinnings, 81.5; hardwood sawdust, 100.8; bagasse, 111.5; and mixed paper, 116.2. It is important to note that these are theoretical yields. The DOE warns that depending on the nature of the feedstock and the process employed, actual yield could be anywhere from 60% to 90% of theoretical, and further states that "achieving high yield may be costly, however, so lower yield processes may often be more cost effective." {Ibid.)

[0211] Remarkably, aspects of the present invention relate to improvements in process economics without sacrificing foreseeable ethanol yield. Because cheaper construction materials may be used, pretreatment capital costs are reduced considerably if severe conditions are not required. This approach does not reduce the ethanol yield because it achieves the same the results associated with acidic high and/or temperature pretreatment. It is recognized that without aggressive pretreatment conditions, fractional separation of the biomass may not be complete. Nevertheless, high enzyme loading at temperatures of optimal enzymatic activity may still enable complete hemicellulose hydrolysis and an ethanol yield comparable or greater than what could be achieved with high temperature or acidic pretreatment. Said high enzyme loading is feasible because the enzyme is not purchased.

Methods of the Invention

[0212] In a particular aspect, the invention is directed to a method for the production of a fermentation product, the method comprising: i) contacting a pretreated lignocellulosic biomass feedstock, a sugar stream, and a population of microorganisms capable of hydrolyzing the lignocellulosic biomass and fermenting sugars into a fermentation product; and ii) culturing said population of microorganisms under conditions for a period sufficient to allow hydrolysis of the lignocellulosic biomass and fermentation of sugars by said population of microorganisms into a fermentation product. In one embodiment, exogenous enzymes are added to the culture.

[0213] In one embodiment, the lignocellulosic biomass feedstock is selected from the group consisting of: grass, switch grass, cord grass, rye grass, reed canary grass, miscanthus, sugar-processing residues, sugarcane bagasse, agricultural wastes, rice straw, rice hulls, barley straw, corn cobs, cereal straw, wheat straw, canola straw, oat straw, oat hulls, corn fiber, stover, soybean stover, corn stover, forestry wastes, recycled wood pulp fiber, paper sludge, sawdust, hardwood, softwood, Agave, and combinations thereof.

[0214] In one embodiment, the sugar stream is selected from the group consisting of: molasses, sugar cane, sugar beet, corn starch, wheat starch, and potato starch.

[0215] In one embodiment, the sugar stream is provided by a dry milling process. In another embodiment, the sugar stream is provided by a wet milling process. [0216] In one embodiment, the population of microorganisms is selected from the group consisting of bacteria and yeast. In one embodiment, the population of microorganisms comprises at least one genetically modified microorganism. In a specific embodiment, the genetically modified microorganism is a bacterium. In a more specific embodiment, the bacterium is thermophilic or mesophilic. In another embodiment, the genetically modified microorganism is a fungus. In a more specific embodiment, the fungus is a yeast.

[0217] In one embodiment, the fermentation product is selected from the group consisting of an alcohol, lactic acid, and acetic acid. In a particular embodiment, the fermentation product is ethanol.

EXAMPLES

[0218] The invention now being generally described, it will be more readily understood by reference to the following examples, which are included merely for purposes of illustration of certain aspects and embodiments of the present invention, and are not intended to limit the invention.

EXAMPLE 1 Sugar Dilution

[0219] Pretreated solids are diluted with at least one sugar stream, for example, cane juice, molasses, sugar beet juice, or hydrolyzed starch. Such sugar streams are available from corn processing in either dry mill or wet mill configurations. Performance is expected to improve markedly because of two compounding factors: 1) reduced concentrations of inhibitors of cellulase synthesis and function and 2) increased cellulase loading (g cellulase protein/g cellulose). The latter arises because microorganisms synthesize cellulases from added sugar-rich streams as well as from cellulose.

[0220] For example, reference process configuration "A," operating at 25 wt. % solids without sugar dilution, presents conditions under which rate and yield is compromised due to inhibition of cellulase synthesis and cellulase function, and mixing is challenging. At 25 wt. % pretreated solids, the cellulose concentration is about 15 wt. %. With a hydrolysis yield of 0.75 (mass sugars liberated per mass cellulose present) and a fermentation yield of 0.45 (mass ethanol per mass sugars liberated), the ethanol concentration is 5.06 wt.%. With a cell yield of 0.1 (mass cells/mass sugars fermented) and a cellulase: cell mass fraction of 0.1 (mass cellulase/mass cells), the cellulase per kg pretreated slurry is: (150 g cellulose/kg pretreated slurry)* (0.75 g sugars liberated per mass cellulose present)*(0.01 mass cellulase synthesized/mass sugars present) = 1.13 g cellulase/kg pretreated slurry. Dividing this by the initial cellulose concentration (150 g cellulose/kg pretreated slurry) gives a cellulase loading of 7.53 mg cellulase/g initial cellulose.

[0221] Reference process configuration "B," features dilution by a stream containing soluble sugar at a concentration of 15 wt.% (readily available in existing industrial plants mentioned above) blended with pretreated solids at a ratio of, for example, 2:3, (for purpose of illustration—other ratios could be established). The blended sugar/pretreated solids stream has a carbohydrate concentration, including both cellulose and sugars of 150 g/kg slurry (consisting of pretreated solids and the sugar stream) as is with configuration "A," but has a cellulose concentration of 0.6* 150 g cellulose/kg pretreated slurry = 90 g cellulose/kg . However, inhibitors of cellulase function and synthesis are present at a 40% lower concentration in configuration "B" than in configuration "A," leading to higher cellulase specific activity and improved cell growth manifested as higher specific growth rates and lower ATP requirements for maintenance. As well, the solids concentration are only 15 wt. %, at which mixing is comparatively easy to achieve. Lowering the ATP requirements for cell maintenance increases the ATP available for cell growth and cellulase synthesis. Thus, a higher cell yield is expected for configuration "B" than for configuration "A." The cellulase: cell ratio is expected to be similar in both configurations in the absence of catabolite repression of cellulase synthesis by sugars. Catabolite repression may not be operative in microorganisms in which heterologous cellulase expression occurs and can, in any case, be prevented by genetic engineering, reactor operating modes which do not feature sugar accumulation, or a combination thereof.

[0222] Assuming, conservatively, the same cell yield and hydrolysis yield for configuration "B" as for configuration "A," the cellulase concentration for configuration "B" would be the same as for configuration "A": 1.13 g cellulase/kg pretreated slurry. Dividing this by the cellulose present, 90 g/L, gives a cellulase loading of 12.6 mg cellulase/g cellulose, which is 40% higher than in configuration "B". If the cell yield in configuration "B" is higher than configuration "A," then the cellulase loading would be 80% higher in configuration "B" as compared to configuration "A." One of ordinary skill in the art is aware that cellulase loading increases of 40 to 80% materially improve the performance of processes for converting cellulosic biomass to biofuels. Considering that this cellulase loading effect is compounded with further significant advantages of configuration "B" - lower inhibition of cellulases due to lower concentrations of inhibitors present in pretreated slurry and further with improved mixing - very substantial improvement in performance may be expected for configuration "B" relative to configuration "A."

EXAMPLE 2

Propagation of Cellulase-Producing Microbes Via Fermentation of Soluble Sugar Streams 3] Industrial plants at which biofuel production occurs via fermentation of soluble sugars by microorganisms that produce cellulase (e.g., in cane, sugar beet, or corn processing as described above) routinely produce microorganism cell mass in excess of process requirements, and often recycle cells. If, instead the fermentation were carried out by a cellulase-producing microorganism, these microorganisms would also be produced in excess of process requirements. Moreover, conditions for cell propagation and cellulase synthesis would be considerably more favorable during fermentation of soluble sugars as compared to fermentation of pretreated solids. Excess cellulase- producing microorganisms could be added to pretreated cellulosic biomass, with or without dilution as above. Such addition could allow higher loading of cellulase and cells per unit pretreated biomass than could be achieved via CBP in which cell and cellulase synthesis occurred primarily or entirely in the CBP reactor. One of ordinary skill in the art is aware that performance would improve markedly at high cellulase and cell loadings. Performance is particularly improved if cellulases are expressed on the cell surface, which fosters co-recovery of cellulase when cells are harvested following soluble sugar fermentation.

* * * [0224] These examples illustrate possible embodiments of the present invention. While the invention has been particularly shown and described with reference to some embodiments thereof, it will be understood by those skilled in the art that they have been presented by way of example only, and not limitation, and various changes in form and details can be made therein without departing from the spirit and scope of the invention. Thus, the breadth and scope of the present invention should not be limited by any of the above-described exemplary embodiments, but should be defined only in accordance with the following claims and their equivalents.

[0225] All documents cited herein, including journal articles or abstracts, published or corresponding U.S. or foreign patent applications, issued or foreign patents, or any other documents, are each entirely incorporated by reference herein, including all data, tables, figures, and text presented in the cited documents.