Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
NOVEL CELL WALL DECONSTRUCTION ENZYMES OF CHAETOMIUM THERMOPHILUM, THERMOMYCES STELLATUS, AND CORYNASCUS SEPEDONIUM, AND USES THEREOF
Document Type and Number:
WIPO Patent Application WO/2015/109405
Kind Code:
A1
Abstract:
The present invention relates to novel polypeptides and enzymes (e.g., thermostable proteins and enzymes) having activities relating to biomass processing and/or degradation (e.g., cell wall deconstruction), as well as polynucleotides, vectors, cells, compositions and tools relating to same, or functional variants thereof. More particularly, the present invention relates to secreted enzymes that may be isolated from the fungi Chaetomium thermophilum strain ATCC 16451, Thermomyces stellatus strain CBS 241.64, and Corynascus sepedonium strain ATCC 9787. Uses thereof in various industrial processes such as in biofuels, food preparation, animal feed, pulp and paper, textiles, detergents, waste treatment and others are also disclosed.

Inventors:
TSANG, Adrian (3816 Draper Avenue, Montréal, Québec H4A 2P1, CA)
POWLOWSKI, Justin (4555 Montclair Avenue, Montréal, Québec H2B 2J8, CA)
BUTLER, Gregory (1700 René-Lévesque Blvd, WestApt. 30, Montréal Québec H3H 2V1, CA)
Application Number:
CA2015/050042
Publication Date:
July 30, 2015
Filing Date:
January 22, 2015
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
CONCORDIA UNIVERSITY (1455 de Maisonneuve West, Suite GM 1000Montréal, Québec H3G 1M8, CA)
International Classes:
C12N15/56; A21D2/26; A23K1/165; A23L1/305; C07K14/37; C12N1/15; C12N1/21; C12N9/00; C12N9/02; C12N9/14; C12N9/24; C12N9/42; C12N9/58; C12N9/88; C12N15/31; C12N15/52; C12N15/55; C12N15/57; C12N15/63; C12N15/80; C12P7/10; C12P19/00; C12P19/14; C12P21/02; D06M16/00; D21C9/00
Domestic Patent References:
WO2009033071A22009-03-12
Other References:
AMLACHER, S ET AL.: "Insight into Structure and Assembly of the Nuclear Pore Complex by Utilizing the Genome of a Eukaryotic Thermophile.", CELL, vol. 146, no. 2, 22 July 2011 (2011-07-22), pages 277 - 289, XP028382610
DATABASE NCBI 6 June 2011 (2011-06-06), XP006695946, accession no. P_006695946
DATABASE NCBI 6 June 2011 (2011-06-06), accession no. M_006695883
COLEMAN, JJ ET AL.: "The Genome of Nectria haematococca: Contribution of Supernumerary Chromosomes to Gene Expansion.", PLOS GENETICS., vol. 5, no. ISSUE, 28 August 2009 (2009-08-28), pages 1 - 14, XP055214133, ISSN: 1553-7404, Retrieved from the Internet
DATABASE NCBI 14 July 2009 (2009-07-14), XP003054296, accession no. P_003054296
BERKA, RM ET AL.: "Comparative genomic analysis of the thermophilic biomass-degrading fungi Myceliophthora thermophila and Thielavia terristris.", NATURE BIOTECHNOLOGY, vol. 29, no. ISSUE, 2 October 2011 (2011-10-02), pages 922 - 927, XP055034838
DATABASE NCBI 9 August 2011 (2011-08-09), accession no. M_003656350
Attorney, Agent or Firm:
GOUDREAU GAGE DUBUC (2000, McGill CollegeSuite 220, Montréal Québec H3A 3H3, CA)
Download PDF:
Claims:
CLAIMS:

1. An isolated polypeptide which is:

(a) a polypeptide comprising the amino acid sequence of any one of SEQ ID NOs: 357-534, 1031- 1278, and 1851-2136;

(b) a polypeptide comprising an amino acid sequence having at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% amino acid sequence identity to the polypeptide defined in (a);

(c) a polypeptide comprising an amino acid sequence encoded by the nucleic acid sequence of any one of SEQ ID NOs: 179-356, 783-1030, and 1565-1850;

(d) a polypeptide comprising an amino acid sequence encoded by any one of the exonic nucleic acid sequences corresponding to the positions as defined in Tables 2A-2C;

(e) a polypeptide comprising an amino acid sequence encoded by a polynucleotide molecule that hybridizes under medium-high stringency conditions, high stringency conditions, or very high stringency conditions with the full-length complement of a polynucleotide molecule comprising the nucleic acid sequence defined in (c) or (d);

(f) a polypeptide comprising an amino acid sequence encoded by a polynucleotide molecule having at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% nucleic acid sequence identity to a polynucleotide comprising the nucleic acid sequence defined in (c) or (d);

(g) a functional variant of the polypeptide defined in (a) comprising a substitution, deletion, and/or insertion at one or more residues; or

(h) a functional fragment of the polypeptide of any one of (a) to (g).

2. The isolated polypeptide of claim 1 , wherein said polypeptide has a corresponding function and/or protein activity according to Tables 1A-1C.

3. The isolated polypeptide of claim 1 or 2 comprising or consisting of the amino acid sequence of any one of SEQ ID NOs: 357-534, 1031-1278, and 1851-2136.

4. The isolated polypeptide of any one of claims 1 to 3, wherein said polypeptide is a recombinant polypeptide.

5. The isolated polypeptide of any one of claims 1 to 4 obtainable from a fungus.

6. The isolated polypeptide of any one of claims 1 to 5, wherein said fungus is from the genus Chaetomium, Thermomyces, or Corynascus.

7. The isolated polypeptide of any one of claims 1 to 6, wherein said fungus is Chaetomium thermophilum, Thermomyces stetlatus, or Corynascus sepedonium.

8. An antibody that specifically binds to the isolated polypeptide of any one of claims 1 to 7.

9. An isolated polynucleotide molecule encoding the polypeptide of any one of claims 1 to 7.

10. An isolated polynucleotide molecule which is:

(a) a polynucleotide molecule comprising a nucleic acid sequence encoding the polypeptide of any one of SEQ ID NOs: 357-534, 1031-1278, and 1851-2136;

(b) a polynucleotide molecule comprising the nucleic acid sequence of any one of SEQ ID NOs: 1- 160, 535-782, and 1279-1564;

(c) a polynucleotide molecule comprising the nucleic acid sequence of any one of SEQ ID NOs: 179- 356, 783-1030, and 1565-1850;

(d) a polynucleotide molecule comprising any one of the exonic nucleic acid sequences corresponding to the positions as defined in Tables 2A-2C;

(e) a polynucleotide molecule comprising a nucleic acid sequence having at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% nucleic acid sequence identity to any one of the polynucleotide molecules defined in (a) to (d); or

(f) a polynucleotide molecule that hybridizes under medium-high stringency conditions, high stringency conditions, or very high stringency conditions with the full-length complement of any one of the polynucleotide molecules defined in (a) to (e).

11. The isolated polynucleotide molecule of claim 9 or 10 obtainable from a fungus.

12. The isolated polynucleotide molecule of claim 11 , wherein said fungus is from the genus Chaetomium, Thermomyces, or Corynascus.

13. The isolated polynucleotide molecule of claim 12, wherein said fungus is Chaetomium thermophilum, Thermomyces stetlatus, or Corynascus sepedonium.

14. A vector comprising a polynucleotide molecule as defined in any one of claims 9 to 13.

15. The vector of claim 14 further comprising a regulatory sequence operatively linked to said polynucleotide molecule for expression of same in a suitable host cell.

16. The vector of claim 15, wherein said suitable host cell is a bacterial cell.

17. The vector of claim 15, wherein said suitable host cell is a fungal cell.

18. The vector of claim 17, wherein said fungal cell is a filamentous fungal cell.

19. A recombinant host cell comprising the polynucleotide molecule as defined in any one of claims 9 to 13, or the vector as defined in any one of claims 14 to 18.

20. The recombinant host cell of claim 19, wherein said cell is a bacterial cell.

21. The recombinant host cell of claim 19, wherein said cell is a fungal cell.

22. The recombinant host cell of claim 21 , wherein said fungal cell is a filamentous fungal cell.

23. A polypeptide obtainable by expressing the polynucleotide molecule of any one of claims 9 to 13, or the vector of any one of claims 14 to 18, in a suitable host cell.

24. A composition comprising the polypeptide of any one of claims 1 to 7 or 23, or the recombinant host cell of any one of claims 19 to 22.

25. The composition of claim 24 further comprising a suitable carrier.

26. The composition of claim 24 or 25 further comprising a substrate of said polypeptide. The composition of claim 26, wherein said substrate is biomass.

A method for producing the polypeptide of any one of claims 1 to 7 or 23, said method comprising:

(a) culturing a strain comprising the polynucleotide molecule of any one of claims 9 to 13 or the vector of any one of claims 14 to 18 under conditions conducive for the production of said polypeptide; and

(b) recovering said polypeptide.

29. The method of claim 28, wherein said strain is a bacterial strain.

30. The method of claim 28, wherein said strain is a fungal strain.

31. The method of claim 30, wherein said fungal strain is a filamentous fungal strain.

32. A method for producing the polypeptide of any one of claims 1 to 7 or 23, said method comprising:

(a) culturing the recombinant host cell of any one of claims 19 to 22 under conditions conducive for the production of said polypeptide; and

(b) recovering said polypeptide.

33. A method for preparing a food product, said method comprising incorporating the polypeptide of any one of claims 1 to 7 or 23 during preparation of said food product.

34. The method of claim 33, wherein said food product is a bakery product.

35. Use of the polypeptide of any one of claims 1 to 7 or 23 for the preparation or processing of a food product.

36. The use of claim 33, wherein said food product is a bakery product.

37. The polypeptide of any one of claims 1 to 7 or 23 for use in the preparation or processing of a food product.

38. The polypeptide of claim 37, wherein said food product is a bakery product.

39. Use of the polypeptide of any one of claims 1 to 7 or 23 for the preparation of animal feed.

40. Use of the polypeptide of any one of claims 1 to 7 or 23 for increasing digestion or absorption of animal feed.

41. The use of claim 39 or 40, wherein said animal feed is a cereal-based feed.

42. The polypeptide of any one of claims 1 to 7 or 23 for the preparation of animal feed, or for increasing digestion or absorption of animal feed.

43. The polypeptide of claim 42, wherein said animal feed is a cereal-based feed.

44. Use of the polypeptide of any one of claims 1 to 7 or 23 for the production or processing of kraft pulp or paper.

45. The use of claim 44, wherein said processing comprises prebleaching.

46. The use of claim 44, wherein said processing comprises de-inking.

47. The polypeptide of any one of claims 1 to 7 or 23 for the production or processing of kraft pulp or paper.

48. The polypeptide of claim 47, wherein said processing comprises prebleaching or de-inking.

49. Use of the polypeptide of any one of claims 1 to 7 or 23 for processing lignin.

50. The polypeptide of any one of claims 1 to 7 or 23 for processing lignin.

51. Use of the polypeptide of any one of claims 1 to 7 or 23 for producing ethanol.

52. The polypeptide of any one of claims 1 to 7 or 23 for producing ethanol.

53. The use of any one of claims 35, 36, 40, 41 , 44 to 46, 49 and 51 in conjunction with cellulose or a cellulase.

54. Use of the polypeptide of any one of claims 1 to 7 or 23 for treating textiles or dyed textiles.

55. The polypeptide of any one of claims 1 to 7 or 23 for treating textiles or dyed textiles.

56. Use of the polypeptide of any one of claims 1 to 7 or 23 for degrading biomass or pretreated biomass.

57. The polypeptide of any one of claims 1 to 7 or 23 for degrading biomass or pretreated biomass.

Description:
TITLE OF THE INVENTION

NOVEL CELL WALL DECONSTRUCTION ENZYMES OF CHAETOMIUM THERMOPHILUM, THERMOMYCES STELLATUS, AND CORYNASCUS SEPEDONIUM, AND USES THEREOF

CROSS REFERENCE TO RELATED APPLICATIONS

[0001] The present application claims the benefit of U.S. Provisional Applications Serial Nos. 61/930,129, 61/930,119 and 61/930,1 13, filed on January 22, 2014, which are incorporated herein by reference in their entirety.

FIELD OF THE INVENTION

[0002] The present invention relates to novel polypeptides and enzymes having activities relating to biomass processing and/or degradation (e.g., cell wall deconstruction), as well as polynucleotides, vectors, cells, compositions and tools relating to same, or functional variants thereof. More particularly, the present invention relates to secreted enzymes that may be isolated from the fungi Chaetomium thermophilum strain ATCC 16451 , Thermomyces stellatus strain CBS 241.64, and Corynascus sepedonium strain ATCC 9787. Uses thereof in various industrial processes such as in biofuels, food preparation, animal feed, pulp and paper, textiles, detergents, waste treatment and others are also disclosed.

SEQUENCE LISTING

[0003] This application contains a Sequence Listing in computer readable form entitled "Seq_Listing_CHATH_THEST_CORSE.txt", created January 14, 2015 and having a size of about 5.42 MB. The computer readable form is incorporated herein by reference.

BACKGROUND OF THE INVENTION

[0004] Biomass-processing enzymes have a number of industrial applications such as in: the biofuel industry (e.g., improving ethanol yield and/or increasing the efficiency and economy of ethanol production); the food industry (e.g., production of cereal-based food products; the feed-enzyme industry (e.g., increasing the digestibility/absorption of nutrients); the pulp and paper industry (e.g., enhancing bleachability of pulp); the textile industry (e.g., treatment of cellulose-based fabrics); the waste treatment industry (e.g., de-colorization of synthetic dyes); the detergent industry (e.g., providing eco-friendly cleaning products); and the rubber industry (e.g., catalyzing the conversion of latex into foam rubber).

[0005] In particular, driven by the limited availability of fossil fuels, there is a growing interest in the biofuel industry for improving the conversion of biomass into second-generation biofuels. This process is heavily dependent on inexpensive and effective enzymes for the conversion of lignocellulose to ethanol. Cellulase enzyme cocktails involve the concerted action of endoglucanases, cellobiohydrolases (also known as exoglucanases), and beta- glucosidases. The current cost of cellulose-degrading enzymes is too high for bioethanol to compete economically with fossil fuels. Cost reduction may result from the discovery of cellulase enzymes with, for example, higher specific activity, lower production costs, and/or greater compatibility with processing conditions including temperature, pH and the presence of inhibitors in the biomass, or produced as the result of biomass pre-treatment.

[0006] Conversion of plant biomass to glucose may also be enhanced by supplementing cellulose cocktails with enzymes that degrade the other components of biomass, including hemicelluloses, pectins and lignins, and their linkages, thereby improving the accessibility of cellulose to the cellulase enzymes. Such enzymes include, without being limiting, to: xylanases, mannanases, arabinanases, esterases, glucuronidases, xyloglucanases and arabinofuranosidases for hemicelluloses; lignin peroxidases, manganese-dependent peroxidases, versatile peroxidases, and laccases for lignin; and pectate lyase, pectin lyase, polygalacturonase, pectin acetyl esterase, alpha-arabinofuranosidase, beta-galactosidase, galactanase, arabinanase, rhamnogalacturonase, rhamnogalacturonan lyase, and rhamnogalacturonan acetyl esterase, xylogalacturonosidase, xylogalacturonase, and rhamnogalacturonan lyase. Additionally, glycoside hydrolase family 61 (GH61) proteins have been shown to stimulate the activity of cellulase preparations.

[0007] These enzymes may also be useful for other purposes in processing biomass. For example, the lignin modifiying enzymes may be used to alter the structure of lignin to produce novel materials, and hemicellulases may be employed to produce 5-carbon sugars from hemicelluloses, which may then be further converted to chemical products.

[0008] There is also a growing need for improved enzymes for food processing and feed applications. Cereal- based food products such as pasta, noodles and bread can be prepared from dough which is usually made from the basic ingredients (cereal) flour, water and optionally salt. As a result of a consumer-driven need to replace the chemical additives by more natural products, several enzymes have been developed with dough and/or cereal- based food product-improving properties, which are used in all possible combinations depending on the specific application conditions. Suitable enzymes include, for example, xylanase, starch degrading enzymes, oxidizing enzymes, fatty material splitting enzymes, protein degrading, and modifying or crosslinking enzymes. Many of these enzymes are also used for treating animal feed or animal feed additives, to make them more digestible or to improve their nutritional quality. Amylases are used for the conversion of plant starches to glucose. Pectin-active enzymes are used in fruit processing, for example to increase the yield of juices, and in fruit juice clarification, as well as in other food processing steps.

[0009] There is also a growing need for improved enzymes in other industries. In the pulp and paper industry, enzymes are used to make the bleaching process more effective and to reduce the use of oxidative chemicals. In the textile industry, enzymatic treatment is often used in place of (or in addition to) a bleaching treatment to achieve a "used" look of jeans, and can also improve the softness/feel of fabrics. When used in detergent compositions, enzymes can enhance cleaning ability or act as a softening agent. In the waste treatment industry, enzymes play an important role in changing the characteristics of the waste, for example, to become more amenable to further treatment and/or for bio-conversion to value-added products.

[0010] There is also a growing need for indutrial enzymes and proteins that are "thermostable" in that they retain a level of their function or protein activity at temperatures about 50°C. These thermostable enzymes are highly desirable, for example, to be able to perform reactions at elevated temperatures to avoid or reduce contamination by microorganisms (e.g., bacteria).

[0011] There thus remains a need in the above-mentioned industries and others for biomass-processing enzymes, polynucleotides encoding same, and recombinant vectors and strains for expressing same.

[0012] The present description refers to a number of documents, the content of which is herein incorporated by reference in their entirety.

SUMMARY OF THE INVENTION

[0013] In general, the present invention relates to soluble, secreted proteins relating to biomass processing and/or degradation (e.g., cell wall deconstruction) that may be isolated from the fungi Chaetomium thermophilum strain ATCC 16451 , Thermomyces stellatus strain CBS 241.64, and Corynascus sepedonium strain ATCC 9787, as well as polynucleotides, vectors, compositions, cells, antibodies, kits, products and uses associated with same. Briefly, these fungal strains were cultured in vitro and genomic DNA along with total RNA were isolated therefrom. These nucleic acids were then used to determine/assemble fungal genomic sequences and generate cDNA libraries. Bioinformatic tools were used to predict genes in the assembled genomic sequences, and those genes encoding proteins relating to biomass-degradation (e.g., cell wall deconstruction) were identified based on bioinformatics (e.g., the presence of conserved domains). Sequences predicted to encode proteins which are targeted to the mitochondria or bound to the cell wall were removed. cDNA clones comprising full-length sequences predicted to encode soluble, secreted proteins relating to biomass-degradation were fully sequenced and cloned into appropriate expression vectors for protein production and characterization. The full-length genomic, exonic, intronic, coding and polypeptide sequences are disclosed herein, along with corresponding putative (biological) functions and/or protein activities, where available.

[0014] The soluble, secreted, biomass degradation proteins of the present invention comprise a proteome which is referred to herein as the SSBD proteome of Chaetomium thermophilum, Thermomyces stellatus, or Corynascus sepedonium.

[0015] Accordingly, in some aspects the present invention relates to an isolated polypeptide which is:

(a) a polypeptide comprising the amino acid sequence of any one of SEQ ID NOs: 357-534, 1031- 1278, and 1851-2136;

(b) a polypeptide comprising an amino acid sequence having at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% amino acid sequence identity to the polypeptide defined in (a);

(c) a polypeptide comprising an amino acid sequence encoded by the nucleic acid sequence of any one of SEQ ID NOs: 179-356, 783-1030, and 1565-1850;

(d) a polypeptide comprising an amino acid sequence encoded by any one the exonic nucleic acid sequences corresponding to the positions as defined in Tables 2A-2C; (e) a polypeptide comprising an amino acid sequence encoded by a polynucleotide molecule that hybridizes under medium-high stringency conditions, high stringency conditions, or very high stringency conditions with the full-length complement of a polynucleotide molecule comprising the nucleic acid sequence defined in (c) or (d);

(0 a polypeptide comprising an amino acid sequence encoded by a polynucleotide molecule having at least 60%, at least 65% at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% nucleic acid sequence identity to a polynucleotide molecule comprising the nucleic acid sequence defined in (c) or (d);

(g) a functional variant of the polypeptide defined in (a) comprising a substitution, deletion, and/or insertion at one or more residues; or

(h) a functional fragment of the polypeptide of any one of (a) to (g).

[0016] In some embodiments, the above mentioned polypeptide has a corresponding function and/or protein activity according to Tables 1A-1C.

[0017] In some embodiments, the above mentioned polypeptide comprises or consists of the amino acid sequence of any one of SEQ ID NOs: 357-534, 1031-1278, and 1851-2136.

[0018] In some embodiments, the above mentioned polypeptide is a recombinant polypeptide.

[0019] In some embodiments, above mentioned polypeptide is obtainable from a fungus. In some embodiments, the fungus is from the genus Chaetomium, Thermomyces, or Corynascus. In some embodiments, the fungus is Chaetomium thermophilum, Thermomyces stellatus, or Corynascus sepedonium.

[0020] In some aspects, the present invention relates to an antibody that specifically binds to any one of the above mentioned polypeptides.

[0021] In some aspects, the present invention relates to an isolated polynucleotide molecule encoding any one of the above mentioned polypeptides.

[0022] In some aspects, the present invention relates to an isolated polynucleotide molecule which is:

(a) a polynucleotide molecule comprising a nucleic acid sequence encoding the polypeptide of any one of SEQ ID NOs: 357-534, 1031-1278, and 1851-2136;

(b) a polynucleotide molecule comprising the nucleic acid sequence of any one of SEQ ID NOs: 1- 178, 535-782, and 1279-1564; in a further embodiment a polynucleotide molecule comprising the nucleic acid sequence of any one of SEQ ID NOs: 1-160, 535-782, and 1279-1564;

(c) a polynucleotide molecule comprising the nucleic acid sequence of any one of SEQ ID NOs: 179- 356, 783-1030, and 1565-1850;

(d) a polynucleotide molecule comprising any one of the exonic nucleic acid sequences corresponding to the positions as defined in Tables 2A-2C;

(e) a polynucleotide molecule comprising a nucleic acid sequence having at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% nucleic acid sequence identity to any one of the polynucleotide molecules defined in (a) to (d); or (f) a polynucleotide molecule that hybridizes under medium-high stringency conditions, high stringency conditions, or very high stringency conditions with the full-length complement of any one of the polynucleotide molecules defined in (a) to (e).

[0023] In some embodiments, the above mentioned polynucleotide molecule is obtainable from a fungus. In some embodiments, the fungus is from the genus Chaetomium, Thermomyces, or Corynascus. In some embodiments, the fungus is Chaetomium thermophilum, Thermomyces stellatus, or Corynascus sepedonium.

[0024] In some embodiments, the above mentioned polynucleotide molecule is operably linked to a heterologous promoter.

[0025] In some aspects, the present invention relates to a vector comprising any one of the above mentioned polynucleotide molecules. In some embodiments, the vector comprises a regulatory sequence operatively linked to the polynucleotide molecule for expression of same in a suitable host cell. In some embodiments, the suitable host cell is a bacterial cell; a fungal cell; or a filamentous fungal cell.

[0026] In some embodiments, the present invention relates to a recombinant host cell comprising any one of the above mentioned polynucleotide molecules or vectors. In some embodiments, the present invention relates to a polypeptide obtainable by expressing the above mentioned polynucleotide or vector in a suitable host cell. In some embodiments, the suitable host cell is a bacterial cell; a fungal cell; or a filamentous fungal cell.

[0027] In some aspects, the present invention relates to a composition comprising any one of the above mentioned polypeptides or the recombinant host cells. In some embodiments, the composition further comprises a suitable carrier. In some embodiments, the composition further comprises a substrate of the polypeptide. In some embodiments, the substrate is biomass.

[0028] In some aspects, the present invention relates to a method for producing any one of the above mentioned polypeptides, the method comprising: (a) culturing a strain comprising the above mentioned polynucleotide molecule or vector under conditions conducive for the production of the polypeptide; and (b) recovering the polypeptide. In some embodiments, the strain is a bacterial strain; a fungal strain; or a filamentous fungal strain.

[0029] In some aspects, the present invention relates to a method for producing any one of the above mentioned polypeptides, the method comprising: (a) culturing the above mentioned recombinant host cell under conditions conducive for the production of the polypeptide; and (b) recovering the polypeptide.

[0030] In some aspects, the present invention relates to a method for preparing a food product, the method comprising incorporating any one of the above mentioned polypeptides during preparation of the food product. In some embodiments, the food product is a bakery product.

[0031] In some aspects, the present invention relates to the use of the above mentioned polypeptide for the preparation or processing of a food product. In some embodiments, the food product is a bakery product. [0032] In some aspects, the present invention relates to the use of any one of the above mentioned polypeptides for the preparation or processing of a food product. In some embodiments, the food product is a bakery product.

[0033] In some aspects, the present invention relates to the above mentioned polypeptide for use in the preparation or processing of a food product. In some embodiments, the food product is a bakery product.

[0034] In some aspects the present invention relates to the use of any one of the above mentioned polypeptides for the preparation of animal feed. In some aspects the present invention relates to the use of any one of the above mentioned polypeptides for increasing digestion or absorption of animal feed. In some aspects, the present invention relates to any one of the above mentioned polypeptides for use in the preparation of animal feed, or for increasing digestion or absorption of animal feed. In some embodiment, the animal feed is a cereal-based feed.

[0035] In some aspects the present invention relates to the use of any one of the above mentioned polypeptides for the production or processing of kraft pulp or paper. In some aspects the present invention relates to any one of the above mentioned polypeptides for the production or processing of kraft pulp or paper. In some embodiments, the processing comprises prebleaching and/or de-inking.

[0036] In some aspects the present invention relates to the use of any one of the above mentioned polypeptides for processing lignin. In some aspects the present invention relates to any one of the above mentioned polypeptides for processing lignin.

[0037] In some aspects the present invention relates to the use of any one of the above mentioned polypeptides for producing ethanol. In some aspects the present invention relates to any one of the above mentioned polypeptides for producing ethanol.

[0038] In some embodiments, the above mentioned uses are in conjunction with cellulose or a cellulase.

[0039] In some aspects the present invention relates to the use of any one of the above mentioned polypeptides for treating textiles or dyed textiles. In some aspects the present invention relates to any one of the above mentioned polypeptides for treating textiles or dyed textiles.

[0040] In some aspects the present invention relates to the use of any one of the above mentioned polypeptides for degrading biomass or pretreated biomass. In some aspects the present invention relates to any one of the above mentioned polypeptides for degrading biomass or pretreated biomass.

[0041] In some embodiments, the present invention relates to proteins and/or enzymes that are thermostable. In some embodiments, a polypeptide of the present invention retains a level of its function and/or protein activity at about 50°C, about 55°C, about 60°C, about 65°C, about 70°C, about 75°C, about 80°C, or about 95°C. In some embodiments, a polypeptide of the present invention retains a level of its function and/or protein activity between about 50°C and about 95°C, between about 50°C and about 90°C, between about 50°C and about 85°C, between about 50°C and about 80°C, between about 50°C and about 75°C, between about 50°C and about 70°C, or between about 50°C and about 65°C. In some embodiments, a polypeptide of the present invention has optimal or maximal function and/or protein activity greater than 50°C, greater than 55°C, greater than 60°C, greater than 65°C, or greater than 70°C. In some embodiments, a polypeptide of the present invention has optimal or maximal function and/or protein activity between about 50°C and about 95°C, between about 50°C and about 90°C, between about 50°C and about 85°C, between about 50°C and about 80°C, between about 50°C and about 75°C, between about 50°C and about 70°C, or between about 50°C and about 65°C.

[0042] Unless defined otherwise, the scientific and technological terms and nomenclature used herein have the same meaning as commonly understood by a person of ordinary skill to which this invention pertains. Commonly understood definitions of molecular biology terms can be found for example in Dictionary of Microbiology and Molecular Biology, 2nd ed. (Singleton et al., 1994, John Wiley & Sons, New York, NY) or The Harper Collins Dictionary of Biology (Hale & Marham, 1991 , Harper Perennial, New York, NY), Rieger et al., Glossary of genetics: Classical and molecular, 5 th edition, Springer-Verlag, New-York, 1991 ; Alberts et al., Molecular Biology of the Cell, 4 th edition, Garland science, New-York, 2002; and, Lewin, Genes VII, Oxford University Press, New-York, 2000. Generally, the procedures of molecular biology methods and the like are common methods used in the art. Such standard techniques can be found in reference manuals such as for example Sambrook et al., (2000, Molecular Cloning - A Laboratory Manual, Third Edition, Cold Spring Harbor Laboratories); and Ausubel et al., (1994, Current Protocols in Molecular Biology, John Wiley & Sons, New-York).

[0043] Further objects and advantages of the present invention will be clear from the description that follows. Definitions

[0044] Headings, and other identifiers, e.g., (a), (b), (i), (ii), etc., are presented merely for ease of reading the specification and claims. The use of headings or other identifiers in the specification or claims does not necessarily require the steps or elements be performed in alphabetical or numerical order or the order in which they are presented.

[0045] In the present description, a number of terms are extensively utilized. In order to provide a clear and consistent understanding of the specification and claims, including the scope to be given such terms, the following definitions are provided.

[0046] Nucleotide sequences are presented herein by single strand, in the 5' to 3' direction, from left to right, using the one-letter nucleotide symbols as commonly used in the art and in accordance with the recommendations of the lUPAC-IUB Biochemical Nomenclature Commission.

[0047] The use of the word "a" or "an" when used in conjunction with the term "comprising" in the claims and/or the specification may mean "one" but it is also consistent with the meaning of "one or more", "at least one", and "one or more than one".

[0048] As used in the specification and claims, the words "comprising" (and any form of comprising, such as "comprise" and "comprises"), "having" (and any form of having, such as "have" and "has"), "including" (and any form of including, such as "includes" and "include") or "containing" (and any form of containing, such as "contains" and "contain") are inclusive or open-ended and do not exclude additional, un-recited elements or method steps.

[0049] The term "about" is used to indicate that a value includes the standard deviation of error for the device or method being employed to determine the value. In general, the terminology "about" is meant to designate a possible variation of up to 10%. Therefore, a variation of 1 , 2, 3, 4, 5, 6, 7, 8, 9 and 10% of a value is included in the term "about".

[0050] The term "DNA" or "RNA" molecule or sequence (as well as sometimes the term "oligonucleotide") refers to a molecule comprised generally of the deoxyribonucleotides adenine (A), guanine (G), thymine (T) and/or cytosine (C). In "RNA", T is replaced by uracil (U).

[0051] The present description refers to a number of routinely used recombinant DNA (rDNA) technology terms. Nevertheless, definitions of selected examples of such rDNA terms are provided for clarity and consistency.

[0052] As used herein, "polynucleotide" or "nucleic acid molecule" refers to a polymer of nucleotides and includes DNA (e.g., genomic DNA, cDNA), RNA molecules (e.g., mRNA), and chimeras thereof. The nucleic acid molecule can be obtained by cloning techniques or synthesized. DNA can be double-stranded or single-stranded (coding strand or non-coding strand [antisense]). Conventional deoxyribonucleic acid (DNA) and ribonucleic acid (RNA) are included in the terms "nucleic acid molecule" and "polynucleotide" as are analogs thereof (e.g., generated using nucleotide analogs, e.g., inosine or phosphorothioate nucleotides). Such nucleotide analogs can be used, for example, to prepare polynucleotides that have altered base-pairing abilities or increased resistance to nucleases. A nucleic acid backbone may comprise a variety of linkages known in the art, including one or more of sugar- phosphodiester linkages, peptide-nucleic acid bonds (referred to as "peptide nucleic acids" (PNA); Hydig-Hielsen et al., PCT Int'l Pub. No. WO 95/32305), phosphorothioate linkages, methylphosphonate linkages or combinations thereof. Sugar moieties of the nucleic acid may be ribose or deoxyribose, or similar compounds having known substitutions, e.g., 2' methoxy substitutions (containing a 2'-0-methylribofuranosyl moiety; see PCT No. WO 98/02582) and/or 2' halide substitutions. Nitrogenous bases may be conventional bases (A, G, C, T, U), known analogs thereof (e.g., inosine or others; see "The Biochemistry of the Nucleic Acids 5-36", Adams et al., ed., 11th ed., 1992), or known derivatives of purine or pyrimidine bases (see, Cook, PCT Int'l Pub. No. WO 93/13121) or "abasic" residues in which the backbone includes no nitrogenous base for one or more residues (Arnold et al., U.S. Pat. No. 5,585,481 ). A nucleic acid may comprise only conventional sugars, bases and linkages, as found in RNA and DNA, or may include both conventional components and substitutions (e.g., conventional bases linked via a methoxy backbone, or a nucleic acid including conventional bases and one or more base analogs).

[0053] An "isolated nucleic acid molecule", as is generally understood and used herein, refers to a polymer of nucleotides, and includes, but should not limited to DNA and RNA. The "isolated" nucleic acid molecule is purified from its natural in vivo state, obtained by cloning or chemically synthesized.

[0054] As used herein, the terms "gene" and "recombinant gene" refer to nucleic acid molecules which may be isolated from chromosomal DNA, and very often include an open reading frame encoding a protein, e.g., polypeptides of the present invention. A gene may include coding sequences, non-coding sequences, introns and regulatory sequences, as well known.

[0055] "Amplification" refers to any in vitro procedure for obtaining multiple copies ("amplicons") of a target nucleic acid sequence or its complement or fragments thereof. In vitro amplification refers to production of an amplified nucleic acid that may contain less than the complete target region sequence or its complement. In vitro amplification methods include, e.g., transcription-mediated amplification, replicase-mediated amplification, polymerase chain reaction (PCR) amplification, ligase chain reaction (LCR) amplification and strand-displacement amplification (SDA including multiple strand-displacement amplification method (MSDA)). Replicase-mediated amplification uses self-replicating RNA molecules, and a replicase such as QB-replicase (e.g., Kramer et al., U.S. Pat. No. 4,786,600). PCR amplification is well known and uses DNA polymerase, primers and thermal cycling to synthesize multiple copies of the two complementary strands of DNA or cDNA (e.g., Mullis et al., U.S. Pat. Nos. 4,683,195, 4,683,202, and 4,800,159). LCR amplification uses at least four separate oligonucleotides to amplify a target and its complementary strand by using multiple cycles of hybridization, ligation, and denaturation (e.g., EP Pat. App. Pub. No. 0320308). SDA is a method in which a primer contains a recognition site for a restriction endonuclease that permits the endonuclease to nick one strand of a hemimodified DNA duplex that includes the target sequence, followed by amplification in a series of primer extension and strand displacement steps (e.g., Walker et al., U.S. Pat. No. 5,422,252). Two other known strand-displacement amplification methods do not require endonuclease nicking (Dattagupta et al., U.S. Patent No. 6,087,133 and U.S. Patent No. 6,124,120 (MSDA)). Those skilled in the art will understand that the oligonucleotide primer sequences of the present invention may be readily used in any in vitro amplification method based on primer extension by a polymerase (e.g., see Kwoh et al., 1990, Am. Biotechnol. Lab. 8:14 25 and Kwoh et al., 1989, Proc. Natl. Acad. Sci. USA 86, 1 173 1177; Lizardi et al., 1988, BioTechnology 6:1197 1202; Malek et al., 1994, Methods Mol. Biol., 28:253 260; and Sambrook et al., 2000, Molecular Cloning - A Laboratory Manual, Third Edition, CSH Laboratories). As commonly known in the art, the oligos are designed to bind to a complementary sequence under selected conditions. The terminology "amplification pair" or "primer pair" refers herein to a pair of oligonucleotides (oligos) of the present invention, which are selected to be used together in amplifying a selected nucleic acid sequence by one of a number of types of amplification processes.

[0056] As used herein, the terms "hybridizing" and "hybridizes" are intended to describe conditions for hybridization and washing under which nucleotide sequences at least about 60%, at least about 70%, at least about 80%, more preferably at least about 85%, even more preferably at least about 90%, more preferably at least 95%, more preferably at least 98% or more preferably at least 99% homologous to each other typically remain hybridized to each other. A preferred, non-limiting example of such hybridization conditions are hybridization in 6X sodium chloride/sodium citrate (SSC) at about 45°C, followed by one or more washes in 1X SSC, 0.1 % SDS at 50°C, preferably at 55°C, preferably at 60°C and even more preferably at 65°C. Highly stringent conditions include, for example, hybridizing at 68°C in 5x SSC/5x Denhardt's solution / 1.0% SDS and washing in 0.2x SSC/0.1 % SDS at room temperature. Alternatively, washing may be performed at 42°C. The skilled artisan will know which conditions to apply for stringent and highly stringent hybridization conditions. Additional guidance regarding such conditions is readily available in the art, for example, in Sambrook et al., supra; and Ausubel et al., supra (eds.), 1995, Current Protocols in Molecular Biology, (John Wiley & Sons, N.Y.). Of course, a polynucleotide which hybridizes only to a poly (A) sequence (such as the 3' terminal poly(A) tract of mRNAs), or to a complementary stretch of T (or U) residues, would not be included in a polynucleotide of the invention used to specifically hybridize to a portion of a nucleic acid of the invention, since such a polynucleotide would hybridize to any nucleic acid molecule containing a poly (A) stretch or the complement thereof (e.g., practically any double-stranded cDNA clone). [0057] The terms "identity" and "percent identity" are used interchangeably herein. For the purpose of this invention, it is defined here that in order to determine the percent identity of two amino acid sequences or of two nucleic acid sequences, the sequences are aligned for optimal comparison purposes (e.g., gaps can be introduced in the sequence of a first amino acid or nucleic acid sequence for optimal alignment with a second amino or nucleic acid sequence). The amino acid residues or nucleotides at corresponding amino acid positions or nucleotide positions are then compared. When a position in the first sequence is occupied by the same amino acid residue or nucleotide as the corresponding position in the second sequence, then the molecules are identical at that position. The percent identity between the two sequences is a function of the number of identical positions shared by the sequences (i.e., % identity = number of identical positions/total number of positions (i.e., overlapping positions) x 100). Preferably, the two sequences are the same length. Thus, In accordance with the present invention, the term "identical" or "percent identity" in the context of two or more nucleic acid or amino acid sequences, refers to two or more sequences or subsequences that are the same, or that have a specified percentage of amino acid residues or nucleotides that are the same (e.g., 60% or 65% identity, preferably, 70-95% identity, more preferably at least 95% identity), when compared and aligned for maximum correspondence over a window of comparison, or over a designated region as measured using a sequence comparison algorithm as known in the art, or by manual alignment and visual inspection. Sequences having, for example, 60% to 95% or greater sequence identity are considered to be substantially identical. Such a definition also applies to the complement of a test sequence. Preferably, the described identity exists over a region that is at least about 15 to 25 amino acids or nucleotides in length, more preferably, over a region that is about 50 to 100 amino acids or nucleotides in length. Those having skill in the art will know how to determine percent identity between/among sequences using, for example, algorithms such as those based on CLUSTALW computer program (Thompson Nucl. Acids Res. 2 (1994), 46734680) or FASTDB (Brutlag Comp. App. Biosci. 6 (1990), 237-245), as known in the art. Although the FASTDB algorithm typically does not consider internal non-matching deletions or additions in sequences, i.e., gaps, in its calculation, this can be corrected manually to avoid an overestimation of the % identity. CLUSTALW, however, does take sequence gaps into account in its identity calculations. Also available to those having skill in this art are the BLAST and BLAST 2.0 algorithms (Altschul Nucl. Acids Res. 25 (1977), 3389-3402). The BLASTN program for nucleic acid sequences uses as defaults a word length (W) of 1 1 , an expectation (E) of 10, M-5, N-4, and a comparison of both strands. For amino acid sequences, the BLASTP program uses as defaults a wordlength (W) of 3, and an expectation (E) of 10. The BLOSUM62 scoring matrix (Henikoff, Proc. Natl. Acad. Sci. USA, 89, (1989), 10915) uses alignments (B) of 50, expectation (E) of 10, M=5, N=4, and a comparison of both strands. Moreover, the present invention also relates to nucleic acid molecules the sequence of which is degenerate in comparison with the sequence of an above-described hybridizing molecule. When used in accordance with the present invention the term "being degenerate as a result of the genetic code" means that due to the redundancy of the genetic code different nucleotide sequences code for the same amino acid. The present invention also relates to nucleic acid molecules which comprise one or more mutations or deletions, and to nucleic acid molecules which hybridize to one of the herein described nucleic acid molecules, which show (a) mutation(s) or (a) deletion(s). The skilled person will appreciate that all these different algorithms or programs will yield slightly different results but that the overall percentage identity of two sequences is not significantly altered when using different algorithms.

[0058] In a related manner, the terms "homology" or "percent homology", refer to a similarity between two polypeptide sequences, but take into account changes between amino acids (whether conservative or not). As well known in the art, amino acids can be classified by charge, hydrophobicity, size, etc. It is also well known in the art that amino acid changes can be conservative (e.g., they do not significantly affect, or not at all, the function of the protein). A multitude of conservative changes are known in the art, Serine for threonine, isoleucine for leucine, arginine for lysine etc., Thus the term homology introduces evolutionistic notions (e.g., pressure from evolution to a retain function of essential or important regions of a sequence, while enabling a certain drift of less important regions).

[0059] The skilled person will be aware of the fact that several different computer programs are available to determine the homology between two sequences. For instance, a comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm. In a preferred embodiment, the percent identity between two amino acid sequences is determined using the Needleman and Wunsch (J. Mol. Biol. (48): 444-453 (1970)) algorithm which has been incorporated into the GAP program in the Accelrys GCG software package (available at http://www.accelrys.com/products/gcg/), using either a BLOSUM62 matrix or a PAM250 matrix, and a gap weight of 16, 14, 12, 10, 8, 6, or 4 and a length weight of 1 , 2, 3, 4, 5, or 6. The skilled person will appreciate that all these different parameters will yield slightly different results but that the overall percentage identity of two sequences is not significantly altered when using different algorithms.

[0060] In yet another embodiment, the percent identity between two nucleotide sequences is determined using the GAP program in the Accelrys GCG software package (available at http://www.accelrys.com/products/gcg/), using a NWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, or 80 and a length weight of 1 , 2, 3, 4, 5, or 6. In another embodiment, the percent identity two amino acid or nucleotide sequence is determined using the algorithm of E. Meyers and W. Miller (CABIOS, 4:1 1 -17 (1989) which has been incorporated into the ALIGN program (version 2.0) (available at the ALIGN Query using sequence data of the Genestream server IGH Montpellier France http://vega.igh.cnrs.fr/bin/align-guess.cgi) using a PAM120 weight residue table, a gap length penalty of 12 and a gap penalty of 4.

[0061] The nucleic acid and protein sequences of the present invention can further be used as a "query sequence" to perform a search against public databases to, for example, identify other family members or related sequences. Such searches can be performed using the NBLAST and XBLAST programs (version 2.0) of Altschul et al., (1990) J. Mol. Biol. 215:403-10. BLAST nucleotide searches can be performed with the NBLAST program, score = 100, wordlength = 12 to obtain nucleotide sequences homologous to nucleic acid molecules of the invention. BLAST protein searches can be performed with the XBLAST program, score = 50, wordlength = 3 to obtain amino acid sequences homologous to protein molecules of the invention. To obtain gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in Altschul et al., (1997) Nucleic Acids Res. 25(17): 3389- 3402. When utilizing BLAST and Gapped BLAST programs, the default parameters of the respective programs (e.g., XBLAST and NBLAST) can be used. See the homepage of the National Center for Biotechnology Information at http://www.ncbi.nlm.nih.gov/.

[0062] As used herein, the expressions "corresponding to", "corresponding to the positions", and "at a position or positions corresponding to", and grammatical variations thereof, refer to one or more nucleotide or amino acid positions that are determined to correspond to one another based on sequence and/or structural alignments with a specified reference gene sequence, coding sequence, or protein. For example, a position "corresponding to" an amino acid position of a given protein can be determined empirically by aligning the sequence of amino acids of that given protein with that of a polypeptide of interest that shares a level of sequence identity therewith. Corresponding positions can be determined by comparing and aligning sequences to maximize the number of matching nucleotides or residues, for example, such that identity between the sequences is greater than 95%, 96%>, 97%, 98% or 99% or more. Corresponding positions also can be based on structural alignments, for example by using computer simulated alignments of protein structure. Recitation that amino acids of a polypeptide correspond to amino acids in a disclosed sequence refers to amino acids identified upon alignment of the polypeptide with the disclosed sequence to maximize identity or homology (where conserved amino acids are aligned) using a standard alignment algorithm, such as the GAP algorithm.

[0063] By "sufficiently complementary" is meant a contiguous nucleic acid base sequence that is capable of hybridizing to another sequence by hydrogen bonding between a series of complementary bases. Complementary base sequences may be complementary at each position in sequence by using standard base pairing (e.g., G:C, A:T or A:U pairing) or may contain one or more residues (including abasic residues) that are not complementary by using standard base pairing, but which allow the entire sequence to specifically hybridize with another base sequence in appropriate hybridization conditions. Contiguous bases of an oligomer are preferably at least about 80% (81 , 82, 83, 84, 85, 86, 87, 88, 89, 90, 91 , 92, 93, 94, 95, 96, 97, 98, 99, 100%), more preferably at least about 90% complementary to the sequence to which the oligomer specifically hybridizes. Appropriate hybridization conditions are well known to those skilled in the art, can be predicted readily based on sequence composition and conditions, or can be determined empirically by using routine testing (see Sambrook et al., Molecular Cloning, A Laboratory Manual, 2 nd ed. (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1989) at §§ 1.90-1.91 , 7.37-7.57, 9.47-9.51 and 11.47-11.57, particularly at §§ 9.50-9.51 , 11.12-11.13, 11.45-1 1.47 and 11.55-11.57).

[0064] The present invention refers to a number of units or percentages that are often listed in sequences. For example, when referring to "at least 80%, at least 85%, at least 90%...", or "at least about 80%, at least about 85%, at least about 90%...", every single unit is not listed, for the sake of brevity. For example, some units (e.g., 81 , 82, 83, 84, 85,... 91 , 92%....) may not have been specifically recited but are considered encompassed by the present invention. The non-listing of such specific units should thus be considered as within the scope of the present invention.

[0065] Nucleic acid sequences may be detected by using hybridization with a complementary sequence (e.g., oligonucleotide probes) (see U.S. Patent Nos. 5,503,980 (Cantor), 5,202,231 (Drmanac et al.), 5,149,625 (Church et al.), 5,1 12,736 (Caldwell et al.), 5,068,176 (Vijg et al.), and 5,002,867 (Macevicz)). Hybridization detection methods may use an array of probes (e.g., on a DNA chip) to provide sequence information about the target nucleic acid which selectively hybridizes to an exactly complementary probe sequence in a set of four related probe sequences that differ one nucleotide (see U.S. Patent Nos. 5,837,832 and 5,861 ,242 (Chee et al.)).

[0066] A detection step may use any of a variety of known methods to detect the presence of nucleic acid by hybridization to an oligonucleotide probe. The types of detection methods in which probes can be used include Southern blots (DNA detection), dot or slot blots (DNA, RNA), and Northern blots (RNA detection). Labeled proteins could also be used to detect a particular nucleic acid sequence to which it binds (e.g., protein detection by far western technology: Guichet et al., 1997, Nature 385(6616): 548-552; and Schwartz et al., 2001 , EMBO 20(3): 510- 519). Other detection methods include kits containing reagents of the present invention on a dipstick setup and the like. Of course, it might be preferable to use a detection method which is amenable to automation. A non-limiting example thereof includes a chip or other support comprising one or more (e.g., an array) of different probes.

[0067] A "label" refers to a molecular moiety or compound that can be detected or can lead to a detectable signal. A label is joined, directly or indirectly, to a nucleic acid probe or the nucleic acid to be detected (e.g., an amplified sequence) or to a polypeptide to be detected. Direct labeling can occur through bonds or interactions that link the label to the polynucleotide or polypeptide (e.g., covalent bonds or non-covalent interactions), whereas indirect labeling can occur through the use of a "linker" or bridging moiety, such as additional nucleotides, amino acids or other chemical groups, which are either directly or indirectly labeled. Bridging moieties may amplify a detectable signal. Labels can include any detectable moiety (e.g., a radionuclide, ligand such as biotin or avidin, enzyme or enzyme substrate, reactive group, chromophore such as a dye or colored particle, luminescent compound including a bioluminescent, phosphorescent or chemiluminescent compound, and fluorescent compound).

[0068] As used herein, "expression" is meant the process by which a gene or otherwise nucleic acid sequence eventually produces a polypeptide. It involves transcription of the gene into mRNA, and the translation of such mRNA into polypeptide(s).

[0069] The terms "peptide" and "oligopeptide" are considered synonymous (as is commonly recognized) and each term can be used interchangeably as the context required to indicate a chain of at least two amino acids coupled by peptidyl linkages. The word "polypeptide" is used herein for chains containing more than seven amino acid residues. All oligopeptide and polypeptide formulas or sequences herein are written from left to right and in the direction from amino terminus to carboxyl terminus. The one-letter code of amino acids used herein is commonly known in the art and can be found in Sambrook, et al., supra. Sequence Listings programs can convert easily this one-letter code of amino acids sequence into a three-letter code.

[0070] The phrase "mature polypeptide" is defined herein as a polypeptide having biological activity a polypeptide of the present invention that is in its final form, following translation and any post-translational modifications, such as N-terminal processing, C-terminal truncation, removal of signal sequences, glycosylation, phosphorylation, etc. In one embodiment, polypeptides of the present invention comprise mature of polypeptides of any one of the polypeptides disclosed herein. Mature polypeptides of the present invention can be predicted using programs such as SignalP. The phrase "mature polypeptide coding sequence" is defined herein as a nucleotide sequence that encodes a mature polypeptide as defined above. As well known, some nucleotide sequences are non-coding.

[0071] As used herein, the term "purified" or "isolated" refers to a molecule (e.g., polynucleotide or polypeptide) having been separated from a component of the composition in which it was originally present. Thus, for example, an "isolated polynucleotide" or "isolated polypeptide" has been purified to a level not found in nature. A "substantially pure" molecule is a molecule that is lacking in most other components (e.g., 30, 40, 50, 60, 70, 75, 80, 85, 90, 95, 96, 97, 98, 99, 100% free of contaminants). By opposition, the term "crude" means molecules that have not been separated from the components of the original composition in which it was present. For the sake of brevity, the units (e.g., 66, 67...81 , 82, 83, 84, 85,...91 , 92%....) have not been specifically recited but are considered nevertheless within the scope of the present invention.

[0072] An "isolated polynucleotide" or "isolated nucleic acid molecule" is a nucleic acid molecule (DNA or RNA) that is not immediately contiguous with both of the coding sequences with which it is immediately contiguous (one on the 5' end and one on the 3' end) in the naturally occurring genome of the organism from which it is derived. Thus, in one embodiment, an isolated nucleic acid includes some or all of the 5' non-coding (e.g., promoter) sequences that are immediately contiguous to the coding sequence. The term therefore includes, for example, a recombinant DNA that is incorporated into a vector, into an autonomously replicating plasmid or virus, or into the genomic DNA of a prokaryote or eukaryote, or which exists as a separate molecule (e.g., a cDNA or a genomic DNA fragment produced by PCR or restriction endonuclease treatment) independent of other sequences. It also includes a recombinant DNA that is part of a hybrid gene encoding an additional polypeptide that is substantially free of cellular material, viral material, or culture medium (when produced by recombinant DNA techniques), or chemical precursors or other chemicals (when chemically synthesized). Moreover, an "isolated nucleic acid fragment" is a nucleic acid fragment that is not naturally occurring as a fragment and would not be found in the natural state.

[0073] As used herein, an "isolated polypeptide" or "isolated protein" is intended to include a polypeptide or protein removed from its native environment. For example, recombinantly produced polypeptides and proteins expressed in host cells are considered isolated for the purpose of the invention, as are native or recombinant polypeptides which have been substantially purified by any suitable technique such as, for example, the single-step purification method disclosed in Smith and Johnson, Gene 67:31 -40 (1988).

[0074] The term "variant" refers herein to a polypeptide, which is substantially similar in structure (e.g., amino acid sequence) to a polypeptide disclosed herein or encoded by a nucleic acid sequence disclosed herein without being identical thereto. Thus, two molecules can be considered as variants even though their primary, secondary, tertiary or quaternary structures are not identical. A variant can comprise an insertion, substitution, or deletion of one or more amino acids as compared to its corresponding native protein. A variant can comprise additional modifications (e.g., post-translational modifications such as acetylation, phosphorylation, glycosylation, sulfatation, sumoylation, prenylation, ubiquitination, etc). As used herein, the term "functional variant" is intended to include a variant which is sufficiently similar in both structure and function to a polypeptide disclosed herein or encoded by a nucleic acid sequence disclosed herein, to maintain at least one of its native biological activities. [0075] As used herein, the term "biomass" refers to any cellulosic or lignocellulosic material and includes materials comprising cellulose, and optionally further comprising hemicellulose, lignin, starch, oligosaccharides and/or monosaccharides. Biomass may also comprise additional components, such as protein and/or lipid. Biomass may be derived from a single source, or biomass can comprise a mixture derived from more than one source; for example, biomass could comprise a mixture of corn cobs and corn stover, or a mixture of grass and leaves. Biomass includes, but is not limited to, bioenergy crops, agricultural residues, municipal solid waste, industrial solid waste, sludge from paper manufacture, yard waste, wood and forestry waste or a combination thereof. Examples of biomass include, but are not limited to, corn grain, corn cobs, crop residues such as corn husks, corn stover, grasses, wheat, wheat straw, barley, barley straw, hay, rice straw, switchgrass, waste paper, sugar cane bagasse, sorghum, soy, components obtained from milling of grains, trees, branches, roots, leaves, wood chips, sawdust, shrubs and bushes, vegetables, fruits, flowers, and animal manure or a combination thereof. Biomass that is useful for the invention may include biomass that has a relatively high carbohydrate value, is relatively dense, and/or is relatively easy to collect, transport, store and/or handle. In one embodiment of the present invention, biomass that is useful includes corn cobs, corn stover, sawdust, and sugar cane bagasse.

[0076] As used herein, the terms "cellulosic" or "cellulose-containing material" refers to a composition comprising cellulose. As used herein, the term "lignocellulosic" refers to a composition comprising both lignin and cellulose. Lignocellulosic material may also comprise hemicellulose. The predominant polysaccharide in the primary cell wall of biomass is cellulose, the second most abundant is hemi-cellulose, and the third is pectin. The secondary cell wall, produced after the cell has stopped growing, also contains polysaccharides and is strengthened by polymeric lignin covalently cross-linked to hemicellulose. Cellulose is a homopolymer of anhydrocellobiose and thus a linear beta-(1-4)-D-glucan, while hemicelluloses include a variety of compounds, such as xylans, xyloglucans, arabinoxylans, and mannans in complex branched structures with a spectrum of substituents. Although generally polymorphous, cellulose is found in plant tissue primarily as an insoluble crystalline matrix of parallel glucan chains. Hemicelluloses usually hydrogen bond to cellulose, as well as to other hemicelluloses, which help stabilize the cell wall matrix.

[0077] Cellulose is generally found, for example, in the stems, leaves, hulls, husks, and cobs of plants or leaves, branches, and wood of trees. The cellulose-containing material can be, but is not limited to, herbaceous material, agricultural residues, forestry residues, municipal solid wastes, waste paper, and pulp and paper mill residues. The cellulose-containing material can be any type of biomass including, but not limited to, wood resources, municipal solid waste, wastepaper, crops, and crop residues (e.g., see Wiselogel et al., 1995, in Handbook on Bioethanol (Charles E. Wyman, editor), pp.105-118, Taylor & Francis, Washington D.C.; Wyman. 1994. Bioresource Technology 50: 3-16; Lynd. 1990. Applied Biochemistry and Biotechnology 24/25: 695-719; Mosier et al., 1999, ecenf Progress in Byconversion of Lignocellulosics, in Advances in Biochemical Engineering/Biotechnology, T. Scheper, managing editor, Volume 65. pp.2340. Springer-Verlag, New York). It is understood herein that the cellulose may be in the form of lignocellulose, a plant cell wall material containing lignin, cellulose, and hemicellulose in a mixed matrix. [0078] The phrase "cellulolytic enhancing activity" or "cellulolysis-enhancing" is defined herein as a biological activity which enhances the hydrolysis of a cellulose-containing material by proteins having cellulolytic activity. The term "cellulolytic activity" is defined herein as a biological activity which hydrolyzes a cellulose- containing material.

[0079] The phrase "lignocellulolytic enhancing activity" or "lignocellulolysis-enhancing" is defined herein as a biological activity which enhances the hydrolysis of a lignocellulose-containing material by proteins having lignocellulolytic activity. The term "lignocellulolytic activity" is defined herein as a biological activity which hydrolyzes a lignocellulose-containing material.

[0080] The term "thermostable", as used herein, refers to an enzyme that retains its function or protein activity at a temperature greater than 50°C; thus, a thermostable cellulose-degrading or cellulase-enhacing enzyme/protein retains the ability to degrade or enhace the degradation of cellulose at this elevated temperature. A protein or enzyme may have more than one enzymatic activity. For example, some polypeptides of the present invention exhibit bifunctional activities such as xylosidase/ arabinosidase activity. Such bifunctional enzymes may exhibit thermostability with regard to one activity, but not another, and still be considered as "thermostable".

BRIEF DESCRIPTION OF DRAWINGS

[0081] In the appended drawings:

[0082] Figure 1 is a schematic map of the pGBFIN-49 expression plasmid.

[0083] Figure 2 shows protein activity-temperature profiles of various secreted proteins from Chaetomium thermophilum (CHATH).

[0084] Figures 3-6 show protein activity-temperature profiles of various secreted proteins from Thermomyces stellatus (THEST)

[0085] Figures 7-10 show protein activity-temperature profiles of various secreted proteins from Corynascus sepedonium (CORSE).

DESCRIPTION OF ILLUSTRATIVE EMBODIMENTS POLYPEPTIDES OF THE INVENTION

[0086] In one aspect, the present invention relates to isolated polypeptides secreted by Chaetomium thermophilum, Thermomyces stellatus, or Corynascus sepedonium (e.g., Chaetomium thermophilum strain ATCC 16451 , Thermomyces stellatus strain CBS 241.64, or Corynascus sepedonium strain ATCC 9787) having an activity relating to the processing or degradation of biomass (e.g., cell wall deconstruction).

[0087] In another aspect, the present invention relates to isolated polypeptides comprising the amino acid sequences shown in any one of SEQ ID NOs: 357-534, 1031-1278, and 1851-2136.

[0088] In another aspect, the present invention relates to isolated polypeptides sharing a minimum threshold of amino acid sequence identity with any one of the above-mentioned polypeptides. In specific embodiments, the present invention relates to isolated polypeptides having at least 60%, 65%, 70%, 71 %, 72, 73%, 74%, 75%, 76%, 77%, 78%, 79, 80%, 81 %, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% amino acid sequence identity to any one of the above-mentioned polypeptides. Other specific percentage units that have not been specifically recited here for brevity are nevertheless considered within the scope of the present invention.

[0089] In another aspect, the present invention relates to a polypeptide encoded by a polynucleotide of the present invention, which includes genomic (e.g., SEQ ID NOs: 1-178, 535-782, and 1279-1564; in a further embodiment SEQ ID NOs: 1-160, 535-782, and 1279-1564), and coding (e.g., SEQ ID NOs: 179-356, 783-1030, and 1565-1850) nucleic acid sequences disclosed herein, polynucleotides hybridizing under medium-high, high, or very high stringency conditions with a full-length complement thereof, as well as polynucleotides sharing a certain degree of nucleic acid sequence identity therewith.

[0090] In another aspect, the present invention relates to a polypeptide comprising an amino acid sequence encoded by at least one exonic nucleic acid sequence of any one of the genomic sequences corresponding to SEQ ID NOs: 1-178, 535-782, and 1279-1564; in a further embodiment, encoded by at least one exonic nucleic acid sequence any one of the genomic sequences corresponding to SEQ ID NOs: 1 -160, 535-782, and 1279-1564 (e.g., the intron or exon segments defined by the exon boundaries listed in Tables 2A-2C) or a functional part thereof.

[0091] In another aspect, the present invention relates to functional variants of any one of the above- mentioned polypeptides. In another embodiment, the term "functional" or "biologically active" relates to the native enzymatic (e.g., catalytic) activity of a polypeptide of the present invention. In some embodiments, the present invention relates to a polypeptide comprising a biological activity of any one of the enzymes described below, or a polynucleotide encoding same.

[0092] "Carbohydrase" refers to any protein that catalyzes the hydrolysis of carbohydrates. "Glycoside hydrolase", "glycosyl hydrolase" or "glycosidase" refers to a protein that catalyzes the hydrolysis of the glycosidic bonds between carbohydrates or between a carbohydrate and a non-carbohydrate residue. Endoglucanases, cellobiohydrolases, beta-glucosidases, a-glucosidases, xylanases, beta-xylosidases, alpha-xylosidases, galactanases, a-galactosidases, beta-galactosidases, a-amylases, glucoamylases, endo-arabinases, arabinofuranosidases, mannanases, beta-mannosidases, pectinases, acetyl xylan esterases, acetyl mannan esterases, femlic acid esterases, coumaric acid esterases, pectin methyl esterases, and chitosanases are examples of glycosidases.

[0093] "Cellulase" refers to a protein that catalyzes the hydrolysis of 1 ,4-D-glycosidic linkages in cellulose (such as bacterial cellulose, cotton, filter paper, phosphoric acid swollen cellulose, Avicel®); cellulose derivatives (such as carboxymethylcellulose and hydroxyethylcellulose); plant lignocellulosic materials, beta-D-glucans or xyloglucans. Cellulose is a linear beta-(1-4) glucan consisting of anhydrocellobiose units. Endoglucanases, cellobiohydrolases, and beta- glucosidases are examples of cellulases.

[0094] "Endoglucanase" refers to a protein that catalyzes the hydrolysis of cellulose to oligosaccharide chains at random locations by means of an endoglucanase activity.

[0095] "Cellobiohydrolase" refers to a protein that catalyzes the hydrolysis of cellulose to cellobiose via an exoglucanase activity, sequentially releasing molecules of cellobiose from the reducing or non-reducing ends of cellulose or cello- oligosaccharides. "Beta-glucosidase" refers to an enzyme that catalyzes the conversion of cellobiose and oligosaccharides to glucose.

[0096] "Hemicellulase" refers to a protein that catalyzes the hydrolysis of hemicellulose, such as that found in lignocellulosic materials. Hemicelluloses are complex polymers, and their composition often varies widely from organism to organism, and from one tissue type to another. Hemicelluloses include a variety of compounds, such as xylans, arabinoxylans, xyloglucans, mammans, glucomannans, and galacto(gluco)mannans. Hemicellulose can also contain glucan, which is a general term for beta-linked glucose residues. In general, a main component of hemicellulose is beta-1 ,4-linked xylose, a five carbon sugar. However, this xylose is often branched as beta-1 ,3 linkages or beta-1 ,2 linkages, and can be substituted with linkages to arabinose, galactose, mannose, glucuronic acid, or by esterification to acetic acid. Hemicellulolytic enzymes, i.e., hemicellulases, include both endo-acting and exo-acting enzymes, such as xylanases, beta-xylosidases, alpha-xylosidases, galactanases, a-galactosidases, beta- galactosidases, endo-arabinases, arabinofuranosidases, mannanases, and beta-mannosidases. Hemicellulases also include the accessory enzymes, such as acetylesterases, ferulic acid esterases, and coumaric acid esterases. Among these, xylanases and acetyl xylan esterases cleave the xylan and acetyl side chains of xylan and the remaining xylo-oligomers are unsubstituted and can thus be hydrolysed with beta-xylosidase only. In addition, several less known side activities have been found in enzyme preparations which hydrolyze hemicellulose. Accordingly, xylanases, acetylesterases and beta- xylosidases are examples of hemicellulases.

[0097] "Xylanase" specifically refers to an enzyme that hydrolyzes the beta-1 ,4 bond in the xylan backbone, producing short xylooligosaccharides.

[0098] "Beta-mannanase" or "endo-1,4-beta-mannosidase" refers to a protein that hydrolyzes mannan- based hemicelluloses (mannan, glucomannan, galacto(gluco)mannan) and produces short beta-1 ,4- mannooligosaccharides.

[0099] "Mannan endo-1 ,6-alpha-mannosidase" refers to a protein that hydrolyzes 1 ,6-alpha-mannosidic linkages in unbranched 1 ,6-mannans.

[00100] "Beta-mannosidase" (beta-1 ,4-mannoside mannohydrolase; EC 3.2.1.25) refers to a protein that catalyzes the removal of beta-D-mannose residues from the non-reducing ends of oligosaccharides.

[00101] "Galactanase", "endo-beta-1 ,6-galactanse" or "arabinogalactan endo-1 ,4-beta-galactosidase" refers to a protein that catalyzes the hydrolysis of endo-1 ,4-beta-D-galactosidic linkages in arabinogalactans.

[00102] "Glucoamylase" refers to a protein that catalyzes the hydrolysis of terminal 1 ,4-linked-D-glucose residues successively from non-reducing ends of the glycosyl chains in starch with the release of beta-D-glucose.

[00103] "Beta-hexosaminidase" or "beta-N-acetylglucosaminidase" refers to a protein that catalyzes the hydrolysis of terminal N-acetyl-D-hexosamine residues in N-acetyl-beta-D-hexosamines.

[00104] "Alpha-L-arabinofuranosidase", "alpha-N-arabmofuranosidase", "alpha-arabinofuranosidase", "arabinosidase" or "arabinofuranosidase" refers to a protein that hydrolyzes arabinofuranosyl-containing hemicelluloses or pectins. Some of these enzymes remove arabinofuranoside residues from 0-2 or 0-3 single substituted xylose residues, as well as from 0-2 and/or 0-3 double substituted xylose residues. Some of these enzymes remove arabinose residues from arabinan oligomers. [00105] "Endo-arabinase" refers to a protein that catalyzes the hydrolysis of 1 ,5-alpha-arabinofuranosidic linkages in 1 ,5-arabinans.

[00106] "Exo-arabinase" refers to a protein that catalyzes the hydrolysis of 1 ,5-alpha-linkages in 1 ,5-arabinans or 1 ,5-alpha-L arabino-oligosaccharides, releasing mainly arabinobiose, although a small amount of arabinotriose can also be liberated.

[00107] "Beta-xylosidase" refers to a protein that hydrolyzes short 1 ,4-beta-D-xylooligomers into xylose.

[00108] "Cellobiose dehydrogenase" refers to a protein that oxidizes cellobiose to cellobionolactone.

[00109] "Chitosanase" refers to a protein that catalyzes the endohydrolysis of beta-1 ,4-linkages between D- glucosamine residues in acetylated chitosan (i.e., deacetylated chitin).

[00110] "Exo-polygalacturonase" refers to a protein that catalyzes the hydrolysis of terminal alpha 1 ,4-linked galacturonic acid residues from non-reducing ends thus converting polygalacturonides to galacturonic acid.

[00111] "Acetyl xylan esterase" refers to a protein that catalyzes the removal of the acetyl groups from xylose residues. "Acetyl mannan esterase" refers to a protein that catalyzes the removal of the acetyl groups from mannose residues, "ferulic esterase" or "ferulic acid esterase" refers to a protein that hydrolyzes the ester bond between the arabinose substituent group and ferulic acid. "Coumaric acid esterase" refers to a protein that hydrolyzes the ester bond between the arabinose substituent group and coumaric acid. Acetyl xylan esterases, ferulic acid esterases and pectin methyl esterases are examples of carbohydrate esterases.

[00112] "Pectate lyase" and "pectin lyases" refer to proteins that catalyze the cleavage of 1 ,4-alpha-D- galacturonan by beta-elimination acting on polymeric and/or oligosaccharide substrates (pectates and pectins, respectively).

[00113] "Endo-1,3-beta-glucanase" or "laminarinase" refers to a protein that catalyzes the cleavage of 1 ,3- linkages in beta-D-glucans such as laminarin or lichenin. Laminarin is a linear polysaccharide made up of beta-1 , 3- glucan with beta-1 ,6-linkages.

[00114] "Lichenase" refers to a protein that catalyzes the hydrolysis of lichenan, a linear, 1 ,3-1 ,4-beta-D glucan.

[00115] Rhamnogalacturonan is composed of alternating alpha-1 ,4-rhamnose and alpha- 1 ,2-linked galacturonic acid, with side chains linked 1 ,4 to rhamnose. The side chains include Type I galactan, which is beta- 1 ,4-linked galactose with alpha-1 ,3-linked arabinose substituents; Type II galactan, which is beta-1 , 3-1 , 6-linked galactoses (very branched) with arabinose substituents; and arabinan, which is alpha-1 ,5-linked arabinose with alpha-1 ,3-linked arabinose branches. The galacturonic acid substituents may be acetylated and/or methylated.

[00116] "Exo-rhamnogalacturonanase" refers to a protein that catalyzes the degradation of the rhamnogalacturonan backbone of pectin from the non-reducing end.

[00117] "Rhamnogalacturonan acetylesterase" refers to a protein that catalyzes the removal of the acetyl groups ester-linked to the highly branched rhamnogalacturonan (hairy) regions of pectin.

[00118] "Rhamnogalacturonan lyase" refers to a protein that catalyzes the degradation of the rhamnogalacturonan backbone of pectin via a beta-elimination mechanism (e.g., see Pages et al., J. Bacteriol., 185:47274733 (2003)). [00119] "Alpha-rhamnosidase" refers to a protein that catalyzes the hydrolysis of terminal non-reducing alpha- L-rhamnose residues in alpha-L-rhamnosides.

[00120] Certain proteins of the present invention may be classified as "Family 61 glycosidases" based on homology of the polypeptides to CAZy Family GH61. Family 61 glycosidases may exhibit cellulolytic enhancing activity or endoglucanase activity. Additional information on the properties of Family 61 glycosidases may be found in U.S. Patent Application Publication Nos. 2005/0191736, 2006/0005279, 2007/0077630, and in PCT Publication No.. WO 2004/031378.

[00121] "Esterases" represent a category of various enzymes including lipases, phospholipases, cutinases, and phytases that catalyze the hydrolysis and synthesis of ester bonds in compounds.

[00122] The International Union of Biochemistry and Molecular Biology have developed a nomenclature for enzymes where each enzyme is described by a sequence of four numbers preceded by "EC". The first number broadly classifies the enzyme based on its mechanism. According to the naming conventions, enzymes are generally classified into six main family classes and many sub-family classes: EC 1 Oxidoreductases: catalyze oxidation/reduction reactions; EC 2 Transferases: transfer a functional group (e.g. a methyl or phosphate group); EC 3 Hydrolases: catalyze the hydrolysis of various bonds; EC 4 Lyases: cleave various bonds by means other than hydrolysis and oxidation; EC 5 Isomerases: catalyze isomerization changes within a single molecule; and EC 6 Ligases: join two molecules with covalent bonds. A number of bioinformatic tools are available to the skilled person to predict which main family class and sub-family class an enzyme molecule belongs to according to its sequence information. In some instances, certain enzymes (or family of enzymes) can be re-classified, for example, to take into account newly discovered enzyme functions or properties. Accordingly, the polypeptides/enzymes of the present invention are not meant to be limited to specific enzyme classes as they currently exist. The skilled person would know how to appropriately reclassify (and assign the appropriate functions) to the enzymes of the present invention based on the amino acid sequence information provided herein. Such reclassifications are thus within the scope of the present invention.

[00123] In some embodiments, the present invention relates to a polypeptide comprising a biological activity of any one of the enzymes (or sub-classes thereof), or a polynucleotide encoding same.

• Cellulose-hydrolyzing enzymes, including: endoglucanases (EC 3.2.1.4), which hydrolyze the beta-1 ,4- linkages between glucose units; exoglucanases (also known as cellobiohydrolases 1 and 2) (EC 3.2.1.91), which hydrolyze cellobiose, a glucose disaccharide, from the reducing and non-reducing ends of cellulose; and beta-glucosidases (EC 3.2.1.21), which hydrolyze the beta-1 ,4 glycoside bond of cellobiose to glucose;

• Proteins that enhance or accelerate the action of cellulose-degrading enzymes, including: glycoside hydrolase family 61 (GH61), recently reclassified as AA9, proteins (e.g., polysaccharide monooxygenases), which enhance the action of cellulose enzymes on lignocellulose substrates;

• Enzymes that degrade or modify xylan and/or xylan-lignin complexes, including: xylanases, such as endo- 1,4-beta-xylanase (EC 3.2.1.8), which catalyze the endohydrolysis of 1 -4-beta-D-xylosidic linkages in xylans (or xyloglucans); xylosidases, such as xylan 1 ,4-beta-xylosidases (EC 3.2.1.37), which catalyze hydrolysis of 1 ,4-beta-D-xylans to remove successive D-xylose residues from the non-reducing terminals, and also cleaves xylobiose; arabinosidases, such as alpha-arabinofuranosidases (EC 3.2.1.55), which hydrolyze terminal non-reducing alpha-L-arabinofuranoside residues in alpha-L-arabinosides (including arabinoxylans and arabinogalactans); alpha-glucuronidases (EC 3.2.1.139), which hydrolyze an alpha-D- glucuronoside to the corresponding alcohol and D-glucuronate; feruloyl esterases (EC 3.1.1.73), which catalyzes hydrolysis of the 4-hydroxy-3-methoxycinnamoyl (feruloyl) group from an esterified sugar (which is usually arabinose in natural substrates); and acetylxylan esterases (EC 3.1.1.72), which catalyze deacetylation of xylans and xylo-oligosaccharides;

Enzymes that degrade or modify mannan, including: mannanases, such as mannan endo-1,4-beta- mannosidase (EC 3.2.1.78), which catalyze random hydrolysis of 1 ,4-beta-D-mannosidic linkages in mannans, galactomannans and glucomannans;

mannosidases (EC 3.2.1.25), which hydrolyze terminal, non-reducing beta-D-mannose residues in beta-D- mannosides; alpha-galactosidases (EC 3.2.1.22), which hydrolyzes terminal, non-reducing alpha-D- galactose residues in alpha-D-galactosides (including galactose oligosaccharides, galactomannans and galactohydrolase); and mannan acetyl esterases;

Enzymes that degrade or modify xyloglucans, including: xyloglucanases such as xyloglucan-specific endo- beta-1 ,4-glucanase (EC 3.2.1.151), which involves endohydrolysis of 1 ,4-beta-D-glucosidic linkages in xyloglucan; and xyloglucan-specific exo-beta-1 ,4-glucanase (EC 3.2.1.155), which catalyzes exohydrolysis of 1 ,4-beta-D-glucosidic linkages in xyloglucan; endoglucanases / cellulases;

Enzymes that degrade or modify glucans, including: Enzymes that degrade beta-1 ,4-glucan, such as endoglucanases; cellobiohydrolases; and beta-glucosidases;

Enzymes that degrade beta-1 ,3-1 ,4-glucan, such as endo-beta-1 ,3(4)-glucanases (EC 3.2.1.6), which catalyzes endohydrolysis of 1 ,3- or 1 ,4-linkages in beta-D-glucans when the glucose residue whose reducing group is involved in the linkage to be hydrolyzed is itself substituted at C-3; endoglucanases (beta-glucanase, cellulase), and beta-glucosidases;

Enzymes that degrade or modify galactans, including: galactanases (EC 3.2.1.23), which hydrolyze terminal non-reducing beta-D-galactose residues in beta-D-galactosides;

Enzymes that degrade or modify arabinans, including: arabinanases (EC 3.2.1.99), which catalyze endohydrolysis of 1 ,5-alpha-arabinofuranosidic linkages in 1 ,5-arabinans;

Enzymes that degrade or modify starch, including: amylases, such as alpha-amylases (EC 3.2.1.1), which catalyze endohydrolysis of 1 ,4-alpha-D-glucosidic linkages in polysaccharides containing three or more 1 ,4- alpha-linked D-glucose units; and glucosidases, such as alpha-glucosidases (EC 3.2.1.20), which hydrolyze terminal, non-reducing 1 ,4-linked alpha-D-glucose residues with release of alpha-D-glucose; Enzymes that degrade or modify pectin, including: pectate lyases (EC 4.2.2.2), which carry out eliminative cleavage of pectate to give oligosaccharides with 4-deoxy-alpha-D-gluc-4-enuronosyl groups at their non- reducing ends; pectin lyases (EC 4.2.2.10), which catalyze eliminative cleavage of (1 -4)-alpha-D- galacturonan methyl ester to give oligosaccharides with 4-deoxy-6-0-methyl-alpha-D-galact-4-enuronosyl groups at their non-reducing ends; polygalacturonases (EC 3.2.1.15), which carry out random hydrolysis of 1 ,4-alpha-D-galactosiduronic linkages in pectate and other galacturonans; pectin esterases, such as pectin acetyl esterase (EC 3.1.1.11 ), which hydrolyzes acetate from pectin acetyl esters; alpha- arabinofuranosidases; beta-galactosidases; galactanases; arabinanases; rhamnogalacturonases (EC 3.2.1.-), which hydrolyze alpha-D-galacturonopyranosyl-(1 ,2)-alpha-L-rhamnopyranosyl linkages in the backbone of the hairy regions of pectins; rhamnogalacturonan lyases (EC 4.2.2.-), which degrade type I rhamnogalacturonan from plant cell walls and releases disaccharide products; rhamnogalacturonan acetyl esterases (EC 3.1.1.-), which hydrolyze acetate from rhamnogalacturonan; and xylogalacturonosidases and xylogalacturonases (EC 3.2.1.-), which hydrolyze xylogalacturonan (xga), a galacturonan backbone heavily substituted with xylose, and which is one important component of the hairy regions of pectin;

• Enzymes that degrade or modify lignin, including: lignin peroxidases (EC 1.11.1.14), which oxidize lignin and lignin model compounds using hydrogen peroxide; manganese-dependent peroxidases (EC 1.11.1.13), which oxidizes lignin and lignin model compounds using Mn 2+ and hydrogen peroxide; versatile peroxidases (EC 1.11.1.16), which oxidize lignin and lignin model compounds using an electron donor and hydrogen peroxide and combines the substrate-specificity characteristics of the two other ligninolytic peroxidases: manganese peroxidase (EC 1.11.1.13) and lignin peroxidase (EC 1.11.1.14); and laccases (EC 1.10.3.2), a group of multi-copper proteins of low specificity acting on both o- and p-quinols, and often acting also on lignin; and

• Enzymes acting on chitin, including: chitinases (EC 3.2.1.14), which catalyze random hydrolysis of N- acetyl-beta-D-glucosaminide 1 ,4-beta-linkages in chitin and chitodextrins; and hexosaminidases, such as beta-N-acetylhexosaminidase (EC 3.2.1.52), which hydrolyzes terminal non-reducing N-acetyl-D- hexosamine residues in N-acetyl-beta-D-hexosaminides.

[00124] In another embodiment, the present invention includes the polypeptides and their corresponding activities as defined in Tables 1A-1 C, as well as functional variants thereof.

[00125] As alluded to above, the term "functional variant" as used herein is intended to include a polypeptide which is sufficiently similar in structure and function to any one of the above-mentioned polypeptides (without being identical thereto) to maintain at least one of its native biological activities. In another embodiment, a functional variant can comprise an insertion, substitution, or deletion of one or more amino acids as compared to its corresponding native protein. In another embodiment, a functional variant can comprise additional modifications (e.g., post-translational modifications such as acetylation, phosphorylation, glycosylation, sulfatation, sumoylation, prenylation, ubiquitination, etc).

[00126] In another embodiment, functional variants of the present invention can contain one or more conservative substitutions of a polypeptide sequence disclosed herein. Such modifications can be carried out routinely using site-specific mutagenesis. The term "conservative substitution" is intended to indicate a substitution in which the amino acid residue is replaced with an amino acid residue having a similar side chain. Families of amino acids having similar side chains are known in the art and include amino acids with basic side chains (e.g., lysine, arginine and hystidine), acidic side chains (e.g., aspartic acid, glutamic acid), uncharged polar side chains (e.g., glycine, asparagines, glutamine, serine, threonine, tyrosine, cysteine), non-polar side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan), beta-branched side chains (e.g., threonine, valine, isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine tryptophan, histidine).

[00127] In another embodiment, functional variants of the present invention can contain one or more insertions, deletions or truncations of non-essential amino acids. As used herein, a "non-essential amino acid" is a residue that can be altered in a polypeptide of the present invention without substantially altering its (biological) function or protein activity. For example, amino acid residues that are conserved among the proteins of the present invention having similar biological activities (and their orthologs) are predicted to be particularly unamenable to alteration.

[00128] In another embodiment, functional variants can include functional fragments (i.e., biologically active fragments) of any one of the polypeptide sequences disclosed herein. Such fragments include fewer amino acids than the full length protein from which they are derived, but exhibit at least one biological activity of the corresponding full-length protein. Typically, biologically active fragments comprise a domain or motif with at least one activity of the full-length protein. A biologically active fragment of a protein of the invention can be a polypeptide which is, for example, 10, 25, 50, 100 or more amino acids in length. Moreover, other biologically active portions, in which other regions of the protein are deleted, can be prepared by recombinant techniques and evaluated for one or more of the biological activities of the native form of a polypeptide of the present invention.

[00129] In another embodiment, the present invention includes other functional variants of the polypeptides disclosed herein, which can be identified by techniques known in the art. For example, functional variants can be identified by screening combinatorial libraries of mutants (e.g., truncation mutants), of polypeptides of the present invention for biological activity. In another embodiment, a variegated library of variants can be generated by combinatorial mutagenesis at the nucleic acid level. A variegated library of variants can be produced by, for example, enzymatically ligating a mixture of synthetic oligonucleotides into gene sequences such that a degenerate set of potential protein sequences is expressible as individual polypeptides, or alternatively, as a set of larger fusion proteins (e.g., for phage display). There are a variety of methods that can be used to produce libraries of potential variants of the polypeptides of the present invention from a degenerate oligonucleotide sequence. Methods for synthesizing degenerate oligonucleotides are known in the art (e.g., see Narang (1983) Tetrahedron 39:3; Itakura et al., (1984) Annu. Rev. Biochem. 53:323; Itakura et al., (1984) Science 198:1056; Ike et al., (1983) Nucleic Acid Res. 1 1 :477).

[00130] In addition, libraries of fragments of the coding sequence of a polypeptide of the present invention can be used to generate a variegated population of polypeptides for screening a subsequent selection of variants. For example, a library of coding sequence fragments can be generated by treating a double stranded PCR fragment of the coding sequence of interest with a nuclease under conditions wherein nicking occurs only about once per molecule, denaturing the double stranded DNA, renaturing the DNA to form double stranded DNA which can include sense/antisense pairs from different nicked products, removing single stranded portions from reformed duplexes by treatment with S1 nuclease, and ligating the resulting fragment library into an expression vector. By this method, an expression library can be derived which encodes N-terminal and internal fragments of various sizes of the protein of interest.

[00131] Several techniques are known in the art for screening gene products of combinatorial libraries made by point mutations of truncation, and for screening cDNA libraries for gene products having a selected property. The most widely used techniques, which are amenable to high through-put analysis, for screening large gene libraries typically include cloning the gene library into replicable expression vectors, transforming appropriate cells with the resulting library of vectors, and expressing the combinatorial genes under conditions in which detection of a desired activity facilitates isolation of the vector encoding the gene whose product was detected. Recursive ensemble mutagenesis (REM), a technique which enhances the frequency of functional mutants in the libraries, can be used in combination with the screening assays to identify variants of polypeptides of the present invention (Arkin and Yourvan (1992) Proc. Natl. Acad. Sci. USA 89:7811-7815; Delgrave et al., (1993) Protein Engineering 6(3): 327- 331 ).

[00132] In another embodiment, functional variants of the present invention can encompass orthologs of the genes and polypeptides disclosed herein. Orthologs of the polypeptides disclosed herein include proteins that can be isolated from other strains or species and possess a similar or identical biological activity. Such orthologs can be identified as comprising an amino acid sequence that is substantially homologous (shares a certain degree of amino acid sequence identity) with the polypeptides disclosed herein. As used herein, the expression "substantially homologous" refers to a first amino acid or nucleotide sequence which contains a sufficient or minimum number of identical or equivalent (e.g., with similar side chain) amino acids or nucleotides to a second amino acid or nucleotide sequence such that the first and the second amino acid or nucleotide sequences have a common domain. For example, amino acid or nucleotide sequences which contain a common domain having at least 70%, 71 %, 72%, 73% 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81 %, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91% 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% sequence identity are defined herein as sufficiently identical.

[00133] In another embodiment, the present invention includes improved proteins derived from the polypeptides of the present invention. Improved proteins are proteins wherein at least one biological activity is improved. Such proteins may be obtained by randomly introducing mutations along all or part of the coding sequences of the polypeptides of the present invention such as by saturation mutagenesis, and the resulting mutants can be expressed recombinantly and screened for biological activity. For instance, the art provides for standard assays for measuring the enzymatic activity of the resulting protein and thus improved proteins may be selected.

Recovery and purification

[00134] In another aspect, polypeptides of the present invention may be present alone (e.g., in an isolated or purified form), within a composition (e.g., an enzymatic composition for carrying out an industrial process), or in an appropriate host. In one embodiment, polypeptides of the present invention can be recovered and purified from cell cultures (e.g., recombinant cell cultures) by methods known in the art. In another embodiment, high performance liquid chromatography ("HPLC") can be employed for the purification. [00135] In another aspect, polypeptides of the present invention include naturally purified products, products of chemical synthetic procedures, and products produced by recombinant techniques from a prokaryotic or eukaryotic host, including, for example, bacterial, yeast, higher plant, insect and mammalian cells. Depending on the host employed in a recombinant production procedure, the polypeptides of the present invention may be glycosylated or may be non-glycosylated. In addition, polypeptides of the invention may also include an initial modified methionine residue, in some cases as a result of host-mediated processes.

Fusion proteins

[00136] In another aspect, the present invention includes fusion proteins comprising a polypeptide of the present invention or a functional variant thereof, which is operatively linked to one or more unrelated polypeptide (e.g., heterologous amino acid sequences). "Unrelated polypeptides" or "heterologous polypeptides" or "heterologous sequences" refer to polypeptides or sequences which are usually not present close to or fused to one of the polypeptides of the present invention. Such "unrelated polypeptides" or "heterologous polypeptides" having amino acid sequences corresponding to proteins which are not substantially homologous to the polypeptide sequences disclosed herein. Such "unrelated polypeptides" can be derived from the same or a different organism. In one embodiment, a fusion protein of the present invention comprises at least two biologically active portions or domains of polypeptide sequences disclosed herein. In the context of fusion proteins, the term "operatively linked" is intended to indicate that all of the different polypeptides are fused in-frame to each other. In another embodiment, an unrelated polypeptide can be fused to the N terminus or C terminus of a polypeptide of the present invention.

[00137] In another embodiment, a polypeptide of the present invention can be fused to a protein which enables or facilitates recombinant protein purification and/or detection. For example, a polypeptide of the present invention can be fused to a protein such as glutathione S-transferase (GST), and the resulting fusion protein can then be purified/detected through the high affinity of GST for glutathione.

[00138] Fusion proteins of the present invention can be produced by standard recombinant DNA techniques. For example, DNA fragments encoding different polypeptide sequences can be ligated together in frame in accordance with conventional techniques, for example by employing blunt-ended or stagger-ended termini for ligation, restriction enzyme digestion to provide for appropriate termini, filling-in of cohesive ends as appropriate, alkaline phosphatase treatment to avoid undesirable joining, and enzymatic ligation. In another embodiment, the fusion gene can be synthesized by conventional techniques including automated DNA synthesizers. Alternatively, PCR amplification of gene fragments can be carried out using anchor primers, which give rise to complementary overhangs between two consecutive gene fragments which can subsequently be annealed and re-amplified to generate a chimeric gene sequence (e.g., see Current Protocols in Molecular Biology, eds. Ausubel et al., John Wiley & Sons: 1992). Moreover, many expression vectors are commercially available that already encode a fusion moiety (e.g., a GST polypeptide). A nucleic acid encoding a polypeptide of the present invention can be cloned into such an expression vector so that the fusion moiety is linked in-frame to the polypeptide of interest.

Signal sequences [00139] In another embodiment, a polypeptide of the present invention can be fused to a heterologous signal sequence (e.g., at its N terminus) to facilitate its isolation, expression and/or secretion from certain host cells (e.g., mammalian and yeast host cells). Signal sequences are typically characterized by a core of hydrophobic amino acids, which are generally cleaved from the mature protein during secretion in one or more cleavage events. Such signal peptides may contain processing sites that allow cleavage of the signal sequence from the mature proteins as they pass through the secretory pathway.

[00140] For example, the gp67 secretory sequence of the baculovirus envelope protein can be used as a heterologous signal sequence (Current Protocols in Molecular Biology, Ausubel et al., eds., John Wiley & Sons, 1992). Other examples of eukaryotic heterologous signal sequences include the secretory sequences of melittin and human placental alkaline phosphatase (Stratagene; La Jolla, California). In yet another example, useful prokaryotic heterologous signal sequences include the phoA secretory signal (Sambrook et al., supra) and the protein A secretory signal (Pharmacia Biotech; Piscataway, New Jersey).

[00141] The signal sequence can direct secretion of the protein, such as from a eukaryotic host into which the expression vector is transformed, and the signal sequence is subsequently or concurrently cleaved. The protein can then be readily purified from the extracellular medium by known methods. In another embodiment, a signal sequence can be linked to a fusion protein of the present invention to facilitate detection, purification, and/or recovery thereof. For example, the sequence encoding a fusion protein of the present invention may be fused to a marker sequence, such as a sequence encoding a peptide, which facilitates purification of the fused polypeptide. In another embodiment, the marker sequence can be a hexa-histidine peptide, such as the tag provided in a pQE vector (Qiagen, Inc.), among others, many of which are commercially available. As described in Gentz et al ., Proc. Natl. Acad. Sci. USA 86:821-824 (1989), for instance, hexa-histidine provides for convenient purification of the fusion protein. In another embodiment, the HA tag is another peptide useful for purification, which corresponds to an epitope derived of influenza hemaglutinin protein, which has been described by Wilson et al., Cell 37:767 (1984), for instance.

POLYNUCLEOTIDES

[00142] The nucleic acid sequences of the genes disclosed herein were determined by sequencing cDNA clones, mRNA transcripts, or genomic DNA obtained from Chaetomium thermophilum strain ATCC 16451 , Thermomyces stellatus strain CBS 241.64, and Corynascus sepedonium strain ATCC 9787.

[00143] In another aspect, the present invention relates to polynucleotides encoding a polypeptide of the present invention, including functional variants thereof. In one embodiment, polynucleotides of the present invention comprise the coding nucleic acid sequence of any one of SEQ ID NOs: 179-356, 783-1030, and 1565-1850, or as set forth in Tables 1A-1C.

[00144] In another aspect, the present invention relates to genomic DNA sequences corresponding to the above mentioned coding sequences. In one embodiment, polynucleotides of the present invention comprise the genomic nucleic acid sequence of any one of SEQ ID NOs: 1-178, 535-782, and 1279-1564; or as set forth in Tables 1A-1 C. In a further embodiment, polynucleotides of the present invention comprise the genomic nucleic acid sequence of any one of SEQ ID NOs: 1-160, 535-782, and 1279-1564.

[00145] In another aspect, the present invention relates to a polynucleotide comprising at least one intronic or exonic nucleic acid sequence of any one of the genomic sequences corresponding to SEQ ID NOs: 1-178, 535-782, and 1279-1564 (e.g., the intron or exon segments defined by the exon boundaries listed in Tables 2A-2C). In a further embodiment, the present invention relates to a polynucleotide comprising at least one intronic or exonic nucleic acid sequence of any one of the genomic sequences corresponding to SEQ ID NOs: 1-160, 535-782, and 1279-1564. Although only the positions of the exons are defined in Tables 2A-2C, a person of skill in the art would readily be able to determine the positions of the corresponding introns in view of this information. In some embodiments, polynucleotides comprising at least one these intronic segments are within the scope of the present invention.

[00146] In yet another aspect, the present invention relates to a polynucleotide comprising at least one exonic nucleic acid sequence comprised within SEQ ID NOs: 1-178, 535-782, and 1279-1564, or as set forth in Tables 2A- 2C. In a further embodiment, the present invention relates to a polynucleotide comprising at least one exonic nucleic acid sequence comprised within SEQ ID NOs: 1-160, 535-782, and 1279-1564.

[00147] In another aspect, the present invention relates to isolated polynucleotides sharing a minimum threshold of nucleic acid sequence identity with any one of the above-mentioned polynucleotides. In specific embodiments, the present invention relates to isolated polynucleotides having at least 60%, 65%, 70%, 71 %, 72, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81 %, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% nucleic acid sequence identity to any one of the above-mentioned polynucleotides. Other specific percentage units that have not been specifically recited here for brevity are nevertheless considered within the scope of the present invention. Polynucleotides having the aforementioned thresholds of nucleic acid sequence identity can be created by introducing one or more nucleotide substitutions, additions or deletions into the coding nucleotide sequences of the present invention such that one or more amino acid substitutions, deletions or insertions are introduced into the encoded polypeptide. Such mutations may be introduced by standard techniques, such as site-directed mutagenesis and PCR-mediated mutagenesis.

[00148] In another aspect, the present invention relates to a polynucleotide that hybridizes (or is hybridizable) under medium-high stringency conditions, high stringency conditions, or very high stringency conditions with the full- length complement of any one of the polynucleotides defined above.

[00149] As used herein, "very low stringency conditions" means for probes of at least 100 nucleotides in length, prehybridization and hybridization at 42°C in 5X SSPE, 0.3% SDS, 200 micrograms/mL sheared and denatured salmon sperm DNA, and 25% formamide, following standard Southern blotting procedures for 12 to 24 hours. The carrier material is finally washed three times each for 15 minutes using 2X SSC, 0.2% SDS at 45°C.

[00150] As used herein, "low stringency conditions" means for probes of at least 100 nucleotides in length, prehybridization and hybridization at 42°C in 5X SSPE, 0.3% SDS, 200 micrograms/mL sheared and denatured salmon sperm DNA, and 25% formamide, following standard Southern blotting procedures for 12 to 24 hours. The carrier material is finally washed three times each for 15 minutes using 2X SSC, 0.2% SDS at 50°C. [00151] As used herein, "medium stringency conditions" means for probes of at least 100 nucleotides in length, prehybridization and hybridization at 42°C in 5X SSPE, 0.3% SOS, 200 micrograms/mL sheared and denatured salmon sperm DNA, and 35% formamide, following standard Southern blotting procedures for 12 to 24 hours. The carrier material is finally washed three times each for 15 minutes using 2X SSC, 0.2% SOS at 55°C.

[00152] As used herein, "medium-high stringency conditions" means for probes of at least 100 nucleotides in length, prehybridization and hybridization at 42°C in 5X SSPE, 0.3% SDS, 200 micrograms/mL sheared and denatured salmon sperm DNA, and 35% formamide, following standard Southern blotting procedures for 12 to 24 hours. The carrier material is finally washed three times each for 15 minutes using 2X SSC, 0.2% SDS at 60°C.

[00153] As used herein, "high stringency conditions" means for probes of at least 100 nucleotides in length, prehybridization and hybridization at 42°C in 5X SSPE, 0.3% SDS, 200 micrograms/mL sheared and denatured salmon sperm DNA, and 50% formamide, following standard Southern blotting procedures for 12 to 24 hours. The carrier material is finally washed three times each for 15 minutes using 2X SSC, 0.2% SDS at 65°C.

[00154] As used herein, "very high stringency conditions" means for probes of at least 100 nucleotides in length, prehybridization and hybridization at 42°C in 5X SSPE, 0.3% SDS, 200 micrograms/mL sheared and denatured salmon sperm DNA, and 50% formamide, following standard Southern blotting procedures for 12 to 24 hours. The carrier material is finally washed three times each for 15 minutes using 2X SSC, 0.2% SDS at 70°C.

[00155] In one embodiment, a polynucleotide of the present invention (or a fragment thereof) can be isolated using the sequence information provided herein in conjunction with standard molecular biology techniques (e.g., as described in Sambrook et al., supra. For example, suitable hybridization oligonucleotides (e.g., probes or primers) can be designed using all or a portion of the nucleic acid sequences disclosed herein and prepared by standard synthetic techniques (e.g., using an automated DNA synthesizer). The oligonucleotides can be employed in hybridization and/or amplification reactions, for example, to amplify a template of cDNA, mRNA or genomic DNA, according to standard PCR techniques. A polynucleotide so amplified can be cloned into an appropriate vector and characterized by DNA sequence analysis.

[00156] In another aspect, the present invention relates to polynucleotides encoding functional variants of any one of the polypeptides of the present invention, including a biologically active fragment or domain thereof.

[00157] In another aspect, the present invention can include nucleic acid molecules (e.g., oligonucleotides) sufficient for use as primers and/or hybridization probes to amplify, sequence and/or identify nucleic acid molecules encoding a polypeptide of the present invention or fragments thereof. In some embodiments, the present invention relates to polynucleotides (e.g., oligonucleotides) that comprise, span, or hybridize specifically to exon-exon or exon- intron junctions of the genomic sequences identified herein, such as those defined in Tables 2A-2C. Designing such polynucleotides/oligonucleotides would be within the grasp of a person of skill in the art in view of the target sequence information disclosed herein and are thus encompassed by the present invention.

[00158] In another aspect, the present invention relates to polynucleotides comprising silent mutations or mutations that do not significantly alter the (biological) function or protein activity of the encoded polypeptide. Guidance concerning how to make phenotypically silent amino acid substitutions is provided for example in Bowie et al., Science 247:1306-1310 (1990) and in the references cited therein. Furthermore, it will be apparent for the skilled person that DNA sequence polymorphisms of the genes disclosed herein may exist within a given population, which may differ from the sequences disclosed herein. Such genetic polymorphisms may exist in cells from different populations or within a population due to natural allelic variation. Accordingly, in one embodiment, the present invention can include natural allelic variants and homologs of polynucleotides disclosed herein.

[00159] In another aspect, polynucleotides of the present invention can comprise only a portion or a fragment of the nucleic acid sequences disclosed herein. Although such polynucleotides may not encode a functional polypeptide of the present invention, they are useful for example as probes or primers in hybridization or amplification reactions. Exemplary uses of such polynucleotides include: (1) isolating a gene (as allelic variant thereof) from cDNA library; (2) in situ hybridization (e.g., FISH) to metaphase chromosomal spreads to provide precise chromosomal location of the gene as described in Verma et al., Human Chromosomes: a Manual of Basic Techniques, Pergamon Press, New York (1988); (3) Northern blot analysis for detecting expression of mRNA corresponding to a polypeptide disclosed herein, or a homolog, ortholog or variant thereof, in specific tissues and/or cells; and (4) probes and primers that can be used as a diagnostic tool to analyze the presence of a nucleic acid hybridizable to a polynucleotide disclosed herein in a given biological (e.g., tissue) sample. It would be within the grasp of a skilled person to design specific oligonucleotides in view of the nucleic acid sequences disclosed herein. Oligonucleotides typically comprise a region of nucleotide sequence that hybridizes (preferably under highly stringent conditions) to at least 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 , 32, 33, 34, 35, 36, 37, 37, 39, 40, 50, 60, 70, 80, 90 or 100 contiguous nucleotides of a polynucleotide of the present invention. In one embodiment, such oligonucleotides can be used for identifying and/or cloning other family members, as well as orthologs from other species. In another embodiment, the oligonucleotide can be attached to a detectable label (e.g., a radioisotope, a fluorescent compound, an enzyme, or an enzyme cofactor). Such oligonucleotides can also be used as part of a diagnostic method or kit for identifying cells which express a polypeptide of the present invention.

[00160] As would be understood by the skilled person, full-length complements of any one of the polynucleotides of the present invention are also encompassed. In one embodiment, the full-length complements are antisense molecules with respect to the coding strands of polynucleotides of the present invention, which hybridize (preferably under highly stringent conditions) to at least 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 , 32, 33, 34, 35, 36, 37, 37, 39, 40, 50, 60, 70, 80, 90 or 100 contiguous nucleotides to a polynucleotide of the present invention.

Sequencing errors

[00161] The sequence information as provided herein should not be so narrowly construed as to require inclusion of erroneously identified bases. The specific sequences disclosed herein can be readily used to isolate the corresponding complete genes from the organism sequenced herein, which in turn can easily be subjected to further sequence analyses thereby identifying sequencing errors.

[00162] Unless otherwise indicated, all nucleotide sequences disclosed herein were determined by sequencing using an automated DNA sequencer, and all amino acid sequences of polypeptides disclosed herein were predicted by translation based on the genetic code. Therefore, as is known in the art for any DNA sequence determined by this automated approach, any nucleotide sequence determined herein may contain some errors. Nucleotide sequences determined by automation are typically at least about 90% identical, more typically at least about 95% to at least about 99.9% identical to the actual nucleotide sequence of the sequenced DNA molecule. The actual sequence can be more precisely determined by other approaches including manual DNA sequencing methods well known in the art. As is also known in the art, a single insertion or deletion in a determined nucleotide sequence compared to the actual sequence will cause a frame shift in translation of the nucleotide sequence such that the predicted amino acid sequence encoded by a determined nucleotide sequence will be completely different from the amino acid sequence actually encoded by the sequenced DNA molecule, beginning at the point of such an insertion or deletion.

[00163] The person skilled in the art is capable of identifying such erroneously identified bases and knows how to correct such errors.

VECTORS

[00164] Another aspect of the invention pertains to vectors (e.g., expression vectors), containing a polynucleotide encoding a polypeptide of the present invention.

[00165] As used herein, the term "vector" includes a nucleic acid molecule capable of transporting another nucleic acid molecule to which it has been linked. One type of vector is a "plasmid", which refers to a circular double stranded DNA loop into which additional DNA segments can be ligated. Another type of vector is a viral vector, wherein additional DNA segments can be ligated into the viral genome. Certain vectors are capable of autonomous replication in a host cell into which they are introduced (e.g., bacterial vectors having a bacterial origin of replication and episomal mammalian vectors). Other vectors (e.g., non-episomal mammalian vectors) are integrated into the genome of a host cell upon introduction into the host cell, and thereby are replicated along with the host genome. Moreover, certain vectors are capable of directing the expression of genes to which they are operatively linked. Such vectors are referred to herein as "expression vectors". In general, expression vectors useful in recombinant DNA techniques are often in the form of plasmids. The terms "plasmid" and "vector" can be used interchangeably herein as the plasmid is the most commonly used form of vector. However, the invention is intended to include such other forms of expression vectors, such as viral vectors (e.g., replication defective retroviruses, adenoviruses and adeno- associated viruses), which serve equivalent functions.

[00166] In one embodiment, recombinant expression vectors of the invention can comprise a polynucleotide of the present invention in a form suitable for expression of the polynucleotide in a host cell, which means that the recombinant expression vector includes one or more regulatory sequences, selected on the basis of the host cells to be used for expression, which is operatively linked to the nucleic acid sequence to be expressed. Within a recombinant expression vector, "operatively linked" is intended to mean that the nucleotide sequence of interest is linked to the regulatory sequence(s) in a manner which allows for expression of the nucleotide sequence (e.g., in an in vitro transcription/translation system or in a host cell when the vector is introduced into the host cell). The term "regulatory sequence" is intended to include promoters, enhancers and other expression control elements (e.g., polyadenylation signal). Such regulatory sequences are described, for example, in Goeddel, Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, CA (1990). Regulatory sequences include those which direct constitutive expression of a nucleotide sequence in many types of host cells and those which direct expression of the nucleotide sequence only in a certain host cell (e.g., tissue-specific regulatory sequences). It will be appreciated by those skilled in the art that the design of the expression vector can depend on such factors as the choice of the host cell to be transformed, the level of expression of protein desired, etc. The expression vectors of the present invention can be introduced into host cells to thereby produce proteins or peptides, encoded by polynucleotides as described herein (e.g., polypeptides of the present invention).

[00167] In another embodiment, recombinant expression vectors of the present invention can be designed for expression of polypeptides of the present invention in prokaryotic or eukaryotic cells. For example, these polypeptides can be expressed in bacterial cells such as E. coli, insect cells (using baculovirus expression vectors), yeast cells or mammalian cells. Suitable host cells are discussed further in Goeddel supra). In another embodiment, recombinant expression vectors of the present invention can be transcribed and translated in vitro, for example using T7 promoter regulatory sequences and T7 polymerase.

[00168] In another embodiment, expression vectors of the present invention can include chromosomal-, episomal- and virus-derived vectors, e.g., vectors derived from bacterial plasmids, bacteriophage, yeast episome, yeast chromosomal elements, viruses such as baculoviruses, papova viruses, vaccinia viruses, adenoviruses, fowl pox viruses, pseudorabies viruses and retroviruses, and vectors derived from combinations thereof, such as those derived from plasmid and bacteriophage genetic elements, such as cosmids and phagemids.

[00169] For expression, a DNA insert should be operatively linked to an appropriate promoter, such as the phage lambda PL promoter, the E. coli lac, trp and tac promoters, the SV40 early and late promoters and promoters of retroviral LTRs, to name a few. Other suitable promoters will be known to the skilled person. In a specific embodiment, promoters are preferred that are capable of directing a high expression level of biologically active polypeptides of the present invention (e.g., lignocellulose active proteins) from fungi. Such promoters are known in the art. The expression constructs may contain sites for transcription initiation, termination, and, in the transcribed region, a ribosome binding site for translation. The coding portion of the mature transcripts expressed by the constructs will include a translation initiating AUG at the beginning and a termination codon appropriately positioned at the end of the polypeptide to be translated.

[00170] Vector DNA can be introduced into prokaryotic or eukaryotic cells via conventional transformation or transfection techniques. As used herein, the terms "transformation" and "transfection" are intended to refer to a variety of art-recognized techniques for introducing foreign nucleic acid (e.g., DNA) into a host cell, including calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated transfection, transduction, infection, lipofection, cationic lipid-mediated transfection or electroporation. Suitable methods for transforming or transfecting host cells can be found in Sambrook, et al., (Molecular Cloning: A Laboratory Manual, 2nd ed. Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1989), Davis et al., Basic Methods in Molecular Biology (1986) and other laboratory manuals.

[00171] For stable transfection of mammalian cells, it is known that, depending upon the expression vector and transfection technique used, only a small fraction of cells may integrate the foreign DNA into their genome. In order to identify and select these integrants, a gene that encodes a selectable marker (e.g., resistance to antibiotics) is generally introduced into the host cells along with the gene of interest. Preferred selectable markers include those which confer resistance to drugs, such as G418, hygromycin and methatrexate. A polynucleotide encoding a selectable marker can be introduced into a host cell on the same vector as that encoding a polypeptide of the present invention, or on a separate vector. Cells stably transfected with a polynucleotide of the present invention can be identified by drug selection (e.g., cells that have incorporated the selectable marker gene will survive, while the other cells die).

[00172] Expression of proteins in prokaryotes is often carried out in E. coli with vectors containing constitutive or inducible promoters directing the expression of either fusion or non-fusion proteins. Fusion vectors add a number of amino acids to a protein encoded therein, e.g., to the amino terminus of the recombinant protein. Such fusion vectors typically serve three purposes: (1 ) to increase expression of recombinant protein; (2) to increase the solubility of the recombinant protein; and (3) to aid in the purification of the recombinant protein by acting as a ligand in affinity purification. Often, in fusion expression vectors, a proteolytic cleavage site is introduced at the junction of the fusion moiety and the recombinant protein to enable separation of the recombinant protein from the fusion moiety subsequent to purification of the fusion protein.

[00173] Vectors preferred for use in bacteria are for example disclosed in WO-A1-2004/074468. Other suitable vectors will be readily apparent to the skilled artisan. Known bacterial promoters suitable for use in the present invention include the promoters disclosed in WO-A1 -2004/074468.

[00174] As indicated, the expression vectors will preferably contain selectable markers. Such markers include dihydrofolate reductase or neomycin resistance for eukaryotic cell culture and antibiotic resistance (e.g., tetracyline or ampicillin) for culturing in E. coli and other bacteria. Representative examples of appropriate host include bacterial cells, such as E. coli, Streptomyces, Salmonella typhimurium and certain Bacillus species; fungal cells such as Aspergillus species, for example A. niger, A. oryzae and A. nidulans, yeast cells such as Kluyveromyces, for example K. lactis and/or Pichia, for example P. pastoris; insect cells such as Drosophila S2 and Spodoptera Sf9; animal cells such as CHO, COS and Bowes melanoma; and plant cells. Appropriate culture mediums and conditions for the above-described host cells are known in the art.

[00175] Transcription of the DNA encoding the polypeptides of the present invention by higher eukaryotes may be increased by inserting an enhancer sequence into the vector. Enhancers are cis-acting elements of DNA, usually about from 10 to 300 bp that act to increase transcriptional activity of a promoter in a given host cell-type. Examples of enhancers include the SV40 enhancer, which is located on the late side of the replication origin at bp 100 to 270, the cytomegalovirus early promoter enhancer, the polyoma enhancer on the late side of the replication origin, and adenovirus enhancers.

[00176] For secretion of the translated protein into the lumen of the endoplasmic reticulum, into the periplasmic space or into the extracellular environment, appropriate secretion signal may be incorporated into the expressed polypeptide. The signals may be endogenous to the polypeptide or they may be heterologous signals. In an embodiment, a polypeptide of the present invention may be expressed in a modified form, such as a fusion protein, and may include not only secretion signals but also additional heterologous functional regions. Thus, for instance, a region of additional amino acids, particularly charged amino acids, may be added to the N terminus of the polypeptide to improve stability and persistence in the host cell, during purification or during subsequent handling and storage. Also, peptide moieties may be added to the polypeptide to facilitate purification and/or detection.

HOST CELLS

[00177] In another aspect, the present invention features cells, e.g., transformed host cells or recombinant host cells that contain a polynucleotide or vector of the present invention. A "transformed cell" or "recombinant cell" is a cell into which (or into an ancestor of which) has been introduced a polynucleotide or vector of the invention by means of recombinant DNA techniques. Both prokaryotic and eukaryotic cells are included, e.g., bacteria, fungi, yeast, and the like, especially preferred are cells from filamentous fungi, in particular the strain from which the polynucleotide and polypeptide sequences disclosed herein were derived.

[00178] In one embodiment, a cell of the present invention is typically not a wild-type strain or a naturally- occurring cell. Host cells of the present invention can include, but are not limited to: fungi (e.g., Aspergillus niger, Trichoderma reesii, Myceliophthora thermophila and Talaromyces emersonii); yeasts (e.g., Saccharomyces cerevisiae, Yarrowia lipolytics and Pichia pastoris); bacteria (e.g., Escherichia coli and Bacillus sp.); and plants (e.g., Nicotiana benthamiana, Nicotiana tabacum and Medicago sativa).

[00179] In another embodiment, a polynucleotide (or a polynucleotide which is comprised within a vector) may be homologous or heterologous with respect to the cell into which it is introduced. In this context, a polynucleotide is homologous to a cell if the polynucleotide naturally occurs in that cell. A polynucleotide is heterologous to a cell if the polynucleotide does not naturally occur in that cell. Accordingly, in an embodiment, the present invention relates to a cell which comprises a heterologous or a homologous sequence corresponding to any one of the polynucleotides or polypeptides disclosed herein.

[00180] In another embodiment, a host cell can be chosen that modulates the expression of the inserted sequences, or modifies and processes the gene product in a specific, desired fashion. Such modifications (e.g., glycosylation) and processing (e.g., cleavage) of protein products may facilitate optimal functioning of the protein. Various host cells have characteristic and specific mechanisms for post-translational processing and modification of proteins and gene products. Appropriate cell lines or host systems familiar to those of skill in the art can be chosen to ensure the desired and correct modification and processing of the foreign protein expressed. To this end, eukaryotic host cells that possess the cellular machinery for proper processing of the primary transcript, glycosylation, and phosphorylation of the gene product can be used. Such host cells are well known in the art.

[00181] In another embodiment, host cells can also include, but are not limited to, mammalian cell lines such as CHO, VERO, BHK, HeLa, COS, MDCK, 293, 3T3, WI38, and choroid plexus cell lines. If desired, a stably transfected cell line can produce the polypeptides of the present invention. A number of vectors suitable for stable transfection of mammalian cells are available to the public, methods for constructing such cell lines are also publicly known, e.g., in Ausubel et al., (supra).

[00182] In another embodiment, the present invention relates to methods of inhibiting the expression of a polypeptide of the present invention in a host cell, comprising administering to the cell or expressing in the cell a double-stranded RNA (dsRNA) molecule (or a molecule comprising region of double-strandedness), wherein the dsRNA comprises a subsequence of a polynucleotide of the present invention. In a preferred aspect, the dsRNA is about 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25 or more duplex nucleotides in length. The dsRNA is preferably a small interfering RNA (siRNA) or a micro RNA (miRNA). In a preferred aspect, the dsRNA is small interfering RNA (siRNAs) for inhibiting transcription. In another preferred aspect, the dsRNA is micro RNA (miRNAs) for inhibiting translation. The present invention also relates to such double-stranded RNA (dsRNA) molecules, comprising a portion of the mature polypeptide coding sequence of any one of the coding sequences of the polypeptides disclosed herein of inhibiting expression of that polypeptide in a cell. While the present invention is not limited by any particular mechanism of action, the dsRNA can enter a cell and cause the degradation of a single-stranded RNA (ssRNA) of similar or identical sequences, including endogenous mRNAs. When a cell is exposed to dsRNA, mRNA from the homologous gene is selectively degraded by a process called RNA interference (RNAi). The dsRNAs of the present invention can be used in gene-silencing methods. In one aspect, the invention relates to methods to selectively degrade RNA using the dsRNAi's of the present invention. The process may be practiced in vitro, ex vivo or in vivo. In one aspect, the dsRNA molecules can be used to generate a loss-of-function mutation in a cell, an organ or an oganism. Methods for making and using dsRNA molecules to selectively degrade RNA are well known in the art, see, for example, U.S. Patent No. 6,506,559; U.S. Patent No. 6,511 ,824; U.S. Patent No. 6,515,109; and U.S. Patent No. 6,489, 127. In some instances, new phylogenic analyses of fungal species have resulted in taxonomic reclassifications. For example, following their phylogenic studies reported in van den Brink et al., ("Phylogeny of the industrial relevant, thermophilic genera Myceliophthora and Corynascus", Fungal Diversity (2012), 52:197-207), the authors proposed renaming all existing Corynascus species to Myceliophthora. Such changes in taxonomic classification are within the scope of the present invention and, regardless of future reclassifications, a person of skill in the art would be able to identify the organism used to determine the sequences disclosed herein for example based on the strain's accession number (ATCC 16451 ).

[00183] It should be understood herein that the level of expression of polypeptides of the present invention could be modified by adapting the codon usage ratio of a sequence of the present invention to that of the host or hosts in which it is meant to be expressed. This adaptation and the concept of codon usage ratio are all well known in the art.

Antibodies

[00184] In another aspect, the present invention relates to an isolated binding agent capable of selectively binding to a polypeptide of the present invention. Suitable binding agents may be selected from an antibody, an antigen binding fragment, or a binding partner. In one embodiment, the binding agent selectively binds to an amino acid sequence selected from Tables 1A-1C, including to any fragment of any of the above sequences comprising at least one antibody binding epitope.

[00185] According to the present invention, the phrase "selectively binds to" refers to the ability of an antibody, antigen binding fragment or binding partner of the present invention to preferentially bind to specified proteins. More specifically, the phrase "selectively binds" refers to the specific binding of one protein to another (e.g., an antibody, fragment thereof, or binding partner to an antigen), wherein the level of binding, as measured by any standard assay (e.g., an immunoassay), is statistically significantly higher than the background control for the assay. For example, when performing an immunoassay, controls typically include a reaction well/tube that contain antibody or antigen binding fragment alone (i.e., in the absence of antigen), wherein an amount of reactivity (e.g., non-specific binding to the well) by the antibody or antigen binding fragment thereof in the absence of the antigen is considered to be background. Binding can be measured using a variety of methods standard in the art including enzyme immunoassays (e.g., ELISA, immunoblot assays, etc.).

[00186] Antibodies are characterized in that they comprise immunoglobulin domains and as such, they are members of the immunoglobulin superfamily of proteins. An antibody of the invention includes polyclonal and monoclonal antibodies, divalent and monovalent antibodies, bi- or multi-specific antibodies, serum containing such antibodies, antibodies that have been purified to varying degrees, and any functional equivalents of whole antibodies. Isolated antibodies of the present invention can include serum containing such antibodies, or antibodies that have been purified to varying degrees. Whole antibodies of the present invention can be polyclonal or monoclonal. Alternatively, functional equivalents of whole antibodies, such as antigen binding fragments in which one or more antibody domains are truncated or absent (e.g., Fv, Fab, Fab', or F(ab)2 fragments), as well as genetically-engineered antibodies or antigen binding fragments thereof, including single chain antibodies or antibodies that can bind to more than one epitope (e.g., bi-specific antibodies), or antibodies that can bind to one or more different antigens (e.g., bi- or multi-specific antibodies), may also be employed in the invention. Methods for the generation and production of antibodies are well known in the art.

[00187] Monoclonal antibodies may be produced according to the methodology of Kohler and Milstein (Nature 256:495-497, 1975). Non-antibody polypeptides, sometimes referred to as binding partners, may be designed to bind specifically to a protein of the invention. Examples of the design of such polypeptides, which possess a prescribed ligand specificity are given in Beste et al., (Proc. Nat'l Acad. Sci. 96:1898-1903, 1999). In one embodiment, a binding agent of the invention is immobilized on a substrate such as: artificial membranes, organic supports, biopolymer supports and inorganic supports such as for use in a screening assay.

[00188] In some embodiment, antibodies and binding agents specifically binding to polypeptides of the present invention may be produced and used even in absence of knowledge of the precise biological function and/or protein activity of the polypeptide. Such antibodies and binding agent may be useful, for example, as diagnostic, classification, and/or research tools.

COMPOSITIONS AND USES

[00189] In another aspect, the present invention relates to a composition comprising one or more polypeptides or polynucleotides of the present invention. In one embodiment, the compositions are enriched in such a polypeptide. The term "enriched" indicates that the biological activity (e.g., biomass degradation or processing) of the composition has been increased, e.g., with an enrichment factor of at least 1.1. The composition may comprise a polypeptide of the present invention as the major component, e.g., a mono-component composition. Alternatively, the composition may comprise multiple enzymatic activities (e.g., those described herein). [00190] The polypeptide compositions may be prepared in accordance with methods known in the art and may be in the form of a liquid or a dry composition. For instance, the polypeptide composition may be in the form of a granulate or a microgranulate. The polypeptide to be included in the composition may be stabilized in accordance with methods known in the art. Examples are given below of preferred uses of the polypeptide compositions of the present invention. The dosage of the polypeptide composition of the invention and other conditions under which the composition is used may be determined on the basis of methods known in the art.

[00191] In another aspect, the present invention relates to the use of the polypeptides (e.g., enzymes) of the present invention a number of industrial and other processes. Despite the long term experience obtained with these processes, there remains a need for improved polypeptides and enzymes featuring one or more significant advantages over those presently used. Depending on the specific application, these advantages can include aspects such as lower production costs, higher specificity towards the substrate, greater synergies with existing enzymes, less antigenic effect, less undesirable side activities, higher yields when produced in a suitable microorganism, more suitable pH and temperature ranges, better properties of the final product, and food grade or kosher aspects. In various embodiments, the present invention seeks to provide one or more of these advantages, or others.

Biomass processing or degradation

[00192] In another aspect, the polypeptides of the present invention may be used in new or improved methods for enzymatically degrading or converting plant cell wall polysaccharides from biomass into various useful products. In addition to cellulose and hemicellulose, plant cell walls contain associated pectins and lignins, the removal of which by enzymes of the current invention can improve accessibility to cellulases and hemicellulases, or which can themselves be converted to useful products. Therefore the polypeptides of the present invention may be used to degrade biomass or pretreated biomass to sugars. These sugars may be used as such or may be, for example, fermented into ethanol.

[00193] Usually, biomass must be subjected to pre-treatment in order to make the cellulose more accessible. Accordingly, in one embodiment, polypeptides of the present invention may be used in improved methods for the processing of pretreated biomass. Pretreatment technologies may involve chemical, physical, or biological treatments. Examples of pre-treatment technologies include but are not limited to: steam explosion; ammonia; acid hydrolysis; alkaline hydrolysis; solvent extraction; crushing; milling; etc.

[00194] One example of a product produced from biomass is bioethanol. Bioethanol is usually produced by the fermentation of glucose to ethanol by yeasts such as Saccharomyces cerevisiae: in addition to ethanol, other chemicals may be synthesized starting from glucose. Ethanol, today, is produced mostly from sugars or starches, obtained from sugar cane, fruits and grains. In contrast, cellulosic ethanol is obtained from cellulose, the main component of wood, straw and much of the plants. Sources of biomass for cellulosic ethanol production comprise agricultural residues (e.g., leftover crop materials from stalks, leaves, and husks of corn plants), forestry wastes (e.g., chips and sawdust from lumber mills, dead trees, and tree branches), energy crops (e.g., dedicated fast- growing trees and grasses such as switch grass), municipal solid waste (e.g., household garbage and paper products), food processing and other industrial wastes (e.g., black liquor, paper manufacturing by-products, etc.). [00195] Plant biomass is a mixture of plant polysaccharides, including cellulose, hemicelluloses, and pectin, together with the structural polymer, lignin. Glucose is released from cellulose by the action of mixtures of enzymes, including: endoglucanases, exoglucanases (cellobiohydrolases 1 and 2) and beta-glucosidases. Efficient large-scale conversion of cellulosic materials by such mixtures may require the full complement of enzymes, and can be enhanced by the addition of enzymes that attack the other plant cell wall components (e.g., hemicelluloses, pectins, and lignins), as well as chemical linkages between these components. Hence, polypeptides of the present invention that are highly expressed, or have high specific activity, stability, or resistance to inhibitors may improve the efficiency of the process, and lower enzyme costs. It would be an advantage to the art to improve the degradation and conversion of plant cell wall polysaccharides by composing cellulase mixtures using cellulase enzymes with such properties. Furthermore, polypeptides of the present invention that are able to function at extremes of pH and temperature are desirable, both since improved enzyme robustness decreases costs, and because enzymes that function at high temperature will allow high processing temperatures under high substrate consistency conditions that decrease viscosity and thus improve yields.

[00196] Glycoside hydrolases from the family GH61 are known to stimulate the activity of cellulose cocktails on lignocellulosic substrates and are thus considered to exhibit cellulose-enhancing activity (Harris et al., Biochemistry 49, 3305 (2010)). Enhancement of cellulase cocktail efficiency by GH61 proteins of the present invention may contribute to lowering the costs of cellulase enzymes used for the production of glucose from plant cell biomass, as described above. GH61 (glycoside hydrolase family 61 or sometimes referred to as EGIV) proteins are oxygen- dependent polysaccharide monooxygenases (PMO's) according to the latest literature. Often in the literature, these proteins are mentioned as enhancing the action of cellulases on lignocellulose substrates. GH61 was originally classified as an endoglucanase, based on the measurement of very weak endo-1 ,4-|J-d-glucanase activity in one family member. The term "GH61" as used herein, is to be understood as a family of enzymes, which share common conserved sequence portions and foldings, originally classified in family 61 (http://www.cazy.org/GH61.html) of the well-established CAZY GH classification system, and now re-classified by CAZY as family AA9 (http://www.cazy.org/M9.html). GH61 is used herein as being part of the cellulolytic system elaborated by certain fungi to degrade cellulose.

[00197] Enzymatic hydrolysis of plant hemicellulose yields 5-carbon sugars that either may be fermented to ethanol by some species of yeast, or converted to other types of chemical products. Enzymatic deconstruction of hemicellulose is also known to improve the accessibility of plant cell wall cellulose to cellulase enzymes for the production of glucose from lignocellulosic materials. Hemicellulase enzymes of the present invention that enhance glucose production from lignocellulose would find utility in the bioethanol industry and in other process that rely on glucose or pentose streams from lignocellulose.

[00198] Lignin is composed of methoxylated phenyl-propane units linked by ether linkages and carbon-carbon bonds. The chemical composition of lignin may, depending on species, include guaiacyl, 4-hydroxyphenyl, and syringyl groups. Enzymatic modification of lignin by the polypeptides of the present invention can be used for the production of structural materials from plant biomass, or alternatively improve the accessibility of plant cellulose and hemicelluloses to cellulase enzymes for the release of glucose from biomass as described above. Enzymes that degrade the lignin component of lignocellulose include lignin peroxidases, manganese-dependent peroxidases, versatile peroxidases, and laccases (Vicuna et al., 2000, Molecular Biotechnology 14: 173-176; Broda et al., 1996, Molecular Microbiology 19: 923-932). In some embodiments, polypeptides of the present invention may also, in certain instances, be active in the decolorization of industrial dyes, and thus useful for the treatment and detoxification of chemical wastes.

[00199] In another embodiment, pectin-degrading polypeptides of the present invention can also enhance the action of cellulases on plant biomass by improving the accessibilty of cellulase to the cellulose component of lignocellulose.

[00200] In another embodiment, polypeptides of the present invention may also be useful in other applications for hydrolyzing non-starch polysaccharide (NSP).

[00201] In another embodiment, esterases of the present invention can be useful in the bioenergy industry such as for the production of biodiesel and hydrolysis of hemicellulose.

[00202] In another embodiment, the present invention relates to methods for degrading or converting a cellulose-containing material, comprising: treating the cellulose-containing material with an effective amount of a cellulolytic enzyme composition in the presence of an effective amount of a polypeptide having cellulolytic enhancing activity of the present invention, wherein the presence of the polypeptide having cellulolytic enhancing activity increases the degradation of cellulose-containing material compared to the absence of the polypeptide having cellulolytic enhancing activity.

[00203] In another embodiment, the present invention relates to methods for producing a fermentation product, comprising: (a) saccharifying a cellulose-containing material with an effective amount of a cellulolytic enzyme composition in the presence of an effective amount of a polypeptide having cellulolytic enhancing activity of the present invention, wherein the presence of the polypeptide having cellulolytic enhancing activity increases the degradation of cellulose-containing material compared to the absence of the polypeptide having cellulolytic enhancing activity; (b) fermenting the saccharified cellulose-containing material of step (a) with one or more fermenting microorganisms to produce the fermentation product; and (c) recovering the fermentation product from the fermentation.

Food product industry

[00204] In one embodiment, the present invention relates to methods for preparing a food product comprising incorporating into the food product an effective amount of a polypeptide of the present invention. This can improve one or more properties of the food product relative to a food product in which the polypeptide is not incorporated. The phrase "incorporated into the food product" is defined herein as adding a polypeptide of the present invention to the food product, to any ingredient from which the food product is to be made, and/or to any mixture of food ingredients from which the food product is to be made. In other words, a polypeptide of the present invention may be added in any step of the food product preparation and may be added in one, two or more steps. The polypeptide of the present invention is added to the ingredients of a food product which can then be treated by methods including cooking, boiling, drying, frying, steaming or baking as is known in the art. [00205] At least in the context of food products, the term "effective amount" is defined herein as an amount of the polypeptide (e.g., enzyme) of the present invention that is sufficient for providing a measurable effect on at least one property of interest of the food product. The term "improved property" is defined herein as any property of a food product which is improved by the action of a polypeptide (e.g., enzyme) of the present invention relative to a food product in which the polypeptide is not incorporated. The improved property may be determined by comparison of a food product prepared with and without addition of a polypeptide of the present invention. Organoleptic qualities may be evaluated using procedures well established in the food industry, and may include, for example, the use of a panel of trained taste-testers.

[00206] The polypeptides of the present invention may be prepared in any form suitable for the use in question, e.g., in the form of a dry powder, agglomerated powder, or granulate, in particular a non-dusting granulate, liquid, in particular a stabilized liquid, or protected enzyme such as described in WO01/11974 and WO02/26044. Granulates and agglomerated powders may be prepared by conventional methods, e.g., by spraying the enzyme according to the invention onto a carrier in a fluid-bed granulator. The carrier may consist of particulate cores having a suitable particle size. The carrier may be soluble or insoluble, e.g., a salt (such as NaCI or sodium sulphate), sugar (such as sucrose or lactose), sugar alcohol (such as sorbitol), starch, rice, corn grits, or soy. In an embodiment, the polypeptide of the present invention (and/or additional polypeptides/enzymes) may be contained in slow-release formulations. Methods for preparing slow-release formulations are well known in the art. Adding nutritionally acceptable stabilizers such as sugar, sugar alcohol, or another polyol, and/or lactic acid or another organic acid according to established methods may for instance, stabilize liquid enzyme preparations.

[00207] In another embodiment, polypeptides of the present invention may also be incorporated in yeast- comprising compositions such as disclosed in EP-A-0619947, EP-A-0659344 and WO02/49441.

[00208] In another embodiment, one or more additional polypeptides/enzymes may be incorporated into a food product of the present invention. The additional enzyme may be of any origin, including mammalian and plant, and preferably of microbial (bacterial, yeast or fungal) origin and may be obtained by techniques conventionally used in the art. Enzymes may conveniently be produced in microorganisms. Microbial enzymes are available from a variety of sources; Bacillus species are a common source of bacterial enzymes, whereas fungal enzymes are commonly produced in Aspergillus species.

[00209] In specific embodiments, additional polypeptides/enzymes include starch degrading enzymes, xylanases, oxidizing enzymes, fatty material splitting enzymes, or protein-degrading, modifying or crosslinking enzymes. Starch degrading enzymes include endo-acting enzymes such as alpha-amylase, maltogenic amylase, pullulanase or other debranching enzymes, and exo-acting enzymes that cleave off glucose (amyloglucosidase), maltose (beta-amylase), maltotriose, maltotetraose and higher oligosaccharides. Suitable xylanases are for instance xylanases, pentosanases, hemicellulase, arabinofuranosidase, glucanase, cellulase, cellobiohydrolase, beta- glucosidase, and others. Oxidizing enzymes are for instance glucose oxidase, hexose oxidase, pyranose oxidase, sulfhydryl oxidase, lipoxygenase, laccase, polyphenol oxidases and others. Fatty material splitting enzymes are for instance triacylglycerol lipases, phospholipases (such as A1 , A2, B, C and D) and galactolipases. Protein degrading, modifying or crosslinking enzymes are for instance endo-acting proteases (serine proteases, metalloproteases, aspartyl proteases, thiol proteases), exo-acting peptidases that cleave off one amino acid, or dipeptide, tripeptide etceteras from the N-terminal (aminopeptidases) or C-terminal (carboxypeptidases) ends of the polypeptide chain, asparagines or glutamine deamidating enzymes such as deamidase and peptidoglutaminase or crosslinking enzymes such as transglutaminase.

[00210] In others embodiments, additional polypeptides/enzymes can include: amylases, such as alpha- amylase (which can be useful for providing sugars that are fermentable by yeast) or beta-amylase; cyclodextrin glucanotransferase; peptidase (e.g., an exopeptidase, which can be useful in flavour enhancement); transglutaminase; lipase, which can be useful for the modification of lipids present in the food or food constituents), phospholipase, cellulase, hemicellulase, protein disulfide isomerase, peroxidase, laccase, or an oxidase (e.g., glucose oxidase, hexose oxidase, aldose oxidase, pyranose oxidase, lipoxygenase or L-amino acid oxidase).

[00211] In other embodiment, esterases of the present invention have a number of applications in the food industry including, but not limited to, degumming vegetable oils; improving the production of bread (e.g., in situ production of emulsifiers); producing crackers, noodles, and pasta; enhancing flavor development of cheese, butter, and margarine; ripening cheese; removing wax; trans-esterification of flavors and cocoa butter substitutes; synthesizing structured lipids for infant formula and nutraceuticals; improving the polyunsaturated fatty acid content in fish oil; and aiding in digestion and releasing minerals in food processing.

[00212] When one or more additional enzyme activities are to be added in accordance with the methods of the present invention, these activities may be added separately or together with the polypeptide according to the invention.

Detergent industry

[00213] In another aspect, polypeptides of the present invention can be useful in the detergent industry, e.g., for removal of carbohydrate-based stains from soiled laundry. Enzymes are used in detergents in order to improve its efficacy to remove most types of dirt. In some embodiments, esterases such as lipases of the present invention are particularly useful for removing fats and lipids.

Feed industry

[00214] In another aspect, polypeptides of the present invention can be useful in the feed enzyme industry, e.g., for increasing nutritional quality, digestibility and/or absorption of animal feed.

[00215] Feed enzymes have an important role to play in current farming systems, as they can increase the digestibility of nutrients, leading to greater efficiency in the production of animal products such as meat and eggs. At the same time, they can play a role in minimizing the environmental impact of increased animal production.

[00216] Non-starch polysaccharides (NSP) can increase the viscosity of the digesta which can, in turn, decrease nutrient availability and animal performance.

[00217] Endoxylanases and phytases are the best-known feed-enzyme products. Phytase enzymes hydrolyse phytic acid and release inorganic phosphate, thereby avoiding the need to add inorganic phosphates to the diet and reducing phosphorus excretion. Addition of xylanases to feed has also been shown to have positive effects on animal growth. Adding specific nutrients to feed improves animal digestion and thereby reduces feed costs. A lot of feed additives are being currently used and new concepts are continuously developed. Use of specific enzymes like non-starch carbohydrate degrading enzymes could breakdown fiber, releasing energy as well as increasing the protein digestibility due to better accessibility of the protein when fiber gets broken down. In this way the feed cost could come down, as well as the protein levels in the feed also could be reduced.

[00218] Non-starch polysaccharides (NSPs) are also present in virtually all feed ingredients of plant origin. NSPs are poorly utilized and can, when solubilized, exert adverse effects on digestion. Exogenous enzymes can contribute to a better utilization of these NSPs and as a consequence reduce any anti-nutritional effects. Accordingly, in a particular embodiment, hemicellulases and other polysaccharide-active polypeptides/enzymes of the present invention can be used for this purpose in cereal-based diets for poultry and, to a lesser extent, for pigs and other species.

[00219] In some embodiments, esterases of the present invention are useful in the feed industry such as for reducing the amount of phosphate in feed.

Pulp and paper

[00220] In another embodiment, xylanases of the present invention can be useful in the pulp and paper industry, e.g., for prebleaching of kraft pulp. Xylanases have been found to be most effective for that purpose. Xylanases attract increasing scientific and commercial attention due to applications in the pulp and paper industry for removal of hemicellulose from dissolving pulps or for enhancement of the bleachability of pulp and, thus, reduction of the use of environmentally harmful bleaching chemicals. A similar application of xylanases for pulp prebleaching is an already well-established technology and has greatly stimulated research on hemicellulases in the past decade. Although lignin-active peroxidases of the present invention may also be active in modification of lignin and hence have bleaching properties, such enzymes are generally less attractive for bleaching due to the need to use and recycle expensive redox mediators.

[00221] In a related embodiment, polypeptides such as xylanases of the present invention can be used to pre- bleach pulp to reduce the amount of bleaching chemicals to obtain a given brightness. It is suggested that xylanase depolymerises xylan blocks and increases accessibility or helps liberation of residual lignin by releasing xylan- chromophore fragments. In addition to brownstock prior to bleaching, polypeptides such as xylanases of the present invention can save on bleaching chemicals. The enzymes hydrolyze surface xylans and are able to break linkages between hemicellulose and lignin. Other polypeptides (e.g., hemicellulase active enzymes) of the present invention which can break these linkages can function effectively in bleaching or pre-bleaching of pulp, and thus such uses are also within the scope of the present invention.

[00222] In some embodiments, esterases of the present invention are useful for the removal of triglycerides, steryl esters, resin acids, free fatty acids, and sterols (e.g., lipophilic wood extractives).

Other uses [00223] In another embodiment, polypeptides such as xylanases of the present invention can be used in antibacterial formulations, as well as in pharmaceutical products such as throat lozenges, toothpastes, and mouthwash.

[00224] Chitin is a beta-(1 ,4)-linked polymer of N-acetyl D-glucosamine (GlcNAc), found as a structural polysaccharide in fungal cell walls as well as in the exoskeleton of arthropods and the outer shell of crustaceans. Approximately 75% the total weight of shellfish, is considered waste, and a large proportion of the material making up the waste is chitin. Accordingly, in one embodiment, polypeptides such as chitin-degrading enzymes of the present invention are useful in the modification and degradation of chitin, allowing the production of chitin-derived material, such as chitooligosaccharides and N-acetyl D-glucosamine, from chitin waste. In another embodiment, polypeptides such as chitinase enzymes of the present invention can be useful as antifungal agents.

[00225] In another embodiment, polypeptides of the present invention can be used in the textile industry (e.g., for the treatment of textile substrates). More particularly, cellulases (e.g., endo- exocellulases and cellobiohydrolases) have gained importance in the treatment of cellulose-containing fibers. During the washing of indigo-dyed denim textiles, enzymatic treatment by a polypeptide of the present invention is can be used in place of (or in addition to) a bleaching treatment to achieve a "used" look of jeans or other suitable fabrics. Polypeptides of the present invention can also improve the softness/feel of such fabrics. When used in textile detergent compositions, enzymes of the present invention can enhance cleaning ability or act as a softening agent. In another embodiment, polypeptides such as cellulases of the present invention can be used in combination with polymeric agents in processes for providing a localized variation in the color density of fibers.

[00226] In another embodiment, polypeptides of the present invention can be used in the waste treatment industry (e.g., for changing the characteristics of the waste to become more amenable to further treatment and/or for bio-conversion to value-added products). Polypeptides such as lipases, cellulases, amylases, and proteases of the present invention can be used in addition to microorganisms to break down polymeric substances like proteins, polysaccharides and lipids, thereby facilitating this process.

[00227] In another embodiment, polypeptides of the present invention can be used in industries such as biocatalysis; sewage treatment; cleaning up oil pollution; the synthesis of fragrances; and enhancing the recovery of oil (e.g., during drilling).

[00228] Other uses of the polynucleotides and polypeptides of the present invention would be apparent to a person of skill in the art in view of the sequences and biological activities disclosed herein. These other uses, even though not explicitly mentioned here, are nevertheless within the scope of the present invention.

Diagnostic, classification and research tools

[00229] In another embodiment, the polynucleotides, polypeptides and antibodies of the present invention can be useful for diagnostic and classification tools. In this regard, it would be within the capacities of a person of skill in the art to search existing sequence databases and perform a phylogenic analysis based on the nucleic acid and amino acid sequences disclosed herein. Furthermore, designing hybridization probes or primers that are specific for a particular genus, species or strain (e.g., the genus, species, or strain from which the sequences disclosed herein were derived) would be within the grasp of a skilled person, in view of the sequence information disclosed herein. Similarly, a skilled person would be able to select an epitope of a polypeptide of the present invention which is specific for a particular genus, species or strain (e.g., the genus, species, or strain from which the sequences disclosed herein were derived) and generate an antibody or binding agent that binds specifically thereto.

[00230] Such tools are useful, for example, in diagnostic methods for detecting the presence or absence of a particular organism (e.g., the organism from which the sequences disclosed herein were derived) in a sample; as research tools (e.g., for designing and producing microarrays for studying fungal gene expression); for rapidly classifying an organism of interest based the detection of a sequence or polypeptide specific for that organism. The skilled person would recognize that knowledge of the precise (biological) function or protein activity of a polypeptide of the present invention is not absolutely necessary for the aforementioned tools to be useful for diagnostic, research, or classification purposes. Sequences that are particularly useful in this regard are the genomic, coding and amino acid sequences corresponding to the polypeptides of the present invention annotated as "unknown" in Tables 1A-1 C (as well as their corresponding exons and introns defined in Tables 2A-2C, where available). These sequences show little sequence identity with those in the art and thus can be useful as markers for identifying the organisms from which the sequences of the present invention were derived. The skilled person would know how to search various sequence databases to design specific hybridization oligonucleotides (e.g., probes and primers), as well as produce antibodies specifically binds to the aforementioned sequences.

[00231] In some embodiments, the present invention relates to a method for identifying and/or classifying an organism (e.g., a fungal species) based on a biological sample, the method comprising detecting the presence or absence of any one of the polynucleotides or polypeptides of the present invention (e.g., those recited in the preceding paragraph) and determining that said organism is present or classifying said organism based on the presence of the polynucleotide or polypeptide. In some embodiments, the detecting step can be carried out using one or more oligonucleotides or antibodies of the present invention. In some embodiments, the detecting step can be carried out by performing an amplification and/or hybridization reaction.

[00232] In Tables 1A-1C below, the skilled person will recognize that although the precise protein activity of a polypeptide of the present invention may not be known perse (e.g., in the case of proteins of the presence invention labelled as "unknown" in Tables 1A-1C), the polypeptide may be nevertheless useful for carrying out an industrial process (e.g., cellulase-enhancing, cellulose-degrading, hemicellulose-degrading, cellulolysis-enhancing, lignocellulolysis-enhancing, and other biological functions listed in Tables 1A-1C). In this regard, proteins labelled herein as "unknown" comprise proteins whose precise enzymatic activities may not be deduceable from sequence comparisons, but that are nevertheless indentified as interesting targets for industrial applications for other reasons (e.g., their expression is induced by growth under certain coonditions such as in the presence of cellulostic and/or lignocellulostic biomass).

In the above Table, an asterix (*) refers to sequences (SEQ ID NOs: 161-178) for which, in the Sequence Listing, the nucleic acid sequences for these genomic sequences have been assigned an arbitrary number of bases "n" as placeholders.

Table 1 B: Biomass degrading genes and polypeptides of Thermomyces stellatus

Table 1C: Biomass degrading genes and polypeptides of Corynascus sepedonium

Table 2A: List of genes of Chaetomium thermophilum with reference to exon

Chath2p7_003224 40 1008 1..1008

Chath2p7_003300 41 1370 1..796, 970..1048, 1226..1244, 1320..1370

Chath2p7_003533 42 1451 1 .193, 247..1451

Chath2p7_003618 43 1674 1..1581 , 1666..1674

Chath2p7_003689 44 1041 1..1041

Chath2p7_003933 45 989 1. 173, 224..98Θ

Chath2p7_003968 46 831 1..688, 755..831

Chath2p7_004674 47 1609 1..921 , 1001..1274, 1332..1609

Chath2p7_004722 48 1053 1..1053

Chath2p7_004791 49 1381 1..297, 347..532, 602..1120, 1184..1381

Chath2p7_004815 50 1188 1..1188

Chath2p7_004828 51 1020 1..1020

Chath2p7_004831 52 2455 1..1844, 1909..2455

Chath2p7_004839 53 1395 1..113, 174..675, 731..1166, 1223..1395

Chath2p7_004844 54 2361 1..2361

Chath2p7_005032 55 1942 1..150, 274..1443, 1514..1942

Chath2p7_005180 56 1201 1..634, 696..1097, 1 155..1201

Chath2p7_005456 57 819 1..55, 105..464, 533..819

Chath2p7_005508 58 2754 1..2754

Chath2p7_005599 59 1117 1..602, 667..1 117

Chath2p7_005742 60 1490 1..891 , 957..1104, 1 177..1490

Chath2p7_005799 61 1867 1..862, 912..1867

Chath2p7_005820 62 1530 1..1530

Chath2p7_005825 63 988 1..301 , 360..988

Chath2p7_006054 64 2820 1..559, 653..2820

Chath2p7_006385 65 1728 1..502, 566..1728

Chath2p7_006459 66 964 1..229, 292.794, 854..9Θ4

Chath2p7_006554 67 1951 1..341 , 422..1718, 1778..1951

Chath2p7_006594 68 1938 1 .123, 223..1938

Chath2p7_006622 69 843 1..509, 572..68Θ, 757..843

Chath2p7_006817 70 465 1..465

Chath2p7_006958 71 1304 1..99, 161..317, 370..1304

Chath2p7_007187 72 868 1..66, 129..433, 499..868

Chath2p7_007459 73 1605 1..1605

Chath2p7_007461 74 1197 1..374, 420..1 197

1..213, 269..319, 364.438, 4G2..555, 636..S44, 898..1025,

Chath2p7_007522 75 1362

1302..1362

1..49, 102..265, 323..813, 867..1206, 1261..1750, 1801..2066,

Chath2p7_007524 76 2824

2123..2216, 2292..2524, 2579..2824

Chath2p7_007635 77 1413 1..183, 252.J93, 855..1413

Chath2p7_007992 78 1354 1..1236, 1301..1354

Chath2p7_008014 79 2081 1..159, 217..425, 477..878, 935..2081

Chath2p7_008143 80 1219 1..284, 344..417, 477..518, 592..809, 893..1219

Chath2p7_008265 81 1524 1..1524 Chath2p7_008316 82 1407 1..250, 307..392, 443..1025, 1079..1407

Chath2p7_008371 83 714 1..714

1..292, 375..57Θ, 648.732, 794..920, 985..1437, 1523..2298,

Chath2p7_008693 84 2425

2351..2425

Chath2p7_008868 85 1 185 1..547, 614..1 185

Chath2p7_008953 86 1420 1..126, 189..288, 347-673, 736..1270, 1336..1420

Chath2p7_008975 87 1291 1..1079, 1 140..1215, 1271..1291

Chath2p7_009128 88 3064 1 ..78, 401..555, 636.-1024, 1083-3064

Chath2p7_009141 89 1087 1..246, 302..970, 1034-1087

Chath2p7_009361 90 1061 1..31 , 103..225, 427..1061

Chath2p7_009566 91 1663 1..372, 437..1663

Chath2p7_009583 92 1873 1..51 , 1 19-1873

Chath2p7_009602 93 1567 1..573, 624..760, 808-1264, 1320-1450, 1510-1567

Chath2p7_009651 94 687 1..332, 384„687

Chath2p7_009773 95 1846 1..72, 156.709, 759..1029, 1076-1358, 1410-1644, 1696..1846

Chath2p7_009774 96 739 1 -504, 560.739

1..202, 254-286, 336..601 , 653.779, 829..1296, 1346-2121,

Chath2p7_009786 97 2207

2178-2207

Chath2p7_009888 98 749 1 -397, 466.749

Chath2p7_009900 99 1401 1..787, 893-1401

Chath2p7_010016 100 1695 1..1421 , 1485-1695

Chath2p7_010073 101 2004 1..258, 309..384, 445-526, 603..2004

Chath2p7 010168 102 1228 1..294, 360..579, 690-1228

Chath2p7_010170 103 1649 1..1437, 1500-1649

Chath2p7_010593 104 773 1..209, 272.773

Chath2p7_011143 105 774 1.774

Chath2p7_011156 106 1 181 1..67, 140. 302, 360-464, 521..1 181

Chath2p7_011414 107 747 1..49, 104.-440, 501.747

Chath2p7_011764 108 1 172 1.381 , 444-623, 680-943, 1001.1099, 1 164..1172

Chath2p7_011785 109 844 1 -418, 486-844

Chath2p7_011979 110 2532 1..351 , 415-501 , 559-867, 964-2532

Chath2p7_011987 11 1 825 1..317, 396..825

Chath2p7_012005 112 1489 1 -210, 275-444, 503-675, 744..1 166, 1269-1333, 1406-1489

Chath2p7_012009 113 1707 1..1587, 1675-1707

Chath2p7_012118 114 1341 1..1341

Chath2p7_012361 115 720 1 -298, 353.720

1 ..245, 300..456, 510..616, 669-949, 1008..1128, 1 179..1396,

Chath2p7_012793 116 2588

1452..2323, 2376-2482, 2540-2588

Chath2p7_012814 117 1 139 1..1009, 1069..1139

Chath2p7_013173 118 641 1..159, 225-641

Chath2p7_013454 119 1239 1 -252, 311 -946, 1018-1239

Chath2p7_013485 120 1080 1..1080

Chath2p7_013764 121 1740 1..1740

Chath2p7_013990 122 2940 1..49, 1 10..294, 354..1169, 1221..1838, 1891 -1984, 2037..2085, 2138..2229, 2289..2521 , 2578..2940

Chath2p7_014000 123 1362 1..868, 932..1362

Chath2p7_014091 124 1 123 1..99, 163..993, 1061..1123

Chath2p7_014309 125 2456 1..142, 199..2456

Chath2p7_014334 126 828 1..55, 121..465, 542..S28

Chath2p7_014403 127 1212 1..95, 158..314, 802..1212

Chath2p7_014416 128 393 1..62, 132..393

Chath2p7_014439 129 2765 1..323, 390..1263, 1316..1682, 1735..2765

Chath2p7_014472 130 743 1..257, 326.743

Chath2p7_014590 131 430 1..210, 262..340, 399..430

Chath2p7_014688 132 1073 1..270, 320..913, 969..1073

Chath2p7_014993 133 1409 1..204, 269.463, 525..1070, 1131..1409

1..405, 497.786, 871..932, 1017..1095, 1 160..1269, 1338..1556,

Chath2p7_014997 134 1834

1629..1834

Chath2p7_015193 135 630 1..630

Chath2p7_015220 136 1014 1..1014

Chath2p7_015251 137 1097 1..444, 506..949, 1002..1097

1 ..49, 103..266, 319..809, 860..1 199, 1255..1744, 1796..2061 ,

Chath2p7_015259 138 2845

21 17..2210, 2287..2519, 2600..2845

Chath2p7_015439 139 1317 1..1317

Chath2p7_015442 140 2176 1..226, 301..1139, 1212..1233, 1285..2060, 2120..2176

Chath2p7_015669 141 1452 1..118, 182..619, 671..1452

Chath2p7_015680 142 1328 1..151 , 228..486, 545..1328

Chath2p7_015688 143 1412 1..392, 449..1412

Chath2p7_015690 144 767 1..49, 1 1 1..387, 443.767

Chath2p7_015986 145 1682 1..153, 220..678, 758..Θ62, 1048..1402, 1490..1682

Chath2p7_016024 146 1904 1..199, 259..434, 485.759, 818..1904

Chath2p7_017015 147 2559 1..2559

Chath2p7_017371 148 907 1.401 , 457..907

Chath2p7_017495 149 426 1..426

Chath2p7_017500 150 1604 1..306, 366..1604

Chath2p7_017827 151 1372 1..189, 254..583, 650..1372

Chath2p7_018195 152 3291 1..221 , 282..2506, 2588..3291

Chath2p7_018489 153 783 1..314, 509..626, 694.783

Chath2p7_018847 154 798 1.798

Chath2p7_018891 155 2616 1.79, 143..220, 278..351 , 404..1890, 1950..2017, 2099..2616

Chath2p7_019218 156 1243 1..938, 1027..1243

Chath2p7_019465 157 1867 1..148, 208..841 , 896..1 187, 1244..1867

Chath2p7_019607 158 1088 1..269, 326..950, 1014..1088

Chath2p7_019694 159 1936 1..499, 644.758, 816..931 , 987..1936

Chath2p7_020025 160 354 1..354

CHATH_1_00427 161* n/a n/a

CHATH_1_00626 162* n/a n/a

CHATH_1_01050 163* n/a n/a CHATH_1_02120 164* n/a n/a

CHATH_1_02778 165* n/a n/a

CHATH_1_02779 166* n/a n/a

CHATH_1_03001 167* n/a n/a

CHATH_1_03122 168* n/a n/a

CHATH_1_04430 169* n/a n/a

CHATH_1_06969 170* n/a n/a

CHATH_1_06970 171* n/a n/a

CHATH_1_08154 172* n/a n/a

CHATH_1_08418 173* n/a n/a

CHATH_1_09728 174* n/a n/a

CHATHJJ0102 175* n/a n/a

CHATHJJ0326 176* n/a n/a

CHATHJJ0523 177* n/a n/a

CHATHJJ0763 178* n/a n/a

In the above Table, an asterix (*) refers to sequences (SEQ ID NOs: 161 -178) for which, in the Sequence Listing, the nucleic acid sequences for SEQ ID NOs: 161 -178 have been assigned an arbitrary number of bases "n" as placeholders; "n/a": not available.

Table 2B: List of genes of Thermomyces stellatus with reference to exon boundaries

Thest2p7_001570 552 771 1..630, 685.-771

1..327, 402..1257, 1339..1358, 1429..1478, 1547..1607, 1668..1774, 1854..1966, 2040..2055, 2121..2319, 2398..24Θ4,

Thest2p7_001581 553 5368

2549..2930, 2994..3157, 3221..3644, 3783..3893, 3994.4783,

4859..5020, 5078..5368

Thest2p7_001623 554 2020 1..435, 487..605, 658..8Θ5, 920..1405, 1455..1793, 1850..2020

Thest2p7_001625 555 2210 1..497, 557..2210

Thest2p7_001649 556 740 1..165, 258..740

Thest2p7_001651 557 1743 1..1743

Thest2p7_001749 558 774 1..441 , 526.774

Thest2p7_001829 559 889 1..365, 421..889

Thest2p7_001830 560 1784 1..474, 534..1784

Thest2p7_001836 561 954 1..954

Thest2p7_002092 562 1641 1..177, 247..345, 425..508, 591..629, 738..1491 , 1601..1641

Thest2p7_002139 563 1963 1..173, 224..415, 475..1963

Thest2p7_002230 564 997 1..52, 127..200, 260..304, 355..531 , 596..884, Θ45..997

Thest2p7_002586 565 1165 1..94, 150..1165

Thest2p7_002843 566 1449 1..1449

Thest2p7_002918 567 1074 1..1074

Thest2p7_002923 568 1041 1..619, 677..1041

Thest2p7_003148 569 922 1..383, 436.496, 561.813, 879..Θ22

Thest2p7_003297 570 885 1..885

Thest2p7_003503 571 1140 1..1140

Thest2p7_003688 572 3850 1..315, 568..1001 , 1083..2386, 2475..3850

Thest2p7_003868 573 878 1..92, 146..196, 248..613, 695..878

1..233, 287..380, 4Θ8..875, 944..1085, 1146..1210, 1261..1388,

Thest2p7_003907 574 3464

1466..1913, 1982..2423, 2518..2802, 2873..3107, 3182..3464

Thest2p7_003915 575 1076 1..270, 327..917, 972..1076

Thest2p7_004110 576 1572 1.-791 , 885.-1246, 1301..1572

Thest2p7_004216 577 1461 1..43, 102..943, 997..1461

1..165, 229.492, 545..605, 660..755, 807..855, 912..1508,

Thest2p7_004468 578 2032

1576..2032

Thest2p7_004537 579 1484 1..1078, 1138..1484

Thest2p7_004666 580 1466 1.-888, 944.-1091 , 1 159-1466

Thest2p7_004749 581 1938 1..141 , 211..1389, 1510-1938

Thest2p7_004778 582 1172 1.462, 521..758, 814..1016, 1077..1172

Thest2p7_004791 583 1281 1..1281

Thest2p7_005041 584 1685 1..565, 655-1685

1..97, 230..255, 337..503, 565..650, 712..952, 1037..1084,

Thest2p7_005042 585 2889

1146-1258, 1321..1453, 1503-2889

Thest2p7_005043 586 1644 1..1644

Thest2p7_005075 587 2068 1..352, 413.490, 542-652, 724..1109, 1862..2068

Thest2p7_005076 588 1787 1..1567, 1630-1787

1..14, 75..154, 214.495, 564-584, 643-849, 908-1031 ,

Thest2p7_005223 589 1968

1090-1348, 1433-1521 , 1582..1697, 1752..1867, 1927..1968 Thest2p7_005229 590 1440 1..354, 419..729, 805..833, 895..1085, 1162..1440

Thest2p7_005230 591 1977 1..1977

Thest2p7_005266 592 892 1..101 , 166..733, 791..892

Thest2p7_005270 593 1781 1..383, 464..1781

Thest2p7_005361 594 972 1..159, 250..972

Thest2p7_005404 595 2097 1..1735, 1793..1994, 2049..2097

Thest2p7_005699 596 864 1..160, 231..360, 423-864

Thest2p7_005700 597 2685 1..2685

Thest2p7_005709 598 1760 1..173, 249..852, 925..1080, 1158..1760

Thest2p7_005710 599 1580 1..102, 211..282, 356..541 , 619..799, 873..1072, 1128..1580

Thest2p7_005722 600 2738 1..666, 740..921 , 1381..1933, 2658.-2738

Thest2p7_005879 601 3268 1..503, 565..1184, 1269..1420, 1489..1554, 1634..3268

Thest2p7_006119 602 1006 1..252, 357..910, 967..1006

Thest2p7_006186 603 1917 1..1917

Thest2p7_006261 604 1736 1..600, 678..1736

Thest2p7_006535 605 1542 1..294, 387.776, 838..1041 , 1097.1257, 1311..1542

Thest2p7_006558 606 1199 1..101 , 188..467, 564-852, 938..1031 , 1160..1199

Thest2p7_006632 607 992 1.-166, 231.361, 429.719, 834-992

Thest2p7_006683 608 2099 1..1757, 1812..1928, 1988..2099

Thest2p7_006748 609 1670 1..107, 166..470, 549..1257, 1313..1670

Thest2p7_006787 610 1449 1..451 , 509..1449

Thest2p7_007047 611 1245 1..1245

Thest2p7_007121 612 1125 1..964, 1037.1125

Thest2p7_007280 613 1625 1..430, 519..646, 705.714, 782..1031 , 1181..1398, 1462 .1625

1.71 , 141..550, 623.776, 837..861 , 923..1181 , 1238..1260,

Thest2p7_007281 614 1725

1404..1447, 1536..1725

Thest2p7_007287 615 1672 1..244, 322..1463, 1538..1672

Thest2p7_007530 616 1574 1..248, 331..503, 583.787, 906..1201 , 1327.1574

Thest2p7_007631 617 2661 1..2661

Thest2p7_007655 618 1038 1..89, 160..580, 658..1038

Thest2p7_007691 619 2098 1..32, 103..182, 275..370, 441.736, 806..2098

Thest2p7_007771 620 800 1..66, 126..800

1..256, 312..400, 463..817, 870..1024, 1077..1186, 1240..1327,

Thest2p7_007802 621 1945

1387..1498, 1557..1945

1..129, 180.758, 818..1007, 1062..1499, 1562..1707, 1775..1882,

Thest2p7_008006 622 2359

1940-2359

Thest2p7_008162 623 1122 1..93, 147-296, 359-1015, 1072..1122

Thest2p7_008352 624 1436 1..127, 191..637, 751.1436

Thest2p7_008445 625 1641 1.328, 438.749, 839..1641

Thest2p7_008458 626 642 1.642

Thest2p7_008537 627 2793 1.2793

Thest2p7_008558 628 1374 1.285, 353.464, 536-637, 701.762, 835-1374

Thest2p7_008591 629 1547 1.359, 419-1547

Thest2p7_008646 630 3404 1.461 , 516..614, 680..3404 Thest2p7_008870 631 1003 1..305, 430..798, 859..1003

Thest2p7_009204 632 784 1..49, 118..475, 538.784

1..358, 471..595, 683.715, 779-869, 951..1161 , 1299..1539,

Thest2p7_009266 633 1905

1846..1905

Thest2p7_009285 634 1623 1.785, 847..1042, 1 108..1623

Thest2p7_009297 635 1566 1..93, 155..198, 250..637, 690..854, 904..1566

Thest2p7_009308 636 1173 1..1 73

Thest2p7_009315 637 890 1..366, 423..521, 577..630, 690..890

Thest2p7_009317 638 1161 1..100, 152..344, 471..518, 651..922, 986..1161

Thest2p7_009331 639 654 1..654

Thest2p7_009474 640 2030 1.750, 948..2030

Thest2p7_009527 641 1092 1..521 , 588..1092

Thest2p7_009765 642 1579 1..626, 736..1579

Thest2p7_009768 643 1292 1..290, 375.727, 808..1292

Thest2p7_009881 644 1190 1..116, 175..505, 568..603, 666..1190

Thest2p7_009893 645 1363 1..49, 126..1363

Thest2p7_010132 646 2040 1..165, 213..406, 455.-656, 714..910, 960..2040

Thest2p7_010134 647 824 1..114, 167..601 , Θ63..824

Thest2p7_010183 648 1648 1..363, 434..1648

Thest2p7_010210 649 1875 1..1875

Thest2p7_010479 650 762 1.762

Thest2p7_010616 651 1443 1..1443

Thest2p7_010785 652 2331 1..2331

Thest2p7_010806 653 1235 1..219, 292-927, 1002..1235

Thest2p7_010820 654 702 1.702

Thest2p7_010821 655 1050 1..46, 103..336, 423..641 , 713.771 , 826..1050

Thest2p7_010886 656 945 1..945

Thest2p7_010929 657 1566 1..1566

Thest2p7_010947 658 730 1..165, 248.730

Thest2p7_011092 659 821 1 -345, 422.-577, 648..821

Thest2p7_011104 660 1835 1..505, 558.713, 769..1835

Thest2p7_011166 661 1587 1..123, 225-324, 447.773, 832..1342, 1413..1587

Thest2p7_011272 662 807 1..36, 127..807

1..299, 366..439, 503..532, 591..688, 751..864, 928-975,

Thest2p7_011337 663 1039

1028-1039

1..49, 1 19..279, 342-885, 964..1558, 1860-1942, 2026-2393,

Thest2p7_011561 664 3988

2453-3308, 3372..3607, 3668-3988

1..100, 164..201 , 259..388, 458..560, 670.793, 874-954,

Thest2p7_011566 665 1164

1025-1088, 1142..1 164

1..154, 390..496, 551..579, 643.716, 784..828, 885-988,

Thest2p7_011567 666 1340

1061 -1174, 1232..1279, 1329-1340

1..150, 222..371 , 529..687, 745-941 , 1004..1545, 1610-1726,

Thest2p7_011649 667 2317

1793-1882, 1943-1954, 2067..2317

Thest2p7_011848 668 840 1..55, 142..480, 557..840

Thest2p7_012040 669 1483 1.763, 825-973, 1028-1483 Thest2p7_012157 670 897 1..897

Thest2p7_012185 671 1409 1..89, 273..518, 585.-592, 658..732, 784..1409

Thest2p7_012221 672 831 1..266, 341..389, 466..831

Thest2p7_012411 673 786 1..786

Thest2p7_012412 674 3077 1..1286, 1356-1699, 1756..3077

Thest2p7_012645 675 751 1..43, 110..263, 331..751

Thest2p7_012757 676 1549 1..297, 360..573, 641..716, 774..1168, 1227..1549

1..10, 74..352, 426..514, 5S8..942, 1068-1266, 1330-1417,

Thest2p7_012758 677 1972

1483..1594, 1668..1972

Thest2p7_012760 678 2104 1..982, 1075..1317, 1422..2104

Thest2p7_012774 679 1605 1..1605

Thest2p7_012956 680 800 1..249, 325..497, 560-800

Thest2p7_012959 681 1646 1..3, 99..146, 502..656, 761..1646

Thest2p7_013220 682 1105 1..286, 334-451 , 508-1105

Thest2p7_013232 683 871 1 -650, 708.739, 822..871

Thest2p7_013244 684 1348 1..102, 182..344, 414..1348

Thest2p7_013279 685 1086 1..1086

1..49, 102..280, 328..1642, 1690-1811 , 1863-1956, 2003..2051 ,

Thest2p7_013283 686 2818

2105..2196, 2249-2481 , 2529-2571 , 2625-2818

Thest2p7_013591 687 741 1.741

Thest2p7_013959 688 1292 1..516, 579-1292

Thest2p7_014002 689 1423 1..104, 189„342, 408„493, 567-882, 1010..1423

Thest2p7_014025 690 1795 1..641 , 714..1137, 1319..1795

Thest2p7_014082 691 770 1 -615, 684.770

Thest2p7_014236 692 726 1.726

Thest2p7_014244 693 852 1..52, 108-286, 352-852

Thest2p7_014403 694 380 1 - 147, 306..380

Thest2p7_014416 695 2425 1..124, 201 -2425

1..248, 305-371 , 426-448, 503-537, 588-829, 879-953,

Thest2p7_014498 696 1453

1006-1024, 1079..1318, 1374..1453

Thest2p7_014570 697 753 1 -429, 490.753

Thest2p7_014679 698 1433 1..601 , 667..1180, 1298-1433

Thest2p7_014727 699 580 1..245, 298-580

Thest2p7_014788 700 1907 1..1328, 1432..1658, 1729..1907

Thest2p7_014789 701 845 1..393, 492„845

Thest2p7_014832 702 1035 1..1035

Thest2p7_014851 703 723 1.723

Thest2p7_014938 704 1035 1..1035

Thest2p7_014992 705 2555 1..49, 131..401 , 486..1709, 1815-1915, 2017..2555

Thest2p7_015009 706 360 1..251 , 354..360

Thest2p7_015083 707 3673 1 -470, 895-3673

Thest2p7_015128 708 1692 1..1692

Thest2p7_015136 709 1893 1 -370, 477.710, 831..1037, 1104..1206, 1290..1666, 1775-1893

Thest2p7_015239 710 1233 1..414, 505..1233 Thest2p7_015325 711 1015 1..107, 164..220, 295..714, 835..1015

Thest2p7_015512 712 1345 1..678, 797..1345

Thest2p7_015642 713 1388 1..294, 382.732, 846..1388

Thest2p7_015683 714 1112 1..73, 141..172, 383..548, 617..747, 819..1112

Thest2p7_015981 715 1016 1..305, 430.798, 872..1016

Thest2p7_016051 716 1518 1..49, 118.770, 866..926, 989..1518

Thest2p7_016235 717 1800 1..1800

Thest2p7_016288 718 555 1..555

Thest2p7_016292 719 1382 1..321 , 397..841 , 937..1382

Thest2p7_016339 720 2187 1..888, 961..1018, 1074..1281 , 1347..1636, 2093..2187

Thest2p7_016371 721 769 1..287, 370.769

Thest2p7_016435 722 2512 1..275, 331..1090, 1145..1255, 1307..2512

Thest2p7_016616 723 835 1..305, 394..83S

Thest2p7_016778 724 1309 1..1039, 1122..1309

Thest2p7_016899 725 1396 1..342, 410..642, 721..1396

Thest2p7_017059 726 1920 1..1920

Thest2p7_017281 727 2346 1..2346

Thest2p7_017550 728 1245 1..1245

Thest2p7_017554 729 672 1..204, 288..521 , S65..672

Thest2p7_017599 730 1197 1..1197

Thest2p7_017688 731 1674 1..460, 518..1674

Thest2p7_017753 732 868 1.70, 141..502, 608..868

Thest2p7_017761 733 925 1..443, 529.749, 84Θ..925

Thest2p7_017954 734 2744 1..105, 158..523, 585..2744

Thest2p7_018034 735 1535 1..285, 353..1079, 1201..1535

Thest2p7_018050 736 1951 1..590, 660..940, 1017..1951

Thest2p7_018059 737 1212 1..1212

Thest2p7_018180 738 1672 1..452, 529..1063, 1 172..1672

Thest2p7_018185 739 1070 1..243, 325. 851 , 924..9S6, 1022..1070

Thest2p7_018230 740 1357 1..597, 683..1357

Thest2p7_018675 741 1262 1..88, 166..331 , 413..517, 602..1262

Thest2p7_018953 742 1575 1..1575

Thest2p7_018954 743 1658 1..399, 468..1658

Thest2p7_019041 744 2147 1..130, 187..2147

Thest2p7_019045 745 1291 1.799, 859..951 , 1083..1291

Thest2p7_019047 746 997 1..99, 329..Θ46, 740..997

Thest2p7_019074 747 1854 1..1854

Thest2p7_019255 748 993 1..993

Thest2p7_019299 749 1309 1..1200, 1262..1309

Thest2p7_019414 750 1005 1..106, 173..413, 478.746, 866..1005

Thest2p7_019575 751 819 1..819

Thest2p7_019785 752 2371 1..218, 300..386, 433..2371

Thest2p7_019791 753 2879 1..135, 219..1390, 1451..2879 Thest2p7_020000 754 2596 1..337, 391..634, 695.767, 823..1131 , 1182..1523, 1589..2596

Thest2p7_020087 755 987 1..987

Thest2p7_020132 756 1118 1..44, 134..597, 745..11 18

Thest2p7_020304 757 1069 1..468, 563..1069

Thest2p7_020395 758 2184 1..248, 330..657, 728..1152, 1224..1320, 1370..1799, 1862..2184

Thest2p7_020517 759 573 1..154, 208..277, 330..573

Thest2p7_020601 760 2327 1..67, 126..1838, 1921..2327

Thest2p7_020633 761 841 1..204, 268.479, 556..841

Thest2p7_020634 762 1068 1..1068

Thest2p7_020681 763 1467 1..1467

Thest2p7_020751 764 1519 1..279, 345..533, 592.772, 820..1019, 1073..1519

Thest2p7_020761 765 1198 1..192, 269..480, 529..1198

Thest2p7_020795 766 1139 1..313, 383..Θ86, 1073..1 139

Thest2p7_020803 767 501 1..71 , 207..501

Thest2p7_020856 768 1917 1..1917

Thest2p7_021004 769 1232 1..438, 519.753, 861 -1063, 1122..1232

Thest2p7_021008 770 1584 1..606, 670..974, 1045..1291 , 1396..1584

Thest2p7_021291 771 1314 1..1314

Thest2p7_021312 772 420 1..420

Thest2p7_021342 773 818 1..285, 369..554, 642..818

Thest2p7_021707 774 2538 1..2538

Thest2p7_021720 775 1047 1..1047

Thest2p7_021739 776 1950 1..360, 420-817, 882..1435, 1529..1950

Thest2p7_021798 777 2077 1..264, 346..589, 645..1 159, 1208..1834, 1952..2077

Thest2p7_021822 778 645 1..313, 374„645

Thest2p7_021871 779 1438 1..183, 236-1438

Thest2p7_021880 780 903 1.72, 128-327, 426-903

Thest2p7_021892 781 1307 1..296, 363..455, 518..1307

Thest2p7_021917 782 648 1..648

Table 2C: List of genes of Corynascus sepedonium with reference to exon boundaries

Corse 1p7_000380 1286 1935 1..1935

Corse1p7_000412 1287 2061 1..2061

Corse1p7_000417 1288 1200 1..298, 365..1200

Corse1p7_000480 1289 1312 1..593, 678..759, 821..1185, 1240..1312

Corse1p7_000626 1290 1452 1..363, 443..1 187, 1391..1452

Corse 1p7_000682 1291 1004 1..468, 561..1004

Corse 1p7_000779 1..46, 103..287, 341.1774, 1835..1977, 2030..2121 , 2192..2424,

1292 2878

2519..2878

Corse 1p7_000836 1293 2697 1.2697

Corse1p7_000848 1294 1135 1.243, 322..1008, 1082..1135

Corse1p7_000860 1295 3034 1.72, 382..53Θ, 595-983, 1038..3034

Corse1p7_000908 1296 978 1.978

Corse1p7_000911 1297 2412 1.2412

Corse1p7_000918 1298 1476 1.1 16, 186.-1 111 , 1231.1309, 1380..1476

Corse1p7_001122 1299 2413 1.245, 326..1 199, 1253..1622, 1674..2413

Corse1p7_001133 1300 1044 1.1044

Corse1p7_001135 1301 1776 1.941 , 1031.1277, 1398..1531 , 1586..1637, 1747..1776

Corse 1 p7_001152 1302 1226 1.360, 420..576, 626..657, 711.1226

Corse 1p7_001156 1.293, 353..458, 519..661 , 718..802, 860..920, 973..1086,

1303 2398

1142..1190, 1249..1428, 1515..2173, 2295.-2398

Corse 1p7_001188 1304 1226 1.88, 166-1226

Corse1p7_001215 1.165, 233-387, 443..63Θ, 710-817, 951.1160, 1227..1317,

1305 1703

1400-1703

Corse1p7_001216 1306 1234 1.99, 156-302, 401.1105, 1172..1234

Corse1p7_001250 1307 783 1.49, 108-381 , 465.783

Corse1p7_001265 1308 908 1.460, 559-908

Corse1p7_001412 1309 1688 1.1053, 1161.1178, 1575-1608, 1666..1688

Corse1p7_001747 1310 963 1.284, 352..425, 487..528, 603..817, 907..963

Corse1p7_001750 1.527, 588-1 162, 1261.1348, 1415-1652, 1727..1738,

1311 2531

1801.1931 , 2082..2363, 2453-2531

Corse1p7_001812 1.209, 266-640, 695-852, 905-1083, 1 149-1240, 1297..1539,

1312 1666

1621.1666

Corse1p7_002077 1313 1339 1.196, 258-1339

Corse1p7_002120 1314 1906 1.67, 192..1055, 1164..1906

Corse 1p7_002197 1315 1534 1.340, 408-893, 993-1534

Corse 1p7_002284 1316 1133 1.553, 625-821 , 888-1133

Corse1p7_002327 1317 1181 1.145, 213-851 , 922-1181

Corse1p7_002385 1318 1626 1.1035, 1105..1626

Corse1p7_002466 1319 1688 1.367, 425-533, 599-835, 966-1585, 1645-1688

Corse 1p7_002587 1320 1161 1.1 161

Corse 1p7_002682 1321 885 1.98, 161.722, 784-885

Corse1p7_002713 1322 3085 1.618, 706-1690, 1758-3085

Corse1p7_002731 1323 856 1.571 , 627-688, 770..856

Corse1p7_002749 1324 2120 1.962, 1015-1033, 1107..1297, 1360-1431 , 1499..2120 Corse 1p7_002768 1325 1241 1..349, 421..1241

Corse 1p7_002770 1326 1029 1..212, 276.-338, 407..457, 518..863, 922..1029

Corse1p7_002901 1327 1935 1..205, 267-442, 504..778, 846..1935

Corse1p7_002990 1328 1405 1..594, Θ74..827, 894..1405

Corse1p7_003021 1329 1307 1..1 10, 177..488, 551..1307

Corse1p7_003111 1330 1290 1..581 , 636..1290

Corse1p7_003146 1331 1094 .270, 331..921 , 990..1094

Corse1p7_003147 1332 1210 1..284, 361..428, 489..912, 1024..1210

Corse1p7_003195 1..257, 304..460, 564..670, 721..1004, 1062..1182, 1233..1450,

1333 2657

1508..2349, 2442..2548, 2609..2657

Corse1p7_003201 1334 1725 1..1725

Corse1p7_003289 1335 842 1..67, 160..521 , 582..842

Corse1p7_003312 1336 1185 1..1 185

Corse1p7_003313 1337 1129 1..602, 679..1 129

Corse1p7_003609 1338 1026 1..1026

Corse1p7_003705 1339 1287 1..1287

Corse1p7_003803 1340 1529 1..442, 505..1529

Corse 1p7_003850 1341 2014 1..284, 340..412, 463..630, 691..786, 843..1716, 1812..2014

Corse1p7_003910 1342 2355 1..152, 215..685, 763..846, 923..1501 , 1576-1768, 1879..2355

Corse1p7_003936 1343 2131 1..295, 361..491 , 560..826, 875..2131

Corse1p7_004050 1..204, 264..310, 370..436, 491..548, 631..872, 939..1273,

1344 1428

1368..1428

Corse1p7_004080 1345 3105 1..3105

Corse1p7_004099 1346 1030 1..267, 350..879, 952..1030

Corse1p7_004114 1347 891 1..891

Corse1p7_004116 1348 1027 1..162, 225-338, 419..1027

Corse1p7_004172 1349 1249 1..83, 152..1013, 1094..1249

Corse1p7_004173 1350 812 1..60, 135..812

Corse1p7_004174 1351 861 1..123, 206-527, 605-861

Corse1p7_004312 1352 2523 1..154, 209-640, 698-2523

Corse1p7_004331 1353 875 1 -440, 507-727, 796-875

Corse 1p7_004332 1354 2578 1..218, 325-2578

Corse1p7_004405 1355 1274 1 -1 19, 190..429, 504..819, 909..1274

Corse1p7_004480 1356 1180 1..199, 270-1 180

Corse1p7_004481 1357 1031 1..155, 222„833, 947..1031

Corse 1p7_004788 1358 1081 1..121 , 186-310, 377..570, 772-835, 899-1081

Corse 1p7_004856 1359 1087 1 -272, 334-937, 1013..1087

Corse1p7_004859 1360 818 1..55, 115..471 , 532..818

Corse1p7_004998 1361 1116 1.659, 819-1 116

Corse1p7_005152 1362 1431 1..1431

Corse1p7_005154 1363 1113 1..1 113

Corse 1p7_005273 1364 1195 1..663, 716..1075, 1130..1195

Corse1p7_005281 1365 2606 1..724, 798-1009, 1085..1991 , 2049-2230, 2292..2606

Corse1p7_005694 1366 1308 1..99, 158..314, 374..1308 Corse 1p7_005785 1367 1597 1..505, 582..944, 1014..1597

Corse 1p7_005787 1368 888 1..55, 124..465, 524..661 , 740..888

Corse1p7_005789 1369 1582 1..393, 448..533, 592..847, 944..1582

Corse1p7_005798 1370 1587 1..1587

Corse1p7_005799 1371 1075 1..219, 275..1075

Corse 1p7_005804 1372 1163 1..101. 320..894, 960..1009, 1067..1 101 , 1151..1163

Corse 1p7_005806 1373 1194 1..1 194

Corse1p7_006022 1374 1434 1..1434

Corse1p7_006116 1375 567 1..567

Corse1p7_006287 1376 1035 1..1035

Corse 1p7_006302 1377 753 1..753

Corse 1p7_006407 1378 1260 1..1260

Corse 1p7_006563 1379 1662 1..372, 430..1662

Corse1p7_006580 1380 2178 1..192, 263..318, 384..566, 624.794, 883..999, 1053..2178

Corse1p7_006600 1381 1108 1..444, 509..949, 1016..1108

Corse1p7_006797 1382 1441 1..1 150, 1251..1441

Corse 1p7_007007 1383 1582 1..318, 380..1582

Corse 1p7_007035 1384 1288 1..380, 487..1288

Corse1p7_007055 1385 808 1.748, 798..808

Corse1p7_007095 1386 1581 1..361 , 424..507, 636..1493, 1547..1581

Corse1p7_007108 1387 730 1.251 , 316.730

Corse1p7_007181 1388 1973 1..150, 283..1449, 1542..1973

Corse1p7_007197 1389 1473 1..106, 164..1086, 1255..1473

Corse1p7_007250 1390 1167 1.73, 133..295, 355.459, 519..1167

Corse1p7_007256 1391 2348 1..280, 349..2348

Corse1p7_007403 1392 1342 1..436, 523..847, 922..1342

Corse 1p7_007428 1.49, 118..284, 344.825, 887..1226, 1298..1787, 1852..2117,

1393 2833

2184..2277, 23S8..2833

Corse1p7_007431 1394 963 1.290, 375.490, 596..807, 907..963

Corse 1p7_007458 1395 1185 1.405, 486..511 , 587..645, 709.769, 855..1185

Corse1p7_007469 1396 1650 1.100, 169..263, 321.384, 490..1524, 1631.1650

Corse1p7_007559 1397 1396 1.73, 132..195, 286..542, 642..1396

Corse1p7_007830 1.140, 192.446, 551.676, 770.793, 850..1000, 1062..1257,

1398 5793 1325..1443, 1523..1728, 1820..2165, 2254.4326, 4404..5701 ,

5748..5793

Corse1p7_007937 1399 977 1.712, 895..Θ77

Corse1p7_007949 1.93, 156..164, 226..25S, 339..571 , Θ37..697, 830..1079,

1400 1247

1204.1247

Corse1p7_007974 1401 1325 1.251 , 332..1325

Corse1p7_008012 1402 768 1.768

Corse 1p7_008232 1403 1760 1.617, 690.761 , 844..1115, 1207..1760

Corse1p7_008334 1404 2017 1.151 , 206..836, 887..1178, 1246..1432, 1557..2017

Corse1p7_008374 1405 1035 1.218, 342..694, 818..1035

Corse1p7_008434 1406 885 1.432, 495..52S, 597.723, 798.-885 Corse 1p7_008634 1407 438 1..438

Corse 1p7_008735 1408 1004 1..184, 245..324, 376..425, 48Θ..533, 608..1004

Corse1p7_008743 1409 600 1..600

Corse1p7_009030 1410 996 1..408, 471..613, 699-996

Corse1p7_009061 1..234, 295..430, 492..631 , 706..1022, 1 101 -1276, 1356..1713,

1411 2493

1831..2044, 2309..2367, 2463..2493

Corse1p7_009076 1412 794 1..648, 723..794

Corse1p7_009217 1413 2634 1..2634

Corse 1p7_009368 1414 1700 1..49, 131..532, 695..951 , 1018..1090, 1 147..1700

Corse1p7_009374 1 -146, 207..382, 43Θ..545, 613.731 , 792-968, 1040-1147,

1415 1377

1218..1377

Corse1p7_009379 1416 1530 1..303, 362.745, 818..1170, 1296..1530

Corse1p7_009448 1417 837 1..275, 420-837

Corse1p7_009495 1418 953 1..126, 194..287, 366.709, 789-953

Corse 1p7_009545 1419 1131 1..468, 524-929, 995..1131

Corse 1p7_009850 1420 1493 1..369, 427„656, 710..1493

Corse1p7_009899 1421 627 1..383, 510..627

Corse1p7_010258 1422 1443 1..179, 232..350, 407..447, 502..588, 642..832, 909..1443

Corse1p7_010341 1423 1281 1..1281

Corse 1p7_010603 1424 1620 1..208, 273-1 181 , 1259-1620

Corse 1p7_010738 1425 306 1..138, 232..306

Corse1p7_010890 1426 855 1..246, 367..855

Corse1p7_010975 1427 1476 1..1476

Corse1p7_011069 1428 739 1..335, 481.739

Corse 1p7_011070 1429 1696 1..567, 639.775, 846-1058, 1128-1374, 1450-1580, 1639-1696

Corse1p7_011160 1430 4571 1..1036, 1119-2112, 2189.4571

Corse1p7_011211 1431 1357 1..168, 227„556, 617..1357

Corse1p7_011399 1432 966 1..406, 490..587, 653.766, 8S2..899, 955-966

Corse1p7_011692 1433 684 1..684

Corse 1p7_011793 1434 822 1.312, 394-822

Corse 1p7_011847 1435 1182 1..1 182

Corse 1p7_011892 1436 2355 1..2355

Corse1p7_012004 1437 2565 1..2565

Corse1p7_012028 1438 2261 1..1795, 1855..2261

Corse1p7_012060 1439 1917 1..1917

Corse 1p7_012063 1440 1652 1..90, 171..523, 594-666, 734-982, 1062..1652

Corse 1p7_012075 1441 1646 1..290, 338-436, 499.737, 844..1646

Corse1p7_012229 1442 1409 1..204, 270..470, 524..1069, 1140..1409

Corse1p7_012235 1443 1207 1..1 120, 1188..1207

Corse1p7_012238 1444 1008 1..1008

Corse 1p7_012360 1445 1545 1..560, 667..841 , 907..1180, 1268..1545

Corse 1p7_012405 1..639, 707..809, 87Θ..937, 987..1115, 1177..1339, 1395..1460,

1446 5018

1528..1613, 1675-4075, 4157..4266, 4395-5018

Corse1p7_012430 1447 1177 1 -690, 752-1 177 Corse 1p7_012432 1448 2635 1 .194, 288..591 , 692..1066, 1136..2635

Corse 1p7_012434 1449 2171 1 .790, 871..932, 999..1891 , 1964..2171

Corse1p7_012549 1450 1252 1 .799, 855..944, 1041..1252

Corse1p7_012644 1451 1452 1 .1452

Corse1p7_012727 1452 1732 1 .502, 567..1732

Corse 1p7_012769 1453 1382 1 .781 , 874..1382

Corse1p7_012881 1454 780 1 .780

Corse1p7_013000 1455 1685 1 .328, 434..666, 724..832, 987..1268, 1375..1685

Corse1p7_013120 1456 753 1 .753

Corse1p7_013127 1457 756 1 .756

Corse 1p7_013247 1458 1849 1 .824, 898..1 106, 1165..1312, 1396..1849

Corse 1p7_013297 1459 1277 1 .293, 348.440, 491..1277

Corse 1p7_013409 1460 1884 1 .1341 , 1412..1470, 1533..1884

Corse1p7_013522 1461 2014 1 .52, 162..2014

Corse1p7_013524 1462 2608 1 473, 541.773, 855..2608

Corse1p7_013621 1463 2143 1 .106, 158..548, 601..2143

Corse 1p7_013876 1464 1908 1 .927, 1006..1908

Corse1p7_013918 1465 1338 1 .1338

Corse1p7_013975 1466 435 1 .435

Corse1p7_014096 1467 1563 1 .100, 175..519, 583..864, 961..1270, 1545..1563

Corse1p7_014148 1468 2074 1 .622, 687..2074

Corse 1p7_014214 1469 1131 1 .1 131

Corse 1p7_014236 1470 875 1 .145, 22Θ..875

Corse1p7_014325 1471 807 1 .371 , 438-807

Corse1p7_014548 1472 858 1 .858

Corse1p7_014561 1473 775 1 .198, 284.775

Corse 1p7_014838 1474 1696 1 .92, 183..691 , 751..1335, 1434.1696

Corse 1p7_014844 1 .132, 191..251 , 312..616, 832..1005, 1186..1280, 1357..1371 ,

1475 5849 1450..1564, 1622..2401 , 2538..2715, 2777..2869, 3895.4521 ,

4594..S849

Corse1p7_014907 1..285, 3Θ2..578, 711.781 , 869..1039, 1 145..1521 , 1587..1674,

1476 4981

29Θ9..3348, 3478..3515, 3558..4064, 4128..4981

Corse1p7_014925 1477 2052 1..243, 310..385, 45..S26, 585..2052

Corse 1p7_015059 1478 1378 1..603, 710..1378

Corse 1p7_015066 1479 1193 1.717, 783..1 193

Corse1p7_015109 1480 131 1 1..330, 391..1311

Corse1p7_015111 1481 1699 1..122, 182..612, 682..1273, 1354..1699

Corse1p7_015118 1482 1219 1..154, 222„345, 416..521 , 609.788, 845..1219

Corse1p7_015177 1483 1670 1..494, 607..1 138, 1203..1670

Corse 1p7_015436 1484 1794 1..1794

Corse1p7_015457 1485 2694 1..2694

Corse1p7_015611 1486 1062 1..287, 355„536, 611..1062

Corse1p7_015615 1487 1206 1..923, 987..1206

Corse 1p7_015795 1488 1422 1..606, 682..1422 Corse 1p7_015808 1489 834 1..100, 160..521 , 589-834

Corse 1p7_015859 1490 2405 1..397, 469..2405

Corse1p7_015872 1491 1588 1..136, 179..303, 362..751 , 865..1199, 1354..1588

Corse1p7_015931 1492 1162 1..1 11 , 224.. 62

Corse1p7_015938 1493 181 1 1..401 , 464..1811

Corse 1p7_016018 1..79, 133-204, 265-338, 398-887, 944..1937, 2012..2079,

1494 2679

2162..2679

Corse 1p7_016074 1495 1694 1..187, 249..507, 566-1310, 1384..1575, 1635-1694

Corse 1p7_016210 1496 1573 1.300, 353-1 121 , 1203-1573

Corse1p7_016211 1497 2568 1..2568

Corse1p7_016216 1498 2195 1..454, 538-732, 805..1245, 1312..2195

Corse1p7_016294 1499 1630 1..409, 468-1630

Corse 1p7_016298 1500 835 1..515, 682-835

Corse 1p7_016342 1501 1617 1..108, 183-276, 343-586, 755-961 , 1026-1617

Corse1p7_016374 1502 1434 1..1434

Corse1p7_016433 1503 1669 1..24, 116..552, 624..873, 965..1669

Corse1p7_016558 1504 1573 1..56, 308-932, 1014..1242, 1388-1421 , 1498-1573

Corse 1p7_016577 1505 1087 1..961 , 1017..1087

Corse1p7_016616 1506 2081 1..135, 190-293, 369-1029, 1116-2081

Corse1p7_016629 1507 1239 1..1239

Corse1p7_016635 1508 748 1.353, 445.748

Corse1p7_016658 1509 2327 1..194, 245-2327

Corse1p7_016833 1510 3370 1..309, 365-798, 865-2165, 2235-3370

Corse 1p7_016839 1511 966 1..966

Corse1p7_016841 1512 1136 1..86, 162..688, 772..1136

Corse1p7_016867 1513 1017 1..586, 686..1017

Corse1p7_016869 1514 963 1..963

Corse1p7_017085 1515 624 1..624

Corse1p7_017101 1516 1326 1..1224, 1279..1326

Corse1p7_017138 1517 858 1..858

Corse1p7_017290 1518 1910 1..51 , 120-1910

Corse1p7_017304 1519 894 1..306, 376..615, 727-894

Corse1p7_017340 1520 191 1 1..108, 222..1339, 1411..1835, 1889-1911

Corse 1p7_017345 1521 849 1..849

Corse 1p7_017437 1522 2470 1..101 , 166-713, 783-834, 1567..1656, 1720-2470

Corse1p7_017469 1523 1072 1..61 , 119-176, 250-634, 710-902, 966-1072

Corse1p7_017501 1524 1261 1 -149, 214..383, 459..697, 755..897, 985..1261

Corse1p7_017503 1525 736 1.233, 319-736

Corse1p7_017513 1526 1346 1.742, 820-1346

Corse 1p7_017620 1527 1484 1..323, 384..906, 966..1113, 1171..1484

Corse 1p7_017635 1528 733 1.242, 310-733

Corse1p7_017662 1529 944 1 -172, 260..381 , 459.752, 840..944

Corse1p7_017740 1530 1317 1..206, 277-686, 779-1317 Corse 1p7_017842 1531 1089 1..263, 328..820, 979..1089

Corse 1p7_017943 1 .126, 186..246, 361.545, 611..784, 864..1039, 1096..1694,

1532 8380 1795..2334, 2405..2579, 2652..4314, 4380..4528, 4607..5085,

5271..5485, 5649..6033, 6101..6342, 6398..Θ425, 6495..8380

Corse1p7_018035 1533 3213 1..1365, 1445..2667, 2742..3213

Corse1p7_018169 1534 1291 1..306, 359..1291

Corse 1p7_018306 1535 927 1..927

Corse1p7_018335 1536 1507 1..285, 366..551 , 654..826, 905..1250, 1313..1507

Corse1p7_018439 1537 1506 1..368, 474..64Θ, 765..1187, 1266..1330, 1423..1506

Corse1p7_018442 1538 800 1..49, 141..414, 485..800

Corse 1p7_018453 1539 1019 1..159, 235-580, 649-1019

Corse 1p7_018530 1540 1497 1..1497

Corse1p7_018598 1541 840 1..52, 140-493, 554..840

Corse1p7_018599 1542 735 1..394, 452.J35

Corse1p7_018602 1..49, 122..961 , 1061 -1182, 1295-1388, 1458-1507, 1565..1643,

1543 2423

1719-1951 , 2031 -2073, 2206-2423

Corse1p7_018641 1544 1008 1..1008

Corse 1p7_018643 1545 980 1..514, 578-819, 900..980

Corse 1p7_018656 1546 1080 1 -460, 530-1080

Corse1p7_018680 1547 817 1..292, 453..817

Corse1p7_018683 1548 2849 1..316, 433-619, 683-2516, 2598-2849

Corse1p7_018791 1549 1956 1..1956

Corse 1p7_018828 1550 1955 1..207, 267..1955

Corse 1p7_018842 1..246, 304..382, 449..460, 553-622, 671..1637, 1715..1783,

1551 2323

1907..2323

Corse 1p7_018897 1552 1707 1..1707

Corse1p7_018979 1553 1718 1..585, 639..1443, 1501..1718

Corse1p7_019023 1554 1522 1..255, 331..516, 613..668, 744..1218, 1331 -1522

Corse1p7_019097 1555 848 1.440, 500-596, 678-848

Corse1p7_019144 1556 1248 1..1248

Corse1p7_019154 1557 1689 1..1689

Corse1p7_019166 1558 1705 1..1590, 1673..1705

Corse1p7_019179 1559 1607 1..1413, 1470..1607

Corse1p7_019215 1560 657 1..657

Corse 1p7_019267 1561 1318 1..301 , 408..674, 847..1108, 1228-1318

Corse1p7_019341 1562 987 1..268, 360-889, 949-987

Corse1p7_019439 1563 825 1..825

CORSE_1_03965 1564 864 1..278, 459-864

[00233] The present invention is illustrated in further detail by the following non-limiting examples. EXAMPLES

Example 1 : Fermentation of the organism

Materials & Methods

[00234] In general, for each species, starter mycelium was grown in rich medium (either mycological broth or yeast malt broth (the latter being indicated with YM)) and then washed with water. The starter was then used to inoculate different liquid media or solid substrate and the resulting mycelium was used for RNA extraction and library construction.

[00235] Following are the medium recipes and the solid substrates with a referenced source (if available) as well as a table (Table 3) listing the media variations, since in some cases the basic recipes of the referenced source have been altered depending on the species grown. This is then followed by a summary of the specific species as grown in the examples.

A. Mycological broth

Per liter: 10 g soytone, 40 g D-glucose, 1 mL Trace Element solution, Double-distilled water;

Adjust pH to 5.0 with hydrochloric acid (HCI) and bring volume to 1 L with double-distilled water.

Trace Element Solution contains 2 mM Iron(ll) sulphate heptahydrate (FeSC h^O), 1 mM Copper (II) sulphate pentahydrate (CuSCVShfeO), 5 mM Zinc sulphate heptahydrate (ZnSCv/hkO), 10 mM Manganese sulphate monohydrate (MnSOhbO), 5 mM Cobalt(ll) chloride hexahydrate (CoCWrfeO), 0.5 mM Ammonium molybdate tetrahydrate and 95 mM Hydrochloric acid (HCI) dissolved in double-distilled water.

B. Yeast-Malt broth (YM)

(Reference: ATCC medium No. 200)

Per liter: 3 g yeast extract, 3 g malt extract, 5 g peptone, 10 g D-glucose, Double-distilled water to 1 L.

C. Trametes Defined Medium (TDM)

(Reference: Reid and Piace, "Effect of Residual lignin type and amount on biological bleaching of kraft pulp by Trametes versicolor". Applied Environmental Microbiology 60: 1395-1400, 1994.)

Per liter: 10 g D-glucose, 0.75 g L-Asparagine monohydrate, 0.68 g Potassium phosphate monobasic (KH2PO4), 0.25 g Magnesium sulphate heptahydrate (MgSC ^O), 15 mg Calcium chloride dihydrate (CaCWhkO), 100 pg Thiamine hydrochloride, 1 ml Trace Element solution, 0.5 g Tween™ 80, Double distilled water;

Adjust pH to 5.5 with 3 M potassium hydroxide and bring volume to 1 L with double-distilled water.

Table 3: Variations of TDM media used for library construction

Variation Description

TDM-1 Medium was prepared as in basic recipe described above. TDM-2 Quantity of asparagine monohydrate was reduced to 0.15 g.

TDM-3 Manganese sulphate monohydrate was omitted from the medium.

The quantity of manganese sulphate monohydrate was raised to 0.2 mM final concentration in the

TDM-4

medium.

TDM-5 The quantity of copper (II) sulphate pentahydrate was raised to 20 μΜ.

TDM-6 Glucose was replaced with 10 g per liter of cellulose (Solka-Floc, 200FCC)

TDM-7 Glucose was replaced with 10 g per liter of xylan from birchwood (Sigma Cat. # X-0502)

TDM-8 Glucose was replaced with 10 g per liter of wheat bran 1 .

TDM-9 Glucose was replaced with 10 g per liter of citrus pectin (Sigma Cat. # P-9135).

TDM-10 Tween™ 80 was omitted from the medium.

The double-distilled water was replaced with Whitewater 2 collected from peroxide bleaching (which

TDM-11

occurs during the manufacture of fine paper).

TDM-12 The double-distilled water was replaced with Whitewater 2 collected from newsprint manufacture.

TDM-13 Glucose was replaced with 5 g per liter of ground hardwood kraft pulp 3 .

TDM-14 The medium's pH was raised to 7.5.

TDM-15 The strain was incubated at 5°C above its optimum growth temperature.

TDM-16 The strain was incubated at 10°C below its optimum growth temperature.

One half of the double-distilled water was replaced with Whitewater from newsprint manufacture.

TDM-17

Glucose was omitted.

TDM-18 Potassium phosphate monobasic was replaced with 5 mM phytic acid from rice (Sigma Cat. # P3168).

TDM-19 Asparagine monohydratewas increased to 4 g per liter.

Asparagine monohydratewas increased to 4g per liter and glucose was replaced with 2% fructose.

TDM-20

Asparagine monohydratewas increased to 4 g per liter; 100 mL of double-distilled water was replaced

TDM-21

with 100 mL kerosene 4 . Glucose was omitted.

Asparagine monohydratewas increased to 4 g per liter; 100 mL of double-distilled water was replaced

TDM-22

with 100 mL hexadecane (Sigma cat. # H0255). Glucose was omitted.

Asparagine monohydratewas increased to 4 g per liter; one half of the double-distilled water was

TDM-23 replaced with 25% Whitewater from newsprint manufacture plus 25% white water from peroxide bleaching. Glucose was omitted.

Asparagine monohydratewas increased to 4 g per liter and the quantity of manganese sulphate

TDM-24

monohydrate was raised to 0.2 mM final concentration in the medium.

Asparagine monohydratewas increased to 4 g per liter and manganese sulphate monohydrate was

TDM-25

omitted from the medium.

TDM-26 Asparagine monohydratewas increased to 4 g per liter; and potassium phosphate monobasic was replaced with 5mM phytic acid from rice (Sigma Cat. # P3168). TDM-27 Glucose was replaced with 10g per liter of olive oil (Sigma cat. # 01514)

One half of the double-distilled water was replaced with Whitewater from peroxide bleaching. Glucose

TDM-28

was omitted.

TDM-29 Glucose was replaced with 10 g per liter of tallow.

TDM-30 Glucose was replaced with 10 g per liter of yellow grease.

TDM-31 Glucose was replaced with 10 g per liter of defined lipid (Sigma cat. # L0288).

TDM-32 Glucose was replaced with 50 g per liter of D-xylose.

TDM-33 Glucose was replaced with 20 g per liter of glycerol and 20ml per liter of ethanol.

TDM-34 Glucose was reduced to 1 g per liter and 10 g per liter of bran was added.

TDM-35 Glucose was reduced to 1g per liter and 10 g per liter of pectin (Sigma Cat. # P-9135) was added.

TDM-36 Glucose was replaced with 10 g per liter of biodiesel.

TDM-37 Glucose was replaced with 10 g per liter of soy feedstock.

TDM-38 Glucose was replaced with 10g per liter of locust bean gum (Sigma cat # G0753).

One half of double-distilled water was replaced with a 1 :1 ratio of Whitewater from newsprint

TDM-39

manufacture and white water from peroxide bleaching. Glucose was omitted.

TDM-40 The medium's pH was raised to 8.5.

One half of double-distilled water was replaced with Whitewater from peroxide bleaching; plus yeast

TDM-41

extract was added to 1 g per liter. Glucose was omitted.

TDM-42 Glucose was replaced with 5 g per liter of yellow grease and 5 g per liter of soy feedstock

TDM-43 Glucose was replaced with 20g per liter of fructose.

Glucose was replaced with 10 g per liter of cellulose (Solka-Floc, 200FCC) plus 1 g per liter of

TDM-44

sophorose.

TDM-45 The medium's pH was raised to 8.84.

1 Food grade wheat bran sourced from the supermarket was used.

2 All Whitewaters were sourced from Quebec paper mills by PAPRICAN on the Applicant's behalf.

3 Hardwood kraft pulp was sourced from Quebec paper mills by PAPRICAN on the Applicant's behalf.

4 Kerosene was sourced from a general hardware store.

D. Asparagine Salts Medium (AS):

(Reference: Ikeda et al., "Laccase and Melanization in Clinically Important Cryptococcus Species Other Than Cryptococcus neoformans", Journal of Clinical Microbiology AO: 1214-1218, 2002)

Per liter: 3.0 g D-glucose, 1.0 g L-Asparagine monohydrate, 3.0 g KH2PO4, 0.5 g Mg S04'7H20, 1 mg Thiamine. Table 4: Variations of AS media used for library construction

E. Solid substrates used:

SS-1 5 g Wheat Bran.

SS-2 5 g Wheat bran plus 5 mL defined lipid.

SS-3 5 g Oat bran (food grade, sourced from supermarket).

[00236] The Chaetomium thermophilum, Ther omyces stellatus, and Corynascus sepedonium strains were grown according to the methods described above under the following growth conditions: TDM-1 , -2, -3, -4, -5, -6, -7, - 8, -9, -10, -13, -14, -15, -39; YM, whereby the following optimal growth temperature was used: 25°C.

[00237] The strains carrying the recombinant genes were grown according to the methods described above under the following growth conditions: minimal medium as described in Kafer et al., (1977, Adv. Genet. 19:33-131) except that the salt concentrations were raised ten-fold and the glucose concentration was 150 grams per liter, at 30°C.

Example 2: Genome sequencing and assembly

[00238] Genomic DNA was isolated from mycelium when the growth culture had reached the mid log phase. Genomic DNA was sequenced using the Roche 454 Titanium technology (http://www.454.com) to a genome coverage of over 20-fold according to the instructions of the manufacturer. The sequences were assembled using the Newbler and Celera assemblers (http://sourceforge.net/apps/mediawiki/wgs-assembler).

Example 3: Building the cDNA libraries

[00239] Total RNA was isolated from fungal cells or mycelia when the growth cultures had reached the late log phase. The mycelia were collected by filtration through Miracloth and washed with water by filtration. The mycelia were padded dry using paper towels, and frozen in liquid nitrogen and stored at -80°C. To extract total RNA, the frozen mycelia or cells were ground to a fine powder in liquid nitrogen using pestle and mortar. Approximately 1 -1.5 gram of frozen fungal powder was dissolved in 10 mL of TRIzol ® reagent and RNA was extracted according to the manufacturer's protocol (Invitrogen Life Sciences, Cat. #15596-018). Following extraction, the RNA was dissolved at 1-1.5 mg/ml of DEPC-treated water.

[00240] The PolyATtract® mRNA Isolation Systems (Promega, Cat. #Z5300) was used to isolate poly(A)+RNA. In general, equal amounts of total RNA extracted from up to ten culture conditions were pooled. One milligram of total RNA was used for isolation of poly(A)+RNA according to the protocol provided by the manufacturer. The purified poly(A)+RNA was dissolved at 200-500 pg/mL of DEPC-treated water.

[00241] Five micrograms of poly(A)+RNA were used for the construction of cDNA library. Double-stranded cDNA was synthesized using the ZAP-cDNA ® Synthesis Kit (Stratagene, Cat. #200400) according to the manufacturer's protocol with the following modifications. An anchored oligo(dT) linker-primer was used in the first- strand synthesis reaction to force the primer to anneal to the beginning of the poly(A) tail of the mRNA. The anchored oligo(dT) linker-primer has the sequence:

5' -GAGAGAGAGAGAGAGAGAGAACTAGTCTCGAGTTTTTTTTTTTTTTTTTTVN-3' (SEQ ID NO: 2137) where V is A, C, or G and N is A, C, G, or T. A second modification was made by adding trehalose at a final concentration of 0.6 M and betaine at a final concentration of 2 M in the buffer of the first-strand synthesis reaction to promote full-length synthesis. Following synthesis and size fractionation, fractions of double-stranded cDNA with sizes longer than 600 bp were pooled. The pooled cDNA was cloned directionally into the plasmid vector BlueScript KS+ ® (Stratagene) or a modified BlueScript KS+ vector that contained Gateway ® (Invitrogen) recombination sites. The cDNA library was transformed into E. coli strain XL10-Gold ultracompetent cells (Stratagene, Cat. #Z00315) for propagation.

[00242] Bacterial cells carrying cDNA clones were grown on LB agar containing the antibiotic ampicillin for selection of plasmid-borne bacteria and X-gal and IPTG to use the blue/white system to screen for the presence cDNA inserts. The white bacterial colonies, those carrying cDNA inserts, were transferred by a colony-picking robot to 384-well MTP for replication and storage. Clones that were to be analyzed by sequencing were transferred to 96- well deep blocks using liquid-handling robots. The bacteria were cultured at 37°C with shaking at 150 rpm. After 24 hours of growth, plasmid DNA from the cDNA clones was prepared by alkaline lysis and sequenced from the 5' end using ABI 3730x1 DNA analyzers (Applied Biosystems). The chromatograms obtained following single-pass sequencing of the cDNA clones were processed using Phred (available at http://www.phrap.org) to assign sequence quality values, Lucy as described in Chou and Holmes (2001 , Bioinformatics, 17(12) 1093-1104) to remove vector and low quality sequences, and Phrap (available at http://www.phrap.org/) to assemble overlapping sequences derived from the same gene into contigs.

Example 4: Annotations

[00243] An in-house automated annotation pipeline was used to predict genes in the assembled genome sequence. The analysis pipeline used in part the ab initio tool Genemark ® (http://exon.biology.gatech.edu/) for prediction. It also used the predictor Augustus (http://augustus.gobics.de/) trained on ate novo assembled sequences and orthologous sequences for gene finding. Sequence similarity searches against the mycoCLAP ® (http://cubique.fungalgenomics.ca/mycoCLAP/) and NCBI non-redundant databases were performed with BLASTX as described in Altschul et al., (1997) (Nucleic Acids Res. 25(17): 3389-3402). Proteins encoding biomass-degrading enzymes possess conserved domains. We used the domains available at the European Bioinformatics Institute (www.ebi.ac.uk/Tools/lnterProScan/) to assist in the identification of target enzymes.

[00244] Proteins targeted to the extracellular space by the classical secretory pathway possess an N-terminal signal peptide, composed of a central hydrophobic core surrounded by N- and C- terminal hydrophilic regions. We used Phobius (available at http://phobius.cgb.ki.se) and SignalP ® version 3 (available at http://www.cbs.dtu.dk/services/SignalP) to recognize the presence of signal peptides encoded by the cDNA clones. The tools TargetP ® (available at http://www.cbs.dtu.dk/services/TargetP) and Big-PI Fungal Predictor (available at http://mendel.imp.ac.at/gpi/fungi_server.html) were used to remove sequences that encode proteins which are targeted to the mitochondria or bound to the cell wall. Finally, sequences predicted to encode soluble secreted proteins by these automated tools were analyzed manually. Clones that comprise full-length cDNAs which are predicted to encode soluble secreted proteins were sequenced completely. For genes identified from the genome sequence, oligonucleotide primers specific to the target genes were designed and used to PCR amplified the target genes from double-stranded cDNA or genomic DNA. The PCR amplified products were cloned into an appropriate expression vector for protein production in host cells. The genomic, coding and polypeptide sequences were assigned SEQ ID NOs, annotations, general functions, protein activities, CAZy family classifications, as summarized in Tables 1A-1C. Where appropriate, carbohydrate-binding modules (CBMs) of particular interest for the degradation of biomass were also listed in Tables 1A-1C.

Example 5: Assays for characterization of polypeptides

[00245] Polypeptides of the present invention may be additionally cloned into an expression vector, expressed and characterized (e.g., in sugar release assays) for activity relating to their ability to breakdown and/or process biomass as described in WO/2012/92676, WO/2012/130950, and WO/2012/130964 using appropriate substrates (e.g., acid pre-treated corn stover (aCS), hot water treated washed wheat straw, or hot water treated washed corn fiber substrate). Soluble sugars that are released can be analyzed for example by proton NMR.

[00246] A number of assays may be used to characterize the polypeptides of the present invention. Selected non-limiting examples of such assays are described and/or referenced below. Of course, other assays not explicitly mentioned or referenced here may also be used, and the expression "can be" used below is intended to reflect this possibility. Furthermore, a person of skill in the art would be able to modify or adapt these and other assays, as necessary, to characterize a particular polypeptide.

(1,4)-2-acetamido-2-deoxy-beta-D-glucan glucanohydrolase. Polypeptides of the present invention having this activity catalyze the random, internal hydrolysis of beta-1 ,4-linkages between glucosamine residues in chitin and chitosan, and can be characterized for example as described in Takaya et al., Microbiology (Reading, Engl.) 1998 Sep; 144 (Pt 9):2647-54.

1 ,3-beta-D-glucanase. Polypeptides of the present invention having this activity can be characterized for example as described in Erfle and Teather, Appl Environ Microbiol. (1991), 57(1): 122-129. 1,4-beta-D-glucan cellobiohydrolase. Polypeptides of the present invention having this activity can be characterized for example as described in Chanzy et al., FEBS Letters (1983), 153(1): 113-118; Gielkens et al., Appl. Environ. Microbiol. (1999), 65: 4340-4345; Hanif et al., Bioresour. Technol. (2004), 94:311-319; Liu et al., Biosci. Biotechnot. Biochem. (2009), 73(6): 1432-4; or Yoshida et al., Biosci. Biotechnot. Biochem. (2009), 73(1): 67-73.

11-beta-hydroxysteroid dehydrogenase 1 B. Polypeptides of the present invention having this activity can be characterized for example as described in Blum et al., Biochemistry (2003), 42(14):4108-17.

2-(R)-hydroxypropyl-CoM dehydrogenase (2-(2-(R)-hydroxypropylthio)ethanesulfonate dehydrogenase).

Polypeptides of the present invention having this activity can be characterized for example as described in Sliwa et al., B/octem/sf/y(2010), 49(16):3487-98.

2-alpha-D-(4-0-methyl)glucuronohydrolase. Polypeptides of the present invention having this activity can be characterized for example as described in Tenkanen and Siika-aho, J. Biotechnot. (2000), 78(2): 149-61 ; or Chong et al., Appl. Microbiol. Biotechnol. (2011), 90(4): 1323-32.

2-dehydro-3-deoxy-D-gluconate 5-dehydrogenase. Polypeptides of the present invention having this activity can be characterized for example as described in Tubeleviciute et al., Appl Microbiol Biotechnol. (2014) 98(12): 5471 - 85.

2-hydroxyacid dehydrogenase (EC 1.1.99.6). Polypeptides of the present invention having this activity characterized for example as described in Cammack, Biochem. J. (1970), 118(3): 405-408.

3-hydroxybenzoate 6-hydroxylase. Polypeptides of the present invention having this activity can be characterized for example as described in Sucharitakul et al., J Biol Che . (2013), 288(49): 35210-21 ; Montersino et al., J Biol Chem. (2013), 288(36): 26235-45; or Montersino and van Berkel, Biochim Biophys Acta. (2012), 1824(3): 433-42.

3-hydroxybutyryl-CoA dehydrogenase. Polypeptides of the present invention having this activity can be characterized for example as described in Kim et al., Biochem Biophys Res Commun. (2014), 443(3): 783-8; Madan et al., Eur. J. Biochem. (1973), 32(1 ): 51-6.

3- oxoacyl-[acyl-carrier-protein] reductase FabG. Polypeptides of the present invention having this activity can be characterized as described in Slabas et al., Biochem J. (1992), 283(Pt2): 321-6; Prescott et al. Adv. Enzymol. Relat. Areas. Mol. Biol. (1972), 36: 269-311 ; Shimakata et al., Arch. Biochem. Biophys. (1982), 218(1): 77-91 ; or Toomey and Wakil, Biochim. Biophys. Acta (1966), 116(2): 189-97.

4- beta-D-glucan cellobiohydrolase. Polypeptides of the present invention having this activity can be characterized as described in Liu et al., Biosci. Biotechnol. Biochem. (2009), 73(6): 1432-4; Yoshida et al., Biosci. Biotechnol. Biochem. (2009), 73(1): 67-73.

4-O-methyl-glucuronoyl methylesterase. Polypeptides of the present invention having this activity can be characterized as described in Spanikova et al, FEBS Letters (2006), 580(19): 4597-4601.

4-O-methyl-glucuronoyl methylesterase. Polypeptides of the present invention having this activity can be characterized for example as described in Li et al., FEBS Lett. (2007), 581 (21 ): 4029-35; Spanikova and Biely, FEBS Lett. (2006), 580(19): 4597-601.

Acetylxylan esterase. Polypeptides of the present invention having this activity can be characterized as described in Water et al., Appl Environ Microbiol. (2012), 78(10): 3759-62; Yang et al., International Journal of Molecular Sciences (2010), 11(12): 5143-5151 ; or in US patent No. 8,129,590.

Adhesin protein Mad1. Polypeptides of the present invention having this activity can be characterized for example as described in Wang and St Leger, Eukaryot. Cell (2007), 6(5): 808-816. Adhesin. Polypeptides of the present invention having this activity (reviewed in Dranginis et al., Microbiology and Molecular Biology Reviews (2007), 71 (2): 282-294) can be characterized using techniques well known in the art (e.g. adhesion assays).

Alcohol dehydrogenase [acceptor]. Polypeptides of the present invention having this activity can be characterized for example as described in Krog et al., PLoS One (2013) 8(3):e59188.

Aldonolactonase. Polypeptides of the present invention having this activity can be characterized for example as described in Beeson et al., Appl Environ Microbiol. (2011), 77(2): 650-6; Ishikawa et al., J Biol Chem. (2008), 283(45): 3113341.

Aldose 1-epimerase (mutarotase, aldose mutarotase). Polypeptides of the present invention having this activity can be characterized for example as described in Timson and Reece, FEBS Letters (2003), 543(1-3):21-24; and Villalobo et al., Exp. Parasitol. (2005) 110(3): 298-302.

Alkaline protease 2. Polypeptides of the present invention having this activity can be characterizedfor example as described in Gomaa, Braz J Microbiol. (2013) 44(2):529-37; or Yao et al., J Food Sci Technol. (2012), 49(5):626-31.

Allergen Asp f 15. Polypeptides of the present invention having this activity can be characterizedfor example as described in Bowyer et al., Medical Mycology (2007), 45(1): 17-26.

Alpha-arabinofuranosidase. Polypeptides of the present invention having this activity can be characterized for example as described in Poutanen et al. {Appl. Microbiol. Biotechnol. 1988, 28, 425-432) using 5 mM p-nitrophenyl alpha-L-arabinofuranoside as substrates. The reactions may be carried out in 50 mM citrate buffer at pH 6.0, 40°C with a total reaction time of 30 min. The reaction is stopped by adding 0.5 ml of 1 M sodium carbonate and the liberated p-nitrophenol is measured at 405 nm. Activity is expressed in U/ml. Furthermore, arabionofuranosidases may also be useful in animal feed compositions to increase digestibility. Corn arabinoxylan is heavily di-substituted with arabinose. In order to facilitate the xylan degradation it is advantageous to remove as many as possible of the arabinose substituents. The in vitro degradation of arabinoxylans in a corn based diet supplemented with a polypeptide of the present invention having alpha-arabinofuranosidase activity and a commercial xylanase is studied in an in vitro digestion system, as described in WO/2006/114094.

Alpha-fucosidase. Polypeptides of the present invention having this activity can be characterized for example as described in US patent No. 5,637,490; in Zielke et al., J. Lab. Clin. Med. (1972), 79:164; or using commercially available kits (e.g., Alpha-L-Fucosidase (AFU) Assay Kit, Cat. No. BQ082A-EALD, BioSupplyUK).

Alpha-galactosidase. Polypeptides of the present invention having this activity can be characterized for example as described in US patent application publication No. US 2010/0273235 A1. Briefly, a synthetic substrate, 4- Nitrophenyl-a-D-galactoside is used and the release of p-Nitro-phenol is followed at a wavelength of 405 nm in a reaction buffer containing 100 mM sodium phosphate, 50 mM sodium chloride, pH 6.8 at 26°C.

Alpha-glucosidase. Polypeptides of the present invention having this activity can be characterized for example as described in Yamamoto et al., Eur. J. Biochem. (2004), 271(16): 3414-3420; or for example using commercial kits (e.g., available from Sigma-Aldrich).

Alpha-glucuronidase GH67. Polypeptides of the present invention having this activity can be characterized for example as described in Lee et al., J Ind Microbiol Biotechnol. (2012), 39(8): 1245-51 , or Nagy et al., J. Bacteriol. (2002), 184: 4925-4929.

Alpha-L-arabinofuranosidase. Polypeptides of the present invention having this activity can be characterized for example as described in Giacobbe et al., N Biotechnol. (2014), 31(3):230-6; Shi et al., Biotechnol Lett. (2014), 36(6): 1321-8; Maehara et al., J Biol Chem. (2014), 289(11):7962-72. Alpha-L-rhamnosidase. Polypeptides of the present invention having this activity can be characterized for example as described in Rosenfeld and Wiederschein, Bull. Soc. Chim. Biol. (1965), 47 (7): 1433-1440, PMID 5855461 ; Fujimoto et al., Biol Chem. (2013), 288(17):12376-85.

Alpha-N-arabinofuranosidase. Polypeptides of the present invention having this activity can be characterized for example as described in Kaji and Tagawa, Biochim. Biophys. Acta (1970), 207: 456-464; Kaji and Yoshihara, Biochim. Biophys. Acta (1971), 250: 367-371 ; Tagawa and Kaji, Carbohydr. Res. (1969), 11 : 293-301.

Alpha-rhamnosidase. Polypeptides of the present invention having this activity can be characterized for example as described in Fujimoto et al., J Biol Chem. (2013), 288(17): 12376-85; Rodriguez et al., JAppl Microbiol. (2010), 109(6): 2206-13; Grandits et al., J Mol Catal B Enzym. (2013), 92(100): 34^3.

Aminopeptidase Y. Polypeptides of the present invention having this activity can be characterized for example as described in Yasuhara et al., J. Biol. Chem. (1994) 269(18) : 13644-50.

Arabinan endo-1 ,5-alpha-L-arabinosidase A. Polypeptides of the present invention having this activity can be characterized for example as described in Flipphi et al., Appl. Microbiol. Biotechnol. (1993), 40: 318-326; and Leal and de Sa-Nogueira, FEMS Microbiol. Lett. (2004), 241 : 41 -48.

Arabinogalactan endo-1, 4-beta-galactosidase. Polypeptides of the present invention having this activity can be characterized for example as described in Emi and Yamamoto, Agric. Biol. Chem. (1972), 36: 1945-1954;

Labavitch et al., J. Biol. Chem. (1976), 251 : 5904-5910; or Shipkowski and Brenchley, Appl. Environ. Microbiol. (2006), 72: 7730-7738.

Arabinogalactanase. Polypeptides of the present invention having this activity can be characterized for example as described in Yamamoto and Emi, Methods in Enzymology (1988), 160: 719-725.

Arabinosidase. Polypeptides of the present invention having this activity can be characterized for example as described in Pei and Shao, Appl Microbiol Biotechnol. (2008), 78(1): 115-21 ; Xiong et al., Journal of Experimental Botany, (2007), 58(1 1): 2799-2810.

Arabinoxylan arabinofuranohydrolase (AXH) GH43. Polypeptides of the present invention having this activity can be characterized for example as described in Yoshida et al., Journal of Bacteriology (2010), 192(20): 5424- 5436.

Arabinoxylan arabinofuranosidase GH62. Polypeptides of the present invention having this activity can be characterized for example as described in Sakamoto et al., Applied Microbiology and Biotechnology (2011), 90(1): 137-146.

Aryl-alcohol oxidase. Polypeptides of the present invention having this activity can be characterized for example as described in Hernandez-Ortega et al., Biochemistry (2012), 51 (33): 6595-608; Hernandez-Ortega et al., Appl Microbiol Biotechnol. (2012), 93(4): 1395-410.

Aspartate-semialdehyde dehydrogenase (EC 1.2.1.11). Polypeptides of the present invention having this activity can be characterized for example as described in Black et al., J. Biol. Chem. (1955) 213:39-50; US Patent No. 7,723,097.

Aspartic protease. Polypeptides of the present invention having this activity can be characterized for example as described in Tacco et al., Med. Mycol. (2009), 47(8): 845-854; or in Hu et al., Journal of Biomedicine and Biotechnology (2012), 2012:728975.

Aspartic-type endopeptidase. Polypeptides of the present invention having this activity can be characterized for example as described in Tjalsma et al., J. Biol. Chem. (1999), 274: 28191-28197.

Aspergillopepsin-2. Polypeptides of the present invention having this activity can be characterized for example as described in Huang et al., Journal of Biological Chemistry (2000), 275(34): 26607-14. Avenacinase. Polypeptides of the present invention having this activity can be characterized for example as described in Kwak et al., Phytopathology (2010), 100(5): 404-14; or in Bowyer et al., Science (1995), 267(5196): 371 -4.

Beta-galactosidase. Polypeptides of the present invention having this activity can be characterized for example using commercially available kits (e.g., β-Galactosidase Enzyme Assay System with Reporter Lysis Buffer, Cat. No. E2000, Promega).

Beta-glucanase. Polypeptides of the present invention having this activity can be characterized for example as described in US patent application publication number US 2012/0023626 A1 ; or in US patent No. 8,309,338.

Beta-glucosidase. Polypeptides of the present invention having this activity can be characterized for example as described in PCT application publication No. WO/2007/019442; or by using a commercially available kit (e.g., Beta- Glucosidase Assay Kit, Cat. No. KA1611 , Abnova Corp).

Beta-glucuronidase. Polypeptides of the present invention having this activity can be characterized for example as described in Eudes et al., Plant Cell Physiology (2008), 49(9): 1331-41 ;or Michikawa et al., Journal of Biological Chemistry (2012), 287: 14069-14077.

Beta-hexosaminidase. Polypeptides of the present invention having this activity can be characterized for example as described in Wendeler and Sandhoff, Glycoconj J. (2009), (8):945-52.

Beta-mannanase. Polypeptides of the present invention having this activity can be characterized for example as described in European patent application No. EP 2261359 A1 ; or in PCT application publication No.

WO2008009673A2.

Beta-mannosidase. Polypeptides of the present invention having this activity can be characterized for example as described in Park et al., N. Biotechnol. (2011), 28(6): 639-48; Duffaud et al., Appl Environ Microbiol. (1997), 63(1): 169-77; or in Fliedrova et al., Protein Expr Purif. (2012), 85(2): 159-64.

Beta-xylosidase; xylan 1,4-beta xylosidase. Polypeptides of the present invention having this activity can be characterized for example as described in Wagschal et al., Applied and Environmental Microbiology (2005), 71 (9): 5318-5323; or Shao et al., Appl Environ Microbiol. (2011), 77(3): 719-726.

Bifunctional alpha-arabinofuranosidase/beta-xylosidase GH43. Polypeptides of the present invention having this activity can be characterized for example as described in Viborg et al., AMB Express. (2013), 3(1):56; Shi et al., Biotechnol Biofuels (2013), 6(1):27; or Kim and Yoon, J Microbiol Biotechnol. (2010), (12): 1711-6.

Bifunctional solanapyrone synthase. Polypeptides of the present invention having this activity can be characterized for example as described in Kasahara et al., ChemBioChem (2010), 11 : 1245-1252; Katayama et al., Biosci. Biotechnol. Biochem. (2008), 72: 604-607; or Katayama et al., Biochim. Biophys. Acta (1998), 1384: 387- 395.

Bifunctional xylanase/deacetylase. Polypeptides of the present invention having this activity can be characterized for example as described in Cepeljnik et al., Folia Microbiol. (2006), 51 (4): 263-267; US patent application publication No. US 2012/0028306 A1 ; US patent No. 7,759,102; or PCT application publication No. WO

2006/078256 A2; or Grozinger and Schreiber, Chem Biol. (2002), 9(1): 3-16.

Carbohydrate-binding cytochrome. Polypeptides of the present invention having this activity can be characterized for example as described in Yoshida et al., Appl Environ Microbiol. (2005) 71(8): 4548-4555.

Carboxylesterase. Polypeptides of the present invention having this activity can be characterized for example using a commercially available kit such as the Carboxylesterase 1 (CES1) Specific Activity Assay Kit (ab109717) (Abeam, Cambridge, MA, USA). Carboxypeptidase. Polypeptides of the present invention having this activity can be characterized for example as described in US patent application publication No. US 2007/0160711 A1 ; or in PCT application publication No. WO 1998/014599A1.

Cellobiohydrolase GH6. Polypeptides of the present invention having this activity can be characterized for example as described in Takahashi et al., Applied and Environmental Microbiology (2010), 76(19): 6583-6590.

Cellobiohydrolase GH7. Polypeptides of the present invention having this activity can be characterized for example as described in Segato et al., Biotechnology for Biofuels (2012), 5:21 ; or Baumann et al., Biotechnol. for Biofuels (2011), 4:45; or Naran et al., Plant J. {2007), 50(1):95-107.

Cellobiose dehydrogenase. Polypeptides of the present invention having this activity can be characterized for example as described in Schou et al., Biochem. J. (1998), 330: 565-571 ; or Baminger et al., J. Microbiol Methods. (1999), 35(3): 253-9.

Chitin deacetylase. Polypeptides of the present invention having this activity can be characterized for example as described in European patent application No. EP 0610320 B1.

Chitinase. Polypeptides of the present invention having this activity can be characterized for example as described in US patent No. 7,087,810.

Chitinase-3-like protein 1. Polypeptides of the present invention having this activity can be characterized for example as described in Dela Cruz et al., Cell Host Microbe (2012), 12(1):34-46.

Chitooligosaccharide deacetylase. Polypeptides of the present invention having this activity can be characterized for example as described in John et al., Proc Natl Acad Sci USA (1993), 90(2): 625-9.

Chitosanase (EC 3.2.1.132). Polypeptides of the present invention having this activity can be characterized for example as described in Boucher et al., J. Biol. Chem. (1995), 270(52): 31077-82; WO2012/100345.

Chitotriosidase-1. Polypeptides of the present invention having this activity can be characterized for example as described in US patent No. 6,057,142.

Choline dehydrogenase. Polypeptides of the present invention having this activity can be characterized for example as described in Gadda and McAllister-Wilkins, Appl. Environ. Microbiol. (2003) 69(4): 2126-32; or Takabe et al., J. Biol. Chem. (2003), 278 (7): 4932^2.

Cholinesterase. Polypeptides of the present invention having this activity can be characterized for example as described in Abass Askar et al., Canadian Journal Veterinary Research (2011), 75(4): 261-270; or Catia et al., PLoS One (2012), 7(3): e33975.

Cis-2,3-dihydrobiphenyl-2,3-diol dehydrogenase. Polypeptides of the present invention having this activity can be characterized for example as described in Sylvestre et al., Appl. Environ. Microbiol. (1996), 62(8): 2710-5; Takagi and Yano, Biochem. Biophys. Res. (1994), Commun. 202 (2): 850-6.; or Hofer et al., Gene. (1993), 130(1 ): 47-55.

Cuticle-degrading protease. Polypeptides of the present invention having this activity can be characterized for example as described in Joshi et al., FEMS Microbiol Lett. (1995), 125(2-3): 211-7; Sheng et al., Curr Microbiol. (2006), 53{2):124-8.

Cutinase. Polypeptides of the present invention having this activity can be characterized for example as described in US patent application publication No. US 2012/0028318 A1 ; or in Chen et al., J. Biol Chem. (2008), 283(38): 25854-62.

Cys-Gly metallodipeptidase. Polypeptides of the present invention having this activity can be characterized for example as described in Ganguli et al., Genetics (2007), 175: 1137-1151 ; Kaur et al., J. Biol. Chem. (2009), 284: 14493-14502. Cytochrome P450. Polypeptides of the present invention having this activity can be characterized for example as using commercially available kits (e.g., P450-Glo™ Assays, Promega); or as described in Walsky and Obach, Drug Metab Dispos. (2004), 32(6): 647-60.

Dehydrogenase. Polypeptides of the present invention having this activity can be characterized for example as described in Mayer and Arnold, J. Biomol. Screen. (2002), 7(2): 135-140.

Dextranase. Polypeptides of the present invention having this activity can be characterized for example as described in Ko and Khouw, Can. J. Biochem. (1970), 48(3): 225-227; or Zohra et al., Cabohydrate Polymers (2013), 92: 2149-2153.

Dipeptidyl peptidase. Polypeptides of the present invention having this activity can be characterized for example as described in Ohara-Nemoto et al., J Biol Chem. (2014), 289(9):5436-48.

Endo-1,3(4)-beta-glucanase (laminarinase). Polypeptides of the present invention having this activity can be characterized for example as described in Akiyama et al., J Plant Physiol. (2009), 166(16): 1814-25; or Hua et al., Biosci Biotechnol Biochem. (2011), 75(9): 1807-12.

Endo-1,4-beta-xylanase. Polypeptides of the present invention having this activity can be characterized for example as described in Song et al., Enzyme and Microbial Technology (2013). 52(3): 170-176.

Endo-1 ,5-alpha-arabinanase. Polypeptides of the present invention having this activity can be characterized for example as described in US patent publication No. US 2012/0270263. More particularly, this assay of arabinase activity is based on colorimetrically determination by measuring the resulting increase in reducing groups using a 3,5-dinitrosalicylic acid reagent. Enzyme activity can be calculated from the relationship between the concentration of reducing groups, as arabinose equivalents, and absorbance at 540 nm. The assay is generally carried out at pH 3.5, but it can be performed at different pH values for the additional characterization and specification of enzymes. Polypeptides of the present invention having this activity can also be characterized for example as described in Hong et al., Biotechnol Lett. (2009), 31(9): 1439-43.

Endo-1,6-beta-glucanase. Polypeptides of the present invention having this activity can be characterized for example as described in Bryant et al., Fungal Genet Biol. (2007), 44(8): 808-17; or in Oyama et al., Biosci Biotechnol Biochem. (2006), 70(7): 1773-5.

Endo-arabinase. Polypeptides of the present invention having this activity can be characterized for example as described in Takao et al., Biosci Biotechnol Biochem. (2002), 66(2): 430-3; Takao et al., Appl Environ Microbiol. (2002), 68(4): 1639-46.

Endo-beta-1,3-galactanase. Polypeptides of the present invention having this activity can be characterized for example as described in Kotake et al., J Biol Chem. (2011), 286(31): 27848-54. Ichinose et al., Appl Environ Microbiol. (2006), 72(5): 3515-3523.

Endo-beta-1,3(4)-glucanase (EC 3.2.1.6). Polypeptides of the present invention having this activity can be characterized for example as described in WO 1995031533 and WO2013037933.

Endo-beta-1,4-glucanase celB. Polypeptides of the present invention having this activity can be characterized for example as described in Baird et al., J Bacteriol. (1990), 172(3): 1576-1586; Jorgensen and Hansen, Gene. (1990), 93(1):55-60; Jagtap et al., "Characterization of a novel endo-β-1 ,4-glucanase from Armillaria gemina and its application in biomass hydrolysis", Appl Microbiol Biotechnol. (2013).

Endo-beta-1,6-galactanase. Polypeptides of the present invention having this activity can be characterized for example as described in Ichinose et al., Appl Environ Microbiol. (2008), 74(8): 2379-83.

Endochitinase. Polypeptides of the present invention having this activity can be characterized for example as described in Wen et al., Biotechnol. Applied Biochem. (2002), 35: 213-219. Endoglucanase. Polypeptides of the present invention having this activity can be characterized for example as described in US patent No. 8,063,267; Couturier et al., Microb Cell Fact. (2011 ), 10:103; Badieyan et al., Biotechnol Bioeng. (2012), 109(1): 31-44; Pereira et al., J Struct Biol. (2010), 172(3): 372-9; Poidevin et al., "Cloning, expression, and characterization of a thermostable GH7 endoglucanase from Myceliophthora thermophila capable of high-consistency enzymatic liquefaction", Appl Environ Microbiol. (2013), 79(14): 4220-9.

Endoglycoceramidase. Polypeptides of the present invention having this activity can be characterized for example as described in US patent No. 5,795,765; or US patent application publication No. US 2009/0170155 A1.

Endo-inulinase. Polypeptides of the present invention having this activity can be characterized for example as described in Vandamme et al., FEBS Open Bio. (2013), 3: 467^72; US patent No. 8,309,079.

Endo-N-acetyl-beta-D-glucosaminidase. Polypeptides of the present invention having this activity can be characterized for example as described in Stals et al., PLoS One. (2012), 7(7): e40854; Stals et al., FEMS Microbiol Lett. (2010), 303(1): 9-17.

Endo-polygalacturonase. Polypeptides of the present invention having this activity can be characterized for example as described in European patent application publication Nos. EP1614748 A1 and EP1114165 A1.

Endo-polygalacturonase. Polypeptides of the present invention having this activity can be characterized for example as described in PCT application publication No. WO 1994/014952 A1 ; or in European patent application publication No. EP1614748 A1.

Endo-rhamnogalacturonase GH28. Polypeptides of the present invention having this activity can be characterized for example as described in Sprockett et al., Gene (2011), 479(1 -2): 29-36; or An et al., Carbohydrate Research (1994), 264(1): 83-96.

Endothiapepsin. Polypeptides of the present invention having this activity can be characterized for example as described in Choi et al., Gene(1993), 125(2): 35-41 ; Williams et al., Arch. Biochem. Biophys. (1972), 149: 52-61.

Exo-1,3-beta-galactanase GH43. Polypeptides of the present invention having this activity can be characterized for example as described in lchinose et al., Appl Environ Microbiol. (2006), 72(5): 3515-3523.

Exo-1,3-beta-glucanase GH17. Polypeptides of the present invention having this activity can be characterized for example as described in Wojtkowiak et al., Acta Crystallogr D Biol Crystallogr. (2013) 69(Pt 1):52-62; Tao et al., Gene (2013), 527(1):154-60.

Exo-1,3-beta-glucanase. Polypeptides of the present invention having this activity can be characterized for example as described in O'Connell et al., Appl Microbiol Biotechnol. (2011), 89(3): 685-96; or Santos et al., J Bacterid. (1979), 139(2): 333-338.

Exo-1,4-beta-xylosidase. Polypeptides of the present invention having this activity can be characterized for example as described in La Grange et al., Applied and Environmental Microbiology (2001), 67(12): 5512-5519.

Exo-arabinanase. Polypeptides of the present invention having this activity can be characterized for example as described in Tatsuji Sakamoto and Thibault, Appl Environ Microbiol. (2001), 67(7): 3319-3321.

Exo-beta-D-glucosaminidase. Polypeptides of the present invention having this activity can be characterized for example as described in Honda et al., Glycobiology. (2011), 21 (4): 503-11 ; Li et al., Carbohydr Res. (2009), 344(8): 1046-9; Fukamizo et al., Glycobiology. (2006), 16(11): 1064-72; Nogawa et al., Appl Environ Microbiol. (1998), 64(3): 890-5; Jung et al., Protein Expr Purif. (2006), 45(1): 125-31.

Exoglucanase. Polypeptides of the present invention having this activity can be characterized for example as described in Creuzet et al., FEMS Microbiology Letters (1983), 20(3): 347-350; or Kruus et al., Journal of Bacteriology (1995), 177(6): 1641 -1644. Exo-glucosaminidase GH2. Polypeptides of the present invention having this activity can be characterized for example as described in Tanaka et al., Journal of Bacteriology (2003), 185(17): 5175-5181.

Exo-inulinase. Polypeptides of the present invention having this activity can be characterized for example as described in Kulminskaya et al., Biochimica et Biophysica Acta (2003), 1650(1 -2):22-9; Pessoni et al., Mycologia. (2007), 99(4):493-503.

Exo-polygalacturonase. Polypeptides of the present invention having this activity can be characterized for example as described in Dong and Wang, BMC Biochem. (2011), 12: 51.

Exo-rhamnogalacturonase GH28. Polypeptides of the present invention having this activity can be characterized for example as described in US patent No. 5,811 ,291.

Expansin. Polypeptides of the present invention having this activity can be characterized for example as described in PCT application publication No. WO 2005/030965 A2; or in US patent No. 7,001 ,743.

Expansin-like protein 1. Polypeptides of the present invention having this activity can be characterized for example as described in Lee et al., Molecules and Cells (2010), 29(4): 379-85.

Extracellular endo-alpha-(1->5)-L-arabinanase 1. Polypeptides of the present invention having this activity can be characterized for example as described in Inacio and de Sa-Nogueira, J Bacteriol. (2008), 190(12):4272-80.

Extracellular exo-alpha-(1->5)-L-arabinofuranosidase. Polypeptides of the present invention having this activity can be characterized for example as described in Matsuo et al., Biochem. J. (2000), 346:9-15.

Extracellular metalloproteinase 10. Polypeptides of the present invention having this activity can be characterized for example as described in Almeida et al., Parasitol Res. (2003), 89(4); Shibata et al., J Biol Chem. (2000), 275(12):8349-54; Kim and Kim, Can J Microbiol. (1994), 40(2):120-6.

Feruloyl esterase. Polypeptides of the present invention having this activity can be characterized for example as described in PCT application publication No. WO 2009/076122 A1.

Festuclavine dehydrogenase (EC 1.5.1.44). Polypeptides of the present invention having this activity can be characterized for example as described in Wallwey et al., Org. Biomol. Chem. (2010), 8(15): 3500-3508.

Fumarate reductase. Polypeptides of the present invention having this activity can be characterized for example as described in Iverson et al., Science (1999), 284(5422): 1961-6; Lancaster et al., Nature (1999), 402(6760): 377- 85; or Maklashina et al., J Biol Chem. (2006), 281 (16): 1 1357-65.

Galactanase GH5. Polypeptides of the present invention having this activity can be characterized for example as described in lchinose et al., Applied and Environmental Microbiology (2008), 74(8): 2379-2383.

Galacturan 1,4-alpha-galacturonidase C. Polypeptides of the present invention having this activity can be characterized for example as described in Dong and Wang, BMC Biochem. (2011), 12:51.

Gamma-glutamyltranspeptidase. Polypeptides of the present invention having this activity can be characterized for example using a commercially available kit such as the gamma-Glutamyltransferase (GGT) Activity Colorimetric Assay Kit (Sigma-Aldrich, Cat. No. MAK089).

Glucan 1,3-beta-glucosidase. Polypeptides of the present invention having this activity can be characterized for example as described in Boonvitthya et al., Biotechnol Lett (2012), 34(10): 1937-43.

Glucan endo-1,3-beta-glucosidase. Polypeptides of the present invention having this activity can be characterized for example as described in Sperisen et al., Proc Natl Acad Sci USA (1991 ), 88(5):1820-4.

Glucan endo-1,6-beta-glucosidase B. Polypeptides of the present invention having this activity can be characterized for example as described in Fayad et al., Appl Microbiol Biotechnol. (2001), 57(1-2):117-23.

Gluconolactonase. Polypeptides of the present invention having this activity can be characterized for example as described in Kondo et al., Proc Natl Acad Sci USA (2006), 103(15):5723-8; and Tarighi et al., Microbiology (2008), 154(Pt 10):2979-90. Glucose oxidase. Polypeptides of the present invention having this activity can be characterized using a commercially available kit such as Amplex® Red Glucose/Glucose Oxidase Assay Kit (Cat. No. A22189, Life Technologies).

Glucose-6-phosphate 1-epimerase. Polypeptides of the present invention having this activity can be characterized for example as described in Wurster and Hess, Methods Enzymol. (1975), 41 : 488-93.

Glucosylceramidase. Polypeptides of the present invention having this activity can be characterized for example as described in Vaccaro et al., Eur J Biochem. (1985), 146(2): 315-21 ; Vaccaro et al., Enzyme. (1989), 42(2): 87- 97; Vaccaro et al., Clin Chim Ada. (1982), 118(1): 1-7.

Glycerol-3-phosphate dehydrogenase. Polypeptides of the present invention having this activity can be characterized for example as described in Albertyn et al., FEBS Lett. (1992) 308: 130-132.

Glycosidase. Polypeptides of the present invention having this activity can be characterized for example as described in US patent No. 8,119,383.

Glyoxylate reductase. Polypeptides of the present invention having this activity can be characterized for example as described in Duan et al., Enzyme Microb Technol. (2014), 60: 72-9.

Glyoxylate/succinic semialdehyde reductase. Polypeptides of the present invention having this activity can be characterized for example as described in Hoover et al., Biochim Biophys Acta. (2013), 1834(12): 2663-71.

Hephaestin-like protein 1. Polypeptides of the present invention having this activity can be characterized for example as described for oxioreductases.

Hexosaminidase. Polypeptides of the present invention having this activity can be characterized for example as described in Wendeler and Sandhoff, Glycoconj J. (2009), 26(8):945-952.

Hydrophobic Polypeptides of the present invention having this activity can be characterized for example as described in Bettini et al., Canadian Journal of Microbiology (2012), 58(8): 965-972; or Niu et al., Amino Acids. (2012), 43(2)763-71.

Invertase. Polypeptides of the present invention having this activity can be characterized for example as described in Bacon, J.S.D., Methods in Enzy ology (1955), Volume I, 258-262; Lever, M. Analytical Biochemistry (1972), Volume 47, 273-279; Us patent No. US 5,665,579.

Iron transport multicopper oxidase FET3. Polypeptides of the present invention having this activity can be characterized for example as described in Askwith et al., Celt (1994), 76: 403-10; or De Silva et al., J. Biol. Chem. (1995) 270: 1098-1 101.

Laccase. Polypeptides of the present invention having this activity can be characterized for example as described in Dedeyan et al., Appl Environ Microbiol. (2000), 66(3): 925-929.

Lactonase. Polypeptides of the present invention having this activity can be characterized for example as described in Khersonsky and Tawfik, ChemBioChem (2006), 7(1): 49-53; or Chow et al., J Biol Chem. (2010) 285(52):40911-20.

Laminarinase. Polypeptides of the present invention having this activity can be characterized for example as described in Ishida et al., J Biol Chem. (2009), 284(15): 10100-10109; or Kawai et al., Biotechnol Lett. (2006), 28(6): 365-71.

L-Ascorbate oxidase. Polypeptides of the present invention having this activity can be characterized for example as described in US patent Nos. 5,612,208 and 5,180,672.

L-carnitine dehydrogenase. Polypeptides of the present invention having this activity can be characterized for example as described in Aurich et al., Biochim Biophys Acta. (1967), 139(2): 505-7; or US patent No. 5,156,966.

Leucine aminopeptidase 1. Polypeptides of the present invention having this activity can be characterized for example as described in Beattie et al., Biochem. J. (1987), 242: 281-283. Licheninase (beta-D-glucan 4-glucanohydrolase). Polypeptides of the present invention having this activity can be characterized for example as described in Tang et al., JAgric Food Chem. (2012), 60(9): 2354-61.

Lignin peroxidase. Polypeptides of the present invention having this activity can be characterized for example as described in Arora and Gill, Enzyme and Microbial Technology (2001), 28(7-8): 602-605.

Lipase. Polypeptides of the present invention having this activity can be characterized for example as described in US patent Nos. 7,662,602 and 7,893,232.

L-lactate dehydrogenase A. Polypeptides of the present invention having this activity can be characterized for example as described in Yeswanth et al., Anaerobe (2013), 24:43-8; Xia et al., Mol Biol Rep. (2011), 38(3): 1853-60.

Loosenin. Polypeptides of the present invention having this activity can be characterized for example as described in Quiroz-Castaneda et al., Microbial Cell Factories (201 1), 10:8.

L-sorbosone dehydrogenase. Polypeptides of the present invention having this activity can be characterized for example as described in Shinjoh et al., Applied and Environment Microbiology (1995), 61 (2): 413-420.

Lysophospholipase. Polypeptides of the present invention having this activity can be characterized for example as described in US patent No. 5,965,422.

Lysozyme. Polypeptides of the present invention having this activity can be characterized for example as described in EnzChek® Lysozyme Assay Kit (cat. No. E22013, Life Technologies); Shugar, D. Biochimica et Biophysica Acta (1952), 8: 302-309.

annan endo-1,4-beta-mannosidase. Polypeptides of the present invention having this activity can be characterized for example as described in Zhao et al., Bioresour Technol. (2011), 102(16): 7538-47;

Songsiriritthigul et al., Microb Cell Fact. (2010), 9:20; or Do et al., Microb Cell Fact (2009), 8: 59.

etallocarboxypeptidase. Polypeptides of the present invention having this activity can be characterized for example as described in Tayyab et al., J Biosci Bioeng. (2011), 1 11(3): 259-65; or Song et al., J Biol Chem. (1997), 272(16): 10543-50.

ethylenetetrahydrofolate dehydrogenase [NAD(+)]. Polypeptides of the present invention having this activity can be characterized for example as described in Wohlfarth et al., J Bacterid. (1991), 173(4): 1414-1419.

Mixed-link glucanase. Polypeptides of the present invention having this activity can be characterized for example as described in Clark et al., Carbohydr Res. (1978), 61 : 457^177.

Monooxygenase. Polypeptides of the present invention having this activity can be characterized for example as described in Haghbeen and Tan, Anal Biochem. (2003), 312(1): 23-32.

Mucorpepsin. Polypeptides of the present invention having this activity can be characterized for example as described in Baudy et al., FEBS Lett. (1988), 235: 271-274.

Multicopper oxidase. Polypeptides of the present invention having this activity can be characterized for example as described in US patent application publication No. US 2012/0094335 A1.

Wlutanase. Polypeptides of the present invention having this activity can be characterized for example as described in Pleszczyhska, Biotechnol Lett. (2010), 32(11): 1699-1704; or WO 1998/000528 A1.

N-acetylglucosaminidase GH18. Polypeptides of the present invention having this activity can be characterized for example as described in Murakami et al., Glycobiology (2013), 23(6):736-44; or in US patent application publication No. US20120258089 A1.

N-acyl-phosphatidylethanolamine-hydrolyzing phospholipase D. Polypeptides of the present invention having this activity can be characterized for example as described in Guo et al., J Lipid Res. (2013), 54(11):3151-7. NADP-dependent L-serine/L-allo-threonine dehydrogenase. Polypeptides of the present invention having this activity can be characterized for example as described in Fujisawa et al., Biochim Biophys Acta. (2003), 1645(1): 89-94.

NADPH-cytochrome P450 reductase. Polypeptides of the present invention having this activity can be characterized for example as described in Guengerich et al., Nat Protoc. (2009), 4(9): 1245-51.

NADPH-dependent aldehyde reductase. Polypeptides of the present invention having this activity can be characterized for example as described in Colrat et al., Plant Physiol. (1999), 119(2): 621 -6.

NADPH-dependent methylglyoxal reductase GRE2. Polypeptides of the present invention having this activity can be characterized for example as described in Murata et al., Eur. J. Biochem. (1985), 151 (3): 631-636; Johnston et al., Yeast. (2003), 20 (6): 545-554.

Non-hemolytic phospholipase C. Polypeptides of the present invention having this activity can be characterized for example as described in Weingart and Hooke, Curr Microbiol. (1999), 38(4): 233-8; Korbsrisate et al., J Clin Microbiol. (1999), 37(11): 3742-5.

O-methylsterigmatocystin oxidoreductase. Polypeptides of the present invention having this activity can be characterized for example as described in Gengan et al., Prep Biochem Biotechnol. (2006), 36(4): 297-306.

Oxidase. Polypeptides of the present invention having this activity can be characterized for example using a number of commercially available kits [e.g., Amplex® Red Galactose/Galactose Oxidase Kit (A22179) and Amplex® Red Glucose/Glucose Oxidase Assay Kit (Molecular Probes/lnvitrogen); Cytochrome C Oxidase Assay Kit (Cat. No. CYTOCOX1-1 KT; Sigma-Aldrich); Xanthine Oxidase Assay Kit (ab102522, Abeam); Lysyl Oxidase Activity Assay Kit (ab112139, Abeam); Glucose Oxidase Assay Kit (ab138884, Abeam); Monoamine oxidase B (MAOB) Specific Activity Assay Kit (ab109912, Abeam)].

Oxidoreductase. Polypeptides of the present invention having this activity can be characterized for example as described in Hommes et al., Anal Chem. (2013), 85(1): 283-291.

Oxygen-dependent choline dehydrogenase. Polypeptides of the present invention having this activity can be characterized for example as described in Gadda et al., Appl Environ Microbiol. (2003), 69(4): 2126-32.

Para-nitrobenzyl esterase. Polypeptides of the present invention having this activity can be characterized for example as described in Moore and Arnold, Nat Biotechnol. (1996), 14(4): 458-67.

Pectate lyase. Polypeptides of the present invention having this activity can be characterized for example as described in Wang et al., BMC Biotechnology (2011), 11 : 32.

Pectin lyase. Polypeptides of the present invention having this activity can be characterized for example as described in Yadav et al., J Basic Microbiol. (2014), 54 Suppl 1 :S161-9; Perez-Fuentes et al., Fungal Biol. (2014), 118(5-6): 507-15.

Pectin methylesterase. Polypeptides of the present invention having this activity can be characterized for example as described in PCT application publication No. WO 1997/031102 A1.

Pectinesterase. Polypeptides of the present invention having this activity can be characterized for example as described in US patent No. 5,053,232.

Penicillopepsin. Polypeptides of the present invention having this activity can be characterized for example as described in Cao et al., Protein Sci. (2000), 9(5): 991-1001 ; or Hofmann et al., Biochemistry. (1984), 14;23(4): 635- 43.

Peptidase. Polypeptides of the present invention having this activity can be characterized for example using a commercially available kit such as the EnzChek® Peptidase/Protease Assay Kit (Life technologies, Cat. No. E33758). Peroxidase. Polypeptides of the present invention having this activity can be characterized for example using a number of commercially available kits [e.g., Amplex® Red Hydrogen Peroxide/Peroxidase Assay Kit (Molecular Probes/lnvitrogen); Peroxidase Activity Assay Kit (Cat. No. K772-100; BioVision); QuantiChrom™ Peroxidase Assay Kit (Cat. No. DPOD-100, BioAssay Systems].

Peroxisomal hydratase-dehydrogenase-epimerase. Polypeptides of the present invention having this activity can be characterized for example as described in Nuttley et al., Gene. (1988), 69(2):171-80.

Phenol 2-monooxygenase (phenol hydroxylase). Polypeptides of the present invention having this activity can be characterized for example as described in Nakagawa and Takeda, Biochim. Biophys. Acta. (1962), 62 (2): 423- 6; Neujahr and Gaal, Eur. J. Biochem. (1973), 35(2): 386^00; Neujahr and Gaal, Eur. J. Biochem. (1975), 58(2): 351-7; Kirchner et al, J Biol Chem. (2003), 278(48): 47545-53.

Phospholipase C. Polypeptides of the present invention having this activity can be characterized for example using commercially available kits (Amplex® Red Phosphatidylcholine-Specific Phospholipase C Assay Kit, Molecular Probes/lnvitrogen).

Podosporapepsin. Polypeptides of the present invention having this activity can be characterized for example as described in Paoletti et al., Gene (1998), 210(1): 45-52; Ortega et al., J Basic Microbiol. (2014), 54 Suppl 1 :S170-7; or Chavez-Sanchez et al., J Food Sci Technol. (2013), 50(1):101-7.

Polygalacturonase. Polypeptides of the present invention having this activity can be characterized for example as described in Ortiz et al., Anal Biochem. (2014), 454: 33-5; or Chen et al., Int J Mol Sci. (2014), 15(4): 5717-29.

Polyphenol oxidase 1. Polypeptides of the present invention having this activity can be characterized for example as described in Tao et al., JAgric Food Chem. (2013), 61 (51):12662-9; Dawson and Magee, Methods in Enzymology II (1955), 817-821 ; Marumo and Waite, Biochim. Biophys. Acta (1986), 872: 98-103.

Polysaccharide deacetylase. Polypeptides of the present invention having this activity can be characterized for example as described in Kobayashi et al., J. Biol. Chem. (2012) 287(13): 9765-76.

Polysaccharide lyase. Polypeptides of the present invention having this activity can be characterized for example as described in Macdonald and Berger, "A polysaccharide lyase from Stenotrophomonas maltophilia with unique, pH-regulated substrate specificity.", J. Biol Chem. (2013); Cordula et al., "On the catalytic mechanism of polysaccharide lyases: evidence of His and Tyr involvement in heparin lysis by heparinase I and the role of Ca2+", Mol Biosyst. (2013); or in PCT application publication No. WO 2013007706 A1.

Polysaccharide monooxygenase. Polypeptides of the present invention having this activity can be characterized for example as described in Kittl et al., Biotechnol Biofuels. (2012), 5(1):79, Phillips et al., ACS Chem Biol (2011), 6(12): 1399-1406, Wu et al., J. Biol. Chem (2013), 288(18): 12828-39. Polysaccharide monooxygenases, sometimes referred to functionally as "cellulase-enhancing proteins", generally belong the enzyme class GH61 and are reported to cleave polysaccharides with the insertion of oxygen.

Protease or peptidase. Polypeptides of the present invention having this activity can be characterized for example as described in US patent application publication No. US 2005/0010037 A1 ; using the EnzChek®

Peptidase/Protease Assay Kit (Cat. No. E33758, Life Technologies).

Putative exoglucanase type C (1 ,4-beta-cellobiohydrolase; beta-glucancellobiohydrolase; exocellobiohydrolase I). Polypeptides of the present invention having this activity can be characterized for example as described in Dai et al., Applied Biochemistry and Biotechnology (1999), 79, Issue 1-3: 689-699.

Pyranose dehydrogenase. Polypeptides of the present invention having this activity can be characterized for example as described in Staudigl et al., Biomolecules (2013), 3: 535-552; Tan et al., PLoS One. (2013), 8(1):e53567; Kujawa et al., FEBS J. (2007), 274(3): 879-94. hamnogalacturonan acetylesterase. Polypeptides of the present invention having this activity can be characterized for example as described in Molgaard et al., Structure(2000), 8(4):373-83; or Kauppinen et al., J Biol Chem. (1995), 270(45):27172-8.

Rhamnogalacturonan endolyase. Polypeptides of the present invention having this activity can be characterized for example as described in Azadi et al., Glycobiology. (1995), 5(8): 783-9.

Rhamnogalacturonan lyase. Polypeptides of the present invention having this activity can be characterized for example as described in Mutter et al., Plant Physiol. (1998), 1 17: 153-163; or de Vries, /4pp/. Microbiol Biotechnol. (2003), 61 : 10-20.

Rhamnogalacturonase. Polypeptides of the present invention having this activity can be characterized for example as described in Paynel et al., Plant Physio! Biochem. (2013), 62: 54-62; Normand et al., Appl Microbiol Biotechnol. (2012), 94(6): 1543-52.

Rhamno-galacturonate lyase. Polypeptides of the present invention having this activity can be characterized for example as described in Jensen et al., J Mol Biol. (2010), 404(1):100-11.

Rhizopuspepsin-3. Polypeptides of the present invention having this activity can be characterized for example as described in Chen et al., Agric Food Chem. (2009), 57(15):6742-7; Flentke et al., Protein Expr Purif. (1999), 16(2):213-20.

Rodlet protein. Polypeptides of the present invention having this activity can be characterized for example as described in Yang et al., Biopolymers (2013), 99(1): 84-94.

Saccharopine dehydrogenase [NADP(+), L-glutamate-forming]. Polypeptides of the present invention having this activity can be characterized for example as described in Kumar et al., Arch Biochem Biophys. (2012), 522(1):57-61 ; Ekanayake et al., Arch Biochem Biophys. (2011), 514(1-2):8-15.

Serine-type carboxypeptidase F. Polypeptides of the present invention having this activity can be characterized for example as described in US patent No. 6,379,913.

Short-chain dehydrogenase reductase 2a. Polypeptides of the present invention having this activity can be characterized for example as described in Bijtenhoorn et al., PLoS One. (2011), 6(10):e26278; Polizzi et al., Chem Commun (Camb). (2007), 18:1843-5.

Sterol-4-alpha-carboxylate 3-dehydrogenase, decarboxylating. Polypeptides of the present invention having this activity can be characterized for example as described in Bautz et al., G3 (Bethesda). (2013), 3(10): 1819-25.

Subtilisin-like protease. Polypeptides of the present invention having this activity can be characterized for example as described in Dang et al., J Invertebr Pathol. (2013), 112(2):166-74; Acevedo et al., J Appl Microbiol. (2013), 114(2):352-63.

Swollenin. Polypeptides of the present invention having this activity can be characterized for example as described in Jager et al., Biotechnol Biofuels. (2011), 4: 33; or Saloheimo et al., Eur J Biochem. (2002), 269(17): 4202-11.

Trihydrophobin. Polypeptides of the present invention having this activity can be characterized for example as described in Cheng et al., J. Biol. Chem. (2009), 284(13):8786-96.

Tripeptidyl-peptidase sedl . Polypeptides of the present invention having this activity can be characterized for example as described in Du et al., Biol Chem. (2001), 382(12):1715-25; Hilbi et al., Biochim Biophys Acta. (2002), 1601 (2): 149-54; Renn et al., J Biol Chem. (1998), 273(30): 19173-82.

Tyrosinase. Polypeptides of the present invention having this activity can be characterized for example as described in US patent application publication No. US 2011/0311693 A1 ; Duckworth and Coleman, J. Biol. Chem. (1970) , 245: 1613-1625; Park et al., J Protein Chem. (2003), 22(5): 473-80. Unsaturated rhamnogalacturonyl hydrolase. Polypeptides of the present invention having this activity can be characterized for example as described in Itoh et al., Biochem Biophys Res Commun. (2006), 347(4): 1021-9; or Itoh et al., J Mol Biol. (2006), 360(3): 573-85.

Versatile peroxidase. Polypeptides of the present invention having this activity can be characterized for example as described in Lankinen et al., Appl Microbiol Biotechnol. (2005), 66(4): 401-7; Banci et al., J Biol Inorg Chem. (2003), 8(7): 751-60; Perez-Boada et al., J Mol Biol. (2005), 354(2): 385-402.

Versicolorin B synthase. Polypeptides of the present invention having this activity can be characterized for example as described in Silva and Townsend, J Biol Chem. (1997), 272(2): 804-13; McGuire et al., Biochemistry (1996), 35(35): 11470-86.

Xylan alpha-1,2-glucuronidase. Polypeptides of the present invention having this activity can be characterized for example as described in Ishihara, M. and Shimizu, K., "alpha-(1->2)-Glucuronidase in the enzymatic

saccharification of hardwood xylan: Screening of alpha-glucuronidase producing fungi." Journal Mokuzai Gakkaishi, (1988) 34: 58-64.

Xylanase. Polypeptides of the present invention having this activity can be characterized for example as described in US patent application publication No. US 2012/0028306 A1 ; US patent No. 7,759,102; PCT application publication No. WO 2006/078256 A2; Chen et al., Agric. Biol. Chem. (1986), 50: 1183-1194; Lever, M., Analytical Biochemistry (1972), 47: 273-279.

Xylogalacturonase. Polypeptides of the present invention having this activity can be characterized for example as described in van der Vlugt-Bergmans et al., Applied and Environmental Microbiology (2000), 66(1): 36-41.

Xyloglucanase. Polypeptides of the present invention having this activity can be characterized for example as described in Master et al., Biochem. (2008), 411(1): 161-170; Ariza et al., J Biol Chem. (2011), 286(39): 33890-900; Qi et al., Biochemistry (Mosc). (2013), 78(4): 424-30; US patent No. 6,815,192.

Xyloglucan-specific endo-beta-1,4-glucanase A. Polypeptides of the present invention having this activity can be characterized for example as described in European patent application publication No. EP0972016 B1 ; in US patent No. 6,077,702; Damasio et al., Biochim Biophys Acta. (2012), 1824(3): 461-7; or Wong et al., Appl Microbiol B/oiechno/. (2010), 86(5): 1463-71.

Xylosidase/arabinosidase. Polypeptides of the present invention having this activity can be characterized for example as described in Whitehead and Cotta, Curr Microbiol. (2001), 43(4): 293-8; or Xiong et al., Journal of Experimental Botany (2007), 58(11): 2799-2810.

Example 6: General Molecular Biology Procedures

[00247] Standard molecular cloning techniques such as DNA isolation, gel electrophoresis, enzymatic restriction modifications of nucleic acids, E. coli transformation, etc., were performed as described by Sambrook et al., 1989, (Molecular cloning: A laboratory manual, 2nd Ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York and Innes et al. (1990) PCR protocols, a guide to methods and applications, Academic Press, San Diego, edited by Michael A. Innis et al). Primers were prepared by IDT (Integrated DNA Technologies). Sanger DNA sequencing was performed using an Applied Biosystem's 3730x1 DNA Analyzer technology at the Innovation Centre (Genome Quebec), McGill University in Montreal.

Example 7: Construction of pGBFIN49 expression plasmids

[00248] Genes of interest were cloned into the expression vector pGBFIN-49. This vector is a derivative of pGBFIN-41 that contains the A. niger glaA promoter, A. niger TrpC terminator, A. nidulans gpdA promoter, gene encoding the pheomycin resistance gene, A. niger glaA terminator and an E. coli backbone. Figure 1 represents a schematic map of pGBFIN-49 and the complete nucleotide sequence is presented as SEQ ID NO: 2148. Details of the construction of pGBFIN-49 are as follows: TtrpC terminator PCR amplification (0.7kb):

[00249] TtrpC terminator was PCR amplified using purified pGBFIN33 plasmid as a template. The following primers and PCR program were used:

Primer-3: 5 ' -GTCCGTCGCCGTCCTTCAccgccggtccgacg-3 ' (SEQ ID NO: 2138) Primer-4 : 5 ' -GCGGCCGGCGTATTGGGTGttacggagc-3 ' (SEQ ID NO: 2139)

[00250] Primer-4 is entirely specific to the TtrpC 3' end. Primer-3 was designed to suit the LIC cloning strategy but also to keep the TtrpC sequence as close to the original sequence. To do so, five adenines were replaced by thymines (underlined).

PCR master mix:

PGBFIN33 1 μΙ_ (5-1 O ng)

Primer-3 (10 mM) 1 pL

Primer-4 (10 mM) 1 pL

dNTPs (2 mM) 5 pL

HF Buffer (5x) 10 L

Phusion DNS pol. 0.5 pL

Nuclease-free water 31.5 pL

Total 50 pL

[00251] PCR program: 1 x 98°C, 2 min; 25 x (98°C, 30 sec; 68°C, 30 sec; 72°C, 1 min); 72°C, 7 min.

[00252] Reaction conditions: 5 pL of the PCR reaction was separated by electrophoresis on 1.0% agarose gel and the remaining was purified using QIAEX II™ gel Extraction kit (QIAGEN) and resuspended in nuclease-free water.

2. pGBFIN41 vector PCR amplification (8.3kb):

[00253] Vector backbone was PCR amplified using pGBFIN41 as a template. Primers were designed outside of the ccdA region (not included in pGBFIN49). The following primers and PCR program were used:

Primer-2 : 5 ' -CACCCAATACGCCGGCCGCgcttccagacagctc-3 ' (SEQ ID NO: 2140)

Primer-lC : 5 ' -GGTGTTTTGTTGCTGGGGAtgaagctcaggctctcagttgcgtc-3 ' (SEQ ID NO: 2141) [00254] Primer-2 contains a pgpdA-specific region and an extra sequence specific to TtrpC 3' end (also included in Primer-4). Primer-1 C was designed to suit the LIC cloning strategy but also to keep PgalA region as close to the original sequence. To do so, three thymines were replaced by adenines (underlined).

PCR master mix:

pGBFIN41 1 pL (50 ng)

Primer-2 (10 mM) 1 pL

Primer-1C (10 mM) 1 μΙ_

dNTPs (2 mM) 5 pL

HF Buffer (5x) 10 pL

Phusion DNS pol. 0.5 pL

DMSO 1 pL

Nuclease-free water 30.5 pL

Total 50 pL

[00255] PCR program: 1 x 98°C, 3 min; 10x (98°C, 30 sec ; 68°C, 30 sec, 72°C, 5 min); 20 x (98°C, 30 sec, 68°C, 30 sec, 72°C, 5 min + 10 sec/cycle); 72°C, 10 min.

[00256] Reaction conditions: 5 pL of the PCR reaction was separated on a 0.5% agarose gel and remaining was purified using QIAEX II™ gel Extraction kit (QIAGEN) and resuspended in nuclease-free water. pGBFIN41 + TtrpC overlap-extension PCR:

[00257] Overlap-extension / Long range PCR was performed to: a) fuse the two PCR pieces together; b) add an Sfol restriction site to re-circularize the vector. No primers were used in the overlap-extension stage. Primer-11 and Primer-12 were used for the long range PCR reaction.

Primer-ll: 5 ' -CACCGGCGCCGTCCGTCGCCGTCCTTC -3' (SEQ ID NO: 2142) Primer-12: 5 ' -ACGGCGCCGGTGTTTTGTTGCTGGGGATG -3' (SEQ ID NO: 2143)

[00258] Primer-11 is specific to the LIC tag located on the TtrpC terminator, while Primer-12 is specific to the LIC tag located on the PglaA region. The Sfol restriction site sequence is underlined above.

[00259] A standard PCR master mix was prepared to perform overlap-extension PCR using pGBFIN41 and TtrpC purified PCR products as templates. No primers were added.

Overlap-extension master mix:

TtrpC 1 pL

pGBFIN41 9 pL

Buffer GC (5x) 10 pL

dNTPs (2 mM) 5 pL

Phusion DNA pol. 0.5 pL

Nuclase-free water 24.5 pL

pGBFIN41 50 pL [00260] PCR program - overlap (no primers): 1x 98°C, 2 min; 5x (98°C, 15 sec; 58°C, 30 sec; 72°C, 5 min), 5x (98°C, 15 sec; 63°C, 30 sec; 72°C, 5 min), 5x (98°C, 15 sec; 68°C, 30 sec; 72°C, 5 min); 72°C, 10 min.

[00261] The overlap-extension PCR product was then, purified on QIAEX II™ column and 5 μί. of the purified reaction was used as template DNA for Long range PCR step with Primers-11 and -12.

PCR master mix:

Overlap product 5 μΙ_

Primer-11 (10mM) 1 μί.

Primer-12 (10mM) 1 μί

dNTPs (2mM) 5 μΙ_

HF Buffer (5x) 10 ML

Phusion DNA pol. 0.5 μί.

DMSO 1 μΙ_

Nuclease-free water 26.5 μί.

pGBFIN41 50 μί.

[00262] PCR program - Long range: 1x 98°C, 3 min; 10x (98°C, 30 sec ; 68°C, 30 sec ; 72°C, 5 min); 20 x (98°C, 30 sec ; 68°C, 30 sec ; 72°C, 5 min + 10 sec/cycle); 72°C, 10 min.

[00263] Reaction conditions: 5 μί. of the PCR reaction was separated on 0.5% agarose gel and remaining was purified using QIAEX II™ gel Extraction kit and resuspended in nuclease-free water. Then, Sfo\ digestion was performed and digested product was purified using QIAEX II gel extraction kit follow the procedure as described by the manufacturer.

4. Ligation:

[00264] 100 ng of the purified digested fragment was ligated to itself using 1 pL of T4 DNA Ligase (New England Biolabs, M0202), and incubated at 16°C overnight. Enzyme inactivation was performed at 65°C for 10 minutes. Then, 10 L of ligation product was transformed in DH5 E. coli competent cells and plated on 2xYT agar containing 100 μg/mL ampiciliin. DNA extraction was performed on single colonies the next day. Restriction analysis and sequencing were done to confirm the structure.

Example 8: Cloning of Chaetomium thermophilum, Thermomyces stellatus, and Corynascus sepedonium genes in E. coli

[00265] Cloning genes of interest in the pGBFIN-49 expression vector was performed using the Ligation- independent cloning (LIC) method according to Aslanidis, C, de Jong, P. (1990) Nucleic Acids Research Vol. 18 No. 20, 6069-6074.

[00266] Coding sequences from genes of interest were amplified by PCR using primers containing LIC tags, which are homologous to Pgla and TrpC sequences in the pGBFIN-49 cloning vector fused to sequences homologous to the coding sequences of the gene of interest, and either genomic DNA or cDNA as template. Primers have the following sequences:

Forward primer: 5 -CCCCAGCAACAAMCACCTCAGCAATG ...15-20 nucleotides specific to each gene to be cloned (portion of primer for which sequence is shown is set forth in SEQ ID NO: 2144) Reverse primer: 5 -GAAGGACGGCGACGGACTTCA...15-20 nucleotides specific to each gene to be cloned

(portion of primer for which sequence is shown is set forth in SEQ ID NO: 2145)

PCR mix consists of following components:

Template (gDNA or cDNA) 1-10 ng/μί 1 μί.

5X Phusion HF Buffer (Finnzymes™) 10 μί.

2 mM dNTPs 5 μί

LIC primer (F+R) mix 10 mM 0.5 μί.

Phusion DNA Polymerase (Finnzymes™) 0.5 μί.

DMSO 1.5 pL

H 2 0 31.5 ML

TOTAL 50 μί

[00267] PCR amplification was carried out with following conditions:

[00268] Following PCR, 90 μί milliQ™ water was added to each sample and the mix was purified using a MultiScreen PCRg6 Filter Plate (Millipore) according to manufacturer's instructions. The PCR product was eluted from the filter in 25 μί 10 mM Tris-HCI pH 8.0.

[00269] Expression vector pGBFIN-49 was PCR amplified using primers with following sequences:

Forward primer: 5' GTCCGTCGCCGTCCTTCACCG-3' (SEQ ID NO: 2146) Reverse primer: 5 ' -GGTGTTTTGTTGCTGGGGATGAAGC- 3 ' (SEQ ID NO: 2147)

(Primers are located at either site of the Sfol restriction site.) PCR mix consists of following components:

pGBFIN-49 plasmid DNA (10 ng/ pL) 2 pL

5X Phusion HF Buffer (Finnzymes™) 20 pL

2 mM dNTPs 10 pL

LIC Primer mix (F+R) 10 mM 2 pL

Phusion DNA Polymerase (Finnzymes™) 1.5 pL

TOTAL 100 pL

[00270] PCR amplification was carried out with following conditions:

[00271] Following PCR, 1 μί of Dpn\ was added to the PCR mix and digestion was performed overnight at 37°C. Digested PCR product was purified using the Qiaquick™ PCR purification kit (Qiagen) according to manufacturer's instructions.

[00272] Obtained PCR fragments were treated with T4 DNA polymerase in the presence of dTTP to create single stranded tails at the ends of the PCR fragments. The single stranded tails of the PCR fragment are complementary to those of the vector, thus permitting non-covalent bi-molecular associations, e.g., circularization between molecules.

[00273] The reaction mix of the T4 DNA polymerase treatment of the pGBFIN-49 PCR fragment consisted of the following components:

Purified pGBFIN-49 PCR fragment 600 ng

10X Neb Buffer 2 2 pL

25 mM dTTP 2 pL

DTT 100 M 0.8 pL

T4 DNA Polymerase 3U/ pL 1 pL

H 2 0 Up to 20 pL

TOTAL 20 pL

[00274] The reaction mix of T4 DNA polymerase treatment of the Gene of Interest (GOI) PCR fragment consisted of the following components: Purified GOI PCR 5 pL

10X NEB Buffer 2 2 pL

25 mM dATP 2 pL

DTT I OO pM 0.8 μί.

T4 DNA Polymerase 3U/ pL 1 pL

H 2 0 9.2 pL

TOTAL 20 pL

[00275] Reaction conditions were as follows:

[00276] Following T4 DNA polymerase treatment, 2 μί. of pGBFIN-49 vector and 4 μί of the GOI were mixed and incubated at room temperature allowing annealing of GOI fragment with pGBFIN-49 vector fragment. The bi- molecular forms are used to transform E. coli. Plasmid DNA of resulting transformants was isolated and verified by sequence analyses for correct amplification and cloning of the gene of interest.

Example 9: Transformation of Chaetomium thermophilum, Thermomyces stellatus, and Corynascus sepedonium gene expression cassettes into A. niger

[00277] As host strain for enzyme production, A. niger GBA307 was used. Construction of A. niger GBA307 is described in WO 2011/009700.

[00278] Transformation of A. niger was performed essentially according to the method described by Tilburn, J. et al. (1983) Gene 26, 205-221 and Kelly, J & Hynes, M. (1985) EMBO J., 4, 475-479 with the following modifications:

Spores were grown for 16-24 hours at 30°C in a rotary shaker at 250 rpm in Aspergillus minimal medium. Aspergillus minimal medium contains per liter: 6 g NaNOs; 0.52 g KCI; 1.52 g KH 2 P0 4 ; 1.12 ml

4 M KOH; 0.52 g MgS0 4 -7H 2 0; 10 g glucose; 1 g casamino acids; 22 mg ZnS0 4 -7H 2 0; 11 mg H 3 B0 3 ;

5 mg FeS0 4 -7H 2 0; 1.7 mg CoCI 2 -6H 2 0; 1.6 mg CuS0 -5H 2 0; 5 mg MnCI 2 -2H 2 0; 1.5 mg Na 2 Mo0 -2 H2O; 50 mg EDTA; 2 mg riboflavin; 2 mg thiamine-HCI; 2 mg nicotinamide; 1 mg pyridoxine-HCI; 0.2 mg panthotenic acid; 4 μg biotin; 10 ml Penicillin (5000IU/mL/Streptomycin (5000 UG/mL) solution (Invitrogen);

Glucanex 200G (Novozymes) was used for the preparation of protoplasts;

- After protoplast formation (2-3 hours) 10 mL TB layer (per liter: 109.32 g Sorbitol; 100 mL 1 M Tris-HCI pH 7.5) was pipetted gently on top of the protoplast suspension. After centrifugation for 10 min at 4330 x g at 4°C in a swinging bucket rotor, the protoplasts on the interface were transferred to a fresh tube and washed with STC buffer (1.2 M Sorbitol, 10 mM Tris-HCI pH 7.5, 50 mM CaCI 2 ). The protoplast suspension was centrifuged for 10 min at 1560 x g in a swinging bucket rotor and resuspended in STC- buffer at a concentration of 10 8 protoplasts/mL;

To 200 μί. of the protoplast suspension, 20 μί. ATA (0.4 M Aurintricarboxylic acid), the DNA dissolved in 10 μί in TE buffer (10 mM Tris-HCI pH 7.5, 0.1 mM EDTA), 100 μί of a PEG solution (20% PEG 4000 (Merck), 0.8M sorbitol, 10 mM Tris-HCI pH 7.5, 50 mM CaCI 2 ) was added;

After incubation of the DNA-protoplast suspension for 10 min at room temperature, 1.5 ml PEG solution (60% PEG 4000 (Merck), 10 mM Tris-HCI pH7.5, 50 mM CaCI 2 ) was added slowly, with repeated mixing of the tubes. After incubation for 20 min at room temperature, suspensions were diluted with 5 ml 1.2 M sorbitol, mixed by inversion and centrifuged for 10 min at 2770 x g at room temperature. The protoplasts were resuspended gently in 1 mL 1.2 M sorbitol and plated onto selective regeneration medium consisting of Aspergillus minimal medium without riboflavin, thiamine.HCI, nicotinamide, pyridoxine, panthotenic acid, biotin, casamino acids and glucose, supplemented with 150 μg/mL Phleomycin (Invitrogen), 0.07 M NaNC , 1 M sucrose, solidified with 2% bacteriological agar #1 (Oxoid, England). After incubation for 5-10 days at 30°C, single transformants were isolated on PDA (Potato Dextrose Agar (Difco) supplemented with 150 μg/mL Phleomycin in 96 wells MTP. After 5-7 days growth at 30°C single transformants were used for MTP fermentation.

Example 10: Aspergillus niger microtiter plate fermentation

[00279] 96 wells microtiter plates (MTP) with sporulated Aspergillus niger strains were used to harvest spores for MTP fermentations. To do this, 100 pL water was added to each well and after resuspending the mixture, 40 μί. of spore suspension was used to inoculate 2 mL A.niger medium (70 g/L glucose'H 2 0, 10 g/L yeast extract, 10 g/L (NH 4 ) 2 S0 4 , 2 g/L K 2 S0 4 , 2 g/L KH 2 P0 4 , 0.5 g/L MgS0 4 -7H 2 0, 0.5 g/L ZnS0 4 -7H 2 0, 0.2 g/L CaCI 2 , 0.01 g/L MnS0 4 -7H 2 0, 0.05 g/L FeS0 4 -7H 2 0, 0.002 Na 2 Mo0 4 -2H 2 0, 0.25 g/L Tween™-80, 10 g/L citric acid, 30 g/L MES; pH 5.5 adjusted with 4 M NaOH) in a 24 well MTP. In the MTP fermentations for strains expressing GH61 proteins (e.g., polysaccharide monooxygenases), 30 μΜ CuS0 4 was included in the media. The MTP's were incubated in a humidity shaker (Infers) at 34°C at 550 rpm, and 80% humidity for 6 days. Plates were centrifuged and supernatants were harvested.

Example 11 : Aspergillus niger shake flask fermentation

[00280] Approximately 1x10 6 - 1x10 7 spores were inoculated in 20 mL pre-culture medium containing Maltose 30 g/L; Peptone (aus casein) 10 g/L; Yeast extract 5 g/L; KH 2 P0 4 1 g/L; MgS0 4 -7H 2 0 0.5 g/L; ZnCI 2 0.03 g/L; CaCI 2 0.02 g/L; MnS0 4 -4H 2 0 0.01 g/L; FeS0 4 -7H 2 0 0.3 g/L; Tween™-80 3 g/L; pH 5.5. After growing overnight at 34°C in a rotary shaker, 10-15 mL of the growing culture was inoculated in 100 mL main culture containing Glucose-^O 70 g/L; Peptone (aus casein) 25 g/L; Yeast extract 12.5 g/L; K 2 S0 4 2 g/L; KH 2 P0 4 1 g/L; MgSCv7H 2 0 0.5 g/L; ZnCI 2 0.03 g/L; CaCI 2 0.02 g/L; MnS0 4 -1 H 2 0 0.009 g/L; FeS0 4 -7H 2 0 0.003 g/L; pH 5.6. Note: for GH61 (e.g., polysaccharide monooxygenase) enzymes the culture media were supplemented with 10 μΜ CuS0 4 .

[00281] Main cultures were grown until all glucose was consumed as measured with Combur Test N strips (Roche), which was the case mostly after 4-7 days of growth. Culture supernatants were harvested by centrifugation for 10 minutes at 5000 x g followed by germ-free filtration of the supernatant over 0.2 μηι PES filters (Nalgene).

Example 12: Shake flask concentration and protein concentration determination with TCA-biuret method

[00282] In order to obtain greater amounts of material for further testing, the fermentation supernatants obtained as described above (volume between 75 and 100 mL) were concentrated using a 10 kDa spin filter to a volume of approximately 5 mL. Subsequently, the protein concentration in the concentrated supernatant was determined via a TCA-biuret method.

[00283] Concentrated protein samples (supernatants) were diluted with water to a concentration between 2 and 8 mg/mL. Bovine serum albumin (BSA) dilutions (0, 1 , 2, 5, 8 and 10 mg/mL were made and included as samples to generate a calibration curve.Of each diluted protein sample, 270 L was transferred into a 10-mL tube containing 830 pL of a 12% (w/v) trichloro acetic acid solution in acetone and mixed thoroughly. Subsequently, the tubes were incubated on ice water for one hour and centrifuged for 30 minutes, at 4°C and 6000 rpm. The supernatant was discarded and pellets were dried by inverting the tubes on a tissue and letting them stand for 30 minutes at room temperature. Next, 3 mL BioQuant™ Biuret reagent mix was added to the pellet in the tube and the pellet was solubilized upon mixing, followed by addition of 1 mL water. The tube was mixed thoroughly and incubated at room temperature for 30 minutes. The absorption of the mixture was measured at 546 nm with a water sample used as a blank measurement and the protein concentration was calculated via the BSA calibration line.

Example 13: Protein Activity Assays

13.1 Determination of pH optima

[00284] pH optima are determined by first determining the range of enzyme concentration that reproducibly displays initial velocity kinetics at standard pH and temperature for the appropriate assay. Enzyme is then diluted to an amount within this range, divided into aliquots, and each aliquot is assayed simultaneously at the different pHs.

13.2 Protein activity-temperature profiles

[00285] Temperature optima are determined by first determining the range of enzyme concentration that reproducibly displays initial velocity kinetics at 40°C and at the enzyme's optimal pH (see Example 13.1) in the appropriate assay. Enzyme is then diluted to an amount within this range, divided into aliquots, and, where possible, each aliquot is assayed simultaneously at the different temperatures (e.g., when reaction is incubated in a dry bath heater, then transferred to a plate reader for endpoint measurement). Where simultaneous measurements at different temperatures are impossible (e.g., when reaction is incubated in a plate reader for continuous measurement) activities are measured in sequence at different temperatures.

13.3 Assay Procedure CU1 : Colorimetric assay for qlvcosidase or esterase activity, measuring release of 4-nitrophenol

[00286] Enzyme sample is diluted in 10 mM citrate buffer, pH 5.0, made by dissolving 1.92 g of citric acid in water, adjusting pH to 5.0 with 10 M NaOH and diluting to 1 L. 10 pL of diluted enzyme sample is added to 30 pL of 50 mM acetate-phosphate-borate reaction buffer at appropriate pH (made by dissolving 2.88 mL 99.7% glacial acetic acid, 3.42 mL 85% phosphoric acid, and 3.10 g boric acid in water, adjusting pH with 10 M NaOH and diluting to 1 L) in a PCR plate and preheated to appropriate temperature in a dry bath heater, and reaction is started by addition of 10 pL of preheated 5 mM substrate in water (see Table CU-1.1) to buffer and sample. Standards contain 10 pL of 4- nitrophenol (from 0 to 3 mM; 3 mM solution is made by dissolving 139 mg 4-nitrophenol in isopropyl alcohol and diluting 300 pL of resulting 100 mM solution to 10 mL in water) and 40 pL of reaction buffer. Sample blank contains 10 pL of enzyme sample and 40 pL of reaction buffer. Substrate blank contains 10 pL of substrate (see Table CU- 1.1) and 40 pL of reaction buffer. After appropriate incubation time, 50 pL of [1] for 4-nitrophenyl acetate, 1 M HEPES buffer pH 8 in water; [2] for all other substrates, 1 M Na2C03 in water, is added. 80 pL is then transferred to a clear microtiter flat-bottomed plate, absorbance is read at 410 nm and compared to the standard curve. One unit is defined as the amount of enzyme that releases one micromole of 4-nitrophenol per minute at the specified pH and temperature. (Adapted from Holmsen et al (1989) Methods in Enzymology, 169, 336-342.)

Table CU-1.1

13.4 Assay Procedure CU2: Colorimetric assay for endo-qlvcanase activity, measuring copper (I) reduced by polysaccharide reducing ends

[00287] Enzyme sample is diluted in 10 mM citrate buffer, pH 5.0, made by dissolving 1.92 g of citric acid in water, adjusting pH to 5.0 with 10 M NaOH and diluting to 1 L. 10 pL of diluted sample is added to 30 pL of either [1] 50 mM acetate-phosphate-borate reaction buffer at appropriate pH (made by dissolving 2.88 mL 99.7% glacial acetic acid, 3.42 mL 85% phosphoric acid, and 3.10 g boric acid in water, adjusting pH with 10 M NaOH and diluting to 1 L) or [2] for enzymes that utilize calcium, 50 mM acetate-MOPS-borate reaction buffer at appropriate pH (made by dissolving 2.88 mL 99.7% glacial acetic acid, 10.45 g MOPS, and 3.10 g boric acid in water, adjusting pH with 10 M NaOH and diluting to 1 L) in a PCR plate and preheated to appropriate temperature in a dry bath heater. The reaction is started by addition of 10 pL of preheated substrate in water (see Table CU-2.1) to buffer and sample. Standards contain 10 pL of 0 to 7.5 mM monosaccharide solution (see Table CU-2.1) in water and 40 pL of reaction buffer. Enzyme sample blank contains 10 pL of sample and 40 pL of reaction buffer. Substrate blank contains 10 pL of substrate (see Table CU-2.1) and 40 pL of reaction buffer. After appropriate incubation time, 10 pL is removed and added to another PCR plate containing 95 pL of BCA Reagent A (made by dissolving 0.543 g Na2C03, 0.242 g NaHC03 and 19 mg disodium 2,2'-bicinchoninate in water and diluting to 1 L) and 95 pL of BCA Reagent B (made by dissolving 12 mg CUSC and 13 mg L-Serine in water and diluting to 1 L), sealed and incubated in a dry bath heater for 25 minutes at 80°C. PCR plate is put on ice for 5 minutes, then 160 pL is transferred to a clear microtiter flat- bottomed plate, absorbance is read at 562 nm and compared to the standard curve. One unit is defined as the amount of enzyme that releases one micromole of monosaccharide-equivalent reducing ends per minute at the specified pH and temperature. (Adapted from Fox et al (1991) Anal. Biochem., 195, 93-96). Colloidal chitin is prepared by mixing 10 g chitin from crab shell in 100 mL concentrated hydrochloric acid, stirring overnight at room temperature, then adding 1 L cold distilled water, filtering resulting suspension through Whatman No. 1 paper washing retentate with distilled water until pH is greater than 4, determining dry weight by gravimetry and diluting to 1% solution with distilled water. (Adapted from Shimahara et al. (1988) Methods in Enzymology 161 , 417-423).

Table CU-2.1

Substrate

Enzyme Substrate Standard

Concentration

Mannanase Beta-mannan 1% Mannose

Endo-arabinanase Arabinan from sugar beet 1% Arabinose

Carboxymethyl cellulose (1 :1

Endoglucanase 1% Glucose

mixture of 4M and 7M)

Endo-arabinanase Carboxymethyl linear arabinan 1% Arabinose

Endochitinase Colloidal Chitin 1% N-acetyl glucosamine

Laminarinase Laminarin 1% Glucose

Lichenanase Lichenan 1% Glucose

Mannanase Locust Bean Gum 0.4% Mannose

Arabinoxylan

Low Viscosity Wheat Arabinoxylan 1% Arabinose arabinofuranohydrolase

Endopolygalacturonase Polygalacturonic acid 0.6% Galacturonic acid

Endopolygalacturonase Polygalacturonic acid, sodium salt 0.6% Galacturonic acid

Glucan endo-1 ,6-beta-

Pustulan 1% Glucose glucosidase

Endoxylanase Xylan from Beechwood 1% Xylose

Endoxylanase Xylan from Birchwood 1% Xylose

Xyloglucanase Xyloglucan from Tamarind 0.4% Glucose 13.5 Assay procedure CU3: UV assay for acetylesterase activity, measuring release of alpha- naphthol

[00288] Enzyme sample is diluted in 10 mM citrate buffer, pH 5.0, made by dissolving 1.92 g of citric acid in dl-W, adjusting pH to 5.0 with 10 M NaOH and diluting to 1 L. 20 pL of diluted sample is added to 20 pL of 300 mM acetate- phosphate-borate reaction buffer at appropriate pH (made by dissolving 17.28 mL 99.7% glacial acetic acid, 20.52 mL 85% phosphoric acid, and 18.6 g boric acid in water, adjusting pH with 10 M NaOH and diluting to 1 L) in a clear microtiter plate and preheated to appropriate temperature in the plate reader. The reaction is started by addition of 160 pL 0.5 mM alpha-naphthyl acetate substrate solution in water (prepared by diluting 46.55 mg of a-Naphthyl acetate in 1 mL of acetone and then transferring to 499 ml of water), preheated to assay temperature in a dry block heater, to the buffer and enzyme sample. Standards contain 180 μΙ_ of 0 to 0.1 mM alpha-naphthol in water and 20 pL of reaction buffer. Blank contains 20 pL of reaction buffer, 20 pL of water and 160 pL of substrate solution. Absorbance is continuously monitored at 303 nm and compared to that of the standards. One unit is the amount of enzyme that produces one micromole of alpha-naphthol per minute under the specified conditions. (Adapted from Yuorno et al. (1981), Anal. Biochem. 1 15, 188-193).

13.6 Assay procedure CU6: UV assay of lyase activity, measuring formation of unsaturated bonds

[00289] Enzyme sample is diluted in 50 mM acetate-MOPS-borate reaction buffer at appropriate pH (made by dissolving 2.88 mL 99.7% glacial acetic acid, 10.45 g MOPS, 3.10 g boric acid and 1.11 g calcium chloride in water, adjusting pH with 10 M NaOH and diluting to 1 L) and left to equilibrate for 30 minutes at room temperature. Reaction buffer is mixed in a 1 :1 ratio with substrate solution (1 % polygalacturonic acid in water or 0.75% Rhamnogalacturonan I from potato in water) and preheated to reaction temperature in a dry bath heater (if reaction temperature is greater than plate reader maximum temperature) or in a microtiter plate in plate reader. Reaction is started by addition of 10 pL of diluted enzyme sample to 240 pL of reaction buffer/substrate in UV-transparent microtiter flat-bottomed plate. Blank contains 10 pL of reaction buffer added to 240 pL of reaction buffer/substrate solution. Absorbance at 235 nm is continuously monitored, and the molar absorptivity coefficient of unsaturated galacturonic acid is used to determine activity. One unit is the amount of enzyme that releases one micromole of unsaturated galacturonic acid equivalents per minute under the specified conditions. Adapted from Hansen et al. (2001 ) J. AOAC International, 84, 1851-1854).

13.7 Assay procedure CU7: Fluorescence assay, measuring release of 4-methylumbelliferone

[00290] Enzyme sample is diluted in 10 mM citrate buffer, pH 5.0, made by dissolving 1.92 g of citric acid in water, adjusting pH to 5.0 with 10 M NaOH and diluting to 1 L. 10 pL of diluted sample is added to 30 pL of 50 mM acetate-phosphate-borate reaction buffer at appropriate pH (made by dissolving 2.88 mL 99.7% glacial acetic acid, 3.42 mL 85% phosphoric acid, and 3.10 g boric acid in water, adjusting pH with 10 M NaOH and diluting to 1 L) in a PCR plate and preheated to appropriate temperature in a dry bath heater. The reaction is started by addition of 10 pL of preheated 1 mM substrate in water (made by diluting 5.0 mg of 4-methylumbelliferyl cellobioside or 4- methylumbelliferyl lactoside in 10 mL water) to buffer and sample. Standards contain 10 pL of 4-methylumbelliferone (from 0 to 50 uM; 19.8 mg of 4-methylumbelliferone sodium salt is dissolved in 100 mL methanol and resulting solution is diluted 20X in water) and 40 pL of reaction buffer. Enzyme sample blank contains 10 pL of enzyme sample and 40 pL of reaction buffer. Substrate blank contains 10 pL of substrate and 40 pL of reaction buffer. After appropriate incubation time, 20 pL is removed and added to a black microtiter plate containing 180 pL of glycine/carbonate buffer, pH 10.7 (made by dissolving 10 g glycine and 8.8 g sodium carbonate in water, adjusting pH with 10 M NaOH and diluting to 1 L). The fluorescence of the wells is measured at 355 nm excitation, 460 nm emission and compared to the standard curve. One unit is defined as the amount of enzyme that releases one micromole of 4-methylumbelliferone per minute. (Adapted from van Tilbeurgh et al. (1988), Methods in Enzymology 160: 45-59.)

13.8 Assay procedure CU8: Spectrophotometric assay of acetylxylanesterase activity, measuring release of acetic acid

[00291] Enzyme sample is diluted in 10 mM citrate buffer, pH 5.0, made by dissolving 1.92 g of citric acid in dhbO, adjusting pH to 5.0 with 10 M NaOH and diluting to 1 L. 40 pL of 1 % acetylated xylan from birchwood (prepared by Swern oxidation in a fume hood according to Johnson et al. (1988), Methods in Enzymology 160, 551- 560, taking appropriate precautions against toxic products released in the air and into water during dialysis) are added to 40 ul of 50 mM phosphate reaction buffer (prepared by dissolving 3.42 mL of 85% phosphoric acid in water, adjusting pH to 6.0 with 10 M NaOH and diluting to 1 L) in the wells of a 96-well PCR plate and preheated to the appropriate temperature in a dry block heater. The reaction is started by adding 20 pL of diluted sample to the wells containing substrate and reaction buffer. Standards contain 20 pL of 0 mg/mL to 1 mg/mL acetic acid in water, and 80 pL reaction buffer. Sample blank contains 20 pL of diluted enzyme sample, 40 pL of reaction buffer and 40 ul of water. Substrate blank contains 20 pL of substrate 40 pL of reaction buffer and 40 pL of water. After appropriate incubation time, the plate is heated to 90°C for 5 minutes and centrifuged 10 minutes at 1500 X g. The amount of acetic acid in the supernatant is then determined with the K-ACETAK™ kit by Megazyme; one unit is defined as the amount of enzyme required to release one micromole of acetic acid per minute under the specified conditions. (Adapted from Johnson et al. (1988), Methods in Enzymoiogy 160, 551 -560 and K-ACETAK assay kit procedure by Megazyme (Ireland)).

13.9 Assay procedure CU12: UV assay of alpha-qlucuronidase activity, measuring NADH

[00292] Enzyme sample is diluted in 10 mM citrate buffer, pH 5.0, made by dissolving 1.92 g of citric acid in water, adjusting pH to 5.0 with 10 M NaOH and diluting to 1 L. Directions for the Megazyme microplate assay kit are followed, except the reaction buffer used is 50 mM acetate-phosphate-borate reaction buffer at appropriate pH (made by dissolving 2.88 mL 99.7% glacial acetic acid, 3.42 mL 85% phosphoric acid, and 3.10 g boric acid in water, adjusting pH with 10 M NaOH and diluting to 1 L). (Adapted from K-AGLUA™ assay kit procedure by Megazyme, Ireland). 13.10 Determination of (hemi-)cellulase activity

[00293] A. niger strains expressing Chaetomium thermophilum, Ther omyces stellatus, and Corynascus sepedonium clones were grown in shake flask, as described above, in order to obtain greater amounts of material for further testing. The fermentation supernatants (volume between 40 and 80ml) were concentrated using a 10 kDa spin filter to a volume of approximately 5 ml. Subsequently, the protein concentration in the concentrated supernatant was determined via a TCA-biuret method, as described above.

[00294] The (hemi-)cellulase activity of these protein samples was tested in an assay where the supernatants were spiked on top of an enzyme base mix in the presence of 10% (w/w) acid pretreated corn stover (aCS). 'To spike' or 'spiking of a supernatant or an enzyme indicates in this context the addition of a supernatant or an enzyme to a (hemi)-cellulase base mix. The feedstock solution was prepared via the dilution of a concentrated feedstock solution with water. Subsequently the pH was adjusted to pH 4.5 with a 4M NaOH solution. The proteins were spiked based on dosage in a total volume of 20 ml at a feedstock concentration of 10% aCS (w/w) in a 40-ml centrifuge bottle (Nalgene Oakridge). All experiments were performed at least in duplicate and were incubated for 72 hours at 65°C in an oven incubator (Techne HB-1 D hybridization oven) while rotating at set-point 3. After incubation, the samples were centrifuged and soluble sugars were analysed by HPLC as described below.

13.11 Soluble sugar analysis by HPLC

[00295] The sugar content of the samples after enzymatic hydrolysis were analyzed using a High-Performance Liquid Chromatography System (Agilent 1100) equipped with a refection index detector (Agilent 1260 Infinity). The separation of the sugars was achieved by using a 300 X 7.8 mm Aminex HPX-87P (Bio rad cat no 125-0098) column; Pre-column: Micro guard Carbo-P (Bio Rad cat. no. 125-0119); mobile phase was HPLC grade water; flow rate of 0.6 ml/min and a column temperature of 85 °C. The injection volume was 10 μΙ.

[00296] The samples were diluted with HPLC grade water to a maximum of 10 g/l glucose and filtered by using 0.2 μιτι filter (Afridisc LC25mm syringe filter PVDF membrane). The glucose was identified and quantified according to the retention time, which was compared to the external glucose standard (D-(+)-Glucose Sigma cat no: G7528) ranging from 0.2; 0.4; 1.0; 2.0 g/l.

13.12 g-arabino(furano)sidase activity assay

[00297] This assay measures the ability of a-arabino(furano)sidases to remove the a-L-arabinofuranosyl residues from substituted xylose residues.

[00298] Single and double substituted oligosaccharides are prepared by incubating wheat arabinoxylan (WAX medium viscosity; 2 mg/mL; Megazyme, Bray, Ireland) in 50 mM acetate buffer pH 4.5 with an appropriate amount of endo-xylanase (Aspergillus Awamori, FJM, Kormelink, Carbohydrate Research, 249 (1993) 355-367) for 48 hours at 50°C to produce a sufficient amount of arabinoxylo-oligosaccharides. The reaction is stopped by heating the samples at 100°C for 10 minutes. The samples are centrifuged for 5 minutes at 10.000 x g. The supernatant is used for further experiments. Degradation of the arabinoxylan is followed by High Performance Anion Exchange Chromatography (HPAEC).

[00299] The enzyme is added to the single and double substituted arabinoxylo-oligosaccharides (endo-xylanase treated WAX) in a dosage of 10 mg protein/ g substrate in 50 mM sodium acetate buffer which is then incubated at 65°C for 24 hours. The reaction is stopped by heating the samples at 100°C for 10 minutes. The samples are centrifuged for 5 minutes at 10.000 x g and 10 times diluted. Release of arabinose from the arabinoxylo- oligosaccharides is analyzed by HPAEC analysis.

[00300] The analysis is performed using a Dionex HPLC system equipped with a Dionex CarboPac PA-1 (2 mm ID x 250 mm) column in combination with a CarboPac PA guard column (2 mm ID x 50 mm) and a Dionex PAD- detector (Dionex Co. Sunnyvale). A flow rate of 0.3 mL/min is used with the following gradient of sodium acetate in 0.1 M NaOH: 0-40 min, 0-400 mM. Each elution is followed by a washing step of 5 min 1000 mM sodium acetate in 0.1 M NaOH and an equilibration step of 15 min 0.1 M NaOH. Arabinose release is quantified by an arabinose standard (Sigma) and compared to a sample where no enzyme was added.

Example 14: Identification of genes that encode secreted proteins

[00301] Genes (and polypeptides) from the organisms Chaetomium thermophilum, Thermomyces stellatus, and Corynascus sepedonium were identified that, based on curation (described above, see Example 4), encoded a secreted protein. A list of these genes and polypeptides is shown in Tables 1A-1C.

Example 15: Characterization of expressed proteins of Chaetomium thermophilum

[00302] The Chaetomium thermophilum proteins Chath2p7_002951, CHATH_1_01050, Chath2p7_002198, Chath2p7_013454, and Chath2p7_017015 were characterized using the assay protocols and assay conditions indicated in Table 5. pH and temperature optima were determined for each protein as described in Examples 13.1 and 13.2. Results are shown in Table 5.

Table 5: Activities of expressed enzymes from Chaetomium thermophilum

* Control is an equal volume of supernatant from a vector-only transformant

* U, micromole product formed per minute under the indicated assay conditions

Example 16: Determination of activity-temperature profiles of Chaetomium thermophilum proteins

[00303] Activity-temperature profiles were determined according to the protocol in Example 13.2 for the Chaetomium thermophilum proteins Chath2p7_002951, CHATH_1_01050, Chath2p7_002198, and Chath2p7_013454, using the Assay Protocols and Assay Conditions indicated below in Table 6. Results are shown in Figure 2.

Table 6: Activity-temperature profiles for various Chaetomium thermophilum proteins

Example 17: Improvement of a thermophilic cellulase mixture composed of three enzymes by a Chaetomium thermophilum protein in an activity assay.

[00304] The cellulase enhancing activity of Chaetomium thermophilum cellobiohydrolase II (CBHII) protein (Chath2p7_001264) was further analysed. The Chath2p7_001264 gene was cloned and expressed in A. niger. The supernatant of an A. niger expressing Chath2p7_001264 shake flask fermentation was concentrated and spiked in a dosage of 1.5 mg/gDM on top of a base activity of a three enzyme base mix (3.5 mg/gDM composed of: beta- glucosidase (BG) at 0.45 g/gDM, cellobiohydrolase I (CBHI) at 1.25 mg/gDM and GH61 at 1.8 mg/gDM) at a feedstock concentration of 10% (w/w) aCS. As a negative control, the 3 enzyme base mix was also tested. The experiments were performed at least in duplicate and were incubated for 72 hours at 62°C in an oven incubator (Techne HB-1 D hybridization oven) while rotating at set-point 3. After incubation, the samples were centrifuged and soluble sugars were analysed by HPLC as described above.

[00305] Addition of this Chaetomium thermophilum CBHII protein showed increased sugar release as shown below in Table 7.

Table 7: Effect of CBHII Chath2p7_001264 protein spiked on top of a 3E mix using aCS substrate

Protein ID SEQ ID NOs: glucose (g/l)

Chath2p7_001264 14, 192, 370 19.3

3 enzyme mix - 17.2 [00306] In a further experiment, the cellulase enhancing activity of Chaetomium thermophilum CBHI protein (Chath2p7_003618) was further analysed. The Chath2p7_003618 gene was cloned and expressed in A. niger. The supernatant of an A. niger expressing Chath2p7_003618 shake flask fermentation was concentrated and spiked in a dosage of 1.25 mg/gDM on top of a base activity of a three enzyme base mix (3.75 mg/gDM composed of: BG at 0.45 g/gDM, CBHII at 1.5 mg/gDM and GH61 at 1.8 mg/gDM) at a feedstock concentration of 10% (w/w) aCS. As a negative control, the 3 enzyme base mix was also tested. The experiments were performed at least in duplicate and were incubated for 72 hours at 62°C in an oven incubator (Techne HB-1 D hybridization oven) while rotating at set- point 3. After incubation, the samples were centrifuged and soluble sugars were analysed by HPLC as described above.

[00307] Addition of this Chaetomium thermophilum CBHI protein showed increased sugar release as shown below in Table 8.

Table 8: Effect of CBHI Chath2p7_003ei8 protein spiked on top of a 3E mix using aCS substrate

[00308] In a further experiment, the cellulase enhancing activity of Chaetomium thermophilum BG protein (Chath2p7_009128) was further analysed. The Chath2p7_009128 gene was cloned and expressed in A. niger. The supernatant of an A. niger expressing Chath2p7_009128 shake flask fermentation was concentrated and spiked in a dosage of 0.45 mg/gDM on top of a base activity of a three enzyme base mix (4.55 mg/gDM composed of: CBHI at 1.25 mg/gDM, CBHII at 1.5 mg/gDM and GH61 at 1.8 mg/gDM) at a feedstock concentration of 10% (w/w) aCS. As a negative control, the 3 enzyme base mix was also tested. The experiments were performed at least in duplicate and were incubated for 72 hours at 62°C in an oven incubator (Techne HB-1 D hybridization oven) while rotating at set- point 3. After incubation, the samples were centrifuged and soluble sugars were analysed by HPLC as described above.

[00309] Addition of this Chaetomium thermophilum BG protein showed increased sugar release as shown below in Table 9.

Table 9: Effect of BG Chath2p7_009128 protein spiked on top of a 3E mix using aCS substrate

Example 18: Characterization of expressed proteins of Thermomyces stellatus [00310] The Thermomyces stellatus proteins Thest2p7_010806, Thest2p7_021008, Thest2p7_019255, Thest2p7_007121 , Thest2p7_008006, Thest2p7_019045, Thest2p7_016616, Thest2p7_018954, Thest2p7_001564, Thest2p7_004749, Thest2p7_001651, Thest2p7_003915, Thest2p7_012157, Thest2p7_013959, and Thest2p7_014416, were characterized using the assay protocols and assay conditions indicated in Table 10. pH and temperature optima were determined for each protein as described in Examples 13.1 and 13.2. Results are shown in Table 10.

Table 10: Activities of expressed enzymes from Thermomyces stellatus

* Control is an equal volume of supernatant from a vector-only transformant; na, not applicable as control exhibited no detectable activity.

* U, micromole product formed per minute under the indicated assay conditions

Example 19: Determination of activity-temperature profiles of Thermomyces stellatus proteins

[00311] Activity-temperature profiles were determined according to the protocol in Example 13.2 for the Thermomyces stellatus proteins Thest2p7_007121, Thest2p7_019045, Thest2p7_004749, Thest2p7_019255, Thest2p7_018954, Thest2p7_001651, Thest2p7_003915, Thest2p7_021008, Thest2p7_014416, Thest2p7_001564, Thest2p7_010806, Thest2p7_012157, Thest2p7_013959, and Thest2p7_008006, using the Assay Protocols and Assay Conditions indicated below in Table 11. Results are shown in Figures 3-6.

Table 11: Activity-temperature profiles for various Thermomyces stellatus proteins

Example 20: Improvement of a thermophilic (hemi)cellulase mixture composed of five enzymes by a

Thermomyces stellatus protein in an activity assay

[00312] The (hemi)cellulase enhancing activity of certain Thermomyces stellatus endo-xylanase proteins (Thest2p7_012645 and Thest2p7_010806) was analysed. Both genes were cloned and expressed in A. niger. The supernatant of an A. niger expressing Thest2p7_012645 or Thest2p7_010806 shake flask fermentation was concentrated and spiked in a dosage of 0.063 mg/gDM on top of a base activity of a five enzyme mix (2.5 mg/gDM composed of: CBHI at 0.7 mg/gDM, CBHII at 0.6 mg/gDM, GH61 at 0.8 mg/gDM, BG at 0.2 mg/gDM and beta- xylosidase (BX) at 0.2 mg/gDM) at a feedstock concentration of 5.5% (w/w) washed aCS. As a negative control, the 5 enzyme base mix was also tested. The samples were incubated for 72 hours at 62°C in an oven incubator (Techne HB-1 D hybridization oven) while rotating at set-point 3. After incubation, the samples were centrifuged and soluble sugars were analysed by HPAEC as described above.

[00313] Addition of both Thermomyces stellatus endo-xylanase proteins showed increased glucose and xylose release as shown below in Table 12.

Table 12: Effect of endo-xylanase Thest2p7_012645 and Thest2p7_010806 protein spiked on top of a 5 enzyme mix using washed aCS substrate

Example 21 : Characterization of expressed proteins of Corynascus sepedonium

[00314] The Corynascus sepedonium proteins Corse1p7_019166, Corse1p7_015795, Corsel p7_000860, Corse1p7_015066, Corsel p7_018897, Corsel p7_000092, Corsel p7_000417, Corsel p7_016342, Corsel p7_015615, Corsel p7_014214, Corsel p7_008434, Corsel p7_012430, Corsel p7_008735, Corsel p7_001122, Corsel p7_002466, Corsel p7_007095, Corsel p7_012549, Corsel p7_014236, and Corse1p7_016074, were characterized using the assay protocols and assay conditions indicated in Table 13. pH and temperature optima were determined for each protein as described in Examples 13.1 and 13.2. Results are shown in Table 13.

Table 13: Activities of expressed enzymes from Corynascus sepedonium

* Control is an equal volume of supernatant from a vector-only transformant; na, not applicable as control exhibited no detectable activity. + U, micromole product formed per minute under the indicated assay conditions

Example 22: Determination of activity-temperature profiles of Corynascus sepedonium proteins

[00315] Activity-temperature profiles were determined according to the protocol in Example 13.2 for the Corynascus sepedonium proteins Corse1p7_008434, Corse1p7_012549, Corse1p7_000860, Corse1p7_019166, Corse1p7_018897, Corse1p7_001122, Corse1p7_015615, Corse1p7_016342, Corse1p7_015795, Corse1p7_014214, Corse1p7_002466, Corse1p7_000417, Corse1p7_007095, and Corse1p7_016074, using the Assay Protocols and Assay Conditions indicated below in Table 14. Results are shown in Figures 7-10.

Table 14: Activity-temperature profiles for various Corynascus sepedonium proteins

Example 23: Improvement of a thermophilic cellulase mixture composed of three enzymes by a Corynascus sepedonium protein in an activity assay.

[00316] The cellulase enhancing activity of Corynascus sepedonium BG protein (Corse1 p7_000860) was further analysed. The Corse1 p7_000860 gene was cloned and expressed in A. niger. The supernatant of an A. niger expressing Corsel p7_000860 shake flask fermentation was concentrated and spiked in a dosage of 0.45 mg/gDM on top of a base activity of a three enzyme base mix (4.55 mg/gDM composed of: CBHI at 1.25 mg/gDM, CBHII at 1.5 mg/gDM and GH61 at 1.8 mg/gDM) at a feedstock concentration of 10% (w/w) aCS. As a negative control, the 3 enzyme base mix was also tested. The experiments were performed at least in duplicate and were incubated for 72 hours at 62°C in an oven incubator (Techne HB-1 D hybridization oven) while rotating at set-point 3. After incubation, the samples were centrifuged and soluble sugars were analysed by HPLC as described above.

[00317] Addition of this Corynascus sepedonium BG protein showed increased sugar release as shown below in Table 15.

Table 15: Effect of BG Corse1p7_000860 protein spiked on top of a 3E mix using aCS substrate

Example 24: Identification of thermophilic Corynascus sepedonium arabino(furano)sidase

[00318] The arabino(furano)sidase activity of a Corynascus sepedonium enzyme was analysed as described above. The supernatant of an A. niger shake flask fermentation was concentrated and assayed for arabinose release from wheat arabinoxylan, which was pre-digested by an endo-xylanase, after incubation for 24 hours at pH 4.5 and 65°C. The Corse 1p7_018897 enzyme showed increased arabinose release as shown below in Table 16.

Table 16: Effect of Corynascus sepedonium arabinofuranosidase on wheat arabinoxylan substrate pre- digested with an endo-xylanase

Example 25: Improvement of a thermophilic (hemi)cellulase mixture composed of five enzymes by a

Corynascus sepedonium protein in an activity assay

[00319] The (hemi)cellulase enhancing activity of Corynascus sepedonium deacetylase protein (Corse1 p7_000092) was analysed. The Corse1 p7_000092 gene was cloned and expressed in A. niger. The supernatant of an A. niger expressing Corse 1 p7_000092 shake flask fermentation was concentrated and spiked in a dosage of 0.063 mg/gDM on top of a base activity of a five enzyme mix (2.5 mg/gDM composed of: CBHI at 0.7 mg/gDM, CBHII at 0.6 mg/gDM, GH61 at 0.8 mg/gDM, BG at 0.2 mg/gDM and BX at 0.2 mg/gDM) at a feedstock concentration of 5.5% (w/w) washed aCS. As a negative control, the 5 enzyme base mix was also tested. The samples were incubated for 72 hours at 62°C in an oven incubator (Techne HB-1 D hybridization oven) while rotating at set-point 3. After incubation, the samples were centrifuged and soluble sugars were analysed by HPAEC as described above.

[00320] Addition of this Corynascus sepedonium deacetylase protein showed increased glucose and xylose release as shown below in Table 17. Table 17: Effect of deacetylase Corse1p7_000092 protein spiked on top of a 5 enzyme mix using washed aCS substrate

[00321] In the appended Sequence Listing, SEQ ID NOs: 1-534 relate to sequences from Chaetomium thermophilum (CHATH); SEQ ID NOs: 535-1278 relate to sequences from Thermomyces stellatus (THEST); and SEQ ID NOs: 1279-2136 relate to sequences from Corynascus sepedonium (CORSE). A summary of the sequences provided in the appended sequence listing is shown below in Table 18.

Table 18: Sequence listing summary

[00322] Although the present invention has been described hereinabove by way of specific embodiments thereof, it can be modified, without departing from the spirit and nature of the subject invention as defined in the appended claims.