Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
XYLOSE ISOMERASES THAT CONFER EFFICIENT XYLOSE FERMENTATION CAPABILITY TO YEAST
Document Type and Number:
WIPO Patent Application WO/2018/115251
Kind Code:
A1
Abstract:
The present invention relates to novel nucleic acid sequences encoding bacterial xylose isomerases that upon transformation of a eukaryotic microbial host cell, such as yeast, to confer to the host cell the ability of isomerising xylose to xylulose. Thenucleic acid sequences encode xylose isomerases that originate from bacteria such as Eubacterium sp., Clostridiumcellulosiand others. The invention further relates to fermentation processes wherein the transformed host cells ferment a xylose-containing medium to produce ethanol or other fermentation products.

Inventors:
THEVELEIN JOHAN (BE)
DEMEKE MEKONNEN (BE)
FOULQUIÉ MORENO MARIA REMEDIOS (BE)
DE GRAEVE STIJN (BE)
VALDOMIRO CHARLES BELO EDGARD (BE)
Application Number:
PCT/EP2017/084034
Publication Date:
June 28, 2018
Filing Date:
December 21, 2017
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
VIB VZW (BE)
KATHOLIEKE UNIV LEUVEN K U LEUVEN R&D (BE)
GLOBALYEAST N V (BE)
International Classes:
C12N1/22; C12N9/92; C12P7/10
Domestic Patent References:
WO2013003219A12013-01-03
WO2013017644A12013-02-07
WO2014170330A22014-10-23
WO2015181169A12015-12-03
WO2016083397A12016-06-02
WO2015086805A12015-06-18
WO2012175552A12012-12-27
WO2014048863A12014-04-03
WO2012175552A12012-12-27
WO2014170330A22014-10-23
WO2015181169A12015-12-03
WO2016083397A12016-06-02
WO2014048863A12014-04-03
WO2015086805A12015-06-18
WO1993003159A11993-02-18
WO2003062430A12003-07-31
WO2006009434A12006-01-26
WO2014090930A12014-06-19
WO2016026954A12016-02-25
Other References:
DATABASE UniProt [online] 24 July 2013 (2013-07-24), "RecName: Full=Xylose isomerase {ECO:0000256|HAMAP-Rule:MF_00455, ECO:0000256|RuleBase:RU000609, ECO:0000256|SAAS:SAAS00909344}; EC=5.3.1.5 {ECO:0000256|HAMAP-Rule:MF_00455, ECO:0000256|RuleBase:RU000609, ECO:0000256|SAAS:SAAS00925975};", XP002778302, retrieved from EBI accession no. UNIPROT:R5MJJ5 Database accession no. R5MJJ5
DATABASE UniProt [online] 29 October 2014 (2014-10-29), "RecName: Full=Xylose isomerase {ECO:0000256|HAMAP-Rule:MF_00455, ECO:0000256|RuleBase:RU000609, ECO:0000256|SAAS:SAAS00909344}; EC=5.3.1.5 {ECO:0000256|HAMAP-Rule:MF_00455, ECO:0000256|RuleBase:RU000609, ECO:0000256|SAAS:SAAS00925975};", XP002778303, retrieved from EBI accession no. UNIPROT:A0A078KPR4 Database accession no. A0A078KPR4
DISCHE; BORENFREUND, J. BIOL. CHEM., vol. 192, 1951, pages 583 - 587
HENIKOFF; HENIKOFF, PNAS, vol. 89, 1992, pages 915 - 919
"GCG Wisconsin Package", ACCELRYS INC.
ALTSCHUL ET AL., J. MOL. BIOL., vol. 215, 1990, pages 403 - 10
ALTSCHUL ET AL., NUCLEIC ACIDS RES., vol. 25, no. 17, pages 3389 - 3402
MEADEN ET AL., GENE, vol. 141, 1994, pages 97 - 101
VANGRYSPERRE ET AL., BIOCHEM. J., vol. 265, pages 699 - 705
HENRICK ET AL., J. MOL. BIOL., vol. 208, pages 129 - 157
BHOSALE ET AL., MICROBIOL. REV., vol. 60, 1996, pages 280 - 300
SAMBROOK; RUSSELL: "Molecular Cloning: A Laboratory Manual", 2001, COLD SPRING HARBOR LABORATORY PRESS
SHARP AND LI, NUCLEIC ACIDS RESEARCH, vol. 15, 1987, pages 1281 - 1295
JANSEN ET AL., NUCLEIC ACIDS RES., vol. 31, no. 8, 2003, pages 2242 - 51
J.A. BARNETT; R.W. PAYNE; D. YARROW: "Yeasts: characteristics and identification", 2000, CAMBRIDGE UNIVERSITY PRESS
"The yeasts, a taxonomic study", 1998, ELSEVIER SCIENCE PUBL. B.V.
DENG; HO, APPL. BIOCHEM. BIOTECHNOL., vol. 24-25, 1990, pages 193 - 199
TRAFF ET AL., APPL. ENVIRONM. MICROBIOL., vol. 67, 2001, pages 5668 - 5674
QAKAR ET AL., FEMS YEAST RESEARCH, vol. 12, 2011, pages 171 - 182
MUMBERG: "Yeast vectors for controlled expression of heterologous protein in different genetic backgrounds", GENE, vol. 156, 1995, pages 119 - 122, XP004042399, DOI: doi:10.1016/0378-1119(95)00037-7
ZALDIVAR J; NIELSEN J; OLSSON L: "Fuel ethanol production from lignocellulose: a challenge for metabolic engineering and process integration", APPL MICROBIOL BIOTECHNOL., vol. 56, no. 1-2, July 2001 (2001-07-01), pages 17 - 34, XP002201309, DOI: doi:10.1007/s002530100624
HAHN-HAGERDAL B; KARHUMAA K; FONSECA C; SPENCER-MARTINS I; GORWA-GRAUSLUND MF: "Towards industrial pentose-fermenting yeast strains", APPL MICROBIOL BIOTECHNOL, vol. 74, no. 5, April 2007 (2007-04-01), pages 937 - 53, XP002620846, DOI: doi:10.1007/s00253-006-0827-2
LAU MW; GUNAWAN C; BALAN V; DALE BE: "Comparing the fermentation performance of Escherichia coli K011, Saccharomyces cerevisiae 424A(LNH-ST) and Zymomonas mobilis AX101 for cellulosic ethanol production", BIOTECHNOL BIOFUELS, vol. 3, no. 1, 27 May 2010 (2010-05-27), pages 1 1
BETTIGA M; HAHN-HAGERDAL B; GORWA-GRAUSLUND MF: "Comparing the xylose reductase/xylitol dehydrogenase and xylose isomerase pathways in arabinose and xylose fermenting Saccharomyces cerevisiae strains", BIOTECHNOL BIOFUELS, vol. 1, no. 1, 23 October 2008 (2008-10-23), pages 16, XP021045780, DOI: doi:10.1186/1754-6834-1-16
HECTOR RE; MERTENS JA; BOWMAN MJ; NICHOLS NN; COTTA MA; HUGHES SR: "Saccharomyces cerevisiae engineered for xylose metabolism requires gluconeogenesis and the oxidative branch of the pentose phosphate pathway for aerobic xylose assimilation", YEAST, vol. 28, no. 9, 1 September 2011 (2011-09-01), pages 645 - 60, XP055104985, DOI: doi:10.1002/yea.1893
HOU J; SUO F; WANG C; LI X, SHEN Y; BAO X: "Fine-tuning of NADH oxidase decreases byproduct accumulation in respiration deficient xylose metabolic Saccharomyces cerevisiae", BMC BIOTECHNOL, vol. 14, no. 1, 14 February 2014 (2014-02-14), pages 13, XP021176813, DOI: doi:10.1186/1472-6750-14-13
JEPPSSON M; BENGTSSON O; FRANKE K; LEE H; HAHN-HAGERDAL B; GORWA-GRAUSLUND MF: "The expression of a Pichia stipitis xylose reductase mutant with higher KM for NADPH increases ethanol production from xylose in recombinant Saccharomyces cerevisiae", BIOTECHNOL BIOENG, vol. 93, no. 4, 2006, pages 665 - 73, XP002504734, DOI: doi:10.1002/bit.20737
WALFRIDSSON M; BAO X; ANDERLUND M; LILIUS G; BULOW L; HAHN-HAGERDAL B: "Ethanolic fermentation of xylose with Saccharomyces cerevisiae harboring the Thermus thermophilus xylA gene, which expresses an active xylose (glucose) isomerase", APPL ENVIRON MICROBIOL., vol. 62, no. 12, December 1996 (1996-12-01), pages 4648 - 51, XP002117639
KUYPER M; HARHANGI HR; STAVE AK; WINKLER AA; JETTEN MSM; LAAT WTAM ET AL.: "High-level functional expression of a fungal xylose isomerase: the key to efficient ethanolic fermentation of xylose by Saccharomyces cerevisiae?", FEMS YEAST RES., vol. 4, no. 1, 2003, pages 69 - 78, XP002312913, DOI: doi:10.1016/S1567-1356(03)00141-7
PENG B; HUANG S; LIU T; GENG A: "Bacterial xylose isomerases from the mammal gut Bacteroidetes cluster function in Saccharomyces cerevisiae for effective xylose fermentation", MICROB CELL FACTORIES., vol. 14, no. 1, 17 May 2015 (2015-05-17), pages 70, XP021222919, DOI: doi:10.1186/s12934-015-0253-1
BRAT D; BOLES E; WIEDEMANN B: "Functional Expression of a Bacterial Xylose Isomerase in Saccharomyces cerevisiae", APPL ENVIRON MICROBIOL, vol. 75, no. 8, 13 February 2009 (2009-02-13), pages 2304 - 11, XP009121860, DOI: doi:10.1128/AEM.02522-08
DEMEKE MM; DIETZ H; LI Y; FOULQUIE-MORENO MR; MUTTURI S; DEPREZ S ET AL.: "Development of a D-xylose fermenting and inhibitor tolerant industrial Saccharomyces cerevisiae strain with high performance in lignocellulose hydrolysates using metabolic and evolutionary engineering", BIOTECHNOL BIOFUELS., vol. 6, no. 1, 21 June 2013 (2013-06-21), pages 89, XP055223225, DOI: doi:10.1186/1754-6834-6-89
GLANEMANN C; LOOS A; GORRET N; WILLIS LB; O'BRIEN XM; LESSARD PA ET AL.: "Disparity between changes in mRNA abundance and enzyme activity in Corynebacterium glutamicum: implications for DNA microarray analysis", APPL MICROBIOL BIOTECHNOL., vol. 61, no. 1, 21 December 2002 (2002-12-21), pages 61 - 8
DEMEKE MM; FOULQUIE-MORENO MR; DUMORTIER F; THEVELEIN JM.: "Rapid Evolution of Recombinant Saccharomyces cerevisiae for Xylose Fermentation through Formation of Extra-chromosomal Circular DNA", PLOS GENET., vol. 11, no. 3, 4 March 2015 (2015-03-04), pages e1005010
DEMEKE MM; DUMORTIER F; LI Y; BROECKXT; FOULQUIE-MORENO MR: "Thevelein JM. Combining inhibitor tolerance and D-xylose fermentation in industrial Saccharomyces cerevisiae for efficient lignocellulose-based bioethanol production", BIOTECHNOL BIOFUELS, vol. 6, no. 1, 26 August 2013 (2013-08-26), pages 120, XP055144407, DOI: doi:10.1186/1754-6834-6-120
AUSTIN MN; RABE LK; SRINIVASAN S; FREDRICKS DN; WIESENFELD HC; HILLIER SL: "Mageeibacillus indolicus gen. nov., sp. nov.: A novel bacterium isolated from the female genital tract", ANAEROBE, vol. 32, April 2015 (2015-04-01), pages 37 - 42
GIETZ RD; SCHIESTL RH; WILLEMS AR; WOODS RA: "Studies on the transformation of intact yeast cells by the LiAc/SS-DNA/PEG procedure", YEAST, vol. 11, no. 4, 15 April 1995 (1995-04-15), pages 355 - 60
Attorney, Agent or Firm:
NEDERLANDSCH OCTROOIBUREAU (NL)
Download PDF:
Claims:
Claims

1. A eukaryotic microbial cell comprising a nucleotide sequence, the expression of which confers to, or increases in the cell the ability to directly isomerise xylose into xylulose, wherein the nucleotide sequence encodes a polypeptide with xylose isomerase activity, which polypeptide comprises an amino acid sequence that has at least 68% sequence identity with the amino acid sequence of SEQ ID NO. 7, and wherein preferably the nucleotide sequence encodes an amino acid sequence that is obtainable from a bacterium of the genus Eubacterium, more preferably a bacterium of the species Eubacterium sp. CAG_180. 2. A cell according to claim 1 , wherein the cell further comprises a nucleotide sequence, the expression of which confers to, or increases in the cell the ability to directly isomerise xylose into xylulose, wherein the nucleotide sequence encodes a polypeptide with xylose isomerase activity, which polypeptide comprises an amino acid sequence that has at least 71 % sequence identity with the amino acid sequence of SEQ ID NO. 10, and wherein preferably the nucleotide sequence encodes an amino acid sequence that is obtainable from a bacterium of the genus Clostridium, more preferably a bacterium of the species Clostridium cellulosi.

3. A cell according to claim 1 or 2, wherein the cell is a yeast or a filamentous fungus of a genus selected from the group consisting of Saccharomyces, Kluyveromyces, Candida, Pichia,

Schizosaccharomyces, Hansenula, Kloeckera, Schwanniomyces, Yarrowia, Kazachstania Naumovia, Aspergillus, Trichoderma, Humicola, Acremonium, Fusarium, and Penicillium.

4. A cell according to claim 3, wherein the cell is a yeast is capable of anaerobic alcoholic fermentation.

5. A cell according to claim 4, wherein the yeast belongs to a Saccharomyces species selected from the group consisting of S. cerevisiae, S. bayanus, S. bulderi, S. cervazzii, S. cariocanus, S. castellii, S. dairenensis, S. exiguus, S. kluyveri, S. kudriazevii, S. mikatae, S. paradoxus, S. pastorianus, S. turicensis and S. unisporus.

6. A cell according to any one of the preceding claims, wherein the nucleotide sequence encoding the polypeptide with xylose isomerase activity is operably linked to a promoter that is insensitive to catabolite repression and that does not require xylose for induction.

7. A cell according to any one the preceding claims, whereby the cell comprises at least one genetic modification selected from:

a) a genetic modification that increases the specific xylulose kinase activity;

b) a genetic modification that increases the flux of the pentose phosphate pathway; and, c) a genetic modification that reduces unspecific aldose reductase activity in the cell.

8. A cell according to any one the preceding claims, wherein the cell further comprises at least one genetic modification that results in a characteristic selected from the group consisting of: a) increased tolerance to ethanol;

b) increased tolerance to acetic acid;

c) reduced production of glycerol;

d) increased xylose to ethanol fermentation rate; and,

e) increased thermotolerance.

9. A cell according to claim 8, wherein, in:

a) the genetic modification is a modification that introduces an allele of one or more of the

ADE1, KIN3, MKT1, VPS70, SWS2 and APJ1 genes that confers increased tolerance to ethanol as described in WO 2012/175552 and WO 2014/170330;

b) the genetic modification is a modification that introduces an allele of one or more of the GL01, DOT5, CUP2 and HAA1 genes that confers increased tolerance to acetic acid as described in WO 2015/181 169 and WO 2016/083397;

c) the genetic modification is a modification that introduces a mutant SSK1 gene encoding a truncated sskl protein as described in WO 2014/048863;

d) the genetic modification is a modification that introduces an allele of the NNK1 gene that confers an increased xylose to ethanol fermentation rate as described in WO 2015/086805; and,

e) the genetic modification is overexpression of at least one of a gene encoding the Prp42 protein and a gene encoding the Smd2 protein.

10. A cell according to any one the preceding claims, wherein the nucleotide sequence encoding the polypeptide with xylose isomerase activity is integrated into the genome of the cell.

1 1. A cell according to any one the preceding claims, wherein the cell is a cell of an industrial yeast strain or derived from an industrial yeast strain. 12. A cell according to any one the preceding claims, wherein the cell is a diploid, aneuploid or polyploid cell.

13. A cell according to any one the preceding claims, wherein the cell is improved in at least one industrially relevant phenotype by evolutionary engineering, wherein preferably the industrially relevant phenotype is xylose utilisation rate.

14. A cell according to any one of the preceding claims, wherein the cell has the ability to produce at least one fermentation product selected from the group consisting of ethanol, lactic acid, 3-hydroxy-propionic acid, acrylic acid, acetic acid, succinic acid, citric acid, amino acids, 1 ,3- propane-diol, ethylene, glycerol, butyric acid, caproate, butanol, glyoxylate, muconic acid, fatty alcohols, fatty acids, β-lactam antibiotics and cephalosporins. A process for producing a fermentation product selected from the group consisting of ethanol, lactic acid, 3-hydroxy-propionic acid, acrylic acid, acetic acid, succinic acid, citric acid, amino acids, 1 ,3-propane-diol, ethylene, glycerol, butyric acid, caproate, butanol, glyoxylate, muconic acid, fatty alcohols, fatty acids, β-lactam antibiotics and cephalosporins, whereby the process comprises the steps of: (a) fermenting a medium containing a source of xylose, and optionally a source of glucose, with a cell as defined in any one of claims 1 - 14, whereby the cell ferments the xylose, and optionally the glucose, to the fermentation product, and optionally, (b) recovery of the fermentation product.

Description:
Xylose Isomerases that confer efficient xylose fermentation capability to yeast

Field of the invention

The present invention relates to the fields of microbiology and fermentation technology. In particular, the invention relates to nucleic acid sequences encoding xylose isomerases that upon transformation of a eukaryotic microbial host cell, such as yeast, to confer to the host cell the ability of isomerising xylose to xylulose. The invention further relates to fermentation processes wherein the transformed host cells ferment a pentose-containing medium to produce ethanol or other fermentation products.

Background art

The yeast Saccharomyces cerevisiae has been the primary organism of choice in industrial fermentation processes including alcoholic beverages and bioethanol production. The dominance of this organism in these industries is due to its superior properties such as high ethanol productivity and yield, high tolerance to ethanol and other inhibitors, and its excellent maintenance of viability during production, storage and transport. Additionally, since it is one of the most intensively studied microorganisms, numerous molecular tools are available for its genetic and physiological manipulation (1 ).

On the other hand, natural strains of S. cerevisiae are not useful in lignocellulose based ethanol industries. This is primarily due to their inability to metabolize pentose sugars, particularly xylose. Xylose is the second most abundant sugar in nature. It accounts for a third of the total sugar present in lignocellulosic biomass, such as agricultural and forest residues, and municipal solid waste. Hence, efficient utilization of xylose is crucial for lignocellulose based (second generation) bioethanol production (2).

There are several microorganisms that are able to naturally ferment xylose. However, unlike S. cerevisiae, those organisms do not have enough inherent robustness to cope with the harsh environments existing in industrial fermentations. Compared to S. cerevisiae, they are less tolerant to ethanol and to various growth and fermentation inhibitors such as organic acids, furan derivatives and phenolic compounds that are present in lignocellulosic hydrolysates (3). For that reason, much effort is being undertaken to engineer S. cerevisiae for efficient xylose fermentation, rather than endowing industrial robustness to naturally xylose utilizing micro-organisms.

Two different xylose utilization pathways have been engineered in yeast. The first pathway, called fungal pathway or the redox pathway, works by a two-step enzymatic conversion of xylose to xylulose. In the first step the NADPH-dependent enzyme Xylose Reductase (XR) reduces xylose to xylitol. Xylitol is subsequently oxidized to xylulose by the NAD-dependent Xylitol Dehydrogenase (XDH). Xylulose can then be phosphorylated to Xylulose-5-Phosphate by the native Xylulokinase. Though yeast strains expressing the fungal redox pathway can efficiently ferment xylose, they generally produce less ethanol per gram biomass, due to accumulation of xylitol as a by-product (4). The low ethanol yield and high xylitol accumulation are due to cofactor imbalance generated by the heterologous enzymes XR/XDH. A number of strategies has been applied to resolve the problem of cofactor imbalance. This includes modification of cofactor specificity of XR and XDH, and expression of heterologous transhydrogenases that catalyse the transfer of H + between NADPH and NAD + (5-7). Balancing cofactor usage in yeast expressing XR/XDH has shown good potential but until now, it could not eliminate production of xylitol as by-product. The yield of ethanol per amount of sugar consumed by such strains remains too low.

The second pathway works with a one-step conversion of xylose to xylulose using Xylose

Isomerase (XI). This pathway alleviates the cofactor imbalance associated with the fungal redox pathway. The XI pathway is predominantly found in bacteria but also in some fungi. Many earlier attempts to express bacterial XI into yeast failed, or resulted in very low expression. The first functionally active bacterial XI expressed in yeast was encoded by the XylA gene from the thermophilic bacterium Thermus thermophiles (8). However, the optimal enzymatic activity was observed at 85°C, which is far above the optimum temperature at which yeasts can grow. Nevertheless, the recombinant strain was able to grow very slowly with xylose as sole carbon source. Later on, expression of an enzymatically active fungal XI from Piromyces sp. became a great success story (9). Subsequently, other Xls from various species of bacteria or fungi have been actively expressed in S. cerevisiae (10). However, the activity of those enzymes in yeast remains lower compared to that of Piromyces sp. XI. The first bacterial XI that showed very good enzymatic activity when expressed in yeast was the XI from the bacterial species Clostridium phytofermentans. This enzyme was less inhibited by xylitol as opposed to xylose isomerases from other bacterial species (1 1 ). However, in spite of the high in vitro enzyme activities of these Xls reported so far, the recombinant strains expressing these enzymes exhibited only slow growth and fermentation capacity with xylose. Further improvement by mutagenesis or adaptive evolution of the recombinant yeast is required to obtain an acceptable xylose fermentation capacity (12).

To date, there are hundreds of XylA sequences available in NCBI sequence databases. These sequences are a great tool to search for functionally active Xls originating from various species. In spite of the vast sequence information, only few Xls originating from several species of bacteria have been functionally expressed in yeast. Recently it was reported that most of the Xls actively expressed in yeast originate from the Bacteroidetes group living in the mammalian gut (10). A drawback of the Xls originating from the Bacteroidetes group is their strong inhibition by xylitol (1 1 ). However, many bacterial Xls other than those originating from the Bacteroidetes group cannot be functionally expressed in yeast and we can still not predict beforehand whether a particular XI will be functionally expressed in yeast or not.

There is, therefore, still a need in the art for nucleotide sequences encoding other xylose isomerases that may be used to transform host cells like S. cerevisiae to confer to them the ability of isomerising xylose to xylulose, so as to enable the use of thus transformed host cell in processes for the production of ethanol or other fermentation products by fermentation of pentose-containing feedstock. Summary of the invention

In a first aspect the invention relates to a eukaryotic microbial cell comprising a nucleotide sequence, the expression of which confers to, or increases in the cell the ability to directly isomerise xylose into xylulose, wherein the nucleotide sequence encodes a polypeptide with xylose isomerase activity, which polypeptide comprises an amino acid sequence that has at least 68% sequence identity with the amino acid sequence of SEQ ID NO. 7. Preferably, the nucleotide sequence encodes an amino acid sequence that is obtainable from a bacterium of the genus Eubacterium, more preferably a bacterium of the species Eubacterium sp. CAG_180. A preferred cell according to the invention further comprises a second nucleotide sequence, the expression of which confers to, or increases in the cell the ability to directly isomerise xylose into xylulose, wherein the nucleotide sequence encodes a polypeptide with xylose isomerase activity, which polypeptide comprises an amino acid sequence that has at least 71 % sequence identity with the amino acid sequence of SEQ ID NO. 10. Preferably the second nucleotide sequence encodes an amino acid sequence that is obtainable from a bacterium of the genus Clostridium, more preferably a bacterium of the species Clostridium cellulosi.

The eukaryotic microbial cell according to the invention preferably is a yeast or a filamentous fungus of a genus selected from the group consisting of Saccharomyces, Kluyveromyces, Candida, Pichia, Schizosaccharomyces, Hansenula, Kloeckera, Schwanniomyces, Yarrowia, Kazachstania Naumovia, Aspergillus, Trichoderma, Humicola, Acremonium, Fusarium, and Penicillium.

In one embodiment the eukaryotic microbial cell preferably is a yeast is capable of anaerobic alcoholic fermentation. Preferably, the yeast belongs to a Saccharomyces species selected from the group consisting of S. cerevisiae, S. bayanus, S. bulderi, S. cervazzii, S. cariocanus, S. castellii, S. dairenensis, S. exiguus, S. kluyveri, S. kudriazevii, S. mikatae, S. paradoxus, S. pastorianus, S. turicensis and S. unisporus.

In a eukaryotic microbial cell according to the invention, the nucleotide sequence encoding the polypeptide with xylose isomerase activity preferably is operably linked to a promoter that is insensitive to catabolite repression and/or that does not require xylose for induction.

The eukaryotic microbial cell according to the invention preferably comprises at least one genetic modification selected from: a) a genetic modification that increases the specific xylulose kinase activity; b) a genetic modification that increases the flux of the pentose phosphate pathway; and, c) a genetic modification that reduces unspecific aldose reductase activity in the cell. The cell further preferably comprises at least one genetic modification that results in a characteristic selected from the group consisting of: a) increased tolerance to ethanol; b) increased tolerance to acetic acid; c) reduced production of glycerol; d) increased xylose to ethanol fermentation rate; and, e) increased thermotolerance. More preferably in the cell: a) the genetic modification is a modification that introduces an allele of one or more of the ADE1 , KIN3, MKT1 , VPS70, SWS2 and APJ1 genes that confers increased tolerance to ethanol as described in WO 2012/175552 and WO 2014/170330; b) the genetic modification is a modification that introduces an allele of one or more of the GL01 , DOT5, CUP2 and HAA1 genes that confers increased tolerance to acetic acid as described in WO 2015/181 169 and WO 2016/083397; c) the genetic modification is a modification that introduces a mutant SSK1 gene encoding a truncated sskl protein as described in WO 2014/048863; d) the genetic modification is a modification that introduces an allele of the NNK1 gene that confers an increased xylose to ethanol fermentation rate as described in WO 2015/086805; and, e) the genetic modification is overexpression of at least one of a gene encoding the Prp42 protein and a gene encoding the Smd2 protein.

In a preferred eukaryotic microbial cell according to the invention, the nucleotide sequence encoding the polypeptide with xylose isomerase activity is integrated into the genome of the cell.

A eukaryotic microbial cell according to the invention preferably is a cell of an industrial yeast strain or derived from an industrial yeast strain. The cell can be a diploid, aneuploid or polyploid cell.

In one embodiment, a eukaryotic microbial cell according to the invention is a cell that is improved in at least one industrially relevant phenotype by evolutionary engineering, wherein preferably the industrially relevant phenotype is xylose utilisation rate.

A eukaryotic microbial cell according to the invention further preferably has the ability to produce at least one fermentation product selected from the group consisting of ethanol, lactic acid, 3-hydroxy-propionic acid, acrylic acid, acetic acid, succinic acid, citric acid, amino acids, 1 ,3- propane-diol, ethylene, glycerol, butyric acid, caproate, butanol, glyoxylate, muconic acid, fatty alcohols, fatty acids, β-lactam antibiotics and cephalosporins.

In a second aspect the invention pertain to a process for producing a fermentation product selected from the group consisting of ethanol, lactic acid, 3-hydroxy-propionic acid, acrylic acid, acetic acid, succinic acid, citric acid, amino acids, 1 ,3-propane-diol, ethylene, glycerol, butyric acid, caproate, butanol, glyoxylate, muconic acid, fatty alcohols, fatty acids, β-lactam antibiotics and cephalosporins. The process preferably comprises the steps of: (a) fermenting a medium containing a source of xylose, and optionally a source of glucose, with a eukaryotic microbial cell according to the invention, whereby the cell ferments the xylose, and optionally the glucose, to the fermentation product, and optionally, (b) recovery of the fermentation product.

In a third aspect, the invention relates to the use of a eukaryotic microbial cell according to first aspect in a process according to the second aspect. Description of the invention

Definitions

The enzyme "xylose isomerase" (EC 5.3.1.5) is herein defined as an enzyme that catalyses the direct isomerisation of D-xylose into D-xylulose and vice versa. The enzyme is also known as a D-xylose ketoisomerase. Some xylose isomerases are also capable of catalysing the conversion between D-glucose and D-fructose and are therefore sometimes referred to as glucose isomerase. Xylose isomerases require magnesium as cofactor. Xylose isomerases of the invention may be further defined by their amino acid sequence as herein described below. Likewise xylose isomerases may be defined by the nucleotide sequences encoding the enzyme as well as by nucleotide sequences hybridising to a reference nucleotide sequence encoding a xylose isomerase as herein described below. A unit (U) of xylose isomerase activity is herein defined as the amount of enzyme producing 1 nmol of xylulose per minute, in a reaction mixture containing 50 mM phosphate buffer (pH 7.0), 10 mM xylose and 10 mM MgCh, at 37°C. Xylulose formed was determined by the method of Dische and Borenfreund (1951 , J. Biol. Chem. 192: 583-587) or by HPLC as is known in the art.

The terms "homology", "sequence identity" and the like are used interchangeably herein.

Sequence identity is herein defined as a relationship between two or more amino acid (polypeptide or protein) sequences or two or more nucleic acid (polynucleotide) sequences, as determined by comparing the sequences. In the art, "identity" also means the degree of sequence relatedness between amino acid or nucleic acid sequences, as the case may be, as determined by the match between strings of such sequences. "Similarity" between two amino acid sequences is determined by comparing the amino acid sequence and its conserved amino acid substitutes of one polypeptide to the sequence of a second polypeptide. "Identity" and "similarity" can be readily calculated by known methods.

"Sequence identity" and "sequence similarity" can be determined by alignment of two peptide or two nucleotide sequences using global or local alignment algorithms, depending on the length of the two sequences. Sequences of similar lengths are preferably aligned using a global alignment algorithms (e.g. Needleman Wunsch) which aligns the sequences optimally over the entire length, while sequences of substantially different lengths are preferably aligned using a local alignment algorithm (e.g. Smith Waterman). Sequences may then be referred to as "substantially identical" or "essentially similar" when they (when optimally aligned by for example the programs GAP or BESTFIT using default parameters) share at least a certain minimal percentage of sequence identity (as defined below). GAP uses the Needleman and Wunsch global alignment algorithm to align two sequences over their entire length (full length), maximizing the number of matches and minimizing the number of gaps. A global alignment is suitably used to determine sequence identity when the two sequences have similar lengths. Generally, the GAP default parameters are used, with a gap creation penalty = 50 (nucleotides) / 8 (proteins) and gap extension penalty = 3 (nucleotides) / 2 (proteins). For nucleotides the default scoring matrix used is nwsgapdna and for proteins the default scoring matrix is Blosum62 (Henikoff & Henikoff, 1992, PNAS 89, 915-919). Sequence alignments and scores for percentage sequence identity may be determined using computer programs, such as the GCG Wisconsin Package, Version 10.3, available from Accelrys Inc., 9685 Scranton Road, San Diego, CA 92121-3752 USA, or using open source software, such as the program "needle" (using the global Needleman Wunsch algorithm) or "water" (using the local Smith Waterman algorithm) in EmbossWIN version 2.10.0, using the same parameters as for GAP above, or using the default settings (both for 'needle' and for 'water' and both for protein and for DNA alignments, the default Gap opening penalty is 10.0 and the default gap extension penalty is 0.5; default scoring matrices are Blossum62 for proteins and DNAFull for DNA). When sequences have a substantially different overall lengths, local alignments, such as those using the Smith Waterman algorithm, are preferred.

Alternatively percentage similarity or identity may be determined by searching against public databases, using algorithms such as FASTA, BLAST, etc. Thus, the nucleic acid and protein sequences of the present invention can further be used as a "query sequence" to perform a search against public databases to, for example, identify other family members or related sequences. Such searches can be performed using the BLASTn and BLASTx programs (version 2.0) of Altschul, ei al. (1990) J. Mol. Biol. 215:403—10. BLAST nucleotide searches can be performed with the NBLAST program, score = 100, wordlength = 12 to obtain nucleotide sequences homologous to oxidoreductase nucleic acid molecules of the invention. BLAST protein searches can be performed with the BLASTx program, score = 50, wordlength = 3 to obtain amino acid sequences homologous to protein molecules of the invention. To obtain gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in Altschul et al., (1997) Nucleic Acids Res. 25(17): 3389-3402. When utilizing BLAST and Gapped BLAST programs, the default parameters of the respective programs (e.g., BLASTx and BLASTn) can be used. See the homepage of the National Center for Biotechnology Information at http://www.ncbi.nlm.nih.gov/.

Optionally, in determining the degree of amino acid similarity, the skilled person may also take into account so-called "conservative" amino acid substitutions, as will be clear to the skilled person. Conservative amino acid substitutions refer to the interchangeability of residues having similar side chains. Examples of classes of amino acid residues for conservative substitutions are given in the Tables below.

Alternative conservative amino acid residue substitution classes.

Alternative Physical and Functional Classifications of Amino Acid Residues.

Alcohol group-containing residues S and T

Aliphatic residues I, L, V, and M

Cycloalkenyl-associated residues F, H, W, and Y

Hydrophobic residues A, C, F, G, H, I, L, M, R, T, V, W, and Y

Negatively charged residues D and E

Polar residues C, D, E, H, K, N, Q, R, S, and T

Positively charged residues H, K, and R

Small residues A, C, D, G, N, P, S, T, and V

Very small residues A, G, and S

Residues involved in turn formation A, C, D, E, G, H, K, N, Q, R, S, P and T

Flexible residues Q, T, K, S, G, P, D, E, and R

Nucleotide sequences encoding xylose isomerases of the invention may also be defined by their capability to hybridise with the nucleotide sequences of encoding xylose isomerases as exemplified herein, under moderate, or preferably under stringent hybridisation conditions. Stringent hybridisation conditions are herein defined as conditions that allow a nucleic acid sequence of at least about 25, preferably about 50 nucleotides, 75 or 100 and most preferably of about 200 or more nucleotides, to hybridise at a temperature of about 65 ° C in a solution comprising about 1 M salt, preferably 6 x SSC or any other solution having a comparable ionic strength, and washing at 65 ° C in a solution comprising about 0.1 M salt, or less, preferably 0.2 x SSC or any other solution having a comparable ionic strength. Preferably, the hybridisation is performed overnight, i.e. at least for 10 hours and preferably washing is performed for at least one hour with at least two changes of the washing solution. These conditions will usually allow the specific hybridisation of sequences having about 90% or more sequence identity.

Moderate conditions are herein defined as conditions that allow a nucleic acid sequences of at least 50 nucleotides, preferably of about 200 or more nucleotides, to hybridise at a temperature of about 45 ° C in a solution comprising about 1 M salt, preferably 6 x SSC or any other solution having a comparable ionic strength, and washing at room temperature in a solution comprising about 1 M salt, preferably 6 x SSC or any other solution having a comparable ionic strength. Preferably, the hybridisation is performed overnight, i.e. at least for 10 hours, and preferably washing is performed for at least one hour with at least two changes of the washing solution. These conditions will usually allow the specific hybridisation of sequences having up to 50% sequence identity. The person skilled in the art will be able to modify these hybridisation conditions in order to specifically identify sequences varying in identity between 50% and 90%.

A "nucleic acid construct" or "nucleic acid vector" is herein understood to mean a man-made nucleic acid molecule resulting from the use of recombinant DNA technology. The term "nucleic acid construct" therefore does not include naturally occurring nucleic acid molecules although a nucleic acid construct may comprise (parts of) naturally occurring nucleic acid molecules. The terms "expression vector" or expression construct" refer to nucleotide sequences that are capable of affecting expression of a gene in host cells or host organisms compatible with such sequences. These expression vectors typically include at least suitable transcription regulatory sequences and optionally, 3' transcription termination signals. Additional factors necessary or helpful in effecting expression may also be present, such as expression enhancer elements. The expression vector will be introduced into a suitable host cell and be able to effect expression of the coding sequence in an in vitro cell culture of the host cell. The expression vector will be suitable for replication in the host cell or organism of the invention.

As used herein, the term "promoter" or "transcription regulatory sequence" refers to a nucleic acid fragment that functions to control the transcription of one or more coding sequences, and is located upstream with respect to the direction of transcription of the transcription initiation site of the coding sequence, and is structurally identified by the presence of a binding site for DNA- dependent RNA polymerase, transcription initiation sites and any other DNA sequences, including, but not limited to transcription factor binding sites, repressor and activator protein binding sites, and any other sequences of nucleotides known to one of skill in the art to act directly or indirectly to regulate the amount of transcription from the promoter. A "constitutive" promoter is a promoter that is active in most tissues under most physiological and developmental conditions. An "inducible" promoter is a promoter that is physiologically or developmentally regulated, e.g. by the application of a chemical inducer.

The term "selectable marker" is a term familiar to one of ordinary skill in the art and is used herein to describe any genetic entity which, when expressed, can be used to select for a cell or cells containing the selectable marker. The term "reporter" may be used interchangeably with marker, although it is mainly used to refer to visible markers, such as green fluorescent protein (GFP). Selectable markers may be dominant or recessive or bidirectional.

As used herein, the term "operably linked" refers to a linkage of polynucleotide elements in a functional relationship. A nucleic acid is "operably linked" when it is placed into a functional relationship with another nucleic acid sequence. For instance, a transcription regulatory sequence is operably linked to a coding sequence if it affects the transcription of the coding sequence. Operably linked means that the DNA sequences being linked are typically contiguous and, where necessary to join two protein encoding regions, contiguous and in reading frame.

The terms "protein" or "polypeptide" are used interchangeably and refer to molecules consisting of a chain of amino acids, without reference to a specific mode of action, size, 3- dimensional structure or origin.

"Fungi" (singular fungus) are herein understood as heterotrophic eukaryotic microorganism that digest their food externally, absorbing nutrient molecules into their cells. Fungi are a separate kingdom of eukaryotic organisms and include yeasts, molds, and mushrooms. The terms fungi, fungus and fungal as used herein thus expressly includes yeasts as well as filamentous fungi.

The term "gene" means a DNA fragment comprising a region (transcribed region), which is transcribed into an RNA molecule (e.g. an mRNA) in a cell, operably linked to suitable regulatory regions (e.g. a promoter). A gene will usually comprise several operably linked fragments, such as a promoter, a 5' leader sequence, a coding region and a 3'nontranslated sequence (3'end) comprising a polyadenylation site. "Expression of a gene" refers to the process wherein a DNA region which is operably linked to appropriate regulatory regions, particularly a promoter, is transcribed into an RNA, which is biologically active, i.e. which is capable of being translated into a biologically active protein or peptide. The term "homologous" when used to indicate the relation between a given (recombinant) nucleic acid or polypeptide molecule and a given host organism or host cell, is understood to mean that in nature the nucleic acid or polypeptide molecule is produced by a host cell or organisms of the same species, preferably of the same variety or strain. If homologous to a host cell, a nucleic acid sequence encoding a polypeptide will typically (but not necessarily) be operably linked to another (heterologous) promoter sequence and, if applicable, another (heterologous) secretory signal sequence and/or terminator sequence than in its natural environment. It is understood that the regulatory sequences, signal sequences, terminator sequences, etc. may also be homologous to the host cell. In this context, the use of only "homologous" sequence elements allows the construction of "self-cloned" genetically modified organisms (GMO's) (self-cloning is defined herein as in European Directive 98/81/EC Annex II). When used to indicate the relatedness of two nucleic acid sequences the term "homologous" means that one single-stranded nucleic acid sequence may hybridize to a complementary single-stranded nucleic acid sequence. The degree of hybridization may depend on a number of factors including the amount of identity between the sequences and the hybridization conditions such as temperature and salt concentration as discussed later.

The term "heterologous" when used with respect to a nucleic acid (DNA or RNA) or protein refers to a nucleic acid or protein that does not occur naturally as part of the organism, cell, genome or DNA or RNA sequence in which it is present, or that is found in a cell or location or locations in the genome or DNA or RNA sequence that differ from that in which it is found in nature. Heterologous nucleic acids or proteins are not endogenous to the cell into which it is introduced, but has been obtained from another cell or synthetically or recombinantly produced. Generally, though not necessarily, such nucleic acids encode proteins that are not normally produced by the cell in which the DNA is transcribed or expressed. Similarly exogenous RNA encodes for proteins not normally expressed in the cell in which the exogenous RNA is present. Heterologous nucleic acids and proteins may also be referred to as foreign nucleic acids or proteins. Any nucleic acid or protein that one of skill in the art would recognize as heterologous or foreign to the cell in which it is expressed is herein encompassed by the term heterologous nucleic acid or protein. The term heterologous also applies to non-natural combinations of nucleic acid or amino acid sequences, i.e. combinations where at least two of the combined sequences are foreign with respect to each other.

Description of embodiments

To date a vast amount of xylose isomerase amino acid sequences are publicly available in Genbank and other sequence databases. Among them are few amino acid sequences of xylose isomerases that are known for the ability of functional expression in yeasts, including e.g. xylose isomerases from anaerobic fungi like Piromyces, from the Bacteroidetes group living in the mammalian gut, as well as a bacterial xylose isomerases from the species Clostridium phytofermentans. The present inventors have surprisingly found amino acid sequences of xylose isomerases that are not related to the Piromyces, Bacteroidetes and C. phytofermentans enzymes - in the sense that most of them share less than 70% amino acid sequence identity with the amino acid sequences of the Piromyces (PiXI; SEQ ID NO: 18) and C. phytofermentans (CpXI; SEQ ID NO: 17) enzymes (see Table 1 ), and that nonetheless have the ability of functional (i.e. active) expression in yeasts.

Functional expression of a xylose isomerase in a yeast is herein understood as expression of a codon-optimised coding sequence for the xylose isomerase from a glycolytic promoter on a 2μ- based plasmid in a S. cerevisiae host strain, which expression allows the detectable growth of the yeast on xylose as sole carbon source, preferably under anaerobic conditions with production of ethanol at the expense of xylose, more preferably with at least one of a growth rate, biomass and ethanol yield that is at least 10, 20, 50 or 80% of that achieved with a codon-optimised sequence coding for the Piromyces xylose isomerase (with amino acid sequence of SEQ ID NO: 18) under otherwise identical conditions. The S. cerevisiae host strain preferably is a host strain modified for growth on xylose by overexpression of xylulose kinase (XKS1 ) and all the genes of the pentose phosphate pathway (PPP), such as e.g. the strain Μ3150ρΧ/Δ/ΟρΧ/Δ (see Examples). Preferably, functional expression is expression that allows the detectable growth of the host strain on xylose as sole carbon source at a temperature which is lower than 35, 33, 30 or 28°C and at a temperature which is higher than 20, 22, or 25°C.

Table 1. Sequence identity of amino acid sequences of xylose isomerases in comparison to the amino acid sequence of Piromyces sp. XI (PiXI) and C. phytofermentans XI (CpXI).

% % Functional

SEQ ID

Source of xylose isomerase identity identity expression Code

NO.

to PiXI to CpXI in yeast

Lachnoclostridium

54.99 96.12 + 1 LplXI phytofermentans

Clostridium algidicarnis 53.83 72.60 + 2 Ca2XI

Mageeibacillus indolicus 53.02 69.35 + 3 Mi3XI

Ruminococcus sp. NK3A76 52.19 68.64 - 4 Rs4XI

Epulopiscium sp. 'N.t.

52.94 67.28 + 5 Es5XI morphotype B

Alkaliphilus metalliredigens 52.76 65.53 + 6 Am6XI

Eubacterium sp. CAG_180 54.38 65.44 + 7 Es7XI

Clostridium

53.23 64.61 + 8 Cs8XI saccharoperbutylacetonicum

Fusobacterium mortiferum 51.96 65.67 + 9 Fm9XI

[Clostridium] cellulosi 50.69 64.84 + 10 Cd OXI

Cellulosilyticum lentocellum 53.35 64.53 + 1 1 CM 1X1

Peptoclostridium difficile 54.04 62.93 + 12 Pcd12XI

(Pepto)clostridium difficile

54.50 62.70 - 13 Cd13XI NAP08

Caldicellulosiruptor acetigenus 50.35 61.75 - 14 Ca14XI Agrobacterium tumefaciens 49.89 52.50 - 15 At15XI

Burkholderia cenocepacia 49.32 51 ,70 - 16 Bc16XI

In a first aspect the invention relates to a transformed host cell that has the ability of isomerising xylose to xylulose. The ability of isomerising xylose to xylulose is conferred to the host cell by transformation of the host cell with a nucleic acid construct comprising a nucleotide sequence encoding a xylose isomerase. The transformed host cell's ability to isomerise xylose into xylulose is understood to mean the direct isomerisation of xylose, in a single reaction catalysed by a xylose isomerase, to xylulose, as opposed to the two step conversion of xylose into xylulose via a xylitol intermediate as catalysed by xylose reductase and xylitol dehydrogenase, respectively.

In one embodiment the nucleotide sequence encoding the xylose isomerase is selected from the group consisting of:

(a) a nucleotide sequence encoding a polypeptide with xylose isomerase activity, which polypeptide comprises an amino acid sequence that has at least 65.5, 66, 67, 68, 69, 70, 71 , 72, 73, 74, 75, 76, 77, 78, 79, 80, 81 , 82, 83, 84, 85, 86, 87, 88, 89, 90, 91 , 92, 93, 94,

95, 95, 96, 97, 98, or 99% sequence identity with the amino acid sequence of SEQ ID NO. 7 (Eubacterium sp. CAG_180);

(b) a nucleotide sequence encoding a polypeptide with xylose isomerase activity, which polypeptide comprises an amino acid sequence that has at least 64.9, 65, 66, 67, 68, 69, 70, 71 , 72, 73, 74, 75, 76, 77, 78, 79, 80, 81 , 82, 83, 84, 85, 86, 87, 88, 89, 90, 91 , 92, 93, 94, 95, 95, 96, 97, 98, or 99% sequence identity with the amino acid sequence of SEQ ID NO. 10 ([Clostridium] cellulosi);

(c) a nucleotide sequence encoding a polypeptide with xylose isomerase activity, which polypeptide comprises an amino acid sequence that has at least 64.7, 65, 66, 67, 68, 69, 70, 71 , 72, 73, 74, 75, 76, 77, 78, 79, 80, 81 , 82, 83, 84, 85, 86, 87, 88, 89, 90, 91 , 92, 93, 94, 95, 95, 96, 97, 98, or 99% sequence identity with the amino acid sequence of SEQ ID NO. 8 (Clostridium saccharoperbutylacetonicum);

(d) a nucleotide sequence encoding a polypeptide with xylose isomerase activity, which polypeptide comprises an amino acid sequence that has at least 64.6, 65, 66, 67, 68, 69, 70, 71 , 72, 73, 74, 75, 76, 77, 78, 79, 80, 81 , 82, 83, 84, 85, 86, 87, 88, 89, 90, 91 , 92, 93, 94, 95, 95, 96, 97, 98, or 99% sequence identity with the amino acid sequence of SEQ ID N0.1 1 (Cellulosilyticum lentocellum);

(e) a nucleotide sequence encoding a polypeptide with xylose isomerase activity, which polypeptide comprises an amino acid sequence that has at least 67.3, 68, 69, 70, 71 , 72, 73, 74, 75, 76, 77, 78, 79, 80, 81 , 82, 83, 84, 85, 86, 87, 88, 89, 90, 91 , 92, 93, 94, 95, 95,

96, 97, 98, or 99% sequence identity with the amino acid sequence of SEQ ID NO. 5 (Epulopiscium sp. 'N.t. morphotype B);

(f) a nucleotide sequence encoding a polypeptide with xylose isomerase activity, which polypeptide comprises an amino acid sequence that has at least 96.2, 96.5, 97, 98, or 99% sequence identity with the amino acid sequence of SEQ ID NO. 1 (Lachnoclostridium phytofermentans) ;

(g) a nucleotide sequence encoding a polypeptide with xylose isomerase activity, which polypeptide comprises an amino acid sequence that has at least 65.6, 66, 67, 68, 69, 70, 71 , 72, 73, 74, 75, 76, 77, 78, 79, 80, 81 , 82, 83, 84, 85, 86, 87, 88, 89, 90, 91 , 92, 93, 94,

95, 95, 96, 97, 98, or 99% sequence identity with the amino acid sequence of SEQ ID NO. 6 (Alkaliphilus metalliredigens);

(h) a nucleotide sequence encoding a polypeptide with xylose isomerase activity, which polypeptide comprises an amino acid sequence that has at least 69.4, 70, 71 , 72, 73, 74, 75, 76, 77, 78, 79, 80, 81 , 82, 83, 84, 85, 86, 87, 88, 89, 90, 91 , 92, 93, 94, 95, 95, 96, 97,

98, or 99% sequence identity with the amino acid sequence of SEQ ID NO. 3 (Mageeibacillus indolicus);

(i) a nucleotide sequence encoding a polypeptide with xylose isomerase activity, which polypeptide comprises an amino acid sequence that has at least 72.7, 73, 74, 75, 76, 77, 78, 79, 80, 81 , 82, 83, 84, 85, 86, 87, 88, 89, 90, 91 , 92, 93, 94, 95, 95, 96, 97, 98, or 99% sequence identity with the amino acid sequence of SEQ ID NO. 2 (Clostridium algidicarnis); (j) a nucleotide sequence encoding a polypeptide with xylose isomerase activity, which polypeptide comprises an amino acid sequence that has at least 63, 64, 65, 66, 67, 68, 69, 70, 71 , 72, 73, 74, 75, 76, 77, 78, 79, 80, 81 , 82, 83, 84, 85, 86, 87, 88, 89, 90, 91 , 92, 93, 94, 95, 95, 96, 97, 98, or 99% sequence identity with the amino acid sequence of SEQ ID

NO. 12 (Peptoclostridium difficile);

(k) a nucleotide sequence the complementary strand of which hybridises to a nucleotide sequence of one of (a) - (j); and,

(I) a nucleotide sequence the sequence of which differs from the sequence of a nucleotide sequence of (k) due to the degeneracy of the genetic code.

The nucleotide sequences of the invention encode a novel class of xylose isomerases that may be functionally expressed in eukaryotic microbial host cells of the invention as defined below. The nucleotide sequences of the invention preferably encode xylose isomerases that naturally occurs in the source organism, e.g. the source bacterium.

A preferred nucleotide sequence of the invention thus encodes a xylose isomerase with an amino acid sequence that is identical to that of a xylose isomerase that is obtainable from (or naturally occurs in) a bacterium of the Family Clostridiaceae, more preferably a bacterium of the genus Clostridium, e.g. Clostridium algidicarnis, but more preferred is Clostridium saccharoperbutylacetonicum and most preferred is [Clostridium] cellulosi.

Another preferred nucleotide sequence of the invention encodes a xylose isomerase with an amino acid sequence that is identical to that of a xylose isomerase that is obtainable from (or naturally occurs in) a bacterium of the Family Eubacteriaceae, more preferably a bacterium of the genus Eubacterium, of which the species Eubacterium sp. CAG_180 is most preferred.

Alternatively, nucleotide sequence of the invention encodes a xylose isomerase with an amino acid sequence that is identical to that of a xylose isomerase that is obtainable from (or naturally occurs in) a bacterium of a genus selected from the group consisting of Cellulosilyticum, Epulopiscium, Lachnoclostridium, Alkaliphilus, Mageeibacillus and Peptoclostridium, more preferably a bacterium of a species selected from the group consisting of Cellulosilyticum lentocellum, Epulopiscium sp. 'N.t. morphotype B, Lachnoclostridium phytofermentans, Alkaliphilus metalliredigens, Mageeibacillus indolicus and Peptoclostridium difficile.

It is however understood that nucleotide sequences encoding engineered forms of any of the xylose isomerases defined above and that comprise one or more amino acid substitutions, insertions and/or deletions as compared to the corresponding naturally occurring xylose isomerases but that are within the ranges of identity or similarity as defined herein are expressly included in the invention. Therefore, in one embodiment the nucleotide sequence of the invention encodes a xylose isomerase amino acid sequence comprising a xylose isomerase signature sequence as defined by Meaden et al. (1994, Gene, 141 : 97-101 ): VXW[GP]GREG[YSTA] (present at positions 187-195) and [LIVM]EPKPX[EQ]P (present at positions 232-239), wherein "X" can be any amino acid and wherein amino acids in brackets indicates that one of the bracketed amino acids can be present at that position in the signature sequence. A xylose isomerase amino acid sequence of the invention further preferably comprises the conserved amino acid residues His-102, Asp-105, and Asp-340, which constitute a triad directly involved in catalysis, Lys-235 plays a structural as well as a functional catalytic role, and Glu-233, which is involved in binding of the magnesium (Vangrysperre et al., 1990, Biochem. J. 265: 699-705; Henrick et al., J. Mol. Biol. 208: 129-157; Bhosale et al., 1996 Microbiol. Rev. 60: 280-300). Amino acid positions of the above signature sequences and conserved residues refer to positions in the reference amino acid sequence of the Piromyces xylose isomerase of SEQ ID NO: 18. In amino acid sequences of the invention other than SEQ ID NO: 18, preferably, the amino acid positions of the above signature sequences and conserved residues are present in amino acid positions corresponding to the positions of the signature sequences and conserved residues in SEQ ID NO: 18, preferably in a ClustalW (1.83 or 1.81 ) sequence alignment using default settings. The skilled person will know how to identify corresponding amino acid positions in xylose isomerase amino acid sequences other than SEQ ID NO: 18 using amino acid sequence alignment algorithms as defined hereinabove. An example of such an alignment is depicted in Table 2.

In one embodiment therefore, the nucleotide sequence can encode engineered forms of any of the xylose isomerases defined above and that comprise one or more amino acid substitutions, insertions and/or deletions as compared to the corresponding naturally occurring xylose isomerase but that are within the ranges of identity or similarity as defined herein. The nucleotide sequence of the invention encodes a xylose isomerase, the amino acid sequence of which at least comprises in each of the invariable positions (that are indicated in Table 2 with a "*"), the amino acid present in a invariable position. Preferably, the amino acid sequence also comprises in the strongly conserved positions (that are indicated in Table 2 with a ":") one of the amino acids present in a strongly conserved position. More preferably, the amino acid sequence further also comprises in the less strongly conserved positions (that are indicated in Table 2 with a ".") one of the amino acids present in a less strongly conserved position. Amino acid substitutions outside of these invariable and conserved positions are less unlikely to affect xylose isomerase activity. In addition, to date a vast amount of amino acid sequences of xylose isomerases are known in the art and new ones are added continuously being added. Sequence alignments of SEQ ID NO: 18 and the xylose isomerase sequences of the invention with these known and new xylose isomerase amino acid sequences will indicate further conserved regions and amino acid positions, the conservation of which are important for structure and enzymatic activity.

The nucleotide sequence encodes a xylose isomerase that is preferably expressed in active form in the host cell. Thus, expression of the nucleotide sequence in the host cell produces a xylose isomerase with a specific activity of at least 10 U xylose isomerase activity per mg protein at 25°C, preferably at least 20, 25, 30, 50, 100, 200 or 300 U per mg at 25°C. The specific activity of the xylose isomerase expressed in the host cell is herein defined as the amount of xylose isomerase activity units per mg protein of cell free lysate of the host cell, e.g. a yeast cell free lysate. Determination of the xylose isomerase activity, amount of protein and preparation of the cell free lysate are as described in the Examples. Preferably, expression of the nucleotide sequence in the host cell produces a xylose isomerase with a K m for xylose that is less than 50, 40, 30 or 25 mM, more preferably, the K m for xylose is about 20 mM or less.

The nucleotide sequence encodes a xylose isomerase that preferably has reduced sensitivity to inhibition by xylitol. Preferably, the xylose isomerase shows less inhibition by xylitol than the Piromyces isomerase (SEQ ID NO: 18), more preferably the xylose isomerase shows less inhibition by xylitol than the C. phytofermentans isomerase (SEQ ID NO: 17). The nucleotide sequence thus preferably encodes a xylose isomerase that has an apparent inhibition constant K, that is greater than 4.6, 5, 10, 14.51 , 15 mM xylitol. Sensitivity to inhibition by xylitol and apparent inhibition constant K, for xylitol can be determined as described in (1 1 ).

The nucleotide sequences of the invention, encoding polypeptides with xylose isomerase activity, are obtainable from genomic and/or cDNA of a bacterium that belongs to a phylum, class, order, family or genus as described above, using method for isolation of nucleotide sequences that are well known in the art per se (see e.g. Sambrook and Russell (2001 ) "Molecular Cloning: A Laboratory Manual (3 rd edition), Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, New York). The nucleotide sequences of the invention are e.g. obtainable in a process wherein a) degenerate PCR primers (such as those in SEQ ID NO.'s 19 and 20) are used on genomic and/or cDNA of a suitable organism (e.g. a bacterium as indicated above) to generate a PCR fragment comprising part of the nucleotide sequences encoding the polypeptides with xylose isomerase activity; b) the PCR fragment obtained in a) is used as probe to screen a cDNA and/or genomic library of the organism; and c) producing a cDNA or genomic DNA comprising the nucleotide sequence encoding a polypeptide with xylose isomerase activity.

To increase the likelihood that the xylose isomerase is expressed at sufficient levels and in active form in the host cells of the invention, the nucleotide sequence encoding these enzymes, as well as other enzymes of the invention (see below), are preferably adapted to optimise their codon usage to that of the host cell in question. The adaptiveness of a nucleotide sequence encoding an enzyme to the codon usage of a host cell may be expressed as codon adaptation index (CAI). The codon adaptation index is herein defined as a measurement of the relative adaptiveness of the codon usage of a gene towards the codon usage of highly expressed genes in a particular host cell or organism. The relative adaptiveness (w) of each codon is the ratio of the usage of each codon, to that of the most abundant codon for the same amino acid. The CAI index is defined as the geometric mean of these relative adaptiveness values. Non-synonymous codons and termination codons (dependent on genetic code) are excluded. CAI values range from 0 to 1 , with higher values indicating a higher proportion of the most abundant codons (see Sharp and Li, 1987, Nucleic Acids Research 15: 1281-1295; also see: Jansen et al., 2003, Nucleic Acids Res. 3J.(8):2242-51 ). An adapted nucleotide sequence preferably has a CAI of at least 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8 or 0.9. Most preferred are the sequences which have been codon optimised for expression in S. cerevisiae cells, as listed in SEQ ID NO's: 21 - 34, of which SEQ ID NO's: 27, 28 and 30 are preferred, and SEQ ID NO: 28 is most preferred.

The host cell to be transformed with a nucleic acid construct comprising a nucleotide sequence encoding a xylose isomerase of the invention preferably is a eukaryotic microbial host, more preferably a fungal host cell, such as a yeast or filamentous fungal host cell. Preferably the host cell is a cultured cell. The host cell of the invention, preferably is a host capable of active or passive pentose (xylose and preferably also arabinose) transport into the cell. The host cell preferably contains active glycolysis. The host cell may further preferably contains an endogenous pentose phosphate pathway and may contain endogenous xylulose kinase activity so that xylulose isomerised from xylose may be metabolised to pyruvate. The host further preferably contains enzymes for conversion of a pentose (preferably through pyruvate) to a desired fermentation product such as ethanol, lactic acid, 3-hydroxy-propionic acid, acrylic acid, acetic acid, succinic acid, citric acid, amino acids, 1 ,3-propane-diol, ethylene, glycerol, β-lactam antibiotics and cephalosporins. A particularly preferred host cell is a host cell that is naturally capable of alcoholic fermentation, preferably, anaerobic alcoholic fermentation. The host cell further preferably has a high tolerance to ethanol, a high tolerance to low pH (i.e. capable of growth at a pH lower than 5, 4, or 3) and towards organic acids like lactic acid, acetic acid or formic acid and sugar degradation products such as furfural and hydroxy-methylfurfural, and a high tolerance to elevated temperatures. Any of these characteristics or activities of the host cell may be naturally present in the host cell or may be introduced or modified by genetic modification, preferably by self-cloning or by the methods of the invention described below. A suitable cell is a cultured cell, a cell that may be cultured in fermentation process e.g. in submerged or solid state fermentation. Particularly suitable cells are eukaryotic microorganism like e.g. fungi, however, most suitable for use in the present inventions are yeasts or filamentous fungi.

Yeasts are herein defined as eukaryotic microorganisms and include all species of the subdivision Eumycotina (Yeasts: characteristics and identification, J.A. Barnett, R.W. Payne, D. Yarrow, 2000, 3rd ed., Cambridge University Press, Cambridge UK; and, The yeasts, a taxonomic study, CP. Kurtzman and J.W. Fell (eds) 1998, 4 th ed., Elsevier Science Publ. B.V., Amsterdam, The Netherlands) that predominantly grow in unicellular form. Yeasts may either grow by budding of a unicellular thallus or may grow by fission of the organism. Preferred yeasts as host cells belong to the genera Saccharomyces, Kluyveromyces, Candida, Pichia, Schizosaccharomyces, Hansenula, Kloeckera, Schwanniomyces, Yarrowia, Kazachstania and Naumovia. Preferred yeast species as host cells include S. cerevisiae, S. exiguus, S. bayanus, K. lactis, K. marxianus and Schizosaccharomyces pom be.

Preferably the yeast cell of the invention is a yeast cell that is naturally capable of anoxic fermentation, more preferably alcoholic fermentation and most preferably anoxic alcoholic fermentation. Over the years suggestions have been made for the introduction of various organisms for the production of bio-ethanol from crop sugars. In practice, however, all major bio-ethanol production processes have continued to use the yeasts of the genus Saccharomyces as ethanol producer. This is due to the many attractive features of Saccharomyces species for industrial processes, i.e. , a high acid-, ethanol- and osmo-tolerance, capability of anaerobic growth, and of course its high alcoholic fermentative capacity. Most preferably therefore a yeast host cell of the invention belongs to a species selected from the group consisting of Saccharomyces cerevisiae, S. bayanus, S. bulderi, S. cervazzii, S. cariocanus, S. castellii, S. dairenensis, S. exiguus, S. kluyveri, S. kudriazevii, S. mikatae, S. paradoxus, S. pastorianus, S. turicensis and S. unisporus (Kurtzman, 2003, supra; and J.A. Barnett, R.W. Payne, D. Yarrow, 2000, supra). Preferably the yeast cell of the invention is an industrial yeast strain or a yeast strain derived from industrial yeast strain. Industrial yeast strains are often diploid, polypoloid or aneuploid and have proven capabilities for application in large scale industrial fermentation. Suitable industrial yeast strains include e.g. the commercial strains Gert Strand Turbo yeasts, Alltech SuperStart™, Fermiol Super HA™, Thermosacc™ and Ethanol Red™. Also suitable are yeast cells derived from any of these strain by modifications as described herein.

Filamentous fungi are herein defined as eukaryotic microorganisms that include all filamentous forms of the subdivision Eumycotina. These fungi are characterized by a vegetative mycelium composed of chitin, cellulose, and other complex polysaccharides. The filamentous fungi of the present invention are morphologically, physiologically, and genetically distinct from yeasts. Vegetative growth by filamentous fungi is by hyphal elongation and carbon catabolism of most filamentous fungi is obligately aerobic. Preferred filamentous fungi as host cells belong to the genera Aspergillus, Trichoderma, Humicola, Acremonium, Fusarium, and Penicillium.

In a host cell of the invention, the nucleotide sequence encoding the xylose isomerase as defined above is preferably operably linked to a promoter that causes sufficient expression of the nucleotide sequences in the cell to confer to the cell the ability to convert xylose into xylulose. More preferably the promoter causes sufficient expression of the nucleotide sequences to confer to the cell the ability to grow on xylose as sole carbon and/or energy source, most preferably under anaerobic conditions. Suitable promoters for expression of the nucleotide sequence as defined above include promoters that are insensitive to catabolite (glucose) repression and/or that do not require xylose for induction. Promoters having these characteristics are widely available and known to the skilled person. Suitable examples of such promoters include e.g. promoters from glycolytic genes such as the phosphofructokinase (PPK), triose phosphate isomerase (TPI), glyceraldehyde- 3-phosphate dehydrogenase (GPD, TDH3 or GAPDH), pyruvate kinase (PYK), phosphoglycerate kinase (PGK), glucose-6-phosphate isomerase promoter (PGI1) promoters from yeasts or filamentous fungi; more details about such promoters from yeast may be found in (WO 93/03159). Other useful promoters are ribosomal protein encoding gene promoters, the lactase gene promoter (LAC4), alcohol dehydrogenase promoters (ADH1, ADH4, and the like), the enolase promoter (ENO), the hexose(glucose) transporter promoter (HXT7), and the cytochrome c1 promoter (CYC1). Other promoters, both constitutive and inducible, and enhancers or upstream activating sequences will be known to those of skill in the art. Preferably the promoter that is operably linked to nucleotide sequence as defined above is homologous to the host cell.

In a host cell of the invention, the nucleotide sequence encoding the xylose isomerase as defined above is preferably expressed from an expression construct wherein the coding sequence is operably linked to a promoter as defined above. An expression construct in a host cell of the invention may be present on a plasmid, preferable a multicopy plasmid. However, more preferably the expression construct is integrated into the genome of the host cell. Preferably, the host cell comprises multiple copies of the expression construct integrated into its genome. More preferably, the multiple copies (e.g. 2, 3, 4, 5, 6, 8, 10 ore more copies) of the expression construct are integrated into in more than one, e.g. at least two, different genomic or chromosomal locations in the host cell's genome. A preferred chromosomal location for integration of an expression construct into the genome of a host cell of the invention is an intergenic region, e.g. the intergenic region downstream of TYE7 and upstream of the tRNA gene tP(UGG)03 in chromosome XV. In one embodiment, the host cell is a diploid, polypoloid or aneuploid host cell. Preferably in the diploid, polypoloid or aneuploid host cell, the expression construct is present at a chromosomal location that is present in at least two copies in the cell's genome. Optionally more than one tandem copies, e.g. two copies, of the expression construct is integrated in a genomic or chromosomal location.

In one embodiment a host cell of the invention comprises more than one different type of nucleotide sequence encoding e.g. at least two different xylose isomerases as defined above, or e.g. encoding a xylose isomerases as defined above in combination with any other xylose isomerase, e.g. a xylose isomerase already known in the art.

The host cell of the invention further preferably comprises xylulose kinase activity so that xylulose isomerised from xylose may be metabolised to pyruvate. Preferably, the cell contains endogenous xylulose kinase activity. More preferably, a cell of the invention comprises a genetic modification that increases the specific xylulose kinase activity. Preferably the genetic modification causes overexpression of a xylulose kinase, e.g. by overexpression of a nucleotide sequence encoding a xylulose kinase. The gene encoding the xylulose kinase may be endogenous to the cell or may be a xylulose kinase that is heterologous to the cell. A nucleotide sequence that may be used for overexpression of xylulose kinase in the cells of the invention is e.g. the xylulose kinase gene from S. cerevisiae (XKS1) as described by Deng and Ho (1990, Appl. Biochem. Biotechnol. 24-25: 193-199). Another preferred xylulose kinase is a xylose kinase that is related to the xylulose kinase from Piromyces (xylB; see WO 03/0624430). This Piromyces xylulose kinase is actually more related to prokaryotic kinase than to all of the known eukaryotic kinases such as the yeast kinase. The eukaryotic xylulose kinases have been indicated as non-specific sugar kinases, which have a broad substrate range that includes xylulose. In contrast, the prokaryotic xylulose kinases, to which the Piromyces kinase is most closely related, have been indicated to be more specific kinases for xylulose, i.e. having a narrower substrate range. In the cells of the invention, a xylulose kinase to be overexpressed is overexpressed by at least a factor 1.1 , 1.2, 1.5, 2, 5, 10 or 20 as compared to a strain which is genetically identical except for the genetic modification causing the overexpression. It is to be understood that these levels of overexpression may apply to the steady state level of the enzyme's activity, the steady state level of the enzyme's protein as well as to the steady state level of the transcript coding for the enzyme.

A cell of the invention further preferably comprises a genetic modification that increases the flux of the pentose phosphate pathway as described in WO 06/009434. In particular, the genetic modification causes an increased flux of the non-oxidative part pentose phosphate pathway. A genetic modification that causes an increased flux of the non-oxidative part of the pentose phosphate pathway is herein understood to mean a modification that increases the flux by at least a factor 1 .1 , 1 .2, 1 .5, 2, 5, 10 or 20 as compared to the flux in a strain which is genetically identical except for the genetic modification causing the increased flux. The flux of the non-oxidative part of the pentose phosphate pathway may be measured as described in WO 06/009434.

Genetic modifications that increase the flux of the pentose phosphate pathway may be introduced in the cells of the invention in various ways. These including e.g. achieving higher steady state activity levels of xylulose kinase and/or one or more of the enzymes of the non-oxidative part pentose phosphate pathway and/or a reduced steady state level of unspecific aldose reductase activity. These changes in steady state activity levels may be effected by selection of mutants (spontaneous or induced by chemicals or radiation) and/or by recombinant DNA technology e.g. by overexpression or inactivation, respectively, of genes encoding the enzymes or factors regulating these genes.

In a preferred cell of the invention, the genetic modification comprises overexpression of at least one enzyme of the (non-oxidative part) pentose phosphate pathway. Preferably the enzyme is selected from the group consisting of the enzymes encoding for ribulose-5-phosphate isomerase, ribulose-5-phosphate 3-epimerase, transketolase and transaldolase. Various combinations of enzymes of the (non-oxidative part) pentose phosphate pathway may be overexpressed. In one embodiment of the invention each of the enzymes ribulose-5-phosphate isomerase, ribulose-5- phosphate 3-epimerase, transketolase and transaldolase is overexpressed in the cell of the invention.

There are various means available in the art for overexpression of enzymes in the cells of the invention. In particular, an enzyme may be overexpressed by increasing the copynumber of the gene coding for the enzyme in the cell, e.g. by integrating additional copies of the gene in the cell's genome, by expressing the gene from an episomal multicopy expression vector or by introducing a episomal expression vector that comprises multiple copies of the gene. The coding sequence used for overexpression of the enzymes preferably is homologous to the host cell of the invention. However, coding sequences that are heterologous to the host cell of the invention may likewise be applied. Alternatively overexpression of enzymes in the cells of the invention may be achieved by using a promoter that is not native to the sequence coding for the enzyme to be overexpressed, i.e. a promoter that is heterologous to the coding sequence to which it is operably linked. Although the promoter preferably is heterologous to the coding sequence to which it is operably linked, it is also preferred that the promoter is homologous, i.e. endogenous to the cell of the invention. Preferably the heterologous promoter is capable of producing a higher steady state level of the transcript comprising the coding sequence (or is capable of producing more transcript molecules, i.e. mRNA molecules, per unit of time) than is the promoter that is native to the coding sequence, preferably under conditions where xylose or xylose and glucose are available as carbon sources, more preferably as major carbon sources (i.e. more than 50% of the available carbon source consists of xylose or xylose and glucose), most preferably as sole carbon sources. Suitable promoters in this context include promoters as described above for expression of the nucleotide sequences encoding xylose isomerases as defined above.

A further preferred cell of the invention comprises a genetic modification that reduces unspecific aldose reductase activity in the cell. Preferably, unspecific aldose reductase activity is reduced in the host cell by one or more genetic modifications that reduce the expression of or inactivates a gene encoding an unspecific aldose reductase. Preferably, the genetic modifications reduce or inactivate the expression of each endogenous copy of a gene encoding an unspecific aldose reductase that is capable of reducing an aldopentose, including, xylose, xylulose and arabinose, in the cell's genome. A given cell may comprise multiple copies of genes encoding unspecific aldose reductases as a result of di-, poly- or aneu-ploidy, and/or a cell may contain several different (iso)enzymes with aldose reductase activity that differ in amino acid sequence and that are each encoded by a different gene. Also in such instances preferably the expression of each gene that encodes an unspecific aldose reductase is reduced or inactivated. Preferably, the gene is inactivated by deletion of at least part of the gene or by disruption of the gene, whereby in this context the term gene also includes any non-coding sequence up- or down-stream of the coding sequence, the (partial) deletion or inactivation of which results in a reduction of expression of unspecific aldose reductase activity in the host cell. A nucleotide sequence encoding an aldose reductase whose activity is to be reduced in the cell of the invention and amino acid sequences of such aldose reductases are described in WO 06/009434 and include e.g. the (unspecific) aldose reductase genes of S. cerevisiae GRE3 gene (Traff et al., 2001 , Appl. Environm. Microbiol. 67: 5668-5674) and orthologues thereof in other species.

A host cell of the invention further preferably comprises at least one genetic modification that results in a characteristic selected from the group consisting of: a) increased tolerance to ethanol; b) increased tolerance to acetic acid; c) reduced production of glycerol; d) increased xylose to ethanol fermentation rate; and e) increased thermotolerance.

The genetic modification that results in increased tolerance to ethanol preferably is a modification as e.g. described in WO 2012/175552 and WO 2014/170330, such as e.g. a modification that introduces alleles of one or more of the ADE1, KIN3, MKT1 and VPS70 that confer increased tolerance to ethanol, and/or a modification that overexpresses a wild type SWS2 gene and/or that inactivates the APJ1 gene, which also confers increased tolerance to ethanol.

The genetic modification that results in increased tolerance to acetic acid preferably is a modification as e.g. described in WO 2015/181 169 and WO 2016/083397, such as e.g. a modification that introduces an allele of one or more of the GL01, DOT5, CUP2 and HAA1 genes that confers increased tolerance to acetic acid.

The genetic modification that results in reduced production of glycerol, preferably is a modification as e.g. described in WO 2014/048863, such as e.g. a modification that introduces a mutant SSK1 gene encoding a truncated sskl protein.

The genetic modification that results in increased xylose to ethanol fermentation rate preferably is a modification as e.g. described in WO 2015/086805, such as e.g. a modification that introduces an allele of the NNK1 gene that confers an increased xylose to ethanol fermentation rate.

The genetic modification that results in increased thermotolerance preferably is a modification as e.g. described in WO 2014/090930, such as e.g. a modification that introduces overexpression of at least one of a gene encoding the Prp42 protein and a gene encoding the Smd2 protein.

A preferred host cell of the invention is a host cell that is improved in at least one industrially relevant phenotype by evolutionary engineering. Evolutionary engineering is a process wherein industrially relevant phenotypes of a microorganism, herein the yeast, can be coupled to the specific growth rate and/or the affinity for a nutrient, by a process of rationally set-up natural selection. Evolutionary Engineering is e.g. described in detail in Cakar et al. (201 1 , FEMS Yeast Research 12: 171-182). Preferably, the D-xylose utilization rate of the host cell is improved by evolutionary engineering. Improvement of the D-xylose utilization rate of yeast host cells by evolutionary engineering is described in detail by Demeke et al. (12, 15 and 16).

In a preferred host cell according to the invention, the nucleic acid construct confers to the host cell the ability to grow on xylose as carbon/energy source, preferably as sole carbon/energy source, and preferably under anaerobic conditions, i.e. conditions as defined herein below for anaerobic fermentation process. Preferably, when grown on xylose as carbon/energy source the transformed host produces essentially no xylitol, e.g. the xylitol produced is below the detection limit or e.g. less than 5, 2, 1 , 0.5, or 0.3% of the carbon consumed on a molar basis.

A host cell of the invention preferably has the ability to grow on xylose as sole carbon/energy source at a rate of at least 0.01 , 0.02, 0.05, 0.1 , 0.2, 0,25 or 0,3 h ~ under aerobic conditions, or, more preferably, at a rate of at least 0.005, 0.01 , 0.02, 0.05, 0.08, 0.1 , 0.12, 0.15 or 0.2 h under anaerobic conditions. A cell of the invention preferably has the ability to grow on a mixture of glucose and xylose (in a 1 :1 weight ratio) as sole carbon/energy source at a rate of at least 0.01 , 0.02, 0.05, 0.1 , 0.2, 0,25 or 0,3 h " under aerobic conditions, or, more preferably, at a rate of at least 0.005, 0.01 , 0.02, 0.05, 0.08, 0.1 , 0.12, 0.15 or 0.2 h under anaerobic conditions. Thus, in a preferred host cell according to the invention, the nucleic acid construct confers to the host cell the ability to anaerobically ferment xylose as sole carbon source in a process wherein ultimately pyruvate is used as an electron (and hydrogen acceptor) and is reduced to fermentation products such as ethanol, lactic acid, 3-hydroxy-propionic acid, acrylic acid, acetic acid, succinic acid, citric acid, amino acids, 1 ,3-propane-diol, ethylene, glycerol, butyric acid, caproate, butanol, glyoxylate, muconic acid, fatty alcohols, fatty acids, β-lactam antibiotics and cephalosporins.

Preferably, a cell of the invention has a specific xylose consumption rate of at least 200, 300,

400, 600, 700, 800, 900 or 1000 mg h-1 (g dry weight)-! Preferably, a cell of the invention has a yield of fermentation product (such as ethanol) on xylose that is at least 20, 40, 50, 60, 80, 90, 95 or 98% of the cell's yield of fermentation product (such as ethanol) on glucose. More preferably, the modified host cell's yield of fermentation product (such as ethanol) on xylose is equal to the host cell's yield of fermentation product (such as ethanol) on glucose. Likewise, the modified host cell's biomass yield on xylose is preferably at least 55, 60, 70, 80, 85, 90, 95 or 98% of the host cell's biomass yield on glucose. More preferably, the modified host cell's biomass yield on xylose is equal to the host cell's biomass yield on glucose. It is understood that in the comparison of yields on glucose and xylose both yields are compared under aerobic conditions or both under anaerobic conditions.

In another aspect the invention relates to a process for producing a fermentation product selected from the group consisting of ethanol, lactic acid, 3-hydroxy-propionic acid, acrylic acid, acetic acid, succinic acid, citric acid, amino acids, 1 ,3-propane-diol, ethylene, glycerol, butyric acid, caproate, butanol, glyoxylate, muconic acid, fatty alcohols, fatty acids, β-lactam antibiotics and cephalosporins. The process preferably comprises the steps of: a) fermenting a medium containing a source of xylose with a cell as defined hereinabove, whereby the cell ferments xylose to the fermentation product, and optionally, b) recovery of the fermentation product.

In addition to a source of xylose the carbon source in the fermentation medium may also comprise a source of glucose. The skilled person will further appreciate that the fermentation medium may further also comprise other types of carbohydrates such as e.g. in particular a source of arabinose. The sources of xylose and glucose may be xylose and glucose as such (i.e. as monomeric sugars) or they may be in the form of any carbohydrate oligo- or polymer comprising xylose and/or glucose units, such as e.g. lignocellulose, xylans, cellulose, starch and the like. For release of xylose and/or glucose units from such carbohydrates, appropriate carbohydrases (such as xylanases, glucanases, amylases, cellulases, glucanases and the like) may be added to the fermentation medium or may be produced by the modified host cell. In the latter case the modified host cell may be genetically engineered to produce and excrete such carbohydrases. An additional advantage of using oligo- or polymeric sources of glucose is that it enables to maintain a low(er) concentration of free glucose during the fermentation, e.g. by using rate-limiting amounts of the carbohydrases preferably during the fermentation. This, in turn, will prevent repression of systems required for metabolism and transport of non-glucose sugars such as xylose. In a preferred process the modified host cell ferments both the xylose and glucose, preferably simultaneously in which case preferably a modified host cell is used which is insensitive to glucose repression to prevent diauxic growth. In addition to a source of xylose (and glucose) as carbon source, the fermentation medium will further comprise the appropriate ingredient required for growth of the modified host cell. Compositions of fermentation media for growth of eukaryotic microorganisms such as yeasts and filamentous fungi are well known in the art.

The fermentation process may be an aerobic or an anaerobic fermentation process. An anaerobic fermentation process is herein defined as a fermentation process run in the absence of oxygen or in which substantially no oxygen is consumed, preferably less than 5, 2.5 or 1 mmol/L/h, more preferably 0 mmol/L/h is consumed (i.e. oxygen consumption is not detectable), and wherein organic molecules serve as both electron donor and electron acceptors. In the absence of oxygen, NADH produced in glycolysis and biomass formation, cannot be oxidised by oxidative phosphorylation. To solve this problem many microorganisms use pyruvate or one of its derivatives as an electron and hydrogen acceptor thereby regenerating NAD + . Thus, in a preferred anaerobic fermentation process pyruvate is used as an electron (and hydrogen acceptor) and is reduced to fermentation products such as ethanol, as well as non-ethanol fermentation products such as lactic acid, 3-hydroxy-propionic acid, acrylic acid, acetic acid, succinic acid, citric acid, amino acids, 1 ,3- propane-diol, ethylene, glycerol, butyric acid, caproate, butanol, glyoxylate, muconic acid, fatty alcohols, fatty acids, β-lactam antibiotics and cephalosporins. Anaerobic processes of the invention are preferred over aerobic processes because anaerobic processes do not require investments and energy for aeration and in addition, anaerobic processes produce higher product yields than aerobic processes. Alternatively, the fermentation process of the invention may be run under aerobic oxygen-limited conditions. Preferably, in an aerobic process under oxygen-limited conditions, the rate of oxygen consumption is at least 5.5, more preferably at least 6 and even more preferably at least 7 mmol/L/h.

The fermentation process is preferably run at a temperature that is optimal for the modified cells of the invention. Thus, for most yeasts or fungal cells, the fermentation process is performed at a temperature which is less than 42°C, preferably less than 38°C. For yeast or filamentous fungal cells, the fermentation process is preferably performed at a temperature which is lower than 35, 33, 30 or 28°C and at a temperature which is higher than 20, 22, or 25°C. For some species, such as Kluyveromyces marxianus, and engineered Saccharomyces cerevisiae strains, the fermentation process may be run at considerably higher temperatures, i.e. at 42°C, 43°C, or preferably between 45 and 50°C, or in rare cases between 50 and 55°C.

Preferably in the fermentation processes of the invention, the cells stably maintain the nucleic acid constructs that confer to the cell the ability of isomerising xylose into xylulose, and optionally converting arabinose into D-xylulose 5-phosphate. Preferably in the process at least 10, 20, 50 or 75% of the cells retain the abilities of isomerising xylose into xylulose, and optionally converting arabinose into D-xylulose 5-phosphate after 50 generations of growth, preferably under industrial fermentation conditions.

A preferred fermentation process according to the invention is a process for the production of ethanol, whereby the process comprises the steps of: a) fermenting a medium containing a source of xylose with a cell as defined hereinabove, whereby the cell ferments xylose, and optionally, b) recovery of the ethanol. The fermentation medium may further be performed as described above. In the process the volumetric ethanol productivity is preferably at least 0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 5.0 or 10.0 g ethanol per litre per hour. The ethanol yield on xylose and/or glucose in the process preferably is at least 50, 60, 70, 80, 90, 95 or 98%. The ethanol yield is herein defined as a percentage of the theoretical maximum yield, which, for xylose and glucose is 0.51 g. ethanol per g. xylose or glucose.

In this document and in its claims, the verb "to comprise" and its conjugations is used in its non-limiting sense to mean that items following the word are included, but items not specifically mentioned are not excluded. In addition, reference to an element by the indefinite article "a" or "an" does not exclude the possibility that more than one of the element is present, unless the context clearly requires that there be one and only one of the elements. The indefinite article "a" or "an" thus usually means "at least one".

All patent and literature references cited in the present specification are hereby incorporated by reference in their entirety.

The following examples are offered for illustrative purposes only, and are not intended to limit the scope of the present invention in any way.

Short description of drawings

Figure 1. Xylose fermentation performance of the Μ2> 50ρΧΙΔ/ΟρΧΙΔ strain expressing a plasmid containing one of the first 7 XylA genes. The code indicating the bacterial origin of XylA genes is explained in Table 1. The fermentation was performed in duplicate using two independent transformants at a starting cell density of 1g DW/L in 50mL YP medium containing 4% xylose at 35°C. The average value is shown in the graph. The CO2 production was estimated by measuring the weight loss during the fermentation. Figure 2. Xylose fermentation performance of the Μ2> 50ρΧΙΔ/ΟρΧΙΔ strain expressing a plasmid with one of the 1 1 XylA genes. The code indicating the bacterial origin of XylA genes is explained in Table 1. The fermentation was performed using 1g DW/L initial cell density in 45 mL YP medium containing 4% xylose at 35°C. The CO2 production was estimated by measuring the weight loss during fermentation.

Figure 3. Integration of XylA genes in the genome.

A) Method of integration using CRISPR/Cas9 methodology in chromosome XV between TYE7 and tp(UGG)03. Arrows indicated by g1 and g2 are gRNA sites where Cas9 makes a double strand break in the chromosomes, guided by two gRNA cutting sites in a single guide RNA plasmid. A plasmid based donor DNA (pDonor) carried two XylA seguences X11 and XI2 flanked by seguences H1 and H2 that are homologous to the site of integration to support homologous recombination.

B) Gel electrophoresis picture of the PCR performed for checking proper insertion of the XylA genes in the genome using two primers flanking the homologous seguences H1 and H2 [shown as prFw(GY94) and prRv(GY95)] at the bottom of panel A. Insertion of a single XylA copy in both alleles of the chromosome produced a PCR product of about 3kb (e.g. Lane CpXI 1 , 2 and 5), while insertion of two copies in both alleles resulted in a 5kb PCR product (e.g. Lane CpXI 3 and 6). Absence of a XylA insertion is expected to produce a PCR band of about 1.6Kb band, which is the size of the PCR band obtained for the control strain T18. Figure 4. Xylose fermentation performance of the GSE16-T18CpX/-VCpX/ _, based strains after genomic integration of different XylA genes. The code indicating the bacterial origin of XylA genes is explained in Table 1. For each XylA gene, two strains carrying either 2 copies or 4 copies were selected. The fermentation was performed using 1g DW/L initial cell density in 50 mL YP medium containing 4% xylose at 35°C. C02 production was estimated by measuring the weight loss during fermentation.

Figure 5. A) Xylose fermentation performance of the MDS130-based strains with genomic integration of different XylA genes as indicated. The code indicating the bacterial origin of XylA genes is explained in Table 1. B) Direct comparison of xylose fermentation performance of the MDS130 strain with the MDC5 strain.

Examples

Example 1

Introduction

In spite of the vast sequence information on xylose isomerases in public sequence databases, only few have been functionally expressed in yeast. One bottleneck might be due to differences in protein synthesis regulation mechanisms between prokaryotes and eukaryotes. The synthesis of bacterial proteins in yeast might not be properly regulated, which could be the reason for the frequent occurrence of inactive or insoluble proteins. Studies showed that proper expression of a gene does not necessarily correlate with proper enzymatic activity (13). In most XylA expressing strains, high enzymatic activity of XI and proper xylose fermentation capacity was observed only after further evolutionary adaptation of the host yeast strain, indicating that there are other regulatory mechanisms required for the proper functioning of the XI enzymes and for their proper connection with the other enzymes of the yeast fermentation pathway (1 1 , 12). Although the regulatory mechanisms are not well understood, certain genetic changes are required by the recombinant host for proper XI activity. This lack of a proper yeast strain as a host for functional expression in turn hinders the screening of potential Xls that might be active in an appropriate host but not in a regular yeast strain.

To overcome the bottleneck of lacking a proper host strain for screening of active bacterial XI genes, we have developed two yeast strains that are able to directly grow and efficiently ferment xylose upon expression of a bacterial XI. These strains have the same industrial yeast strain background of the strain that successfully expressed the Clostridium phytofermentans XylA gene (CpXI) (12, 15, 16). The first strain Μ315_ΟρΧ/Δ/ΟρΧ/Δ has been developed by deletion of the two copies of CpXI from the genome of strain M315. This parent strain M315 has been developed by random mutagenesis of a recombinant industrial strain Ethanol Red, which carried two copies of CpXI and xylulokinase (XKS1 ) and all the genes of the pentose phosphate pathway (PPP) overexpressed in the chromosome. The second platform strain GSE16-T18CpX/A/CpX/A has been developed by deletion of all the CpXI gene copies from the chromosome of the industrial xylose fermenting strain GSE16-T18, which carried 16 to 18 copies of CpXI. GSE16-T18 had been developed from the M315 strain through a series of evolutionary adaptation rounds in synthetic medium and in lignocellulose hydrolysate. Deletion of all the CpXI copies from the strain completely abolished the xylose fermentation performance. Reintroduction of xylose isomerase into these platform strains restored efficient xylose fermentation capacity. Therefore, these two strains provided us with a useful tool for screening of potential XylA genes from different sources for functional expression in yeast. Using these platform strains, we were able to screen several bacterial XylA genes for rapid xylose fermentation capacity, which resulted in the identification of several genes expressing xylose isomerase with superior performance.

Materials and Methods

Construction of multi-copy plasmids carrying XylA genes

Each of the 14 bacterial XylA genes was synthesized in two blocks of about 700 bp with a 30bp overlap to each other. The two gblock gene fragments were linked by PCR using a pair of primers each having a 30 bp tail sequence to create overlap to the 5' and 3' end of a linearized vector p426tef1. The vector p426tef1 (Mumberg et.al., 1995, Yeast vectors for controlled expression of heterologous protein in different genetic backgrounds. Gene, 156; 1 19-122) was linearized using Pstl and Hindlll restriction enzymes between the tef 1 promoter and cyd terminator. The PCR fragment and the linearized vector were assembled using Gibson assembly cloning kit (New England BioLabs, USA), and transformed into chemically competent E. coli strain Top10 (Invitrogen). The plasmids were subsequently isolated from the E. coli using NucleoSpin ® Plasmid EasyPure kit (MACHEREY-NEGEL GMBH & CO. KG, Germany). The isolated plasmids were transformed into the host yeast strain using standard LiAc/PEG method (18).

Deletion of CpXI from GSE16-T18

The strain GSE16-T18 carried between 16 and 18 copies of the CpXI gene that was originally inserted in two copies by replacing part of the PYK2 gene in chromosome XV. The CpXI gene was amplified in the chromosomal locus into multiple tandem repeats during an evolutionary engineering step (15).

The multiple copies of the XylA genes were deleted using a CRISPR/Cas9 based methodology. First, a single gRNA plasmid carrying two gRNA target sequences from either ends of the amplified XylA genes and a hygromycin resistance gene hph has been constructed. Next, two donor DNA fragments were made by PCR amplification of two selection marker genes, the kanamycin resistance gene kan and nourseothricin resistance marker nat. Each marker gene was flanked by sequences homologous to the upstream and downstream gRNA target sequences in the genome. After that, the GSE16-T18 strain was transformed with a Cas9 plasmid having a ble selection marker. The GSE16-T18-Cas9 strain expressing Cas9 was subsequently transformed with the gRNA plasmid and the two donor-DNA fragments. Transformants were selected only for the hph marker in the gRNA. Positive transformants expressing the hph resistance marker were evaluated for effective replacement of the multiple copies of the XylA gene with the two markers kan and nat, both phenotypically and by PCR. A strain that replaced all the XylA copies by a kan and a nat marker was selected, and the markers were subsequently removed by another CRISPR/Cas9 step using a gRNA plasmid that targets each of the kan and nat markers. A full length PYK2 sequence flanked by sequences upstream and downstream of the inserted markers in the genome was used as a donor DNA to cure the partially deleted PYK2 gene. The final strain devoid of any CpXI gene and carrying the full length PYK2 was referred to as GSE16-T18CpX/A CpX/A. Optimization of the CRISPR/Cas9 method for genomic integration of the XylA genes

Genomic integration of 2 to 4 copies of each of the XylA genes was performed using an optimized CRISPR/Cas9 system. First, a donor DNA was constructed in a multicopy plasmid carrying two XylA sequences flanked by sequences homologous to upstream and downstream sequences of the site of integration to trigger homologous recombination. The donor plasmid DNA (pDonor) was transformed into the yeast strain GSE16-T18CpX/A CpX/A and selected directly on plates containing xylose as a carbon source. The pDonor plasmid carrying strains were then transformed with a gRNA plasmid with the hph marker and a Cas9 plasmid with the kan marker, and selected on YPD + geneticin and hygromycin. Transformants growing in the presence of both antibiotic resistance markers were transferred to a new YPD plate to be evaluated for the proper integration of the donor DNA into the genome. This was performed by PCR using a pair of primers annealing upstream and downstream of the insertion site. Once the insertion was confirmed, the strains were allowed to lose the plasmids by growing them in YPD medium for 5 days and then serially transferring the strains to a new YPD plate every 24h. After 5 days, a sample was spread for single colonies and several colonies were evaluated for loss of the gRNA and Cas9 plasmids carrying the hph and kan markers, respectively. Colonies that lost both plasmids were checked by PCR to evaluate the loss of the donor plasmid since the donor plasmid was devoid of selection marker.

Small-scale fermentations

Small scale fermentations were performed essentially according to the protocol described previously (12). Briefly, cells were pre-grown in 5ml YPD for 24h. Subsequently, 1 ml of culture was transferred to 50ml_ YPD in a 300ml_ Erlenmeyer flask. After 24h growth, cells were harvested and a 1g DW/L amount of cells was inoculated into 50ml_ YP medium containing 4% w/v xylose as a carbon source, in cylindrical tubes with cotton plugged rubber stopper and glass tubing. Cultures were continuously stirred with a magnetic rod at 120 rpm and incubated at 35°C. Progress of the fermentations was followed by measuring the weight due to CO2 loss during the fermentation. Results

Screening for XI sequences that support growth of S. cerevisiae on xylose as a sole carbon source Expression in yeast of Xls originating from several species of bacteria has been reported in the last decade. Most of the enzymes failed to show reasonable enzymatic activity in S. cerevisiae. Only a limited number of Xls with good enzymatic activity is available to date. Since a large number of sequences exist in public databases such as NCBI, we explored sequence databases to search for XylA genes originating from diverse environments. We selected 16 sequences coding for XI from 16 bacterial species. The sequences varied from 62% to 96% identity to the sequence of C. phytofermentans XylA (Table 1 ), and between 50 and 55 % to the sequence of Piromyces sp E2 XylA at the amino acid level.

Each sequence has been codon optimized and synthesized by IDT (Integrated DNA

Technologies, Heverlee, Belgium). The codon-optimized genes were subsequently cloned into a yeast expression vector p426-tef1 , under control of the tef 1 promoter and Cyc1 terminator. For comparison, we also constructed a plasmid with the CpXI gene under control of the same promoter and terminator. The constructed plasmids were subsequently transformed into the platform strain M315CpX/A/CpX/A.

Transformants were selected in synthetic medium containing xylose as a carbon source (SCX plate). After 5 days at 30°C, 7 of the 14 transformants were able to grow on the SCX plate. Later, an additional 4 transformants grew into smaller colonies, after 7 days, indicating that the genes in these 4 additional transformants supported only slow growth on xylose. Nevertheless, a total of 1 1 out of the 16 genes tested were able to support growth in medium with xylose as a sole carbon source.

Confirmation of correct expression of the genes

In order to confirm the presence of the expressed gene in the host strain, polymerase chain reaction (PCR) was performed using primers that specifically amplify each gene. As expected, a positive PCR result was obtained at the expected molecular weight of 1.2 kb for all the strains tested (data not shown). The negative control strains M315CpX/A CpX/A and M315CpX/ failed to show a PCR band, confirming the specificity of the PCR product.

Fermentation in medium with xylose

Since growth and fermentation are different traits, and frequently do not correlate well to one another, we evaluated all the 11 XylA transformants for fermentation performance in YP medium containing xylose as a sole carbon source. The first 7 XylA transformants were tested in a first batch of fermentation assays. Interestingly, all 7 XylA transformants showed rapid xylose fermentation capacity in YP medium containing 4% xylose (Figure 1 ). A control strain with the gene CpXI was also evaluated for comparison. Two of the newly isolated genes (Es7XI and Cd OXI) sustained similar xylose fermentation performance as that of CpXI.

Subsequently, we repeated the fermentation test and included the four slow growing XylA transformants. As shown in Figure 2, all 1 1 transformants were able to ferment xylose very well. The 7 strains showing rapid fermentation in the first fermentation test also showed the same rapid fermentation profile. In addition, two strains from the slowly growing second batch (Es5XI and CM 1X1) showed a similar fermentation profile as the first 7 rapidly fermenting strains. Therefore, 9 of the 1 1 transformants were able to support rapid xylose fermentation capacity in an industrial yeast strain background.

To confirm that the XylA transformants truly did not carry CpXI anymore, which is able to support high xylose fermentation capacity by itself, we tested all the cultures at the end of the fermentation by PCR using specific primers unique for the CpXI sequence. As expected, none of the cultures was positive for the CpXI gene, while the control strain that carries the CpXI plasmid was positive for the expected molecular weight band (data no shown).

Integration of XylA genes into the genome

Since plasmid copy number varies greatly in vivo during growth or fermentation, selection of the most active gene based on the fermentation performance of plasmid carrying strains can create a strong bias. Moreover, plasmids are unstable and are not an ideal gene expression system for industrial application. Hence, we performed genomic integration of 3 of the 8 genes supporting the best xylose fermentation capacity and also the gene CpXI for comparison. The integration was carried out into the genome of a robust industrial platform yeast strain GSE16-T18CpX/A CpX/A, using a modified CRISPR/Cas9 system that we optimized for a single step transformation and for efficient integration of foreign genes in 2 to 4 copies, as described in the methods section. Using this methodology, we were able to stably integrate 2 to 4 copies of each of the genes at an intergenic region downstream of TYE7 and upstream of the tRNA gene tP(UGG)03 in chromosome XV. Proper integration of the genes in the genome was confirmed by PCR (Figure 3). Fermentation performance after XylA genomic integration

The fermentation performance of the strains carrying 2 to 4 copies of each XylA gene has been evaluated in YP medium with xylose as sole carbon source. As shown in Figure 4, strains carrying either the Es7XI, Cs8XI or Fm9XI gene in the genome showed high xylose fermentation capacity. Strains carrying Cs8XI and Fm9XI showed from the beginning a comparable xylose fermentation rate as the strain carrying CpXI, while the strain carrying Es7XI showed a delay at the beginning of the fermentation but later recovered a high xylose fermentation rate. Although the strain with two copies of Cs8XI showed a slightly slower xylose fermentation rate than the strain with two copies of CpXI, it showed the highest rate of fermentation during the exponential phase of fermentation (Figure 4). Moreover, the strains with 4 copies of Cs8XI fermented at a higher rate than the strain with the same number of copies of CpXI.

Conclusion

Eleven of the 16 newly identified XylA genes confer very good xylose fermentation performance in an industrial yeast strain when expressed from a multi-copy plasmid under control of the Tef1 promoter and Cyc1 terminator. Except for the XylA gene obtained from L. phytofermentans, which has 96% sequence identity with that of CpXI, all the functionally expressed XylA genes lack significant sequence identity with any of the Xylose Isomerases that have been actively expressed to date. The bacterial species from which these XylA genes have been obtained are isolates from diverse environments. Though most of the species inhabit environments that are rich in plant matter, which explains their cellulolytic capacity, the bacterium M. indolicus is a non- cellulolytic organism that has been isolated from the female genital tract (17). From an evolutionary point of view, this would indicate that there is no correlation with the functionality of the Xylose Isomerase, since there is no need for the XI to remain active in environment lacking hemicellulose. On the other hand, it cannot be excluded that the bacterium M. indolicus also lives in environments where xylose utilization is important for its survival but not lignocellulolytic capacity.

Three of the 1 1 XylA genes were studied after their integration into the genome. The Cs8XI gene was among the best to confer xylose fermentation capacity to the platform industrial strain when integrated in 2 or 4 copies. This gene is derived from an acetone-butanol producing bacterial species C. saccharoperbutylacetonicum. Although the bacterium is known to utilize xylose, the XI gene from this organism has never been expressed in the yeast S. cerevisiae. On the other hand, the gene Fm9XI has previously been expressed in yeast (WO 2010/074577). Interestingly, the Cs8XI and Fm9XI XylA genes have only 68% sequence identity at the amino acid level. The low sequence identity of the two XylA genes is not surprising since the two source organisms are unrelated. Cs8XI is therefore a novel gene that confers excellent xylose fermentation capacity in yeast with chromosomal integration of only 2 to 4 copies. Integration of additional copies of the gene might further improve the xylose fermentation capacity. Furthermore, integration of the other identified genes in this work into the genome of the platform strain is important for stable expression of the genes and may also result in high xylose fermentation capacity. Example 2

Performance of Es7XI and CdOXI in strain MDS130

We further improved the strain GSE16-T18 for improved xylose fermentation and inhibitor tolerance by genome shuffling and evolutionary adaption. Strain MDS130 has thus been selected showing highly improved xylose fermentation capacity in inhibitor-rich hydrolysates. Afterwards, we completely knocked out the CpXI genes from the genome of MDS130 using the CRISPR/Cas9 technique as described above in the section "deletion of CpXI from GSE16-T18". As expected, the knockout strain MDS130CpXIA CpXIA was not able to utilize xylose (Figure 5A).

Next, we introduced the two best performing novel XI genes Es7XI and Cd OXI into the genome of MDS130CpXIA CpXIA downstream of TYE7 gene in chromosome XV. With only two copies of each gene introduced, the deletion strain was able to utilized xylose but at a slower rate compared to the original MDS130 strain that carried about 18 copies of CpXI. In order to evaluate if combining the two genes improved xylose fermentation performance, we introduced additional 4 copies of Cd OXI into strain carrying two copies of Es7X. This resulted in significant improvement of the fermentation rate, close to the performance of strain MDS130 (Figure 5A).

We have previously shown that a gene of interest adjacent to an ARS sequence is frequently amplified when cells are grown in a selective pressure requiring high expression of the gene of interest (WO2016026954). For that reason, we introduced Ex7XI about 2000 nucleotide upstream of ARS1529 in two copies and evolved in YP+4% xylose to induce chromosomal amplification. After 3 weeks, single cells isolates were evaluated and strain MDC5 that performed best from the tested single cell isolates has been selected. Gene copy number analysis by qPCR analysis showed that this strain carried about 12 copies of Es7XI. The performance of strain MDC5 with 12 copies of Es7XI was similar to that of MDS130 that carried about 18 copies of CpXI (Figure 5B). This shows the superior performance of Es7XI over CpXI, at least in the strain background tested.

Table 2 CLUSTAL alignment of xylose isomerase amino acid sequences by MUSCLE (3.8)

PiXI MAKEYFPQIQKIKFEGKDSKNPLAFHYYDAEKEVMGKKMKDWLRFAMAWWHTLCAEGADQ

CclOXI -MKEYFSNIPKVRYEGPDSKNPFAFKFYNPEEKIAGKTMREQLKFSLAYWHTLDAEGTDM

Am6XI -MREHFLEINKIKFEGGDSTNPLAFKYYDANRIVAGKKMKDHLRFALSYWHTLTGNGTDP

Fm9XI —MEFFKGIDKVKYEGVKTNNLLAFAHYNPEEVILGKKMKDHLKFAMSYWHTLTGEGTD P

Cs8XI -MKEYFGNVSKINYEGPGSKNPYSFKYYNPDEVIGGKTMKEHLRFSLSYWHTLTANGADP

C111XI -MAEFFKGIGVIPFEGADSVNPLAFKHYNKDEKVGDKTMAEHLRFAMSYWHTLCAEGGDP

Pcdl2XI -MSEIFKGIGQIKFEGVKSDNELAFRYYNPEQWGNKTMKEHLRFAMSYWHTLCGEGNDP

Es7XI MYFNNIEKIKFEGVNSKNPLAFKYYDADRI IAGKKMSEHLKFAMSYWHTMCADGTDM

Es5XI -MVNGLTNIPPVKFEGRDSKKALSFKYYNPDEMIQGKKMKDYLKFAMSYWHTLCGDGTDP

Mi3XI —MKFFENVPKVKYEGSKSTNPFAFKYYNPEAVIAGKKMKDHLKFAMSWWHTMTATGQD Q

Ca2XI -MKEYFKGIPEVKYEGKDSINPFAFKFYDAKRVIDGKSMEEHLKFAMSWWHTMTATGTDP

LplXI -MKNYFPNVPEVKYEGPNSTNPFAFKYYDAERIVAGKTMKEHCRFALSWWHTLCAGGADP

CpXI -MKNYFPNVPEVKYEGPNSTNPFAFKYYDANKWAGKTMKEHCRFALSWWHTLCAGGADP

PiXI FGGGTKSFPWNEGTDAIEIAKQKVDAGFEIMQKLGIPYYCFHDVDLVSEGNSIEEYESNL

CclOXI FGRATMDKSFGETD-PMAIYKNKAYAAFELMDKLDIDYFCFHDRDIAPEGPTLSETNKNL Am6XI FGQPTMERDYNSLD-GIELSKARVDAAFELMTKLGIEFFCFHDLDIAPEGNSLQEKLDNL

Fm9XI FGNATMDREWNEYT-PMEKAKARVKAGFEFMEKLGLEYFCFHDKDIAPEAETLEEYHRNL

Cs8XI FGAGTMLRPWDDITNEMDLAKARMEAAFELMDKLNIEYFCFHDRDIAPEGKTLQETNENL

C111XI FGSTTAARPWNQIANPIEMAKAKVDAGFEFMQKLGIEYFCFHDRDIAPEGKDLAETNQIL

Pcdl2XI FGVGTVERPWNNITDPIEIAKIKVDAGFEFMSKMGIEYFCFHDRDIAPEGRDLEETNKIL Es7XI FGRGTINKSFGGKT-AIEIYEHKVYAAFELMEKLGMQYFCFHDRDIAPEGATLKETNENL

Es5XI FGSSTIDRDYSGQT-PMEKAKTKADVAFALMQILGIEYFCFHDLDIAPTGNSLKELKNNL

Mi3XI FGSGTMSRIYDGQTEPLALAKARVDAAFDFMEKLNIEYFCFHDADLAPEGNSLQERNENL

Ca2XI FGAGTIDRNYGQTE-SMEIARAKVDAAFELMKKLGIKYFCFHDVDIVPEGKDLKETKENL

LplXI FGVTTMDRSYGNITDPMEFAKAKVDAGFELMTKLGIEYFCFHDADIAPEGENFEESKKNL CpXI FGVTTMDRTYGNITDPMELAKAKVDAGFELMTKLGIEFFCFHDADIAPEGDTFEESKKNL

PiXI KAWAYLKEKQKETGIKLLWSTANVFGHKRYMNGASTNPDFDWARAIVQIKNAIDAGIE

CclOXI DEIVSLLKKLMAEHNKKLLWGTANTFSHPRYVHGAGTSCNASVFAFAAAQIKKAIEITKE Am6XI DTILERIEDKMKETGIKCLWGTTNAFSHPRFMHGAATSPNADVFAFAAAQVKKALEITHR

Fm9XI DEIVDLIEEEMKRTGIKLLWGTSNMFSHPRFMHGAATSCNADVFAYAAAQTKKALEITKR

Cs8XI DEIVAYCKELMKKYNKKLLWGTANCFTNPRYVHGAGTSCNADVFAYAAAQIKKALEVTKE

CI11XI DEWAYIKVKMQETGIKLLWGTANCFNNKRFMHGAGTTCNAEVFAYAAAQIKKAIEVTKE

Pcdl2XI DEIVEYIKVNMEKTGIKLLWGTANMFGNPRFVHGASTTCNADVYAYAAAQVKKAMEITKY Es7XI ERIVPI IKSEMKRTGIKLLWGTANCFNHPRYMCGAGTAPSADVFAYAAAQIKKAIEI VE

Es5XI IEITDYIKGLMDKTGIKLLWGTANCFSHPRYMNGAGTSPQADIFACAAAQIKNAIDATIK

Mi3XI QEMVSYLKQKMAGTSIKLLWGTSNCFSNPRFMHGAATSCEADVFAWTATQLKNAIDA IA

Ca2XI SVIVDYIEEKMKGTDIKLLWGTANCFSSPRYMHGAGTSCNADSFSYAASQIKNAIDATIQ

LplXI FVIVDYIKEKMDQTGIKLLWGTANNFGHPRFMHGASTSCNADVFAYAAAKIKNALDATIK CpXI FEIVDYIKEKMDQTGIKLLWGTANNFSHPRFMHGASTSCNADVFAYAAAKIKNALDATIK

PiXI LGAENYVFWGGREGYMSLLNTDQKREKEHMATMLTMARDYARSKGFKGTFLIEPKPMEPT

CclOXI LDGCGYVFWGGREGYETLLNTDMELELDNMARLLKMAVDYARSIGFKGEFFIEPKPKEPT

Am6XI LRGENYVFWGGREGYETLLNTDIALENDNLAKFLKMAKDYARNIGFEGQFLIEPKPKEPT

Fm9XI LNGTGYVFWGGREGYETLLNTDIGLELDNLARFLQMAVDYAKKIGFEGQFFIEPKPKEPT Cs8XI LGGENYVFWGGREGYETLLNTDMGLELDNFARLLQMAVDYAKEIGFTGQFLIEPKPKEPT

C111XI LGGENYVFWGGREGYETLLNTDTGLELDNFARLLQMAVDYAKEIGFTGQFLIEPKPKEPT

Pcdl2XI LGGENFVFWGGREGYETLLNTNTELEMDNFARFLQMAVDYAKEIGFTGQFLIEPKPKEPT

Es7XI LGGQGYVFWGGREGYD ILNTDMAKEQDNMAYLMRMAVDYGRSIGFTGDFYIEPKPKEPT

Es5XI LGGTGYVFWGGREGYETLLNTNMEIELDNMAKLMHMAVDYARSKGFTGDFYIEPKPKEPT Mi3XI LGGKGYVFWGGREGYETLLNTDVGLEMDNYARMLKMAVAYARSKGYTGDFYIEPKPKEPT

Ca2XI LGGSGYVFWGGREGYETLLNTDMGFELDNMARLMKMAVKYARKKGFNGDFYIEPKPKEPT

Lp1XI LGGKGYVFWGGREGYETLLNTDLGLELDNMARLMKMAVEYGRANGFDGDFYIEPKPKEPT

CpXI LGGKGYVFWGGREGYETLLNTDLGLELDNMARLMKMAVEYGRANGFDGDFYIEPKPKEPT

PiXI KHQYDVDTETAIGFLKAHNLDKDFKVNIEVNHATLAGHTFEHELACAVDAGMLGSIDANR

CclOXI KHQYDYDVSTVLAFLRKYGLDKVFKVNIEANHATLAQHTFQHELRVARINGVLGSVDANQ

Am6XI KHQYDFDTMTVLGFLRKYNLIDDFKLNIEANHATLAGHTFQHELAMARINGVLGSVDANQ

Fm9XI KHQYDFDTTTVLEFLRKYNLDKYFKMNIEANHATLAGHTFQHELCTARINGVFGSIDANQ Cs8XI KHQYDFDTATVLGFLKKYNLDKYFKVNIEANHATLAQHTFQHELNFARINNFLGSIDANQ

C111XI KHQYDFDTATVLAFLRKYNLDTYFKMNIEANHATLAGHTFQHELNMSRINNVLGSIDANQ

Pcdl2XI KHQYDFDTATVLGFLRKYNLDKYFKMNIEANHATLAGHTFQHELNIARINNVLGSIDANQ

Es7XI KHQYDFDVSTVLAFLRKYDLDKDFKMNIEANHATLAGHTFQHELRVARDNGVFGSIDANQ

Es5XI KHQYDFDVATWGFLRKYGLDKDFKMNIEANHATLAGHTFQHELNVARVNNVFGSIDANQ Mi3XI KHQYDFDVATCVAFLEKYDLMRDFKVNIEANHATLAGHTFQHELRMARTFGVFGSVDANQ

Ca2XI KHQYDFDAATVIGFLRKYDLMDDFKLNIEANHATLAGHTFPHELAVARINGVFGSVDANQ

LplXI KHQYDFDTATVLGFLRKYGLEKDFKMNIEANHATLAGHTFEHELALARVNGVFGSVDANQ

CpXI KHQYDFDTATVLAFLRKYGLEKDFKMNIEANHATLAGHTFEHELAMARVNGAFGSVDANQ

PiXI GDYQNGWDTDQFPIDQYELVQAWMEI IRGGGFVTGGTNFDAKTRRNSTDLEDI I IAHVSG

CclOXI GDVMLGWDTDQFPTNVYDTALAMYEILKNGGLPSGGLNFDSKNRRGSFEPEDIFHGFIAG

Am6XI GDLLLGWDTDQFPTNIYDATLSMYEVLKNGGIAPGGLNFDAKVRRGSFKPDDLFIAYIVG

Fm9XI GDMLLGWDTDQFPTNVYDAVLAMYETLLAGGFKEGGLNFDAKVRRGSFEPKDLFYAYISG Cs8XI GDPMLGWDTDQFPTNIYDATLAMYEILKNGGLAPGGVNFDAKVRRASFEKEDLFLAYIAG

C111XI GDLMLGWDTDQFPTNIYDATMAMYEVLKAGGIAPGGFNFDSKVRRGSFEEADLFIAYIAG

Pcdl2XI GDLLLGWDTDQFPTNIYDATLAMYEVLKQGGIAPGGFNFDSKVRRASFEVEDLFLAYIAG

Es7XI GDMLLGWDTDQFPTDLYSTTMCMYEVLKQGGFTNGGLNFDAKARRASNTYEDVFLSYIAG

Es5XI GDLLLGWDTDQFPTNVYDTTLCMLEVIKAGGFTNGGLNFDAKVRRASYTMEDI ILAYISG Mi3XI GDSNLGWDTDQFPGNIYDTTLAMYEILKAGGFTNGGLNFDAKVRRPSFTPEDIAYAYILG

Ca2XI GDSLLGWDTDQFPTDVKEATLSMLEI IKAGGFTNGGLNFDAKVRRPSFTFEDIVYGYISG

LplXI GDPNLGWDTDQFPTDVHSATLAMLEVLKAGGFTNGGLNFDAKVRRGSFEFDDIAYGYIAG

CpXI GDPNLGWDTDQFPTDVHSATLAMLEVLKAGGFTNGGLNFDAKVRRGSFEFDDIAYGYIAG

PiXI MDAMARALENAAKLLQESPYTKMKKERYASFDSGIGKDFEDGKLTLEQVYEYGKKNGEP-

Ccl 0X1 MDAFALGLRIADRI IRDGRLEQFVKDRYKSYQSGIGADIVSGRAKIEDLEKYALKLGEVN

Am6XI MDTFAKGLLVADKLLTDGVLENFVTKRYESYTAGIGKKI IEDATSFEELAEYALKHDKI-

Fm9XI MDTFAKGLKVAAKLIEDGTFEKIKVERYSSYTTGIGKQIVNGEVGFEELSKYALTNGVK-

Cs8XI MDTFAKGLKVAHKLLENGELENFIKNKYASFSEGIGKEIVEGKVGLKELEAYALKNNEI-

C111XI MDTFAKGLKVAYNLLKDGVLEDFVADRYASFNEGIGKDIVSGNVGFKELEAYALKQQPI-

Pcdl2XI MDTFAKGLLIAHKLLEDEVFENFTKERYASFSEGIGKDIVEGKVGFKELESYALQMPVI-

Es7XI MDAFAYGLIVADKI ISDGVMDKFVENRYSSYTEGIGKKIADKQTSLAELEQYTLTNGEP-

Es5XI MDTFALGLKIANKI IEDGRIDEFVSRRYASYKTGIGADI IAGRTNLEELEKYALELPPV-

Mi3XI MDTFALGLIKAQQLIEDGRIDRFVAEKYASYKSGIGAEILSGKTSLPELEAYALKKGEP-

Ca2XI MDTFALGLIKAYEVIEDGRIDEFIEKRYASYESGIGKKILNNEVTLEELEAYTLENKER-

LplXI MDTFALGLIKAAEI IEDGRIAKFVEDRYASYKTGIGKAIVDGTTSLEELEQYVLTHNEP-

CpXI MDTFALGLIKAAEI IDDGRIAKFVDDRYASYKTGIGKAIVDGTTSLEELEQYVLTHSEP-

PiXI KQTSGKQELYEAIVA—MYQ

CclOXI AIGSGRQEYLEDILNSIMFGK

Am6XI VLESGRQEMLEDIVNRYIYK

Fm9XI KNSSGRQEMLENILNRYIYE

Cs8XI TNKSGRQELLEAIVNQYIFED

C111XI VNKSGRQEWLETWNQYIYNNK

Pcdl2XI KNKSGRQEMLESILNRYIYEVD ISNK

Es7XI TAESGKQEYLEALVNQY11SAGREL—

Es5XI EPHPGKQEYLEAVFNNVMFTV

Mi3XI KLYSGRQEYLESWNNVIFNGNL

Ca2XI PMESGRQEYLETILNQILYK

LplXI VMQSGRQEVLESIVNNILFR

CpXI VMQSGRQEVLETIVNNILFR

References

1. Zaldivar J, Nielsen J, Olsson L. Fuel ethanol production from lignocellulose: a challenge for metabolic engineering and process integration. Appl Microbiol Biotechnol. 2001 Jul;56(1-2): 17-34.

2. Hahn-Hagerdal B, Karhumaa K, Fonseca C, Spencer-Martins I, Gorwa-Grauslund MF. Towards industrial pentose-fermenting yeast strains. Appl Microbiol Biotechnol. 2007

Apr;74(5):937-53.

3. Lau MW, Gunawan C, Balan V, Dale BE. Comparing the fermentation performance of Escherichia coli K01 1 , Saccharomyces cerevisiae 424A(LNH-ST) and Zymomonas mobilis AX101 for cellulosic ethanol production. Biotechnol Biofuels. 2010 May 27;3(1 ):1 1 .

4. Bettiga M, Hahn-Hagerdal B, Gorwa-Grauslund MF. Comparing the xylose reductase/xylitol dehydrogenase and xylose isomerase pathways in arabinose and xylose fermenting Saccharomyces cerevisiae strains. Biotechnol Biofuels. 2008 Oct 23; 1 (1 ):16.

5. Hector RE, Mertens JA, Bowman MJ, Nichols NN, Cotta MA, Hughes SR. Saccharomyces cerevisiae engineered for xylose metabolism requires gluconeogenesis and the oxidative branch of the pentose phosphate pathway for aerobic xylose assimilation. Yeast. 201 1 Sep 1 ;28(9):645-60.

6. Hou J, Suo F, Wang C, Li X, Shen Y, Bao X. Fine-tuning of NADH oxidase decreases byproduct accumulation in respiration deficient xylose metabolic Saccharomyces cerevisiae. BMC Biotechnol. 2014 Feb 14;14(1 ): 13.

7. Jeppsson M, Bengtsson O, Franke K, Lee H, Hahn-Hagerdal B, Gorwa-Grauslund MF. The expression of a Pichia stipitis xylose reductase mutant with higher KM for NADPH increases ethanol production from xylose in recombinant Saccharomyces cerevisiae. Biotechnol Bioeng. 2006;93(4):665-73.

8. Walfridsson M, Bao X, Anderlund M, Lilius G, Biilow L, Hahn-Hagerdal B. Ethanolic fermentation of xylose with Saccharomyces cerevisiae harboring the Thermus thermophilus xylA gene, which expresses an active xylose (glucose) isomerase. Appl Environ Microbiol. 1996 Dec;62(12):4648-51.

9. Kuyper M, Harhangi HR, Stave AK, Winkler AA, Jetten MSM, de Laat WTAM, et al. High- level functional expression of a fungal xylose isomerase: the key to efficient ethanolic fermentation of xylose by Saccharomyces cerevisiae? FEMS Yeast Res. 2003;4(1 ):69-78.

10. Peng B, Huang S, Liu T, Geng A. Bacterial xylose isomerases from the mammal gut Bacteroidetes cluster function in Saccharomyces cerevisiae for effective xylose fermentation. Microb Cell Factories. 2015 May 17;14(1 ):70.

1 1. Brat D, Boles E, Wiedemann B. Functional Expression of a Bacterial Xylose Isomerase in Saccharomyces cerevisiae. Appl Environ Microbiol. 2009 Feb 13;75(8):2304-1 1.

12. Demeke MM, Dietz H, Li Y, Foulquie-Moreno MR, Mutturi S, Deprez S, et al. Development of a D-xylose fermenting and inhibitor tolerant industrial Saccharomyces cerevisiae strain with high performance in lignocellulose hydrolysates using metabolic and evolutionary engineering. Biotechnol Biofuels. 2013 Jun 21 ;6(1 ):89. 13. Glanemann C, Loos A, Gorret N, Willis LB, O'Brien XM, Lessard PA, et al. Disparity between changes in mRNA abundance and enzyme activity in Corynebacterium glutamicum: implications for DNA microarray analysis. Appl Microbiol Biotechnol. 2002 Dec 21 ;61 (1 ):61— 8.

14. Glanemann03.pdf [Internet]. [cited 2016 Nov 6]. Available from: http://web.mit.edu/biology/sinskey/www/Glanemann03.pdf

15. Demeke MM, Foulquie-Moreno MR, Dumortier F, Thevelein JM. Rapid Evolution of Recombinant Saccharomyces cerevisiae for Xylose Fermentation through Formation of Extra- chromosomal Circular DNA. PLoS Genet. 2015 Mar 4; 1 1 (3):e1005010.

16. Demeke MM, Dumortier F, Li Y, Broeckx T, Foulquie-Moreno MR, Thevelein JM. Combining inhibitor tolerance and D-xylose fermentation in industrial Saccharomyces cerevisiae for efficient lignocellulose-based bioethanol production. Biotechnol Biofuels. 2013 Aug 26;6(1 ): 120.

17. Austin MN, Rabe LK, Srinivasan S, Fredricks DN, Wiesenfeld HC, Hillier SL. Mageeibacillus indolicus gen. nov., sp. nov.: A novel bacterium isolated from the female genital tract. Anaerobe. 2015 Apr;32:37-42.

18. Gietz RD, Schiestl RH, Willems AR, Woods RA. Studies on the transformation of intact yeast cells by the LiAc/SS-DNA/PEG procedure. Yeast. 1995 Apr 15; 1 1 (4):355-60.