Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
PRODUCTION OF ANTHOCYANIN FROM SIMPLE SUGARS
Document Type and Number:
WIPO Patent Application WO/2017/050853
Kind Code:
A1
Abstract:
Methods for producing anthocyanin by expression in a microorganism are disclosed including culturing of the microorganism under anthocyanin producing conditions, wherein the microorganism has an operative metabolic pathway including at least one heterologous enzyme activity, the pathway producing anthocyanin from simple sugars or other simple carbon sources.

Inventors:
NAESBY MICHAEL (FR)
ZOKOURI ZINA (CH)
FISCHER DAVID (CH)
EICHENBERGER MICHAEL (CH)
HANSSON ANDERS (CH)
Application Number:
PCT/EP2016/072474
Publication Date:
March 30, 2017
Filing Date:
September 21, 2016
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
EVOLVA SA (CH)
International Classes:
C12N15/52; C12P19/44; C12P17/06
Domestic Patent References:
WO2014096456A12014-06-26
WO2001007631A22001-02-01
WO2001040491A22001-06-07
Foreign References:
US20050208643A12005-09-22
US20060019334A12006-01-26
CN101948794A2011-01-19
Other References:
LIMEM I ET AL: "Production of phenylpropanoid compounds by recombinant microorganisms expressing plant-specific biosynthesis genes", PROCESS BIOCHEMISTRY, ELSEVIER, NL, vol. 43, no. 5, 1 May 2008 (2008-05-01), pages 463 - 479, XP022588860, ISSN: 1359-5113, [retrieved on 20080212], DOI: 10.1016/J.PROCBIO.2008.02.001
EFFENDI LEONARD ET AL: "Strain Improvement of Recombinant Escherichia coli for Efficient Production of Plant Flavonoids", MOLECULAR PHARMACEUTICS, vol. 5, no. 2, 1 April 2008 (2008-04-01), pages 257 - 265, XP055003955, ISSN: 1543-8384, DOI: 10.1021/mp7001472
GREEN; SAMBROOK: "MOLECULAR CLONING: A LABORATORY MANUAL, Fourth Edition,", 2012, COLD SPRING HARBOR LABORATORY
AUSUBEL ET AL.: "CURRENT PROTOCOLS IN MOLECULAR BIOLOGY", 1989, GREENE PUBLISHING ASSOCIATES AND WILEY INTERSCIENCE
INNIS ET AL.: "PCR Protocols: A Guide to Methods and Applications", 1990, ACADEMIC PRESS
SASAKI ET AL.: "The Role of Acyl-Glucose in Anthocyanin Modifications", MOLECULES, vol. 19, 2014, pages 18747 - 66
PAQUETTE ET AL., PHYTOCHEMISTRY, vol. 62, 2003, pages 399 - 413
AREND ET AL., BIOTECH. & BIOENG, vol. 78, 2001, pages 126 - 131
YIN ET AL.: "Evolution of plant nucleotide-sugar interconversion enzymes", PLOS ONE., vol. 6, no. 11, 2011, pages E27995
SHAO ET AL., NUCL. ACIDS RES., vol. 37, no. 2, 2009, pages E16
GIETZ ET AL., NAT PROTOC., vol. 2, no. 1, 2007, pages 35 - 7
GIETZ ET AL., NAT PROTOC, vol. 2, no. 1, 2007, pages 35 - 7
Attorney, Agent or Firm:
SMAGGASGALE, Gillian Helen (GB)
Download PDF:
Claims:
WHAT SS CLAIMED IS:

1. A microorganism, comprising an operative metabolic pathway capable of producing an anthocyanin from a simple sugar, the operative metabolic pathway comprising:

a 4-coumaric acid-CoA ligase (4CL);

a chalcone synthase (CHS);

a flavanone 3-hydroxylase (F3H);

a dihydroflavonol-4-reductase (DFR);

an anthocyanidin synthase (ANS);

an anthocyanidin 3-O-glycosyltransferase (A3GT);

a chalcone isomerase (CHI); and

at least one of

a) a tyrosine ammonia lyase (TAL); or

b) a phenylalanine ammonia lyase (PAL) and a trans-cinnamate 4- monooxygenase (C4H),

wherein at least one enzyme of the operative metabolic pathway is encoded by a gene heterologous to the microorganism.

2. The microorganism of claim 1 , wherein the metabolic pathway comprises:

a tyrosine ammonia lyase (TAL);

a phenylalanine ammonia lyase (PAL); and

a trans-cinnamate 4-monooxygenase (C4H).

3. The microorganism of any of the preceding claims, wherein the metabolic pathway further comprises one or more of:

a flavonoid 3'-hydroxylase (F3'H);

a flavonoid 3'-5'-hydroxylase (F3'5'H);

a leucoan*hocyanidin reductase (LAR); or

a CYP450 reductase (CPR).

4. The microorganism of claim 3, wherein the anthocyanin is pelargonidin-3-O- glucoside (P3G), cyanidin-3-O-glucoside (C3G), or delphinidin-3-O-glucoside (D3G).

5. The microorganism of claim 1 , wherein the microorganism is a yeast.

6. The microorganism of claim 5, wherein the yeast belongs to the genus Saccharomyces, Klyuveromyces, Candida, Pichia, Debaromyces, Hansenula, Yarrowia, Zygosaccharomyces, or Schizosaccharomyces.

7. The microorganism of claim 6, wherein the yeast is Saccharomyces cerevisiae.

8. The microorganism of claim 1 , wherein the microorganism is a bacteria.

9. The microorganism of claim 8, wherein the bacteria is Escherichia coli.

10. The microorganism of claim 1 , wherein a plurality of enzymes comprising the operative metabolic pathway are encoded by genes that are heterologous to the microorganism.

11. The microorganism of claim 1 , wherein the operative metabolic pathway is expressed in a microorganism further modified to provide an increased amount of a precursor for at least one enzyme of the operative metabolic pathway.

12. The microorganism of claim 1 , wherein the microorganism is genetically modified to exhibit increased tolerance to at least one of a precursor, an intermediate, or a product molecule from the operative metabolic pathway.

13. A fermentation vesse' comprising the microorganism of claim 1.

14. The microorganism of claim 1 , wherein the operative metabolic pathway comprises: a 4-coumaric acid-CoA ligase (4CL) encoded by the nucleic acid sequence set forth in SEQ ID NO: 1 ;

a chalcone synthase (CHS) encoded by the nucleic acid sequence set forth in SEQ ID NO: 21 ;

a flavanone 3-hydroxylase (F3H) encoded by the nucleic acid sequence set forth in SEQ ID NO: 3;

a dihydroflavonol-4-reductase (DFR) encoded by the nucleic acid sequence set forth in SEQ ID NO: 5 or SEQ ID NO: 7;

an anthocyanidin synthase (ANS) encoded by the nucleic acid sequence set forth in SEQ ID NO: 9;

an anthocyanidin 3-O-glycosyltransferase (A3GT) encoded by the nucleic acid sequence set forth in SEQ ID NO: 11 ;

a chalcone isomerase (CHI) encoded by the nucleic acid sequence set forth in SEQ ID NO: 13; and

at least one of

a) a tyrosine ammonia lyase (TAL) encoded by the nucleic acid sequence set forth in SEQ ID NO: 15, or

b) a phenylalanine ammonia lyase (PAL) encoded by the nucleic acid sequence set forth in SEQ ID NO: 17 and a trans-cinnamate 4- monooxygenase (C4H) encoded by the nucleic acid sequence set forth in SEQ ID NO: 19.

15. The microorganism of claim 14 further comprising a flavonoid 3'-5'-hydroxylase (F3'5'H) encoded by the nucleic acid sequence set forth in SEQ ID NO: 33.

16. A method of producing an anthocyanin, comprising the steps of:

a) culturing the microorganism of any one of claims 1-12, 14, or 15 in a culture medium, wherein the anthocyanin is produced by the microorganism; and

b) optionally isolating the anthocyanin.

17. The method of claim 16, wherein the anthocyanin is pelargonidin-3-O-glucoside (P3G), cyanidin-3-O-glucoside (C3G), and/or delphinidin-3-O-glucoside (D3G).

18. A method of producing an anthocyanin, comprising the steps of:

a) culturing a microorganism comprising an operative metabolic pathway producing an anthocyanin from a simple sugar, the operative metabolic pathway comprising:

a 4-coumaric acid-CoA ligase (4CL);

a chalcone synthase (CHS);

a flavanone 3-hydroxylase (F3H);

a dihydroflavonol-4-reductase (DFR);

an anthocyanidin synthase (ANS);

an anthocyanidin 3-O-glycosyltransferase (A3GT);

a chalcone isomerase (CHI); and

at least one of

a) a tyrosine ammonia lyase (TAL); or

b) a phenylalanine ammonia lyase (PAL) and a trans-cinnamate 4- monooxygenase (C4H),

wherein at least one enzyme of the operative metabolic pathway is encoded by a gene heterologous to the microorganism;

b) producing an anthocyanin by the microorganism; and

c) optionally isolating the anthocyanin.

19. The method of claim 18, wherein the metabolic pathway comprises:

a tyrosine ammonia lyase (TAL);

a phenylalanine ammonia lyase (PAL); and

a trans-cinnamate 4-monooxygenase (C4H).

20. The method of claim 18 or 19, wherein the metabolic pathway further comprises one or more of:

a flavonoid 3'-hydroxylase (F3'H); a flavonoid 3' -5' -hydroxylase (F3'5'H);

a leucoanthocyanidin reductase (LAR); or

a CYP450 reductase (CPR).

21. The method of claim 18, wherein the anthocyanin is pelargonidin-3-O-glucoside (P3G), cyanidin-3-O-glucoside (C3G), or delphinidin-3-O-glucoside (D3G).

22. The method of claim 18, wherein the microorganism is a yeast.

23. The method of claim 22, wherein the yeast belongs to the genus Saccharomyces, Klyuveromyces, Candida, Pichia, Debaromyces, Hansenula, Yarrowia, Zygosaccharomyces, or Schizosaccharomyces.

24. The method of claim 23, wherein the yeast is Saccharomyces cerevisiae.

25. The method of claim 22, wherein the microorganism is a bacteria.

26. The method of claim 25, wherein the bacteria is Escherichia coli.

27. The method of claim 18, wherein the operative metabolic pathway is expressed in a microorganism further modified to provide an increased amount of a precursor for at least one enzyme of the operative metabolic pathway.

28. The method of claim 18, wherein the microorganism is genetically modified to exhibit increased tolerance to at least one of a precursor, an intermediate, or a product molecule from the operative metabolic pathway.

29. The method of claim 18, wherein the simple sugr- comprises glucose, glycerol, ethanol, or easily fermentable raw materials.

30. A microorganism, comprising an operative metabolic pathway capable of producing an anthocyanin from a simple sugar, the operative metabolic pathway comprising:

a 4-coumaric acid-CoA ligase (4CL);

a chalcone synthase (CHS);

a flavanone 3-hydroxylase (F3H);

a dihydroflavonol-4-reductase (DFR);

an anthocyanidin synthase (ANS);

an anthocyanidin 3-O-glycosyltransferase (A3GT);

a chalcone isomerase (CHI);

at least one of

a) a tyrosine ammonia lyase (TAL); or

b) a phenylalanine ammonia lyase (PAL) and a trans-cinnamate 4- monooxygenase (C4H); and

an anthocyanin-5-O-glycosyl transferase (A5GT), an anthocyanin-3-O-aromatic acyl transferase (A3AAT), or an anthocyanin-3-O-malonyl acyl transferase (A3MAT), wherein at least one enzyme of the operative metabolic pathway is encoded by a gene heterologous to the microorganism.

31. The microorganism of claim 30, wherein the anthocyanin is pelargonidin-3,5-0- diglucoside, cyanidin-3,5-0-diglucoside, delphinidin-3,5-0-diglucoside, pelargonidin-3- O-coumaroyl-glucoside, pelargonidin-3-O-coumaroyl glucoside-5-O-glucoside, pelargonidin-3-O-malony! glucoside, or pelargonidin-3-O-malonyl glucoside-5-O- glucoside.

32. A method of producing an anthocyanin, comprising the steps of:

a) culturing a microorganism comprising an operative metabolic pathway producing an anthocyanin from a simple sugar, the operative metabolic pathway comprising:

a 4-coumaric acid-CoA ligase (4CL);

a chalcone synthase (CHS); a flavanone 3-hydroxylase (F3H);

a dihydroflavonol-4-reductase (DFR);

an anthocyanidin synthase (ANS);

an anthocyanidin 3-O-glycosyltransferase (A3GT);

a chalcone isomerase (CHI);

at least one of

a) a tyrosine ammonia lyase (TAL); or

b) a phenylalanine ammonia lyase (PAL) and a trans-cinnamate 4- monooxygenase (C4H), and

an anthocyanin-5-O-glycosyl transferase (A5GT), an anthocyanin-3-O-aromatic acyl transferase (A3AAT), or an anthocyanin-3-O-malonyl acyl transferase (A3MAT), wherein at least one enzyme of the operative metabolic pathway is encoded by a gene heterologous to the microorganism;

b) producing an anthocyanin by the microorganism; and

c) optionally isolating the anthocyanin.

33. The method of claim 32, wherein the anthocyanin is pelargonidin-3,5-0- diglucoside, cyanidin-3,5-0-glucoside, delphinidin-3,5-0-diglucoside, pelargonidin-3-O- coumaroyl-glucoside, pelargonidin-3-O-coumaroyl glucoside-5-O-glucoside,

pelargonidin-3-O-malonyl glucoside, or pelargonidin-3-O-malonyl glucoside-5-O- glucoside.

Description:
PRODUCTION OF ANTHOCYANIN FROM SIMPLE SUGARS

BACKGROUND OF THE INVENTION

Field of the Invention

[0001] Provided are methods for producing anthocyanins in recombinant host cells.

Description of Related Art

[0002] Over the last decade there have been several reports of heterologous production of flavonoids, including anthocyanins, using unicellular hosts, particularly in the prokaryote, Escherichia coli, and the eukaryote, Saccharomyces cerevisiae. Especially in E. coli there has been some success, predominantly after feeding intermediates of the flavonoid pathway to the bacteria. This has allowed several flavanones, flavones, and flavonols to be produced from phenyl propanoid precursors (see e.g., Yan 2005; Jiang 2005; Leonard 2007, respectively). In addition, several other flavonoids were made by intermediate feeding, such as isoflavonoids from liquiritigenin; flavan-3-ols and flavan-4-ols from flavanones; and anthocyanins from either flavanones or from (+)-catechin. However, there are no reports of anthocyanins being produced from basal medium components such as sugar or from the natural precursors phenylalanine or tyrosine.

[0003] The anthocyanin biosynthetic pathway is shown in FIG. 1. As shown, in this pathway the flavonoid intermediate coumaroyl-CoA is produced via the plant phenylpropanoid pathway. Phenylalanine is deaminated by the action of phenylalanine ammonia lyase (PAL), an enzyme of the ammonia lyase family, to form cinnamic acid. Cinnamic acid is then hydroxylated to p-coumaric acid (also called 4-coumaric acid) by cinnamate 4-hydroxylase (C4H), a CYP450 enzyme. Alternatively, p-coumaric acid is formed directly from tyrosine by the action of tyrosine ammonia lyase (TAL). Some enzymes have both PAL and TAL activity. The enzyme 4-coumarate-CoA-ligase (4CL) activates p-coumaric acid to p-coumaroyl CoA by attachment of a CoA group. [0004] Chalcone synthase (CHS), a polyketide synthase, is the first committed enzyme in the flavonoid pathway, and catalyzes synthesis of naringenin chalcone from one molecule of p-coumaroyl CoA and three molecules of malonyl Co A. Naringenin chalcone is rapidly and stereospecifically isomerized to the colorless (2S)-naringenin by chalcone isomerase (CHI). (2S)-Naringenin is hydroxylated at the 3-position by flavanone 3-hydroxylase (F3H) to yield (2R,3R)-dihydrokaempferol, a dihydroflavonol. F3H belongs to the 2-oxoglutarate-dependent dioxygenase (20DD) family. Flavonoid 3'-hydroxylase (F3'H) and flavonoid 3',5'-hydroxylase (F3'5'H), which are P450 enzymes, catalyze hydroxylation of dihydrokaempferol (DHK) to form (2R.3R)- dihydroquercetin and dihydromyricetin, respectively. F3'H and F3'5'H determine the hydroxylation pattern of the B-ring of flavonoids and anthocyanins and are necessary for cyanidin and delphinidin production, respectively. They are the key enzymes that determine the structures of anthocyanins and thus their color. Dihydroflavonols are reduced to corresponding 3,4-cis leucoanthocyanidins by the action of dihydroflavonol 4-reductase (DFR). Anthocyanidin synthase (ANS, also called leucoanthocyanidin dioxygenase or LDOX), which belongs to the 20DD family, catalyzes synthesis of corresponding colored anthocyanidins. In contrast to the well-conserved main pathway of flavonoid biosynthesis described above, modification of anthocyanidins is family- or species-dependent and can be very diverse. Additionally, in order to form more stable anthocyanins, anthocyanidins can be 3-glucosylated by the action of UDP- glucose:flavonoid (or anthocyanidin) 3GT.

[0005] In yeast {e.g., S. cerevisiae), some of the same molecules (flavanones, flavones, and flavonols) have been made from phenyl propanoids. In addition, a few examples have been reported of production of flavonoids from sugar, e.g., naringenin (Koopman et al. 2012) and various flavanones and flavonols (Naesby 2009). However, production of anthocyanins has never been reported.

[0006] Therefore, new approaches are required for producing anthocyanins via heterologous biosynthetic pathways in microbes. SUMMARY OF THE INVENTION

[0007] It is against the above background that the present invention provides certain advantages and advancements over the prior art. Set forth herein are methods developed by selection of highly active heterologous genes, and by balancing the expression thereof, that produce anthocyanins from glucose in a microorganism host cell. Specifically provided herein are operative metabolic pathways for producing anthocyanins from glucose or other simple sugars.

[0008] In a first aspect, the invention provides a microorganism including an operative metabolic pathway capable of producing an anthocyanin from glucose. The operative metabolic pathway includes at least a 4-coumaric acid-CoA ligase (4CL), a chalcone synthase (CHS), a flavanone 3-hydroxylase (F3H), a dihydroflavonol-4- reductase (DFR), an anthocyanidin synthase (ANS), an anthocyanidin 3-0- glycosyltransferase (A3GT), a chalcone isomerase (CHI), and at least one of a) a tyrosine ammonia lyase; or b) a phenylalanine ammonia lyase (PAL) and a trans- cinnamate 4-monooxygenase (C4H). At least one enzyme of the operative metabolic pathway is encoded by a gene heterologous to the microorganism is encoded by a gene heterologous to the microorganism. In particular embodiments, the anthocyanin is produced in a ratio of at least 1 :1 to its anthocyanidin precursor by the operative metabolic pathway.

[0009] In a second aspect, the invention provides a fermentation vessel including a microorganism having an operative metabolic pathway producing an anthocyanin from glucose. The operative metabolic pathway includes a 4-coumaric acid-CoA ligase (4CL), a chalcone synthase (CHS), a flavanone 3-hydroxylase (F3H), a dihydroflavonol- 4-reductase (DFR), an anthocyanidin synthase (ANS), an anthocyanidin 3-0- glycosyltransferase (A3GT), a chalcone isomerase (CHI), and a tyrosine ammonia lyase or a phenylalanine ammonia lyase (PAL) and a trans-cinnamate 4-monooxygenase (C4H), wherein at least one enzyme of the operative metabolic pathway is encoded by a gene heterologous to the microorganism.

[0010] In a third aspect, the invention provides a microorganism including an operative metabolic pathway producing an anthocyanin from glucose. The operative metabolic pathway includes a 4-coumaric acid-CoA ligase (4CL) encoded by the nucleic acid sequence set forth in SEQ ID NO: 1 , a chalcone synthase (CHS) encoded by the nucleic acid sequence set forth in SEQ ID NO: 21 , a flavanone 3-hydroxylase (F3H) encoded by the nucleic acid sequence set forth in SEQ ID NO: 3, a dihydroflavonol-4- reductase (DFR) encoded by the nucleic acid sequence set forth in SEQ ID NO: 5 or SEQ ID NO: 7, an anthocyanidin synthase (ANS) encoded by the nucleic acid sequence set forth in SEQ ID NO: 9, an anthocyanidin 3-O-glycosyltransferase (A3GT) encoded by the nucleic acid sequence set forth in SEQ ID NO: 11 , a chalcone isomerase (CHI) encoded by the nucleic acid sequence set forth in SEQ ID NO: 13, and at least one of a) a tyrosine ammonia lyase (TAL) encoded by the nucleic acid sequence set forth in SEQ ID NO: 15 or b) a phenylalanine ammonia lyase (PAL) encoded by the nucleic acid sequence set forth in SEQ ID NO: 17 and a trans-cinnamate 4-monooxygenase (C4H) encoded by the nucleic acid sequence set forth in SEQ ID NO: 19.

[0011] In a fourth aspect, a microorganism includes an operative metabolic pathway capable of producing an anthocyanin from a simple sugar. The operative metabolic pathway includes a 4-coumaric acid-CoA ligase (4CL), a chalcone synthase (CHS), a flavanone 3-hydroxylase (F3H), a dihydroflavonol-4-reductase (DFR), an anthocyanidin synthase (ANS), an anthocyanidin 3-O-glycosyltransferase (A3GT), a chalcone isomerase (CHI), at least one of a) a tyrosine ammonia lyase (TAL) or b) a phenylalanine ammonia lyase (PAL) and a trans-cinnamate 4-monooxygenase (C4H), and an anthocyanin-5-O-glycosyl transferase (A5GT), an anthocyanin-3-O-aromatic acyl transferase (A3AAT), or an anthocyanin-3-O-malonyl acyl transferase (A3MAT). At least one enzyme of the operative metabolic pathway is encoded by a gene heterologous to the microorganism. In one embodiment, the anthocyanin is pelargonidin-3,5-0-diglucoside, cyanidin-3,5-0-diglucoside, delphinidin-3,5-0- diglucoside, pelargonidin-3-O-coumaroyl-glucoside, pelargonidin-3-O-coumaroyl glucoside-5-O-glucoside, pelargonidin-3-O-malonyl glucoside, or pelargonidin-3-O- malonyl glucoside-5-O-glucoside.

[0012] In a fifth aspect, a method of producing an anthocyanin includes the steps of a) culturing a microorganism comprising an operative metabolic pathway producing an anthocyanin from a simple sugar, the operative metabolic pathway comprising: a 4- coumaric acid-CoA ligase (4CL); a chalcone synthase (CHS);a flavanone 3-hydroxylase (F3H); a dihydroflavonol-4-reductase (DFR);an anthocyanidin synthase (ANS); an anthocyanidin 3-O-glycosyltransferase (A3GT); a chalcone isomerase (CHI);at least one of a) a tyrosine ammonia lyase (TAL) or b) a phenylalanine ammonia lyase (PAL) and a trans-cinnamate 4-monooxygenase (C4H), and an anthocyanin-5-O-glycosyl transferase (A5GT), an anthocyanin-3-O-aromatic acyl transferase (A3AAT), or an anthocyanin-3-O-malonyl acyl transferase (A3MAT), at least one enzyme of the operative metabolic pathway is encoded by a gene heterologous to the microorganism, b) producing an anthocyanin by the microorganism, and c) optionally isolating the anthocyanin. In one embodiment, the anthocyanin is pelargonidin-3,5-0-diglucoside, cyanidin-3,5-0-glucoside, delphinidin-3,5-0-diglucoside, pelargonidin-3-O-coumaroyl- glucoside, pelargonidin-3-O-coumaroyl glucoside-5-O-glucoside, pelargonidin-3-O- malonyl glucoside, or pelargonidin-3-O-malonyl glucoside-5-O-glucoside.

[0013] These and other features and advantages of the present invention will be more fully understood from the following detailed description of the invention taken together with the accompanying claims. It is noted that the scope of the claims is defined by the recitations therein and not by the specific discussion of features and advantages set forth in the present description.

DESCRIPTION OF DRAWINGS [0014] FIG. 1. Anthocyanin biosynthetic pathway overview.

[0015] FIGS. 2(a) and 2(b). FIG. 2(a) depicts DNA fragments used for assembling, by in vivo homologous recombination, the plasmid shown in FIG. 2(b). Each DNA fragment is amplified in a bacterial vector from which it is released by a restriction enzyme digest (only the released fragments are shown). The DNA fragments contain elements for stable maintenance and replication in yeast, or they contain a yeast expression cassette (promoter-gene coding sequence-terminator) for expressing one of the genes of the desired biosynthetic pathway. Finally, one fragment contains the tags necessary for closing the circle: All fragments have so-called HRTs (Homologous Recombination Tag) at the ends, where the 3'-end of one fragment is identical to the 5'- end of the next fragment, etc. When introduced into yeast, the repair mechanism of this host will assemble the fragments into the full plasmid shown in FIG. 2(b).

[0016] FIG. 3 depicts DNA fragments used for assembling and integrating, by in vivo homologous recombination, the expression cassettes (as described in FIGS. 2(a) and 2(b) for assembly of a desired biosynthetic pathway. Instead of sequences for plasmid replication, the first and the last fragment have sequences (Integration Tags) which are homologous to the integration site in the host genome.

[0017] FIG. 4. Chromatogram of the anthocyanidin pelargonidin detected by LC/MS.

[0018] FIG. 5. Chromatogram of anthocyanin pelargonidin-3-O-glucoside (P3G) detected by LC/MS.

[0019] FIG. 6. Chromatogram of pelargonidin-3,5-0-diglucoside detected by LC/MS.

[0020] FIG. 7. Chromatogram of the cyanidin detected by LC/MS.

[0021] FIG. 8. Chromatogram of cyanidin-3-O-glucoside (C3G) detected by LC/MS.

[0022] FIG. 9. Chromatogram of cyanidin-3,5-0-diglucoside detected by LC/MS.

[0023] FIG. 10. Chromatogram of the delphinidin detected by LC/MS.

[0024] FIG. 1 1 . Chromatogram of the delphinid ' in-3-O-glucoside detected by LC/MS.

[0025] FIG. 12. Chromatogram of delphinidin-3,5-0-diglucoside detected by LC/MS.

[0026] FIG. 13. Chromatogram of the pelargonidin-3-O-coumaroyl-glucoside detected by LC/MS.

[0027] FIG. 14. Chromatogram of the pelargonidin-3-0-coumaroyl-glucoside-5-0- glucoside detected by LC/MS.

[0028] FIG. 15. Chromatogram of the pelargonidin-3-O-malonyl-glucoside detected by LC/MS.

[0029] FIG. 16. Chromatogram of the pelargonidin-3-0-malonyl-glucoside-5-0- glucoside detected by LC/MS. [0030] FIG. 17. A photograph of methanol extracted P3G producing cells. Cell samples were adjusted to pH 2 with HCI. Cells in the left tube contain the full P3G pathway, and as can be seen, express the P3G molecule. The cells in the right tube contain the full P3G pathway but lack DFR, and therefore, have no color.

[0031] FIG. 18. A photograph of methanol extracted P3G producing cells. Cell samples were pH adjusted with HCI to a pH of < 2 (left tube = a first shade), ~ 5 (center tube = no color), or about 10 (right tube = a second shade).

DETAILED DESCRIPTION

[0032] All publications, patents and patent applications cited herein are hereby expressly incorporated by reference in their entirety for all purposes.

[0033] Before describing the present invention in detail, a number of terms will be defined. As used herein, the singular forms "a," "an," and "the" include plural referents unless the context clearly dictates otherwise. For example, reference to "a compound" means one or more compounds.

[0034] It is noted that terms like "preferably," "commonly," and "typically" are not utilized herein to limit the scope of tie claimed invention or to imply that certain features are critical, essential, or even important to the structure or function of the claimed invention. Rather, these terms are merely intended to highlight alternative or additional features that can or cannot be utilized in a particular embodiment of the present invention.

[0035] For the purposes of describing and defining the present invention it is noted that the term "substantially" is utilized herein to represent the inherent degree of uncertainty that can be attributed to any quantitative comparison, value, measurement, or other representation. The term "substantially" is also utilized herein to represent the degree by which a quantitative rep" 3 sentation can vary from a stated reference withoi * resulting in a change in the basic function of the subject matter at issue.

[0036] As used herein, the term "about" refers to ±10% of a given value unless otherwise specified. [0037] As used herein, the terms "or" and "and/or" are utilized to describe multiple components in combination or exclusive of one another. For example, "x, y, and/or z" can refer to "x" alone, "y" alone, "z" alone, "x, y, and z," "(x and y) or z," "x or (y and z)," or "x or y or z."

[0038] Methods well known to those skilled in the art can be used to construct genetic expression constructs and recombinant cells according to this invention. These methods include in vitro recombinant DNA techniques, synthetic techniques, in vivo recombination techniques, and polymerase chain reaction (PGR) techniques. See, for example, techniques as described in Green & Sambrook, 2012, MOLECULAR CLONING: A LABORATORY MANUAL, Fourth Edition, Cold Spring Harbor Laboratory, New York; Ausubel et a/., 1989, CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, Greene Publishing Associates and Wiley Interscience, New York, and PCR Protocols: A Guide to Methods and Applications (Innis et a/., 1990, Academic Press, San Diego, CA).

[0039] As used herein, the terms "polynucleotide," "nucleotide," "oligonucleotide," and "nucleic acid" can be used interchangeably to refer to nucleic acid comprising DNA, RNA, derivatives thereof, or combinations thereof.

[0040] As used herein, the terms "microorganism," "microorganism host," "microorganism host cell," "recombinant host," and "recombinant host cell" can be used interchangeably. As used herein, the term "recombinant host" is intended to refer to a host, the genome of which has been augmented by at least one DNA sequence. Such DNA sequences include but are not limited to genes that are not naturally present, DNA sequences that are not normally transcribed into RNA or translated into a protein ("expressed"), and other genes or DNA sequences which one desires to introduce into the non-recombinant host. It will be appreciated that typically the genome of a recombinant host described herein is augmented through stable introduction of one or more recombinant genes that may be inserted into the host genome and/or by w^v of an episomal vector (e.g., plasmid, YAC, etc.). Generally, introduced DNA is not originally resident in the host that is the recipient of the DNA, but it is within the scope of this disclosure to isolate a DNA segment from a given host, and to subsequently introduce one or more additional copies of that DNA into the same host, e.g., to enhance production of the product of a gene or alter the expression pattern of a gene. In some instances, the introduced DNA will modify or even replace an endogenous gene or DNA sequence by, e.g., homologous recombination or site-directed mutagenesis. Suitable recombinant hosts include microorganisms.

[0041] As used herein, the term "recombinant gene" refers to a gene or DNA sequence that is introduced into a recipient host, regardless of whether the same or a similar gene or DNA sequence may already be present in such a host. "Introduced," or "augmented" in this context, is known in the art to mean introduced or augmented by the hand of man. Thus, a recombinant gene can be a DNA sequence from another species, or can be a DNA sequence that originated from or is present in the same species, but has been incorporated into a host by recombinant methods to form a recombinant host. It will be appreciated that a recombinant gene that is introduced into a host can be identical to a DNA sequence that is normally present in the host being transformed. For any recombinant gene, one or more additional copies of the DNA can be introduced, to thereby permit overexpression or modified expression of the gene product of that DNA. Said recombinant genes are particularly encoded by cDNA.

[0042] As used herein, the terms "codon optimization" and "codon optimized" refer to a technique to maximize protein expression in fast-growing microorganisms such as E. coli or S. cerevisiae by increasing the translation efficiency of a particular gene. Codon optimization can be achieved, for example, by converting a nucleotide sequence of one species into a genetic sequence which better reflects the translation machinery of a different, host species. Optimal codons help to achieve faster translation rates and high accuracy.

[0043] As used herein, the term "engineered biosynthetic pathway" or "operative metabolic pathway" refers to a biosynthetic pathway that occurs in a recombinant host, as described herein, and dnes not naturally occur in the host. Further, an "enn'neered microorganism" refers to a recombinant host that contains an engineered biosynthetic pathway or operative metabolic pathway. [0044] As used herein, the terms "heterologous sequence," "heterologous coding sequence," and "heterologous gene" are used to describe a sequence or gene derived from a species other than the recombinant host. For example, if the recombinant host is an S. cerevisiae cell, then the cell would include a heterologous sequence derived from an organism other than S. cerevisiae. A heterologous coding sequence or gene, for example, can be from a prokaryotic microorganism, a eukaryotic microorganism, a plant, an animal, an insect, or a fungus different than the recombinant host expressing the heterologous sequence.

[0045] As used herein, "highly efficient enzyme" refers to an enzyme that when expressed in a recombinant host exhibits a rate of enzymatic catalysis more efficient than a second enzyme (e.g., a functional homolog or another embodiment of the first enzyme) expressed in the same host under the same conditions and that catalyzes the same reaction as the highly efficient enzyme. For example, the highly efficient enzyme and second enzyme could both be glycosyltransferases but from different species. By way of illustration, said highly efficient enzyme would have an enzymatic activity that is two-fold, or four-fold, or ten-fold, or twenty-fold, or one hundred-fold, or one thousandfold higher than said second heterologous enzyme.

[0046] As used herein, "functional homolog" refers to a polypeptide that has sequence similarity to a reference polypeptide, and that carries out one or more of the biochemical or physiological function(s) of the reference polypeptide. A functional homolog and the reference polypeptide can be a natural occurring polypeptide, and the sequence similarity can be due to convergent or divergent evolutionary events. As such, functional homologs are sometimes designated in the literature as homologs, or orthologs, or paralogs. Variants of a naturally occurring functional homolog, such as polypeptides encoded by mutants of a wild type coding sequence, can themselves be functional homologs. Functional homologs can also be created via site-directed mutagenesis of the coding sequence for a polypeptide, or by combining domains from the coding sequence^ for different naturally-occurring polypeptides ("domain swapping"). Techniques for modifying genes encoding functional polypeptides described herein are known and include, inter alia, directed evolution techniques, site- directed mutagenesis techniques and random mutagenesis techniques, and can be useful to increase specific activity of a polypeptide, alter substrate specificity, alter expression levels, alter subcellular location, or modify polypeptide-polypeptide interactions in a desired manner. Such modified polypeptides are considered functional homologs. The term "functional homolog" is sometimes applied to the nucleic acid that encodes a functionally homologous polypeptide.

[0047] As used herein, "optimal conditions," in reference to an enzyme, refers to reaction conditions in which an expressed enzyme is able to operate at its maximum efficiency. For example, an enzyme of a biosynthetic pathway operating under optimal conditions would have a non-rate-limiting supply of substrate for its reaction step. Further, the enzyme would have little to no feedback inhibition caused by, for example, an overabundance of product accumulation downstream of the enzyme in the biosynthetic pathway.

[0048] Also, as used herein "optimal conditions," in reference to a biosynthetic pathway, refers to a biosynthetic pathway in which each enzyme is operating under optimal conditions for a given host taking into account side-reactions that sap initial substrates and intermediates between enzymes of the pathway.

[0049] In one embodiment, optimal conditions for a biosynthetic pathway may be achieved by balancing the rate of a single catalytic step or the rate of flow through a single step of the pathway. In another embodiment, optimal conditions for a biosynthetic pathway may be achieved by balancing the rate of two or more catalytic steps or the rates of flow through two or more steps of the pathway. For example, if substrate availability and intermediate accumulation are non-limiting, then pathway flow rate may be optimized by choosing highly efficient enzymes. Where less efficient enzymes are used, the resultant decreased flow rate may be compensated for by increasing their expression levels to provide a greater number of the less efficient enzyme to increase overall flow volume. This may be achieved, for example, by pairing a gene promoter with a high rate (e.g., 2X expression rate) of gene exnression with a relatively less efficient enzyme and a gene promoter with a lower rate (e.g. , 1 X expression rate) of gene expression with a relatively more efficient enzyme. As a result, on average, the flow through the step catalyzed by the less efficient, but more abundant enzyme and that catalyzed by the more efficient, but less abundant enzyme can be balanced or made relatively equal. Such an approach may be used to "balance" biosynthetic pathways having multiple enzymes with varying levels of efficiency relative to one another by choosing the appropriate promoter/gene combination that results in an equivalent level of catalytic activity for each step. Another approach is to integrate multiple gene copies encoding of a less efficient enzyme into the genome of the host cell to increase the expression levels of the less efficient enzyme.

[0050] A recombinant gene encoding a polypeptide described herein comprises the coding sequence for that polypeptide, operably linked in sense orientation to one or more regulatory regions suitable for expressing the polypeptide. Because many microorganisms, particularly prokaryotes, are capable of expressing multiple gene products from a polycistronic mRNA, multiple polypeptides can be expressed under the control of a single regulatory region for those microorganisms, if desired. A coding sequence and a regulatory region are considered to be operably-linked when the regulatory region and coding sequence are positioned so that the regulatory region is effective for regulating transcription or translation of the sequence.

[0051] In many cases, the coding sequence for a polypeptide described herein is identified in a species other than the recombinant host, i.e., is a heterologous nucleic acid. Thus, if the recombinant host is a microorganism, the coding sequence can be from other prokaryotic or eukaryotic microorganisms, from plants or from animals. In some case, however, the coding sequence is a sequence that is native to the host and is being reintroduced into that organism. A native sequence can often be distinguished from the naturally occurring sequence by the presence of non-natural sequences linked to the exogenous nucleic acid, e.g., non-native regulatory sequences flanking a native sequence in a recombinant nucleic acid construct. In addition, stably transformed exogenous nucleic acids typically are integrated at positions other than the position where the native sequence is found. "Regulatory region" refers to a nucleic acid having nucleotide sequences that influence transcription or translation initiation and rate, and stability and/or mobility of a transcription or translation product. Regulatory regions include, without limitation, promoter sequences, enhancer sequences, response elements, protein recognition sites, inducible elements, protein binding sequences, 5 ' and 3 ' untranslated regions (UTRs), transcriptional start sites, termination sequences, polyadenylation sequences, introns, and combinations thereof. A regulatory region typically comprises at least a core (basal) promoter. A regulatory region also can include at least one control element, such as an enhancer sequence, an upstream element or an upstream activation region (UAR). A regulatory region is operably linked to a coding sequence by positioning the regulatory region and the coding sequence so that the regulatory region is effective for regulating transcription or translation of the sequence. A regulatory region can, however, be positioned as much as about 5,000 nucleotides upstream of the translation initiation site or about 2,000 nucleotides upstream of the transcription start site.

[0052] The choice of regulatory regions to be included depends upon several factors, including, but not limited to, efficiency, selectability, inducibility, desired expression level, and preferential expression during certain culture stages. It is a routine matter for one of skill in the art to modulate the expression of a coding sequence by appropriately selecting and positioning regulatory regions relative to the coding sequence. It will be understood that more than one regulatory region can be present, e.g. , introns, enhancers, upstream activation regions, transcription terminators, and inducible elements.

[0053] As used herein, the term "detectable concentration" refers to a level of anthocyanin measured in mg/L, nM, μΜ, or mM. Anthocyanin production can be detected and/or analyzed by techniques generally available to one skilled in the art, for example, but not limited to, thin layer chromatography (TLC), high-performance liquid chromatography (HPLC), ultraviolet-visible spectroscopy/spectrophotometry (UV-Vis), mass spectrometry (MS), and nuclear magnetic resonance spectroscopy (NMR).

[0054] Anthocyanins

[0055] Anthocyanins are multi-glycosylated anthocyanidins, which, in turn, are derived from jvonoids such as naringenin. The anthocyanins aiv often further acylated in a process where moieties from aromatic or non-aromatic acids are transferred to hydroxy! groups of the anthocyanin-resident sugars. The aromatic acylation of anthocyanins increases stability and shifts their color. [0056] Anthocyanins are pigments, which naturally appear red, purple, or blue. Frequently, the color of anthocyanins is dependent on pH. Anthocyanins are naturally found in flowers, where they provide bright-red and -purple colors. Anthocyanins are also found in vegetables and fruits. Anthocyanins are useful as dyes or coloring agents, and furthermore, anthocyanins have caught attention for their antioxidant properties.

[0057] There could be any number of reasons for the observed lack of previous demonstration of anthocyanin production from sugar in unicellular organisms. For instance, in E. coli, one impediment could have been a lack of sufficient precursors such as UDP-sugar, and malonyl-CoA, as well as the amino acids phenylalanine and tyrosine. In addition, expression of plant monooxygenases (CYP450s) in bacteria is a recognized challenge, because these enzymes depend on cofactors such as NAD(P)H dependent reductases, as well as co-localization to the ER membrane. In yeast, however, precursors and co-factors are relatively abundant, and most plant enzymes can readily be expressed. Yet, the art contained a surprising lack of attempts or examples for producing anthocyanins in yeast.

[0058] In addition, some of the later intermediates in the anthocyanin biosynthetic pathway, in particular leucoanthocyanins and anthocyanidins, are relatively unstable at physiological pH. In plants, this instability is thought to be circumvented by channeling these intermediates between enzymes that form close association or aggregates in the cytosol, possibly anchored on the ER surface. It is not known whether this channeling is taking place between enzymes heterologously expressed in bacteria and yeast. An attempt of channeling was made by Yan 2005 with some success by fusing the anthocyanidin synthase (ANS) and anthocyanidin 3-O-glycosyltransferase (A3GT) enzymes, but it was later suggested that the more important factor is to have efficient expression of A3GT (Lim 2015).

[0059] Another issue that has hampered heterologous expression is the promiscuity of several enzymes regarding substrate specificity, and the ability of such enzymes to catalyze t lore than one reaction. This is particularly the jase with a group of 2- oxoglutarate dependent dioxygenases (20DDs) including flavanone 3-hydroxylase (F3H) and ANS. ANS has very high similarity to flavonol synthase (FLS) and has been shown to catalyze many of the same reactions normally associated with FLS and flavonol synthesis. Hence, after expression of biosynthetic pathways directed to anthocyanin production, the result has been high amounts of fiavonols (both aglycones and their 3-O-glycosides). Several ANS enzymes have been tested with similar results, and this has hampered production of anthocyanins from their precursors, e.g., flavanones and dihydroflavonols. It is also likely to be one of the major reasons why anthocyanin production from glucose has not been previously demonstrated in bacteria and yeast.

[0060] Further, heterologous compound production via heterologous biosynthetic pathways often faces competition from host enzymes capable of degrading or modifying intermediates, or otherwise shunting them away from the main pathway. In yeast, this includes degradation of phenyl propanoids, as well as cleavage of the final glucoside to revert anthocyanins to the unstable anthocyanidins. Such issues are further exacerbated when the heterologous synthetic pathways compete for primary substrates for host metabolism, such as glucose.

[0061] Despite these previous challenges, this invention demonstrates that unexpectedly, it is possible to produce anthocyanins from simple sugars, such as glucose, or other simple carbon sources such as glycerol, ethanol, or easily fermentable raw materials in microorganisms such as yeast, by careful selection and expression of highly efficient heterologous enzymes.

[0062] In one embodiment, the invention discloses a recombinant host cell including an operative metabolic pathway capable of producing an anthocyanidin of the formula I:

wherein

Ri is selected from the group consisting of -H, -OH and -OCH3; and

R2 is selected from the group consisting of -H and -OH; and R.3 is selected from the group consisting of -H, -OH and -OCH3; and

R 4 is selected from the group consisting of -H and -OH; and

R5 is selected from the group consisting of -OH and -OCH3; and

Re is selected from the group consisting of -H and -OH; and

R7 is selected from the group consisting of -OH and -OCH3 [0015] In certain aspects, the anthocyanidin is selected from the group consisting of aurantinidin, cyanidin, delphinidin, europinidin, luteolinidin, pelargonidin, malvidin, peonidin, petunidin and rosinidin.

[0063] In one embodiment, a recombinant host cell is provided that is genetically engineered to include an operative metabolic pathway for producing anthocyanins from glucose. In another embodiment, a microorganism is provided that is engineered to include an operative metabolic pathway for producing anthocyanins including only heterologous genes in the operative metabolic pathway. For example, in the case of a yeast host, the operative metabolic pathway may include genes from plants, archaea, bacteria, animals, and other fungi. In one embodiment, each of the heterologous genes in the operative metabolic pathway is from one or more plants.

[0064] In another embodiment, a recombinant host cell is provided that includes one or more heterologous nucleic acid molecules that encode enzymes of the aurantinidin, cyanidin, delphinidin, europinidin, luteolinidin, pelargonidin, malvidin, peonidin, petunidin and/or rosinidin biosynthesis pathways. In certain aspects, the host cells are capable of producing cyanidin. In other aspects, the host cells comprise one or more heterologous enzyme nucleic acid molecules each encoding an enzyme of the cyanidin biosynthesis pathway.

[0065] As will be understood by a person skilled in the art, any enzyme of the anthocyanin synthetic pathway can be a target for optimization by genetic modifications, such as specific deletions, insertions, alterations, e.g., by mutagenesis, to improve both the specificity and turn-over rate of that enzyme. Moreover, while specific enzymes are disclosed herein, the skilled worker will appreciate that each disclosed enzyme represents its enzymatic function rather than only the listed enzyme and should not be considered to be limited to the particular enzyme exemplified herein by name or sequence.

[0066] In certain embodiments, the heterologous enzymes can be selected from any one or a combination of organisms. For example, organisms from which heterologous enzymes for use herein may be selected include one or more of the following genera: Petunia, Malus, Anthurium, Zea, Arabidopsis, Ammi, Glycine, Hordeum, Medicago, Populus, Fragaria, Dianthus, Saccharomyces, and the like. Representative species from these genera that may be used include Petunia x hybrida, Malus domestica, Anthurium andraeanum, Arabidopsis thaliana, Ammi majus, Hordeum vulgar e, Medicago sativa, Populus trichocarpa, Fragaria x ananassa, Dianthus caryuphyllus, and Saccharomyces cerevisiae.

[0067] Orthogonal enzymes from other organisms may also be substituted. Hence, there may be many options for constructing anthocyanin or catechin pathways by identifying a set of enzymes that will work well together in a given microorganism.

[0068] Host optimization to improve expression of the heterologous pathways described is also possible. This may, for example, be done in such a way as to improve the ability of the host to provide higher levels of precursor molecules, tolerate higher levels of product, or to eliminate unwanted host enzyme activity which interferes with the heterologous anthocyanin-producing pathway.

[0069] In another embodiment, enzymes that may be used herein include any enzymes involved in anthocyanidin synthesis or anthocyanin synthesis. For example, enzymes contemplated for use herein include those listed in Table No. 1 below and homologs and variants thereof, including host-specific codon optimized variants.

[0070] Table No. 1. Enzymes.

Gene Gene product

C4H Trans-cinnamate 4-monooxygenase

4CL 4-coumaric acid-CoA ligase

CHS Chalcone synthase

CHI Chalcone isomerase

F3H Flavanone 3-hydroxylase

F3'H Flavonoid 3'-hydroxylase

F3'5'H Flavonoid 3'-5'-hydroxylase

FLS Flavonol synthase

LAR Leucoanthocyanidin reductase

TAL Tyrosine ammonia lyase

A5GT Anthocyanin-5-O-glycosyl transferase

A3AAT Anthocyanin-3-O-aromatic acyl transferase

A3MAT Anthocyanin-3-O-malonyl acyl transferase

[0071] In another embodiment, the recombinant host cell may further include anthocyanidin synthase (AIMS (LDOX)), flavonol synthase (FLS), leucoanthocyanidin reductase (LAR), and anthocyanidin reductase (ANR).

[0072] In other aspects, the invention provides a recombinant host cell that is capable of producing a compound selected from the group consisting of coumaroyl- CoA, benzoyl-CoA, sinapoyl-CoA, feruloyl-CoA, malonyl-CoA, cinnamoyl-CoA, and caffeoyl-CoA. In further aspects, the recombinant host comprises one or more heterologous enzyme nucleic acid molecules each encoding an enzyme of the coumaoryl-CoA biosynthesis pathway.

[0073] In one embodiment, a recombinant host cell is provided that is capable of producing one or more anthocymins, wherein the host cell expresses at least one anthocyanidin, and wherein the host cell includes one or more heterologous GT nucleic acid molecules and one or more heterologous AT nucleic acid molecules. [0074] In a further embodiment, a recombinant host cell is provided that includes a glycosyltransferase that is a UDP-glucose dependent glucosyltransferase. For example, the glycosyltransferase can be a UDP-glucose dependent glucosyltransferase of family 1 .

[0075] In another embodiment, a recombinant host cell is provided that includes an acyltransferase, for example, a BAHD acyltransferase.

[0076] The term "anthocyanin" as used herein refers to any anthocyanidin, which have been glycosylated and/or acylated at least once. However, an anthocyanin may also have been glycosylated and/or acy\ated several times. Thus, in principle, an anthocyanidin may also be an anthocyanin, which has been glycosylated and/or acylated at least once.

[0077] Thus, an anthocyanin may be any of the anthocyanidins described herein, wherein the anthocyanidin is substituted with one or more selected from the group consisting of glycosyl, acyl, substituents consisting of more than one glycosyl, substituents consisting of more than one acyl and substituents consisting of one or more glycosyl(s) and one or more acyl(s).

[0078] The anthocyanidin can be substituted at any useful position. Frequently, the anthocyanidin is substituted at one or more of the following positions: the 3 position on the C-ring, the 5 position on the A-ring, the 7 position on the A ring, the 3' position of the B ring, the 4' position of the B-ring or the 5' position of the B-ring.

[0079] Accordingly, in one embodiment of the invention the anthocyanin is a compound of the formula I:

wherein

selected from the group consisting of -H, -OH, -OCH3 and O-Ra; and R2 is selected from the group consisting of -H, -OH and O-Rs; and

R3 is selected from the group consisting of -H, -OH, -OCH3 and O-Rs; and

R 4 is selected from the group consisting of -H, -OH and O-Rs; and

R5 is selected from the group consisting of -OH, -OCH3 and O-Rs; and

Re is selected from the group consisting of -H and -OH; and

R7 is selected from the group consisting of -OH, -OCH3 and O-Rs and

Re is selected from the group consisting of glycosyl, acyl, substituents consisting of more than one glycosyl, substituents consisting of more than one acyl and substituents consisting of one or more glycosyl(s) and one or more acyl(s); and wherein at least one of Ri , R2, R3, R4, R5 and R7 is -O-Re.

[0080] The acyl may be any acyl. In one embodiment, one or more acyls are selected from the group consisting of the acyl moiety of a fatty acid. In another embodiment one or more acyls are selected from the group consisting of coumaroyl, benzoyl, sinapoyl, feruloyl and caffeoyl, malonyl and hydroxybenzoyl.

[0081] The glycoside can be any sugar residue. For example, one or more glycosides may be selected from the group consisting of glucoside, rhamnoside, xyloside, galacto ide and arabinoside.

[0082] The substituent consisting of one or more glycosides can be, for example, a monosaccharide, disaccharide, or a trisaccharide. The monosaccharide can be, for example, selected from the group consisting of glucoside, rhamnoside, xyloside, galactoside and arabinoside. The disaccharide and the trisaccharide can, for example, consist of glycosides selected from the group consisting of glucoside, rhamnoside, xyloside, galactoside and arabinoside.

[0083] The substituent consisting of one or more glycosides and one or more acyl can be, for example, a monosaccharide, disaccharide or a trisaccharide substituted at one or more pos: l' ons with an acyl. The substituent consisting of one π more glycosides and one or more acyl can be, for example, a monosaccharide selected from the group consisting of glucoside, rhamnoside, xyloside, galactoside and arabinoside, wherein any of the aforementioned can be substituted at one or more positions with an acyl selected from the group consisting of coumaroyl, benzoyl, sinapoyl, feruloyl and caffeoyl, malonyl and hydroxybenzoyl. The substituent consisting of one or more glycosides and one or more acyl can also be, for example, a disaccharide or a trisaccharide consisting of glycosides selected from the group consisting of glucoside, rhamnoside, xyloside, galactoside and arabinoside, wherein any of the aforementioned can be substituted at one or more positions with an acyl selected from the group consisting of coumaroyl, benzoyl, sinapoyl, feruloyl and caffeoyl, malonyl and hydroxybenzoyl.

[0084] In one embodiment, an anthocyanin can have multiple glycosylations. Such anthocyanins exhibit improved systemic bioavailability (compared to the aglycon (a non- glycosylated molecule) alone or an anthocyanin with fewer glycosylations). The sugars can be removed in the Gl tract. Such multiply glycosylated anthocyanins (one or more glycosylations) also have improved aqueous solubility. The anthocyanin with no sugars or fewer sugars than when ingested can then cross through the Gl wall.

[0085] The improvement of bioavailability or solubility or a combination thereof can be 2, 5, 10, 50, 100, 200 or more fold.

[0086] Sugars can be added to the anthocyanin by an enzyme or by a metabolic process within a cell. The sugars can be any sugar, for example, glucose, galactose, lactose, fructose, maltose, and can be added to more than one site on the anthocyanin. There can be more than one sugar per site, or 2, 3, 4, 5, or more sugars per site. The anthocyanin can first be derivatized with a functional group (using e.g. a P450 or other enzyme) that the sugar is subsequently added to.

[0087] Co-pigmentation can affect stability, color, and hue. This can be an intramolecular interaction e.g. of the acyl group with the rest of the anthocyanin molecule or intermolecular interactions with other molecules in solution. The effect of acyl group variation protects intramolecular but not intermolecular co-pigmentation.

[0088] For processing, formulation and storage of products containing anthocyanins, stabilization of the intact anthocyanin is desired. However, in vivo therapeutic effects of anthocyanins can be due to one of more of native anthocyanin, degradation products, metabolites or anthocyanin derivatives. Notably, the amount of native anthocyanin in plasma has been quoted as less than 1 % of the consumed quantities. This has been considered to be due to limited intestinal absorption, high rates of cellular uptake, metabolism and excretion.

[0089] Therefore, for therapeutic applications of anthocyanins, it can be advantageous to use anthocyanins with instability at the relevant stage of the digestive tract, or derivatization for maximum adsorption at the relevant stage of the digestive tract. Colonic metabolism of anthocyanins can also be considered. Therefore, in some instances "improved stability" of an anthocyanin may actually be a decrease in stability for delivery to a specific stage of the digestive tract or colon. The chemical forms of anthocyanins ingested in the diet may not be the ones that reach microbiota but instead their respective metabolites that were excreted in the bile and/or from the enterohepatic circulation.

[0090] Glycosyl transferases

[0091] Glycosyltransferases that can be used with the present invention can be any enzymes that are capable of catalyzing transfer of one monosaccharide residue to an acceptor molecule. In particular, useful glycosyltransferases are any enzymes that can catalyze transfer of one monosaccharide residue from a sugar donor to an acceptor molecule. In particular, glycosyltransferases useful in the present invention are capable of catalyzing transfer of one monosaccharide residue selected from the group consisting of glucose, rhamnose, xylose, galactose and arabinose to an acceptor molecule selected from the group consisting of anthocyanins and anthocyanidins.

[0092] The sugar donor can be any moiety having a monosaccharide, such as any donor moiety covalently coupled to a glycoside, such as a glycoside selected from the group consisting of glucoside, rhamnoside, xyloside, galactoside and arabinoside. The donor moiety can be, for example, a nucleotide, such as a nucleoside diphosphosphate, for example, UDP. Thus, the sugar donor can be, for example, a UDP-glycoside, v herein glycoside for example may be selected fron, the group consisting of glucoside, rhamnoside, xyloside, galactoside and arabinoside. [0093] The sugar donor can also be a molecule consisting of a sugar moiety and an acyl moiety, e.g. , an aromatic acyl moiety, such as a phenyl propanoid moiety. Such donors are described in, e.g. , Sasaki et al. ("The Role of Acyl-Glucose in Anthocyanin Modifications," Molecules 19: 18747-66, 2014).

[0094] The art describes a number of glycosyltransferases that can glycosylate compounds of interest. Based on DNA sequence homology of the sequenced genome of the plant Arabidopsis thaliana, it is believed to contain around 100 different glycosyltransferases. These and numerous others have been analyzed in Paquette et al. , (Phytochemistry 62: 399-413, 2003). WO2001/07631 , WO2001/40491 , and Arend et al., (Biotech. & Bioeng 78: 126-131 , 2001 ) also describe useful glycosyltransferases, which may be employed with the present invention.

[0095] Furthermore, numerous suitable glycosyltransferases may be found in the Carbohydrate-Active enZYmes (CAZY) database (http://www.cazy.org/). In the CAZY database, suitable glycosyltransferase molecules from virtually all species including, animal, insects, plants and microorganisms can be found. Furthermore, a type of glycosyl transferase of the glycoside hydrolase family 1 (GH1 ), as described e.g. in Sasaki et al. that uses acyl-glucosides as donors, may be used in the present invention.

[0096] In one embodiment, at least 50% of the glycosyltransferases, such as at least 75% of the glycosyltransferases, to be used with the methods of the invention belong to the CAZy family GT1 . The skilled person will be able to identify whether a given glycosyltransferase belong to a particular CAZy family using conventional, computer- aided methods based mainly on sequence information. The GT1 family has at least 5217 genes coding for glycosyltransferases. They are referred to as UGTs and are numbered UGT<family numberxgroup letterxenzyme number>.

[0097] Glycosyltransferases that are more than 40% identical to a given GT1 member in amino acid sequence are classified to the same UGT-family within GT1 . Those that are 60% or more identical receiv . the same group letter, and the individual glycosyltransferase is then assigned an enzyme number. [0098] In one embodiment, it may be advantageous to include Nucleotide-Sugar I ntercon ersion enzymes, such as RHM2, to improve availability of the desired sugar donor, by converting UDP-glucose to UDP-rhamnose. Several of such enzymes are known in the art. (See e.g., Yin et al. ("Evolution of plant nucleotide-sugar interconversion enzymes," PLoS One. 6(1 1 ): e27995, 201 1 ).

[0099] Acyl transferases

[00100] Acyltransferases that can be used with the present invention can be any enzyme that is capable of catalyzing transfer of an acyl residue to an acceptor molecule. In particular, the acyltransferase to be used with the present invention can be any enzymes that are capable of catalyzing transfer of one acyl residue from an acyl donor to an acceptor molecule selected from the group consisting of anthocyanins and anthocyanidins.

[00101] Useful acyltransferases include that capable of catalyzing transfer of one acyl residue from coenzyme A-derivative of an organic acid to an acceptor molecule selected from the group consisting of anthocyanins and anthocyanidins.

[00102] The acyltransferase can be any enzyme that is capable of catalysing transfer of one acyl residue from any of the acyl donors described herein below in the section "Acyl donor" to an anthocyanin and/or an anthocyanidin.

[00103] In one embodiment, the acyltransferase is of the BAHD type. Nucleic acid molecules encoding BAHD acyltransferases can be identified by screening gene transcripts present in anthocyanin-producing tissues of plants having a high level of anthocyanin production. The screening can use homology searching with known BAHD genes to identify additional nucleic acid molecules encoding BADH acyltransferases. For these enzymes, certain protein motifs are conserved well enough to allow easy identification. The identified nucleic acid molecules can then be transferred to host cells or be used for in vitro production of acyltransferases to be used with the methods of the invention. [00104] In another embodiment, the acyltransferase can belong to the EC 2.3.1 . - class of enzymes, including EC 2.3.1 .18; EC 2.3.1 .153; EC 2.3.1 .171 ; EC 2.3.1 .172; EC 2.3.1 .173; EC 2.3.1 .213; EC 2.3.1 .214; EC 2.3.1 .215; and similar enzymes.

[00105] In yet another embodiment, the acyltransferase can belong to the class of AHCT (anthocyanin o-hydroxy cinnamoyl transferase) enzymes. An exemplary GenBank Accession Number for an AHCT nucleic acid molecule includes, but is not limited to, AY395719.1 .

[00106] In yet another embodiment, the acyltransferase can be a serine carboxypeptidase-like (SCPL) protein family type, which uses acyl-glycosides as donors to transfer the acyl to the target molecule. Such acyltransferases and their donor molecules are described, e.g. , in Sasaki et al.

[00107] According to the invention, enzymes of any of the above mentioned classes can be used individually or as mixtures.

[00108] The acyl donor can be any useful acyl donor. In particular, the acyl donor may be any moiety including an acyl residue, such as any donor moiety covalently coupled to an acyl residue. The acyl residue can be the acyl part of an organic acid. The donor moiety can be coenzyme A, and thus, the acyl donor can be a coenzyme A-derivative of an organic acid including aromatic phenolic acids or phenylpropanoic acids. Further, the acyl donor can be a compound selected from the group consisting of acetyl-CoA, malyl- CoA, malonyl- Co A, coumaroyl-CoA, benzoyl-CoA, sinapoyl-CoA, feruloyl-CoA and caffeoyl-CoA. In particular, the acyl donor can be coumaroyl-CoA.

[00109] Further, the acyl donor can be an acyl-glucoside of the type described in Sasaki et al.

[00110] In certain embodiments of the invention, the acyl donor can be added directly to the fermentation broth. However, in a preferred embodiment of the invention, the recombinant host cell can be capable of producing the acyl donor. Many host cells are capable of producing one or more acyl donors. For example, yeast cells are capable of producing malonyl-CoA. [00111] Frequently, however, host cells are not capable of producing all desired acyl donors, in which case the host cells can include one or more heterologous enzyme nucleic acid molecules each encoding enzymes of the biosynthetic pathway of the specific acyl donor.

[00112] Several biosynthesis pathways for conversion of a sugar into an acyl donor are known. Where the host cell is a yeast or bacterial cell, the cell can include a heterologous enzyme nucleic acid molecule encoding one or more enzymes of the biosynthetic pathway for conversion of a sugar into an acyl donor, even though some of the required enzymatic activities typically are present in the host cell. Thus, frequently the acyl donor can be prepared using phenyl alanine or tyrosine as a substrate. Typically host cells, such as yeast or bacterial cells, are capable of producing phenyl alanine or tyrosine.

[00113] Thus, the host cell can include heterologous nucleic acid molecules encoding one or more enzymes of the biosynthesis pathway for conversion of phenyl alanine or tyrosine to phenylpropanoyl-CoA. For example, the host cell can include heterologous nucleic acid molecules encoding all the enzymes of the biosynthesis pathway for conversion of phenylalanine or tyrosine to e.g. feruloyl-CoA.

[00114] The host cell can also include heterologous nucleic acid molecules encoding one or more enzymes of the biosynthesis pathway for conversion of phenylalanine or tyrosine to p-hydroxybenzoyl-CoA. For example, the host cell can include heterologous nucleic acid molecules encoding all the enzymes of the biosynthesis pathway for conversion of phenylalanine or tyrosine to p-hydroxybenzoyl-CoA.

[00115] Host cells may include any suitable cell for expression of the biosynthetic pathway proteins disclosed herein, including, but not limited to, prokaryotic and eukaryotic species, such as yeast cells, plant cells, mammalian cells, insect cells, fungal cells, bacterial cells. If the cells are human cells, they are isolated or cultured.

[00116] Suitable host cells include yeast, such as those belonging to the genera Saccharomyces, Ashbya, Arxula, Klyuveromyces, Gibberella, Aspergillus, Candida, Pichia, Debaromyces, Hansenula, Yarrowia, Zygosaccharomyces, Cyberlindnera, Hansenula, Xanthophyllomyces, or Schizosaccharomyces. For example, a suitable yeast species may be Saccharomyces cerevisiae, Schizosaccharomyces pombe, Yarrowia lipolytica, Candida glabrata, Ashbya gossypii, Gibberella fujikuroi, Aspergillus niger, Cyberlindnera jadinii, Pichia pastoris, Kluyveromyces lactis, Hansenula polymorpha, Candida boidinii, Arxula adeninivorans, Xanthophyllomyces dendrorhous, or Candida albicans.

[00117] Suitable bacterial cells include Escherichia bacteria cells, Lactobacillus bacteria cells, Lactococcus bacteria cells, Cornebacterium bacteria cells, Acetobacter bacteria cells, Acinetobacter bacteria cells, Pseudomonas bacterial cells, or Rhodobacter sphaeroides, Rhodobacter capsulatus, or Rhodotorula toruloides cells.

[00118] In some embodiments, a microorganism can be an algal cell such as Blakeslea thspora, Dunaliella salina, Haematococcus pluvialis, Chlorella sp., Undaria pinnatifida, Sargassum, Laminaria japonica, or Scenedesmus almeriensis species.

[00119] In some embodiments, a microorganism can be a cyanobacterial cell such as Blakeslea trispora, Dunaliella salina, Haematococcus pluvialis, Chlorella sp., Undaria pinnatifida, Sargassum, Laminaria japonica, or Scenedesmus almeriensis.

[00120] The genetically engineered microorganisms disclosed herein can be cultivated using conventional cell culture or fermentation processes, including, inter alia, chemostat, batch, fed-batch cultivations, continuous perfusion fermentation, and continuous perfusion cell culture.

[00121] After the microorganism has been grown in culture for a desired period of time, anthocyanin and/or one or more anthocyanin derivatives or anthocyanidin can then be recovered from the culture using various techniques known in the art.

[00122] Once isolated, anthocyanins produced according to the current disclosure may be used, as is known in the art, as colorants (such as dyes or pigments that may have a predetermined color and/or hue), pH indicators, food additives, antioxidants, for medicinal purposes, or for any other use, including food and nutritional supplements.

[00123] The invention will be further described in the following examples, which do not limit the scope of the invention described in the claims. EXAMPLES

[00124] The Examples that follow are illustrative of specific embodiments of the invention, and various uses thereof. They are set forth for explanatory purposes only and are not taken as limiting the invention.

[00125] Overview

[00126] The following Examples demonstrate successful anthocyanin production in yeast via a heterologous full-length biosynthetic pathway. Successful production was achieved by combining highly efficient enzymes and expressing them under near optimal conditions to achieve sufficient flow through the pathway (and to overcome deleterious side-reactions) to produce useful amounts of anthocyanin products. As listed in the tables below, the gene sequences disclosed in SEQ ID NOS: 1 , 3, 5, 7 , 9, 1 1 , 13, 15, 17, 19, 21 , 23, 25, 27, 29, 31 , 33, 45, 47, 48, 51 , and 52 encode the protein sequences of SEQ ID NOS: 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 54, 55, 56, 57, and 58, respectively.

[00127] All flavonoids, anthocyanidins, anthocyanins, and their derivatives in the examples below were analyzed using the method set forth in Example No. 10.

Example No. 1 : Production of naringenin in yeast. [00128] Materials and Methods

[00129] The naringenin pathway was assembled by in vivo homologous recombination and simultaneous integration in a background S. cerevisiae strain to make a naringenin producing strain. The S. cerevisiae strains used were based on the S288c strain.

[00130] The naringenin pathway genes used in this example are listed in Table No. 2 below, though a tyrosine r nmonia lyase (TAL), such as that encoded by SE° ID NO: 15 may be used in place of or in addition to PAL2 and C4H (as illustrated in FIG. 1 ) to provide the intermediate, p-coumaric acid, in the pathway.

[00131] Table No. 2. Naringenin Pathway Genes used in Example No. 1. Plasmid SEQ ID

(pEVE) Cassette Content NO Species

Integration tag 35

4745 ZA for XI-3

URA3 and 36

3169 AB LoxP

BC PAL2 At 17 Arabidopsis thaliana

CD C4H Am 19 Ammi majus

DE 4CL2 At 1 Arabidopsis thaliana

EF CHS2 Hv 21 Hordeum vulgare

FG CHI Ms 13 Medicago sativa

GH CPR1 Sc 23 Saccharomyces cerevisiae

1919 HZ 600 bp stuffer 37

[00132] All genes were manufactured based on sequences from public databases, except CPR1 Sc (SEQ ID NO: 23) and 4CL2 At (SEQ ID NO: 1 ), which were amplified from yeast genomic DNA and plant cDNA, respectively. Synthetic genes, codon- optimized for expression in yeast, were manufactured by DNA 2.0, Inc. (Menlo Park, CA, USA) or GeneArt AG (Regensburg, Germany). During synthesis, all genes except PAL2 At were provided, at the 5'-end, with the DNA sequence AAGCTTAAA (SEQ ID NO: 43) including a Hind III restriction recognition site and a Kozak sequence, and at the 3'-end the DNA sequence CCGCGG (SEQ ID NO: 44) including a Sacll recognition site. By PCR, PAL2 At was provided, at the 5'-end, with the DNA sequence AAGCTTAAA (SEQ ID NO: 43), including a Hindlll restriction recognition site and a Kozak sequence, and at the 3'-end with the DNA sequence CCGCGG (SEQ ID NO: 44) including a Sacll recognition site. The A. thaliana gene 4CL2 (SEQ ID NO: 1 ) was amplified by PCR from first strand cDNA. The 4CL2 sequence has one internal Hindlll site and one internal Sacll site, and was therefore cloned, using the In-Fusion® HD Cloning Plus kit (Clontech, Inc.), into Hindlll and Sacll, according to manufacturers' instructions. [00133] The S. cerevisiae gene CPR1 was amplified from genomic DNA by PGR (SEQ ID NO: 23). During PGR, the gene was provided, at the 5'-end, with the DNA sequence AAGCTTAAA (SEQ ID NO: 43), including a Hindlll restriction recognition site and a Kozak sequence, and at the 3'-end with the DNA sequence CCGCGG (SEQ ID NO: 44) including a Sacll recognition site. An internal Sacll site of SEQ ID NO: 23 was removed with a silent point mutation (C519T) by site directed mutagenesis. Yeast CPR1 was overexpressed to allow efficient regeneration of the CYP450 enzyme C4H. All genes were cloned into Hindlll and Sacll of pUC18 based vectors containing yeast expression cassettes derived from native yeast promoters and terminators.

[00134] Promoters and terminators, described by Shao et al. (Nucl. Acids Res. 2009, 37(2):e16), had been prepared by PGR from yeast genomic DNA. Each expression cassette was flanked by 60 bp homologous recombination tag (HRT) sequences, on both sides, and the cassettes including these HRTs were, in turn, flanked by Ascl recognition sites (see FIGS. 2(a), 2(b), and 3). The HRTs were designed such that the 3'-end tag of the first expression cassette fragment is identical to the 5'-end tag of the second expression cassette fragment, and so forth. Three helper fragments were used to integrate multiple expression cassettes into the yeast genome by homologous recombination. One helper fragment (ZA in pEVE4745, SEQ ID NO: 35), included the two recombination tags for integration into the site XI-3, each of which was homologous to sequences in the yeast genome. These were both flanked by a HRT and separated with an Ascl site. The second helper fragment (AB in pEVE3169, SEQ ID NO: 36) included a yeast auxotrophic marker (URA3) flanked by LoxP sites. This fragment also had flanking HRTs. The third helper fragment (HZ in pEVE1919, SEQ ID NO: 37) was designed only with HRTs separated by a short 600 bp spacer sequence. All helper fragments had been cloned in a pUC18 based backbone for amplification in E. coli. All fragments were cloned in Ascl sites from where they could be excised. FIGS. 2(a) and (b) and FIG. 3 depict how the DNA assembler technology, based on Shao et al. 2009, can be u& sd to assemble biosynthetic pathways by homologous recombination, for stable maintenance on a plasmid (FIGS. 2(a) and (b)) or after integration into the host genome (FIG. 3). [00135] To integrate the naringenin pathway into the background strain, plasmid DNA from the three helper plasmids (pEVE4745, pEVE3169, and pEVE1919, SEQ ID NOS: 35-37, respectively) was mixed with plasmid DNA from each of the plasmids containing the expression cassettes. The mix of plasmid DNA was digested with Ascl. This treatment released all fragments from the plasmid backbone and created fragments with HRTs at the ends, these being sequentially overlapping with the HRT of the next fragment. The background strain was transformed with the digested mix, and the naringenin pathway was integrated in vivo by homologous recombination essentially as described by Shao et al. 2009.

[00136] Following integration, the genes were transcribed and translated into the enzymes of the naringenin biosynthetic pathway, plus the additional yeast CPR1. Naringenin production was confirmed by LC/MS.

Example No. 2: Production of pelargonidin-3-O-glucoside (P3G) in yeast.

[00137] The pelargonidin-3-O-glucoside (P3G)-pathway from naringenin was assembled on HRT vectors according to Table No. 3 below. Each yeast expression cassette BC, CD, DE and EF contained a gene encoding one enzyme of the P3G pathway. The BC cassette encoded an anthocyanidin synthase (ANS) from Petunia x hybrida, the CD cassette contained an anthocyanidin-3-O-glycosyl transferase (A3GT) from Arabidopsis thaliana, the DE cassette encoded a flavanone-3-hydroxylase (F3H) from Malus domestica, and the EF cassette encoded a dihydroflavonol-4-reductase (DFR) from Anthurium andraeanum. See FIGS. 2(a) and 2(b) depicting pathway assembly on a plasmid, and FIG. 3 depicting assembly by genomic integration.

[00138] The backbone of the HRT vectors was formed by the DNA fragments ZA, AB and FZ, which contained a yeast selection marker, an autonomously replicating sequence (ARS), a yeast centromere (CEN) and a 60ϋ bp stuffer sequence (see Table No. 3 below). Expression of each cassette was driven by a yeast native promoter as described in Example No. 1 above. The DNA helper fragments, as well as the gene expression cassettes, were flanked by 60 bp homologous recombination tags (HRT), where each terminal tag was identical to the first tag of the following cassette. Each HRT cassette included terminal AscI restriction sites to allow excision from the vector backbone.

[00139] Table No. 3. P3G Pathway Gene Cassettes. *

* Summary of the plasmids containing the cassettes included in the final HRT vector for P3G production in yeast. Approximate sizes of the undigested donor plasmids are indicated, as well as the amounts of DNA that were mixed and digested with AscI before being used to transform the yeast.

[00140 J Plasmids (from Table No. 3) containing the described helper fragments and gene expression cassettes were digested with AscI in a 20 μΙ_ reaction volume. The digest was performed for 2 h at 37°C.

[00141] For transformation of a naringenin producing yeast strain (described in Example No. 1 ) with the HRT reaction, a 5 ml_ pre-culture of the naringenin producing strain was inoculated the day before transformation. After transformation of the naringenin producing strain by the LiAC/SS carrier DNA/PEG method (see e.g., Gietz et a/., Nat Protoc. 2007;2(1 ):35-7), cells were grown at 30°C for 72 h. Next, four clones were re-streaked onto fresh plates and grown for 72 h at 30°C.

[00142] The clones were then grown in 2 r. '_ liquid cultures until the cultures turned red (96 h to 120 h). Subsequently, 1 volume of acidified methanol was added, and after ½ hour of shaking at 30°C cell debris was spun down by centrifugation and the cleared supernatant was collected for analysis by LC/MS. Analysis demonstrated the presence of pelargonidin (FIG. 4) and pelargonidin-3-O-glucoside (FIG. 5).

Example No. 3: Production of pelargonidin-3,5-0-diglucoside (P35G) in yeast.

[00143] The pelargonidin-3-5-0-diglucoside pathway, starting from naringenin, was assembled in yeast by utilization of the HRT technique, described in Example No. 1 above and shown in FIGS. 2(a) and 2(b). Genes used for P35G production are summarized Table No. 4 below. Each yeast expression cassette BC, CD, DE, EF and FG contained a gene encoding one enzyme of the P35G pathway. The BC cassette encoded an anthocyanidin synthase (ANS) from Petunia x hybrida, the CD cassette contained an anthocyanidin-3-O-glycosyl transferase (A3GT) from Arabidopsis thaliana, the DE cassette encoded a flavanone-3-hydroxylase (F3H) from Malus domestica, the EF cassette encoded a dihydroflavonol-4-reductase (DFR) from Anthurium andraeanum, and the FG cassette encoded an anthocyanin-5-O-glucosyltransferase from Vitis amurensis. All genes were manufactured based on sequences from public databases, codon-optimized for expression in yeast, and manufactured by DNA 2.0, Inc. (Menlo Park, CA, USA) or GeneArt AG (Regensburg, Germany).

[00144] The backbone of the P35G HRT vector was formed by the DNA fragments ZA, AB and GZ, which contained an auxotrophic yeast selection marker (HIS3), an autonomously replicating sequence (ARS), a yeast centromere (CEN) and a 600 bp stuffer sequence (see Table No. 4 below). Expression of each cassette was driven by a yeast native promoter as described in Example 1 above. The DNA backbone fragments, as well as the gene expression cassettes were flanked by 60 bp homologous recombination tags (HRT), where each terminal tag was identical to the first tag of the following cassette. Each HRT cassette included terminal Ascl restriction sites to allow excision from the vector backbone.

[00145] Table No. 4. P35G Pathway Gene Cassettes. *

"Summary of the plasmids containing the cassettes included in the final HRT vector for P35G production in yeast.

[00146] Plasmids (from Table No. 4) containing the described DNA helper fragments and gene expression cassettes were digested with Ascl in a 20 pL reaction volume. The digest was performed for 2 h at 37°C.

[00147] For transformation of a naringenin producing yeast strain (described in Example 1 ) with the HRT reaction, a 3 mL pre-culture of the naringenin producing strain was inoculated the day before transformation and used to inoculate a fresh yeast culture the following day which was transformed after 3-4 hours of growth. After transformation of the naringenin producing strain by the LiAC method (see e.g., Gietz et al„ Nat Protoc. 2007;2(1 ):35-7), cells were grown at 30°C for 72 h.

[00148] Individual yeast clones were subsequently grown in 2 mL liquid cultures for 96 hours, after which, the cultures were extracted with acidified Methanol (1 % HCL) at 30°C, 300 rpm for 30 min. Following extraction, the cell debris was precipitated by centrifugation, and the supernatants were collected for analysis by LC/MS. Analysis demonstrated the presence of pelargonidin-3,5-0-glucoside (FIG. 6).

Example 4: Production of cyanidin-3-O-glucoside (C3G) in yeast. [00149] The cyanidin-3-O-glucoside (C3G)-pathway from naringenin was assembled in two steps including assembly of two HRT plasmids, as described below in reference to Table Nos. 5 and 6. In a first step a (+)-catechin (CAT)-producing strain was created by combining the genes listed in Table. No. 5. The CAT pathway was assembled on an HRT vector containing the genes F3'H from Petunia x hybrida, F3H-1 from Malus domestica, and a CPR (ATR1 ) from Arabidopsis thaliana cloned into yeast expression cassettes CD, DE, and GH, respectively. In addition, the expression cassettes EF and FG containing a DFR variant and a LAR variant, respectively, were included. The DNA fragment BC was empty, meaning no expression cassette was inserted between the HRTs. The plasmid backbone was formed by the DNA fragments ZA, AB, and HZ (see Table No. 5). The HRT reaction was performed as described above, but in a 50 μΙ_ reaction volume.

[00150] The naringenin producing strain (Example No. 1 ) was transformed with the HRT reaction. After transformation and growth of the cells for 72 h, clones were cultured in 96-well plates and screened for CAT production. A clone, with confirmed production of CAT was chosen for further engineering in a second step.

[00151] In the second step, a cyanidin-3-O-glucoside producing yeast strain was created from a combination of ANS and A3GT genes transformed into the CAT producing clone described above. The expression cassettes BC and CD of the second HRT vector contained one of eight tested ANS variants and one of eight tested A3GT variants, respectively. Note, that for the purpose of this example only one specific ANS and A3GT gene, respectively, are listed in Table No. 6. HRT reaction, transformation, and cell culture were performed as above. Clones were isolated and grown as described above, and analyzed for anthocyanin production. Several clones were shown to produce cyanidin (FIG. 7) and cyanidin-3-O-glucoside (FIG. 8). The highest concentrations were seen with the specific ANS and A3GT listed in Table No. 6.

[00152] Table No. 5 Summary of a plasmid containing the cassettes included in a HRT vector which exhibited (+)-catechin production in yeast.

Plasmid PI size SEQ ID PI amount

(PEVE) Cassette Content (kb) NO (ng)

[00153] Table No. 6. Summary of one plasmid containing the cassettes included in the HRT vector for C3G production.

Example No. 5: Production of cyanidin-3,5-0-diglucoside (C35G) in yeast.

[00154] The cyanidin-3,5-0-diglucoside (C35G) pathway was done in two steps including assembly of two HRT plasmids. In a first step, an eriodictyol strain was created from the naringenin strain (see Example No. 1 above) by the introduction and assembly of HRT expression fragments consisting of a flavonoid 3 -hydroxylase (F3'H) from Petunia nybrida and a cytochrome P450 reductase (CPR-1 ) gene from Arabidopsis thaliana, cloned into yeast expression cassettes CD and DE, respectively. The DNA fragment BC was empty, meaning no expression cassette was inserted between the HRTs. The plasmid backbone was formed by the DNA fragments ZA, AB, and EZ (see Table No. 7).

[00155] Plasmids containing the described helper fragments and gene expression cassettes were digested with Ascl in a 20 pl_ reaction volume. The digest was performed for 2 h at 37°C.

[00156] The naringenin producing strain was transformed with the HRT reaction using the LiAC method (see e.g., Gietz et a/., Nat Protoc. 2007;2(1 ):35-7). After transformation, the cells were grown at 30°C for 72 h.

[00157] Individual yeast clones were then grown in 2 ml_ liquid cultures for 96 h. Subsequently, the cultures were extracted with acidified methanol (1 % HCL) at 30°C, 300 rpm for 30 min. Following extraction, the cell debris was precipitated by centrifugation, and the cleared supernatants were collected for analysis by LC/MS. Analysis showed that introduction of the listed genes (Table No. 7) resulted in the production of eriodictyol.

[00158] Table No. 7. Eriodictyol Pathway Gene Cassettes. *

Summary of the plasmids containing the cassettes included in the final HRT vector for eriodictyol production in yeast.

[00159] In the second step, a cyanidin-3,5-0-glucoside producing yeast strain was created from a combination of ANS, DFR, F3H, A3GT and A5GT genes transformed into the eriodictyol producing strain described above. Each yeast expression cassette BC, CD, DE and EF contained a gene encoding one enzyme of the C35G pathway. The BC cassette encoded an anthocyanidin synthase (ANS) from Petunia x hybrida, the CD cassette contained an anthocyanidin-3-O-glycosyl transferase (A3GT) from Arabidopsis thaliana, the DE cassette encoded a flavanone-3-hydroxylase (F3H) from Malus domestica, the EF cassette encoded a dihydroflavonol-4-reductase (DFR) from Anthurium andraeanum and the FG cassette contained an anthocyanin-5-O-glycosyl transferase (A5GT) from Vitis amurensis.

[00160] The backbone of the HRT vector was formed by the DNA helper fragments ZA, AB and GZ, which contained an auxotrophic yeast selection marker (HIS3), an autonomously replicating sequence (ARS), a yeast centromere (CEN) and a 600 bp stuffer sequence (see Table No. 8 below). Expression of each cassette was driven by a yeast native promoter. The DNA helper fragments, as well as the gene expression cassettes were flanked by 60 bp homologous recombination tags (HRT), where each terminal tag was identical to the first tag of the following cassette. Each HRT cassette included terminal Ascl restriction sites to allow excision from the vector backbone.

[00161] Table No. 8. C35G Pathway Gene Cassettes/

* Summary of the plasmids containing the cassettes included in the final HRT vector for C35G production in yeast.

[00162] Plasmids containing the described helper fragments and gene expression cassettes were digested with Ascl in a 20 μΙ_ reaction volume. The digest was performed for 2 h at 37°C. [00163] The eriodictyol producing yeast strain was transformed with the HRT digest reaction using the LiAC method (see e.g., Gietz et al., Nat Protoc. 2007;2(1 ):35-7). After transformation, the cells were grown at 30°C for 72 h.

[00164] Individual yeast clones were then grown in 2 ml_ liquid cultures for 96 h. Subsequently, the cultures were extracted with acidified methanol (1 % HCL) at 30°C, 300 rpm for 30 min. Following extraction, the cell debris was precipitated by centrifugation, and the cleared supernatants were collected for analysis by LC/MS. The analysis demonstrated the presence of cyanidin-3,5-0-glucoside (FIG. 9).

Example No. 6: Production of delphinidin and delphinidin-3-O- glucoside (D3G) in yeast.

[00165] The delphinidin-3-O-glucoside (D3G) pathway was done in two steps including assembly of two HRT plasmids. In a first step, a 5,7,3', 4', 5' pentahydroxyflavone (PHF) strain was created from the naringenin strain (see Example No. 1 above) by the introduction and assembly of HRT expression fragments consisting of a flavonoid-3'5'-hydroxylase gene (F3'5'H) from Solanum lycopersicum and a cytochrome P450 reductase (CPR-1 ) gene from Arabidopsis thaliana, cloned into HRT yeast expression cassettes CD and DE, respectively. The DNA fragment BC was empty, meaning no expression cassette was inserted between the HRTs. The plasmid backbone was formed by the DNA fragments ZA, AB, and EZ, which contained an auxotrophic yeast selection marker (LEU2), an autonomously replicating sequence (ARS), a yeast centromere (CEN) and a 600 bp stuffer sequence (see Table No. 9). Expression of each cassette was driven by a yeast native promoter as described in Example No. 1. The DNA backbone fragments, as well as the gene expression cassettes were flanked by 60 bp homologous recombination tags (HRT). Each HRT cassette included terminal AscI restriction sites to allow excision from the vector backbone.

[00166] Table No. 9. PHF Pathway Gene Cassettes.

" Summary of the plasmids containing the cassettes included in the final HRT vector for PHF production in yeast.

[00167] Plasmids containing the described helper fragments and gene expression cassettes were digested with Ascl in a 20 μΙ_ reaction volume. The digest was performed for 2 h at 37° C.

[00168] The naringenin producing yeast strain was transformed with the HRT digest reaction using the LiAC method (see e.g., Gietz et a/., Nat Protoc. 2007;2(1 ):35-7). After transformation, the cells were grown at 30°C for 72 h.

[00169] Individual yeast clones were then grown in 2 mL liquid cultures for 96 h. Subsequently, the cultures were extracted with acidified methanol (1% HCL) at 30°C, 300 rpm for 30 min. Following extraction, the cell debris was precipitated by centrifugation, and the cleared supernatants were collected for analysis by LC/MS and production of PHF was confirmed.

[00170] In the second step, a delphinidin-3-O-glucoside producing yeast strain was created from a combination of ANS, DFR, F3H and A3GT genes transformed into the PHF producing strain described above. Each yeast expression cassette BC, CD, DE and EF contained a gene encoding one enzyme of the D3G pathway. The BC cassette encoded an anthocyanidin synthase (ANS) from Petunia x hybrida, the CD cassette contained an anthocyanidin-3-O-glycosyl transferase (A3GT) from Arabidopsis thaliana, the DE cassette encoded a flava jne-3-hydroxylase (F3H) from Malus domestica, and the EF cassette encoded a dihydroflavonol-4-reductase (DFR) from Anthurium andraeanum. [00171] The backbone of the HRT vector was formed by the DNA helper fragments ZA, AB and FZ, which contained an auxotrophic yeast selection marker (HIS3), an autonomously replicating sequence (ARS), a yeast centromere (CEN) and a 600 bp stuffer sequence (see Table No. 10 below). Expression of each cassette was driven by a yeast native promoter. The DNA helper fragments, as well as the gene expression cassettes were flanked by 60 bp homologous recombination tags (HRT), where each terminal tag was identical to the first tag of the following cassette. Each HRT cassette included terminal Ascl restriction sites to allow excision from the vector backbone.

[00172] Table No. 10. D3G Pathway Gene Cassettes. '

* Summary of the plasmids containing the cassettes included in the final HRT vector for D3G production in yeast.

[00173] Plasmids containing the described helper fragments and gene expression cassettes were digested with Ascl in a 20 μΙ_ reaction volume. The digest was performed for 2 h at 37°C.

[00174] Yeast was transformed with the HRT digest reaction using the LiAC method (see e.g., Gietz et a/., Nat Protoc. 2007;2(1 ):35-7). After transformation, the cells were grown at 30°C for 72 h.

[00175] Individual yea«* clones were then grown in 2 ml_ liquid culture* for 96 h. Subsequently, the cultures were extracted with acidified methanol (1 % HCL) at 30°C, 300 rpm for 30 min. Following extraction, the cell debris was precipitated by centrifugation, and the cleared supernatants were collected for analysis by LC/MS. Analysis showed that introduction of the listed genes (Table No. 10) resulted in the production of delphinidin (see FIG. 10) and delphinidin-3-O-glucoside (see FIG. 11 ).

Example No. 7: Production of delphinidin-3,5-0-diglucoside (D35G) in yeast.

[00176] The delphinidin-3,5-0-diglucoside (D35G) pathway was assembled in the 5,7,3',4',5' pentahydroxyflavone (PHF) strain described in Example No. 6 above. Specifically, a delphinidin-3,5-0-diglucoside producing yeast strain was created from a combination of ANS, DFR, F3H, A3GT, and A5GT genes transformed into the PHF producing strain. Each yeast expression cassette BC, CD, DE and EF contained a gene encoding one enzyme of the D35G pathway. The BC cassette encoded an anthocyanidin synthase (ANS) from Petunia x hybrida, the CD cassette contained an anthocyanidin-3-O-glycosyl transferase (A3GT) from Arabidopsis thaliana, the DE cassette encoded a flavanone-3-hydroxylase (F3H) from Malus domestica, the EF cassette encoded a dihydroflavonol-4-reductase (DFR) from Anthurium andraeanum and the FG cassette contained an anthocyanin-5-O-glycosyl transferase (A5GT) from Vitis amurensis.

[00177] The backbone of the HRT vector was formed by the DNA helper fragments ZA, AB and GZ, which contained an auxotrophic yeast selection marker (HIS3), an autonomously replicating sequence (ARS), a yeast centromere (CEN) and a 600 bp stuffer sequence (see Table No. 11 below). Expression of each cassette was driven by a yeast native promoter. The DNA helper fragments, as well as the gene expression cassettes were flanked by 60 bp homologous recombination tags (HRT), where each terminal tag was identical to the first tag of the following cassette. Each HRT cassette included terminal Ascl restriction sites to allow excision from the vector backbone.

[00178] '

Plasmid SEQ ID

(PEVE) Cassette Content NO

4134 BC ANS Ph 9

4005 CD A3GT At 25

4015 DE F3H-1 Md 3

4024 EF DFR Aa 5

25163 FG A5GT Va 45

1918 GZ 600 bp stuffer 53

* Summary of the plasmids containing the cassettes included in the final HRT vector for D35G production in yeast.

[00179] Plasmids containing the described helper fragments and gene expression cassettes were digested with Ascl in a 20 μΙ_ reaction volume. The digest was performed for 2 h at 37°C.

[00180] The PHF producing yeast strain was transformed with the HRT digest reaction using the LiAC method (see e.g., Gietz et al., Nat Protoc. 2007;2(1 ):35-7). After transformation, cells were grown at 30°C for 72 h.

[00181] Individual yeast clones were then grown in 2 ml_ liquid cultures for 96 h. Subsequently, the cultures were extracted with acidified methanol (1 % HCL) at 30°C, 300 rpm for 30 min. Following extraction, the cell debris was precipitated by centrifugation, and the cleared supernatants were collected for analysis by LC/MS. Analysis showed that introduction of the listed genes of Table No. 11 resulted in the production of delphinidin-3,5-0-diglucoside (FIG. 12).

Example No. 8: Production of pelargonidin-3-O-coumaroyl-glucoside (P3CG) and pelargonidin-3-O-coumaroyl glucoside-5-O-glucoside (P35CG) in yeast

[00182] The assembly of the P3CG and P35CG pathways were done in the pelargonidin-3-O-glucoside and pelargonidin-3,5-0-diglucoside producing strains, respectively. The gene for an anthocyanin 3-0-glucoside:6"-0-p-coumaroyl transferase (A3AAT) from Arabidopsis thaliana, which had been codon-optimized for expression in yeast and manufactured by GeneArt AG (Regensburg, Germany), was introduced on a plasmid using the HRT technology. Table No. 12 lists the gene cassettes that were used for pathway assembly.

[00183] The DNA fragment CD was empty, meaning no expression cassette was inserted between the HRTs. The plasmid backbone was formed by the DNA fragments ZA, AB, and DZ which contained an auxotrophic yeast selection marker (LEU2), an autonomously replicating sequence (ARS), a yeast centromere (CEN) and a 600 bp stuffer sequence (see Table No. 12).

[00184] Table No. 12. P3CG and P35CG Pathway Gene Cassettes. *

* Summary of the plasmids containing the cassettes included in the final HRT vector for P3CG and P35CG production in yeast.

[00185] Plasmids containing the described helper fragments and gene expression cassettes were digested with Ascl in a 20 μΐ_ reaction volume. The digest was performed for 2 h at 37°C.

[00186] The two yeast strains producing P3G and P35G, respectively, were transformed separately with the digested HRT fragments using the LiAC transformation method (see e.g. , Gietz et a/., Nat Protoc. 2007;2(1 ):35-7). After transformation, the cells were grown at 30°C for 72 h.

[ " 3187] Individual yeast clones from both transfo. nations were then grown in 2 mL liquid cultures for 96 h. Subsequently, the cultures were extracted with acidified methanol (1 % HCL) at 30°C, 300 rpm for 30 min. Following extraction, the cell debris was precipitated by centrifugation, and the cleared supernatants were collected for analysis by LC/MS. Analysis showed that introduction of the gene encoding the anthocyanin 3-0-glucoside:6"-0-p-coumaroyl transferase resulted in the production of pelargonidin-3-O-coumaroyl glucoside (FIG. 13) and pelargonidin-3-O-coumaroyl glucoside-5-O-glucoside (FIG. 14).

Example No. 9: Production of pelargonidin-3-O-malonyl glucoside (P3 G) and pelargonidin-3-O-malonyl glucoside-5-O-glucoside (P35MG) in yeast

[00188] The assembly of the P3MG and P35MG pathways were done in the pelargonidin-3-O-glucoside and pelargonidin-3,5-0-diglucoside producing strains, respectively. The gene encoding an anthocyanin 3-0-glucoside:6"-0-malonyl transferase (A3MAT) from Dahlia variabilis, which had been codon-optimized for expression in yeast and manufactured by GeneArt AG (Regensburg, Germany), was introduced on a plasmid using the HRT technology. Table No. 13 lists the gene cassettes that were used for pathway assembly.

[00189] The DNA fragment CD was empty, meaning no expression cassette was inserted between the HRTs. The plasmid backbone was formed by the DNA fragments ZA, AB, and DZ which contained an auxotrophic yeast selection marker (LEU2), an autonomously replicating sequence (ARS), a yeast centromere (CEN) and a 600 bp stuffer sequence (see Table No. 13).

[00190] Table No. 13. P3MG and 35MG Pathway Gene Cassettes *

[00191] Plasmids containing the described helper fragments and gene expression cassettes were digested with Ascl in a 20 μΙ_ reaction volume. The digest was performed for 2 h at 37°C.

[00192] The two yeast strains producing P3G and P35G, respectively, were transformed separately with the digested HRT fragments using the LiAC transformation method (see e.g. , Gietz et a/,, Nat Protoc. 2007;2(1 ):35-7). After transformation, the cells were grown at 3Q°C for 72 h.

[00193] Individual yeast clones from both transformations were then grown in 2 ml_ liquid cultures for 96 h. Subsequently, the cultures were extracted with acidified methanol (1 % HCL) at 30°C, 300 rpm for 30 min. Following extraction, the cell debris was precipitated by centrifugation, and the cleared supernatants were collected for analysis by LC/MS. Analysis showed that introduction of the gene encoding the anthocyanin 3-0-glucoside:6"-0-malonyl transferase resulted in the production of pelargonidin-3-O-malonyl glucoside (see FIG. 15) and pelargonidin-3-O-malonyl glucoside-5-O-glucoside (see FIG. 16).

Example No. 10: Analysis of flavonoids and flavonoid derivatives [00194] LC parameters

[00195] Flavonoids and derivatives were analyzed using liquid-chromatography coupled to mass spectrometry (LC/MS). An HSS T3 column, 130 A, 1 .7 pm, 2.1 mm X 100 mm was employed using the conditions indicated in Table No. 14 below. A = 0.1 % formic acid, B = acetonitrile with 0.1 % formic acid.

[00196] Table No. 14. Chromatographic gradient for LCMS analysis of flavonoids and flavonoid-derivatives.

Flow

Time (min) %A %B

(mUmin)

9.00 0.400 55.0 45.0

11 .00 0.400 0.0 100.0

13.00 0.400 0.0 100.0

13.01 0.400 95.0 5.0

15.00 0.400 95.0 5.0

[00197] MS parameters

[00198] For mass spectrum analysis, full scan spectrum data were recorded using a Xevo® G2-XS (Waters Cooperation, Milford, USA) with the parameters indicated in Table No. 15 below.

[00199] Table No. 15. Mass spectrometry parameters.

[00200] Data processing and quantification

[00201] For each compound, an extracted ion chromatogram within a mass window of 0.01 Da was calculated. Peak areas and compound quantities were calculated according to the retention time and linear calibration curve of the respective standard compounds (Sigma-Aldrich, Switzerland) (see Table No. 16 below).

[00202] Table No. 16. Mass spectrometry standards

Compound | Retention Time [min] Cyanidin 3.7

Cyanidin-3-glucoside 2.6

Cyanidin-3,5-diglucoside 1.9

Pelargonidin 4.2

Pelargonidin-3-glucostde 2.9

Pelargonidin-3,5-diglucoside 2.2

Delphinidin 3.1

Delphinidin-3-glucoside 2.3

Delphinidin 3,5-diglucoside 1.6

Example No. 11 : Characterization of Isolated Anthocyanins.

[00203] A yeast strain was constructed as described in Example No. 2, but leaving out the DFR gene. This strain was used as negative control for P3G production. After culturing this strain and the strain from Example No. 2, the broth was acidified with HCI to pH<2 and visually inspected. As seen in FIG. 17, the development of color, corresponding to the presence of P3G, was only achieved when DFR was included in the strain. The control strain without DFR did not produce any color. This shows that the compound(s) giving rise to the color is downstream from dihydroflavonols, in this case the dihydrokaempferol, and is consistent with the detection of P3G in this strain.

[00204] Further, the P3G-producing strain from Example No. 2 was grown, as described, and the broth was adjusted to various pH values: pH<2, pH=5, and pH >10. As seen in FIG. 18, the color observed at the different pH corresponds to the expected pH-dependent color changes, as reported in literature for P3G.

[00205] Having described the invention in detail and by reference to specific embodiments thereof, it will be apparent that modifications and variations are possible without departing from the scope of the invention defined in the appended claims. More specifically, although some aspects of the present invention are identified herein as particularly advantageous, it is contemplated that the present invention is not necessarily limited to these particular aspects of the invention. 6] Sequence IDs of genes/enzymes used in Examples.

vulgare

DNA sequence encoding cytochrome

SEQ ID NO: 23 p450 CPR1 (Ncp1 ) of Saccharomyces cerevisiae

Protein sequence of CPR1 of

SEQ ID NO: 24

Saccharomyces cerevisiae

DNA sequence encoding A3GT of

SEQ ID NO: 25

Arabidopsis thaliana (pEVE 4005)

Protein sequence of A3GT of

SEQ ID NO: 26

Arabidopsis thaliana

DNA sequence encoding F3'H of

SEQ ID NO: 27

Petunia x hybrida (pEVE 3999)

Protein sequence of F3'H of Petunia x

SEQ ID NO: 28

hybrida

DNA sequence encoding LAR-1 of

SEQ ID NO: 29

Fragaria x ananassa (pEVE 4028)

Protein sequence of LAR-1 of Fragaria x

SEQ ID NO: 30

ananassa

DNA sequence encoding ATR-1 of

SEQ ID NO: 31

Arabidopsis thaliana (pEVE 3975)

Protein sequence of ATR-1 of

SEQ ID NO: 32

Arabidopsis thaliana

DNA sequence encoding F3'5'H of Viola

SEQ ID NO: 33

tricolor

Protein sequence of F3'5'H of Viola

SEQ ID NO: 34

tricolor

DNA sequence of pEVE 4745 -ZA for

SEQ ID NO: 35

HRT integration into XI-3 site

DNA sequence of pEVE3 69 -AB with

SEQ ID NO: 36

URA3 marker flanked by LoxP sites

DNA sequence of pEVE1919 - Closing

SEQ ID NO: 37 linker HZ for 6 gene plasmid or

integration

DNA sequence of pEVE4729 - ZA with

SEQ ID NO: 38 HIS3 marker and pSC101 ORI for HRT plasmids

DNA sequence of pEVE1968 - AB with

SEQ ID NO: 39 ARS/CEN origin and CmR marker for

HRT plasmids

DNA sequence of pEVE1917 - Closing

SEQ ID NO: 40

linker FZ for 4 gene K 7" plasmid

DNA sequence of pEVE1765 - ZA with

SEQ ID NO: 41 LEU2 marker and pMB1 ORI for HRT plasmids

SEQ ID NO: 42 DNA sequence of pEVE1915 - Closing

Dahlia variabilis SEQ ID NO: 1 ATGACGACACAAGATGTGATAGTCAATGATCAGAATGATCAGAAACAGT

GTAGTAATGACGTCA I I I I CCGATCGAGATTGCCTGATATATACATCCCT

AACCACCTCCCACTCCACGACTACATCTTCGAAAATATCTCAGAGTTCG

CCGCTAAGCCATGCTTGATCAACGGTCCCACCGGCGAAGTATACACCT

ACGCCGATGTCCACGTAACATCTCGGAAACTCGCCGCCGGTCTTCATAA

CCTCGGCGTGAAGCAACACGACGTTGTAATGATCCTCCTCCCGAACTCT

CCTGAAGTAGTCCTCAC I I I CCTTGCCGCCTCCTTCATCGGCGCAATCA

CCACCTCCGCGAACCCGTTCTTCACTCCGGCGGAGATTTCTAAACAAGC

CAAAGCCTCCGCGGCGAAACTCATCGTCACTCAATCCCGTTACGTCGAT

AAAATCAAGAACCTCCAAAACGACGGCG I I I I GATCGTCACCACCGACT

CCGACGCCATCCCCGAAAACTGCCTCCG I I I CTCCGAGTTAACTCAGTC

CGAAGAACCACGAGTGGACTCAATACCGGAGAAGATTTCGCCAGAAGA

CGTCGTGGCGCTTCCTTTCTCATCCGGCACGACGGGTCTCCCCAAAGG

AGTGATGCTAACACACAAAGGTCTAGTCACGAGCGTGGCGCAGCAAGT

CGACGGCGAGAATCCGAATCTTTACTTCAACAGAGACGACGTGATCCTC

TGTGTCTTGCCTATGTTCCATATATACGCTCTCAACTCCATCATGCTCTG

TAGTCTCAGAGTTGGTGCCACGATCTTGATAATGCCTAAGTTCGAAATC

ACTCTCTTGTTAGAGCAGATACAAAGGTGTAAAGTCACGGTGGCTATGG

TCGTGCCACCGATCG I I I I AGCTATCGCGAAGTCGCCGGAGACGGAGA

AGTATGATCTGAGCTCGGTTAGGATGGTTAAGTCTGGAGCAGCTCCTCT

TGGTAAGGAGCTTGAAGATGCTATTAGTGCTAAGTTTCCTAACGCCAAG

CTTGGTCAGGGCTATGGGATGACAGAAGCAGGTCCGGTGCTAGCAATG

TCGTTAGGGTTTGCTAAAGAGCCGTTTCCAGTGAAGTCAGGAGCATGTG

GTACGGTGGTGAGGAACGCCGAGATGAAGATACTTGATCCAGACACAG

GAGATTCTTTGCCTAGGAACAAACCCGGCGAAATATGCATCCGTGGCAA

CCAAATCATGAAAGGCTATCTCAATGACCCCTTGGCCACGGCATCGACG

ATCGATAAAGATGGTTGGCTTCACACTGGAGACGTCGGATTTATCGATG

ATGACGACGAGC I I I I CATTGTGGATAGATTGAAAGAACTCATCAAGTA

CAAAGGA I I I CAAGTGGCTCCAGCTGAGCTAGAGTCTCTCCTCATAGGT

CATCCAGAAATCAATGATGTTGCTGTCGTCGCCATGAAGGAAGAAGATG

CTGGTGAGGTTCCTGTTGCGTTTGTGGTGAGATCGAAAGATTCAAATAT

ATCCGAAGATGAAATCAAGCAATTCGTGTCAAAACAGGTTGTG I I I I ATA

AGAGAATCAACAAAGTGTTCTTCACTGACTCTATTCCTAAAGCTCCATCA

GGGAAGATATTGAGGAAGGATCTAAGAGCAAGACTAGCAAATGGATTAA

TGAACTAG

SEQ ID NO: 2 MTTQDVIVNDQNDQKQCSNDVIFRSRLPDIYIPNHLPLHDYIFENISEFAAKP

CLINGPTGEVYTYADVHVTSRKLAAGLHNLGVKQHDWMILLPNSPEWLTF

LAASFIGAITTSANPFFTPAEISKQAKASAAKLIVTQSRYVDKIKNLQNDGVLI

VTTDSDAIPENCLRFSELTQSEEPRVDSIPEKISPEDWALPFSSGTTGLPK

GVMLTHKGLVTSVAQQVDGENPNLYFNRDDVILCVLPMFHIYALNSIMLCSL

RVGATILIMPKFEITLLLEQIQRCKVTVA VVPPIVLAIAKSPETEKYDLSSVR

MVKSGAAPLGKELEDA!SAKFPNAKLGQGYG TEAGPVLAMSLGFAKEPF

PVKSGACGTWRNAE KILDPDTGDSLPRNKPGEICIRGNQI KGYLNDPL

ATASTIDKDGWLHTGDY3F1DDDDELFIVDRLKEUKYKGFQVAPAELESLU

GHPEINDVAWAMKEEDAGEVPVAFWRSKDSNISEDEIKQFVSKQWFYK

RINKVFFTDSIPKAPSGKILRKDLRARLANGLMN

SEQ ID NO: 3 ATGGCTCCAGCCACTACCTTAACCTCTATTGCACATGAAAAGACATTACA

GCAGAAGTTCGTTAGAGATGAGGATGAAAGGCCTAAGGTTGCCTATAAC GACTTTTCTAATGAAATTCCAATAATCTCTTTGGCTGGTATAGACGAAGT AGAAGGTAGAAGGGGAGAAATATGTAAGAAGATTGTTGCAGCTTGCGAA GATTGGGGCATTTTCCAGATCGTAGACCATGGTGTAGATGCCGAATTGA TATCAGAAATGACAGGTTTGGCTAGAGAATTCTTCGCATTGCCTTCAGA AGAGAAGTTAAGGTTTGATATGTCCGGTGGTAAGAAAGGTGG I 1 1 1 ATA GTCTCTAGTCATTTAC AG GGTG AAG CCGTTCAAG ATTG G AG AG AAATC G TAACATA 1 1 1 CTCATACCCAATTAGACACAGAGATTACTCCAGGTGGCCT GATAAGCCAGAAGCCTGGAGGGAAGTTACTAAGAAATACTCAGATGAGT TGATGGGATTAGCTTGTAAATTGTTGGGCGTGTTGTCAGAAGCCATGGG ATTGGATACAGAGGCCTTGACCAAAGCATGTGTTGATATGGACCAAAAG GTAGTTGTCAACTTCTACCCTAAATGCCCTCAACCAGACTTGACATTAG GCTTGAAAAGACATACCGACCCCGGCACTATCACTTTATTATTACAAGA CCAAGTCGGTGGTTTGCAGGCTACTAGAGACGACGGTAAAACCTGGAT CACTGTTCAACCCGTTGAAGGAGCATTCGTCGTTAA I t 1 GGGCGATCAT G G ACACTTATTGTCCAATG GTAG ATTTAAG AATG CTG ATCACCAAG CTG TG GT C AACTCTAAT AGT AGT AG ATT ATC C ATTG CT AC ATTTC AG AACCC A GCACAAGAAGCAATTG 1 1 1 ATCCTTTATCTGTGAGAGAAGGAGAGAAGC CTA I 1 1 1 AGAGGCACCAATTACATATACTGAGATGTATAAGAAGAAGATG TCTAAAGA 1 1 1 GGAGTTAGCAAGATTGAAGAAATTAGCTAAAGAGCAACA AAGTCAAGATTTAGAGAAGGCTAAAGTGGATACTAAACCAGTGGATGAT ATCTTCGCTTAA

SEQ ID NO: 4 APATTLTSIAHEKTLQQKFVRDEDERPKVAYNDFSNEIPIISLAGIDEVEGR

RGEICKKIVAACEDWGIFQIVDHGVDAELISEMTGLAREFFALPSEEKLRFD

MSGGKKGGFIVSSHLQGEAVQDWREIVTYFSYPIRHRDYSRWPDKPEAW

REVTKKYSDELMGLACKLLGVLSEAMGLDTEALTKACVD DQKVWNFYP

KCPQPDLTLGLKRHTDPGTITLLLQDQVGGLQATRDDGKTWITVQPVEGAF

WNLGDHGHLLSNGRFKNADHQAWNSNSSRLSIATFQNPAQEAIVYPLSV

REGEKPILEAPITYTEMYKKKMSKDLELARLKKLAKEQQSQDLEKAKVDTKP

VDDIFA

SEQ ID NO: 5 ATGATGCACAAAGGTACAGTTTGTGTTACTGGTGCTGCCGGCTTCGTAG

GTAGTTGGTTAATCATGAGGTTATTAGAACAAGGTTACTCCGTTAAGGCT

ACAGTGAGAG JCCTTCTAACATGAAGAAAGTTAAGCATTTGTTGGA1 TT

ACCCGGAGCAGCAAATAGGTTGACTTTGTGGAAGGCAGATTTAGTTGAT

GAAGGTTCCTTTGATGAACCTATTCAAGGTTGCACAGGTGTATTCCATG

TCGCAACTCCAATGGA 1 1 1 CGAGTCTAAAGATCCTGAGAGTGAGATGAT

TAAACCTACAATCGAGGGCATGTTAAACG 1 S S 1 GAGGTCATGTGCAAGA

GCATCCAGTACTGTCAGAAGGGTAG M i l CACTTCCTCTGCCGGTACTG

TTAGTATCCATGAAGGCAGAAGACACTTATACGATGAAACCAGTTGGTC

AGACGTCGATTTCTGCAGGGCCAAGAAGATGACAGGTTGGATGTATTTC

GTCTCTAAAACCTTAGCAGAAAAGGCCGCCTGGGA 1 1 1 CGCAGAAAAGA

ATAACATTGACTTCATTTCTATTATACCCACTTTAGTCAATGGTCCC I 1 I G

TTATGCCAACTATGCCACCATCAATGTTGTCAGCTTTGGCTTTAATTACC

AGAAATGAACCTCATTACTCAATTTTGAACCCTGTGCAATTTGTACATTT

G G ATG ATTT ATG C AATG CTC ATA 1 1 1 1 CTTGTTTGAATGTCCAGATGCTA

AGGGTAGATACATCTGTTCTTCACACGATGTAACAATCGCCGG 1 1 I AGC

TCAAATATTGAGACAAAGATATCCAGAGTTTGACGTGCCAACAGAATTTG

GAGAAATGGA GTGTTTGACATTATATCATATTCTTCTAAGAAGTTAA OT

GACTTGGGATTTGAATTTAAATATTCTTTAGAGGACATG 1 1 1 GACGGCGC

TATACAGTCTTGTAGAGAAAAGGGCTTGTTGCCTCCAGCTACAAAAGAA

CCATCCTATGCTACCGAACAATTGATAGCTACCGGACAGGACAATGGAC

ACTAA

SEQ ID NO: 6 MMHKGTVCVTGAAGFVGSWLIMRLLEQGYSVKATVRDPSNMKKVKHLLDL PGAANRLTLWKADLVDEGSFDEPIQGCTGVFHVATPMDFESKDPESEMIK

PTIEGMLNVLRSCARASSTVRRWFTSSAGTVSIHEGRRHLYDETSWSDVD

FCRAKKMTGW YFVSKTLAEKAAWDFAEKNNIDFISIiPTLVNGPFV PTM

PPS LSALALITRNEPHYSILNPVQFVHLDDLCNAHIFLFECPDAKGRYICSS

HDVTIAGLAQILRQRYPEFDVPTEFGEMEVFDIISYSSKKLTDLGFEFKYSLE

DMFDGAIQSCREKGLLPPATKEPSYATEQLIATGQDNGH

SEQ ID NO: 7 ATGGGTACTGAAGCTGAAACCGTTTGTGTTACTGGTGCTTCTGGTTTTAT

TGGTTCCTGGTTGATCATGAGATTATTGGAAAAAGGTTACGCTGTTAGA

GCCACTGTTAGAGATCCAGATAATATGAAGAAGGTCACCCACTTGTTGG TTGCCAAAGGCTTCTACTCATTTGACTTTGTGGAAAGCCGATTTGTCT

GTTGAAGGTTCTTACGATGAAGCTATTCAAGGTTGTACTGGTG M M CCA

TGTTGCTACTCCAATGGATTTCGAATCTAAGGATCCAGAAAACGAAGTTA

TCAAGCCAACCATTAACGGTG M M G G ATATTATG AG AG CTTG CG CTAA

CTCTAAGACCGTTAGAAAGATCG M M CACTTCTTCTGCTGGTACTGTTG

ATGTCGAAGAAAAAAGAAAGCCAGTCTACGATGAATCTTGCTGGTCTGA

TTTGGATTTCGTCCAATCTATTAAGATGACCGGTTGGATGTACTTCG 1 1 1

CTAAAAC 1 1 1 GGCTGAACAAGCTGCTTGGAAGTTCGCTAAAGAAAACAA

CTTGGACTTCATCTCCATTATCCCAACTTTGGTTGTTGGTCCATTCATCA

TGCAATCTATGCCACCATCTTTGTTGACTGCCTTGTCTTTGATTACTGGT

AACGAAGCTCATTACGGTATCTTGAAACAAGGTCATTACGTTCACTTGG

ATGACTTGTGTATGTCCCATATCTTCTTGTACGAAAACCCAAAAGCTGAA

GGTAGATATATCTGCAACTCTGATGATGCCAACATTCATGATTTGGCTAA

GTTGTTGAGAGAAAAGTACCCAGAATACAACGTTCCAGCTAAGTTCAAG

GATATCGACGAAAATTTGGCTTGCGTTGC 1 1 1 CTCATCTAAGAAGTTGAC

AGATTTGGGTTTCGAATTCAAGTACTCCTTGGAAGATATGTTTGCTGGTG

CAGTTGAAACCTGTAGAGAAAAGGGTTTGATTCCATTGTCCCACAGAAA

ACAAGTCGTCGAAGAATGCAAAGAAAATGAAGTTGTTCCAGCTTCTTAA

SEQ ID NO: 8 MGTEAETVCVTGASGFIGSWLIMRLLEKGYAVRATVRDPDNMKKVTHLLEL

PKASTHLTLWKADLSVEGSYDEAIQGCTGVFHVATP DFESKDPENEVIKP

TINGVLDIMRACANSKTVRKIVFTSSAGTVDVEEKRKPVYDESCwVSDLDFV

QSIKMTGWMYFVSKTLAEQAAWKFAKENNLDFISIIPTLWGPFIMQSMPPS

LLTALSLITGNEAHYGILKQGHYVHLDDLCMSHIFLYENPKAEGRYICNSDD

ANIHDLAKLLREKYPEYNVPAKFKDIDENLACVAFSSKKLTDLGFEFKYSLE

DMFAGAVETCREKGLIPLSHRKQWEECKENEVVPAS

SEQ ID NO: 9 ATGGTTAACGCCGTTGTTACTACCCCATCTAGAGTTGAATCTTTGGCTAA

GTCTGGTATTCAAGCCATCCCAAAAGAATACGTTAGACCACAAGAAGAA

TTGAACGGTATCGGTAACA M M CGAAGAAGAAAAGAAAGACGAAGGTC

CACAAGTTCCAACCATCGA 1 1 1 GAAAGAAATCGACTCCGAAGACAAAGA

AATCAGAGAAAAGTGCCACCAATTGAAAAAGGCTGCTATGGAATGGGGT

GTTATGCA 1 1 1 GGTTAATCACGGTATCTCCGACGAATTGATCAACAGAGT

TAAGGTTGCTGGTGAAACCTTTTTCGATCAACCAGTCGAAGAAAAAGAA

AAGTACGCTAACGATCAAGCCAACGGTAATGTTCAAGGTTACGGTTCTA

AATTGGCTAACTCTGCTTGTGGTCAATTGGAATGGGAAGATTACTTTTTC

CATTGCGCTTTCCCAGAAGATAAGAGAGA 1 1 1 GTCTATCTGGCCAAAGA

ACCC* ^CTGATTATACTCCAGCTACTTCTGAATACGCCAAG^ OTAGA

GCTTTGGCTACTAAGATTTTGACCGTCTTGTCTATTGGTTTGGGTTTGGA

AGAAGGTAGATTGGAAAAAGAAGTTGGTGGTATGGAAGATTTGTTGTTG

CAAATGAAGATCAACTACTACCCAAAGTGTCCACAACCAGAATTGGCTT

TGGGTGTTGAAGCTCATACTGATGTTTCTGCTTTGACCTTCATCTTGCAT

AATATGGTCCCAGGTTTACAATTATTCTACGAAGGTCAATGGGTTACCG CTAAGTGTGTTCCAAATTCCATTATCATGCATATCGGTGACACCATCGAA ATCTTGTCTAACGGTAAATACAAGTCCATCTTGCACAGAGGTGTTGTCAA C AA AG AAAAG GTT AG ATTCTCCTG G GCT A I I I I CTGTGAACCACCTAAA G AAAAG ATCATCTTG AAG CCATTG CCAG AAACTGTTACTG AAG CTGAAC CACCAAGATTTCCACCAAGAAC I I I I GCTCAACATATGGCCCATAAGTTG TTCAGAAAGGATGATAAGGATGCTGCCGTTGAACATAAGG I I I I CAACG AAGATGAATTGGATACTGCTGCTGAACACAAAGTCTTGAAGAAGGATAA TCAAGACGCTGTTGCTGAAAACAAGGACATCAAAGAAGATGAACAATGT GGTCCAGCAGAACACAAAGATATCAAAGAAGATGGTCAAGGTGCTGCT GCAGAAAACAAGG I I I I CAAAGAAAACAATCAAGATGTCGCCGCCGAAG AATCTAAGTAA

SEQ ID NO: 10 MVNAWTTPSRVESLAKSGIQAIPKEYVRPQEELNGIGNIFEEEKKDEGPQV

PTIDLKEIDSEDKEIREKCHQLKKAAMEWGVMHLVNHGISDELINRVKVAGE

TFFDQPVEEKEKYANDQANGNVQGYGSKLANSACGQLEWEDYFFHCAFP

EDKRDLSIWPKNPTDYTPATSEYAKQIRALATKILTVLSIGLGLEEGRLEKEV

GGMEDLLLQMKINYYP CPQPELALGVEAHTDVSALTFILHNMVPGLQLFY

EGQWVTAKCVPNSI IMHIGDTIEILSNGKYKS I LHRGWNKEKVRFSWAIFCE

PPKEKIILKPLPETVTEAEPPRFPPRTFAQHMAHKLFRKDDKDAAVEHKVFN

EDELDTAAEHKVLKKDNQDAVAENKDIKEDEQCGPAEHKDIKEDGQGAAA

ENKVFKENNQDVAAEESK*

SEQ ID NO: 1 1 ATGTCAGCAAATTCTAACTACATGAACAAAAGTCGTCTCCATGTCGCTGT

GTTTCCATTCCCTTTTGGAACACACGCGACTCCACTTTTCAACATAACCC

AAAAACTAGCATCATTTATGCCTGATGTCGTCTTCTCCTTCTTCAACATC

CCACAATCCAACGCTAAGATATCTTCTGA I I I I AAAAACGATACCATAAA

CATGTATGATGTGTGGGACGGGGTGCCGGAAGGATATGTCTTCAAGGG

TAAGCCTCAAGAAGACATCGAGCTCTTCATGCTGGCTGCACCTCCCACA

TTGACAGAGGCGTTGGCTAAAGCCGAGGTGGAAACAGGGACCAAGGTG

AGCTGCATACTTGGCGATGCCTTTTTATGGTTCCTGGAGGAACTCGCCC

AACAAAAACAAGTTCCCTGGATTACTACTTATATGTCTGAGGAGCATTCT

C I I I I GGCTCATATTTGCACTGATCTTATCAGACAAACTATTGGCATTCA

TGAGAAAGCAGAAGAGCGGAAAGATGAAGAGCTAGATTTCATTCCAGG

ATTGTCCAAGATTAGAGTCCAAGACTTACCAGAGGGAATCGTGATGGGA

AATTTGGATTCGTA I I I I GCGAGAATGCTTCACCAAATGGGGCGGGCAT

TACCGCGTGCATCAGCAGTTTGCATTAGTTCATGTCAAGAACTAGACCC

TGTTGCGACTAATGAGCTTAACAGAAAATTGAATAAATTGATTAATGTTG

GACCTCTAAGTCTAATTACGCAATCAAACTCATTACCTTCAGGCACAAAC

AAGAGTCTGGGTTGGCTTGATAAACAAGAATCTGAAAACAGTGTTGCGT

ACGTTAG I I I I GGGTCAGTTGCACGCCCTGATGCAACCGAGATTACAGC

CCTGGCTCAAGCATTGGAGGCAAGTCAGGTCAAATTTATCTGGTCGATT

AGAGACAATCTTAAGGTACA I I I GCCAGGTGGATTTATTGAGAATACAAA

GGATAAAGGGATGGTGGTGTCGTGGGTGCCACAGACAGCTGTGTTGGC

TCACAAGGCAGTTGGTG I I I I CATAACCCATTTCGGTCACAATTCCATCA

TGGAAAGTATTGCAAGTGAGGTTCCAATGATAGGGCGACCATTCATCGG

GGAACAAAAGTTGAACGGTAGAATAGTGGAAGCCAAATGGTGTATCGGT

TTGGTTGTGGAAGGTGGAG I I I I CACTAAAGATGG " > " TACTGAGAAGCT

TGAACAAAATACTAGGTAGCACACAAGGTGAAGAAATGAGGAGAAATAT

AAGAGACCTACGACTCATGGTTGACAAGGCACTCAGTCCTGACGGAAG

CTGCAATACAAACTTGAAACATTTGGTCGACATGATCGTCACTTCTAACT

AA

SEQ I D NO: MSANSNYMNKSRLHVAVFPFPFGTHATPLFNITQKLASFMPDWFSFFNIP 12 QSNAKISSDFKNDTIN YDVWDGVPEGYVFKGKPQEDIELFMLAAPPTLTE

ALAKAEVETGTKVSCILGDAFLWFLEELAQQKQVPWITTYMSEEHSLLAHIC

TDLIRQTIGIHEKAEERKDEELDFIPGLSKIRVQDLPEGIVMGNLDSYFARML

HQMGRALPRASAVCISSCQELDPVATNELNRKLNKLINVGPLSLITQSNSLP

SGTNKSLGWLDKQESENSVAYVSFGSVARPDATEITALAQALEASQVKFIW

SIRDNLKVHLPGGFIENTKDKGMWSWVPQTAVLAHKAVGVFITHFGHNSI

MESIASEVPMIGRPFIGEQKLNGRIVEAKWCIGLWEGGVFTKDGVLRSLNK

ILGSTQGEEMRRNIRDLRLMVDKALSPDGSCNTNLKHLVD IVTSN

SEQ ID NO: 13 ATGGCTGCTTCCATTACCGCTATTACCGTTGAAAATTTGGAATACCCAG

CTGTTGTTACTTCTCCAGTTACTGGTAAGTCTTACTTTTTGGGTGGTGCT

GGTGAAAGAGGTTTGACTATTGAAGGTAACTTCATTAAGTTCACCGCCA

TCGGTGTTTACTTGGAAGATATTGCTGTTGCTTCTTTGGCTGCTAAATGG

AAGGGTAAATCCTCCGAAGAATTATTGGAAACCTTGGACTTCTACAGAG

ACATTATTTCTGGTCCATTCGAAAAGTTGATCAGAGGTTCCAAGATCAGA

GAATTGTCTGGTCCAGAATACTCCAGAAAGGTTATGGAAAATTGCGTTG

CCCATTTGAAGTCTGTTGGTACTTATGGTGATGCTGAAGCTGAAGCTAT

GCAAAAATTTGCTGAAGCCTTTAAGCCAGTTAATTTTCCACCAGGTGCTT

CCGTTTTTTACAGACAATCTCCAGATGGTATCTTGGGTTTGTCTTTTTCA

CCAGATACCTCCATCCCAGAAAAAGAAGCTGCTTTGATTGAAAACAAGG

CTGTTTCTTCTGCTGTCTTGGAAACTATGATTGGTGAACATGCTG I I I CC

CCAGATTTGAAAAGATG I I I AGCTGCTAGATTGCCTGCCTTGTTGAATGA

AGGTGC I I I I AAGATTGGTAACTAA

SEQ ID NO: 14 M AAS ITAITVEN LEYP AWTS PVTGKS YF LGG AGERGLTI EGN F I KFTAI G VY

LEDIAVASL-AAKWKGKSSEELLETLDFYRDIISGPFEKLIRGSKIRELSGPEYS

RKVMENCVAHLKSVGTYGDAEAEA QKFAEAFKPVNFPPGASVFYRQSP

DGILGLSFSPDTSIPEKEAALIENKAVSSAVLETMIGEHAVSPDLKRCLAARL

PALLNEGAFKIGN

SEQ ID NO: 15 ATGGCGGGCAACGGCGCCATCGTGGAGAGCGACCCGCTGAACTGGGG

CGCGGCGGCGGCGGAGCTGGCCGGGAGCCACCTGGACGAGGTGAAG

CGCATGGTGGCGCAGGCCCGGCAGCCCG i GGTCAAGATCGAGGGCTC

CACCCTCCGCGTCGGCCAGGTGGCCGCCGTCGCCTCCGCCAAGGACG

CGTCCGGCGTCGCCGTCGAGCTCGACGAGGAGGCCCGCCCCCGCGTC

AAGGCCAGCAGCGAGTGGATCCTCGACTGCATCGCCCACGGCGGCGA

CATCTACGGCGTCACCACCGGCTTCGGCGGCACCTCCCACCGCCGCA

CCAAGGACGGGCCCGCGCTCCAGGTCGAGCTGCTCAGGCATCTCAAC

GCCGGAATCTTCGGCACCGGCAGCGACGGGCACACGCTGCCGTCGGA

GGTCACCCGCGCGGCGATGCTGGTGCGCATCAACACCCTCCTCCAGG

GCTACTCCGGCATCCGCTTCGAGATCCTCGAGGCCATCACGAAGCTGC

TCAACACCGGTGTCAGCCCCTGCCTGCCGCTCCGGGGCACCATCACCG

CGTCGGGCGACCTGGTCCCGCTCTCCTACATCGCCGGCCTCATCACGG

GCCGCCCCAACGCGCAGGCCGTCACCGTCGACGGAAGGAAGGTGGAC

GCCGCCGAGGCGTTCAAGATCGCCGGCATCGAGGGCGGCTTCTTCAA

GCTCAACCCCAAGGAGGGCCTCGCCATCGTCAACGGCACGTCCGTGG

GCTCCGCGCTCGCGGCCACCGTGATGTACGACGCCAACGTCCTGGCC

GTCCTGTCGGAGGTCCTGTCCGCCGTCTT°TGCGAGGTCATGAACGGC

AAGCCCGAGTACACGGACCACCTGACCCACAAGCTGAAGCACCACCCG

GGGTCCATCGAGGCCGCGGCCATCATGGAGCACATCCTGGATGGCAG

CTCCTTCATGAAGCAGGCCAAGAAGGTGAACGAGCTGGACCCGCTGCT

GAAGCCCAAGCAGGACAGGTACGCGCTCCGCACGTCGCCGCAGTGGC

TGGGCCCCCAGATCGAGGTCATCCGCGCCGCCACCAAGTCCATCGAG CGCGAGGTCAACTCCGTGAACGACAACCCGGTCATCGACGTCCACCGC

GGCAAGGCGCTGCACGGCGGCAACTTCCAGGGCACCCCCATCGGCGT

GTCCATGGACAACGCCCGCCTCGCCATCGCCAACATCGGCAAGCTCAT

GTTCGCGCAGTTCTCCGAGCTCGTCAACGAGTTCTACAACAACGGGCT

CACCTCCAACCTGGCCGGCAGCCGCAACCCCAGCCTGGACTACGGCTT

CAAGGGCACCGAGATCGCCATGGCCTCCTACTGCTCCGAGCTCCAGTA

CCTGGGCAACCCCATCACCAACCACGTGCAGAGCGCGGACGAGCACA

ACCAGGACGTGAACTCCCTGGGCCTCGTCTCGGCCAGGAAGACCGCC

GAGGCGATCGACATCCTGAAGCTCATGTCGTCCACCTACATCGTGGCG

CTGTGCCAGGCCGTGGACCTGCGCCACCTCGAGGAGAACATCAAGGC

GTCGGTGAAGAACACCGTGACCCAGGTGGCCAAGAAGGTGCTGACCAT

GAACCCCTCGGGCGAGCTCTCCAGCGCCCGCTTCAGCGAGAAGGAGC

TGATCAGCGCCATCGACCGCGAGGCCGTGTTCACGTACGCGGAGGAC

GCGGCCAGCGCCAGCCTGCCGCTGATGCAGAAGCTGCGCGCCGTGCT

GGTGGACCACGCCCTCAGCAGCGGCGAGCGCGGAGCGGGAGCCCTC

CGTGTTCTCCAAGATCACCAGGTTCGAGGAGGAGCTCCGCGCGGTGCT

GCCCCAGGAGGTGGAGGCCGCCCGCGTGGCGTCGCCGAGGGCACCG

CCCCCGTGGCGAACCGGATCGCGGACAGCCGGTCGTTCCCGCTGTAC

CGCTTCGTGCGCGAGGAGCTCGGCTGCGTGTTCCTGACCGGCGAGAG

GCTCAAGTCCCCCGGCGAGGAGTGCAACAAGGTGTTCGTCGGCATCAG

CCAGGGCAAGCTCGTGGACCCCATGCTCGAGTGCCTCAAGGAGTGGG

ACGGCAAGCCGCTGCCCATCAACATCAAGTAA

SEQ ID NO: 16 AGNGAIVESDPLNWGAAAAELAGSHLDEVKRMVAQARQPWKIEGSTLR

VGQVAAVASAKDASGVAVELDEEARPRVKASSEWILDCIAHGGDIYGVTTG

FGGTSHRRTKDGPALQVELLRHLNAGIFGTGSDGHTLPSEVTRAAMLVRIN

TLLQGYSGIRFEILEAITKLLNTGVSPCLPLRGTITASGDLVPLSYIAGLITGRP

NAQAVTVDGRKVDAAEAFKIAGIEGGFFKLNPKEGLAIVNGTSVGSALAATV

MYDANVLAVLSEVLSAVFCEVMNGKPEYTDHLTHKLKHHPGSIEAAAIMEHI

LDGSSFMKQAKKVNELDPLLKPKQDRYALRTSPQWLGPQIEVIRAATKSIE

REVNSVNDNPVIDVHRGKALHGGNF QGTPIGVSMDNARLAIANIGKLMFAQ

FSELVNEFYNNGLTSNLAGSRNPSLDYGFKGTEIAMASYCSELQYLGNPIT

NHVQSADEHNQDVNS LG LVSARKTAEAID I LKL SSTYIVALCQAVDLRHLE

ENIKASVKNTVTQVAKKVLTMNPSGELSSARFSEKELISAIDREAVFTYAED

AASASLPLMQKLRAVLVDHALSSGERGAGALRVLQDHQVRGGAPRGAAP

GGGGRPRGVAEGTAPVANRIADSRSFPLYRFVREELGCVFLTGERLKSPG

EECNKVFVGISQGKLVDPMLECL EWDGKPLPINIK

SEQ ID NO: 17 ATGGACCAAATTGAAGCAATGCTATGCGGTGGTGGTGAAAAGACCAAG

GTGGCCGTAACGACAAAAACTCTTGCAGATCCTTTGAATTGGGGTCTGG

CAGCTGACCAGATGAAAGGTAGCCATCTGGATGAAGTTAAGAAGATGGT

TGAGGAATACAGAAGACCAGTCGTAAATCTAGGCGGCGAGACATTGAC

GATAGGACAGGTAGCTGCTATTTCGACCGTTGGCGGTTCAGTGAAGGT

AGAACTTGCAGAAACAAGTAGAGCCGGAGTTAAGGCTTCATCAGATTGG

GTCATGGAAAGTATGAACAAGGGCACAGATTCCTATGGCGTTACCACAG

GCTTTGGTGCTACCTCTCATAGAAGAACTAAAAATGGCACTGCTTTGCA

AACAGAACTGATCAGATTCCTTAA^GCCGGTA I I I I CGGTAATACAAAG

GAAACTTGCCATACATTACCCCAATCGGCAACAAGAGCTGCTATGCTTG

TTAGGGTGAACACTTTGTTGCAAGGTTACTCTGGAATAAGGTTTGAAATT

CTTGAGGCCATCACTTCACTATTGAACCACAACATTTCTCCTTCGTTGCC

CTTAAGAGGAACAATAACTGCCAGCGGTGATTTGGTTCCCCTTTCATAT

ATCGCAGGCTTATTAACGGGAAGACCTAATTCAAAGGCCACTGGTCCAG ACGGAGAATCCTTAACCGCTAAGGAAGCATTTGAGAAAGCTGGTATTTC

AACTGGTTTC I I I GATTTgCAACCCAAGGAAGGTTTAGCCCTGGTGAATG

GCACCGCTGTCGGCAGCGGTATGGCATCCATGGTGTTG I I i GAAGCTA

ACGTACAAGCAGTTTTGGCCGAAGTTTTGTCCGCAATTTTTGCCGAAGT

CATGAGTGGAAAACCTGAGTTTACTGATCACTTGACCCACAGGTTAAAA

CATCACCCAGGACAAATTGAAGCAGCAGCTATCATGGAGCACA I I I I GG

ACGGCTCTAGCTACATGAAGTTAGCCCAGAAGGTTCATGAAATGGACCC

I I I GCAAAAACCCAAACAAGATAGATATGCTTTAAGGACATCCCCACAAT

GGCTTGGCCCTCAAATTGAAGTAATTAGACAAGCTACAAAGTCTATAGA

AAGAGAGATCAACTCTGTTAACGATAATCCACTTATTGATGTGTCGAGG

AATAAGGCAATACATGGAGGCAATTTCCAGGGTACACCCATAGGAGTCA

GTATGGATAATACCAGGCTTGCCATAGCCGCAATTGGCAAATTAATGTT

TGCCCAA I I I I CTGAATTGGTCAATGACTTCTACAATAACGGTTTGCCTT

CGAATCTGACCGCATCTTCTAACCCTAGTCTTGATTATGG I I I CAAAGGT

GCTGAGATAGCAATGGCAAGCTATTGTTCAGAGCTGCAATATCTAGCCA

ACCCAGTAACCTCTCATGTACAATCAGCCGAACAACACAATCAGGATGT

TAATTCTTTGGGCCTGA I I I CATCAAGAAAAACAAGCGAGGCCGTTGAT

ATCCTTAAATTAATGTCCACAACATTTTTAGTGGGTATATGCCAGGCCGT

AGATTTgAGACACTTGGAAGAGAA I I I GAGACAGACAGTGAAAAATACC

GTATCACAGGTTGCAAAAAAGGTTCTAACTACAGGTATCAATGGTGAATT

GCACCCATCAAGATTCTGTGAAAAAGATTTATTAAAAGTTGTAGATAGAG

AACAAGTATTTACTTACGTTGACGATCCATGTAGCGCTACTTATCCATTG

ATGCAGAGATTGAGACAAGTTATTGTAGATCACGC I I I ATCCAATGGTG

AAACTGAGAAAAATGCCGTTACTTCAATATTCCAAAAGATAGGTGCCTTT

GAAGAAGAACTGAAGGCAG I I I I ACCAAAGGAAGTCGAAGCTGCTAGA

GCCGCATACGGAAATGGTACTGCCCCTATACCAAATAGAATCAAAGAGT

GTAGGTCGTACCCTTTGTACAGATTCGTTAGAGAAGAGTTGGGAACCAA

ATTACTAACTGGTGAAAAAGTCGTTAGCCCAGGTGAAGAATTTGACAAG

GTATTCACAGCTATGTGCGAGGGAAAGTTGATAGATCCACTTATGGATT

GCTTGAAAGAGTGGAATGGTGCACCTATTCCAATCTGCTAA

SEQ ID NO: 18 MDQIEAMLCGGGEKTKVAVTTKTLADPLNWGLAADQ KGSHLDEVKKMV

EEYRRPWNLGGETLTIGQVAAISTVGGSVKVELAETSRAGVKASSDWVME

SMNKGTDSYGVTTGFGATSHRRTKNGTALQTELIRFLNAGIFGNTKETCHT

LPQSATRAAMLVRVNTLLQGYSGIRFEILEAITSLLNHNISPSLPLRGTITASG

DLVPLSYIAGLLTGRPNSKATGPDGESLTAKEAFEKAGISTGFFDLQPKEGL

ALVNGTAVGSGMASMVLFEANVQAVLAEVLSAIFAEVMSGKPEFTDHLTHR

LKHHPGQIEAAAIMEHILDGSSYMKLAQKVHEMDPLQKPKQDRYALRTSPQ

WLGPQIEVIRQATKSIEREINSVNDNPLIDVSRNKAIHGGNFQGTPIGVSMD

NTRLAIAAIGKLMFAQFSELVNDFYNNGLPSNLTASSNPSLDYGFKGAEIAM

ASYCSELQYLANPVTSHVQSAEQHNQDVNSLGLISSR TSEAVDILKLMST

TFLVGICQAVDLRHLEENLRQTVKNTVSQVAKKVLTTGINGELHPSRFCEKD

LLKWDREQVFTYVDDPCSATYPLMQRLRQVIVDHALSNGETEKNAVTSIF

QKIGAFEEELKAVLPKEVEAARAAYGNGTAPIPNRIKECRSYPLYRFVREEL

GTKLLTGEKVVSPGEEFDKVFTAMCEGKLIDPL DCLKEWNGAPIPIC

SEQ ID NO: 19 ATGATGGATTTTGTTTTGTTAGAAAAAGCTCTTCTTGGTTTGTTCATTGCA

ACTAT AGTAG CC ATCACAA FCTCTAAG CTAAG G GG AAAG AAACTTAAGTT GCCTCCAGGCCCAATCCCTGTCCCAGTGTTTGGTAATTGGTTACAAGTT GGCGACGACTTAAACCAGAGGAATTTGGTAGAGTATGCTAAAAAGTTCG GCGACTTATTTCTACTTAGGATGGGTCAAAGAAACTTGGTCGTGG I I I C ATCCCCTGACTTAGCAAAAGACGTACTACATACCCAGGGTGTCGAGTTC GGAAGTAGAACTAGAAATGTTGTGTTTGATATTTTCACAGGCAAAGGTC

AAGATATGGTTTTTACCGTATACAGCGAGCACTGGAGGAAAATGAGAAG

AATAATG ACTGTCCCATTCTTTAC AAAC AAAGTG GTTC AAC AGT ATAG GT

TCGGATGGGAGGACGAAGCCGCTAGAGTAGTCGAGGATGTTAAGGCAA

ATCCTGAAGCCGCTACCAACGGTATTGTGTTGAGGAATAGATTACAACT

TTTGATGTACAACAATATGTATAGAATAATGTTTGACAGGAGATTTGAAT

CTGTTGATGATCCATTATTCCTAAAACTTAAGGCATTGAATGGCGAGAGA

TCAAGGTTAGCTCAATCCTTTGAATACAACTTCGGTGACTTCATTCCTAT

ATTGAGGCCATTCTTGAGAGGATATCTTAAGTTGTGTCAGGAAATCAAG

GACAAAAGGTTAAAGCTATTCAAGGACTACTTCGTCGACGAGAGAAAAA

AGTTGGAGAGTATCAAGAGCGTAGGTAATAACTCCTTAAAGTGCGCCAT

AGATCATATTATCGAGGCACAAGAAAAAGGCGAGATAAACGAGGATAAC

GTGTTATACATCGTCGAGAATATCAACGTGGCTGCCATTGAAACTACAC

TTTGGTCTATTGAATGGGGTATAGCAGAACTAGTGAATAACCCTGAAAT

CCAGAAAAAATTGAGACACGAATTAGACACCGTACTTGGAGCTGGTGTT

CAAATTTGTGAACCAGATGTTCAAAAATTGCCTTATCTACAGGCCGTGAT

AAAAGAGACTTTAAGGTACAGGATGGCAATTCCATTGTTAGTCCCACAT

ATGAATCTTCACGAAGCCAAATTGGCCGGCTATGATATCCCTGCAGAGA

GCAAAAT I I I GGTAAACGCTTGGTGGTTAGCCAATAATCCAGCACATTG

GAACAAACCTGATGAGTTTAGACCAGAAAGATTTTTGGAGGAAGAATCC

AAGGTCGAGGCTAATGGAAACGACTTTAAGTACATCCC I I I CGGTGTTG

GCAGAAGATCTTGCCCAGGTATAATTCTTGCTTTACCAATCCTTGGAATA

GTAATTGGTAGGTTGGTTCAAAACTTCGAGTTACTTCCACCTCCAGGCC

AAAGCAAAATAGATACAGCCGAAAAAGGTGGACAG I I I I CATTGCAAAT

CCTAAAGCATTCCACTATTGTGTGTAAACCTAGAAGTTCTTAA

SEQ ID NO: 20 MMDFVLLEKALLGLFIATIVAITISKLRGKKLKLPPGPIPVPVFGNWLQVGDD

LNQRNLVEYA KFGDLFLLR GQRNLWVSSPDLAKDVLHTQGVEFGSRT

RNWFDIFTGKGQDMVFTVYSEHWRKMRRIMTVPFFTNKVVQQYRFGWE

DEAARVVEDVKANPEAATNGIVLRNRLQLL YNN YRIMFDRRFESVDDPL

FLKLKALNGERSRLAQSFEYNFGDFIPILRPFLRGYLKLCQEIKDKRLKLFKD

YFVDERKKLESIKSVGNNSLKCAIDHIIEAQEKGEINEDNVLYIVENINVAAIET

TLWSIEWGIAELVNNPEIQKKLRHELDTVLGAGVQICEPDVQKLPYLQAVIK

ETLRYRMAIPLLVPHMNLHEAKLAGYDIPAESKILVNAWWLANNPAHWNKP

DEFRPERFLEEESKVEANGNDFKYIPFGVGRRSCPGIILALPILGIVIGRLVQ

NFELLPPPGQSKIDTAEKGGQFSLQILKHSTIVCKPRSS

SEQ ID NO: 21 ATGGCTGCAGTAAGATTGAAAGAAGTTAGAATGGCACAGAGGGCTGAA

GGTTTAGCTACAG I I I I AGCAATCGGTACTGCCGTTCCAGCTAATTGTG

TTTATCAAGCTACCTATCCAGATTATTA I I I I AGGGTTACTAAAAGTGAG

CACTTGGCAGA I I I AAAG G AG AAGTTTCAAAG AATGTGTG ACAAATCAAT

GATTAGAAAGAGACACATGCACTTGACCGAGGAAATATTGATCAAGAAC

CCAAAGATCTGTGCACACATGGAGACCTCATTGGATGCTAGACACGCCA

TCGCATTAGTTGAAGTTCCCAAATTGGGCCAAGGTGCAGCTGAGAAGG

CCATTAAGGAGTGGGGCCAACCCTTGTCTAAGATTACTCATTTGGTATTT

TGCACAACATCCGGCGTTGACATGCCCGGTGCTGATTACCAATTAACAA

AGTTGTTAGG I I I ^TCCCCTACAGTCAAAAGGTTAATGATGTACCAACAA

GGTTGCTTTGGTGGTGCAACTG I I I I GAGATTGGCAAAAGATATCGCTG

AAAATAATAGAGGTGCCAGAGTGTTAGTCGTTTGTTCCGAGATAACTGC

TATGGCCTTCAGAGGTCCATGCAAGAGTCATTTAGATTCCTTGGTAGGT

CATGCCTTGTTCGGTGATGGTGCCGCTGCTGCAATTATAGGCGCTGAC

CCAGACCAATTAGACGAACAACCAG I I I I CCAGTTGGTATCAGCTTCTC AGACTATATTACCAGAATCAGAAGGTGCCATAGATGGCCATTTAACAGA

AGCTGGTTTAACTATACATTTATTAAAAGATGTTCCTGGTTTAA 1 1 1 CAGA

GAACATTGAACAGGCTTTGGAGGATGCCTTTGAACC 1 1 1 AGGTATTCAT

AACTGGAATTCAATTTTCTGGATTGCACATCCTGGTGGCCCTGCCATTTT

AGACAGAGTTGAAGATAGAGTAGGATTGGATAAGAAGAGAATGAGGGC

TTCTAGGGAAGTGTTATCTGAATACGGAAATATGTCTAGTGCCTCTGTGT

TGTTTGTGTTAGATGTCATGAGGAAAAGTTCTGCTAAAGACGGATTGGC

AACCACAGGAGAAGGAAAAGATTGGGGAGTGTTG 1 1 1 GGATTCGGACC

AGGCTTGACTGTAGAAACCTTAGTGTTGCATAGTGTCCCAGTCCCTGTC

CCTACTGCAGCTTCTGCATGA

SEQ ID NO: 22 MAAVRLKEVRMAQRAEG LATVLAI GTAVPANCVYQATYPDYYFRVTKSEH L

ADLKEKFQRMCDKSMIRKRHMHLTEEILIKNPKICAHMETSLDARHAIALVE

VPKLGQGAAEKAIKEWGQPLSKITHLVFCTTSGVDMPGADYQLTKLLGLSP

TVKRLMMYQQGCFGGATVLRLAKDIAENNRGARVLVVCSEITA AFRGPC

KSHLDSLVGHALFGDGAAAAIIGADPDQLDEQPVFQLVSASQTILPESEGAI

DGHLTEAGLTIHLLKDVPGLISENIEQALEDAFEPLGIHNWNSIFWIAHPGGP

AILDRVEDRVGLDKKRMRASREVLSEYGNMSSASVLFVLDVMRKSSAKDG

LATTGEGKDWGVLFGFGPGLTVETLVLHSVPVPVPTAASA

SEQ ID NO: 23 ATGCCGTTTGGAATAGACAACACCGACTTCACTGTCCTGGCGGGGCTA

GTGCTTGCCGTGCTACTGTACGTAAAGAGAAACTCCATCAAGGAACTGC

TGATGTCCGATGACGGAGATATCACAGCTGTCAGCTCGGGCAACAGAG

ACATTGCTCAGGTGGTGACCGAAAACAACAAGAACTACTTGGTGTTGTA

TGCGTCGCAGACTGGGACTGCCGAGGATTACGCCAAAAAG 1 1 1 I CCAA

GGAGCTGGTGGCCAAGTTCAACCTAAACGTGATGTGCGCAGATGTTGA

GAACTACGAC 1 I 1 GAGTCGCTAAACGATGTGCCCGTCATAGTCTCGATT

TTTATCTCTACATATGGTGAAGGAGACTTCCCCGACGGGGCGGTCAACT

TTGAAGACTTTATTTGTAATGCGGAAGCGGGTGCACTATCGAACCTGAG

GTATAATATGTTTGGTCTGGGAAATTCTACTTATGAATTC I 1 1 AATGGTG

CCGCCAAGAAGGCCGAGAAGCATCTCTCCGCTGCGGGCGCTATCAGAC

TAGGCAAGCTCGGTGAAGCTGATGATGGTGCAGGAACTACAGACGAAG

ATTACATGGCCTGGAAGGACTCCATCCTGGAGG 1 1 1 1 GAAAGACGAACT

GCATTTGGACGAACAGGAAGCCAAGTTCACCTCTCAATTCCAGTACACT

GTGTTGAACGAAATCACTGACTCCATGTCGCTTGGTGAACCCTCTGCTC

ACTATTTGCCCTCGCATCAGTTGAACCGCAACGCAGACGGCATCCAATT

GGGTCCCTTCGATTTGTCTCAACCGTATATTGCACCCATCGTGAAATCT

CGCGAACTGTTCTCTTCCAATGACCGTAATTGCATCCACTCTGAATTTGA

CTTGTCCGGCTCTAACATCAAGTACTCCACTGGTGACCATCTTGCTGTT

TGGCCTTCCAACCCATTGGAAAAGGTCGAACAGTTCTTATCCATATTCAA

CCTGGACCCTGAAACCATTTTTGACTTGAAGCCCCTGGATCCCACCGTC

AAAGTGCCCTTCCCAACGCCAACTACTATTGGCGCTGCTATTAAACACT

ATTTGGAAATTACAGGACCTGTCTCCAGACAATTGTTTTCATCTTTGATT

CAGTTCGCCCCCAACGCTGACGTCAAGGAAAAATTGACTCTGCTTTCGA

AAGACAAGGACCAATTCGCCGTCGAGATAACCTCCAAATATTTCAACAT

CGCAGATGCTCTGAAATATTTGTCTGATGGCGCCAAATGGGACACCGTA

CCCATGC TTCTTGGTCGAATCAGTTCCCCAAATGACTCCTCPTTACTA

CTCTATC7 CTTCCTCTTCTCTGTCTGAAAAGCAAACCGTCCATG I CACCT

CCATTGTGGAAAACTTTCCTAACCCAGAATTGCCTGATGCTCCTCCAGT

TGTTGGTGTTACGACTAACTTGTTAAGAAACATTCAATTGGCTCAAAACA

ATGTTAACATTGCCGAAACTAACCTACCTGTTCACTACGATTTAAATGGC

CCACGTAAAC M i l CGCCAATTACAAATTGCCCGTCCACGTTCGTCGTT CTAACTTCAGATTGCCTTCCAACCCTTCCACCCCAGTTATCATGATCGGT

CCAGGTACCGGTGTTGCCCCATTCCGTGGGTTTATCAGAGAGCGTGTC

GCGTTCCTCGAATCACAAAAGAAGGGCGGTAACAACGTTTCGCTAGGTA

AGCATATACTGTTTTATGGATCCCGTAACACTGATGATTTCTTGTACCAG

GACGAATGGCCAGAATACGCCAAAAAATTGGATGGTTCGTTCGAAATGG

TCGTGGCCCATTCCAGGTTGCCAAACACCAAAAAAG 1 1 1 ATGTTCAAGA

TAAATTAAAGGATTACGAAGACCAAGTATTTGAAATGATTAACAACGGTG

CA I 1 1 ATCTACGTCTGTGGTGATGCAAAGGGTATGGCCAAGGGTGTGTC

AACCGCATTGGTTGGCATCTTATCCCGTGGTAAATCCATTACCACTGAT

GAAGCAACAGAGCTAATCAAGATGCTCAAGACTTCAGGTAGATACCAAG

AAGATGTCTGGTAA

SEQ ID NO: 24 MPFGIDNTDFTVLAGLVLAVLLYVKRNSIKELL SDDGDITAVSSGNRDIAQ

WTENNKNYLVLYASQTGTAEDYAKKFSKELVAKFNLNVMCADVENYDFES

LNDVPVIVSIFISTYGEGDFPDGAVNFEDFICNAEAGALSNLRYN FGLGNS

TYEFFNGAAKKAEKHLSAAGAIRLGKLGEADDGAGTTDEDYMAWKDSILEV

LKDELHLDEQEAKFTSQFQYTVLNEITDSMSLGEPSAHYLPSHQLNRNADG

IQLGPFDLSQPYIAPIVKSRELFSSNDRNCIHSEFDLSGSNIKYSTGDHLAVW

PSNPLEKVEQFLSIFNLDPETIFDLKPLDPTVKVPFPTPTTIGAAIKHYLEITGP

VSRQLFSSLIQFAPNADVKEKLTLLSKDKDQFAVEITSKYFNIADALKYLSDG

AKWDTVPMQFLVESVPQMTPRYYSISSSSLSEKQTVHVTSIVENFPNPELP

DAPPWGVTTNLLRNIQLAQNNVNIAETNLPVHYDLNGPRKLFANYKLPVHV

RRSNFRLPSNPSTPVIMIGPGTGVAPFRGFIRERVAFLESQKKGGNNVSLG

KHILFYGSRNTDDFLYQDEWPEYAKKLDGSFE WAHSRLPNTKKVYVQD

KLKDYEDQVFEMINNGAFIYVCGDAKG AKGVSTALVGILSRGKSITTDEAT

ELIKMLKTSGRYQEDVW

SEQ ID NO: 25 ATGACCAAGCCATCTGATCCAACCAGAGATTCTCATGTTGCTGTTTTGG

C I 1 I 1 CCATTTGGTACTCATGCTGCTCCATTATTGACTGTTACTAGAAGA

TTGGCTTCTGCTTCTCCATCTACCGTTTTTTCTTTTTTCAACACCGCCCA

ATCCAACTCCTCTTTG M M CATCTGGTGATGAAGCTGATAGACCAGCCA

Al ATTAGAGTTTACGATATTGCTGATGGTGTCCCAGAAGGTTACGTTTTT

TCAGGTAGACCACAAGAAGCCATCGAATTATTCTTGCAAGCTGCTCCAG

AAAACTTCAGAAGAGAAATTGCTAAGGCTGAAACCGAAGTTGGTACTGA

AGTTAAGTGTTTGATGACCGATGCTTTTTTTTGGTTCGCTGCTGATATGG

CTACTGAAATCAATGCTTCTTGGATTGCTTTTTGGACTGCTGGTGCTAAT

TCTTTGTCTG CTCACTTGTACACCGA 1 1 1 GATTAGAGAAACCATCGGTGT

CAAAGAAGTCGGTGAAAGAATGGAAGAAACTATTGGTGTTATTTCCGGT

ATGGAAAAGATCAGAGTTAAGGATACTCCAGAAGGTGTTG I 1 1 1 CGGTA

ACTTGGATTCTGTTTTCTCCAAGATGTTGCACCAAATGGGTTTGGCTTTG

CCAAGAGCTACTGCTGTTTTTATCAACTCCTTCGAAGATTTGGATCCTAC

CTTGACTAACAACTTGAGATCCAGATTCAAGAGATACTTGAACATTGGTC

CATTGGG t 1 1 GTTGTCCTCTACATTGCAACAATTGGTTCAAGATCCACAT

GGTTGTTTGGCTTGGATGGAAAAAAGATCATCTGGTTCCGTTGCCTACA

TTTCTTTTGGTACTGTTATGACTCCACCACCAGGTGAATTGGCTGCTATT

GCTGAAGGTTTGGAATCTTCTAAGGTTCCATTTGTTTGGTCCTTGAAAGA

A A AGTCCTTGGTCCAATTGCCAAAGGGTTTTTTGGATAGAACTAGAGAA

CAAGGTATCGTTGTTCCATGGGCTCCACAAGTTGAATTATTGAAACATG

AAGCTACCGGTG 1 1 1 1 CGTTACTCATTGTGGTTGGAATTCTGTCTTGGAA

TCAGTTTCTGGTGGTGTTCCAATGATCTGTAGACCATTTTTTGGTGACCA

AAGATTGAACGGTAGAGCCGTTGAAGTTG 1 1 1 GGGAAATTGGTATGACC

ATCATCAATGGTG M M CACCAAGGATGGTTTCGAAAAGTGTTTGGATAA GGTTTTGGTCCAAGACGACGGTAAAAAGATGAAGTGTAATGCCAAGAAG

TTGAAAGAATTGGCTTACGAAGCTGTCTCCTCTAAAGGTAGATCATCCG

AAAATTTCAGAGGTTTGTTGGATGCCGTTGTCAACATTATCTGA

SEQ ID NO: 26 MTKPSDPTRDSHVAVLAFPFGTHAAPLLTVTRRLASASPSTVFSFFNTAQS

NSSLFSSGDEADRPANIRVYDIADGVPEGYVFSGRPQEAIELFLQAAPENF

RREIAKAETEVGTEVKCLMTDAFFWFAADMATEINASWIAFWTAGANSLSA

HLYTDLIRETIGVKEVGERMEETIGVISGMEKIRVKDTPEGWFGNLDSVFSK

MLHQMGLALPRATAVFINSFEDLDPTLTNNLRSRFKRYLNIGPLGLLSSTLQ

QLVQDPHGCLAWMEKRSSGSVAYISFGTVMTPPPGELAAIAEGLESSKVPF

VWSLKEKSLVQLPKGFLDRTREQGIWPWAPQVELLKHEATGVFVTHCGW

NSVLESVSGGVPMICRPFFGDQRLNGRAVEVVWEIGMTIINGVFTKDGFEK

CLDKVLVQDDGKK KCNAKKLKELAYEAVSSKGRSSENFRGLLDAWNII

SEQ ID NO: 27 ATGGAGATTTTAAGTTTAATTTTGTATACAGTTATCTTCAGTTTCTTATTG

CAATTTATTTTGAGATCTTTCTTTAGGAAAAGATATCCATTACCATTACCT

CCAGGTCCAAAACCATGGCCAATAATAGGCAACTTAGTACACTTGGGAC

CCAAACCACACCAGTCTACCGCCGCTATGGCCCAAACATATGGTCCATT

GATGTACTTAAAGATGGGCTTCGTAGACGTCGTTGTCGCTGCATCTGCA

AGTGTTGCTGCACAATTCTTGAAGACTCACGATGCTAACTTCTCTTCTAG

ACCTCCAAATAGTGGCGCTGAGCATATGGCCTATAATTACCAAGACTTG

G i l l 1 CGCCCCATACGGCCCTAGGTGGAGAATGTTAAGGAAAATATGTT

CTGTGCACTTGTTCTCTACAAAAGCATTGGATGATTTCAGACATGTCAGA

CAAGACGAAGTAAAGACTTTAACCAGAGCATTAGCTTCAGCAGGTCAGA

AGCCCGTGAAGTTAGGCCAATTATTAAACGTCTGTACTACTAATGC I 1 I A

GCCAGAGTAATGTTAGGTAAAAGAGTCTTCGCTGACGGTTCAGGCGAT

GTTGACCCACAAGCCGCAGAATTCAAATCTATGGTAGTTGAGATGATGG

TCGTCGCCGGTGTATTTAACATAGGAGATTTCATTCCTCAATTAAATTGG

TTGGACATTCAAGGTGTGGCCGCTAAAATGAAGAAGTTACATGCTAGAT

TCGATGC 1 1 1 CTTGACAGACATATTGGAAGAACATAAAGGTAAAATCTTT

GGTGAAATGAAGGATTTATTAAGTACCTTAATCTCCTTGAAGAATGATGA

TGCCGACAATGATGGTGGAAAATTGACAGATACAGAGATTAAAGCATTA

TTATTAAACTTGTTTGTTG C AG G A ACTG ATACTTC ATC CTC AACTGTTG A

ATGGGCAATTGCCGAATTGATCAGAAATCCAAAGA 1 1 1 1 GGCTCAGGCT

CAACAAGAGATCGACAAAGTGGTAGGTAGAGACAGGTTGGTGGGCGAA

TTAGATTTAGCACAATTAACCTACTTGGAAGCAATTGTTAAGGAAACCTT

TAGATTGCATCCCTCCACTCCATTATCATTGCCAAGAATAGCATCAGAAT

CATGTGAAATCAACGGTTACTTTATCCCAAAAGGATCCACTTTATTATTG

AATGTTTGGGCTATAGCCAGGGATCCTAATGCTTGGGCCGATCCTTTAG

AA I 1 1 AGACCTGAAAGATTCTTGCCTGGTGGTGAAAAGCCTAAGGTGGA

TGTAAGGGGAAATGATTTTGAGGTGATTCCCTTTGGAGCAGGTAGGAG

GA I 1 1 GCGCTGGAATGAATTTGGGTATTAGGATGGTTCAGTTAATGATC

GCAACATTGATACATGCATTTAACTGGGATTTGGTTTCCGGTCAGTTGC

CTGAAATGTTGAACATGGAAGAGGCTTATGGTTTGACATTGCAGAGAGC

TGATCC 1 1 I GGTTGTTCATCCCAGACCCAGATTGGAAGCTCAGGCTTAT

ATCGGTTGA

SEQ ID NO- 28 MEILSLILYTVIFSFLLQFILRSFFRKRYPLPLPPG D PWPIiGNLVHLGPKPH

QSTAAMAQTYGPLMYLKMGFVDWVAASASVAAQFLKTHDANFSSRPPNS

GAEHMAYNYQDLVFAPYGPRWR LRKICSVHLFSTKALDDFRHVRQDEVK

TLTRALASAGQKPVKLGQLLNVCTTNALARVMLGKRVFADGSGDVDPQAA

EF SMWEMMWAGVFNIGDFIPQLNWLDIQGVAAK KKLHARFDAFLTDIL

EEHKGKIFGEMKDLLSTLISLKNDDADNDGGKLTDTEIKALLLNLFVAGTDTS SSTVEWAIAELIRNPKILAQAQQEIDKWGRDRLVGELDLAQLTYLEAIVKET FRLHPSTPLSLPRIASESCEINGYFIPKGSTLLLNVWAIARDPNAWADPLEFR PERFLPGGEKPKVDVRGNDFEVIPFGAGRRICAGMNLGIR VQLMIATLIHA FNWDLVSGQLPEMLNMEEAYGLTLQRADPLWHPRPRLEAQAYIG

SEQ ID NO: 29 ATGACTGTTAGTCCATCTATCGCTAGTGCAGCCAAATCTGGCAGAGTAT

TAATTATCGGTGCCACCGGCTTTATAGGTAAATTTGTTGCTGAAGCATCT

TTGGATAGTGGCTTGCCAACATATGTCTTAGTAAGACCAGGTCCTTCAA

GACCAAGTAAAAGTGATACAATTAAATCTTTAAAAGACAGGGGCGCAAT

AA I I I I ACACGGTGTCATGTCTGATAAACCATTGATGGAAAAATTGTTAA

AGGAGCATGAAATCGAGATTGTTATTTCAGCTGTGGGTGGTGCTACTAT

I I I AGATCAAATCACCTTGGTAGAAGCTATCACCTCAGTAGGAACAGTC

AAGAGATTTTTGCCCTCCGAATTTGGCCATGACGTAGATAGAGCCGACC

CTGTTGAACCCGGTTTGACCATGTATTTGGAAAAGAGAAAGGTCAGAAG

GGCCATAGAAAAGTCTGGTGTACCATACACTTACATATGCTGTAACTCA

ATCGCCTCATGGCCATACTATGATAATAAGCACCCTTCTGAAGTGGTGC

CACCTTTGGATCAATTCCAGATCTATGGCGATGGAACCGTTAAGGCATA

CTTTGTGGATGGACCTGATATTGGTAAATTTACTATGAAGACTGTCGATG

ATATCAGGACTATGAACAAAAACGTTCATTTCAGACCATCCTCCAA I I I A

TATG AT ATTAATG G ATTG G CCTCATTGTG G G AAAAG AAG ATTG G AAG AA

CTTTGCCAAAGGTGACTATAACCGAGAATGACTTGTTAACAATGGCAGC

TGAAAACAGAATTCCTGAATCTATAGTTGCATCCTTCACACATGATATTT

TCATAAAAGGTTGCCAAACTAA I I I I CCCATAGAAGGTCCTAATGACGTT

GACATTGGAACATTATATCCTGAGGAATCCTTTAGGAC I I I AGACGAATG

TTTCAATGATTTCTTAGTTAAAGTTGGTGGTAAATTAGAGACAGACAAAT

TAGCAGCTAAAAACAAAGCAGCAGTTGGTGTCGAGCCCATGGCTATTAC

AGCTACATGTGCTTAA

SEQ ID NO: 30 MTVSPSIASAAKSGRVLIIGATGFIGKFVAEASLDSGLPTYVLVRPGPSRPSK

SDTIKSLKDRGAIILHGVMSDKPLMEKLLKEHEIEIVISAVGGATILDQITLVEAI

TSVGTVKRFLPSEFGHDVDRADPVEPGLTMYLEKRKVRRAIEKSGVPYTYI

CCNSIASWPYYDNKHPSEWPPLDQFQiYGDGTVKAYFVDGPDIGKFTMKT

VDDIRT NKNVHFRPSSNLYDINGLASLWEKKIGRTLPKVTITENDLLTMAA

ENRIPESIVASFTHDIFIKGCQTNFPIEGPNDVDIGTLYPEESFRTLDECFNDF

LVKVGGKLETDKLAAKNKAAVGVEPMAITATCA

SEQ ID NO: 31 ATGACTTCTGCACTTTATGCCTCCGATCTTTTCAAACAATTGAAAAGTAT

CATGGGAACGGATTCTTTGTCCGATGATGTTGTATTAGTTATTGCTACAA

CTTCTCTGGCACTGGTTGCTGGTTTCGTTGTCTTATTGTGGAAAAAGAC

CACGGCAGATCGTTCCGGCGAGCTAAAGCCACTAATGATCCCTAAGTCT

CTGATGGCGAAAGATGAGGATGATGACTTAGATCTAGGTTCTGGAAAAA

CGAGAGTCTCTATCTTCTTCGGCACACAAACCGGAACAGCCGAAGGATT

CGCTAAAGCAC I I I CAGAAGAGATCAAAGCAAGATACGAAAAGGCGGCT

GTAAAAGTAATCGATTTGGATGATTACGCTGCCGATGATGACCAATATG

AGGAAAAGTTGAAAAAGGAAACATTGGCTTTC I I I I GTGTAGCCACGTAT

GGTGATGGTGAACCAACCGATAACGCCGCAAGATTCTACAAGTGGTTTA

CTGAAGAGAACGAAAGAGATATCAAGTTGCAGCAACTTGCTTACGGCGT

1 1 1 1 GCCTTAGGTAACAGACAATACG ^CACTTTAACAAGATAGGTATTG

TCTTAGATGAAGAGTTATGCAAAAAGGGTGCGAAGAGATTGATTGAAGT

CGGTTTAGGAGATGATGATCAATCTATCGAGGATGACTTTAATGCATGG

AAGGAATCTTTGTGGTCTGAATTAGATAAGTTACTTAAGGACGAAGATGA

TAAATCCGTTGCCACTCCATACACAGCCGTCATTCCAGAATATAGAGTA

GTTACTCATGATCCAAGATTCACAACACAGAAATCAATGGAAAGTAATGT GGCTAATGGTAATACTACCATCGATATTCATCATCCATGTAGAGTAGAC

GTTGCAGTTCAAAAGGAATTGCACACTCATGAATCAGACAGATCTTGCA

TACATCTTGAA 1 1 1 GATATATCACGTACTGGTATCACTTACGAAACAGGT

GATCACGTGGGTGTCTACGCTGAAAACCATGTTGAAATTGTAGAGGAAG

CTGGAAAGTTGTTGGGCCATAGTTTAGATCTTG I N I CTCAATTCATGCC

GATAAAGAGGATGGCTCACCACTAGAAAGTGCAGTGCCTCCACCA 1 1 I C

CAGGACCATGCACCCTAGGTACCGG 1 1 1 AGCTCGTTACGCGGATCTGTT

AAATCCTCCACGTAAATCAGCTCTAGTGGCCTTGGCTGCGTACGCCACA

GAACCTTCTGAGGCAGAAAAACTGAAACATCTAACTTCACCAGATGGTA

AGGATGAATACTCACAATGGATAGTAGCTAGTCAACGTTCTTTACTAGAA

GTTATGGCTGCTTTCCCATCCGCTAAACCTCCTTTGGGTG M i l CTTCGC

CGCAATAGCGCCTAGACTGCAACCAAGATACTATTCAATTTCATCCTCA

CCTAGACTGGCACCATCAAGAGTTCATGTCACATCCGC 1 1 1 AGTGTACG

GTCCAACTCCTACTGGTAGAATCCATAAGGGCGTTTGTTCAACATGGAT

GAAAAACGCGGTTCCAGCAGAGAAGTCTCACGAATGTTCTGGTGCTCC

AATCTTTATCAGAGCCTCCAACTTCAAACTGCCTTCCAATCCTTCTACTC

CTATTGTCATGGTCGGTCCTGGTACAGGTCTTGCTCCATTCAGAGGTTT

CTTACAAGAGAGAATGGCCTTAAAGGAGGATGGTGAAGAGTTGGGATC

TTCTTTGTTGTTTTTCGGCTGTAGAAACAGACAAATGGATTTCATCTACG

AAGATGAACTGAATAACTTTGTAGATCAAGGAGTTATTTCAGAGTTGATA

ATGGCTTTTTCTAGAGAAGGTGCTCAGAAGGAGTACGTCCAACACAAAA

TGATGGAAAAGGCCGCACAAGTTTGGGACTTAATCAAAGAGGAAGGCT

ATCTATATGTCTGTGGTGATGCAAAGGGTATGGCAAGAGATGTTCACAG

AACACTTCATACTATAGTCCAGGAACAGGAAGGCGTTAGTTCTTCTGAA

GCGGAAGCAATTGTGAAAAAGTTACAAACAGAGGGAAGATACTTGAGAG

ATGTGTGGTAA

SEQ SD NO: 32 MTSALYASDLFKQLKSIMGTDSLSDDWLVIATTSLALVAGFWLLWKKTTA

DRSGELKPLMIPKSLMAKDEDDDLDLGSGKTRVSIFFGTQTGTAEGFAKAL

SEEIKARYEKAAVKVIDLDDYAADDDQYEEKLKKETLAFFCVATYGDGEPTD

NAARFYKWFTEENERDIKLQQLAYGVFALGNRQYEHFNKIGIVLDEELCKK

GAKRLIEVGLGDDDQSIEDDFNAWKESLWSELDKLLKDEDD SVATPYTAV

IPEYRVVTHDPRFTTQKS ESNVANGNTTIDIHHPCRVDVAVQKELHTHES

DRSCIHLEFDISRTGITYETGDHVGVYAENHVEIVEEAGKLLGHSLDLVFSIH

ADKEDGSPLESAVPPPFPGPCTLGTGLARYADLLNPPRKSALVALAAYATE

PSEAEKLKHLTSPDGKDEYSQWIVASQRSLLEVMAAFPSAKPPLGVFFAAI

APRLQPRYYSISSSPRLAPSRVHVTSALVYGPTPTGRIHKGVCSTW NAV

PAEKSHECSGAPIFIRASNFKLPSNPSTPIVMVGPGTGLAPFRGFLQERMAL

KEDGEELGSSLLFFGCRNRQMDFIYEDELNNFVDQGVISELIMAFSREGAQ

KEYVQHKMMEKAAQVWDLIKEEGYLYVCGDAKGMARDVHRTLHTIVQEQE

GVSSSEAEAIVKKLQTEGRYLRDVW

SEQ ID NO: 33 ATGGCAATTCTAGTCACCGACTTCGTTGTCGCGGCTATAATTTTCTTGAT

CACTCGGTTCTTAGTTCGTTCTC 1 1 1 1 CAAGAAACCAACCCGACCGCTC

CCCCCGGGTCCTCTCGGTTGGCCCTTGGTGGGCGCCCTCCCTCTCCTA

GGCGCCATGCCTCACGTCGCACTAGCCAAACTCGCTAAGAAGTATGGT

CCGATCATGCACCTAAAAATO GCACGTGCGACATGGTGGTCGCGTCC

ACCCCCGAGTCGGCTCGAGCCTTCCTCAAAACGCTAGACCTCAACTTCT

CCAACCGCCCACCCAACGCGGGCGCATCCCACCTAGCGTACGGCGCG

CAGGACTTAGTCTTCGCCAAGTACGGTCCGAGGTGGAAGACTTTAAGAA

AATTGAGCAACCTCCACATGCTAGGCGGGAAGGCGTTGGATGATTGGG

CAAATGTGAGGGTCACCGAGCTAGGCCACATGCTTAAAGCCATGTGCG AGGCGAGCCGGTGCGGGGAGCCCGTGGTGCTGGCCGAGATGCTCACG

TACGCCATGGCGAACATGATCGGTCAAGTGATACTCAGCCGGCGCGTG

TTCGTGACCAAAGGGACCGAGTCTAACGAGTTCAAAGACATGGTGGTC

GAGTTGATGACGTCCGCCGGGTACTTCAACATCGGTGACTTCATACCCT

CGATCGCTTGGATGGA 1 1 1 GCAAGGGATCGAGCGAGGGATGAAGAAGC

TGCACACGAAG 1 1 1 GATGTGTTATTGACGAAGATGGTGAAGGAGCATAG

AGCGACGAGTCATGAGCGCAAAGGGAAGGCAGATTTCCTCGACGTTCT

CTTGGAAGAATGCGACAATACAAATGGGGAGAAGCTTAGTATTACCAAT

ATCAAAGCTGTCC M i l GAATCTATTCACGGCGGGCACGGACACATCTT

CGAGCATAATCGAATGGGCGTTAACGGAGATGATCAAGAATCCGACGA

TCTTAAAAAAGGCGCAAGAGGAGATGGATCGAGTCATCGGTCGTGATC

GGAGGCTGCTCGAATCGGACATATCGAGCCTCCCGTACCTACAAGCCA

TTGCTAAAGAAACGTATCGCAAACACCCGTCGACGCCTCTCAACTTGCC

GAGGATTGCGATCCAAGCATGTGAAGTTGATGGCTACTACATCCCTAAG

GACGCGAGGCTTAGCGTGAACATTTGGGCGATCGGTCGGGACCCGAAT

G I N GGGAGAATCCGTTGGAGTTCTTGCCGGAAAGATTCTTGTCTGAAG

AGAATGGGAAGATCAATCCCGGTGGGAATGA M M GAGCTGATTCCGTT

TGGAGCCGGGAGGAGAA 1 1 1 GTGCGGGGACAAGGATGGGAATGGTCC

TTGTAAGTTATATTTTG G G C ACTTTG GTCC ATTCTTTTG ATTG G AAATTAC

CAAATGGTGTCGCTGAGCTTAATATGGATGAAAG M M GGGCTTGCATT

GCAAAAGGCCGTGCCGCTCTCGGCCTTGGTCAGCCCACGGTTGGCCTC

AAACGCGTACGCAACCTGA

SEQ ID NO: 34 MAiLVTDFVVAAIIFLITRFLVRSLF KPTRPLPPGPLGWPLVGALPLLGAMP

HVALAKLAKKYGPIMHLKMGTCD VVASTPESARAFLKTLDLNFSNRPPNA

GASHLAYGAQDLVFAKYGPRWKTLRKLSNLHMLGGKALDDWANVRVTEL

GHMLKAMCEASRCGEPWLAEMLTYAMANMIGQVILSRRVFVTKGTESNE

FKDMWELMTSAGYFNIGDFIPSIAWMDLQGIERG KKLHTKFDVLLTKMV

KEHRATSHERKGKADFLDVLLEECDNTNGEKLSITNIKAVLLNLFTAGTDTS

SSIIEWALTEMIKNPTILKKAQEEMDRVIGRDRRLLESDISSLPYLQAIAKETY

RKHPSTPLNLPRIAIQAJEVDGYYIPKDARLSVNIWAIGRDPNVWENPLEFL

PERFLSEENGKINPGGNDFELIPFGAGRRICAGTRMGMVLVSYILGTLVHSF

DWKLPNGVAELN DESFGLALQKAVPLSALVSPRLASNAYAT

SEQ ID NO: 35 CTAAATTGTAAGCGTTAATATTTTGTTAAAATTCGCGTTAAATTTTT

GTTAAATCAGCTCATTTTTTAACCAATAGGCCGAAATCGGCAAAAT

CCCTTATAAATCAAAAGAATAGACCGAGATAGGGTTGAGTGGCCG

CTACAGGGCGCTCCCATTCGCCATTCAGGCTGCGCAACTGTTGGGA

AGGGCGTTTCGGTGCGGGCCTCTTCGCTATTACGCCAGCTGGCGAA

AGGGGGATGTGCTGCAAGGCGATTAAGTTGGGTAACGCCAGGGTT

TTCCCAGTCACGACGTTGTAAAACGACGGCCAGTGAGCGCGACGT

AATACGACTCACTATAGGGCGAATTGAAGGAAGGCCGTCAAGGCC

GCATGTCGACGGCGCGCCAGTTACTTGCTCTATGCGTTTGCGCATC

CTCTTTTTACTTTTTTTTTTTCAGTAAAGCCTAAGCATAAATCGTTT

TATACGTACGACACGTTCAACTTTTCTTGGTTAGTAGTGGCAATCT

CTGCAATACATACA GGAGTCATGGTCTATCATCTTGTCCAATCAA

AGAAGCATCGGTTCAGATCGAGCAAACTGTAGGGAGAAAGGAAA

GTAG AA ATGC AG AGTGTGCT AT ATGTCC A ATCTCGG Π " 1 1 GTAGTT

TGGATGTCATTAGAGATCTACCACCCAACCGGCTGCTTTCATGTGG

AACAGAAAAGAAATCGGGGCGCTTCCTCTTCTGTATTCC 1 1 1 AATT

AACGTTTTTATTCAGCCATCTAACCATCATACCCCCATACGGTAAC AAAACCTCTTCTAAGAAAAGAAGTCTCTGCTCCTCCGCCATCT AT

TTTTATTCGCTGCGCGCGTTTATTGTCGCATCGCTAGCCAGCAAAA

AGTTGGTTGCCTTTTTTTACCTAAAAAAGACACATCTAACTGATTA

GTTTTCCGTTTTAGGATATTGACGCCAAGCGTGCGTCTGATTCCCG

GGTCATCGTCCACCTCCGGAGAACAGGCCACCATCACGCATCTGT

GTCTGAATTTCATCACGAGGCGCGCCTTTTCCCGTCTTTCAGTGCCT

TGTTCAGTTCTTCCTGACGGGCGGTATATTTCTCCAGCTTACTAGTT

TACGTGGATTGAGCCAGCAATACAGATCATTATTAAACTGTTTTGT

ACATGATGTTAGTATATAATCGTAAAGCTTTTCTAATATGTATACC

TTATACATGGAACTCCACAGAACTTGCAAACATACCAAAAATCCTT

TATTCTTGTTCACTCATTTTACATCAAAAAATAATATTTCAGTTATT

AAGGAAAATAAAAAAATAGATTAGAGAAGCATTTTGAAGAAATA

GTATATTCTTTTATTGAACCTAAGAGCGTGATATTTTTACTCGAAA

TAAAATACGAAAAATCTATACACTCATCTTTCCGACTACTATTGGC

TCCTGCTCAAAAAAAGAGGGAAAAAAAGCTCCAAAATTCTATCTT

TTCCTATCGCTCCTGTCCTATCCTTATTACGTTCATTACTATTTTAA

TACTATCCATTCTTTTATTTTCAGTCTAAAAAAAACATTTCTCATAA

CGGGAAAAGCAAAAAAATGTCAAGCTTATACATCAAAACACCACT

GCATGCATTATCTGCTGGTCCGGATTCTCAGGCGCGCCCCTGCAGG

CTGGGCCTCATGGGCCTTCCTTTCACTGCCCGCTTTCCAGTCGGGA

AACCTGTCGTGCCAGCTGCATTAACATGGTCATAGCTGTTTCCTTG

CGTATTGGGCGCTCTCCGCTTCCTCGCTCACTGACTCGCTGCGCTC

GGTCGTTCGGGTAAAGCCTGGGGTGCCTAATGAGCAAAAGGCCAG

CAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTC

CATAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACGCTCA

AGTCAGAGGTGGCGAAACCCGACAGGACTATAAAGATACCAGGC

GTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGC

CGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGTGGC

GCTTTCTCATAGCTCACGCTGTAGGTATCTCAGTTCGGTGTAGGTC

GTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCCG

ACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCCGGT

AAGACACGACTTATCGCCACTGGCAGCAGCCACTGGTAACAGGAT

TAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTCTTGAAGTG

GTGGCCTAACTACGGCTACACTAGAAGAACAGTATTTGGTATCTGC

GCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCTT

GATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTTTTTGTTTG

CAAGCAGCAGATTACGCGCAGAAAAAAAGGATCTCAAGAAGATC

CTTTGATCTTTTCTACGGGGTCTGACGCTCAGTGGAACGAAAACTC

ACGTTAAGGGATTTTGGTCATGAGATTATCAAAAAGGATCTTCACC

TAGATCCTTTTAAATTAAAAATGAAGTTTTAAATCAATCTAAAGTA

TATATGAGTAAACTTGGTCTGACAGTTATTAGAAAAATTCATCCAG

CAGACGATAAAACGCAATACGCTGGCTATCCGGTGCCGCAAI CC

ATACAGCACCAGAAAACGATCCGCCCATTCGCCGCCCAGTTCTTCC

GCAATATCACGGGTGGCCAGCGCAATATCCTGATAACGATCCGCC

ACGCCCAGACGGCCGCAATCAATAAAGCCGCTAAAACGGCCATTT

TCCACCATAATGTTCGGCAGGCACGCATCACCATGGGTCACCACC AGATCTTCGCCATCCGGCATGCTCGCTTTCAGACGCGCAAACAGCT

CTGCCGGTGCCAGGCCCTGATGTTCTTCATCCAGATCATCCTGATC

CACCAGGCCCGCTTCCATACGGGTACGCGCACGTTCAATACGATGT

TTCGCCTGATGATCAAACGGACAGGTCGCCGGGTCCAGGGTATGC

AGACGACGCATGGCATCCGCCATAATGCTCACTTTTTCTGCCGGCG

CCAGATGGCTAGACAGCAGATCCTGACCCGGCACTTCGCCCAGCA

GCAGCCAATCACGGCCCGCTTCGGTCACCACATCCAGCACCGCCG

CACACGGAACACCGGTGGTGGCCAGCCAGCTCAGACGCGCCGCTT

C ATCCTGC AGCTCGTTCAGCGC ACCGCTC AGATCGGT Π " 1 'C AC AAA

CAGCACCGGACGACCCTGCGCGCTCAGACGAAACACCGCCGCATC

AGAGCAGCCAATGGTCTGCTGCGCCCAATCATAGCCAAACAGACG

TTCCACCCACGCTGCCGGGCTACCCGCATGCAGGCCATCCTGTTCA

ATCATACTCTTCCTTTTTCAATATTATTGAAGCATTTATCAGGGTTA

TTGTCTCATGAGCGGATACATATTTGAATGTATITAGAAAAATAAA

CAAATAGGGGTTCCGCGCACATTTCCCCGAAAAGTGCCAC

SEQ ID NO: 36 CGGTGCGGGCCTCTTCGC ATTACGCCAGCTGGCGAAAGGGGGAT

GTGCTGCAAGGCGATTAAGTTGGGTAACGCCAGGG 1 1 1 1 CCCAGTC

ACGACGTTGTAAAACGACGGCCAGTGAATTGTAATACGACTCACT

ATAGGGCGACCCTTAGGATCCTATGGCGCGCCTCATCGTCCACCTC

CGGAGAACAGGCCACCATCACGCATCTGTGTCTGAATTTCATCACG

ACGCGCCGCTGCAGGTCGACAACCCTTAATATAACTTCGTATAATG

TATGCTATACGAAGTTATTAGGTCTAGAGATCCCAATACAACAGAT

CACGTGATCTTTTGTAAGATGAAGTTGAAGTGAGTGTTGCACCGTG

CCAATGCAGGTGGCTATTAGATTAAATATGTGATTTGTTCTATTAA

GTTTCCTGTATAATTAATGGGGAGCGCTGATTCTCTTTTGGTACGC

TTCCCATCCAGCATTTCTGTATC r i CACCTTCAACCTTAGGATCTC

TACCGTTGGCGAAAAGTCCTCTGCCAACAATGATGATATCTGATCC

ACCACTTACAACTTCGTCGACGGTTCTGTACTGCTGACCCAATGCA

TCGCCITTGTCGTCTAAACCTACACCTGGGGTCATGATTAGCCAAT

CAAACCCTTCTTCTCTTCCTCCCATATCGTTCTGAGCAATGAACCC

AATAACGAAATC Π ATCACTCTTTGCAATATCAACGGTACCCTTA

GTATATTCACCGTGTGCTAGAGAACCCTTGGAAGACAATTCAGCA

AGCATCAATAATCCCCTTGGTTC Γ 1 1 GGTGACCTCTTGCGCACCTT

GTTTCAAGCCAGCAACAATACCAGCACCAGTAACCCCGTGGGCGT

TGGTGATATCAGACCATTCTGCGATACGGTAAACGCCCGATGTATA

TTGTAATTTGACTGTGTTACCGATATCGGCGAA l ΤΠ CTGTCCTCAA

ATATCAAGAACTTGTATTTCTCTGCCAATGC 1 1 1 C AATGG AACG AC

AGTACCCTCATAACTGAAATCATCCAAGATATCAACGTGTG 1 M I C

AAAAGGCAAATGTATGGACCCAACGTTTCAACAAGTTTCAATAGC

TCATCAGTCGAACGAACGTCAAGAGAAGCACACAAATTGGTCTTC

1 1 Γ I C ATCC ATTAAACGTAAAAGTTTCGATGCAACCGGACTTGC AT

GAG i CTCAGCTCTACTGGTATATGATTTTGTGGACATG aTGCAACT

AATTG ACGGG AGTGT ATTG ACGCTGGCGT ACTGGC 1 1 1 C AC AAAAT

GGCCC AATCAC AACC AC ATCTTAG AT AGTTG AAATGAC 1 1 1 AG ATA

ACATCAATTGAGATGAGCTTAATCATGTCAAAGCTAAAAGTGTCA

CCATGAACGACAATTCTTAAGCAAATCACGTGATATAGATCCACG AATAACCACCATTTGATGCTCGAGGCAAGTAATGTGTGTAAAAAA

ATGCGTTACCACCATCCAATGCAGACCGATCTTCTACCCAGAATCA

CATATATTTATGTACCGAGTACCTTTTTTCTATCTTCCAATTGCTTC

TCCCATATGATTGTCTCCGTAAGCTCGAAATTTCTAAGTTGGATTTT

AATCTTCACGCAGGATGACAGTTCGATGAGCTTCTGAGGAGTGTTT

AGAACATAATCAGTTTATCCATGGTCTATCTCTTCTTGTCGCTTTTT

CTCCTCGATAGAACCTAAATAAAACGAGCTCTCGAGAACCCTTAA

TATAACTTCGTATAATGTATGCTATACGAAGTTATTAGGTGATATC

AGATCCGGCGCGTGGCACCCTTGCGGGCCATGTCATACACCGCCTT

CAGAGCAGCCGGACCTATCTGCCCGTTGGCGCGCCTATTGAAAGA

TCTTAAGGGGATATCCTCGAGGTTCCCTTTAGTGAGGGTTAATTGC

GAGCTTGGCGTAATCATGGTCATAGCTGTTTCCTGTGTGAAATTGT

TATCCGCTCACAATTCCACACAACATACGAGCCGGAAGCATAAAG

TGTAAAGCCTGGGGTGCCTAATGAGTGAGCTAACTCACATTAATTG

CGTTGCGCTCACTGCCCGCTTTCCAGTCGGGAAACCTGTCGTGCCA

GCTGCATTAATGAATCGGCCAACGCGCGGGGAGAGGCGGTTTGCG

TATTGGGCGCTCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCG

GTCGTTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTA

ATACGGTTATCCACAGAATCAGGGGATAACGCAGGAAAGAACATG

TGAGCAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCGC

GTTGCTGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCAC

AAAAATCGACGCTCAAGTCAGAGGTGGCGAAACCCGACAGGACTA

TAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTC

CTGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCT

TCGGGAAGCGTGGCGCTTTCTCATAGCTCACGCTGTAGGTATCTCA

GTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACC

CCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTT

GAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCC

ACTGGTAACAGGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACA

GAGTTCTTGAAGTGGTGGCCTAACTACGGCTACACTAGAAGAACA

GTATTTGGTATCTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAA

GAGTTGGTAGCTCTTGATCCGGCAAACAAACCACCGCTGGTAGCG

GTGGTTTTTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAAAAAG

GATCTCAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCA

GTGGAACGAAAACTCACGTTAAGGGATTTTGGTCATGAGATTATC

AAAAAGGATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTTTT

AAATCAATCTAAAGTATATATGAGTAAACTTGGTCTGACAGTTACC

AATGCTTAATCAGTGAGGCACCTATCTCAGCGATCTGTCTATTTCG

TTCATCCATAGTTGCCTGACTCCCCGTCGTGTAGATAACTACGATA

CGGGAGGGCTTACCATCTGGCCCCAGTGCTGCAATGATACCGCGA

GACCCACGCTCACCGGCTCCAGATTTATCAGCAATAAACCAGCCA

GCCGGAAGGGCCGAGCGCAGAAGTGGTCCTGL AACTTTATCCGCC

TCCATCCAGTCTATTAATTGTTGCCGGGAAGCTAGAGTAAGTAGTT

CGCCAGTTAATAGTTTGCGCAACGTTGTTGCCATTGCTACAGGCAT

CGTGGTGTCACGCTCGTCGTTTGGTATGGCTTCATTCAGCTCCGGT

TCCCAACGATCAAGGCGAGTTACATGATCCCCCATGTTGTGCAAA AAAGCGGTTAGCTCCTTCGGTCCTCCGATCGTTGTCAGAAGTAAGT

TGGCCGCAGTGTTATCACTCATGGTTATGGCAGCACTGCATAATTC

TCTTACTGTCATGCCATCCGTAAGATGCTTTTCTGTGACTGGTGAG

TACTCAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCGAGTT

GCTCTTGCCCGGCGTCAATACGGGATAATACCGCGCCACATAGCA

GAAC 11" i ' AAAAGTGCTC ATC ATTGGAAAACGTTCTTCGGGGCGAA

AACTCTCAAGGATCTTACCGCTGTTGAGATCCAGTTCGATGTAACC

CACTCGTGCACCCAACTGATCTTCAGCATCTTTTACTTTCACCAGC

GTTTCTGGGTGAGCAAAAACAGGAAGGCAAAATGCCGCAAAAAA

GGGAATAAGGGCGACACGGAAATGTTGAATACTCATACTCTTCCT

i n CAATATTATTGAAGCATTTATCAGGGTTATTGTCTCATGAGC

GGATACATATTTGAATGTATITAGAAAAATAAACAAATAGGGGTT

CCGCGCACATTTCCCCGAAAAGTGCCACCTGACGCGCCCTGTAGC

GGCGCATTAAGCGCGGCGGGTGTGGTGGTTACGCGCAGCGTGACC

GCTACACTTGCCAGCGCCCTAGCGCCCGCTCCTTTCGCTTTCTTCCC

TTCCTTTCTCGCCACGTTCGCCGG 1 I CCCCGTCAAGCTCTAAATC

GGGGGCTCCCTTTAGGGTTCCGATTTAGTGCTTTACGGCACCTCGA

CCCCAAAAAACTTGATTAGGGTGATGGTTCACGTAGTGGGCCATC

GCCCTGATAGACGGTTTTTCGCCCTTTGACGTTGGAGTCCACGTTC

TTTAATAGTGGACTCTTGTTCCAAACTGGAACAACACTCAACCCTA

TCTCGGTCTATTCTTTTGATTTATAAGGGATTTTGCCGATTTCGGCC

TATTGGTTAAAAAATGAGCTGATTTAACAAAAATTTAACGCGAATT

TTAACAAAATATTAACGCTTACAATTTGCCATTCGCCATTCAGGCT

GCGCAACTGTTGGGAAGGGCGAT

SEQ ID NO: 37 CGGTGCGGGCCTCTTCGCTATI ACGCCAGCTGGCGAAAGGGGGAT

GTGCTGCAAGGCGATTAAGTTGGGTAACGCCAGGG I i l l CCCAGTC

ACGACGTTGTAAAACGACGGCCAGTGA ATTGTAATACGACTCACT

ATAGGGCGACCCTAGGATCCTATGGCGCGCCGCCACCAACAGCCC

CGCCAATGGCGCTGCCGATACTCCCGACAATCCCCACCATTGCCTG

ACGCGTCCAGTATCCCAGCAGATACGGGATATCGACATTTCTGCAC

CATTCCGGCGGGTATAGG 1 1 1 " ! ATTG ATGGCCTC ATCC AC ACGC AG

CAGCGTCTGTTCATCGTCGTGGCGGCCCATAATAATCTGCCGGTCA

ATCAGCCAG 1 1 1 CCTCACCCGGCCCCCATCCCCATACGCGCATTT

CGTAGCGGTCCAGCTGGGAGTCGATACCGGCGGTCAGGTAAGCCA

C ACGGTCAGGAACGGGCGCTGAATAATGCT 1 1 1 CCGCTCTGCC AT

CACTTCAGCATCCGGACGTTCGCCAA 1 1 1 1 CGCCTCCC ACGTCTC A

CCGAGCGTGGTGTITACGAAGGTTTTACGTTTTCCCGTATCCCCTTT

CGTTTTCATCCAGTCTTTGACAATCTGCACCCAGGTGGTGAACGGG

CTGTACGCTGTCCAGATGTGAAAGGTCACACTGTCAGGTGGCTCA

ATCTCTTCACCGGATGACGAAAACCAGAGAATGCCATCACGGGTC

CAGATCCCGGTCH I rCGCAGATATAACGGGCATCAGTAAAGTCCA

GCTCCTGCTGGCGGATGACGCAGGCAi i ATGCTCGCAGAGATAAA

ACACGCTGGAGACGCGTTTTCCCGTCTTTCAGTGCCTTGTTCAGTT

CTTCCTGACGGGCGGTATATTTCTCCAGCTTGGCGCGCCTAAGACT

TAGATCTTAAGGGGATATCCTCGAGGTTCCC 1 1 1 AGTGAGGGTTAA

TTGCGAGCTTGGCGTAATCATGGTCATAGCTGTTTCCTGTGTGAAA TTGTTATCCGCTCACAATTCCACACAACATACGAGCCGGAAGCATA

AAGTGTAAAGCCTGGGGTGCCTAATGAGTGAGCTAACTCACATTA

ATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGGGAAACCTGTCGT

GCCAGCTGCATTAATGAATCGGCCAACGCGCGGGGAGAGGCGGTT

TGCGTATTGGGCGCTCTTCCGCTTCCTCGCTCACTGACTCGCTGCG

CTCGGTCGTTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGC

GGTAATACGGTTATCCACAGAATCAGGGGATAACGCAGGAAAGAA

CATGTGAGCAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGG

CCGCGTTGCTGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCA

TCACAAAAATCGACGCTCAAGTCAGAGGTGGCGAAACCCGACAGG

ACTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGC

TCTCCTGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCT

CCCTTCGGGAAGCGTGGCGCTTTCTCATAGCTCACGCTGTAGGTAT

CTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACG

AACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCG

TCTTGAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGC

AGCCACTGGTAACAGGATTAGCAGAGCGAGGTATGTAGGCGGTGC

TACAGAGTTCTTGAAGTGGTGGCCTAACTACGGCTACACTAGAAG

AACAGTATTTGGTATCTGCGCTCTGCTGAAGCCAGTTACCTTCGGA

AAAAGAGTTGGTAGCTCTTGATCCGGCAAACAAACCACCGCTGGT

AGCGGTGGTTTTTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAA

AAAGGATCTCAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGACG

CTCAGTGGAACGAAAACTCACGTTAAGGGATTTTGGTCATGAGAT

TATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTAAAAATGAAG

TTTTAAATCAATCTAAAGTATATATGAGTAAACTTGGTCTGACAGT

TACCAATGCTTAATCAGTGAGGCACCTATCTCAGCGATCTGTCTAT

TTCGTTCATCCATAGTTGCCTGACTCCCCGTCGTGTAGATAACTAC

GATACGGGAGGGCTTACCATCTGGCCCCAGTGCTGCAATGATACC

GCGAGACCCACGCTCACCGGCTCCAGATTTATCAGCAATAAACCA

GCCAGCCGGAAGGGCCGAGCGCAGAAGTGGTCCTGCAACTTTATC

CGCCTCCATCCAGTCTATTAATTGTTGCCGGGAAGCTAGAGTAAGT

AGTTCGCCAGTTAATAGTTTGCGCAACGTTGTl GCCATTGCTACAG

GCATCGTGGTGTCACGCTCGTCGTTTGGTATGGCTTCATTCAGCTC

CGGTTCCCAACGATCAAGGCGAGTTACATGATCCCCCATG ' ITGTGC

AAAAAAGCGGTTAGCTCCTTCGGTCCTCCGATCGTTGTCAGAAGTA

AGTTGGCCGCAGTGTTATCACTCATGGTTATGGCAGCACTGCATAA

TTCTCTTACTGTCATGCCATCCGTAAGATGCTTTTCTGTGACTGGTG

AGTACTCAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCGA

GTTGCTCTTGCCCGGCGTCAATACGGGATAATACCGCGCCACATAG

CAGAACTTTAAAAGTGCTCATCATTGGAAAACGTTCTTCGGGGCG

AAAACTCTCAAGGATCTTACCGCTGTTGAGATCCAGTTCGATGTAA

CCC ACTCGTGC ACCC AACTG A i TTC AGC ATCTTTT ACTTTC ACC A

GCGTTTCTGGGTGAGCAAAAACAGGAAGGCAAAATGCCGCAAAA

AAGGGAATAAGGGCGACACGGAAATGTTGAATACTCATACTCTTC

CTTTTTCAATATTATTGAAGCATTTATCAGGGTTATTGTCTCATGAG

CGGATACATATTTGAATGTATTTAGAAAAATAAACAAATAGGGGT TCCGCGCACATTTCCCCGAAAAGTGCCACCTGACGCGCCCTGTAGC

GGCGCATTAAGCGCGGCGGGTGTGGTGGTTACGCGCAGCGTGACC GCTACACTTGCC AGCGCCCTAGCGCCCGCTCC 1 1 1 CGCTTTCTTCCC

TTCCTTTCTCGCCACGTTCGCCGGCTTTCCCCGTCAAGCTCTAAATC GGGGGCTCCCTTTAGGGTTCCGATTTAGTGCTTTACGGCACCTCGA CCCCAAAAAACTTGATTAGGGTGATGGTTCACGTAGTGGGCCATC

GCCCTGATAGACGGTTTTTCGCCCTTTGACGTTGGAGTCCACGTTC

TTTAATAGTGGACTCTTGTTCCAAACTGGAACAACACTCAACCCTA

TCTCGGTCTATTCTTTTGATTTATAAGGGATTTTGCCGATTTCGGCC

TATTGGTTAAAAAATGAGCTGATTTAACAAAAATTTAACGCGAATT

TTAACAAAATATTAACGCTTAC ATTTGCCATTCGCCATTCAGGCT

GCGCAACTGTTGGGAAGGGCGAT

SEQ ID NO: 38 CGGCCGCCTGCACGGTCCTGTTCCCTAGCATGTACGTGAGCGTATT

TCCTTTTAAACCACGACGCTTTGTCTTCATTCAACGTTTCCCATTGT

ί H TTCTACTATTGCTTTGCTGTGGGAAAAACTTATCGAAAGATG

ACGACTTTTTCTTAATTCTCGTTTTAAGAGCTTGGTGAGCGCTAGG

AGTCACTGCCAGGTATCGTTTGAACACGGCATTAGTCAGGGAAGT

CATAACACAGTCCTTTCCCGCAATTTTCTTTTTCTATTACTCTTGGC

CTCCTCTAGTACACTCTATATTTTTTTATGCCTCGGTAATGATTTTC

ATTTTTTTTTTTCCACCTAGCGGATGACTCTTTTTTTTTCTTAGCGAT

TGGCATTATCACATAATGAATTATACATTATATAAAGTAATGTGAT

TTCTTCGAAGAATATACTAAAAAATGAGCAGGCAAGATAAACGAA

GGCAAAGATGACAGAGCAGAAAGCCCTAGTAAAGCGTATTACAA

ATGAAACCAAGATTCAGATTGCGATCTCTTTAAAGGGTGGTCCCCT

AGCGATAGAGCACTCGATCTTCCCAGAAAAAGAGGCAGAAGCAGT

AGCAGAACAGGCCACACAATCGCAAGTGATTAACGTCCACACAGG

TATAGGGTTTCTGGACC TATGATACATGCTCTGGCCAAGCATTCC

GGCTGGTCGCTAATCGTTGAGTGCATTGGTGACTTACACATAGACG

ACCATCACACCACTGAAGACTGCGGGATTGCTCTCGGTCAAGCTTT

TAAAGAGGCCCTAGGGGCCGTGCGTGGAGTAAAAAGGTTTGGATC

AGGATTTGCGCCTTTGGATGAGGCACTTTCCAGAGCGGTGGTAGAT

C I I 1 CGAACAGGCCGTACGC AGTTGTCG AACTTGGTTTGC AAAGGG

AGAAAGTAGGAGATCTCTCTTGCGAGATGATCCCGCATTTTCTTGA

AAGC 1 Γ 1 GCAGAGGCTAGCAGA ATTACCCTCCACGTTG ATTGTCTG

CGAGGCAAGAATGATCATCACCGTAGTGAGAGTGCGTTCAAGGCT

CTTGCGGTTGCCATAAGAGAAGCCACCTCGCCCAATGGTACCAAC

GATGTTCCCTCCACCAAAGGTGTTCTTATGTAGTGACACCGATTAT

TTAAAGCTGCAGCATACGATATATATACATGTGTATATATGTATAC

CTATGAATGTCAGTAAGTATGTATACGAACAGTATGATACTGAAG

ATGACAAGGTAATGCATCATTCTATACGTGTCATTCTGAACGAGGC

GCGCTTTCCTTTTTTCTTTTTGCTTTTTCTTTTTTTTTCTCTTGAACTC

GATCGAGAAAAAAAA i ATAAAAGAGATGGAGGAACGGGAAAAAG

TTAGTTGTGGTGATAGGTGGCAAGTGGTATTCCGTAAGAACAACA

AGAAAAGCATTTCATATTATGGCTGAACTGAGCGAACAAGTGCAA

AATTTAAGCATCAACGACAACAACGAGAATGGTTATGTTCCTCCTC

ACTTAAGAGGAAAACCAAGAAGTGCCAGAAATAACAGTAGCAAC TACAATAACAACAACGGCGGCTACAACGGTGGCCGTGGCGGTGGC

AGCTTCTTTAGCAACAACCGTCGTGGTGGTTACGGCAACGGTGGTT

TCTTCGGTGGAAACAACGGTGGCAGCAGATCTAACGGCCGTTCTG

GTGGTAGATGGATCGATGGCAAACATGTCCCAGCTCCAAGAAACG

AAAAGGCCGAGATCGCCATATTTGGTGTGGCGGCCGCACGCGTTC

ATCGTCCACCTCCGGAGAACAGGCCACCATCACGCATCTGTGTCTG

AATTTCATCACGGGCGCGCCCTGGGCCTCATGGGCCTTCCGCTCAC

TGCCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCTGCATTAACA

TGGTCATAGCTGTTTCCTTGCGTATTGGGCGCTCTCCGCTTCCTCGC

TCACTGACTCGCTGCGCTCGGTCGTTCGGGTAAAGCCTGGGGTGCC

TAATGAGCAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGC

CGCGTTGCTGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCAT

CACAAAAATCGACGCTCAAGTCAGAGGTGGCGAAACCCGACAGG

ACTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGC

TCTCCTGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCT

CCCTTCGGGAAGCGTGGCGCTTTCTCATAGCTCACGCTGTAGGTAT

CTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACG

AACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCG

TCTTGAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGC

AGCCACTGGTAACAGGATTAGCAGAGCGAGGTATGTAGGCGGTGC

TACAGAGTTCTTGAAGTGGTGGCCTAACTACGGCTACACTAGAAG

AACAGTATTTGGTATCTGCGCTCTGCTGAAGCCAGTTACCTTCGGA

AAAAGAGTTGGTAGCTC ' ITGATCCGGCAAACAAACCACCGCTGGT

AGCGGTGGTTTTTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAA

AAAGGATCTCAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGACG

CTCAGTGGAACGAAAACTCACGTTAAGGGATTTTGGTCATGAGAT

TATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTAAAAATGAAG

TTTTAAATCAATCTAAAGTATATATGAGTAAACTTGGTCTGACAGT

TACCAATGCTTAATCAGTGAGGCACCTATCTCAGCGATCTGTCTAT

TTCGTTCATCCATAGTTGCCTGACTCCCCGTCGTGTAGATAACTAC

GATACGGGAGGGCTTACCATCTGGCCCCAGTGCTGCAATGATACC

GCGAGAACCACGCTCACCGGCTCCAGATTTATCAGCAATAAACCA

GCCAGCCGGAAGGGCCGAGCGCAGAAGTGGTCCTGCAACTTTATC

CGCCTCCATCCAGTCTATTAATTGTTGCCGGGAAGCTAGAGTAAGT

AGTTCGCCAGTTAATAGTTTGCGCAACGTTGTTGCCATTGCTACAG

GCATCGTGGTGTCACGCTCGTCGTTTGGTATGGCTTCATTCAGCTC

CGGTTCCCAACGATCAAGGCGAGTTACATGATCCCCCATGTTGTGC

AAAAAAGCGGTTAGCTCCTTCGGTCCTCCGATCGTTGTCAGAAGTA

AGTTGGCCGCAGTGTTATCACTCATGGTl ATGGCAGCACTGCATAA

TTCTCTTACTGTCATGCCATCCGTAAGATGCTTTTCTGTGACTGGTG

AGTACTCAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCGA

GTTGCTCTTGCL CGGCGTCAATACGGGATAATACCGCGCCACATA a

CAGAACTTTAAAAGTGCTCATCATTGGAAAACGTTCTTCGGGGCG

AAAACTCTCAAGGATCTTACCGCTGTTGAGATCCAGTTCGATGTAA

CCCACTCGTGCACCCAACTGATCTTCAGCATCT ' ITTACTTTCACCA

GCGTTTCTGGGTGAGCAAAAACAGGAAGGCAAAATGCCGCAAAA AAGGGAATAAGGGCGACACGGAAATGTTGAATACTCATACTCTTC

CTTTTTCAATATTATTGAAGCATTTATCAGGGTTATTGTCTCATGAG

CGGATACATATTTGAATGTATTTAGAAAAATAAACAAATAGGGGT

TCCGCGCACATTTCCCCGAAAAGTGCCACCTAAATTGTAAGCGTTA

ATATTTTGTTAAAATTCGCGTTAAATTTTTGTTAAATCAGCTCATTT

TTTAACCAATAGGCCGAAATCGGCAAAATCCCTTATAAATCAAAA

GAATAGACCGAGATAGGGTTGAGTGGCCGCTACAGGGCGCTCCCA

TTCGCCATTCAGGCTGCGCAACTGTTGGGAAGGGCGTTTCGGTGCG

GGCCTCTTCGCTATTACGCCAGCTGGCGAAAGGGGGATGTGCTGC

AAGGCGATTAAGTTGGGTAACGCCAGGGTTTTCCCAGTCACGACG

TTGTAAAACGACGGCCAGTGAGCGCGACGTAATACGACTCACTAT

AGGGCGAATTGGCGGAAGGCCGTCAAGGCCGCATGGCGCGCCTTT

CCCGTCTTrCAGTGCCTTGTTCAGTTCTTCCTGACGGGCGGTATATT

TCTCCAGCTTGGCCTATGCGGCCCTGTCAGACCAAGTTTACGAGCT

CGCTTGGACTCCTGTTGATAGATCCAGTAATGACCTCAGAACTCCA

TCTGGATTTGTTCAGAACGCTCGGTTGCCGCCGGGCGTTTTTTATT

GGTGAGAATCCAAGCACTAGGGACAGTAAGACGGGTAAGCCTGTT

GATGATACCGCTGCCTTACTGGGTGCATTAGCCAGTCTGAATGACC

TGTCACGGGATAATCCGAAGTGGTCAGACTGGAAAATCAGAGGGC

AGGAACTGCTGAACAGCAAAAAGTCAGATAGCACCACATAGCAG

ACCCGCCATAAAACGCCCTGAGAAGCCCGTGACGGGCTTTTCTTGT

ATTATGGGTAGTTTCCTTGCATGAATCCATAAAAGGCGCCTGTAGT

GCCATTTACCCCCATTCACTGCCAGAGCCGTGAGCGCAGCGAACT

GAATGTCACGAAAAAGACAGCGACTCAGGTGCCTGATGGTCGGAG

ACAAAAGGAATATTCAGCGATTTGCCCGAGCTTGCGAGGGTGCTA

CTTAAGCCTTTAGGGTTTTAAGGTCTGTTTTGTAGAGGAGCAAACA

GCGTTTGCGACATCCTTTTGTAATACTGCGGAACTGACTAAAGTAG

TGAGTTATACACAGGGCTGGGATCTATTCTTTTTATCTTTT I TATT

CTTTCTTTATTCTATAAATTATAACCACTTGAATATAAACAAAAAA

AACACACAAAGGTCTAGCGGAATTTACAGAGGGTCTAGCAGAATT

TACAAGTTTTCCAGCAAAGGTCTAGCAGAATTTACAGATACCCAC

AACTCAAAGGAAAAGGACATGTAATTATCA1TGACTAGCCCATCT

CAATTGGTATAGTGATTAAAATCACCTAGACCAATTGAGATGTATG

TCTGAATTAGTTGTTTTCAAAGCAAATGAACTAGCGATTAGTCGCT

ATGACTTAACGGAGCATGAAACCAAGCTAATTTTATGCTGTGTGGC

ACTACTCAACCCCACGATTGAAAACCCTACAAGGAAAGAACGGAC

GGTATCGTTCACTTATAACCAATACGCTCAGATGATGAACATCAGT

AGGGAAAATGCTTATGGTGTATTAGCTAAAGCAACCAGAGAGCTG

ATGACGAGAACTGTGGAAATCAGGAATCCTTTGGTTAAAGGCTTT

GAGATTTTCCAGTGGACAAACTATGCCAAGTTCTCAAGCGAAAAA

TTAGAATTAGTTTTTAGTGAAGAGATATTGCCTTATCTTTTCCAGTT

AAAAA - ^ATTC ATAAAATATAATCTGG AACATGTTAAGTC i fTTGAA

AACAAATACTCTATGAGGATTTATGAGTGGTTATTAAAAGAACTA

ACACAAAAGAAAACTCACAAGGCAAATATAGAGATTAGCCTTGAT

GAATTTAAGTTCATGTTAATGCTTGAAAATAACTACCATGAGTTTA

AAAGGCTTAACCAATGGGTTTTGAAACCAATAAGTAAAGATTTAA ACACTTACAGCAATATGAAATTGGTGGTTGATAAGCGAGGCCGCC CG ACTG AT ACGTTG A i l l ! CC AAGTTG A ACT AG AT AG AC AAATGG

ATCTCGTAACCGAACTTGAGAACAACCAGATAAAAATGAATGGTG

ACAAAATACCAACAACCATTACATCAGATTCCTACCTACATAACG

GACTAAGAAAAACACTACACGATGCTTTAACTGCAAAAATTCAGC

TCACCAGTTTTGAGGCAAAATTTTTGAGTGACATGCAAAGTAAGTA

TGATCTCAATGGTTCGTTCTCATGGCTCACGCAAAAACAACGAACC

ACACTAGAGAACATACTGGC AAATACGGAAGGATCTGAGGTTCT

TATGGCTCTTGTATCTATCAGTGAAGCATCAAGACTAACAAACAA

AAGTAGAACAACTGTTCACCGTTACATATCAAAGGGAAAACTGTC

CATATGCACAGATGAAAACGGTGTAAAAAAGATAGATACATCAGA

GCTTTTACGAGTTTTTGGTGCATTCAAAGCTGTTCACCATGAACAG

ATCGACAATGTAACG

SEQ ID NO: 39 CGGTGCGGGCCTCTTCGCTATTACGCCAGCTGGCGAAAGGGGGAT

GTGCTGC AAGGCG ATT A AGITGGGT AACGCC AGGG Π CCCAGTC

ACGACGTTGTAAAACGACGGCCAGTGAATTGTAATACGACTCACT

ATAGGGCGACCCTTAGGATCCTATGGCGCGCCTCATCGTCCACCTC

CGGAGAACAGGCCACCATCACGCATCTGTGTCTGAATTTCATCACG

ACGCGCCTTAAGGGCACCAATAACTGCCTTAAAAAAATTACGCCC

CGCCCTGCCACTCATCGCAGTACTGTTGTAATTCATTAAGCATTCT

GCCGACATGGAAGCCATCACAGACGGCATGATGAACCTGAATCGC

CAGCGGCATCAGCACCTTGTCGCCTTGCGTATAATATTTGCCCATG

GTGAAAACGGGGGCGAAGAAGTTGTCCATATTGGCCACGTTTAAA

TCAAAACTGGTGAAACTCACCCAGGGATTGGCTGAGACGAAAAAC

AT ATTCTC AATAA ACCCTTTAGGG A A AT AGG CC AGG 1 1 T 1 C ACCGT

AACACGCCACATCTTGCGAATATATGTGTAGAAACTGCCGGAAAT

CGTCGTGGTATTCACTCCAGAGCGATGAAAACGTTTCAGTTTGCTC

ATGGAAAACGGTGTAACAAGGGTGAACACTATCCCATATCACCAG

CTCACCGTCTTTCATTGCCATACGGAATTCCGGATGAGCATTCATC

AGGCGGGCAAGAATGTGAA I AAAGGCCGGATAAAACTTGTGCTTA

TTTTTCTTTACGGTCTTTAAAAAGGCCGTAATATCCAGCTGAACGG

TCTGGTTATAGGTACATTGAGCAACTGACTGAAATGCCTCAAAATG

TTC HT ACG ATGCC ATTGGG AT AT ATC AACGGTGGTAT ATCC AGTG

ATTTTTTTCTCCATTTTAGCTTCCTTAGCTCCTGAAAATCTCGATAA

CTCAAAAAATACGCCCGGTAGTGATCTTATTTCATTATGGTGAAAG

TTGGAACCTCTTACGTGCCG ATCAACGTCTCA 1 1 1 1 CGCCAAAAGT

TGGCCCAGGGCTTCCCGGTATCAACAGGGACACCAGGATTTATTTA

TTCTGCGAAGTGATCTTCCGTCACAGGTATTGGACCACCCTGTGGG

TTTATAAGCGCGCTGCTGGCGTGTAAGGCGGTGACGGCGAAGGAA

GGGTCCTTTTCATCACGTGCTATAAAAATAATTATAATTTAAATTT

TTTAATATAAATATATAAATTAAAAATAGAAAGTAAAAAAAGAAA

' ί rAAAGAAAAAATAGTTTTTGTTTTCCGAAGATG ι AAAAGACTCTA

GGGGGATCGCCAACAAATACTACCTTTTATCTTGCTCTTCCTGCTC

TCAGGTATTAATGCCGAATTGTTTCATCTTGTCTGTGTAGAAGACC

ACACACGAAAATCCTGTGATTTTACATTTTACTTATCGTTAATCGA

ATGTATATCTATTTAATCTGCTTTTCTTGTCTAATAAATATATATGT AAAGTACGCTTTTTGTTGAAATTTTTTAAACCTTTGTTTATTTTTTTT

TCTTCATTCCGTAACTCTTCTACCTTCTTTATTTACTTTCTAAAATCC

AAATACAAAACATAAAAATAAATAAACACAGAGTAAATTCCCAAA

TTATTCCATCATTAAAAGATACGAGGCGCGTGTAAGTTACAGGCA

AGCGATCCGTCCTAAGAAACCATTATTATCATGACATTAACCTATA

AAAATAGGCGTATCACGAGGCCCTTTCGTCTCGCGCGTTTCGGTGA

TGACGGTGAAAACCTCTGACACATGCAGCTCCCGGAGACGGTCAC

AGCTTGTCTGTAAGCGGATGCCGGGAGCAGACAAGCCCGTCAGGG

CGCGTCAGCGGGTGTTGGCGGGTGTCGGGGCTGGCTTAACTATGC

GGCATCAGAGCAGATTGTACTGAGAGTGCACCACGGCGCGTGGCA

CCCTTGCGGGCCATGTCATACACCGCCTTCAGAGCAGCCGGACCTA

TCTGCCCGTTGGCGCGCCTATTGAAAGATCTTAAGGGGATATCCTC

GAGGTTCCCTTTAGTGAGGGTTAATTGCGAGCTTGGCGTAATCATG

GTCATAGCTGTTTCCTGTGTGAAATTGTTATCCGCTCACAATTCCA

CACAACATACGAGCCGGAAGCATAAAGTGTAAAGCCTGGGGTGCC

TAATGAGTGAGCTAACTCACATTAATTGCGTTGCGCTCACTGCCCG

CTTTCCAGTCGGGAAACCTGTCGTGCCAGCTGCATTAATGAATCGG

CCAACGCGCGGGGAGAGGCGGTTTGCGTATTGGGCGCTCTTCCGC

TTCCTCGCTCACTGACTCGCTGCGCTCGGTCGTTCGGCTGCGGCGA

GCGGTATCAGCTCACTCAAAGGCGGTAATACGGTTATCCACAGAA

TCAGGGGATAACGCAGGAAAGAACATGTGAGCAAAAGGCCAGCA

AAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCA

TAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACGCTCAAG

TCAGAGGTGGCGAAACCCGACAGGACTATAAAGATACCAGGCGTT

TCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCG

CTTACCGGATACCTGTCCGCCTTTCTCCCTTCCiGGAAGCGTGGCGC

TTTCTCATAGCTCACGCTGTAGGTATCTCAGTTCGGTGTAGGTCGT

TCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCCGAC

CGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCCGGTAA

GACACGACTTATCGCCACTGGCAGCAGCCACTGGTAACAGGATTA

GCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTCTTGAAGTGGT

GGCCTAACTACGGCTACACTAGAAGAACAGTATTTGGTATCTGCG

CTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCTTG

ATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGC

AAGCAGCAGATTACGCGCAGAAAAAAAGGATCTCAAGAAGATCCT

TTGATCTTTTCTACGGGGTCTGACGCTCAGTGGAACGAAAACTCAC

GTTAAGGGA ITTTGGTCATGAGATTATCAAAAAGGATCTTCACCTA

GATCCTTTTAAATTAAAAATGAAGTTTTAAATCAATCTAAAGTATA

TATGAGTAAACTTGGTCTGACAGTTACCAATGCTTAATCAGTGAGG

CACCTATCTCAGCGATCTGTCTATTTCGTTCATCCATAGTTGCCTGA

CTCCCCGTCGTGTAGATAACTACGATACGGGAGGGCTTACCATCTG

GCCCCAGTGCTGCAATGATACCGCGAGAC CACGCTCACCGGCTC

CAGATTTATCAGCAATAAACCAGCCAGCCGGAAGGGCCGAGCGCA

GAAGTGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTATTAATTG

TTGCCGGGAAGCTAGAGTAAGTAGTTCGCCAGTTAATAGTTTGCGC

AACGTTGTTGCCATTGCTACAGGCATCGTGGTGTCACGCTCGTCGT TTGGTATGGCTTCATTCAGCTCCGGTTCCCAACGATCAAGGCGAGT

TACATGATCCCCCATGTTGTGCAAAAAAGCGGTTAGCTCCTTCGGT

CCTCCGATCGTTGTCAGAAGTAAGTTGGCCGCAGTGTTATCACTCA

TGGTTATGGCAGCACTGCATAATTCTCTTACTGTCATGCCATCCGT

AAGATGCTTTTCTGTGACTGGTGAGTACTCAACCAAGTCATTCTGA

GAATAGTGTATGCGGCGACCGAGTTGCTCTTGCCCGGCGTCAATAC

GGGATAATACCGCGCCACATAGCAGAACTTTAAAAGTGCTCATCA

TTGGAAAACGTTCTTCGGGGCGAAAACTCTCAAGGATCTTACCGCT

GTTGAGATCCAGTTCGATGTAACCCACTCGTGCACCCAACTGATCT

TCAGCATCTTTTACTTTCACCAGCGTTTCTGGGTGAGCAAAAACAG

GAAGGCAAAATGCCGCAAAAAAGGGAATAAGGGCGACACGGAAA

TGTTGAATACTCATACTCTTCCTTTTTCAATATTATTGAAGCATTTA

TCAGGGTTATTGTCTCATGAGCGGATACATATTTGAATGTATTTAG

AAAAATAAACAAATAGGGGTTCCGCGCACATTTCCCCGAAAAGTG

CCACCTGACGCGCCCTGTAGCGGCGCATTAAGCGCGGCGGGTGTG

GTGGTTACGCGCAGCGTGACCGCTACACTTGCCAGCGCCCTAGCG

CCCGCTCCTTTCGCTTTCTTCCCTTCCTTTCTCGCCACGTTCGCCGG

C l l l CCCCGTCAAGCTCT AAATCGGGGGCTCCCTTTAGGGTTCCGA

TTTAGTGC 1 ACGGC ACCTCGACCCC AAAAAACTTGATT AGGGTG

ATGGTTCACGTAGTGGGCCATCGCCCTGATAGACGGTTTTTCGCCC

TTTGACGTTGGAGTCCACGTTCTTTAATAGTGGACTCTTGTTCCAA

ACTGGAACAACACTC AACCCTATCTCGGTCTATTC 1 1 i ' 1 GATTT AT

AAGGGAi 1 1 GCCGATTTCGGCCTATTGGTTAAAAAATGAGCTGAT

TT AAC AAAAATTT AACGCG AA ' 1111 1 AAC AA AAT ATT AACGCTT AC A

ATTTGCCATTCGCCATTCAGGCTGCGCAACTGTTGGGAAGGGCGAT

SEQ ID NO: 40 CGGTGCGGGCCTCTTCGCTATTACGCCAGCTGGCGAAAGGGGGAT

GTGCTGCAAGGCGATTAAGTTGGGTAACGCCAGGG J i l l CCCAGTC

ACGACGTTGTAAAACGACGGCCAGTGAATTGTAATACGACTCACT

ATAGGGCGACCCTTAGGATCCTATGGCGCGCCACCACGGTGAACA

ATCCCCGCTGGCTCATATTTGCCGCCGGTTCCCGTAAATCCTCCGG

TACGCGTCCAGTATCCCAGCAGATACGGGATATCGACATTTCTGCA

CCATTCCGGCGGGTATAGG 1 1 1 1 ATTGATGGCCTC ATCC AC ACGC A

GCAGCGTCTGTTCATCGTCGTGGCGGCCCATAATAATCTGCCGGTC

AATCAGCCAGCHICCTCACCCGGCCCCCATCCCCATACGCGCATT

TCGTAGCGGTCCAGCTGGGAGTCGATACCGGCGGTCAGGTAAGCC

ACACGGTCAGGAACGGGCGCTGAATAATGCTCTTTCCGCTCTGCCA

TCACTTCAGCATCCGGACGTTCGCCA.A11 1 1 CGCCTCCCACGTCTC

ACCGAGCGTGGTGTTTACGAAGGTTTTACGTTTTCCCGTATCCCCT

TTCGl ' l 1 " 1 C ATCC AGTCTTTG AC AATCTGC ACCC AGGTGGTGAACG

GGCTGTACGCTGTCCAGATGTGAAAGGTCACACTGTCAGGTGGCT

CAATCTCTTCACCGGATGACGAAAACCAGAGAATGCCATCACGGG

TCC AGATCCCGGTC11 " 1 " rCGC AGA Γ AT AACGGGC ATC AGTAAAGTC

CAGCTCCTGCTGGCGGATGACGCAGGCATTATGCTCGCAGAGATA

AAACACGCTGGAGACGCGTTTTCCCGTCTTTCAGTGCCTTGTTCAG

TTCTTCCTGACGGGCGGTATATTTCTCCAGCTTGGCGCGCCTAAGA

CTTAGATCTTAAGGGGATATCCTCGAGGTTCCC 1 1 1 AGTG AGGGTT AATTGCGAGCTTGGCGTAATCATGGTCATAGCTGTTTCCTGTGTGA

AATTGTTATCCGCTCACAATTCCACACAACATACGAGCCGGAAGC

ATAAAGTGTAAAGCCTGGGGTGCCTAATGAGTGAGCTAACTCACA

TTAATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGGGAAACCTGT

CGTGCCAGCTGCATTAATGAATCGGCCAACGCGCGGGGAGAGGCG

GTTTGCGTATTGGGCGCTCTTCCGCTTCCTCGCTCACTGACTCGCTG

CGCTCGGTCGTTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAG

GCGGTAATACGGTTATCCACAGAATCAGGGGATAACGCAGGAAAG

AACATGTGAGCAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAA

GGCCGCGTTGCTGGCGTITTTCCATAGGCTCCGCCCCCCTGACGAG

CATCACAAAAATCGACGCTCAAGTCAGAGGTGGCGAAACCCGACA

GGACTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGC

GCTCTCCTGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTTT

CTCCCTTCGGGAAGCGTGGCGCTTTCTCATAGCTCACGCTGTAGGT

ATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCA

CGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTAT

CGTCTTGAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGCA

GCAGCCACTGGTAACAGGATTAGCAGAGCGAGGTATGTAGGCGGT

GCTACAGAGTTCTTGAAGTGGTGGCCTAACTACGGCTACACTAGA

AGAACAGTATTTGGTATCTGCGCTCTGCTGAAGCCAGTTACCTTCG

GAAAAAGAGTTGGTAGCTCTTGATCCGGCAAACAAACCACCGCTG

GTAGCGGTGGTTTTTTTGTTTGCAAGCAGCAGATTACGCGCAGAAA

AAAAGGATCTCAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGAC

GCTCAGTGGAACGAAAACTCACGTTAAGGGATTTTGGTCATGAGA

TTATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTAAAAATGAA

GTTTTAAATCAATCTAAAGTATATATGAGTAAACTTGGTCTGACAG

TTACCAATGCTTAATCAGTGAGGCACCTATCTCAGCGATCTGTCTA

TTTCGTTCATCCATAGTTGCCTGACTCCCCGTCGTGTAGATAACTA

CGATACGGGAGGGCTTACCATCTGGCCCCAGTGCTGCAATGATAC

CGCGAGACCCACGCTCACCGGCTCCAGATTTATCAGCAATAAACC

AGCCAGCCGGAAGGGCCGAGCGCAGAAGTGGTCCTGCAACTTTAT

CCGCCTCCATCCAGTCTATTAATTGTTGCCGGGAAGCTAGAGTAAG

TAGTTCGCCAGTTAATAGTTTGCGCAACGTTGTTGCCATTGCTACA

GGCATCGTGGTGTCACGCTCGTCGTTTGGTATGGCTTCATTCAGCT

CCGGTTCCCAACGATCAAGGCGAGTTACATGATCCCCCATGTTGTG

CAAAAAAGCGGTTAGCTCCrrCGGTCCTCCGATCGTTGTCAGAAGT

AAGTTGGCCGCAGTGTTATCACTCATGGTTATGGCAGCACTGCATA

ATTCTCTTACTGTCATGCCATCCGTAAGATGCTTTTCTGTGACTGGT

GAGTACTCAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCG

AGTTGCTCTTGCCCGGCGTCAATACGGGATAATACCGCGCCACATA

GCAGAACTTTAAAAGTGCTCATCATTGGAAAACGTTCTTCGGGGC

GAAAACTCTC AAGG ATCl ' i ACCGCTGTTGAGATCCAGTTCGATGTA

ACCCACTCGTGCACCCAACTGATC TCAGCATCTTITACTTTCACC

AGCGTTTCTGGGTGAGCAAAAACAGGAAGGCAAAATGCCGCAAA

AAAGGGAATAAGGGCGACACGGAAATGITGAATACTCATAC CTT

CCTTTTTCAATATTATTGAAGCATTTATCAGGGTTATTGTCTCATGA GCGGATACATATTTGAATGTATTTAGAAAAATAAACAAATAGGGG

TTCCGCGCACATTTCCCCGAAAAGTGCCACCTGACGCGCCCTGTAG

CGGCGCATTAAGCGCGGCGGGTGTGGTGGTTACGCGCAGCGTGAC

CGCTACACTTGCCAGCGCCCTAGCGCCCGCTCCTTTCGC l 1 iCTTCC

CTTCCTTTCTCGCCACGTTCGCCGGCTTTCCCCGTCAAGCTCTAAAT

CGGGGGCTCCCTTTAGGGTTCCGATTTAGTGCTTTACGGCACCTCG

ACCCCAAAAAACTTGATTAGGGTGATGGTTCACGTAGTGGGCCAT

CGCCCTGATAGACGGTTTTTCGCCCTTTCiACGTTGGAGTCCACGTT

C I T 1 AATAGTGGACTCTTGTTCC AAACTGG AAC AACACTC AACCCT

ATCTCGGTCTATTCTTTTG ATTTAT A AGGG A Π GCCG ATTTCGGC

CTATTGGTTAAAAAATGAGCTGATTTAACAAAAATTTAACGCGAA

1 " 1 ' AAC AAAAT ATT AACGCTT AC AATTTGCC ATTCGCC ATTC AGG

CTGCGCAACTGTTGGGAAGGGCGAT

SEQ ID NO: 41 CGGTGCGGGCCTCTTCGCTATTACGCCAGCTGGCGAAAGGGGGAT

GTGCTGCAAGGCG ATT AAGTTGGGTAACGCC AGGG 1 1 1 1 CCCAGTC

ACGACGTTGTAAAACGACGGCCAGTGAATTGTAATACGACTCACT

ATAGGGCGACCCTTAGGCGCGCCTTTCCCGTCTTTCAGTGCCTTGT

TCAGTTCTTCCTGACGGGCGGTATATTTCTCCAGCTTACGCGCCAT

GCAGGGATATCAGATCTTCGAGGAGAACTTCTAGTATATCCACAT

ACCTAATATTATTGCCTTATTAAAAATGGAATCCCAACAATTACAT

CAAAATCCACATTCTCTTCAAAATCAATTGTCCTGTACTTCCTTGTT

CATGTGTGTTCAAAAACGTTATATTTATAGGATAATTATACTCTAT

TTCTCAACAAGTAATTGGTTGTTTGGCCGAGCGGTCTAAGGCGCCT

GATTCAAGAAATATCTTGACCGCAGTTAACTGTGGGAATACTCAG

GTATCGTAAGATGCAAGAGTTCGAATCTCTTAGCAACCATTATTTT

TTTCCTCAACATAACGAGAACACACAGGGGCGCTATCGCACAGAA

TCAAATTCGATGACTGGAAATTTTTTGTTAATTTCAGAGGTCGCCT

GACGCATATACCT ' f 1 1 1 C A ACT G A AAAATT GG G AG A AAA AGG AAA

GGTGAGAGGCCGGAACCGGCTTTTCATATAGAATAGAGAAGCGTT

CATGACTAAATGCTTGCATCACAATACTTGAAGTTGACAATATTAT

TTAAGGACCTATTGTTTTTTCCAATAGGTGGTTAGCAATCGTCTTA

CTTTCTAACTTTTCTTACCTTTTACATTTCAGCAATATATATATATA

TTTCAAGGATATACCATTCTAATGTCTGCCCCTATGTCTGCCCCTA

AG AAG ATCGTCG ' GCC AGGTG ACC ACGTTGGTC AAG AAATC A

CAGCCGAAGCCATTAAGGTTCTTAAAGCTATTTCTGATGTTCGTTC

CAATGTCAAGTTCGATTTCGAAAATCATTTAATTGGTGGTGCTGCT

ATCGATGCTACAGGTGTCCCACTTCCAGATGAGGCGCTGGAAGCC

TCCAAGAAGGTTGATGCCG l l 1 l GTTAGGTGCTGTGGCTGGTCCTA

AATGGGGTACCGGTAGTGTTAGACCTGAACAAGGTTTACTAAAAA

TCCGTAAAGAACTTCAATTGTACGCCAACTTAAGACCATGTAACTT

TGCATCCGACTCTCTTTTAGACTTATCTCCAATCAAGCCACAATTT

GCTAAAGGTACTGACTTCGTTGTTGTCAGAGAATTAGTGGGAGGT

ATTTACTTTGGTAAGAGAAAGGAAGACGATGGTGATGGTGTCGCT

TGGGATAGTGAACAATACACCGTTCCAGAAGTGCAAAGAATCACA

AGAATGGCCGC 1 1 CATGGCCCTACAACATGAGCCACCATTGCCTA

TTTGGTCCTTGGATAAAGCTAATCTTTTGGCCTCTTCAAGATTATG GAGAAAAACTGTGGAGGAAACCATCAAGAACGAATTCCCTACATT

GAAGGTTCAACATCAATTGATTGATTCTGCCGCCATGATCCTAGTT

AAGAACCCAACCCACCTAAATGGTATTATAATCACCAGCAACATG

TTTGGTGATATCATCTCCGATGAAGCCTCCGTTATCCCAGGTTCCTT

GGGTTTGTTGCCATCTGCGTCCTTGGCCTCTTTGCCAGACAAGAAC

ACCGCATTTGGTTTGTACGAACCATGCCACGGTTCTGCTCCAGATT

TGCCAAAGAATAAGGTTGACCCTATCGCCACTATCTTGTCTGCTGC

AATGATGTTGAAATTGTCATTGAACTTGCCTGAAGAAGGTAAGGC

CATTGAAGATGCAGTTAAAAAGGTTTTGGATGCAGGTATCAGAAC

TGGTGATTTAGGTGGTTCCAACAGTACCACCGAAGTCGGTGATGCT

GTCGCCGAAGAAGTTAAGAAAATCCTTGCTTAAAAAGATTCTCTTT

TTTTATGATATTTGTACATAAACTTTATAAATGAAATTCATAATAG

AAACGACACGAAATTACAAAATGGAATATGTTCATAGGGTAGACG

AAACTATATACGCAATCTACATACATTTATCAAGAAGGAGAAAAA

GGAGGATAGTAAAGGAATACAGGTAAGCAAATTGATACTAATGGC

TCAACGTGATAAGGAAAAAGAATTGCACTTTAACATTAATATTGA

CAAGGAGGAGGGCACCACACAAAAAGTTAGGTGTAACAGAAAAT

CATGAAACTACGATTCCTAATTTGATATTGGAGGATTTTCTCTAAA

AAAAAAAAAATACAACAAATAAAAAACACTCAATGACCTGACCAT

TTGATGGAGTTTAAGTCAATACCTTCTTGAAGCATTTCCCATAATG

GTGAAAGTTCCCTCAAGAATTTTACTCTGTCAGAAACGGCCTTACG

ACGTAGTCGAGCATGCGTATTGGGCGCTCTTCCGCTTCCTCGCTCA

CTGACTCGCTGCGCTCGGTCGTTCGGCTGCGGCGAGCGGTATCAGC

TCACTCAAAGGCGGTAATACGGTTATCCACAGAATCAGGGGATAA

CGCAGGAAAGAACATGTGAGCAAAAGGCCAGCAAAAGGCCAGGA

ACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAGGCTCCGCCC

CCCTGACGAGCATCACAAAAATCGACGCTCAAGTCAGAGGTGGCG

AAACCCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGGAAG

CTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGATAC

CTGTCCGCCTTTCTCCCTTCGGGAAGCGTGGCGCTTTCTCATAGCTC

ACGCTGTAGGTATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTG

GGCTGTGTGCACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTAT

CCGGTAACTATCGTCTTGAGTCCAACCCGGTAAGACACGACTTATC

GCCACTGGCAGCAGCCACTGGTAACAGGATTAGCAGAGCGAGGTA

TGTAGGCGGTGCTACAGAGTTCTTGAAGTGGTGGCCTAACTACGG

CTACACTAGAAGGACAGTATTTGGTATCTGCGCTCTGCTGAAGCCA

GTTACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGCAAACAA

ACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAGCAGCAGATTA

CGCGCAGAAAAAAAGGATCTCAAGAAGATCCTTTGATCTTTTCTAC

GGGGTCTGACGCTCAGTGGAACGAAAACTCACGTTAAGGGATTTT

GGTCATGAGATTATCAAAAAGGATCTTCACCTAGATCCTTTTAAAT

T AAAAAT i AAGTTTT A A ATC A ATCT AAAGT ATATATG AGT A/ ACTT

GGTCTGACAGTTAACGGCGCGTTCATCGTCCACCTCCGGAGAACA

GGCCACCATCACGCATCTGTGTCTGAATTTCATCACGGGCGCGCCT

AAGGGGATATCCTCGAGGTTCCCTTTAGTGAGGGTTAATTGCGAGC

TTGGCGTAATCATGGTCATAGCTGTTTCCTGTGTGAAATTGTTATC CGCTCACAATTCCACACAACATACGAGCCGGAAGCATAAAGTGTA

AAGCCTGGGGTGCCTAATGAGTGAGCTAACTCACATTAATTGCGTT

GCGCTC ACTGCCCGC 1 1 1 CCAGTCGGGAAACCTGTCGTGCC AGCTG

CATTAACATCATACCGTATAGGCTATCCAATGCTTAATCAGTGAGG

CACCTATCTCAGCGATCTGTCTATTTCGTTCATCCATAGTTGCCTGA

CTCCCCGTCGTGTAGATAACTACGATACGGGAGGGCTTACCATCTG

GCCCCAGTGCTGCAATGATACCGCGAGACCCACGCTCACCGGCTC

CAGATTTATCAGCAATAAACCAGCCAGCCGGAAGGGCCGAGCGCA

GAAGTGGTCCTGCAA TTTATCCGCCTCCATCCAGTCTATTAATTG

TTGCCGGGAAGCTAGAGTAAGTAGTTCGCCAGTTAATAGTTTGCGC

AACGTTGTTGCCATTGCTACAGGCATCGTGGTGTCACGCTCGTCGT

TTGGTATGGCTTCATTCAGCTCCGGTTCCCAACGATCAAGGCGAGT

TACATGATCCCCCATGTTGTGCAAAAAAGCGGTTAGCTCCTTCGGT

CCTCCGATCGTTGTCAGAAGTAAGTTGGCCGCAGTGTTATCACTCA

TGGTTATGGCAGCACTGCATAATTCTCTTACTGTCATGCCATCCGT

AAGATGC i'i 1 l CTGTGACTGGTGAGTACTCAACCAAGTCATTCTGA

GAATAGTGTATGCGGCGACCGAGTTGCTCTTGCCCGGCGTCAATAC

GGGATAATACCGCGCCACATAGCAGAACTTTAAAAGTGCTCATCA

TTGGAAAACGTTCTTCGGGGCGAAAACTCTCAAGGATCTTACCGCT

GTTGAGATCCAGTTCGATGTAACCCACTCGTGCACCCAACTGATCT

TCAGCATC 1 1 1 1 ACTTTCACCAGCGTTTCTGGGTGAGCAAAAAC AG

GAAGGCAAAATGCCGCAAAAAAGGGAATAAGGGCGACACGGAAA

TGTTGAATACTCATACTCTTCCTTTTTCAATATTATTGAAGCATTTA

TCAGGGTTATTGTCTCATGAGCGGATACATATTTGAATGTATTTAG

AAAAATAAACAAATAGGGGTTCCGCGCACATTTCCCCGAAAAGTG

CCACCTGACGCGCCCTGTAGCGGCGCATTAAGCGCGGCGGGTGTG

GTGGTTACGCGCAGCGTGACCGCTACACTTGCCAGCGCCCTAGCG

CCCGCTCCTTTCGCTTTCTTCCCTTCCTTTCTCGCCACGTTCGCCGG

C I 1 1 CCCCGTCAAGCTCTAAATCGGGGGCTCCCTTTAGGGTTCCGA

TTTAGTGCTTTACGGCACCTCGACCCCAAAAAACTTGATTAGGGTG

ATGGTTCACGTAGTGGGCCATCGCCCTGATAGACGGTTTTTCGCCC

TTTGACGTTGGAGTCCACGTTCTTTAATAGTGGACTCTTGTTCCAA

ACTGGAACAACACTCAACCCTATCTCGGTCTATTCTTTTGATTTAT

AAGGG A 1 1 1 1 GCCG ATTTCGGCCT ATTGGTT AA A A A ATG AGCTG AT

TT AAC AAAAATTTAACGCG AA ΓΤ 1 T AAC AAAAT ATT AACGCTT AC A

ATTTGCCATTCGCCATTCAGGCTGCGCAACTGTTGGGAAGGGCGAT

SEQ ID NO: 42 CGGTGCGGGCCTCTTCGCTATTACGCCAGCTGGCGAAAGGGGGAT

GTGCTGC AAGGCG ATT AAGTTGGGT AACGCC AGG 1 " 1 " i " I'CCC AGTC ACGACGTTGTAAAACGACGGCCAGTGAATTGTAATACGACTCACT ATAGGGCGACCCTTAGGATCTAAGCATTGGCGCGCCCCGGCTGTCT GCCATGCTGCCCGGTGTACCGACATAACCGCCGGTGGCATAGCCG

CG ATACGCGTCTCCAGCGTG nn ATCTCTGCG AGC , T AATGCCT

GCGTCATCCGCCAGCAGGAGCTGGAC IT l ACTGATGCCCGTTATAT CTGCGAAAAGACCGGGATCTGGACCCGTGATGGCATTCTCTGGTTT TCGTCATCCGGTGAAGAGATTGAGCCACCTGACAGTGTGACC l 1 1 C ACATCTGGACAGCGTACAGCCCGTTCACCACCTGGGTGCAGATTGT CAAAGACTGGATGAAAACGAAAGGGGATACGGGAAAACGTAAAA

CCTTCGTAAACACCACGCTCGGTGAGACGTGGGAGGCGAAAATTG

GCGAACGTCCGGATGCTGAAGTGATGGCAGAGCGGAAAGAGCATT

ATTCAGCGCCCGTTCCTGACCGTGTGGCTTACCTGACCGCCGGTAT

CGACTCCCAGCTGGACCGCTACGAAATGCGCGTATGGGGATGGGG

GCCGGGTGAGGAAAGCTGGCTGATTGACCGGCAGATTATTATGGG

CCGCCACGACGATGAACAGACGCTGCTGCGTGTGGATGAGGCCAT

CAATAAAACCTATACCCGCCGGAATGGTGCAGAAATGTCGATATC

CCGTATCTGCTGGGATACTGGACGCGTTTTCCCGTCTTTCAGTGCC

TTGTTCAGTTCTTCCTGACGGGCGGTATATTTCTCCAGCTTGGCGC

GCCTAAGACTTAGATCTTAAGGGGATATCCTCGAGGTTCCCTTTAG

TGAGGGTTAATTGCGAGCTTGGCGTAATCATGGTCATAGCTGTTTC

CTGTGTGAAATTGTTATCCGCTCACAATTCCACACAACATACGAGC

CGGAAGCATAAAGTGTAAAGCCTGGGGTGCCTAATGAGTGAGCTA

ACTCACATTAATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGGGA

AACCTGTCGTGCCAGCTGCATTAATGAATCGGCCAACGCGCGGGG

AGAGGCGGTTTGCGTATTGGGCGCTCTTCCGCTTCCTCGCTCACTG

ACTCGCTGCGCTCGGTCGTTCGGCTGCGGCGAGCGGTATCAGCTCA

CTCAAAGGCGGTAATACGGTTATCCACAGAATCAGGGGATAACGC

AGGAAAGAACATGTGAGCAAAAGGCCAGCAAAAGGCCAGGAACC

GTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAGGCTCCGCCCCCC

TGACGAGCATCACAAAAATCGACGCTCAAGTCAGAGGTGGCGAAA

CCCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTC

CCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGATACCTG

TCCGCCTTTCTCCCTTCGGGAAGCGTGGCGCTTTCTCATAGCTCAC

GCTGTAGGTATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGG

CTGTGTGCACGAACCCCCCGTTCAGCCCGACroCTGCGCCTTATCC

GGTAACTATCGTCTTGAGTCCAACCCGGTAAGACACGACTTATCGC

CACTGGCAGCAGCCACTGGTAACAGGATTAGCAGAGCGAGGTATG

TAGGCGGTGCTACAGAGTTCTTGAAGTGGTGGCCTAACTACGGCT

ACACTAGAAGAACAGTATTTGGTATCTGCGCTCTGCTGAAGCCAGT

TACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGCAAACAAAC

CACCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAGCAGCAGATTACG

CGCAGAAAAAAAGGATCTCAAGAAGATCCTTTGATCTTTTCTACG

GGGTCTGACGCTCAGTGGAACGAAAACTCACGTTAAGGGATTTTG

GTCATGAGATTATCAAAAAGGATCTTCACCTAGATCCTTTTAAATT

AAAAATGAAGTTTTAAATCAATCTAAAGTATATATGAGTAAACTT

GGTCTGACAGTTACCAATGCTTAATCAGTGAGGCACCTATCTCAGC

GATCTGTCTATTTCGTTCATCCATAGTTGCCTGACTCCCCGTCGTGT

AGATAACTACGATACGGGAGGGCTTACCATCTGGCCCCAGTGCTG

CAATGATACCGCGAGACCCACGCTCACCGGCTCCAGATTTATCAG

CAATAAACCAGCCAGCCGGAAGGGCCGAGC ' CAGAAGTGGTCCTG

CAACTTTATCCGCCTCCATCCAGTCTATTAATTGTTGCCGGGAAGC

TAGAGTAAGTAGTTCGCCAGTTAATAGTTTGCGCAACGTTGTTGCC

ATTCiCTACAGGCATCGTGGTGTCACGCTCGTCGTTTGGTATGGCTT

CATTCAGCTCCGGTTCCCAACGATCAAGGCGAGTTACATGATCCCC CATGTTGTGCAAAAAAGCGGTTAGCTCCTTCGGTCCTCCGATCGTT

GTCAGAAGTAAGTTGGCCGCAGTGTTATCACTCATGGTTATGGCAG

CACTGCATAATTCTCTTACTGTCATGCCATCCGTAAGATGCTTTTCT

GTGACTGGTGAGTACTCAACCAAGTCATTCTGAGAATAGTGTATGC

GGCGACCGAGTTGCTCTTGCCCGGCGTCAATACGGGATAATACCG

CGCC AC AT AGC AG AAC i l AAAAGTGCTC ATC ATTGG AAAACGTT

CTTCGGGGCGAAAACTCTCAAGGATCTTACCGCTGTTGAGATCCAG

TTCGATGTAACCC ACTCGTGCACCCAACTGATCTTC AGC ATCT 1 1 1

AC I'll C AC C AGCGTTTCTGGGTG AGC AAAAAC AGGAAGGC AAAAT

GCCGCAAAAAAGGGAATAAGGGCGACACGGAAATGTTGAATACT

CATACTCTTCCTTTTTCAATATTATTGAAGCATTTATCAGGGTTATT

GTCTCATGAGCGGATACATATTTGAATGTATTTAGAAAAATAAAC

AAATAGGGGTTCCGCGCACATTTCCCCGAAAAGTGCCACCTGACG

CGCCCTGTAGCGGCGCATTAAGCGCGGCGGGTGTGGTGGTTACGC

GCAGCGTGACCGCTACACTTGCCAGCGCCCTAGCGCCCGCTCCTTT

CGCTTTCTTCCCTTCCTTTCTCGCCACGTTCGCCGGCTTTCCCCGTC

AAGCTCTAAATCGGGGGCTCCCTTTAGGGTTCCGATTTAGTGCTTT

ACGGCACCTCGACCCCAAAAAACTTGATTAGGGTGATGGTTCACG

TAGTGGGCCATCGCCCTGATAGACGG 1 1 1 1 TCGCCCTTTGACGTTG

GAGTCCACGTTCTTTAATAGTGGACTCTTGTTCCAAACTGGAACAA

C ACTC AACCCT ATCTCGGTCT ATTCTTTTG ATTT AT AAGGG A I l l ' l G

CCGATTTCGGCCTATTGGTTAAAAAATGAGCTGATTTAACAAAAAT

TT AACGCGAA ' ΓΤΊ T AAC AAAAT ATT AACGCTT AC AATTTGCCATTC

GCCATTCAGGCTGCGCAACTGTTGGGAAGGGCGAT

SEQ ID NO: AAGCTTAAA

43

SEQ ID NO: CCGCGG

44

SEQ ID NO: CGGTGCGGGCCTCTTCGCTATTACGCCAGCTGGCGAAAGGGGG 45 ATGTGCTGCAAGGCGATTAAGTTGGGTAACGCCAGGG M i l CCC

AGTCACGACGTTGTAAAACGACGGCCAGTGAATTGTAATACGACT

CACTATAGGGCGACCCTTAGGATCCTATGGCGCGCCACCACGGT

GAACAATCCCCGCTGGCTCATATTTGCCGCCGGTTCCCGTAAATC

CTCCGGTACGCGCCGGGCCGTATACTTACATATAGTAGATGTCAA

GCGTAGGCGCTTCCCCTGCCGGCTGTGAGGGCGCCATAACCAA

GGTATCTATAGACCGCCAATCAGCAAACTACCTCCGTACATTCAT

GTTGCACCCACACATTTATACACCCAGACCGCGACAAATTACCCA

TAAGGTTG 1 1 1 GTGACGGCGTCGTACAAGAGAACGTGGGAACTTT

TTAGGCTCACCAAAAAAGAAAGAAAAAATACGAGTTGCTGACAGA

AGCCTCAAGAAAAAAAAAATTCTTCTTCGACTATGCTGGAGGCAG

AGATGATCGAGCCGGTAGTTAACTA I ATATAGCTAAATTGGTTCC

ATCACCTTC I 1 1 1 CTGGTGTCGCTCCTTCTAGTGCTATTTCTGGCT

TTTCCTATTTTTTTTTTTCCATTTTTCTTTCTCTCTTTCTAATATATA

AATTCTCTTGCATTTTCTATTTTTCTCTCTATCTATTCTACTTGTTTA

TTCCCTTCAAGGTTTTTTTTTAAGGAGTACTTGTTTTTAGAATATAC GGTCAACGAACTATAATTAACTAAACAAGCTTAAAATGGCTAACCC

ACACCCACATTTCTTGATTATTACTTTTCCAGCCCAAGGTCATATT

AACCCAGCTTTGGAATTGGCCAAAAGATTGATTGGTGTTGGTGCT

GATGTTACTTTCGCTACTACTATTCATGCCAAGTCCAGATTGGTTA

AGAACCCAACTGTTGATGGTTTGAGATTCTCTACTTTCTCCGATG

GTCAAGAAGAAGGTGTTAAGAGAGGTCCAAACGAATTGCCAGTTT

TTCAAAGATTGGCCTCCGAAAACTTGTCCGAATTGATTATGGCTT

CTGCTAATGAAGGTAGACCAATCTCTTGTTTGATCTACTCCATTTT

GATTCCAGGTGCTGCTGAATTGGCTAGATCATTCAATATTCCATCT

GCTTTCTTGTGGATTCAACCAGCTACTGTTTTGGACATCTATTACT

ACTACTTCAACGGTTTCGGTGACTTGATCAGATCCAAATCTTCTGA

TCCATCCTTCTCCATTGAATTACCAGGTTTGCCATCTTTGTCCAGA

CAAGATTTGCCATCCTTTTTCGTTGGTTCCGACCAAAATCAAGAAA

ACCATGCTTTGGCTGCCTTTCAAAAGCACTTGGAAATTTTGGAAC

AAGAAGAAAACCCAAAGGTCTTGGTTAACACTTTCGATGCTTTAG

AACCAGAAGCCTTGAGAGCTGTTGAAAAGTTGAAATTGACTGCTG

TTGGTCCATTGGTTCCATCTGGTTTTTCTGATGGTAAAGATGCTTC

TGATACACCATCTGGTGGTGATTTGTCTGATGGTTCTAGAGATTAT

ATGGAATGGTTGAAGTCCAAGCCAGAATCTACTGTTGTTTACGTT

TCCTTCGGTTCCATCAGTATGTTCTCTATGCAACAAATGGAAGAAA

TCGCCAGAGGTTTGTTGGAATCTGGTAGACCATTTTTGTGGGTTA

TCAGAGCTAAAGAAAACGGTGAAGAAAACAAAGAAGAAGATAAGT

TGTCCTGCCAAGAAGAATTGGAAAAGCAAGGTATGTTGATCCAAT

GGTGCTCTCAAATGGAAGTTTTGTCTCATCCATCTTTGGGTTGTTT

CGTTACTCATTGTGGTTGGAACTCCTCTATTGAATCTTTAGCTTCT

GGTGTTCCAATGATTGCATTTCCACAATGGGCTGATCAAGGTACT

AATACCAAGTTGATTAAGGACGTTTGGAAAACCGGTGTTAGATTG

ATGGTTAACGAAGAAGAAATTGTCACCTCCGACGAATTGAGAAGA

TGCTTGGAATTAGTTATGGGTGATGGTGAAAAGGGTCAAGAAATG

AGAAAGAATGCTAAGAAGTGGAAGATTTTGGCTAAAGAAGCCTTA

AAAGAAGGTGGTTCCTCTCACAAGAATTTGAAGAACTTCGTTGAC

GAAGTCATCCAAGGTTACTGACCGCGGACAAATCGCTCTTAAATA

TATACCTAAAGAACATTAAAGCTATATTATAAGCAAAGATACGTAA

ATTTTGCTTATATTATTATACACATATCATATTTCTATATTTTTAAGA

TTTGGTTATATAATGTACGTAATGCAAAGGAAATAAATTTTATACAT

TATTGAACAGCGTCCAAGTAACTACATTATGTGCACTAATAGTTTA

GCGTCGTGAAGACTTTATTGTGTCGCGAAAAGTAAAAATTTTAAAA

ATTAGAGCACCTTGAACTTGCGAAAAAGGTTCTCATCAACTGTTTA

AAAGGAGGATATCAGGTCCTATTTCTGACAAACAATATACAAATTT

AGTTTCAAAGGCGCGTTGCAAAATGGAATTTCGCCGCAGCGGCC

TGAATGGCTGTACCGCCTGACGCGGATGCGCCGGCGCGCCTATT

G AAAG ATCTTAAG G GG ATA1 CTCGAGGTTCCCTTTAGTGAGGGT

TAATTGCGAGCTTGGCGTAATCATGGTCATAGCTGTTTCCTGTGT

GAAATTGTTATCCGCTCACAATTCCACACAACATACGAGCCGGAA

GCATAAAGTGTAAAGCCTGGGGTGCCTAATGAGTGAGCTAACTCA

CATTAATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGGGAAACC TGTCGTGCCAGCTGCATTAATGAATCGGCCAACGCGCGGGGAGA

GGCGGTTTGCGTATTGGGCGCTCTTCCGCTTCCTCGCTCACTGA

CTCGCTGCGCTCGGTCGTTCGGCTGCGGCGAGCGGTATCAGCT

CACTCAAAGGCGGTAATACGGTTATCCACAGAATCAGGGGATAAC

GCAGGAAAGAACATGTGAGCAAAAGGCCAGCAAAAGGCCAGGAA

CCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAGGCTCCGCC

CCCCTGACGAGCATCACAAAAATCGACGCTCAAGTCAGAGGTGG

CGAAACCCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGG

AAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTTACCG

GATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGTGGCGCTTTCTC

ATAGCTCACGCTGTAGGTATCTCAGTTCGGTGTAGGTCGTTCGCT

CCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCCGACCG

CTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCCGGTAAG

ACACGACTTATCGCCACTGGCAGCAGCCACTGGTAACAGGATTA

GCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTCTTGAAGTGG

TGGCCTAACTACGGCTACACTAGAAGAACAGTATTTGGTATCTGC

GCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCT

TGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTTTTTGTT

TGCAAGCAGCAGATTACGCGCAGAAAAAAAGGATCTCAAGAAGAT

CCTTTGATCTTTTCTACGGGGTCTGACGCTCAGTGGAACGAAAAC

TCACGTTAAGGGATTTTGGTCATGAGATTATCAAAAAGGATCTTCA

CCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAATCAATCTAAAGT

ATATATGAGTAAACTTGGTCTGACAGTTACCAATGCTTAATCAGTG

AGGCACCTATCTCAGCGATCTGTCTATTTCGTTCATCCATAGTTG

CCTGACTCCCCGTCGTGTAGATAACTACGATACGGGAGGGCTTA

CCATCTGGCCCCAGTGCTGCAATGATACCGCGAGACCCACGCTC

ACCGGCTCCAGATTTATCAGCAATAAACCAGCCAGCCGGAAGGG

CCGAGCGCAGAAGTGGTCCTGCAACTTTATCCGCCTCCATCCAG

TCTATTAATTGTTGCCGGGAAGCTAGAGTAAGTAGTTCGCCAGTT

AATAGTTTGCGCAACGTTGTTGCCATTGCTACAGGCATCGTGGTG

TCACGCTCGTCGTTTGGTATGGCTTCATTCAGCTCCGGTTCCCAA

CGATCAAGGCGAGTTACATGATCCCCCATGTTGTGCAAAAAAGCG

GTTAGCTCCTTCGGTCCTCCGATCGTTGTCAGAAGTAAGTTGGCC

GCAGTGTTATCACTCATGGTTATGGCAGCACTGCATAATTCTCTTA

CTGTCATGCCATCCGTAAGATGCTTTTCTGTGACTGGTGAGTACT

CAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCGAGTTGCT

CTTGCCCGGCGTCAATACGGGATAATACCGCGCCACATAGCAGA

ACTTTAAAAGTGCTCATCATTGGAAAACGTTCTTCGGGGCGAAAA

CTCTCAAGGATCTTACCGCTGTTGAGATCCAGTTCGATGTAACCC

ACTCGTGCACCCAACTGATCTTCAGCATCTTTTACTTTCACCAGC

GTTTCTGGGTGAGCAAAAACAGGAAGGCAAAATGCCGCAAAAAA

G GG AATAAG G GCG AL ACG G AAATGTTGAATACTCATACTCTTCCT

TTTTCAATATTATTGAAGCATTTATCAGGGTTATTGTCTCATGAGC

GGATACATATTTGAATGTATTTAGAAAAATAAACAAATAGGGGTTC

CGCGCACATTTCCCCGAAAAGTGCCACCTGACGCGCCCTGTAGC

GGCGCATTAAGCGCGGCGGGTGTGGTGGTTACGCGCAGCGTGA CCGCTACACTTGCCAGCGCCCTAGCGCCCGCTCCTTTCGCTTTC

TTCCCTTCCTTTCTCGCCACGTTCGCCGGCTTTCCCCGTCAAGCT

CTAAATCGGGGGCTCCCTTTAGGGTTCCGATTTAGTGCTTTACGG

CACCTCGACCCCAAAAAACTTGATTAGGGTGATGGTTCACGTAGT

GGGCCATCGCCCTGATAGACGGTTTTTCGCCCTTTGACGTTGGA

GTCCACGTTC 1 1 1 AATAGTGGACTCTTGTTCCAAACTGGAACAAC

ACTC AACCCTATCTCG GTCTATTCTTTTG ATTTATAAG G G ATTTTG

CCGATTTCGGCCTATTGGTTAAAAAATGAGCTGATTTAACAAAAAT

TTAACGCGAA 1 1 1 1 AACAAAATATTAACGCTTACAATTTGCCATTC

GCCATTCAGGCTGCGCAACTGTTGGGAAGGGCGAT

SEQ ID NO: CGGTGCGGGCCTCTTCGCTATTACGCCAGCTGGCGAAAGGGGG

46 ATGTGCTGCAAGGCGATTAAGTTGGGTAACGCCAGGG M i l CCC

AGTCACGACGTTGTAAAACGACGGCCAGTGAATTGTAATACGACT

CACTATAGGGCGACCCTTAGGATCCTATGGCGCGCCGGCACCCT

TGCGGGCCATGTCATACACCGCCTTCAGAGCAGCCGGACCTATC

TGCCCGTTACGCGCCAGCTTGCAAATTAAAGCCTTCGAGCGTCC

CAAAACCTTCTCAAGCAAGG 1 1 I 1 CAGTATAATGTTACATGCGTAC

A CG CGTCT GT A CA G AAAAAAAA G AAAAATTT G A AATATAAATAA CG

TTCTTAATACTAACATAACTATAAAAAAATAAATAGGGACCTAGAC

TTCAGGTTGTCTAACTCCTTCC 1 1 1 1 CGGTTAGAGCGGATGTGGG

GGGAGGGCGTGAATGTAAGCGTGACATAACTAATTACATGATATC

GACAAAGGAAAAGGGGGACGGATCTCCGAGGCCTCGGACCCGT

CGGGCCGCCGTCGGACGTGCCGCGGATCCCCGGGTCGAGCCTG

AACGGCCTCGAGGCCTGAACGGCCTCGACGAATTCATTATTTGTA

GAGCTCATCCATGCCATGTGTAATCCCAGCAGCAGTTACAAACTC

AAGAAGGACCATGTGGTCACGC 1 1 1 1 CGTTGGGATCTTTCGAAAG

GGCAGATTGTGTCGACAGGTAATGGTTGTCTGGTAAAAGGACAG

GGCCATCGCCAATTGGAGTA 1 I 1 1 GTTGATAATGGTCTGCTAGTT

GAACGGATCCATCTTCAATGTTGTGGCGAA 1 1 1 1 G AAGTTAG CTTT

GATTCCATTC M M GTTTGTCTGCCGTGATGTATACATTGTGTGAG

TTATAGTTGTACTCGAGTTTGTGTCCGAGAATGTTTCCATCTTCTT

TAAAATCAATACC 1 1 1 I AACTCGATACGATTAACAAGGGTATCACC

TTCAAACTTGACTTCAGCACGCGTCTTGTAGTTCCCGTCATC 1 1 I G

AAAGATATAGTGCGTTCCTGTACATAACCTTCGGGCATGGCACTC

TTGAAAAAGTCATGCCGTTTCATATGATCCGGATAACGGGAAAAG

CATTGAACACCATAAGAGAAAGTAGTGACAAGTGTTGGCCATGGA

ACAGGTAG 1 1 1 1 C C A GT AGTG C AAAT AAATTTAA G G GT AAG CTG G

CCCTGCAGGCCAAGCTTTTTGTTTGTTTATGTGTGTTTATTCGAAA

CTAAGTTCTTGGTG 1 M 1 AAAACTAAAAAAAAGACTAACTATAAAA

GTAGAATTTAAGAAGTTTAAGAAATAGATTTACAGAATTACAATCA

ATACCTACCGTCTTTATATACTTATTAGTCAAGTAGGGGAATAATT

TCAGGGAAC i GGTTTCAACCTTTTTTTTCAGCTTTTTCCAAATC/-.G

AGAGAGCAGAAGGTAATAGAAGGTGTAAGAAAATGAGATAGATAC

ATGCGTGGGTCAATTGCCTTGTGTCATCATTTACTCCAGGCAGGT

TGCATCACTCCATTGAGGTTGTGTCCGTTTTTTGCCTGTTTGTGC

CCCTGTTCTCTGTAGTTGCGCTAAGAGAATGGACCTATGAACTGA TGGTTGGTGAAGAAAACAATATTTTGGTGCTGGGATTCTTTTTTTT

TCTGGATGCCAGCTTAAAAAGCGGGCTCCATTATATTTAGTGGAT

GCCAGGAATAAACTGTTCACCCAGACACCTACGATGTTATATATT

CTGTGTAACCCGCCCCCTATTTTGGGCATGTACGGGTTACAGCA

GAATTAAAAGGCTAATTTTTTGACTAAATAAAGTTAGGAAAATCAC

TACT ATT AATTATTTACGTATTCTTTGAAATGGCAGTATTGATAATG

ATAAACTCGAACTGGGCGCGTCGTGCCGTCGTTGTTAATCACCAC

ATGGTTATTCTGCTCAAACGTCCCGGACGCCTGCGAGGCGCGCC

TATTGAAAGATCTTAAGGGGATATCCTCGAGGTTCCCTTTAGTGA

GGGTTAATTGCGAGCTTGGCGTAATCATGGTCATAGCTGTTTCCT

GTGTGAAATTGTTATCCGCTCACAATTCCACACAACATACGAGCC

GGAAGCATAAAGTGTAAAGCCTGGGGTGCCTAATGAGTGAGCTA

ACTCACATTAATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGGG

AAACCTGTCGTGCCAGCTGCATTAATGAATCGGCCAACGCGCGG

GGAGAGGCGGTTTGCGTATTGGGCGCTCTTCCGCTTCCTCGCTC

ACTGACTCGCTGCGCTCGGTCGTTCGGCTGCGGCGAGCGGTATC

AGCTCACTCAAAGGCGGTAATACGGTTATCCACAGAATCAGGGG

ATAACGCAGGAAAGAACATGTGAGCAAAAGGCCAGCAAAAGGCC

AGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAGGCT

CCGCCCCCCTGACGAGCATCACAAAAATCGACGCTCAAGTCAGA

GGTGGCGAAACCCGACAGGACTATAAAGATACCAGGCGTTTCCC

CCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCT

TACCGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGTGGCGC

TTTCTCATAG CTCACGCTGTAG GTATCTCAGTTCGGTGTAG GTCG

TTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCC

GACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCC

GGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGTAACA

GGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTCTTG

AAGTGGTGGCCTAACTACGGCTACACTAGAAGAACAGTATTTGGT

ATCTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGT

AGCTCTTGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTT

TTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAAAAAGGATCTCA

AGAAGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCAGTGGAA

CGAAAACTCACGTTAAGGGATTTTGGTCATGAGATTATCAAAAAG

GATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAATCAA

TCTAAAGTATATATGAGTAAACTTGGTCTGACAGTTACCAATGCTT

AATCAGTGAGGCACCTATCTCAGCGATCTGTCTATTTCGTTCATC

CATAGTTGCCTGACTCCCCGTCGTGTAGATAACTACGATACGGGA

GGGCTTACCATCTGGCCCCAGTGCTGCAATGATACCGCGAGACC

CACGCTCACCGGCTCCAGATTTATCAGCAATAAACCAGCCAGCC

GGAAGGGCCGAGCGCAGAAGTGGTCCTGCAACTTTATCCGCCTC

CATC AGTCTATTAATTGTTGCCGGGAAGCTAGAGTAAbTAGTTC

GCCAGTTAATAGTTTGCGCAACGTTGTTGCCATTGCTACAGGCAT

CGTGGTGTCACGCTCGTCGTTTGGTATGGCTTCATTCAGCTCCG

GTTCCCAACGATCAAGGCGAGTTACATGATCCCCCATGTTGTGCA

AAAAAGCGGTTAGCTCCTTCGGTCCTCCGATCGTTGTCAGAAGTA AGTTGGCCGCAGTGTTATCACTCATGGTTATGGCAGCACTGCATA

ATTCTCTTACTGTCATGCCATCCGTAAGATGC M M CTGTGACTGG

TGAGTACTCAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACC

GAGTTGCTCTTGCCCGGCGTCAATACGGGATAATACCGCGCCAC

ATAGCAGAACTTTAAAAGTGCTCATCATTGGAAAACGTTCTTCGG

GGCGAAAACTCTCAAGGATCTTACCGCTGTTGAGATCCAGTTCGA

TGTAACCCACTCGTGCACCCAACTGATCTTCAGCATC 1 1 1 1 ACTTT

CACCAGCGTTTCTGGGTGAGCAAAAACAGGAAGGCAAAATGCCG

CAAAAAAGGGAATAAGGGCGACACGGAAATGTTGAATACTCATAC

TCTTCCTTTTTCAATATTATTGAAGCATTTATCAGGGTTATTGTCTC

ATGAGCGGATACATATTTGAATGTATTTAGAAAAATAAACAAATAG

GGGTTCCGCGCACA 1 1 1 CCCCGAAAAGTGCCACCTGACGCGCCC

TGTAGCGGCGCATTAAGCGCGGCGGGTGTGGTGGTTACGCGCA

GCGTGACCGCTACACTTGCCAGCGCCCTAGCGCCCGCTCCTTTC

GCTTTCTTCCCTTCCTTTCTCGCCACGTTCGCCGGCTTTCCCCGT

CAAGCTCTAAATCGGGGGCTCCC 1 1 1 AGGGTTCCGATTTAGTGCT

TTACGGCACCTCGACCCCAAAAAACTTGATTAGGGTGATGGTTCA

CGTAGTGGGCCATCGCCCTGATAGACGGTTTTTCGCCCTTTGAC

GTTGGAGTCCACGTTCTTTAATAGTGGACTCTTGTTCCAAACTGG

AACAACACTCAACCCTATCTCGGTCTATTCTTTTGATTTATAAGGG

ATTTTGCCGATTTCGGCCTATTGGTTAAAAAATGAGCTGATTTAAC

AAAAATTTAACGCGAATTTTAACAAAATATTAACGCTTACAATTTGC

CATTCGCCATTCAGGCTGCGCAACTGTTGGGAAGGGCGAT

SEQ ID NO: CGGTGCGGGCCTCTTCGCTATTACGCCAGCTGGCGAAAGGGGG

47 ATGTGCTGCAAGGCGATTAAGTTGGGTAACGCCAGGG 1 1 1 I CCC

AGTCACGACGTTGTAAAACGACGGCCAGTGAATTGTAATACGACT

CACTATAGGGCGACCCTTAAGATCTGTAATGGCGCGCCATGCGC

GGCTATGCCACCGGCGGTTATGTCGGTACACCGGGCAGCATGG

CAGACAGCCGGACGCGCCACGCACAGATATTATAACATCTGCAT

AATAGGCATTTGCAAGAATTACTCGTGAGTAAGGAAAGAGTGAGG

AACTATCGCATACCTGCATTTAAAGATGCCGATTTGGGCGCGAAT

CCTTTA 1 1 1 1 GGCTTCACCCTCATACTATTATCAGGGCCAGAAAAA

GGAAGTG 1 1 1 CCCTCCTTCTTGAATTGATGTTACCCTCATAAAGCA

CGTGGCCTCTTATCGAGAAAGAAATTACCGTCGCTCGTGATTTGT

TTGCAAAAAGAACAAAACTGAAAAAACCCAGACACGCTCGACTTC

CTGTCTTCCTATTGATTGCAGCTTCCAA 1 1 1 CGTCACACAACAAGG

TCCTAGCGACGGCTCACAGG M M GTAACAAGCAATCGAAGGTTC

TGGAATGGCGGGAAAGGGTTTAGTACCACATGCTATGATGCCCA

CTGTGATCTCCAGAGCAAAGTTCGTTCGATCGTACTGTTACTCTC

TCTCTTTCAAACAGAATTGTCCGAATCGTGTGACAACAACAGCCT

GTTCTCACACACTC M M CTTCTAACCAAGGGGGTGGTTTAGTTTA

GTAGAACCTCGTGAAACTTACA 1 1 1 ACATATATA fAAACTTGCATA

AATTGGTCAATGCAAGAAATACATATTTGGTCTTTTCTAATTCGTA

GTTTTTCAAGTTCTTAGATGCTTTCTTTTTCTCTTTTTTACAGATCA

TCAAGGAAGTAATTATCTACTTTTTACAACAAATATAAAACAAAGC

TTAAAATGGCCTTGAGAATCAACGAATTATTCGTCGCTGCCATCAT CTACATCATCGTTCATATTATCATCTCCAAGTTGATCACCACCGTT

AGAGAAAGAGGTAGAAGATTGCCATTGCCACCAGGTCCAACTGG

TTGGCCAGTTATTGGTGCTTTGCCATTATTGGGTTCTATGCCACAT

GTTGCTTTGGCTAAAATGGCTAAGAAATACGGTCCAATCATGTAC

TTGAAGGTTGGTACTTGTGGTATGGTTGTTGCTTCTACTCCAAAT

GCTGCTAAGGCTTTCTTGAAAACCTTGGACATTAACTTCTCTAACA

GACCACCTAATGCTGGTGCTACTCATTTGGCTTATAATGCCCAAG

ATATGGTTTTTGCTCCATATGGTCCAAGATGGAAGTTGTTGAGAA

AGTTGTCTAACTTGCATATGTTGGGTGGTAAGGCTTTGGAAAATT

GGGCTAATGTTAGAGCTAACGAATTGGGTCATATGTTGAAGTCTA

TGTTCGATGCTTCTCAAGATGGTGAATGCGTTGTTATTGCTGATG

TTTTGACTTTCGCTATGGCTAACATGATCGGTCAAGTTATGTTGTC

CAAGAGAGTTTTCGTTGAAAAGGGTGTCGAAGTTAACGAATTCAA

GAACATGGTTGTCGAATTGATGACTGTTGCTGGTTACTTTAACATC

GGTGATTTCATTCCAAAGTTGGCCTGGATGGATATTCAAGGTATT

GAAAAAGGTATGAAGAACTTGCACAAGAAGTTCGACGATTTGTTG

ACCAAGATGTTTGATGAACATGAAGCCACCTCCAACGAAAGAAAA

GAAAATCCAGATTTCTTGGATGTCGTCATGGCCAATAGAGATAAT

TCTGAAGGTGAAAGATTGTCCACCACCAATATTAAGGCCTTGTTG

TTGAATTTGTTCACCGCTGGTACTGATACCTCCTCTTCTGTTATTG

AATGGGCTTTAGCTGAAATGATGAAGAACCCAAAAATCTTCAAAA

AGGCCCAACAAGAAATGGACCAAGTTATCGGTAAAAACAGAAGAT

TGATCGAATCCGACATTCCAAACTTGCCATATTTGAGAGCTATCT

GCAAAGAAACTTTCAGAAAGCACCCATCTACTCCATTGAATTTGC

CAAGAGTTTCTTCTGAACCATGTACCGTTGATGGTTACTACATCC

CAAAAAACACTAGATTGTCCGTTAACATTTGGGCCATTGGTAGAG

ATCCAGATGTTTGGGAAAATCCATTGGAATTCACTCCAGAAAGAT

TCTTGTCTGGTAAGAACGCTAAGATTGAACCTAGAGGTAACGACT

TTGAATTGATTCCATTTGGTGCCGGTAGAAGAATTTGTGCTGGTA

CTAGAATGGGTATCGTTGTCGTTGAATATATCTTAGGTACTTTGGT

CCACTCCTTCGATTGGAAATTGCCAAACAACGTTATCGACATCAA

CATGGAAGAATCATTTGGTTTGGCCTTGCAAAAAGCTGTTCCATT

AGAAGCTATGGTTACCCCAAGATTGTCTTTGGATGTTTACAGATG

CTAACCGCGGATCTCTTATGTCTTTACGATTTATAGTTTTCATTAT

CAAGTATGCCTATATTAGTATATAGCATCTTTAGATGACAGTGTTC

GAAGTTTCACGAATAAAAGATAATATTCTACTTTTTGCTCCCACCG

CGTTTGCTAGCACGAGTGAACACCATCCCTCGCCTGTGAGTTGTA

CCCATTCCTCTAAACTGTAGACATGGTAGCTTCAGCAGTGTTCGT

TATGTACGGCATCCTCCAACAAACAGTCGGTTATAGTTTGTCCTG

CTCCTCTGAATCGTCTCCCTCGATATTTCTCATTTTCCTTCGGCGC

GTTCGCAGGCGTCCGGGACGTTTGAGCAGAATAACCATGTGGTG

ATTAACAACGACGGCACGGGCGCGCCMATGCTTAGATCTTAAGG

GGATATCCTCGAGGTTCCCTTTAGTGAGGGTTAATTGCGAGCTTG

GCGTAATCATGGTCATAGCTGTTTCCTGTGTGAAATTGTTATCCG

CTCACAATTCCACACAACATACGAGCCGGAAGCATAAAGTGTAAA

GCCTGGGGTGCCTAATGAGTGAGCTAACTCACATTAATTGCGTTG CGCTCACTGCCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCT

GCATTAATGAATCGGCCAACGCGCGGGGAGAGGCGGTTTGCGTA

TTGGGCGCTCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCG

GTCGTTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGT

AATACGGTTATCCACAGAATCAGGGGATAACGCAGGAAAGAACAT

GTGAGCAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCG

CGTTGCTGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCAT

CACAAAAATCGACGCTCAAGTCAGAGGTGGCGAAACCCGACAGG

ACTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGC

GCTCTCCTGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCC

TTTCTCCCTTCGGGAAGCGTGGCGCTTTCTCATAGCTCACGCTGT

AGGTATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTG

TGTGCACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCG

GTAACTATCGTCTTGAGTCCAACCCGGTAAGACACGACTTATCGC

CACTGGCAGCAGCCACTGGTAACAGGATTAGCAGAGCGAGGTAT

GTAGGCGGTGCTACAGAGTTCTTGAAGTGGTGGCCTAACTACGG

CTACACTAGAAGAACAGTATTTGGTATCTGCGCTCTGCTGAAGCC

A GTT A C CTT C G G A AAA A GAGTTGGTAGCTCTTGATCCGG C AA A C A

AACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAGCAGCAGAT

TACG C G C A G AAAAAAAG G ATCTC AA G AAG ATC CTTTG ATCTTTTCT

ACGGGGTCTGACGCTCAGTGGAACGAAAACTCACGTTAAGGGAT

TTTGGTCATGAGATTATCAAAAAGGATCTTCACCTAGATCCTTTTA

AATTAAAAATGAAGTTTTAAATCAATCTAAAGTATATATGAGTAAAC

TTGGTCTGACAGTTACCAATGCTTAATCAGTGAGGCACCTATCTC

AGCGATCTGTCTATTTCGTTCATCCATAGTTGCCTGACTCCCCGT

CGTGTAGATAACTACGATACGGGAGGGCTTACCATCTGGCCCCA

GTGCTGCAATGATACCGCGAGACCCACGCTCACCGGCTCCAGAT

TTATCAGCAATAAACCAGCCAGCCGGAAGGGCCGAGCGCAGAAG

TGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTATTAATTGTTGC

CGGGAAGCTAGAGTAAGTAGTTCGCCAGTTAATAGTTTGCGCAAC

GTTGTTGCCATTGCTACAGGCATCGTGGTGTCACGCTCGTCGTTT

GGTATGGCTTCATTCAGCTCCGGTTCCCAACGATCAAGGCGAGTT

ACATGATCCCCCATGTTGTGCAAAAAAGCGGTTAGCTCCTTCGGT

CCTCCGATCGTTGTCAGAAGTAAGTTGGCCGCAGTGTTATCACTC

ATGGTTATGGCAGCACTGCATAATTCTCTTACTGTCATGCCATCC

GTAAGATGCTTTTCTGTGACTGGTGAGTACTCAACCAAGTCATTC

TGAGAATAGTGTATGCGGCGACCGAGTTGCTCTTGCCCGGCGTC

AATACGGGATAATACCGCGCCACATAGCAGAACTTTAAAAGTGCT

CATCATTGGAAAACGTTCTTCGGGGCGAAAACTCTCAAGGATCTT

ACCGCTGTTGAGATCCAGTTCGATGTAACCCACTCGTGCACCCAA

CTGATCTTCAGCATCTTTTACTTTCACCAGCGTTTCTGGGTGAGC

AAAAACAGGAAGGCAAAATGCC'JCAAAAAAGGGAATAAGGGCGA

CACGGAAATGTTGAATACTCATACTCTTCCTTTTTCAATATTATTGA

AGCATTTATCAGGGTTATTGTCTCATGAGCGGATACATATTTGAAT

GTATTTAGAAAAATAAACAAATAGGGGTTCCGCGCACATTTCCCC

GAAAAGTGCCACCTGACGCGCCCTGTAGCGGCGCATTAAGCGCG GCGGGTGTGGTGGTTACGCGCAGCGTGACCGCTACACTTGCCA

GCGCCCTAGCGCCCGCTCCTTTCGCTTTCTTCCCTTCC 1 I 1 CTCG

CCACGTTCGCCGGCTTTCCCCGTCAAGCTCTAAATCGGGGGCTC

CCTTTAGGGTTCCGATTTAGTGCTTTACGGCACCTCGACCCCAAA

AAACTTGATTAGGGTGATGGTTCACGTAGTGGGCCATCGCCCTG

ATAGACGGTTTTTCGCCCTTTGACGTTGGAGTCCACGTTCTTTAAT

AGTGGACTCTTGTTCCAAACTGGAACAACACTCAACCCTATCTCG

GTCTATTCTTTTGATTTATAAGGGATTTTGCCGATTTCGGCCTATT

GGTTAAAAAATGAGCTGATTTAACAAAAATTTAACGCGAA I 1 1 I AA

CAAAATATTAACGCTTACAATTTGCCATTCGCCATTCAGGCTGCG

CAACTGTTGGGAAGGGCGAT

SEQ ID NO: CGGTGCGGGCCTCTTCGCTATTACGCCAGCTGGCGAAAGGGGG 48 ATGTGCTGCAAGGCGATTAAGTTGGGTAACGCCAGGG M i l CCC

AGTCACGACGTTGTAAAACGACGGCCAGTGAATTGTAATACGACT

CACTATAGGGCGACCCTTAGGATCTAAGCATTGGCGCGCCCCGG

CTGTCTGCCATGCTGCCCGGTGTACCGACATAACCGCCGGTGGC

ATAGCCGCGCATACGCGCCATTTCCTTCCATCTTGTGATTCATGC

TATCCATCTTTTTTGAGTATCCAATTAACGAAGACGTTACCAGCTG

ATTGAAGGTTCTCAAAGTGACTGTACTCCATG M M CTTATCATCC

ATGTAGTTATTTTTCAAACTGCAAATTCAAGAAAAAGCCACGCGTG

TGCACCTTTTTTTTCCCCTTCCAGTGCATTATGCAATAGACAGCAC

GAGTCTTTGAAAAAGTAACTTATAAAACTGTATCAATTTTTAAACCT

AAATAGATTCATAAACTATTCGTTAATATAAAGTGTTCTAAACTATG

ATGAAAAAATAAGCAGAAAAGACTAATAATTCTTAGTTAAAAGCAC

TCCGCGGTTACCACACATCTCTCAAGTATCTTCCCTCTG 1 1 1 GTAA

CTTTTTCACAATTGCTTCCGCTTCAGAAGAACTAACGCCTTCCTGT

TCCTGGACTATAGTATGAAGTGTTCTGTGAACATCTCTTGCCATAC

CCTTTGCATCACCACAGACATATAGATAGCCTTCCTCTTTGATTAA

GTCCCAAACTTGTGCGGCCTTTTCCATCATTTTGTGTTGGACGTA

CTCCTTCTGAGCACCTTCTCTAGAAAAAGCCATTATCAACTCTGAA

ATAACTCCTTGATCTACAAAGTTATTCAGTTCATCTTCGTAGATGA

AATCCATTTGTCTGTTTCTACAGCCGAAAAACAACAAAGAAGATCC

CAACTCTTCACCATCCTCCTTTAAGGCCATTCTCTCTTGTAAGAAA

CCTCTGAATGGAGCAAGACCTGTACCAGGACCGACCATGACAAT

AGGAGTAGAAGGATTGGAAGGCAGTTTGAAGTTGGAGGCTCTGA

TAAAGATTGGAGCACCAGAACATTCGTGAGACTTCTCTGCTGGAA

CCGCGTTTTTCATCCATGTTGAACAAACGCCCTTATGGATTCTAC

CAGTAGGAGTTGGACCGTACACTAAAGCGGATGTGACATGAACT

CTTGATGGTGCCAGTCTAGGTGAGGATGAAATTGAATAGTATCTT

GGTTGCAGTCTAGGCGCTATTGCGGCGAAGAAAACACCCAAAGG

AGGTTTAGCGGATGGGAAAGCAGCCATAACTTCTAGTAAAGAACG

TTGACTAGCTACTATCC - TGTGAGTATTCATCCTTACCATCTGGT

GAAGTTAGATGTTTCAG 1 1 1 1 1 CTGCCTCAGAAGGTTCTGTGGCG

TACGCAGCCAAGGCCACTAGAGCTGATTTACGTGGAGGATTTAAC

AGATCCGCGTAACGAGCTAAACCGGTACCTAGGGTGCATGGTCC

TGGAAATGGTGGAGGCACTGCAC 1 1 1 CTAGTGGTGAGCCATCCT CTTTATCGGCATGAATTGAGAAAACAAGATCTAAACTATGGCCCA

ACAACTTTCCAGCTTCCTCTACAATTTCAACATGGTTTTCAGCGTA

GACACCCACGTGATCACCTGTTTCGTAAGTGATACCAGTACGTGA

TATATCAAATTCAAGATGTATGCAAGATCTGTCTGATTCATGAGTG

TGCAATTCCTTTTGAACTGCAACGTCTACTCTACATGGATGATGAA

TATCGATGGTAGTATTACCATTAGCCACATTACTTTCCATTGATTT

CTGTGTTGTGAATCTTGGATCATGAGTAACTACTCTATATTCTGGA

ATGACGGCTGTGTATGGAGTGGCAACGGATTTATCATCTTCGTCC

TTAAGTAACTTATCTAATTCAGACCACAAAGATTCCTTCCATGCAT

TAAAGTCATCCTCGATAGATTGATCATCATCTCCTAAACCGACTTC

AATCAATCTCTTCGCACCCTTTTTGCATAACTCTTCATCTAAGACA

ATACCTATCTTGTTAAAGTGCTCGTATTGTCTGTTACCTAAGGCAA

AAACGCCGTAAGCAAGTTGCTGCAACTTGATATCTCTTTCGTTCT

CTTCAGTAAACCACTTGTAGAATCTTGCGGCGTTATCGGTTGGTT

CACCATCACCATACGTGGCTACACAAAAGAAAGCCAATGTTTCCT

TTTTCAACTTTTCCTCATATTGGTCATCATCGGCAGCGTAATCATC

CAAATCGATTACTTTTACAGCCGCCTTTTCGTATCTTGCTTTGATC

TCTTCTGAAAGTGCTTTAGCGAATCCTTCGGCTGTTCCGGTTTGT

GTGCCGAAGAAGATAGAGACTCTCGTTTTTCCAGAACCTAGATCT

AAGTCATCATCCTCATCTTTCGCCATCAGAGACTTAGGGATCATTA

GTGGCTTTAGCTCGCCGGAACGATCTGCCGTGGTCTTTTTCCACA

ATAAGACAACGAAACCAGCAACCAGTGCCAGAGAAGTTGTAGCA

ATAACTAATACAACATCATCGGACAAAGAATCCGTTCCCATGATAC

TTTTCAATTGTTTGAAAAGATCGGAGGCATAAAGTGCAGAAGTCA

TTTTAAGCTTTTTGTAATTAAAACTTAGATTAGATTGCTATGCTTTC

TTTCTAATGAGCAAGAAGTAAAAAAAGTTGTAATAGAACAAGAAAA

ATGAAACTGAAACTTGAGAAATTGAAGACCGTTTATTAACTTAAAT

AT C A AT G G G AG GT C AT CG AAAG AG AAA AAAAT CAAAAAAAAAAAT

TTTCAAGAAAAAGAAACGTGATAAAAATTTTTATTGCCTTTTTCGA

CGAAGAAAAAGAAACGAGGCGGTCTCTTTTTTCTTTTCCAAACCTT

TAGTACGGGTAATTAACGACACCCTAGAGGAAGAAAGAGGGGAA

ATTTAGTATGCTGTGCTTGGGTGTTTTGAAGTGGTACGGCGATGC

GCGGAGTCCGAGAAAATCTGGAAGAGTAAAAAAGGAGTAGAAAC

ATTTTGAAGCTAGGCGCGTCAGCCGGTAAAGATTCCCCACGCCA

ATCCGGCTGGTTGCCTCCTTCGTGAAGACAAACTCGGCGCGCCA

TTACAGATCTTAAGGGGATATCCTCGAGGTTCCCTTTAGTGAGGG

TTAATTGCGAGCTTGGCGTAATCATGGTCATAGCTGTTTCCTGTG

TGAAATTGTTATCCGCTCACAATTCCACACAACATACGAGCCGGA

AGCATAAAGTGTAAAGCCTGGGGTGCCTAATGAGTGAGCTAACTC

ACATTAATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGGGAAAC

CTGTCGTGCCAGCTGCATTAATGAATCGGCCAACGCGCGGGGAG

AGGCGGTTTGCu iATTGGGCGCTCTTCCGCTTCCTCGCTCACTG

ACTCGCTGCGCTCGGTCGTTCGGCTGCGGCGAGCGGTATCAGCT

CACTCAAAGGCGGTAATACGGTTATCCACAGAATCAGGGGATAAC

GCAGGAAAGAACATGTGAGCAAAAGGCCAGCAAAAGGCCAGGAA

CCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAGGCTCCGCC CCCCTGACGAGCATCACAAAAATCGACGCTCAAGTCAGAGGTGG

CGAAACCCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGG

AAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTTACCG

GATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGTGGCGCTTTCTC

ATAGCTCACGCTGTAGGTATCTCAGTTCGGTGTAGGTCGTTCGCT

CCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCCGACCG

CTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCCGGTAAG

ACACGACTTATCGCCACTGGCAGCAGCCACTGGTAACAGGATTA

GCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTCTTGAAGTGG

TGGCCTAACTACGGCTACACTAGAAGAACAGTATTTGGTATCTGC

GCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCT

TGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTTTTTGTT

TGCAAGCAGCAGATTACGCGCAGAAAAAAAGGATCTCAAGAAGAT

CCTTTGATCTTTTCTACGGGGTCTGACGCTCAGTGGAACGAAAAC

TCACGTTAAGGGATTTTGGTCATGAGATTATCAAAAAGGATCTTCA

CCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAATCAATCTAAAGT

ATATATGAGTAAACTTGGTCTGACAGTTACCAATGCTTAATCAGTG

AGGCACCTATCTCAGCGATCTGTCTATTTCGTTCATCCATAGTTG

CCTGACTCCCCGTCGTGTAGATAACTACGATACGGGAGGGCTTA

CCATCTGGCCCCAGTGCTGCAATGATACCGCGAGACCCACGCTC

ACCGGCTCCAGATTTATCAGCAATAAACCAGCCAGCCGGAAGGG

CCGAGCGCAGAAGTGGTCCTGCAACTTTATCCGCCTCCATCCAG

TCTATTAATTGTTGCCGGGAAGCTAGAGTAAGTAGTTCGCCAGTT

AATAGTTTGCGCAACGTTGTTGCCATTGCTACAGGCATCGTGGTG

TCACGCTCGTCGTTTGGTATGGCTTCATTCAGCTCCGGTTCCCAA

CGATCAAGGCGAGTTACATGATCCCCCATGTTGTGCAAAAAAGCG

GTTAGCTCCTTCGGTCCTCCGATCGTTGTCAGAAGTAAGTTGGCC

GCAGTGTTATCACTCATGGTTATGGCAGCACTGCATAATTCTCTTA

CTGTCATGCCATCCGTAAGATGCTTTTCTGTGACTGGTGAGTACT

CAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCGAGTTGCT

CTTGCCCGGCGTCAATACGGGATAATACCGCGCCACATAGCAGA

ACTTTAAAAGTGCTCATCATTGGAAAACGTTCTTCGGGGCGAAAA

CTCTCAAGGATCTTACCGCTGTTGAGATCCAGTTCGATGTAACCC

ACTCGTGCACCCAACTGATCTTCAGCATCTTTTACTTTCACCAGC

GTTTCTGGGTGAGCAAAAACAGGAAGGCAAAATGCCGCAAAAAA

GGGAATAAGGGCGACACGGAAATGTTGAATACTCATACTCTTCCT

TTTTCAATATTATTGAAGCATTTATCAGGGTTATTGTCTCATGAGC

GGATACATATTTGAATGTATTTAGAAAAATAAACAAATAGGGGTTC

CGCGCACATTTCCCCGAAAAGTGCCACCTGACGCGCCCTGTAGC

GGCGCATTAAGCGCGGCGGGTGTGGTGGTTACGCGCAGCGTGA

CCGCTACACTTGCCAGCGCCCTAGCGCCCGCTCCTTTCGCTTTC

TTCCCTl JCTTTCTCGCCACGTTCGCCGGCTTTCCCCGTCAAGCT

CTAAATCGGGGGCTCCCTTTAGGGTTCCGATTTAGTGCTTTACGG

CACCTCGACCCCAAAAAACTTGATTAGGGTGATGGTTCACGTAGT

GGGCCATCGCCCTGATAGACGGTTTTTCGCCCTTTGACGTTGGA

GTCCACGTTCTTTAATAGTGGACTCTTGTTCCAAACTGGAACAAC ACTCAACCCTATCTCGGTCTATTCTTTTGATTTATAAGGGATTTTG CCGATTTCGGCCTATTGGTTAAAAAATGAGCTGATTTAACAAAAAT TTAACGCGAA 1 I 1 1 AACAAAATATTAACGCTTACAATTTGCCATTC GCCATTCAGGCTGCGCAACTGTTGGGAAGGGCGAT

SEQ ID NO: CGGTGCGGGCCTCTTCGCTATTACGCCAGCTGGCGAAAGGGGG 49 ATGTGCTGCAAGGCGATTAAGTTGGGTAACGCCAGGG I I I ! CCC

AGTCACGACGTTGTAAAACGACGGCCAGTGAATTGTAATACGACT

CACTATAGGGCGACCCTTAGGATCCTATGGCGCGCCCAGCCGGT

AAAGATTCCCCACGCCAATCCGGCTGGTTGCCTCCTTCGTGAAG

ACAAACTCACGCGTCCAGTATCCCAGCAGATACGGGATATCGAC

ATTTCTGCACCATTCCGGCGGGTATAGGTTTTATTGATGGCCTCA

TCCACACGCAGCAGCGTCTGTTCATCGTCGTGGCGGCCCATAAT

AATCTGCCGGTCAATCAGCCAGC 1 1 1 CCTCACCCGGCCCCCATC

CCCATACGCGCATTTCGTAGCGGTCCAGCTGGGAGTCGATACCG

GCGGTCAGGTAAGCCACACGGTCAGGAACGGGCGCTGAATAATG

CTC I 1 1 CCGCTCTGCCATCACTTCAGCATCCGGACGTTCGCCAAT

TTTCGCCTCCCACGTCTCACCGAGCGTGGTGTTTACGAAGGTTTT

ACGTTTTCCCGTATCCCCTTTCGTTTTCATCCAGTCTTTGACAATC

TGCACCCAGGTGGTGAACGGGCTGTACGCTGTCCAGATGTGAAA

GGTCACACTGTCAGGTGGCTCAATCTCTTCACCGGATGACGAAAA

CCAGAGAATGCCATCACGGGTCCAGATCCCGGTC I 1 1 1 CGCAGA

TATAACGGGCATCAGTAAAGTCCAGCTCCTGCTGGCGGATGACG

CAGGCATTATGCTCGCAGAGATAAAACACGCTGGAGACGCG M M

CCCGTC 1 1 1 CAGTGCCTTGTTCAGTTCTTCCTGACGGGCGGTATA

1 1 1 CTCCAGCTTGGCGCGCCTAAGACTTAGATCTTAAGGGGATAT

CCTCGAGGTTCCCTTTAGTGAGGGTTAATTGCGAGCTTGGCGTAA

TCATGGTCATAGCTGTTTCCTGTGTGAAATTGTTATCCGCTCACAA

TTCCACACAACATACGAGCCGGAAGCATAAAGTGTAAAGCCTGG

GGTGCCTAATGAGTGAGCTAACTCACATTAATTGCGTTGCGCTCA

CTGCCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCTGCATTA

ATGAATCGGCCAACGCGCGGGGAGAGGCGG 1 1 1 GCGTATTGGG

CGCTCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTCGT

TCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATAC

GGTTATCCACAGAATCAGGGGATAACGCAGGAAAGAACATGTGA

GCAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTT

GCTGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACA

AAAATCGACGCTCAAGTCAGAGGTGGCGAAACCCGACAGGACTA

TAAAGATACCAGGCG 1 1 1 CCCCCTGGAAGCTCCCTCGTGCGCTC

TCCTGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCC 1 1 1 CT

CCCTTCGGGAAGCGTGGCGCTTTCTCATAGCTCACGCTGTAGGT

ATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTG

C^CGAACCCCCCGTTCAGCCCGACCGCTGCGCC 1 1 ATCCGGTAA

CTATCGTCTTGAGTCCAACCCGGTAAGACACGACTTATCGCCACT

GGCAGCAGCCACTGGTAACAGGATTAGCAGAGCGAGGTATGTAG

GCGGTGCTACAGAGTTCTTGAAGTGGTGGCCTAACTACGGCTAC

ACTAGAAGAACAGTATTTGGTATCTGCGCTCTGCTGAAGCCAGTT ACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGCAAACAAACC

ACCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAGCAGCAGATTACG

CGCAGAAAAAAAGGATCTCAAGAAGATCCTTTGATC 1 1 I I CTACG

GGGTCTGACGCTCAGTGGAACGAAAACTCACGTTAAGGGA t i l l

GGTCATGAGATTATCAAAAAGGATCTTCACCTAGATCC I 1 1 I AAAT

TAAAAATGAAG 1 1 1 1 AAATCAATCTAAAGTATATATGAGTAAACTTG

GTCTGACAGTTACCAATGCTTAATCAGTGAGGCACCTATCTCAGC

GATCTGTCTA 1 1 1 CGTTCATCCATAGTTGCCTGACTCCCCGTCGT

GTAGATAACTACGATACGGGAGGGCTTACCATCTGGCCCCAGTG

CTGCAATGATACCGCGAGACCCACGCTCACCGGCTCCAGA 1 1 I A

TCAGCAATAAACCAGCCAGCCGGAAGGGCCGAGCGCAGAAGTG

GTCCTGCAAC 1 1 1 ATCCGCCTCCATCCAGTCTATTAATTGTTGCC

GGGAAGCTAGAGTAAGTAGTTCGCCAGTTAATAGTTTGCGCAACG

TTGTTGCCATTGCTACAGGCATCGTGGTGTCACGCTCGTCGTTTG

GTATGGCTTCATTCAGCTCCGGTTCCCAACGATCAAGGCGAGTTA

CATGATCCCCCATGTTGTGCAAAAAAGCGGTTAGCTCCTTCGGTC

CTCCGATCGTTGTCAGAAGTAAGTTGGCCGCAGTGTTATCACTCA

TGGTTATGGCAGCACTGCATAATTCTCTTACTGTCATGCCATCCG

TAAGATGC M M CTGTGACTGGTGAGTACTCAACCAAGTCATTCT

GAGAATAGTGTATGCGGCGACCGAGTTGCTCTTGCCCGGCGTCA

ATACGGGATAATACCGCGCCACATAGCAGAAC 1 1 1 AAAAGTGCTC

ATCATTGGAAAACGTTCTTCGGGGCGAAAACTCTCAAGGATCTTA

CCGCTGTTGAGATCCAGTTCGATGTAACCCACTCGTGCACCCAAC

TGATCTTCAGCATCTTTTACTTTCACCAGCGTTTCTGGGTGAGCAA

AAACAGGAAGGCAAAATGCCGCAAAAAAGGGAATAAGGGCGACA

CGGAAATGTTGAATACTCATACTCTTCCTTTTTCAATATTATTGAA

GCA I 1 l ATCAGGGTTATTGTCTCATGAGCGGATACATATTTGAATG

TATTTAGAAAAATAAACAAATAGGGGTTCCGCGCACATTTCCCCG

AAAAGTGCCACCTGACGCGCCCTGTAGCGGCGCATTAAGCGCGG

CGGGTGTGGTGGTTACGCGCAGCGTGACCGCTACACTTGCCAGC

GCCCTAGCGCCCGCTCCTTTCGC 1 1 1 CTTCCCTTCCTTTCTCGCC

ACGTTCGCCGGCTTTCCCCGTCAAGCTCTAAATCGGGGGCTCCC

TTTAGGGTTCCGATTTAGTGCTTTACGGCACCTCGACCCCAAAAA

ACTTGATTAGGGTGATGGTTCACGTAGTGGGCCATCGCCCTGATA

GACGGTTTTTCGCCCTTTGACGTTGGAGTCCACGTTCTTTAATAG

TGGACTCTTGTTCCAAACTGGAACAACACTCAACCCTATCTCGGT

CTATTCTTTTGATTTATAAGGGATTTTGCCGATTTCGGCCTATTGG

TTAAAAAATGAGCTGATTTAACAAAAATTTAACGCGAATTTTAACA

AAATATTAACGCTTACAATTTGCCATTCGCCATTCAGGCTGCGCA

ACTGTTGGGAAGGGCGAT

SEQ ID NO: TTTCCCAGTCACGACGTTGTAAAACGACGGCCAGTGAATTGTAAT 50 ACGACTCACTATAGGGCGACCCTTAAGATu rGTAATGGCGCGCC

ATGCGCGGCTATGCCACCGGCGGTTATGTCGGTACACCGGGCA

GCATGGCAGACAGCCGGACGCGCCACGCACAGATATTATAACAT

CTGCATAATAGGCATTTGCAAGAATTACTCGTGAGTAAGGAAAGA

GTGAGGAACTATCGCATACCTGCATTTAAAGATGCCGA 1 1 1 GGGC GCGAATCCTTTATTTTGGCTTCACCCTCATACTATTATCAGGGCCA

GAAAAAGGAAGTGTTTCCCTCCTTCTTGAATTGATGTTACCCTCAT

AAAGCACGTGGCCTCTTATCGAGAAAGAAATTACCGTCGCTCGTG

ATTTGTTTGCAAAAAGAACAAAACTGAAAAAACCCAGACACGCTC

GACTTCCTGTCTTCCTATTGATTGCAGCTTCCAATTTCGTCACACA

ACAAGGTCCTAGCGACGGCTCACAGGTTTTGTAACAAGCAATCGA

AGGTTCTGGAATGGCGGGAAAGGGTTTAGTACCACATGCTATGAT

GCCCACTGTGATCTCCAGAGCAAAGTTCGTTCGATCGTACTGTTA

CTCTCTCTCTTTCAAACAGAATTGTCCGAATCGTGTGACAACAACA

GCCTGTTCTCACACACTCTTTTCTTCTAACCAAGGGGGTGGTTTA

GTTTAGTAGAACCTCGTGAAACTTACATTTACATATATATAAACTT

GCATAAATTGGTCAATGCAAGAAATACATATTTGGTCTTTTCTAAT

TCGTAGTTTTTCAAGTTCTTAGATGCTTTCTTTTTCTCTTTTTTACA

GATCATCAAGGAAGTAATTATCTACTTTTTACAACAAATATAAAAC

AAAGCTTGGCCTGCAGGGCCAGCTTACCCTTAAATTTATTTGCAC

TACTGGAAAACTACCTGTTCCATGGCCAACACTTGTCACTACTTTC

TCTTATGGTGTTCAATGCTTTTCCCGTTATCCGGATCATATGAAAC

GGCATGACTTTTTCAAGAGTGCCATGCCCGAAGGTTATGTACAGG

AACGCACTATATCTTTCAAAGATGACGGGAACTACAAGACGCGTG

CTGAAGTCAAGTTTGAAGGTGATACCCTTGTTAATCGTATCGAGT

TAAAAGGTATTGATTTTAAAGAAGATGGAAACATTCTCGGACACAA

ACTCGAGTACAACTATAACTCACACAATGTATACATCACGGCAGA

CAAACAAAAGAATGGAATCAAAGCTAACTTCAAAATTCGCCACAA

CATTGAAGATGGATCCGTTCAACTAGCAGACCATTATCAACAAAA

TACTCCAATTGGCGATGGCCCTGTCCTTTTACCAGACAACCATTA

CCTGTCGACACAATCTGCCCTTTCGAAAGATCCCAACGAAAAGCG

TGACCACATGGTCCTTCTTGAGTTTGTAACTGCTGCTGGGATTAC

ACATGGCATGGATGAGCTCTACAAATAATGAATTCGTCGAGGCCG

TTCAGGCCTCGAGGCCGTTCAGGCTCGACCCGGGGATCCGCGG

ATCTCTTATGTCTTTACGATTTATAGTTTTCATTATCAAGTATGCCT

ATATTAGTATATAGCATCTTTAGATGACAGTGTTCGAAGTTTCACG

AATAAAAGATAATATTCTACTTTTTGCTCCCACCGCGTTTGCTAGC

ACGAGTGAACACCATCCCTCGCCTGTGAGTTGTACCCATTCCTCT

AAACTGTAGACATGGTAGCTTCAGCAGTGTTCGTTATGTACGGCA

TCCTCCAACAAACAGTCGGTTATAGTTTGTCCTGCTCCTCTGAAT

CGTCTCCCTCGATATTTCTCATTTTCCTTCGGCGCGTTCGCAGGC

GTCCGGGACGTTTGAGCAGAATAACCATGTGGTGATTAACAACGA

CGGCACGGGCGCGCCAATGCTTAGATCTTAAGGGGATATCCTCG

AGGTTCCCTTTAGTGAGGGTTAATTGCGAGCTTGGCGTAATCATG

GTCATAGCTGTTTCCTGTGTGAAATTGTTATCCGCTCACAATTCCA

CACAACATACGAGCCGGAAGCATAAAGTGTAAAGCCTGGGGTGC

CTAATGAGTGAGCTAACTCACATTAATTGCGTTGCGCTCACTGCC

CGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCTGCATTAATGAAT

CGGCCAACGCGCGGGGAGAGGCGGTTTGCGTATTGGGCGCTCT

TCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTCGTTCGGCT

GCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACGGTTAT CCACAGAATCAGGGGATAACGCAGGAAAGAACATGTGAGCAAAA

GGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGG

CGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAAATC

GACGCTCAAGTCAGAGGTGGCGAAACCCGACAGGACTATAAAGA

TACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGT

TCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTC

GGGAAGCGTGGCGCTTTCTCATAGCTCACGCTGTAGGTATCTCA

GTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAA

CCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCG

TCTTGAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAG

CAGCCACTGGTAACAGGATTAGCAGAGCGAGGTATGTAGGCGGT

GCTACAGAGTTCTTGAAGTGGTGGCCTAACTACGGCTACACTAGA

AGAACAGTATTTGGTATCTGCGCTCTGCTGAAGCCAGTTACCTTC

GGAAAAAGAGTTGGTAGCTCTTGATCCGGCAAACAAACCACCGC

TGGTAGCGGTGGTTTTTTTGTTTGCAAGCAGCAGATTACGCGCAG

AAAAAAAGGATCTCAAGAAGATCCTTTGATCTTTTCTACGGGGTCT

GACGCTCAGTGGAACGAAAACTCACGTTAAGGGATTTTGGTCATG

AGATTATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTAAAAAT

GAAGTTTTAAATCAATCTAAAGTATATATGAGTAAACTTGGTCTGA

CAGTTACCAATGCTTAATCAGTGAGGCACCTATCTCAGCGATCTG

TCTATTTCGTTCATCCATAGTTGCCTGACTCCCCGTCGTGTAGATA

ACTACGATACGGGAGGGCTTACCATCTGGCCCCAGTGCTGCAAT

GATACCGCGAGACCCACGCTCACCGGCTCCAGATTTATCAGCAA

TAAACCAGCCAGCCGGAAGGGCCGAGCGCAGAAGTGGTCCTGC

AACTTTATCCGCCTCCATCCAGTCTATTAATTGTTGCCGGGAAGC

TAGAGTAAGTAGTTCGCCAGTTAATAGTTTGCGCAACGTTGTTGC

CATTGCTACAGGCATCGTGGTGTCACGCTCGTCGTTTGGTATGG

CTTCATTCAGCTCCGGTTCCCAACGATCAAGGCGAGTTACATGAT

CCCCCATGTTGTGCAAAAAAGCGGTTAGCTCCTTCGGTCCTCCGA

TCGTTGTCAGAAGTAAGTTGGCCGCAGTGTTATCACTCATGGTTA

TGGCAGCACTGCATAATTCTCTTACTGTCATGCCATCCGTAAGAT

GCTTTTCTGTGACTGGTGAGTACTCAACCAAGTCATTCTGAGAAT

AGTGTATGCGGCGACCGAGTTGCTCTTGCCCGGCGTCAATACGG

GATAATACCGCGCCACATAGCAGAACTTTAAAAGTGCTCATCATT

GGAAAACGTTCTTCGGGGCGAAAACTCTCAAGGATCTTACCGCT

GTTGAGATCCAGTTCGATGTAACCCACTCGTGCACCCAACTGATC

TTCAGCATCTTTTACTTTCACCAGCGTTTCTGGGTGAGCAAAAACA

GGAAGGCAAAATGCCGCAAAAAAGGGAATAAGGGCGACACGGAA

ATGTTGAATACTCATACTCTTCCTTTTTCAATATTATTGAAGCATTT

ATCAGGGTTATTGTCTCATGAGCGGATACATATTTGAATGTATTTA

GAAAAATAAACAAATAGGGGTTCCGCGCACATTTCCCCGAAAAGT

GCCACCTGACGCGCCCTG , AGCGGCGCATTAAGCGCGGCGGGT

GTGGTGGTTACGCGCAGCGTGACCGCTACACTTGCCAGCGCCCT

AGCGCCCGCTCCTTTCGCTTTCTTCCCTTCCTTTCTCGCCACGTT

CGCCGGCTTTCCCCGTCAAGCTCTAAATCGGGGGCTCCCTTTAG

GGTTCCGATTTAGTGCTTTACGGCACCTCGACCCCAAAAAACTTG ATTAGGGTGATGGTTCACGTAGTGGGCCATCGCCCTGATAGACG

GTTTTTCGCCCTTTGACGTTGGAGTCCACGTTCTTTAATAGTGGA

CTCTTGTTCCAAACTGGAACAACACTCAACCCTATCTCGGTCTATT

CTTTTGATTTATAAGGGATTTTGCCGATTTCGGCCTATTGGTTAAA

AAATGAGCTGATTTAACAAAAATTTAACGCGAA ! i ! I AACAAAATA

TTAACGCTTACAATTTGCCATTCGCCATTCAGGCTGCGCAACTGT

TGGGAAGGGCGAT

SEQ ID NO: CGGTGCGGGCCTCTTCGCTATTACGCCAGCTGGCGAAAGGGGG

51 ATGTGCTGCAAGGCGATTAAGTTGGGTAACGCCAGGG l ! I I CCC

AGTCACGACGTTGTAAAACGACGGCCAGTGAATTGTAATACGACT

CACTATAGGGCGACCCTTAGGATCCTATGGCGCGCCGGCACCCT

TGCGGGCCATGTCATACACCGCCTTCAGAGCAGCCGGACCTATC

TGCCCGTTACGCGCCAGCTTGCAAATTAAAGCCTTCGAGCGTCC

CAAAACCTTCTCAAGCAAGG I I I I CAGTATAATGTTACATGCGTAC

ACGCGTCTGTACAGAAAAAAAAGAAAAATTTGAAATATAAATAACG

TTCTTAATACTAACATAACTATAAAAAAATAAATAGGGACCTAGAC

TTCAGGTTGTCTAACTCCTTCC ! S I I CGGTTAGAGCGGATGTGGG

GGGAGGGCGTGAATGTAAGCGTGACATAACTAATTACATGATATC

GACAAAGGAAAAGGGGGACGGATCTCCGAGGCCTCGGACCCGT

CGGGCCGCCGTCGGACGTGCCGCGGTCAGGTGGCGAACTTCTT

AATACCTTGTTGCAAGATAGAGTCGAAAACGTCCATCTTTTTCTTT

TCCAAGGCAATACCAATTTCAACACCGTTAGAACCATCTCTAGATT

CAGAGAAGGCAATGGAACCACCAGTTTCAATATGAACGA I I I CCA

TCTTGCATGGCTTACCCAAACCAAAATCCATATCGTACAAACCCA

A I I I I GGAGCACCAGCAATAGAGGTTGGGTAATGAGACATAACCC

ATTTTCTAACACCTTGACCCCATCTTGGAGCAGTTTTCAACAAATC

GGAGGACAACATATCCTTGATTCTAGCAGTAATAGCATCAGAAGC

AGCCAAAACGCACTTTTCACCCAACAAATCATGTTTTTTGACAGAG

ACTATACCTGGAGCCATACAGTTACCGAAGTAAGTTTGTGGAATA

GGTTGGGTGTACTTCAATCTGTTTCTACAGTCAACGTTAATCATCA

AGTGGAAAACTTCGTCCTTATCTTCTTCGTTAGCCTTAG I I I CAGA

ATCTTGGACCAAGGTCTTAATCAAGGAAACCCAGATAAAAGCCAA

GGTAACAACGAAGGTAGAAACTGGAGATTGA ! I I I CGGATTGTTC

GGTGACCCAAGACTTCAAGTTATCGATTTGCTTTCTGGACAAGGT

GAAAGTAGCTCTAACCATG I I I I CTGGAGTAACATGAGAAGAGTG

CTTGGCGGAA I I I I GTGACCAAAATCTTTCCAAATGACCAGCACC

AACTTCACCTGGATCCTTGATCATGTTTCTGCAAGAATGAATTGG

CAAAGATGGCAACAAAACAGTAGCTGGATC I I I ACCAGAAGATTT

GGTCAAGGACATCCAGTACTTCATGAAATGTGAGAAAGTAACACC

ATCAGCAACAACATGAGTAGCAGAGTTACCAATACAGATACCAGC

ACCTGGAAAAATAGTGACTTGCATAGCCATAATTGGTCTCATTTGA

ATACCTTCAGGTGAAACATGTGGTGGTGGCAA I I I I GGCAAAACA

CCATGTAAAACGGAAATATCCTTTGGGGAATCGGACTTCAATTGA

TCGAAATCGGTTTCAGTAGATTCAGCAACGGTGAAAACCAAAGAG

TCTTGACCATCATTGTAATGCAAGTATGGTGGATCTGGTCTTGGT

GGAATAATCAACTTACCGGCGTATGGAAAAAAATGTTGCAAGGTA ATAGACAAGGAGTGCTTCAAGTTTGGGACGAAATCTTGTAAGAAA

GATTCGGTGGAGTTTTGGTAGGAGAAGAAGAACAAAGAATCAGC

CAATGGTAAAGACAACCATGGGGCATCAAAAAAAGTCAATGGCAA

AGTAGTAGATGGAACAGTACCCTTTGGTGGAGAAATATGGCAGGT

TTCAATAATCTTTGGTGGTTGCAAGTGAGCAACCATTTTAAGCTTT

TTGTTTGTTTATGTGTGTTTATTCGAAACTAAGTTCTTGGTGTTTTA

AAACTAAAAAAAAGACTAACTATAAAAGTAGAATTTAAGAAGTTTA

AGAAATAGATTTACAGAATTACAATCAATACCTACCGTCTTTATAT

ACTTATTAGTCAAGTAGGGGAATAATTTCAGGGAACTGGTTTCAA

CCTTTTTTTTCAGCTTTTTCCAAATCAGAGAGAGCAGAAGGTAATA

GAAGGTGTAAGAAAATGAGATAGATACATGCGTGGGTCAATTGCC

TTGTGTCATCATTTACTCCAGGCAGGTTGCATCACTCCATTGAGG

TTGTGTCCGTTTTTTGCCTGTTTGTGCCCCTGTTCTCTGTAGTTGC

GCTAAGAGAATGGACCTATGAACTGATGGTTGGTGAAGAAAACAA

TATTTTGGTGCTGGGATTCTTTTTTTTTCTGGATGCCAGCTTAAAA

AGCGGGCTCCATTATATTTAGTGGATGCCAGGAATAAACTGTTCA

CCCAGACACCTACGATGTTATATATTCTGTGTAACCCGCCCCCTA

TTTTGGGCATGTACGGGTTACAGCAGAATTAAAAGGCTAATTTTTT

GACTAAATAAAGTTAGGAAAATCACTACTATTAATTATTTACGTATT

CTTTGAAATGGCAGTATTGATAATGATAAACTCGAACTGGGCGCG

TCGTGCCGTCGTTGTTAATCACCACATGGTTATTCTGCTCAAACG

TCCCGGACGCCTGCGAGGCGCGCCTATTGAAAGATCTTAAGGGG

ATATCCTCGAGGTTCCCTTTAGTGAGGGTTAATTGCGAGCTTGGC

GTAATCATGGTCATAGCTGTTTCCTGTGTGAAATTGTTATCCGCTC

ACAATTCCACACAACATACGAGCCGGAAGCATAAAGTGTAAAGCC

TGGGGTGCCTAATGAGTGAGCTAACTCACATTAATTGCGTTGCGC

TCACTGCCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCTGCA

TTAATGAATCGGCCAACGCGCGGGGAGAGGCGGTTTGCGTATTG

GGCGCTCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTC

GTTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAAT

ACGGTTATCCACAGAATCAGGGGATAACGCAGGAAAGAACATGT

GAGCAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCGC

GTTGCTGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATC

ACAAAAATCGACGCTCAAGTCAGAGGTGGCGAAACCCGACAGGA

CTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCG

CTCTCCTGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTT

TCTCCCTTCGGGAAGCGTGGCGCTTTCTCATAGCTCACGCTGTA

GGTATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGT

GTGCACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGG

TAACTATCGTCTTGAGTCCAACCCGGTAAGACACGACTTATCGCC

ACTGGCAGCAGCCACTGGTAACAGGATTAGCAGAGCGAGGTATG

TAGGCGGToCTACAGAGTTCTTGAAGTGGTGGCCTAACTAC GC

TACACTAGAAGAACAGTATTTGGTATCTGCGCTCTGCTGAAGCCA

GTTACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGCAAACAA

ACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAGCAGCAGATT

ACGCGCAGAAAAAAAGGATCTCAAGAAGATCCTTTGATCTTTTCT ACGGGGTCTGACGCTCAGTGGAACGAAAACTCACGTTAAGGGAT

TTTGGTCATGAGATTATCAAAAAGGATCTTCACCTAGATCCTTTTA

AATTAAAAATGAAG 1 1 1 1 AAATCAATCTAAAGTATATATGAGTAAAC

TTGGTCTGACAGTTACCAATGCTTAATCAGTGAGGCACCTATCTC

AGCGATCTGTCTATTTCGTTCATCCATAGTTGCCTGACTCCCCGT

CGTGTAGATAACTACGATACGGGAGGGCTTACCATCTGGCCCCA

GTGCTGCAATGATACCGCGAGACCCACGCTCACCGGCTCCAGAT

TTATCAGCAATAAACCAGCCAGCCGGAAGGGCCGAGCGCAGAAG

TGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTATTAATTGTTGC

CGGGAAGCTAGAGTAAGTAGTTCGCCAGTTAATAGTTTGCGCAAC

GTTGTTGCCATTGCTACAGGCATCGTGGTGTCACGCTCGTCG 1 1 1

GGTATGGCTTCATTCAGCTCCGGTTCCCAACGATCAAGGCGAGTT

ACATGATCCCCCATGTTGTGCAAAAAAGCGGTTAGCTCCTTCGGT

CCTCCGATCGTTGTCAGAAGTAAGTTGGCCGCAGTGTTATCACTC

ATGGTTATGGCAGCACTGCATAATTCTCTTACTGTCATGCCATCC

GTAAGATGC 1 1 1 1 CTGTGACTGGTGAGTACTCAACCAAGTCATTC

TGAGAATAGTGTATGCGGCGACCGAGTTGCTCTTGCCCGGCGTC

AATACGGGATAATACCGCGCCACATAGCAGAACTTTAAAAGTGCT

CATCATTGGAAAACGTTCTTCGGGGCGAAAACTCTCAAGGATCTT

ACCGCTGTTGAGATCCAGTTCGATGTAACCCACTCGTGCACCCAA

CTGATCTTCAGCATCTTTTACTTTCACCAGCGTTTCTGGGTGAGC

AAAAACAGGAAGGCAAAATGCCGCAAAAAAGGGAATAAGGGCGA

CACGGAAATGTTGAATACTCATACTCTTCCTTTTTCAATATTATTGA

AGCATTTATCAGGGTTATTGTCTCATGAGCGGATACATA I 1 I GAAT

GTATTTAGAAAAATAAACAAATAGGGGTTCCGCGCACATTTCCCC

GAAAAGTGCCACCTGACGCGCCCTGTAGCGGCGCATTAAGCGCG

GCGGGTGTGGTGGTTACGCGCAGCGTGACCGCTACACTTGCCA

GCGCCCTAGCGCCCGCTCCTTTCGCTTTCTTCCCTTCC 1 1 1 CTCG

CCACGTTCGCCGGC 1 1 1 CCCCGTCAAGCTCTAAATCGGGGGCTC

CCTTTAGGGTTCCGATTTAGTGCTTTACGGCACCTCGACCCCAAA

AAACTTGATTAGGGTGATGGTTCACGTAGTGGGCCATCGCCCTG

ATAGACGG M i l l CGCCCTTTGACGTTGGAGTCCACGTTCTTTAAT

AGTGGACTCTTGTTCCAAACTGGAACAACACTCAACCCTATCTCG

GTCTATTCTTTTGATTTATAAGGGATTTTGCCGATTTCGGCCTATT

GGTTAAAAAATGAGCTGATTTAACAAAAATTTAACGCGAATTTTAA

CAAAATATTAACGCTTACAATTTGCCATTCGCCATTCAGGCTGCG

CAACTGTTGGGAAGGGCGAT

SEQ ID NO: CGGTGCGGGCCTCTTCGCTATTACGCCAGCTGGCGAAAGGGGG 52 ATGTGCTGCAAGGCGATTAAGTTGGGTAACGCCAGGG 1 1 1 I CCC

AGTCACGACGTTGTAAAACGACGGCCAGTGAATTGTAATACGACT

CACTATAGGGCGACCCTTAGGATCCTATGGCGCGCCGGCACCCT

TGCuGGCCATGTCATACACCGCCTTCAGAGCAGCCG'JACCTATC

TGCCCGTTACGCGCCAGCTTGCAAATTAAAGCCTTCGAGCGTCC

CAAAACCTTCTCAAGCAAGG 1 1 1 1 CAGTATAATGTTACATGCGTAC

ACGCGTCTGTACAGAAAAAAAAGAAAAATTTGAAATATAAATAACG

TTCTTAATACTAACATAACTATAAAAAAATAAATAGGGACCTAGAC TTCAGGTTGTCTAACTCCTTCCTTTTCGGTTAGAGCGGATGTGGG

GGGAGGGCGTGAATGTAAGCGTGACATAACTAATTACATGATATC

GACAAAGGAAAAGGGGGACGGATCTCCGAGGCCTCGGACCCGT

CGGGCCGCCGTCGGACGTGCCGCGGTTAAGAAGCAATAGCGGA

TTCCAAACCGTCGTTAAAGATTTTACCAAAGGCTTCCATTTGCATG

GATGGGAAACAAACACCAATTTCAAAATCTTGGGCGGATTCTTTA

CAAGCTGACAAAGAAACAGAGGCGGAGTAGTCAATAGAAACAAC

TTCGTACTTCATAGCCTTACCCCAACCGAAATCAATATCGTAGAA

GTTCAACTTTGGAGTACCAGAAATACCCATCTTTCTAGCTGGAAT

CTTAAAACCATCGTACCATCTATCAGCGTATTCCAAAATACCACCC

TTCTTGTTAACCATCTTAGAGATACCTTCACCAATCAACTTAGCAG

CCATAACAAAACCGTTTTCACCCTTCAAGACACCGTTCTTAATAGT

GACAATACATGGAGCAGAACAGTTACCGAAGTAGTTTTCTGGTAA

TGGTGGATCTAATCTTGATCTGCAACCGACAGAAACGATGAATTG

TTCCAATTCATCTTCACCCTTTTTTTCACCCATGTTGACCAAGGAC

TTAACGATACAAGACCAAATGTAACCGCAGGTAACAGTGAAAGAA

GAAGTGTATTCCAACATTGGCAATTGAGTCAAGACTTGCTTCTTCA

AACCGGAAATATGAGTTCTGGCCAAAACGAAAGTAGCTCTAACTC

TATCAGATGAAGAACCAACCAAAGAAGGAGCTTGGTAGAAAGTAC

CCAATCTGGTTTGATTCAATCTGTTTTCGTATAATTGTGGGTTAAC

AACAACTCTATCGAAAACTGGTGGGGAACCATTTTTCAAGAATGG

TTGATCTTCACCAGTTTCACAAACAGAAGCCCAAGCCTTCAAAAA

ACCGAATCTAGTGTTAGCATCAGACAAAGAGTGATGGTTGGTCAA

ACCAATAGAAATACCGGAGTTTGGGAAGTAAGTAACTTGAACAGA

GAAAACTGGCAAGGTAACGTAATCAGATTCTTTTACAGCGTTACC

CAATGGTGGAACCAATGGATAGAAATTTTCGCACTTTCTTGGATG

GTTAGCAGACAAATCGTTGAAATCCAAGGTAGTTTCAGCGAAAGT

CAAAGCAACAGAATCACCTTCAACATGTCTGATTTCTGGCTTTCTG

GTAGAATCATGTGGATTTGGGTAAACGATCAACTTACCGACGAAT

GGAAAGTAATGTTGCAAGGTAATGGACAAGGAGTGCTTCAAATTT

GGGATAACAGTTTCGGTGAAATGGGACTTGGAGTATGGAAAATG

GTAGAAGTACAAGTGATGAACTGGTGGAAACAACAACCAGGCAAT

ATCGAAGAAAGTCAATGGCAATGATCTATGACCAATAGTAGATGG

TGGTGGAGAAATTCTAGAGTGTTCCAAGATGGTCAAGTTTGGGAT

GTTGTCCATTTTAAGCTTTTTGTTTGTTTATGTGTGTTTATTCGAAA

CTAAGTTCTTGGTGTTTTAAAACTAAAAAAAAGACTAACTATAAAA

GTAGAATTTAAGAAGTTTAAGAAATAGATTTACAGAATTACAATCA

ATACCTACCGTCTTTATATACTTATTAGTCAAGTAGGGGAATAATT

TCAGGGAACTGGTTTCAACCTTTTTTTTCAGCTTTTTCCAAATCAG

AGAGAGCAGAAGGTAATAGAAGGTGTAAGAAAATGAGATAGATAC

ATGCGTGGGTCAATTGCCTTGTGTCATCATTTACTCCAGGCAGGT

TGCATCACTCCATTGAGGTTGTGTCCGTTTTl « GCCTGTTTGTGC

CCCTGTTCTCTGTAGTTGCGCTAAGAGAATGGACCTATGAACTGA

TGGTTGGTGAAGAAAACAATATTTTGGTGCTGGGATTCTTTTTTTT

TCTGGATGCCAGCTTAAAAAGCGGGCTCCATTATATTTAGTGGAT

GCCAGGAATAAACTGTTCACCCAGACACCTACGATGTTATATATT CTGTGTAACCCGCCCCCTATTTTGGGCATGTACGGGTTACAGCA

GAATTAAAAGGCTAATTTTTTGACTAAATAAAGTTAGGAAAATCAC

TACTATTAATTATTTACGTATTCTTTGAAATGGCAGTATTGATAATG

ATAAACTCGAACTGGGCGCGTCGTGCCGTCGTTGTTAATCACCAC

ATGGTTATTCTGCTCAAACGTCCCGGACGCCTGCGAGGCGCGCC

TATTGAAAGATCTTAAGGGGATATCCTCGAGGTTCCCTTTAGTGA

GGGTTAATTGCGAGCTTGGCGTAATCATGGTCATAGCTGTTTCCT

GTGTGAAATTGTTATCCGCTCACAATTCCACACAACATACGAGCC

GGAAGCATAAAGTGTAAAGCCTGGGGTGCCTAATGAGTGAGCTA

ACTCACATTAATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGGG

AAACCTGTCGTGCCAGCTGCATTAATGAATCGGCCAACGCGCGG

GGAGAGGCGGTTTGCGTATTGGGCGCTCTTCCGCTTCCTCGCTC

ACTGACTCGCTGCGCTCGGTCGTTCGGCTGCGGCGAGCGGTATC

AGCTCACTCAAAGGCGGTAATACGGTTATCCACAGAATCAGGGG

ATAACGCAGGAAAGAACATGTGAGCAAAAGGCCAGCAAAAGGCC

AGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAGGCT

CCGCCCCCCTGACGAGCATCACAAAAATCGACGCTCAAGTCAGA

GGTGGCGAAACCCGACAGGACTATAAAGATACCAGGCGTTTCCC

CCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCT

TACCGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGTGGCGC

TTTCTCATAGCTCACGCTGTAGGTATCTCAGTTCGGTGTAGGTCG

TTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCC

GACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCC

GGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGTAACA

GGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTCTTG

AAGTGGTGGCCTAACTACGGCTACACTAGAAGAACAGTATTTGGT

ATCTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGT

AGCTCTTGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTT

TTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAAAAAGGATCTCA

AGAAGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCAGTGGAA

CGAAAACTCACGTTAAGGGATTTTGGTCATGAGATTATCAAAAAG

GATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAATCAA

TCTAAAGTATATATGAGTAAACTTGGTCTGACAGTTACCAATGCTT

AATCAGTGAGGCACCTATCTCAGCGATCTGTCTATTTCGTTCATC

CATAGTTGCCTGACTCCCCGTCGTGTAGATAACTACGATACGGGA

GGGCTTACCATCTGGCCCCAGTGCTGCAATGATACCGCGAGACC

CACGCTCACCGGCTCCAGATTTATCAGCAATAAACCAGCCAGCC

GGAAGGGCCGAGCGCAGAAGTGGTCCTGCAACTTTATCCGCCTC

CATCCAGTCTATTAATTGTTGCCGGGAAGCTAGAGTAAGTAGTTC

GCCAGTTAATAGTTTGCGCAACGTTGTTGCCATTGCTACAGGCAT

CGTGGTGTCACGCTCGTCGTTTGGTATGGCTTCATTCAGCTCCG

GTTCCCAACGATCAAGGCGAGTTACA i GATCCCCCATGTTGTGCA

AAAAAGCGGTTAGCTCCTTCGGTCCTCCGATCGTTGTCAGAAGTA

AGTTGGCCGCAGTGTTATCACTCATGGTTATGGCAGCACTGCATA

ATTCTCTTACTGTCATGCCATCCGTAAGATGCTTTTCTGTGACTGG

TGAGTACTCAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACC GAGTTGCTCTTGCCCGGCGTCAATACGGGATAATACCGCGCCAC

ATAGCAGAACTTTAAAAGTGCTCATCATTGGAAAACGTTCTTCGG

GGCGAAAACTCTCAAGGATCTTACCGCTGTTGAGATCCAGTTCGA

TGTAACCCACTCGTGCACCCAACTGATCTTCAGCATC 1 1 1 I ACTTT

CACCAGCGTTTCTGGGTGAGCAAAAACAGGAAGGCAAAATGCCG

CAAAAAAGGGAATAAGGGCGACACGGAAATGTTGAATACTCATAC

TCTTCCTTTTTCAATATTATTGAAGCATTTATCAGGGTTATTGTCTC

ATGAGCGGATACATATTTGAATGTA i 1 1 AGAAAAATAAACAAATAG

GGGTTCCGCGCACA 1 1 1 CCCCGAAAAGTGCCACCTGACGCGCCC

TGTAGCGGCGCATTAAGCGCGGCGGGTGTGGTGGTTACGCGCA

GCGTGACCGCTACACTTGCCAGCGCCCTAGCGCCCGCTCCTTTC

GCTTTCTTCCCTTCC 1 I 1 CTCGCCACGTTCGCCGGCTTTCCCCGT

CAAGCTCTAAATCGGGGGCTCCCTTTAGGGTTCCGATTTAGTGCT

TTACGGCACCTCGACCCCAAAAAACTTGATTAGGGTGATGGTTCA

CGTAGTGGGCCATCGCCCTGATAGACGGTTTTTCGCCCTTTGAC

GTTGGAGTCCACGTTCTTTAATAGTGGACTCTTGTTCCAAACTGG

AACAACACTCAACCCTATCTCGGTCTATTC M M GATTTATAAGGG

ATTTTGCCGATTTCGGCCTATTGGTTAAAAAATGAGCTGATTTAAC

AAAAATTTAACGCGAATTTTAACAAAATATTAACGCTTACAATTTGC

CATTCGCCATTCAGGCTGCGCAACTGTTGGGAAGGGCGAT

SEQ ID NO: CGGTGCGGGCCTCTTCGCTATTACGCCAGCTGGCGAAAGGGGG 53 ATGTGCTGCAAGGCGATTAAGTTGGGTAACGCCAGGG M M CCC

AGTCACGACGTTGTAAAACGACGGCCAGTGAATTGTAATACGACT

CACTATAGGGCGACCCTTAAGATCTAAGTCTTAGGCGCGCCAAG

CTGGAGAAATATACCGCCCGTCAGGAAGAACTGAACAAGGCACT

GAAAGACGGGAAAACGCGTCCAGTATCCCAGCAGATACGGGATA

TCGACATTTCTGCACCATTCCGGCGGGTATAGGTTTTATTGATGG

CCTCATCCACACGCAGCAGCGTCTGTTCATCGTCGTGGCGGCCC

ATAATAATCTGCCGGTCAATCAGCCAGCTTTCCTCACCCGGCCCC

CATCCCCATACGCGCATTTCGTAGCGGTCCAGCTGGGAGTCGAT

ACCGGCGGTCAGGTAAGCCACACGGTCAGGAACGGGCGCTGAA

TAATGCTCTTTCCGCTCTGCCATCACTTCAGCATCCGGACGTTCG

CCAA M M CGCCTCCCACGTCTCACCGAGCGTGGTGTTTACGAAG

GTTTTACGTTTTCCCGTATCCCCTTTCGTTTTCATCCAGTCTTTGA

CAATCTGCACCCAGGTGGTGAACGGGCTGTACGCTGTCCAGATG

TGAAAGGTCACACTGTCAGGTGGCTCAATCTCTTCACCGGATGAC

GAAAACCAGAGAATGCCATCACGGGTCCAGATCCCGGTC I M I C

GCAGATATAACGGGCATCAGTAAAGTCCAGCTCCTGCTGGCGGA

TGACGCAGGCATTATGCTCGCAGAGATAAAACACGCTGGAGACG

CGTGGCGCATCCGCGTCAGGCGGTACAGCCATTCAGGCCGCTG

CGGCGAAATTCCA 1 1 t 1 GCAGGCGCGCCAATGCTTAGATCCTAAG

GGGATATCCTCGAGGTTCCCl I TAGTGAGGGTTAATTGCGAGCTT

G G C GT AAT CAT G GT CAT A GCTGTTTCCTGTGT G AAATT GTT AT CC

GCTCACAATTCCACACAACATACGAGCCGGAAGCATAAAGTGTAA

AGCCTGGGGTGCCTAATGAGTGAGCTAACTCACATTAATTGCGTT

GCGCTCACTGCCCGC 1 1 1 CCAGTCGGGAAACCTGTCGTGCCAGC TGCATTAATGAATCGGCCAACGCGCGGGGAGAGGCGGTTTGCGT

ATTGGGCGCTCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTC

GGTCGTTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCG

GTAATACGGTTATCCACAGAATCAGGGGATAACGCAGGAAAGAA

CATGTGAGCAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGG

CCGCGTTGCTGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAG

CATCACAAAAATCGACGCTCAAGTCAGAGGTGGCGAAACCCGAC

AGGACTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGT

GCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGATACCTGTCCG

CCTTTCTCCCTTCGGGAAGCGTGGCGCTTTCTCATAGCTCACGCT

GTAGGTATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGC

TGTGTGCACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATC

CGGTAACTATCGTCTTGAGTCCAACCCGGTAAGACACGACTTATC

GCCACTGGCAGCAGCCACTGGTAACAGGATTAGCAGAGCGAGGT

ATGTAGGCGGTGCTACAGAGTTCTTGAAGTGGTGGCCTAACTAC

GGCTACACTAGAAGAACAGTATTTGGTATCTGCGCTCTGCTGAAG

CCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGCAAA

CAAACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAGCAGCAG

ATTACGCGCAGAAAAAAAGGATCTCAAGAAGATCCTTTGATCTTTT

CTACGGGGTCTGACGCTCAGTGGAACGAAAACTCACGTTAAGGG

ATTTTGGTCATGAGATTATCAAAAAGGATCTTCACCTAGATCCTTT

TAAATTAAAAATGAAGTTTTAAATCAATCTAAAGTATATATGAGTAA

ACTTGGTCTGACAGTTACCAATGCTTAATCAGTGAGGCACCTATC

TCAGCGATCTGTCTATTTCGTTCATCCATAGTTGCCTGACTCCCC

GTCGTGTAGATAACTACGATACGGGAGGGCTTACCATCTGGCCC

CAGTGCTGCAATGATACCGCGAGACCCACGCTCACCGGCTCCAG

ATTTATCAGCAATAAACCAGCCAGCCGGAAGGGCCGAGCGCAGA

AGTGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTATTAATTGTT

GCCGGGAAGCTAGAGTAAGTAGTTCGCCAGTTAATAGTTTGCGC

AACGTTGTTGCCATTGCTACAGGCATCGTGGTGTCACGCTCGTC

GTTTGGTATGGCTTCATTCAGCTCCGGTTCCCAACGATCAAGGCG

AGTTACATGATCCCCCATGTTGTGCAAAAAAGCGGTTAGCTCCTT

CGGTCCTCCGATCGTTGTCAGAAGTAAGTTGGCCGCAGTGTTATC

ACTCATGGTTATGGCAGCACTGCATAATTCTCTTACTGTCATGCC

ATCCGTAAGATGCTTTTCTGTGACTGGTGAGTACTCAACCAAGTC

ATTCTGAGAATAGTGTATGCGGCGACCGAGTTGCTCTTGCCCGG

CGTCAATACGGGATAATACCGCGCCACATAGCAGAACTTTAAAAG

TGCTCATCATTGGAAAACGTTCTTCGGGGCGAAAACTCTCAAGGA

TCTTACCGCTGTTGAGATCCAGTTCGATGTAACCCACTCGTGCAC

CCAACTGATCTTCAGCATCTTTTACTTTCACCAGCGTTTCTGGGTG

AG C AAAAAC AG G AAG G C AAAATG CC G C AAAAAAG G G AAT AAG G G

CGACACGGAAATGTTGAATACTCATACTCTTCCTTTTTCAATATTA

TTGAAGCATTTATCAGGGTTATTGTCTCATGAGCGGATACATATTT

GAATGTATTTAGAAAAATAAACAAATAGGGGTTCCGCGCACATTT

CCCCGAAAAGTGCCACCTGACGCGCCCTGTAGCGGCGCATTAAG

CGCGGCGGGTGTGGTGGTTACGCGCAGCGTGACCGCTACACTT GCCAGCGCCCTAGCGCCCGCTCCTTTCGCTTTCTTCCCTTCCTTT

CTCGCCACGTTCGCCGGCTTTCCCCGTCAAGCTCTAAATCGGGG

GCTCCCTTTAGGGTTCCGATTTAGTGCTTTACGGCACCTCGACCC

CAAAAAACTTGATTAGGGTGATGGTTCACGTAGTGGGCCATCGCC

CTGATAGACGGTTTTTCGCCCTTTGACGTTGGAGTCCACGTTCTT

TAATAGTGGACTCTTGTTCCAAACTGGAACAACACTCAACCCTAT

CTCGGTCTATTCTTTTGATTTATAAGGGATTTTGCCGATTTCGGCC

T ATTG GTT AAAAAATG A G CTG ATTTAACAAAAATTTAAC G CG A ATT

TTAAC AAA ATATTAA CG CTTA CAA 1 1 1 GCCATTCGCCATTCAGGCT

GCGCAACTGTTGGGAAGGGCGAT

SEQ ID NO: MANPHPHFLIITFPAQGHINPALELAKRLIGVGADVTFATTIHAKSRLV 54 KNPTVDGLRFSTFSDGQEEGVKRGPNELPVFQRLASENLSELIMAS

ANEGRPISCLIYSILIPGAAELARSFNIPSAFLWIQPATVLDIYYYYFNG

FGDLIRSKSSDPSFSIELPGLPSLSRQDLPSFFVGSDQNQENHALAA

FQKHLEILEQEENPKVLVNTFDALEPEALRAVEKLKLTAVGPLVPSGF

SDGKDASDTPSGGDLSDGSRDYMEWLKSKPESTWYVSFGSISMF

SMQQMEEIARGLLESGRPFLWVIRAKENGEENKEEDKLSCQEELEK

QGMLIQWCSQ EVLSHPSLGCFVTHCGWNSSIESLASGVPMIAFPQ

WADQGTNTKLIKDVWKTGVRLMVNEEEIVTSDELRRCLELVMGDGE

KGQEMRKNAKKWKILAKEALKEGGSSHKNLKNFVDEVIQGY

SEQ ID NO: MALRINELFVAAIIYIIVHIIISKLITTVRERGRRLPLPPGPTGWPVIGALP 55 LLGSMPHVALAK AKKYGPIMYLKVGTCGMWASTPNAAKAFLKTL

DINFSNRPPNAGATHLAYNAQDMVFAPYGPRWKLLRKLSNLHMLGG

KALENWANVRANELGHMLKSMFDASQDGECWIADVLTFAMAN IG

QVMLSKRVFVEKGVEVNEFKNMVVELMTVAGYFNIGDFIPKLAWMDI

QGIEKGMKNLHKKFDDLLTKMFDEHEATSNERKENPDFLDWMANR

DNSEGERLSTTNIKALLLNLFTAGTDTSSSVIEWALAEMMKNPK!FKK

AQQE DQVIGKNRRLIESDIPNLPYLRAICKETFRKHPSTPLNLPRVS

SEPCTVDGYYIPKNTRLSVNIWAIGRDPDVWENPLEFTPERFLSGKN

AKIEPRGNDFELIPFGAGRRICAGTRMGIWVEYILGTLVHSFDWKLP

NNVIDINMEESFGLALQKAVPLEAMVTPRLSLDVYRC

SEQ ID NO: MTSALYASDLFKQLKSIMGTDSLSDDVVLVIATTSLALVAGFWLLWK

56 KTTADRSGELKPLMIPKSLMAKDEDDDLDLGSG TRVSIFFGTQTGT

AEGFAKALSEEIKARYEKAAVKVIDLDDYAADDDQYEEKLKKETLAFF

CVATYGDGEPTDNAARFYKWFTEENERDIKLQQLAYGVFALGNRQY

EHFNKIGIVLDEELCKKGAKRLIEVGLGDDDQSIEDDFNAWKESLWS

ELDKLLKDEDDKSVATPYTAVIPEYRWTHDPRFTTQKSMESNVANG

NTTIDIHHPCRVDVAVQKELHTHESDRSCIHLEFDISRTGITYETGDH

VGVYAENHVEIVEEAGKLLGHSLDLVFSIHADKEDGSPLESAVPPPF

PGPCTLGTGLARYADLLNPPRKSALVALAAYATEPSEAEKLKHLTSP

DGKDEYSQWIVASQRSLLEV AAFPSAKPPLGVFFAAIAPRLQPRYY

SISSSPRLAPSR^HVTSALVYGPTPTGRIHKGVCSTWMKNAVPAEKd

HECSGAPIFIRASNFKLPSNPSTPIVMVGPGTGLAPFRGFLQER AL

KEDGEELGSSLLFFGCRNRQMDFIYEDELNNFVDQGVISELIMAFSR

EGAQKEYVQHKMMEKAAQVWDLIKEEGYLYVCGDAKG ARDVHRT

LHTIVQEQEGVSSSEAEAIVKKLQTEGRYLRDVW SEQ ID NO: MVAHLQPPKIIETCHISPPKGTVPSTTLPLTFFDAPWLSLPLADSLFFF 57 SYQNSTESFLQDFVPNLKHSLSITLQHFFPYAGKLIIPPRPDPPYLHY

NDGQDSLVFTVAESTETDFDQLKSDSPKDISVLHGVLPKLPPPHVSP

EGIQMRPIMAMQVTIFPGAGICIGNSATHVVADGVTFSHFMKYWMSL

TKSSGKDPATVLLPSLPIHSCRNMIKDPGEVGAGHLERFWSQNSAK

HSSHVTPEN VRATFTLSRKQIDNLKSWVTEQSENQSPVSTFWTL

AFIWVSLIKTLVQDSETKANEEDKDEVFHLMINVDCRNRLKYTQPIPQ

TYFGNCMAPGIVSVKKHDLLGEKCVLAASDAITARIKDMLSSDLLKTA

PRWGQGVRKWVMSHYPTSIAGAPKLGLYDMDFGLGKPCKMEIVHIE

TGGSIAFSESRDGSNGVEIGIALEKKKMDVFDSILQQGIKKFAT

SEQ ID NO: MDNIPNLTILEHSRISPPPSTIGHRSLPLTFFDIAWLLFPPVHHLYFYHF

58 PYSKSHFTETVIPNLKHSLSITLQHYFPFVGKLIVYPNPHDSTRKPEIR

HVEGDSVALTFAETTLDFNDLSANHPRKCENFYPLVPPLGNAVKESD

YVTLPVFSVQVTYFPNSGISIGLTNHHSLSDANTRFGFLKAWASVCE

TGEDQPFLKNGSPPVFDRVWNPQLYENRLNQTRLGTFYQAPSLVG

SSSDRVRATFVLARTHISGLKKQVLTQLPMLEYTSSFTVTCGYIWSCI

VKSLVNMGEKKGEDELEQFIVSVGCRSRLDPPLPENYFGNCSAPCIV

TIKNGVLKGENGFV AA LIGEGISKMVNKKGG!LEYADRWYDGFK!

PARKMGISGTPKLNFYDIDFGWGKAMKYEVVSIDYSASVSLSACKES

AQDFEIGVCFPSMQMEAFGKIFNDGLESAIAS