Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
RECOMBINANT XYLANASES
Document Type and Number:
WIPO Patent Application WO/1993/025693
Kind Code:
A1
Abstract:
Recombinant xylanases are derived from anaerobic fungi, particularly Neocallimastix patriciarum. The enzymes are highly specific for xylans and have industrial value, particularly in the pulp and paper industries. Certain truncated forms of the enzymes, and enzymes encoded by truncated DNA sequences, are preferred for their high expression levels.

Inventors:
HAZLEWOOD GEOFFREY PETER (GB)
GILBERT HARRY JOHN (GB)
Application Number:
PCT/GB1993/001283
Publication Date:
December 23, 1993
Filing Date:
June 17, 1993
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
HAZLEWOOD GEOFFREY PETER (GB)
GILBERT HARRY JOHN (GB)
International Classes:
A23K1/165; A21D8/04; C12N1/15; C12N9/24; C12N15/09; C12N15/56; C12S3/08; D21C9/10; C12R1/645; (IPC1-7): C12N15/56; A21D8/04; A23K1/165; C12N9/24; D21C9/10
Domestic Patent References:
WO1991019782A11991-12-26
Foreign References:
EP0463706A11992-01-02
Other References:
EMBL Database entry Accesion number X65526; 5 May 1992 GILBERT, H.J. ET AL.: 'Homologous catalytic domains in a rumen fungal xylanase: evidence for gene duplication and prokaryotic origin'
FEMS MICROBIOLOGY LETTERS vol. 77, no. 1, 1 January 1991, pages 107 - 112 PASCALE REYMOND ET AL. 'Molecular cloning of genes from the rumen anaerobic fungus Neocallimastix frontalis: expression during hydrolase induction' cited in the application
MOLECULAR MICROBIOLOGY vol. 6, no. 15, August 1992, pages 2065 - 2072 H.J. GILBERT ET AL. 'Homologous catalytic domains in a rumen fungal xylanase: evidence for gene duplication and prokaryotic origin'
Download PDF:
Claims:
CLAIMS
1. A xylanase which has at least one catalytic domain which is substantially homologous with a xylanase of an anaerobic fungus and which is not a full length natural xylanase.
2. A xylanase as claimed in claim 1, wherein die or each catalytic domain is identical to a catalytic domain of a natural xylanases from an anaerobic fungus.
3. A xylanase as claimed in claim 1 or 2, wherein the anaerobic fungus is a rumen fungus.
4. A xylanase as claimed in claim 3, wherein d e rumen fungus is of the genus Neocallimastix.
5. A xylanase as claimed in claim 4, wherein die fungus is Neocallimastix patriciarum.
6. A xylanase as claimed in any one of claims 1 to 5, which is derived from a xylanase having die structure (from the Nterminus to the Cterminus): CAT1LINK1CAT2LINK2CTR1CTR2 wherein: CAT1 represents a first catalytic domain, CAT2 represents a second catalytic domain, LINK1 represents a first linker, LINK2 represents a second linker, CTR1 represents a first Cterminal repeat, and CTR2 represents a second Cterminal repeat.
7. A xylanase as claimed in claim 6, wherein CATl has a sequence which is identical or otherwise substantially homologous to the sequence: RLTVGN GQTQHKGVADGYSYEIW1 )NTGGSGSMTLGSGATFKAEWN ASVNRGNFLARRGLDFGSQK KATDYSYIGLDYTATYRQTG S ASGNSRLCVYGWFQNRGVQ G LVEYYIIEDWNDWNPD A QGRMNTTOGAQYKIFQMDHT GPTIΝGGSETFKQYFSNRQQ KRTSGHΓΓVSDHFKEWAKQG WGIGΝLYEVALΝAEGWQSSG L )VTKLDVYTTQKGSΝPAP.
8. A xylanase as claimed in claim 6 or 7, wherein CAT2 has a sequence which is identical or otherwise substantially homologous to the sequence: K FTVGNGQNQHKGVNDGFSYEIWLDNTGGNGSMTLGSGATF KAEWNAAVNRGNFLARRGLDFGSQKKATDYDYIGLDYAAT YKQTASASGNSRLCVYGWFQ NRGLNGVPLVEYYIIEDWND WVPDAQGKMNTIDGAQYKIF QMDHTGFΠΝGGSETFKQYF SVRQQKRTSGHΠΎSDHFKE WAKQGWGIGΝLYENALΝAEG WQSSGNADNTLLDVYTTPKG SSPA.
9. A xylanase as claimed in claim 6, 7 or 8, wherein LIΝK1 has a sequence which is identical or otherwise substantially homologous to the sequence: TSTGTNPSSSAGGSTAΝGK.
10. A xylanase as claimed in any one of claims 6 to 9, wherein LJΝK2 has a sequence which is identical or otherwise substantially homologous to the sequence: TSAAPRTTTRTTTRTKSIJ'TNYNK.
11. A xylanase as claimed in any one of claims 6 to 10, wherein CTRl has a sequence which is identical or otherwise substantially homologous to the sequence: CSAMTAQGYKCCSDPNCN ΥYTDEDGTWGVENNDWCGCG.
12. A xylanase as claimed in any one of claims 6 to 11, wherein CTR2 has a sequence which is identical or otherwise substantially homologous to the sequence: VEQCSSKTTSQGYKCCSDPNCVNFYTDDDGKWGVENNDWCGCGF.
13. A xylanase as claimed in any one of claims 6 to 12 comprising a catalytic domain which is substantially homologous with at least one of CATl and CAT2 and is missing at least part of the amino acid sequence downstream (ie towards die Cterminus) of CAT2.
14. A xylanase as claimed in claim 13, wherein at least part of CTR2 is missing.
15. A xylanase as claimed in claim 13 or 14, wherein at least part of CTRl is missing.
16. A xylanase as claimed in any one of claims 6 to 15, which has d e structure: CATlLINKlCAT2IJNK2CTRl(truncated); CATlUNKlCAT2UNK2(truncated); UNKl(trun»ted)CAT2LINK2(truncated); CATlUNKl (truncated); CATl (truncated); LJNKl(truncated)CAT2IINK2CTRlCTR2; UNKl(truncated)CAT2LINK2CTRl (truncated); or LINK1 (truncated)C AT2(truncated) .
17. A xylanase as claimed in claim 15, which has the structure: LINKl(truncated)CAT2LINK2(truncated).
18. An isolated or recombinant DNA molecule encoding a xylanase which has a catalytic domain substantially homologous with a xylanase of an anaerobic fungus, provided tiiat die DNA molecule does not comprise a full length copy of natural mRNA encoding the xylanase.
19. A DNA molecule as claimed in claim 18, wherein the absent portion, or one of die absent portions, of the DNA corresponds to die 3' and/or 5' untranslated region of the mRNA.
20. A DNA molecule as claimed in claim 18 or 19, which is derived from a DNA molecule having the following structure: J 'ωrsigcatllinklcaalink2ctrlctr23 'utr, wherein 5 'utr represents a 5' untranslated region; sig encodes a signal peptide; catl encodes a first catalytic domain; link! encodes a first linker sequence; catl encodes a second catalytic domain; lin l encodes a second linker sequence; crrl encodes a first Cterminal repeat; ct l encodes a second Cterminal repeat; and 3 'utr represents a 3' untranslated region.
21. A DNA sequence as claimed in claim 20, wherein the 3 'utr segment has a sequence which is identical to or otherwise substantially homologous with the f olio wing sequence: TTTTATTATATCAATCTCTAATTTA'rrri'l'rrAGGAAAAAAATAAAAAAATAAATATAAT AAATATTAGAGAGTAATATTTAAAAACAAAGAAATTTAAAA^^ TTTTACTGGTTAAAAAAAAAATAAAAAACAAAATTAATAAAGATATTTTTGAAAA^ 5 GAATTAGAAAAAAA.
22. A DNA sequence as claimed in claim 20 or 21, wherein the sig segment has a sequence which is identical to or otherwise substantially homologous widi die following sequence: 0 ATGAGAACTATTAAATTC ITTTCGCAGTAGCTATTGCAACTGTTG CTAAGGCCCAATGGGGTGGAGGTGGTGCCTCTGCTGGTCAA; .
23. A DNA sequence as claimed in claim 20, 21 or 22, wherein the catl segment has a sequence which is identical to or otherwise substantially homologous 5 with die following sequence: AGATTAACCGTCGGTAATG GTCAAACCCAACATAAGGGTGTAGCTGATGGTTACAGTTATGAAATCTGGTTAGATAACA CCGGTGGTAGTGGTTCTATGACTC CGGTAGTCΚΠ'GCAACCΠTCAACΚΪCTGAATGGAATG CATCTGTTAACCGTGGTAACITCCΠTGCCCGTCGTGGTCTTGACΓTCGGTTCTCAAAAGA o AGGCAACCGATTACAGCTACATTGGATTGGATTATACTGCAACTTACAGACAAACTGGTA GCGCAAGTGGTAACTCCCGTCTCTGTGTATACGGTTGGTTCCAAAACCGTGGAGTTCAAG GTGTTCCATTGGTAGAATACTACATCATTGAAGATTGGGTTGACTGGGTTCCAGATGCAC AAGGTAGAATGGTAACCATTGATGGAGCΓCAATATAAGATTTTCCAAATGGATCACACTG GTCCAACTATCAATGGTGGTAGTGAAACCTΓTAAGCAATACTTCAGTGTCCGTCAACAAA 5 AGAGAACTTCTGGTCATATTACTGTCTCAGATCACTTTAAGGAATGGGCCAAACAAGGTT GGGGTATTGGTAACCTITATGAAGTTGCTTTGAACGCCGAAGGTTGGCAAAGTAGTGGTA TAGCTGATGTCACCAAGTTAGATGTTTACACAACCCAAAAAGGTTCTAATCCTGCCCCT.
24. A DNA sequence as claimed in any one of claims 20 to 23, wherein the 0 linkl segment has a sequence which is identical to or otherwise substantially homologous widi d e following sequence: ACCTCCACTGGTACTGTTCCAAGCAGTTCTGCTGGTGGAAGTACTGCCAATGGTAAA;.
25. A DNA sequence as claimed in any one of claims 20 to 24, wherein the cat! segment has a sequence which is identical to or otherwise substantially homologous with the following sequence: AAGT TTACTGTCGGTAATGGACAAAACCAACATAAGGGTGTCAACGATCKJRTTCAGTTATGAAA TCTGGTTAGATAACACTCKΠ'GGTAACGGΠCTATGACTCT AGGCTGAATCK3AATGCAGCTGTTAACCGTGGTAACTTCCTTGCCCGTCGTGGTCTTGACT TCGGTTCTCAAAAGAAGGCAACCGATTACGACTACATTGGATTAGATTATC5CTGCTACTT ACAAACAAACΓGCCAGTGCAACTGGTAACTCCCCΓΓCTCTGTGTATACGGATGGTTCCAAA ACCGTGGACITAATCK CGTTCCTTTAGTAGAATACTACATCATTGAAGATT GTTGACT GCΚΠTCCAGATGCACAAGGAAAAATGGTAACCATTGATGGAGCTCAATATAAGATTTTCC AAATGGATCACACTCKΠ'CCAACTATCAATGGTGGTAGTGAAACCTITAAGCAATACTTCA GTGTCCGTCAACAAAAGAGAACTTCTGGTCATATTACTGTCTCAGATCACITTAAGGAAT GGGCCAAACAAGGTTGGGGTATTGGTAACCTTTATGAAGTTGCTITGAACGCCGAAGGTT GGCAAAGTAGTCKJTGTTGCTGATGTCACCTΓATTAGATGTTTAC AC AACTCCAAAGGGTT CTAGTCCAGCC.
26. A DNA sequence as claimed in any one of claims 20 to 25, wherein the linkl segment has a sequence which is identical to or otherwise substantially homologous with die following sequence: ACCTCTGCCCκrrCCTCGTACTACTACCCGTACTACTACTCGTACCAAGTCTCIT AATTACAATAAG; .
27. A DNA sequence as claimed in any one of claims 20 to 26, wherein the trl segment has a sequence which is identical to or otherwise substantially homologous with the following sequence: TGTTCTGCTAGAATTACTGCTCAAGGTTACAAGTGTTGTACκ:GATCCAAATTGTGTTGτT TACTACACTGATGAGGATGGTACCTGGGGTGTTGAAAACAACGACTGGTGTGGTTGTGGT.
28. A DNA sequence as claimed in any one of claims 20 to 27, wherein the ctr2 segment has a sequence which is identical to or otiierwise substantially homologous widi die following sequence: GTTGAACAATGTTCTTCCAAGATCACTTCTCAAGGTTACAAGTGTTGTAGCGATCCAAAT TGCGTTGTTTTCTACACTGATGACGATGGTAAATGGGGTGTTGAAAACAACGACTGGTGT GGTTGTGGTTTC.
29. A DNA sequence as claimed in any one of claims 20 to 28, wherein die 5 'utr segment has a sequence which is identical to or otherwise substantially homologous with me following sequence: TAAGCAGTAAAATACTAATTAATAA AAAATTAAAGAATTATGAAAAATTTAAATTTAAAAATTTAAAAGAATTATGAAAAAT^ AATTTAAAAATTTAAAAAAAACTAATTTAGTAAAAAATTAAAGAATTATTGAAAAT^ AATGTAAAAATΓΓAAAAAATACAAATTTGTAAAAAAAAATGAAAGAATTATGAAAAATTA AAATGTAAAAGTTTAAAAAATACAAATTTGTAAGAAAAATAAAGAATTATAAAAAAAATA AAGAATTATGAAAAACCCAAATGTAAAGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA.
30. A DNA sequence as claimed in any one of claims 18 to 29 encoding a xylanase as claimed in claim 1 to 17.
31. A DNA sequence as claimed in any one of claims 20 to 30 which comprises the following segments: 5'ttrr«'gcflrl/t/Λlctzt2/tnΛ2crrl(truncated); 5'utrsigcatllinklcat2link2(tr iiciitt,d); ιnA:l(truncated)cαt2/tnλ2(truncated); J 'ttfrΛgcαrl/t/Λl(truncated); J 'utrsigcatl (truncated) ; /tnλl(truncated)cαt2/ nA2crrlctt25'ttrr, linkl (truncated)cαt2/ 2ctrl (truncated) ; linkl (truncated)cαr2(truncated) .
32. A DNA molecule as claimed in any one of claims 18 to 31, which is in die form of a vector.
33. A DNA molecule as claimed in claim 32, wherein die vector is a plasmid.
34. A DNA molecule as claimed in claim 32 or 33, wherein the vector is an expression vector.
35. A DNA molecule which is, or comprises the insert of, plasmid pNX3, pNX4, pNX5, pNX6, pNX7, pNX8, pNX9 or pNXlO, as defined herein.
36. A DNA molecule which is, or comprises the insert of, plasmid pNX5, pNX9 or pNXlO, as defined herein.
37. A host cell transfected or transformed widi a DNA molecule as claimed in any one of claims 18 to 36.
38. The use of a xylanase as claimed in any one of claims 1 to 17 in the modification of baked products.
39. The use of a xylanase as claimed in any one of claims 1 to 17 as an enzyme supplement for animal feed.
40. The use of a xylanase as claimed in any one of claims 1 to 17 as an impurity remover in pulp.
41. The use of a xylanase as claimed in any one of claims 1 to 17 in the prebleaching of kraft pulp.
42. A xylanase which has at least one catalytic domain which is substantially homologous wid a xylanase of an anaerobic fungus.
43. An isolated or recombinant DNA molecule encoding a xylanase which has a catalytic domain substantially homologous widi a xylanase of an anaerobic fungus, provided that if the DNA molecule is cDNA encoding a xylanase of Neocallimastix fromalis then die DNA molecule is operatively coupled to a promoter.
Description:
RECOMBINA-NT XYLANASES

This invention relates to recombinant xylanases derivable from an anaerobic fungus.

Xylan, a major component of plant hemicelluloses, consists of a polymer of 1,4- linked /3-D-xylopyranose units substituted with mainly acetyl, arabinosyl and glucuronosyl residues. Hardwood xylan is typically O-acetyl-4-O- methylglucuronoxylan with approximately ten percent of xylose units α- 1 ,2-linked to a 4-O-methylglucuronic acid side chain, and seventy percent of xylose residues acetylated at the C-2 or C-3 positions. Softwood xylans are commonly arabino-4- O-methyl-glucuronoxylans in which more than ten percent of xylose units are substituted with -l,3-linked arabionfuranose residues. A repertoire of microbial enzymes act co-operatively to convert xylan to its constituent simple sugars. These include endo-0-l,4-xylanases (EC 3.2.1.8), /3-xylosidase (EC 3.2.1.37) and a series of enzymes which cleave side-chain sugars (glycosidases) or remove acetyl groups from the xylan backbone (Dekker R.F.H., and Richards, G.N., Adv. Carbo ydr. Chem. Biochem. 32: 277-352 (1976); Biely, Trends Biotechnol. 3: 286-290 (1985); Poutanen et al, "Accessory Enzymes Involved in the Hydrolysis of Xylans" In: Enzymes in Biomass Conversion. ACS Symposium Series 460. pp426-436. Ed. G.F. Letham. (1991)). Xylanolytic micro-organisms generally express isoenzymic forms of xylanases which are encoded by multiple genes (Hazlewood et al, FΕMS Microbiol. Lett. 51: 231-236 (1988); Gilbert et al, J. Gen. Microbiol. 134: 3239-3247 (1988); Clarke et al, FEMS Microbiol. Lett. 83: 305-310 (1991)).

Some xylanases hydrolyse only xylan (Hall et al, Mol. Microbiol. 3: 1211-1219 (1989); Wong et al, Microbiol. Rev. 52: 305-317 (1988). Many microorganisms that hydrolyse xylan also degrade cellulose. In view of the similarity of the bond

cleaved (/3-1,4-glycosidic linkages), and the cross-specificity sometimes observed between cellulases and xylanases, the phylogenetic relationships of these enzymes is an interesting question. Recently, sequence alignment and hydrophobic cluster analysis have been utilised to assign plant cell wall hydrolases to eight enzyme families (Henrissat et al, Gene 81: 83-95 (1989); Gilkes et al, Microbiol. Rev. 55:

303-315 (1991)). Xylanases showed no convincing sequence identity with cellulases suggesting that the two enzyme species evolved from distinct ancestral genes.

Many plant cell wall hydrolases consist of two distinct domains; a catalytic domain

(CD) linked by hydroxyamino acid/proline-rich linker sequences to a non-catalytic cellulose binding domain (CBD; Gilkes et al, Microbiol. Rev. 55: 303-315 (1991); Kellett et al, Biochem. J. 272: 369-376 (1990); Gilbert et al, Mol. Microbiol. 4: 759-767 (1990)). The precise role of the CBD is the subject of much debate; in aerobic fungal cellulases the CBD plays a critical role in the enzymes' hydrolysis of crystalline cellulose (Tomme et al, Eur. J. Biochem. 170: 575-581 (1988)). The role of this domain in prokaryotic cellulases and xylanases is less certain (Ferreira et al, Biochem. J. 269: 261-264 (1990)). In addition to their modular structure, cellulases often contain extended repeated sequences (Gilkes et al, Microbiol. Rev. 55: 303-315 (1991)). The precise role of these tandem repeats is largely unresolved.

Many cellulolytic and hemicellulolytic prokaryotes reside in the rumen of cows and sheep. Recently, anaerobic rumen fungi have also been shown to degrade both cellulose and xylan efficiemly (Orpin and Letcher Curr. Microbiol. 3: 121-124

(1979); Lowe et al, Appl. Environ. Microbiol. 53: 1216-1223 (1987)) and similar fungi reside in the alimentary tracts of large herbivores (Orpin and Joblin, "Anaerobic fungi". In: The Rumen Microbial Ecosystem, P.N. Hobson (Ed), ppl29-150, Elsevier, London (1988)). The cellulase complex of the rumen fungus

Neocallimastix frontalis has been characterised by Wood et al, Biochemistry and Genetics of Cellulose Degradation: FEMS Syrnp. 43: 31-52 (1988). The lower eukaryote synthesises a large multienzyme complex, of M r 1-2 million, which rapidly hydrolyses crystalline cellulose. The complex contains substantial endoglucanase, and some /3-glucosidase activity. The fungus also synthesises an

Avicelase, presumably a cellobiohydroiase. Another rumen fungus, Neocallimastix patriciarum, produces extracellular enzymes which hydrolyse filter paper cellulose, AviCEL (a trade mark for microcrystalline cellulose) and xylan (Williams and Orpin Can. J. Microbiol. 33: 418-426 (1987)). None of these enzymes has been characterised. Limited information on Neocallimastix genes encoding plant cell wall hydrolases has been described (Reymond et al, FEMS Microbiol. Lett. 77: 107-112 (1991)).

Xylans are found, in association with lignin, in the primary and secondary cell walls of most plants. The association between xylan and lignin is the key to the commercial potential of xylanases in, among other things, paper pulp processing. Sandoz Products Ltd in the USA have already conducted practical trials using a crude fungal xylanase to replace, at least partially, the amount of chlorine and chlorine-derived compounds normally used to bleach the objectionable brown lignin-derived residues in the treatment of wood pulp in the production of paper and other wood-derived products. The chlorine requirements of present day wood pulping plants are such that each plant may have its own chlorine dioxide production unit.

The advantages to the paper industry in avoiding the use of chlorine are clear: improvements in waste handling, operator safety and plant capital could be achieved if a suitable replacement for chlorine could be found. However, the paper industry is intensely competitive, and profit margins are slim, so any chlorine replacement must be capable of being produced reasonably economically

and must also, of course, be sufficiently effective to persuade pulp and paper manufacturers of the benefits of its use.

The fall length cDNA and protein sequence of a xylanase from Neocallimastix patriciarum were available from the EMBL databank in Heidelberg, Germany, as of 5 May 1992 under the accession number X65526. The xylanase was designated XYLA and the corresponding gene xynA.

It has now been found that modified xylanases derived from individual xylanases from anaerobic fungi, such as the XYLA enzyme from N. patriciarum, have properties which make them appropriate for industrial use, particularly in the manufacture of pulp and paper. It appears surprisingly that truncation can enhance the expression of the enzyme.

According to a first aspect of the present invention, there is provided a xylanase which has at least one catalytic domain which is substantially homologous with a xylanase of an anaerobic fungus and which is not a full length natural xylanase.

Preferred catalytic domains are identical to catalytic domains of natural xylanases from anaerobic fungi. However, for the purpose of the present invention, a first sequence is substantially homologous with a second sequence if, for example, it shares its biological activity and there is at least about 40% homology at the amino acid level; so a catalytic domain of a xylanase of this aspect of the invention has at least about 40% homology with a catalytic domain of a natural xylanase of an anaerobic fungus. In general, it may be preferred for there to be at least 50%,

60%, 70%, 80% or 90% homology (in increasing order of preference) between the two amino acid sequences being compared. Homology may alternatively or additionally be assessed at the nucleic acid level. DNA encoding a first amino acid sequence may be substantially homologous with and hybridise to DNA (which

SUBSTITUTE SHEET ISA/EP

may be cDNA or genomic DNA) which encodes a second amino acid sequence or would so hybridise but for the degeneracy of the genetic code. Hybridisation conditions may be stringent, such as 65 °C in a salt solution of approximately 0.9 molar.

Examples of anaerobic fungi, which may be alimentary tract (particularly rumen) fungi, include: Neocallimastix spp., such as N. patriciarum, N. frontalis, N. hurley ensis and N. stanthorpensis; Sphaeromonas spp., such as S. communis; Caecomyces spp., such as C. equi; Piromyces spp., such as P. communis, P. equi, P. dumbonica, P. lethargicus and P. mai; Ruminomyces spp., such as P. elegans;

Anaeromyces spp., such as A. mucronatus and Orpinomyces spp., such as 0. bovis and O. jσyonii.

Caecomyces equi, Piromyces equi, Piromyces dumbonica and Piromyces mai are found in horses and not in the rumen of cattle like the other fungi listed above.

Neocallimastix spp. are preferred, particularly N. patriciarum.

Xylanases in accordance with the invention may have a high specific activity. The specific activity may be significantly higher than that of bacterially derived xylanases and may for example be at least 1000, 2000, 3000, 4000, 4500, 5000 or even 5500 U/mg protein, in increasing order of preference. (A unit of xylanase activity is defined as the quantity of enzyme releasing 1 μmole of product, measured as xylose equivalents, in 1 minute at 37°C). More particularly, xylanases in accordance with this aspect of the invention may be significantly better expressed than natural XYLA is expressed by N. patriciarum; expression may be at least 10 fold improved or preferably at least 100 fold improved over the wild type enzyme.

Xylanases in accordance with the invention may have the ability to degrade xylan at high efficiency. At least 0.1, and preferably at least 0.5 or even 0.75 g reducing sugar may be produced per g xylan substrate.

Xylanases in accordance with the invention may have no significant residual activity against cellulose, in contrast to many known xylanases. This property is particularly useful in the application of the invention to the pulp and paper industry, as the enzyme can remove xylan and dissociate lignin from plant fibre without damaging cellulose fibre.

Xylanases in accordance with the invention may have at least two catalytic domains. The arrangement of the catalytic domains may be as in a wild type xylanase enzyme, or they may be arranged in an artificial configuration to increase or otherwise improve the xylanolytic activity of the enzyme.

A particularly preferred xylanase as a source of catalytic domains for use in the invention, is that derived from Neocallimastix patriciarum and designated XYLA; it has the following properties:

(i) a specific activity of 5980 U/mg protein for the purified enzyme when prepared by the following protocol:

Host cells (E. cø/iXLl-Blue harbouring a plasmid expressing the enzyme) are harvested by centrifugation and resuspended in 50mM Tris-HCl buffer. pH 8.0, and the cytoplasmic fraction prepared as described by Clarke et al, (FEMS Microbiol. Letts. 83 305-310 (1991)). Xylanase, precipitated by the addition of ammonium sulphate (0.39 g/ml), is redissolved in 10 mM Tris-HCl buffer, pH 8.0. After dialysis against three changes of the same buffer, the

xylanase is substantially purified by anion-exchange chromatography on DEAE-Triacryl M essentially as described by Hall et al. (Mol. Microbiol. 3 1211-1219 (1989)).

(ii) the ability to degrade xylan at high efficiency, releasing 0.9g of reducing sugar per g of the substrate;

(ϋi) no significant residual activity against cellulose (as determined by no detectable release of reducing sugar from carboxymethyl cellulose. barley 0-glucan, laminarin or lichenan); and

(iv) two catalytic domains.

The structure of mature XYLA may be represented as follows (from the N- terminus to the C-teπniπus):

CAT1-UNK1-CAT2-LJNK2-CTR1-CTR2 wherein:

CAT1 represents a first catalytic domain, having the sequence: RLTVGN

GQTQHKGVADGYSYEIWLDNTGGSGSMTLGSGATFKAEWN ASVNRGNFLARRGLDFGSQK KATDYSYIGLDYTATYRQTG SASGNSRLCVYGWFQNRGVQ GVPLVEYYIIEDWNDWNPD A QGRMNTIDGAQYKIFQMDHT GPTIΝGGSETFKQYFSNRQQ KRTSGHITNSDHFKEWAKQG WGIGΝLYEVALΝAEGWQSSG

IADNTKLDNYTTQKGSΝPAP; CAT2 represents a second catalytic domain having the sequence K FTNGΝGQΝQHKGNΝDGFSYEIWLDΝTGGΝGSMTLGSGATF

KAEWNAANNRGNFLARRGLDFGSQKKATDYDYIGLDYAAT YKQTASASGNSRLCVYGWFQ NRGLNGVPLVEYYIIEDWND WNPDAQGKMNITDGAQYKIF QMDHTGPTIΝGGSETFKQYF SNRQQKRTSGHITVSDHFKE WAKQGWGIGΝLYENALΝAEG WQSSGVADNTLLDVYTTPKG SSPA;

LIΝK1 represents a first linker having the sequence:

TSTGTNPSSSAGGSTAΝGK; LIΝK2 represents a second linker having the sequence: TSAAPRTTTRTTTRTKSLPTNYNK; CTRl represents a first C-teπninal repeat having the sequence:

CSARITAQGYKCCSDPNCNVYYTDEDGTWGVENNDWCGCG; and CTR2 represents a second C-terminal repeat having the sequence:

VEQCSSKITSQGYKCCSDPNCNNFYTDDDGKWGNENNDWC GCGF.

All these partial sequences can be seen in SEQ ID NO: 1 and SEQ ID NO: 2.

The structure of xylanases from other anaerobic fungi may be broadly similar, but of course the precise sequences of the components will generally be different, unless the source organism is very closely related to N. patriciarum. It may not be necessary for the entirety of the sequence of each region (particularly the catalytic domains) to be present for activity; in the present invention, although the entirety of a catalytic domain may be present, it is sufficient for the active portion of the catalytic domain to be present (that is to say, the catalytic domain must be functionally present).

The two catalytic domains can be seen to be very similar to each other but not identical. The difference between them gives an indication of the degree of

homology to a natural sequence that is particularly preferred. The two C-teπninal repeats can also be seen to be similar to each other (but less so than the two catalytic domains). The difference between them gives an indication of the degree of homology which is still highly preferred. The precise sequence of the two linker sequences may not be particularly important; all that is necessary is that the spatial arrangement of the catalytic domain(s) is such as to enable them to function effectively (and preferably optimally).

Preferred embodiments of the invention comprise a catalytic domain which is substantially homologous with at least one of CAT1 and CAT2 and are missing at least part of the amino acid sequence downstream (ie towards the C-terminus) of CAT2. At least part of CTR2 may be missing; alternatively or (preferably) additionally, at least part of CTR1 may be missing.

Particular embodiments of xylanases in accordance with the invention include those including (and preferably consisting essentially of) the following regions:

A. CATl-UNKl-CAT2-LINK2-CTRl(truncated) (eg pNX3);

B. CATl-UNKl-CAT2-UNK2(truncated) (eg pNX4); C. UNKl(truncated)-CAT2-LINK2(truncated) (eg pNX5);

D. CATl-UNKl(truncated) (eg pNX6);

E. CATl(truncated) (eg pNX7);

F. LINKl(truncated)-CAT2-LINK2-CTRl-CTR2 (eg pNX8);

G. UNKl(truncated)-CAT2-LINK2-CTRl(truncated) (eg pNX9); H. LINKl(truncated)-CAT2(truncated) (eg pNXlO).

(The plasmid designations in brackets refer to plasmids in the examples whose expression products are the xylanases shown.) Signal sequences may initially be present but will preferably be absent in the final molecule. Structures C, F, G and H are preferred and strucmres C, G and H are particularly preferred.

Enzymes in accordance with the invention may comprise a single CAT1 domain, a single CAT2 domain, or have two or more catalytic domains, each of which independently may be chosen from CAT1 and CAT2. It may be that substantially only catalytic domains are present; and as indicated above it may be that not all of the natural catalytic domain sequences are essential for adequate activity.

On the immature protein a signal peptide may be present; the sequence of the natural signal peptide is:

MRTKFFFANAIATVAKAQWGGGGASAGQ. This sequence again is shown in SEQ ID NO:l and SEQ ID NO:2.

Xylanases in accordance with the invention may be prepared by any suitable means. While bulk fermentation of the source anaerobic fungus may be undertaken, and polypeptide synthesis by the techniques of organic chemistry may be attempted, the method of preparation of choice will generally involve recombinant DNA technology. A xylanase as described above will therefore for preference be the expression product of heterologous xylanase-encoding DNA in a host cell.

According to a second aspect of the invention, there is provided an isolated or recombinant DNA molecule encoding a xylanase which has a catalytic domain substantially homologous with a xylanase of an anaerobic fungus, provided that the DNA molecule does not comprise a full length copy of natural mRNA encoding the xylanase.

cDNA (apparently comprising a fall length copy of mRNA) encoding a xylanase of Neocallimastix frontalis has been described by Reymond et al, FEMS Microbiol. Lett. 77: 107-112 (1991), but no expression was reported.

Although a fall length copy of natural mRNA is not present in DNA in accordance with this aspect of the invention, it should be understood that the invention is not limited to truncated cDNAs. It is contemplated that some or all of the introns (if any) naturally present in the corresponding wild type gene may be present. However, at least some sequence that is present in the fall length cDNA is absent in DNA in accordance with this aspect of the invention. It should also be understood that this aspect of the invention encompasses DNAs encoding fall length xylanases; the absent portion of the DNA may be (and in some embodiments preferably is) in the 3' and/or 5' untranslated regions. Substantially fall length or truncated xylanases may therefore be produced from DNA in accordance with this aspect of the invention which (a) is substantially missing the 3' untranslated region, or (b) is substantially missing the 5' untranslated region or (c) is substantially missing both the 3' and 5' untranslated regions.

A fall length cDNA encoding a xylanase of an anaerobic fangus (taking the xynh gene of N. patriciarum as the prototype) may have the following structure:

J 'utr-sig-catl-li ~ ιkl-caι2-li-ιk2-ctrl-ctr2-3 'utr, wherein 5 'utr represents a 5' untranslated region; sig encodes a signal peptide; catl encodes a first catalytic domain; linkl encodes a first linker sequence; ca l encodes a second catalytic domain; lirύd encodes a second linker sequence; crrl encodes a first C-teπninal repeat; ct l encodes a second C-terminal repeat; and

3 'utr represents a 3' untranslated region.

Genomic sequences may have one or more introns interspersed within the above structure. In the xynA gene encoding the XYLA enzyme of N. patriciarum, the various DNA segments have the following sequences:

3 'utr:

TTTTATTATATCAATCTCTAATTTAT TTTTTAGGAAAAAAATAAAAAAATAAATATAAT AAATATTAGAGAGTAATATTTAAAAACAAAGAAAT TAAAAACGTTTATTTAGT ATTTT TTTACTGGTTAAAAAAAAAATAAAAAAC^AAATTAATAAAGATA TTTTGAAAAATAT GAATTAGAAAAAA ;

ATGAGAACTATTAAATTCTTT TCGCAGTAGCTATTGCAAC G TG CTAAGGCCCAATGGGGTGGAGGTGGTGCCTCTGCTGGTCAA;

catl:

AGATTAACCGTCGGTAATG

GTO^ΛCCCAACΑTAAGGGTGTAGCTGATGGTTACAGTTATGAAATCTGGTTAGAT AACA CCGGTGGTAGTGGTTC ATGACTCTCGGTAGTGG GOACCT CAAGGCTGAATGGAATG C-ATCTGTTAACCGTGGTAACT CCTTGCCCGTCGTGGTC-TTGACTTCGGTTCTCAAAAGA AGGOACCGATTACAGC ACATTGGATTGGATTATACTGCAACTTAC-AGACAAACTGGTA

GCGCAAGTGGTAACTCCCGTCTCTGTGTATACGGTTGGTTCCAAAACCGTGGAGTTC AAG GTGTTCC-ATTGGTAGAATAC ACATCA TGAAGATTGGGTTGACrGGGTTCCAGATGC-AC AAGGTAGAATGG AACCATTGATGGAGCTC-AATATAAGATTT CCAAATGGATCACACTG GTCO CTATC-AATGGTGGTAGTGAAACC-TTTAAGaATAC TCAGTGTCCGTC-AACAAA AGAGAACTTCTGGTCATAT ACTGTCTCAGATC-ΛCTTTAAGG-AATGGGCCAAACAAGGTT

GGGGTATTGGTAACCTTTATGAAGTTGCTT GAACGCCGAAGGTTGGCAAAGTAGTGGTA TAGC GATGTCACC-^AGTTAGATGTTTACAa VCCCAAAAAGGTTCTAATCCTGCCCCT ;

linkl ACCTCC-ACTGGTACTGTTCCAACX^GTTCTGCTGGTGGAAGTACTGCCAATGGTAAA ;

cat2:

AAGT

TTACTGTCGGTAATGGACAAAACCAACATAAGGGTGTCAACGATGG TTCAGTTATGAAA TCTGGTTAGATAACACTGGTGGT-^CGGTTCTATGACTCTCGGTAGTGGTGCAACTTTCA

AGGCTGAATGGAATGCAGCTG AACCGTGGTAACTTCCTTGCCCGTCGTGGTCTTGACT TCGGTTCTOiAAAGAAGGCAACCGATTACGACTACATTGGATTAGATTATGCTGCTACTT AC-A--^CAAACTGCCAGTGCAAGTGGTAACTCCCGTCTCTGTGTATACGGATGGTTCC- AAA

ACCGTGGACTTAATGGCGTTCCT TAGTAGAATACTACATCATTGAAGAT GGGTTGACT GGGTTCCAGATGCAOVAGGAAAAATGGTAACCΛTTσATGGAGC C- TATAAGATrTTCC AAATGGATCACΑCTGGTCCAACTATC^TGGTGGTAGTGAAACCT^ GTG CCGTOΛC-AAAAGAGAAC TCTGGTCATATTACTGTCTCAGATCACTTTAAGGAAT GGGCCAAACAAGGTTGGGGTATTGGTAACC TTATGAAGTTGCTTTGAACGCCGAAGGTT

GGCAAAGTAGTGGTGTTGCTGATGTCACCTTAT AGATGTTTACACΛAC CCAAAGGGTT CTAGTCCAGCC;

link2: ACCTCTGCCG TCC CGTACTACTACCCGTACTACTAC CGTACCAAGTC C TCCAACC

AATTACAATAAG ;

TGTTCTGC AGAATTACTGC CAAGGTTACAAGTGTTGTAGCGATCCAAAT GTGT G T TACTACACTGATGAGGATGGTACCrGGGGTGT GAAAACAACGACTGGTGTGGTTGTGGT;

GT GAAαΛTGTTCTTCCAAGATCACTTCT.αAGGT ACAAGTGTTGTAGCGATCCAAAT TGCGTTGTTTTCTAC-ACTGATGACGATGGTAAATGGGGTGTTGAAAACAACGACTGGTG T GGTTGTGGTTTC; and

5'utr:

TAAGCAGTAAAATACTAATTAATAA AAAATTAAAGAAT ATGAAAAATTTAAAT T7UAAATTTAAAAGAATTATGAAAAA TTA AATTTAAAAATTTAAAAAAAACTAATTTAGTAAAAAATTAAAGAATTATTGAAAATTTTA

AATGTAAAAATTTAAAAAATACAAATTTGTAAAAAAAAATGAAAGAATTATGAAAAA TTA AAATGTAAAAGT AAAAAATAC^AATTTGTAAGAAAAATAAAGAATTATAAAAAAAATA AAGAATTATGAAAAACCCAAATGTAAAGAAAAΆAAAAAAAAAAAAAAAAAAAAAAAAA

(Note that the first three nucleotides of the 5 'utr segment constitute a stop codon, which will generally be present.)

The use of (less than the totality of) these DNA segments, or sequences substantially homologous with them, is preferred in this aspect of the invention. Preferred embodiments correspond generally to the preferred embodiments of the xylanases per se in accordance with the first aspect of the invention, but with the

added considerations that (a) it may be preferred for a DNA sequence encoding a peptide signal sequence to be present and/or (b) it may be preferred for one or both of the untranslated regions to be truncated or absent. Particular embodiments of this aspect of the invention include those including (and preferably consisting essentially of, apart from vector-derived sequences) the following segments:

a. J'Mtr-5tg-cfltl-/ιtΛl-cαr2-//niΩ-ctrl(truncated) (eg pNX3); b. J'tttr-« ' g-cαrl-/ l-cαt2-//πik2(truncated) (eg pNX4); c. ft * /ιA:l(truncated)-cαt2-/ nΛ2(truncated) (eg pNX5); d. 5'Uftr-sig-catl-linkl (truncated) (eg pNXό); e. 5'«tr-5z * g-c rl(truncated) (eg pNX7);

/. / /Λl(truncated)-cflt2-//nΛ2-ctrl-ct^2-3'«tr (eg pNX8); g. / l(truncated)-cαr2-//π 2- ^rl(truncated) (eg ρNX9); h. ft"nifel(truncated)-c t2(truncated) (eg pNXlO).

(The plasmid designations in brackets refer to plasmids in the examples including the DNA sequences shown.) Structures c, f, g and h are preferred and structures c, g and A are particularly preferred.

Recombinant DNA in accordance with the invention may be in the form of a vector. The vector may for example be a plasmid, cosmid or phage. Vectors will frequently include one or more selectable markers to enable selection of cells transfected (or transformed: the terms are used interchangeably in this specification) with mem and, preferably, to enable selection of cells harbouring vectors incorporating heterologous DNA. Appropriate start and stop signals will generally be present. Additionally, if the vector is intended for expression, sufficient regulatory sequences to drive expression will be present. Vectors not including regulatory sequences are useful as cloning vectors; and, of course, expression vectors may also be useful as cloning vectors.

Cloning vectors can be introduced into E. coli or another suitable host which facilitate their manipulation. According to another aspect of the invention, there is therefore provided a host cell transfected or transformed with DNA as described above.

DNA in accordance with the invention can be prepared by any convenient method involving coupling together successive nucleotides, and/or ligating oligo- and/or poly-πucieotides, including in vitro processes, but recombinant DNA technology forms the method of choice.

Xylanase-encoding DNA may be cloned from a DNA library, which may be prepared from one of the above fangi. The library may be genomic, but a cDNA library may be easier to prepare and work with, particularly if steps are taken to enhance the likelihood of the presence of xylanase-encoding cDNA in the cDNA library.

Cultivation of a chosen fungus, such as N. patriciarum, may proceed anaerobically in an appropriate culture medium containing rumen fluid; the sole or predominant carbon source may be xylan so as to promote xylanase expression and, hence, to cause an increase in the amount of xylanase-encoding RΝA. However, cultivation in the presence of xylan is not essential, and die carbon source may instead be a cellulose, such as die microcrystalline cellulose sold under the trade mark AviCΕL.

After cultivation of the fungus, total RΝA may be extracted in any suitable maimer. Fungal cells may be harvested by filtration and subsequently lysed in appropriate cell lysis buffer by mechanical disruption. A suitable RΝA preserving compound, such as guanidinium thiocyanate, may also be added to the fungal cells to reduce or prevent RΝase-mediated digestion. Total RΝA may subsequently be isolated from the resulting homogenate by any suitable technique such as by

ultracentrifagation through a CsCl 2 cushion or as described in Sambrook et al, Molecular Cloning: A Laboratory Manual, 2nd Edition, Cold Spring Harbor, New York: Cold Spring Harbor Laboratory Press (1989).

Another method for preparation of total fungal RNA in addition to that described above may be based on or adapted from the procedure described in Puissant and Houdebine Bio-Techniques 148-149 (1990). In this method, total fangal RNA can be isolated from the above homogenate by extraction with phenol/chloroform at pH 4 to remove DNA and associated protein. The resulting crude RNA was further purified by washing with lithium chloride-urea solution.

A suitable further technique for fungal RNA extraction is that of Teeri et al. (Anal. Biochem. 164 60-67 (1987)).

Once total RNA has been extracted, by whichever method, poly-A + mRNA may then be isolated from the total RNA, for example by affinity chromatography on a compound containing multiple thymidine or uracil residues, to which the poly-A tail of the mRNA can bind. Examples of suitable compounds include oligo-dT cellulose and poly-U SEPHADEX'". Poly-A + mRNA can then be eluted by a suitable buffer.

A cDNA expression library may then be constructed using a standard technique based on conversion of the poly-A + mRNA to cDNA by reverse transcriptase. While it is possible to construct a genomic library, a cDNA library is preferred because it avoids any difficulties which may be caused by the presence of introns in the fungal genomic DNA. The first strand of cDNA may be synthesised using reverse transcriptase and the second strand may be synthesised using any suitable DNA-directed DNA polymerase such as Escherichia coli DNA polymerase I (E. coli pol I).

The cDNA may subsequently be fractionated to a suitable size and may be ligated to a suitable vector which is preferably a phage vector such as λZAP, λZAPII or λgt 11. Suitable kits for the purpose are available from Stratagene. Further or alternative guidance may be had from Reymond et al (FEMS Microbiol. Lett. 17 107-112 (1991)) which details the preparation of a cDNA library from N. frontalis.

The resulting cDNA library may then be amplified after packaging in vitro, using any suitable host bacterial cell such as an appropriate strain of E. coli.

The screening of xylanase positive recombinant clones may be carried out by any suitable technique, which may be based on hydrolysis of xylan. In this procedure the clones may be grown on culture media incorporating xylan and hydrolysis may be detected by the presence of xylanase-positive plaques suitably assisted by a suitable colour indicator. Methods for selecting xylanase + clones are described in the literature. Two examples are Clarke et al. (FEMS Microbiol. Lett. 83 305-310 (1991)) and Teather and Wood (Appl. Environ. Microbiol. 43 777-800 (1982)).

Xylanase positive recombinant clones may then be purified (that is to say a plaque may be converted to a bacterial colony) by well established procedures. Suitable techniques can be found in Sambrook et al (1989) (loc. cit.), but it would be usual simply to follow the manufacturer's instructions in whichever kit was being used and the cDNA insert in the clones may then be excised into a vector of choice, such as pBLUESCRiPT 51 ^ to name only one example. Other suitable plasmids can be used for subcloning; examples include the pUC plasmids and plasmids derived from mem, as described in Sambrook et al. Molecular Cloning: A Laboratory Manual, 2nd edition, Cold Spring Harbor, New York: Cold Spring Harbor

Laboratory Press (1989). Expression vectors (particularly plasmids) in which the xylanase-encoding DNA is under the control of an appropriate promoter may also be formed by ligation and transformed and transfected into a suitable expression host. Examples of suitable expression vectors include me pUC series (which have

the lacZp promoter), the pMTL series (which also have the lacZp promoter and pBLUESCRlPT (which has both the LacZp promoter and the T 7 promoter).

The nature of the promoter is not in general believed to be particularly critical and will depend on the expression host and the conditions under which expression is desired. As indicated above, a suitable example for a bacterial expression host such as E. coli is the lacZ promoter. Alternative promoters for bacterial hosts include the bacteriophage T 7 promoter.

It may not be necessary to purify recombinant xylanases from their expression hosts. While E. coli as a host cell may be suitable for application of the xylanase of the invention in pulp manufacture, it will be appreciated that other host cells could be used such as gram positive bacteria inclusive of Bacillus subtilis, or lactic acid bacteria. Alternatively a eukaryotic expression host may be used; an example would be yeast (such as Saccharomyces cerevisiae).

Host cells expressing xylanases as described above and or harbouring DNA sequences as described above (whether for expression or otherwise) themselves constimte a further aspect of the invention. Also included in the invention are methods of preparing a host cell, in which xylanase-encoding DNA is transformed or transfected into a cell, and methods of producing a % xylanase, in which expression hosts are cultivated to express xylanase-encoding DNA.

Depending on me nature of the host cell, it may be preferred for recombinant DNA in accordance with the invention to include a signal sequence. Either a host- specific signal sequence may be included or, for expression in eukaryotes, the enzyme's own signal sequence may be used. A translational start site adapted for or preferred by the expression host may be provided; however, the protein's own translational start site may be adequate or even in some circumstances preferred.

Recombinant xylanase enzyme from an expression host may then be characterised. Principal features that have been ascertained for certain embodiments of the invention are as follows:

(i) the cloned xylanase has a very high specific activity (5980

U/mg protein of the purified enzyme); this is in contrast to many cloned xylanases from bacteria which have been reported so far; (ii) the enzyme is able to degrade xylan at extraordinarily high efficiency, releasing 0.9g of reducing sugar per g of the substrate.; (ϋi) the enzyme has no residual activity again cellulose, whϋe many other xylanases possess some cellulase activity; and (iv) the enzyme contains two catalytic domains, which may have potential for construction of a highly efficient xylanase-producing clone by further genetic manipulation of the xylanase cDNA.

The high specific activity of the fall length cloned xylanase (hereinafter referred to as xylanase A) (5980 U/mg protein of the purified enzyme) is an intrinsic property of this fungal xylanase. However, the expression level of the present construct of xynA cDNA in pBluescript vector (pNXl) is relatively low in E. coli, accounting for 0.3% of soluble protein synthesised by E. coli cells. Generally speaking, the expression of the cloned gene at the level of > 10% of total cellular E. coli protein is attainable.

Truncated forms of xynA cDNA may be prepared by the use of restriction enzymes. Some truncated forms, including that in the plasmid designated pNX5, produce several hundred-fold higher xylanase activity than pNXl. One explanation for this observation is that is a result of the utilisation of LacZ translation initiation sequence for the synthesis of the truncated xylanase A. Another explanation is that avoidance of AT-rich regions may result in higher expression levels; a theory is

that the mRNA degrading activity of RNase E is the rate limiting step in protein synthesis, and that RNase E has a preference for AT-rich regions of mRNA. It is possible to further increase its expression level in E. coli by using a stronger promoter, such as Bacteriophage T 7 promoter.

Recombinant xylanase A (XYLA) purified from Escherichia coli harbouring xynA, had an M*, of 53000 and hydrolysed oat spelt xylan to xylobiose and xylose. The enzyme did not hydrolyse any cellulosic substrates. The nucleotide sequence of xynA revealed a single open reading frame of 1821 bp coding for a protein of M τ 66192. The predicted primary structure of XYLA comprised of an N-terminal signal peptide followed by a 225 amino acid repeated sequence, which was separated from a tandem 40 residue C-teπninal repeat by a threonine/proline linker sequence. The large N-terminal reiterated regions consisted of distinct catalytic domains which displayed simϋar substrate specificities to the fall length enzyme.

Xylanases in accordance with the invention have a number of applications in the food, feed, and pulp and paper industries. The use of xylanases described herein in these industries is included within the scope of the invention.

Dealing first with the food industry, certain properties of dough and its resultant baked products are dependent on die pentosan and starch content of the flour used. These properties include me texture, volume and staling of bread. The use of xylanase could modify baked products to provide goods of potential commercial value. Among the properties that can be modified by xylanase treatment is the specific volume of bread. The increase in specific volume is enhanced further when amylase is added in combination with xylanase. One of me factors contributing to this effect is the water-binding capacity of carbohydrates. The invention provides dough including a xylanase as described herein.

In the animal feed mdustry, the use of enzyme supplementation to improve feed for chicks was reported as early as 1957. More recent results suggest that, in certain grains such as wheat, and particularly rye, it is the pentosans in the endosperm that are mainly responsible for poor nutrient uptake and sticky droppings from the chicks. Both problems appear to result from the high viscosity of the undigested pentosans. This hampers the diffusion of nutrients and binds water to make excreta watery. The problems can be alleviated using xylanase preparations. Xylanase action can improve both the weight gain of chicks and their feed conversion efficiency. It appears that xylanase supplementation could be used to improve the nutritional value of rye, so as to promote the use of this grain in chick feed. The effectiveness of this treatment may be dependent on the variety of rye. The invention provides the use of xylanase in chick feed and grain for these purposes.

In the pulp industry, dissolving pulps are purified celluloses used for making viscose rayons, cellulose esters and cellulose ethers. They are derived from prehydrolysed kraft pulps or acid sulphate pulps. Their processing is characterised by the derivatisation of die cellulose at one stage, the derivative being soluble in common solvents and thus permitting the formation of fibres, films and plastics. Impurities in the cellulose hamper derivatisation and thus lead to insolubles that block orifices in sprayers or form defects in the final product. Furthermore, certain xylan impurities can lead to colour, haze and thermal instability in acetate products. Xylanases may thus have a role to play in removing impurities, and the use of xylanases described herein for this purpose is comprehended within the invention.

The prebleaching of kraft pulp using cellulase-free xylanase has been identified as one of the biotechnologies most likely to be accepted in the pulp and paper industry in the near future, but only if suitable xylanases become available. The

kraft (also known as alkaline or sulphate) process has become the predominant pulping technology in Canada because it produces strong wood fibres and because the chemicals used are recovered and recycled. Kraft pulps, particularly those derived from softwoods, are relatively difficult to bleach. A sequence of stages using elemental chlorine and chlorine-containing compounds is traditionally required to bleach these pulps effectively to the desired fall brightness of - 90% . The bleaching process, particularly when using elemental chlorine, products chloro-organics that have traditionally been discharged from the bleach plant with the waste water. However, both public demand and legislated regulations are presently pressurizing pulp mills to reduce or eliminate the emission of these pollutants. The pulp and paper industry is considering the implementation of various alternative technologies in order to reduce die environmental impact of its mills. These options mclude xylanase prebleaching of kraft pulp. Xylanases in accordance with the present invention are particularly well suited to this purpose.

It is believed that the xylanases of the present invention are particularly applicable to the paper and pulp industry. Whϋe it is appreciated that the use of enzymes will never replace chemicals completely, there is pressure being exerted by those concerned wiui the environment to reduce the use of chemicals. There are also practical reasons for reducing the use of chemicals in the paper and pulp industry.

Pulping plants usually generate their own supplies of chlorine and chlorine dioxide on site, and this can limit capacity as well as being potentially hazardous. Treating the paper pulp (eg kraft pulp) to remove lignin involves the use of chlorine, NaOH, H 2 O 2 and chlorine dioxide. Sandoz in the USA have conducted practical trials using their CARTAZYME product, which is a fiingal xylanase (crude), active at 30-55 °C, pH 3 to 5, and contains 2 xylanases, and have found that a 25-33% reduction in chlorine is possible using 1U xylanase/gm pulp. Also the product is brighter than when chemicals alone are used. Another advantage of die xylanase

is that it is specific whereas chemicals can attack the cellulose at low lignin contents, leading to reduced fibre strength and other undesirable physical characteristics. It is therefore clear that xylanases could become more important in pulp bleaching and recombinant ones particularly so because of their specificity and high yield. It is believed that lignin is bonded to hemicellulose, and if the hemicellulose (xylan) is depolymerised the lignin may be partially disassociated from cellulose and subsequently washed out. At present, however, some chemical treatment may still be necessary. The main points about xylanase of the present invention, with respect to commercial use, are (i) its very high specific activity and high level of expression would make it economical to produce on a large scale and

(ϋ) its lack of cellulase activity make it particularly useful where it is necessary to remove xylan specifically as applied to the paper making and textile industry.

It is also believed that the xylanase of the invention could find a valuable application in the sugar industry and in relation to the treatment of bagasse or other products containing xylan for more efficient disposal.

It was previously mentioned that the protein sequence of XYLA and die DNA sequence of xynA were made available on 5 May 1992 on the EMBL database under accession number X65526. This avaϋabϋity may not constimte effective prior art in the jurisdictions of all of die states designated in this application. For those jurisdictions where the EMBL database entry does not constitute effective prior art, notice is hereby given that the invention is and wUl be defined more broadly than as indicated above. In particular, the invention may then be seen to reside in the following further aspects:

a xylanase which has at least one catalytic domain which is substantially homologous with a xylanase of an anaerobic fungus; the xylanase may be a fall length natural xylanase of an anaerobic fungus; and

SUBSTITUTE SHEET ISA/EP

an isolated or recombinant DNA molecule encoding a xylanase which has a catalytic domain substantially homologous with a xylanase of an anaerobic fungus, provided that if the DNA molecule is cDNA encoding a xylanase of Neocallimastix frontalis then die DNA molecule is operatively coupled to a promoter; the DNA molecule may comprise a fall length copy of natural mRNA encoding die xylanase.

It will be apparent from the foregoing that the mvention includes within its scope not only the recombinant xylanase described above but also xylanases derived from other anaerobic fungi as described above which may be prepared by the methods described herein. The invention also includes within its scope any mutant derived from N. patriciarum or strains derived from N. patriciarum by selection or gene transfer.

The invention also includes witiiin its scope

(i) DNA sequences derived from pNXl, pNX4, pNX5, pNX6, pNX8, pNX9 and pNXlO and DNA sequences capable of hybridising thereto; (ϋ) a DNA construct containing a DNA sequence as in (i) operably linked to regulatory regions capable of directing the expression or over-expression of a polypeptide having xylanase activity in a suitable expression host;

(ϋi) a transformed microbial host capable of the expression or over- expression of a fungal xylanase containing an expression construct as in

(ii);

(iv) a polypeptide having xylanase activity produced by expression using a microbial host as in (iϋ);

(v) amino acid sequence as shown in Figure 4 including

components A, B, C and D and amino acid sequences derived from this xylanase; and

(vi) plasmids described in Figure 1.

The invention also includes within its scope a method of preparation of a xylanase from E. coli harbouring the recombinant plasmids as shown in Figure 1.

Each preferred feature described above with reference to one aspect of the invention is equally preferred, muxatis mutandis, for each other aspect.

The invention will now be Ulustrated by the following examples. The examples refer to the accompanying drawings, in which:

FIGURE 1 is a restriction map of recombinant plasmids containing xynA. The positions of die cleavage sites of EcøRI (R), S-sfl (S), Seal (Sc), Hpal

(Hp), xKpnl (K), Xhόl (X), Smal (Sm), PvuTL (Pv), Nael (Na), Nrul (Nr), Stul (St) and HinάSE (H) are shown. Restriction sites of multiple cloning regions or vectors in parenthesis have been destroyed. Multiple cloning regions of vectors, designated by *, are derived frompSK(S), pMTL20(20) and pMTL22(2) respectively. The solid line with an arrow shows me extent and orientation of die xynA open reading frame. Construction of the deletion mutants of xynA is detailed below. The phenotypes of E. coli strains harbouring d e recombinant plasmids are shown.

FIGURES 2A and 2B show the purification of XYLA. SDS/PAGE of

XYLA purified from cell-free extract E. coli XLl-Blue harbouring pNXl (A) or pNX5(B). Lane 1 contained XYLA purified by anion exchange chromatography, lane 2 contained cell-free extract from E. coli harbouring pNXl or pNX5 and lane 3 (B only) contained cell-free extract from E. coli

containing pBluescript SK. Gels depicted in A and B contained 10% (w/v) or 15 % (w/v) polyacrylamide, respectively. Protein sizes are shown in kD, deduced from the marker proteins which are high (Figure 2A) or low (Figure 2B) molecular weight markers from Sigma.

FIGURE 3 shows the effect of purified XYLA on the specific viscosity of soluble xylan (0.5%) in PC buffer, pH 6.5 at 37°C. Specific viscosity (■) and reducing sugars (•) were measured as described below.

FIGURE 4 shows the primary structure oCXYLA. The two homologous catalytic domains, designated A and B, together with the duplicated C- terminal sequences (C and D) are boxed.

FIGURE 5 shows the alignment of homologous regions of N. patriciarum XYLA and prokaryote xylanases. The enzymes compared were as follows:

B. pumilus xylanase A (XYLAB; Fukusaki et al, FEBSLett. 171: 197-201 (1984)), B. circulans xylanase (XYLBC; Yang et al, Nucl. Acids Res. 16: 7178 (1988)) and C. acetobutylicum xylanase B (XYLBCA; Zappe et al. , Nucl. Acids Res. 18 2179 (1990)). Residues which show identity or similarity in all primary sequences compared are boxed. The positions of the first and last residues of homologous regions, in their respective primary sequences, are shown.

FIGURE 6 shows die structure of plasmid pNXl.

FIGURE 7 shows the cloning and characterisation of Neocallimastix patriciarum xylanase A encoding cDNA.

EXAMPLE 1 - Preparation of DNXI

1.1 Microbial strains, vectors and culture media

The anaerobic fungus Neocallimastix patriciarum (type species) was isolated from a sheep rumen by Orpin, C.G., and Munn, E.A., Trans. Br. Mycol. Soc. 86: 178-

181 (1986). Host strains for cDNA cloning were E. coli PLK-F' and XLl-Blue.

E. coli strain JM83 was used for characterisation of the xylanase " cDNA clones.

The vectors were λZAPII, pBLUESCRiπ 1 ^ (Stratagene), pMTL20, pMTL22 and pMTL23 (Chambers et al, Gene 68: 139-149 (1988)). N. patriciarum culture was maintained in a medium containing 10% rumen fluid as described by Kemp et al, J. Gen. Microbiol. 130: 27-37 (1984)). E. coli strains were grown in L-broth (Sambrook et al, Molecular Cloning. A Laboratory Manual, 2nd edition. Cold Spring Harbor, New York: Cold Spring Harbor Laboratory Press (1989). The recombinant phage were grown in E. coli strains using NZY medium according to Stratagene's instructions.

1.2 General recnπihina DNA techniques

Agarose-gel electrophoresis, transformation of E. coli and modification of DNA using restriction enzymes and T4 DNA ligase were as described by Gilbert et al.,

J. Gen. Microbiol. 134 3239-3247 (1988). Large amounts of plasmid DNA was extracted from E. coli by Brij lysis' and subsequent CsCl density-gradient centrifagation(Clewell, D.B., andHelinski, D.R., Proc. Nad. Acad. Sci. USA 62: 1159-1166 (1969)). The rapid boϋing method of Holmes, D.S., and Quigley, M., Anal. Biochem. 114: 193-197 (1981) and die alkaline lysis method of Birnboim,

H.L. and Doly, J., Nucl. Acids Res. 7: 1513-1523 (1979) were employed to isolate plasmid for rapid restriction analysis and nucleotide sequencing, respectively. Northern hybridisation was as described by GUbert et al, J. Bacteriol. 161: 314- 320 (1985).

1.3 Cultivation of rumen anaerobic fungus. N. patriciarum

N. patriciarum was grown in a rumen fluid-containing medium (Kemp et al, J. Gen. Microbiol. 130: 27-37 (1984)) in the presence of 1% AviCEL at 39°C and anaerobic conditions for 48hr (Alternative culture media, such as described by Philips, M.W., and Gordon, G.L.R., Appl. Environ. Microbiol. 55: 1695-1702

(1989) and Lowe et al, J. Gen. Microbiol. 131: 2225-2229 (1985), can be used.

1.4 Total RNA isolation

The frozen mycelia were ground to fine powder under liquid nitrogen with a mortar and pestle. 5-10 vol of gnanidinium diiocyanate solution (4M guanidinium thiocyanate, 0.5% sodium laurylsarcosine, 25mM sodium citrate, pH 7.0, lmM

EDTA and 0.1M β mercaptoethanol) was added to die frozen mycelial powder and d e mixture was homogenised for 5 min with a mortar and pestle and for a further

2 min at fall speed using a Polytron homogeniser. Total RNA was isolated from the homogenate by ultracentrifagation through a CsCl cushion (Sambrook et al,

Molecular Cloning. A Laboratory Manual, 2nd edition. Cold Spring Harbor,

New York: Cold Spring Harbor Laboratory Press (1989). (Alternative method for preparation of total fungal RNA, such as adaptation of d e procedure described by

Puissant, C, and Houdebine, L.M., Bio-Techniques 148-149 (1990), can be used).

1.5 Polv A + mRNA purification

Poly A + mRNA was purified from the total RNA by Oligo (dT) cellulose chromatography (Sambrook et al, Molecular Cloning. A Laboratory Manual, 2nd edition. Cold Spring Harbor, New York: Cold Spring Harbor Laboratory Press (1989).

1.6 Construction of a cDNA expression library of N. patriciarum

The cDΝA library was constructed, using Stratagene's λZAP cDΝA synthesis kit, basically according to the manufacturer's instructions.

The procedure is described briefly as follows: Poly A + mRNA was converted to the first strand cDNA by reverse transcriptase, using Xhόl linker - oligo (dT) primer and 5-methyl dCTP. Double-stranded cDNA was syndiesised from the first-strand cDNA by the action of RNase H and DNA polymerase I. After blunting cDNA ends, die cDNA was ligated with EcoRI adaptor, phosphorylated and digested widi Xhόl to create cDNA with EcoRI site at 5' region and Xhol site at 3' region. The cDNA was size-fractionated by 1% low-melting point agarose gel electro-phoresis and 1.2-8 Kb sizes of the cDNA were recovered by phenol extraction (Sambrook et al, Molecular Cloning. A Laboratory Manual, 2nd edition. Cold Spring Harbor, New York: Cold Spring Harbor Laboratory Press

(1989)). The size-fractionated cDNA was then ligated to die EcόBJJXhol digested λZAPII vector (other expression vectors can be used).

The cDNA library was packaged in vitro and amplified using E. coli PLK-F' as plating cells.

1.7 Screening xylanase-positive recomhinant bacteriophage clones Recombinant phage were grown in E. coli XLl-Blue in 0.7% top agar containing 0.1% xylan and lOmM isopropyl-0-D-thiogalactopyranoside (IPTG, an inducer for LacZ promoter controlled gene expression). After overnight incubation at 37°C,

0.5 % Congo red solution was added over the top agar. After incubation at RT for 15 min, e unbound dye was removed by washing with 1M NaCl. Xylanase- producing phage plaques were surrounded by yellow haloes against a red background.

The xylanase-positive recombinant phage were purified to homogeneity by replating and rescreening the phage as above for 2-3 times.

The cDNA inserts in xylanase-positive phage were excised into pBLUESCRiPT SK " using VCS-M13 helper phage.

1.8 Xylanase and related-enzvme assays The enzyme extracts from E. coli harbouring xylanase-positive recombinant plasmids were prepared as described by Kellett et al, Biochem. J. 272: 369-376 (1990).

The enzymes were assayed for hydrolysis of xylan or other substrates at 37°C in 50mM potassium phosphate /12mM citric acid buffer, pH 6.5 and the reducing sugars released from xylan or other plant polysaccharides (carboxymethyl cellulose, barley 0-glucan, laminarin, lichenan) were measured as described by Kellett et al, Biochem. J. 272: 369-376 (1990) and Hazlewood et al, J. Gen. Microbiol. 136: 2089-2097 (1990). Assays for activities against artificial substrate-<memylumbelliferyl-/3-D-ceUobioside^emylumbelli feryl-i8-D-glucoside, methylumbelliferyl-/3-D-xyloside and p-nitrophenyl-3-D-xyloside) were described by Hazlewood et al, J. Gen. Microbiol. 136: 2089-2097 (1990).

1.9 DNA sequencing Plasmid DNA, denatured by alkali, was neutralised and further purified by spin dialysis (Murphy, G., and Kavanagh, T., Nucl. Acid Res. 16: 5198 (1988)). Sequencing of the resultant DNA was based on die protocol recommended by die manufacturer of the Sequenase DNA sequencing kit (USA, Cleveland, OH). Overlapping sequences were generated by cloning appropriate restriction fragments into pMTL-based vectors. Sequences were compϋed and ordered using the computer programs described by Staden, R., Nucl. Acids Res. 16: 3673-3694 (1980). The complete sequence of me cDNA contained in die plasmid designated pNXl was determined in both strands. The xylanase-encoding gene contained in the plasmid was designated xynA and die gene product, die xylanase enzyme itself, was designated XYLA.

EXAMPLE 2 - Construction of PNX4. a Deletion Mutant of pNXl (xynA)

pNXl was linearised by Xhόl and die 3' region of xynA cDNA was removed by Bal-31 digestion (Hall, J., and Gilbert, H.J., Mol. Gen. Genet. 213: 112-117 (1988)). After blunt ending, die truncated cDNA was excised from pNXl by

EcόKL digestion and cloned into EcόRUSmal digested pMTL22 vector.

EXAMPLE 3 - Construction of PNX5. a Deletion Mutant of PNXI (xynA^

720bp ScάUNrul fragment was excised from pNX4 and cloned into pMTL20 vector. This resulted in a highly expressing clone, in which the enzyme expression levels were some hundreds higher than for pNXl.

EXAMPLE 4 - Construction of PNX6. a Deletion Mutant of PNXI (xynA)

pNX6 was constructed by cleaving pNXl with EcoRUScal and cloning the resulting fragment into Eco JJSmal-cat pMTL22.

EXAMPLE 5 - Construction of PNX8. a Deletion Mutant of PNXI (xynA)

pNXl was digested with Seal and Xhόl to obtain 1.3kb fragment which was cloned into pMTL20 so that the XynA sequence was in phase with the LacZ ATG contained in die vector. This resulted in a high expression clone in which the expression level was approximately fifteen times that of pNXl.

EXAMPLE 6 - Construction of pNX9. a Deletion Mutant of pNXl (xynA)

pNX8 was cut with Kpnl (1 site in vector poly linker) and die msert fragment, after electroelution was digested witii Rsal (cuts in die PT linker region of me gene) to

produce a -700bp fragment which was cloned into pMTL20 which had been cut with Kpnl and Stul. This resulted in a highly-expressing clone (much better than clone containing pNX8) with second catalytic domain in frame with vector I-zcZ N-teπniπus.

EXAMPLE 7 - Construction of PNXIO. a Deletion Mutant of pNXl (xynA)

pNX8 was digested witii Kpnl and die fragment (-850bp) was ligated into Kpήl-ait pMTL20. This clone also expressed well but the protein expressed contains some residues at the carboxy end, which when removed allow for me high level expression observed for pNX9.

EXAMPLE 8 - Purification and amino acid sequencing of the N-teπninus of xylanase A

E. coli XLl-Blue harbouring pNXl or pNX5 was cultured for 16 hours in LB broth containing ampicillin (lOOug/ml). Cells, harvested by centrifagation, were resuspended in 50mM Tris/HCl buffer, pH 8.0 and die cytoplasmic fraction prepared as described previously (Clarke etal, FEMS Microbiol. Lett. 83: 305-310 (1991)). Xylanase, precipitated by the addition of ammonium sulphate (0.39g/ml), was redissolved in lOmM Tris/HCl buffer, pH 8.0. After dialysing against 3 changes of die same buffer, the xylanase was substantially purified by anion exchange chromatography on DEAE-Trisacryl M essentially as described by Poole et al, Mol. Gen. Genet. 223: 217-223 (1990).

The xylanase (designated XYLA) purified from cell-free extract of E. coli XLl- Blue harbouring pNXl was fractionated by SDS/PAGE and electroblotted onto PROBLOT " membrane (Applied Biosystems Ine). N-terminal sequence was determined by automated Edman sequencing using a 470 gas-phase sequenator

equipped widi a 120A on-line phenykhiohydantoin analyser (AppUed Biosystems Ine: Hunkapϋlar et al, Methods Enzymol. 91: 399-413 (1983)).

EXAMPLE 9 - Summary of Isolation of xynA

A cDNA library consisting of 10 6 clones was constructed using mRNA isolated from N. patriciarum cells grown widi AVICEL as sole carbon source. Thirty one recombinant bacteriophages which hydrolysed xylan were identified after screening 5 x 10 4 clones from the library, and 16 strongly xylanase-positive phage were isolated for further characterisation. Restriction mapping and hybridisation data indicated that all the xylanase- positive recombinants contained cDNA sequences derived from the same mRNA species. A restriction map of the largest cDNA sequence encoding a functional xylanase, designated xynA, is shown in Figure 1. A nucleic acid probe consisting of 1.7kb of die 5' region of xynA, hybridised to a single 2.5kb Neocallimastix RNA species. This suggests that the longest xynA cDNA isolated is almost fall length.

EXAMPLE 10 - Characterisation of xylanase A

The cDNA sequences encoding Neocallimastix xylanases were excised from λZAPII and rescued in E. coli XLl-Blue as recombinants of pBLUESCRiPT SK. Xylanase activity expressed by die recombinant strain harbouring the plasmid pNXl, which contained die longest form of xynA, was found predominandy in die cell-free extract, indicating tiiat the enzyme was not effϊcientiy secreted by E. coli. The xylanase, designated xylanase A (XYLA), was purified to near homogeneity

(>90% pure). Purified XYLA had a specific activity of 5980 U/mg protein, compared to die cell free extract value of 16 U/mg protein. This indicates diat XYLA consists of 0.3% of soluble protein synthesised by E. coli cells harbouring pNXl. The purified enzyme had an M τ of 53000 (Figure 2) and an N-terminal

sequence of IATVAKAQWGGGGAS. XYLA hydrolysed xylan but exhibited no activity against carboxymethyl cellulose, barley /S-glucan, laminarin, lichenan or die artificial substrates 4-methyl-umbelliferyl-j3-D-xylosideand p-nitrophenyl-3-D- xylopyranoside (Table 1).

TABLE 1

The enzyme activity of purified xylanase A from E. coli harbouring pNXl (xynA cDNA) plasmid.

tone unit of XYLA releases 1 μmole of product per minute.

The enzyme attacked soluble xylan in a manner typical of an endo-3-l,4-xylanase (EC 3.1.2.8), promoting a rapid decline in viscosity (Figure 3) and releasing 893mg of reducing sugar per g of substrate. Analysis of die hydrolysis products by HPLC revealed that XYLA liberated approximately equal amounts of xylobiose and xylose. No disaccharides containing arabinose, the major side-chain sugar of oat spelt xylan, were detected among the reaction products, suggesting diat the

enzyme does not hydrolyse glycosidic linkages involving xylose units linked to side chain sugars.

EXAMPLE 11 - Nucleotide sequence

The 2.3kb Neocallimastix cDNA derived from pNXl was sequenced in both strands (Accession number X65526 in EMBIJGenbank/DDBJ Nucleotide Sequence Data Libraries). Translation of die nucleotide sequence revealed a single open reading frame (ORF) of 1821 bp encoding a polypeptide of Λf r 66192. The deduced primary structure of the encoded protein is shown in Figure 4. The N- terminal 15 residues of recombinant XYLA, purified from E. coli, exhibited a perfect match with amino acids 12 to 26 of d e translated sequence. The assignment of the proposed translation initiation codon was based on die following observations: (i) there are not ATG sequences upstream of the ORF; (ϋ) translational stop codons are in all 3 reading frames upstream of the putative translational start codon. Inspection of die nucleotide sequence in die vicinity of e putative ATG start codon did not reveal any alternative sequences which could act as translational start codon in E. coli. It is likely, therefore, that translational mitiation of die xynA occurs at the same codon in die enteric bacterium and anaerobic fungus. This is despite die fact that lower eukaryote mRNAs do not contain ribosome binding sequences which conform to the corresponding E. coli sequence. Presumably the sequence AGA, 7bp upstream of the ATG start codon, acts as weak ribosome binding sequence in die bacterium. Transcription initiation of xynA in E. coli is presumably at die vector's tocZp as subcloning of the xynA cDNA, on a 2.3 kb EcόBl-Xhόl restriction fragment, into pMTL22, generated a recombinant plasmid (pNX2) which did not direct a functional xylanase. The vector's αcZp is at die 3' of xynA in pNX2. Ahhough XYLA is not secreted by E. coli, die deduced N-terminal region of die xylanase conforms to that of a signal peptide: comprising of an N-terminal hydroprrilic basic region followed by a sequence of 23 predominantly hydrophobic or neutral amino acids.

The G + C coment of the xynA ORF was 43.4%, compared to 10.7% for the 5' and 3' non-coding regions (excluding die 3' poly A taϋ). The overall G + C content of Neocallimastix DNA is approximately 15% (Billon-Grand et al, FEMS Microbiol. Lett. 82: 267-270 (1991)), indicating tiiat non-protein coding regions of the genome are generally very A + T-rich. The bias in codon utilisation in xynA is evident from the absence of 14 of the 61 amino acid codons. There is a marked preference for T in the third position ( ~ 50% of all codons end in T) and an exclusion of G in die wobble position. Apart from ATG and TGG, which are the sole codons for Met and Trp respectively, only 3 codons contain G in die tiiird position; AAG, GAG and TTG.

Inspection of die deduced primary structure of mature XYLA revealed several interesting features. Between residues 255-265 and 491-519 are regions rich in prolme and hydroxy amino acids. Many cellulases and xylanases consist of multiple domains which are linked by sequences rich in proline/hydroxy amino acids (Gilkes et al, Microbiol. Rev. 55: 303-315 (1991)). The presence of 2 such "linker sequences" in XYLA suggests tiiat the enzyme consists of at least 3 distinct domains. The Neocallimastix xylanase, in addition to comprising of linker regions, also contains a 225 amino acid repeated sequence at die N-terminus, and a C- terminal 40 residue reiterated domain (Figure 4). There is no obvious sequence conservation between the large and small repeated regions. The two N-terminal repeated sequences exhibited 91.6% and 95.6% identity and similarity, respectively. The 40 amino acid reiterated region displayed 82.9% and 95.1 % identity and similarity, respectively. DNA encoding die two repeated regions also showed sequence identity, widi die 699 bp and 120 bp reiterated sequences exhibiting 92.7% and 90.8% identity, respectively.

EXAMPLE 12 - Homology Studies

Hydrophobic cluster analysis has shown that cellulases and xylanases can be grouped into nine enzyme families. Proteins within a family are structurally related and have probably evolved from a common ancestral gene (Henrissat et al,

Gene 81: 83-95 (1989)). Comparison of XYLA with sequences in the SWISS- PROT database revealed homology between die fungal enzyme and Bacillus pumilis xylanase A (Fukusaki et al, FEBS Lett. 171: 197-201 (1984)), Bacillus circulans xylanase (Yang et al, Nucl. Acids Res. 16: 7178 (1988)), Clostridium acetobutylicum xylanase B (Zappe et al, Nucl. Acids Res. 18: 2179 (1990)) and die

N-terminal region of the flavefaciens xylanase (Zhang & Flint, Mol. Microbiol. 6: 1013-1019 (1992)). The degree of homology between these enzymes and N. patriciarum XYLA is shown in Figure 5.

It is interesting to note tiiat only die large repeated sequence of XYLA exhibited homology with other hemicellulases; the C-terminal reiterated region showed no identity widi proteins in die database. This suggests that XYLA has a modular structure in which the N-terminal region constitutes the catalytic domain.

EXAMPLE 13 - Structure and function of XYLA

To investigate die assertion that the N-terminal repeated sequence constituted the catalytic domain of XYLA, 5' and 3' regions of xynA were deleted, or subcloned into appropriate vectors, and the capacity of die resultant xynA derivatives to express a functional xylanase was evaluated. A truncated form of xynA in which

291 bp of me 3' region encoding die 40 amino acid C-terminal repeat, had been deleted, still encoded a functional xylanase. The predicted M τ of die encoded enzyme was 53000. This is simϋar to the size of XYLA purified from E. coli harbouring pNXl. Thus, me recombinant xylanase syndiesised from the fall-

length gene by the enteric bacterium could also lack the C-terminal repeated sequence. Support for this view is provided by die fact tiiat several multidomain cellulases and xylanases are particularly sensitive to proteolytic cleavage within die linker sequences (Tomme et al, Eur. J. Biochem. 170: 575-581 (1988); Gilkes et al, J. Biol. Chem. 263: 10401-10407 (1988) , including a Pseudomonas xylanase, expressed by E. coli which was substantially cleaved within the serine-rich linker sequences (Hall et al, Mol. Microbiol. 3: 1211-1219 (1989)). A more substantial 3' deletion (pNX6), extending for 1011 bp did not affect die capacity of xynA to direct die syndiesis of a functional xylanase. However, removal of 1324 bp from the 3' region of xynA resulted in die synthesis of an inactive derivative of XYLA.

These data suggest tiiat the N-terminal 270 residues of d e N. patriciarum xylanase folds into a catalytically active enzyme. To determine whether both Ν-teπninal reiterated sequences, fold into functional xylanases, the 720 bp Seal Nrul- restriction fragment (Nrul cleaves in the multiple cloning region of pΝX4) was cloned into pMTL20 to generate pNX5 , in which truncated xynA was in phase with die vectors lacT translation initiation codon (Figure 1). E. coli harbouring pNX5 expressed 15 times more XYLA compared to a clone harbouring full length xynA. This elevation in die expression of the fungal enzyme, is presumably a result of die utilisation of an E. coli translation initiation sequence in xynA encoded by pNX5. XYLA purified from cell-free extract of (Ε. coli containing pNX5 had an

M,. of 26000 (Figure 2B). These data confirm that the reiterated N-terminal 225 residues constitute distinct catalytic domains. Interestingly, a further increase in xylanase activity was achieved by deletion of a few amino residues from the C- terminus of the second catalytic domain to generate pNX9.

To investigate die substrate specificities of the N- and C-terminal catalytic domains, d e capacity of the xylanases, encoded by pNX6 and pNX5, to cleave plant structural polysaccharides were assessed. The enzymes cleaved only xylan. releasing xylobiose and xylose in simϋar proportions to that of fall-length XYLA.

Thus, both catalytic domains displayed die same substrate specificities as fall- length XYLA.

Although many cellulases and xylanases consist of multiple domains, celB from Caldocellum saccharotyticum (Saul etal, Appl. Environ. Microbiol. 56: 3117-3124

(1990)) is the only previous example of an enzyme containing 2 distinct catalytic domains. This enzyme consists of an N-terminal exoglucanase and a C-terminal endoglucanase which belong to different enzyme families. Thus, the gene encoding celB probably arose through die fusion of two discrete cellulase genes. This invention provides evidence that fungal xylanases can also consist of multiple catalytic domains. In contrast to the celB gene, xynA is clearly a result of tandem duplication of an ancestral gene. It is not apparent what selective advantage die gene dupUcation confers on the anaerobic fungus. Is it simply a mechanism for increasing the expression of XYLA catalytic domains? As this is the first description of an anaerobic fungal xylanase, it is unclear whether multiple catalytic domains are a common feature of lower eukaryote hemicellulases.

SEQUENCE LISTING

(1) GENERAL INFORMATION:

(i) APPLICANT:

(A) NAME: Harry John GILBERT

(B) STREET: 16 Kells Gardens, Low Fell,

(C) CITY: Gateshead

(D) STATE: Tyne and Wear

(E) COUNTRY: United Kingdom

(F) POSTAL CODE (ZIP) : NE9 5XS

(A) NAME: Geoffrey Peter HAZLEWOOD

(B) STREET: 109A Duchess Drive

(C) CITY: Newmarket

(D) STATE: Suffolk

(E) COUNTRY: United Kingdom

(F) POSTAL CODE (ZIP) : CB8 8AL

(ii) TITLE OF INVENTION: Recombinant Xylanases

(iii) NUMBER OF SEQUENCES: 18

(iv) COMPUTER READABLE FORM:

3.5" MS-DOS FLOPPY DISK CONTAINING ASCII FILE (93_01283.ASC)

(v) CURRENT APPLICATION DATA:

APPLICATION NUMBER: WO PCT/GB93/01283

(2) INFORMATION FOR SEQ ID NO: 1:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 2338 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: double

(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iii) ANTI-SENSE: NO

(vi) ORIGINAL SOURCE:

(A) ORGANISM: Neocallimastix patriciarum

(B) STRAIN: (type species)

(ix) FEATURE:

(A) NAME/KEY: CDS

(B) LOCATION: 195..2018

(D) OTHER INFORMATION: /function-- "Xylanolytic enzyme" /product-. "XYLA" /standard_name= "Xylanase"

(ix) FEATURE:

(A) NAME/KEY: sig_peptide

(B) LOCATION: 195..281

(ix) FEATURE:

(A) NAME/KEY: mat_peptide

(B) LOCATION: 282..2018

(ix) FEATURE:

(A) NAME/KEY: misc_feature

(B) LOCATION: 282..959

(D) OTHER INFORMATION: /label= CAT1

SUBSTITUTE SHEET

/note= "1st catalytic domain"

(ix) FEATURE:

(A) NAME/KEY: misc_feature

(B) LOCATION: 1017..1691

(D) OTHER INFORMATION: /label= CAT2 /note-- "2nd catalytic domain"

(ix) FEATURE:

(A) NAME/KEY: misc_feature

(B) LOCATION: 1764..1883

(D) OTHER INFORMATION: /label= CTR1

/note= "1st C-terminal repeat"

(ix) FEATURE:

(A) NAME/KEY: misc_feature

(B) LOCATION: 1884..2015

(D) OTHER INFORMATION: /label= CTR2

/note= "2nd C-terminal repeat"

(ix) FEATURE:

(A) NAME/KEY: misc_feature

(B) LOCATION: 1..2338

(D) OTHER INFORMATION: /label-- pNXl_insert

(ix) FEATURE:

(A) NAME/KEY: misc_feature

(B) LOCATION: 1..2338

(D) OTHER INFORMATION: /label= pNX2_insert

/note= "pNX2 insert is in reverse orientation to pNXl insert"

(ix) FEATURE:

(A) NAME/KEY: misc_feature

(B) LOCATION: 1..1847

(D) OTHER INFORMATION: /label= pNX3_insert

(ix) FEATUR :

(A) NAME/KEY: misc_feature

(B) LOCATION: 1..1725

(D) OTHER INFORMATION: /label= pNX4_insert

(ix) FEATURE:

(A) NAME/KEY: misc_feature

(B) LOCATION: 1002..1725

(D) OTHER INFORMATION: /label= pNX5_insert

(ix) FEATURE:

(A) NAME/KEY: misc_feature

(B) LOCATION: 1..1001

(D) OTHER INFORMATION: /label= pNX6_insert

(ix) FEATURE:

(A) NAME/KEY: misc_feature

(B) LOCATION: 1..690

(D) OTHER INFORMATION: /label= pNX7_insert

(ix) FEATURE:

(A) NAME/KEY: misc_feature

(B) LOCATION: 1002..2338

(D) OTHER INFORMATION: /label-- pNX8_insert

(ix) FEATURE:

(A) NAME/KEY: misc_feature

(B) LOCATION: 1002..1847

(D) OTHER INFORMATION: /label= pNX9_insert

(ix) FEATURE :

(A) NAME/KEY: misc_feature

(B) LOCATION: 1002..1709

(D) OTHER INFORMATION: /label= pNX10_insert

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1:

TTTTATTATA TCAATCTCTA ATTTATTTTT TTAGGAAAAA AATAAAAAAA TAAATATAAT 60

AAATATTAGA GAGTAATATT TAAAAACAAA GAAATTTAAA AACGTTTATT TAGTTATTTT 120

TTTTACTGGT TAAAAAAAAA ATAAAAAACA AAATTAATAA AGATATTTTT GAAAAATATT 180

GAATTAGAAA AAAA ATG AGA ACT ATT AAA TTC TTT TTC GCA GTA GCT ATT 230 Met Arg Thr He Lys Phe Phe Phe Ala Val Ala He -29 -25 -20

GCA ACT GTT GCT AAG GCC CAA TGG GGT GGA GGT GGT GCC TCT GCT GGT 278 Ala Thr Val Ala Lys Ala Gin Trp Gly Gly Gly Gly Ala Ser Ala Gly -15 -10 -5

CAA AGA TTA ACC GTC GGT AAT GGT CAA ACC CAA CAT AAG GGT GTA GCT 326 Gin Arg Leu Thr Val Gly Asn Gly Gin Thr Gin His Lys Gly Val Ala 1 5 10 IS

GAT GGT TAC AGT TAT GAA ATC TGG TTA GAT AAC ACC GGT GGT AGT GGT 374 Asp Gly Tyr Ser Tyr Glu He Trp Leu Asp Asn Thr Gly Gly Ser Gly 20 25 30

TCT ATG ACT CTC GGT AGT GGT GCA ACC TTC AAG GCT GAA TGG AAT GCA 422 Ser Met Thr Leu Gly Ser Gly Ala Thr Phe Lys Ala Glu Trp Asn Ala 35 40 45

TCT GTT AAC CGT GGT AAC TTC CTT GCC CGT CGT GGT CTT GAC TTC GGT 470 Ser Val Asn Arg Gly Asn Phe Leu Ala Arg Arg Gly Leu Asp Phe Gly 50 55 60

TCT CAA AAG AAG GCA ACC GAT TAC AGC TAC ATT GGA TTG GAT TAT ACT 518 Ser Gin Lys Lys Ala Thr Asp Tyr Ser Tyr He Gly Leu Asp Tyr Thr 65 70 75

GCA ACT TAC AGA CAA ACT GGT AGC GCA AGT GGT AAC TCC CGT CTC TGT 566 Ala Thr Tyr Arg Gin Thr Gly Ser Ala Ser Gly Asn Ser Arg Leu Cys 80 85 90 95

GTA TAC GGT TGG TTC CAA AAC CGT GGA GTT CAA GGT GTT CCA TTG GTA 614 Val Tyr Gly Trp Phe Gin Asn Arg Gly Val Gin Gly Val Pro Leu Val 100 105 110

GAA TAC TAC ATC ATT GAA GAT TGG GTT GAC TGG GTT CCA GAT GCA CAA 662 Glu Tyr Tyr He He Glu Asp Trp Val Asp Trp Val Pro Asp Ala Gin 115 120 125

GGT AGA ATG GTA ACC ATT GAT GGA GCT CAA TAT AAG ATT TTC CAA ATG 710 Gly Arg Met Val Thr He Asp Gly Ala Gin Tyr Lys He Phe Gin Met 130 135 140

GAT CAC ACT GGT CCA ACT ATC AAT GGT GGT AGT GAA ACC TTT AAG CAA 758 Asp His Thr Gly Pro Thr He Asn Gly Gly Ser Glu Thr Phe Lys Gin 145 ISO 155

TAC TTC AGT GTC CGT CAA CAA AAG AGA ACT TCT GGT CAT ATT ACT GTC 806 Tyr Phe Ser Val Arg Gin Gin Lys Arg Thr Ser Gly His- He Thr Val 160 165 170 175

TCA GAT CAC TTT AAG GAA TGG GCC AAA CAA GGT TGG GGT ATT GGT AAC 854

Ser Asp His Phe Lys Glu Trp Ala Lys Gin Gly Trp Gly He Gly Asn 180 185 190

CTT TAT GAA GTT GCT TTG AAC GCC GAA GGT TGG CAA AGT AGT GGT ATA 902 Leu Tyr Glu Val Ala Leu Asn Ala Gl-A Gly Trp Gin Ser Ser Gly He 195 Z 205

GCT GAT GTC ACC AAG TTA GAT GTT TAC ACA ACC CAA AAA GGT TCT AAT 950 Ala Asp Val Thr Lys Leu Asp Val Tyr Thr Thr Gin Lys Gly Ser Asn 210 215 220

CCT GCC CCT ACC TCC ACT GGT ACT GTT CCA AGC AGT TCT GCT GGT GGA 998 Pro Ala Pro Thr Ser Thr Gly Thr Val Pro Ser Ser Ser Ala Gly Gly 225 230 235

AGT ACT GCC AAT GGT AAA AAG TTT ACT GTC GGT AAT GGA CAA AAC CAA 1046 Ser Thr Ala Asn Gly Lys Lys Phe Thr Val Gly Asn Gly Gin Asn Gin 240 245 250 255

CAT AAG GGT GTC AAC GAT GGT TTC AGT TAT GAA ATC TGG TTA GAT AAC 1094 His Lys Gly Val Asn Asp Gly Phe Ser Tyr Glu He Trp Leu Asp Asn 260 265 270

ACT GGT GGT AAC GGT TCT ATG ACT CTC GGT AGT GGT GCA ACT TTC AAG 1142 Thr Gly Gly Asn Gly Ser Met Thr Leu Gly Ser Gly Ala Thr Phe Lys 275 280 285

GCT GAA TGG AAT GCA GCT GTT AAC CGT GGT AAC TTC CTT GCC CGT CGT 1190 Ala Glu Trp Asn Ala Ala Val Asn Arg Gly Asn Phe Leu Ala Arg Arg 290 295 300

GGT CTT GAC TTC GGT TCT CAA AAG AAG GCA ACC GAT TAC GAC TAC ATT 1238 Gly Leu Asp Phe Gly Ser Gin Lys Lys Ala Thr Asp Tyr Asp Tyr He 305 310 315

GGA TTA GAT TAT GCT GCT ACT TAC AAA CAA ACT GCC AGT GCA AGT GGT 1286 Gly Leu Asp Tyr Ala Ala Thr Tyr Lys Gin Thr Ala Ser Ala Ser Gly 320 325 330 335

AAC TCC CGT CTC TGT GTA TAC GGA TGG TTC CAA AAC CGT GGA CTT AAT 1334 Asn Ser Arg Leu Cvs Val Tyr Gly Trp Phe Gin Asn Arg Gly Leu Asn 340 345 350

GGC GTT CCT TTA GTA GAA TAC TAC ATC ATT GAA GAT TGG GTT GAC TGG 1382 Gly Val Pro Leu Val Glu Tyr Tyr He He Glu Asp Trp Val Asp Trp .355 360 365

GTT CCA GAT GCA CAA GGA AAA ATG GTA ACC ATT GAT GGA GCT CAA TAT 1430 Val Pro Asp Ala Gin Gly Lys Met Val Thr He Asp Gly Ala Gin Tyr 370 375 380

AAG ATT TTC CAA ATG GAT CAC ACT GGT CCA ACT ATC AAT GGT GGT AGT 1478 Lys He Phe Gin Met Asp His Thr Gly Pro Thr He Asn Gly Gly Ser 385 390 395

GAA ACC TTT AAG CAA TAC TTC AGT GTC CGT CAA CAA AAG AGA ACT TCT 1526 Glu Thr Phe Lys Gin Tyr Phe Ser Val Arg Gin Gin Lys Arg Thr Ser 400 405 410 415

GGT CAT ATT ACT GTC TCA GAT CAC TTT AAG GAA TGG GCC AAA CAA GGT 1574 Gly His He Thr Val Ser Asp His Phe Lys Glu Trp Ala Lys Gin Gly 420 425 430

TGG GGT ATT GGT AAC CTT TAT GAA GTT GCT TTG AAC GCC GAA GGT TGG 1622 Trp Gly He Gly Asn Leu Tyr Glu Val Ala Leu Asn Ala Glu Gly Trp 435 440 445

CAA AGT AGT GGT GTT GCT GAT GTC ACC TTA TTA GAT GTT TAC ACA ACT 1670 Gin Ser Ser Gly Val Ala Asp Val Thr Leu Leu Asp Val Tyr Thr Thr 450 455 460

CCA AAG GGT TCT AGT CCA GCC ACC TCT GCC GCT CCT CGT ACT ACT ACC 1718 Pro Lys Gly Ser Ser Pro Ala Thr Ser Ala Ala Pro Arg Thr Thr Thr 465 470 475

CGT ACT ACT ACT CGT ACC AAG TCT CTT CCA ACC AAT TAC AAT AAG TGT 1766 Arg Thr Thr Thr Arg Thr Lys Ser Leu Pro Thr Asn Tyr Asn Lys Cys 480 485 490 495

TCT GCT AGA ATT ACT GCT CAA GGT TAC AAG TGT TGT AGC GAT CCA AAT 1814 Ser Ala Arg He Thr Ala Gin Gly Tyr Lys Cys Cys Ser Asp Pro Asn 500 505 510

TGT GTT GTT TAC TAC ACT GAT GAG GAT GGT ACC TGG GGT GTT GAA AAC 1862 Cys Val Val Tyr Tyr Thr Asp Glu Asp Gly Thr Trp Gly Val Glu Asn 515 520 525

AAC GAC TGG TGT GGT TGT GGT GTT GAA CAA TGT TCT TCC AAG ATC ACT 1910 Asn Asp Trp Cys Gly Cys Gly Val Glu Gin Cys Ser Ser Lys He Thr 530 535 540

TCT CAA GGT TAC AAG TGT TGT AGC GAT CCA AAT TGC GTT GTT TTC TAC 1958 Ser Gin Gly Tyr Lys Cys Cys Ser Asp Pro Asn Cys Val Val Phe Tyr 545 550 555

ACT GAT GAC GAT GGT AAA TGG GGT GTT GAA AAC AAC GAC TGG TGT GGT 2006 Thr Asp Asp Asp Gly Lys Trp Gly Val Glu Asn Asn Asp Trp Cys Gly 560 565 570 575

TGT GGT TTC TAAGCAGTAA AATACTAATT AATAAAAAAT TAAAGAATTA 2055

Cys Gly Phe

TGAAAAATTT AAATTTAAAA ATTTAAAAGA ATTATGAAAA ATTTAAATTT AAAAATTTAA 2115

AAAAAACTAA TTTAGTAAAA AATTAAAGAA TTATTGAAAA TTTTAAATGT AAAAATTTAA 2175

AAAATACAAA TTTGTAAAAA AAAATGAAAG AATTATGAAA AATTAAAATG TAAAAGTTTA 2235

AAAAATACAA ATTTGTAAGA AAAATAAAGA ATTATAAAAA AAATAAAGAA TTATGAAAAA 2295

CCCAAATGTA AAGAAAAAAA AAAAAAAAAA AAAAAAAAAA AAA 2338

(2) INFORMATION FOR SEQ ID NO: 2:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 607 amino acids

(B) TYPE: amino acid (D) TOPOLOGY: linear

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2:

Met Arg Thr He Lys Phe Phe Phe Ala Val Ala He Ala Thr Val Ala -29 -25 -20 -15

Lys Ala Gin Trp Gly Gly Gly Gly Ala Ser Ala Gly Gin Arg Leu Thr -10 -5 1

Val Gly Asn Gly Gin Thr Gin His Lys Gly Val Ala Asp Gly Tyr Ser 5 10 15

Tyr Glu He Trp Leu Asp Asn Thr Gly Gly Ser Gly Ser Met Thr Leu 20 25 30 35

Gly Ser Gly Ala Thr Phe Lys Ala Glu Trp Asn Ala Ser Val Asn Arg 40 45 50

Gly Asn Phe Leu Ala Arg Arg Gly Leu Asp Phe Gly Ser Gin Lys Lys 55 60 65

Ala Thr Asp Tyr Ser Tyr He Gly Leu Asp Tyr Thr Ala Thr Tyr Arg 70 75 80

Gin Thr Gly Ser Ala Ser Gly Asn Ser Arg Leu Cys Val Tyr Gly Trp 85 90 95

Phe Gin Asn Arg Gly Val Gin Gly Val Pro Leu Val Glu Tyr Tyr He 100 105 110 115

He Glu Asp Trp Val Asp Trp Val Pro Asp Ala Gin Gly Arg Met Val 120 125 130

Thr He Asp Gly Ala Gin Tyr Lys He Phe Gin Met Asp His Thr Gly 135 140 145

Pro Thr He Asn Gly Gly Ser Glu Thr Phe Lys Gin Tyr Phe Ser Val 150 155 160

Arg Gin Gin Lys Arg Thr Ser Gly His He Thr Val Ser Asp His Phe 165 170 175

Lys Glu Trp Ala Lys Gin Gly Trp Gly He Gly Asn Leu Tyr Glu Val 180 185 190 195

Ala Leu Asn Ala Glu Gly Trp Gin Ser Ser Gly He Ala Asp Val Thr 200 205 210

Lys Leu Asp Val Tyr Thr Thr Gin Lys Gly Ser Asn Pro Ala Pro Thr 215 220 225

Ser Thr Gly Thr Val Pro Ser Ser Ser Ala Gly Gly Ser Thr Ala Asn 230 235 240

Gly Lys Lys Phe Thr Val Gly Asn Gly Gin Asn Gin His Lys Gly Val 245 250 255

Asn Asp Gly Phe Ser Tyr Glu He Trp Leu Asp Asn Thr Gly Gly Asn 260 265 270 275

Gly Ser Met Thr Leu Gly Ser Gly Ala Thr Phe Lys Ala Glu Trp Asn 280 285 290

Ala Ala Val Asn Arg Gly Asn Phe Leu Ala Arg Arg Gly Leu Asp Phe 295 300 305

Gly Ser Gin Lys Lys Ala Thr Asp Tyr Asp Tyr He Gly Leu Asp Tyr 310 315 320

Ala Ala Thr Tyr Lys Gin Thr Ala Ser Ala Ser Gly Asn Ser Arg Leu 325 330 335

Cys Val Tyr Gly Trp Phe Gin Asn Arg Gly Leu Asn Gly Val Pro Leu 340 345 350 355

Val Glu Tvr Tyr He He Glu Asp Trp Val Asp Trp Val Pro Asp Ala 360 365 370

Gln Gly Lys Met Val Thr He Asp Gly Ala Gin Tyr Lys He Phe Gin 375 380 385

Met Asp His Thr Gly Pro Thr He Asn Gly Gly Ser Glu Thr Phe Lys 390 395 400

Gin Tyr Phe Ser Val Arg Gin Gin Lys Arg Thr Ser Gly His He Thr 405 410 415

Val Ser Asp His Phe Lys Glu Trp Ala Lys Gin Gly Trp Gly He Gly 420 425 430 435

Asn Leu Tyr Glu Val Ala Leu Asn Ala Glu Gly Trp Gin Ser Ser Gly 440 445 450

Val Ala Asp Val Thr Leu Leu Asp Val Tyr Thr Thr Pro Lys Gly Ser 455 460 465

Ser Pro Ala Thr Ser Ala Ala Pro Arg Thr Thr Thr Arg Thr Thr Thr 470 475 480

Arg Thr Lys Ser Leu Pro Thr Asn Tyr Asn Lys Cys Ser Ala Arg He 485 490 495

Thr Ala Gin Gly Tyr Lys Cys Cys Ser Asp Pro Asn Cys Val Val Tyr 500 505 510 515

Tyr Thr Asp Glu ASD Gly Thr Trp Gly Val Glu Asn Asn Asp Trp Cys 520 525 530

Gly Cys Gly Val Glu Gin Cys Ser Ser Lys He Thr Ser Gin Gly Tyr 535 540 545

Lys Cys Cys Ser Asp Pro Asn Cys Val Val Phe Tyr Thr Asp Asp Asp 550 555 560

Gly Lys Trp Gly Val Glu Asn Asn Asp Trp Cys Gly Cys Gly Phe 565 570 575

(2) INFORMATION FOR SEQ ID NO: 3:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1847 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: double

(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: cDNA

(ix) FEATURE:

(A) NAME/KEY: CDS

(B) LOCATION: 195..1847

(ix) FEATURE:

(A) NAME/KEY: sig_peptide

(B) LOCATION: 195..281

(ix) FEATURE:

(A) NAME/KEY: misc_feature

(B) LOCATION: 1..1847

(D) OTHER INFORMATION: /label= pNX3_insert

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3:

TTTTATTATA TCAATCTCTA ATTTATTTTT TTAGGAAAAA AATAAAAAAA TAAATATAAT 60

AAATATTAGA GAGTAATATT TAAAAACAAA GAAATTTAAA AACGTTTATT TAGTTATTTT 120

TTTTACTGGT TAAAAAAAAA ATAAAAAACA AAATTAATAA AGATATTTTT GAAAAATATT 180

GAATTAGAAA AAAA ATG AGA ACT ATT AAA TTC TTT TTC GCA GTA GCT ATT 230 Met Arg Thr He Lys Phe Phe Phe Ala Val Ala He 1 5 10

GCA ACT GTT GCT AAG GCC CAA TGG GGT GGA GGT GGT GCC TCT GCT GGT 278 Ala Thr Val Ala Lys Ala Gin Trp Gly Gly Gly Gly Ala Ser Ala Gly 15 20 25

CAA AGA TTA ACC GTC GGT AAT GGT CAA ACC CAA CAT AAG GGT GTA GCT 326 Gin Arg Leu Thr Val Gly Asn Gly Gin Thr Gin His Lys Gly Val Ala 30 35 40

GAT GGT TAC AGT TAT GAA ATC TGG TTA GAT AAC ACC GGT GGT AGT GGT 374 Asp Gly Tyr Ser Tyr Glu He Trp Leu Asp Asn Thr Gly Gly Ser Gly 45 50 55 60

TCT ATG ACT CTC GGT AGT GGT GCA ACC TTC AAG GCT GAA TGG AAT GCA 422 Ser Met Thr Leu Gly Ser Gly Ala Thr Phe Lys Ala Glu Trp Asn Ala 65 70 75

TCT GTT AAC CGT GGT AAC TTC CTT GCC CGT CGT GGT CTT GAC TTC GGT 470 Ser Val Asn Arg Gly Asn Phe Leu Ala Arg Arg Gly Leu Asp Phe Gly 80 85 90

TCT CAA AAG AAG GCA ACC GAT TAC AGC TAC ATT GGA TTG GAT TAT ACT 518 Ser Gin Lys Lys Ala Thr Asp Tyr Ser Tyr He Gly Leu Asp Tyr Thr 95 100 105

GCA ACT TAC AGA CAA ACT GGT AGC GCA AGT GGT AAC TCC CGT CTC TGT 566 Ala Thr Tyr Arg Gin Thr Gly Ser Ala Ser Gly Asn Ser Arg Leu Cys 110 115 120

GTA TAC GGT TGG TTC CAA AAC CGT GGA GTT CAA GGT GTT CCA TTG GTA 614 Val Tyr Gly Trp Phe Gin Asn Arg Gly Val Gin Gly Val Pro Leu Val 125 130 135 140

GAA TAC TAC ATC ATT GAA GAT TGG GTT GAC TGG GTT CCA GAT GCA CAA 662 Glu Tyr Tyr He He Glu Asp Trp Val Asp Trp Val Pro Asp Ala Gin 145 150 155

GGT AGA ATG GTA ACC ATT GAT GGA GCT CAA TAT AAG ATT TTC CAA ATG 710 Gly Arg Met Val Thr He Asp Gly Ala Gin Tyr Lys He Phe Gin Met 160 165 170

GAT CAC ACT GGT CCA ACT ATC AAT GGT GGT AGT GAA ACC TTT AAG CAA 758 ASD His Thr Gly Pro Thr He Asn Gly Gly Ser Glu Thr Phe Lys Gin 175 180 185

TAC TTC AGT GTC CGT CAA CAA AAG AGA ACT TCT GGT CAT ATT ACT GTC 806 Tyr Phe Ser Val Arg Gin Gin Lys Arg Thr Ser Gly His He Thr Val 190 195 200

TCA GAT CAC TTT AAG GAA TGG GCC AAA CAA GGT TGG GGT ATT GGT AAC 854 Ser Asp His Phe Lys Glu Trp Ala Lys Gin Gly Trp Gly He Gly Asn 205 210 215 220

CTT TAT GAA GTT GCT TTG AAC GCC GAA GGT TGG CAA AGT AGT GGT ATA 902 Leu Tyr Glu Val Ala Leu Asn Ala Glu Gly Trp Gin Ser Ser Gly He 225 230 235

GCT GAT GTC ACC AAG TTA GAT GTT TAC ACA ACC CAA AAA GGT TCT AAT 950 Ala Asp Val Thr Lys Leu Asp Val Tyr Thr Thr Gin Lys Gly Ser Asn 240 245 250

CCT GCC CCT ACC TCC ACT GGT ACT GTT CCA AGC AGT TCT GCT GGT GGA 998 Pro Ala Pro Thr Ser Thr Gly Thr Val Pro Ser Ser Ser Ala Gly Gly 255 260 265

AGT ACT GCC AAT GGT AAA AAG TTT ACT GTC GGT AAT GGA CAA AAC CAA 1046 Ser Thr Ala Asn Gly Lys Lys Phe Thr Val Gly Asn Gly Gin Asn Gin 270 275 280

CAT AAG GGT GTC AAC GAT GGT TTC AGT TAT GAA ATC TGG TTA GAT AAC 1094 His Lys Gly Val Asn Asp Gly Phe Ser Tyr Glu He Trp Leu Asp Asn 285 290 295 300

ACT GGT GGT AAC GGT TCT ATG ACT CTC GGT AGT GGT GCA ACT TTC AAG 1142 Thr Gly Gly Asn Gly Ser Met Thr Leu Gly Ser Gly Ala Thr Phe Lys 305 310 315

GCT GAA TGG AAT GCA GCT GTT AAC CGT GGT AAC TTC CTT GCC CGT CGT 1190 Ala Glu Trp Asn Ala Ala Val Asn Arg Gly Asn Phe Leu Ala Arg Arg 320 325 330

GGT CTT GAC TTC GGT TCT CAA AAG AAG GCA ACC GAT TAC GAC TAC ATT 1238 Gly Leu Asp Phe Gly Ser Gin Lys Lys Ala Thr Asp Tyr Asp Tyr He 335 340 345

GGA TTA GAT TAT GCT GCT ACT TAC AAA CAA ACT GCC AGT GCA AGT GGT 1286 Gly Leu Asp Tyr Ala Ala Thr Tyr Lys Gin Thr Ala Ser Ala Ser Gly 350 355 360

AAC TCC CGT CTC TGT GTA TAC GGA TGG TTC CAA AAC CGT GGA CTT AAT 1334 Asn Ser Arg Leu Cys Val Tyr Gly Trp Phe Gin Asn Arg Gly Leu Asn 365 370 375 380

GGC GTT CCT TTA GTA GAA TAC TAC ATC ATT GAA GAT TGG GTT GAC TGG 1382 Gly Val Pro Leu Val Glu Tyr Tyr He He Glu Asp Trp Val Asp Trp 385 390 395

GTT CCA GAT GCA CAA GGA AAA ATG GTA ACC ATT GAT GGA GCT CAA TAT 1430 Val Pro Asp Ala Gin Gly Lys Met Val Thr He Asp Gly Ala Gin Tyr 400 405 410

AAG ATT TTC CAA ATG GAT CAC ACT GGT CCA ACT ATC AAT GGT GGT AGT 1478 Lys He Phe Gin Met Asp His Thr Gly Pro Thr He Asn Gly Gly Ser 415 420 425

GAA ACC TTT AAG CAA TAC TTC AGT GTC CGT CAA CAA AAG AGA ACT TCT 1526 Glu Thr Phe Lys Gin Tyr Phe Ser Val Arg Gin Gin Lys Arg Thr Ser 430 435 440

GGT CAT ATT ACT GTC TCA GAT CAC TTT AAG GAA TGG GCC AAA CAA GGT 1574 Gly His He Thr Val Ser Asp His Phe Lys Glu Trp Ala Lys Gin Gly 445 450 455 460

TGG GGT ATT GGT AAC CTT TAT GAA GTT GCT TTG AAC GCC GAA GGT TGG 1622 Trp Gly He Gly Asn Leu Tyr Glu Val Ala Leu Asn Ala Glu Gly Trp 465 470 475

CAA AGT AGT GGT GTT GCT GAT GTC ACC TTA TTA GAT GTT TAC ACA ACT 1670 Gin Ser Ser Gly Val Ala Asp Val Thr Leu Leu Asp Val Tyr Thr Thr 480 485 490

CCA AAG GGT TCT AGT CCA GCC ACC TCT GCC GCT CCT CGT ACT ACT ACC 1718 Pro Lys Gly Ser Ser Pro Ala Thr Ser Ala Ala Pro Arg Thr Thr Thr 495 500 505

^9-

CGT ACT ACT ACT CGT ACC AAG TCT CTT CCA ACC AAT TAC AAT AAG TGT 1766 Arg Thr Thr Thr Arg Thr Lys Ser Leu Pro Thr Asn Tyr Asn Lys Cys 510 515 520

TCT GCT AGA ATT ACT GCT CAA GGT TAC AAG TGT TGT AGC GAT CCA AAT 1814 Ser Ala Arg He Thr Ala Gin Gly Tyr Lys Cys Cys Ser Asp Pro Asn 525 530 535 540

TGT GTT GTT TAC TAC ACT GAT GAG GAT GGT ACC 1847

Cys Val Val Tyr Tyr Thr Asp Glu Asp Gly Thr 545 550

(2) INFORMATION FOR SEQ ID NO: 4:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 551 amino acids

(B) TYPE: amino acid (D) TOPOLOGY: linear

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4:

Met Arg Thr He Lys Phe Phe Phe Ala Val Ala He Ala Thr Val Ala 1 5 10 15

Lys Ala Gin Trp Gly Gly Gly Gly Ala Ser Ala Gly Gin Arg Leu Thr 20 25 30

Val Gly Asn Gly Gin Thr Gin His Lys Gly Val Ala Asp Gly Tyr Ser 35 40 45

Tyr Glu He Trp Leu Asp Asn Thr Gly Gly Ser Gly Ser Met Thr Leu 50 55 60

Gly Ser Gly Ala Thr Phe Lys Ala Glu Trp Asn Ala Ser Val Asn Arg 65 70 75 80

Gly Asn Phe Leu Ala Arg Arg Gly Leu Asp Phe Gly Ser Gin Lys Lys 85 90 95

Ala Thr Asp Tyr Ser Tyr He Gly Leu Asp Tyr Thr Ala Thr Tyr Arg 100 105 110

Gin Thr Gly Ser Ala Ser Gly Asn Ser Arg Leu Cys Val Tyr Gly Trp 115 120 125

Phe Gin Asn Arg Gly Val Gin Gly Val Pro Leu Val Glu Tyr Tyr He 130 135 140

He Glu Asp Trp Val Asp Trp Val Pro Asp Ala Gin Gly Arg Met Val 145 150 155 160

Thr He Asp Gly Ala Gin Tyr Lys He Phe Gin Met Asp His Thr Gly 165 170 175

Pro Thr He Asn Gly Gly Ser Glu Thr Phe Lys Gin Tyr Phe Ser Val 180 185 190

Arg Gin Gin Lys Arg Thr Ser Gly His He Thr Val Ser Asp His Phe 195 200 205

Lys Glu Trp Ala Lys Gin Gly Trp Gly He Gly Asn Leu Tyr Glu Val 210 215 220

Ala Leu Asn Ala Glu Gly Trp Gin Ser Ser Gly He Ala Asp Val Thr 225 230 235 240

Lys Leu Aso Val Tyr Thr Thr Gin Lys Gly Ser Asn Pro Ala Pro Thr 245 250 255

Ser Thr Gly Thr Val Pro Ser Ser Ser Ala Gly Gly Ser Thr Ala Asn 260 265 270

Gly Lys Lys Phe Thr Val Gly Asn Gly Gin Asn Gin His Lys Gly Val 275 280 285

Asn Asp Gly Phe Ser Tyr Glu He Trp Leu Asp Asn Thr Gly Gly Asn 290 295 300

Gly Ser Met Thr Leu Gly Ser Gly Ala Thr Phe Lys Ala Glu Trp Asn 305 310 315 320

Ala Ala Val Asn Arg Gly Asn Phe Leu Ala Arg Arg Gly Leu Asp Phe 325 330 335

Gly Ser Gin Lys Lys Ala Thr Asp Tyr Asp Tyr He Gly Leu Asp Tyr 340 345 350

Ala Ala Thr Tyr Lys Gin Thr Ala Ser Ala Ser Gly Asn Ser Arg Leu 355 360 365

Cys Val Tyr Gly Trp Phe Gin Asn Arg Gly Leu Asn Gly Val Pro Leu 370 375 380

Val Glu Tyr Tyr He He Glu Asp Trp Val Asp Trp Val Pro Asp Ala 385 390 395 400

Gin Gly Lys Met Val Thr He Asp Gly Ala Gin Tyr Lys He Phe Gin 405 410 415

Met Asp His Thr Gly Pro Thr He Asn Gly Gly Ser Glu Thr Phe Lys 420 425 430

Gin Tyr Phe Ser Val Arg Gin Gin Lys Arg Thr Ser Gly His He Thr 435 440 445

Val Ser Asp His Phe Lys Glu Trp Ala Lys Gin Gly Trp Gly He Gly 450 455 460

Asn Leu Tyr Glu Val Ala Leu Asn Ala Glu Gly Trp Gin Ser Ser Gly 465 470 475 480

Val Ala ASD Val Thr Leu Leu Asp Val Tyr Thr Thr Pro Lys Gly Ser 485 490 495

Ser Pro Ala Thr Ser Ala Ala Pro Arg Thr Thr Thr Arg Thr Thr Thr 500 505 510

Arg Thr Lys Ser Leu Pro Thr Asn Tyr Asn Lys Cys Ser Ala Arg He 515 520 525

Thr Ala Gin Gly Tyr Lys Cys Cys Ser Asp Pro Asn Cys Val Val Tyr 530 535 540

Tyr Thr ASD Glu Asp Gly Thr 545 " 550

(2) INFORMATION FOR SEQ ID NO: 5:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1725 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: double

(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: cDNA

(ix) FEATURE:

(A) NAME/KEY: CDS

(B) LOCATION: 195..1724

(ix) FEATURE:

(A) NAME/KEY: sig_peptide

(B) LOCATION: 195..281

(ix) FEATURE:

(A) NAME/KEY: misc_feature

(B) LOCATION: 1..1725

(D) OTHER INFORMATION: /label * pNX4_insert

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5:

TTTTATTATA TCAATCTCTA ATTTATTTTT TTAGGAAAAA AATAAAAAAA TAAATATAAT 60

AAATATTAGA GAGTAATATT TAAAAACAAA GAAATTTAAA AACGTTTATT TAGTTATTTT 120

TTTTACTGGT TAAAAAAAAA ATAAAAAACA AAATTAATAA AGATATTTTT GAAAAATATT 180

GAATTAGAAA AAAA ATG AGA ACT ATT AAA TTC TTT TTC GCA GTA GCT ATT 230

Met Arg Thr He Lys Phe Phe Phe Ala Val Ala He 1 5 10

GCA ACT GTT GCT AAG GCC CAA TGG GGT GGA GGT GGT GCC TCT GCT GGT 278 Ala Thr Val Ala Lys Ala Gin Trp Gly Gly Gly Gly Ala Ser Ala Gly 15 20 25

CAA AGA TTA ACC GTC GGT AAT GGT CAA ACC CAA CAT AAG GGT GTA GCT 326 Gin Arg Leu Thr Val Gly Asn Gly Gin Thr Gin His Lys Gly Val Ala 30 35 40

GAT GGT TAC AGT TAT GAA ATC TGG TTA GAT AAC ACC GGT GGT AGT GGT 374 Asp Gly Tyr Ser Tyr Glu He Trp Leu Asp Asn Thr Gly Gly Ser Gly 45 50 55 60

TCT ATG ACT CTC GGT AGT GGT GCA ACC TTC AAG GCT GAA TGG AAT GCA 422 Ser Met Thr Leu Gly Ser Gly Ala Thr Phe Lys Ala Glu Trp Asn Ala 65 70 75

TCT GTT AAC CGT GGT AAC TTC CTT GCC CGT CGT GGT CTT GAC TTC GGT 470 Ser Val Asn Arg Gly Asn Phe Leu Ala Arg Arg Gly Leu Asp Phe Gly 80 85 90

TCT CAA AAG AAG GCA ACC GAT TAC AGC TAC ATT GGA TTG GAT TAT ACT 518 Ser Gin Lys Lys Ala Thr Asp Tyr Ser Tyr He Gly Leu Asp Tyr Thr 95 100 105

GCA ACT TAC AGA CAA ACT GGT AGC GCA AGT GGT AAC TCC CGT CTC TGT 566 Ala Thr Tyr Arg Gin Thr Gly Ser Ala Ser Gly Asn Ser Arg Leu Cys 110 115 120

GTA TAC GGT TGG TTC CAA AAC CGT GGA GTT CAA GGT GTT CCA TTG GTA 614 Val Tyr Gly Trp Phe Gin Asn Arg Gly Val Gin Gly Val Pro Leu Val 125 130 135 140

GAA TAC TAC ATC ATT GAA GAT TGG GTT GAC TGG GTT CCA GAT GCA CAA 662 Glu Tyr Tyr He He Glu Asp Trp Val Asp Trp Val Pro Asp Ala Gin 145 150 155

GGT AGA ATG GTA ACC ATT GAT GGA GCT CAA TAT AAG ATT TTC CAA ATG 710 Gly Arg Met Val Thr He Asp Gly Ala Gin Tyr Lys He Phe Gin Met 160 165 170

GAT CAC ACT GGT CCA ACT ATC AAT GGT GGT AGT GAA ACC TTT AAG CAA 758 ASD His Thr Gly Pro Thr He Asn Gly Gly Ser Glu Thr Phe Lys Gin 175 180 185

TAC TTC AGT GTC CGT CAA CAA AAG AGA ACT TCT GGT CAT ATT ACT GTC 806 Tyr Phe Ser Val Arg Gin Gin Lys Arg Thr Ser Gly His He Thr Val 190 195 200

TCA GAT CAC TTT AAG GAA TGG GCC AAA CAA GGT TGG GGT ATT GGT AAC 854 Ser Asp His Phe Lys Glu Trp Ala Lys Gin Gly Trp Gly He Gly Asn 205 210 215 220

CTT TAT GAA GTT GCT TTG AAC GCC GAA GGT TGG CAA AGT AGT GGT ATA 902 Leu Tyr Glu Val Ala Leu Asn Ala Glu Gly Trp Gin Ser Ser Gly He 225 230 235

GCT GAT GTC ACC AAG TTA GAT GTT TAC ACA ACC CAA AAA GGT TCT AAT 950 Ala Asp Val Thr Lys Leu Asp Val Tyr Thr Thr Gin Lys Gly Ser Asn 240 245 250

CCT GCC CCT ACC TCC ACT GGT ACT GTT CCA AGC AGT TCT GCT GGT GGA 998 Pro Ala Pro Thr Ser Thr Gly Thr Val Pro Ser Ser Ser Ala Gly Gly 255 260 265

AGT ACT GCC AAT GGT AAA AAG TTT ACT GTC GGT AAT GGA CAA AAC CAA 1046 Ser Thr Ala Asn Gly Lys Lys Phe Thr Val Gly Asn Gly Gin Asn Gin 270 275 280

CAT AAG GGT GTC AAC GAT GGT TTC AGT TAT GAA ATC TGG TTA GAT AAC 1094 His Lys Gly Val Asn Asp Gly Phe Ser Tyr Glu He Trp Leu Asp Asn 285 290 295 300

ACT GGT GGT AAC GGT TCT ATG ACT CTC GGT AGT GGT GCA ACT TTC AAG 1142 Thr Gly Gly Asn Gly Ser Met Thr Leu Gly Ser Gly Ala Thr Phe Lys 305 310 315

GCT GAA TGG AAT GCA GCT GTT AAC CGT GGT AAC TTC CTT GCC CGT CGT 1190 Ala Glu Trp Asn Ala Ala Val Asn Arg Gly Asn Phe Leu Ala Arg Arg 320 325 330

GGT CTT GAC TTC GGT TCT CAA AAG AAG GCA ACC GAT TAC GAC TAC ATT 1238 Gly Leu Asp Phe Gly Ser Gin Lys Lys Ala Thr Asp Tyr Asp Tyr He 335 340 345

GGA TTA GAT TAT GCT GCT ACT TAC AAA CAA ACT GCC AGT GCA AGT GGT 1286 Gly Leu Asp Tyr Ala Ala Thr Tyr Lys Gin Thr Ala Ser Ala Ser Gly 350 355 360

AAC TCC CGT CTC TGT GTA TAC GGA TGG TTC CAA AAC CGT GGA CTT AAT 1334 Asn Ser Arg Leu Cys Val Tyr Gly Trp Phe Gin Asn Arg Gly Leu Asn 365 370 375 380

GGC GTT CCT TTA GTA GAA TAC TAC ATC ATT GAA GAT TGG GTT GAC TGG 1382 Gly Val Pro Leu Val Glu Tyr Tyr He He Glu Asp Trp Val Asp Trp 385 390 395

GTT CCA GAT GCA CAA GGA AAA ATG GTA ACC ATT GAT GGA GCT CAA TAT 1430 Val Pro Asp Ala Gin Gly Lys Met Val Thr He Asp Gly Ala Gin Tyr 400 405 410

AAG ATT TTC CAA ATG GAT CAC ACT GGT CCA ACT ATC AAT GGT GGT AGT 1478 Lys He Phe Gin Met Asp His Thr Gly Pro Thr He Asn Gly Gly Ser 415 420 425

GAA ACC TTT AAG CAA TAC TTC AGT GTC CGT CAA CAA AAG AGA ACT TCT 1526 Glu Thr Phe Lys Gin Tyr Phe Ser Val Arg Gin Gin Lys Arg Thr Ser 430 435 440

GGT CAT ATT ACT GTC TCA GAT CAC TTT AAG GAA TGG GCC AAA CAA GGT 1574 Gly His He Thr Val Ser Asp His Phe Lys Glu Trp Ala Lys Gin Gly 445 450 455 460

TGG GGT ATT GGT AAC CTT TAT GAA GTT GCT TTG AAC GCC GAA GGT TGG 1622 Trp Gly He Gly Asn Leu Tyr Glu Val Ala Leu Asn Ala Glu Gly Trp 465 470 475

CAA AGT AGT GGT GTT GCT GAT GTC ACC TTA TTA GAT GTT TAC ACA ACT 1670 Gin Ser Ser Gly Val Ala Asp Val Thr Leu Leu Asp Val Tyr Thr Thr 480 485 490

CCA AAG GGT TCT AGT CCA GCC ACC TCT GCC GCT CCT CGT ACT ACT ACC 1718 Pro Lys Gly Ser Ser Pro Ala Thr Ser Ala Ala Pro Arg Thr Thr Thr 495 500 505

CGT ACT A 1725

Arg Thr 510

(2) INFORMATION FOR SEQ ID NO: 6:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 510 amino acids

(B) TYPE: amino acid (D) TOPOLOGY: linear

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6:

Met Arg Thr He Lys Phe Phe Phe Ala Val Ala He Ala Thr Val Ala 1 5 10 15

Lys Ala Gin Trp Gly Gly Gly Gly Ala Ser Ala Gly Gin Arg Leu Thr 20 25 30

Val Gly Asn Gly Gin Thr Gin His Lys Gly Val Ala Asp Gly Tyr Ser 35 40 45

Tyr Glu He Trp Leu Asp Asn Thr Gly Gly Ser Gly Ser Met Thr Leu 50 55 60

Gly Ser Gly Ala Thr Phe Lys Ala Glu Trp Asn Ala Ser Val Asn Arg 65 70 75 80

Gly Asn Phe Leu Ala Arg Arg Gly Leu Asp Phe Gly Ser Gin Lys Lys 85 90 95

Ala Thr Asp Tyr Ser Tyr He Gly Leu Asp Tyr Thr Ala Thr Tyr Arg 100 105 110

Gin Thr Gly Ser Ala Ser Gly Asn Ser Arg Leu Cys Val Tyr Gly Trp 115 120 125

Phe Gin Asn Arg Gly Val Gin Gly Val Pro Leu Val Glu Tyr Tyr He 130 135 140

Ile Glu Asp Trp Val Asp Trp Val Pro Asp Ala Gin Gly Arg Met Val 145 150 155 160

Thr He Asp Gly Ala Gin Tyr Lys He Phe Gin Met Asp His Thr Gly 165 170 175

Pro Thr He Asn Gly Gly Ser Glu Thr Phe Lys Gin Tyr Phe Ser Val 180 185 190

Arg Gin Gin Lys Arg Thr Ser Gly His He Thr Val Ser Asp His Phe 195 200 205

Lys Glu Trp Ala Lys Gin Gly Trp Gly He Gly Asn Leu Tyr Glu Val 210 215 220

Ala Leu Asn Ala Glu Gly Trp Gin Ser Ser Gly He Ala Asp Val Thr 225 230 235 240

Lys Leu Asp Val Tyr Thr Thr Gin Lys Gly Ser Asn Pro Ala Pro Thr 245 250 255

Ser Thr Gly Thr Val Pro Ser Ser Ser Ala Gly Gly Ser Thr Ala Asn 260 265 270

Gly Lys Lys Phe Thr Val Gly Asn Gly Gin Asn Gin His Lys Gly Val 275 280 285

Asn Asp Gly Phe Ser Tyr Glu He Trp Leu Asp Asn Thr Gly Gly Asn 290 295 300

Gly Ser Met Thr Leu Gly Ser Gly Ala Thr Phe Lys Ala Glu Trp Asn 305 310 315 320

Ala Ala Val Asn Arg Gly Asn Phe Leu Ala Arg Arg Gly Leu Asp Phe 325 330 335

Gly Ser Gin Lys Lys Ala Thr Asp Tyr Asp Tyr He Gly Leu Asp Tyr 340 345 350

Ala Ala Thr Tyr Lys Gin Thr Ala Ser Ala Ser Gly Asn Ser Arg Leu 355 360 365

Cys Val Tyr Gly Trp Phe Gin Asn Arg Gly Leu Asn Gly Val Pro Leu 370 375 380

Val Glu Tyr Tyr He He Glu Asp Trp Val Asp Trp Val Pro Asp Ala 385 390 395 400

Gin Gly Lys Met Val Thr He Asp Gly Ala Gin Tyr Lys He Phe Gin 405 410 415

Met Asp His Thr Gly Pro Thr He Asn Gly Gly Ser Glu Thr Phe Lys 420 425 430

Gin Tyr Phe Ser Val Arg Gin Gin Lys Arg Thr Ser Gly His He Thr 435 440 445

Val Ser Asp His Phe Lys Glu Trp Ala Lys Gin Gly Trp Gly He Gly 450 455 460

Asn Leu Tyr Glu Val Ala Leu Asn Ala Glu Gly Trp Gin Ser Ser Gly 465 470 475 480

Val Ala Asp Val Thr Leu Leu Asp Val Tyr Thr Thr Pro Lys Gly Ser 485 490 495

Ser Pro Ala Thr Ser Ala Ala Pro Arg Thr Thr Thr Arg Thr 500 505 510

(2) INFORMATION FOR SEQ ID NO: 7:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 724 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: double

(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: cDNA

(ix) FEATURE:

(A) NAME/KEY: CDS

(B) LOCATION: 1..723

(ix) FEATURE:

(A) NAME/KEY: misc_feature

(B) LOCATION: 1..724

(D) OTHER INFORMATION: /label= pNX5_insert

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7:

ACT GCC AAT GGT AAA AAG TTT ACT GTC GGT AAT GGA CAA AAC CAA CAT 48 Thr Ala Asn Gly Lys Lys Phe Thr Val Gly Asn Gly Gin Asn Gin His 1 5 10 15

AAG GGT GTC AAC GAT GGT TTC AGT TAT GAA ATC TGG TTA GAT AAC ACT 96 Lys Gly Val Asn Asp Gly Phe Ser Tyr Glu He Trp Leu Asp Asn Thr 20 25 30

GGT GGT AAC GGT TCT ATG ACT CTC GGT AGT GGT GCA ACT TTC AAG GCT 144 Gly Gly Asn Gly Ser Met Thr Leu Gly Ser Gly Ala Thr Phe Lys Ala

35 40 45 -

GAA TGG AAT GCA GCT GTT AAC CGT GGT AAC TTC CTT GCC CGT CGT GGT 192 Glu Trp Asn Ala Ala Val Asn Arg Gly Asn Phe Leu Ala Arg Arg Gly 50 55 60

CTT GAC TTC GGT TCT CAA AAG AAG GCA ACC GAT TAC GAC TAC ATT GGA 240 Leu Asp Phe Gly Ser Gin Lys Lys Ala Thr Asp Tyr Asp Tyr He Gly 65 70 75 80

TTA GAT TAT GCT GCT ACT TAC AAA CAA ACT GCC AGT GCA AGT GGT AAC 288 Leu Asp Tyr Ala Ala Thr Tyr Lys Gin Thr Ala Ser Ala Ser Gly Asn 85 90 95

TCC CGT CTC TGT GTA TAC GGA TGG TTC CAA AAC CGT GGA CTT AAT GGC 336 Ser Arg Leu Cys Val Tyr Gly Trp Phe Gin Asn Arg Gly Leu Asn Gly 100 105 110

GTT CCT TTA GTA GAA TAC TAC ATC ATT GAA GAT TGG GTT GAC TGG GTT 384 Val Pro Leu Val Glu Tyr Tyr He He Glu Asp Trp Val Asp Trp Val 115 120 125

CCA GAT GCA CAA GGA AAA ATG GTA ACC ATT GAT GGA GCT CAA TAT AAG 432 Pro Asp Ala Gin Gly Lys Met Val Thr He Asp Gly Ala Gin Tyr Lys 130 135 140

ATT TTC CAA ATG GAT CAC ACT GGT CCA ACT ATC AAT GGT GGT AGT GAA 480 He Phe Gin Met Asp His Thr Gly Pro Thr He Asn Gly Gly Ser Glu 145 150 155 160

ACC TTT AAG CAA TAC TTC AGT GTC CGT CAA CAA AAG AGA ACT TCT GGT 528 Thr Phe Lys Gin Tyr Phe Ser Val Arg Gin Gin Lys Arg Thr Ser Gly 165 170 175

CAT ATT ACT GTC TCA GAT CAC TTT AAG GAA TGG GCC AAA CAA GGT TGG 576 His He Thr Val Ser Asp His Phe Lys Glu Trp Ala Lys Gin Gly Trp 180 185 190

GGT ATT GGT AAC CTT TAT GAA GTT GCT TTG AAC GCC GAA GGT TGG CAA 624 Gly He Gly Asn Leu Tyr Glu Val Ala Leu Asn Ala Glu Gly Trp Gin 195 200 205

AGT AGT GGT GTT GCT GAT GTC ACC TTA TTA GAT GTT TAC ACA ACT CCA 672 Ser Ser Gly Val Ala Asp Val Thr Leu Leu Asp Val Tyr Thr Thr Pro 210 215 220

AAG GGT TCT AGT CCA GCC ACC TCT GCC GCT CCT CGT ACT ACT ACC CGT 720 Lys Gly Ser Ser Pro Ala Thr Ser Ala Ala Pro Arg Thr Thr Thr Arg 225 230 235 240

ACT A 724 Thr

(2) INFORMATION FOR SEQ ID NO: 8:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 241 amino acids

(B) TYPE: amino acid (D) TOPOLOGY: linear

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8:

Thr Ala Asn Gly Lys Lys Phe Thr Val Gly Asn Gly Gin Asn Gin His 1 5 10 15

Lys Gly Val Asn Asp Gly Phe Ser Tyr Glu He Trp Leu Asp Asn Thr 20 25 30

Gly Gly Asn Gly Ser Met Thr Leu Gly Ser Gly Ala Thr Phe Lys Ala 35 40 45

Glu Trp Asn Ala Ala Val Asn Arg Gly Asn Phe Leu Ala Arg Arg Gly 50 55 60

Leu Asp Phe Gly Ser Gin Lys Lys Ala Thr Asp Tyr Asp Tyr He Gly 65 70 75 80

Leu Asp Tyr Ala Ala Thr Tyr Lys Gin Thr Ala Ser Ala Ser Gly Asn 85 90 95

Ser Arg Leu Cys Val Tyr Gly Trp Phe Gin Asn Arg Gly Leu Asn Gly 100 105 110

Val Pro Leu Val Glu Tyr Tyr He He Glu Asp Trp Val Asp Trp Val 115 120 125

Pro Asp Ala Gin Gly Lys Met Val Thr He Asp Gly Ala Gin Tyr Lys 130 135 140

He Phe Gin Met Asp His Thr Gly Pro Thr He Asn Gly Gly Ser Glu 145 150 155 160

Thr Phe Lys Gin Tyr Phe Ser Val Arg Gin Gin Lys Arg Thr Ser Gly 165 170 175

His He Thr Val Ser Asp His Phe Lys Glu Trp Ala Lys Gin Gly Trp 180 185 190

Gly He Gly Asn Leu Tyr Glu Val Ala Leu Asn Ala Glu Gly Trp Gin 195 200 205

Ser Ser Gly Val Ala Asp Val Thr Leu Leu Asp Val Tyr Thr Thr Pro 210 215 220

Lys Gly Ser Ser Pro Ala Thr Ser Ala Ala Pro Arg Thr Thr Thr Arg 225 230 235 240

Thr

(-2) INFORMATION FOR SEQ ID NO: 9:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1001 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: double

(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: cDNA

(ix) FEATURE:

(A) NAME/KEY: CDS

(B) LOCATION: 195..1001

(ix) FEATURE:

(A) NAME/KEY: sig_peptide

(B) LOCATION: 195..281

(ix) FEATURE:

(A) NAME/KEY: misc_feature

(B) LOCATION: 1..1001

(D) OTHER INFORMATION: /label= pNX6_insert

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9:

TTTTATTATA TCAATCTCTA ATTTATTTTT TTAGGAAAAA AATAAAAAAA TAAATATAAT 60

AAATATTAGA GAGTAATATT TAAAAACAAA GAAATTTAAA AACGTTTATT TAGTTATTTT 120

TTTTACTGGT TAAAAAAAAA ATAAAAAACA AAATTAATAA AGATATTTTT GAAAAATATT 180

GAATTAGAAA AAAA ATG AGA ACT ATT AAA TTC TTT TTC GCA GTA GCT ATT 230

Met Arg Thr He Lys Phe Phe Phe Ala Val Ala He 1 5 10

GCA ACT GTT GCT AAG GCC CAA TGG GGT GGA GGT GGT GCC TCT GCT GGT 278 Ala Thr Val Ala Lys Ala Gin Trp Gly Gly Gly Gly Ala Ser Ala Gly 15 20 25

CAA AGA TTA ACC GTC GGT AAT GGT CAA ACC CAA CAT AAG GGT GTA GCT 326 Gin Arg Leu Thr Val Gly Asn Gly Gin Thr Gin His Lys Gly Val Ala 30 35 40

GAT GGT TAC AGT TAT GAA ATC TGG TTA GAT AAC ACC GGT GGT AGT GGT 374 Asp Gly Tyr Ser Tyr Glu He Trp Leu Asp Asn Thr Gly Gly Ser Gly 45 50 55 60

TCT ATG ACT CTC GGT AGT GGT GCA ACC TTC AAG GCT GAA TGG AAT GCA 422 Ser Met Thr Leu Gly Ser Gly Ala Thr Phe Lys Ala Glu Trp Asn Ala 65 70 75

TCT GTT AAC CGT GGT AAC TTC CTT GCC CGT CGT GGT CTT GAC TTC GGT 470 Ser Val Asn Arg Gly Asn Phe Leu Ala Arg Arg Gly Leu Asp Phe Gly 80 85 90

TCT CAA AAG AAG GCA ACC GAT TAC AGC TAC ATT GGA TTG GAT TAT ACT 518 Ser Gin Lys Lys Ala Thr Asp Tyr Ser Tyr He Gly Leu Asp Tyr Thr 95 100 105

GCA ACT TAC AGA CAA ACT GGT AGC GCA AGT GGT AAC TCC CGT CTC TGT 566 Ala Thr Tyr Arg Gin Thr Gly Ser Ala Ser Gly Asn Ser Arg Leu Cys 110 115 120

GTA TAC GGT TGG TTC CAA AAC CGT GGA GTT CAA GGT GTT CCA TTG GTA 614 Val Tyr Gly Trp Phe Gin Asn Arg Gly Val Gin Gly Val Pro Leu Val 125 130 135 140

GAA TAC TAC ATC ATT GAA GAT TGG GTT GAC TGG GTT CCA GAT GCA CAA 662 Glu Tyr Tyr He He Glu Asp Trp Val Asp Trp Val Pro Asp Ala Gin 145 150 155

GGT AGA ATG GTA ACC ATT GAT GGA GCT CAA TAT AAG ATT TTC CAA ATG 710 Gly Arg Met Val Thr He Asp Gly Ala Gin Tyr Lys He Phe Gin Met 160 165 170

GAT CAC ACT GGT CCA ACT ATC AAT GGT GGT AGT GAA ACC TTT AAG CAA 758 Asp His Thr Gly Pro Thr He Asn Gly Gly Ser Glu Thr Phe Lys Gin 175 180 185

TAC TTC AGT GTC CGT CAA CAA AAG AGA ACT TCT GGT CAT ATT ACT GTC 806 Tyr Phe Ser Val Arg Gin Gin Lys Arg Thr Ser Gly His He Thr Val 190 195 200

TCA GAT CAC TTT AAG GAA TGG GCC AAA CAA GGT TGG GGT ATT GGT AAC 854 Ser Asp His Phe Lys Glu Trp Ala Lys Gin Gly Trp Gly He Gly Asn 205 210 215 220

CTT TAT GAA GTT GCT TTG AAC GCC GAA GGT TGG CAA AGT AGT GGT ATA 902 Leu Tyr Glu Val Ala Leu Asn Ala Glu Gly Trp Gin Ser Ser Gly He 225 230 235

GCT GAT GTC ACC AAG TTA GAT GTT TAC ACA ACC CAA AAA GGT TCT AAT 950 Ala Asp Val Thr Lys Leu Asp Val Tyr Thr Thr Gin Lys Gly Ser Asn 240 245 250

CCT GCC CCT ACC TCC ACT GGT ACT GTT CCA AGC AGT TCT GCT GGT GGA 998 Pro Ala Pro Thr Ser Thr Gly Thr Val Pro Ser Ser Ser Ala Gly Gly 255 260 265

AGT 1001 Ser

(2) INFORMATION FOR SEQ ID NO: 10:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 269 amino acids

(B) TYPE: amino acid (D) TOPOLOGY: linear

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10:

Met Arg Thr He Lys Phe Phe Phe Ala Val Ala He Ala Thr Val Ala 1 5 10 15

Lys Ala Gin Trp Gly Gly Gly ,Gly Ala Ser Ala Glv Gin Arg Leu Thr 20 25 * 30

Val Gly Asn Gly Gin Thr Gin His Lys Gly Val Ala Asp Gly Tyr Ser 35 40 45

Tyr Glu He Trp Leu Asp Asn Thr Gly Gly Ser Gly Ser Met Thr Leu 50 55 60

Gly Ser Gly Ala Thr Phe Lys Ala Glu Trp Asn Ala Ser Val Asn Arg 65 70 75 80

Gly Asn Phe Leu Ala Arg Arg Gly Leu Asp Phe Gly Ser Gin Lys Lys 85 90 95

Ala Thr Asp Tyr Ser Tyr He Gly Leu Asp Tyr Thr Ala Thr Tyr Arg 100 105 110

Gin Thr Gly Ser Ala Ser Gly Asn Ser Arg Leu Cys Val Tyr Gly Trp 115 120 125

Phe Gin Asn Arg Gly Val Gin Gly Val Pro Leu Val Glu Tyr Tyr He 130 135 140

He Glu Asp Trp Val Asp Trp Val Pro Asp Ala Gin Gly Arg Met Val 145 150 155 160

Thr He Asp Gly Ala Gin Tyr Lys He Phe Gin Met Asp His Thr Gly 165 170 175

Pro Thr He Asn Gly Gly Ser Glu Thr Phe Lys Gin Tyr Phe Ser Val 180 185 190

Arg Gin Gin Lys Arg Thr Ser Gly His lie Thr Val Ser Asp His Phe 195 200 205

Lys Glu Trp Ala Lys Gin Gly Trp Gly He Gly Asn Leu Tyr Glu Val 210 215 220

Ala Leu Asn Ala Glu Gly Trp Gin Ser Ser Gly He Ala Asp Val Thr 225 230 235 240

Lys Leu Asp Val Tyr Thr Thr Gin Lys Gly Ser Asn Pro Ala Pro Thr 245 250 255

Ser Thr Gly Thr Val Pro Ser Ser Ser Ala Gly Gly Ser 260 265

(2) INFORMATION FOR SEQ ID NO: 11:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 690 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: double

(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: cDNA

(ix) FEATURE:

(A) NAME/KEY: CDS

(B) LOCATION: 195..689

(ix) FEATURE:

(A) NAME/KEY: sig_peptide

(B) LOCATION: 195..281

(ix) FEATURE:

(A) NAME/KEY: misc_feature

(B) LOCATION: 1..690

(D) OTHER INFORMATION: /label= pNX7_insert

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11:

TTTTATTATA TCAATCTCTA ATTTATTTTT TTAGGAAAAA AATAAAAAAA TAAATATAAT 60

AAATATTAGA GAGTAATATT TAAAAACAAA GAAATTTAAA AACGTTTATT TAGTTATTTT 120

TTTTACTGGT TAAAAAAAAA ATAAAAAACA AAATTAATAA AGATATTTTT GAAAAATATT 180

GAATTAGAAA AAAA ATG AGA ACT ATT AAA TTC TTT TTC GCA GTA GCT ATT 230 Met Arg Thr He Lys Phe Phe Phe Ala Val Ala He 1 5 10

GCA ACT GTT GCT AAG GCC CAA TGG GGT GGA GGT GGT GCC TCT GCT GGT 278 Ala Thr Val Ala Lys Ala Gin Trp Gly Gly Gly Gly Ala Ser Ala Gly 15 20 25

CAA AGA TTA ACC GTC GGT AAT GGT CAA ACC CAA CAT AAG GGT GTA GCT 326 Gin Arg Leu Thr Val Gly Asn Gly Gin Thr Gin His Lys Gly Val Ala 30 35 40

GAT GGT TAC AGT TAT GAA ATC TGG TTA GAT AAC ACC GGT GGT AGT GGT 374 Asp Gly Tyr Ser Tyr Glu He Trp Leu Asp Asn Thr Gly Gly Ser Gly 45 50 55 60

TCT ATG ACT CTC GGT AGT GGT GCA ACC TTC AAG GCT GAA TGG AAT GCA 422 Ser Met Thr Leu Gly Ser Gly Ala Thr Phe Lys Ala Glu Trp Asn Ala 65 70 75

TCT GTT AAC CGT GGT AAC TTC CTT GCC CGT CGT GGT CTT GAC TTC GGT 470 Ser Val Asn Arg Gly Asn Phe Leu Ala Arg Arg Gly Leu Asp Phe Gly 80 85 90

TCT CAA AAG AAG GCA ACC GAT TAC AGC TAC ATT GGA TTG GAT TAT ACT 518 Ser Gin Lys Lys Ala Thr Asp Tyr Ser Tyr He Gly Leu Asp Tyr Thr 95 100 105

GCA ACT TAC AGA CAA ACT GGT AGC GCA AGT GGT AAC TCC CGT CTC TGT 566 Ala Thr Tyr Arg Gin Thr Gly Ser Ala Ser Gly Asn Ser Arg Leu Cys 110 115 120

GTA TAC GGT TGG TTC CAA AAC CGT GGA GTT CAA GGT GTT CCA TTG GTA 614 Val Tyr Gly Trp Phe Gin Asn Arg Gly Val Gin Gly Val Pro Leu Val 125 130 135 140

GAA TAC TAC ATC ATT GAA GAT TGG GTT GAC TGG GTT CCA GAT GCA CAA 662 Glu Tyr Tyr He He Glu Asp Trp Val Asp Trp Val Pro Asp Ala Gin 145 150 155

GGT AGA ATG GTA ACC ATT GAT GGA GCT C 690

Gly Arg Met Val Thr He Asp Gly Ala 160 165

(2) INFORMATION FOR SEQ ID NO: 12:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 165 ammo acids

(B) TYPE: amino acid

(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12:

Met Arg Thr He Lys Phe Phe Phe Ala Val Ala He Ala Thr Val Ala 1 5 10 15

Lys Ala Gin Trp Gly Gly Gly Gly Ala Ser Ala Gly Gin Arg Leu Thr 20 25 30

Val Gly Asn Gly Gin Thr Gin His Lys Gly Val Ala Asp Gly Tyr Ser 35 40 45

Tyr Glu He Trp Leu Asp Asn Thr Gly Gly Ser Gly Ser Met Thr Leu 50 55 60

Gly Ser Gly Ala Thr Phe Lys Ala Glu Trp Asn Ala Ser Val Asn Arg 65 70 75 80

Gly Asn Phe Leu Ala Arg Arg Gly Leu Asp Phe Gly Ser Gin Lys Lys 85 90 95

Ala Thr Asp Tyr Ser Tyr He Gly Leu Asp Tyr Thr Ala Thr Tyr Arg 100 105 110

Gin Thr Gly Ser Ala Ser Gly Asn Ser Arg Leu Cys Val Tyr Gly Trp 115 120 125

Phe Gin Asn Arg Gly Val Gin Gly Val Pro Leu Val Glu Tyr Tvr He 130 135 140

He Glu Asp Trp Val Asp Trp Val Pro Asp Ala Gin Gly Arg Met Val 145 150 155 160

Thr He Asp Gly Ala 165

(2) INFORMATION FOR SEQ ID NO: 13:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1337 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: double

(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: CDNA

(ix) FEATURE:

(A) NAME/KEY: CDS

(B) LOCATION: 1..1014

(ix) FEATURE:

(A) NAME/KEY: misc_feature

(B) LOCATION: 1..1337

(D) OTHER INFORMATION: /label= pNX8_insert

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13:

ACT GCC AAT GGT AAA AAG TTT ACT GTC GGT AAT GGA CAA AAC CAA CAT 48 Thr Ala Asn Gly Lys Lys Phe Thr Val Gly Asn Gly Gin Asn Gin His 1 5 10 15

AAG GGT GTC AAC GAT GGT TTC AGT TAT GAA ATC TGG TTA GAT AAC ACT 96 Lys Gly Val Asn Asp Gly Phe Ser Tyr Glu He Trp Leu Asp Asn Thr 20 25 30

GGT GGT AAC GGT TCT ATG ACT CTC GGT AGT GGT GCA ACT TTC AAG GCT 144 Gly Gly Asn Gly Ser Met Thr Leu Gly Ser Gly Ala Thr Phe Lys Ala 35 40 45

GAA TGG AAT GCA GCT GTT AAC CGT GGT AAC TTC CTT GCC CGT CGT GGT 192 Glu Trp Asn Ala Ala Val Asn Arg Gly Asn Phe Leu Ala Arg Arg Gly 50 55 60

CTT GAC TTC GGT TCT CAA AAG AAG GCA ACC GAT TAC GAC TAC ATT GGA 240 Leu Asp Phe Gly Ser Gin Lys Lys Ala Thr Asp Tyr Asp Tyr He Gly 65 70 75 80

TTA GAT TAT GCT GCT ACT TAC AAA CAA ACT GCC AGT GCA AGT GGT AAC 288 Leu Asp Tyr Ala Ala Thr Tyr Lys Gin Thr Ala Ser Ala Ser Gly Asn 85 90 95

TCC CGT CTC TGT GTA TAC GGA TGG TTC CAA AAC CGT GGA CTT AAT GGC 336 Ser Arg Leu Cys Val Tyr Gly Trp Phe Gin Asn Arg Gly Leu Asn Gly 100 105 110

GTT CCT TTA GTA GAA TAC TAC ATC ATT GAA GAT TGG GTT GAC TGG GTT 384 Val Pro Leu Val Glu Tyr Tyr He He Glu Asp Trp Val Asp Trp Val 115 120 125

CCA GAT GCA CAA GGA AAA ATG GTA ACC ATT GAT GGA GCT CAA TAT AAG 432 Pro Asp Ala Gin Gly Lys Met Val Thr He Asp Gly Ala Gin Tyr Lys 130 135 140

ATT TTC CAA ATG GAT CAC ACT GGT CCA ACT ATC AAT GGT GGT AGT GAA 480 He Phe Gin Met Asp His Thr Gly Pro Thr He Asn Gly Gly Ser Glu 145 150 155 160

ACC TTT AAG CAA TAC TTC AGT GTC CGT CAA CAA AAG AGA ACT TCT GGT 528 Thr Phe Lys Gin Tyr Phe Ser Val Arg Gin Gin Lys Arg Thr Ser Gly 165 170 175

CAT ATT ACT GTC TCA GAT CAC TTT AAG GAA TGG GCC AAA CAA GGT TGG 576 His He Thr Val Ser Asp His Phe Lys Glu Trp Ala Lys Gin Gly Trp 180 185 190

GGT ATT GGT AAC CTT TAT GAA GTT GCT TTG AAC GCC GAA GGT TGG CAA 624 Gly He Gly Asn Leu Tyr Glu Val Ala Leu Asn Ala Glu Gly Trp Gin 195 200 205

AGT AGT GGT GTT GCT GAT GTC ACC TTA TTA GAT GTT TAC ACA ACT CCA 672 Ser Ser Gly Val Ala Asp Val Thr Leu Leu Asp Val Tyr Thr Thr Pro 210 215 220

AAG GGT TCT AGT CCA GCC ACC TCT GCC GCT CCT CGT ACT ACT ACC CGT 720 Lys Gly Ser Ser Pro Ala Thr Ser Ala Ala Pro Arg Thr Thr Thr Arg 225 230 235 240

ACT ACT ACT CGT ACC AAG TCT CTT CCA ACC AAT TAC AAT AAG TGT TCT 768 Thr Thr Thr Arg Thr Lys Ser Leu Pro Thr Asn Tyr Asn Lys Cys Ser 245 250 255

GCT AGA ATT ACT GCT CAA GGT TAC AAG TGT TGT AGC GAT CCA AAT TGT 816 Ala Arg He Thr Ala Gin Gly Tyr Lys Cys Cys Ser Asp Pro Asn Cys 260 265 270

GTT GTT TAC TAC ACT GAT GAG GAT GGT ACC TGG GGT GTT GAA AAC AAC 864 Val Val Tyr Tyr Thr Asp Glu ASD Gly Thr Trp Gly Val Glu Asn Asn 275 280 285

GAC TGG TGT GGT TGT GGT GTT GAA CAA TGT TCT TCC AAG ATC ACT TCT 912 Asp Trp Cys Gly Cys Gly Val Glu Gin Cys Ser Ser Lys He Thr Ser 290 295 300

CAA GGT TAC AAG TGT TGT AGC GAT CCA AAT TGC GTT GTT TTC TAC ACT 960 Gin Gly Tyr Lys Cys Cys Ser Asp Pro Asn Cys Val Val Phe Tyr Thr 305 310 315 320

GAT GAC GAT GGT AAA TGG GGT GTT GAA AAC AAC GAC TGG TGT GGT TGT 1008 Asp Asp Asp Gly Lys Trp Gly Val Glu Asn Asn Asp Trp Cys Gly Cys 325 330 335

GGT TTC TAAGCAGTAA AATACTAATT AATAAAAAAT TAAAGAATTA TGAAAAATTT 1064 Gly Phe

AAATTTAAAA ATTTAAAAGA ATTATGAAAA ATTTAAATTT AAAAATTTAA AAAAAACTAA 1124

TTTAGTAAAA AATTAAAGAA TTATTGAAAA TTTTAAATGT AAAAATTTAA AAAATACAAA 1184

TTTGTAAAAA AAAATGAAAG AATTATGAAA AATTAAAATG TAAAAGTTTA AAAAATACAA 1244

ATTTGTAAGA AAAATAAAGA ATTATAAAAA AAATAAAGAA TTATGAAAAA CCCAAATGTA 1304

AAGAAAAAAA AAAAAAAAAA AAAAAAAAAA AAA 1337

(2) INFORMATION FOR SEQ ID NO: 14:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 338 amino acids

(B) TYPE: amino acid (D) TOPOLOGY: linear

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14:

Thr Ala Asn Gly Lys Lys Phe Thr Val Gly Asn Gly Gin Asn Gin His

1 5 10 15

Lys Gly Val Asn Asp Gly Phe Ser Tyr Glu He Trp Leu Asp Asn Thr 20 25 30

Gly Gly Asn Gly Ser Met Thr Leu Gly Ser Gly Ala Thr Phe Lys Ala 35 40 45

Glu Trp Asn Ala Ala Val Asn Arg Gly Asn Phe Leu Ala Arg Arg Gly 50 55 60

Leu Asp Phe Gly Ser Gin Lys Lys Ala Thr Asp Tyr Asp Tyr He Gly 65 70 75 80

Leu Asp Tyr Ala Ala Thr Tyr Lys Gin Thr Ala Ser Ala Ser Gly Asn 85 90 95

Ser Arg Leu Cys Val Tyr Gly Trp Phe Gin Asn Arg Gly Leu Asn Gly 100 105 110

Val Pro Leu Val Glu Tyr Tyr He He Glu Asp Trp Val Asp Trp Val 115 120 125

Pro Asp Ala Gin Gly Lys Met Val Thr He Asp Gly Ala Gin Tyr Lys 130 135 - 140

He Phe Gin Met Asp His Thr Glv Pro Thr He Asn Gly Gly Ser Glu 145 150 ' 155 160

Thr Phe Lys Gin Tyr Phe Ser Val Arg Gin Gin Lys Arg Thr Ser Gly 165 170 175

His He Thr Val Ser Asp His Phe Lys Glu Trp Ala Lys Gin Gly Trp 180 185 190

Gly He Gly Asn Leu Tyr Glu Val Ala Leu Asn Ala Glu Gly Trp Gin 195 200 205

Ser Ser Gly Val Ala Asp Val Thr Leu Leu Asp Val Tyr Thr Thr Pro 210 215 220

Lys Gly Ser Ser Pro Ala Thr Ser Ala Ala Pro Arg Thr Thr Thr Arg 225 230 235 240

Thr Thr Thr Arg Thr Lys Ser Leu Pro Thr Asn Tyr Asn Lys Cys Ser 245 250 255

Ala Arg He Thr Ala Gin Gly Tyr Lys Cys Cys Ser Asp Pro Asn Cys 260 265 270

Val Val Tyr Tyr Thr Asp Glu Asp Gly Thr Trp Gly Val Glu Asn Asn 275 280 285

Asp Trp Cys Gly Cys Gly Val Glu Gin Cys Ser Ser Lys He Thr Ser 290 295 300

Gin Gly Tyr Lys Cys Cys Ser Asp Pro Asn Cys Val Val Phe Tyr Thr 305 310 315 320

Asp Asp Asp Gly Lys Trp Gly Val Glu Asn Asn Asp Trp Cys Gly Cys 325 330 335

Gly Phe

(2) INFORMATION FOR SEQ ID NO: 15:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 846 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: double

(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: cDNA

(ix) FEATURE:

(A) NAME/KEY: CDS

(B) LOCATION: 1..846

(ix) FEATURE:

(A) NAME/KEY: misc_feature

(B) LOCATION: 1..846

(D) OTHER INFORMATION: /label= pNX9_insert

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15:

ACT GCC AAT GGT AAA AAG TTT ACT GTC GGT AAT GGA CAA AAC CAA CAT 48 Thr Ala Asn Gly Lys Lys Phe Thr Val Gly Asn Gly Gin Asn Gin His 1 5 10 15

AAG GGT GTC AAC GAT GGT TTC AGT TAT GAA ATC TGG TTA GAT AAC ACT 96 Lvs Gly Val Asn Asp Gly Phe Ser Tyr Glu He Trp Leu Asp Asn Thr 20 25 30

GGT GGT AAC GGT TCT ATG ACT CTC GGT AGT GGT GCA ACT TTC AAG GCT 144 Gly Gly Asn Gly Ser Met Thr Leu Gly Ser Gly Ala Thr Phe Lys Ala 35 40 45

GAA TGG AAT GCA GCT GTT AAC CGT GGT AAC TTC CTT GCC CGT CGT GGT 192 Glu Trp Asn Ala Ala Val Asn Arg Gly Asn Phe Leu Ala Arg Arg Gly 50 55 60

CTT GAC TTC GGT TCT CAA AAG AAG GCA ACC GAT TAC GAC TAC ATT GGA 240 Leu Asp Phe Gly Ser Gin Lys Lys Ala Thr ASD Tyr Asp Tyr He Gly 65 70 75 80

TTA GAT TAT GCT GCT ACT TAC AAA CAA ACT GCC AGT GCA AGT GGT AAC 288 Leu Asp Tyr Ala Ala Thr Tyr Lys Gin Thr Ala Ser Ala Ser Gly Asn 85 90 95

TCC CGT CTC TGT GTA TAC GGA TGG TTC CAA AAC CGT GGA CTT AAT GGC 336 Ser Arg Leu Cys Val Tyr Gly Trp Phe Gin Asn Arg Gly Leu Asn Gly 100 105 110

GTT CCT TTA GTA GAA TAC TAC ATC ATT GAA GAT TGG GTT GAC TGG GTT 384 Val Pro Leu Val Glu Tyr Tyr He He Glu Asp Trp Val Asp Trp Val 115 120 125

CCA GAT GCA CAA GGA AAA ATG GTA ACC ATT GAT GGA GCT CAA TAT AAG 432 Pro Asp Ala Gin Gly Lys Met Val Thr He Asp Gly Ala Gin Tyr Lys 130 135 140

ATT TTC CAA ATG GAT CAC ACT GGT CCA ACT ATC AAT GGT GGT AGT GAA 480 He Phe Gin Met Asp His Thr Gly Pro Thr He Asn Gly Gly Ser Glu 145 150 155 160

ACC TTT AAG CAA TAC TTC AGT GTC CGT CAA CAA AAG AGA ACT TCT GGT 528 Thr Phe Lys Gin Tyr Phe Ser Val Arg Gin Gin Lys Arg Thr Ser Gly 165 170 175

CAT ATT ACT GTC TCA GAT CAC TTT AAG GAA TGG GCC AAA CAA GGT TGG 576 His He Thr Val Ser Asp His Phe Lys Glu Trp Ala Lys Gin Gly Trp 180 185 190

GGT ATT GGT AAC CTT TAT GAA GTT GCT TTG AAC GCC GAA GGT TGG CAA 624 Gly He Gly Asn Leu Tyr Glu Val Ala Leu Asn Ala Glu Gly Trp Gin 195 200 205

AGT AGT GGT GTT GCT GAT GTC ACC TTA TTA GAT GTT TAC ACA ACT CCA 672 Ser Ser Gly Val Ala Asp Val Thr Leu Leu Asp Val Tyr Thr Thr Pro 210 215 220

AAG GGT TCT AGT CCA GCC ACC TCT GCC GCT CCT CGT ACT ACT ACC CGT 720 Lys Gly Ser Ser Pro Ala Thr Ser Ala Ala Pro Arg Thr Thr Thr Arg 225 230 235 240

ACT ACT ACT CGT ACC AAG TCT CTT CCA ACC AAT TAC AAT AAG TGT TCT 768 Thr Thr Thr Arg Thr Lys Ser Leu Pro Thr Asn Tyr Asn Lys Cys Ser 245 250 255

GCT AGA ATT ACT GCT CAA GGT TAC AAG TGT TGT AGC GAT CCA AAT TGT 816 Ala Arg He Thr Ala Gin Gly Tyr Lys Cys Cys Ser Asp Pro Asn Cys 260 265 270

GTT GTT TAC TAC ACT GAT GAG GAT GGT ACC 846

Val Val Tyr Tyr Thr Asp Glu Asp Gly Thr 275 280

(2) INFORMATION FOR SEQ ID NO: 16:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 282 amino acids

(B) TYPE: amino acid (D) TOPOLOGY: linear

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16:

Thr Ala Asn Gly Lys Lys Phe Thr Val Gly Asn Gly Gin Asn Gin His 1 5 10 15

Lys Gly Val Asn Asp Gly Phe Ser Tyr Glu He Trp Leu Asp Asn Thr 20 25 30

Gly Gly Asn Gly Ser Met Thr Leu Gly Ser Gly Ala Thr Phe Lys Ala 35 40 45

Glu Trp Asn Ala Ala Val Asn Arg Gly Asn Phe Leu Ala Arg Arg Gly 50 55 60

Leu Asp Phe Gly Ser Gin Lys Lys Ala Thr Asp Tyr Asp Tyr He Gly 65 70 75 80

Leu Asp Tyr Ala Ala Thr Tyr Lys Gin Thr Ala Ser Ala Ser Gly Asn 85 90 95

Ser Arg Leu Cys Val Tyr Gly Trp Phe Gin Asn Arg Gly Leu Asn Gly 100 105 110

Val Pro Leu Val Glu Tyr Tyr He He Glu Asp Trp Val Asp Trp Val 115 120 125

Pro Asp Ala Gin Gly Lys Met Val Thr He Asp Gly Ala Gin Tyr Lys 130 135 140

He Phe Gin Met Asp His Thr Gly Pro Thr He Asn Gly Gly Ser Glu 145 150 155 160

Thr Phe Lys Gin Tyr Phe Ser Val Arg Gin Gin Lys Arg Thr Ser Gly 165 170 175

His He Thr Val Ser Asp His Phe Lys Glu Trp Ala Lys Gin Gly Trp 180 185 190

Gly He Gly Asn Leu Tyr Glu Val Ala Leu Asn Ala Glu Gly Trp Gin 195 200 205

Ser Ser Gly Val Ala Asp Val Thr Leu Leu Asp Val Tyr Thr Thr Pro 210 215 220

Lys Gly Ser Ser Pro Ala Thr Ser Ala Ala Pro Arg Thr Thr Thr Arg 225 230 235 240

Thr Thr Thr Arg Thr Lys Ser Leu Pro Thr Asn Tyr Asn Lys Cys Ser 245 250 255

Ala Arg He Thr Ala Gin Gly Tyr Lys Cys Cys Ser Asp Pro Asn Cys 260 265 270

Val Val Tyr Tyr Thr Asp Glu Asp Gly Thr 275 280

(2) INFORMATION FOR SEQ ID NO: 17:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 708 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: double

(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: cDNA

(ix) FEATURE:

(A) NAME/KEY: CDS

(B) LOCATION: 1..708

(ix) FEATURE:

(A) NAME/KEY: misc_feature

(B) LOCATION: 1..708

(D) OTHER INFORMATION: /label- pNX10_insert

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17:

ACT GCC AAT GGT AAA AAG TTT ACT GTC GGT AAT GGA CAA AAC CAA CAT 48 Thr Ala Asn Gly Lys Lys Phe Thr Val Gly Asn Gly Gin Asn Gin His 1 5 10 15

AAG GGT GTC AAC GAT GGT TTC AGT TAT GAA ATC TGG TTA GAT AAC ACT 96 Lys Gly Val Asn Asp Gly Phe Ser Tyr Glu He Trp Leu Asp Asn Thr 20 25 30

GGT GGT AAC GGT TCT ATG ACT CTC GGT AGT GGT GCA ACT TTC AAG GCT 144 Gly Gly Asn Gly Ser Met Thr Leu Gly Ser Gly Ala Thr Phe Lys Ala 35 40 45

GAA TGG AAT GCA GCT GTT AAC CGT GGT AAC TTC CTT GCC CGT CGT GGT 192 Glu Trp Asn Ala Ala Val Asn Arg Gly Asn Phe Leu Ala Arg Arg Gly 50 55 60

CTT GAC TTC GGT TCT CAA AAG AAG GCA ACC GAT TAC GAC TAC ATT GGA 240

Leu Asp Phe Gly Ser Gin Lys Lys Ala Thr Asp Tyr Asp Tyr He Gly

65 70 75 80

TTA GAT TAT GCT GCT ACT TAC AAA CAA ACT GCC AGT GCA AGT GGT AAC 288 Leu Asp Tyr Ala Ala Thr Tyr Lys Gin Thr Ala Ser Ala Ser Gly Asn 85 90 95

TCC CGT CTC TGT GTA TAC GGA TGG TTC CAA AAC CGT GGA CTT AAT GGC 336 Ser Arg Leu Cys Val Tyr Gly Trp Phe Gin Asn Arg Gly Leu Asn Gly 100 105 110

GTT CCT TTA GTA GAA TAC TAC ATC ATT GAA GAT TGG GTT GAC TGG GTT 384 Val Pro Leu Val Glu Tyr Tyr He He Glu Asp Trp Val Asp Trp Val 115 120 125

CCA GAT GCA CAA GGA AAA ATG GTA ACC ATT GAT GGA GCT CAA TAT AAG 432 Pro Asp Ala Gin Gly Lys Met Val Thr He Asp Gly Ala Gin Tyr Lys 130 135 140

ATT TTC CAA ATG GAT CAC ACT GGT CCA ACT ATC AAT GGT GGT AGT GAA 480 He Phe Gin Met Asp His Thr Gly Pro Thr He Asn Gly Gly Ser Glu 145 150 155 160

ACC TTT AAG CAA TAC TTC AGT GTC CGT CAA CAA AAG AGA ACT TCT GGT 528 Thr Phe Lys Gin Tyr Phe Ser Val Arg Gin Gin Lys Arg Thr Ser Gly 165 170 175

CAT ATT ACT GTC TCA GAT CAC TTT AAG GAA TGG GCC AAA CAA GGT TGG 576 His He Thr Val Ser Asp His Phe Lys Glu Trp Ala Lys Gin Gly Trp 180 185 190

GGT ATT GGT AAC CTT TAT GAA GTT GCT TTG AAC GCC GAA GGT TGG CAA 624 Gly He Gly Asn Leu Tyr Glu Val Ala Leu Asn Ala Glu Gly Trp Gin 195 200 205

AGT AGT GGT GTT GCT GAT GTC ACC TTA TTA GAT GTT TAC ACA ACT CCA 672 Ser Ser Gly Val Ala Asp Val Thr Leu Leu Asp Val Tyr Thr Thr Pro 210 215 220

AAG GGT TCT AGT CCA GCC ACC TCT GCC GCT CCT CGT 708

Lys Gly Ser Ser Pro Ala Thr Ser Ala Ala Pro Arg 225 230 235

(2) INFORMATION FOR SEQ ID NO: 18:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 236 amino acids

(B) TYPE: amino acid (D) TOPOLOGY: linear

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18:

Thr Ala Asn Gly Lys Lys Phe Thr Val Gly Asn Gly Gin Asn Gin His 1 5 10 15

Lys Gly Val Asn Asp Gly Phe Ser Tyr Glu He Trp Leu Asp Asn Thr 20 25 30

Gly Gly Asn Gly Ser Met Thr Leu Gly Ser Gly Ala Thr Phe Lys Ala 35 40 45

Glu Trτ> Asn Ala Ala Val Asn Arg Gly Asn Phe Leu Ala Arg Arg Gly 50 55 60

Leu Asp Phe Gly Ser Gin Lys Lys Ala Thr Asp Tyr Asp Tyr He Gly 65 70 75 80

Leu Asp Tyr Ala Ala Thr Tyr Lys Gin Thr Ala Ser Ala Ser Gly Asn 85 90 95

Ser Arg Leu Cys Val Tyr Gly Trp Phe Gin Asn Arg Gly Leu Asn Gly 100 105 110

Val Pro Leu Val Glu Tyr Tyr He He Glu Asp Trp Val Asp Trp Val 115 120 125

Pro Asp Ala Gin Gly Lys Met Val Thr He Asp Gly Ala Gin Tyr Lys 130 135 140

He Phe Gin Met Asp His Thr Gly Pro Thr He Asn Gly Gly Ser Glu 145 150 155 160

Thr Phe Lys Gin Tyr Phe Ser Val Arg Gin Gin Lys Arg Thr Ser Gly 165 170 175

His He Thr Val Ser Asp His Phe Lys Glu Trp Ala Lys Gin Gly Trp 180 185 190

Gly He Gly Asn Leu Tyr Glu Val Ala Leu Asn Ala Glu Gly Trp Gin 195 200 205

Ser Ser Gly Val Ala Asp Val Thr Leu Leu Asp Val Tyr Thr Thr Pro 210 215 220

Lys Gly Ser Ser Pro Ala Thr Ser Ala Ala Pro Arg 225 230 235

SUMMARY OF SEQUENCE LISTINGS pNXl DNA and coding region Protein sequence of SEQ ID NO pNX3 DNA and coding region Protein sequence of SEQ ID NO pNX4 DNA and coding region Protein sequence of SEQ ID NO pNX5 DNA and coding region Protein sequence of SEQ ID NO pNX6 DNA and coding region Protein sequence of SEQ ID NO pNX7 DNA and coding region Protein sequence of SEQ ID NO 11 pNX8 DNA and coding region Protein sequence of SEQ ID NO 13 pNX9 DNA and coding region Protein sequence of SEQ ID NO 15 pNXlO DNA and coding region Protein sequence of SEQ ID NO 17