Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
DSZ GENE EXPRESSION IN $i(PSEUDOMONAS) HOSTS
Document Type and Number:
WIPO Patent Application WO/1998/045447
Kind Code:
A1
Abstract:
The present invention provides novel recombinant pseudomonas which contain a heterologous nucleic acid molecule comprising a nucleotide sequence encoding one or more desulfurization enzymes which are components of a biodesulfurization catalyst. The invention also provides a method of desulfurizing a carbonaceous material, such as a fossil fuel, which comprises organosulfur compounds. The method comprises the steps of (1) contacting the fossil fuel with an aqueous phase containing a recombinant biocatalyst which is capable of cleaving carbon-sulfur bonds and, optionally, a flavoprotein, thereby forming a fossil fuel and aqueous phase mixture; (2) maintaining the mixture under conditions sufficient for cleavage of the carbon-sulfur bonds of the organosulfur molecules by the biocatalyst, thereby resulting in a fossil fuel having a reduced organic sulfur content; and (3) separating the fossil fuel having a reduced organic sulfur content from the resulting aqueous phase.

Inventors:
DARZINS ALDIS (US)
XI LEI (US)
CHILDS JOHN D (US)
MONTICELLO DANIEL J (US)
SQUIRES CHARLES H (US)
Application Number:
PCT/US1998/006691
Publication Date:
October 15, 1998
Filing Date:
April 03, 1998
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
ENERGY BIOSYSTEMS CORP (US)
DARZINS ALDIS (US)
XI LEI (US)
CHILDS JOHN D (US)
MONTICELLO DANIEL J (US)
SQUIRES CHARLES H (US)
International Classes:
C10G32/00; C12N1/21; C12N9/02; C12N15/52; C12S1/02; (IPC1-7): C12N15/52; C10G32/00; C12N1/21; C12N9/02; C12P11/00; C12S1/02
Domestic Patent References:
WO1994001563A11994-01-20
WO1996017940A21996-06-13
Foreign References:
EP0218734A11987-04-22
Other References:
GRAY K.A. ET AL.: "Molecular mechanisms of biocatalytic desulfurization of fossil fuels.", NATURE BIOTECHNOLOGY, vol. 14, 14 December 1996 (1996-12-14), pages 1705 - 1709, XP002073201
CONSTANTI M. ET AL.: "Desulphurization of dibenzothiopene by bacteria", WORLD J. OF MICROBIOLOGY & BIOTECHNOLOGY, vol. 10, no. 5, 1994, pages 510 - 516, XP002072523
KLUBEK B. ET AL.: "Characterization of soil bacteria that desulfurize organic sulfur compounds. 1. Classification and growth studies.", MICROBIOS, vol. 88, no. 357, 1996, pages 223 - 236, XP002072524
LABES M- ET AL.: "A new family of RSF1010-derived expression and lac-fusion broad-host-range vectors for Gram-negative bacteria", GENE, vol. 89, no. 1, 30 April 1990 (1990-04-30), pages 37 - 46, XP002072525
GALLARDO M.E. ET AL.: "Designing recombinant Pseudomonas strains to enhance biodesulphurization", J. BACTERIOL., vol. 179, no. 22, November 1997 (1997-11-01), pages 7156 - 7160, XP002072526
DARZINS A. ET AL.: "Expression of the desulfurization (dsz) genes from Rhodococcus erythropolis IGTS8 in heterologous bacterial hosts. ABSTRACTS OF THE 97th ASM MEETING, Miami Beach, 4-8 May 1997, Abstr. O-105
Attorney, Agent or Firm:
Elmore, Carolyn S. (Brook Smith & Reynolds, P.C., Two Militia Driv, Lexington MA, US)
Download PDF:
Claims:
CLAIMS We claim:
1. A recombinant pseudomonad comprising a heterologous nucleic acid molecule which encodes one or more desulfurization enzymes.
2. The recombinant pseudomonad of Claim 1, wherein the pseudomonad is selected from the group consisting of Pseudomonas fluorescens, Burkholderia cepacia, Comomonas testosteroni, Pseudomonas aeruginosa, Pseudomonas aureofaciens, Pseudomonas alcaligenes, Pseudomonas chlororaphis, Pseudomonas denitrifcans, Pseudomonas fluorescens, Pseudomonas mendocina, Pseudomonas oleovorans, Pseudomonas putida, Pseudomonas stutzeri, and Sphingomonas paucimobilis.
3. The recombinant pseudomonad of Claim 2 wherein the host organism is Pseudomonas fluorescens ATCC 13525 or Pseudomonas fluorescens NCIB 11764.
4. The recombinant pseudomonad of Claim 1 wherein the heterologous nucleic acid molecule encodes one or more enzymes presented in SEQ ID NO.: 2, SEQ ID NO.: 4 and SEQ ID NO.: 6, mutant, homologue or fragment thereof.
5. The recombinant pseudomonad of Claim 4 wherein the heterologous nucleic acid molecule comprises at least one of the nucleotide sequences set forth in SEQ ID NO.: 1, SEQ ID NO.: 3 and SEQ ID NO.: 5.
6. The recombinant pseudomonad of Claim 4 wherein the heterologous nucleic acid molecule comprises a nucleotide sequence which is mutated from the nucleotide sequence set forth in SEQ ID NO.: 1, SEQ ID NO.: 3, or SEQ ID NO.: 5 by the replacement of one or more codons with a degenerate codon.
7. The recombinant pseudomonad of Claim 1 wherein the heterologous nucleic acid molecule encodes at least one of the enzymes having an amino acid sequence set forth in SEQ ID NO.: 8, SEQ ID NO.: 10 or SEQ ID NO.: 12; or a fragment thereof.
8. The recombinant pseudomonad of Claim 7 wherein the heterologous nucleic acid molecule includes one or more of the nucleotide sequences set forth in SEQ ID NO.: 7, SEQ ID NO.: 9, and SEQ ID NO.: 11.
9. The recombinant pseudomonad of Claim 7 wherein the heterologous nucleic acid molecule comprises a nucleotide sequence which is mutated from the nucleotide sequence set forth in SEQ ID NO.: 1, SEQ ID NO.: 3, or SEQ ID NO.: 5 by the replacement of one or more codons with a degenerate codon.
10. The recombinant pseudomonad of Claim 1 wherein the heterologous nucleic acid molecule is of Rhodococcus origin.
11. The recombinant pseudomonad of Claim 1 wherein the heterologous nucleic acid molecule is of Sphingomonas origin.
12. The recombinant pseudomonad of Claim 1 wherein the heterologous nucleic acid molecule encodes a desulfurization enzyme which is a mutation of the enzyme set forth in SEQ ID NO.: 2, SEQ ID NO.: 4, or SEQ ID NO.: 6.
13. The recombinant pseudomonad of Claim 12 wherein the heterologous nucleic acid molecule encodes an enzyme having an amino acid sequence which is mutated from the amino acid sequence set forth in SEQ ID NO.: 2, SEQ ID NO.: 4 or SEQ ID NO.: 6 by conservative substitution of one or more amino acid residues.
14. The recombinant pseudomonad of Claim 1 wherein the heterologous nucleic acid molecule encodes a desulfurization enzyme which is a mutation of an enzyme set forth in SEQ ID NO.: 8, SEQ ID NO.: 10 or SEQ ID NO.: 12.
15. The recombinant pseudomonad of Claim 14 wherein the heterologous nucleic acid molecule encodes an enzyme having an amino acid sequence which is mutated from the amino acid sequence set forth in SEQ ID NO.: 8, SEQ ID NO.: 10 or SEQ ID NO.: 12 by conservative substitution of one or more amino acid residues.
16. The recombinant pseudomonad of Claim 1 wherein the heterologous nucleic acid molecule encodes an amino acid sequence comprising the amino acid residues which are conserved in SEQ ID NO.: 2 and SEQ ID NO.: 8 in an amino acid alignment.
17. The recombinant pseudomonad of Claim 1 wherein the heterologous nucleic acid molecule encodes an amino acid sequence which comprises the amino acid residues which are conserved in SEQ ID NO.: 4 and SEQ ID NO.: 10 in an amino acid alignment.
18. The recombinant pseudomonad of Claim 1 wherein the heterologous nucleic acid molecule encodes an amino acid sequence comprising the amino acid residues which are conserved in SEQ ID NO.: 6 and SEQ ID NO.: 12 in an amino acid alignment.
19. The recombinant pseudomonad of Claim 1 wherein the heterologous nucleic acid molecule further encodes a flavoprotein.
20. The recombinant pseudomonad of Claim 19 wherein the flavoprotein is a flavin reductase.
21. A method for desulfurizing a carbonaceous material which includes organosulfur compounds, comprising the steps of: (a) contacting the carbonaceous material with an aqueous phase containing a recombinant pseudomonad biocatalyst comprising at least one heterologous enzyme capable of catalyzing at least one step in the oxidative cleavage of carbonsulfur bonds, thereby forming a carbonaceous material and aqueous phase mixture; (b) maintaining the mixture of step (a) under conditions sufficient for biocatalysis, thereby resulting in a carbonaceous material having a reduced organic sulfur content; and (c) separating the carbonaceous material having a reduced organic sulfur content from the resulting aqueous phase.
22. The method of Claim 21 wherein the recombinant pseudomonad biocatalyst is a recombinant pseudomonad containing a heterologous nucleic acid molecule which encodes one or more desulfurization enzymes; or an enzyme preparation derived therefrom.
23. The method of Claim 22 wherein the recombinant pseudomonad is selected from the group consisting of Pseudomonas fluorescens, Burkholderia cepacia, Comomonas testosteroni, Pseudomonas aeruginosa, Pseudomonas aureofaciens, Pseudomonas alcaligenes, Pseudomonas chlororaphis, Pseudomonas denitrifcans, Pseudomonas fluorescens, Pseudomonas mendocina, Pseudomonas oleovorans, Pseudomonas putida, Pseudomonas stutzeri, and Sphingomonas paucimobilis.
24. The method of Claim 23 wherein the pseudomonad is Pseudomonas fluorescens NCIB 11764 or Pseudomonas fluorescens ATCC 13525.
25. The method of Claim 21 wherein the carbonaceous material is a fossil fuel.
26. The method of Claim 25 wherein the fossil fuel is petroleum or a petroleum distillate fraction.
27. The method of Claim 22 wherein the heterologous nucleic acid molecule encodes one or more of the enzymes presented in SEQ ID NO.: 2, SEQ ID NO.: 4 and SEQ ID NO.: 6.
28. The method of Claim 27 wherein the heterologous nucleic acid molecule includes one or more of the nucleotide sequences set forth in SEQ ID NO.: 1, SEQ ID NO.: 3, and SEQ ID NO.: 5.
29. The method of Claim 27 wherein the heterologous nucleic acid molecule comprises a nucleotide sequence which is mutated from the nucleotide sequence set forth in SEQ ID NO.: 1, SEQ ID NO.: 3, or the full coding region corresponding to SEQ ID NO.: 5 by the replacement of one or more codons with a degenerate codon.
30. The method of Claim 22 wherein the heterologous nucleic acid molecule encodes one or more of the enzymes set forth in SEQ ID NO.: 8, SEQ ID NO.: 10 and SEQ ID NO.: 12.
31. The method of Claim 30 wherein the heterologous nucleic acid molecule comprises at least one of the nucleotide sequences set forth in SEQ ID NO.: 7, SEQ ID NO.: 9 and SEQ ID NO.: 11.
32. The method of Claim 30 wherein the heterologous nucleic acid molecule comprises a nucleotide sequence which is mutated from the nucleotide sequence set forth in SEQ ID NO.: 7, SEQ ID NO.: 9, or SEQ ID NO.: 11 by the replacement of one or more codons with a degenerate codon.
33. The method of Claim 22 wherein the heterologous nucleic acid molecule encodes an enzyme which is a mutant of the enzyme set forth in SEQ ID NO.: 2, SEQ ID NO.: 4, SEQ ID NO.: 6.
34. The method of Claim 33 wherein the heterologous nucleic acid molecule encodes an enzyme is mutated from the enzyme set forth in SEQ ID NO.: 2, SEQ ID NO.: 4 or SEQ ID NO.: 6 by conservative substitution of one or more amino acid residues.
35. The method of Claim 22 wherein the heterologous nucleic acid molecule encodes an enzyme which is a mutant of the enzyme set forth in SEQ ID NO.: 8 or SEQ ID NO.: 10, or the enzyme having the Nterminal sequence set forth in SEQ ID NO.: 12.
36. The method of Claim 35 wherein the heterologous nucleic acid molecule encodes an enzyme which is mutated from the enzyme set forth in SEQ ID NO.: 8 or SEQ ID NO.: 10 or an enzyme having an Nterminal sequence set forth in SEQ ID NO.: 12 by conservative substitution of one or more amino acid residues.
37. The method of Claim 22 wherein the heterologous nucleic acid molecule encodes an enzyme having an amino acid sequence comprising the amino acid residues which are conserved in SEQ ID NO.: 2 and SEQ ID NO.: 8 in an amino acid alignment.
38. The method of Claim 22 wherein the heterologous nucleic acid molecule encodes an enzyme having an amino acid sequence comprising the amino acid residues which are conserved in SEQ ID NO.: 4 and SEQ ID NO.: 10 in an amino acid alignment.
39. The method of Claim 22 wherein the heterologous nucleic acid molecule encodes an enzyme having an amino acid sequence comprising the amino acid residues which are conserved in SEQ ID NO.: 6 and SEQ ID NO.: 12 in an amino acid alignment.
40. The method of Claim 22 wherein the heterologous nucleic acid molecule further encodes a flavoprotein.
41. The method of Claim 40 wherein the flavoprotein is a flavin reductase.
42. The method of Claim 22 wherein the heterologous nucleic acid molecule is of Rhodococcus origin.
43. The method of Claim 42 wherein the heterologous nucleic acid molecule is isolated from Rhodococcus sp. strain IGTS8.
44. The method of Claim 22 wherein the heterologous nucleic acid molecule is of Sphingomonas origin.
45. The method of Claim 44 wherein the heterologous nucleic acid molecule is derived from Sphingomonas sp. strain AD109.
46. A method of oxidizing an organic compound, comprising the steps of: (a) contacting the organic compound with an aqueous phase containing a recombinant pseudomonad biocatalyst capable of oxidizing organosulfur compounds, thereby forming an organosulfur compound and aqueous phase mixture; and (b) maintaining the mixture under conditions sufficient for oxidation of the organic compound by the biocatalyst, thereby forming an oxidized organosulfur compound.
47. The method of Claim 46 wherein the biocatalyst comprises the enzyme set forth in SEQ ID NO.: 2, a mutant or active fragment thereof; the enzyme set forth in SEQ ID NO.: 6, a mutant or active fragment thereof; or a combination thereof.
48. The method of Claim 46 wherein the biocatalyst comprises the enzyme set forth in SEQ ID NO.: 8, a mutant or active fragment thereof; an enzyme having the Nterminal sequence set forth in SEQ ID NO.: 12, a mutant or active fragment thereof; or a combination thereof.
49. The method of Claim 46 wherein the organic compound is an organosulfur compound which is a component of a fossil fuel.
50. The method of Claim 46 wherein the organosulfur compound is a substituted or unsubstituted dibenzothiophene.
51. The method of Claim 46 wherein the organosulfur compound is a substituted or unsubstituted dibenzothiophene55dioxide.
52. The method of Claim 46 wherein the recombinant pseudomonad contains a heterologous nucleic acid molecule which is derived from Sphingomonas sp. strain AD109.
53. The method of Claim 46 wherein the recombinant pseudomonad contains a heterologous nucleic acid molecule which is derived from Rhodococcus sp. strain IGTS8.
Description:
DSZ GENE EXPRESSION IN PSEUDOMONAS HOSTS RELATED APPLICATIONS This is a continuation-in-part application of Serial No. 08/851,088, filed May 5,1997 which is a continuation-in-part application of Serial No.

08/835,185, filed April 7,1997, the contents of which are incorporated herein by reference in its entirety.

BACKGROUND OF THE INVENTION The microbial desulfurization of fossil fuels has been an area of active investigation for over fifty years. The object of these investigations has been to develop biotechnology based methods for the pre- combustion removal of sulfur from fossil fuels, such as coal, crude oil and petroleum distillates. The driving forces for the deve-lopment of desulfurization methods are the increasing levels of sulfur in fossil fuel and the increasingly stringent regulation of sulfur emissions. Monticello et al.,"Practical Considerations in Biodesulfurization of Petroleum,"IGT's 3d Intl.

Symp. on Gas, Oil, Coal and Env. Biotech., (Dec. 3-5, 1990) New Orleans, LA.

Many biocatalysts and processes have been developed to desulfurize fossil fuels, including those described in U. S. Patent Nos. 5,356,801,5,358,870,5,358,813, 5,198,341,5,132,219,5,344,778,5,104,801 and 5,002,888, incorporated herein by reference. Economic analyses indicate that one limitation in the

commercialization of the technology is improving the reaction rates and specific activities of the biocatalysts, such as the bacteria and enzymes that are involved in the desulfurization reactions. The reaction rates and specific activities (sulfur removed/hour/gram of biocatalyst) that have been reported in the literature are much lower than those necessary for optimal commercial technology. Therefore, improvements in the longevity and specific activity of the biocatalyst are desirable.

SUMMARY OF THE INVENTION The present invention provides novel recombinant pseudomonads which contain a heterologous nucleic acid molecule comprising a nucleotide sequence encoding one or more desulfurization enzymes which are components of a biodesulfurization catalyst. Such enzymes catalyze one or more steps of a biodesulfurization process, for example, the oxidative cleavage of the carbon-sulfur bonds of an organosulfur compound. The invention also includes a method for producing such recombinant organisms.

In one embodiment, the nucleotide sequence which encodes the desulfurization enzyme (s) is derived from a Rhodococcus organism, such as Rhodococcus sp. IGTS8. In another embodiment, the nucleotide sequence is derived from a Sphingomonas organism, such as Sphingomonas sp. strain AD109.

The recombinant organism can be derived from a host organism which does not contain native genes encoding a biodesulfurization catalyst. Such an organism can be, for example, a bacterium which is a species of

Pseudomonas. The recombinant organism can also be derived from a host organism which contains native genes encoding a biodesulfurization catalyst. The invention is predicated upon the discovery that pseudomonas hosts possess advantages over other host cells in the desulfurization of fossil fuels.

In a further embodiment, the invention provides a method of desulfurizing a carbonaceous material, such as a fossil fuel, which comprises organosulfur compounds.

The method includes the steps of (1) contacting the fossil fuel with an aqueous phase containing a recombinant biocatalyst and, optionally, a flavoprotein, thereby forming a fossil fuel and aqueous phase mixture; (2) maintaining the mixture under conditions sufficient for biocatalysis, thereby resulting in a fossil fuel having a reduced organic sulfur content; and (3) separating the fossil fuel having a reduced organic sulfur content from the resulting aqueous phase.

The invention also provides a method of oxidizing an organic compound. The method comprises the steps of: (1) contacting the organic compound with an aqueous phase containing a recombinant pseudomonad biocatalyst comprising at least one enzyme capable of catalyzing at least one step in the oxidative cleavage of carbon- sulfur bonds, thereby forming an organic compound and aqueous phase mixture; (2) maintaining the mixture of step (1) under conditions sufficient for oxidation of the organic compound by the biocatalyst, thereby resulting in an oxidized organic compound, and, optionally, separating the oxidized organic compound from the aqueous phase.

BRIEF DESCRIPTION OF THE DRAWINGS Figures 1A, 1B, 1C and 1D set forth the DNA sequence and predicted amino acid sequence of open reading frame 1 (ORF-1 (R)) of the nucleotide sequence required for desulfurization activity in Rhodococcus sp. strain IGTS8.

Figures 2A, 2B and 2C set forth the DNA sequence and predicted amino acid sequence of open reading frame 2 (ORF-2 (R)) of the nucleotide sequence required for desulfurization activity in Rhodococcus sp. strain IGTS8.

Figures 3A, 3B and 3C set forth the DNA sequence and predicted amino acid sequence of open reading frame 3 (ORF-3 (R)) of the nucleotide sequence required for desulfurization activity in Rhodococcus sp. strain IGTS8.

Figures 4A, 4B, 4C and 4D set forth the DNA sequence and predicted amino acid sequence of open reading frame 1 (ORF-1 (S)) of the nucleotide sequence required for desulfurization activity in Sphingomonas sp. strain AD109.

Figures 5A, 5B, and 5C set forth the DNA sequence and predicted amino acid sequence of open reading frame 2 (ORF-2 (S)) of the nucleotide sequence required for desulfurization activity in Sphingomonas sp. strain AD109.

Figures 6A, 6B and 6C set forth the DNA sequence and predicted amino acid sequence of open reading frame 3 (ORF-3 (S)) of the nucleotide sequence required for desulfurization activity in Sphingomonas sp. strain AD109.

Figure 7 is a physical map of plasmid pEX83.

Figure 8 is a physical map of plasmid pEXB5.

Figure 9 is a physical map of plasmid pEXB9.

Figure 10 is a physical map of plasmid pEX81.

Figure 11 is a physical map of plasmid pEX85.

Figure 12 is a physical map of plasmid pEX92.

Figure 13 is a physical map of plasmid pEXYlO.

Figure 14 is a physical map of plasmid pEX1085.

Figure 15 is a physical map of plasmid pEX1087.

Figure 16 is a physical map of plasmid pEX1079.

Figure 17 is a physical map of plasmid pDA120.

Figure 18 sets forth the nucleotide sequence of the frdA linker.

Figure 19 presents the results of an in vitro desulfurization assay using a cell-free lysate prepared from a salicylate induced culture of P. fluorescens ATCC 13525/pEX1087.

Figure 20 is a physical map of plasmid pDA121.

DETAILED DESCRIPTION OF THE INVENTION The present invention is based upon the successful expression of heterologous biodesulfurization genes in pseudomonad host organisms and the advantages thereof.

The resulting recombinant organisms can express a heterologous biocatalyst which catalyzes the oxidative cleavage of the carbon-sulfur bonds of an organosulfur compound. Such organisms, or enzyme preparations derived therefrom, can be utilized as biodesulfurization catalysts in the desulfurization of a carbonaceous material comprising one or more organosulfur compounds, such as a fossil fuel.

In the petroleum extraction and refining arts, the

term"organic sulfur is generally understood as referring to organic molecules having a hydrocarbon framework to which one or more sulfur atoms are covalently joined. These sulfur atoms can be directly bonded to the hydrocarbon framework, e. g., by one or more carbon-sulfur bonds, or can be present in a substituent bonded to the hydrocarbon framework of the molecule, e. g., a sulfate group. The general class of organic molecules having one or more sulfur heteroatoms are sometimes referred to as"organosulfur compounds".

The hydrocarbon portion of these compounds can be aliphatic and/or aromatic.

Sulfur-bearing heterocycles, such as substituted and unsubstituted thiophene, benzothiophene, and dibenzothiophene, are known to be stable to conventional desulfurization treatments, such as hydrodesulfurization (HDS). Sulfur-bearing heterocycles can have relatively simple or relatively complex chemical structures. In complex heterocycles, multiple condensed aromatic rings, one or more of which can be heterocyclic, are present.

The difficulty of desulfurization generally increases with the structural complexity of the molecule. That is, refractory behavior is particularly accentuated in complex sulfur-bearing heterocycles, such as dibenzothiophene (DBT, C12H8S).

DBT is a sulfur-bearing heterocycle that has a condensed, multiple aromatic ring structure in which a five-membered thiophenic ring is flanked by two six- membered benzo rings. Much of the residual post-HDS organic sulfur in fossil fuel refining intermediates and combustible products is thiophenic sulfur. The majority of this resi-dual thiophenic sulfur is present in DBT

and derivatives thereof having one or more alkyl or aryl groups attached to one or more carbon atoms present in one or both flanking benzo rings. DBT itself is accepted as a model compound illustrative of the behavior of the class of compounds encompassing DBT and derivatives thereof in reactions involving thiophenic sulfur (Monticello and Finnerty, Ann. Rev. Microbiol., 39 : 371-389 (1985)). DBT and derivatives thereof can account for a significant percentage of the total sulfur content of particular crude oils, coals and bitumen.

For example, these sulfur-bearing heterocycles have been reported to account for as much as 70 wt% of the total sulfur content of West Texas crude oil, and up to 40 wt% of the total sulfur content of some Middle East crude oils. Thus, DBT is considered to be particularly relevant as a model compound for the forms of thiophenic sulfur found in fossil fuels, such as crude oils, coals or bitumen of particular geographic origin, and various refining intermediates and fuel products manufactured therefrom (Monticello and Finnerty (1985), supra).

Another characteristic of DBT and derivatives thereof is that, following a release of fossil fuel into the environment, these sulfur-bearing heterocycles persist for long periods of time without significant biodegradation. Gundlach et al., Science 221 : 122-129 (1983). Thus, most prevalent naturally occurring microorganisms do not effectively metabolize and break down sulfur-bearing heterocycles.

A fossil fuel that is suitable for desulfurization treatment according to the present invention is one that contains organic sulfur. Such a fossil fuel is referred to as a"substrate fossil fuel". Substrate fossil fuels

that are rich in thiophenic sulfur are particularly suitable for desulfurization according to the method described herein. Examples of such substrate fossil fuels include Cerro Negro or Orinoco heavy crude oils; Athabascan tar and other types of bitumen; petroleum refining fractions such as light cycle oil, heavy atmospheric gas oil, and No. 1 diesel oil; and coal- derived liquids manufactured from sources such as Pocahontas #3, Lewis-Stock, Australian Glencoe or Wyodak coal.

Biocatalytic desulfurization (biocatalysis or BDS) is the excision (liberation or removal) of sulfur from organosulfur compounds, including refractory organosulfur compounds such as sulfur-bearing heterocycles, as a result of the oxidative, preferably selective, cleavage of carbon-sulfur bonds in said compounds by a biocatalyst. BDS treatment yields the desulfurized combustible hydrocarbon framework of the former refractory organosulfur compound, along with inorganic sulfur substances which can be readily separated from each other by known techniques such as fractional distillation or water extraction. For example, DBT is converted into 2-hydroxybiphenyl (also referred to as"HBP") when subjected to BDS treatment.

The present invention provides a biocatalyst of use in a BDS process comprising one or more recombinant pseudomonads that functionally express one or more enzymes that direct, singly or in concert with each other, the removal of sulfur from organosulfur compounds, including sulfur-bearing heterocycles, by the selective cleavage of carbon-sulfur bonds in said compounds; one or more enzymes obtained from such

microorganisms; or a mixture of such microorganisms and enzymes.

Suitable sources of the heterologous nucleic acid molecule of the invention are organisms which exhibit biocatalytic desulfurization activity, also referred to as Dsz+ organisms. Several such organisms are known in the art. For example, investigators have reported the genetic modification of naturally-occurring bacteria into mutant strains capable of catabolizing DBT.

Kilbane, J. J., Resour. Cons. Recycl. 3: 69-79 (1990), Isbister, J. D., and R. C. Doyle, U. S. Patent No.

4,562,156 (1985), and Hartdegan, F. J. et al., Chem. Eng.

Progress : 63-67 (1984). Kilbane has reported a mixed bacterial culture which appeared capable of selectively liberating sulfur from DBT by the oxidative pathway. This culture was composed of bacteria obtained from natural sources such as sewage sludge, petroleum refinery wastewater, garden soil, coal tar-contaminated soil, etc., and maintained in culture under conditions of continuous sulfur deprivation in the presence of DBT.

The culture was then exposed to the chemical mutagen 1- methyl-3-nitro-1-nitrosoguanidine. The major catabolic product of DBT metabolism by this mutant culture was 2- hydroxybiphenyl; sulfur was released as inorganic water- soluble sulfate, and the hydrocarbon portion of the molecule remained essentially intact as 2- hydroxybiphenyl. Kilbane, J. J., Resour. Cons. Recycl., 3: 69-79 (1990), the teachings of which are incorporated herein by reference.

Kilbane isolated a mutant strain of Rhodococcus from this mixed bacterial culture. This mutant, IGTS8 or ATCC No. 53968, is a particularly preferred source of

the heterologous nucleic acid molecule for use with the instant invention. The isolation and characteristics of this mutant are described in detail in J. J. Kilbane, U. S. Patent No. 5,104,801, the teachings of which are incorporated herein by reference. This microorganism has been deposited at the American Type Culture Collection (ATCC), 12301 Park Lawn Drive, Rockville, Maryland, U. S. A. 20852 under the terms of the Budapest Treaty, and has been designated as ATCC Deposit No.

53968.

There are at least two possible types of pathways which result in the specific release of sulfur from DBT: oxidative and reductive. Preferably, an oxidative (aerobic) pathway can be followed. Examples of microorganisms that act by this oxidative pathway, preparations of which are suitable for use as the source of recombinant DNA in the present invention include the microbial consortium (a mixture of several microorganisms) disclosed in Kilbane, Resour. Conserv.

Recycl., 3: 69-79 (1990), the microorganisms disclosed by Kilbane in U. S. Patent Nos. 5,002,888 (issued Mar.

26,1991), 5,104,801 (issued Apr. 14,1992), 5,344,778, 5,132,219,5,198,341,5,356,813 and 5,358,870 [also described in Kilbane (1990), Biodesulfurization : Future Prospects in Coal Cleaning, in Proc, 7th Ann. Int'l.

Pittsburgh Coal Conf.: 373-382]. Preferred sources of the heterologous nucleic acid molecules of the invention are Rhodococcus sp. IGTS8 (ATCC 53968) and Sphingomonas sp. strain AD109. Other desulfurizing microorganisms which are suitable nucleic acid molecule sources include Corynebacterium sp. strain SY1, as disclosed by Omori et al., Appl. Env. Microbiol., 58 : 911-915 (1992);

Rhodococcus erythropolis D-1, as disclosed by Izumi et al., Appl. Env. Microbiol., 60: 223-226 (1994); the Arthrobacter strain described by Lee et al., Appl.

Environ. Microbiol. 61 : 4362-4366 (1995) and the Rhodococcus strains (ATCC 55309 and ATCC 55310) disclosed by Grossman et al., U. S. Patent No. 5,607,857, each of which is incorporated herein by reference in its entirety. Each of these microorganisms is believed to produce one or more enzymes (protein biocatalysts) that catalyze one or more reactions in the desulfurization of DBT.

Each of the foregoing microorganisms can serve as the source of recombinant DNA in the present invention because each contains one or more genes encoding one or more enzymes (protein biocatalysts) that carry out the specific chemical reaction (s) by which sulfur is excised from refractory organosulfur compounds. Mutational or genetically engineered derivatives of any of the foregoing microorganisms, as exemplified by the U. S. patents listed above, can also be used as the DNA source, provided that appropriate biocatalytic function is retained.

The genes from Rhodococcus strain sp. IGTS8 which encode the biodesulfurization catalyst have been isolated and cloned, as described in U. S. Patent No.

5,356,801, the contents of which are incorporated herein by reference. The DNA sequence required for expression of desulfurization activity in Rhodococcus comprises three open reading frames, designated dszA, dszB and dszC. Each of these genes encodes an enzyme, denoted DszA, DszB and DszC, respectively, which catalyzes one or more steps in the desulfurization of DBT. These

enzymes have been isolated and characterized (Gray et al., Nature Biotech. 14 : 1705-1709 (1996). The nucleotide sequences of the Rhodococcus desulfurization genes are set forth in Figures 1A-lD (ORF-1 (R), SEQ ID NO.: 1), Figures 2A-2C (ORF-2 (R), SEQ ID NO.: 3) and Figures 3A-3C (ORF-3 (R), SEQ ID NO.: 5). Each of these figures also sets forth the predicted amino acid sequences of the proteins encoded by these nucleotide sequences (ORF-1 (R), SEQ ID NO.: 2; ORF-2 (R), SEQ ID NO.: 4 and ORF-3 (R), SEQ ID NO.: 6).

Recent experiments, including confirming sequencing of these genes, have raised the possibility that amino acid 56 has been misidentified as an alanine versus a glycine. Thus, SEQ ID NO.: 6, (and, thus, the corresponding codon) includes a 56 G"substitution" variant. In light of the fact that sequencing errors are common in the industry, references to specific sequences herein refer to the native sequences isolated from the host organism.

The isolation and characterization of the desulfurizing organism Sphingomonas sp. strain AD109 (ATCC Deposit No. 55954 on April 21,1997) are described in copending U. S. Patent Application Serial No.

08/851,089, the contents of which are incorporated herein in their entirety. This organism has been isolated as a biologically pure culture by a soil enrichment procedure using 2- (2- hydroxyphenyl) benzenesulfinate (also referred to as "HPBS") as the sole source of sulfur. This organism expresses a collection of desulfurization enzymes which, together, catalyze the conversion of DBT to 2- hydroxybiphenyl and inorganic sulfur. The nucleotide

sequence encoding the desulfurization catalyst of this organism includes three open reading frames which exhibit substantial homology with the corresponding nucleotide sequences of Rhodococcus IGTS8. The sequences of these open reading frames are set forth in Figures 4A-4D (ORF-1 (S), SEQ ID NO.: 7), Figures 5A-5C (ORF-2 (S), SEQ ID NO.: 9) and Figures 6A-6C (ORF-3 (S), SEQ ID NO.: 11). Figures 4-6 also present the predicted amino acid sequences of the proteins encoded by each of these nucleotide sequences (ORF-1 (S), SEQ ID NO.: 8; ORF-2 (S), SEQ ID NO.: 10 and ORF-3 (S), SEQ ID NO.: 12).

The heterologous nucleic acid molecule of the invention can comprise one or more nucleotide sequences encoding an enzyme which is a component of a biodesulfurization catalyst of a desulfurizing organism.

Such an enzyme, also referred to as a"desulfurization enzyme", catalyzes one or more steps in the oxidative cleavage of one or more carbon-sulfur bonds of an organosulfur compound. The heterologous nucleic acid molecule, can, for example, comprise one or more nucleotide sequences which encode a desulfurization enzyme of a Dsz'organism, such as one or more enzymes having an amino acid sequence as set forth in SEQ ID NO.: 2, SEQ ID NO.: 4, SEQ ID NO.: 6, SEQ ID NO.: 8, SEQ ID NO.: 10 or SEQ ID NO.: 12.

For example, the heterologous nucleic acid molecule can comprise a nucleotide sequence which is identical to a native desulfurization enzyme-encoding sequence of a Dsz+ organism. For example, the heterologous nucleic acid molecule can comprise one or more of the nucleotide sequences set forth in SEQ ID NO.: 1, SEQ ID NO.: 3, SEQ ID NO.: 5, SEQ ID NO.: 7, SEQ

ID NO.: 9, and SEQ ID NO.: 11. The heterologous nucleotide molecule can also comprise one or more nucleotide sequences which result from one or more silent mutations of a desulfurization enzyme-encoding sequence of a Dsz'organism, such as Rhodococcus IGTS8 or Sphingomonas sp. strain AD109. Such a mutant sequence results from the substitution of one or more codons in the native sequence with a degenerate codon, i. e., a codon encoding the same amino acid residue.

The heterologous nucleic acid molecule can also include a nucleotide sequence which is homologous to the native nucleotide sequence of a desulfurizing organism, for example, a nucleotide sequence which is homologous to one of the sequences set forth in SEQ ID NO.: 1, SEQ ID NO.: 3, SEQ ID NO.: 5, SEQ ID NO.: 7, SEQ ID NO.: 9, and SEQ ID NO.: 11. Preferably the nucleotide sequence exhibits at least about 50% sequence homology or sequence identity with a native sequence, preferably at least about 70% homology, and, more preferably, at least about 80% homology. It is particularly preferred that the nucleotide sequence exhibit at least about 95% sequence homology with, or is essentially or substantially the same as, a native sequence.

The heterologous nucleic acid molecule can comprise a nucleotide sequence which encodes an amino acid sequence variant of one or more desulfurization enzymes of a desulfurizing organism, such as an amino acid sequence variant of one or more of the desulfurization enzymes of Rhodococcus IGTS8 and Sphingomonas sp. strain AD109. Such amino acid variants can be substitution, deletion or insertion mutants. Preparation of mutant nucleotide sequences can be accomplished by methods

known in the art as are described in Old, et al., Principles of Gene Manipulation, Fourth Edition, Blackwell Scientific Publications (1989), in Ausubel et al., Current Protocols in Molecular Biology, Wiley- Interscience, New York (1997) (hereinafter"Ausubel et al.,,) and in Sambrook et al., Molecular Cloning : A Laboratory Manual, third edition, Cold Spring Harbor Laboratory Press (1992) (hereinafter "Sambrook et al."), each of which are incorporated herein by reference.

The amino acid sequence variant preferably exhibits at least about 50% sequence homology to a native desulfurization enzyme of a Dsz'organism, preferably at least about 70% homology, and more preferably at least about 80% sequence homology in an amino acid alignment.

It is particularly preferred that the variant enzyme exhibits at least about 95% sequence homology with, or is essentially the same as, a native desulfurization enzyme. An amino acid alignment can be constructed by methods known in the art, for example, with the aid of a computer program, such as the BLAST program (Altschul et al., J. Mol. Biol. 215 : 403-410 (1990)). For example, the amino acid sequence variant can have an amino acid sequence which is homologous to one of the sequences presented in SEQ ID NO.: 2, SEQ ID NO.: 4, SEQ ID NO.: 6, SEQ ID NO.: 8, SEQ ID NO.: 10 or SEQ ID NO.: 12.

It is preferred to substitute amino acids which are not conserved among the desulfurization enzymes of two or more desulfurizing organisms, such as Rhodococcus IGTS8 and Sphingomonas sp. strain AD109. For example, the amino acid sequence variant encoded by the heterologous nucleic acid molecule preferably retains

the amino acid residues which are conserved in the Rhodococcus and Sphingomonas desulfurization enzymes.

The heterologous nucleic acid molecule can comprise the corresponding codons which are conserved in one or more of the Rhodococcus and Sphingomonas desulfurization genes, or one or more codons of these genes can be substituted with degenerate codons which encode the conserved amino acids.

The amino acid sequence variant encoded by the heterologous nucleic acid molecule can also result from conservative substitutions of one or more amino acid residues which are conserved in the desulfurization enzymes of two or more desulfurization organisms, such as Rhodococcus IGTS8 and Sphingomonas sp. strain AD109.

Conservative substitutions are those in which a first amino acid residue is substituted by a second residue having similar side chain properties. An example of such a conservative substitution is replacement of one hydrophobic residue, such as valine, with another hydrophobic residue, such as leucine.

The amino acid sequence variant can also result from the conservative or nonconservative substitution of one or more amino acid residues which are not conserved in the desulfurization enzymes of two or more desulfurization organisms, such as Rhodococcus IGTS8 and Sphingomonas sp. strain AD109. A non-conservative substitution involves replacing a first residue with a second residue having different side chain properties.

An example of a non-conservative substitution is the replacement of a hydrophobic residue, such as valine, with an acidic residue, such as glutamic acid.

The heterologous nucleic acid molecule can also

encode an active fragment of one or more desulfurization enzymes. Preferred nucleic acid molecules of this type encode a significant portion of the enzyme and include at least one region, e. g. a series of contiguous conserved amino acid residues, of at least one desulfurization enzyme, which is conserved among two or more Dsz+ organisms, such as Rhodococcus IGTS8 and Sphingomonas sp. strain AD109.

Additional Dsz'microorganisms suitable as the source of the recombinant DNA can be derived from naturally occurring microorganisms by known techniques.

As set forth above, these methods involve culturing preparations of microorganisms obtained from natural sources such as sewage sludge, petroleum refinery wastewater, garden soil, or coal tar-contaminated soil under selective culture conditions in which the microorganisms are grown in the presence of refractory organosulfur compounds such as sulfur-bearing heterocycles as the sole sulfur source; exposing the microbial preparation to chemical or physical mutagens; or a combination of these methods. Such techniques are recounted by Isbister and Doyle in U. S. Patent No.

4,562,156 (issued Dec. 31,1985); by Kilbane in Resour.

Conserv. Recycl., 3 : 69-79 (1990), U. S. Patent Nos.

5,002,888,5,104,801 and 5,198,341; and by Omori and coworkers in Appl. Env. Microbiol., 58 : 911-915 (1992), all incorporated herein by reference.

In another embodiment, the recombinant organism contains a heterologous nucleotide sequence which encodes one or more desulfurization enzymes as well as an oxidoreductase, such as a flavoprotein, for example, a flavin reductase. For example, the heterologous

nucleotide sequence can encode an oxidoreductase which is native to Sphingomonas sp. strain AD109. The heterologous nucleic acid molecule can also encode the oxidoreductase denoted DszD described in copending U. S.

Patent Application Serial No. 08/583,118; the flavin reductase from Vibrio harveyii described in copending U. S. Patent Application Serial No. 08/351,754; or the flavin reductase from Rhodococcus sp. IGTS8, described in copending U. S. Patent Application Serial No.

08/735,963. The contents of each of these references is incorporated herein by reference. The heterologous nucleotide sequence can also encode an amino acid variant or an active fragment of one of these oxidoreductases.

The recombinant pseudomonad of the invention can be derived from any pseudomonad which is capable of taking up and expressing heterologous desulfurization genes.

The"term pseudomonad", as used herein, refers to any bacterium classified as a Pseudomonas species in Krieg et al., ed., Bergey's Manual of Systematic Bacteriology, Williams and Wilkins, Baltimore (1984). In a preferred embodiment, the host organism is resistant to the conditions encountered in a biodesulfurization process.

For example, particularly preferred are microorganisms which are resistant to the compounds present in a fossil fuel, such as petroleum, as well as to the high salt concentrations and elevated temperatures which can be encountered in a BDS system. Suitable pseudomonads include Pseudomonas fluorescens, Burkholderia cepacia, Comomonas testosteroni, Pseudomonas aeruginosa, Pseudomonas aureofaciens, Pseudomonas alcaligenes, Pseudomonas chlororaphis, Pseudomonas denitrifcans,

Pseudomonas fluorescens, Pseudomonas mendocina, Pseudomonas oleovorans, Pseudomonas putida, Pseudomonas stutzeri, and Sphingomonas paucimobilis. A preferred host microorganism is a strain of P. fluorescens, such as P. fluorescens NCIB 11764 and P. fluorescens ATCC 13525.

The recombinant organisms of the present invention can be created by various methods by those skilled in the art. Any method for introducing a recombinant plasmid into the organism of choice can be used, and a variety of such methods are described by Sambrook et al. and Ausubel et al. For example, the recombinant plasmid can be introduced via a suitable vector or by electroporation.

The present invention also provides a method for desulfurizing a carbonaceous material which includes organosulfur compounds. The carbonaceous material can be, for example, a fossil fuel, such as petroleum, a petroleum distillate fraction or coal. The method comprises the steps of (1) contacting the carbonaceous material with an aqueous phase containing a recombinant pseudomonad biocatalyst comprising at least one enzyme capable of catalyzing at least one step in the oxidative cleavage of carbon-sulfur bonds, thereby forming a carbonaceous material and aqueous phase mixture; (2) maintaining the mixture of step (1) under conditions sufficient for cleavage of the carbon-sulfur bonds of the organic sulfur molecules by the biocatalyst, thereby resulting in a carbonaceous material having a reduced organic sulfur content; and (3) separating the carbonaceous material having a reduced organic sulfur content from the resulting aqueous phase.

The term"recombinant pseudomonad biocatalyst", as used herein, refers to a recombinant pseudomonad organism which contains a heterologous nucleic acid molecule which encodes one or more desulfurization enzymes, or an enzyme preparation derived therefrom, such as a cell lysate. Preferably, the recombinant organism is as described above, such as a recombinant pseudomonad comprising recombinant DNA derived from a Dsz+ microorganism, such as Rhodococcus sp. IGTS8 or Sphingomonas sp. strain AD109.

Although living microorganisms (e. g., a culture) can be used as the biocatalyst herein, this is not required. Biocatalytic enzyme preparations that are useful in the present invention include microbial lysates, extracts, fractions, subfractions, or purified products obtained by conventional means and capable of carrying out the desired biocatalytic function. In a particularly preferred embodiment, the biocatalyst is overexpressed in the recombinant host cell (such as a cell which contains more than one copy of the gene or genes).

Enzyme biocatalyst preparations suitable for use herein can optionally be affixed to a solid support, e. g., a membrane, filter, polymeric resin, glass particles or beads, or ceramic particles or beads. The use of immobilized enzyme preparations facilitates the separation of the biocatalyst from the treated fossil fuel which has been depleted of refractory organosulfur compounds.

The specific activity of a given biocatalyst is a measure of its biocatalytic activity per unit mass.

Thus, the specific activity of a particular biocatalyst

depends on the nature or identity of the microorganism used or used as a source of biocatalytic enzymes, as well as the procedures used for preparing and/or storing the biocatalyst preparation. The concentration of a particular biocatalyst can be adjusted as desired for use in particular circumstances. For example, where a culture of living microorganisms is used as the biocatalyst preparation, a suitable culture medium lacking a sulfur source other than sulfur-bearing heterocycles can be inoculated with suitable microorganisms and grown until a desired culture density is reached. The resulting culture can be diluted with additional medium or another suitable buffer, or microbial cells present in the culture can be retrieved e. g., by centrifugation, and resuspended at a greater concentration than that of the original culture. The concentrations of microorganism and enzyme biocatalyst can be adjusted similarly. In this manner, appropriate volumes of biocatalyst preparations having predetermined specific activities and/or concentrations can be obtained.

In the biocatalytic desulfurization stage, the liquid fossil fuel containing sulfur-bearing heterocycles is combined with the biocatalyst and the flavin reductase. The relative amounts of biocatalyst and flavin reductase and liquid fossil fuel can be adjusted to suit particular conditions, or to produce a particular level of residual sulfur in the treated, deeply desulfurized fossil fuel. The amount of biocatalyst preparation to be combined with a given quantity of liquid fossil fuel will reflect the nature, concentration and specific activity of the particular

biocatalyst used, as well as the nature and relative abundance of inorganic and organic sulfur compounds present in the substrate fossil fuel and the degree of deep desulfurization sought or considered acceptable.

The method of desulfurizing a fossil fuel of the present invention involves two aspects. First, a host organism or biocatalytic preparation obtained therefrom is contacted with a fossil fuel to be desulfurized.

This can be done in any appropriate container, optionally fitted with an agitation or mixing device.

The mixture is combined thoroughly and allowed to incubate for a sufficient time to allow for cleavage of a significant number of carbon-sulfur bonds in organosulfur compounds, thereby producing a desulfurized fossil fuel. In one embodiment, an aqueous emulsion or microemulsion is produced with an aqueous culture of the organism or enzyme fraction and the fossil fuel, allowing the organism to propagate in the emulsion while the expressed biocatalyst cleaves carbon-sulfur bonds.

Variables such as temperature, pH, oxidation level, concentration, mixing rate and rate of desulfurization will vary according to the biocatalyst used. Optimal parameters can generally be determined through no more than routine experimentation.

When the fossil fuel is a liquid hydrocarbon, such as petroleum, the desulfurized fossil fuel and the aqueous phase can form an emulsion. The components of such emulsions can be separated by a variety of methods, such as those described in U. S. Patent No. 5,358,870 and U. S. Patent Application Serial No. 08/640,129, which are incorporated herein by reference. For example, some

emulsions reverse spontaneously when maintained under stationary conditions for a suitable period of time.

Other emulsions can be reversed by adding an additional amount of an aqueous phase. Still other emulsions can be separated by the addition of a suitable chemical agent, such as a demulsifying agent or by employing suitable physical conditions, such as a particular temperature range.

The biocatalyst can be recovered from the aqueous phase, for example, by centrifugation, filtration or lyophilization. When the biocatalyst is a microorganism, the biocatalyst can be resuspended in fresh sulfur-free nutrient medium and/or any fresh microorganism culture as necessary to reconstitute or replenish to the desired level of biocatalytic activity.

The biocatalyst can then be reintroduced into the reaction system.

Several suitable techniques for monitoring the rate and extent of desulfurization are well-known and readily available to those skilled in the art. Baseline and time course samples can be collected from the incubation mixture, and prepared for a determination of the residual organic sulfur in the fossil fuel.

The disappearance of sulfur from organosulfur compounds, such as DBT, in the sample being subjected to biocatalytic treatment can be monitored using, e. g., X- ray fluorescence (XRF) or atomic emission spectrometry (flame spectrometry). Preferably, the molecular components of the sample are first separated, e. g., by gas chromatography.

Without being limited to any particular mechanism or theory, it is believed that the pathway of the desulfurization reaction in Rhodococcus sp. IGTS8, and possibly in other Dsz+ organisms, is set forth below:

Here the flavin reductase provides an electron transport chain which delivers, via FMNH2, the reducing equivalents from NADH (or other electron donor) to the enzymes DszC (or Sox C) and/or DszA (or Sox A). The enzyme DszC is responsible for the biocatalysis of the oxidation reaction of DBT to DBTO2. The enzyme DszA is responsible for the reaction of DBTO2 to 2- (2- hydroxyphenyl) benzenesulfinate (HPBS). The enzyme DszB catalyzes the conversion of HPBS to 2-hydroxybiphenyl and inorganic sulfur.

As such, it is particularly preferred to add the cofactor, FMN, to the reaction medium as well as an electron donor, NADH.

Another method of use of the recombinant Pseudomonas organisms of the invention is as catalysts

for the oxidation of organic compounds. The method comprises the steps of (1) contacting the organic compound with an aqueous phase containing a recombinant pseudomonad biocatalyst comprising at least one enzyme capable of catalyzing at least one step in the oxidative cleavage of carbon-sulfur bonds, thereby forming an organic compound and aqueous phase mixture; (2) maintaining the mixture of step (1) under conditions sufficient for biocatalytic oxidation of the organic compound by the biocatalyst, thereby resulting in an oxidized organic compound, and, optionally, (3) separating the oxidized organic compound from the aqueous phase. In one embodiment, the organic compound is a heteroorganic compound, such as an organonitrogen compound or an organosulfur compound. In one embodiment, the organic compound is an organosulfur compound which is a component of a fossil fuel, such as petroleum or a petroleum distillate fraction. In a second embodiment, the organic compound is a substituted or unsubstituted indole, as described in U. S.

Provisional Patent Application Serial Number 60/020563, filed July 2,1996, which is incorporated herein by reference.

The enzymes having the amino acid sequence set forth in SEQ ID NO.: 6 and SEQ ID NO.: 12 catalyze the oxidation of dibenzothiophene to dibenzothiophene-5,5- dioxide (dibenzothiophene sulfone), and the enzymes of SEQ ID NO.: 1 and SEQ ID NO.: 7 catalyze the oxidation of dibenzothiophene-5,5-dioxide to 2- (2- hydroxyphenyl) benzenesulfinate. In one embodiment the biocatalyst comprises an enzyme of SEQ ID NO.: 6 or SEQ ID NO.: 12, or a mutant, homologue or active fragment

thereof; the organosulfur compound is substituted or unsubstituted dibenzothiophene; and the oxidized organosulfur compound is a substituted or unsubstituted dibenzothiophene-5,5-dioxide. In another embodiment the biocatalyst comprises an enzyme of SEQ ID NO.: 6 or SEQ ID NO.: 12, and an enzyme of SEQ ID NO.: 1 or SEQ ID NO.: 7; or a mutant, homologue or fragment thereof; the organosulfur compound is a substituted or unsubstituted dibenzothiophene; and the oxidized organosulfur compound is a substituted or unsubstituted 2- (2-hydroxyphenyl) benzenesulfinate. In yet another embodiment, the biocatalyst comprises the enzyme encoded by SEQ ID NO.: 1 or SEQ ID NO.: 7 or a mutant, homologue or active fragment thereof; the organosulfur compound is a substituted or unsubstituted dibenzothiophene-5,5- dioxide; and the oxidized organosulfur compound is a substituted or unsubstituted 2- (2-hydroxyphenyl) benzenesulfinate.

The oxidized organosulfur compound can, optionally, be further processed, for example, via a non-biological process or an enzyme-catalyzed reaction. In one embodiment, the oxidized organosulfur compound is desulfurized in a process employing suitable desulfurization enzymes from an organism other than a Sphingomonas.

The invention will now be further illustrated by way of the following examples.

General Materials and Methods Bacterial strains and plasmids E. coli DH10 (F-mcrA A (mrr-hsdRMS- mcrBC)+80dlacZEM15 AlacX74 deoR recA1 endA1 araD139 A (ara, leu) 7697 galU galK lambda-rpsL nunG ; Gibco-BRL, Gaithersburg, MD) was used as the cloning host.

Pseudomonas fluorescens ATCC 13525 (American Type Culture Collection, Rockville, MD) and P. fluorescens NCIB 11764 (Harris and Knowles, J. Gen. Microbiol. 129: 1005-1011 (1983)) were used as typical expression hosts for the Rhodococcus IGTS8 dsz genes.

Plasmids pUC18 and pUC19 were used as cloning vectors (ApR ; Vieria and Messing, Gene 19: 259-268 (1982)). The IncQ based, broad-host range expression plasmid pEXYlO containing the salicylate inducible PG promoter is derived from plasmid pRWF113 (Frazee et al., J. Bacteriol. 175 : 6194-6202, (1993)), a derivative of plasmid pKMY319 (Yen, K.-W., J. Bacteriol. 173 : 5328- 5335, (1991)). The construction of pEXYlO is detailed in Example 1. The fre gene was obtained from plasmid pfFRI (Zenno, S. and Saigo, K., J. Bacteriol. 176: 3544- 3551, (1994)). The source of the Rhodococcus erythropolis IGTS8 flavin reductase gene, frdA, was plasmid pEBC615. Plasmid pEBC443, which was used in the construction of pDA121, contains the dszA gene in the tac expression vector pT3X12 (Hale et al., Cytokine 7 : 26-38 (1995)).

Media and Reagents

LB and 2YT media were routinely used to propagate E. coli and P. fluorescens. Luria broth (LB) is 1% tryptone (Difco), 0.5% yeast extract (Difco) and 0.5% NaCl. 2YT medium is 1.6% tryptone, 1% yeast extract and 0.5% NaCl. Basal salts minimal medium (BSM-glucose) contained the following (per liter) phosphate buffer, 100 mmol (pH 7.2); glucose, 20g : NH4C1, 2g ; MgCl2 6H20, 644 mg; MnC12 4H2O, 33 mg; (NH4) 6Mo7024 4H2O, 0.09 mg; and EDTA, 1.25 mg. When required, the sulfur source was 2 mM MgSO4. For solid media, agar or agarose was added a concentration of 1.5%. Dibenzothiophene (DBT) and dibenzothiophene sulfone (DBT) were made up in acetonitrile as 50 mM stock solutions.

To screen for presence of the dsz genes, tetracycline-resistant P. fluorescens transconjugants were tested for the ability to produce clearing zones on a BSM Glucose DBTO2 plate (final DBTO2 concentration = 400 hum).

The antibiotic concentrations for E. coli were as follows: ampicillin, 100 Hg/ml ; tetracycline, 15 Hg/ml.

The antibiotic concentrations for Pseudomonas were as follows: tetracycline, 30 Hg/ml. For selection of Pseudomonas drug-resistant transconjugants, plate mating mixtures were plated on either Pseudomonas Isolation Agar (PIA; Difco) supplemented with tetracycline (300 Ag/ml) or on BSM Glucose plates containing tetracycline.

DNA Methods Restriction enzymes and T4 DNA ligase were purchased from New England Biolabs, Inc. (Beverly, MA) and used as recommended by the supplier. Small scale plasmid preparations from E. coli were carried out as

described by Birboim and Doly (Nuc. Acids Res. 7 : 1513- 1523 (1979)). Larger scale DNA preparations were carried out with Midi-prep columns from Qiagen (Chatsworth, CA). DNA fragments were purified from agarose gels after electrophoretic separation by the method of Vogelstein and Gillespie (Proc. Natl. Acad.

Sci. USA 76 : 615-619 (1979)). DNA fragments were cloned into vectors by using techniques described by Sambrook, et al. (Sambrook, J., Fritsch, E. F., and Maniatis, T., eds., In Molecular Cloning : A Laboratory Manual. Cold Spring Harbor Laboratory Press, 1992).

DNA samples were sequenced by SeqWright (Houston, TX) using a dye-terminator cycling sequencing kit from Perkin Elmer and the 373A and 377 ABI automatic DNA sequencer. The sequence was extended by synthesizing overlapping oligonucleotides to previously read sequence. The synthesized oligonucleotides wee used as primers for continuing sequence reactions. Sequencing reads were assembled and edited to 99.99% accuracy using Genecode's Sequencher, version 3.0 computer software.

DNA and protein sequence analysis was performed with the MacVector software program (Oxford Molecular Group, Campbell, CA).

Genetic Procedures Plasmid DNA was introduced into E. coli DH10 by electroporation. Competent ElectroMAX DHlOß (Gibco-BRL, Gaithersburg, MD) were used according to the manufacturer's suggestions. Recombinant broad-host range plasmids were introduced into Pseudomonas and other gram-negative strains either by electroporation using standard techniques or by triparental plate

matings with pRK2013 (Figurski and Helinski, Proc. Natl.

Acad. Sci. USA 76 : 1648-1652 (1979)) as the mobilizing plasmid.

Preparation of cell-free extracts Cells grown in the appropriate medium were concentrated to an optical density at 600 nm of 50 by centrifugation and resuspended in 25 mM phosphate buffer, pH 7.4, containing 100 mM NaCl, 0.5 mM dithiothreitol (DTT), and 1 mM phenylmethanesulfonyl fluoride (PMSF). Cells were disrupted in a French press and debris was removed by centrifugation at 32,000 x g for 20 min. Cell lysates were stored on ice at 4°C.

Desulfurization and enzyme assays Cell-free desulfurization assays was performed by incubating a crude lysate with 200 AM dibenzothiophene (DBT), 10yM FMN, and 4 hum NADH in buffer containing 25 mM sodium phosphate buffer, pH 7.4, containing 100 mNaCl and 0.5 mM DTT. 2- (2-hydroxyphenyl) benzenesulfinate (HPBS) desulfinase activity was determined by incubating a crude lysate with 200 UM HPBS in buffer. The reaction mixtures were shaken at 250 rpm at 30°C. At designated time points aliquots were removed and the reaction was quenched with acetonitrile.

Substrate and product concentrations were quantitated by high-pressure liquid chromatography (HPLC) analysis.

Flavin reductase activity was measured by the FMN- dependent oxidation of NADH monitored at 340 nm as previously described (Gray et al., Nature Biotechnology, 14: 1705-1709 (1996)).

Single phase, in vivo desulfurization assays were performed by first diluting the cells to be tested into buffer consisting of 50 mM potassium phosphate buffer (pH 7.2) + 1% glucose to an OD6oo=0. 5. The cell suspension was equilibrated in a 30°C shaker bath (250 rpm) for several minutes before the addition of DBT (final concentration) = 200 UM). Aliquots were removed during the 60 minute incubation period, quenched with an equal volume of acetonitrile and centrifuged to remove the cell debris. Substrate and product concentrations were quantitated by HPLC analysis.

Whole cell, two phase desulfurization assays were performed by resuspending cell paste (the equivalent of 0.005 grams dry cell weight) in buffer consisting of 156 mM potassium phosphate (pH 7.2) + 1% glucose. Following equilibration in a 30°C shaker bath (250 rpm), one third volume of 0.6 wt% DBT in hexadecane was added. Samples of the aqueous phase were removed after 60 minutes, quenched with acetonitrile and analyzed as described above. The hexadecane phase was recovered after 24 hrs. and analyzed by gas chromatography (GC) for DBT and 2- HBP. Specific activity is reported as moles product/min/gram dry cell weight (gDCW).

Expression Studies P. fluorescens strains harboring the dsz expression plasmids pEX108Z or pDA121 (see below) were inoculated into 250 ml shake flasks containing 2YT medium + tetracycline and allowed to grow with shaking at 30°C.

At an OD600 of approximately 0.9, the culture was induced by the addition of sodium salicylate (final conc. 250 AM) and allowed to grow for an additional 3-5 hrs.

Alternatively, the strains were grown in a 1-liter fermentor with a basal salts glucose medium containing tetracycline. The cultures were harvested by centrifugation and the resulting biomass was used to perform in vivo (i. e., whole cell) assays or to generate cell-free lysates for in vitro desulfurization assays.

SDS-PAGE and Western Blot Analysis Protein separations were done with Novex (San Diego, CA) precast 10% polyacylamide gels with Tris- Glycine-sodium dodecyl sulfate (SDS) (Laemmli) running buffer. Western blot analysis was carried out by first transferring the proteins electrophoretically to nitrocellulose membranes as recommended by Biorad (Hercules, CA). Blots were treated with antisera raised against the purified IGTS8 Dsz proteins (primary antibody) and then with goat anti-rabbit antisera conjugated to horseradish peroxidase as the second antibody. Finally, the proteins were detected with a horseradish peroxidase catalyzed chemiluminescence reaction.

Example 1 Construction of recombinant plasmids A number of broad host range expression vectors have been described for use in gram-negative bacteria (Mermod et al., in R. Sokatch and L. N. Ornston, ed., The Bacteria vol. 10, p. 325-355,, 1986), however, plasmids that contain the PG promoter, which is regulated by nan R, have proven to be very useful in the overexpression of several genes in Pseudomonas hosts (Frazee et al., J.

Bacteriol. 175: 6194-6202 (1993); Yen, K., W., J.

Bacteriol. 173 : 5328-5335 (1991)). This vector/ promoter system was chosen to express the Rhodococcus IGTS8 desulfurization genes (dsz) in gram-negative hosts.

Enzyme kinetic analysis has shown that the rate of the second oxygenation step, which is catalyzed by DBTO2- MO (i. e., DszA), is about ten times faster than the rate of the first oxygenation step which is carried out by DBT-MO (i. e. DszC) (Gray et al., Nature Biotechnology, 14: 1705-1709 (1996)). In the present studies organization of the dsz genes on the plasmid in the order dszCAB gives improved expression of DszC.

A. pEX1087 The broad-host range plasmid pEX1087, which contains the dsz genes in the order dszCAB, was constructed in several cloning steps.

1. A synthetic duplex DNA oligonucleotide adaptor with the sequence shown below (SEQ ID NO.: 13) was ligated into the EcoRI and HindIII sites of pUC18 and the resulting plasmid was designated pEX82. <BR> <BR> <P>5'-AATTATCGATGAATTCCCGGGCCTGAGGAGATCTTCGAACTAGTA< ;BR> TAGCTACTTAAGGGCCCGGACTCCTCTAGAAGCTTGATCATTCGA-5' 2. The plasmid pEX14 was constructed as described in U. S. Patent Application Serial No. 08/662,810, incorporated herein by reference. An Apol/Bsu311 restriction fragment from pEX14 (described in U. S.

Patent Application Serial No. 08/351,754, incorporated herein by reference) containing a portion of 3'end of

dszB and the complete dszC was ligated with pEX82 that had been digested with EcoRI and Bsu361, resulting in pEX83, which is shown in Figure 7.

3. A portion of the dszB gene was synthesized by PCR with two oligonucleotides, (B1 and B2, shown below) as the primers and denatured pTOXI1, disclosed in U. S.

Patent No. 5,356,801, as the template.

Primer B1 : <BR> 5'ATCGGAATTCTCTAGAAGATCTGATCGTGGAGGATGATTAAATGACAAGCCGCG TCGACCCCGCAAAC3' (SEQ ID NO.: 14) Primer B2: 5'-TAATAAGCTTACTAGTTTAGCGATGTCGGTTCAGAGAATTATTGA- GGAACTCCGGAGCGTTGGGTACCGGGCAGTTGCTGTAG3' (SEQ ID NO.: 15) The 0.17kb product was digested with EcoRI and HindIII and ligated into pUC19 that has been digested with EcoRI and HindIII, resulting in pEXB5, as presented in Figure 8. The insert was confirmed by DNA sequence analysis.

4. The larger KpnI-BspEI fragment from pEXB5 was ligated with a 0.97 kb fragment containing a portion of dszB isolated from pTOXil that had been digested with Kpn I and BspEI. The resulting plasmid was designed pEXB9 and is shown in Figure 9.

5. A Bgl II-Not I fragment containing dszA and 5'end of dszB isolated from pEX14 was ligated with the largest Bgl II-Not I fragment isolated from pEXB9 to

form plasmid pEX81, as shown in Figure 10.

6. The 2.2 kb BgIII-HindIII fragment containing dszA and dszB from pEX81 was isolated and inserted into the Bgl II and HindIII sites of pEX83, resulting in plasmid pEX85, as shown in Figure 11.

7. A synthetic duplex DNA oligonucleotide adaptor with the following sequence (SEQ ID NO.: 16) 5'AATTCTAGAGAGGAACTCCATGCCAATCAATTGCAAAGCCCGGGACTAGTA GATCTCTCCTTGAGGTACGGTTAGTTAACGTTTCGGGCCCTGATCATTCGA5' was ligated into pUC18 that had been digested with EcoRI and HindIII, resulting in pEX91.

8. A 0.7 kb MunI/blunted-FoklI restriction fragment containing the 3'-end of fre, from plasmid pfFRI (Zenno, S. & Saigo, K., J. Bacteriol. 176: 3544-3551, (1994)) was ligated with MunI-SmaI-digested pEX91. The resulting plasmid was designated pEX92 (Figure 12).

9. The pacHG genes were removed from pRWF113 (Frazee, R. W. J. et al., J. Bacteriol. 175 : 6194-6202 (1993)), a derivative plasmid from pKMY319 (Yen, K.-W., J.

Bacteriol. 173 : 5328-5335, (1991)), be digesting with ClaI and XhoI. This fragment was replaced by ligation with a synthetic duplex DNA adaptor with structure shown below (SEQ ID NO.: 17).

5'-CGATCTAGAGGAGGCTTCATATGTTTAAACTAGTC<BR> TAGATCTCCTCCGAAGTATACAAATTTGATCAGAGCT-5'

The resulting expresion vector, pEXY10, is shown in Figure 13.

10. The ClaI-SpeI fragment containing the dszCBA genes from pEX85 was subcloned into the ClaISpeI sites of pEXYlO. The resulting plasmid, pEX1085, is shown in Figure 14.

11. A SpeI-XbaI fragment containing the fre gene was isolated from pEX92 and ligated into the SpeI site of pEX1085. The resulting plasmid, designated pEX1087, is shown in Figure 15.

B. pDA121 Plasmid pDA121 (dszCA frdA), which is based on plasmid pEX1079 (dszCAfre ; Figure 16), was constructed in two cloning steps. The first step involved replacing the Vibrio NADH-FMN oxidoreductase gene (fre) in pEX1079 with the Rhodococcus IGTS8 flavin reductase gene (freA).

A SacI-SpeI fragment containing the 3'portion of the (dszCA fre gene was removed from plasmid pEX1079 (dszCA fre). This fragment was replaced with a 200 bp SacI- SpeI-fragment containing the 3'portion of the dszA gene from pEBC443. Plasmid pEBC443 contains the dszA gene cloned into the tac promoter expression vector pT3XI2). The ligation of the 200 bp SacI-SpeI fragment from pEBC443 into pEX1079, thereby recreated the dszA gene. The resulting plasmid designated, pDA120 (Figure 17), contains intact dszC and A genes and a unique SpeI site immediately downstream of dszA.

In the second step, the Rhodococcus IGTS8 frdA gene was inserted downstream of the dszA gene in pDA120.

Before this was done, however, the native frdA gene was

first modified to improve translation in a recombinant host. Specifically, the uncommon TTG initiation codon of frdA gene was replaced with an ATG start codon and the native ribosome binding site (RBS) was replaced with the RBS of thcE gene, as described in U. S. patent application No. 08/583,118, the contents of which are incorporated herein by reference. A DNA linker (Figure 18) containing these modifications and the coding region for the first 7 amino acids residues of the frdA gene including the BsmI site, was cloned into the BamHI and HindIII sites of pUC19. Following DNA sequence verification of the linker, a BsmI-SnaBl fragment of the frdA which lacked a RBS, initiation codon and coding region for the first 7 amino acids was cloned into the BsmI-SnaBl sites of the pUC19-linker clone. The resulting plasmid was designated pECB615. The frdA gene was removed from pEBC615 as a SpeI fragment and inserted into the SpeI site of pDA120. Following transformation and characterization of the resulting transformants, one construct, designated pDA121 (dszCAfrdA: Figure 20), which contains frdA in the correct transcriptional orientation with respect to the promoter PG was chosen for further analysis.

Example 2 Expression of the dsz genes in P. fluorescens ATCC 13525 Plasmid pEX1087 was introduced into P. fluorescens ATCC 13525. A single tetracycline-resistant transconjugant from each mating was selected for further analysis.

The tetracycline-resistant P. fluorescens ATCC 13525/pEX1087 transconjugant was able to produce a zone of clearing on a BSM Glucose DBTO2 plate indicating the

presence of the dsz genes. The production of clearing zones was salicylate independent. The parent strain without pEX1087 was not capable of producing clearing zones. Induction of a P. fluorescens ATCC 13525/pEX1087 culture with salicylate resulted in significant whole cell desulfuriztion activity (14.9 moles HPBS/min/gdcw, (gdcw = gram dry cell weight) as determined in single phase assays. Significantly more HPBS than 2-HBP was produced in this assay and is consistent with previous data that the last step in the pathway (catalyzed by HPBS desulfinase) is rate limiting (Gray et al., Nature Biotechnology 14: 1707-1709 (1996)). In the absence of inducer very little desulfurization activity (0.7 moles HPBS/min/gdcw) was detected indicating that the transcription of the dsz genes in this host was tightly regulated.

Conversion of DBT to HPBS and 2-HBP was also demonstrated in cell free extracts prepared from a salicylate induced culture of P. fluorescens/pEX1087, as presented in Figure 19. The level of HPBS increased over the first 15 min. but then decreased to near baseline levels by 60 min., whereas the levels of 2-HBP steadily increased over the length of the assay. This product accumulation profile is consistent with that previously seen with Rhodocossus IGTS8 cell free extracts (Gray et al., Nature Biotechnology 14: 1707- 1709 (1996). Western blot analysis of this cell-free lysate revealed the presence of all three Dsz proteins.

Example 3 Expression of the dsz genes in P. fluorescens NCIB 11764/pDA121 Plasmid pDA121 was introduced into P. fluorescens NCIB 11764. A single tetracycline-resistant

transconjugant from each mating was selected for further analysis.

The tetracycline-resistant P. fluorescens NCIB 11764/pDA121 transconjugant was also able to produce clearing zones on a BSM Glucose DBTO2 plate indicating the presense of the dsz genes. The production of clearing zones was salicylate independent. Cells obtained from several P. fluorescens NCIB 11764/pDA121 salicylate-induced cultures produced significant levels of HPBS in single phase, whole cell assays. This corresponded to an in vivo activity of 2.5-6.0 (pmoles/min/gdcw). In the absence of the inducer, however, little or no activity was detected, indicating that the transcription of the dsz genes in this host was tightly regulated. There was also significant levels of HPBS produced in two-phase desulfurization assays. The amount of HPBS produced corresponded to a specific activity of 9.5 Hmoles/min/gdcw).

Cell-free lysates prepared from salicylate induced cultures of P. fluorescens NCIB 11764/pDA121 contained significant FMN reductase activity (9700 to 12,500 nmol NADH oxidized/min/mg). This is compared to FMN reductase activities of 600-1000 for P. fluorescens NCIB 11764 harboring pEX1079 (dszCAfre). SDS-PAGE analysis of this lysate also revealed the presense of prominent FrdA, DszC and DszA protein bands. When compared to a known amount of the purified Rhodoccus reductase, it was estimated that FrdA represents a significant portion of the total cytoplasmic protein. This was the first evidence that co-expression of the dsz genes with frdA in a non-Rhodococcus host results in measureable in vivo desulfurization activity.

Plasmid pEX1087, pDA121 or its various derivatives

(e. g., pEX1079) have been successfully introduced into a variety of Gram-negative species using the general methods outlined above. These include, but are not limited to Burkholderia cepacia, Comomonas testosteroni, P. aeruginosa, P. aureofaciens, P. alcaligenes, P. chlororaphis, P. denitrifcans, P. fluorescens, P. mendocina, P. oleovorans, P. putida, P. stutzeri, and Sphingomonas paucimobilis. The results presented using P. fluorescens as an expression host are provided as an example. Desulfurization activity was demonstrated for all of the listed recombinants.

The recombinant Pseudomonads of the present invention offer several advantages. For example, the growth characteristics of Pseudomonas species result in a relatively low biocatalyst manufacturing cost as well as rapid growth within the process system. This also allows Pseudomonas species to compete effectively with biological contaminants which can be present in the system. Further, Pseudomonas species can be used in a biodesulfurization system at high concentrations without complicating the recovery of the hydrocarbon product from the resulting emulsion.

EQUIVALENTS Those skilled in the art will know, or be able to ascertain, using no more than routine experimentation, many equivalents to the specific embodiments of the invention described herein. These and all other equivalents are intended to be encompassed by the following claims.

SEQUENCE LISTING (1) GENERAL INFORMATION: (i) APPLICANT: (A) NAME: Energy BioSystems Corporation (B) STREET: 4200 Research Forest Drive (C) CITY: The Woodlands (D) STATE/PROVINCE: Texas (E) COUNTRY: USA (F) POSTAL CODE/ZIP: 77381 (G) TELEPHONE: (281) 364-6100 (I) TELEFAX: (281) 364-6112 (ii) TITLE OF INVENTION: DSZ Gene Expression In Pseudomonas Hosts (iii) NUMBER OF SEQUENCES: 23 (iv) CORRESPONDENCE ADDRESS: (A) ADDRESSEE: Hamilton, Brook, Smith & Reynolds, P. c.

(B) STREET: Two Militia Drive (C) CITY: Lexington (D) STATE: Massachusetts (E) COUNTRY: USA (F) ZIP: 02173 (v) COMPUTER READABLE FORM: (A) MEDIUM TYPE: Floppy disk (B) COMPUTER: IBM PC compatible (C) OPERATING SYSTEM: PC-DOS/MS-DOS (D) SOFTWARE: PatentIn Release #1. 0, Version #1. 30 (vi) PRIOR APPLICATION DATA: (A) APPLICATION NUMBER : US 08/851,088 (B) FILING DATE: 05-MAY-1997 (vii) PRIOR APPLICATION DATA: (A) APPLICATION NUMBER: US 08/835,185 (B) FILING DATE: 07-APR-1997 (viii) ATTORNEY/AGENT INFORMATION: (A) NAME: Elmore, Carolyn S.

(B) REGISTRATION NUMBER: 37,567 (C) REFERENCE/DOCKET NUMBER : EBC96-06A2 (ix) TELECOMMUNICATION INFORMATION: (A) TELEPHONE: (781) 861-6240 (B) TELEFAX: (781) 861-9540 (2) INFORMATION FOR SEQ ID NO : 1 : (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1359 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (ix) FEATURE: (A) NAME/KEY: CDS (B) LOCATION: 1.. 1359 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : ATG ACT CAA CAA CGA CAA ATG CAT CTG GCC GGT TTC TTC TCG GCC GGC 48 Met Thr Gln Gln Arg Gln Met His Leu Ala Gly Phe Phe Ser Ala Gly 1 5 10 15 AAT GTG ACT CAT GCA CAT GGG GCG TGG CGG CAC ACG GAC GCG TCG AAT 96 Asn Val Thr His Ala His Gly Ala Trp Arg His Thr Asp Ala Ser Asn 20 25 30 GAC TTT CTG TCG GGG AAG TAC TAC CAA CAC ATC GCC CGT ACT CTG GAG 144 Asp Phe Leu Ser Gly Lys Tyr Tyr Gln His Ile Ala Arg Thr Leu Glu 35 40 45 CGC GGC AAG TTC GAT CTG TTG TTT CTG CCT GAC GGG TTG GCC GTC GAG 192 Arg Gly Lys Phe Asp Leu Leu Phe Leu Pro Asp Gly Leu Ala Val Glu 50 55 60 GAC AGC TAC GGG GAC AAC CTG GAC ACC GGT GTC GGC CTG GGC GGG CAG 240 Asp Ser Tyr Gly Asp Asn Leu Asp Thr Gly Val Gly Leu Gly Gly Gln 65 70 75 80 GGT GCA GTC GCC TTG GAG CCG GCC AGT GTG GTC GCA ACC ATG GCC GCG 288 Gly Ala Val Ala Leu Glu Pro Ala Ser Val Val Ala Thr Met Ala Ala 85 90 95 GTG ACC GAG CAC CTG GGT CTT GGG GCA ACC ATT TCG GCG ACC TAC TAT 336 Val Thr Glu His Leu Gly Leu Gly Ala Thr Ile Ser Ala Thr Tyr Tyr 100 105 110 CCC CCG TAT CAC GTT GCT CGG GTG TTC GCG ACG CTC GAT CAG TTG TCA 384 Pro Pro Tyr His Val Ala Arg Val Phe Ala Thr Leu Asp Gln Leu Ser 115 120 125 GGG GGT CGG GTG TCC TGG AAC GTC GTC ACC TCG CTC AAC GAC GCT GAA 432 Gly Gly Arg Val Ser Trp Asn Val Val Thr Ser Leu Asn Asp Ala Glu 130 135 140 GCG CGC AAC TTC GGC ATT AAT CAG CAT CTG GAA CAC GAC GCC CGC TAT 480 Ala Arg Asn Phe Gly Ile Asn Gln His Leu Glu His Asp Ala Arg Tyr 145 150 155 160 GAC CGC GCC GAT GAG TTC TTG GAA GCG GTC AAG AAA CTC TGG AAC AGC 528 Asp Arg Ala Asp Glu Phe Leu Glu Ala Val Lys Lys Leu Trp Asn Ser 165 170 175 TGG GAC GAG GAC GCC CTC GTG CTG GAC AAG GCG GCC GGC GTG TTC GCC 576 Trp Asp Glu Asp Ala Leu Val Leu Asp Lys Ala Ala Gly Val Phe Ala 180 185 190 GAT CCC GCG AAG GTG CAC TAC GTC GAT CAC CAC GGG GAG TGG CTG AAT 624 Asp Pro Ala Lys Val His Tyr Val Asp His His Gly Glu Trp Leu Asn 195 200 205 GTG CGC GGA CCT CTG CAG GTA CCG CGT TCA CCT CAG GGT GAG CCG GTG 672 Val Arg Gly Pro Leu Gln Val Pro Arg Ser Pro Gln Gly Glu Pro Val 210 215 220 ATC CTG CAG GCC GGC CTG TCG CCC CGG GGT CGG CGC TTC GCC GGG AAG 720 Ile Leu Gln Ala Gly Leu Ser Pro Arg Gly Arg Arg Phe Ala Gly Lys 225 230 235 240 TGG GCC GAG GCC GTC TTC AGT CTT GCA CCC AAC CTC GAG GTG ATG CAG 768 Trp Ala Glu Ala Val Phe Ser Leu Ala Pro Asn Leu Glu Val Met Gln 245 250 255 GCC ACC TAC CAG GGC ATC AAA GCC GAG GTC GAC GCT GCG GGG CGC GAT 816 Ala Thr Tyr Gln Gly Ile Lys Ala Glu Val Asp Ala Ala Gly Arg Asp 260 265 270 CCC GAT CAG ACG AAA ATC TTC ACC GCC GTG ATG CCG GTA CTC GGC GAA 864 Pro Asp Gln Thr Lys Ile Phe Thr Ala Val Met Pro Val Leu Gly Glu 275 280 285 AGC CAG GCG GTG GCA CAG GAA CGA CTG GAA TAT CTC AAC AGT CTG GTC 912 Ser Gln Ala Val Ala Gln Glu Arg Leu Glu Tyr Leu Asn Ser Leu Val 290 295 300 CAT CCG GAA GTG GGA CTG TCG ACG CTA TCC AGT CAC ACC GGC ATC AAC 960 His Pro Glu Val Gly Leu Ser Thr Leu Ser Ser His Thr Gly Ile Asn 305 310 315 320 CTG GCG GCG TAC CCT CTC GAC ACT CCG ATC AAG GAC ATC CTG CGG GAT 1008 Leu Ala Ala Tyr Pro Leu Asp Thr Pro Ile Lys Asp Ile Leu Arg Asp 325 330 335 CTG CAG GAT CGG AAT GTC CCG ACG CAA CTG CAC ATG TTC GCC GCC GCA 1056 Leu Gln Asp Arg Asn Val Pro Thr Gln Leu His Met Phe Ala Ala Ala 340 345 350 ACG CAC AGC GAA GAG CTC ACG CTG GCG GAA ATG GGT CGG CGC TAT GGA 1104 Thr His Ser Glu Glu Leu Thr Leu Ala Glu Met Gly Arg Arg Tyr Gly 355 360 365 ACC AAC GTG GGG TTC GTT CCT CAG TGG GCC GGT ACC GGG GAG CAG ATC 1152 Thr Asn Val Gly Phe Val Pro Gln Trp Ala Gly Thr Gly Glu Gln Ile 370 375 380 GCT GAC GAG CTG ATC CGC CAC TTC GAG GGC GGC GCC GCG GAT GGT TTC 1200 Ala Asp Glu Leu Ile Arg His Phe Glu Gly Gly Ala Ala Asp Gly Phe 385 390 395 400 ATC ATC TCT CCG GCC TTC CTG CCG GGC TCC TAC GAC GAG TTC GTC GAC 1248 Ile Ile Ser Pro Ala Phe Leu Pro Gly Ser Tyr Asp Glu Phe Val Asp 405 410 415 CAG GTG GTT CCG GTT CTG CAG GAT CGC GGC TAC TTC CGC ACC GAG TAC 1296 Gln Val Val Pro Val Leu Gln Asp Arg Gly Tyr Phe Arg Thr Glu Tyr 420 425 430 CAG GGC AAC ACT CTG CGC GAC CAC TTG GGT CTG CGC GTA CCA CAA CTG 1344 Gln Gly Asn Thr Leu Arg Asp His Leu Gly Leu Arg Val Pro Gln Leu 435 440 445 CAA GGA CAA CCT TCA 1359 Gln Gly Gln Pro Ser 450 (2) INFORMATION FOR SEQ ID NO : 2: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 453 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2: Met Thr Gln Gln Arg Gln Met His Leu Ala Gly Phe Phe Ser Ala Gly 1 5 10 15 Asn Val Thr His Ala His Gly Ala Trp Arg His Thr Asp Ala Ser Asn 20 25 30 Asp Phe Leu Ser Gly Lys Tyr Tyr Gln His Ile Ala Arg Thr Leu Glu 35 40 45 Arg Gly Lys Phe Asp Leu Leu Phe Leu Pro Asp Gly Leu Ala Val Glu 50 55 60 Asp Ser Tyr Gly Asp Asn Leu Asp Thr Gly Val Gly Leu Gly Gly Gln 65 70 75 80 Gly Ala Val Ala Leu Glu Pro Ala Ser Val Val Ala Thr Met Ala Ala 85 90 95 Val Thr Glu His Leu Gly Leu Gly Ala Thr Ile Ser Ala Thr Tyr Tyr 100 105 110 Pro Pro Tyr His Val Ala Arg Val Phe Ala Thr Leu Asp Gln Leu Ser 115 120 125 Gly Gly Arg Val Ser Trp Asn Val Val Thr Ser Leu Asn Asp Ala Glu 130 135 140 Ala Arg Asn Phe Gly Ile Asn Gln His Leu Glu His Asp Ala Arg Tyr 145 150 155 160 Asp Arg Ala Asp Glu Phe Leu Glu Ala Val Lys Lys Leu Trp Asn Ser 165 170 175 Trp Asp Glu Asp Ala Leu Val Leu Asp Lys Ala Ala Gly Val Phe Ala 180 185 190 Asp Pro Ala Lys Val His Tyr Val Asp His His Gly Glu Trp Leu Asn 195 200 205 Val Arg Gly Pro Leu Gln Val Pro Arg Ser Pro Gln Gly Glu Pro Val 210 215 220 Ile Leu Gln Ala Gly Leu Ser Pro Arg Gly Arg Arg Phe Ala Gly Lys 225 230 235 240 Trp Ala Glu Ala Val Phe Ser Leu Ala Pro Asn Leu Glu Val Met Gln 245 250 255 Ala Thr Tyr Gln Gly Ile Lys Ala Glu Val Asp Ala Ala Gly Arg Asp 260 265 270 Pro Asp Gln Thr Lys Ile Phe Thr Ala Val Met Pro Val Leu Gly Glu 275 280 285 Ser Gln Ala Val Ala Gln Glu Arg Leu Glu Tyr Leu Asn Ser Leu Val 290 295 300 His Pro Glu Val Gly Leu Ser Thr Leu Ser Ser His Thr Gly Ile Asn 305 310 315 320 Leu Ala Ala Tyr Pro Leu Asp Thr Pro Ile Lys Asp Ile Leu Arg Asp 325 330 335 Leu Gln Asp Arg Asn Val Pro Thr Gln Leu His Met Phe Ala Ala Ala 340 345 350 Thr His Ser Glu Glu Leu Thr Leu Ala Glu Met Gly Arg Arg Tyr Gly 355 360 365 Thr Asn Val Gly Phe Val Pro Gln Trp Ala Gly Thr Gly Glu Gln Ile 370 375 380 Ala Asp Glu Leu Ile Arg His Phe Glu Gly Gly Ala Ala Asp Gly Phe 385 390 395 400 Ile Ile Ser Pro Ala Phe Leu Pro Gly Ser Tyr Asp Glu Phe Val Asp 405 410 415 Gln Val Val Pro Val Leu Gln Asp Arg Gly Tyr Phe Arg Thr Glu Tyr 420 425 430 Gln Gly Asn Thr Leu Arg Asp His Leu Gly Leu Arg Val Pro Gln Leu 435 440 445 Gln Gly Gln Pro Ser 450 (2) INFORMATION FOR SEQ ID NO : 3: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1095 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (ix) FEATURE: (A) NAME/KEY: CDS (B) LOCATION: 1.. 1095 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3: ATG ACA AGC CGC GTC GAC CCC GCA AAC CCC GGT TCA GAA CTC GAT TCC 48 Met Thr Ser Arg Val Asp Pro Ala Asn Pro Gly Ser Glu Leu Asp Ser 1 5 10 15 GCC ATC CGC GAC ACA CTG ACC TAC AGC AAC TGC CCG GTA CCC AAC GCT 96 Ala Ile Arg Asp Thr Leu Thr Tyr Ser Asn Cys Pro Val Pro Asn Ala 20 25 30 CTG CTC ACG GCA TCG GAA TCG GGC TTC CTC GAC GCC GCC GGC ATC GAA 144 Leu Leu Thr Ala Ser Glu Ser Gly Phe Leu Asp Ala Ala Gly Ile Glu 35 40 45 CTC GAC GTC CTC AGC GGC CAG CAG GGC ACG GTT CAT TTC ACC TAC GAC 192 Leu Asp Val Leu Ser Gly Gln Gln Gly Thr Val His Phe Thr Tyr Asp 50 55 60 CAG CCT GCC TAC ACC CGT TTT GGG GGT GAG ATC CCG CCA CTG CTC AGC 240 Gln Pro Ala Tyr Thr Arg Phe Gly Gly Glu Ile Pro Pro Leu Leu Ser 65 70 75 80 GAG GGG TTG CGG GCA CCT GGG CGC ACG CGT CTA CTC GGC ATC ACC CCG 288 Glu Gly Leu Arg Ala Pro Gly Arg Thr Arg Leu Leu Gly Ile Thr Pro 85 90 95 CTC TTG GGG CGC CAG GGC TTC TTT GTC CGC GAC GAC AGC CCG ATC ACA 336 Leu Leu Gly Arg Gln Gly Phe Phe Val Arg Asp Asp Ser Pro Ile Thr 100 105 110 GCG GCC GCC GAC CTT GCC GGA CGT CGA ATC GGC GTC TCG GCC TCG GCA 384 Ala Ala Ala Asp Leu Ala Gly Arg Arg Ile Gly Val Ser Ala Ser Ala 115 120 125 ATT CGC ATC CTG CGC GGC CAG CTG GGC GAC TAC CTC GAG TTG GAT CCC 432 Ile Arg Ile Leu Arg Gly Gln Leu Gly Asp Tyr Leu Glu Leu Asp Pro 130 135 140 TGG CGG CAA ACG CTG GTA GCG CTG GGC TCG TGG GAG GCG CGC GCC TTG 480 Trp Arg Gln Thr Leu Val Ala Leu Gly Ser Trp Glu Ala Arg Ala Leu 145 150 155 160 TTG CAC ACC CTT GAG CAC GGT GAA CTG GGT GTG GAC GAC GTC GAG CTG 528 Leu His Thr Leu Glu His Gly Glu Leu Gly Val Asp Asp Val Glu Leu 165 170 175 GTG CCG ATC AGC AGT CCT GGT GTC GAT GTT CCC GCT GAG CAG CTC GAA 576 Val Pro Ile Ser Ser Pro Gly Val Asp Val Pro Ala Glu Gln Leu Glu 180 185 190 GAA TCG GCG ACC GTC AAG GGT GCG GAC CTC TTT CCC GAT GTC GCC CGC 624 Glu Ser Ala Thr Val Lys Gly Ala Asp Leu Phe Pro Asp Val Ala Arg 195 200 205 GGT CAG GCC GCG GTG TTG GCC AGC GGA GAC GTT GAC GCC CTG TAC AGT 672 Gly Gln Ala Ala Val Leu Ala Ser Gly Asp Val Asp Ala Leu Tyr Ser 210 215 220 TGG CTG CCC TGG GCC GGG GAG TTG CAA GCC ACC GGG GCC CGC CCA GTG 720 Trp Leu Pro Trp Ala Gly Glu Leu Gln Ala Thr Gly Ala Arg Pro Val 225 230 235 240 GTG GAT CTC GGC CTC GAT GAG CGC AAT GCC TAC GCC AGT GTG TGG ACG 768 Val Asp Leu Gly Leu Asp Glu Arg Asn Ala Tyr Ala Ser Val Trp Thr 245 250 255 GTC AGC AGC GGG CTG GTT CGC CAG CGA CCT GGC CTT GTT CAA CGA CTG 816 Val Ser Ser Gly Leu Val Arg Gln Arg Pro Gly Leu Val Gln Arg Leu 260 265 270 GTC GAC GCG GCC GTC GAC GCC GGG CTG TGG GCA CGC GAT CAT TCC GAC 864 Val Asp Ala Ala Val Asp Ala Gly Leu Trp Ala Arg Asp His Ser Asp 275 280 285 GCG GTG ACC AGC CTG CAC GCC GCG AAC CTG GGC GTA TCG ACC GGA GCA 912 Ala Val Thr Ser Leu His Ala Ala Asn Leu Gly Val Ser Thr Gly Ala 290 295 300 GTA GGC CAG GGC TTC GGC GCC GAC TTC CAG CAG CGT CTG GTT CCA CGC 960 Val Gly Gln Gly Phe Gly Ala Asp Phe Gln Gln Arg Leu Val Pro Arg 305 310 315 320 CTG GAT CAC GAC GCC CTC GCC CTC CTG GAG CGC ACA CAG CAA TTC CTG 1008 Leu Asp His Asp Ala Leu Ala Leu Leu Glu Arg Thr Gln Gln Phe Leu 325 330 335 CTC ACC AAC AAC TTG CTG CAG GAA CCC GTC GCC CTC GAT CAG TGG GCG 1056 Leu Thr Asn Asn Leu Leu Gln Glu Pro Val Ala Leu Asp Gln Trp Ala 340 345 350 GCT CCG GAA TTT CTG AAC AAC AGC CTC AAT CGC CAC CGA 1095 Ala Pro Glu Phe Leu Asn Asn Ser Leu Asn Arg His Arg 355 360 365 (2) INFORMATION FOR SEQ ID NO : 4: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 365 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4: Met Thr Ser Arg Val Asp Pro Ala Asn Pro Gly Ser Glu Leu Asp Ser 1 5 10 15 Ala Ile Arg Asp Thr Leu Thr Tyr Ser Asn Cys Pro Val Pro Asn Ala 20 25 30 Leu Leu Thr Ala Ser Glu Ser Gly Phe Leu Asp Ala Ala Gly Ile Glu 35 40 45 Leu Asp Val Leu Ser Gly Gln Gln Gly Thr Val His Phe Thr Tyr Asp 50 55 60 Gln Pro Ala Tyr Thr Arg Phe Gly Gly Glu Ile Pro Pro Leu Leu Ser 65 70 75 80 Glu Gly Leu Arg Ala Pro Gly Arg Thr Arg Leu Leu Gly Ile Thr Pro 85 90 95 Leu Leu Gly Arg Gln Gly Phe Phe Val Arg Asp Asp Ser Pro Ile Thr 100 105 110 Ala Ala Ala Asp Leu Ala Gly Arg Arg Ile Gly Val Ser Ala Ser Ala 115 120 125 Ile Arg Ile Leu Arg Gly Gln Leu Gly Asp Tyr Leu Glu Leu Asp Pro 130 135 140 Trp Arg Gln Thr Leu Val Ala Leu Gly Ser Trp Glu Ala Arg Ala Leu 145 150 155 160 Leu His Thr Leu Glu His Gly Glu Leu Gly Val Asp Asp Val Glu Leu 165 170 175 Val Pro Ile Ser Ser Pro Gly Val Asp Val Pro Ala Glu Gln Leu Glu 180 185 190 Glu Ser Ala Thr Val Lys Gly Ala Asp Leu Phe Pro Asp Val Ala Arg 195 200 205 Gly Gln Ala Ala Val Leu Ala Ser Gly Asp Val Asp Ala Leu Tyr Ser 210 215 220 Trp Leu Pro Trp Ala Gly Glu Leu Gln Ala Thr Gly Ala Arg Pro Val 225 230 235 240 Val Asp Leu Gly Leu Asp Glu Arg Asn Ala Tyr Ala Ser Val Trp Thr 245 250 255 Val Ser Ser Gly Leu Val Arg Gln Arg Pro Gly Leu Val Gln Arg Leu 260 265 270 Val Asp Ala Ala Val Asp Ala Gly Leu Trp Ala Arg Asp His Ser Asp 275 280 285 Ala Val Thr Ser Leu His Ala Ala Asn Leu Gly Val Ser Thr Gly Ala 290 295 300 Val Gly Gln Gly Phe Gly Ala Asp Phe Gln Gln Arg Leu Val Pro Arg 305 310 315 320 Leu Asp His Asp Ala Leu Ala Leu Leu Glu Arg Thr Gln Gln Phe Leu 325 330 335 Leu Thr Asn Asn Leu Leu Gln Glu Pro Val Ala Leu Asp Gln Trp Ala 340 345 350 Ala Pro Glu Phe Leu Asn Asn Ser Leu Asn Arg His Arg 355 360 365 (2) INFORMATION FOR SEQ ID NO : 5: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1251 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (ix) FEATURE: (A) NAME/KEY: CDS (B) LOCATION: 1.. 1251 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5: ATG ACA CTG TCA CCT GAA AAG CAG CAC GTT CGA CCA CGC GAC GCC GCC 48 Met Thr Leu Ser Pro Glu Lys Gln His Val Arg Pro Arg Asp Ala Ala 1 5 10 15 GAC AAC GAT CCC GTC GCG GTT GCC CGT GGG CTA GCC GAA AAG TGG CGA 96 Asp Asn Asp Pro Val Ala Val Ala Arg Gly Leu Ala Glu Lys Trp Arg 20 25 30 GCC ACC GCC GTC GAG CGT GAT CGC GCC GGG GGT TCG GCA ACA GCC GAG 144 Ala Thr Ala Val Glu Arg Asp Arg Ala Gly Gly Ser Ala Thr Ala Glu 35 40 45 CGC GAA GAC CTG CGC GCG AGC GCG CTG CTG TCG CTC CTC GTC CCG CGC 192 Arg Glu Asp Leu Arg Ala Ser Ala Leu Leu Ser Leu Leu Val Pro Arg 50 55 60 GAA TAC GGC GGC TGG GGC GCA GAC TGG CCC ACC GCC ATC GAG GTC GTC 240 Glu Tyr Gly Gly Trp Gly Ala Asp Trp Pro Thr Ala Ile Glu Val Val 65 70 75 80 CGC GAA ATC GCG GCA GCC GAT GGA TCT TTG GGA CAC CTG TTC GGA TAC 288 Arg Glu Ile Ala Ala Ala Asp Gly Ser Leu Gly His Leu Phe Gly Tyr 85 90 95 CAC CTC ACC AAC GCC CCG ATG ATC GAA CTG ATC GGC TCG CAG GAA CAA 336 His Leu Thr Asn Ala Pro Met Ile Glu Leu Ile Gly Ser Gln Glu Gln 100 105 110 GAA GAA CAC CTG TAC ACC CAG ATC GCG CAG AAC AAC TGG TGG ACC GGA 384 Glu Glu His Leu Tyr Thr Gln Ile Ala Gln Asn Asn Trp Trp Thr Gly 115 120 125 AAT GCC TCC AGC GAG AAC AAC AGC CAC GTG CTG GAC TGG AAG GTC AGC 432 Asn Ala Ser Ser Glu Asn Asn Ser His Val Leu Asp Trp Lys Val Ser 130 135 140 GCC ACC CCG ACC GAA GAC GGC GGC TAC GTG CTC AAT GGC ACG AAG CAC 480 Ala Thr Pro Thr Glu Asp Gly Gly Tyr Val Leu Asn Gly Thr Lys His 145 150 155 160 TTC TGC AGC GGC GCC AAG GGG TCG GAC CTG CTG TTC GTG TTC GGC GTC 528 Phe Cys Ser Gly Ala Lys Gly Ser Asp Leu Leu Phe Val Phe Gly Val 165 170 175 GTC CAG GAT GAT TCT CCG CAG CAG GGT GCG ATC ATT GCT GCC GCT ATC 576 Val Gln Asp Asp Ser Pro Gln Gln Gly Ala Ile Ile Ala Ala Ala Ile 180 185 190 CCG ACA TCG CGG GCT GGC GTT ACG CCC AAC GAC GAC TGG GCC GCC ATC 624 Pro Thr Ser Arg Ala Gly Val Thr Pro Asn Asp Asp Trp Ala Ala Ile 195 200 205 GGC ATG CGG CAG ACC GAC AGC GGT TCC ACG GAC TTC CAC AAC GTC AAG 672 Gly Met Arg Gln Thr Asp Ser Gly Ser Thr Asp Phe His Asn Val Lys 210 215 220 GTC GAG CCT GAC GAA GTG CTG GGC GCG CCC AAC GCC TTC GTT CTC GCC 720 Val Glu Pro Asp Glu Val Leu Gly Ala Pro Asn Ala Phe Val Leu Ala 225 230 235 240 TTC ATA CAA TCC GAG CGC GGC AGC CTC TTC GCG CCC ATA GCG CAA TTG 768 Phe Ile Gln Ser Glu Arg Gly Ser Leu Phe Ala Pro Ile Ala Gln Leu 245 250 255 ATC TTC GCC AAC GTC TAT CTG GGG ATC GCG CAC GGC GCA CTC GAT GCC 816 Ile Phe Ala Asn Val Tyr Leu Gly Ile Ala His Gly Ala Leu Asp Ala 260 265 270 GCC AGG GAG TAC ACC CGT ACC CAG GCG AGG CCC TGG ACA CCG GCC GGT 864 Ala Arg Glu Tyr Thr Arg Thr Gln Ala Arg Pro Trp Thr Pro Ala Gly 275 280 285 ATT CAA CAG GCA ACC GAG GAT CCC TAC ACC ATC CGC TCC TAC GGT GAG 912 Ile Gln Gln Ala Thr Glu Asp Pro Tyr Thr Ile Arg Ser Tyr Gly Glu 290 295 300 TTC ACC ATC GCA TTG CAG GGA GCT GAC GCC GCC GCC CGT GAA GCG GCC 960 Phe Thr Ile Ala Leu Gln Gly Ala Asp Ala Ala Ala Arg Glu Ala Ala 305 310 315 320 CAC CTG CTG CAG ACG GTG TGG GAC AAG GGC GAC GCG CTC ACC CCC GAG 1008 His Leu Leu Gln Thr Val Trp Asp Lys Gly Asp Ala Leu Thr Pro Glu 325 330 335 GAC CGC GGC GAA CTG ATG GTG AAG GTC TCG GGA GTC AAA GCG TTG GCC 1056 Asp Arg Gly Glu Leu Met Val Lys Val Ser Gly Val Lys Ala Leu Ala 340 345 350 ACC AAC GCC GCC CTC AAC ATC AGC AGC GGC GTC TTC GAG GTG ATC GGC 1104 Thr Asn Ala Ala Leu Asn Ile Ser Ser Gly Val Phe Glu Val Ile Gly 355 360 365 GCG CGC GGA ACA CAT CCC AGG TAC GGT TTC GAC CGC TTC TGG CGC AAC 1152 Ala Arg Gly Thr His Pro Arg Tyr Gly Phe Asp Arg Phe Trp Arg Asn 370 375 380 GTG CGC ACC CAC TCC CTG CAC GAC CCG GTG TCC TAC AAG ATC GCC GAC 1200 Val Arg Thr His Ser Leu His Asp Pro Val Ser Tyr Lys Ile Ala Asp 385 390 395 400 GTC GGC AAG CAC ACC TTG AAC GGT CAA TAC CCG ATT CCC GGC TTC ACC 1248 Val Gly Lys His Thr Leu Asn Gly Gln Tyr Pro Ile Pro Gly Phe Thr 405 410 415 TCC 1251 Ser (2) INFORMATION FOR SEQ ID NO : 6: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 417 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6: Met Thr Leu Ser Pro Glu Lys Gln His Val Arg Pro Arg Asp Ala Ala 1 5 10 15 Asp Asn Asp Pro Val Ala Val Ala Arg Gly Leu Ala Glu Lys Trp Arg 20 25 30 Ala Thr Ala Val Glu Arg Asp Arg Ala Gly Gly Ser Ala Thr Ala Glu 35 40 45 Arg Glu Asp Leu Arg Ala Ser Ala Leu Leu Ser Leu Leu Val Pro Arg 50 55 60 Glu Tyr Gly Gly Trp Gly Ala Asp Trp Pro Thr Ala Ile Glu Val Val 65 70 75 80 Arg Glu Ile Ala Ala Ala Asp Gly Ser Leu Gly His Leu Phe Gly Tyr 85 90 95 His Leu Thr Asn Ala Pro Met Ile Glu Leu Ile Gly Ser Gln Glu Gln 100 105 110 Glu Glu His Leu Tyr Thr Gln Ile Ala Gln Asn Asn Trp Trp Thr Gly 115 120 125 Asn Ala Ser Ser Glu Asn Asn Ser His Val Leu Asp Trp Lys Val Ser 130 135 140 Ala Thr Pro Thr Glu Asp Gly Gly Tyr Val Leu Asn Gly Thr Lys His 145 150 155 160 Phe Cys Ser Gly Ala Lys Gly Ser Asp Leu Leu Phe Val Phe Gly Val 165 170 175 Val Gln Asp Asp Ser Pro Gln Gln Gly Ala Ile Ile Ala Ala Ala Ile 180 185 190 Pro Thr Ser Arg Ala Gly Val Thr Pro Asn Asp Asp Trp Ala Ala Ile 195 200 205 Gly Met Arg Gln Thr Asp Ser Gly Ser Thr Asp Phe His Asn Val Lys 210 215 220 Val Glu Pro Asp Glu Val Leu Gly Ala Pro Asn Ala Phe Val Leu Ala 225 230 235 240 Phe Ile Gln Ser Glu Arg Gly Ser Leu Phe Ala Pro Ile Ala Gln Leu 245 250 255 Ile Phe Ala Asn Val Tyr Leu Gly Ile Ala His Gly Ala Leu Asp Ala 260 265 270 Ala Arg Glu Tyr Thr Arg Thr Gln Ala Arg Pro Trp Thr Pro Ala Gly 275 280 285 Ile Gln Gln Ala Thr Glu Asp Pro Tyr Thr Ile Arg Ser Tyr Gly Glu 290 295 300 Phe Thr Ile Ala Leu Gln Gly Ala Asp Ala Ala Ala Arg Glu Ala Ala 305 310 315 320 His Leu Leu Gln Thr Val Trp Asp Lys Gly Asp Ala Leu Thr Pro Glu 325 330 335 Asp Arg Gly Glu Leu Met Val Lys Val Ser Gly Val Lys Ala Leu Ala 340 345 350 Thr Asn Ala Ala Leu Asn Ile Ser Ser Gly Val Phe Glu Val Ile Gly 355 360 365 Ala Arg Gly Thr His Pro Arg Tyr Gly Phe Asp Arg Phe Trp Arg Asn 370 375 380 Val Arg Thr His Ser Leu His Asp Pro Val Ser Tyr Lys Ile Ala Asp 385 390 395 400 Val Gly Lys His Thr Leu Asn Gly Gln Tyr Pro Ile Pro Gly Phe Thr 405 410 415 Ser (2) INFORMATION FOR SEQ ID NO : 7: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1362 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (ix) FEATURE: (A) NAME/KEY: CDS (B) LOCATION: 1.. 1359 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7: ATG ACC GAT CCA CGT CAG CTG CAC CTG GCC GGA TTC TTC TGT GCC GGC 48 Met Thr Asp Pro Arg Gln Leu His Leu Ala Gly Phe Phe Cys Ala Gly 1 5 10 15 AAC GTC ACG CAC GCC CAC GGA GCG TGG CGC CAC GCC GAC GAC TCC AAC 96 Asn Val Thr His Ala His Gly Ala Trp Arg His Ala Asp Asp Ser Asn 20 25 30 GGC TTC CTC ACC AAG GAG TAC TAC CAG CAG ATT GCC CGC ACG CTC GAG 144 Gly Phe Leu Thr Lys Glu Tyr Tyr Gln Gln Ile Ala Arg Thr Leu Glu 35 40 45 CGC GGC AAG TTC GAC CTG CTG TTC CTT CCC GAC GCG CTC GCC GTG TGG 192 Arg Gly Lys Phe Asp Leu Leu Phe Leu Pro Asp Ala Leu Ala Val Trp 50 55 60 GAC AGC TAC GGC GAC AAT CTG GAG ACC GGT CTG CGG TAT GGC GGG CAA 240 Asp Ser Tyr Gly Asp Asn Leu Glu Thr Gly Leu Arg Tyr Gly Gly Gln 65 70 75 80 GGC GCG GTG ATG CTG GAG CCC GGC GTA GTT ATC GCC GCG ATG GCC TCG 288 Gly Ala Val Met Leu Glu Pro Gly Val Val Ile Ala Ala Met Ala Ser 85 90 95 GTG ACC GAA CAT CTG GGG CTG GGC GCC ACC ATT TCC ACC ACC TAC TAC 336 Val Thr Glu His Leu Gly Leu Gly Ala Thr Ile Ser Thr Thr Tyr Tyr 100 105 110 CCG CCC TAC CAT GTA GCC CGG GTC GTC GCT TCG CTG GAC CAG CTG TCC 384 Pro Pro Tyr His Val Ala Arg Val Val Ala Ser Leu Asp Gln Leu Ser 115 120 125 TCC GGG CGA GTG TCG TGG AAC GTG GTC ACC TCG CTC AGC AAT GCA GAG 432 Ser Gly Arg Val Ser Trp Asn Val Val Thr Ser Leu Ser Asn Ala Glu 130 135 140 GCG CGC AAC TTC GGC TTC GAT GAA CAT CTC GAC CAC GAT GCC CGC TAC 480 Ala Arg Asn Phe Gly Phe Asp Glu His Leu Asp His Asp Ala Arg Tyr 145 150 155 160 GAT CGC GCC GAT GAA TTC CTC GAG GTC GTG CGC AAG CTC TGG AAC AGC 528 Asp Arg Ala Asp Glu Phe Leu Glu Val Val Arg Lys Leu Trp Asn Ser 165 170 175 TGG GAT CGC GAT GCG CTG ACA CTC GAC AAG GCA ACC GGC CAG TTC GCC 576 Trp Asp Arg Asp Ala Leu Thr Leu Asp Lys Ala Thr Gly Gln Phe Ala 180 185 190 GAT CCG GCT AAG GTG CGC TAC ATC GAC CAC CGC GGC GAA TGG CTC AAC 624 Asp Pro Ala Lys Val Arg Tyr Ile Asp His Arg Gly Glu Trp Leu Asn 195 200 205 GTA CGC GGG CCG CTT CAG GTG CCG CGC TCC CCC CAG GGC GAG CCT GTC 672 Val Arg Gly Pro Leu Gln Val Pro Arg Ser Pro Gln Gly Glu Pro Val 210 215 220 ATT CTG CAG GCC GGG CTT TCG GCG CGG GGC AAG CGC TTC GCC GGG CGC 720 Ile Leu Gln Ala Gly Leu Ser Ala Arg Gly Lys Arg Phe Ala Gly Arg 225 230 235 240 TGG GCG GAC GCG GTG TTC ACG ATT TCG CCC AAT CTG GAC ATC ATG CAG 768 Trp Ala Asp Ala Val Phe Thr Ile Ser Pro Asn Leu Asp Ile Met Gln 245 250 255 GCC ACG TAC CGC GAC ATA AAG GCG CAG GTC GAG GCC GCC GGA CGC GAT 816 Ala Thr Tyr Arg Asp Ile Lys Ala Gln Val Glu Ala Ala Gly Arg Asp 260 265 270 CCC GAG CAG GTC AAG GTG TTT GCC GCG GTG ATG CCG ATC CTC GGC GAG 864 Pro Glu Gln Val Lys Val Phe Ala Ala Val Met Pro Ile Leu Gly Glu 275 280 285 ACC GAG GCG ATC GCC AGG CAG CGT CTC GAA TAC ATA AAT TCG CTG GTG 912 Thr Glu Ala Ile Ala Arg Gln Arg Leu Glu Tyr Ile Asn Ser Leu Val 290 295 300 CAT CCC GAA GTC GGG CTT TCT ACG TTG TCC AGC CAT GTC GGG GTC AAC 960 His Pro Glu Val Gly Leu Ser Thr Leu Ser Ser His Val Gly Val Asn 305 310 315 320 CTT GCC GAC TAT TCG CTC GAT ACC CCG CTG ACC GAG GTC CTG GGC GAT 1008 Leu Ala Asp Tyr Ser Leu Asp Thr Pro Leu Thr Glu Val Leu Gly Asp 325 330 335 CTC GCC CAG CGC AAC GTG CCC ACC CAA CTG GGC ATG TTC GCC AGG ATG 1056 Leu Ala Gln Arg Asn Val Pro Thr Gln Leu Gly Met Phe Ala Arg Met 340 345 350 TTG CAG GCC GAG ACG CTG ACC GTG GGA GAA ATG GGC CGG CGT TAT GGC 1104 Leu Gln Ala Glu Thr Leu Thr Val Gly Glu Met Gly Arg Arg Tyr Gly 355 360 365 GCC AAC GTG GGC TTC GTC CCG CAG TGG GCG GGA ACC CGC GAG CAG ATC 1152 Ala Asn Val Gly Phe Val Pro Gln Trp Ala Gly Thr Arg Glu Gln Ile 370 375 380 GCG GAC CTG ATC GAG ATC CAT TTC AAG GCC GGC GGC GCC GAT GGC TTC 1200 Ala Asp Leu Ile Glu Ile His Phe Lys Ala Gly Gly Ala Asp Gly Phe 385 390 395 400 ATC ATC TCG CCG GCG TTC CTG CCC GGA TCT TAC GAG GAA TTC GTC GAT 1248 Ile Ile Ser Pro Ala Phe Leu Pro Gly Ser Tyr Glu Glu Phe Val Asp 405 410 415 CAG GTG GTG CCC ATC CTG CAG CAC CGC GGA CTG TTC CGC ACT GAT TAC 1296 Gln Val Val Pro Ile Leu Gln His Arg Gly Leu Phe Arg Thr Asp Tyr 420 425 430 GAA GGC CGC ACC CTG CGC AGC CAT CTG GGA CTG CGT GAA CCC GCA TAC 1344 Glu Gly Arg Thr Leu Arg Ser His Leu Gly Leu Arg Glu Pro Ala Tyr 435 440 445 CTG GGA GAG TAC GCA TGA 1362 Leu Gly Glu Tyr Ala 450 (2) INFORMATION FOR SEQ ID NO : 8: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 453 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8: Met Thr Asp Pro Arg Gln Leu His Leu Ala Gly Phe Phe Cys Ala Gly 1 5 10 15 Asn Val Thr His Ala His Gly Ala Trp Arg His Ala Asp Asp Ser Asn 20 25 30 Gly Phe Leu Thr Lys Glu Tyr Tyr Gln Gln Ile Ala Arg Thr Leu Glu 35 40 45 Arg Gly Lys Phe Asp Leu Leu Phe Leu Pro Asp Ala Leu Ala Val Trp 50 55 60 Asp Ser Tyr Gly Asp Asn Leu Glu Thr Gly Leu Arg Tyr Gly Gly Gln 65 70 75 80 Gly Ala Val Met Leu Glu Pro Gly Val Val Ile Ala Ala Met Ala Ser 85 90 95 Val Thr Glu His Leu Gly Leu Gly Ala Thr Ile Ser Thr Thr Tyr Tyr 100 105 110 Pro Pro Tyr His Val Ala Arg Val Val Ala Ser Leu Asp Gln Leu Ser 115 120 125 Ser Gly Arg Val Ser Trp Asn Val Val Thr Ser Leu Ser Asn Ala Glu 130 135 140 Ala Arg Asn Phe Gly Phe Asp Glu His Leu Asp His Asp Ala Arg Tyr 145 150 155 160 Asp Arg Ala Asp Glu Phe Leu Glu Val Val Arg Lys Leu Trp Asn Ser 165 170 175 Trp Asp Arg Asp Ala Leu Thr Leu Asp Lys Ala Thr Gly Gln Phe Ala 180 185 190 Asp Pro Ala Lys Val Arg Tyr Ile Asp His Arg Gly Glu Trp Leu Asn 195 200 205 Val Arg Gly Pro Leu Gln Val Pro Arg Ser Pro Gln Gly Glu Pro Val 210 215 220 Ile Leu Gln Ala Gly Leu Ser Ala Arg Gly Lys Arg Phe Ala Gly Arg 225 230 235 240 Trp Ala Asp Ala Val Phe Thr Ile Ser Pro Asn Leu Asp Ile Met Gln 245 250 255 Ala Thr Tyr Arg Asp Ile Lys Ala Gln Val Glu Ala Ala Gly Arg Asp 260 265 270 Pro Glu Gln Val Lys Val Phe Ala Ala Val Met Pro Ile Leu Gly Glu 275 280 285 Thr Glu Ala Ile Ala Arg Gln Arg Leu Glu Tyr Ile Asn Ser Leu Val 290 295 300 His Pro Glu Val Gly Leu Ser Thr Leu Ser Ser His Val Gly Val Asn 305 310 315 320 Leu Ala Asp Tyr Ser Leu Asp Thr Pro Leu Thr Glu Val Leu Gly Asp 325 330 335 Leu Ala Gln Arg Asn Val Pro Thr Gln Leu Gly Met Phe Ala Arg Met 340 345 350 Leu Gln Ala Glu Thr Leu Thr Val Gly Glu Met Gly Arg Arg Tyr Gly 355 360 365 Ala Asn Val Gly Phe Val Pro Gln Trp Ala Gly Thr Arg Glu Gln Ile 370 375 380 Ala Asp Leu Ile Glu Ile His Phe Lys Ala Gly Gly Ala Asp Gly Phe 385 390 395 400 Ile Ile Ser Pro Ala Phe Leu Pro Gly Ser Tyr Glu Glu Phe Val Asp 405 410 415 Gln Val Val Pro Ile Leu Gln His Arg Gly Leu Phe Arg Thr Asp Tyr 420 425 430 Glu Gly Arg Thr Leu Arg Ser His Leu Gly Leu Arg Glu Pro Ala Tyr 435 440 445 Leu Gly Glu Tyr Ala 450 (2) INFORMATION FOR SEQ ID NO : 9: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1110 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (ix) FEATURE: (A) NAME/KEY: CDS (B) LOCATION: 1.. 1107 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9: ATG ACG ACA GAC ATC CAC CCG GCG AGC GCC GCA TCG TCG CCG GCG GCG 48 Met Thr Thr Asp Ile His Pro Ala Ser Ala Ala Ser Ser Pro Ala Ala 1 5 10 15 CGC GCG ACG ATC ACC TAC AGC AAC TGC CCC GTG CCT AAT GCC CTG CTC 96 Arg Ala Thr Ile Thr Tyr Ser Asn Cys Pro Val Pro Asn Ala Leu Leu 20 25 30 GCC GCG CTC GGC TCA GGT ATT CTG GAC AGT GCC GGG ATC ACA CTT GCC 144 Ala Ala Leu Gly Ser Gly Ile Leu Asp Ser Ala Gly Ile Thr Leu Ala 35 40 45 CTG CTG ACC GGA AAG CAG GGC GAG GTG CAC TTC ACC TAC GAC CGA GAT 192 Leu Leu Thr Gly Lys Gln Gly Glu Val His Phe Thr Tyr Asp Arg Asp 50 55 60 GAC TAC ACC CGC TTC GGC GGC GAG ATT CCG CCG CTG GTC AGC GAG GGA 240 Asp Tyr Thr Arg Phe Gly Gly Glu Ile Pro Pro Leu Val Ser Glu Gly 65 70 75 80 CTG CGT GCG CCG GGG CGG ACC CGC CTG CTG GGA CTG ACG CCG GTG CTG 288 Leu Arg Ala Pro Gly Arg Thr Arg Leu Leu Gly Leu Thr Pro Val Leu 85 90 95 GGC CGC TGG GGC TAC TTC GTC CGG GGC GAC AGC GCG ATC CGC ACC CCG 336 Gly Arg Trp Gly Tyr Phe Val Arg Gly Asp Ser Ala Ile Arg Thr Pro 100 105 110 GCC GAT CTT GCC GGC CGC CGC GTC GGA GTA TCC GAT TCG GCC AGG AGG 384 Ala Asp Leu Ala Gly Arg Arg Val Gly Val Ser Asp Ser Ala Arg Arg 115 120 125 ATA TTG ACC GGA AGG CTG GGC GAC TAC CGC GAA CTT GAT CCC TGG CGG 432 Ile Leu Thr Gly Arg Leu Gly Asp Tyr Arg Glu Leu Asp Pro Trp Arg 130 135 140 CAG ACC CTG GTC GCG CTG GGG ACA TGG GAG GCG CGT GCC TTG CTG AGC 480 Gln Thr Leu Val Ala Leu Gly Thr Trp Glu Ala Arg Ala Leu Leu Ser 145 150 155 160 ACG CTC GAG ACG GCG GGG CTT GGC GTC GGC GAC GTC GAG CTG ACG CGC 528 Thr Leu Glu Thr Ala Gly Leu Gly Val Gly Asp Val Glu Leu Thr Arg 165 170 175 ATC GAG AAC CCG TTC GTC GAC GTG CCG ACC GAA CGA CTG CAT GCC GCC 576 Ile Glu Asn Pro Phe Val Asp Val Pro Thr Glu Arg Leu His Ala Ala 180 185 190 GGC TCG CTC AAA GGA ACC GAC CTG TTC CCC GAC GTG ACC AGC CAG CAG 624 Gly Ser Leu Lys Gly Thr Asp Leu Phe Pro Asp Val Thr Ser Gln Gln 195 200 205 GCC GCA GTC CTT GAG GAT GAG CGC GCC GAC GCC CTG TTC GCG TGG CTT 672 Ala Ala Val Leu Glu Asp Glu Arg Ala Asp Ala Leu Phe Ala Trp Leu 210 215 220 CCC TGG GCG GCC GAG CTC GAG ACC CGC ATC GGT GCA CGG CCG GTC CTA 720 Pro Trp Ala Ala Glu Leu Glu Thr Arg Ile Gly Ala Arg Pro Val Leu 225 230 235 240 GAC CTC AGC GCA GAC GAC CGC AAT GCC TAT GCG AGC ACC TGG ACG GTG 768 Asp Leu Ser Ala Asp Asp Arg Asn Ala Tyr Ala Ser Thr Trp Thr Val 245 250 255 AGC GCC GAG CTG GTG GAC CGG CAG CCC GAA CTG GTG CAG CGG CTC GTC 816 Ser Ala Glu Leu Val Asp Arg Gln Pro Glu Leu Val Gln Arg Leu Val 260 265 270 GAT GCC GTG GTG GAT GCA GGG CGG TGG GCC GAG GCC AAT GGC GAT GTC 864 Asp Ala Val Val Asp Ala Gly Arg Trp Ala Glu Ala Asn Gly Asp Val 275 280 285 GTC TCC CGC CTG CAC GCC GAT AAC CTC GGT GTC AGT CCC GAA AGC GTC 912 Val Ser Arg Leu His Ala Asp Asn Leu Gly Val Ser Pro Glu Ser Val 290 295 300 CGC CAG GGA TTC GGA GCC GAT TTT CAC CGC CGC CTG ACG CCG CGG CTC 960 Arg Gln Gly Phe Gly Ala Asp Phe His Arg Arg Leu Thr Pro Arg Leu 305 310 315 320 GAC AGC GAT GCT ATC GCC ATC CTG GAG CGT ACT CAG CGG TTC CTG AAG 1008 Asp Ser Asp Ala Ile Ala Ile Leu Glu Arg Thr Gln Arg Phe Leu Lys 325 330 335 GAT GCG AAC CTG ATC GAT CGG TCG TTG GCG CTC GAT CGG TGG GCT GCA 1056 Asp Ala Asn Leu Ile Asp Arg Ser Leu Ala Leu Asp Arg Trp Ala Ala 340 345 350 CCT GAA TTC CTC GAA CAA AGT CTC TCA CGC CAG GTC GAA GGG CAG ATA 1104 Pro Glu Phe Leu Glu Gln Ser Leu Ser Arg Gln Val Glu Gly Gln Ile 355 360 365 GCA TGA 1110 Ala (2) INFORMATION FOR SEQ ID NO : 10: (i) SEQUENCE CHARACTERISTICS : (A) LENGTH: 369 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 10: Met Thr Thr Asp Ile His Pro Ala Ser Ala Ala Ser Ser Pro Ala Ala 1 5 10 15 Arg Ala Thr Ile Thr Tyr Ser Asn Cys Pro Val Pro Asn Ala Leu Leu 20 25 30 Ala Ala Leu Gly Ser Gly Ile Leu Asp Ser Ala Gly Ile Thr Leu Ala 35 40 45 Leu Leu Thr Gly Lys Gln Gly Glu Val His Phe Thr Tyr Asp Arg Asp 50 55 60 Asp Tyr Thr Arg Phe Gly Gly Glu Ile Pro Pro Leu Val Ser Glu Gly 65 70 75 80 Leu Arg Ala Pro Gly Arg Thr Arg Leu Leu Gly Leu Thr Pro Val Leu 85 90 95 Gly Arg Trp Gly Tyr Phe Val Arg Gly Asp Ser Ala Ile Arg Thr Pro 100 105 110 Ala Asp Leu Ala Gly Arg Arg Val Gly Val Ser Asp Ser Ala Arg Arg 115 120 125 Ile Leu Thr Gly Arg Leu Gly Asp Tyr Arg Glu Leu Asp Pro Trp Arg 130 135 140 Gln Thr Leu Val Ala Leu Gly Thr Trp Glu Ala Arg Ala Leu Leu Ser 145 150 155 160 Thr Leu Glu Thr Ala Gly Leu Gly Val Gly Asp Val Glu Leu Thr Arg 165 170 175 Ile Glu Asn Pro Phe Val Asp Val Pro Thr Glu Arg Leu His Ala Ala 180 185 190 Gly Ser Leu Lys Gly Thr Asp Leu Phe Pro Asp Val Thr Ser Gln Gln 195 200 205 Ala Ala Val Leu Glu Asp Glu Arg Ala Asp Ala Leu Phe Ala Trp Leu 210 215 220 Pro Trp Ala Ala Glu Leu Glu Thr Arg Ile Gly Ala Arg Pro Val Leu 225 230 235 240 Asp Leu Ser Ala Asp Asp Arg Asn Ala Tyr Ala Ser Thr Trp Thr Val 245 250 255 Ser Ala Glu Leu Val Asp Arg Gln Pro Glu Leu Val Gln Arg Leu Val 260 265 270 Asp Ala Val Val Asp Ala Gly Arg Trp Ala Glu Ala Asn Gly Asp Val 275 280 285 Val Ser Arg Leu His Ala Asp Asn Leu Gly Val Ser Pro Glu Ser Val 290 295 300 Arg Gln Gly Phe Gly Ala Asp Phe His Arg Arg Leu Thr Pro Arg Leu 305 310 315 320 Asp Ser Asp Ala Ile Ala Ile Leu Glu Arg Thr Gln Arg Phe Leu Lys 325 330 335 Asp Ala Asn Leu Ile Asp Arg Ser Leu Ala Leu Asp Arg Trp Ala Ala 340 345 350 Pro Glu Phe Leu Glu Gln Ser Leu Ser Arg Gln Val Glu Gly Gln Ile 355 360 365 Ala (2) INFORMATION FOR SEQ ID NO : 11: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1236 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (ix) FEATURE: (A) NAME/KEY: CDS (B) LOCATION: 1.. 1236 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 11: ATG AAC GAA CTC GTC AAA GAT CTC GGC CTC AAT CGA TCC GAT CCG ATC 48 Met Asn Glu Leu Val Lys Asp Leu Gly Leu Asn Arg Ser Asp Pro Ile 1 5 10 15 GGC GCT GTG CGG CGA CTG GCC GCG CAG TGG GGG GCC ACC GCT GTT GAT 96 Gly Ala Val Arg Arg Leu Ala Ala Gln Trp Gly Ala Thr Ala Val Asp 20 25 30 CGG GAC CGG GCC GGC GGA TCG GCA ACC GCC GAA CTC GAT CAA CTG CGC 144 Arg Asp Arg Ala Gly Gly Ser Ala Thr Ala Glu Leu Asp Gln Leu Arg 35 40 45 GGC AGC GGC CTG CTC TCG CTG TCC ATT CCC GCC GCA TAT GGC GGC TGG 192 Gly Ser Gly Leu Leu Ser Leu Ser Ile Pro Ala Ala Tyr Gly Gly Trp 50 55 60 GGC GCC GAC TGG CCA ACG ACT CTG GAA GTT ATC CGC GAA GTC GCA ACG 240 Gly Ala Asp Trp Pro Thr Thr Leu Glu Val Ile Arg Glu Val Ala Thr 65 70 75 80 GTG GAC GGA TCG CTG GCG CAT CTA TTC GGC TAC CAC CTC GGC TGC GTA 288 Val Asp Gly Ser Leu Ala His Leu Phe Gly Tyr His Leu Gly Cys Val 85 90 95 CCG ATG ATC GAG CTG TTC GGC TCG GCG CCA CAA AAG GAA CGG CTG TAC 336 Pro Met Ile Glu Leu Phe Gly Ser Ala Pro Gln Lys Glu Arg Leu Tyr 100 105 110 CGC CAG ATC GCA AGC CAT GAT TGG CGG GTC GGG AAT GCG TCG AGC GAA 384 Arg Gln Ile Ala Ser His Asp Trp Arg Val Gly Asn Ala Ser Ser Glu 115 120 125 AAC AAC AGC CAC GTG CTC GAG TGG AAG CTT GCC GCC ACC GCC GTC GAT 432 Asn Asn Ser His Val Leu Glu Trp Lys Leu Ala Ala Thr Ala Val Asp 130 135 140 GAT GGC GGG TTC GTC CTC AAC GGC GCG AAG CAC TTC TGC AGC GGC GCC 480 Asp Gly Gly Phe Val Leu Asn Gly Ala Lys His Phe Cys Ser Gly Ala 145 150 155 160 AAA AGC TCC GAC CTG CTC ATC GTG TTC GGC GTG ATC CAG GAC GAA TCC 528 Lys Ser Ser Asp Leu Leu Ile Val Phe Gly Val Ile Gln Asp Glu Ser 165 170 175 CCC CTG CGC GGC GCG ATC ATC ACC GCG GTC ATT CCC ACC GAC CGG GCC 576 Pro Leu Arg Gly Ala Ile Ile Thr Ala Val Ile Pro Thr Asp Arg Ala 180 185 190 GGT GTT CAG ATC AAT GAC GAC TGG CGC GCA ATC GGG ATG CGC CAG ACC 624 Gly Val Gln Ile Asn Asp Asp Trp Arg Ala Ile Gly Met Arg Gln Thr 195 200 205 GAC AGC GGC AGC GCC GAA TTT CGC GAC GTC CGA GTC TAC CCA GAC GAG 672 Asp Ser Gly Ser Ala Glu Phe Arg Asp Val Arg Val Tyr Pro Asp Glu 210 215 220 ATC TTG GGG GCA CCA AAC TCA GTC GTT GAG GCG TTC GTG ACA AGC AAC 720 Ile Leu Gly Ala Pro Asn Ser Val Val Glu Ala Phe Val Thr Ser Asn 225 230 235 240 CGC GGC AGC CTG TGG ACG CCG GCG ATT CAG TCG ATC TTC TCG AAC GTT 768 Arg Gly Ser Leu Trp Thr Pro Ala Ile Gln Ser Ile Phe Ser Asn Val 245 250 255 TAT CTG GGG CTC GCG CGT GGC GCG CTC GAG GCG GCA GCG GAT TAC ACC 816 Tyr Leu Gly Leu Ala Arg Gly Ala Leu Glu Ala Ala Ala Asp Tyr Thr 260 265 270 CGG ACC CAG AGC CGC CCC TGG ACA CCC GCC GGC GTG GCG AAG GCG ACA 864 Arg Thr Gln Ser Arg Pro Trp Thr Pro Ala Gly Val Ala Lys Ala Thr 275 280 285 GAG GAT CCC CAC ATC ATC GCC ACC TAC GGT GAA CTG GCG ATC GCG CTC 912 Glu Asp Pro His Ile Ile Ala Thr Tyr Gly Glu Leu Ala Ile Ala Leu 290 295 300 CAG GGC GCC GAG GCG GCC GCG CGC GAG GTC GCG GCC CTG TTG CAA CAG 960 Gln Gly Ala Glu Ala Ala Ala Arg Glu Val Ala Ala Leu Leu Gln Gln 305 310 315 320 GCG TGG GAC AAG GGC GAT GCG GTG ACG CCC GAA GAG CGC GGC CAG CTG 1008 Ala Trp Asp Lys Gly Asp Ala Val Thr Pro Glu Glu Arg Gly Gln Leu 325 330 335 ATG GTG AAG GTT TCG GGT GTG AAG GCC CTC TCG ACG AAG GCC GCC CTC 1056 Met Val Lys Val Ser Gly Val Lys Ala Leu Ser Thr Lys Ala Ala Leu 340 345 350 GAC ATC ACC AGC CGT ATT TTC GAG ACA ACG GGC TCG CGA TCG ACG CAT 1104 Asp Ile Thr Ser Arg Ile Phe Glu Thr Thr Gly Ser Arg Ser Thr His 355 360 365 CCC AGA TAC GGA TTC GAT CGG TTC TGG CGT AAC ATC CGG ACT CAT ACG 1152 Pro Arg Tyr Gly Phe Asp Arg Phe Trp Arg Asn Ile Arg Thr His Thr 370 375 380 CTG CAC GAT CCG GTA TCG TAT AAA ATC GTC GAT GTG GGG AAC TAC ACG 1200 Leu His Asp Pro Val Ser Tyr Lys Ile Val Asp Val Gly Asn Tyr Thr 385 390 395 400 CTC AAC GGG ACA TTC CCG GTT CCC GGA TTT ACG TCA 1236 Leu Asn Gly Thr Phe Pro Val Pro Gly Phe Thr Ser 405 410 (2) INFORMATION FOR SEQ ID NO : 12: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 412 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 12: Met Asn Glu Leu Val Lys Asp Leu Gly Leu Asn Arg Ser Asp Pro Ile 1 5 10 15 Gly Ala Val Arg Arg Leu Ala Ala Gln Trp Gly Ala Thr Ala Val Asp 20 25 30 Arg Asp Arg Ala Gly Gly Ser Ala Thr Ala Glu Leu Asp Gln Leu Arg 35 40 45 Gly Ser Gly Leu Leu Ser Leu Ser Ile Pro Ala Ala Tyr Gly Gly Trp 50 55 60 Gly Ala Asp Trp Pro Thr Thr Leu Glu Val Ile Arg Glu Val Ala Thr 65 70 75 80 Val Asp Gly Ser Leu Ala His Leu Phe Gly Tyr His Leu Gly Cys Val 85 90 95 Pro Met Ile Glu Leu Phe Gly Ser Ala Pro Gln Lys Glu Arg Leu Tyr 100 105 110 Arg Gln Ile Ala Ser His Asp Trp Arg Val Gly Asn Ala Ser Ser Glu 115 120 125 Asn Asn Ser His Val Leu Glu Trp Lys Leu Ala Ala Thr Ala Val Asp 130 135 140 Asp Gly Gly Phe Val Leu Asn Gly Ala Lys His Phe Cys Ser Gly Ala 145 150 155 160 Lys Ser Ser Asp Leu Leu Ile Val Phe Gly Val Ile Gln Asp Glu Ser 165 170 175 Pro Leu Arg Gly Ala Ile Ile Thr Ala Val Ile Pro Thr Asp Arg Ala 180 185 190 Gly Val Gln Ile Asn Asp Asp Trp Arg Ala Ile Gly Met Arg Gln Thr 195 200 205 Asp Ser Gly Ser Ala Glu Phe Arg Asp Val Arg Val Tyr Pro Asp Glu 210 215 220 Ile Leu Gly Ala Pro Asn Ser Val Val Glu Ala Phe Val Thr Ser Asn 225 230 235 240 Arg Gly Ser Leu Trp Thr Pro Ala Ile Gln Ser Ile Phe Ser Asn Val 245 250 255 Tyr Leu Gly Leu Ala Arg Gly Ala Leu Glu Ala Ala Ala Asp Tyr Thr 260 265 270 Arg Thr Gln Ser Arg Pro Trp Thr Pro Ala Gly Val Ala Lys Ala Thr 275 280 285 Glu Asp Pro His Ile Ile Ala Thr Tyr Gly Glu Leu Ala Ile Ala Leu 290 295 300 Gln Gly Ala Glu Ala Ala Ala Arg Glu Val Ala Ala Leu Leu Gln Gln 305 310 315 320 Ala Trp Asp Lys Gly Asp Ala Val Thr Pro Glu Glu Arg Gly Gln Leu 325 330 335 Met Val Lys Val Ser Gly Val Lys Ala Leu Ser Thr Lys Ala Ala Leu 340 345 350 Asp Ile Thr Ser Arg Ile Phe Glu Thr Thr Gly Ser Arg Ser Thr His 355 360 365 Pro Arg Tyr Gly Phe Asp Arg Phe Trp Arg Asn Ile Arg Thr His Thr 370 375 380 Leu His Asp Pro Val Ser Tyr Lys Ile Val Asp Val Gly Asn Tyr Thr 385 390 395 400 Leu Asn Gly Thr Phe Pro Val Pro Gly Phe Thr Ser 405 410 (2) INFORMATION FOR SEQ ID NO : 13: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 45 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 13: AATTATCGAT GAATTCCCGG GCCTGAGGAG ATCTTCGAAC TAGTA 45 (2) INFORMATION FOR SEQ ID NO : 14: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 45 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 14: AGCTTACTAG TTCGAAGATC TCCTCAGGCC CGGGAATTCA TCGAT 45 (2) INFORMATION FOR SEQ ID NO : 15: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 68 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 15: ATCGGAATTC TCTAGAAGAT CTGATCGTGG AGGATGATTA AATGACAAGC CGCGTCGACC 60 CCGCAAAC 68 (2) INFORMATION FOR SEQ ID NO : 16: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 83 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 16: TAATAAGCTT ACTAGTTTAG CGATGTCGGT TCAGAGAATT ATTGAGGAAC TCCGGAGCGT 60 TGGGTACCGG GCAGTTGCTG TAG 83 (2) INFORMATION FOR SEQ ID NO : 17: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 51 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 17: AATTCTAGAG AGGAACTCCA TGCCAATCAA TTGCAAAGCC CGGGACTAGT A 51 (2) INFORMATION FOR SEQ ID NO : 18: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 51 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 18: AGCTTACTAG TCCCGGGCTT TGCAATTGAT TGGCATGGAG TTCCTCTCTA G 51 (2) INFORMATION FOR SEQ ID NO : 19 : (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 35 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 19: CGATCTAGAG GAGGCTTCAT ATGTTTAAAC TAGTC 35 (2) INFORMATION FOR SEQ ID NO : 20: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 37 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 20: TCGAGACTAG TTTAAACATA TGAAGCCTCC TCTAGAT 37 (2) INFORMATION FOR SEQ ID NO : 21: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 87 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (ix) FEATURE: (A) NAME/KEY: CDS (B) LOCATION: 40.. 60 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 21: GATCCACTAG TCCTGAGGAC ATCCATGAGG AGATAACCG ATG TCT GAC AAG CCG 54 Met Ser Asp Lys Pro 1 5 AAT GCC GCATGCTACG TATTAATTAA ACTAGTA 87 Asn Ala (2) INFORMATION FOR SEQ ID NO : 22: SEQUENCE CHARACTERISTICS : (A) LENGTH: 7 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 22: Met Ser Asp Lys Pro Asn Ala 1 5 (2) INFORMATION FOR SEQ ID NO : 23: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 87 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 23: AGCTTACTAG TTTAATTAAT ACGATGCATG CGGCATTCGG CTTGTCAGAC ATCGGTTATC 60 TCCTCATGGA TGTCCTCAGG ACTAGTG 87