Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
MOLECULAR CLONES OF HIV-1 AND USES THEREOF
Document Type and Number:
WIPO Patent Application WO/1992/006990
Kind Code:
A1
Abstract:
The present invention relates to the HIV-1 strains MN-ST1 and BA-L which are typical United States HIV-1 isotypes. The present invention relates to DNA segments encoding the envelope protein of MN-ST1 or BA-L, to DNA constructs containing such DNA segments and to host cells transformed with such contructs. The viral isolates and envelope proteins of the present invention are of value for use in vaccines and bioassays for the detection of HIV-1 infection in biological samples, such as blood bank samples.

Inventors:
REITZ MARVIN S JR (US)
FRANCHINI GENOVEFFA (US)
MARKHAM PHILLIP D (US)
GALLO ROBERT C (US)
LORI FRANCO C (US)
POPOVIC MIKULAS (US)
GARNTER SUZANNE (US)
Application Number:
PCT/US1991/007611
Publication Date:
April 30, 1992
Filing Date:
October 17, 1991
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
US HEALTH (US)
International Classes:
A61K39/00; A61K39/21; A61P31/12; C07K14/00; C07K14/155; C07K14/16; C07K14/705; C07K17/00; C12N5/10; C12N7/00; C12N15/09; C12N15/49; C12P21/02; C12Q1/70; (IPC1-7): A61K39/10; C07H15/12; C07K3/12; C07K13/00; C07K17/00; C12N5/10; C12N7/02; C12N7/04; C12N15/49; C12Q1/70; G01N33/53
Other References:
Science, Vol. 241, issued 22 July 1988, W.C. KOFF et al, "Development and Testing of AIDS Vaccines", pages 426-432. see entire article.
Nature, Vol. 312, issued 20/27 December 1984, P.A. LUCIW et al, "Molecular cloning of AIDS-associated retrovirus", pages 760-763. see entire article.
Science, Vol. 226, issued 07 December 1984, G.M. SHAW et al, "Molecular Characterization of Human T-Cell Leukemia (Lymphotropic) Virus Type III in the Acquired Immune Deficiency Syndrome", pages 1165-1171. see entire document.
Nature, Vol. 312, issued 20/27 December 1984, M. ALIZON et al, "Molecular cloning of lymphadenopathy-associated virus", pages 757-760. see entire article.
Journal of Medical Virology, Vol. 19, issued 1986, H. RUEBSAMEN-WAIGMANN et al, "Isolation of Variants of Lymphocytopathic Retroviruses From the Peripheral Blood and Cerebrospinal Fluid of Patients With ARC or AIDS", pages 335-344. see entire article.
Nature, Vol. 313, issued 24 January 1985, L. RATNER et al, "Complete nucleotide sequence of the AIDS virus, HTLV-III", pages 277-284. see entire article.
Cell, Vol. 40, issued January 1985, S. WAIN-HOBSON et al, "Nucleotide Sequence of the AIDS Virus, LAV", pages 9-17. see entire article.
Science, Vol. 227, issued 01 February 1985, R. SANCHEZ-PESCADOR et al, "Nucleotide Sequence and Expression of an AIDS-Associated Retrovirus (ARV-2)", pages 484-492. see entire article.
Nature, Vol. 313, issued 07 February 1985, M.A. MUESING et al, "Nucleic acid structure and expression of the human AIDS/lymphadenopathy retrovirus", pages 450-458. see entire article.
Nature, Vol. 320, issued 10 April 1986, S.-L. HU et al, "Expression of AIDS virus envelope gene in recombinant vaccinia viruses", pages 537-540. see entire article.
Nature, Vol. 320, issued 10 April 1986, S. CHAKRABARTI et al, "Expression of the HTLV-III envelope gene by a recombinant vaccinia virus", pages 535-537. see entire article.
Bio/Technology, Vol. 3, issued October 1985, T.W. CHANG et al, "Detection of Antibodies to Human T-Cell Lymphotropic Virus-III (HTLV-III) with an Immunoassay Employing a Recombinant Escherichia coli-Derived Viral Antigenic Peptide", pages 905-909. see entire article.
Proc. Natl. Acad. Sci. USA, Vol. 84, issued October 1987, J. R. RUSCHE et al, "Humoral immune response to the entire human immunodeficiency virus envelope glycoprotein made in insect cells", pages 6924-6928. see entire article.
J. Virology, Vol. 63, No. 3, issued March 1989, M. HADZOPOULOU-CLADARAS et al, "The rev (trs/art) Protein of Human Immunodeficiency Virus Type 1 Affects Viral mRNA and Protein Expression via a cis-Acting Sequence in the env Region", pages 1265-1274. see entire article.
J. Virology, Vol. 64, No. 9, issued September 1990, P.J. DILLON et al, "Function of the Human Immunodeficiency Virus Types 1 and 2 Rev Proteins Is Dependent on Their Ability To Interact with a Structured Region Present in env Gene mRNA", pages 4428-4437. see entire article.
Cell, Vol. 45, issued 06 June 1986, B.R. STARCICH et al, "Identification and Characterization of Conserved and Variable Regions in the Envelope Gene of HTLV-III/LAV, the Retrovirus of AIDS", pages 637-648. see entire article.
J. Virology, Vol. 61, No. 2, issued February 1987, S. MODROW et al, "Computer-Assisted Analysis of Envelope Protein Sequences of Seven Human Immunodeficiency Virus Isolates: Prediction of Antigenic Epitopes in Conserved and Variable Regions", pages 570-578. see entire article.
Analytical Biochemistry, Vol. 151, issued 1985, D. PAULETTI et al, "Application of a Modified Computer Algorithm in Determining Potential Antigenic Determinants Associated with the AIDS Virus Glycoprotein", pages 540-546. see entire article.
Virology, Vol. 164, issued 1988, C. GURGO et al, "Envelope Sequences of Two New United States HIV-1 Isolates", pages 531-536. see entire article.
J. Virology, Vol. 64, No. 5, issued May 1990, A. ALDOVINI et al, "Mutations of RNA and Protein Sequences Involved in Human Immunodeficiency Virus Type 1 Packaging Result in Production of Noninfectious Virus", pages 1920-1926. see entire article.
See also references of EP 0554389A4
Download PDF:
Claims:
WHAT IS CLAIMED IS:
1. A substantially pure preparation of a molec¬ ular clone capable of yielding after transfection into recipient cells active cultures of the Human Immunodefi¬ ciency Virus Type 1 (HIVl) virus strain MNSTl, having the identifying characteristics of ATCC 40889.
2. A substantially pure preparation of DNA containing the envelope and rev coding sequences of the (HIVl) virus strain BAL, having the identifying charac¬ teristics of ATCC 40890.
3. A DNA segment encoding an envelope (env) protein of MNSTl.
4. The DNA segment according to claim 3 having the sequence given in Table III.
5. A DNA segment encoding an env protein of BA L.
6. A DNA segment according to claim 5 having the sequence given in Table III.
7. A purified MNSTl env protein.
8. The protein according to claim 7 having the sequence given in Table II.
9. A purified BAL protein.
10. The protein according to claim 9 having the sequence given in Table III.
11. A DNA construct comprising: i) the DNA segment according to claim 3; and ii) a vector.
12. The DNA construct according to claim 11 further comprising a DNA segment encoding a rev protein and a revresponsive region.
13. A DNA construct comprising: i) the DNA segment according to claim 5; and ii) a vector.
14. The DNA construct according to claim 13 further comprising a DNA segment encoding a rev protein and a revresponsive region.
15. A recombinantly produced MNSTl env protein.
16. A recombinantly produced BAL env protein.
17. A host cell stably transformed with said recombinant DNA construct according to claim 11 or claim 13, in a manner allowing expression of said viral protein encoded in said recombinant DNA molecule.
18. A method of producing a recombinant HIVl virus strain MNSTl protein comprising culturing said host cells according to claim 17, in a manner allowing expres¬ sion of said viral protein and isolating said viral protein.
19. A vaccine for mammals against HIVl infection comprising a noninfectious antigenic portion of said MN STl virus strain according to claim 1, in an amount sufficient to induce immunization against said infection, and a pharmaceutically acceptable carrier.
20. A vaccine for mammals against HIVinfection comprising a noninfectious antigenic portion of said BAL virus strain according to claim 2 in an amount sufficient to induce immunization against said infection, and a pharmaceutically acceptable carrier.
21. The vaccine according to claim 19 or claim 20 which further comprises an adjuvant.
22. A vaccine for mammals against HIVl infection comprising at least 5 amino acids of a MNSTl virus strain env protein, in an amount sufficient to induce immuniza¬ tion against said infection, and a pharmaceutically acceptable carrier.
23. A vaccine for mammals against HIVl infection comprising at least 5 amino acids of a BAL virus strain env protein, in an amount sufficient to induce immuniza¬ tion against said infection, and a pharmaceutically acceptable carrier.
24. The vaccine according to claim 22 or 23 wherein said protein is a recombinantly produced protein.
25. A method of testing candidate vaccines against HIVl infection comprising administering said vaccine and the MNSTl virus strain according to claim 1, to a test mammal and detecting the presence or absence of said infection.
26. A method of screening drugs for their ability to effect HIVl activity comprising contacting host cells according to claim 17, with said drug under conditions such that said activity of said virus can be effected.
27. A bioassay for the detection of HIVl in a biological sample comprising the steps of: i) coating a surface with at least 5 amino acids of a env protein from MNSTl or BAL virus; ii) contacting said coated surface with said sample; and iii) detecting the presence or absence of a complex formed between said protein and antibodies specific therefor present in said sample.
Description:
MOLECULAR CLONES OF HIV-1 AND USES THEREOF

BACKGROUND OF THE INVENTION HIV-1 has been identified as the etiologic agent of the acquired immunodeficiency syndrome (AIDS) (Barre-Sinoussi et al.. Science 220, 868-871, 1983; Popvic et al.. Science 224, 497-500, 1984; Gallo et al.. Science 224, 500-503, 1984). Infected individuals generally develop antibodies to the virus within several months of exposure (Sarngadharan et al., Science 224, 506-508, 1984) , which has made possible the development of immuno- logically based tests which can identify most blood samples from infected individuals. This is a great advantage in diagnosis, and is vital to maintaining the maximum possible safety of samples from blood banks. An important aspect of HIV-1 is its genetic variability (Hahn et al., Proc. Natl. Acad. Sci. U.S.A. 82, 4813-4817, 1985). This is particularly evident in the gene for the outer envelope glycoprotein (Starcich et al., Cell 45, 637-648, 1986; Alizon et al.. Cell 46, 63-74, 1986; Gurgo et al.. Virology 164, 531-536, 1988). Since the outer envelope glycoprotein is on the surface of the virus particle and the infected cell, it is potentially one of the primary targets of the immune system, including the target of neutralizing antibodies and cytotoxic T cells. This variability may also lead to differences in the ability of antigens from different strains of HIV-1 to be recognized by antibodies from a given individual, as well as to differences in the ability of proteins from different strains of virus to elicit an immune response which would be protective against the mixture of virus strains that exists in the at risk populations.

Several biologically active complete molecular clones of various strains of HIV-l have been obtained and sequenced. These clones, however, seem to represent viral genotypes which are relatively atypical of United States HIV-l isolates. In addition, several of the translational reading frames for non-structural viral proteins are not complete. Further, viruses derived from these clones do

not grow in macrophages, in contrast to many HIV-l field isolates and, perhaps, because of this lack of ability to infect macrophages efficiently, these clones do not repli¬ cate well in chimpanzees. This latter ability is impor- tant for testing candidate vaccines in animal systems. In addition, the ability to infect macrophages is critical in evaluating the possible protective efficacy of elicited immune response since neutralization of infectivity on acrophage may differ from the better studied neutraliza- tion on T cells.

Neutralizing antibodies (Robert-Guroff et al., Nature 316, 72-74, 1985; Weiss et al., Nature 316, 69-72, 1985) have been demonstrated in infected individuals, as have cytotoxic T cells responses (Walker et al. Nature 328, 345-348, 1988). Although these do not appear to be protective, it is likely that if they were present prior to infection, they would prevent infection, especially by related strains of virus. This is supported by the finding that macaques can be protected by immunization with inactivated simian immunodeficiency virus (SIV) from infection with the homologous live virus (Murphy-Corb et al.. Science 246, 1293-1297, 1989). Chimps also have been passively protected against challenge by live virus by prior administration of neutralizing antibodies to the same virus (E ini et al., J. Virol. 64, 3674-3678, 1989). One problem, however, is that at least some of the neu¬ tralizing antibodies studied depend on recognition of a variable region on the envelope (Matsushita et al., J. Virol. 62, 2107-2114, 1988; Rusche et al. , Proc. Natl. Acad. Sci. U.S.A. 85, 3198-3202, 1988; Skinner et al., AIDS Res. Hum. Retroviruses 4, 187-197, 1988) called the V3 region (Starcich et al. , Cell 45, 637-648, 1986).

An at least partial solution to the problem of viral heterogeneity is to identify prototypical HIV-l strains, that is, those that are most similar by DNA sequence data or serologic reactivity to strains present in the population at risk. The inclusion of a limited number of such prototype strains in a polyvalent vaccine

cocktail might then result in elicitation of an immune response protective against most naturally occurring viruses within a given population. Such a mixture should also provide the maximum possible sensitivity in diagnos- tic tests for antibodies in infected individuals.

Components of highly representative isolates of a geographical area provide the maximum possible sensitivity in diagnostic tests and vaccines. Production of viral proteins from molecular clones by recombinant DNA tech- niques is the preferred and safest means to provide such proteins. Molecular clones of prototype HIV-l strains can serve as the material from which such recombinant proteins can be made. The use of recombinant DNA avoids any possibility of the presence of live virus and affords the opportunity of genetically modifying viral gene products.

The use of biologically active clones ensures that the gene products are functional and hence, maximizes their potential relevance.

Infectious clones, that is, those which after transfection into recipient cells produce complete virus, are desirable for several reasons. One reason is that the gene products are by definition functional; this maximizes their potential relevance to what is occurring in vivo. A second reason is that genetically altered complete virus is easy to obtain. Consequently, the biological conse¬ quences of, variability can be easily assessed. For example, the effect of changes in the envelope gene on the ability of the virus to be neutralized by antibody can be easily addressed. Using this technique, a single point mutation in the envelope gene has been shown to confer resistance to neutralizing antibody (Reitz et al.. Cell 54, 57-63, 1988). A third reason is that a clonal virus population provides the greatest possible definition for challenge virus in animals receiving candidate vaccines, especially those including components of the same molecu¬ larly cloned virus.

SUMMARY OF THE INVENTION It is an object of the present invention to provide vaccine components for an anti HIV-l vaccine which would represent a typical United States isolate HIV-l. It is another object of the present invention to provide diagnostic tests for the detection of HIV-l.

Various other objects and advantages of the present invention will become apparent from the drawings and the following description of the invention. BRIEF DESCRIPTION OF THE DRAWINGS

FIGURE 1 shows the structure and restriction map of the lambda MN-PH1 clone.

FIGURE 2 shows the restriction map of the MN-PH1 envelope plasmid clone. FIGURE 3 shows the restriction map and structure of the lambda MN-STl clone.

FIGURE 4 shows the structure of the lambda BA-L clone.

FIGURE 5 shows the restriction map of the clone BA-L1.

Detailed Disclosure of the Invention The present invention relates to the HIV-l virus strains, MN-STl and BA-L, which are more typical of the HIV-l isolates found in the United States than previously known HIV-l strains. Local isolates provide better material for vaccine and for the detection of the virus in biological samples, such as blood bank samples.

The present invention relates to DNA segments encoding the env protein of MN-STl or BA-L (the DNA sequence given in Figures 5 and 8 being two such examples) and to nucleotide sequences complementary to the segments referenced above as well as to other genes and nucleotide sequences contained in these clones. The present inven¬ tion also relates to DNA segments encoding a unique portion of the MN-STl env protein or the BA-L env protein. (A "unique portion" consists of at least five (or six) amino acids or corresponding at least 15 (or 18) nucleotides.)

The invention further relates to the HIV-l virus strains MN-STl and BA-L themselves. The HIV-l virus strains of the present invention are biologically active and can easily be isolated by one skilled in the art using known methodologies.

The above-described DNA segments of the present invention can be placed in DNA constructs which are then used in the transformation of host cells for a generation of recombinantly produced viral proteins. DNA constructs of the present invention comprise a DNA segment encoding the env protein and the flanking region of MN-STl (or BA- L) or a portion thereof and a vector. The constructs can further comprise a second DNA segment encoding both a rev protein and a rev-responsive region of the env gene operably linked to the first DNA segment encoding the env protein. The rev protein facilitates efficient expression of the env protein in eucaryotic cells. Suitable vectors for use in the present invention include, but are not limited to, pSP72, lambda EMBL3 and SP65gpt. Host cells to which the present invention relates are stably transformed with the above-described DNA constructs. The cells are transformed under conditions such that the viral protein encoded in the transforming construct is expressed. The host cell can be procaryotic (such as bacterial), lower eucaryotic (such as fungal, including yeast) or higher eucaryotic (such as mammalian) . The host cells can be used to generate recombinantly produced MN-STl (or BA-L) env protein by culturing the cells in a manner allowing expression of the viral protein encoded in the construct. The recombinantly produced protein is easily isolated from the host cells using standard protein isolation protocols.

Since HIV-l strains MN-STl and BA-L represent relatively typical United States genotypes, non-infectious MN-STl or BA-L proteins (for example, the env protein) , peptides or unique portions of MN-STl or BA-L proteins (for example, a unique portion of the env protein) , and even whole inactivated MN-STl or BA-L can be used as an

immunogen in mammals, such as primates, to generate antibodies capable of neutralization and T cells capable of killing infected cells. The protein can be isolated from the virus or made recombinantly from a cloned enve- lope gene. Accordingly, the virus and viral proteins of the present invention are of value as either a vaccine or a component thereof, or an agent in immunotherapeutic treatment of individuals already infected with HIV-l.

As is customary for vaccines, a non-infectious antigenic portion of MN-STl or BA-L, for example, the env protein, can be delivered to a mammal in a pharmacologi¬ cally acceptable carrier. The present invention relates to vaccines comprising non-infectious antigenic portions of either MN-STl or BA-L and vaccines comprising non- infectious antigenic portions of both MN-STl and. BA-L. Vaccines of the present invention can include effective amounts of immunological adjuvants known to enhance an immune response. The viral protein or polypeptide is present in the vaccine in an amount sufficient to induce an immune response against the antigenic protein and thus to protect against HIV-l infection. Protective antibodies are usually best elicited by a series of 2-3 doses given about 2 to 3 weeks apart. The series can be repeated when circulating antibody concentration in the patient drops. Virus derived from the infectious HIV-l(MN) clones, MN-STl, may also be used for reproducible chal¬ lenge experiments in chimpanzees treated with candidate HIV-l vaccines or in vitro with human antiserum from individuals treated with candidate vaccines. A candidate vaccine can be administered to a test mammal, such as a chimpanzee prior to or simultaneously with the infectious MN-STl virus of the present invention. Effectiveness of the vaccine can be determined by detecting the presence or absence of HIV-l infection in the test mammals. Side-by- side comparative tests can be run by further administering to a second set of test mammals the virus alone and comparing the number of infections which develop in the two sets of test mammals. Alternatively, candidate

vaccines can be evaluated in humans by administering the vaccine to a patient and then testing the ability of the MN-STl virus to infect blood cells from the patient.

The present invention also relates to the detec- tion of HIV-l virus in a biological sample. For detection of an HIV-l infection, the presence of the virus, proteins encoded in the viral genome, or antibodies to HIV-l is determined. Many types of tests, as one skilled in the art will recognize, can be used for detection. Such tests include, but are not limited to, ELISA and RIA.

In one bioassay of the present invention all, or a unique portion, of the env protein is coated on a surface and contacted with the biological sample. The presence of a resulting complex formed between the protein and anti- bodies specific therefor in the serum can be detected by any of the known methods commonly used in the art, such as, for example, fluorescent antibody spectroscopy or colori etry.

The following non-limiting examples are given to further demonstrate the present invention without being deemed limitative thereof.

EXAMPLES MN-PH1 Clone

The permuted circular unintegrated viral DNA representing the complete HIV-l(MN) genome was cloned by standard techniques (Sambrook et al., 1989, Molecular

Cloning. Cold Spring Harbor, New York: Cold Spring Harbor

Laboratory Press) into the Eco RI site of lambda gtWES.lambda B DNA from total DNA of H9 cells producing HIV-l(MN). This clone is designated lambda MN-PH1, and its structure and restriction map are shown in Figure 1.

The clone was subcloned into M13mpl8 and M13mpl9, and the

DNA sequence of the entire clone, given in Figure 2, was obtained by the dideoxy chain termination method (Sanger et al., Proc. Natl. Acad. Sci. U.S.A. 74, 5463-5467,

1977) . The amino acid sequence of the envelope protein

(see Table I) was inferred from the DNA sequence. A restriction map of the cloned unintegrated viral DNA (see

Figure 1) was also obtained from the DNA sequence of lambda PHI and used in conjunction with the inferred amino acid sequence of the viral proteins to subclone the envelope (env) gene into the commercially available plasmid pSP72 (Promega Biological Research Products, Madison, WI) , as shown in Figure 2. This plasmid (pMN- PHlenv) contains, in addition to the coding regions for the envelope proteins, the coding region for the rev protein (Feinberg et al. , Cell 46, 807-817, 1986) and the portion of the env gene which contains the rev-responsive region (Dayton et al., J. Acquir. Immune. Defic. Syndr. 1, 441-452, 1988), since both are necessary for efficient expression of the envelope protein in eucaryotic cells. This plasmid thus contains all the elements required for production of envelope protein following placement into appropriate expression vectors and introduction into recipient cells, all by standard techniques known to molecular biologists. MN-STl clone The infectious molecular clone, lambda MN-STl, was obtained by cloning integrated provirus from DNA purified from peripheral blood lymphocytes infected with HIV-l(MN) and maintained in culture for a short time (one month) . The integrated proviral DNA was partially digested with the restriction enzyme Sau3A under conditions which gave a maximum yield of DNA fragments of from 15-20 kilobases (kb) . This was cloned into the compatible BamHI site of lambda EMBL3, as shown in Figure 3. Figure 3 also shows the restriction map of clone lambda MN-STl. The DNA sequence of the entire clone, given in Table II, was obtained by the dideoxy chain termination method (Sanger et al., Proc. Natl. Acad. Sci. U.S.A. 74, 5463-5467, 1977) . The amino acid sequence was predicted from the DNA sequence (see Table II) . This clone can be transfected into recipient cells by standard techniques. After transfection, the cloned proviral DNA is expressed into biologically active virus particles, which can be used as a source for virus stocks. The proviral DNA whose

restriction map is shown in Figure 2, was removed from the lambda phage vector by digestion with BamHI and inserted into a plasmid, SP65gpt (Feinberg et al.. Cell 46, 807- 817, 1986). This plasmid, pMN-STl, contains an SV40 origin of replication. Consequently, transfection into COS-1 cells (Gluzman, Y. Cell 23, 175-182, 1981), which produce a SV40 gene product which interacts with the cognate origin of replication, results in a transient high plasmid copy number with a concomitant production of large amount of replication competent, infectious virus (Feinberg et al. , Cell 46, 807-817, 1986). This provides a convenient source of genetically homogeneous virus, as well as a way to introduce desired mutations using stan¬ dard methods. The envelope gene was excised from the lambda phage clone and cloned into a plasmid as described above for lambda MN-PH1. This clone (pMN-STlenv) , is similar to pMN-PHlenv, described above, except that it derives from a biologically active cloned provirus. Like pMN-PHlenv, it can be placed in a suitable vector and host to produce the envelope protein of HIV-l(MN) by well known techniques. BA-L Clone

A Hind III fragment of unintegrated viral DNA representing the HIV-l(BA-L) genome was cloned by standard techniques into lambda phage Charon 28 DNA from total DNA of peripheral blood macrophages infected with and producing HIV-l(BA-L) . A positive clone was selected by hybridization using a radiolabelled probe for the HIV-l envelope. This clone, designated lambda BA-L1, was found to contain the entire gene for the envelope protein. Its structure is given in Figure 4. The insert was trans¬ ferred into a plasmid (pBluescript, Stratagene, LaJolla, CA) and the DNA sequence of the env gene was determined (see Table III). This clone is designated pBA-Ll. The amino acid sequence of the envelope protein, shown in Table III, was inferred from the DNA sequence. A restriction map was also obtained from the DNA sequence of BA-L1 (shown in Figure 5) in order to determine the

appropriate restriction enzyme sites for cloning the env gene into suitable expression vectors. An Eco RI-Hindlll fragment of 0.4 Kb and a 2.8 Kb Hindlll-Xbal fragment when cloned together constitute the entire env gene. This plasmid contains, in addition to the coding regions for the envelope proteins, the coding region for the rev protein and the portion of the env protein which contains the rev-responsive region. Both are necessary for effi¬ cient expression of the envelope protein in eucaryotic cells (Feinberg et al.. Cell 46, 807-817, 1986; Dayton et al., J. Acquir. Immune. Defic. Syndr. 1, 441-452). This plasmid thus contains all the HIV-l genetic elements required for production of envelope protein following placement into appropriate expression vectors and intro- duction into recipient cells, all by standard techniques well known in the art. Statement of Deposit

The lambda MN-STl clone and the BA-L plasmid clone were deposited at the American Type Culture Collec- tion (ATCC) , 12301 Parklawn Drive, Rockville, Maryland 20852, U.S.A., on September 13, 1990, under the terms of the Budapest Treaty. The lambda MN-STl clone has been assigned the ATCC accession number ATCC 40889 and the BA-L plasmid clone has been assigned the ATCC accession number ATCC 40890.

* * * * * *

All publications mentioned hereinabove are hereby incorporated by reference.

While the foregoing invention has been described in some detail for purposes of clarity and understanding, it will be appreciated by one skilled in the art from a reading of this disclosure that various changes in form and detail can be made without departing from the true scope of the invention.

TABLE I

TGGAAGGGCT AATTCACTCC CAACGAAGAC AAGATATCCT TGATCTGTGG ATCTACCACA 60 CACAAGGCTA CTTCCCTGAT TAGCAGAACT ACACACCAGG GCCAGGGATC AGATATCCAC 120 TGACCTTTGG ATGGTGCTAC AAGCTAGTAC CAGTTGAGCC AGAGAAGTTA GAAGAAGCCA 180 ACAAAGGAGA GAACACCAGC TTGTTACACC CTGTGAGCCT GCATGGAATG GATGACCCGG 240 AGAGAGAAGT GTTAGAGTGG AGGTTTGACA GCCGCCTAGC ATTTCATCAC ATGGCCCGAG 300 AGCTGCATCC GGAGTACTTC AAGAACTGCT GACATCGAGC TTGCTACAAG GGACTTTCCG 360 CTGGGGACTT TCCAGGGAGG CGTGGCCTGG GCGGGACTGG GGAGTGGCGA GCCCTCAGAT 420 CCTGCATATA AGCAGCTGCT TTTTGCCTGT ACTGGGTCTC TCTGGTTAGA CCAGATCTGA 480 GCCTGGGAGC TCTCTGGCTA ACTAGGGAAC CCACTGCTTA AGCCTCAATA AAGCTTGCCT 540 TGAGTGCTTC AAGTAGTGTG TGCCCGTCTG TTATGTGACT CTGGTAGCTA GAGATCCCTC 600 AGATCCTTTT AGGCAGTGTG GAAAATCTCT AGCAGTGGCG CCCGAACAGG GACTTGAAAG 660 CGAAAGAAAA ACCAGAGCTC TCTCGACGCA GGACTCGGCT TGCTGAAGCG CGCACGGCAA 720 GAGGCGAGGG GCGGCGACTG GTGAGTACGC CAAAAATTCT TGACTAGCGG AGGCTAGAAG 780 GAGAGAGATG GGTGCGAGAG CGTCGGTATT AAGCGGGGGA GAATTAGATC GATGGGAAAA 840 CATTCGGTTA AGGCCAGGGG GAAAGAAAAA ATATAAATTA AAACATGTAG TATGGGCAAG 900 CAGGGAGCTA GAACGATTCG CAGTCAATCC TGGCCTGTTA GAAACATCAG AAGGCTGTAG 960 ACAAATACTG GGACAGCTAC AACCATCCCT TCAGACAGGA TCAGAAGAAC TTAAATCATT 1020 ATATAATACA GTAGCAACCC TCTATTGTGT GCATCAAAAG ATAGAGATAA AAGACACCAA 1080 GGAAGCTTTA GAGAAAATAG AGGAAGAGCA AAACAAAAGT AAGAAAAAAG CACAGCAAGC 1140 AGCAGCTGAC ACAGGAAACA GAGGAAACAG CAGCCAAGTC AGCCAAAATT ACCCCATAGT 1200 GCAGAACATC GAGGGGCAAA TGGTACATCA GGCCATATCA CCTAGAACTT TAAATGCATG 1260 GGTAAAAGTA GTAGAAGAGA AGGCTTTCAG CCCAGAAGTA ATACCCATGT TTTCAGCATT 1320 ATCAGAAGGA GCCACCCCAC AAGATTTAAA CACCATGCTA AACACAGTGG GGGGACATCA 1380 AGCAGCCATG CAAATGTTAA AAGAGACCAT CAATGAGGAA GCTGCAGAAT GGGATAGATT 1440 GCATCCAGTG CATGCAGGGC CTATTACACC AGGCCAGATG AGAGAACCAA GGGGAAGTG 1500 CATAGCAGGA ACTACTAGTA CCCTTCAGGA ACAAATAGGA TGGATGACAA ATAATCCACC 1560 TATCCCAGTA GGAGAAATCT ATAAAAGATG GATAATCCTG GGATTAAATA AAATAGTAAG 1620 GATGTATAGC CCTTCCAGCA TTCTGGACAT AAGACAAGGA CCAAAGGAAC CCTTTAGAGA 1680 CTATGTAGAC CGGTTCTATA AAACTCTAAG AGCCGAGCAA GCTTCACAGG AGGTAAAAAA 1740 CCGGACGACA GAAACCTTGT TGGTCCAAAA TGCGAACCCA GATTGTAAGA CTATTTTAAA 1800 AGCATTGGGA CCAGCAGCTA CACTAGAAGA AATGATGACA GCATGTCAGG GAGTGGGAGG 1860 ACCTGGTCAT AAAGCAAGAG TTTTGGCGGA AGCGATGAGC CAAGTAACAA ATTCAGCTAC 1920

CATAATGATG CAGAGAGGCA ATTTTAGGAA TCAAAGAAAG ATTATCAAGT GCTTCAATTG 1980 TGGCAAAGAA GGGCACATAG CCAAAAATTG CAGGGCCCCT AGGAAAAGGG GCTGTTGGAA 2040 ATGTGGAAAG GAAGGACACC AAATGAAAGA TTGTACTGAG AGACAGGCTA ATTTTTTAGG 2100 GAAGATCTGG CCTTCCTGCA AGGGAAGGCG GAATTTTCCT CAGAGCAGAA CAGAGCCAAC 2160 AGCCCCACCA GAAGAGAGCT TCAGGTTTGG GGAAGAGACA ACAACTCCCT ATCAGAAGCA 2220 GGAGAAGAAG CAGGAGACGA TAGACAAGGA CCTGTATCCT TTAGCTTCCC TCAAATCACT 2280 CTTTGGCAAC GACCCATTGT CACAATAAAG ATAGGGGGGC AACTAAAGGA AGCTCTATTA 2340 GATACAGGAG CAGATGATAC AGTATTAGGA GAAATGAATT TGCCAAGAAG ATGGAAACCA 2400 AAAATGATAG GGGGAATTGG AGGTTTTATC AAAGTAAGAC AGTATGATCA GATAACCATA 2460 GGAATCTGTG GACATAAAGC TATAGGTACA GTATTAGTAG GACCTACACC TGTCAACATA 2520 ATTGGAAGAA ATCTGTTGAC TCAGCTTGGG TGCACTTTAA ATTTTCCCAT TAGTCCTATT 2580 GAAACTGTAC CAGTAAAATT AAAGCCAGGA ATGGATGGCC CAAAAGTTAA ACAATGGCCA 2640 TTGACAGAAG AAAAAATAAA AGCATTAATA GAAATTTGTA CAGAAATGGA AAAGGAAGGG 2700 AAAATTTCAA AAATTGGGCC TGAAAATCCA TACAATACTC CAGTATTTGC CATAAAGAAA 2760 AAAGACAGTA CTAAATGGAG AAAATTAGTA GATTTCAGAG AACTTAATAA GAAAACTCAA 2820 GACTTCTGGG AAGTTCAATT AGGAATACCA CATCCTGCAG GGTTAAAAAA GAAAAAATCA 2880 GTAACAGTAC TGGATGTGGG TGATGCATAT TTTTCAGTTC CCTTAGATAA AGACTTCAGG 2940 AAGTATACTG CATTTACCAT ACCTAGTATA AACAATGAAA CACCAGGGAT TAGATATCAG 3000 TACAATGTGC TTCCACAGGG ATGGAAAGGA TCACCAGCAA TATTCCAAAG TAGCATGACA 3060 AAAATCTTAG AGCCTTTTAG AAAACAAAAT CCAGACATAG TTATCTATCA ATACATGGAT 3120 GATTTGTATG TAGGATCTGA CTTAGAAATA GGGCAGCATA GAGCAAAAAT AGAGGAACTG 3180 AGACGACATC TGTTGAGGTG GGGATTTACC ACACCAGACA AAAAACATCA GAAAGAACCT 3240 CCATTCCTTT GGATGGGTTA TGAACTCCAT CCTGATAAAT GGACAGTACA GCCTATAGTG 3300 CTACCAGAAA AAGACAGCTG GACTGTCAAT GACATACAGA AGTTAGTGGG AAAATTGAAT 3360 TGGGCAAGTC AGATTTACGC AGGGATTAAA GTAAAGCAAT TATGTAAACT CCTTAGAGGA 3420 ACCAAAGCAC TAACAGAAGT AATACCACTA ACAGAAGAAG CAGAGCTAGA ACTGGCAGAA 3480 AACAGGGAAA TTCTAAAAGA ACCAGTACAT GGAGTGTATT ATGACCCATC AAAAGACTTA 3540 ATAGCAGAAG TACAGAAGCA GGGGCAAGGC CAATGGACAT ATCAAATTTA TCAAGAGCCA 3600 TTTAAAAATC TGAAAACAGG CAAATATGCA AGAATGAGGG GTGCCCACAC TAATGATGTA 3660 AAACAATTAA CAGAGGCAGT GCAAAAAATA GCCACAGAAA GCATAGTAAT ATGGGGAAAG 3720 ACTCCTAAAT TTAGACTACC CATACAAAAA GAAACATGGG AAACATGGTG GACAGAGTAT 3780 ACGTAAGCCA CCTGGATTCC TGAGTGGGAG GTTGTCAATA CCCCTCCCTT AGTGAAATTA 3840 TGGTACCAGT TAGAGAAAGA ACCCATAGTA GGTGCAGAAA CTTTCTATGT AGATGGGGCA 3900 GCTAACAGGG AGACTAAAAA AGGAAAAGCA GGATATGTTA CTAACAGAGG AAGACAAAAG 3960

GTTGTCTCCC TAACTGACAC AACAAATCAG AAGACTGAGT TACAAGCAAT TCATCTAGCT 4020 TTGCAAGATT CAGGGTTAGA AGTAAACATA GTAACAGACT CACAATATGC ATTAGGAATC 4080 ATTCAAGCAC AACCAGATAA AAGTGAATCA GAGTTAGTCA GTCAAATAAT AGAGCAGTTA 4140 ATAAAAAAGG AAAAGGTCTA TCTGGCATGG GTACCAGCAC ACAAAGGAAT TGGAGGAAAT 4200 GAACAAGTAG ATAAATTAGT CAGTGCTGGA ATCAGGAAAG TACTATTTTT AGATGGAATA 4260 GATAAGGCCC AAGAAGACCA TGAGAAATAT CACAGTAATT GGAGAGCAAT GGCTAGTGAC 4320 TTTAACCTAC CACCTATAGT AGCAAAAGAA ATAGTAGCCA GCTGTGATAA ATGTCAGCTA 4380 AAAGGAGAAG CCATGCATGG ACAAGTAGAC TGTAGTCCAG GAATATGGCA ACTAGATTGT 4440 ACACATTTAG AAGGAAAAGT TATCCTGGTA GCAGTTCATG TAGCCAGTGG ATACATAGAA 4500 GCAGAAGTTA TTCCAGCAGA GACAGGGCAG GAGACAGCAT ACTTTCTCTT AAAATTAGCA 4560 GGAAGATGGC CAGTAAAAAC AATACATACA GACAATGGCC CCAATTTCAC CAGTACTACG 4620 GTTAAGGCCG CCTGTTGGTG GACGGGAATC AAGCAGGAAT TTGGCATTCC CTACAATCCC 4680 CAAAGTCAAG GAGTAATAGA ATCTATGAAT AAAGAATTAA AGAAAATTAT AGGACAGGTA 4740 AGAGATCAGG CTGAACATCT TAAGAGAGCA GTACAAATGG CAGTATTCAT CCACAATTTT 4800 AAAAGAAAAG GGGGGATTGG GGGGTACAGT GCAGGGGAAA GAATAGTAGG CATAATAGCA 4860 ACAGACATAC AAACTAAAGA ACTACAAAAA CAAATTACAA AAATTCAAAA TTTTCGGGTT 4920 TATTACAGGG ACAGCAGAGA TCCACTTTGG AAAGGACCAG CAAAGCTTCT CTGGAAAGGT 4980 GAAGGGGCAG TAGTAATACA AGATAATAAT GACATAAAAG TAGTGCCAAG AAGAAAAGCA 5040 AAGGTCATTA GGGATTATGG AAAACAGACG GCAGGTGATG ATTGTGTGGC AAGCAGACAG 5100 GATGAGGATT AGAACATGGA AAAGTTTAGT AAAACACCAT ATGTATATTT CAAAGAAAGC 5160 TAAAGGACGG TTTTATAGAC ATCACTATGA AAGCACTCAT CCAAGAATAA GTTCAGAAGT 5220 ACACATCCCA CTAGGGGATG CTAGATTGGT AATAACAACA TATTGGGGTC TGCATACAGG 5280 AGAAAGAGAC TGGCATTTAG GTCAGGGAGT CTCCATAGAA TGGAGGAAAA AGAGATATAG 5340 CACACAAGTA GACCCTGACC TAGCAGACCA CCTAATTCAT CTGCATTACT TTGATTGTTT 5400 TTCAGACTCT GCCATAAGAA AGGCCATATT AGGACATAGA GTTAGTCCTA TTTGTGAATT 5460 TCAAGCAGGA CATAACAAGG TAGGACCTCT ACAGTACTTG GCACTAACAG CATTAATAAC 5520 ACCAAAAAAG ATAAAGCCAC CTTTGCCTAG TGTTAAGAAA CTGACAGAGG ATAGATGGAA 5580 CAAGCCCCAG AAGACCAAGG GCCACAGAGG GAGCCATACA ATCAATGGGC ACTAGAGCTT 5640 TTAGAGGAGC TTAAGAATGA AGCTGTTAGA CATTTTCCTA GGATATGGCT CCATGGCTTA 5700 GGGCAACATA TCTATGAAAC TTATGGGGAT ACTTGGGCAG GAGTGGAAGC CATAATAAGA 5760 ATTCTACAAC AACTGCTGTT TATTCATTTC AGAATTGGGT GTCGACATAG CAGAATAGGC 5820 ATTATTCGAC AGAGGAGAGC AAGAAATGGA GCCAGTAGAT CCTAGACTAG AGCCCTGGAA 5880 GCATCCAGGA AGTCAGCCTA AGACTGCTTG TACCACTTGC TATTGTAAAA AGTGTTGCTT 5940

TCATTGCCAA GTTTGTTTCA CAAAAAAAGC CTTAGGCATC TCCTATGGCA GGAAGAAGCG 6000

GAGACAGCGA CGAAGAGCTC CTGAAGACAG TCAGACTCAT CAAGTTTCTC TACCAAAGCA 6060

GTAAGTAGTA CATGTAATGC AACCTTTAGT AATAGCAGCA ATAGTAGCAT TAGTAGTAGC 6120

AGGAATAATA GCAATAGTTG TGTGATCCAT AGTATTCATA GAATATAGGA AAATAAGAAG 6180

ACAAAGAAAA ATAGACAGGT TAATTGATAG AATAAGCGAA AGAGCAGAAG ACAGTGGCA 6239

ATG AGA GTG AAG GGG ATC AGG AGG AAT TAT CAG CAC TGG TGG GGA TGG 6287 Met Arg Val Lys Gly He Arg Arg Asn Tyr Gin His Trp Trp Gly Trp 1 5 10 15

GGC ACG ATG CTC CTT GGG TTA TTA ATG ATC TGT AGT GCT ACA GAA AAA 6335 Gly Thr Met Leu Leu Gly Leu Leu Met He Cys Ser Ala Thr Glu Lys 20 25 30

TTG TGG GTC ACA GTC TAT TAT GGG GTA CCT GTG TGG AAA GAA GCA ACC 6383 Leu Trp Val Thr Val Tyr Tyr Gly Val Pro Val Trp Lys Glu Ala Thr 35 40 45

ACC ACT CTA TTT TGT GCA TCA GAT GCT AAA GCA TAT GAT ACA GAG GTA 6431 Thr Thr Leu Phe Cys Ala Ser Asp Ala Lys Ala Tyr Asp Thr Glu Val 50 55 60

CAT AAT GTT TGG GCC ACA CAA GCC TGT GTA CCC ACA GAC CCC AAC CCA 6479 His Asn Val Trp Ala Thr Gin Ala Cys Val Pro Thr Asp Pro Asn Pro 65 70 75 80

CAA GAA GTA GAA TTG GTA AAT GTG ACA GAA AAT TTT AAC ATG TGG AAA 6527 Gin Glu Val Glu Leu Val Asn Val Thr Glu Asn Phe Asn Met Trp Lys 85 90 95

AAT AAC ATG GTA GAA CAG ATG CAT GAG GAT ATA ATC AGT TTA TGG GAT 6575 Asn Asn Met Val Glu Gin Met His Glu Asp He He Ser Leu Trp Asp 100 105 110

CAA AGC CTA AAG CCA TGT GTA AAA TTA ACC CCA CTC TGT GTT ACT TTA 6623 Gin Ser Leu Lys Pro Cys Val Lys Leu Thr Pro Leu Cys Val Thr Leu 115 120 125

AAT TGC ACT GAT TTG AGG AAT ACT ACT AAT ACC AAT AAT AGT ACT GCT 6671 Asn Cys Thr Asp Leu Arg Asn Thr Thr Asn Thr Asn Asn Ser Thr Ala 130 135 140

AAT AAC AAT AGT AAT AGC GAG GGA ACA ATA AAG GGA GGA GAA ATG AAA 6719 Asn Asn Asn Ser Asn Ser Glu Gly Thr He Lys Gly Gly Glu Met Lys 145 150 155 160

AAC TGC TCT TTC AAT ATC ACC ACA AGC ATA AGA GAT AAG ATG CAG AAA 6767 Asn Cys Ser Phe Asn He Thr Thr Ser He Arg Asp Lys Met Gin Lys 165 170 175

GAA TAT GCA CTT CTT TAT AAA CTT GAT ATA GTA TCA ATA GAT AAT GAT 6815 Glu Tyr Ala Leu Leu Tyr Lys Leu Asp He Val Ser He Asp Asn Asp 180 185 190

AGT ACC AGC TAT AGG TTG ATA AGT TGT AAT ACC TCA GTC ATT ACA CAA 6863 Ser Thr Ser Tyr Arg Leu He Ser Cys Asn Thr Ser Val He Thr Gin 195 200 205

GCT TGT CCA AAG ATA TCC TTT GAG CCA ATT CCC ATA CAC TAT TGT GCC 6911 Ala Cys Pro Lye He Ser Phe Glu Pro He Pro He His Tyr Cys Ala 210 215 220

CCG GCT GGT TTT GCG ATT CTA AAA TGT AAC GAT AAA AAG TTC AGT GGA 6959 Pro Ala Gly Phe Ala He Leu Lys Cys Asn Asp Lys Lys Phe Ser Gly 225 230 235 240

AAA GGA TCA TGT AAA AAT GTC AGC ACA GTA CAA TGT ACA CAT GGA ATT 7007 Lys Gly Ser Cys Lys Asn Val Ser Thr Val Gin Cys Thr His Gly He 245 250 255

AGG CCA GTA GTA TCA ACT CAA CTG CTG TTA AAT GGC AGT CTA GCA GAA 7055 Arg Pro Val Val Ser Thr Gin Leu Leu Leu Asn Gly Ser Leu Ala Glu 260 265 270

GAA GAG GTA GTA ATT AGA TCT GAG AAT TTC ACT GAT AAT GCT AAA ACC 7103 Glu Glu Val Val He Arg Ser Glu Asn Phe Thr Asp Asn Ala Lys Thr 275 280 285

ATC ATA GTA CAT CTG AAT GAA TCT GTA CAA ATT AAT TGT ACA AGA CCC 7151 He He Val His Leu Asn Glu Ser Val Gin He Asn Cys Thr Arg Pro 290 295 300

AAC TAC AAT AAA AGA AAA AGG ATA CAT ATA GGA CCA GGG AGA GCA TTT 7199 Asn Tyr Asn Lys Arg Lys Arg He His He Gly Pro Gly Arg Ala Phe 305 310 315 320

TAT ACA ACA AAA AAT ATA ATA GGA ACT ATA AGA CAA GCA CAT TGT AAC 7247 Tyr Thr Thr Lys Asn He He Gly Thr He Arg Gin Ala His Cys Asn 325 330 335

ATT AGT AGA GCA AAA TGG AAT GAC ACT TTA AGA CAG ATA GTT AGC AAA 7295 He Ser Arg Ala Lys Trp Asn Asp Thr Leu Arg Gin He Val Ser Lys 340 345 350

TTA AAA GAA CAA TTT AAG AAT AAA ACA ATA GTC TTT AAT CAA TCC TCA 7343 Leu Lys Glu Gin Phe Lys Asn Lys Thr He Val Phe Asn Gin Ser Ser 355 360 365

GGA GGG GAC CCA GAA ATT GTA ATG CAC AGT TTT AAT TGT GGA GGG GAA 7391 Gly Gly Asp Pro Glu He Val Met His Ser Phe Asn Cys Gly Gly Glu 370 375 380

TTT TTC TAC TGT AAT ACA TCA CCA CTG TTT AAT AGT ACT TGG AAT GGT 7433 Phe Phe Tyr Cys Asn Thr Ser Pro Leu Phe Asn Ser Thr Trp Asn Gly 385 390 395 400

AAT AAT ACT TGG AAT AAT ACT ACA GGG TCA AAT AAC AAT ATC ACA CTT 7487 Asn Asn Thr Trp Asn Asn Thr Thr Gly Ser Asn Asn Asn He Thr Leu 405 410 415

CAA TGC AAA ATA AAA CAA ATT ATA AAC ATG TGG CAG GAA GTA GGA AAA 7535 Gin Cys Lys He Lys Gin He He Asn Met Trp Gin Glu Val Gly Lys 420 425 430

GCA ATG TAT GCC CCT CCC ATT GAA GGA CAA ATT AGA TGT TCA TCA AAT 7583 Ala Met Tyr Ala Pro Pro He Glu Gly Gin He Arg Cys Ser Ser Asn 435 440 445

ATT ACA GGG CTA CTA TTA ACA AGA GAT GGT GGT AAG GAC ACG GAC ACG 7631 He Thr Gly Leu Leu Leu Thr Arg Asp Gly Gly Lys Asp Thr Asp Thr 450 455 460

AAC GAC ACC GAG ATC TTC AGA CCT GGA GGA GGA GAT ATG AGG GAC AAT 7679 Asn Asp Thr Glu He Phe Arg Pro Gly Gly Gly Asp Met Arg Aβp Asn 465 470 475 480

TGG AGA AGT GAA TTA TAT AAA TAT AAA GTA GTA ACA ATT GAA CCA TTA 7727 Trp Arg Ser Glu Leu Tyr Lys Tyr Lys Val Val Thr He Glu Pro Leu 485 490 495

GGA GTA GCA CCC ACC AAG GCA AAG AGA AGA GTG GTG CAG AGA GAA AAA 7775 Gly Val Ala Pro Thr Lys Ala Lys Arg Arg Val Val Gin Arg Glu Lys 500 505 510

AGA GCA GCG ATA GGA GCT CTG TTC CTT GGG TTC TTA GGA GCA GCA GGA 7823 Arg Ala Ala He Gly Ala Leu Phe Leu Gly Phe Leu Gly Ala Ala Gly 515 520 525

AGC ACT ATG GGC GCA GCG TCA GTG ACG CTG ACG GTA CAG GCC AGA CTA 7871 Ser Thr Met Gly Ala Ala Ser Val Thr Leu Thr Val Gin Ala Arg Leu 530 535 540

TTA TTG TCT GGT ATA GTG CAA CAG CAG AAC AAT TTG CTG AGG GCC ATT 7919 Leu Leu Ser Gly He Val Gin Gin Gin Asn Asn Leu Leu Arg Ala He 545 550 555 560

GAG GCG CAA CAG CAT ATG TTG CAA CTC ACA GTC TGG GGC ATC AAG CAG 7967 Glu Ala Gin Gin His Met Leu Gin Leu Thr Val Trp Gly He Lys Gin 565 570 575

CTC CAG GCA AGA GTC CTG GCT GTG GAA AGA TAC CTA AAG GAT CAA CAG 8015 Leu Gin Ala Arg Val Leu Ala Val Glu Arg Tyr Leu Lys Asp Gin Gin 580 585 590

CTC CTG GGG TTT TGG GGT TGC TCT GGA AAA CTC ATT TGC ACC ACT ACT 8063 Leu Leu Gly Phe Trp Gly Cys Ser Gly Lys Leu He Cys Thr Thr Thr 595 600 605

GTG CCT TGG AAT GCT AGT TGG AGT AAT AAA TCT CTG GAT GAT ATT TGG 8111 Val Pro Trp Asn Ala Ser Trp Ser Asn Lys Ser Leu Asp Asp He Trp 610 615 620

AAT AAC ATG ACC TGG ATG CAG TGG GAA AGA GAA ATT GAC AAT TAC ACA 8159 Asn Asn Met Thr Trp Met Gin Trp Glu Arg Glu He Asp Asn Tyr Thr 625 630 635 640

AGC TTA ATA TAC TCA TTA CTA GAA AAA TCG CAA ACC CAA CAA GAA AAG 8207 Ser Leu He Tyr Ser Leu Leu Glu Lys Ser Gin Thr Gin Gin Glu Lys 645 650 655

AAT GAA CAA GAA TTA TTG GAA TTG GAT AAA TGG GCA AGT TTG TGG AAT 8255 Asn Glu Gin Glu Leu Leu Glu Leu Asp Lys Trp Ala Ser Leu Trp Asn 660 665 670

TGG TTT GAC ATA ACA AAT TGG CTG TGG TAT ATA AAA ATA TTC ATA ATG 8303 Trp Phe Asp He Thr Asn Trp Leu Trp Tyr He Lys He Phe He Met 675 680 685

ATA GTA GGA GGC TTG GTA GGT TTA AGA ATA GTT TTT GCT GTA CTT TCT 8351 He Val Gly Gly Leu Val Gly Leu Arg He Val Phe Ala Val Leu Ser 690 695 700

ATA GTG AAT AGA GTT AGG CAG GGA TAC TCA CCA TTG TCG TTG CAG ACC 8399 He Val Asn Arg Val Arg Gin Gly Tyr Ser Pro Leu Ser Leu Gin Thr 705 710 715 720

CGC CCC CCA GTT CCG AGG GGA CCC GAC AGG CCC GAA GGA ATC GAA GAA 8447 Arg Pro Pro Val Pro Arg Gly Pro Asp Arg Pro Glu Gly He Glu Glu 725 730 735

GAA GGT GGA GAG AGA GAC AGA GAC ACA TCC GGT CGA TTA GTG CAT GGA 8495 Glu Gly Gly Glu Arg Asp Arg Asp Thr Ser Gly Arg Leu Val His Gly 740 745 750

TTC TTA GCA ATT ATC TGG GTC GAC CTG CGG AGC CTG TTC CTC TTC AGC 8543 Phe Leu Ala He He Trp Val Asp Leu Arg Ser Leu Phe Leu Phe Ser 755 760 765

TAC CAC CAC AGA GAC TTA CTC TTG ATT GCA GCG AGG ATT GTG GAA CTT 8591 Tyr Hie His Arg Aβp Leu Leu Leu He Ala Ala Arg He Val Glu Leu 770 775 780

CTG GGA CGC AGG GGG TGG GAA GTC CTC AAA TAT TGG TGG AAT CTC CTA 8639 Leu Gly Arg Arg Gly Trp Glu Val Leu Lye Tyr Trp Trp Asn Leu Leu 785 790 795 800

CAG TAT TGG AGT CAG GAA CTA AAG AGT AGT GCT GTT AGC TTG CTT AAT 8687 Gin Tyr Trp Ser Gin Glu Leu Lys Ser Ser Ala Val Ser Leu Leu Asn 805 810 815

GCC ACA GCT ATA GCA GTA GCT GAG GGG ACA GAT AGG GTT ATA GAA GTA 8735 Ala Thr Ala He Ala Val Ala Glu Gly Thr Asp Arg Val He Glu Val 820 825 830

CTG CAA AGA GCT GGT AGA GCT ATT CTC CAC ATA CCT ACA AGA ATA AGA 8783 Leu Gin Arg Ala Gly Arg Ala He Leu His He Pro Thr Arg He Arg 835 840 845

CAG GGC TTG GAA AGG GCT TTG CTA TAAGATGGGT GGCAAATGGT CAAAACGTGT 8837 Gin Gly Leu Glu Arg Ala Leu Leu 850 855

GACTGGATGG CCTACTGTAA GGGAAAGAAT GAGACGAGCT GAACCAGCTG AGCTAGCAGC 8897

AGATGGGGTG GGAGCAGCAT CCCGAGACCT GGAAAAACAT GGAGCACTCA CAAGTAGCAA 8957

TACAGCAGCT ACCAATGCTG ATTGTGCCTG GCTAGAAGCA CAAGAGGAGG AGGAAGTGGG 9017

TTTTCCAGTC AAACCTCAGG TACCTTTAAG ACCAATGACT TACAAAGCAG CTTTAGATCT 9077

TAGCCACTTT TTAAAAGAAA AGGGGGGACT GGATGGGTTA ATTTACTCCC AAAAGAGACA 9137

AGACATCCTT GATCTGTGGG TCTACCACAC ACAAGGCTAC TTCCCTGATT GGCAGAACTA 9197

CACACCAGGG CCAGGGATCA GATATCCACT GACCTTTGGA TGGTGCTTCA AGCTAGTACC 9257

AGTTGAGCCA GAGAAGATAG AAGAGGCCAA TAAAGGAGAG AACAACTGCT TGTTACACCC 9 17

TATGAGCCAG CATGGATGGA TGACCCGGAG AGAGAAGTGT TAGTGTGGAA GTCTGACAGC 9377

CACCTAGCAT TTCAGCATTA TGCCCGAGAG CTGCATCCGG AGTACTACAA GAACTGCTGA 9437

CATCGAGCTA TCTACAAGGG ACTTTCCGCT GGGGACTTTC CAGGGAGGTG TGGCCTGGGC 9497

GGGACCGGGG AGTGGCGAGC CCTCAGATCG TGCATATAAG CAGCTGCTTT CTGCCTGTAC 9557

TGGGTCTCTC TGGTTAGACC AGATCTGAGC CTGGGAGCTC TCTGGCTAAC TAGGGAACCC 9617

ACTGCTTAAG CCTCAATAAA GCTTGCCTTG AGTGCTTCAA GTAGTGTGTG CCCGTCTGTT 9677

ATGTGACTCT GGTAGCTAGA GATCCCTCAG ATCCTTTTAG GCAGTGTGGA AAATCTCTAG 9737

CA 9739

Met Arg Val Lys Gly He Arg Arg Asn Tyr Gin His Trp Trp Gly Trp

1 5 10 15

Gly Thr Met Leu Leu Gly Leu Leu Met He Cys Ser Ala Thr Glu Lys 20 25 30

Leu Trp Val Thr Val Tyr Tyr Gly Val Pro Val Trp Lys Glu Ala Thr 35 40 45

Thr Thr Leu Phe Cyβ Ala Ser Asp Ala Lys Ala Tyr Asp Thr Glu Val 50 55 60

Hie Asn Val Trp Ala Thr Gin Ala Cys Val Pro Thr Asp Pro Asn Pro 65 70 75 80

Gin Glu Val Glu Leu Val Asn Val Thr Glu Asn Phe Asn Met Trp Lys 85 90 95

Asn Asn Met Val Glu Gin Met His Glu Asp He He Ser Leu Trp Asp 100 105 110

Gin Ser Leu Lys Pro Cys Val Lys Leu Thr Pro Leu Cys Val Thr Leu 115 120 125

Asn Cys Thr Asp Leu Arg Asn Thr Thr Asn Thr Asn Asn Ser Thr Ala 130 135 140

Asn Asn Asn Ser Asn Ser Glu Gly Thr He Lys Gly Gly Glu Met Lys 145 150 155 160

Asn Cys Ser Phe Asn He Thr Thr Ser He Arg Asp Lys Met Gin Lys 165 170 175

Glu Tyr Ala Leu Leu Tyr Lys Leu Asp He Val Ser He Asp Asn Asp 180 185 190

Ser Thr Ser Tyr Arg Leu He Ser Cys Asn Thr Ser Val He Thr Gin 195 200 205

Ala Cys Pro Lys He Ser Phe Glu Pro He Pro He His Tyr Cys Ala 210 215 220

Pro Ala Gly Phe Ala He Leu Lys Cys Asn Asp Lys Lys Phe Ser Gly 225 230 235 240

Lys Gly Ser Cys Lys Asn Val Ser Thr Val Gin Cys Thr His Gly He 245 250 255

Arg Pro Val Val Ser Thr Gin Leu Leu Leu Asn Gly Ser Leu Ala Glu 260 265 270

Glu Glu Val Val He Arg Ser Glu Asn Phe Thr Asp Asn Ala Lys Thr 275 280 285

He He Val His Leu Asn Glu Ser Val Gin He Asn Cys Thr Arg Pro 290 295 300

Asn Tyr Asn Lys Arg Lys Arg He His He Gly Pro Gly Arg Ala Phe 305 310 315 320

Tyr Thr Thr Lys Asn He He Gly Thr He Arg Gin Ala His Cys Asn 325 330 335

He Ser Arg Ala Lys Trp Asn Asp Thr Leu Arg Gin He Val Ser Lys 340 345 350

Leu Lys Glu Gin Phe Lys Asn Lys Thr He Val Phe Asn Gin Ser Ser 355 360 365

Gly Gly Asp Pro Glu He Val Met His Ser Phe Asn Cys Gly Gly Glu 370 375 380

Phe Phe Tyr Cys Asn Thr Ser Pro Leu Phe Asn Ser Thr Trp Asn Gly 385 390 395 400

Asn Aβn Thr Trp Asn Asn Thr Thr Gly Ser Asn Asn Asn He Thr Leu 405 410 415

Gin Cys Lys He Lye Gin He He Aβn Met Trp Gin Glu Val Gly Lys 420 425 430

Ala Met Tyr Ala Pro Pro He Glu Gly Gin He Arg Cys Ser Ser Asn 435 440 445

He Thr Gly Leu Leu Leu Thr Arg Asp Gly Gly Lys Asp Thr Asp Thr 450 455 460

Asn Asp Thr Glu He Phe Arg Pro Gly Gly Gly Asp Met Arg Aβp Asn 465 470 475 480

Trp Arg Ser Glu Leu Tyr Lys Tyr Lys Val Val Thr He Glu Pro Leu 485 490 495

Gly Val Ala Pro Thr Lys Ala Lys Arg Arg Val Val Gin Arg Glu Lys 500 505 510

Arg Ala Ala He Gly Ala Leu Phe Leu Gly Phe Leu Gly Ala Ala Gly 515 520 525

Ser Thr Met Gly Ala Ala Ser Val Thr Leu Thr Val Gin Ala Arg Leu 530 535 540

Leu Leu Ser Gly He Val Gin Gin Gin Asn Asn Leu Leu Arg Ala He 545 550 555 560

Glu Ala Gin Gin His Met Leu Gin Leu Thr Val Trp Gly He Lys Gin 565 570 575

Leu Gin Ala Arg Val Leu Ala Val Glu Arg Tyr Leu Lys Asp Gin Gin 580 585 590

Leu Leu Gly Phe Trp Gly Cys Ser Gly Lys Leu He Cys Thr Thr Thr 595 600 605

Val Pro Trp Asn Ala Ser Trp Ser Asn Lys Ser Leu Asp Asp He Trp 610 615 620

Asn Asn Met Thr Trp Met Gin Trp Glu Arg Glu He Asp Asn Tyr Thr 625 630 635 640

Ser Leu He Tyr Ser Leu Leu Glu Lys Ser Gin Thr Gin Gin Glu Lys 645 650 655

Asn Glu Gin Glu Leu Leu Glu Leu Asp Lys Trp Ala Ser Leu Trp Asn 660 665 670

Trp Phe Asp He Thr Asn Trp Leu Trp Tyr He Lys He Phe He Met 675 680 685

He Val Gly Gly Leu Val Gly Leu Arg He Val Phe Ala Val Leu Ser 690 695 700

He Val Asn Arg Val Arg Gin Gly Tyr Ser Pro Leu Ser Leu Gin Thr 705 710 715 720

Arg Pro Pro Val Pro Arg Gly Pro Asp Arg Pro Glu Gly He Glu Glu 725 730 735

Glu Gly Gly Glu Arg Asp Arg Asp Thr Ser Gly Arg Leu Val His Gly 740 745 750

Phe Leu Ala He He Trp Val Asp Leu Arg Ser Leu Phe Leu Phe Ser 755 760 765

Tyr Hie Hie Arg Aβp Leu Leu Leu He Ala Ala Arg He Val Glu Leu 770 775 780

Leu Gly Arg Arg Gly Trp Glu Val Leu Lys Tyr Trp Trp Asn Leu Leu 785 790 795 800

Gin Tyr Trp Ser Gin Glu Leu Lys Ser Ser Ala Val Ser Leu Leu Asn 805 810 815

Ala Thr Ala He Ala Val Ala Glu Gly Thr Asp Arg Val He Glu Val 820 825 830

Leu Gin Arg Ala Gly Arg Ala He Leu His He Pro Thr Arg He Arg 835 840 845

Gin Gly Leu Glu Arg Ala Leu Leu 850 855

TABLE II

TGGATGGGTT AATTTACTCC CAAAGAGACA AGACATCCTT GATCTGTGGG TCTACCACAC 60 ACAAGGCTAC TTCCCTGATT GGCAGAACTA CACACCAGGG CCAGGGATCA GATATCCACT 120 GACCTTTGGA TGGTGCTTCA AGCTAGTACC AGTTGAGCCA GAGAAGATAG AAGAGGCCAA 180 TAAAGGAGAG AACAACTGCT TGTTACACCC TATGAGCCAG CATGGGATGG ATGACCCGGA 240 GAGAGAAGTG TTAGTGTGGA AGTCTGACAG CCACCTAGCA TTTCAGCATT ATGCCCGAGA 300 GCTGCATCCG GAGTACTACA AGAACTGCTG ACATCGAGCT ATCTACAAGG GACTTTCCGC 360 TGGGGACTTT CCAGGGAGGT GTGGCCTGGG CGGGACCGGG GAGTGGCGAG CCCTCAGATG 420 CTGCATATAA GCAGCTGCTT TCTGCCTGTA CTGGGTCTCT CTGGTTAGAC CAGATCTGAG 480 CCTGGGAGCT CTCTGGCTAA CTAGGGAACC CACTGCTTAA GCCTCAATAA AGCTTGCCTT 540 GAGTGCTTCA AGTAGTGTGT GCCCGTCTGT TATGTGACTC TGGTAGCTAG AGATCCCTCA 600 GATCCTTTTA GGCAGTGTGG AAAATCTCTA GCAGTGGCGC CCGAACAGGG ACTTGAAAGC 660 GAAAGAGAAA CCAGAGGAGC TCTCTCGACG CAGGACTCGG CTTGCTGAAG CGCGCACGGC 720 AAGAGGCGAG GGGCGGCGAC TGGTGAGTAC GCCAAAATTC TTGACTAGCG GAGGCTAGAA 780 GGAGAGAGAT GGGTGCGAGA GCGTCGGTAT TAAGCGGGGG AGAATTAGAT CGATGGGAAA 840 AAATTCGGTT AAGGCCAGGG GGAAAGAAAA AATATAAATT AAAACATGTA GTATGGGCAA 900 GCAGGGAGCT AGAACGATTC GCAGTCAATC CTGGCCTGTT AGAAACATCA GAAGGCTGTA 960 GACAAATACT GGGACAGCTA CAACCATCCC TTCAGACAGG ATCAGAAGAA CTTAAATCAT 1020 TATATAATAC AGTAGCAACC CTCTATTGTG TGCATCAAAA GATAGAGATA AAAGACACCA 1080 AGGAAGCTTT AGAGAAAATA GAGGAAGAGC AAAACAAAAG TAAGAAAAAA GCACAGCAAG 1140 CAGTAGCTGA CACAGGAAAC AGAGGAAACA GCAGCCAAGT CAGCCAAAAT TACCCCATAG 1200 TGCAGAACAT CCAGGGGCAA ATGGTACATC AGGCCATATC ACCTAGAACT TTAAATGCAT 1260 GGGTAAAAGT AGTAGAAGAG AAGGCTTTCA GCCCAGAAGT AATACCCATG TTTTCAGCAT 1320 TATCAGAAGG AGCCACCCCA CAAGATTTAA ACACCATGCT AAACACAGTG GGGGGACATC 1380 AAGCAGCCAT GCAAATGTTA AAAGAGACCA TCAATGAGGA AGCTGCAGAA TGGGATAGAT 1440 TGCATCCAGT GCATGCAGGG CCTATTGCAC CAGGCCAGAT GAGAGAACCA AGGGGAAGTG 1500 ACATAGCAGG AACTACTAGT ACCCTTCAGG AACAAATAGG ATGGATGACA AATAATCCAC 1560 CTATCCCAGT AGGAGAAATC TATAAAAGAT GGATAATCCT GGGATTAAAT AAAATAGTAA 1620 GGATGTATAG CCCTTCCAGC ATTCTGGACA TAAGACAAGG ACCAAAGGAA CCCTTTAGAG 1680 ACTATGTAGA CCGGTTCTAT AAAACTCTAA GAGCCGAGCA AGCTTCACAG GAGGTAAAAA 1740 ATTGGATGAC AGAAACCTTG TTGGTCCAAA ATGCGAACCC AGATTGTAAG ACTATTTTAA 1800 AAGCATTGGG ACCAGCAGCT ACACTAGAAG AAATGATGAC AGCATGTCAG GGAGTGGGAG 1860 GACCTGGTCA TAAAGCAAGA GTTTTGGCGG AAGCGATGAG CCAAGTAACA AATTCAGCTA 1920

CCATAATGAT GCAGAGAGGC AATTTTAGGA ATCAAAGAAA GATTATCAAG TGCTTCAATT 1980 GTGGCAAAGA AGGGCACATA GCCAAAAATT GCAGGGCCCC TAGGAAAAGG GGCTGTTGGA 2040 AATGTGGAAA GGAAGGACAC CAAATGAAAG ATTGTACTGA GAGACAGGCT AATTTTTTAG 2100 GGAAGATCTG GCCTTCCTGC AAGGGAAGGC AGGGAATTTT CCTCAGAGCA GAACAGAGCC 2160 AACAGCCCCA CCAGAAGAGA GCTTCAGGTT TGGGGAAGAG ACAACAACTC CCTATCAGAA 2220 GCAGGAGAAG AAGCAGGAGA CGATAGACAA GGACCTGTAT CCTTTAGCTT CCCTCAAATC 2280 ACTCTTTGGC AACGACCCAT TGTCACAATA AAGATAGGGG GGCAACTAAA GGAAGCTCTA 2340 TTAGATACAG GAGCAGATGA TACAGTATTA GAAGAAATGA ATTTGCCAGG AAGATGGAAA 2400 CCAAAAATGA TAGGGGGAAT TGGAGGTTTT ATCAAAGTAA GACAGTATGA TCAGATAACC 2460 ATAGAAATCT GTGGACATAA AGCTATAGGT ACAGTATTAG TAGGACCTAC ACCTGTCAAC 2520 ATAATTGGAA GAAATCTGTT GACTCAGCTT GGGTGCACTT TAAATTTTCC CATTAGTCCT 2580 ATTGAAACTG TACCAGTAAA ATTAAAGCCA GGAATGGATG GCCCAAAAGT TAAACAATGG 2640 CCATTGACAG AAGAAAAAAT AAAAGCATTA ATAGAAATTT GTACAGAAAT GGAAAAGGAA 2700 GGGAAAATTT CAAAAATTGG GCCTGAAAAT CCATACAATA CTCCAGTATT TGCCATAAAG 2760 AAAAAAGACA GTACTAAATG GAGAAAATTA GTAGATTTCA GAGAACTTAA TAAGAAAACT 2820 CAAGACTTCT GGGAAGTTCA ATTAGGAATA CCACATCCTG CAGGGTTAAA AAAGAAAAAA 2880 TCAGTAACAG TACTGGATGT GGGTGATGCA TATTTTTCAG TTCCCTTAGA TAAAGACTTC 2940 AGGAAGTATA CTGCATTTAC CATACCTAGT ATAAACAATG AAACACCAGG GATTAGATAT 3000 CAGTACAATG TGCTTCCACA GGGATGGAAA GGATCACCAG CAATATTCCA AAGTAGCATG 3060 ACAAAAATCT TAGAGCCTTT TAGAAAACAA AATCCAGACA TAGTTATCTA TCAATACATG 3120 GATGATTTGT ATGTAGGATC TGACTTAGAA ATAGGGCAGC ATAGAGCAAA AATAGAGGAA 3180 CTGAGACGAC ATCTGTTGAG GTGGGGATTT ACCACACCAG ACAAAAAACA TCAGAAAGAA 3240 CCTCCATTCC TTTGGATGGG TTATGAACTC CATCCTGATA AATGGACAGT ACAGCCTATA 3300 GTGCTGCCAG AAAAAGACAG CTGGACTGTC AATGACATAC AGAAGTTAGT GGGAAAATTG 3360 AATTGGGCAA GTCAAATTTA CGCAGGGATT AAAGTAAAGC AATTATGTAA ACTCCTTAGA 3420 GGAACCAAAG CACTAACAGA AGTAATACCA CTAACAGAAG AAGCAGAGCT AGAACTGGCA 3480 GAAAACAGGG AAATTCTAAA AGAACCAGTA CATGGAGTGT ATTATGACCC ATCAAAAGAC 3540 TTAATAGCAG AAGTACAGAA GCAGGGGCAA GGCCAATGGA CATATCAAAT TTATCAAGAG 3600 CCATTTAAAA ATCTGAAAAC AGGCAAATAT GCAAGAATGA GGGGTGCCCA CACTAATGAT 3660 GTAAAACAAT TAACAGAGGC AGTGCAAAAA ATAGCCACAG AAAGCATAGT AATATGGGGA 3720 AAGACTCCTA AATTTAGACT ACCCATACAA AAAGAAACAT GGGAAACATG GTGGACAGAG 3780 TATTGGCAAG CCACCTGGAT TCCTGAGTGG GAGTTTGTCA ATACCCCTCC CTTAGTGAAA 3840 TTATGGTACC AGTTAGAGAA AGAACCCATA GTAGGAGCAG AAACTTTCTA TGTAGATGGG 3900 GCAGCTAACA GGGAGACTAA AAAAGGAAAA GCAGGATATG TTACTAACAG AGGAAGACAA 3960

AAGGTTGTCT CCCTAACTGA CACAACAAAT CAGAAGACTG AGTTACAAGC AATTCATCTA 4020 GCTTTGCAAG ATTCAGGGTT AGAAGTAAAC ATAGTAACAG ACTCACAATA TGCATTAGGA 4080 ATCATTCAAG CACAACCAGA TAAAAGTGAA TCAGAGTTAG TCAGTCAAAT AATAGAGCAG 4140 TTAATAAAAA AGGAAAAGGT CTATCTGGCA TGGGTACCAG CACACAAAGG AATTGGAGGA 4200 AATGAACAAG TAGATAAATT AGTCAGTGCT GGAATCAGGA AAGTACTATT TTTAGATGGA 4260 ATAGATAAGG CCCAAGAAGA CCATGAGAAA TATCACAGTA ATTGGAGAGC AATGGCTAGT 4320 GACTTTAACC TACCACCTAT AGTAGCAAAA GAAATAGTAG CCAGCTGTGA TAAATGTCAG 4380 CTAAAAGGAG AAGCCATGCA TGGACAAGTA GACTGTAGTC CAGGAATATG GCAACTAGAT 4440 TGTACACATT TAGAAGGAAA AGTTATCCTG GTAGCAGTTC ATGTAGCCAG TGGATACATA 4500 GAAGCAGAAG TTATTCCAGC AGAGACAGGG CAGGAGACAG CATACTTTCT CTTAAAATTA 4560 GCAGGAAGAT GGCCAGTAAA AACAATACAT ACAGACAATG GCCCCAATTT CACCAGTACT 4620 ACGGTTAAGG CCGCCTGTTG GTGGGCGGGG ATCAAGCAGG AATTTGGCAT TCCCTACAAT 4680 CCCCAAAGTC AAGGAGTAAT AGAATCTATG AATAAAGAAT TAAAGAAAAT TATAGGACAG 4740 GTAAGAGATC AGGCTGAACA TCTTAAGACA GCAGTACAAA TGGCAGTATT CATCCACAAT 4800 TTTAAAAGAA AAGGGGGGAT TGGGGGGTAC AGTGCAGGGG AAAGAATAGT AGACATAATA 4860 GCAACAGACA TACAAACTAA AGAACTACAA AAACAAATTA CAAAAATTCA AAATTTTCGG 4920 GTTTATTACA GGGACAGCAG AGATCCACTT TGGAAAGGAC CAGCAAAGCT TCTCTGGAAA 4980 GGTGAAGGGG CAGTAGTAAT ACAAGATAAT AGTGACATAA AAGTAGTGCC AAGAAGAAAA 5040 GCAAAGATCA TTAGGGATTA TGGAAAACAG ATGGCAGGTG ATGATTGTGT GGCAAGTAGA 5100 CAGGATGAGG ATTAGAACAT GGAAAAGTTT AGTAAAACAC CATATGTATA TTTCAAAGAA 5160 AGCTAAAGGA TGGTTTTATA GACATCACTA TGAAAGCACT CATCCAAGAA TAAGTTCAGA 5220 AGTACACATC CCACTAGGGG ATGCTAGATT GGTAATAACA ACATATTGGG GTCTGCATAC 5280 AGGAGAAAGA GACTGGCATT TAGGTCAGGG AGTCTCCATA GAATGGAGGA AAAAGAGATA 5340 TAGCACACAA GTAGACCCTG ACCTAGCAGA CCACCTAATT CATCTGCATT ACTTTGATTG 5400 TTTTTCAGAC TCTGCCATAA GAAAGGCCAT ATTAGGACAT AGAGTTAGTC CTATTTGTGA 5460 ATTTCAAGCA GGACATAACA AGGTAGGATC TCTACAGTAC TTGGCACTAA CAGCATTAAT 5520 AACACCAAAA AAGATAAAGC CACCTTTGCC TAGTGTTAAG AAACTGACAG AGGATAGATG 5580 GAACAAGCCC CAGAAGACCA AGGGCCACAG AGGGAGCCAT ACAATCAATG GGCATTAGAG 5640 CTTTTAGAGG AGCTTAAGAA TGAAGCTGTT AGACATTTTC CTAGGATATG GCTCCATGGC 5700 TTAGGGCAAC ATATCTATGA AACTTATGGG GATACTTGGG CAGGAGTGGA AGCCATAATA 5760 AGAATTCTAC AACAACTGCT GTTTATTCAT TTCAGAATTG GGTGTCGACA TAGCAGAATA 5820 GGCATTATTC GACAGAGGAG AGCAAGAAAT GGAGCCAGTA GATCCTAGAC TAGAGCCCTG 5880 GAAGCATCCA GGAAGTCAGC CTAAGACTGC TTGTACCACT TGCTATTGTA AAAAGTGTTG 5940

CTTTCATTGC CAAGTTTGTT TCACAAAAAA AGCCTTAGGC ATCTCCTATG GCAGGAAGAA 6000

GCGGAGACAG CGACGAAGAG CTCCTGAAGA CAGTCAGACT CATCAAGTTT CTCTACCAAA 6060

GCAGTAAGTA GTACATGTAA TGCAACCTTT AGTAATAGCA GCAATAGTAG CATTAGTAGT 6120

AGCAGGAATA ATAGCAATAG TTGTGTGATC CATAGTATTC ATAGAATATA GGAAAATAAG 6180

AAGACAAAGA AAAATAGACA GGGTAATTGA CAGAATAAGC GAAAGAGCAG AAGACAGTGG 6240

CA ATG AGA GTG AAG GGG ATC AGG AGG AAT TAT CAG CAC TGG TGG GGA 6287 Met Arg Val Lys Gly He Arg Arg Asn Tyr Gin His Trp Trp Gly 1 5 10 15

TGG GGC ACG ATG CTC CTT GGG TTA TTA ATG ATC TGT AGT GCT ACA GAA 6335 Trp Gly Thr Met Leu Leu Gly Leu Leu Met He Cys Ser Ala Thr Glu 20 25 30

AAA TTG TGG GTC ACA GTC TAT TAT GGG GTA CCT GTG TGG AAA GAA GCA 6383 Lys Leu Trp Val Thr Val Tyr Tyr Gly Val Pro Val Trp Lys Glu Ala 35 40 45

ACC ACC ACT CTA TTT TGT GCA TCA GAT GCT AAA GCA TAT GAT ACA GAG 6431 Thr Thr Thr Leu Phe Cys Ala Ser Asp Ala Lys Ala Tyr Asp Thr Glu 50 55 60

GTA CAT AAT GTT TGG GCC ACA CAT GCC TGT GTA CCC ACA GAC CCC AAC 6479 Val His Asn Val Trp Ala Thr His Ala Cys Val Pro Thr Asp Pro Asn 65 70 75

CCA CAA GAA GTA GAA TTG GTA AAT GTG ACA GAA AAT TTT AAC ATG TGG 6527 Pro Gin Glu Val Glu Leu Val Asn Val Thr Glu Asn Phe Asn Met Trp 80 85 90 95

AAA AAT AAC ATG GTA GAA CAG ATG CAT GAG GAT ATA ATC AGT TTA TGG 6575 Lys Asn Asn Met Val Glu Gin Met His Glu Asp He He Ser Leu Trp 100 105 110

GAT CAA AGC CTA AAG CCA TGT GTA AAA TTA ACC CCA CTC TGT GTT ACT 6623 Asp Gin Ser Leu Lys Pro Cys Val Lys Leu Thr Pro Leu Cys Val Thr 115 120 125

TTA AAT TGC ACT GAT TTG AGG AAT ACT ACT AAT ACC AAT AAT AGT ACT 6671 Leu Asn Cys Thr Asp Leu Arg Asn Thr Thr Asn Thr Asn Asn Ser Thr 130 135 140

GCT AAT AAC AAT AGT AAT AGC GAG GGA ACA ATA AAG GGA GGA GAA ATG 6719 Ala Asn Asn Aβn Ser Asn Ser Glu Gly Thr He Lys Gly Gly Glu Met 145 150 155

AAA AAC TGC TCT TTC AAT ATC ACC ACA AGC ATA AGA GAT AAG ATG CAG 6767 Lye Asn Cys Ser Phe Asn He Thr Thr Ser He Arg Asp Lys Met Gin 160 165 170 175

AAA GAA TAT GCA CTT CTT TAT AAA CTT GAT ATA GTA TCA ATA AAT AAT 6815 Lys Glu Tyr Ala Leu Leu Tyr Lys Leu Asp He Val Ser He Asn Asn 180 185 190

GAT AGT ACC AGC TAT AGG TTG ATA AGT TGT AAT ACC TCA GTC ATT ACA 6863 Asp Ser Thr Ser Tyr Arg Leu He Ser Cys Asn Thr Ser Val He Thr 195 200 205

CAA GCT TGT CCA AAG ATA TCC TTT GAG CCA ATT CCC ATA CAC TAT TGT 6911 Gin Ala Cys Pro Lys He Ser Phe Glu Pro He Pro He His Tyr Cys 210 215 220

GCC CCG GCT GGT TTT GCG ATT CTA AAG TGT AAC GAT AAA AAG TTC AGT 6959 Ala Pro Ala Gly Phe Ala He Leu Lys Cys Asn Asp Lys Lys Phe Ser 225 230 235

GGA AAA GGA TCA TGT AAA AAT GTC AGC ACA GTA CAA TGT ACA CAT GGA 7007 Gly Lys Gly Ser Cyβ Lys Asn Val Ser Thr Val Gin Cys Thr His Gly 240 245 250 255

ATT AGG CCA GTA GTA TCA ACT CAA CTG CTG TTA AAT GGC AGT CTA GCA 7055 He Arg Pro Val Val Ser Thr Gin Leu Leu Leu Asn Gly Ser Leu Ala 260 265 270

GAA GAA GAG GTA GTA ATT AGA TCT GAG AAT TTC AAT GAT AAT GCT AAA 7103 Glu Glu Glu Val Val He Arg Ser Glu Asn Phe Asn Asp Asn Ala Lys 275 280 285

ACC ATC ATA GTA CAT CTG AAT GAA TCT GTA CAA ATT AAT TGT ACA AGA 7151 Thr He He Val His Leu Asn Glu Ser Val Gin He Asn Cys Thr Arg 290 295 300

CCC AAC TAC AAT AAA AGA AAA AGG ATA CAT ATA GGA CCA GGG AGA GCA 7199 Pro Asn Tyr Asn Lys Arg Lys Arg He His He Gly Pro Gly Arg Ala 305 310 315

TTT TAT ACA ACA AAA AAT ATA ATA GGA ACT ATA AGA CAA GCA CAT TGT 7247 Phe Tyr Thr Thr Lys Asn He He Gly Thr He Arg Gin Ala His Cys 320 325 330 335

AAC ATT AGT AGA GCA AAA TGG AAT GAC ACT TTA AGA CAG ATA GTT AGC 7295 Asn He Ser Arg Ala Lys Trp Asn Asp Thr Leu Arg Gin He Val Ser 340 345 350

AAA TTA AAA GAA CAA TTT AAG AAT AAA ACA ATA GTC TTT AAT CAA TCC 7343 Lys Leu Lys Glu Gin Phe Lys Asn Lys Thr He Val Phe Asn Gin Ser 355 360 365

TCA GGA GGG GAC CCA GAA ATT GTA ATG CAC AGT TTT AAT TGT GGA GGG 7391 Ser Gly Gly Aβp Pro Glu He Val Met His Ser Phe Asn Cys Gly Gly 370 375 380

GAA TTT TTC TAC TGT AAT ACA TCA CCA CTG TTT AAT AGT ACT TGG AAT 7439 Glu Phe Phe Tyr Cys Asn Thr Ser Pro Leu Phe Asn Ser Thr Trp Asn 385 390 395

GGT AAT AAT ACT TGG AAT AAT ACT ACA GGG TCA AAT AAC AAT ATC ACA 7487 Gly Aβn Aβn Thr Trp Aβn Aβn Thr Thr Gly Ser Asn Asn Asn He Thr 400 405 410 415

CTT CAA TGC AAA ATA AAA CAA ATT ATA AAC ATG TGG CAG GAA GTA GGA 7535 Leu Gin Cys Lys He Lye Gin He He Aβn Met Trp Gin Glu Val Gly 420 425 430

AAA GCA ATA TAT GCC CCT CCC ATT GAA GGA CAA ATT AGA TGT TCA TCA 7583 Lye Ala He Tyr Ala Pro Pro He Glu Gly Gin He Arg Cys Ser Ser 435 440 445

AAT ATT ACA GGG CTA CTA TTA ACA AGA GAT GGT GGT AAG GAC ACG GAC 7631 Asn He Thr Gly Leu Leu Leu Thr Arg Asp Gly Gly Lys Asp Thr Asp 450 455 460

ACG AAC GAC ACC GAG ATC TTC AGA CCT GGA GGA GGA GAT ATG AGG GAC 7679 Thr Aβn Aβp Thr Glu He Phe Arg Pro Gly Gly Gly Asp Met Arg Asp 465 470 475

AAT TGG AGA AGT GAA TTA TAT AAA TAT AAA GTA GTA ACA ATT GAA CCA 7727 Aβn Trp Arg Ser Glu Leu Tyr Lys Tyr Lys Val Val Thr He Glu Pro 480 485 490 495

TTA GGA GTA GCA CCC ACC AAG GCA AAG AGA AGA GTG GTG CAG AGA GAA 7775 Leu Gly Val Ala Pro Thr Lys Ala Lys Arg Arg Val Val Gin Arg Glu 500 505 510

AAA AGA GCA GCG ATA GGA GCT CTG TTC CTT GGG TTC TTA GGA GCA GCA 7823 Lys Arg Ala Ala He Gly Ala Leu Phe Leu Gly Phe Leu Gly Ala Ala 515 520 525

GGA AGC ACT ATG GGC GCA GCG TCA GTG ACG CTG ACG GTA CAG GCC AGA 7871 Gly Ser Thr Met Gly Ala Ala Ser Val Thr Leu Thr Val Gin Ala Arg 530 535 540

CTA TTA TTG TCT GGT ATA GTG CAA CAG CAG AAC AAT TTG CTG AGG GCC 7919 Leu Leu Leu Ser Gly He Val Gin Gin Gin Asn Asn Leu Leu Arg Ala 545 550 555

ATT GAG GCG CAA CAG CAT ATG TTG CAA CTC ACA GTC TGG GGC ATC AAG 7967 He Glu Ala Gin Gin His Met Leu Gin Leu Thr Val Trp Gly He Lys 560 565 570 575

CAG CTC CAG GCA AGA ATC CTG GCT GTG GAA AGA TAC CTA AAG GAT CAA 8015 Gin Leu Gin Ala Arg He Leu Ala Val Glu Arg Tyr Leu Lys Asp Gin 580 585 590

CAG CTC CTG GGG ATT TGG GGT TGC TCT GGA AAA CTC ATT TGC ACC ACT 8063 Gin Leu Leu Gly He Trp Gly Cys Ser Gly Lys Leu He Cys Thr Thr 595 600 605

ACT GTG CCT TGG AAT GCT AGT TGG AGT AAT AAA TCT CTG GAT GAT ATT 8111 Thr Val Pro Trp Asn Ala Ser Trp Ser Asn Lys Ser Leu Asp Asp He 610 615 620

TGG AAT AAC ATG ACC TGG ATG CAG TGG GAA AGA GAA ATT GAC AAT TAC 8159 Trp Aβn Asn Met Thr Trp Met Gin Trp Glu Arg Glu He Asp Asn Tyr 625 630 635

ACA AGC TTA ATA TAC TCA TTA CTA GAA AAA TCG CAA ACC CAA CAA GAA 8207 Thr Ser Leu He Tyr Ser Leu Leu Glu Lye Ser Gin Thr Gin Gin Glu 640 645 650 655

ATG AAT GAA CAA GAA TTA TTG GAA TTG GAT AAA TGG GCA AGT TTG TGG 8255 Met Aβn Glu Gin Glu Leu Leu Glu Leu Aβp Lys Trp Ala Ser Leu Trp 660 665 670

AAT TGG TTT GAC ATA ACA AAT TGG CTG TGG TAT ATA AAA ATA TTC ATA 8303 Asn Trp Phe Asp He Thr Aβn Trp Leu Trp Tyr He Lye He Phe He 675 680 685

ATG ATA GTA GGA GGC TTG GTA GGT TTA AGA ATA GTT TTT GCT GTA CTT 8351 Met He Val Gly Gly Leu Val Gly Leu Arg He Val Phe Ala Val Leu 690 695 700

TCT ATA GTG AAT AGA GTT AGG CAG GGA TAC TCA CCA TTG TCG TTG CAG 8399 Ser He Val Aβn Arg Val Arg Gin Gly Tyr Ser Pro Leu Ser Leu Gin 705 710 715

ACC CGC CCC CCA GTT CCG AGG GGA CCC GAC AGG CCC GAA GGA ATC GAA 8447 Thr Arg Pro Pro Val Pro Arg Gly Pro Asp Arg Pro Glu Gly He Glu 720 725 730 735

GAA GAA GGT GGA GAG AGA GAC AGA GAC ACA TCC GGT CGA TTA GTG CAT 8495 Glu Glu Gly Gly Glu Arg Asp Arg Aβp Thr Ser Gly Arg Leu Val Hie 740 745 750

GGA TTC TTA GCA ATT ATC TGG GTC GAC CTG CGG AGC CTG TTC CTC TTC 8543 Gly Phe Leu Ala He He Trp Val Aep Leu Arg Ser Leu Phe Leu Phe 755 760 765

AGC TAC CAC CAC TTG AGA GAC TTA CTC TTG ATT GCA GCG AGG ATT GTG 8591 Ser Tyr Hie His Leu Arg Asp Leu Leu Leu He Ala Ala Arg He Val 770 775 780

GAA CTT CTG GGA CGC AGG GGG TGG GAA GTC CTC AAA TAT TGG TGG AAT 8639 Glu Leu Leu Gly Arg Arg Gly Trp Glu Val Leu Lye Tyr Trp Trp Asn 785 790 795

CTC CTA CAG TAT TGG AGT CAG GAA CTA AAG AGT AGT GCT GTT AGC TTG 8687 Leu Leu Gin Tyr Trp Ser Gin Glu Leu Lye Ser Ser Ala Val Ser Leu 800 805 810 815

CTT AAT GCC ACA GAT ATA GCA GTA GCT GAG GGG ACA GAT AGG GTT ATA 8735 Leu Aβn Ala Thr Aβp He Ala Val Ala Glu Gly Thr Asp Arg Val He 820 825 830

GAA GTA CTG CAA AGA GCT GGT AGA GCT ATT CTC CAC ATA CCT ACA AGA 8783 Glu Val Leu Gin Arg Ala Gly Arg Ala He Leu His He Pro Thr Arg 835 840 845

ATA AGA CAG GGC TTG GAA AGG GCT TTG CTA TAAGATGGGT GGCAAATGGT . 8833 He Arg Gin Gly Leu Glu Arg Ala Leu Leu 850 855

CAAAACGTGT GACTGGATGG CCTACTGTAA GGGAAAAAAT GAGACGAGCT GAACCAGCTG 8893

AGCCAGCAGC AGATGGGGTG GGAGCAGCAT CCCGAGACCT GGAAAAACAT GGAGCACTCA 8953

CAAGTAGCAA TACAGCAGCT ACCAATGCTG ATTGTGCCTG GCTAGAAGCA CAAGAGGAGG 9013

AGGAAGTGGG TTTTCCAGTC AGACCTCAGG TACCTTTAAG ACCAATGACT TACAAAGCAG 9073

CTTTAGATCT TAGCCACTTT TTAAAAGAAA AGGGGGGACT GGATGGGTTA ATTTACTCCC 9133

AAAAGAGACA AGACATCCTT GATCTGTGGG TCTACCACAC ACAAGGCTAC TTCCCTGATT 9193

GGCAGAACTA CACACCAGGG CCAGGGATCA GATATCCACT GACCTTTGGA TGGTGCTTCA 9253

AGCTAGTACC AGTTGAGCCA GAGAAGATAG AAGAGGCCAA TAAAGGAGAG AACAACTGCT 9313

TGTTACACCC TATGAGCCAG CATGGGATGG ATGACCCGGA GAGAGAAGTG TTAGTGTGGA 9373

AGTCTGACAG CCACCTAGCA TTTCAGCATT ATGCCCGAGA GCTGCATCCG GAGTACTACA 9433

AGAACTGCTG ACATCGAGCT ATCTACAAGG GACTTTCCGC TGGGGACTTT CCAGGGAGGT 9493

GTGGCCTGGG CGGGACCGGG GAGTGGCGAG CCCTCAGATG CTGCATATAA GCAGCTGCTT 9553

TCTGCCTGTA CTGGGTCTCT CTGGTTAGAC CAGATCTGAG CCTGGGAGCT CTCTGGCTAA 9613

CTAGGGAACC CACTGCTTAA GCCTCAATAA AGCTTGCCTT GAGTGCTTCA AGTAGTGTGT 9673

GCCCGTCTGT TATGTGACTC TGGTAGCTAG AGATCCCTCA GATCCTTTTA GGCAGTGTGG 9733

AAAATCTCTA GCA 9746

Met Arg Val Lys Gly He Arg Arg Asn Tyr Gin His Trp Trp Gly Trp 1 5 10 15

Gly Thr Met Leu Leu Gly Leu Leu Met He Cys Ser Ala Thr Glu Lye 20 25 30

Leu Trp Val Thr Val Tyr Tyr Gly Val Pro Val Trp Lys Glu Ala Thr 35 40 45

Thr Thr Leu Phe Cyβ Ala Ser Aβp Ala Lye Ala Tyr Asp Thr Glu Val 50 55 60

His Aβn Val Trp Ala Thr Hie Ala Cyβ Val Pro Thr Aβp Pro Asn Pro 65 70 75 80

Gin Glu Val Glu Leu Val Asn Val Thr Glu Asn Phe Asn Met Trp Lys 85 90 95

Aβn Aβn Met Val Glu Gin Met Hie Glu Asp He He Ser Leu Trp Asp 100 105 110

Gin Ser Leu Lys Pro Cys Val Lys Leu Thr Pro Leu Cys Val Thr Leu 115 120 125

Asn Cyβ Thr Aβp Leu Arg Aβn Thr Thr Asn Thr Asn Asn Ser Thr Ala 130 135 140

Asn Asn Asn Ser Asn Ser Glu Gly Thr He Lys Gly Gly Glu Met Lys 145 150 155 160

Asn Cys Ser Phe Asn He Thr Thr Ser He Arg Asp Lys Met Gin Lys 165 170 175

Glu Tyr Ala Leu Leu Tyr Lys Leu Asp He Val Ser He Asn Asn Asp 180 185 190

Ser Thr Ser Tyr Arg Leu He Ser Cys Asn Thr Ser Val He Thr Gin 195 200 205

Ala Cys Pro Lys He Ser Phe Glu Pro He Pro He His Tyr Cys Ala 210 215 220

Pro Ala Gly Phe Ala He Leu Lys Cys Asn Asp Lys Lys Phe Ser Gly 225 230 235 240

Lys Gly Ser Cys Lys Asn Val Ser Thr Val Gin Cys Thr His Gly He 245 250 255

Arg Pro Val Val Ser Thr Gin Leu Leu Leu Asn Gly Ser Leu Ala Glu 260 265 270

Glu Glu Val Val He Arg Ser Glu Asn Phe Asn Asp Asn Ala Lys Thr 275 * 280 285

He He Val His Leu Asn Glu Ser Val Gin He Asn Cys Thr Arg Pro 290 295 300

Asn Tyr Asn Lys Arg Lys Arg He His He Gly Pro Gly Arg Ala Phe 305 310 315 320

Tyr Thr Thr Lys Aβn He He Gly Thr He Arg Gin Ala Hie Cys Asn 325 330 335

He Ser Arg Ala Lys Trp Asn Asp Thr Leu Arg Gin He Val Ser Lys 340 345 350

Leu Lye Glu Gin Phe Lye Asn Lys Thr He Val Phe Asn Gin Ser Ser 355 360 365

Gly Gly Asp Pro Glu He Val Met Hie Ser Phe Asn Cys Gly Gly Glu 370 375 380

Phe Phe Tyr Cys Asn Thr Ser Pro Leu Phe Asn Ser Thr Trp Asn Gly 385 390 395 400

Aβn Aβn Thr Trp Aβn Asn Thr Thr Gly Ser Asn Asn Asn He Thr Leu 405 410 415

Gin Cyβ Lye He Lye Gin He He Aβn Met Trp Gin Glu Val Gly Lys 420 425 430

Ala He Tyr Ala Pro Pro He Glu Gly Gin He Arg Cys Ser Ser Asn 435 440 445

He Thr Gly Leu Leu Leu Thr Arg Asp Gly Gly Lys Asp Thr Asp Thr 450 455 460

Aβn Aβp Thr Glu He Phe Arg Pro Gly Gly Gly Aβp Met Arg Asp Asn 465 470 475 480

Trp Arg Ser Glu Leu Tyr Lys Tyr Lys Val Val Thr He Glu Pro Leu 485 490 495

Gly Val Ala Pro Thr Lys Ala Lys Arg Arg Val Val Gin Arg Glu Lys 500 505 510

Arg Ala Ala He Gly Ala Leu Phe Leu Gly Phe Leu Gly Ala Ala Gly 515 520 525

Ser Thr Met Gly Ala Ala Ser Val Thr Leu Thr Val Gin Ala Arg Leu 530 535 540

Leu Leu Ser Gly He Val Gin Gin Gin Asn Asn Leu Leu Arg Ala He 545 550 555 560

Glu Ala Gin Gin His Met Leu Gin Leu Thr Val Trp Gly He Lys Gin 565 570 575

Leu Gin Ala Arg He Leu Ala Val Glu Arg Tyr Leu Lys Asp Gin Gin 580 585 590

Leu Leu Gly He Trp Gly Cyβ Ser Gly Lys Leu He Cys Thr Thr Thr 595 600 605

Val Pro Trp Aβn Ala Ser Trp Ser Asn Lys Ser Leu Asp Asp He Trp 610 615 620

Aβn Aβn Met Thr Trp Met Gin Trp Glu Arg Glu He Asp Asn Tyr Thr 625 630 635 640

Ser Leu He Tyr Ser Leu Leu Glu Lys Ser Gin Thr Gin Gin Glu Met 645 650 655

Asn Glu Gin Glu Leu Leu Glu Leu Asp Lys Trp Ala Ser Leu Trp Asn 660 665 670

Trp Phe Asp He Thr Asn Trp Leu Trp Tyr He Lys He Phe He Met 675 680 685

He Val Gly Gly Leu Val Gly Leu Arg He Val Phe Ala Val Leu Ser 690 695 700

He Val Asn Arg Val Arg Gin Gly Tyr Ser Pro Leu Ser Leu Gin Thr 705 710 715 720

Arg Pro Pro Val Pro Arg Gly Pro Asp Arg Pro Glu Gly He Glu Glu 725 730 735

Glu Gly Gly Glu Arg Asp Arg Asp Thr Ser Gly Arg Leu Val His Gly 740 745 750

Phe Leu Ala He He Trp Val Asp Leu Arg Ser Leu Phe Leu Phe Ser 755 760 765

Tyr Hie Hie Leu Arg Aβp Leu Leu Leu He Ala Ala Arg He Val Glu 770 775 780

Leu Leu Gly Arg Arg Gly Trp Glu Val Leu Lys Tyr Trp Trp Asn Leu 785 790 795 800

Leu Gin Tyr Trp Ser Gin Glu Leu Lys Ser Ser Ala Val Ser Leu Leu 805 810 815

Aβn Ala Thr Aβp He Ala Val Ala Glu Gly Thr Asp Arg Val He Glu 820 825 830

Val Leu Gin Arg Ala Gly Arg Ala He Leu His He Pro Thr Arg He 835 840 845

Arg Gin Gly Leu Glu Arg Ala Leu Leu 850 855

TABLE III

GATCAAGGGC CACAGAGGGA GCCACACAAT GAATGGACAC TAGAGCTTTT AGAGGAGCTT 60

AAGAGTGAAG CTGTTAGACA CTTTCCTAGG ATATGGCTTC ATGGCTTAGG GCAACATATC 120

TATGAAACTT ATGGGGATAC TTGGGCAGGA GTGGAAGCCA TAATAAGAAT TCTGCAACAA 180

CTGCTGTTTA TCCATTTCAG GATTGGGTGC CAACATAGCA GAATAGGTAT TATTCAACAG 240

AGGAGAGCAA GAAATGGAGC CAGTAGATCC TAAACTAGAG CCCTGGAAGC ATCCAGGAAG 300

TCAGCCTAAG ACTGCTTGTA CCACTTGCTA TTGTAAAAAG TGTTGCTTTC ATTGCCAAGT 360

TTGCTTCATA ACAAAAGGCT TAGGCATCTC CTATGGCAGG AAGAAGCGGA GACAGCGACG 420

AAGAGCTCCT CAAGACAGTG AGACTCATCA AGTTTCTCTA TCAAAGCAGT AAGTAGTACA 480

TGTAATGCAA GCTTTACAAA TATCAGCTAT AGTAGGATTA GTAGTAGCAG CAATAATAGC 540

AATAGTTGTG TGGACCATAG TATTCATAGA ATATAGGAAA ATATTAAGGC AAAGAAAAAT 600

AGACAGGTTA ATTGATAGAA TAACAGAAAG AGCAGAAGAC AGTGGCA ATG AGA GTG 656

Met Arg Val 1

ACG GAG ATC AGG AAG AGT TAT CAG CAC TGG TGG AGA TGG GGC ATC ATG 704 Thr Glu He Arg Lys Ser Tyr Gin Hie Trp Trp Arg Trp Gly He Met 5 10 15

CTC CTT GGG ATA TTA ATG ATC TGT AAT GCT GAA GAA AAA TTG TGG GTC 752 Leu Leu Gly He Leu Met He Cyβ Aβn Ala Glu Glu Lys Leu Trp Val 20 25 30 35

ACA GTC TAT TAT GGG GTA CCT GTG TGG AAA GAA GCA ACC ACC ACT CTA 800 Thr Val Tyr Tyr Gly Val Pro Val Trp Lys Glu Ala Thr Thr Thr Leu 40 45 50

TTT TGT GCA TCA GAT CGT AAA GCA TAT GAT ACA GAG GTA CAT AAT GTT 848 Phe Cys Ala Ser Asp Arg Lye Ala Tyr Asp Thr Glu Val His Asn Val 55 60 65

TGG GCC ACA CAT GCC TGT GTA CCC ACA GAC CCC AAC CCA CAA GAA GTA 896 Trp Ala Thr His Ala Cys Val Pro Thr Asp Pro Asn Pro Gin Glu Val 70 75 80

GAA TTG AAA AAT GTG ACA GAA AAT TTT AAC ATG TGG AAA AAT AAC ATG 944 Glu Leu Lys Asn Val Thr Glu Asn Phe Asn Met Trp Lys Asn Asn Met 85 90 95

GTA GAA CAA ATG CAT GAG GAT ATA ATC AGT TTA TGG GAT CAA AGC CTA 992 Val Glu Gin Met His Glu Aβp He He Ser Leu Trp Asp Gin Ser Leu 100 105 110 115

AAG CCA TGT GTA AAA TTA ACC CCA CTC TGT GTT ACT TTA AAT TGC ACT 1040 Lys Pro Cys Val Lys Leu Thr Pro Leu Cys Val Thr Leu Asn Cys Thr 120 125 130

GAT TTG AGG AAT GCT ACT AAT GGG AAT GAC ACT AAT ACC ACT AGT AGT 1088 Asp Leu Arg Asn Ala Thr Asn Gly Asn Asp Thr Asn Thr Thr Ser Ser 135 140 145

AGC AGG GGA ATG GTG GGG GGA GGA GAA ATG AAA AAT TGC TCT TTC AAT _ 1136 Ser Arg Gly Met Val Gly Gly Gly Glu Met Lye Aβn Cyβ Ser Phe Aβn 150 155 160

ATC ACC ACA AAC ATA AGA GGT AAG GTG CAG AAA GAA TAT GCA CTT TTT 1184 He Thr Thr Aβn He Arg Gly Lye Val Gin Lye Glu Tyr Ala Leu Phe 165 170 175

TAT AAA CTT GAT ATA GCA CCA ATA GAT AAT AAT AGT AAT AAT AGA TAT 1232 Tyr Lye Leu Aβp He Ala Pro He Aβp Aβn Asn Ser Asn Asn Arg Tyr 180 185 190 195

AGG TTG ATA AGT TGT AAC ACC TCA GTC ATT ACA CAG GCC TGT CCA AAG 1280 Arg Leu He Ser Cys Asn Thr Ser Val He Thr Gin Ala Cys Pro Lys 200 205 210

GTA TCC TTT GAG CCA ATT CCC ATA CAT TAT TGT GCC CCG GCT GGT TTT 1328 Val Ser Phe Glu Pro He Pro He His Tyr Cys Ala Pro Ala Gly Phe 215 220 225

GCG ATT CTA AAG TGT AAA GAT AAG AAG TTC AAT GGA AAA GGA CCA TGT 1376 Ala He Leu Lys Cys Lys Asp Lys Lys Phe Asn Gly Lys Gly Pro Cys 230 235 240

ACA AAT GTC AGC ACA GTA CAA TGT ACA CAT GGA ATT AGG CCA GTA GTA 1424 Thr Asn Val Ser Thr Val Gin Cys Thr His Gly He Arg Pro Val Val 245 250 255

TCA ACT CAA CTG CTG TTA AAT GGC AGT CTA GCA GAA GAA GAG GTA GTA 1472 Ser Thr Gin Leu Leu Leu Asn Gly Ser Leu Ala Glu Glu Glu Val Val 260 265 270 275

ATT AGA TCC GCC AAT TTC GCG GAC AAT GCT AAA GTC ATA ATA GTA CAG 1520 He Arg Ser Ala Asn Phe Ala Asp Asn Ala Lys Val He He Val Gin 280 285 290

CTG AAT GAA TCT GTA GAA ATT AAT TGT ACA AGA CCC AAC AAC AAT ACA 1568 Leu Asn Glu Ser Val Glu He Aβn Cys Thr Arg Pro Asn Asn Asn Thr 295 300 305

AGA AAA AGT ATA CAT ATA GGA CCA GGC AGA GCA TTT TAT ACA ACA GGA 1616 Arg Lys Ser He His lie Gly Pro Gly Arg Ala Phe Tyr Thr Thr Gly 310 315 320

GAA ATA ATA GGA GAT ATA AGA CAA GCA CAT TGT AAC CTT AGT AGA GCA 1664 Glu He He Gly Aβp He Arg Gin Ala His Cys Asn Leu Ser Arg Ala 325 330 335

AAA TGG AAT GAC ACT TTA AAT AAG ATA GTT ATA AAA TTA AGA GAA CAA 1712 Lys Trp Aβn Aβp Thr Leu Aβn Lye He Val He Lys Leu Arg Glu Gin 340 345 350 355

TTT GGG AAT AAA ACA ATA GTC TTT AAG CAC TCC TCA GGA GGG GAC CCA 1760 Phe Gly Asn Lys Thr He Val Phe Lys His Ser Ser Gly Gly Asp Pro 360 365 370

GAA ATT GTG ACG CAC AGT TTT AAT TGT GGA GGG GAA TTT TTC TAC TGT 1808 Glu He Val Thr His Ser Phe Asn Cys Gly Gly Glu Phe Phe Tyr Cys 375 380 385

AAT TCA ACA CAA CTG TTT AAT AGT ACT TGG AAT GTT ACT GAA GAG TCA 1856 Asn Ser Thr Gin Leu Phe Asn Ser Thr Trp Asn Val Thr Glu Glu Ser 390 395 400

AAT AAC ACT GTA GAA AAT AAC ACA ATC ACA CTC CCA TGC AGA ATA AAA 1904 Aβn Aβn Thr Val Glu Aβn Aβn Thr He Thr Leu Pro Cys Arg He Lys 405 410 415

CAA ATT ATA AAC ATG TGG CAG GAA GTA GGA AGA GCA ATG TAT GCC CCT 1952 Gin He He Asn Met Trp Gin Glu Val Gly Arg Ala Met Tyr Ala Pro 420 425 430 435

CCC ATC AGA GGA CAA ATT AGA TGT TCA TCA AAT ATT ACA GGG CTG CTA 2000 Pro He Arg Gly Gin He Arg Cys Ser Ser Asn He Thr Gly Leu Leu 440 445 450

TTA ACA AGA GAT GGT GGT CCT GAG GAC AAC AAG ACC GAG GTC TTC AGA 2048 Leu Thr Arg Asp Gly Gly Pro Glu Asp Asn Lys Thr Glu Val Phe Arg 455 460 465

CCT GGA GGA GGA GAT ATG AGG GAT AAT TGG AGA AGT GAA TTA TAT AAA 2096 Pro Gly Gly Gly Asp Met Arg Asp Asn Trp Arg Ser Glu Leu Tyr Lys 470 475 480

TAT AAA GTA GTA AAA ATT GAA CCA TTA GGA GTA GCA CCC ACC AAG GCA 2144 Tyr Lye Val Val Lye He Glu Pro Leu Gly Val Ala Pro Thr Lys Ala 485 490 495

AAG AGA AGA GTG GTG CAG AGA GAA AAA AGA GCA GTG GGA ATA GGA GCT 2192 Lys Arg Arg Val Val Gin Arg Glu Lys Arg Ala Val Gly He Gly Ala 500 505 510 515

GTG TTC CTT GGG TTC TTG GGA GCA GCA GGA AGC ACT ATG GGC GCA GCG 2240 Val Phe Leu Gly Phe Leu Gly Ala Ala Gly Ser Thr Met Gly Ala Ala 520 525 530

GCA ATG ACG CTG ACG GTA CAG GCC AGA CTA TTA TTG TCT GGT ATA GTG 2288 Ala Met Thr Leu Thr Val Gin Ala Arg Leu Leu Leu Ser Gly He Val 535 540 545

CAA CAG CAG AAC AAT CTG CTG AGG GCT ATT GAG GCG CAA CAG CAT CTG 2336 Gin Gin Gin Asn Asn Leu Leu Arg Ala He Glu Ala Gin Gin His Leu 550 555 560

TTG CAA CTC ACA GTC TGG GGC ATC AAG CAG CTC CAG GCA AGA GTC CTG 2384 Leu Gin Leu Thr Val Trp Gly He Lys Gin Leu Gin Ala Arg Val Leu 565 570 575

GCT GTG GAA AGA TAC CTA AGG GAT CAA CAG CTC CTG GGG ATT TGG GGT 2432 Ala Val Glu Arg Tyr Leu Arg Asp Gin Gin Leu Leu Gly He Trp Gly 580 585 590 595

TGC TCT GGA AAA CTC ATC TGC ACC ACT GCT GTG CCT TGG AAT GCT AGT 2480 Cys Ser Gly Lys Leu He Cys Thr Thr Ala Val Pro Trp Asn Ala Ser 600 605 610

TGG AGT AAT AAA TCT CTG AAT AAG ATT TGG GAT AAC ATG ACC TGG ATA 2528 Trp Ser Asn Lys Ser Leu Aβn Lys He Trp Asp Asn Met Thr Trp He 615 620 625

GAG TGG GAC AGA GAA ATT AAC AAT TAC ACA AGC ATA ATA TAC AGC TTA 2576 Glu Trp Asp Arg Glu He Asn Asn Tyr Thr Ser He He Tyr Ser Leu 630 635 640

ATT GAA GAA TCG CAG AAC CAA CAA GAA AAG AAT GAA CAA GAA TTA TTA 2624 He Glu Glu Ser Gin Asn Gin Gin Glu Lys Asn Glu Gin Glu Leu Leu 645 650 655

GAA TTA GAT AAA TGG GCA AGT TTG TGG AAT TGG TTT GAC ATA ACA AAA 2672 Glu Leu Aβp Lye Trp Ala Ser Leu Trp Aβn Trp Phe Asp He Thr Lys 660 665 670 675

TGG CTG TGG TAT ATA AAA ATA TTC ATA ATG ATA GTA GGA GGC TTG ATA 2720 Trp Leu Trp Tyr He Lys He Phe He Met He Val Gly Gly Leu He 680 685 690

GGT TTA AGA ATA GTT TTT TCT GTA CTT TCT ATA GTG AAT AGA GTT AGG 2768 Gly Leu Arg He Val Phe Ser Val Leu Ser He Val Asn Arg Val Arg 695 700 705

CAG GGA TAC TCA CCA TTA TCG TTT CAG ACC CAC CTC CCA TCC TCG AGG 2816 Gin Gly Tyr Ser Pro Leu Ser Phe Gin Thr His Leu Pro Ser Ser Arg 710 715 720

GGA CCC GAC AGG CCC GGA GGA ATC GAA GAA GAA GGT GGA GAG AGA GAC 2864 Gly Pro Asp Arg Pro Gly Gly He Glu Glu Glu Gly Gly Glu Arg Asp 725 730 735

AGA GAC AGA TCC GGT CCA TTA GTG AAC GGA TTC TTG GCG CTT ATC TGG 2912 Arg Asp Arg Ser Gly Pro Leu Val Asn Gly Phe Leu Ala Leu He Trp 740 745 750 755

GTC GAT CTG CGG AGC CTG TTC CTC TTC AGC TAC CAC CGC TTG AGA GAC 2960 Val Asp Leu Arg Ser Leu Phe Leu Phe Ser Tyr His Arg Leu Arg Asp 760 765 770

TTA CTC TTG ATT GTG ATG AGG ATT GTG GAA CTT CTG GGA CTA GCA GGG 3008 Leu Leu Leu He Val Met Arg He Val Glu Leu Leu Gly Leu Ala Gly 775 780 785

GGG TGG GAA GTC CTC AAA TAT TGG TGG AAT CTC CTA CAG TAT TGG AGT 3056 Gly Trp Glu Val Leu Lys Tyr Trp Trp Aβn Leu Leu Gin Tyr Trp Ser 790 795 800

CAG GAA CTA AAG AAT AGT GCT GTT AGC TTG CTC AAT GCC ACA GCT GTA 3104 Gin Glu Leu Lye Aβn Ser Ala Val Ser Leu Leu Asn Ala Thr Ala Val 805 810 815

GCA GTA GCT GAA GGG ACA GAT AGG GTT ATA GAA GTA TTA CAG AGA GCT 3152 Ala Val Ala Glu Gly Thr Asp Arg Val He Glu Val Leu Gin Arg Ala 820 825 830 835

GTT AGA GCT ATT CTC CAC ATA CCT AGA AGA ATA AGA CAG GGC TTG GAA 3200 Val Arg Ala He Leu His He Pro Arg Arg He Arg Gin Gly Leu Glu 840 845 850

AGG GCT TTG CTA TAAGATGGGT GGCAAGTGGT CAAAAAGTAG TATAGTCGTA 3252 Arg Ala Leu Leu 855

TGGCCTGCTG TAAGGAAAAG AATGAGAAGA ACTGAGCCAG CAGCAGATGG AGTAGGAGCA 3312

GTATCTAGAG ACCTGGAAAA ACATGGAGCA ATCACAAGTA GCAATACAGC AGCTAACAAT 3372

GCTGATTGTG CCTGGCTAGA AGCACAAGAG GATGAAGAAG TGGGTTTTCC AGTCAGACCT 3432

CAGGTACCTT TAAGACCAAT GACTCGCAGT GCAGCTATAG ATCTTAGCCA CTTTTTTAAG 3492

AAAAAGGGGG GACTGGAAGG GCTAATTCAC TCCCAAAAAA GACAAGATAT CCTTGATTTG 3552

TGGGTCTACC ACACACAAGG CTACTTCCCT GATTGGCAGA ACTACACACC AGGGCCAGGG 3612

ACCAGATTTC CACTGACCTT TGGATGGTGC TTCAAGCTAG TACCAGTTGA GCCAGAGAAG 3672

GTAGAAGAGG CCAATGAAGG AGAGAACAAC TGCTTGTCAC ACCCTATGAG CCTGCATGGG 3732

ATGGATGACC CGGAGAAAGA AGTGTTAGCA TGGAAGTTTG ACAGCAGCCT AGCATTCCAT 3792

CACGTGGCCC GAGAA 3807

Met Arg Val Thr Glu He Arg Lye Ser Tyr Gin His Trp Trp Arg Trp 1 5 10 15

Gly He Met Leu Leu Gly He Leu Met He Cys Asn Ala Glu Glu Lys 20 25 30

Leu Trp Val Thr Val Tyr Tyr Gly Val Pro Val Trp Lys Glu Ala Thr 35 40 45

Thr Thr Leu Phe Cys Ala Ser Aβp Arg Lye Ala Tyr Asp Thr Glu Val 50 55 60

His Aβn Val Trp Ala Thr Hie Ala Cys Val Pro Thr Asp Pro Asn Pro 65 70 75 80

Gin Glu Val Glu Leu Lys Asn Val Thr Glu Asn Phe Asn Met Trp Lys 85 90 95

Aβn Aβn Met Val Glu Gin Met His Glu Asp He He Ser Leu Trp Asp 100 105 110

Gin Ser Leu Lys Pro Cys Val Lys Leu Thr Pro Leu Cys Val Thr Leu 115 120 125

Aβn Cyβ Thr Aβp Leu Arg Asn Ala Thr Asn Gly Asn Asp Thr Asn Thr 130 135 140

Thr Ser Ser Ser Arg Gly Met Val Gly Gly Gly Glu Met Lys Asn Cys 145 150 155 160

Ser Phe Asn He Thr Thr Asn He Arg Gly Lys Val Gin Lys Glu Tyr 165 170 175

Ala Leu Phe Tyr Lys Leu Aβp He Ala Pro He Asp Asn Asn Ser Asn 180 185 190

Asn Arg Tyr Arg Leu He Ser Cys Asn Thr Ser Val He Thr Gin Ala 195 200 205

Cyβ Pro Lye Val Ser Phe Glu Pro He Pro He His Tyr Cys Ala Pro 210 215 220

Ala Gly Phe Ala He Leu Lye Cys Lye Aβp Lye Lye Phe Asn Gly Lys 225 230 235 240

Gly Pro Cys Thr Aβn Val Ser Thr Val Gin Cys Thr His Gly He Arg 245 250 255

Pro Val Val Ser Thr Gin Leu Leu Leu Asn Gly Ser Leu Ala Glu Glu 260 265 270

Glu Val Val He Arg Ser Ala Asn Phe Ala Asp Asn Ala Lys Val He 275 280 285

He Val Gin Leu Asn Glu Ser Val Glu He Asn Cys Thr Arg Pro Asn 290 295 300

Asn Aβn Thr Arg Lye Ser He His He Gly Pro Gly Arg Ala Phe Tyr 305 310 315 320

Thr Thr Gly Glu He He Gly Aβp He Arg Gin Ala Hiβ Cys Asn Leu 325 330 335

Ser Arg Ala Lys Trp Asn Aβp Thr Leu Aβn Lys He Val He Lys Leu 340 345 350

Arg Glu Gin Phe Gly Asn Lys Thr He Val Phe Lye Hiβ Ser Ser Gly 355 360 365

Gly Aβp Pro Glu He Val Thr Hiβ Ser Phe Aen Cyβ Gly Gly Glu Phe 370 375 380

Phe Tyr Cye Aβn Ser Thr Gin Leu Phe Aβn Ser Thr Trp Asn Val Thr 385 390 395 400

Glu Glu Ser Asn Asn Thr Val Glu Asn Asn Thr He Thr Leu Pro Cys 405 410 415

Arg He Lys Gin He He Asn Met Trp Gin Glu Val Gly Arg Ala Met 420 425 430

Tyr Ala Pro Pro He Arg Gly Gin He Arg Cys Ser Ser Asn He Thr 435 440 445

Gly Leu Leu Leu Thr Arg Asp Gly Gly Pro Glu Asp Asn Lys Thr Glu 450 455 460

Val Phe Arg Pro Gly Gly Gly Asp Met Arg Asp Asn Trp Arg Ser Glu 465 470 475 480

Leu Tyr Lys Tyr Lys Val Val Lys He Glu Pro Leu Gly Val Ala Pro 485 490 495

Thr Lys Ala Lys Arg Arg Val Val Gin Arg Glu Lys Arg Ala Val Gly 500 505 510

He Gly Ala Val Phe Leu Gly Phe Leu Gly Ala Ala Gly Ser Thr Met 515 520 525

Gly Ala Ala Ala Met Thr Leu Thr Val Gin Ala Arg Leu Leu Leu Ser 530 535 540

Gly He Val Gin Gin Gin Asn Asn Leu Leu Arg Ala He Glu Ala Gin 545 550 555 560

Gin Hie Leu Leu Gin Leu Thr Val Trp Gly He Lys Gin Leu Gin Ala 565 570 575

Arg Val Leu Ala Val Glu Arg Tyr Leu Arg Asp Gin Gin Leu Leu Gly 580 585 590

He Trp Gly Cys Ser Gly Lys Leu He Cys Thr Thr Ala Val Pro Trp 595 600 605

Asn Ala Ser Trp Ser Asn Lys Ser Leu Asn Lys He Trp Asp Asn Met 610 615 620

Thr Trp He Glu Trp Asp Arg Glu He Asn Asn Tyr Thr Ser He He 625 630 635 640

Tyr Ser Leu He Glu Glu Ser Gin Asn Gin Gin Glu Lys Asn Glu Gin 645 650 655

Glu Leu Leu Glu Leu Asp Lys Trp Ala Ser Leu Trp Asn Trp Phe Asp 660 665 670

He Thr Lys Trp Leu Trp Tyr He Lye He Phe He Met He Val Gly 675 680 685

Gly Leu He Gly Leu Arg He Val Phe Ser Val Leu Ser He Val Aβn 690 695 700

Arg Val Arg Gin Gly Tyr Ser Pro Leu Ser Phe Gin Thr Hiβ Leu Pro 705 710 715 720

Ser Ser Arg Gly Pro Aβp Arg Pro Gly Gly He Glu Glu Glu Gly Gly 725 730 735

Glu Arg Aβp Arg Aβp Arg Ser Gly Pro Leu Val Asn Gly Phe Leu Ala 740 745 750

Leu He Trp Val Asp Leu Arg Ser Leu Phe Leu Phe Ser Tyr His Arg 755 760 765

Leu Arg Asp Leu Leu Leu He Val Met Arg He Val Glu Leu Leu Gly 770 775 780

Leu Ala Gly Gly Trp Glu Val Leu Lys Tyr Trp Trp Asn Leu Leu Gin 785 790 795 800

Tyr Trp Ser Gin Glu Leu Lys Asn Ser Ala Val Ser Leu Leu Asn Ala 805 810 815

Thr Ala Val Ala Val Ala Glu Gly Thr Asp Arg Val He Glu Val Leu 820 825 830

Gin Arg Ala Val Arg Ala He Leu His He Pro Arg Arg He Arg Gin 835 840 845

Gly Leu Glu Arg Ala Leu Leu 850 855