Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
HUMAN HOMOLOG OF THE E-CADHERIN GENE AND METHODS BASED THEREON
Document Type and Number:
WIPO Patent Application WO/1994/011401
Kind Code:
A1
Abstract:
The present invention relates to nucleotide sequences of the human E-cadherin gene, and the amino acid sequences of the encoded E-cadherin protein. The invention further relates to fragments and other derivatives, and analogs, of the human E-cadherin protein, as well as antibodies thereto. Nucleic acids encoding such fragments or derivatives are also within the scope of the invention, as well as human E-cadherin antisense nucleic acids. Production of the foregoing proteins and derivatives, e.g., by recombinant methods, is provided. In specific embodiments, the invention relates to human E-cadherin protein derivatives and analogs of the invention which are functionally active, or which comprise one or more domains of a human E-cadherin protein, including but not limited to the amino-terminal processed region, the HAV homotypic binding domain, one or more of the three repeat domains, the transmembrane region, the extracellular region, the cytoplasmic domain, and any combination of the foregoing. The present invention further relates to therapeutic and diagnostic methods and compositions based on E-cadherin proteins and nucleic acids.

Inventors:
RIMM DAVID L
MORROW JON S
Application Number:
PCT/US1993/011097
Publication Date:
May 26, 1994
Filing Date:
November 16, 1993
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNIV YALE (US)
International Classes:
C07K14/705; C12P21/08; G01N33/574; G01N33/68; A61K38/00; A61K39/00; (IPC1-7): C07K13/00; C07K15/28; C07H21/04; C12N5/10; C12P21/00; A61K37/02; A61K39/395; A61K48/00; G01N33/53
Domestic Patent References:
WO1992017608A11992-10-15
WO1991004745A11991-04-18
Other References:
DIFFERENTIATION, Vol. 38, No. 1, issued June 1988, A. MANSOURI et al., "Characterization and Chromosomal Location of the Gene Encoding the Human Cell Adhesion Molecule Uvomorulin", pages 67-71.
THE JOURNAL OF CELL BIOLOGY, Vol. 108, No. 6, issued June 1989, J. BEHRENS et al., "Dissecting Tumor Cell Invasion: Epithelial Cells Acquire Invasive Properties after the Loss of Uvomorulin-Mediated Cell-Cell Adhesion", pages 2435-2447.
THE JOURNAL OF CELL BIOLOGY, Vol. 113, No. 1, issued April 1991, U.H. FRIXEN et al., "E-Cadherin-Mediated Cell-Cell Adhesion Prevents Adhesiveness of Human Carcinoma Cells", pages 173-185.
NATURE, Vol. 329, issued 24 September 1987, A. NAGAFUCHI et al., "Transformation of Cell Adhesion Properties by Exogenously Introduced E-Cadherin cDNA", pages 341-343.
CELL, Vol. 63, issued 20 November 1990, M. OZAWA et al., "Single Amino Acid Substitutions in One Ca2+ Binding Site of Uvomorulin Abolish the Adhesive Function", pages 1033-1038.
THE EMBO JOURNAL, Vol. 6, No. 12, issued 1987, M. RINGWALD et al., "The Structure of Cell Adhesion Molecule Uvomorulin. Insights into the Molecular Mechanism of Ca2+-Dependent Cell Adhesion", pages 3647-3653.
NUCLEIC ACIDS RESEARCH, Vol. 19, No. 23, issued 1991, M. RINGWALD et al., "The Structure of the Gene Encoding for the Mouse Cell Adhesion Molecule Uvomorulin", pages 6533-6539.
DEVELOPMENT, Vol. 102, issued April 1988, M. TAKEICHI, "The Cadherins: Cell-Cell Adhesion Molecules Controlling Animal Morphogenesis", pages 639-655.
SCIENCE, Vol. 251, issued 22 March 1991, M. TAKEICHI, "Cadherin Cell Adhesion Receptors as a Morphogenetic Regulator", pages 1451-1455.
CELL, Vol. 34, issued September 1983, C.H. DAMSKY et al., "Identification and Purification of a Cell Surface Glycoprotein Mediating Intercellular Adhesion in Embryonic and Adult Tissue", pages 455-466.
JOURNAL OF THE AMERICAN CHEMICAL SOCIETY, Vol. 85, issued 1963, R.B. MERRIFIELD, "Solid Phase Peptide Synthesis. I. The Synthesis of a Tetrapeptide", pages 2149-2154.
E. HARLOW et al., "Antibodies a Laboratory Manual", published 1988, by COLD SPRING HARBOR LABORATORY (COLD SPRING HARBOR, NEW YORK), pages 72-77, 92-97, 128-135, and 141-157.
L.S. GOODMAN et al., "The Pharmacological Basis of Therapeutics", published 1975, by MACMILLAN PUBLISHING CO., INC. (NEW YORK), pages 1-46.
CELL, Vol. 61, issued 06 April 1990, A. NOSE et al., "Localization of Specificity Determining Sites in Cadherin Cell Adhesion Molecules", pages 147-155.
DEVELOPMENTAL BIOLOGY, Vol. 152, issued 1992, K. SHIMAMURA et al., "E-Cadherin Expression in a Particular Subset of Sensory Neurons", pages 242-254.
JOURNAL OF CELLULAR BIOCHEMISTRY, Vol. 34, issued 1987, M.J. WHEELOCK et al., "Soluble 80-kd Fragment of Cell-CAM 120/80 Disrupts Cell-Cell Adhesion", pages 187-202.
Download PDF:
Claims:
WHAT IS CLAIMED IS:
1. A purified human Ecadherin protein having the amino acid sequence depicted in Figure 3 (SEQ ID NO:2) from amino acid numbers 1878.
2. A purified human Ecadherin protein having the amino acid sequence depicted in Figure 3 (SEQ ID NO:2) from amino acid numbers 151878, which is free of detergents.
3. The protein of claim 1 which is not glycosyiated.
4. A purified human Ecadherin protein having the amino acid sequence depicted in Figure 3 (SEQ ID NO:2) from amino acid numbers 151 878, which is not glycosyiated.
5. A purified protein comprising a fragment of a human Ecadherin protein consisting of at least 30 sequential amino acids of the human Ecadherin sequence shown in Figure 3 from amino acid numbers 308878.
6. The protein of claim 5 which displays one or more functional activities associated with a fulllength human Ecadherin protein.
7. The protein of claim 6 in which the protein is able to be bound by an antibody to a human Ecadherin protein.
8. A purified protein comprising amino acid numbers 1150 as depicted in Figure 3 (SEQ ID NO:2), with the proviso that said protein is not a mature human Ecadherin protein comprising amino acids 151 878 of Figure 3.
9. A purified protein comprising amino acid numbers 178289 as depicted in Figure 3 (SEQ ID NO:2), with the proviso that said protein is not a mature human Ecadherin protein comprising amino acids 151878 of Figure 3.
10. A purified protein comprising amino acid numbers 290401 as depicted in Figure 3 (SEQ ID NO:2), with the proviso that said protein is not a mature human Ecadherin protein comprising amino acids 151878 of Figure 3.
11. A purified protein comprising amino acid numbers 402513 as depicted in Figure 3 (SEQ ID NO:2), with the proviso that said protein is not a mature human Ecadherin protein comprising amino acids 151878 of Figure 3.
12. A purified protein comprising amino acid numbers 178513 as depicted in Figure 3 (SEQ ID NO:2), with the proviso that said protein is not a mature human Ecadherin protein comprising amino acids 151878 of Figure 3.
13. A purified protein comprising amino acid numbers 151703 as depicted in Figure 3 (SEQ ID NO.2), with the proviso that said protein is not a mature human Ecadherin protein comprising amino acids 151878 of Figure 3.
14. A purified protein comprising amino acid numbers 1703 as depicted in Figure 3 (SEQ ID NO:2), with the proviso that said protein is not a mature human Ecadherin protein comprising amino acids 151878 of Figure 3.
15. A purified protein comprising amino acid numbers 728878 as depicted in Figure 3 (SEQ ID NO:2), with the proviso that said protein is not a mature human Ecadherin protein comprising amino acids 151878 of Figure 3.
16. A purified protein comprising amino acid numbers 704878 as depicted in Figure 3 (SEQ ID NO:2), with the proviso that said protein is not a mature human Ecadherin protein comprising amino acids 151878 of Figure 3. ro¬ il.
17. A purified protein comprising the Ecadherin amino acid sequence depicted in Figure 3 (SEQ ID NO:2) encoded by nucleotide numbers 11053, 5102686, 13323000, 5401500, 348906, 8901648, 3841208, 6412046, 6851336, 8801661 , 11991742, 13731742, 17052204, or 24582775, with the proviso that said protein is not a mature human Ecadherin protein comprising amino acids 151878 of Figure 3.
18. A fragment of a human Ecadherin protein, said fragment consisting of at least ten sequential amino acids selected from the repeat region or cytoplasmic domain of a human Ecadherin protein, said human Ecadherin protein having the amino acid sequence shown in Figure 3 (SEQ ID NO:2), in which said fragment is able to be bound by antibody to said human Ecadherin protein.
19. A chimeric protein comprising a functionally active fragment of a human Ecadherin protein joined via a peptide bond to an amino acid sequence of a protein other than a human Ecadherin protein, in which the fragment of the human Ecadherin protein is selected from the group consisting of the extracellular domain, the cytoplasmic domain, the repeat region, and the conserved cysteine domain.
20. A chimeric protein comprising a functionally active fragment of a human Ecadherin protein joined via a peptide bond to an amino acid sequence of a protein other than a human Ecadherin protein, in which the fragment of the human Ecadherin protein consists of at least 30 sequential amino acids of the human Ecadherin sequence shown in Figure 3 from amino acid numbers 308878.
21. The protein according to claim 20 in which the at least 30 sequential amino acids are from the cytoplasmic domain of the human Ecadherin protein.
22. A monoclonal antibody which binds to a human Ecadherin protein and which does not bind to a mouse or chicken Ecadherin protein.
23. An antibody which binds to the cytoplasmic domain of a human Ecadherin protein.
24. A purified nucleic acid encoding a human Ecadherin protein having the amino acid sequence shown in Figure 3 (SEQ ID NO:2) from amino acid numbers 1878.
25. A purified nucleic acid encoding a human Ecadherin protein having the amino acid sequence shown in Figure 3 (SEQ ID NO:2) from amino acid numbers 151878.
26. The nucleic acid of claim 25 which lacks introns.
27. A purified nucleic acid which encodes the protein of claim 8, 9, or 10.
28. A purified nucleic acid which encodes the protein of claim 1 1 , 13 or 14.
29. A purified nucleic acid which encodes the protein of claim 15 or 16.
30. A purified nucleic acid having the nucleotide sequence depicted in Figure 3 (SEQ ID NO: l ) from nucleotide numbers 1 162749.
31. A purified nucleic acid having the nucleotide sequence depicted in Figure 3 (SEQ ID NO: l) from nucleotide numbers 5662749.
32. A purified nucleic acid comprising the nucleotide sequence depicted in Figure 3 (SEQ ID NO: l) from nucleotide numbers 11053. 5102686, 13323000. 5401500, 348906, 8901648, 3841208, 6412046, 6851336, 8801661 , 11991742, 13731742, 17052204, or 24582775.
33. A purified nucleic acid comprising the human Ecadherin nucleotide sequence contained in plasmid bsFLEC, as deposited with the ATCC and assigned accession number 69123.
34. A purified nucleic acid comprising the human Ecadherin nucleotide sequence contained in plasmid bsL5.1 , as deposited with the ATCC and assigned accession number 69122.
35. A purified cDNA encoding a protein comprising a fragment of a human Ecadherin protein consisting of at least 30 sequential amino acids of the human Ecadherin sequence depicted in Figure 3 (SEQ ID NO: 3) selected from amino acid numbers 308878.
36. A purified nucleic acid comprising a nucleotide sequence 100% complementary to at least 30 sequential nucleotides of the nucleotide sequence depicted in Figure 3 (SEQ ID NO: l ), with the proviso that said nucleotides do not consist of a portion of the nucleotide sequence from nucleotides numbers 6171036 depicted in Figure 3.
37. A nucleic acid encoding the chimeric protein of claim 20.
38. A nucleic acid vector comprising the nucleic acid of claim 24 or 25.
39. A nucleic acid vector comprising the nucleic acid of claim 27.
40. A cell containing the nucleic acid vector of claim 37.
41. A cell containing the nucleic acid vector of claim 38.
42. A cell containing the nucleic acid vector of claim 39.
43. A cell containing a nucleic acid comprising (a) a first nucleotide sequence encoding human Ecadherin or an at least 30 amino acid functional fragment thereof, with the proviso that said fragment does not consist of amino acids numbers 153307 depicted in Figure 3 (SEQ ID NO:2); and (b) a promoter operatively linked to the first nucleotide sequence, with the proviso that said promoter is not a human Ecadherin gene promoter.
44. The cell of claim 43 which is a tumor cell.
45. A method for producing a human Ecadherin protein comprising growing the recombinant cell of claim 40, such that the human Ecadherin protein is expressed by the cell; and recovering the expressed human Ecadherin protein.
46. A method for producing a human Ecadherin protein comprising growing the recombinant cell of claim 41 , such that the human Ecadherin protein is expressed by the cell; and recovering the expressed human Ecadherin protein.
47. A purified protein which is the product of the method of claim*& 45.
48. A purified protein which is the product of the method of claim 46.
49. A pharmaceutical composition comprising a therapeutical ly effective amount of the protein of claim 1 ; and a pharmaceutically acceptable carrier.
50. A pharmaceutical composition comprising a therapeutical ly effective amount of the protein of claim 2; and a pharmaceutically acceptable carrier.
51. A pharmaceutical composition comprising a therapeutical ly effective amount of a purified human Ecadherin protein having the amino acid sequence depicted in Figure 3 (SEQ ID NO:2) from amino acid numbers 151878; and a pharmaceutically acceptable carrier.
52. A pharmaceutical composition comprising a therapeutically effective amount of the protein of claim 4; and a pharmaceutically acceptable carrier.
53. A pharmaceutical composition comprising a therapeutically effective amount of the protein of claim 12; and a pharmaceutically acceptable carrier.
54. A pharmaceutical composition comprising a therapeutically effective amount of the antibody of claim 22; and a pharmaceutically acceptable carrier.
55. A pharmaceutical composition comprising a therapeutically effective amount of the nucleic acid of claim 24, 25, or 31 ; and a pharmaceutically acceptable carrier.
56. A pharmaceutical composition comprising a therapeutically effective amount of the cDNA of claim 35; and a pharmaceutically acceptable carrier.
57. A method of treating or preventing a malignancy in a subject comprising administering to a subject in need of such treatment or prevention a therapeutically or prophylactically effective amount of the protein of claim 1.
58. A method of treating or preventing a malignancy in a subject comprising administering to a subject in need of such treatment or prevention a therapeutically or prophylactically effective amount of the protein of claim 2.
59. A method of treating or preventing a malignancy in a subject comprising administering to a subject in need of such treatment or prevention a therapeutically or prophylactically effective amount of the protein of claim 4.
60. A method of treating or preventing a malignancy in a subject comprising administering to a subject in need of such treatment or prevention a therapeutically or prophylactically effective amount of the protein of claim 17.
61. A method of treating or preventing a malignancy in a subject comprising administering to a subject in need of such treatment or prevention a therapeutically or prophylactically effective amount of the nucleic acid of claim 24 or 25.
62. A method of treating a benign dysproliferative disorder in a subject comprising administering to a subject in need of such treatment a therapeutically effective amount of the protein of claim 1.
63. A method of treating a benign dysproliferative disorder in a subject comprising administering to a subject in need of such treatment a therapeutically effective amount of the protein of claim 2.
64. A method of treating a benign dysproliferative disorder in a subject comprising administering to a subject in need of such treatment a therapeutically effective amount of the protein of claim 5.
65. A method of treating a benign dysproliferative disorder in a subject comprising administering to a subject in need of such treatment a therapeutically effective amount of the nucleic acid of claim 24 or 25.
66. A purified nucleic acid comprising a nucleotide sequence 100% complementary to at least a 30 sequential nucleotide portion of the nucleotide sequence as depicted in Figure 3 (SEQ ID NO: l).
67. The nucleic acid of claim 66 which is complementary to a portion of the nucleotide sequence depicted in Figure 3 (SEQ ID NO: l) selected from nucleotides numbers 1162748.
68. A method of detecting metastatic potential in a cell comprising detecting or measuring the level of human Ecadherin in the cell, in which a change in localization or decreased level of human Ecadherin relative to the localization or level of human Ecadherin in a nonmalignant cell indicates that the cell has metastatic potential; wherein the detection or measurement of human Ecadherin is carried out by a method comprising contacting the cell with an antibody to the cytoplasmic domain of human Ecadherin such that immunospecific binding can occur, and detecting or measuring any immunospecific binding to the antibody.
69. A method of treating or preventing a malignancy or a benign dysproliferative disorder in a subject comprising administering to a subject in need of such treatment or prevention a therapeutically or prophylactically effective amount of the cell of claim 43.
70. A method of promoting nerve or tissue regeneration in a subject comprising administering to a subject in need of such treatment a therapeutically effective amount of the antibody of claim 23.
71. A method of promoting nerve or tissue regeneration in a subject comprising administering to a subject in need of such treatment a therapeutically effective amount of the nucleic acid of claim 66.
72. A fragment of the antibody of claim 23 containing the antibody binding domain.
73. A method of promoting wound healing in a subject comprising delivering to the site of a wound in a subject a therapeutically effective amount of the protein of claim 2.
74. A method of promoting wound healing in a subject comprising delivering to the site of a wound iri a subject a therapeutically effective amount of the nucleic acid of claim 24 or 25.
75. A method of treating an inflammatory disorder comprising delivering to the site of inflammation in a subject a therapeutically effective amount of the protein of claim 2.
76. A method of treating an inflammatory disorder comprising delivering to the site of inflammation in a subject a therapeutically effective amount of the nucleic acid of claim 24 or 25.
77. A method of treating or preventing gestational disease or fetal wastage comprising administering to a subject in need of such treatment or prevention a therapeutically or prophylactically effective amount of the protein of claim 2.
78. A method of treating or preventing gestational disease or fetal wastage comprising administering to a subject in need of such treatment or prevention a therapeutically or prophylactically effective amount of the nucleic acid of claim 24 or 25.
79. A kit comprising in one or more containers: (a) a first oligonucleotide of at least 15 nucleotides in size, consisting of a nucleotide sequence that is a first portion of the nucleotide sequence depicted in Figure 3 (SEQ ID NO: l), wherein said first portion is not identical to or contained within (i) nucleotide numbers 6171036 depicted in Figure 3 (SEQ ID NO: l), or (ii) the mouse or chicken Ecadherin coding sequence; and (b) a second oligonucleotide of at least 15 nucleotides in size, consisting of a nucleotide sequence that (i) is complementary to a second portion of the nucleotide sequence depicted in Figure 3, said second portion situated 3' to said first portion.
Description:
HUMAN HOMOLOG OF THE E-CADHERIN GENE AND METHODS BASED THEREON

This invention was made with government support under grant number 1RO1 DK43812-01 awarded by the National Institutes of Health. The government has certain rights in the invention.

1. INTRODUCTION The present invention relates to the human epithelial-cadherin Q (E-cadherin) gene and its encoded protein product, as well as derivatives and analogs of the human E-cadherin protein. Production of human E-cadherin proteins, derivatives and antibodies is also provided. The invention further relates to therapeutic and diagnostic methods and compositions.

5 2. BACKGROUND OF THE INVENTION Cell adhesion molecules (CAMs) are cell surface glycoproteins which mediate specific cell-cell adhesions involved in embryonic development and maintaining tissue form and function.

E-cadherin is a cell adhesion molecule that is also known as o uvomorulin, L-CAM and Cell CAM 120/80. E-cadherin localizes to the lateral surfaces and is concentrated in the adherens junctions of intestinal epithelial cells. It is present in epithelial cells from all organs examined, and related "cadherin family" molecules have been identified in brain, muscle, placenta, and other organs. This molecule has been attributed to play a role in initiation of the c formation of the cortical cytoskeleton, establishment of polarity (Nelson, 1991, Sem. Cell. Bio. 2:375-385; Nelson and Hammerton, 1989, J. Cell Biol. 108:893-902; Nelson et al., 1990, J. Cell Biol. 110:349-357; McNeill et al., 1990, Cell 62:309-316; Pasdar et al., 1991 , J. Cell Biol. 113:645-655; Ruggieri et al. , 1992, Am. J. Pathol. 140: 1179-1185; Avner et al.. 1992, Proc. Natl. 0 Acad. Sci. USA 89:7447-7451), and suppression of cell invasion (Behrens et al., 1989, J. Cell Biol. 108:2435-2447). It has recently gained increased attention since it has been proposed to be a tumor suppressor protein (Mareel et al., 1991,

5

Int. J. Cancer 47:922-928). Although this claim has yet to be proven in actual human tumors, there is evidence of changes in levels and patterns of expression in breast, gastric, and esophageal squamous cell carcinomas (Shimoyama and Hirohashi, 1991, Cancer Res. 51:2185-2192; Shimoyama et al., 1989, Cancer Res. 49:2128-2133; Shiozaki et al., 1991 , Am. J. Pathol. 139: 17-23).

Studies, primarily on the mouse and chicken proteins, have shown that E-cadherin is a 120 kilodalton (kD) membrane spanning protein with a large, glycosylated, amino-terminal extracellular domain and a 150 amino acid C-terminal cytoplasmic tail. The extracellular domain is cleavable with trypsin in the presence of Ca ++ , resulting in an 80 kD peptide that contains three putative repeat structures (possibly involved in Ca + + binding) and a highly conserved amino-terminal 113 amino acid region. Within this region is a HAV (His-Ala-Val) motif that is conserved between species, and that is primarily responsible for homotypic interactions (Fig. 1). In mouse and chicken, an amino- terminal piece consisting of 150 amino acids is removed in a maturation process resulting in the truncated mature form. The exact mechanism of this processing event is unknown.

The murine and avian homologs of E-cadherin have been cloned and sequenced (see, e.g. , Ringwald et al., 1987, EMBO J. 6:3647-3653; Nagafuchi et al., 1987, Nature 329:341-343; Gallin et al., 1987, Proc. Natl.

Acad. Sci. USA 84:2808-2812; Sorkin et al., 1988, Proc. Natl. Acad. Sci. USA 85:7617-7621). A partial sequence of human E-cadherin derived from a liver cDNA clone has been reported (Mansouri et al., 1988, Differentiation 38:67-71). A partial sequence derived from amino-terminal sequencing of the human protein has also been disclosed (Wheelock et al., 1987, J. Ceil Biochem. 34: 187-202). However, the full-length nucleotide and amino acid sequences for human E-cadherin have not been available prior to the present invention. A knowledge of such sequences is of primary importance since proteins having human sequences and antibodies thereto are greatly preferred over those of other species for human therapeutic and diagnostic purposes. Knowledge of the complete

E-cadherin sequences is also important for deriving appropriate strategies in the

generation of derivatives and fragments, for example, in choosing an appropriate restriction enzyme for cleavage in the isolation of portions of the coding sequence.

Citation of a reference hereinabove shall not be construed as an admission that such reference is prior art to the present invention.

3. SUMMARY OF THE INVENTION The present invention relates to nucleotide sequences of the human E-cadherin gene, and the amino acid sequences of the encoded E-cadherin protein. The invention further relates to fragments and other derivatives, and analogs, of the human E-cadherin protein, as well as antibodies thereto. Nucleic acids encoding such fragments or derivatives are also within the scope of the invention. Production of the foregoing proteins and derivatives, e.g. , by recombinant methods, is provided. , In specific embodiments, the invention relates to human E-cadherin protein derivatives and analogs of the invention which are functionally active, or which comprise one or more domains of a human E-cadherin protein, including but not limited to the amino-terminal processed region, the HAV homotypic binding domain, one or more of the three repeat domains, the conserved cysteine domain, the transmembrane region, the extracellular region, the cytoplasmic domain, and any combination of the foregoing.

The present invention further relates to therapeutic and diagnostic methods and compositions based on E-cadherin proteins and nucleic acids. The invention provides for treatment of disorders of cell fate or differentiation by administration of a therapeutic compound of the invention. Such therapeutic compounds (termed herein "Therapeutics") include: E-cadherin proteins and analogs and derivatives (including fragments) thereof; antibodies thereto; nucleic acids encoding the E-cadherin proteins, analogs, or derivatives; and E-cadherin antisense nucleic acids. In a preferred embodiment, a Therapeutic of the invention is administered to treat or prevent a cancerous condition, or to prevent progression from a pre-neoplastic or non-malignant state into a neoplastic or a

malignant state, or to inhibit or ameliorate metastatic tumor development. Methods of promoting nerve or tissue regeneration, of promoting wound healing, of treating an inflammatory disorder, and of treating or preventing gestational disease or fetal wastage are also provided. In particular embodiments presented by way of examples sections infra, the invention provides the complete nucleotide sequence of human cDNA from liver and colon coding for E-cadherin, and the sequence of the encoded human E-cadherin protein. Also described are specific anti-human E-cadherin antibodies, and multiple human cloned fragments of both liver and colon E-cadherin cDNAs, some of which have been expressed in both eukaryotic and prokaryotic expression systems.

4. DESCRIPTION OF THE FIGURES Figure 1. A schematic diagram of E-cadherin showing structural features including the amino (N)-terminal processed region, the HAV homotypic binding sequence, the three repeat domains, and the highly conserved carboxy (C)-terminal cytoplasmic domain. Also shown are some of the proteins thought to interact, either directly or indirectly, with the cytoplasmic domain.

Figure 2. A restriction map of the liver E-cadherin clone. The translation start is shown preceding the mature N terminus of the protein. Three proposed repeat domains are shown and the homotypic adhesion sequence (HAV) is located in the first repeat. The hashed portion of the sequence denotes the region originally published by Mansouri et al. (1988, Differentiation 38:67-71). The regions labeled e250 and cyto 20 are regions that have been produced as fusion proteins for in vitro binding studies and antibody production.

Figure 3. The complete nucleotide (SEQ ID NO: l) and protein (SEQ ID NO:2) sequences of human liver E-cadherin.

Figure 4. Series of E-cadherin clones shown by location and source. Leftmost column lists the clone name. Numbers beneath lines are the starting and ending nucleotide of each clone. Clones with hashmarks on one end are either clones whose ends are not yet sequenced or fusion clones with

artefactual sequence on the end as shown. Part A: liver clones. Part B: colon clones. Solid bar: characterized E-cadherin sequence; cross-hatched area: concatenated unrelated sequence; dotted area: regions not yet sequenced.

Figure 5. Nucleotide (SEQ ID NO: l) and protein (SEQ ID NO:2) sequences of human liver E-cadherin, with restriction sites. Restriction enzyme cleavage sites, translation start site, mature amino (N) terminus, homotypic binding domain/recognition sequence ("Rec Seq"; containing the HAV site), and the transmembrane region are shown.

Figure 6. Schematic diagram of plasmid pCMV-NeoPoly 1. pCMV-NeoPoly 1 is a 6.7 kb plasmid that was constructed and kindly provided by Dr. Eric R. Fearon. Known unique restriction sites: Xhol, EcoRV, BamHI, StuI, Nhel, Hindlll, and Sstl.

5. DETAILED DESCRIPTION OF THE INVENTION

The present invention relates to nucleotide sequences of the human E-cadherin gene, and the amino acid sequence of the encoded E-cadherin protein. The invention further relates to fragments and other derivatives, and analogs, of the human E-cadherin protein. Nucleic acids encoding such fragments or derivatives are also within the scope of the invention. Production of the foregoing proteins and derivatives, e.g. , by recombinant methods, is provided.

The invention also relates to human E-cadherin protein derivatives and analogs of the invention which are functionally active, i.e. , they are capable of displaying one or more known functional activities associated with a full-length (wild-type) E-cadherin protein. Such functional activities include but are not limited to antigenicity [ability to bind (or compete with a E-cadherin protein for binding) to an anti-E-cadherin protein antibody], immunogenicity (ability to generate antibody which binds to a E-cadherin protein), ability to bind (or compete with a E-cadherin protein for binding) to a receptor or ligand for a E-cadherin protein, suppression of cell invasiveness, therapeutic activity, etc.

The invention further relates to fragments (and derivatives and analogs thereof) of a human E-cadherin protein which comprise one or more domains of a human E-cadherin protein (see Section 6), including but not limited to the amino-terminal processed region, the HAV homotypic binding domain, one or more of the three repeat regions, the conserved cysteine domain, the extracellular region, transmembrane region, cytoplasmic domain, and any combination of the foregoing.

Antibodies to the human E-cadherin protein and its derivatives and analogs are additionally provided. The present invention further relates to therapeutic and diagnostic methods and compositions based on E-cadherin proteins and nucleic acids. The invention provides for treatment by administration of a therapeutic compound of the invention. Such therapeutic compounds (termed herein "Therapeutics") include: human E-cadherin proteins and analogs and derivatives (including fragments) thereof; antibodies thereto; nucleic acids encoding the human

E-cadherin proteins, analogs, or derivatives; E-cadherin antisense nucleic acids. In a preferred embodiment, a Therapeutic of the invention is administered to treat a cancerous condition, or to prevent progression from a pre-neoplastic or non- malignant state (e.g. , metaplastic condition) into a neoplastic or a malignant state, or to inhibit or ameliorate metastatic tumor development. In another specific embodiment, a nucleic acid encoding a human E-cadherin protein or fragment thereof is used in gene therapy. Methods of promoting nerve or tissue regeneration, of promoting wound healing, of treating an inflammatory disorder, and of treating or preventing gestational disease or fetal wastage are also provided.

E-cadherin plays a role in developmental and other physiological processes. The nucleic acid and amino acid sequences and antibodies thereto of the invention can also be used for the detection and quantitation of human E-cadherin mRNA, to study expression thereof, to produce human E-cadherin proteins, fragments and other derivatives, and analogs thereof, in the study and manipulation of differentiation and other physiological processes.

The invention is illustrated by way of examples infra which disclose, inter alia, the cloning and sequencing of human E-cadherin cDNAs from liver and colon, and the construction and recombinant expression of human E-cadherin chimeric/fusion derivatives and production of antibodies thereto. For clarity of disclosure, and not by way of limitation, the detailed description of the invention will be divided into the following subsections: (i) Isolation of the Human E-Cadherin Gene; (ii) Expression of the Human E-Cadherin Gene; (iii) Identification and Purification of the Expressed Gene Products;

(iv) Structure of the Human E-Cadherin Gene and Protein; (v) Generation of Antibodies to the Human E-Cadherin Protein and Derivatives Thereof; (vi) Human E-Cadherin Protein Derivatives and Analogs; (vii) Assays of Human E-Cadherin Proteins, Derivatives and

Analogs; (viii) Therapeutic and Prophylactic Uses; (ix) Gene Therapy;

(x) Antisense Regulation of Human E-cadherin Expression; (xi) Demonstration of Therapeutic or Prophylactic Utility;

(xii) Therapeutic/Prophylactic Administration and Compositions; (xiii) Diagnostic Utility.

5.1. ISOLATION OF THE HUMAN E-CADHERIN GENE The invention relates to the nucleotide sequences of human

E-cadherin nucleic acids. In a specific embodiment, the human E-cadherin nucleic acid comprises the nucleotide sequence (SEQ ID NO: l ) shown in Figure 3; in particular, from nucleotides numbers 116-2749, or 566-2749, or 1037-2748, or fragments thereof. In other embodiments, a nucleic acid is provided which comprises the nucleotide sequence depicted in Figure 3 (SEQ ID NO: l) from nucleotide numbers 1-1053, 510-2686, 1332-3000, 540-1500,

348-906, 890-1648, 384-1208, 641-2046, 685-1336, 880-1661, 1199-1742, 1373-1742, 1705-2204, or 2458-2775 (see Fig. 4).

In another specific embodiment, the nucleotide sequence encodes all or a portion of the amino acid sequence (SEQ ID NO:2) shown in Figure 3. The invention provides nucleic acids consisting of at least 8 nucleotides (i.e. , a hybridizable portion) of a human E-cadherin sequence; in other embodiments, the nucleic acids consist of at least 30 nucleotides, 50 nucleotides, 100 nucleotides, 150 nucleotides, or 200 nucleotides of a human E-cadherin sequence. The invention also relates to nucleic acids hybridizable to or complementary to the foregoing sequences. In specific aspects, nucleic acids are provided which comprise a sequence complementary to at least 10, 25, 30, 50, 100, or 200 nucleotides or the entire coding region of a human E-cadherin gene. The longest stretch of identity among the human E-cadherin sequence of Figure 3 and the published mouse and chicken E-cadherin sequences is 29 nucleotides. Nucleic acids comprising a portion of the noncoding sequence shown in Figure 3 are also provided, as are nucleic acids complementary thereto. The nucleic acids of the invention do not consist of the nucleotide sequence shown in Figure 3 (SEQ ID NO:l) from nucleotide numbers 617-1036; preferably, such nucleic acids also do not consist of a portion of such nucleotide sequence. Nucleic acids encoding fragments and derivatives of human

E-cadherin proteins (see Section 5.6) are additionally provided.

In a preferred, but not limiting, aspect of the invention, a human E-cadherin DNA can be cloned and sequenced by the method described in Section 6, infra. Any human cell potentially can serve as the nucleic acid source for the molecular cloning of the E-cadherin gene. The DNA may be obtained by standard procedures known in the art from cloned DNA (e.g. , a DNA "library"), by chemical synthesis, by cDNA cloning, or by the cloning of genomic DNA, or fragments thereof, purified from the desired cell. (See, for example. Sambrook et al., 1989, Molecular Cloning, A Laboratory Manual, 2d Ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York; Glover, D.M. (ed.), 1985,

DNA Cloning: A Practical Approach, MRL Press. Ltd., Oxford, U.K. Vol. I, II.) Clones derived from genomic DNA may contain regulatory and intron DNA regions in addition to coding regions; clones derived from cDNA will lack introns and will contain only exon sequences. Whatever the source, the gene should be molecularly cloned into a suitable vector for propagation of the gene.

In the molecular cloning of the gene from genomic DNA, DNA fragments are generated, some of which will encode the desired gene. The DNA may be cleaved at specific sites using various restriction enzymes. Alternatively, one may use DNAse in the presence of manganese to fragment the DNA, or the DNA can be physically sheared, as for example, by sonication. The linear DNA fragments can then be separated according to size by standard techniques, including but not limited to, agarose and polyacrylamide gel electrophoresis and column chromatography.

Once the DNA fragments are generated, identification of the specific DNA fragment containing the desired gene may be accomplished in a number of ways. For example, if an amount of a portion of a E-cadherin (of any species) gene or its specific RNA, or a fragment thereof, e.g. , an extracellular, or cytoplasmic region (see Section 5.6), is available and can be purified, or synthesized, and labeled, the generated DNA fragments may be screened by nucleic acid hybridization to the labeled probe (Benton and Davis, 1977, Science 196:180; Grunstein and Hogness, 1975, Proc. Natl. Acad. Sci. U.S.A. 72:3961). Those DNA fragments with substantial homology to the probe will hybridize. It is also possible to identify the appropriate fragment by restriction enzyme digestion(s) and comparison of fragment sizes with those expected according to a known restriction map, either available or deduced from a known nucleotide sequence. Further selection can be carried out on the basis of the properties of the gene. Alternatively, the presence of the gene may be detected by assays based on the physical, chemical, or immunological properties of its expressed product. For example, cDNA clones, or DNA clones which hybrid-select the proper mRNAs, can be selected which produce a protein that, e.g. , has similar or identical electrophoretic migration, isolectric focusing behavior, proteolytic

digestion maps, binding activity, or antigenic properties as known for a E-cadherin protein. By use of an antibody to a E-cadherin protein, die E-cadherin protein may be identified by binding of labeled antibody to the putatively E-cadherin protein synthesizing clones, in an ELISA (enzyme-linked immunosorbent assay)-type procedure.

The E-cadherin gene can also be identified by mRNA selection by nucleic acid hybridization followed by in vitro translation. In this procedure, fragments are used to isolate complementary mRNAs by hybridization. Such DNA fragments may represent available, purified E-cadherin DNA of human or of another species (e.g. , mouse, chicken). Immunoprecipitation analysis or functional assays (e.g. , binding to a receptor or ligand; see infra) of the in vitro translation products of the isolated products of the isolated mRNAs identifies the mRNA and, therefore, the complementary DNA fragments that contain the desired sequences. In addition, specific mRNAs may be selected by adsorption of polysomes isolated from cells to immobilized antibodies specifically directed against a E-cadherin protein. A radiolabelled E-cadherin cDNA can be synthesized using the selected mRNA (from the adsorbed polysomes) as a template. The radiolabelled mRNA or cDNA may then be used as a probe to identify the E-cadherin DNA fragments from among other genomic DNA fragments.

Alternatives to isolating the human E-cadherin genomic DNA include, but are not limited to, chemically synthesizing the gene sequence itself from a known sequence or making cDNA to the mRNA which encodes a human E-cadherin protein. For example, RNA for cDNA cloning of the human E-cadherin gene can be isolated from human cells (e.g. , epithelial cells) which express a E-cadherin protein. Other methods are possible and within the scope of the invention.

The identified and isolated gene can then be inserted into an appropriate cloning vector. A large number of vector-host systems known in the art may be used. Possible vectors include, but are not limited to, plasmids or modified viruses, but the vector system must be compatible with the host cell

used. Such vectors include, but are not limited to, bacteriophages such as lambda derivatives, or plasmids such as PBR322 or pUC plasmid derivatives. The insertion into a cloning vector can, for example, be accomplished by ligating the DNA fragment into a cloning vector which has complementary cohesive termini. However, if the complementary restriction sites used to fragment the DNA are not present in the cloning vector, the ends of the DNA molecules may be enzymatically modified. Alternatively, any site desired may be produced by ligating nucleotide sequences (linkers) onto the DNA termini; these ligated linkers may comprise specific chemically synthesized oligonucleotides encoding restriction endonuclease recognition sequences. In an alternative method, the cleaved vector and E-cadherin gene may be modified by homopolymeric tailing. Recombinant molecules can be introduced into host cells via transformation, transfection, infection, microinjection, electroporation, etc., so that many copies of the gene sequence are generated. In an alternative method, the desired gene may be identified and isolated after insertion into a suitable cloning vector in a "shot gun" approach. Enrichment for the desired gene, for example, by size fractionation, can be done before insertion into the cloning vector.

In specific embodiments, transformation of host cells with recombinant DNA molecules that incorporate the isolated E-cadherin gene, cDNA, or synthesized DNA sequence enables generation of multiple copies of the gene. Thus, the gene may be obtained in large quantities by growing transformants, isolating the recombinant DNA molecules from the transformants and, when necessary, retrieving the inserted gene from the isolated recombinant DNA.

5.2. EXPRESSION OF THE HUMAN E-CADHERIN GENE The nucleotide sequence coding for a human E-cadherin protein or a functionally active fragment or other derivative thereof (see Section 5.6), can be inserted into an appropriate expression vector, i. e. , a vector which contains the necessary elements for the transcription and translation of the inserted protein-

coding sequence. The necessary transcriptional and translational signals can also be supplied by die native E-cadherin gene and/or its flanking regions. A variety of host-vector systems may be utilized to express the protein-coding sequence. These include but are not limited to mammalian cell systems infected with virus (e.g. , vaccinia virus, adenovirus, etc.); insect cell systems infected with virus (e.g., baculovirus); microorganisms such as yeast containing yeast vectors, or bacteria transformed with bacteriophage DNA, plasmid DNA, or cosmid DNA. The expression elements of vectors vary in their strengths and specificities. Depending on the host-vector system utilized, any one of a number of suitable transcription and translation elements may be used. In a specific embodiment, a chimeric protein comprising the extracellular domain or repeat region or other domain of a human E-cadherin protein is expressed. In other specific embodiments, a full-length human E-cadherin cDNA is expressed, or a sequence encoding a functionally active portion of a human E-cadherin protein. In yet another embodiment, a fragment of a human E-cadherin protein comprising a domain of the protein, or other derivative, or analog of a human E-cadherin protein is expressed.

Any of the methods previously described for the insertion of DNA fragments into a vector may be used to construct expression vectors containing a chimeric gene consisting of appropriate transcriptional/translational control signals and the protein coding sequences. These methods may include in vitro recombinant DNA and synthetic techniques and in vivo recombinants (genetic recombination). Expression of a nucleic acid sequence encoding a human E-cadherin protein or peptide fragment may be regulated by a second nucleic acid sequence so that the E-cadherin protein or peptide is expressed in a host transformed with the recombinant DNA molecule. For example, expression of a E-cadherin protein may be controlled by any promoter/enhancer element known in the art. Promoters which may be used to control E-cadherin gene expression include, but are not limited to, the SV40 early promoter region (Bernoist and Chambon, 1981 , Nature 290:304-310), the promoter contained in the 3' long terminal repeat of Rous sarcoma virus (Yamamoto et al. , 1980, Cell 22:787-797),

the herpes thymidine kinase promoter (Wagner et al., 1981 , Proc. Natl. Acad. Sci. U.S.A. 78: 1441-1445), the regulatory sequences of the metallothionein gene (Brinster et al., 1982, Nature 296:39-42), an adenovirus promoter, cytomegalovirus (CMV) promoter); prokaryotic promoters such as the β- lactamase (Villa-Kamaroff et al., 1978, Proc. Natl. Acad. Sci. U.S.A. 75:3727- 3731), tac (DeBoer et al., 1983, Proc. Natl. Acad. Sci. U.S.A. 80:21-25), λP L , or trc promoters; see also "Useful proteins from recombinant bacteria" in Scientific American, 1980, 242:74-94; plant expression vectors comprising the nopaline synthetase promoter region or the cauliflower mosaic virus 35S RNA promoter (Gardner et al., 1981 , Nucl. Acids Res. 9:2871), and the promoter of the photosynthetic enzyme ribulose biphosphate carboxylase (Herrera-Estrella et al., 1984, Nature 310: 115-120); promoter elements from yeast or other fungi such as the Gal 4 promoter, the ADC (alcohol dehydrogenase) promoter, PGK (phosphoglycerol kinase) promoter, alkaline phosphatase promoter, and the following animal transcriptional control regions, which exhibit tissue specificity and have been utilized in transgenic animals: elastase I gene control region which is active in pancreatic acinar cells (Swift et al., 1984, Cell 38:639-646; Ornitz et al., 1986, Cold Spring Harbor Symp. Quant. Biol. 50:399-409; MacDonald, 1987, Hepatology 7:425-515); insulin gene control region which is active in pancreatic beta cells (Hanahan, 1985, Nature 315: 115-122), immunoglobulin gene control region which is active in lymphoid cells (Grosschedl et al., 1984, Cell 38:647-658; Adames et al., 1985, Nature 318:533-538; Alexander et al., 1987, Mol. Cell. Biol. 7: 1436-1444), mouse mammary tumor virus control region which is active in testicular, breast, lymphoid and mast cells (Leder et al., 1986, Cell 45:485-495), albumin gene control region which is active in liver (Pinkert et al., 1987, Genes and Devel. 1:268-276), alpha-fetoprotein gene control region which is active in liver (Krumlauf et al., 1985, Mol. Cell. Biol. 5: 1639-1648; Hammer et al., 1987, Science 235:53-58; alpha 1-antitrypsin gene control region which is active in the liver (Kelsey et al., 1987, Genes and Devel. 1 : 161-171), beta-globin gene control region which is active in myeloid cells (Mogram et al. , 1985, Nature 315:338-340; Kollias et al., 1986, Cell 46:89-94; myelin basic

protein gene control region which is active in oligodendrocyte cells in the brain (Readhead et al., 1987, Cell 48:703-712); myosin light chain-2 gene control region which is active in skeletal muscle (Sani, 1985, Nature 314:283-286), and gonadotropic releasing hormone gene control region which is active in the hypothalamus (Mason et al., 1986, Science 234:1372-1378). In a specific embodiment, a pGEX vector (Pharmacia) is used for expression in bacteria. In another specific embodiment, a nucleotide sequence encoding a human E-cadherin or fragment or derivative thereof is operatively linked to a promoter, wherein the promoter is not a human E-cadherin gene promoter. Expression vectors containing human E-cadherin gene inserts can be identified by three general approaches: (a) nucleic acid hybridization, (b) presence or absence of "marker" gene functions, and (c) expression of inserted sequences. In the first approach, the presence of a foreign gene inserted in an expression vector can be detected by nucleic acid hybridization using probes comprising sequences that are homologous to an inserted E-cadherin gene. In the second approach, the recombinant vector/host system can be identified and selected based upon the presence or absence of certain "marker" gene functions (e.g. , thymidine kinase activity, resistance to antibiotics, transformation phenotype, occlusion body formation in baculovirus, etc.) caused by the insertion of foreign genes in the vector. For example, if the E-cadherin gene is inserted within the marker gene sequence of the vector, recombinants containing the E-cadherin insert can be identified by the absence of the marker gene function. In the third approach, recombinant expression vectors can be identified by assaying the foreign gene product expressed by the recombinant. Such assays can be based, for example, on the physical or functional properties of the E-cadherin gene product in in vitro assay systems, e.g. , binding to a ligand or receptor, binding with antibody.

Once a particular recombinant DNA molecule is identified and isolated, several methods known in the art may be used to propagate it. Once a suitable host system and growth conditions are established, recombinant expression vectors can be propagated and prepared in quantity. As previously

explained, die expression vectors which can be used include, but are not limited to, the following vectors or their derivatives: human or animal viruses such as vaccinia virus or adenovirus; insect viruses such as baculovirus; yeast vectors; bacteriophage vectors (e.g. , lambda), and plasmid and cosmid DNA vectors, to name but a few.

In addition, a host cell strain may be chosen which modulates the expression of the inserted sequences, or modifies and processes the gene product in the specific fashion desired. Expression from certain promoters can be elevated in the presence of certain inducers; thus, expression of the genetically engineered E-cadherin protein may be controlled. Furthermore, different host cells have characteristic and specific mechanisms for the translational and post- translational processing and modification (e.g. , glycosylation, cleavage) of proteins. For example, mammalian, yeast, and baculovirus host cells can glycosylate proteins. Appropriate cell lines or host systems can be chosen to ensure the desired modification and processing of the foreign protein expressed. Both cDNA and genomic sequences can be cloned and expressed.

5.3. IDENTIFICATION AND PURIFICATION OF THE EXPRESSED GENE PRODUCTS Once a recombinant which expresses a human E-cadherin gene sequence is identified, the gene product can be analyzed. This is achieved by assays based on the physical or functional properties of the product, including radioactive labelling of the product followed by analysis by gel electrophoresis, immunoassay, etc. Once a human E-cadherin protein is identified, it may be isolated and purified by standard methods including chromatography (e.g. , ion exchange, affinity, and sizing column chromatography). centrifugation, differential solubility, or by any other standard technique for the purification of proteins. The functional properties may be evaluated using any suitable assay (see Section 5 - 7 )-

Alternatively, the amino acid sequence of a human E-cadherin protein can be deduced from the nucleotide sequence of the chimeric gene

contained in the recombinant. Once the amino acid sequence is thus known, the protein can be synthesized by standard chemical methods known in the art (e.g. , see Hunkapiller et al., 1984, Nature 310:105-111).

By way of example, the deduced amino acid sequence (SEQ ID NO: 2) of a human E-cadherin protein is presented in Figure 3. In a specific embodiment of the present invention, a human E-cadherin protein, whether produced by recombinant DNA techniques or by chemical synthetic methods, includes but is not limited to one containing, as a primary amino acid sequence, all or part of the amino acid sequence substantially as depicted in Figure 3 (SEQ ID NO:2), as well as fragments and other derivatives, and analogs thereof. In specific embodiments, the invention relates to mature human E-cadherin proteins, e.g.. those having an amino acid sequence substantially as depicted in Figure 3 from amino acid numbers 151-878. In another specific embodiment, a protein comprises the amino acid sequence as depicted in Figure 3 from amino acid numbers 153-878. In another specific embodiment, the invention relates to an E-cadherin protein having an amino acid sequence substantially as depicted in Figure 3 from amino acid numbers 1-878. Purified proteins comprising the foregoing sequences are also provided. In another specific embodiment, the invention provides purified human E-cadherin proteins and fragments thereof that are free of detergents, substantially non-denatured, and/or free of other human cell membrane components. In another embodiment, the human E-cadherin protein or fragment thereof (e.g. , comprising the extracellular domain) is glycosylated (e.g. , as obtained by expression in mammalian cells). In another embodiment, the human E-cadherin protein or fragment thereof is nonglycosylated (e.g.. as obtained by expression in bacteria). Nonglycosylated mature E-cadherin proteins are believed to be capable of homotypic binding.

5.4. STRUCTURE OF THE HUMAN E-CADHERIN GENE AND PROTEIN The structure of the human E-cadherin gene and protein can be analyzed by various methods known in the art.

5.4.1. GENETIC ANALYSIS The cloned DNA or cDNA corresponding to the E-cadherin gene can be analyzed by methods including but not limited to Southern hybridization (Southern, 1975, J. Mol. Biol. 98:503-517), Northern hybridization (see e.g. , Freeman et al., 1983, Proc. Natl. Acad. Sci. U.S.A. 80:4094-4098), restriction endonuclease mapping (Maniatis, 1982, Molecular Cloning, A Laboratory, Cold Spring Harbor, New York), and DNA sequence analysis (see infra). Polymerase chain reaction (PCR; U.S. Patent Nos. 4,683,202, 4,683,195 and 4,889,818; Gyllenstein et al., 1988, Proc. Natl. Acad. Sci. U.S.A. 85:7652-7656; Ochman et al., 1988, Genetics 120:621-623; Loh et al., 1989, Science 243:217-220) followed by Southern hybridization with a E-cadherin-specific probe can allow the detection of the human E-cadherin gene in DNA from various cell types. Northern hybridization analysis can be used to determine the expression of the E-cadherin gene. Various cell types, at various states of development, differentiation, or activity can be tested for E-cadherin gene expression. The stringency of the hybridization conditions for both Southern and Northern hybridization can be manipulated to ensure detection of nucleic acids with the desired degree of relatedness to the specific E-cadherin probe used, whether it be human or other species. Restriction endonuclease mapping can be used to roughly determine the genetic structure of the human E-cadherin gene. Restriction maps derived by restriction endonuclease cleavage can be confirmed by DNA sequence analysis. Alternatively, restriction maps can be deduced, once the nucleotide sequence is known. DNA sequence analysis can be performed by any techniques known in the art, including but not limited to the method of Maxam and Gilbert (1980, Meth. Enzymol. 65:499-560), the Sanger dideoxy method (Sanger et al. , 1977, Proc. Natl. Acad. Sci. U.S.A. 74:5463), the use of T7 DNA polymerase (Tabor and Richardson, U.S. Patent No. 4,795,699; Sequenase, U.S. Biochemical Corp.), or Taq polymerase, or use of an automated DNA sequenator (e.g. , Applied Biosystems, Foster City, CA). The cDNA sequence of a human

E-cadherin gene comprises me sequence substantially as depicted in Figure 3 (SEQ ID NO: l), and described in Section 6, infra.

5.4.2. PROTEIN ANALYSIS The amino acid sequence of a human E-cadherin protein can be derived by deduction from the DNA sequence, or alternatively, by direct sequencing of the protein, e.g. , with an automated amino acid sequencer. The amino acid sequence of a representative human E-cadherin protein comprises the amino acid sequence substantially as depicted in Figure 3 (SEQ ID NO:2). In a specific embodiment, the sequence of the mature E-cadherin protein is substantially as depicted in Figure 3 from amino acid numbers 151-878.

Comparison of the human sequence to other known sequences allows identification of functional domains within the molecule, including but not limited to die extracellular domain, transmembrane region, cytoplasmic domain, amino- terminal processed region, homotypic binding domain, repeat region, repeat ff\ , repeat #2, and repeat #3 (see also Section 7 infra).

The E-cadherin protein sequence can be further characterized by a hydrophilicity analysis (Hopp and Woods, 1981, Proc. Natl. Acad. Sci. U.S.A.

78:3824). A hydrophilicity profile can be used to identify die hydrophobic and hydrophilic regions of a E-cadherin protein and the corresponding regions of the gene sequence which encode such regions.

Secondary, structural analysis (Chou and Fasman, 1974,

Biochemistry 13:222) can also be done, to identify regions of a E-cadherin protein mat assume specific secondary structures. Manipulation, translation, and secondary structure prediction, as well as open reading frame prediction and plotting, can also be accomplished using computer software programs available in the art.

Other methods of structural analysis can also be employed. These include but are not limited to X-ray crystallography (Engstom. 1974, Biochem. Exp. Biol. 11:7-13) and computer modeling (Fletterick and Zoller (eds.), 1986,

Computer Graphics and Molecular Modeling, in Current Communications in

Molecular Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York).

5.5. GENERATION OF ANTIBODIES TO THE HUMAN 5 E-CADHERIN PROTEIN AND DERIVATIVES THEREOF

According to the invention, a human E-cadherin protein, its fragments or other derivatives, or analogs thereof, may be used as an immunogen to generate antibodies which recognize such an immunogen. Such antibodies include but are not limited to polyclonal, monoclonal, chimeric, single chain, Fab

JO fragments, and an Fab expression library. In a preferred embodiment, antibodies which specifically bind to human E-cadherin proteins are produced. In one embodiment, such an antibody recognizes the human E-cadherin protein having the sequence shown in Figure 3 (SEQ ID NO:2), or a portion thereof. In another embodiment, such an antibody specifically binds to human, but not mouse or j chicken, E-cadherin. In anodier embodiment, antibodies to a particular domain (e.g. , the cytoplasmic domain) of a human E-cadherin protein are produced.

Various procedures known in die art may be used for the production of polyclonal antibodies to a human E-cadherin protein or derivative or analog. For die production of antibody, various host animals can be

20 immunized by injection with a native human E-cadherin protein, or a synthetic version, or derivative (e.g. , fragment) thereof, including but not limited to rabbits, mice, rats, etc. Various adjuvants may be used to increase the immunological response, depending on the host species, and including but not limited to Freund's (complete and incomplete), mineral gels such as aluminum

25 hydroxide, surface active substances such as lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, keyhole limpet hemocyanins, dinitrophenol, and potentially useful human adjuvants such as BCG (bacille Calmette-Guerin) and corynebacterium parvu .

In a preferred embodiment, polyclonal or monoclonal antibodies

30 are produced by use of a hydrophilic portion of a human E-cadherin peptide (e.g. , identified by the procedure of Hopp and Woods (1981. Proc. Natl. Acad. Sci. U.S.A. 78:3824)).

35

For preparation of monoclonal antibodies directed toward a E-cadherin protein sequence or analog thereof, any technique which provides for the production of antibody molecules by continuous cell lines in culture may be used. For example, the hybridoma technique originally developed by Kohler and Milstein (1975, Nature 256:495-497), as well as the trioma technique, the human B-cell hybridoma technique (Kozbor et al., 1983, Immunology Today 4:72), and the EBV-hybridoma technique to produce human monoclonal antibodies (Cole et al., 1985, in Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96) can be used. In an additional embodiment of the invention, monoclonal antibodies can be produced in germ-free animals (PCT Publication No.

WO 89/12690 dated December 28, 1989). According to the invention, human antibodies may be used and can be obtained by using human hybridomas (Cote et al., 1983, Proc. Natl. Acad. Sci. U.S.A. 80:2026-2030) or by transforming human B cells with EBV virus in vitro (Cole et al., 1985, in Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, pp. 77-96), or by other methods known in the art. In fact, according to the invention, techniques developed for die production of "chimeric antibodies" (Morrison et al., 1984, Proc. Natl. Acad. Sci. U.S.A. 81:6851-6855; Neuberger et al., 1984, Nature 312:604-608; Takeda et al., 1985, Nature 314:452-454) by splicing the genes from a mouse antibody molecule specific for a E-cadherin protein together with genes from a human antibody molecule of appropriate biological activity can be used; such antibodies are within the scope of this invention. Also within the scope of the invention are "humanized" antibodies (see, e.g. , EP Publication 239,400 dated September 30, 1987 by Winter). According to the invention, techniques described for die production of single chain antibodies (U.S. Patent 4,946,778) can be adapted to produce E-cadherin protein-specific single chain antibodies. An additional embodiment of the invention utilizes the techniques described for the construction of Fab expression libraries (Huse et al. , 1989, Science 246:1275-1281) to allow rapid and easy identification of monoclonal Fab fragments with the desired specificity for E-cadherin proteins, derivatives, or analogs.

Antibody fragments which contain the idiotype (binding domain) of the molecule can be generated by known techniques. For example, such fragments include but are not limited to: the F(ab') 2 fragment which can be produced by pepsin digestion of the antibody molecule; the Fab' fragments which can be generated by reducing the disulfide bridges of the F(ab') fragment, and the Fab fragments which can be generated by treating die antibody molecule with papain and a reducing agent.

In die production of antibodies, screening for die desired antibody can be accomplished by techniques known in the art, e.g. ELISA (enzyme-linked immunosorbent assay). For example, to select antibodies which recognize a specific domain of a E-cadherin protein, one may assay generated hybridomas for a product which binds to a E-cadherin fragment containing such domain. For selection of an antibody specific to a human E-cadherin protein and not E-cadherin of anodier species (e.g. , mouse, chicken), one can select on the basis of positive binding to a human E-cadherin protein and a lack of binding to die E-cadherin protein of die other species.

The foregoing antibodies can be used in methods known in the art relating to the localization and activity of die protein sequences of die invention (e.g. , see Section 5.7, infra), e.g. , for imaging these proteins, measuring levels thereof in appropriate physiological samples, etc., diagnostically, and erapeutically, e.g. , for inhibiting E-cadherin function.

5.6. HUMAN E-CADHERIN PROTEIN DERIVATIVES AND ANALOGS The invention further relates to derivatives (including but not limited to fragments) and analogs of human E-cadherin proteins.

The production and use of derivatives and analogs related to human E-cadherin proteins are within the scope of the present invention. In a specific embodiment, the derivative or analog is functionally active, i.e. , capable of exhibiting one or more functional activities associated with a full-length, wild- type human E-cadherin protein. As one example, such derivatives or analogs which have the desired immunogenicity or antigenicity can be used, for example,

in immunoassays, for immunization, for promotion or inhibition of E-cadherin protein activity, etc. Such molecules which retain, or alternatively inhibit, a desired human E-cadherin protein property, e.g. , binding to a receptor or ligand, such as possibly Notch protein, can be used as inducers, or inhibitors, respectively, of such property and its physiological correlates. Derivatives or analogs of E-cadherin proteins can be tested for the desired activity by procedures known in die art, including but not limited to the assays described in Section 5.7.

In particular, E-cadherin derivatives can be made by altering E-cadherin sequences by substitutions, additions or deletions diat provide for functionally equivalent molecules. Due to the degeneracy of nucleotide coding sequences, other DNA sequences which encode substantially die same amino acid sequence as a human E-cadherin gene may be used in the practice of die present invention. These include but are not limited to nucleotide sequences comprising all or portions of human E-cadherin genes which are altered by die substitution of different codons diat encode a functionally equivalent amino acid residue wi in the sequence, thus producing a silent change. Likewise, die E-cadherin derivatives of the invention include, but are not limited to, diose containing, as a primary amino acid sequence, all or part of die amino acid sequence of a human E-cadherin protein including altered sequences in which functionally equivalent amino acid residues are substituted for residues within the sequence. For example, one or more amino acid residues within the sequence can be substituted by another amino acid of a similar polarity which acts as a functional equivalent, resulting in a silent alteration. Substitutes for an amino acid within the sequence may be selected from other members of the class to which the amino acid belongs. For example, the nonpolar (hydrophobic) amino acids include alanine, leucine, isoleucine, valine, proline, phenylalanine, tryptophan and methionine. The polar neutral amino acids include glycine, serine, threonine, cysteine, tyrosine, asparagine, and glutamine. The positively charged (basic) amino acids include arginine, lysine and histidine. The negatively charged (acidic) amino acids include aspartic acid and glutamic acid.

In a specific embodiment of die invention, proteins consisting of or comprising a fragment of a human E-cadherin protein consisting of at least 30 amino acids of d e E-cadherin protein is provided. In other embodiments, die fragment consists of at least 6, 10, 50, 75, or 100 amino acids of the E-cadherin protein. Another specific embodiment relates to a protein comprising a fragment of the amino acid sequence shown in Figure 3 (SEQ ID NO:2) from amino acid numbers 728-878, which can be bound by anti-E-cadherin antibody. In anodier specific embodiment of e invention, a purified protein is provided which comprises a derivative or fragment of a human E-cadherin protein, with the proviso that said purified protein is not a mature human E-cadherin protein comprising amino acids 151-878 as depicted in Figure 3 (SEQ ID NO:2).

The human E-cadherin protein derivatives and analogs of the invention can be produced by various mediods known in the art. The manipulations which result in their production can occur at the gene or protein level. For example, the cloned E-cadherin gene sequence can be modified by any of numerous strategies known in the art (Maniatis, 1990, Molecular Cloning, A Laboratory Manual, 2d ed., Cold Spring Harbor Laboratory, Cold Spring Harbor, New York). The sequence can be cleaved at appropriate sites with restriction endonuclease(s), followed by further enzymatic modification if desired, isolated, and ligated in vitro. In the production of the gene encoding a derivative or analog of a human E-cadherin protein, care should be taken to ensure diat the modified gene remains within the same translational reading frame as the E-cadherin gene, uninterrupted by translational stop signals, in the gene region where die desired E-cadherin protein activity is encoded. Additionally, die E-cadherin-encoding nucleic acid sequence can be mutated in vitro or in vivo, to create and/or destroy translation, initiation, and/or term.:.ation sequences, or to create variations in coding regions and/or form new restriction endonuclease sites or destroy preexisting ones, to facilitate further in vitro modification. Any technique for mutagenesis known in the art can be used, including but not limited to, in vitro site-directed mutagenesis (Hutchinson et al., 1978, J. Biol. Chem 253:6551).

Manipulations of the human E-cadherin sequence may also be made at the protein level. Included within the scope of the invention are human E-cadherin protein fragments or other derivatives or analogs which are differentially modified during or after translation, e.g. , by acetylation, glycosylation or deglycosylation, phosphorylation, amidation, derivatization by known protecting/blocking groups, proteolytic cleavage, linkage to an antibody molecule or other cellular ligand, etc. Any of numerous chemical modifications may be carried out by known techniques, including but not limited to specific chemical cleavage by cyanogen bromide, trypsin, chymotrypsin, papain, V8 protease, NaBH 4 , acetylation, formylation, oxidation, reduction, etc.

In addition, analogs and derivatives of human E-cadherin proteins can be chemically synthesized. For example, a peptide corresponding to a portion of a E-cadherin protein which comprises the desired domain (see Section 5.6.1), or which mediates the desired activity in vitro or in vivo, can be synthesized by use of a peptide synthesizer. Furthermore, if desired, nonclassical amino acids or chemical amino acid analogs can be introduced as a substitution or addition into die human E-cadherin protein sequence. Non-classical amino acids include but are not limited to the D-isomers of the common amino acids, α-amino isobutyric acid, 4-aminobutyric acid, hydroxyproline, sarcosine, citrulline, cysteic acid, t-butylglycine, t-butylalanine, phenylglycine, cyclohexylalanine, /ϊ-alanine, designer amino acids such as /."-methyl amino acids, Cα-methyl amino acids, and Nα-methyl amino acids.

In a specific embodiment, the human E-cadherin derivative is a chimeric, or fusion, protein comprising a human E-cadherin protein or fragment thereof (preferably consisting of at least a domain or region of the E-cadherin protein, or at least 30 amino acids of the E-cadherin protein) joined at its amino or carboxy-terminus via a peptide bond to an amino acid sequence of a different protein. In one embodiment, such a chimeric protein is produced by recombinant expression of a nucleic acid encoding die protein (comprising a human E-cadher in-coding sequence joined in-frame to a coding sequence for a different protein). Such a chimeric product can be made by ligating the appropriate nucleic

acid sequences encoding die desired amino acid sequences to each other by methods known in the art, in the proper coding frame, and expressing the chimeric product by mediods commonly known in the art. Alternatively, such a chimeric product may be made by protein syndietic techniques, e.g. , by use of a peptide synthesizer. A specific embodiment relates to a chimeric protein comprising a fragment of a E-cadherin protein which comprises a domain or motif of the E-cadherin protein, e.g. , the extracellular domain, transmembrane region, cytoplasmic domain, amino-terminal processed region, homotypic binding domain, conserved cysteine domain, HAV sequence, repeat region, repeat #1, repeat #2, repeat #3, or any combination of die foregoing (see Section 7).

Ano ier specific embodiment relates to a chimeric protein comprising a fragment of a human E-cadherin protein of at least six amino acids. In specific embodiments, the fusion protein comprises or consists of the extracellular portion of human E-cadherin or a functional fragment thereof joined via a peptide bond to a transmembrane domain joined via a peptide bond to die cytoplasmic (signalling) domain or functional fragment thereof of another receptor or adhesion molecule (e.g. , DCC (Deleted in Colorectal Cancer). P-cadherin, N-cadherin; N-CAM (neural cell adhesion molecule); receptor tyrosine kinases such as growth factor receptors like epidermal growth factor receptor, fibroblast growth factor receptor, neu, etc.). Particular examples of human E-cadherin fusion proteins, consisting of a human E-cadherin fragment capable of generating anti-E-cadherin antibody fused to die carboxyl-terminus of glutathione-S-transferase, are described in Section 8 hereof. Anodier specific embodiment relates to a protein comprising portions of die human E-cadherin sequence which appear in different order or are missing amino acid sequence relative to native E-cadherin.

In another specific embodiment, a protein comprising a portion of me E-cadherin amino acid sequence shown in Figure 3 (SEQ ID NO:2) is provided, widi die proviso mat the protein does not contain amino acids numbers 153-307. Odier specific embodiments of derivatives and analogs are described in the subsections below and examples sections infra.

5.6.1. DERIVATIVES OF THE HUMAN E-CADHERIN PROTEIN

CONTAINING ONE OR MORE DOMAINS OF THE PROTEIN

In a specific embodiment, the invention relates to human

E-cadherin protein derivatives and analogs, in particular human E-cadherin fragments and derivatives of such fragments, at comprise one or more domains of a human E-cadherin protein, including but not limited to the extracellular domain, transmembrane region, cytoplasmic domain, amino-terminal processed region, homotypic binding domain, HAV sequence, conserved cysteine domain, repeat domain, repeat #1, repeat #2, and repeat #3. The amino acid sequences representing the foregoing domains for die protein having the sequence shown in Figure 3 are described in Section 7 infra. In a specific embodiment, a protein comprises one or more of human E-cadherin repeats #1 , #2, and #3. In anodier specific embodiment, a protein comprises the human E-cadherin transmembrane region and cytoplasmic domain. In anodier specific embodiment, die invention relates to a derivative or analog of human E-cadherin at lacks one or more domains of a human E-cadherin protein.

5.6.2. DERIVATIVES OF THE HUMAN E-CADHERIN

PROTEIN THAT MEDIATE BINDING TO PROTEINS v The invention also provides human E-cadherin fragments, and analogs or derivatives of such fragments, which mediate binding to other proteins, and nucleic acid sequences encoding die foregoing. In specific embodiments, such fragments, analogs, and derivatives which bind to alpha, beta or gamma catenin, actin, spectrin/fodrin, and/or ankyrin are envisioned.

5.7. ASSAYS OF HUMAN E-CADHERIN

PROTEINS. DERIVATIVES AND ANALOGS

The functional activity of E-cadherin proteins, derivatives and analogs can be assayed by various methods known in the art. For example, in one embodiment, where one is assaying for the ability to bind or compete with a wild-type human E-cadherin protein for binding

to anti-E-cadherin protein antibody, various immunoassays known in the art can be used, including but not limited to competitive and non-competitive assay systems using techniques such as radioimmunoassays, ELISA (enzyme linked immunosorbent assay), "sandwich" immunoassays, immunoradiometric assays, gel diffusion precipitin reactions, immunodiffusion assays, in situ immunoassays

(using colloidal gold, enzyme or radioisotope labels, for example), western blots, precipitation reactions, agglutination assays (e.g. , gel agglutination assays, hemagglutination assays), complement fixation assays, immunofluorescence assays, protein A assays, and immunoelectrophoresis assays, etc. In one embodiment, antibody binding is detected by detecting a label on the primary antibody. In anodier embodiment, the primary antibody is detected by detecting binding of a secondary antibody or reagent to the primary antibody. In a further embodiment, die secondary antibody is labelled. Many means are known in e art for detecting binding in an immunoassay and are within the scope of the present invention.

The ability to bind to anodier protein (be it a second E-cadherin protein; alpha, beta, or gamma catenin; actin, spectrin/fodrin, ankyrin, 220 kD undercoat protein (Itoh et al., 1991 , J. Cell Biol. 115(5): 1449-1462), or ouierwise) can be demonstrated by in vitro binding assays, noncompetitive or competitive, by methods known in the art. In another embodiment, die ability of an E-cadherin derivative or analog to bind to an identical molecule ("homotypic binding") or to native E-cadherin can be assayed by methods known in die art.

In another embodiment, physiological correlates of E-cadherin introduction into cells can be assayed. For example, e ability to suppress cell invasion can be assayed by known methods (see, e.g. , Vleminckx et al., 1991 , Cell 66:107-119).

Other methods will be known to the skilled artisan and are within die scope of the invention.

5.8. THERAPEUTIC AND PROPHYLACTIC USES The human E-cadherin proteins, derivatives (including fragments) and analogs diereof; antibodies thereto; nucleic acids encoding die human E-cadherin proteins, derivatives, and analogs, and E-cadherin antisense nucleic acids have dierapeutic utility in the modulation of functions mediated by human E-cadherin. Such therapeutical ly useful molecules provided by the invention are termed herein "Therapeutics. " The Therapeutics have therapeutic value for various diseases and disorders.

One specific embodiment of the invention relates to Therapeutics which antagonize, or inhibit, a E-cadherin protein function. Such Therapeutics are most preferably identified by use of known convenient in vitro assays, e.g. , based on dieir ability to inhibit binding of E-cadherin to odier proteins, or inhibit any known E-cadherin function as assayed in vitro, although in vivo assays may also be employed. In a preferred embodiment, such a Therapeutic is a protein or derivative diereof comprising a functionally active fragment such as a fragment of a human E-cadherin protein which binds to another protein; such a Therapeutic can be used, e.g. , in soluble form, to competitively inhibit the function mediated by such fragment. In specific embodiments, a Therapeutic is a protein comprising the homotypic binding domain, or an analog/competitive inhibitor of a E-cadherin signal-transducing function, a nucleic acid capable of expressing one of die foregoing proteins, a human E-cadherin antisense nucleic acid (see infra), or an anti-human E-cadherin antibody which neutralizes a functional activity of E-cadherin.

In anodier embodiment of die invention, a nucleic acid containing a portion of a dysfunctional E-cadherin gene is used, to promote E-cadherin inactivation by homologous recombination (Koller and Smithies, 1989, Proc. Natl. Acad. Sci. USA 86:8932-8935; Zijlstra et al. , 1989, Nature 342:435-438).

In anodier embodiment. Therapeutics can be used to promote E-cadherin function. Such Therapeutics include but are not limited to human E-cadherin proteins and derivatives and analogs of the invention which are functionally active, i.e.. they are capable of displaying one or more known

functional activities associated widi a full-length (wild-type) E-cadherin protein. In a preferred aspect, such functional activity is the ability to suppress cell invasion or metastasis. In a specific embodiment, such a Therapeutic comprises one or more domains of the human E-cadherin protein, preferably die homotypic binding domain.

In a specific embodiment of die invention in which a Therapeutic which promotes E-cadherin function is introduced into or delivered into a cell which does not normally express E-cadherin or which is not an epithelial or human placental cell, the Therapeutic is a chimeric/fusion protein comprising (a) an extracellular domain or functional derivative thereof that is of human

E-cadherin, (b) a transmembrane domain, and (c) a cytoplasmic signalling domain or functional derivative thereof of a protein normally expressed by a cell type representative of the cell to which the Therapeutic is delivered. For example, where die cell is a neural or mesenchymal cell, e cytoplasmic domain can be of N-CAM or N-cadherin. In a specific embodiment relating to gene therapy, a recombinant nucleic acid encoding and capable of expressing such a chimeric molecule is introduced into a host cell.

Further descriptions and sources of Therapeutics of the inventions are found in Sections 5.8.1 dirough 5.10.1 herein. in a specific embodiment, a Therapeutic which antagonizes

E-cadherin function is administered to promote cell invasion. In anodier specific embodiment, a Therapeutic which promotes E-cadherin function is administered to inhibit cell invasion and metastasis. Thus, for example, introduction into cell of a nucleic acid encoding a human E-cadherin or fragment thereof which mediates homotypic binding is dierapeutically useful for promotion of adhesion of such cells and prevention of their invasion and metastasis. Where such cells are tumor cells, direct delivery of the nucleic acid to such tumor cells in vivo is envisaged. In a different embodiment of the invention, such cells to be used for introduction of die E-cadherin nucleic acid can be any cells to be administered in vivo for therapeutic effect; introduction into such cells of the E-cadherin-encoding sequences prior to administration of the cell to a patient can prevent the cell from

subsequently becoming or behaving like an invasive tumor cell (see also Section 5.9).

Thus, in a preferred aspect of die invention, a Therapeutic which exhibits E-cadherin homotypic binding ability, is administered to treat or prevent malignancy, or metastasis of a malignancy. This is described further in Sections 5.8.1 and 5.8.2.

In another embodiment of the invention, a Therapeutic which promotes E-cadherin function (e.g. , comprising the extracellular domain or homotypic binding domain, and ius capable of mediating homotypic binding and resultant adhesion of cells attached to or expressing die Therapeutic) is used for treatment of benign dysproliferative disorders. Specific embodiments are directed to treatment of cirrhosis of die liver (a condition in which scarring has overtaken normal liver regeneration processes), treatment of keloid (hypertrophic scar) formation (disfiguring of the skin in which the scarring process interferes with normal renewal), psoriasis (a common skin condition characterized by excessive proliferation of die skin and delay in proper cell fate determination), and baldness (a condition in which terminally differentiated hair follicles fail to function properly).

In another embodiment of die invention, a Therapeutic which promotes E-cadherin function (e.g. , comprising the extracellular domain or homotypic binding domain, and dius capable of mediating homotypic binding) is used to promote wound healing, including the treatment of burns, and to promote me re-epithelialization of the skin, mucosal surfaces, or cornea. In a specific embodiment, fibroblasts obtained from a patient are transfected in vitro with a nucleic acid encoding human E-cadherin or a derivative diereof capable of homotypic binding, in order to form a synmetic skin graft that, when applied to die site of a patient's wound, provides a protective autologous barrier to promote wound healing. Incorporation of human E-cadherin, or cells expressing the same, in synthetic organs is also envisioned. in yet anodier embodiment, a Therapeutic which promotes

E-cadherin function is used to treat or prevent gestational disease, or fetal

wastage, for example, spontaneous abortions, and developmental abnormalities of die fetus or neonate. In a preferred aspect, the Therapeutic is administered into die amniotic sac or intrauterinely. E-cadherin is normally expressed in human placenta. In yet anodier embodiment, a Therapeutic which promotes

E-cadherin function (in particular, its function in establishing an impermeability barrier in epithelial cells) can be used for die treatment or prevention of inflammatory disorders, e.g. , Crohn's disease or sclerosing cholangitis. Crohn's disease and sclerosing cholangitis are associated widi decreased permeability of epithelial cells. It has been reported that E-cadherin is required to establish the impermeability barrier in epithelial cells.

5.8.1. MALIGNANCIES Malignant and pre-neoplastic conditions which can be treated by administration of a Therapeutic, preferably one which promotes adhesiveness mediated by E-cadherin and dius exhibits E-cadherin homotypic binding ability, include but are not limited to diose described below in this and die subsequent subsection.

Such malignancies and related disorders, include but are not limited to those listed in Table 1 (for a review of such disorders, see Fishman et al., 1985, Medicine, 2d Ed., J.B. Lippincott Co. , Philadelphia):

TABLE 1 MALIGNANCIES AND RELATED DISORDERS

Leukemia acute leukemia acute lymphocytic leukemia acute myelocytic leukemia myeloblastic promyelocytic myelomonocytic monocytic erythroleukemia chronic leukemia chronic myelocytic (granulocytic) leukemia chronic lymphocytic leukemia Polycythemia vera Lymphoma

Hodgkin's disease non-Hodgkin's disease

Multiple myeloma Waldenstrδm's macroglobulinemia Heavy chain disease Solid tumors sarcomas and carcinomas fibrosarcoma myxosarcoma liposarcoma chondrosarcoma osteogenic sarcoma chordoma angiosarcoma endodieliosarcoma lymphangiosarcoma lymphangioendotheliosarcoma synovioma mesothelioma Ewing's tumor leiomyosarcoma rhabdomyosarcoma colon carcinoma stomach cancer pancreatic cancer breast cancer ovarian cancer

prostate cancer squamous cell carcinoma basal cell carcinoma adenocarcinoma sweat gland carcinoma sebaceous gland carcinoma papillary carcinoma papillary adenocarcinomas cystadenocarcinoma medullary carcinoma bronchogenic carcinoma renal cell carcinoma hepatoma bile duct carcinoma seminoma Wilms' tumor cervical cancer testicular tumor lung carcinoma small cell lung carcinoma bladder carcinoma epithelial carcinoma glioma astrocytoma medulloblastoma craniopharyngioma ependymoma pinealoma hemangioblastoma acoustic neuroma oligodendroglioma menangioma melanoma neuroblastoma retinoblastoma germ cell neoplasm (teratocarcinoma, embryonal carcinoma, choriocarcinoma) other gestational proliferative disease (e.g. , molar pregnancy)

In specific embodiments, malignancy or dysproliferative changes

(such as metaplasias and dysplasias) are treated or prevented in epidielial tissues such as those in the cervix, esophagus, lung, breast, bladder, kidney, and colon.

5.8.2. PREVENTION OF MALIGNANCIES The Therapeutics of the invention which exhibit adhesiveness to cells, or homotypic binding, can be administered to prevent progression to a neoplastic or malignant state, including but not limited to those disorders listed in Table 1. Such prophylactic use is indicated in conditions known or suspected of preceding progression to neoplasia or cancer, in particular, where non-neoplastic cell grow i consisting of hyperplasia, metaplasia, or most particularly, dysplasia has occurred (for review of such abnormal growdi conditions, see Robbins and Angell, 1976, Basic Pathology, 2d Ed., W.B. Saunders Co., Philadelphia, pp. 68-79.) Hyperplasia is a form of controlled cell proliferation involving an increase in cell number in a tissue or organ, without significant alteration in structure or function. As but one example, endometrial hyperplasia often precedes endometrial cancer. Metaplasia is a form of controlled cell growth in which one type of adult or fully differentiated cell substitutes for another type of adult cell. Metaplasia can occur in epithelial or connective tissue cells. Atypical metaplasia involves a somewhat disorderly metaplastic epithelium. Dysplasia is frequently a forerunner of cancer, and is found mainly in the epithelia; it is the most disorderly form of non-neoplastic cell growdi, involving a loss in individual cell uniformity and in die architectural orientation of cells. Dysplastic cells often have abnormally large, deeply stained nuclei, and exhibit pleomorphism. Dysplasia characteristically occurs where there exists chronic irritation or inflammation, and is often found in the cervix, respiratory passages, oral cavity, and gall bladder.

Alternatively or in addition to die presence of abnormal cell growdi characterized as hyperplasia, metaplasia, or dysplasia, die presence of one or more characteristics of a transformed phenotype, or of a malignant phenotype, displayed in vivo or displayed in vitro by a cell sample from a patient, can indicate die desirability of prophylactic/dierapeutic administration of a Therapeutic of die invention. Such characteristics of a transformed phenotype include moφhology changes, looser substratum attachment, loss of contact inhibition, loss of anchorage dependence, protease release, increased sugar transport, decreased

serum requirement, expression of fetal antigens, disappearance of die 250,000 dalton cell surface protein, etc. (see also id., at pp. 84-90 for characteristics associated with a transformed or malignant phenotype).

In a specific embodiment, leukoplakia, a benign-appearing hypeφlastic or dysplastic lesion of the epithelium, or Bowen's disease, a carcinoma in situ, are pre-neoplastic lesions indicative of die desirability of prophylactic intervention.

In another embodiment, fibrocystic disease (cystic hypeφlasia, mammary dysplasia, particularly adenosis (benign epithelial hypeφlasia), or atypical papillomatosis) is indicative of the desirability of prophylactic intervention.

In another embodiment, a patient with a strong family history of breast cancer is treated widi a Therapeutic, for prevention of breast cancer.

In other embodiments, a patient which exhibits one or more of e following predisposing factors for malignancy is treated by administration of an effective amount of a Therapeutic: a chromosomal translocation associated widi a malignancy (e.g. , the Philadelphia chromosome for chronic myelogenous leukemia, t(14;18) for follicular lymphoma, etc.), familial polyposis or Gardner's syndrome (possible forerunners of colon cancer), benign monoclonal gammopathy (a possible forerunner of multiple myeloma), and a first degree kinship with persons having a cancer or precancerous disease showing a Mendel ian (genetic) inheritance pattern (e.g. , familial polyposis of the colon, Gardner's syndrome, hereditary exostosis, polyendocrine adenomatosis, medullary diyroid carcinoma wid amyloid production and pheochromocytoma, Peutz-Jeghers syndrome, neurofibromatosis of Von Recklinghausen, retinoblastoma, carotid body tumor, cutaneous melanocarcinoma, intraocular melanocarcinoma. xeroderma pigmentosum, ataxia telangiectasia, Chediak-Higashi syndrome, albinism, Fanconi's aplastic anemia, and Bloom's syndrome; see Robbins and Angell, 1976, Basic Pathology, 2d Ed., W.B. Saunders Co., Philadelphia, pp. 112-113); or one of the foregoing precancerous conditions, or neurofibromatosis (e.g. , tuberous sclerosis, von Hippel-Lindau disease, multiple exostoses), genodermatosis (e.g. ,

polydysplastic epidermolysis bullosa), immune deficiency syndrome (e.g. , Wiskott-Aldrich syndrome, X-linked agammaglobulinemia), chromosome breakage or polyploidy (e.g. , Down's syndrome) (see The Merck Manual of Diagnosis and Therapy, 1987, 15th Ed., Berkow et al. (eds.), Merck Sharp & Dohme Research Laboratories, NJ, p. 1207), etc.)

5.8.3. NERVOUS SYSTEM DISORDERS In anodier embodiment of die invention. Therapeutics which antagonize E-cadherin function, and thus promote cell invasiveness, can be used therapeutical ly, e.g. , to promote tissue repair and regeneration. A particular embodiment is directed to the promotion of nerve regeneration. Such Therapeutics which antagonize E-cadherin function include but are not limited to human E-cadherin antisense nucleic acids, anti-human E-cadherin monoclonal antibodies (e.g. , directed against the homotypic binding region, repeat region), and human E-cadherin peptide fragments or analogs thereof (e.g. of the

E-cadherin extracellular domain, which when administered preferably in soluble form will competitively inhibit homotypic binding or adhesiveness of E-cadherin). Nervous system disorders which are dius envisioned for treatment include but are not limited to nervous system injuries, and diseases or disorders which result in either a disconnection of axons, a diminution or degeneration of neurons, or demyelination. Nervous system lesions which may be treated in a patient (including human and non-human mammalian patients) according to the invention include but are not limited to die following lesions of either die central (including spinal cord, brain) or peripheral nervous systems: (j) traumatic lesions, including lesions caused by physical injury or associated with surgery, for example, lesions which sever a portion of the nervous system, or compression injuries; (ii) ischemic lesions, in which a lack of oxygen in a portion of the nervous system results in neuronal injury or death,

including cerebral infarction or ischemia, or spinal cord infarction or ischemia; (iii) malignant lesions, in which a portion of the nervous system is destroyed or injured by malignant tissue which is eidier a 5 nervous system associated malignancy or a malignancy derived from non-nervous system tissue; (iv) infectious lesions, in which a portion of the nervous system is destroyed or injured as a result of infection, for example, by an abscess or associated widi infection by 10 human immunodeficiency virus, heφes zoster, or heφes simplex virus or widi Lyme disease, tuberculosis, syphilis; (v) degenerative lesions, in which a portion of the nervous system is destroyed or injured as a result of a degenerative process including but not limited to degeneration associated 15 with Parkinson's disease, Alzheimer's disease,

Huntington's chorea, or amyotrophic lateral sclerosis; (vi) lesions associated widi nutritional diseases or disorders, in which a portion of die nervous system is destroyed or injured by a nutritional disorder or disorder of metabolism 20 including but not limited to. vitamin B12 deficiency, folic acid deficiency, Wernicke disease, tobacco-alcohol amblyopia, Marchiafava-Bignami disease (primary degeneration of die corpus callosum), and alcoholic cerebellar degeneration; 25 (vii) neurological lesions associated with systemic diseases including but not limited to diabetes (diabetic neuropathy,

Bell's palsy), systemic lupus erythematosus, carcinoma, or sarcoidosis; (viii) lesions caused by toxic substances including alcohol, lead, 30 or particular neurotoxins; and

35

(ix) demyelinated lesions in which a portion of the nervous system is destroyed or injured by a demyelinating disease including but not limited to multiple sclerosis, human immunodeficiency virus-associated myelopathy, transverse myelopathy or various etiologies, progressive multifocal leukoencephalopathy, and central pontine myelinolysis. Therapeutics which are useful according to the invention for treatment of a nervous system disorder may be selected by testing for biological activity in promoting neurite extension or survival or differentiation of neurons. For example, and not by way of limitation, Therapeutics which elicit any of the following effects may be useful according to die invention:

(i) increased survival time of neurons in culture; (ii) increased sprouting of neurons in culture or in vivo; (iii) increased production of a neuron-associated molecule in culture or in vivo, e.g., choline acetyltransferase or acetylcholinesterase with respect to motor neurons; or (iv) decreased symptoms of neuron dysfunction in vivo. Such effects may be measured by any method known in the art. In preferred, non-limiting embodiments, increased survival of neurons may be measured by the method set forth in Arakawa et al. (1990, J. Neurosci. 10:3507-3515); increased sprouting of neurons may be detected by methods set forth in Pestronk et al. (1980, Exp. Neurol. 70:65-82) or Brown et al. (1981 , Ann. Rev. Neurosci. 4:17-42); increased production of neuron-associated molecules may be measured by bioassay, enzymatic assay, antibody binding, Northern blot assay, etc., depending on die molecule to be measured; and motor neuron dysfunction may be measured by assessing the physical manifestation of motor neuron disorder, e.g. , weakness, motor neuron conduction velocity, or functional disability.

In a specific embodiments, motor neuron disorders diat may be treated according to die invention include but are not limited to disorders such as infarction, infection, exposure to toxin, trauma, surgical damage, or degenerative disease diat may affect motor neurons as well as other components of the nervous

system, as well as disorders that selectively affect neurons such as amyotrophic lateral sclerosis, and including but not limited to progressive spinal muscular atrophy, progressive bulbar palsy, primary lateral sclerosis, infantile and juvenile muscular atrophy, progressive bulbar paralysis of childhood (Fazio-Londe syndrome), poliomyelitis and die post polio syndrome, and Hereditary Motorsensory Neuropathy (Charcot-Marie-Tooth Disease).

Other therapeutic and prophylactic me iods provided by die invention are described in Sections 5.9 through 5.10.1 infra.

5.9. GENE THERAPY

Nucleic acids encoding human E-cadherin or functional derivatives thereof, and expression vectors comprising the same can be introduced into cells such that the E-cadherin nucleic acid sequences are stably incorporated in the cell and capable of expression by die cell and/or its progeny cells. Such introduction can occur in vivo, by or following in vivo administration of die E-cadherin encoding nucleic acid; or in vitro, after which die recombinant cell can be introduced into an animal, most preferably a human, for puφoses of gene therapy (_._.. , therapeutic benefit via expression of the protein encoded by the introduced, heterologous gene sequence) (for reviews relating to gene therapy, see, e.g. , Karson et al., 1992, J. Reprod. Med. 37(6):508-514; Thompson. 1992, Science 258:744-746; Cline, 1985, Pharmac. Ther. 29:69-92). In a preferred embodiment, a nucleic acid encoding die complete human E-cadherin protein or a functional derivative diereof capable of homotypic binding, is introduced into a cell of an animal, eidier in vitro followed by introduction of the transformed cell or in vivo (e.g. , by direct injection into the animal) to prevent progression to malignancy, to prevent cell invasion, or to prevent metastasis. In a specific embodiment, the nucleic acid is directly injected into or otherwise delivered to a tumor cell, to prevent metastasis. Such tumor cells include but are not limited to die solid tumors listed in Table 1. supra. In an alternative embodiment, recombinant cells engineered to secrete an E-cadherin protein or derivative can be

used to provide soluble E-cadherin to competitively inhibit E-cadherin adhesive function on cells, thus promoting cell invasion.

Cells into which an E-cadherin-encoding nucleic acid can be introduced for puφoses of gene therapy encompass any desired, available cell type, and include but are not limited to epithelial cells, endothelial cells, keratinocytes, fibroblasts, muscle cells, hepatocytes; blood cells such as T lymphocytes, B lymphocytes, monocytes, macrophages, neutrophils, eosinophils, megakaryocytes, granulocytes; various stem or progenitor cells, in particular hematopoietic stem or progenitor cells, e.g. , as obtained from bone marrow, umbilical cord blood, peripheral blood, fetal liver, etc. In a specific embodiment in which a tumor is treated by gene therapy, die E-cadherin-encoding nucleic acid, if not directly introduced into the tumor ceil in vivo, is preferably introduced into a cell of the same cell type as the tumor cell.

In a preferred embodiment, the cell used for gene therapy is autologous to the patient.

In an embodiment in which cells are obtained, a nucleic acid encoding E-cadherin or a derivative thereof is introduced into die cells such that it is expressible by the cells or their progeny, and the recombinant cells are then administered in vivo for therapeutic effect, stem cells are preferred for use. Any stem cells which can be isolated and maintained in vitro can potentially be used in accordance with this embodiment of the present invention. Such stem cells include but are not limited to hematopoietic stem cells (HSC), stem cells of epithelial tissues such as die skin and the lining of die gut, and embryonic heart muscle cells. Epidielial stem cells are preferred for use. Epithelial stem cells (ESCs) or keratinocytes can be obtained from tissues such as the skin and die lining of the gut by known procedures (Rheinwald, 1980, Meth. Cell Bio. 21A:229). In stratified epithelial tissue such as die skin, renewal occurs by mitosis of stem cells within die germinal layer, the layer closest to the basal lamina. Stem cells within the lining of the gut provide for a rapid renewal rate of this tissue. ESCs or keratinocytes obtained from the skin or lining of the gut of a patient or donor can be grown in tissue culture

(Rheinwald, 1980, Meth. Cell Bio. 21A:229; Pittel ow and Scott, 1986, Mayo Clinic Proc. 61 :771). If d e ESCs are provided by a donor, a mediod for suppression of host versus graft reactivity (e.g. , irradiation, drug or antibody administration to promote moderate immunosuppression) can also be used. Widi respect to hematopoietic stem cells (HSC), any technique which provides for die isolation, propagation, and maintenance in vitro of HSC can be used in this embodiment of die invention. Techniques by which this may be accomplished include (a) die isolation and establishment of HSC cultures from bone marrow cells isolated from die future host, or a donor, or (b) die use of previously established long-term HSC cultures, which may be allogeneic or xenogeneic. Non-autologous HSC are used preferably in conjunction with a method of suppressing transplantation immune reactions of the future host/patient. In a particular embodiment of die present invention, human bone marrow cells can be obtained from the posterior iliac crest by needle aspiration (see, e.g. , Kodo et al., 1984, J. Clin. Invest. 73: 1377-1384). In a preferred embodiment of die present invention, die HSCs can be made highly enriched or in substantially pure form. This enrichment can be accomplished before, during, or after long- term culturing, and can be done by any techniques known in the art. Long-term cultures of bone marrow cells can be established and maintained by using, for example, modified Dexter cell culture techniques (Dexter et al.. 1977, J. Cell Physiol. 91:335) or Witlock-Witte culture techniques (Witlock and Witte, 1982, Proc. Natl. Acad. Sci. USA 79:3608-3612).

In one embodiment, die nucleic acid encoding E-cadherin or a derivative thereof is introduced into a cell prior to administration in vivo of the resulting recombinant cell. Such introduction can be carried out by any method known in the art, including but not limited to transfection, electroporation, microinjection, infection with a viral or bacteriophage vector containing the E-cadherin sequences, cell fusion, chromosome-mediated gene transfer, microcell- mediated gene transfer, spheroplast fusion, etc. Numerous techniques are known in the art for die introduction of foreign genes into cells (see e.g. , Cline, 1985, Pharmac. Ther. 29:69-92) and may be used in accordance with the present

invention, provided diat die necessary developmental and physiological functions of the recipient cells are not disrupted. The technique should provide for the stable transfer of the heterologous gene sequence to the cell, so that the heterologous gene sequence is expressible by the cell and preferably heritable and expressible by its cell progeny.

The resulting recombinant cells can be introduced by various methods known in the art (see Section 5.12 infra). In a preferred embodiment, epithelial cells are injected, e.g. , subcutaneously. In another embodiment, recombinant skin cells may be applied as a skin graft onto the patient. Recombinant blood cells (e.g. , hematopoietic stem or progenitor cells) are preferably administered intravenously. The amount of cells envisioned for use depends on die desired effect, patient state, etc. , and can be determined by one skilled in die art.

In anodier specific embodiment, the nucleic acid encoding E-cadherin or a derivative diereof is directly administered in vivo for therapeutic effect, whereby it is expressed to produce E-cadherin or a derivative thereof. This can be accomplished by any of numerous methods known in die art, e.g. , by constructing it as part of an appropriate nucleic acid expression vector and administering it so that it becomes intracellular, e.g. , by infection using a defective or attenuated retroviral or other viral vector (see U.S. Patent No.

4,980.286), or by direct injection, or by use of microparticle bombardment (e.g. , a gene gun; Biolistic, Dupont), or coating with lipids or cell-surface receptors or transfecting agents, or by administering it in linkage to a peptide which is known to enter die nucleus, etc. Alternatively, the nucleic acid Therapeutic can be introduced intracellularly and incorporated within host cell DNA for expression, by homologous recombination.

5.10. ANTISENSE REGULATION OF HUMAN E-CADHERIN EXPRESSION The present invention also provides the therapeutic or prophylactic use of nucleic acids of at least six nucleotides that are antisense to a gene or cDNA encoding a human E-cadherin protein or a portion thereof. "Antisense" as

used herein refers to a nucleic acid capable of hybridizing to a portion of a human E-cadherin RNA (preferably mRNA) by virtue of some sequence complementarity. Such antisense nucleic acids have utility as antagonists of E-cadherin function, and can be used where cell invasion is desired (e.g. , to promote nerve or other tissue regeneration).

The antisense nucleic acids of the invention can be oligonucleotides diat are double-stranded or single-stranded, RNA or DNA or a modification or derivative diereof, which can be directly administered to a cell, or which can be produced intracellularly by transcription of exogenous, introduced sequences. The invention further provides pharmaceutical compositions comprising an effective amount of die E-cadherin antisense nucleic acids of die invention in a pharmaceutically acceptable carrier, as described in Section 5.12. In anodier embodiment, die invention is directed to mediods for inhibiting the expression of a human E-cadherin nucleic acid sequence in a prokaryotic or eukaryotic cell, comprising providing the cell widi an effective amount of a composition comprising an antisense E-cadherin nucleic acid of the invention.

Human E-cadherin antisense nucleic acids and dieir uses are described in detail below.

5.10.1. HUMAN E-CADHERIN ANTISENSE NUCLEIC ACIDS

The human E-cadherin antisense nucleic acids are of at least six nucleotides and are preferably oligonucleotides (ranging from 6 to about 50 oligonucleotides). In specific aspects, the oligonucleotide is at least 10 nucleotides, at least 15 nucleotides, at least 30 nucleotides, at least 50 nucleotides, at least 100 nucleotides, or at least 200 nucleotides. In a specific embodiment, the oligonucleotide has a sequence that is not identical or 100% complementary to any same-size portion of the mouse or chicken E-cadherin nucleotide coding sequences or flanking sequences. In another specific embodiment, the antisense oligonucleotide is not 100% complementary to a same size nucleic acid having a sequence depicted in Figure 3 from nucleotide numbers 617-1036 or a portion

thereof. In anodier embodiment, die oligonucleotide is complementary to the nucleotide sequence encoding a domain or portion thereof of the human E-cadherin protein. The oligonucleotides can be DNA or RNA or chimeric mixtures or derivatives or modified versions diereof, single-stranded or double- stranded. The oligonucleotide can be modified at die base moiety, sugar moiety, or phosphate backbone. The oligonucleotide may include odier appending groups such as peptides, or agents facilitating transport across the cell membrane (see, e.g. , Letsinger et al., 1989, Proc. Natl. Acad. Sci. U.S.A. 86:6553-6556; Lemaitre et al., 1987, Proc. Natl. Acad. Sci. 84:648-652; PCT Publication No. WO 88/09810, published December 15, 1988) or blood-brain barrier (see, e.g. , PCT Publication No. WO 89/10134, published April 25. 1988), hybridization- triggered cleavage agents (see, e.g. , Krol et al., 1988, BioTechniques 6:958-976) or intercalating agents (see, e.g. , Zon, 1988, Pharm. Res. 5:539-549). In a specific embodiment, die antisense nucleic acid is antisense to a sequence encoding one or more domains of the E-cadherin protein.

In a preferred aspect of die invention, a human E-cadherin antisense oligonucleotide is provided, preferably of single-stranded DNA. In a most preferred aspect, such an oligonucleotide comprises a sequence antisense to e sequence encoding die extracellular domain of a human E-cadherin protein, or the repeat domain thereof. The oligonucleotide may be modified at any position on its structure with substituents generally known in die art.

The E-cadherin antisense oligonucleotide may comprise at least one modified base moiety which is selected from die group including but not limited to 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xantine, 4-acetylcytosine, 5-(carboxyhydroxylmedιyl) uracil,

5-carboxymethylaminomethyl-2-dιiouridine, 5-carboxymethylaminomethyluracil, dihydrouracil, beta-D-galactosylqueosine, inosine, N6-isopentenyladenine, 1-methylguanine, 1-methylinosine, 2,2-dimethylguanine. 2-methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 7-methylguanine, 5-methylaminomethyluracil, 5-medιoxyaminomethyl- 2-thiouracil, beta-D-mannosylqueosine, 5'-medιoxycarboxymethyluracil,

5-methoxyuracil, 2-medιyldιio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, queosine. 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-dιiouracil, 5-methyluracil, uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 5-methyl-2-thiouracil, 3-(3-amino-3-N-2- carboxypropyl) uracil, (acp3)w, and 2,6-diaminopurine.

In another embodiment, the oligonucleotide comprises at least one modified sugar moiety selected from the group including but not limited to arabinose, 2-fluoroarabinose, xylulose, and hexose.

In yet another embodiment, the oligonucleotide comprises at least one modified phosphate backbone selected from the group consisting of a phosphorothioate, a phosphorodithioate. a phosphoramidothioate. a phosphoramidate, a phosphordiamidate. a methylphosphonate, an alkyl phosphotriester, and a formacetal or analog thereof.

In yet another embodiment, die oligonucleotide is an c.-anomeric oligonucleotide. An α-anomeric oligonucleotide forms specific double-stranded hybrids with complementary RNA in which, contrary to the usual /."-units, the strands run parallel to each other (Gautier et al. , 1987, Nucl. Acids Res. 15:6625-6641).

The oligonucleotide may be conjugated to another molecule, e.g. . a peptide. hybridization triggered cross-linking agent, transport agent, hybridization-triggered cleavage agent, etc.

Oligonucleotides of the invention may be synthesized by standard me ods known in die art, e.g. by use of an automated DNA synthesizer (such as are commercially available from Biosearch, Applied Biosystems, etc.). As examples, phosphorothioate oligos may be synthesized by the method of Stein et al. (1988, Nucl. Acids Res. 16:3209), methylphosphonate oligos can be prepared by use of controlled pore glass polymer supports (Sarin et al. , 1988. Proc. Natl. Acad. Sci. U.S.A. 85:7448-7451 ), etc.

In a specific embodiment, the E-cadherin antisense oligonucleotide comprises catalytic RNA, or a ribozyme (see, e.g.. PCT International Publication WO 90/11364, published October 4, 1990; Sarver et al.. 1990, Science 247: 1222-

1225). In another embodiment, the oligonucleotide is a 2'-0-medιylribonucleotide (Inoue et al., 1987, Nucl. Acids Res. 15:6131-6148), or a chimeric RNA-DNA analogue (Inoue et al., 1987, FEBS Lett. 215:327-330).

In an alternative embodiment, e E-cadherin antisense nucleic acid of the invention is produced intracellularly by transcription from an exogenous sequence. For example, a vector can be introduced in vivo such that it is taken up by a cell, within which cell die vector or a portion thereof is transcribed, producing an antisense nucleic acid (RNA) of die invention. Such a vector would contain a sequence encoding die E-cadherin antisense nucleic acid. Such a vector can remain episomal or become chromosomally integrated, as long as it can be transcribed to produce die desired antisense RNA. Such vectors can be constructed by recombinant DNA technology mediods standard in die art. Vectors can be plasmid, viral, or odiers known in the art, used for replication and expression in mammalian cells. Expression of e sequence encoding die E-cadherin antisense RNA can be by any promoter known in the art to act in mammalian, preferably human, cells. Such promoters can be inducible or constitutive. Such promoters include but are not limited to: the SV40 early promoter region (Bernoist and Chambon, 1981 , Nature 290:304-310), the promoter contained in die 3' long terminal repeat of Rous sarcoma virus (Yamamoto et al., 1980, Cell 22:787-797), die herpes thymidine kinase promoter (Wagner et al., 1981 , Proc. Natl. Acad. Sci. U.S.A. 78: 1441-1445), the regulatory sequences of the metaliothionein gene (Brinster et al. , 1982, Nature 296:39-42), etc.

The antisense nucleic acids of die invention comprise a sequence complementary to at least a portion of an RNA transcript of a human E-cadherin gene. However, absolute complementarity, almough preferred, is not required. A sequence "complementary to at least a portion of an RNA." as referred to herein, means a sequence having sufficient complementarity to be able to hybridize widi the RNA, forming a stable duplex; in the case of double-stranded E-cadherin antisense nucleic acids, a single strand of the duplex DNA may thus be tested, or triplex formation may be assayed. The ability to hybridize will

depend on bodi die degree of complementarity and the length of die antisense nucleic acid. Generally, die longer the hybridizing nucleic acid, die more base mismatches with a E-cadherin RNA it may contain and still form a stable duplex (or triplex, as die case may be). One skilled in die art can ascertain a tolerable degree of mismatch by use of standard procedures to determine the melting point of die hybridized complex. In another embodiment, 100% complementary sequences are envisioned.

The amount of E-cadherin antisense nucleic acid which will be effective in the treatment of a particular disorder or condition will depend on the nature of the disorder or condition, and can be determined by standard clinical techniques.

E-cadherin antisense nucleic acids can be administered by methods as described supra in Section 5.9.

5.1 1. DEMONSTRATION OF THERAPEUTIC

OR PROPHYLACTIC UTILITY

The Therapeutics of the invention can be tested in vivo for the desired dierapeutic or prophylactic activity. For example, such compounds can be tested in suitable animal model systems prior to testing in humans, including but not limited to rats, mice, chicken, cows, monkeys, rabbits, etc. For in vivo testing, prior to administration to humans, any animal model system known in the art may be used.

5.12. THERAPEUTIC/PROPHYLACTIC

ADMINISTRATION AND COMPOSITIONS

The invention provides methods of treatment (and prophylaxis) by administration to a subject of an effective amount of a Therapeutic of the invention. In a preferred aspect, the Therapeutic is substantially purified. The subject is preferably an animal, including but not limited to animals such as cows, pigs, chickens, etc.. and is preferably a mammal, and most preferably human.

Various delivery systems are known and can be used to administer a Therapeutic of the invention, e.g. , encapsulation in liposomes, microparticles.

microcapsules, expression by recombinant cells, receptor-mediated endocytosis (see, e.g. , Wu and Wu, 1987, J. Biol. Chem. 262:4429-4432), construction of a Therapeutic nucleic acid as part of a retroviral or other vector, etc. Methods of introduction include but are not limited to intradermal, intramuscular, intraperitoneal, intravenous, subcutaneous, intranasal, epidural, and oral routes. The compounds may be administered by any convenient route, for example by infusion or bolus injection, by absoφtion through epithelial or mucocutaneous linings (e.g. , oral mucosa, rectal and intestinal mucosa, etc.) and may be administered together with odier biologically active agents. Administration can be systemic or local. In addition, it may be desirable to introduce the pharmaceutical compositions of the invention into the central nervous system by any suitable route, including intraventricular and intrathecal injection; intraventricular injection may be facilitated by an intraventricular cadieter, for example, attached to a reservoir, such as an Ommaya reservoir. In a specific embodiment, it may be desirable to utilize liposomes targeted via antibodies to specific identifiable tumor antigens (Leonetti et al., 1990. Proc. Natl. Acad. Sci. U.S.A. 87:2448-2451; Renneisen et al., 1990, J. Biol. Chem. 265: 16337-16342).

In a specific embodiment where the Therapeutic is a nucleic acid encoding a protein Therapeutic, die nucleic acid can be administered in vivo to promote expression of its encoded protein, by methods as described supra in Section 5.9.

In a specific embodiment, it may be desirable to administer the Therapeutics of the invention locally to the area in need of treatment; this may be achieved by, for example, and not by way of limitation, local infusion during surgery, topical application, e.g. , in conjunction with a wound dressing after surgery, by injection, by means of a catheter, by means of a suppository, or by means of an implant, said implant being of a porous, non-porous, or gelatinous material, including membranes, such as sialastic membranes, or fibers. In one embodiment, administration can be by direct injection at the site (or former site) of a malignant tumor or neoplastic pr pre-neoplastic tissue.

The present invention also provides pharmaceutical compositions. Such compositions comprise a therapeutical ly effective amount of a Therapeutic, and a pharmaceutically acceptable carrier or excipient. Such a carrier includes but is not limited to saline, buffered saline, dextrose, water, glycerol, ethanol, and combinations thereof. The carrier and composition can be sterile. The formulation should suit die mode of administration.

The composition, if desired, can also contain minor amounts of wetting or emulsifying agents, or pH buffering agents. The composition can be a liquid solution, suspension, emulsion, tablet, pill, capsule, sustained release formulation, or powder. The composition can be formulated as a suppository, with traditional binders and carriers such as triglycerides. Oral formulation can include standard carriers such as pharmaceutical grades of mannitol, lactose, starch, magnesium stearate, sodium saccharine, cellulose, magnesium carbonate, etc. In a preferred embodiment, the composition is formulated in accordance widi routine procedures as a pharmaceutical composition adapted for intravenous administration to human beings. Typically, compositions for intravenous administration are solutions in sterile isotonic aqueous buffer. Where necessary, the composition may also include a solubilizing agent and a local anesdietic such as lignocaine to ease pain at the site of the injection. Generally, die ingredients are supplied either separately or mixed together in unit dosage form, for example, as a dry lyophilized powder or water free concentrate in a hermetically sealed container such as an ampoule or sachette indicating the quantity of active agent. Where the composition is to be administered by infusion, it can be dispensed with an infusion bottle containing sterile pharmaceutical grade water or saline. Where the composition is administered by injection, an ampoule of sterile water for injection or saline can be provided so diat die ingredients may be mixed prior to administration.

The Therapeutics of the invention can be formulated as neutral or salt forms. Pharmaceutically acceptable salts include those formed with free amino groups such as those derived from hydrochloric, phosphoric, acetic, oxalic.

tartaric acids, etc., and diose formed widi free carboxyl groups such as those derived from sodium, potassium, ammonium, calcium, ferric hydroxides, isopropylamine, triemylamine, 2-ethylamino ethanol, histidine, procaine, etc. The amount of the Therapeutic of the invention which will be effective in the treatment of a particular disorder or condition will depend on die nature of the disorder or condition, and can be determined by standard clinical techniques. In addition, in vitro assays may optionally be employed to help identify optimal dosage ranges. The precise dose to be employed in the formulation will also depend on the route of administration, and the seriousness of the disease or disorder, and should be decided according to the judgment of die practitioner and each patient's circumstances. However, suitable dosage ranges for intravenous administration of a protein Therapeutic are generally about 20-500 micrograms of active compound per kilogram body weight. Suitable dosage ranges for intranasal administration are generally about 0.01 pg/kg body weight to 1 mg/kg body weight. Effective doses may be extrapolated from dose-response curves derived from in vitro or animal model test systems.

Suppositories generally contain active ingredient in the range of 0.5% to 10% by weight; oral formulations preferably contain 10% to 95% active ingredient. The invention also provides a pharmaceutical pack or kit comprising one or more containers filled with one or more of the ingredients of the pharmaceutical compositions of the invention. Optionally associated widi such container(s) can be a notice in the form prescribed by a governmental agency regulating die manufacture, use or sale of pharmaceuticals or biological products, which notice reflects approval by the agency of manufacture, use or sale for human administration.

5.13. DIAGNOSTIC UTILITY Detection and/or measurement of human E-cadherin expression has diagnostic and prognostic utility. The loss of expression or improper localization of expressed E-cadherin correlates with severity and degree of differentiation in

some cancers (see Shimoyama and Hirohashi, 1991 , Cancer Res. 51 :2185-2192). Thus, decreased expression or change in localization of E-cadherin in human tumor cells relative to the level of expression or localization, respectively, in non- malignant cells (preferably of die same cell type), indicates a poor prognosis and the presence of an invasive malignancy. Monitoring of the course of disease and of treatment efficacy can also be performed; such a decreased expression or change in localization relative to the level of expression or localization, respectively, in comparable cells taken from the patient at an earlier time (e.g. , a prior tissue biopsy sample taken, for example, prior to treatment) indicates the progression of malignancy or a poor response to treatment. In a specific embodiment, an anti-human E-cadherin antibody is used diagnostically in conventional immunoperoxidase staining of a surgical specimen to predict metastatic potential of an epithelial cell cancer such as breast, prostate, ovarian, gastric, or squamous cell cancer. Disorders of cell fate, in particular precancerous conditions such as metaplasia and dysplasia, and hypeφroliferative (e.g. , cancer) or hypoproliferative disorders, involving aberrant or undesirable levels of expression or activity of a E-cadherin protein can be diagnosed by detecting such levels. Thus, human E-cadherin proteins, analogues, derivatives, and subsequences diereof, E-cadherin nucleic acids (and sequences complementary thereto), anti- human-E-cadherin protein antibodies, and other proteins and derivatives and analogs thereof which interact with human E-cadherin proteins, and inhibitors of such E-cadherin-protein interactions, have uses in diagnostics. Such molecules can be used in assays, such as immunoassays, to detect, prognose, diagnose, or monitor various conditions, diseases, and disorders affecting E-cadherin expression, or monitor the treatment thereof. In particular, such an immunoassay is carried out by a method comprising contacting a sample derived from a patient with an anti-human E-cadherin protein antibody under conditions such that immunospecific binding can occur, and detecting or measuring the amount of any immunospecific binding by the antibody. In a specific embodiment, antibody to E-cadherin can be used to assay in a patient tissue or serum sample for the

presence of E-cadherin where an aberrant level of E-cadherin is an indication of a diseased condition.

The immunoassays which can be used include but are not limited to competitive and non-competitive assay systems using techniques such as western blots, radioimmunoassays, ELISA (enzyme linked immunosorbent assay), "sandwich" immunoassays, immunoprecipitation assays, precipitin reactions, gel diffusion precipitin reactions, immunodiffusion assays, agglutination assays, complement-fixation assays, immunoradiometric assays, fluorescent immunoassays, protein A immunoassays, to name but a few. Human E-cadherin genes and related nucleic acid sequences and subsequences, including complementary sequences, can also be used in hybridization assays. Human E-cadherin nucleic acid sequences, or subsequences diereof comprising about at least 8 nucleotides, can be used as hybridization probes. Hybridization assays can be used to detect, prognose, diagnose, or monitor conditions, disorders, or disease states associated with changes in human E-cadherin expression and/or activity as described supra. In particular, such a hybridization assay is carried out by a method comprising contacting a sample containing nucleic acid with a nucleic acid probe capable of hybridizing to human E-cadherin DNA or RNA, under conditions such that hybridization can occur, and detecting or measuring any resulting hybridization.

In another embodiment, PCR primers based on the human E-cadherin nucleotide sequence are used in diagnostic PCR. Thus, a pair of purified oligonucleotide primers is provided: a first oligonucleotide having a sequence which is die same as a first portion of die nucleotide sequence depicted in Figure 3 (SEQ ID NO: l); and a second oligonucleotide having a sequence which is complementary to a second portion of the nucleotide sequence shown in Figure 3 (SEQ ID NO: l) and thus able to prime DNA synthesis off the opposite DNA strand from the first oligonucleotide: in which the second portion is situated 3' to die first portion in the sequence shown in Figure 3. These oligonucleotides are then used as primers in PCR to amplify DNA fragments spanning from the first portion to the second portion in the E-cadherin gene, thus allowing the

detection of full-lengdi E-cadherin nucleic acids or portions thereof (depending on the identity of the primers and dius dieir position within die E-cadherin gene) in a sample from a patient. The characteristics (e.g.. size) of the amplified fragment, or the lack of any amplification, where differing from at achieved with a wild- type human E-cadherin gene, can indicate die presence of an abnormality associated with a change in human E-cadherin gene sequence. In a specific embodiment directed to die PCR amplification of the complete coding sequence, primers from the noncoding regions, flanking the coding sequence, are employed. In specific embodiments, the oligonucleotide primers are at least 15, and are preferably 18 or 24 nucleotides. In particular, die primers consist of nucleotide sequences diat are not identical or complementary to any same-size fragment of the chicken or mouse E-cadherin cDNAs or the coding regions or flanking regions thereof.

6. ISOLATION OF HUMAN E-CADHERIN cDNA CLONES FROM LIVER AND COLON

We have isolated and characterized die first full-lengdi cDNA coding for human liver and colonic E-cadherin. The sequence is unique, although fairly homologous to similar isolates from mouse and chicken. We have isolated 10 cDNA clones from normal human liver and hepatocellular carcinoma libraries and over 20 cDNAs from a colonic epithelial cell cDNA library.

Our clones were obtained by hybridization screening of cDNA libraries received as gifts. We screened a normal colon cDNA library, a hepatocellular carcinoma library and a normal liver library. Our initial probes were made by polymerase chain reaction (PCR) from published human sequence (Mansouri et al. , 1988, Differentiation 38:67-71), and from a fragment of human sequence (as published by Mansouri et al., supra) received as a gift from Dr. R. Kemler (see Fig. 2). These resulted in a single clone representing about 80% of the coding sequence. Further screening of the liver libraries and all screening of the colon library was done with restriction fragments from mis or subsequent clones, until overlapping clones spanning the entire sequence were obtained.

7. THE COMPLETE NUCLEOTIDE AND AMINO ACID SEQUENCES OF HUMAN E-CADHERIN

Sequencing of human E-cadherin cDNA clones was carried out using Taq polymerase in dideoxy sequencing (TaqTrack, Promega, Madison, WI). The complete nucleotide sequence of d e liver clone is shown schematically in Figure 2 with landmark regions mapped below die sequence. The 2756 bp human liver cDNA sequence predicts a 878 amino acid protein (Figure 3) which shows high levels of homology with die mouse (84% at the protein level and 80% at die DNA level) and chicken molecules. The sequence shows a 150 amino acid O leader sequence (containing a signal sequence) that has 46% homology to (identity widi) die mouse sequence, followed by die 26 amino acids representing the mature amino terminus that show 96% identity with die mouse sequence. The amino-terminal processed region (leader sequence) is cleaved post-translationally to produce die mature protein. The SHAVS sequence shown to be necessary for j homotypic interaction is identical in sequence and location to that seen in mouse and chicken. Human E-cadherin shows the same three internal repeat (putative Ca + + binding) structures seen in mouse with 79-86% similarity to the mouse sequence and 22-36% similarity between repeats. The four cysteine residues in die region (conserved cysteine domain) immediately external to the 0 transmembrane binding domain are identically conserved as are the three consensus sequences for N-glycosylation in that region in spite of the fact that this region is less similar overall (69% homology). The other N-glycosylation sequence in the middle of die second repeat is conserved but not identical. The region with die highest similarity is the carboxy terminal region of die molecule widi die 24 transmembrane amino acids showing 100% identity and the cytoplasmic domain showing 95% similarity.

Thus, the human E-cadherin protein and domains thereof can be characterized as containing die following amino acids:

30

35

Amino acids of Figure 3

E-cadherin protein or portion thereof (SEQ ID NO:2. mature protein 151-878 extracellular domain 1-703; 151 -703 of mature protein transmembrane domain 704-727 cytoplasmic domain 728-878 amino-terminal processed region 1-150 homotypic binding domain 228-232 HAV homotypic binding sequence 229-231 conserved cysteine domain 513-703 Repeat domain 178-513 Repeat tt\ 178-289

Repeat #2 290-401

Repeat #3 402-513

8. cDNA CLONE SERIES, PROKARYOTIC

EXPRESSION. AND ANTIBODY PRODUCTION

We used diree individual clones to construct a coding sequence clone (termed FLEC) which encodes a protein consisting of 5 amino-terminal amino acids not from the human E-cadherin sequence, followed by the human E-cadherin protein sequence from approximately amino acid 146 through amino acid 878. The FLEC -encoded protein dius is missing amino acids 1 to about 145 of die sequence shown in Figure 3. The FLEC clone thus encodes die entire mature E-cadherin protein sequence (amino acids 151-878 of Fig. 3), but differs in e protein sequence corresponding to the amino-terminal processed region of human E-cadherin.

A full-length human E-cadherin coding sequence can be generated from bsFLEC and bsL5.1 (both deposited with the ATCC; see Section 10). Both bsFLEC and bsL5. 1 represent E-cadherin sequences cloned in "bluescript phagemid II" from Stratgene Inc. L5.1 represents the 5 ' end of the E-cadherin clone (see Fig. 4) and is cloned into the EcoRl site of the plasmid. bsFLEC is cloned between the EcoRl and Xhol sites. To produce a full length human

E-cadherin construct, one cuts both plasmids widi Hindlll and BamHI and purifies die larger (4.7 kbp) band from bsFLEC and the 840 bp band from bsL5.1. These fragments are then ligated, and die resultant product is the full length clone. FLEC has been transferred into multiple eukaryotic cloning vectors.

We have also used die FLEC clone as a template to produce fragments for cloning into prokaryotic expression vectors (pGEX series, Pharmacia; pQE Series, Qiagen). Aldiough detectable expression was not obtained widi die pQE series, fusion proteins were obtained using the pGEX expression vector system in E. coli HBlOl . This vector system produces chimeric (fusion) proteins containing die carboxyl terminus of glutathione S-transferase as the amino-terminal sequence fused to the E-cadherin sequence. We have produced two fusion proteins, one, encoded by clone e250 (see Fig. 2), containing die extracellular domain including die homotypic adhesion sequence, and a second, encoded by clone cyto 20 (see Fig. 2) containing the entire cytoplasmic domain.

Clone e250 contains nucleotide numbers 589-832 of e human E-cadherin sequence (Fig. 2; SEQ ID NO: l ). The e250-encoded protein thus contains the HAV binding domain, and is expected to competitively inhibit homotypic binding of E-cadherin. Clone e250 was made by subcloning a

PvuII-BamHI fragment from clone H9.1 into pGEX-2T. Expression of clone e250 in E. coli HBlOl yielded a 37 kd fusion protein as detected by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE).

Clone cyto 20 contains nucleotide numbers 2297-2750 of the human E-cadherin sequence (Fig. 2; SEQ ID NO: l). Clone cyto 20 was made by performing PCR with 16-mer primers just 5' and just 3' of nucleotide numbers 2297-2749 in Figure 3. The primers were designed so as to incorporate an in-frame BamHI site at nucleotide number 2296. The PCR-amplified product was then cleaved with BamHI and subcloned into pGEX-3X. Expression in E. coli HBlOl yielded a 42 kd fusion protein as detected by SDS-PAGE. Preliminary evidence indicates that the cyto 20-encoded fusion protein binds to brain

fodrin/spectrin, and co-precipates proteins out of MDCK cell extracts in the size range of alpha and beta catenin (see Ozawa and Kemler, 1992, J. Cell Biol. 116:989-96, regarding mouse E-cadherin).

We have produced bodi of the above-described fusion proteins in large quantity and generated polyclonal antibodies to both in rabbits. Fusion proteins were produced and purified by affinity chromatography according to standard procedures (Smith and Johnson, 1988, Gene 67:31-40) and utilized for immunization of rabbits (for polyclonal antibody production) and of mice (for monoclonal antibody production). The polyclonal antibodies thus obtained were tested against native human E-cadherin in colonic extracts and in extracts from a number of cell lines and were shown to bind with high affinity. We are currently in the process of producing mouse monoclonal antibodies to each fusion protein as follows: Since the fusion proteins appear to be toxic to mice, lymph nodes or spleen were taken for fusion with myeloma cells after only one immunization and one boost. The initial injection of fusion protein into mice was in Freund's complete adjuvant; the boost was given 13 days later in Freund's incomplete adjuvant. Three days after boosting, the lymph nodes or spleen were taken, and fusions widi myeloma cells were performed, generating hybridomas for screening. The hybridomas were screened for reactivity with the fusion protein used as immunogen, and ten candidate hybridomas were identified as reactive with the e250-encoded protein, and are subject to verification.

We also have a large series of cloned fragments (Fig. 4) representing various portions of the molecule which may be used later for deletion analysis type studies.

9. EUKARYOTIC EXPRESSION OF HUMAN E-CADHERIN We have subcloned the clone H9. 1 DNA carboxy-terminal- encoding fragments (including the complete cytoplasmic domain) into eukaryotic expression vectors. We are using a series of vectors including, pCMV-NeoPoly 1 (kindly provided by E. Fearon, pers. comm; Fig. 6), pLXSN (Weintraub et al.. 1989, Proc. Natl. Acad. Sci. USA 86:5434-5438), pGALO (Kato et al. , 1990,

Mol. Cell Biol. 10:5914-5920), and pNLVP16 (Dang et al. , 1991 , Mol. Cell Biol. 11 :954-962). Transfection experiments are underway with CHO (Chinese hamster ovary) cells, Jurkat (T lymphoma cell line) cells, and a number of breast cancer cell lines.

10. DEPOSIT OF MICROORGANISMS The following bacterial strains, containing the listed plasmids, were deposited on November 12, 1992 widi die American Type Culture Collection (ATCC), 1201 Parklawn Drive, Rockville, Maryland 20852, under the provisions of the Budapest Treaty on the International Recognition of the Deposit of Microorganisms for the Purposes of Patent Procedures, and assigned the indicated accession numbers.

Bacteria carrying Plasmid ATCC Accession Number

£. coli HBlOl bsFLEC (containing FLEC) 69123 E. coli HBlOl bsL5.1 (containing L5.1) 69122

The present invention is not to be limited in scope by die microorganisms described or the specific embodiments described herein. Indeed, various modifications of the invention in addition to those described herein will become apparent to ose skilled in the art from the foregoing description and accompanying figures. Such modifications are intended to fall within the scope of the appended claims.

Various publications are cited herein, the disclosures of which are incoφorated by reference in their entireties.

SEQUENCE LISTING

(1) GENERAL INFORMATION:

(i) APPLICANT: Rimm, David L Morrow, Jon ,S

(ii) TITLE OF INVENTION: Human Homolog of the E-Cadherin Gene and Methods Based Thereon.

(iii) NUMBER OF SEQUENCES: 2

(iv) CORRESPONDENCE ADDRESS:

(A) ADDRESSEE: Pennie & Edmonds

(B) STREET: 1155 Avenue of the Americas

(C) CITY: New York

(D) STATE: New York

(E) COUNTRY: U.S.A.

(F) ZIP: 10036-2711

(v) COMPUTER READABLE FORM:

(A) MEDIUM TYPE: Floppy disk

(B) COMPUTER: IBM PC compatible

(C) OPERATING SYSTEM: PC-DOS/MS-DOS

(D) SOFTWARE: Patentin Release #1.0, Version #1.25

(vi) CURRENT APPLICATION DATA:

(A) APPLICATION NUMBER: To be assigned

(B) FILING DATE: On even date herewith

(C) CLASSIFICATION:

(viii) ATTORNEY/AGENT INFORMATION:

(A) NAME: Misrock, S. Leslie

(B) REGISTRATION NUMBER: 18,872

(C) REFERENCE/DOCKET NUMBER: 7326-014

(ix) TELECOMMUNICATION INFORMATION:

(A) TELEPHONE: (212)790-9090

(B) TELEFAX: (212)869-8864/9741

(C) TELEX: 66141 PENNIE

(2) INFORMATION FOR SEQ ID Nθ:l:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 2815 nucleotides

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: unknown

(ii) MOLECULE TYPE: DNA

(ix) FEATURE:

(A) NAME/KEY: CDS

(B) LOCATION: 116..2749

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l:

GAATTCCGGA AAGCACCTGT GAGCTTGGCA AGTCAGTTCA GAGCTCCAGC CCGCTCCAGC 60

CCGGCCCGAC CCGACCGCAC CCGGCGCCTG CCTCGCTCGG GCTCCCCGGC CAGCC ATG 118

Met

1

GGC CCT TGG AGC CGC AGC CTC TCG GGC CTG CTG CTG CTG CTG AGG TCT 166 Gly Pro Trp Ser Arg Ser Leu Ser Gly Leu Leu Leu Leu Leu Arg Ser

5 10 15

CCT CTT GGC TCT CAG GAG CGG AGC CCT CCT CCC TGT TTG ACG CGA GAG 214 Pro Leu Gly Ser Gin Glu Arg Ser Pro Pro Pro Cys Leu Thr Arg Glu 20 25 30

CTA CAC GTT CAC GGT GCC CCG GCG CCA CCT GAG AAG AGG CCG CGT CTG 262 Leu His Val His Gly Ala Pro Ala Pro Pro Glu Lys Arg Pro Arg Leu 35 40 45

GGC AGA GTG AAT TTT GAA GAT TGC ACC GGT CGA CAA AGG ACA GCT ATT 310 Gly Arg Val Asn Phe Glu Asp Cys Thr Gly Arg Gin Arg Thr Ala lie 50 55 60 65

TTC CTG ACA CCG ATT CCG AAA GTG GGC ACA GAT GGT GTG ATT ACA GTC 358 Phe Leu Thr Pro lie Pro Lys Val Gly Thr Asp Gly Val lie Thr Val 70 75 80

AAA AGG CCT CTA CGG TTT CAT AAC CCA ACA GAT CCA TTT CTT GGT CTA 406 Lys Arg Pro Leu Arg Phe His Asn Pro Thr Asp Pro Phe Leu Gly Leu 85 90 95

CGC TGG GAC TCC ACC TAC AGA AAG TTT TCC ACC AAA GTC ACG CTG AAT 454 Arg Trp Asp Ser Thr Tyr Arg Lys Phe Ser Thr Lys Val Thr Leu Asn 100 105 110

ACA GTG GGG CAC CAC CAC CGC CCC CCG CCC CAT CAG GCC TCC GTT TCT 502 Thr Val Gly His His His Arg Pro Pro Pro His Gin Ala Ser Val Ser 115 120 125

GGA ATC CAA GCA GAA TTG CTC ACA TTT CCC AAC TCC TCT CCT GGC CTC 550 Gly lie Gin Ala Glu Leu Leu Thr Phe Pro Asn Ser Ser Pro Gly Leu 130 135 140 145

AGA AGA CAG AAG AGA GAC TGG GTT ATT CCT CCC ATC AGC TGC CCA GAA 598 Arg Arg Gin Lys Arg Asp Trp Val lie Pro Pro lie Ser Cys Pro Glu 150 155 160

AAT GAA AAA GGC CCA TTT CCT AAA AAC CTG GTT CAG ATC AAA TCC AAC 646 Asn Glu Lys Gly Pro Phe Pro Lys Asn Leu Val Gin lie Lys Ser Asn 165 170 175

AAA GAC AAA GAA GGC AAG GTT TTC TAC AGC ATC ACT GGC CAA GGA GCT 694 Lys Asp Lys Glu Gly Lys Val Phe Tyr Ser lie Thr Gly Gin Gly Ala 180 185 190

GAC ACA CCC CCT GTT GGT GTC TTT ATT ATT GAA AGA GAA ACA GGA TGG 742 Asp Thr Pro Pro Val Gly Val Phe lie lie Glu Arg Glu Thr Gly Trp 195 200 205

CTG AAG GTG ACA GAG CCT CTG GAT AGA GAA CGC ATT GCC ACA TAC ACT 790 Leu Lys Val Thr Glu Pro Leu Asp Arg Glu Arg lie Ala Thr Tyr Thr 210 215 220 225

CTC TTC TCT CAC GCT GTG TCA TCC AAC GGG AAT GCA GTT GAG GAT CCA 838 Leu Phe Ser His Ala Val Ser Ser Asn Gly Asn Ala Val Glu Asp Pro 230 235 240

ATG GAG ATT TTG ATC ACG GTA ACC GAT CAG AAT GAC AAC AAG CCC GAA 886 Met Glu lie Leu lie Thr Val Thr Asp Gin Asn Asp Asn Lys Pro Glu 245 250 255

TTC ACC CAG GAG GTC TTT AAG GGG TCT GTC ATG GAA GGT GCT CTT CCA 934 Phe Thr Gin Glu Val Phe Lys Gly Ser Val Met Glu Gly Ala Leu Pro 260 265 270

GGA ACC TCT GTG ATG GAG GTC ACA GCC ACA GAC GCG GAC GAT GAT GTG 982 Gly Thr Ser Val Met Glu Val Thr Ala Thr Asp Ala Asp Asp Asp Val 275 280 285

AAC ACC TAC AAT GCC GCC ATC GCT TAC ACC ATC CTC AGC CAA GAT CCT 1030 Asn Thr Tyr Asn Ala Ala lie Ala Tyr Thr lie Leu Ser Gin Asp Pro

290 295 300 305

GAG CTC CCT GAC AAA AAT ATG TTC ACC ATT AAC AGG AAC ACA GGA GTC 1078

Glu Leu Pro Asp Lys Asn Met Phe Thr lie Asn Arg Asn Thr Gly Val 310 315 320

ATC AGT GTG GTC ACC ACT GGG CTG GAC CGA GAG AGT TTC CCT ACG TAT 1126 lie Ser Val Val Thr Thr Gly Leu Asp Arg Glu Ser Phe Pro Thr Tyr

325 330 335

ACC CTG GTG GTT CAA GCT GCT GAC CTT CAA GGT GAG GGG TTA AGC ACA 1174

Thr Leu Val Val Gin Ala Ala Asp Leu Gin Gly Glu Gly Leu Ser Thr 340 345 350

ACA GCA ACA GCT GTG ATC ACA GTC ACT GAC ACC AAC GAT AAT CCT CCG 1222

Thr Ala Thr Ala Val lie Thr Val Thr Asp Thr Asn Asp Asn Pro Pro

355 360 365

ATC TTC AAT CCC ACC ACG TAC AAG GGT CAG GTG CCT GAG AAC GAG GCT 1270 lie Phe Asn Pro Thr Thr Tyr Lys Gly Gin Val Pro Glu Asn Glu Ala 370 375 380 385

AAC GTC GTA ATC ACC ACA CTG AAA GTG ACT GAT GCT GAT GCC CCC AAT 1318

Asn Val Val lie Thr Thr Leu Lys Val Thr Asp Ala Asp Ala Pro Asn 390 395 400

ACC CCA GCG TGG GAG GCT GTA TAC ACC ATA TTG AAT GAT GAT GGT GGA 1366

Thr Pro Ala Trp Glu Ala Val Tyr Thr lie Leu Asn Asp Asp Gly Gly

405 410 415

CAA TTT GTC GTC ACC ACA AAT CCA GTG AAC AAC GAT GGC ATT TTG AAA 1414

Gin Phe Val Val Thr Thr Asn Pro Val Asn Asn Asp Gly lie Leu Lys 420 425 430

ACA GCA AAG GGC TTG GAT TTT GAG GCC AAG CAG CAG TAC ATT CTA CAC 1462

Thr Ala Lys Gly Leu Asp Phe Glu Ala Lys Gin Gin Tyr lie Leu His

435 440 445

GTA GCA GTG ACG AAT GTG GTA CCT TTT GAG GTC TCT CTC ACC ACC TCC 1510

Val Ala Val Thr Asn Val Val Pro Phe Glu Val Ser Leu Thr Thr Ser 450 455 460 465

ACA GCC ACC GTC ACC GTG GAT GTG CTG GAT GTG AAT GAA GGC CCC ATC 1558

Thr Ala Thr Val Thr Val Asp Val Leu Asp Val Asn Glu Gly Pro lie 470 475 480

TTT GTG CCT CCT GAA AAG AGA GTG GAA GTG TCC GAG GAC TTT GGC GTG 1606

Phe Val Pro Pro Glu Lys Arg Val Glu Val Ser Glu Asp Phe Gly Val

485 490 495

GGC CAG GAA ATC ACA TCC TAC ACT GCC CAG GAG CCA GAC ACA TTT ATG 1654

Gly Gin Glu lie Thr Ser Tyr Thr Ala Gin Glu Pro Asp Thr Phe Met 500 505 510

GAA CAG AAA ATA ACA TAT CGG ATT TGG AGA GAC ACT CGC AAC TGG CTG 1702

Glu Gin Lys lie Thr Tyr Arg lie Trp Arg Asp Thr Arg Asn Trp Leu

515 520 525

GAG ATT AAT CCG GAC ACT GGT GCC ATT TCC ACT CGG GCT GAG CTG GAC 1750

Glu lie Asn Pro Asp Thr Gly Ala lie Ser Thr Arg Ala Glu Leu Asp 530 535 540 545

AGG GAG GAT TTT GAG CAC GTG AAG AAC AGC ACG TAC ACA GCC CTA ATC 1798

Arg Glu Asp Phe Glu His Val Lys Asn Ser Thr Tyr Thr Ala Leu lie 550 555 560

ATA GCT ACA GAC AAT GGT TCT CCA GTT GCT ACT GGA ACA GGG ACA CTT 1846 lie Ala Thr Asp Asn Gly Ser Pro Val Ala Thr Gly Thr Gly Thr Leu

565 570 575

CTG CTG ATC CTG TCT GAT GTG AAT GAC AAC GCC CCC ATA CCA GAA CCT 1894 Leu Leu lie Leu Ser Asp Val Asn Asp Asn Ala Pro lie Pro Glu Pro 580 585 590

CGA ACT ATA TTC TTC TGT GAG AGG AAT CCA AAG CCT CAG GTC ATA AAC 1942 Arg Thr lie Phe Phe Cys Glu Arg Asn Pro Lys Pro Gin Val lie Asn 595 600 605

ATT CAT GAT GCA GAC CTT CCT CCC AAT ACA TCT CCC TTC ACA GCA GAA 1990 lie His Asp Ala Asp Leu Pro Pro Asn Thr Ser Pro Phe Thr Ala Glu 610 615 620 625

CTA ACA CAC GGG CGA GTG CCC AAC TGG ACC ATT CAG TAC AAC GAC CCA 2038 Leu Thr His Gly Arg Val Pro Asn Trp Thr He Gin Tyr Asn Asp Pro 630 635 640

ACC CAA GAA TCT ATC ATT TTG AAG CCA AAG ATG GCC TTA GAG GTG GGT 2086 Thr Gin Glu Ser He He Leu Lys Pro Lys Met Ala Leu Glu Val Gly 645 650 655

GAC TAC AAA ATC AAT CTC AAG CTC ATG GAT AAC CAG AAT AAA GAC CAA 2134 Asp Tyr Lys He Asn Leu Lys Leu Met Asp Asn Gin Asn Lys Asp Gin 660 665 670

GTG ACC ACC TTA GAG GTC AGC GTG TGT GAC TGT GAA GGG GCC GCC GGC 2182 Val Thr Thr Leu Glu Val Ser Val Cys Asp Cys Glu Gly Ala Ala Gly 675 680 685

GTC TGT AGG AAG GCA CAG CCT GTC GAA GCA GGA TTG CAA ATT CCT GCC 2230 Val Cys Arg Lys Ala Gin Pro Val Glu Ala Gly Leu Gin He Pro Ala 690 695 700 705

ATT CTG GGG ATT CTT GGA GGA ATT CTT GCT TTG CTA ATT CTG ATT CTG 2278 He Leu Gly He Leu Gly Gly He Leu Ala Leu Leu He Leu He Leu 710 715 720

CTG CTC TTG CTG TTT CTT CGG AGG AGA GCG GTG GTC AAA GAG CCC TTA 2326 Leu Leu Leu Leu Phe Leu Arg Arg Arg Ala Val Val Lys Glu Pro Leu 725 730 735

CTG CCC CCA GAG GAT GAC ACC CGG GAC AAC GTT TAT TAC TAT GAT GAA 2374 Leu Pro Pro Glu Asp Asp Thr Arg Asp Asn Val Tyr Tyr Tyr Asp Glu 740 745 750

GAA GGA GGC GGA GAA GAG GAC CAG GAC TTT GAC TTG AGC CAG CTG CAC 2422 Glu Gly Gly Gly Glu Glu Asp Gin Asp Phe Asp Leu Ser Gin Leu His 755 760 765

AGG GGC CTG GAC GCT CGG CCT GAA GTG ACT CGT AAC GAC GTT GCA CCA 2470 Arg Gly Leu Asp Ala Arg Pro Glu Val Thr Arg Asn Asp Val Ala Pro 770 775 780 785

ACC CTC ATG AGT GTC CCC CGG TAT CTT CCC CGC CCT GCC AAT CCC GAT 2518 Thr Leu Met Ser Val Pro Arg Tyr Leu Pro Arg Pro Ala Asn Pro Asp 790 795 800

GAA ATT GGA AAT TTT ATT GAT GAA AAT CTG AAA GCG GCT GAT ACT GAC 2566 Glu He Gly Asn Phe He Asp Glu Asn Leu Lys Ala Ala Asp Thr Asp 805 810 815

CCC ACA GCC CCG CCT TAT GAT TCT CTG CTC GTG TTT GAC TAT GAA GGA 2614 Pro Thr Ala Pro Pro Tyr Asp Ser Leu Leu Val Phe Asp Tyr Glu Gly 820 825 830

AGC GGT TCC GAA GCT GCT AGT CTG AGC TCC CTG AAC TCC TCA GAG TCA 2662 Ser Gly Ser Glu Ala Ala Ser Leu Ser Ser Leu Asn Ser Ser Glu Ser 835 840 845

GAC AAA GAC CAG GAC TAT GAC TAC TTG AAC GAA TGG GGC AAT CCG TTC 2710 Asp Lys Asp Gin Asp Tyr Asp Tyr Leu Asn Glu Trp Gly Asn Pro Phe

850 855 860 865

AAG AAG CTG GCT GAC ATG TAC GGA GGC GGC GAG GAC CAC TAGGGGACTC 2759 Lys Lys Leu Ala Asp Met Tyr Gly Gly Gly Glu Asp His 870 875

GAGAGAGGCG GCCCAGACCA TGTGCAGAAA TGCAGAAATC AGCGTTCTGT TGTTTT 2815

(2) INFORMATION FOR SEQ ID NO:2:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 878 amino acids

(B) TYPE: amino acid

(D) TOPOLOGY: unknown

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2:

Met Gly Pro Trp Ser Arg Ser Leu Ser Gly Leu Leu Leu Leu Leu Arg 1 5 10 15

Ser Pro Leu Gly Ser Gin Glu Arg Ser Pro Pro Pro Cys Leu Thr Arg

20 25 30

Glu Leu His Val His Gly Ala Pro Ala Pro Pro Glu Lys Arg Pro Arg 35 40 45

Leu Gly Arg Val Asn Phe Glu Asp Cys Thr Gly Arg Gin Arg Thr Ala 50 55 60

He Phe Leu Thr Pro He Pro Lys Val Gly Thr Asp Gly Val He Thr 65 70 75 80

Val Lys Arg Pro Leu Arg Phe His Asn Pro Thr Asp Pro Phe Leu Gly 85 90 95

Leu Arg Trp Asp Ser Thr Tyr Arg Lys Phe Ser Thr Lys Val Thr Leu 100 105 110

Asn Thr Val Gly His His His Arg Pro Pro Pro His Gin Ala Ser Val 115 120 125

Ser Gly He Gin Ala Glu Leu Leu Thr Phe ' Pro Asn Ser Ser Pro Gly 130 135 140

Leu Arg Arg Gin Lys Arg Asp Trp Val He Pro Pro He Ser Cys Pro 145 150 155 160

Glu Asn Glu Lys Gly Pro Phe Pro Lys Asn Leu Val Gin He Lys Ser 165 170 175

Asn Lys Asp Lys Glu Gly Lys Val Phe Tyr Ser He Thr Gly Gin Gly 180 185 190

Ala Asp Thr Pro Pro Val Gly Val Phe He He Glu Arg Glu Thr Gly 195 200 205

Trp Leu Lys Val Thr Glu Pro Leu Asp Arg Glu Arg He Ala Thr Tyr 210 215 220

Thr Leu Phe Ser His Ala Val Ser Ser Asn Gly Asn Ala Val Glu Asp 225 230 235 240

Pro Met Glu He Leu He Thr Val Thr Asp Gin Asn Asp Asn Lys Pro 245 250 255

Glu Phe Thr Gin Glu Val Phe Lys Gly Ser Val Met Glu Gly Ala Leu 260 265 270

Pro Gly Thr Ser Val Met Glu Val Thr Ala Thr Asp Ala Asp Asp Asp 275 280 285

Val Asn Thr Tyr Asn Ala Ala He Ala Tyr Thr He Leu Ser Gin Asp 290 295 300

Pro Glu Leu Pro Asp Lys Asn Met Phe Thr He Asn Arg Asn Thr Gly 305 310 315 320

Val He Ser Val Val Thr Thr Gly Leu Asp Arg Glu Ser Phe Pro Thr 325 330 335

Tyr Thr Leu. Val Val Gin Ala Ala Asp Leu Gin Gly Glu Gly Leu Ser 340 345 350

Thr Thr Ala Thr Ala Val He Thr Val Thr Asp Thr Asn Asp Asn Pro 355 360 365

Pro He Phe Asn Pro Thr Thr Tyr Lys Gly Gin Val Pro Glu Asn Glu 370 375 380

Ala Asn Val Val He Thr Thr Leu Lys Val Thr Asp Ala Asp Ala Pro 385 390 395 400

Asn Thr Pro Ala Trp Glu Ala Val Tyr Thr He Leu Asn Asp Asp Gly 405 410 415

Gly Gin Phe Val Val Thr Thr Asn Pro Val Asn Asn Asp Gly He Leu 420 425 430

Lys Thr Ala Lys Gly Leu Asp Phe Glu Ala Lys Gin Gin Tyr He Leu 435 440 445

His Val Ala Val Thr Asn Val Val Pro Phe Glu Val Ser Leu Thr Thr 450 455 460

Ser Thr Ala Thr Val Thr Val Asp Val Leu Asp Val Asn Glu Gly Pro 465 470 475 480

He Phe Val Pro Pro Glu Lys Arg Val Glu Val Ser Glu Asp Phe Gly 485 490 495

Val Gly Gin Glu He Thr Ser Tyr Thr Ala Gin Glu Pro Asp Thr Phe 500 505 510

Met Glu Gin Lys He Thr Tyr Arg He Trp Arg Asp Thr Arg Asn Trp 515 520 525

Leu Glu He Asn Pro Asp Thr Gly Ala He Ser Thr Arg Ala Glu Leu 530 535 540

Asp Arg Glu Asp Phe Glu His Val Lys Asn Ser Thr Tyr Thr Ala Leu 545 550 555 560

He He Ala Thr Asp Asn Gly Ser Pro Val Ala Thr Gly Thr Gly Thr 565 570 575

Leu Leu Leu He Leu Ser Asp Val Asn Asp Asn Ala Pro He Pro Glu 580 585 590

Pro Arg Thr He Phe Phe Cys Glu Arg Asn Pro Lys Pro Gin Val He 595 600 605

Asn He His Asp Ala Asp Leu Pro Pro Asn Thr Ser Pro Phe Thr Ala 610 615 620

Glu Leu Thr His Gly Arg Val Pro Asn Trp Thr He Gin Tyr Asn Asp 625 630 635 640

Pro Thr Gin Glu Ser He He Leu Lys Pro Lys Met Ala Leu Glu Val

645 650 655

Gly Asp Tyr Lys He Asn Leu Lys Leu Met Asp Asn Gin Asn Lys Asp 660 665 670

Gin Val Thr Thr Leu Glu Val Ser Val Cys Asp Cys Glu Gly Ala Ala 675 680 685

Gly Val Cys Arg Lys Ala Gin Pro Val Glu Ala Gly Leu Gin He Pro 690 695 700

Ala He Leu Gly He Leu Gly Gly He Leu Ala Leu Leu He Leu He 705 710 715 720

Leu Leu Leu Leu Leu Phe Leu Arg Arg Arg Ala Val Val Lys Glu Pro 725 730 735

Leu Leu Pro Pro Glu Asp Asp Thr Arg Asp Asn Val Tyr Tyr Tyr Asp 740 745 750

Glu Glu Gly Gly Gly Glu Glu Asp Gin Asp Phe Asp Leu Ser Gin Leu 755 760 765

His Arg Gly Leu Asp Ala Arg Pro Glu Val Thr Arg Asn Asp Val Ala 770 775 780

Pro Thr Leu Met Ser Val Pro Arg Tyr Leu Pro Arg Pro Ala Asn Pro 785 790 795 800

Asp Glu He Gly Asn Phe He Asp Glu Asn Leu Lys Ala Ala Asp Thr 805 810 815

Asp Pro Thr Ala Pro Pro Tyr Asp Ser Leu Leu Val Phe Asp Tyr Glu 820 825 830

Gly Ser Gly Ser Glu Ala Ala Ser Leu Ser Ser Leu Asn Ser Ser Glu 835 840 845

Ser Asp Lys Asp Gin Asp Tyr Asp Tyr Leu Asn Glu Trp Gly Asn Pro 850 855 860

Phe Lys Lys Leu Ala Asp Met Tyr Gly Gly Gly Glu Asp His 865 870 875

Intemaiional Application No: P I

MICROORGANISMS

Optional Sheet in connection with the microorganism referred to on page 58, lines S- 5 of the description

A. IDENTIFICATION OF DEPOSIT '

Further deposits are identified on an additional sheet

Name of depositary institution ' American Type Culture Collection

Address of depositary institution (including postal code and country)

12301 Parklawn Drive Rockville, MD 20852 US

Date of deposi ' November 12, 1992 Accession Number 69122

B. ADDITIONAL INDICATIONS (leave blank if not applicabl ). Thia infom-uiαn la conlinu - on a a-p-ralc al -hed ahecl

C. DESIGNATED STATES FOR WHICH INDICATIONS ARE MADE ,_ __ .

D. SEPARATE FURNISHING OF INDICATIONS ' (love n ,r no. _ppi_-._ι_>

E. D This sheet was received with the Intemaiional application when filed fiσ be checkeψfiy the receiving Office)

E NORA ' ■NTEftNATlQffAL DIVISIO

(Authorized Officer)

~~ The date of receipt (from the applicant) by the International Bureau '

(Authorized Officer)

Form PCT/RO/134 (January 1981 )

International Application No: PCT/ /

Form PCT/Rθ/134 (cont.)

American Type Culture Collection

12301 Parklawn Drive Rockville, MD 20852 US

Accession No. Date of Deposit

69123 November 12, 1992