Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
TUMOR SUPPRESSOR GENE CALIBAN
Document Type and Number:
WIPO Patent Application WO/2006/133316
Kind Code:
A2
Abstract:
A tumor suppressor, referred to as Caliban, nucleic acid molecules encoding this protein, and methods of making and using these molecules are provided. Also provided are methods of ameliorating, treating, detecting, prognosing, and diagnosing diseases and conditions associated with abnormal Caliban expression, such as cancer.

Inventors:
MORTIN MARK ANDREW (US)
BI XIAOLIN (US)
Application Number:
PCT/US2006/022180
Publication Date:
December 14, 2006
Filing Date:
June 06, 2006
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
US GOV HEALTH & HUMAN SERV (US)
MORTIN MARK ANDREW (US)
BI XIAOLIN (US)
International Classes:
C07K14/435
Other References:
DATABASE EMBL 26 March 2002 (2002-03-26), "Drosophila melanogaster genomic polynucleotide SEQ ID NO 21052." XP002398813 retrieved from EBI Database accession no. ABL23193
DATABASE UniProt 1 May 2000 (2000-05-01), "CG11847-PA" XP002398814 retrieved from EBI Database accession no. Q9VBX1
DEMIDENKO ZOYA ET AL: "Regulated nuclear export of the homeodomain transcription factor Prospero" DEVELOPMENT (CAMBRIDGE), vol. 128, no. 8, April 2001 (2001-04), pages 1359-1367, XP002398807 ISSN: 0950-1991 cited in the application
BI XIAOLIN ET AL: "The carboxy terminus of Prospero regulates its subcellular localization." MOLECULAR AND CELLULAR BIOLOGY, vol. 23, no. 3, February 2003 (2003-02), pages 1014-1024, XP002398808 ISSN: 0270-7306 cited in the application
SCANLAN ET AL: "Characterization of human colon cancer antigens recognized by autologous antibodies" INTERNATIONAL JOURNAL OF CANCER, NEW YORK, NY, US, vol. 76, no. 5, 29 May 1998 (1998-05-29), pages 652-658, XP002103186 ISSN: 0020-7136 cited in the application
CARBONNELLE D ET AL: "Up-regulation of a novel mRNA (NY-CO-1) involved in the methyl 4-methoxy-3-(3-methyl-2-butenoyl) benzoate (VT1)-induced proliferation arrest of a non-small-cell lung carcinoma cell line (NSCLC-N6)" INTERNATIONAL JOURNAL OF CANCER, vol. 92, no. 3, 1 May 2001 (2001-05-01), pages 388-397, XP002398809 ISSN: 0020-7136 cited in the application
BI XIAOLIN ET AL: "Drosophila caliban, a nuclear export mediator, can function as a tumor suppressor in human lung cancer cells" ONCOGENE, vol. 24, no. 56, December 2005 (2005-12), pages 8229-8239, XP002398810 ISSN: 0950-9232
Attorney, Agent or Firm:
CALDWELL, John, W. et al. (ONE LIBERTY PLACE 46th Floo, Philadelphia PA, US)
Download PDF:
Claims:

What is Claimed:

1. An isolated Caliban (Clbn) polynucleotide, wherein said polynucleotide is

(a) a polynucleotide that has the sequence as shown in Figure 5;

(b) a polynucleotide that hybridizes under stringent hybridization conditions to (a) and encodes a polypeptide having the sequence as shown in Figure 5; or

(c) a polynucleotide that hybridizes under stringent hybridization conditions to (a) and encodes a polypeptide with at least 25 contiguous residues of the polypeptide of as shown in Figure 5; or

(d) a polynucleotide that hybridizes under stringent hybridization conditions to (a) and has at least 12 contiguous bases identical to or exactly complementary to as shown in Figure 5, wherein the polynucleotide encodes a polypeptide having nuclear export mediation activity.

2. The isolated Caliban polynucleotide of claim 1 , wherein the nuclear export mediation activity results in tumor suppression.

3. An isolated Caliban polynucleotide encoding a polypeptide comprising a sequence at least 60% identical to the sequence as shown in Figure 5 and having nuclear export mediation activity.

4. The isolated Caliban polynucleotide of claim 1 encoding a polypeptide comprising the sequence as shown in Figure 5.

5. The isolated Caliban polynucleotide of claim 1 comprising the sequence as shown in Figure 5 or its complement.

6. An isolated polynucleotide comprising a nucleotide sequence having at least 60% identity to the sequence as shown in Figure 5, or a complement thereof and having nuclear export mediation activity.

7. An isolated polypeptide comprising a nucleotide sequence that has at least 90% sequence identity to the sequence as shown in Figure 5 and is immunologically cross-reactive with the amino acid sequence as shown in Figure 5 or shares a biological function with native Caliban.

8. A vector comprising the isolated Caliban polynucleotide of claim 1.

9. An expression vector comprising the Caliban polynucleotide of claim 1 operatively linked to a regulatory sequence that controls expression of the polynucleotide in a host cell.

10. The expression vector of claim 10 wherein the polynucleotide is operatively linked to the regulatory sequence in an antisense orientation.

11. The expression vector of claim 10 wherein the polynucleotide is operatively linked to the regulatory sequence in a sense orientation.

12. A host cell comprising the polynucleotide of claim 1 , or progeny of the cell.

13. The host cell of claim 13 which is a eukaryote.

14. A host cell comprising the polynucleotide of claim 1 operatively linked with a regulatory sequence that controls expression of the polynucleotide in a host cell.

15. The host cell of claim 15 wherein the nucleic acid is operatively linked to the regulatory sequence in an antisense orientation.

16. The host cell of claim 15 wherein the nucleic acid is operatively linked to the regulatory sequence in a sense orientation.

17. An isolated DNA that encodes a Caliban protein as shown in Figure 5.

18. An antisense oligonucleotide complementary to a messenger RNA comprising the sequence as shown in Figure 5 and encoding Caliban, wherein the oligonucleotide inhibits the expression of Caliban.

19. The polynucleotide of claim 1 that is RNA.

20. A method of producing a polypeptide comprising:

(i) culturing the host cell of claim 13 under conditions such that the polypeptide is expressed; and

(ii) recovering the polypeptide from the cultured host cell of its cultured medium.

21. An isolated polypeptide encoded by a polynucleotide of claim 1 (a) or (b).

22. The polypeptide of claim 22 that has the amino acid sequence as shown in Figure 5.

23. The isolated polypeptide of claim 22 that is fused with a heterologous peptide.

24. An isolated antibody that specifically binds to a polypeptide having the amino acid sequence as shown in as shown in Figure 5.

25. An isolated antibody composition that specifically binds to a polypeptide of claim 25.

26. The isolated antibody composition of claim 25 that is monoclonal.

27. The isolated antibody composition of claim 25 that is polyclonal.

28. The isolated antibody of claims 27 or 28 that is labeled.

29. The isolated antibody composition of claims 27 or 28 that is a neutralizing antibody.

30. A hybridoma capable of secreting the antibody that binds to a polypeptide of claim 26.

31. A method for identifying a compound or agent that binds to a Caliban polypeptide comprising:

(i) contacting a Caliban polypeptide of claim 22 with the compound or agent under conditions which allow binding of the compound to the Caliban polypeptide to form a complex and

(ii) detecting the presence of the complex.

32. A method of detecting a Caliban polypeptide in a sample, comprising:

(i) contacting the sample with an antibody of claim 27, and (ii) determining whether a hybridization complex has been formed between the antibody and the Caliban polypeptide.

33. A method of detecting a Caliban polypeptide in a sample, comprising:

(i) contacting the sample with a polynucleotide of claim 1 or a polynucleotide that comprises a sequence of at least 12 nucleotides and is complementary to a contiguous sequence of the polynucleotide of section (a) of claim 1; and

(ii) determining whether a hydridization complex has been formed.

34. The method of claim 34, wherein said method is used to diagnose a disease.

35. The method of claim 34, wherein said method is used to diagnose a disease or disorder associated with cancer.

36. A method of detecting a Caliban nucleotide in a sample, comprising:

(i) using a polynucleotide that comprises a sequence of at least 12 nucleotides and is complementary to a contiguous sequence of a polynucleotide of section (a) of claim 1, in an amplification process, and

(ii) determining whether a specific amplification product has been formed.

37. The method of claim 37, wherein said method is used to diagnose a disease or disorder.

38. The method of claim 38, wherein said method is used to diagnose a disease or disorder associated with cancer.

39. The method of claim 38, wherein said method is used to diagnose lung cancer.

40. A pharmaceutical composition comprising a polynucleotide of claim 1 , or a polypeptide of claim 22 or an antibody of claim 25 and a pharmaceutically acceptable carrier.

41. A pharmaceutical composition comprising an antibody of claim 25.

42. A method of modulating Caliban activity in a subject, comprising administering to the subject a therapeutically effective amount of the pharmaceutical composition of claim 42.

43. The method of claim 43 wherein the Caliban activity is nuclear export mediation activity.

44. The method of claim 43 wherein the nuclear export mediation activity is tumor suppression

45. A method of screening bioactive agents comprising: a) providing a cell that expresses a Caliban gene as set forth in Figure 5 or ortholog thereof, or fragment thereof; b) adding a bioactive agent candidate to the cell; and

c) determining the effect of the bioactive agent candidate on the expression of the Caliban gene.

46. The method according to claim 46 wherein the determining comprises comparing the level of expression in the absence of the bioactive agent candidate to the level of expression in the presence of the bioactive agent candidate.

47. A method of screening for a bioactive agent capable of binding to a Caliban protein, wherein the Caliban protein is encoded by a nucleic acid encoding a gene as set forth in Figure 5 or ortholog thereof, or fragment thereof, the method comprising: a) combining the Caliban protein and a candidate bioactive agent; and b) determining the binding of the bioactive agent to the Caliban protein.

48. A method for screening for a bioactive agent capable of modulating the activity of a Caliban protein, wherein the Caliban protein is encoded by a nucleic acid encoding a gene as set forth in Figure 5 or ortholog thereof, or fragment thereof, the method comprising: a) combining the Caliban protein and a candidate bioactive agent; and b) determining the effect of the bioactive agent on the bioactivity of the Caliban protein.

49. A method of evaluating the effect of a Caliban modulating drug comprising: a) administering the drug to a mammal; b) removing a cell sample from the mammal; and c) determining the expression of the Caliban gene set forth Figure 5 or ortholog thereof.

50. A method for screening for a bioactive agent capable of interfering with the binding of a Caliban protein or a fragment thereof and an antibody which binds to the Caliban protein or fragment thereof, the method comprising: a) combining a Caliban or fragment thereof, a candidate bioactive agent and an antibody which binds to the Caliban protein or fragment thereof; and b) determining the binding of the Caliban protein or fragment thereof and the antibody.

51. A method for inhibiting the activity of a Caliban protein, wherein the Caliban protein is a gene product of the gene set forth in Figure 5 or ortholog thereof, or a fragment thereof, the method comprising binding an inhibitor to the Caliban protein.

52. A method of treating cancer comprising administering to a subject a modulator of a Caliban protein, wherein the Caliban protein is a gene product of the Caliban gene set forth in Figure 5 or ortholog thereof, or a fragment thereof.

53. A method of neutralizing the effect of a Caliban protein, or a functional fragment thereof, comprising contacting an agent specific for the protein, or a functional fragment thereof, with the protein in an amount sufficient to effect neutralization.

54. A method of treating cancer in a subject comprising administering to the subject a nucleic acid molecule that hybridizes under stringent conditions to the Caliban gene as shown in Figure 5 or ortholog thereof, or fragment thereof, and attenuates expression of the Caliban gene.

55. The method of claim 55 wherein the nucleic acid molecule is an antisense oligonucleotide.

56. The method of claim 55 wherein said nucleic acid molecule is a double stranded RNA molecule.

57. The method of claim 55 wherein said nucleic acid molecule is a DNA molecule comprising a nucleotide sequence encoding an shRNA molecule.

58. The method of claim 55 wherein said double stranded RNA molecule is short interfering RNA (siRNA) or short hairpin RNA (shRNA).

59. A method of inhibiting expression of a Caliban gene or its ortholog as shown in Figure 5 comprising the steps of (i) providing a biological system in which expression of a Caliban gene as shown in Figure 5 or ortholog thereof to be inhibited; and (ii) contacting the system with a double stranded RNA molecule that hybridizes to a transcript encoding the protein translated from the gene; and (iii) inhibiting expression of the gene encoding the protein.

60. A compound comprising a double stranded RNA having a nucleotide sequence that hybridizes under stringent conditions to a Caliban gene as shown in Figure 5 or an ortholog thereof, and attenuates expression of said Caliban gene.

61. The compound of claim 61 wherein said double stranded RNA hybridizes to an untranslated sequence of the Caliban gene.

62. The compound of claim 61 wherein said double stranded RNA hybridizes to an intron sequence of the Caliban gene.

63. A compound that inhibits a Caliban gene, the compound comprising an oligonucleotide that interacts with a Caliban gene ortholog having at least about 40% sequence similarity to the ortholog.

64. The compound of claim 64 wherein the oligonucleotide interacts with a gene product encoded by an ortholog of a Caliban gene shown as shown in Figure 5 having at least about 40% sequence similarity to the ortholog.

65. The compound of claim 64 wherein the oligonucleotide interacts with a gene product encoded by the ortholog of a Caliban gene having at least about 70% sequence similarity to the ortholog.

66. The compound of claim 64 wherein the compound is at least one of: a single- stranded DNA oligonucleotide, double-stranded DNA oligonucleotide, a single-stranded RNA oligonucleotide, double-stranded RNA oligonucleotide, and modified variants of these.

67. A method for treating cancer comprising: providing a subject at risk of or suffering from cancer and administering a compound that modulates activity or abundance of a Caliban gene as shown in Figure 5 or ortholog thereof.

68. A kit comprising the oligonucleotide of claim 64 and one or more items selected from the group consisting of: packaging and instructions for use, a buffer, nucleotides, a polymerase, an enzyme, a positive control sample, a negative control sample, and a negative control primer or probe.

69. A kit comprising the oligonucleotide of claim 64 and one or more items selected from the group consisting of: packaging and instructions for use, a buffer, nucleotides, a polymerase, an enzyme, a positive control sample, a negative control sample, and a negative control primer or probe.

70. A method of evaluating the effect of a Caliban bioactive agent comprising: a) administering the bioactive agent to a subject; b) removing a cell sample from the subject; and c) determining the expression profile of the cell sample.

71. A method according to claim 71 further comprising comparing the expression profile to an expression profile of a healthy subject.

72. A method according to claim 72 wherein the expression profile includes at least one CLBN gene, or ortholog thereof.

73. An array of probes, comprising a support bearing a plurality of nucleic acid probes complementary to a plurality of mRNAs fewer than 1000 in number, wherein the

plurality of mRNA probes includes an mRNA expressed by at least one CLBN gene, or ortholog thereof.

74. The array of claim 74, wherein the probes are cDNA sequences.

75. The array of claim 74, comprising a plurality of sets of probes, each set of probes complementary to subsequences from a mRNA.

76. A pharmaceutical composition comprising a compound of claim 42 or claim 45; and a pharmaceutically acceptable carrier.

77. A transgenic non-human animal comprising a heterologous nucleic acid, wherein the nucleic acid comprises a loss-of-function allele of a Caliban gene, and the animal exhibits a phenotype, relative to a wild-type phenotype, comprising a characteristic of inhibition of Caliban activity or inhibition of Caliban induced nuclear export or both.

78. The transgenic non-human animal of claim 78 wherein the phenotype of the Caliban mutant animal is characteristic of characteristic of cancer susceptibility, delayed development, or hypertrophy of the nervous system.

79. The transgenic non-human animal of claim 78, wherein the animal is a mouse or a rat.

80. A cell or cell line derived from a transgenic non-human animal according to claim 78.

81. An. in vitro method of screening for a modulator of Caliban activity, the method comprising: contacting a cell or cell line according to claim 81 with a test compound; and detecting an increase or a decrease in the amount of Caliban production, or Caliban induced- nuclear export mediation activity, thereby identifying the test compound as a modulator of Caliban activity.

82. An in vivo method of screening for a modulator of Caliban activity, the method comprising: contacting a cell or cell line according to claim 81 with a test compound; and detecting an increase or a decrease in the amount of Caliban production, or Caliban induced- nuclear export mediation activity, thereby identifying the test compound as a modulator of Caliban activity.

83. An isolated nucleic acid molecule from the Prospero protein, used for a biomarker for Caliban activity comprising a nucleic acid sequence that encodes amino acid residues 4730 to 4744 and 4832 to 4900 in Figure 5, wherein the encoded amino acid residues 1215 to 1219 and 1249 to 1271 is a nuclear export signal.

84. A biomarker for measuring Caliban activity, wherein the biomarker is encoded by an amino acid sequence comprising the sequence as shown in Figure 5, or a conservatively modified variant of an amino acid sequence GMAPTSSTLTPMHLRKAKLMFFWVRYPS (SEQ ID NO: 9), wherein the conservatively modified variant is a biomarker for measuring Caliban function.

85. The biomarker of claim 85, wherein the measuring Caliban function includes determining the subcellular localization of the biomarker.

86. A fusion protein comprising the biomarker of claim 85.

87. The fusion protein of claim 87, further comprising a detectable label.

88. An isolated polynucleotide encoding a fusion protein according to claim 87.

89. The biomarker of claim 85, wherein the biomarker is the nuclear export signal from the homeodomain transcription factor Prospero.

90. A method for determining Caliban activity in a cell comprising:

(a) introducing a recombinant expression construct encoding the biomarker of claim 85 into a cell;

(b) assaying for the presence of the recombinant expression construct in the cytoplasm and nucleus of the cell; wherein the determining Caliban activity is based on Caliban's ability to alter the cellular localization of the biomarker.

91. The method of claim 91 , wherein the cellular localization of the biomarker is in the cytoplasm of the cell, thereby indicating Caliban is active.

92. The method of claim 91 , wherein the cellular localization of the biomarker is in the nucleus of the cell, thereby indicating Caliban is inactive.

93. The method of claim 91 , wherein the cell is a mammalian cell.

94. The method of claim 94, wherein the mammalian cell is a human cell.

95. The method of claim 95, wherein the mammalian cell is a cancer cell.

96. The method of claim 96, wherein the cancer cell is a lung cancer cell.

97. A method of modulating cancer in a subject, the method comprising the step of administering to the subject a therapeutically effective amount of a nucleic acid encoding the recombinant polynucleotide of claim 1.

98. The method of claim 98, wherein the cancer is lung cancer.

99. A method of modulating cancer in a subject, the method comprising the step of administering to the subject a therapeutically effective amount of a nucleic acid encoding Caliban, wherein the nucleic acid hybridizes under stringent conditions to a nucleic acid encoding a polypeptide having an amino acid sequence of as shown in Figure 5.

100. A method for modulating cancer in a subject comprising the steps of: (a) isolating cells to be implanted into said subject (b) introducing into the cells the recombinant expression system of claim 26; and (c) implanting the cells containing the recombinant expression system into said subject.

101. The method of claim 36, wherein the cells express wildtype Caliban.

102. The method of claim 36, wherein the cells do not express wildtype Caliban.

103. A method for modulating cancer in a subj ect in need thereof, comprising:

(a) selecting the patient in need thereof;

(b) isolating cells from the patient, wherein the cells express do not express functional Caliban and introducing into the cells a first nucleotide sequence encoding the polynucleotide of claim 1, wherein the first nucleotide sequence independently operatively linked to transcription controlling nucleotide sequences in the isolated cells; and

(c) readministering the cells to the patient.

104. A method for modulating cancer in a subject in need thereof, comprising:

(a) selecting the patient in need thereof;

(b) isolating cells from the patient, wherein the cells do not express Caliban; and introducing into the cells the recombinant expression system of claim 26; and

(c) readministering the cells to the patient.

105. An expression cassette comprising a polynucleotide encoding a polypeptide having the sequence as shown in Figure 5, wherein said polynucleotide is under the control of a promoter operable in eukaryotic cells.

106. The expression cassette of claim 106, wherein said promoter is heterologous to the coding sequence.

107. The expression cassette of claim 107, wherein said promoter is a tissue specific promoter.

108. The expression cassette of claim 108, wherein said promoter is an inducible promoter.

109. The expression cassette of claim 109, wherein said expression cassette is contained in a viral vector.

110. The expression cassette of claim 110, wherein said viral vector is selected from the group consisting of a retroviral vector, an adenoviral vector, and adeno-associated viral vector, a vaccinia viral vector, and a herpesviral vector.

111. The expression cassette of claim 106, wherein said expression cassette further comprises a polyadenylation signal.

112. A cell comprising an expression cassette comprising a polynucleotide encoding a polypeptide having the sequence of as shown in Figure 5, wherein said polynucleotide is under the control of a promoter operable in eukaryotic cells, said promoter being heterologous to said polynucleotide.

Description:

TUMORSUPPRESSOR GENE CALIBAN

FIELD

[0001] This invention relates to isolated nucleic acid and amino acid sequences of human tumor suppressors, antibodies to such tumor suppressors, methods of detecting such nucleic acids and proteins, methods of screening for modulators of tumor suppressors, and methods of diagnosing and treating tumors with such nucleic acids and proteins.

BACKGROUND

[0002] Certain tumors, benign, premalignant, and malignant, are known to have genetic components. Some of these tumors are caused by mutations or inactivation of "tumor suppressor" genes. In normal cells, the tumor suppressor genes are involved in the regulation of cell growth and proliferation and in the control of cellular aging, anchorage dependence and apoptosis. When the tumor suppressor genes are mutated or inactivated, cells are transformed and become immortalized or tumorigenic. These transformed cells can be reverted back to the normal phenotype (i.e., the cell growth rate is suppressed) by introducing the wildtype suppressor genes.

[0003] The first tumor suppressor gene identified was the nuclear phosphoprotein, retinoblastoma gene (Rb). Retinoblastoma is a malignant tumor of the sensory layer of the retina, and often occurs bilaterally during childhood. Retinoblastoma exhibits a familial tendency, but it can be acquired. Mutations in the Rb gene and inactivation of its product have been shown to be involved in other tumors, such as bladder, breast, small cell lung carcinomas, osteosarcomas, and soft tissue sarcomas. It was demonstrated that reconstitution of Rb-deficient tumor cells with the

wildtype ' KB Teadslcftrfe^ suppression of growth rate or tumorigenicity (Huang et al, Science 242: 1563-1566, 1988). This result provides direct evidence that Rb protein is a tumor suppressor.

[0004] Another well-characterized tumor suppressor is the gene for the nuclear phosphoprotein, p53. More than half of all human cancers are associated with mutations in the tumor suppressor gene p53 (see, e.g., Hollstein et al, Science 253: 49-53, 1991; Caron et al, Genes Chromosome. Cancer 4: 1-15; Harris et al, N. Engl. J. Med. 329: 1318-1327, 1993; Greenblatt et al, Cancer Res. 54: 4855-4878, 1994). Mutations in p53 often appear to be a critical step in the pathogenesis and progression of tumors. For example, missense mutations of p53 occur in tumors of the colon, lung, breast, ovary, bladder, and several other organs. Alternatively, inactivation of the wildtype p53 proteins in cells can cause tumors. For example, certain strains of human papillomavirus (HPV) are known to interfere with the p53 protein function, because the virus produces a protein, E6, which promotes the degradation of the p53 protein.

[0005] Relatively few tumor suppressor genes have been identified, given the number of recessive mutations that have been associated with neoplasias. Since tumor suppressor genes may function in a cell-specific manner, the few already identified may not be useful in treating all neoplasias. There is therefore a continuing need to identify and isolate other tumor suppressor genes as diagnostic and therapeutic agents for identification and treatment of neoplasias and other diseases.

SUMMARY

[0006] The present invention thus provides for the first time nucleic acid and amino acid sequences of a new tumor suppressor gene called Caliban, methods of detecting such nucleic acids and proteins, methods of screening for modulators of Caliban, and methods of diagnosing and treating tumors. Caliban nucleic acids and proteins are tumor suppressors that play a key role in regulation of cell proliferation and tumor suppression.

[0007] In one aspect, the invention provides a polynucleotide, wherein said polynucleotide is (a) a polynucleotide that has the sequence as shown in Figure 5; (b) a polynucleotide that hybridizes under stringent hybridization conditions to (a) and encodes a polypeptide having the sequence as shown in Figure 5; or (c) a polynucleotide that hybridizes under stringent hybridization conditions to (a) and encodes a polypeptide with at least 25 contiguous residues of the polypeptide of as shown in Figure 5; or (d) a polynucleotide that hybridizes under stringent hybridization conditions to (a) and has at least 12 contiguous bases identical to or exactly complementary to as shown in Figure 5, wherein the polynucleotide encodes a polypeptide having nuclear export mediation activity. In some such aspects, the

nuclear export mediation activity results in tumor suppression. In other such aspects, the polynucleotide encodes a polypeptide comprising the sequence as shown in Figure 5. In some such aspects, the polynucleotide comprises the sequence as shown in Figure 5 or its complement.

[0008] In another aspect, the invention provides an isolated Caliban polynucleotide encoding a polypeptide comprising a sequence at least 40% identical to the sequence as shown in Figure 5 and having nuclear export mediation activity.

[0009] hi another aspect, the invention provides an isolated polynucleotide comprising a nucleotide sequence having at least 60% identity to the sequence as shown in Figure 5, or a complement thereof and having nuclear export mediation activity.

[0010] In another aspect, the invention provides an isolated polypeptide comprising a nucleotides sequence that has at least 90% sequence identity to the sequence as shown in Figure 5 and is immunologically cross-reactive with the amino acid sequence as shown in Figure 5 or shares a biological function with native Caliban.

[0011] In another aspect, the invention provides vectors, such as expression vectors, comprising a polynucleotide sequence of the invention, hi other aspects, the invention provides host cells or progeny of the host cells comprising a vector of the invention, hi certain aspects, the host cell is a eukaryote. In other aspects, the expression vector comprises a Caliban polynucleotide in which the nucleotide sequence of the polynucleotide is operatively linked with a regulatory sequence that controls expression of the polynucleotide in a host cell, hi certain aspects, the invention provides a host cell comprising a Caliban polynucleotide, wherein the nucleotide sequence of the polynucleotide is operatively linked with a regulatory sequence that controls expression of the polynucleotide in a host cell, or progeny of the cell. The nucleotide sequence of the polynucleotide can be operatively linked to the regulatory sequence in a sense or antisense orientation.

[0012] In another aspect, the invention provides an isolated DNA that encodes a Caliban protein as shown in Figure 5.

[0013] In another aspect, the invention provides a Caliban polynucleotide that is an antisense polynucleotide. In one such aspect, the antisense polynucleotide is less than about 200 bases in length. In other such aspects, the invention provides an antisense oligonucleotide complementary to a messenger RNA comprising the polypeptide sequences shown in Figure 5 and encoding Caliban, wherein the oligonucleotide inhibits the expression of Caliban.

[0014] In another aspect, the invention provides a method of producing a polypeptide comprising (i) culturing the host cell as described herein under conditions such that the

polypeptide is expressed; and (ii) recovering the polypeptide from the cultured host cell of its cultured medium.

[0015] In another aspect, the invention provides an isolated Caliban polypeptide encoded by a Caliban polynucleotide. In some aspects, the Caliban polypeptide has 40% sequence identity to the amino acid sequence as shown in Figure 5 and has Caliban activity. In some such aspects, the Caliban polypeptide comprises the amino acid sequence as shown in Figure 5. In other such aspect, the Caliban polypeptide is fused with a heterologous polypeptide.

[0016] In another aspect, the invention provides an isolated antibody or antibody composition that specifically binds to a polypeptide having the amino acid sequence as shown in Figure 5. In some such aspects, the antibody is monoclonal. In other such aspects, the antibody is polyclonal. In some aspects, the antibodies of the present invention are labeled. In some such aspects, the isolated antibodies of the present invention are conjugated to a toxic or non-toxic moiety. In one such aspect, the isolated antibodies of the present invention are neutralizing antibodies. In other aspects, the invention provides hybridomas capable of secreting the antibodies of the present invention.

[0017] In another aspect, the invention provides a method for identifying a compound or agent that binds to a Caliban polypeptide comprising: (i) contacting a Caliban polypeptide of as described above with the compound or agent under conditions which allow binding of the compound to the Caliban polypeptide to form a complex and (ii) detecting the presence of the complex.

[0018] In another aspect, the invention provides a method of detecting a Caliban polypeptide in a sample, comprising: (i) contacting the sample with an antibody of the invention, and (ii) determining whether a hybridization complex has been formed between the antibody and the Caliban polypeptide.

[0019] In another aspect, the invention provides a method of detecting a Caliban polypeptide in a sample, comprising: (i) contacting the sample with a polynucleotide of the invention as described above or a polynucleotide that comprises a sequence of at least 12 nucleotides and is complementary to a contiguous sequence of the polynucleotide; and (ii) determining whether a hydridization complex has been formed. In some such aspects, the method is used to diagnose a disease. In other such aspects, the method is used to diagnose a disease or disorder associated with cancer.

[0020] In another aspect, the invention provides a method of detecting a Caliban nucleotide in a sample, comprising: (i) using a polynucleotide that comprises a sequence of at least 12 nucleotides and is complementary to a contiguous sequence of a polynucleotide as

described "above, in 'Si'ainplffication process, and (ii) determining whether a specific amplification product has been formed. In some such aspects, the method is used to diagnose a disease. In other such aspects, the method is used to diagnose a disease or disorder associated with cancer. In some such aspects, the method is used to diagnose lung cancer.

[0021] In another aspect, the invention provides a pharmaceutical composition comprising a Caliban polynucleotide, or a Caliban polypeptide or a Caliban antibody and a pharmaceutically acceptable carrier.

[0022] In another aspect, the invention provides a method of modulating Caliban activity in a subject, comprising administering to the subject a therapeutically effective amount of the pharmaceutical composition as described herein. In some methods, the Caliban activity is nuclear export mediation activity. In other methods, the nuclear export mediation activity is tumor suppression

[0023] In another aspect, the invention provides a method of screening bioactive agents comprising: a) providing a cell that expresses a Caliban gene as set forth in Figure 5 or ortholog thereof, or fragment thereof; b) adding a bioactive agent candidate to the cell; and c) determining the effect of the bioactive agent candidate on the expression of the Caliban gene. In some methods, the determining comprises comparing the level of expression in the absence of the bioactive agent candidate to the level of expression in the presence of the bioactive agent candidate.

[0024] In another aspect, the invention provides a method of screening for a bioactive agent capable of binding to a Caliban protein, wherein the Caliban protein is encoded by a nucleic acid encoding a gene as set forth in Figure 5 or ortholog thereof, or fragment thereof, the method comprising: a) combining the Caliban protein and a candidate bioactive agent; and b) determining the binding of the bioactive agent to the Caliban protein.

[0025] In another aspect, the invention provides a method for screening for a bioactive agent capable of modulating the activity of a Caliban protein, wherein the Caliban protein is encoded by a nucleic acid encoding a gene as set forth in Figure 5 or ortholog thereof, or fragment thereof, the method comprising: a) combining the Caliban protein and a candidate bioactive agent; and b) determining the effect of the bioactive agent on the bioactivity of the Caliban protein.

[0026] In another aspect, the invention provides a method of evaluating the effect of a Caliban modulating drug comprising: a) administering the drug to a mammal; b) removing a cell sample from the mammal; and c) determining the expression of the Caliban gene set forth Figure 5 or ortholog thereof.

" |002f] " In ' anbϊner aspect the invention provides a method for screening for a bioactive agent capable of interfering with the binding of a Caliban protein or a fragment thereof and an antibody which binds to the Caliban protein or fragment thereof, the method comprising: a) combining a Caliban or fragment thereof, a candidate bioactive agent and an antibody which binds to the Caliban protein or fragment thereof; and b) determining the binding of the Caliban protein or fragment thereof and the antibody.

[0028] hi another aspect the invention provides a method for inhibiting the activity of a Caliban protein, wherein the Caliban protein is a gene product of the gene set forth in Figure 5 or ortholog thereof, or a fragment thereof, the method comprising binding an inhibitor to the Caliban protein.

[0029] In another aspect the invention provides a method of treating cancer comprising administering to a subject a modulator of a Caliban protein, wherein the Caliban protein is a gene product of the Caliban gene set forth in Figure 5 or ortholog thereof, or a fragment thereof.

[0030] In another aspect the invention provides a method of neutralizing the effect of a Caliban protein, or a functional fragment thereof, comprising contacting an agent specific for the protein, or a functional fragment thereof, with the protein in an amount sufficient to effect neutralization.

[0031] In another aspect the invention provides a method of treating cancer in a subject comprising administering to the subject a nucleic acid molecule that hybridizes under stringent conditions to the Caliban gene as shown in Figure 5 or ortholog thereof, or fragment thereof, and attenuates expression of the Caliban gene. In some such methods, the nucleic acid molecule is an antisense oligonucleotide, hi other such methods, the nucleic acid molecule is a double stranded RNA molecule, hi some such methods, the nucleic acid molecule is a DNA molecule comprising a nucleotide sequence encoding an shRNA molecule, hi other such methods, the double stranded RNA molecule is short interfering RNA (siRNA) or short hairpin RNA (shRNA).

[0032] In another aspect, the invention provides a method of inhibiting expression of a Caliban gene or its ortholog as shown in Figure 5 comprising the steps of (i) providing a biological system in which expression of a Caliban gene as shown in Figure 5 or ortholog thereof to be inhibited; and (ii) contacting the system with a double stranded RNA molecule that hybridizes to a transcript encoding the protein translated from the gene; and (iii) inhibiting expression of the gene encoding the protein.

[0033] hi another aspect the invention provides a compound comprising a double stranded RNA having a nucleotide sequence that hybridizes under stringent conditions to a Caliban gene as shown in Figure 5 or an ortholog thereof, and attenuates expression of said

Caliban gene.'lri some'meωϊocIs, the double stranded RNA hybridizes to an untranslated sequence of the Caliban gene. In other methods, the double stranded RNA hybridizes to an intron sequence of the Caliban gene.

[0034] In another aspect, the invention provides a compound that inhibits a Caliban gene, the compound comprising an oligonucleotide that interacts with a Caliban gene ortholog having at least about 40% sequence similarity to the ortholog. In some aspects, the oligonucleotide interacts with a gene product encoded by an ortholog of a Caliban gene shown as shown in Figure 5 having at least about 40% sequence similarity to the ortholog. In some aspects the oligonucleotide interacts with a gene product encoded by the ortholog of a Caliban gene having at least about 70% sequence similarity to the ortholog. In other aspects, the compound is at least one of: a single-stranded DNA oligonucleotide, double-stranded DNA oligonucleotide, a single-stranded RNA oligonucleotide, double-stranded RNA oligonucleotide, and modified variants of these.

[0035] In another aspect, the invention provides a method for treating cancer comprising: providing a subject at risk of or suffering from cancer and administering a compound that modulates activity or abundance of a Caliban gene as shown in Figure 5 or ortholog thereof.

[0036] In another aspect the invention provides a kit comprising the oligonucleotide as described herein and one or more items selected from the group consisting of: packaging and instructions for use, a buffer, nucleotides, a polymerase, an enzyme, a positive control sample, a negative control sample, and a negative control primer or probe.

[0037] In another aspect the invention provides a method of evaluating the effect of a Caliban bioactive agent comprising: a) administering the bioactive agent to a mammal; b) removing a cell sample from the mammal; and c) determining the expression profile of the cell sample. In some aspects, the methods further comprise comparing the expression profile to an expression profile of a healthy individual. In some methods, the expression profile includes at least one CLBN gene, or ortholog thereof.

[0038] In another aspect, the invention provides an array of probes, comprising a support bearing a plurality of nucleic acid probes complementary to a plurality of mRNAs fewer than 1000 in number, wherein the plurality of mRNA probes includes an mRNA expressed by at least one CLBN gene, or ortholog thereof, hi some arrays, the probes are cDNA sequences. In some such arrays, the arrays comprise a plurality of sets of probes, each set of probes complementary to subsequences from a mRNA.

[0039] In aiiδffier aspect, the invention provides a transgenic non-human animal comprising a heterologous nucleic acid, wherein the nucleic acid comprises a sequence having at least 90% sequence identity to the sequence as set forth in Figure 5, wherein said animal exhibits a phenotype, relative to a wild-type phenotype comprising a characteristic of susceptibility to cancer, developmental delay, hypertrophy of the nervous system, or a combination of any two or more thereof. Some such transgenic non-human animals are mice or rats. A cell or cell lines are derived from the transgenic non-human animals as described herein.

[0040] In another aspect, the invention provides an in vitro method of screening for a modulator of disease or disorder, said method comprising: contacting a cell or cell line as described herein with a test compound; and detecting an increase or a decrease in the amount of cytoplasmic/nuclear EYFP-HDA; thereby identifying the test compound as a modulator of Caliban gene activity.

[0041] In another aspect, the invention provides an in vivo method of screening for a modulator of disease or disorder, said method comprising: contacting a cell or cell line as described herein with a test compound; and detecting an increase or a decrease in the amount of detecting an increase or a decrease in the amount of cytoplasmic/nuclear EYFP-HDA; thereby identifying the test compound as a modulator of Caliban gene activity.

[0042] In another aspect, the invention provides an in vivo method for screening for a modulator of a disease or disorder, said method comprising: contacting a transgenic non-human animal as described herein with a test compound; and detecting an increase or a decrease in the amount or severity of the disease or disorder; wherein the increase or the decrease identifies the test compound as a modulator of the disease or disorder. In some such methods, the disease or disorder is cancer.

[0043] hi another aspect, the invention provides an in vivo method to identify a genetic modulator of a disease or disorder, said method comprising: inserting a Caliban gene into one or more cells of a transgenic non-human animal as described herein; and detecting an increase or a decrease in the amount or severity of the disease or disorder; wherein the increase or decrease identifies the Caliban gene as a genetic modulator of a disease or disorder. In some such methods, the disease or disorder is cancer.

[0044] In another aspect, the invention provides an in vivo method to identify a genetic modulator of disease or disorder, said method comprising: mating a first transgenic non-human animal as described herein with a second non-human animal of a sex opposite of the first transgenic non-human animal, wherein the second non-human animal is selected from the group consisting of an inbred non-human animal strain, a randomly mutagenized non-human animal, a

transgenic non-human " animal, and a knockout non-human animal; and selecting an offspring of the mating that exhibits an increase or a decrease in the amount or severity of the disease or disorder, thereby identifying a genetic modulator of the disease or disorder.

[0045] In another aspect, the invention provides an in vivo method to identify a genetic modulator of a disease or disorder, said method comprising: (i) mating a first transgenic non- human animal as described herein with a second non-human animal of a sex opposite of the first transgenic non-human animal, wherein the second non-human animal is a randomly mutagenized non-human animal; (ii) mating two offspring of the mating of step (i); and (iii) identifying offspring of the mating of step (ii) that carry two mutated alleles of a nucleic acid having at least 90% identity with the sequence as set forth in Figure 5 and that exhibit an increase or a decrease in the amount or severity of the disease or disorder.

[0046] In another aspect, the invention provides an in vivo method to identify a genetic modulator of a disease or disorder, said method comprising: (i) mating a first transgenic non- human animal as described herein with a second non-human animal of a sex opposite of the first transgenic non-human animal, wherein the second non-human animal is a randomly mutagenized non-human animal; (ii) mating an offspring of the mating of step (i) with a transgenic non- human animal as described herein; and (iii) identifying offspring of the mating of step (ii) that carry two mutated alleles of a nucleic acid having at least 90% identity with the sequence as set forth in Figure 5 and that exhibit an increase or a decrease in the amount or severity of the disease or disorder.

[0047] In another aspect, the invention provides an in vivo method to identify a genetic modulator of a disease or disorder, said method comprising: (i) mating a first transgenic non- human animal as described herein with a second non-human animal of a sex opposite of the first transgenic non-human animal, wherein the second non-human animal is a randomly mutagenized non-human animal; (ii) mating an offspring of the mating of step (i) with a randomly mutagenized non-human animal; and (iii) identifying offspring of the mating of step (ii) that carry a mutated allele of a nucleic acid having at least 90% identity with the sequence as set forth in Figure 5 and that exhibit an increase or a decrease in nuclear transport activity of HDA , thereby identifying a genetic modulator of the disease or disorder.

[0048] In another aspect, the invention provides a knockout non-human animal, wherein an endogenous gene sequence comprising a nucleic acid sequence having at least 90% sequence identity to the amino acid sequence in Figure 5 is disrupted so as to produce a phenotype comprising a characteristic of susceptibility to cancer, delayed development, slight hypertrophy of the nervous system, or a combination of any two or more thereof. Some such

khόclcδuf ϊϊum'an animafis "aϊhouse or a rat. A cell or cell line is also derived from a knockout non-human animal as described herein.

[0049] In another aspect, the invention provides an in vivo method of screening for a modulator of Caliban activity, the method comprising: contacting a cell or cell line as described herein with a test compound; and detecting an increase or a decrease in the amount of Caliban production, or Caliban induced- nuclear export mediation activity, thereby identifying the test compound as a modulator of Caliban activity.

[0050] In another aspect, the invention provides an in vivo method for screening for a modulator of a disease or disorder, said method comprising: contacting a knockout non-human animal as described herein with a test compound; and detecting an increase or a decrease in the amount or severity of the disease or disorder; wherein the increase or the decrease identifies the test compound as a modulator of the disease or disorder, or the Caliban defect.

[0051] In another aspect, the invention provides an inbred mouse comprising a genome that is homozygous for a nucleic acid sequence encoding a polypeptide having at least 95% sequence identity to the sequence as shown in Figure 5. In some such inbred mice, the polypeptide comprises a sequence as set forth in Figure 5. In other such inbred mice, the mouse has a phenotype comprising a characteristic of cancer and lung cancer model. A cell or cell line is derived from the inbred mice as described herein.

[0052] In another aspect, the invention provides an in vitro method of screening for a modulator of Caliban activity, the method comprising: contacting a cell or cell line as described herein with a test compound; and detecting an increase or a decrease in the amount of Caliban production, or Caliban induced-nuclear export mediation activity, thereby identifying the test compound as a modulator of Caliban activity.

[0053] In another aspect, the invention provides a non-human transgenic animal having a knockout mutation in one or both alleles encoding a polypeptide substantially identical to a Caliban polypeptide.

[0054] In another aspect, the invention provides a transgenic knockout mouse whose genome comprises a homozygous disruption in its endogenous Caliban gene, wherein said homozygous disruption prevents the expression of a functional Caliban protein, resulting in a transgenic knockout mouse in which the nuclear transport activity is inhibited, as compared to the nuclear transport activity of a wild type mouse.

[0055] In another aspect, the invention provides a transgenic knockout mouse whose genome comprises a homozygous disruption in its endogenous Caliban gene, wherein said homozygous disruption prevents the expression of a functional Caliban protein, resulting in a

transgenic knockout mouse which has a decreased production of Caliban as compared to Caliban production in a wild type mouse.

[0056] In other aspects for the transgenic knockout mouse as described herein, the homozygous disruption results from deletion or mutation of portions of the endogenous Caliban gene whereby a non-functional gene product or complete absence of the gene product is produced.

[0057] In another aspect, the invention provides a transgenic non-human animal comprising a heterologous nucleic acid, wherein the nucleic acid comprises a loss-of-function allele of a Caliban gene, and the animal exhibits a phenotype, relative to a wild-type phenotype, comprising a characteristic of inhibition of Caliban activity or inhibition of Caliban induced nuclear export or both. In some transgenic non-human animal, the phenotype of the Caliban mutant animal is characteristic of cancer susceptibility, delayed development, or hypertrophy of the nervous system. Some transgenic non-human animals are mice or rats. In some transgenic non-human animals, a cell or cell line is derived from the transgenic non-human animal.

[0058] In another aspect, the invention provides an in vitro method of screening for a modulator of Caliban activity, the method comprising: contacting a cell or cell line as described herein with a test compound; and detecting an increase or a decrease in the amount of Caliban production, or Caliban induced- nuclear export mediation activity, thereby identifying the test compound as a modulator of Caliban activity.

[0059] In another aspect, the invention provides an in vivo method of screening for a modulator of Caliban activity, the method comprising: contacting a cell or cell line as described herein with a test compound; and detecting an increase or a decrease in the amount of Caliban production, or Caliban induced- nuclear export mediation activity, thereby identifying the test compound as a modulator of Caliban activity.

[0060] In another aspect, the invention provides an isolated nucleic acid molecule for a biomarker for Caliban activity comprising a nucleic acid sequence that encodes amino acid residues 4730 to 4744 and 4832 to 4900 in Figure 5, wherein the encoded amino acid residues 1215 to 1219 and 1249 to 1271 is a nuclear export signal.

[0061] In another aspect, the invention provides a biomarker for measuring Caliban activity, wherein the biomarker is encoded by an amino acid sequence comprising the sequence as shown in Figure 5, or a conservatively modified variant of an amino acid sequence GMAPTSSTLTPMHLRKAKLMFFWVRYPS (SEQ ID NO: 9), wherein the conservatively modified variant is a biomarker for measuring Caliban function. In some such biomarkers, the measuring Caliban function includes determining the subcellular localization of the biomarker.

In " another aspect, the invention provides a fusion protein comprising the biomarker as described herein. Some such fusion proteins further comprise a detectable label.

[0062] In some such biomarkers, the invention provides an isolated polynucleotide encoding a biomarker fusion protein. In some such biomarkers, the biomarker is the nuclear export signal from the homeodomain transcription factor Prospero.

[0063] In another aspect, the invention provides a method for determining Caliban activity in a cell comprising: (a) introducing a recombinant expression construct encoding the biomarker of as described herein into a cell; (b) assaying for the presence of the recombinant expression construct in the cytoplasm and nucleus of the cell; wherein the determining Caliban activity is based on Caliban's ability to alter the cellular localization of the biomarker.

[0064] hi some such methods, the cellular localization of the biomarker is in the cytoplasm of the cell, thereby indicating Caliban is active. In other such methods, the cellular localization of the biomarker is in the nucleus of the cell, thereby indicating Caliban is inactive. In some methods, the cell is a mammalian cell, hi some such methods, the mammalian cell is a human cell. In other such methods, the mammalian cell is a cancer cell. In some such methods, the cancer cell is a lung cancer cell.

[0065] hi another aspect, the invention provides a method of modulating cancer in a subject, the method comprising the step of administering to the subject a therapeutically effective amount of a nucleic acid encoding the recombinant polynucleotide as described herein. In some methods, the cancer is lung cancer.

[0066] hi another aspect, the invention provides a method of modulating cancer in a subject, the method comprising the step of administering to the subject a therapeutically effective amount of a nucleic acid encoding Caliban, wherein the nucleic acid hybridizes under stringent conditions to a nucleic acid encoding a polypeptide having an amino acid sequence of as shown in Figure 5.

[0067] In another aspect, the invention provides a method for modulating cancer in a subject comprising the steps of: (a) isolating cells to be implanted into said subject (b) introducing into the cells the recombinant expression system as described herein; and (c) implanting the cells containing the recombinant expression system into said subject, hi some such methods, the cells express wildtype Caliban, hi other such methods, the cells do not express wildtype Caliban.

[0068] hi another aspect, the invention provides a method for modulating cancer in a subject in need thereof, comprising: (a) selecting the patient in need thereof; (b) isolating cells from the patient, wherein the cells express do not express functional Caliban and introducing into

the cells a first nucleotide sequence encoding the polynucleotide as described herein, wherein the first nucleotide sequence independently operatively linked to transcription controlling nucleotide sequences in the isolated cells; and (c) readministering the cells to the patient.

[0069] In another aspect, the invention provides a method for modulating cancer in a subject in need thereof, comprising: (a) selecting the patient in need thereof; (b) isolating cells from the patient, wherein the cells do not express Caliban; and introducing into the cells the recombinant expression system as described herein; and (c) readministering the cells to the patient.

[0070] In another aspect, the invention provides an expression cassette comprising a polynucleotide encoding a polypeptide having the sequence as shown in Figure 5, wherein said polynucleotide is under the control of a promoter operable in eukaryotic cells. In some expression cassettes, the promoter is heterologous to the coding sequence. In some such expression cassettes, the promoter is a tissue specific promoter. In other such expression cassettes, the promoter is an inducible promoter. In some such expression cassette is contained in a viral vector. In some such expression cassettes, the viral vector is selected from the group consisting of a retroviral vector, an adenoviral vector, and adeno-associated viral vector, a vaccinia viral vector, and a herpesviral vector. Some such expression cassette further comprises a polyadenylation signal.

[0071] In another aspect, the invention provides a cell comprising an expression cassette comprising a polynucleotide encoding a polypeptide having the sequence of as shown in Figure 5, wherein said polynucleotide is under the control of a promoter operable in eukaryotic cells, said promoter being heterologous to said polynucleotide.

BRIEF DESCRIPTION OF THE DRAWINGS

[0072] Figure 1. Caliban binds a functional nuclear export signal, a, Yeast (AH109) were transformed with the indicated clones and streaked onto selective media as shown. The carboxy terminus of Clbn (pGBKT7-Clbn-C) interacts with HD (HD and all of its derivatives were cloned into pGADT7 resulting in the clones, pGADT7-HD, and the like) and HDA but not HDB. b, The interaction of Caliban and the NES is impaired by mutations that disrupt nuclear export. Yeast were co-transformed with Clbn-C and the indicated clones, and plated on selective media at 10-fold serial dilutions. Relative survival under stringent selective conditions indicates the interaction strengths of the expressed proteins. HDA-F4 and HDA-LLL express mutant versions of the NES; HP expresses a masked NES (Bi et al, 2003). c, The carboxy and amino termini of Caliban bind the HD nuclear export signal and Exportin, respectively. Clones were in

vitro transcriFed andlranslated in the presence of 35 S-methionine, mixed and incubated as indicated, and immunoprecipitated with the indicated antibodies. HA-CbIn-C (lane 1), HA-Clbn- C and Myc-HD (lane 2), Myc-HD (lane 3), HA-Clbn-C (lane 4), HA-Clbn-C and Myc-HD (lane 5), Myc-HD (lane 6), HA-HD (lane 7) HA-HD and Myc-Clbn (lane 8), Myc-Clbn (lane 9), Myc- Clbn-N (lane 10), Myc-Clbn-N and HA-Exportin (lane 11), HA-Exportin (lane 12), Myc-HD and HA-Exportin (lane 13), Myc-HD (lane 14). RanQ69LGTP (Bischoff et al, 1994) was added to the samples used in lanes 11 and 13.

[0073] Figure 2. Caliban is required for the nuclear export of HDA and itself contains functional NESs. a-j, Cells were fixed, stained with Propidium Iodide and examined by confocal microscopy as described elsewhere (Demidenko et al, 2001); EYFP fusion protein is shown in green, DNA in red and overlap in yellow, a-d, Drosophila SL2 tissue culture cells were co- transfected with a clone expressing the green-fluorescent-tagged nuclear export signal, pAc5.1- EYFP-HDA, and RNAi synthesized from the vector ρSP73 (a), pSP73-Clbn (b), or pSP73-Emb (c), or with the plasmid pCMV-Clbn-C (d). e-j, Mammalian CVl cells were transfected with plasmids expressing a deletion series of the amino terminal 435 amino acids of Clbn fused to green fluorescent protein, e, Cblnl/435, f, Clbn51/435, g, Clbn91/435, h, Clbnl31/435, i, Clbnl81/435, j, Clbn311/435.

[0074] Figure 3. Sdccagl is inactive in lung cancer cell lines, a, Human lung cell lines were transfected with pEYFP-HDA (HDA), pREV(1.4)-GFP-PKI (PKI) or pEYFP-Pros-NLS+ and fixed, stained and examined as in Figure 2. A primary normal human bronchial epithelial cell line (NHBE) and two immortal "normal" embryonic lung cell lines, WI-38 and IMR-90 (not shown) were compared to five lung cancer cell lines including A549, H23 and HOP92 (HOP62 and EKVX not shown), b, sdccagl RNA is expressed in most tissues, c, sdccagl RNAi abrogates nuclear export of EYFP-HDA in WI-38 cells, d, Expression of fly Caliban in A549 cells is sufficient to allow nuclear export of EYFP-HDA. Caliban is shown in blue. Caliban expression and export of EYFP-HDA are independent of the addition of teteracycline, indicating that the repressor is not working efficiently in any of the extant cell lines.

[0075] Figure 4. Fly Caliban is a tumor suppressor in human lung cancer cells, a, b, Expression of fly Caliban in A549 or EKVX lung cancer cells impedes their formation of colonies when grown on soft agar. Bars indicate 100 μm. A549, EKVX and the "non-cancerous" lung cell line WI-38 form numerous large colonies when grown on soft agar. Expression of fly Caliban greatly reduces the size (a) and number of colonies (a, b). c, Expression of fly Caliban reduces the invasiveness of A549 and EKVX cells and inactivation of endogenous Sdccagl in NHBE cells increases invasiveness, d, Inactivation of Sdccagl in NHBE cells and expression of

fly Clbn in A549 and EKVX cells have minor effects on the cell cycle but do not result in cell cycle arrest nor significant lengthening or shortening of the cell cycle.

[0076] Figure 5. Sequences of Caliban (a-f) and the biomarker HDA (g-m). a, SEQ ID NO: 1, Predicted protein sequence from Gene bank including an additional 20 amino acids previously thought to be an intron. b, SEQ ID NO: 2, Corrected nucleotide sequence including 60 nucleotides (145-204 shown in bold) previously thought to constitute intron 2. c, SEQ ID NO: 3, Amino acid sequence of the dominant negative, d, SEQ ID NO: 4, Predicted protein sequence from Clbn as disclosed herein including numerous polymorphisms, e, SEQ ID NO: 5, Nucleotide sequence of the protein described in (d), with polymorphisms and additional 60 nucleotides (see b) shown in bold, f, SEQ ID NO: 6, The dominant negative including polymorphisms shown in (d). g, SEQ ID NO: 7, Original Prospero 1407 amino acid sequence, accession number AAA28841. h, SEQ ID NO: 8, Nucleotide sequence of the long isoform of Prospero, accession number M81389, with the translation start and stop points shown in bold, i, SEQ ID NO: 9, A minimal nuclear export signal (NES) from the short isoform of Prospero, amino acids 1215-1219 and 1249-1271. j, SEQ ID NO: 10, A minimal masked NES from the short isoform of Prospero, amino acids 1215-1219 and 1249-1407. k, SEQ ID NO: 11, A masked NES from the long isoform of Prospero, amino acids 1215-1407. 1, SEQ ID NO: 12, A more efficient biomarker results from using a larger piece of the short isoform of Prospero including its nuclear localization signal, from amino acids 987-1219 and 1249-1377. m, SEQ ID NO: 13, The masked version of the biomarker shown in (1), from amino acids 987-1219 and 1249-1407. This serves as a control for (m) and can function itself as a biomarker to identify "unmasking proteins," which can be predicted to be tumor suppressors.

[0077] Figure 6. Disrupting Caliban expression in flies increases their likelihood of developing tumors (arrows), a, Larvae lacking the Caliban gene form numerous melanotic tumors in response to DNA damage by irradiation, b, One line of flies expressing the dominant negative (oncogenic) form of Caliban develop tumors in adults.

DETAILED DESCRIPTION

1. INTRODUCTION

[0078] The present invention provides for the first time nucleic acids and polypeptides of a new tumor suppressor referred herein as Caliban. The present invention also provides antibodies which selectively bind to a Caliban protein. These nucleic acids and the polypeptides they encode are tumor suppressors. These tumor suppressor nucleic acids and polypeptides are involved in the regulation of cell proliferation and in the control of cellular aging, anchorage dependence, and apoptosis.

[0079] The present invention also provides methods of screening for modulators {e.g. , activators, inhibitors, stimulators, enhancers, agonists, and antagonists) of these novel Caliban proteins. Such modulators are useful for pharmacological and genetic modulation of cell growth and tumor suppression. The invention thus provides assays for tumor suppression and cell growth, where Caliban acts as a direct or indirect reporter molecule for measuring the effect of modulators on cell growth or tumor suppression. These assays can measure various parameters that are affected by the Caliban activity, e.g., cell growth on soft agar, contact inhibition and density limitation of growth, growth factor or serum dependence, tumor specific markers levels, invasiveness into Matrigel, tumor growth in vivo, Caliban protein or mRNA levels, transcriptional activation or repression of a reporter gene, and the like.

[0080] The present invention also provides methods of inhibiting cell proliferation of a cell by transducing the cell with an expression vector containing Caliban nucleic acids. The transduced cell may have a missense or null endogenous Caliban phenotype or a mutation in another tumor suppressor gene. For example, the cell may contain Caliban having a sequence with a missense mutation. Expression of wildtype Caliban restores cell growth regulation and prevents the development of tumors. For example, Caliban nucleic acids can be used to treat cancer or other cell proliferative diseases, such as hyperplasia, in patients.

[0081] Finally, the invention provides for methods of detecting Caliban nucleic acid and protein expression, allowing investigation of cell growth regulation and tumor suppression. Furthermore, Caliban nucleic acid and protein expression can be used to diagnose cancer in patients who have a defect in one or more copies of Caliban in their genome.

[0082] Functionally, Caliban is involved in the regulation of cell proliferation and in the control of cellular aging, anchorage and apoptosis.

[0083] Structurally, the nucleotide sequence of Caliban (see, e.g., Figure 5, isolated from the fruit fly, Drosophila melanogaster) encodes a polypeptide of approximately 992 amino acids with a predicted molecular weight of approximately 112 kDa and a predicted range of 100-

140 IcDa (see " Figufe "" 5j;"Rerated Caliban genes from other species share at least about 40% amino acid identity over an amino acid region of at least about 25 amino acids in length, preferably 50 to 100 amino acids in length.

[0084] Specific regions of the Caliban nucleotide and amino acid sequences may be used to identify polymorphic variants, interspecies homologs, and alleles of Caliban. This identification can be made in vitro, e.g., under stringent hybridization conditions or with PCR and sequencing, or by using the sequence information in a computer system for comparison with other nucleotide or amino acid sequences. Typically, identification of polymorphic variants and alleles of Caliban is made by comparing an amino acid sequence of about 25 amino acids or more, preferably 50-100 amino acids. Amino acid identity of approximately at least 40% or above, preferably 80%, most preferably 90-95% or above typically demonstrates that a protein is a polymorphic variant, interspecies homolog, or allele of Caliban. Sequence comparison can be performed using any of the sequence comparison algorithms discussed below. Antibodies that bind specifically to Caliban or a conserved region thereof can also be used to identify alleles, interspecies homologs, and polymorphic variants.

[0085] Polymorphic variants, interspecies homologs, and alleles of Caliban are confirmed by examining the effect of putative Caliban expression on cell growth and tumor suppression using the methods and assays described herein. Typically, Caliban having the amino acid sequence as shown in the Figures is used as a positive control. For example, immunoassays using antibodies directed against the amino acid sequence as shown in the Figures can be used to demonstrate the identification of a polymorphic variant or allele of Caliban. Alternatively, Caliban having the nucleic acid sequences as shown in the Figures is used as a positive control, e.g., hybridization on Southern blots with such sequences to demonstrate the identification of a polymorphic variant or allele of Caliban. The polymorphic variants, alleles and interspecies homologs of Caliban are expected to retain the ability to inhibit cell proliferation and tumor suppression. These functional characteristics can be tested using various assays, such as soft agar assay, contact inhibition and density limitation of growth assay, growth factor or serum dependence assay, tumor specific markers assay, invasiveness assay, apoptosis assay, G 0 ZG 1 cell cycle arrest assay, tumor growth assay, and the like.

[0086] Caliban nucleotide and amino acid sequence information may also be used to construct models of tumor suppressor polypeptides in a computer system. These models are subsequently used to identify compounds that can activate or inhibit Caliban. Such compounds that modulate the activity of Caliban can be used to investigate the role of Caliban in inhibition of cell proliferation and tumor suppression or can be used as therapeutics.

[0087] Isolation of Caliban provides a means for assaying for modulators of Caliban. Caliban is useful for testing modulators using in vivo and in vitro expression that measure various parameters, e.g., cell growth on soft agar, contact inhibition and density limitation of growth, growth factor or serum dependence, tumor specific markers levels, invasiveness into Matrigel, apoptosis assay, G 0 ZG 1 cell cycle arrest assay, tumor growth in vivo, Caliban protein or mRNA levels, transcriptional activation or repression of a reporter gene, and the like. Such modulators identified using Caliban can be used to study cell growth regulation and tumor suppression, and further to treat cancer.

[0088] Methods of detecting Caliban nucleic acids and expression of Caliban are also useful for diagnosing various cancers or tumors by using assays such as northern blotting, dot blotting, in situ hybridization, RNase protection, and the like. Chromosome localization of the genes encoding human Caliban can also be used to identify diseases, mutations, and traits caused by and associated with Caliban. Techniques, such as high density oligonucleotide arrays (GeneChip ® ), can also be used to screen for mutations, polymorphic variants, alleles and interspecies homologs of Caliban.

[0089] It is to be understood that this invention is not limited to particular methods, reagents, compounds compositions or biological systems, which can, of course, vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to be limiting. As used in this specification and the appended claims, the singular forms "a", "an" and "the" include plural referents unless the content clearly dictates otherwise. Thus, for example, reference to "a cell" includes a combination of two or more cells, and the like.

[0090] "About" as used herein when referring to a measurable value such as an amount, a temporal duration, and the like, is meant to encompass variations of ±20% or ±10%, more preferably ±5%, even more preferably ±1 %, and still more preferably ±0.1 % from the specified value, as such variations are appropriate to perform the disclosed methods.

[0091] Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the invention pertains. Although any methods and materials similar or equivalent to those described herein can be used in the practice for testing of the present invention, the preferred materials and methods are described herein. In describing and claiming the present invention, the following terminology will be used.

[0092] A "neoplasm" is an abnormal mass of tissue resulting from excessive cell division that is uncontrolled and progressive, also called a tumor. Neoplasms are either begin

(neither infiltrative nor cancerous) or malignant (invasive). A neoplastic cell is a cell derived from a neoplasm.

[0093] A "tumor" as used herein refers to a neoplasm that may be either malignant or non-malignant. Tumors of the same tissue type are primary tumors originating in a particular organ (such as breast, prostate, bladder or lung). Tumors of the same tissue type may be divided into tumor of different sub-types (a classic example being bronchogenic carcinomas (lung tumors) which can be an adenocarcinoma, small cell, squamous cell, or large cell tumor). Breast cancers can be divided histologically into scirrhous, infiltrative, papillary, ductal, medullary and lobular. "Tumor cell" refers to precancerous, cancerous, and normal cells in a tumor.

[0094] "Tumor suppressor" refers to a nucleic acid or the polypeptide it encodes, that in its wildtype form has the ability to suppress, prevent, or decrease uncontrolled cell growth and or cell/division. Tumor suppressor genes encode proteins that inhibit progression through the cell cycle thereby inhibiting cell growth and/or cell division. When DNA damage is detected in a cell, tumor suppressors prevent the cell from continuing to multiply until the damaged DNA is repaired. Alternatively, if the DNA cannot be repaired, they may signal the cell to undergo apoptosis (programmed cell death) in order to prevent the damaged DNA from being passed on to the daughter cells. Tumor suppressors therefore play an important role in preventing the onset of uncontrolled cell growth, or neoplasia. Examples of tumor suppressors are p53, Rb and members of the Caliban family described herein.

[0095] "Cancer" or "malignancy" are used as synonymous terms and refer to any of a number of diseases that are characterized by uncontrolled, abnormal proliferation of cells, the ability of affected cells to spread locally or through the bloodstream and lymphatic system to other parts of the body {i.e., metastasize) as well as any of a number of characteristic structural and/or molecular features. A "cancerous" or "malignant cell" is understood as a cell having specific structural properties, lacking differentiation and being capable of invasion and metastasis. Examples of cancers are, breast, lung, brain, bone, liver, kidney, colon, and prostate cancer, (see DeVita et al., Cancer Principles and Practice of Oncology, 2001; this reference is herein incorporated by reference in its entirety for all purposes).

[0096] "Cancer-associated" and "cancer-related" refers to the relationship of a nucleic acid and its expression, or lack thereof, or a protein and its level or activity, or lack thereof, to the onset of malignancy in a subject cell. For example, cancer can be associated with or related to the expression of a particular gene that is not expressed, or is expressed at a lower level, in a normal healthy cell. Conversely, a cancer-associated gene can be one that is not expressed in a

malignant cell (or in a cell undergoing transformation), or is expressed at a lower level in the malignant cell than it is expressed in a normal healthy cell.

[0097] In the context of the cancer, "transformation" refers to the change that a normal cell undergoes as it becomes malignant. In eukaryotes, the term "transformation" can be used to describe the conversion of normal cells to malignant cells in cell culture.

[0098] "Proliferating cells" are those which are actively undergoing cell division and growing exponentially. "Loss of cell proliferation control" refers to the property of cells that have lost the cell cycle controls that normally ensure appropriate restriction of cell division. Cells that have lost such controls proliferate at a faster than normal rate, without stimulatory signals, and do not respond to inhibitory signals.

[0099] "Advanced cancer" means cancer that is no longer localized to the primary tumor site, or a cancer that is Stage III or IV according to the American Joint Committee on Cancer (AJCC).

[0100] "Cell cycle" refers to the cyclic biochemical and structural evens occurring during growth of cells. The cell cycle is divided into periods called: G 0 , Gap 1? (Gl), DNA synthesis (S), Gap 2 (G2), and mitosis (M).

[0101] As used herein, "Caliban", "CLBN" and "Clbn" "human SDCCAGl ", "SDCCAGl", "human NY-CO-I", "NY-CO-I refers to a family of tumor suppressor nucleic acids or polypeptides. Fly Clbn was cloned in an interaction screen using a noncanonical nuclear export signal (NES) as bait. As described herein, full length Clbn acts as both a bipartite mediator of nuclear export and a tumor suppressor in human lung cancer cells. Fly Caliban is homologous to a protein identified in humans as Sdccagl . Human Sdccagl was first identified in a screen for mis-expressed proteins in human colon cancer patients; however, the carboxy (or last) third of the gene was described and those authors believed that this was the complete gene (Scanlan et al., Int. J. Cancer 76: 652-658, 1998). A second group then identified human Sdccagl as a factor of unknown function but that they associated with one human non-small-cell lung cell line; however, again the authors believed the carboxy terminal third of the gene was the complete gene (Carbonnelle et al, Int. J. Cancer 92: 388-397, 2001). As described herein, full length Clbn can function as a nuclear export mediator and that expression of only full length Caliban acts as a tumor suppressor in human non-small cell lung cancer cells.

[0102] The Drosophila melanogaster Caliban has a molecular weight of approximately 112 kDa while human Caliban has a molecular weight of approximately 123 kDa. This protein is involved in the mediation of nuclear export, regulation of cell growth and proliferation and in the control of cellular aging, anchorage and apoptosis. Human Caliban is mapped to human

chromosome 14q22 "1 (Carbδnnelle et al. Cytogen. Cell Gen. 86:3-4, 1999). "Caliban protein" or "Clbn" protein or fragment thereof, or nucleic acid encoding "Caliban" protein or a fragment thereof refer to nucleic acids and polypeptide polymorphic variants, alleles, mutants, and interspecies homologs that: (1) have an amino acid sequence that has greater than about 40% amino acid sequence identity, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, preferably 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% or greater amino acid sequence identity, preferably over a region of at least about 25, 50, 100, 200, 500, 1000, or more amino acids, to an amino acid sequence encoded by a Caliban nucleic acid or amino acid sequence of a Caliban protein, e.g., a Caliban protein as shown in Figure 5 or; (2) specifically bind to antibodies, e.g., polyclonal antibodies, raised against an immunogen comprising an amino acid sequence of a Caliban protein, e.g., a Caliban protein as shown in Figure 5 or ortholog, immunogenic fragments thereof, and conservatively modified variants thereof; (3) specifically hybridize under stringent hybridization conditions to an anti-sense strand corresponding to a nucleic acid sequence encoding a Caliban protein, e.g., Caliban protein (Figures) or ortholog, and conservatively modified variants thereof; (4) have a nucleic acid sequence that has greater than about 40% amino acid sequence identity, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, preferably 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99%, or higher nucleotide sequence identity, preferably over a region of at least about 25, 50, 100, 200, 500, 1000, or more nucleotides, to a Caliban nucleic acid, e.g., a Caliban protein as shown in Figure 5 or ortholog thereof.

[0103] As stated above, the full length Clbn protein is required for it to function as a mediator of nuclear export and a tumor suppressor; however, we have also demonstrated that in isolation the carboxy terminus of the protein has the opposite effect, that is it blocks nuclear export and has oncogenic or tumor promoting properties. This effect is defined as "dominant negative" as expression of the carboxy terminus, in the absence of the amino terminus, interferes or blocks the active of the full length protein.

[0104] A Caliban polynucleotide or polypeptide sequence is typically from a mammal including, but not limited to, primate, e.g., human; rodent, e.g., rat, mouse, hamster; cow, pig, horse, sheep, or any mammal. Other Caliban polynucleotide or polypeptide sequences are from other organisms, including yeast {e.g., Saccharomyces cerevisiae; also referred to as S. cerevisiae), worms, and insects {e.g., Drosophila melanogaster). The nucleic acids and proteins of the invention include both naturally occurring or recombinant molecules.

[0105] The phrase "Caliban protein or a fragment thereof, or a nucleic acid encoding "Caliban protein or a fragment thereof refer to nucleic acid and polypeptide polymorphic

variants, alleles, mutants, and interspecies homologs that: (1) have an amino acid sequence that has greater than about 40% amino acid sequence identity, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, preferably 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% or greater amino acid sequence identity, preferably over a region of over a region of at least about 25, 50, 100, 200, 500, 1000, or more amino acids, to an amino acid sequence encoded by as shown in Figure 5; (2) specifically bind to antibodies, e.g., polyclonal antibodies, raised against an immunogen comprising an amino acid sequence encoded by a Caliban protein as shown in Figure 5 or ortholog, immunogenic fragments thereof, and conservatively modified variants thereof; (3) specifically hybridize under stringent hybridization conditions to an anti-sense strand corresponding to a nucleic acid sequence encoding a Caliban protein, e.g., a Caliban protein as shown in Figure 5 or ortholog, or their complements, and conservatively modified variants thereof; (4) have a nucleic acid sequence that has greater than about 40% amino acid sequence identity, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, preferably 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99%, or higher nucleotide sequence identity, preferably over a region of at least about 25, 50, 100, 200, 500, 1000, or more nucleotides, to the genes and their representative sequences as a Caliban protein as shown in Figure 5 or ortholog or their complements.

[0106] Full length "Caliban activity" is defined by its ability to facilitate the movement of target or cargo proteins, which include a specific amino acid sequence tag or NES, from the cell nucleus to the cytoplasm. This can be assayed by placing this NES tag on a readily detectable protein, such as green fluorescent protein, and following its subcellular localization (nuclear vs cytoplasmic)

[0107] "Cytoplasmic localization activity" within the meaning of the invention means any activity resulting in a rise in a signal (e.g., fluorescence) in the cellular cytoplasm above background levels as may be assayed by methods known in the art, preferably by the method described herein. Preferred is a rise in a signal above background levels of at least 10%, preferably at least 20%, at least 50% or even at least 2-fold.

[0108] "Gene" refers to a unit of inheritable genetic material found in a chromosome, such as in a human chromosome. Each gene is composed of a linear chain of deoxyribonucleotides which can be referred to by the sequence of nucleotides forming the chain. Thus, "sequence" is used to indicate both the ordered listing of the nucleotides which form the chain, and the chain which has that sequence of nucleotides. The term "sequence" is used in the same way in referring to RNA chains, linear chains made of ribonucleotides. The gene includes regulatory and control sequences, sequences which can be transcribed into an RNA molecule,

and can contain sequences with unknown function. Some of the RNA products (products of transcription from DNA) are messenger RNAs (mRNAs) which initially include ribonucleotide sequences (or sequence) which are translated into a polypeptide and ribonucleotide sequences which are not translated. The sequences which are not translated include control sequences, introns and sequences with unknowns function. It can be recognized that small differences in nucleotide sequence for the same gene can exist between different persons, or between normal cells and cancerous cells, without altering the identity of the gene.

[0109] "Gene expression pattern" means the set of genes of a specific tissue or cell type that are transcribed or "expressed" to form RNA molecules. Which genes are expressed in a specific cell line or tissue can depend on factors such as tissue or cell type, stage of development or the cell, tissue, or target organism and whether the cells are normal or transformed cells, such as cancerous cells. For example, a gene can be expressed at the embryonic or fetal stage in the development of a specific target organism and then become non-expressed as the target organism matures. Alternatively, a gene can be expressed in liver tissue but not in brain tissue of an adult human.

[0110] "Differential expression" refers to both quantitative as well as qualitative differences in the temporal and tissue expression patterns of a gene or a protein. For example, a differentially expressed gene can have its expression activated or completely inactivated in normal versus disease conditions. Such a qualitatively regulated gene can exhibit an expression pattern within a given tissue or cell type that is detectable in either control or disease conditions, but is not detectable in both. Differentially expressed genes can represent "profile genes," or "target genes" and the like.

[0111] Similarly, a differentially expressed protein can have its expression activated or completely inactivated in normal versus disease conditions. Such a qualitatively regulated protein can exhibit an expression pattern within a given tissue or cell type that is detectable in either control or disease conditions, but is not detectable in both. Moreover, differentially expressed genes can represent "profile proteins", "target proteins" and the like.

[0112] Differentially expressed genes can represent "expression profile genes", which includes "target genes". "Expression profile gene," as used herein, refers to a differentially expressed gene whose expression pattern can be used in methods for identifying compounds useful in the modulation of Caliban activity, or the treatment of disorders, or alternatively, the gene can be used as part of a prognostic or diagnostic evaluation of diseases or disorders, e.g. , diseases or disorders associated with cancer (e.g., lung cancer). For example, the effect of the compound on the expression profile gene normally displayed in connection with a particular

state, for example, can be used to evaluate the efficacy of the compound to modulate that state, or preferably, to induce or maintain that state. Such assays are further described below. Alternatively, the gene can be used as a diagnostic or in the treatment of cancer as also further described below. In some instances, only a fragment of an expression profile gene is used, as further described below.

[0113] "Expression profile," as used herein, refers to the pattern of gene expression generated from two up to all of the expression profile genes which exist for a given state. As outlined above, an expression profile is in a sense a "fingerprint" or "blueprint" of a particular cellular state; while two or more states have genes that are similarly expressed, the total expression profile of the state will be unique to that state. A "fingerprint pattern", as used herein, refers to a pattern generated when the expression pattern of a series (which can range from two up to all the fingerprint genes that exist for a given state) of fingerprint genes is determined. A fingerprint pattern also can be referred to as an "expression profile". A fingerprint pattern or expression profile can be used in the same diagnostic, prognostic, and compound identification methods as the expression of a single fingerprint gene. The gene expression profile obtained for a given state can be useful for a variety of applications, including diagnosis of a particular disease or condition and evaluation of various treatment regimes. In addition, comparisons between the expression profiles of different Caliban-related disease or disorder can be similarly informative. An expression profile can include genes which do not appreciably change between two states, so long as at least two genes which are differentially expressed are represented. The gene expression profile can also include at least one target gene, as defined below. Alternatively, the profile can include all of the genes which represent one or more states. Specific expression profiles are described below.

[0114] Gene expression profiles can be defined in several ways. For example, a gene expression profile can be the relative transcript level of any number of particular set of genes. Alternatively, a gene expression profile can be defined by comparing the level of expression of a variety of genes in one state to the level of expression of the same genes in another state. For example, genes can be either upregulated, downregulated, or remain substantially at the same level in both states.

[0115] A "target gene" refers to a nucleic acid, often derived from a biological sample, to which an oligonucleotide probe is designed to specifically hybridize. It is either the presence or absence of the target nucleic acid that is to be detected, or the amount of the target nucleic acid that is to be quantified. The target nucleic acid has a sequence that is complementary to the nucleic acid sequence of the corresponding probe directed to the target. The target nucleic acid

can also reter to the specific subsequence of a larger nucleic acid to which the probe is directed or to the overall sequence (e.g., gene or mRNA) whose expression level it is desired to detect. A "target gene", therefore, refers to a differentially expressed gene in which modulation of the level of gene expression or of gene product activity prevents and/or ameliorates a disease or disorder. Thus, compounds that modulate the expression of a target gene, the target gene, or the activity of a target gene product can be used in the diagnosis, treatment or prevention of diseases and disorders such as cancer (e.g., lung cancer).

[0116] A "target protein" refers to an amino acid or protein, often derived from a biological sample, to which a protein-capture agent specifically hybridizes or binds. It is either the presence or absence of the target protein that is to be detected, or the amount of the target protein that is to be quantified. The target protein has a structure that is recognized by the corresponding protein-capture agent directed to the target. The target protein or amino acid can also refer to the specific substructure of a larger protein to which the protein-capture agent is directed or to the overall structure (e.g., gene or mRNA) whose expression level it is desired to detect.

[0117] A "differentially expressed gene transcript", as used herein, refers to a gene, including a Caliban gene, transcript that is found in different numbers of copies in different cell or tissue types of an organism having a disease or disorder or a disease or disorder associated with cancer, compared to the numbers of copies or state of the gene transcript found in the cells of the same tissue in a healthy organism, or in the cells of the same tissue in the same organism. Multiple copies of gene transcripts can be found in an organism having a disease or disorder or a disease or disorder associated with cancer, while fewer copies of the same gene transcript are found in a healthy organism or healthy cells of the same tissue in the same organism, or vice- versa.

[0118] A "differentially expressed gene," can be a target, fingerprint, or pathway gene. For example, a "fingerprint gene", as used herein, refers to a differentially expressed gene whose expression pattern can be used as a prognostic or diagnostic marker for the evaluation of diseases or disorders, or which can be used to identify compounds useful for the treatment of such diseases or disorders or a disease or disorder associated with cancer. For example, the effect of a compound on the fingerprint gene expression pattern normally displayed in connection with diseases or disorders or diseases or disorders associated with cancer, can be used to evaluate the efficacy, such as potency, of the compound as treatment, or can be used to monitor patients undergoing clinical evaluation for the treatment of such a disease or disorder.

[0119] "Ortholog" refers to an evolutionarily conserved bio-molecule represented in a species other than the organism in which a reference sequence is identified, and contains a nucleic-acid or amino-acid sequence that is homologous to the reference sequence. To determine the degree of homology between a reference sequence and a sequence in question, two nucleic- acid sequences or two amino-acid sequences are compared. Homology can be defined by percentage identity or by percentage similarity. Percentage identity correlates with the proportion of identical amino-acid residues shared between two sequences compared in an alignment. Percentage similarity correlates with the proportion of amino-acid residues having similar structural properties that is shared between two sequences compared in an alignment. Percentages of similarity and identity can be calculated over a portion of the primary structure and not over the entire gene/protein sequence. For example, amino-acid residues having similar structural properties can be substituted for one another, such as the substitutions of analogous hydrophilic amino-acid residues, and the substitution of analogous hydrophobic amino-acid residues. Percentages of similarity and identity can be calculated over a portion of the primary structure and not over the entire gene/protein sequence. For the present disclosure, an ortholog or an orthologous sequence is defined as a homologous molecule or a sequence having Caliban activity and a sequence identity of at least about 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, and 95%. Alternatively, an ortholog is defined as a homologous molecule or sequence having Caliban activity and a sequence similarity of at least about 40%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, and 95%.

[0120] It is further contemplated that "ortholog" is a polypeptide or nucleic acid molecule of an organism that is highly related to a reference protein, or nucleic acid sequence, from another organism. An ortholog is functionally related to the reference gene, protein or nucleic acid sequence. In other words, the ortholog and its reference molecule would be expected to fulfill similar, if not equivalent, functional roles in their respective organisms. It is not required that an ortholog, when aligned with a reference sequence, have a particular degree of amino acid sequence identity to the reference sequence. A protein ortholog might share significant amino acid sequence identity over the entire length of the protein, for example, or, alternatively, might share significant amino acid sequence identity over only a single functionally important domain of the protein. Such functionally important domains may be defined by genetic mutations or by structure- function assays. Orthologs can be identified using methods provided herein. The functional role of an ortholog may be assayed using methods well known to the skilled artisan, and described herein. For example, function might be assayed in vivo or in vitro using a biochemical, immunological, or enzymatic assay; transformation rescue,

or tor example, in a nematode bioassay for the effect of gene inactivation on nematode phenotype. Alternatively, bioassays may be carried out in tissue culture; function can also be assayed by gene inactivation {e.g., by RNAi, siRNA, or gene knockout), or gene over- expression, as well as by other methods.

[0121] "Paralogs" are distinct but structurally related proteins made by an organism. Paralogs are believed to arise through gene duplication.

[0122] "Variant" may refer to an organism with a particular genotype in singular form, a set of organisms with different genotypes in plural form, and also to alleles of any gene identifiable by methods of the present invention. For example, the term "variants" includes various alleles that may occur at high frequency at a polymorphic locus, and includes organisms containing such allelic variants. The term "variant" includes various "strains" and various "mutants."

[0123] "Polymorphic variant" and "allele" refer to forms of Caliban that occur in a population (or among populations) and that maintain wildtype Caliban activity as measured using one of the assays described herein.

[0124] A Caliban "mutant" refers to those mutants which are experimentally made or those which are found in tumor or cancer cells. Mutants of Caliban can be due to, e.g., truncation, elongation, substitution of amino acids, deletion, insertion, or lack of expression {e.g., due to promoter or splice site mutations, and the like). A mutant has activity that differs from the activity of wildtype Caliban by at least about 20% as measured using an assay described herein. For example, a mutant of Caliban can have a null mutation which results in absence of normal gene product at the molecular level or an absence of function at the phenotypic level. Another example is a missense mutation of Caliban, where a substitution of amino acid(s) results in a change in the activity of the protein.

[0125] The phrase "missense or null endogenous Caliban phenotype" of a cell therefore refers to Caliban has a missense or null mutation so that the cell has a phenotype {e.g., soft agar growth, contact inhibition and density limitation of growth, and the like) which differs from a cell having a wildtype Caliban.

[0126] An "expression vector" is a nucleic acid construct, generated recombinantly or synthetically, with a series of specified nucleic acid elements that permit transcription of a particular nucleic acid in a host cell. The expression vector can be part of a plasmid, virus, or nucleic acid fragment. Typically, the expression vector includes a nucleic acid to be transcribed operably linked to a promoter.

[U127J "Transfect" or "transduce" refers to any way of getting a nucleic acid across a cell membrane, including electroporation, biolistics, injection, plasmid transfection, lipofection, viral transduction, lipid-nucleic acid complexes, naked DNA, and the like

[0128] "Biological sample" include, but are not limited to, tissue isolated from humans, mice, and rats. In some embodiments, a sample of biological tissue or fluid contains nucleic acids or polypeptides of Caliban and/. Biological samples may also include sections of tissues such as frozen sections taken from histological purposes. A biological sample is typically obtained from a eukaryotic organism, such as insects, protozoa, birds, fish, reptiles, and preferably a mammal such as rat, mouse, cow, dog, guinea pig, or rabbit, and most preferably a primate such as chimpanzees or humans.

[0129] The phrase "functional effects" in the context of assays for testing compounds that modulate Caliban mediated tumor suppression includes the determination of any parameter that is indirectly or directly under the influence of the Caliban protein. Functional effects include, e.g., anchorage dependence, contact inhibition and density limitation of growth, growth factor or serum dependence, tumor specific markers levels, invasiveness, tumor growth, Caliban protein mRNA levels, apoptosis, GoZG 1 cell cycle arrest, and the like, in vitro, in vivo, and ex vivo .

[0130] By "determining the functional effect" is meant assays for a compound that increases or decreases a parameter that is directly or indirectly under the influence of Caliban. Such functional effects can be measured by any means known to those skilled in the art, e.g., soft agar assay, contact inhibition and density limitation of growth assay, growth factor or serum dependence assay, tumor specific markers assay, invasiveness assay, apoptosis assay, G 0 ZG 1 cell cycle arrest assay, tumor growth assay, Caliban protein mRNA level assay, transcriptional activation or repression of a reporter gene assay, and the like, in vitro, in vivo, and ex vivo.

[0131] "Inhibitors," "activators," and "modulators" of Caliban genes and their gene products in cells are used to refer to inhibitory, activating, or modulating molecules, respectively, identified using in vitro and in vivo assays for binding or signaling, e.g., ligands, agonists, antagonists, and their homologs and mimetics. The term "modulator" includes inhibitors and activators. Inhibitors are agents that, e.g., bind to, partially or totally block stimulation, decrease, prevent, delay activation, inactivate, desensitize, or down regulate the activity of Caliban genes, e.g., antagonists. Activators are agents that, e.g., bind to, stimulate, increase, open, activate, facilitate, enhance activation, sensitize or up regulate the activity of Caliban genes, e.g., agonists. Modulators include agents that, e.g., alter the interaction of Caliban gene or gene product with: proteins that bind activators or inhibitors, receptors, including proteins, peptides, lipids, carbohydrates, polysaccharides, or combinations of the above, e.g., lipoproteins, glycoproteins,

and the like. Modulators can include genetically modified versions of naturally-occurring activated Caliban ligands, e.g., with altered activity, as well as naturally occurring and synthetic ligands, antagonists, agonists, small chemical molecules and the like. Such assays for inhibitors and activators include, e.g., applying putative modulator compounds to a cell expressing a Caliban receptor and then determining the functional effects on Caliban receptor signaling. Samples or assays comprising activated Caliban receptor that are treated with a potential activator, inhibitor, or modulator are compared to control samples without the inhibitor, activator, or modulator to examine the extent of inhibition. Control samples (untreated with inhibitors) can be assigned a activity value of 100%. Inhibition of activated samples is achieved when the activity value relative to the control is about 80%, optionally 50% or 25-0%. Activation of sample is achieved when the activity value relative to the control is 110%, optionally 150%, optionally 200-500%, or 1000-3000% higher.

[0132] Samples or assays comprising Caliban that are treated with a potential modulator are compared to control samples without the inhibitor, activator, or modulator. Control samples (untreated with inhibitors) are assigned a relative Caliban activity value of 100%. Inhibition of Caliban is achieved when the Caliban activity value relative to the control is about 90% or less, optionally about 80% or less, 70% or less, 60% or less, 50% or less, 40% or less, 30% or less, or 25-0%. Activation of Caliban is achieved when the Caliban activity value relative to the control (untreated with activators) is 110% or more, optionally 120%, 130%, 140%, 150% or more, 200-500% or more, 1000-3000% or more.

[0133] The phrase "changes in cell growth" refers to any change in cell growth and proliferation characteristics in vitro or in vivo, such as formation of foci, anchorage independence, semi-solid or soft agar growth, changes in contact inhibition and density limitation of growth, loss of growth factor or serum requirements, changes in cell morphology, gaining or losing immortalization, gaining or losing tumor specific markers, ability to form or suppress tumors when injected into suitable animal hosts, and/or immortalization of the cell. See, e.g., Freshney, Culture of Animal Cells a Manual of Basic Technique 3: 231-241, 1994, herein incorporated by reference. The phrase "changes in cell growth" can also refer to changes in apoptosis or changes in cell cycle pattern.

[0134] "Recombinant" when used with reference, e.g., to a cell, or nucleic acid, protein, or vector, indicates that the cell, nucleic acid, protein or vector, has been modified by the introduction of a heterologous nucleic acid or protein or the alteration of a native nucleic acid or protein, or that the cell is derived from a cell so modified. Thus, for example, recombinant cells

express genes that are not found within the native (non-recombinant) form of the cell or express native genes that are otherwise abnormally expressed, under expressed or not expressed at all.

[0135] A "promoter" is defined as an array of nucleic acid control sequences that direct transcription of a nucleic acid. As used herein, a promoter includes necessary nucleic acid sequences near the start site of transcription, such as, in the case of a polymerase II type promoter, a TATA element. A promoter also optionally includes distal enhancer or repressor elements, which can be located as much as several thousand base pairs from the start site of transcription.

[0136] A "constitutive" promoter is a promoter that is active under most environmental and developmental conditions. An "inducible" promoter is a promoter that is active under environmental or developmental regulation.

[0137] The term "operably linked" refers to a functional linkage between a nucleic acid expression control sequence (such as a promoter, or array of transcription factor binding sites) and a second nucleic acid sequence, wherein the expression control sequence directs transcription of the nucleic acid corresponding to the second sequence.

[0138] The term "heterologous" when used with reference to portions of a nucleic acid indicates that the nucleic acid comprises two or more subsequences that are not found in the same relationship to each other in nature. For instance, the nucleic acid is typically recombinantly produced, having two or more sequences from unrelated genes arranged to make a new functional nucleic acid, e.g., a promoter from one source and a coding region from another source. Similarly, a heterologous protein indicates that the protein comprises two or more subsequences that are not found in the same relationship to each other in nature (e.g., a fusion protein).

[0139] A "label" is a composition detectable by spectroscopic, photochemical, biochemical, immunochemical, or chemical means. For example, useful labels include 32 P, fluorescent dyes, electron-dense reagents, enzymes (e.g., as commonly used in an ELISA), biotin, digoxigenin, or haptens and proteins for which antisera or monoclonal antibodies are available (e.g., the polypeptide sequence as shown in the Figures can be made detectable, e.g., by incorporating a radiolabel into the peptide, and used to detect antibodies specifically reactive with the peptide).

[0140] "Isolated," "purified," or "biologically pure" refer to material that is substantially or essentially free from components which normally accompany it as found in its native state. Purity and homogeneity are typically determined using analytical chemistry techniques such as polyacrylamide gel electrophoresis or high performance liquid

chromatography. A protein that is the predominant species present in a preparation is substantially purified. In particular, an isolated Caliban nucleic acid is separated from open reading frames that flank the Caliban gene and encode proteins other than Caliban. The term "purified" denotes that a nucleic acid or protein gives rise to essentially one band in an electrophoretic gel. Particularly, it means that the nucleic acid or protein is at least 85% pure, more preferably at least 95% pure, and most preferably at least 99% pure.

[0141] "Nucleic acid" refers to deoxyribonucleotides or ribonucleotides and polymers thereof in either single- or double-stranded form. The term encompasses nucleic acids containing known nucleotide analogs or modified backbone residues or linkages, which are synthetic, naturally occurring, and non-naturally occurring, which have similar binding properties as the reference nucleic acid, and which are metabolized in a manner similar to the reference nucleotides. Examples of such analogs include, without limitation, phosphorothioates, phosphoramidates, methyl phosphonates, chiral-methyl phosphonates, 2-O-methyl ribonucleotides, peptide-nucleic acids (PNAs).

[0142] Unless otherwise indicated, a particular nucleic acid sequence also implicitly encompasses conservatively modified variants thereof (e.g., degenerate codon substitutions) and complementary sequences, as well as the sequence explicitly indicated. The term nucleic acid is used interchangeably with gene, cDNA, mRNA, oligonucleotide, and polynucleotide.

[0143] "Identical" or percent "identity," in the context of two or more nucleic acids or polypeptide sequences, refer to two or more sequences or subsequences that are the same or have a specified percentage of amino acid residues or nucleotides that are the same {i.e., 70% identity, preferably 75%, 80%, 85%, 90%, or 95% identity or higher over a specified region), when compared and aligned for maximum correspondence over a comparison window, or designated region as measured using one of the following sequence comparison algorithms or by manual alignment and visual inspection. Such sequences are then said to be "substantially identical." This definition also refers to the compliment of a test sequence. Preferably, the identity exists over a region that is at least about 25 amino acids or nucleotides in length, or more preferably over a region that is 50-100 amino acids or nucleotides in length. In most preferred embodiments, the sequences are substantially identical over the entire length of, e.g., the coding region.

[0144] For sequence comparison, typically one sequence acts as a reference sequence, to which test sequences are compared. When using a sequence comparison algorithm, test and reference sequences are entered into a computer, subsequence coordinates are designated, if necessary, and sequence algorithm program parameters are designated. Default program

parameters can be used, or alternative parameters can be designated. The sequence comparison algorithm then calculates the percent sequence identities for the test sequences relative to the reference sequence, based on the program parameters.

[0145] A "comparison window", as used herein, includes reference to a segment of any one of the number of contiguous positions selected from the group consisting of from 20 to 600, usually about 50 to about 200, more usually about 100 to about 150 in which a sequence may be compared to a reference sequence of the same number of contiguous positions after the two sequences are optimally aligned. Methods of alignment of sequences for comparison are well- known in the art. Optimal alignment of sequences for comparison can be conducted, e.g., by the local homology algorithm of (Smith et al, Adv. Appl. Math. 2: 482, 1981), by the homology alignment algorithm of (Needleman et al, J. MoI Biol. 48: 443, 1970), by the search for similarity method of (Pearson et al, Proc. Nat 'I. Acad. Sd. 85: 2444, 1988), by computerized implementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group, 575 Science Dr., Madison, Wis.), or by manual alignment and visual inspection (see, e.g., Ausubel et al, Current Protocols in Molecular Biology ,1995 supplement).

[0146] A preferred example of algorithm that is suitable for determining percent sequence identity and sequence similarity are the BLAST and BLAST 2.0 algorithms, which are described in (Altschul et al, Nuc. Acids Res. 25: 3389-3402, 1977 and Altschul et al, J. MoI Biol. 215: 403-410, 1990), respectively. BLAST and BLAST 2.0 are used, with the parameters described herein, to determine percent sequence identity for the nucleic acids and proteins of the invention. Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information (http ://www .ncbi .nlm.nih. govλ . This algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence, which either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as the neighborhood word score threshold (Altschul et al, supra). These initial neighborhood word hits act as seeds for initiating searches to find longer HSPs containing them. The word hits are extended in both directions along each sequence for as far as the cumulative alignment score can be increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always >0) and N (penalty score for mismatching residues; always <0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word hits in each direction are halted when: the cumulative alignment score falls off by the quantity X from its maximum achieved value; the cumulative

score goes to zero or below, due to the accumulation of one or more negative-scoring residue alignments; or the end of either sequence is reached. The BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) uses as defaults a wordlength (W) of 11, an expectation (E) of 10, M=5, N-4 and a comparison of both strands. For amino acid sequences, the BLASTP program uses as defaults a wordlength of 3, and expectation (E) of 10, and the BLOSUM62 scoring matrix (see Henikoff et al, Proc. Natl. Acad. Sd. U.S.A. 89: 10915, 1989) alignments (B) of 50, expectation (E) of 10, M=5, N=-4, and a comparison of both strands.

[0147] The BLAST algorithm also performs a statistical analysis of the similarity between two sequences (see, e.g., Karlin et al, Proc. Nat 'I. Acad. Sd. 90: 5873-5787, 1993). One measure of similarity provided by the BLAST algorithm is the smallest sum probability (P(N)), which provides an indication of the probability by which a match between two nucleotide or amino acid sequences would occur by chance. For example, a nucleic acid is considered similar to a reference sequence if the smallest sum probability in a comparison of the test nucleic acid to the reference nucleic acid is less than about 0.2, more preferably less than about 0.01, and most preferably less than about 0.001.

[0148] Another example of a useful algorithm is PILEUP. PILEUP creates a multiple sequence alignment from a group of related sequences using progressive, pairwise alignments to show relationship and percent sequence identity. It also plots a tree or dendogram showing the clustering relationships used to create the alignment. PILEUP uses a simplification of the progressive alignment method of (Feng et al, J. MoI. Evol 35: 351-360, 1987). The method used is similar to the method described by (Higgins et al, CABIOS 5: 151-153, 1989). The program can align up to 300 sequences, each of a maximum length of 5,000 nucleotides or amino acids. The multiple alignment procedure begins with the pairwise alignment of the two most similar sequences, producing a cluster of two aligned sequences. This cluster is then aligned to the next most related sequence or cluster of aligned sequences. Two clusters of sequences are aligned by a simple extension of the pairwise alignment of two individual sequences. The final alignment is achieved by a series of progressive, pairwise alignments. The program is run by designating specific sequences and their amino acid or nucleotide coordinates for regions of sequence comparison and by designating the program parameters. Using PILEUP, a reference sequence is compared to other test sequences to determine the percent sequence identity relationship using the following parameters: default gap weight (3.00), default gap length weight (0.10), and weighted end gaps. PILEUP can be obtained from the GCG sequence analysis software package, e.g., version 7.0 (Devereaux et al, Nuc. Acids Res. 12: 387-395, 1984).

[0149] An indication that two nucleic acid sequences or polypeptides are substantially identical is that the polypeptide encoded by the first nucleic acid is immunologically cross reactive with the antibodies raised against the polypeptide encoded by the second nucleic acid, as described below. Thus, a polypeptide is typically substantially identical to a second polypeptide, for example, where the two peptides differ only by conservative substitutions. Another indication that two nucleic acid sequences are substantially identical is that the two molecules or their complements hybridize to each other under stringent conditions, as described below. Yet another indication that two nucleic acid sequences are substantially identical is that the same primers can be used to amplify the sequence.

[0150] "Substantially identical," in the context of two nucleic acids or polypeptides refers to two or more sequences or subsequences that have at least about 80%, about 90%, about 95% or higher nucleotide or amino acid residue identity, when compared and aligned for maximum correspondence, as measured using the following sequence comparison method and/or by visual inspection. Such "substantially identical" sequences are typically considered to be homologous. The "substantial identity" can exist over a region of sequence that is at least about 50 residues in length, over a region of at least about 100 residues, or over a region at least about 150 residues, or over the full length of the two sequences to be compared. As described below, any two antibody sequences can only be aligned in one way, by using the numbering scheme in Kabat. Therefore, for antibodies, percent identity has a unique and well-defined meaning.

[0151] Amino acids from the variable regions of the mature heavy and light chains of immunoglobulins are designated Hx and Lx respectively, where x is a number designating the position of an amino acid according to the scheme of (Kabat, Sequences of Proteins of Immunological Interest, 1987 and 1991). Kabat lists many amino acid sequences for antibodies for each subgroup, and lists the most commonly occurring amino acid for each residue position in that subgroup to generate a consensus sequence. Kabat uses a method for assigning a residue number to each amino acid in a listed sequence, and this method for assigning residue numbers has become standard in the field. Rabat's scheme is extendible to other antibodies not included in his compendium by aligning the antibody in question with one of the consensus sequences in Kabat by reference to conserved amino acids. The use of the Kabat numbering system readily identifies amino acids at equivalent positions in different antibodies. For example, an amino acid at the L50 position of a human antibody occupies the equivalent position to an amino acid position L50 of a mouse antibody. Likewise, nucleic acids encoding antibody chains are aligned when the amino acid sequences encoded by the respective nucleic acids are aligned according to the Kabat numbering convention. An alternative structural definition has been proposed by

(Chothia et al, J. MoI. Biol. 196: 901-917, 1987; Chothia et al, Nature 342: 878-883, 1989; and Chothia, et al, J. MoI. Biol. 186: 651-663, 1989), which are herein incorporated by reference for all purposes.

[0152] The nucleic acids of the invention be present in whole cells, in a cell lysate, or in a partially purified or substantially pure form. A nucleic acid is "isolated" or "rendered substantially pure" when purified away from other cellular components or other contaminants, e.g., other cellular nucleic acids or proteins, by standard techniques, including alkaline/SDS treatment, CsCl banding, column chromatography, agarose gel electrophoresis and others well known in the art (See, e.g., Sambrook, Tijssen and Ausubel discussed herein and incorporated by reference for all purposes). The nucleic acid sequences of the invention and other nucleic acids used to practice this invention, whether RNA, cDNA, genomic DNA, or hybrids thereof, can be isolated from a variety of sources, genetically engineered, amplified, and/or expressed recombinantly. Any recombinant expression system can be used, including, in addition to bacterial, e.g., yeast, insect or mammalian systems. Alternatively, these nucleic acids can be chemically synthesized in vitro. Techniques for the manipulation of nucleic acids, such as, e.g., subcloning into expression vectors, labeling probes, sequencing, and hybridization are well described in the scientific and patent literature, see, e.g., Sambrook, Tijssen and Ausubel. Nucleic acids can be analyzed and quantified by any of a number of general means well known to those of skill in the art. These include, e.g., analytical biochemical methods such as NMR, spectrophotometry, radiography, electrophoresis, capillary electrophoresis, high performance liquid chromatography (HPLC), thin layer chromatography (TLC), and hyperdiffusion chromatography, various immunological methods, such as fluid or gel precipitin reactions, immunodiffusion (single or double), immunoelectrophoresis, adioimmunoassay (RIAs), enzyme-linked immunosorbent assays (ELISAs), immuno-fluorescent assays, Southern analysis, Northern analysis, dot-blot analysis, gel electrophoresis {e.g., SDS-PAGE), RT-PCR, quantitative PCR, other nucleic acid or target or signal amplification methods, radiolabeling, scintillation counting, and affinity chromatography.

[0153] The nucleic acid compositions of the present invention, while often in a native sequence (except for modified restriction sites and the like), from either cDNA, genomic or mixtures can be mutated, thereof in accordance with standard techniques to provide gene sequences. For coding sequences, these mutations, can affect amino acid sequence as desired. In particular, DNA sequences substantially homologous to or derived from native V, D, J, constant, switches and other such sequences described herein are contemplated (where "derived" indicates that a sequence is identical or modified from another sequence).

[0154] "Recombinant host cell" (or simply "host cell") refers to a cell into which a recombinant expression vector has been introduced. It should be understood that such terms are intended to refer not only to the particular subject cell but to the progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term "host cell" as used herein.

[0155] "Polypeptide," "peptide" and "protein" are used interchangeably herein to refer to a polymer of amino acid residues. The terms apply to amino acid polymers in which one or more amino acid residue is an artificial chemical mimetic of a corresponding naturally occurring amino acid, as well as to naturally occurring amino acid polymers and non-naturally occurring amino acid polymer.

[0156] "Amino acid" refers to naturally occurring and synthetic amino acids, as well as amino acid analogs and amino acid mimetics that function in a manner similar to the naturally occurring amino acids. Naturally occurring amino acids are those encoded by the genetic code, as well as those amino acids that are later modified, e.g., hydroxyproline, γ-carboxyglutamate, and 0-phosphoserine. Amino acid analogs refers to compounds that have the same basic chemical structure as a naturally occurring amino acid, i.e., an a carbon that is bound to a hydrogen, a carboxyl group, an amino group, and an R group, e.g., homoserine, norleucine, methionine sulfoxide, methionine methyl sulfonium. Such analogs have modified R groups (e.g., norleucine) or modified peptide backbones, but retain the same basic chemical structure as a naturally occurring amino acid. Amino acid mimetics refers to chemical compounds that have a structure that is different from the general chemical structure of an amino acid, but that functions in a manner similar to a naturally occurring amino acid.

[0157] Amino acids may be referred to herein by either their commonly known three letter symbols or by the one-letter symbols recommended by the IUPAC-IUB Biochemical Nomenclature Commission. Nucleotides, likewise, can be referred to by their commonly accepted single-letter codes.

[0158] "Conservatively modified variants" applies to both amino acid and nucleic acid sequences. With respect to particular nucleic acid sequences, conservatively modified variants refers to those nucleic acids which encode identical or essentially identical amino acid sequences, or where the nucleic acid does not encode an amino acid sequence, to essentially identical sequences. Because of the degeneracy of the genetic code, a large number of functionally identical nucleic acids encode any given protein. For instance, the codons GCA, GCC, GCG and GCU all encode the amino acid alanine. Thus, at every position where an

alanine is specified by a codon, the codon can be altered to any of the corresponding codons described without altering the encoded polypeptide. Such nucleic acid variations are "silent variations," which are one species of conservatively modified variations. Every nucleic acid sequence herein which encodes a polypeptide also describes every possible silent variation of the nucleic acid. One of skill will recognize that each codon in a nucleic acid (except AUG, which is ordinarily the only codon for methionine, and TGG, which is ordinarily the only codon for tryptophan) can be modified to yield a functionally identical molecule. Accordingly, each silent variation of a nucleic acid which encodes a polypeptide is implicit in each described sequence with respect to the expression product, but not with respect to actual probe sequences.

[0159] As to amino acid sequences, one of skill will recognize that individual substitutions, deletions or additions to a nucleic acid, peptide, polypeptide, or protein sequence which alters, adds or deletes a single amino acid or a small percentage of amino acids in the encoded sequence is a "conservatively modified variant" where the alteration results in the substitution of an amino acid with a chemically similar amino acid. Conservative substitution tables providing functionally similar amino acids are well known in the art. Such conservatively modified variants are in addition to and do not exclude polymorphic variants, interspecies homologs, and alleles of the invention.

[0160] The following eight groups each contain amino acids that are conservative substitutions for one another: 1) Alanine (A), Glycine (G); 2) Aspartic acid (D), Glutamic acid (E); 3) Asparagine (N), Glutamine (Q); 4) Arginine (R), Lysine (K); 5) Isoleucine (I), Leucine (L), Methionine (M), Valine (V); 6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W); 7) Serine (S), Threonine (T); and 8) Cysteine (C), Methionine (M) (see, e.g., Creighton, Proteins 1984).

[0161] Macromolecular structures such as polypeptide structures can be described in terms of various levels of organization. For a general discussion of this organization, (see, e.g., Alberts et al, Molecular Biology of the Cell 3: 1994) and (Cantor et al, Biophysical Chemistry Parti: The Conformation of Biological Macromolecules, 1980). "Primary structure" refers to the amino acid sequence of a particular peptide. "Secondary structure" refers to locally ordered, three dimensional structures within a polypeptide. These structures are commonly known as domains, e.g., enzymatic domains, extracellular domains, transmembrane domains, pore domains, and cytoplasmic tail domains. Domains are portions of a polypeptide that form a compact unit of the polypeptide and are typically 15 to 350 amino acids long. Exemplary domains include domains with enzymatic activity, e.g., a kinase domain. Typical domains are made up of sections of lesser organization such as stretches of (8-sheet and ce-helices. "Tertiary structure" refers to the complete three dimensional structure of a polypeptide monomer.

"Quaternary structure" refers to the three dimensional structure formed by the noncovalent association of independent tertiary units. Anisotropic terms are also known as energy terms.

[0162] A particular nucleic acid sequence also implicitly encompasses "splice variants." Similarly, a particular protein encoded by a nucleic acid implicitly encompasses any protein encoded by a splice variant of that nucleic acid. "Splice variants," as the name suggests, are products of alternative splicing of a gene. After transcription, an initial nucleic acid transcript can be spliced such that different (alternate) nucleic acid splice products encode different polypeptides. Mechanisms for the production of splice variants vary, but include alternate splicing of exons. Alternate polypeptides derived from the same nucleic acid by read-through transcription are also encompassed by this definition. Any products of a splicing reaction, including recombinant forms of the splice products, are contemplated here.

[0163] The phrase "selectively (or specifically) hybridizes to" refers to the binding, duplexing, or hybridizing of a molecule only to a particular nucleotide sequence under stringent hybridization conditions when that sequence is present in a complex mixture (e.g., total cellular or library DNA or RNA).

[0164] The phrase "stringent hybridization conditions" refers to conditions under which a probe will hybridize to its target subsequence, typically in a complex mixture of nucleic acid, but to no other sequences. Stringent conditions are sequence-dependent and will be different in different circumstances. Longer sequences hybridize specifically at higher temperatures. An extensive guide to the hybridization of nucleic acids is found in (Tijssen, Techniques in Biochemistry and Molecular Biology— Hybridization with Nucleic Probes, 1993). Generally, stringent conditions are selected to be about 5-1O 0 C lower than the thermal melting point (T m ) for the specific sequence at a defined ionic strength pH. The T m is the temperature (under defined ionic strength, pH, and nucleic concentration) at which 50% of the probes complementary to the target hybridize to the target sequence at equilibrium (as the target sequences are present in excess, at T m , 50% of the probes are occupied at equilibrium). Stringent conditions will be those in which the salt concentration is less than about 1.0 M sodium ion, typically about 0.01 to 1.0 M sodium ion concentration (or other salts) at pH 7.0 to 8.3 and the temperature is at least about 3O 0 C. for short probes (e.g., 10 to 50 nucleotides) and at least about 6O 0 C for long probes (e.g., greater than 50 nucleotides). Stringent conditions may also be achieved with the addition of destabilizing agents such as formamide. For selective or specific hybridization, a positive signal is at least two times background, preferably 10 times background hybridization. Exemplary stringent hybridization conditions can be as following: 50% formamide, 5 x SSC, and 1% SDS,

incubating at 42 U C, or, 5 x SSC, 1% SDS 3 incubating at 65 0 C, with wash in 0.2 x SSC, and 0.1% SDS at 65 0 C.

[0165] Nucleic acids that do not hybridize to each other under stringent conditions are still substantially identical if the polypeptides which they encode are substantially identical. This occurs, for example, when a copy of a nucleic acid is created using the maximum codon degeneracy permitted by the genetic code. In such cases, the nucleic acids typically hybridize under moderately stringent hybridization conditions. Exemplary "moderately stringent hybridization conditions" include a hybridization in a buffer of 40% formamide, 1 M NaCl, 1% SDS at 37 0 C, and a wash in 1 x SSC at 45 0 C. A positive hybridization is at least twice background. Those of ordinary skill will readily recognize that alternative hybridization and wash conditions can be utilized to provide conditions of similar stringency.

[0166] "Epitope" refers to a protein determinant capable of specific binding to an antibody. Epitopes usually consist of chemically active surface groupings of molecules such as amino acids or sugar side chains and usually have specific three dimensional structural characteristics, as well as specific charge characteristics. Conformational and nonconformational epitopes are distinguished in that the binding to the former but not the latter is lost in the presence of denaturing solvents.

[0167] An intact "antibody" comprises at least two heavy (H) chains and two light (L) chains inter-connected by disulfide bonds. Each heavy chain is comprised of a heavy chain variable region (abbreviated herein as HCVR or VH) and a heavy chain constant region. The heavy chain constant region is comprised of three domains, CHl, CH2 and CH3. Each light chain is comprised of a light chain variable region (abbreviated herein as LCVR or VL) and a light chain constant region. The light chain constant region is comprised of one domain, CL. The VH and VL regions can be further subdivided into regions of hypervariability, termed complementarity determining regions (CDR), interspersed with regions that are more conserved, termed framework regions (FR). Each VH and VL is composed of three CDRs and four FRs, arranged from amino-terminus to carboxyl-terminus in the following order: FRl, CDRl, FR2, CDR2, FR3, CDR3, FR4. The variable regions of the heavy and light chains contain a binding domain that interacts with an antigen. The constant regions of the antibodies can mediate the binding of the immunoglobulin to host tissues or factors, including various cells of the immune system (e.g., effector cells) and the first component (CIq) of the classical complement system. The term antibody includes antigen-binding portions of an intact antibody that retain capacity to bind Caliban. Examples of binding include (i) a Fab fragment, a monovalent fragment consisting of the VL, VH, CL and CHl domains; (ii) a F(ab')2 fragment, a bivalent fragment comprising

two Fab fragments linked by a disulfide bridge at the hinge region; (iii) a Fd fragment consisting of the VH and CHl domains; (iv) a Fv fragment consisting of the VL and VH domains of a single arm of an antibody, (v) a dAb fragment (Ward et ah, Nature 341: 544-546, 1989), which consists of a VH domain; and (vi) an isolated complementarity determining region (CDR).

[0168] "Single chain antibodies" or "single chain Fv (scFv)" refers to an antibody fusion molecule of the two domains of the Fv fragment, VL and VH. Although the two domains of the Fv fragment, VL and VH, are coded for by separate genes, they can be joined, using recombinant methods, by a synthetic linker that enables them to be made as a single protein chain in which the VL and VH regions pair to form monovalent molecules (known as single chain Fv (scFv); (See, e.g., Bird et ah, Science 242: 423-426, 1988; and Huston et ah, Proc. Natl. Acad. Sd. 85: 5879-5883, 1988). Such single chain antibodies are included by reference to the term "antibody" fragments can be prepared by recombinant techniques or enzymatic or chemical cleavage of intact antibodies.

[0169] "Human sequence antibody" includes antibodies having variable and constant regions (if present) derived from human germline immunoglobulin sequences. The human sequence antibodies of the invention can include amino acid residues not encoded by human germline immunoglobulin sequences {e.g., mutations introduced by random or site-specific mutagenesis in vitro or by somatic mutation in vivo). Such antibodies can be generated in non- human trans genie animals, e.g., as described in PCT Publication Nos. WO 01/14424 and WO 00/37504. However, the term "human sequence antibody", as used herein, is not intended to include antibodies in which CDR sequences derived from the germline of another mammalian species, such as a mouse, have been grafted onto human framework sequences (e.g., humanized antibodies).

[0170] Also, recombinant immunoglobulins may be produced. (See, Cabilly, U.S. Pat. No. 4,816,567) incorporated herein by reference in its entirety and for all purposes; and (Queen et ah, Proc. Nat 'I Acad. Sd. 86: 10029-10033, 1989).

[0171] "Monoclonal antibody" refer to a preparation of antibody molecules of single molecular composition. A monoclonal antibody composition displays a single binding specificity and affinity for a particular epitope. Accordingly, the term "human monoclonal antibody" refers to antibodies displaying a single binding specificity which have variable and constant regions (if present) derived from human germline immunoglobulin sequences, hi one embodiment, the human monoclonal antibodies are produced by a hybridoma which includes a B cell obtained from a transgenic non-human animal, e.g., a transgenic mouse, having a genome comprising a human heavy chain transgene and a light chain transgene fused to an immortalized cell.

[0172] "Polyclonal antibody" refers to a preparation of more than 1 (two or more) different antibodies to a Caliban protein. Such a preparation includes antibodies binding to a range of different epitopes. Antibodies to Caliban can bind to an epitope on human Caliban so as to inhibit Caliban. These and other antibodies suitable for use in the present invention can be prepared according to methods that are well known in the art and/or are described in the references cited here.

[0173] For preparation of monoclonal or polyclonal antibodies, any technique known in the art can be used (see, e.g., Kohler et al, Nature 256: 495-497, 1975; Kozbor et al, Immunology Today 4: 72, 1983; Cole et al, Monoclonal Antibodies and Cancer Therapy, 77-96, 1985). Techniques for the production of single chain antibodies (U.S. Pat. No. 4,946,778) can be adapted to produce antibodies to polypeptides of this invention. Also, transgenic mice, or other organisms such as other mammals, may be used to express humanized antibodies. Alternatively, phage display technology can be used to identify antibodies and heteromeric Fab fragments that specifically bind to selected antigens (see, e.g., McCafferty et al, Nature 348: 552-554, 1990; Marks et al, Biotechnology 10: 779-783, 1992).

[0174] "Recombinant human antibody" includes all human sequence antibodies of the invention that are prepared, expressed, created or isolated by recombinant means, such as antibodies isolated from an animal (e.g., a mouse) that is transgenic for human immunoglobulin genes (described further below); antibodies expressed using a recombinant expression vector transfected into a host cell, antibodies isolated from a recombinant, combinatorial human antibody library, or antibodies prepared, expressed, created or isolated by any other means that involves splicing of human immunoglobulin gene sequences to other DNA sequences. Such recombinant human antibodies have variable and constant regions (if present) derived from human germline immunoglobulin sequences. Such antibodies can, however, be subjected to in vitro mutagenesis (or, when an animal transgenic for human Ig sequences is used, in vivo somatic mutagenesis) and thus the amino acid sequences of the VH and VL regions of the recombinant antibodies are sequences that, while derived from and related to human germline VH and VL sequences, may not naturally exist within the human antibody germline repertoire in vivo.

[0175] A "heterologous antibody" is defined in relation to the transgenic non-human organism producing such an antibody. This term refers to an antibody having an amino acid sequence or an encoding nucleic acid sequence corresponding to that found in an organism not consisting of the transgenic non-human animal, and generally from a species other than that of the transgenic non-human animal.

[0176] A "chimeric antibody" is an antibody molecule in which (a) the constant region, or a portion thereof, is altered, replaced or exchanged so that the antigen binding site (variable region) is linked to a constant region of a different or altered class, effector function and/or species, or an entirely different molecule which confers new properties to the chimeric antibody, e.g., an enzyme, toxin, hormone, growth factor, drug, and the like; or (b) the variable region, or a portion thereof, is altered, replaced or exchanged with a variable region having a different or altered antigen specificity.

[0177] An "anti-Caliban" antibody is an antibody or antibody fragment that specifically binds a polypeptide encoded by the Caliban gene, cDNA, or a subsequence thereof.

[0178] An "immunoassay" is an assay that uses an antibody to specifically bind an antigen. The immunoassay is characterized by the use of specific binding properties of a particular antibody to isolate, target, and/or quantify the antigen.

[0179] "Substantially pure" or "isolated" means an object species (e.g., an antibody of the invention) has been identified and separated and/or recovered from a component of its natural environment such that the object species is the predominant species present (i.e., on a molar basis it is more abundant than any other individual species in the composition); a "substantially pure" or "isolated" composition also means where the object species comprises at least about 50 percent (on a molar basis) of all macromolecular species present. A substantially pure or isolated composition can also comprise more than about 80 to 90 percent by weight of all macromolecular species present in the composition. An isolated object species (e.g., antibodies of the invention) can also be purified to essential homogeneity (contaminant species cannot be detected in the composition by conventional detection methods) wherein the composition consists essentially of derivatives of a single macromolecular species. For example, an isolated antibody to a Caliban gene product as shown in the Figures can be substantially free of other antibodies that lack binding to that particular gene product and bind to a different antigen. Further, an isolated antibody that specifically binds to an epitope, isoform or variant of a Caliban protein may, however, have cross-reactivity to other related antigens, e.g., from other species (e.g., Caliban species homologs). Moreover, an isolated antibody of the invention be substantially free of other cellular material (e.g., non-immunoglobulin associated proteins) and/or chemicals.

[0180] "Specific binding' ' refers to preferential binding of an antibody to a specified antigen relative to other non-specified antigens. The phrase "specifically (or selectively) binds" to an antibody refers to a binding reaction that is determinative of the presence of the protein in a heterogeneous population of proteins and other biologies. Typically, the antibody binds with an

association constant (Ka) of at least about 1 x 10 6 M "1 or 10 7 M "1 , or about 10 8 M "1 to 10 9 M "1 , or about 10 10 M "1 to 10 11 M "1 or higher, and binds to the specified antigen with an affinity that is at least two-fold greater than its affinity for binding to a non-specific antigen (e.g., BSA 5 casein) other than the specified antigen or a closely-related antigen. The phrases "an antibody recognizing an antigen" and "an antibody specific for an antigen" are used interchangeably herein with the term "an antibody which binds specifically to an antigen". A predetermined antigen is an antigen that is chosen prior to the selection of an antibody that binds to that antigen.

[0181] "Specifically bind(s)" or "bind(s) specifically" when referring to a peptide refers to a peptide molecule which has intermediate or high binding affinity, exclusively or predominately, to a target molecule. The phrases "specifically binds to" refers to a binding reaction which is determinative of the presence of a target protein in the presence of a heterogeneous population of proteins and other biologies. Thus, under designated assay conditions, the specified binding moieties bind preferentially to a particular target protein and do not bind in a significant amount to other components present in a test sample. Specific binding to a target protein under such conditions can require a binding moiety that is selected for its specificity for a particular target antigen. A variety of assay formats can be used to select ligands that are specifically reactive with a particular protein. For example, solid-phase ELISA immunoassays, immunoprecipitation, Biacore and Western blot are used to identify peptides that specifically react with the antigen. Typically a specific or selective reaction will be at least twice background signal or noise and more typically more than 10 times background.

[0182] "High affinity" for an antibody refers to an equilibrium association constant (K a ) of at least about 10 7 M "1 , at least about 10 8 M "1 , at least about 10 9 M "1 , at least about 10 10 M "1 , at least about 10 11 M "1 , or at least about 10 12 M "1 or greater, e.g., up to 10 13 M "1 or 10 14 M "1 or greater. However, "high affinity" binding can vary for other antibody isotypes.

[0183] "K a ", as used herein, is intended to refer to the equilibrium association constant of a particular antibody-antigen interaction. This constant has units of 1/M.

[0184] "K d ", as used herein, is intended to refer to the equilibrium dissociation constant of a particular antibody-antigen interaction. This constant has units of M.

[0185] The term "k a ", as used herein, is intended to refer to the kinetic association constant of a particular antibody-antigen interaction. This constant has units of I/Ms.

[0186] The term "k d ", as used herein, is intended to refer to the kinetic dissociation constant of a particular antibody-antigen interaction. This constant has units of 1/s.

[0187] "Particular antibody-antigen interactions" refers to the experimental conditions under which the equilibrium and kinetic constants are measured.

[0188] "Isotype" refers to the antibody class that is encoded by heavy chain constant region genes. Heavy chains are classified as gamma, mu, alpha, delta, or epsilon, and define the antibody's isotype as IgG, IgM, IgA, IgD and IgE, respectively. Additional structural variations characterize distinct subtypes of IgG (e.g., IgGi, IgG 2 , IgG 3 and IgG 4 ) and IgA (e.g., IgA 1 and IgA 2 )

[0189] The phrase "selectively associates with" refers to the ability of a nucleic acid to "selectively hybridize" with another as defined above, or the ability of an antibody to "selectively (or specifically) bind to a protein, as defined above.

[0190] "Caliban-specific reagent" refers to any reagent which specifically associates with Caliban. For example, it can be a Caliban-specific antibody, a Caliban-specific primer, or a Caliban-specific nucleic acid probe.

[0191] "Naturally-occurring' ' as applied to an object refers to the fact that an object can be found in nature. For example, a polypeptide or polynucleotide sequence that is present in an organism (including viruses) that can be isolated from a source in nature and which has not been intentionally modified by man in the laboratory is naturally-occurring.

[0192] "Dosage unit" refers to physically discrete units suited as unitary dosages for the particular individual to be treated. Each unit can contain a predetermined quantity of active compound(s) calculated to produce the desired therapeutic effect(s) in association with the required pharmaceutical carrier. The specification for the dosage unit forms can be dictated by (a) the unique characteristics of the active compound(s) and the particular therapeutic effect(s) to be achieved, and (b) the limitations inherent in the art of compounding such active compound(s).

[0193] "Patient", "subject" or "mammal" are used interchangeably and refer to mammals such as human patients and non-human primates, as well as experimental animals such as rabbits, rats, and mice, and other animals. Animals include all vertebrates, e.g., mammals and non-mammals, such as sheep, dogs, cows, chickens, amphibians, and reptiles.

[0194] "Treating" refers to any indicia of success in the treatment or amelioration or prevention of a cancer (e.g., lung cancer), including any objective or subjective parameter such as abatement; remission; diminishing of symptoms or making the disease condition more tolerable to the patient; slowing in the rate of degeneration or decline; or making the final point of degeneration less debilitating. The treatment or amelioration of symptoms can be based on objective or subjective parameters; including the results of an examination by a physician. Accordingly, the term "treating" includes the administration of the compounds or agents of the present invention to prevent or delay, to alleviate, or to arrest or inhibit development of the symptoms or conditions associated with ocular disease. The term "therapeutic effect" refers to

the reduction, elimination, or prevention of the disease, symptoms of the disease, or side effects of the disease in the subject.

[0195] "In combination with", "combination therapy" and "combination products" refer, in certain embodiments, to the concurrent administration to a patient of a first therapeutic and the compounds as used herein. When administered in combination, each component can be administered at the same time or sequentially in any order at different points in time. Thus, each component can be administered separately but sufficiently closely in time so as to provide the desired therapeutic effect.

[0196] "Concomitant administration" of a known drug with a compound of the present invention means administration of the drug and the compound at such time that both the known drug and the compound will have a therapeutic effect or diagnostic effect. Such concomitant administration can involve concurrent (i.e., at the same time), prior, or subsequent administration of the drug with respect to the administration of a compound of the present invention. A person of ordinary skill in the art, would have no difficulty determining the appropriate timing, sequence and dosages of administration for particular drugs and compounds of the present invention.

[0197] This invention relies on routine techniques in the field of recombinant genetics. Basic texts disclosing the general methods of use in this invention include (Sambrook et al, Molecular Cloning, A Laboratory Manual, 2: 1989; Kriegler, Gene Transfer and Expression: A Laboratory Manual, 1990; and Ausubel et al, Current Protocols in Molecular Biology, 1994), all of which are herein incorporated by reference for all purposes.

[0198] Caliban genes, nucleic acids, polymorphic variants, orthologs, and alleles that are substantially identical to sequences provided herein, and referenced by their appropriate GenBank Accession number where applicable and in the Figures, can be isolated using Caliban nucleic acid probes and oligonucleotides under stringent hybridization conditions, by screening libraries. Alternatively, expression libraries can be used to clone Caliban protein, polymorphic variants, orthologs, and alleles by detecting expressed homologs immunologically with antisera or purified antibodies made against Caliban genes and their gene products or portions thereof.

2. CALIBAN - OVERVIEW

[0199] Caliban (Clbn) is homologous to human Sdccagl, which has been implicated in colon (Scanlan et al, Int. J. Cancer 76: 652-658, 1998) and lung (Carbonnelle et al, Int. J. Cancer 92: 388-397, 2001) cancer. Induction of sdccagl in a non-small cell lung cancer cell line has been correlated with cell cycle arrest, but its mode of action was unknown (Carbonnelle et al.Jnt. J. Cancer 92: 388-397, 2001).

[0200] For the first time as disclosed herein, foil length Clbn and sdccagl are shown to be mediators of nuclear export in fly and human cells. Clbn is a bipartite protein with the carboxy terminus binding HDA, the nuclear export signal (NES) from the homeodomain transcription factor Prospero (Demidenko et al, Development 126: 1359-1367, 2001; Bi et al, MoI Cell. Biol. 23: 1014-1024, 2003), and the amino terminus directly binding Exportin and acting as an NES. The reporter gene EYFP-HDA is exported in normal fly and human lung cells; however, clbn and sdccagl interference RNA (RNAi) abrogate this nuclear export. As disclosed herein, EYFP-HDA is not exported in all five non-small cell lung cancer cell lines examined, demonstrating that its nuclear export can be used as a biomarker to distinguish these cancer cells from normal cells. Expression of fly Caliban in human non-small cell lung cancer cells restores nuclear export and acts as a potent tumor suppressor. These results suggest that CalibsmJsdccagl is a good candidate for use in gene therapy to treat lung cancers and that identifying additional targets exported by Caliban/sdccagl might identify candidates for drugable targets.

[0201] Prospero' s (Prox in mammals) 64 amino acid homeodomain (HD) and adjacent 100 amino acid Prospero domain are highly conserved from flies to man. Together they constitute one functional domain controlling both DNA binding and nuclear export, and resembling a heterodimeric homeodomain (Demidenko et al, Development 126: 1359-1367, 2001; Bi et al, MoI. Cell. Biol 23: 1014-1024, 2003; Ryter et al, Structure 10: 1541-1549, 2002). We previously demonstrated that a 28-amino-acid region from the beginning of Prospero' s homeodomain (HDA) functions as an Exportin dependent NES. The homeo/Prospero domain regulates nuclear export by inhibiting or masking the NES in most cells (Demidenko et al, Development 126: 1359-1367, 2001; Bi et al, MoI. Cell. Biol. 23: 1014-1024, 2003).

[0202] Exportin signals typically consist of specifically spaced leucine or hydrophobic amino acids, LX 1-3 LX 2-3 LXL (Henderson et al, Exp. Cell Res. 256: 213-224, 2000). The HDA export sequence also consists of conserved hydrophobic residues; however, the spacing, LX 4 LX 4 LXF (Bi et al, MoI. Cell. Biol. 23: 1014-1024, 2003), does not match that of a canonical Exportin signal. The tertiary structure of the homeo/Prospero domain places the nuclear export signal on the outside of the protein, exposing an unusually large hydrophobic surface (Bi et al, MoI Cell. Biol. 23: 1014-1024, 2003; Ryter et al, Structure 10: 1541-1549, 2002). This suggests that the export signal might interact with other proteins to regulate its function. Here we use the homeodomain as bait in a yeast two-hybrid screen (Fields et al, Nature 340: 245-246, 1989) to identify a protein that regulates Prospero nuclear export. Clones of one gene, as disclosed herein, was named Caliban (Clbn), were recovered 35 times; all include the carboxy terminal 200 amino acids (Clbn-C).

3. ISOLATION OF THE GENE ENCODING CALIBAN

A. General Recombinant DNA Methods

[0203] The nucleic acids used to practice this invention, whether RNA 5 iRNA, antisense nucleic acid, cDNA, genomic DNA, vectors, viruses or hybrids thereof, can be isolated from a variety of sources, genetically engineered, amplified, and/or expressed/generated recombinantly. Recombinant polypeptides generated from these nucleic acids can be individually isolated or cloned and tested for a desired activity. Any recombinant expression system can be used, including bacterial, mammalian, yeast, insect or plant cell expression systems.

[0204] Alternatively, these nucleic acids can be synthesized in vitro by well-known chemical synthesis techniques, as described in, (e.g., Adams, J. Am. Chem. Soc. 105: 661, 1983; Belousov, Nucleic Acids Res. 25: 3440-3444, 1997; Frenkel, Free Radic. Biol. Med. 19: 373-380, 1995; Blommers, Biochemistry 33: 7886-7896, 1994; Narang, Meth. Enzymol. 68: 90, 1979; Brown MetA. Enzymol. 68: 109, 1979; Beaucage, Tetra. Lett. 22: 1859, 1981; U.S. Pat. No. 4,458,066).

[0205] The invention provides oligonucleotides comprising sequences of the invention, e.g., subsequences of the exemplary sequences of the invention. Oligonucleotides can include, e.g., single stranded poly-deoxynucleotides or two complementary polydeoxynucleotide strands which can be chemically synthesized.

[0206] Techniques for the manipulation of nucleic acids, such as, e.g., subcloning, labeling probes (e.g., random-primer labeling using Klenow polymerase, nick translation, amplification), sequencing, hybridization and the like are well described in the scientific and patent literature, (see, e.g., Sambrook, 1989 and Ausubel, 1994, supra).

[0207] Caliban polypeptides and nucleic acids are used in the assays described below. For example, recombinant Caliban can be used to make cells that constitutively express Caliban. Such polypeptides and nucleic acids can be made using routine techniques in the field of recombinant genetics.

[0208] For nucleic acids, sizes are given in either kilobases (kb) or base pairs (bp). These are estimates derived from agarose or acrylamide gel electrophoresis, from sequenced nucleic acids, or from published DNA sequences. For proteins, sizes are given in kilodaltons (kDa) or amino acid residue numbers. Proteins sizes are estimated from gel electrophoresis, from sequenced proteins, from derived amino acid sequences, or from published protein sequences.

[0209] Oligonucleotides can be chemically synthesized according to the solid phase phosphoramidite triester method first described by (Beaucage et ah, Tetrahedron Letts. 22: 1859- 1862, 1981), using an automated synthesizer, as described in (Van Devanter et ah, Nucleic Acids

Res. 12: 6159-6168, 1984). Purification of oligonucleotides is typically by either native acrylamide gel electrophoresis or by anion-exchange HPLC as described in (Pearson et al, J. Chrom. 255: 137-149, 1983). The sequence of the cloned genes and synthetic oligonucleotides can be verified after cloning using, e.g., the chain termination method for sequencing double- stranded templates of (Wallace et al, Gene 16: 21-26, 1981). Again, as noted above, companies such as Operon Technologies, Inc. provide an inexpensive commercial source for essentially any oligonucleotide.

B. Cloning Methods for the Isolation of Nucleotide Sequences Encoding Caliban

[0210] In general, the nucleic acid sequences encoding genes of interest, such as Caliban and related nucleic acid sequence homologs, are cloned from cDNA and genomic DNA libraries by hybridization with a probe, or isolated using amplification techniques with oligonucleotide primers. Preferably mammalian, more preferably human sequences are used. For example, Caliban sequences are typically isolated from mammalian nucleic acid (genomic or cDNA) libraries by hybridizing with a nucleic acid probe, the sequence of which can be derived from the sequences shown in the Figures.

[0211] Amplification techniques using primers can also be used to amplify and isolate, e.g., a nucleic acid encoding Caliban, from DNA or RNA (see, e.g., Dieffenfach et al, PCR Primer: A Laboratory Manual, 1995). These primers can be used, e.g., to amplify either the full length sequence or a probe of one to several hundred nucleotides, which is then used to screen a mammalian library for the full-length nucleic acid of choice. For example, degenerate primer sets can be used to isolate Caliban nucleic acids. Nucleic acids can also be isolated from expression libraries using antibodies as probes. Such polyclonal or monoclonal antibodies can be raised, e.g., using the sequence of Caliban.

[0212] Polymorphic variants and alleles that are substantially identical to the gene of choice can be isolated using nucleic acid probes, and oligonucleotides under stringent hybridization conditions, by screening libraries. Alternatively, expression libraries can be used to clone, e.g., Caliban and Caliban polymorphic variants, interspecies homologs, and alleles, by detecting expressed homologs immunologically with antisera or purified antibodies made against Caliban, which also recognize and selectively bind to the Caliban homolog.

[0213] To make a cDNA library, one should choose a source that is rich in the mRNA of choice, e.g., for human Caliban mRNA, such as normal human bronchial epithelial cells, which we have demonstrated show a high level of Caliban activity. The mRNA is then made into cDNA using reverse transcriptase, ligated into a recombinant vector, and transfected into a recombinant host for propagation, screening and cloning. Methods for making and screening

' cDNAlibraries are well known (see, e.g., Gubler et al, Gene 25: 263-269, 1983; Sambrook et al, supra; Ausubel et al, supra).

[0214] For a genomic library, the DNA is extracted from the tissue and either mechanically sheared or enzymatically digested to yield fragments of about 12-20 kb. The fragments are then separated by gradient centrifugation from undesired sizes and are constructed in non-lambda expression vectors. These vectors are packaged in vitro. Recombinant phage are analyzed by plaque hybridization as described in (Benton et al., Science 196: 180-182, 1977). Colony hybridization is carried out as generally described in (Grunstein et al, Proc. Natl. Acad. ScL U.S.A. 72: 3961-3965, 1975).

[0215] An alternative method of isolating a nucleic acid and its homologs combines the use of synthetic oligonucleotide primers and amplification of an RNA or DNA template (see U.S. Pat. Nos. 4,683,195 and 4,683,202; Innis et al, PCR Protocols: A Guide to Methods and Applications, 1990). Methods such as polymerase chain reaction (PCR) and ligase chain reaction (LCR) can be used to amplify nucleic acid sequences of, e.g., Caliban directly from mRNA, from cDNA, from genomic libraries or cDNA libraries. Degenerate oligonucleotides can be designed to amplify Caliban homologs using the sequences provided herein. Restriction endonuclease sites can be incorporated into the primers. Polymerase chain reaction or other in vitro amplification methods may also be useful, for example, to clone nucleic acid sequences that code for proteins to be expressed, to make nucleic acids to use as probes for detecting the presence of Caliban encoding mRNA in physiological samples, for nucleic acid sequencing, or for other purposes. Genes amplified by the PCR reaction can be purified from agarose gels and cloned into an appropriate vector.

[0216] As described above, gene expression of Caliban can also be analyzed by techniques known in the art, e.g., reverse transcription and PCR amplification of mRNA, isolation of total RNA or poly A + RNA, northern blotting, dot blotting, in situ hybridization, RNase protection, probing high density oligonucleotides, and the like. All of these techniques are standard in the art.

[0217] Synthetic oligonucleotides can be used to construct recombinant genes for use as probes or for expression of protein. This method is performed using a series of overlapping oligonucleotides usually 40-120 bp in length, representing both the sense and non-sense strands of the gene. These DNA fragments are then annealed, ligated and cloned. Alternatively, amplification techniques can be used with precise primers to amplify a specific subsequence of the Caliban nucleic acid. The specific subsequence is then ligated into an expression vector.

[0218] The nucleic acid encoding the protein of choice is typically cloned into intermediate vectors before transformation into prokaryotic or eukaryotic cells for replication and/or expression. These intermediate vectors are typically prokaryote vectors, e.g., plasmids, or shuttle vectors. Optionally, cells can be transfected with recombinant Caliban operably linked to a constitutive promoter, to provide higher levels of Caliban expression in cultured cells.

C. Expression in Prokaryotes and Eukaryotes

[0219] To obtain high level expression of a cloned gene or nucleic acid, such as those cDNAs encoding Caliban, one typically subclones Caliban into an expression vector that contains a strong promoter to direct transcription, a transcription/translation terminator, and if for a nucleic acid encoding a protein, a ribosome binding site for translational initiation. Suitable bacterial promoters are well known in the art and described, e.g., in Sambrook et al. and Ausubel et al. Bacterial expression systems for expressing the Caliban protein are available in, e.g., E. coli, Bacillus sp., and Salmonella (Palva et al, Gene 22: 229-235, 1983). Kits for such expression systems are commercially available. Eukaryotic expression systems for mammalian cells, yeast, and insect cells are well known in the art and are also commercially available.

[0220] The promoter used to direct expression of a heterologous nucleic acid depends on the particular application. The promoter is preferably positioned about the same distance from the heterologous transcription start site as it is from the transcription start site in its natural setting. As is known in the art, however, some variation in this distance can be accommodated without loss of promoter function. The promoter typically cam also include elements that are responsive to transactivation, e.g., hypoxia responsive elements, Gal4 responsive elements, lac repressor responsive elements, and the like. The promoter can be constitutive or inducible, heterologous or homologous.

[0221] In addition to the promoter, the expression vector typically contains a transcription unit or expression cassette that contains all the additional elements required for the expression of the nucleic acid in host cells. A typical expression cassette thus contains a promoter operably linked, e.g., to the nucleic acid sequence encoding Caliban, and signals required for efficient polyadenylation of the transcript, ribosome binding sites, and translation termination. The nucleic acid sequence may typically be linked to a cleavable signal peptide sequence to promote secretion of the encoded protein by the transformed cell. Such signal peptides would include, among others, the signal peptides from tissue plasminogen activator, insulin, and neuron growth factor, and juvenile hormone esterase of Heliothis virescens. Additional elements of the cassette may include enhancers and, if genomic DNA is used as the structural gene, introns with functional splice donor and acceptor sites.

[0222] In addition to a promoter sequence, the expression cassette should also contain a transcription termination region downstream of the structural gene to provide for efficient termination. The termination region may be obtained from the same gene as the promoter sequence or may be obtained from different genes.

[0223] The particular expression vector used to transport the genetic information into the cell is not particularly critical (one expression vector is described in Example I). Any of the conventional vectors used for expression in eukaryotic or prokaryotic cells may be used. Standard bacterial expression vectors include plasmids such as pBR322 based plasmids, pSKF, pET23D, and fusion expression systems such as GST and LacZ. Epitope tags can also be added to recombinant proteins to provide convenient methods of isolation, e.g., c-myc.

[0224] Expression vectors containing regulatory elements from eukaryotic viruses are typically used in eukaryotic expression vectors, e.g., SV40 vectors, papilloma virus vectors, and vectors derived from Epstein-Barr virus. Other exemplary eukaryotic vectors include pMSG, pAV009/A + , pMTO10/A + , pMAMneo-5, baculovirus pDSVE, and any other vector allowing expression of proteins under the direction of the SV40 early promoter, SV40 later promoter, metallothionein promoter, murine mammary tumor virus promoter, Rous sarcoma virus promoter, polyhedrin promoter, or other promoters shown effective for expression in eukaryotic cells.

[0225] Some expression systems have markers that provide gene amplification such as thymidine kinase, hygromycin B phosphotransferase, and dihydrofolate reductase. Alternatively, high yield expression systems not involving gene amplification are also suitable, such as using a baculovirus vector in insect cells, with a Caliban encoding sequence under the direction of the polyhedrin promoter or other strong baculovirus promoters.

[0226] The elements that are typically included in expression vectors also include a replicon that functions in E. coli, a gene encoding antibiotic resistance to permit selection of bacteria that harbor recombinant plasmids, and unique restriction sites in nonessential regions of the plasmid to allow insertion of eukaryotic sequences. The particular antibiotic resistance gene chosen is not critical, any of the many resistance genes known in the art are suitable. The prokaryotic sequences are preferably chosen such that they do not interfere with the replication of the DNA in eukaryotic cells, if necessary.

[0227] Standard transfection methods are used to produce bacterial, mammalian, yeast or insect cell lines that express large quantities of protein, which are then purified using standard techniques (see, e.g., Colley et al, J. Biol Chem. 264: 17619-17622, 1989; Deutscher, Guide to Protein Purification, in Methods in Enzymology, 182: 1990). Transformation of eukaryotic and

prόkaryotic cells are performed according to standard techniques (see, e.g., Morrison, J. Bad. 132: 349-351, 1977; Clark-Curtiss et al, Methods in En∑ymology 101: 347-362, and Wu et al, 1983).

[0228] Any of the well known procedures for introducing foreign nucleotide sequences into host cells may be used. These include the use of calcium phosphate transfection, polybrene, protoplast fusion, electroporation, liposomes, microinjection, plasma vectors, viral vectors and any of the other well known methods for introducing cloned genomic DNA, cDNA, synthetic DNA or other foreign genetic material into a host cell (see, e.g., Sambrook et al, supra). It is only necessary that the particular genetic engineering procedure used be capable of successfully introducing at least one gene into the host cell capable of expressing the protein of choice.

[0229] After the expression vector is introduced into the cells, the transfected cells are cultured under conditions favoring expression of the Caliban protein, which is recovered from the culture using standard techniques identified below.

4. PURIFICATION OF CALIBAN

[0230] If necessary, naturally occurring or recombinant proteins can be purified for use in functional assays, e.g., to make antibodies to detect Caliban. Naturally occurring Caliban is purified, e.g., from mammalian tissue such as lung, colon, or liver cell lines, or any other source of a Caliban homolog. Recombinant Caliban is purified from any suitable expression system, e.g., by expressing Caliban in E. coli and then purifying the recombinant protein via affinity purification, e.g., by using antibodies that recognize a specific epitope on the protein or on part of the fusion protein, or by using glutathione affinity gel, which binds to GST. In some embodiments, the recombinant protein is a fusion protein, e.g., with GST or Gal4 at the N- terminus.

[0231] The protein of choice may be purified to substantial purity by standard techniques, including selective precipitation with such substances as ammonium sulfate; column chromatography, immunopurification methods, and others (see, e.g., Scopes, Protein Purification: Principles and Practice, 1982; U.S. Pat. No. 4,673,641; Ausubel et al, supra; and Sambrook et al, supra).

[0232] A number of procedures can be employed when recombinant protein is being purified. For example, proteins having established molecular adhesion properties can be reversibly fused to Caliban. With the appropriate ligand, Caliban can be selectively adsorbed to a purification column and then freed from the column in a relatively pure form. The fused protein is then removed by enzymatic activity. Finally, Caliban could be purified using immunoaffmity columns.

A. Purification of Caliban from Recombinant Bacteria

[0233] Recombinant proteins are expressed by transformed bacteria in large amounts, typically after promoter induction; but expression can be constitutive. Promoter induction with IPTG is one example of an inducible promoter system. Bacteria are grown according to standard procedures in the art. Fresh or frozen bacteria cells are used for isolation of protein.

[0234] Proteins expressed in bacteria may form insoluble aggregates ("inclusion bodies"). Several protocols are suitable for purification of inclusion bodies. For example, purification of inclusion bodies typically involves the extraction, separation and/or purification of inclusion bodies by disruption of bacterial cells, e.g., by incubation in a buffer of 50 mM TRIS/HCL pH 7.5, 50 mM NaCl, 5 mM MgCl 2 , 1 mM DTT, 0.1 mM ATP, and 1 mM PMSF. The cell suspension can be lysed using 2-3 passages through a French press, homogenized using a Polytron (Brinkman Instruments) or sonicated on ice. Alternate methods of lysing bacteria are apparent to those of skill in the art (see, e.g., Sambrook et al., supra; Ausubel et al., supra).

[0235] If necessary, the inclusion bodies are solubilized, and the lysed cell suspension is typically centrifuged to remove unwanted insoluble matter. Proteins that formed the inclusion bodies may be renatured by dilution or dialysis with a compatible buffer. Suitable solvents include, but are not limited to urea (from about 4 M to about 8 M), formamide (at least about 80%, volume/volume basis), and guanidine hydrochloride (from about 4 M to about 8 M). Some solvents which are capable of solubilizing aggregate-forming proteins, for example SDS (sodium dodecyl sulfate), 70% formic acid, are inappropriate for use in this procedure due to the possibility of irreversible denaturation of the proteins, accompanied by a lack of immunogenicity and/or activity. Although guanidine hydrochloride and similar agents are denaturants, this denaturation is not irreversible and renaturation may occur upon removal (by dialysis, for example) or dilution of the denaturant, allowing re-formation of immunologically and/or biologically active protein. Other suitable buffers are known to those skilled in the art. The protein of choice is separated from other bacterial proteins by standard separation techniques, e.g., with Ni-NTA agarose resin.

[0236] Alternatively, it is possible to purify the recombinant Caliban protein from bacteria periplasm. After lysis of the bacteria, when the protein is exported into the periplasm of the bacteria, the periplasmic fraction of the bacteria can be isolated by cold osmotic shock in addition to other methods known to skill in the art. To isolate recombinant proteins from the periplasm, the bacterial cells are centrifuged to form a pellet. The pellet is resuspended in a buffer containing 20% sucrose. To lyse the cells, the bacteria are centrifuged and the pellet is resuspended in ice-cold 5 mM MgSO 4 and kept in an ice bath for approximately 10 minutes. The

cell suspension is centrifuged and the supernatant decanted and saved. The recombinant proteins present in the supernatant can be separated from the host proteins by standard separation techniques well known to those of skill in the art.

B. Standard Protein Separation Techniques for Purifying Caliban

Solubility Fractionation

[0237] Often as an initial step, particularly if the protein mixture is complex, an initial salt fractionation can separate many of the unwanted host cell proteins (or proteins derived from the cell culture media) from the recombinant protein of interest. The preferred salt is ammonium sulfate. Ammonium sulfate precipitates proteins by effectively reducing the amount of water in the protein mixture. Proteins then precipitate on the basis of their solubility. The more hydrophobic a protein is, the more likely it is to precipitate at lower ammonium sulfate concentrations. A typical protocol includes adding saturated ammonium sulfate to a protein solution so that the resultant ammonium sulfate concentration is between 20-30%. This concentration will precipitate the most hydrophobic of proteins. The precipitate is then discarded (unless the protein of interest is hydrophobic) and ammonium sulfate is added to the supernatant to a concentration known to precipitate the protein of interest. The precipitate is then solubilized in buffer and the excess salt removed if necessary, either through dialysis or diafiltration. Other methods that rely on solubility of proteins, such as cold ethanol precipitation, are well known to those of skill in the art and can be used to fractionate complex protein mixtures.

Size Differential Filtration

[0238] The molecular weight of the protein, e.g., Caliban, can be used to isolated the protein from proteins of greater and lesser size using ultrafiltration through membranes of different pore size (for example, Amicon or Millipore membranes). As a first step, the protein mixture is ultrafiltered through a membrane with a pore size that has a lower molecular weight cut-off than the molecular weight of the protein of interest. The retentate of the ultrafiltration is then ultrafiltered against a membrane with a molecular cut off greater than the molecular weight of the protein of interest. The recombinant protein will pass through the membrane into the filtrate. The filtrate can then be chromatographed as described below.

Column Chromatography

[0239] The protein of choice can also be separated from other proteins on the basis of its size, net surface charge, hydrophobicity, and affinity for ligands. In addition, antibodies raised against proteins can be conjugated to column matrices and the proteins immunopurified. AU of these methods are well known in the art. It will be apparent to one of skill that chromatographic

techniques can be performed at any scale and using equipment from many different manufacturers (e.g., Pharmacia Biotech).

5. IMMUNOLOGICAL DETECTION OF CALIBAN

[0240] In addition to the detection of Caliban genes and gene expression using nucleic acid hybridization technology, one can also use immunoassays to detect Caliban, e.g., to identify alleles, mutants, polymorphic variants and interspecies homologs of Caliban. Immunoassays can be used to qualitatively or quantitatively analyze Caliban, e.g., to detect Caliban, to measure Caliban activity, or to identify modulators of Caliban activity. A general overview of the applicable technology can be found in (Harlow et al, Antibodies: A Laboratory Manual ,1988).

A. Antibodies to Caliban

[0241] Methods of producing polyclonal and monoclonal antibodies that react specifically with Caliban are known to those of skill in the art (see, e.g., Coligan, Current Protocols in Immunology, 1991; Harlow et al., supra; Goding, Monoclonal Antibodies: Principles and Practice 2: 1986; and Kohler et al. Nature 256: 495-497, 1975). Such techniques include antibody preparation by selection of antibodies from libraries of recombinant antibodies in phage or similar vectors, as well as preparation of polyclonal and monoclonal antibodies by immunizing rabbits or mice (see, e.g., Huse et al, Science 246: 1275-1281, 1989; Ward et al, Nature 341: 544-546, 1989). In addition, as noted above, many companies, such as BMA Biomedicals, Ltd., HTI Bio-products, and the like, provide the commercial service of making an antibody to essentially any peptide.

[0242] A number of Caliban comprising immunogens may be used to produce antibodies specifically reactive with Caliban , respectively. For example, recombinant Caliban , or antigenic fragments thereof, are isolated as described herein. Recombinant protein can be expressed in eukaryotic or prokaryotic cells as described above, and purified as generally described above. Recombinant protein is the preferred immunogen for the production of monoclonal or polyclonal antibodies. Alternatively, a synthetic peptide derived from the sequences disclosed herein and conjugated to a carrier protein can be used an immunogen. Naturally occurring protein may also be used either in pure or impure form. The product is then injected into an animal capable of producing antibodies. Either monoclonal or polyclonal antibodies may be generated, for subsequent use in immunoassays to measure the protein.

[0243] Methods of production of polyclonal antibodies are known to those of skill in the art. To improve reproducibility, an inbred strain of mice {e.g., BALB/C mice) can be immunized to make the antibody; however, standard animals (mice, rabbits, and the like) used to make antibodies are immunized with the protein using a standard adjuvant, such as Freund's

adjuvant, and a standard immunization protocol (see Harlow et al, supra). The animal's immune response to the immunogen preparation is monitored by taking test bleeds and determining the titer of reactivity to the protein of choice. When appropriately high titers of antibody to the immunogen are obtained, blood is collected from the animal and antisera are prepared. Further fractionation of the antisera to enrich for antibodies reactive to the protein can be done if desired (see Harlow et al, supra).

[0244] Monoclonal antibodies may be obtained by various techniques familiar to those skilled in the art. Briefly, spleen cells from an animal immunized with a desired antigen are immortalized, commonly by fusion with a myeloma cell (see Kohler et al. Eur. J. Immunol. 6: 511-519, 1976). Alternative methods of immortalization include transformation with Epstein Barr Virus, oncogenes, or retroviruses, or other methods well known in the art. Colonies arising from single immortalized cells are screened for production of antibodies of the desired specificity and affinity for the antigen, and yield of the monoclonal antibodies produced by such cells may be enhanced by various techniques, including injection into the peritoneal cavity of a vertebrate host. Alternatively, one may isolate DNA sequences which encode a monoclonal antibody or a binding fragment thereof by screening a DNA library from human B cells according to the general protocol outlined by (Huse et al, Science 246: 1275-1281, 1989).

[0245] Monoclonal antibodies and polyclonal sera are collected and titered against the immunogen protein in an immunoassay, for example, a solid phase immunoassay with the immunogen immobilized on a solid support. Typically, polyclonal antisera with a titer of 10 4 or greater are selected and tested for their cross reactivity against non-Caliban proteins or even other related proteins, e.g., from other organisms, using a competitive binding immunoassay. Specific polyclonal antisera and monoclonal antibodies will usually bind with KD of at least about 0.1 mM, more usually at least about 1 .mu.M, preferably at least about 0.1 .mu.M or better, and most preferably, 0.01 .mu.M or better.

[0246] Once Caliban specific antibodies are available, these proteins can be detected by a variety of immunoassay methods. For a review of immunological and immunoassay procedures, see Basic and Clinical Immunology (Stites et al, 7: 1991). Moreover, the immunoassays of the present invention can be performed in any of several configurations, which are reviewed extensively in Enzyme Immunoassay (Maggio, 1980; and Harlow et al, supra).

B. Immunological Binding Assays

[0247] Caliban can be detected and/or quantified using any of a number of well recognized immunological binding assays (see, e.g., U.S. Pat. Nos. 4,366,241; 4,376,110; 4,517,288; and 4,837,168). For a review of the general immunoassays, (see also Asai, Methods

in ' Cell Biology: Antibodies in Cell Biology, 37: 1993; Stites et al, Basic and Clinical Immunology 7, 1991). Immunological binding assays (or immunoassays) typically use an antibody that specifically binds to a protein or antigen of choice (in this case Caliban, or antigenic fragments thereof). The antibody may be produced by any of a number of means well known to those of skill in the art and as described above.

[0248] Immunoassays also often use a labeling agent to specifically bind to and label the complex formed by the antibody and antigen. The labeling agent may itself be one of the moieties comprising the antibody/antigen complex. Thus, the labeling agent may be a labeled Caliban polypeptide or a labeled anti-Caliban. Alternatively, the labeling agent may be a third moiety, such a secondary antibody, that specifically binds to the antibody/antigen complex (a secondary antibody is typically specific to antibodies of the species from which the first antibody is derived). Other proteins capable of specifically binding immunoglobulin constant regions, such as protein A or protein G may also be used as the label agent. These proteins exhibit a strong non-immunogenic reactivity with immunoglobulin constant regions from a variety of species (see, e.g., Kronval et al, J. Immunol. Ill: 1401-1406 , 1973; Akerstrom et al, J. Immunol. 135: 2589-2542, 1985). The labeling agent can be modified with a detectable moiety, such as biotin, to which another molecule can specifically bind, such as streptavidin. A variety of detectable moieties are well known to those skilled in the art.

[0249] Throughout the assays, incubation and/or washing steps may be required after each combination of reagents. Incubation steps can vary from about 5 seconds to several hours, preferably from about 5 minutes to about 24 hours. However, the incubation time will depend upon the assay format, antigen, volume of solution, concentrations, and the like. Usually, the assays will be carried out at ambient temperature, although they can be conducted over a range of temperatures, such as 1O 0 C to 4O 0 C.

Non-Competitive Assay Formats

[0250] Immunoassays for detecting Caliban in samples may be either competitive or noncompetitive. Noncompetitive immunoassays are assays in which the amount of antigen is directly measured. In one preferred "sandwich" assay, for example, the anti-antigen antibodies can be bound directly to a solid substrate on which they are immobilized. These immobilized antibodies then capture antigen present in the test sample. Antigen thus immobilized is then bound by a labeling agent, such as a second antibody bearing a label. Alternatively, the second antibody may lack a label, but it may, in turn, be bound by a labeled third antibody specific to antibodies of the species from which the second antibody is derived. The second or third

with a detectable moiety, such as biotin, to which another molecule specifically binds, e.g., streptavidin, to provide a detectable moiety.

Competitive Assay Formats

[0251] In competitive assays, the amount of Caliban present in the sample is measured indirectly by measuring the amount of a known, added (exogenous) antigen displaced (competed away) from an anti-antigen antibody by the unknown antigen present in a sample. In one competitive assay, a known amount of antigen is added to a sample and the sample is then contacted with an antibody that specifically binds to the antigen. The amount of exogenous antigen bound to the antibody is inversely proportional to the concentration of antigen present in the sample. In a particularly preferred embodiment, the antibody is immobilized on a solid substrate. The amount of antigen bound to the antibody may be determined either by measuring the amount of antigen present in an antigen/antibody complex, or alternatively by measuring the amount of remaining uncomplexed protein. The amount of antigen may be detected by providing a labeled antigen molecule.

[0252] A hapten inhibition assay is another preferred competitive assay. In this assay the known antigen is immobilized on a solid substrate. A known amount of anti-antigen antibody is added to the sample, and the sample is then contacted with the immobilized antigen. The amount of anti-antigen antibody bound to the known immobilized antigen is inversely proportional to the amount of antigen present in the sample. Again, the amount of immobilized antibody may be detected by detecting either the immobilized fraction of antibody or the fraction of the antibody that remains in solution. Detection may be direct where the antibody is labeled or indirect by the subsequent addition of a labeled moiety that specifically binds to the antibody as described above.

Cross-Reactivity Determinations

[0253] Immunoassays in the competitive binding format can also be used for crossreactivity determinations. For example, Caliban proteins can be immobilized to a solid support. Proteins are added to the assay that compete for binding of the antisera to the immobilized antigen. The ability of the added protein to compete for binding of the antisera to the immobilized protein is compared to the ability of antigen to compete with itself. The percent crossreactivity for the above proteins is calculated, using standard calculations. Those antisera with less than 10% crossreactivity with the added proteins are selected and pooled. The cross- reacting antibodies are optionally removed from the pooled antisera by immunoabsorption with the added proteins.

"[0154] '-' ' εM'ϊiϊMMdabsorbed and pooled antisera are then used in a competitive binding immunoassay as described above to compare a second protein thought to be perhaps an allele, interspecies homologs, or polymorphic variant of Caliban , to the immunogen protein. In order to make this comparison, the two proteins are each assayed at a wide range of concentrations and the amount of each protein required to inhibit 50% of the binding of the antisera to the immobilized protein is determined. If the amount of the second protein required to inhibit 50% of binding is less than 10 times the amount of the first protein that is required to inhibit 50% of binding, then the second protein is said to specifically bind to the polyclonal antibodies generated to the immunogen of choice.

Other Assay Formats

[0255] Western blot (immunoblot) analysis is used to detect and quantify the presence of Caliban in the sample. The technique generally comprises separating sample proteins by gel electrophoresis on the basis of molecular weight, transferring the separated proteins to a suitable solid support, (such as a nitrocellulose filter, a nylon filter, or derivatized nylon filter), and incubating the sample with the antibodies that specifically bind Caliban. The anti-antigen antibodies specifically bind to the antigen on the solid support. These antibodies may be directly labeled or alternatively may be subsequently detected using labeled antibodies (e.g., labeled sheep anti-mouse antibodies) that specifically bind to the anti-antigen antibodies.

[0256] Other assay formats include liposome immunoassays (LIA), which use liposomes designed to bind specific molecules (e.g., antibodies) and release encapsulated reagents or markers. The released chemicals are then detected according to standard techniques (see Monroe et al., Amer. Clin. Prod. Rev. 5: 34-41, 1986).

Reduction of Non-Specific Binding

[0257] One of skill in the art will appreciate that it is often desirable to minimize nonspecific binding in immunoassays. Particularly where the assay involves an antigen or antibody immobilized on a solid substrate, it is desirable to minimize the amount of non-specific binding to the substrate. Means of reducing such non-specific binding are well known to those of skill in the art. Typically, this technique involves coating the substrate with a proteinaceous composition. In particular, protein compositions such as bovine serum albumin (BSA), nonfat powdered milk, and gelatin are widely used with powdered milk being most preferred.

Labels

[0258] The particular label or detectable group used in the assay is not a critical aspect of the invention, as long as it does not significantly interfere with the specific binding of the antibody used in the assay. The detectable group can be any material having a detectable

phyάeaψtϋtf'άlkemϊeήϊ-ptθpttfy. Such detectable labels have been well-developed in the field of immunoassays and, in general, most any label useful in such methods can be applied to the present invention. Thus, a label is any composition detectable by spectroscopic, photochemical, biochemical, immunochemical, electrical, optical or chemical means. Useful labels in the present invention include magnetic beads (e.g., DYNAB EADS ® ), fluorescent dyes (e.g., fluorescein isothiocyanate, Texas red, rhodamine, and the like), radiolabels (e.g., 3 H, 125 1, 35 S, 14 C, or 32 P), enzymes (e.g., horse radish peroxidase, alkaline phosphatase and others commonly used in an ELISA), and calorimetric labels such as colloidal gold or colored glass or plastic beads (e.g., polystyrene, polypropylene, latex, and the like).

[0259] The label may be coupled directly or indirectly to the desired component of the assay according to methods well known in the art. As indicated above, a wide variety of labels may be used, with the choice of label depending on sensitivity required, ease of conjugation with the compound, stability requirements, available instrumentation, and disposal provisions.

[0260] Non-radioactive labels are often attached by indirect means. Generally, a ligand molecule (e.g., biotin) is covalently bound to the molecule. The ligand then binds to another molecules (e.g., streptavidin) molecule, which is either inherently detectable or covalently bound to a signal system, such as a detectable enzyme, a fluorescent compound, or a chemiluminescent compound. The ligands and their targets can be used in any suitable combination with antibodies that recognize a specific protein, or secondary antibodies that recognize antibodies to the specific protein.

[0261] The molecules can also be conjugated directly to signal generating compounds, e.g., by conjugation with an enzyme or fluorophore. Enzymes of interest as labels will primarily be hydrolases, particularly phosphatases, esterases and glycosidases, or oxidotases, particularly peroxidases. Fluorescent compounds include fluorescein and its derivatives, rhodamine and its derivatives, dansyl, umbelliferone, and the like Chemiluminescent compounds include luciferin, and 2,3-dihydrophthalazined- iones, e.g., luminol. For a review of various labeling or signal producing systems that maybe used, see (U.S. Pat. No. 4,391,904).

[0262] Means of detecting labels are well known to those of skill in the art. Thus, for example, where the label is a radioactive label, means for detection include a scintillation counter or photographic film as in autoradiography. Where the label is a fluorescent label, it may be detected by exciting the fluorochrome with the appropriate wavelength of light and detecting the resulting fluorescence. The fluorescence may be detected visually, by means of photographic film, by the use of electronic detectors such as charge coupled devices (CCDs) or photomultipliers and the like. Similarly, enzymatic labels may be detected by providing the

a'pprόpηate' substrains fϋHϊM enzyme and detecting the resulting reaction product. Finally simple colorimetric labels may be detected simply by observing the color associated with the label. Thus, in various dipstick assays, conjugated gold often appears pink, while various conjugated beads appear the color of the bead.

[0263] Some assay formats do not require the use of labeled components. For instance, agglutination assays can be used to detect the presence of the target antibodies, hi this case, antigen-coated particles are agglutinated by samples comprising the target antibodies. Li this format, none of the components need be labeled and the presence of the target antibody is detected by simple visual inspection.

6. ASSAYS FOR MEASURING CHANGES IN CALIBAN REGULATED CELL GROWTH

[0264] Caliban and its alleles, interspecies homologs, and polymorphic variants participate in regulation of cell proliferation and tumor suppression. Therefore, expression of Caliban and its alleles, interspecies homologs, and polymorphic variants in host cells would inhibit cell proliferation and suppress tumor formation. On the other hand, expression of Caliban mutants in a cell could lead to abnormal cell proliferation and loss of tumor suppressor phenotypes. Finally, compounds that activate or inhibit Caliban would indirectly affect regulation of cellular proliferation and tumor suppression. Any of these changes in cell growth can be assessed by using a variety of in vitro and in vivo assays, e.g., ability to grow on soft agar, changes in contact inhibition and density limitation of growth, changes in growth factor or serum dependence, changes in the level of tumor specific markers, changes in invasiveness into Matrigel, changes in apoptosis, changes in cell cycle pattern, changes in tumor growth in vivo, such as in transgenic mice, and the like Furthermore, these assays can be used to screen for activators, inhibitors, and modulators of Caliban. Such activators, inhibitors, and modulators of Caliban can then be used to modulate Caliban expression in tumor cells or abnormal proliferative cells.

A. Assays for Changes in Cell Growth by Expression of Caliban Constructs

[0265] One or more of the following assays can be used to identify Caliban constructs which are capable of regulating cell proliferation and tumor suppression. The phrase "Caliban constructs" can refer to any of Caliban and its alleles, interspecies homologs, polymorphic variants and mutants. Functional Caliban constructs identified by the following assays can then be used in, e.g., gene therapy to inhibit abnormal cellular proliferation and transformation.

'"SOtt'α&af ''OrowtrToT'Colony Formation in Suspension

[0266] Normal cells require a solid substrate to attach and grow. When the cells are transformed, they lose this phenotype and grow detached from the substrate. For example, transformed cells can grow in stirred suspension culture or suspended in semi-solid media, such as semi-solid or soft agar. The transformed cells, when transfected with tumor suppressor genes, regenerate normal phenotype and require a solid substrate to attach and grow.

[0267] Soft agar growth or colony formation in suspension assays can be used to identify Caliban constructs, which when expressed in host cells, inhibit abnormal cellular proliferation and transformation. Typically, transformed host cells (e.g., cells that grow on soft agar) are used in this assay. For example, HeLa or NIH 3T3 cell lines can be used. Expression of a tumor suppressor gene in these transformed host cells would reduce or eliminate the host cells' ability to grow in stirred suspension culture or suspended in semi-solid media, such as semi-solid or soft. This is because the host cells would regenerate anchorage dependence of normal cells, and therefore require a solid substrate to grow. Therefore, this assay can be used to identify Caliban constructs that encode a functional tumor suppressor. Once identified, such Caliban constructs can be used in a number of diagnostic or therapeutic methods, e.g., in gene therapy to inhibit abnormal cellular proliferation and transformation.

[0268] Techniques for soft agar growth or colony formation in suspension assays are described in (Freshney, Culture of Animal Cells a Manual of Basic Technique 3, 1994), herein incorporated by reference. See also, the methods section of Garkavtsev et al, (1996), supra, herein incorporated by reference.

Contact Inhibition and Density Limitation of Growth

[0269] Normal cells typically grow in a flat and organized pattern in a petri dish until they touch other cells. When the cells touch one another, they are contact inhibited and stop growing. When cells are transformed, however, the cells are not contact inhibited and continue to grow to high densities in disorganized foci. Thus, the transformed cells grow to a higher saturation density than normal cells. This can be detected morphologically by the formation of a disoriented monolayer of cells or rounded cells in foci within the regular pattern of normal surrounding cells. Alternatively, labeling index with [ 3 H] -thymidine at saturation density can be used to measure density limitation of growth. See Freshney (1994), supra. The transformed cells, when transfected with tumor suppressor genes, regenerate a normal phenotype and become contact inhibited and would grow to a lower density.

[0270] Contact inhibition and density limitation of growth assays can be used to identify Caliban constructs which are capable of inhibiting abnormal proliferation and

1xahsfόMa:tlb'nϊn''h'dδt'ό ι δlTsrTypically, transformed host cells (e.g., cells that are not contact inhibited) are used in this assay. For example, HeLa or NIH 3T3 cell lines can be used. Expression of a tumor suppressor gene in these transformed host cells would result in cells which are contact inhibited and grow to a lower saturation density than the transformed cells. Therefore, this assay can be used to identify Caliban constructs which function as a tumor suppressor. Once identified, such Caliban constructs can be used, e.g., in gene therapy to inhibit abnormal cellular proliferation and transformation.

[0271] In this assay, labeling index with [ 3 H] -thymidine at saturation density is a preferred method of measuring density limitation of growth. Transformed host cells are transfected with a Caliban construct and are grown for 24 hours at saturation density in non- limiting medium conditions. The percentage of cells labeling with [ 3 H] -thymidine is determined autoradiogrpahically. See, Freshney (1994), supra. The host cells expressing a functional Caliban construct would give arise to a lower labeling index compared to control (e.g., transformed host cells transfected with a vector lacking an insert).

Growth Factor or Serum Dependence

[0272] Growth factor or serum dependence can be used as an assay to identify functional Caliban constructs. Transformed cells have a lower serum dependence than their normal counterparts (see, e.g., Temin, J. Natl. Cancer Insti. 37: 167-175, 1966; Eagle et al, J. Exp. Med. 131: 836-879, 1970; Freshney, supra). This is in part due to release of various growth factors by the transformed cells. When a tumor suppressor gene is transfected and expressed in these transformed cells, the cells would reacquire serum dependence and would release growth factors at a lower level. Therefore, this assay can be used to identify Caliban constructs which encode functional tumor suppressor. Growth factor or serum dependence of transformed host cells which are transfected with a Caliban construct can be compared with that of control (e.g., transformed host cells which are transfected with a vector without insert). Host cells expressing a functional Caliban would exhibit an increase in growth factor and serum dependence compared to control.

Tumor Specific Markers Levels

[0273] Tumor cells release an increased amount of certain factors (hereinafter "tumor specific markers") than their normal counterparts. For example, plasminogen activator (PA) is released from human glioma at a higher level than from normal brain cells (see, e.g., Gullino, Angiogenesis, 178-184, 1985). Similarly, Tumor angiogenesis factor (TAF) is released at a higher level in tumor cells than their normal counterparts. (See, e.g., Folkman, Angiogenesis and cancer, Sem Cancer Biol., 1992).

[off 4f ! ' ' fiifiofSpS&c markers can be assayed for to identify Caliban constructs, which when expressed, decrease the level of release of these markers from host cells. Typically, transformed or tumorigenic host cells are used. Expression of a tumor suppressor gene in these host cells would reduce or eliminate the release of tumor specific markers from these cells. Therefore, this assay can be used to identify Caliban constructs that encode a functional tumor suppressor.

[0275] Various techniques which measure the release of these factors are described in Freshney (1994), supra. Also, see, (Unkless et al, J. Biol. Chem. 249: 4295-4305, 1974; Strickland et al, J. Biol. Chem. 251: 5694-5702, 1976; Whur et al, Br. J. Cancer 42: 305-312, 1980; Gullino, Angiogenesis, 1985; Freshney, Anticancer Res. 5: 111-130, 1985).

Invasiveness into Matrigel

[0276] The degree of invasiveness into Matrigel or some other extracellular matrix constituent can be used as an assay to identify Caliban constructs which are capable of inhibiting abnormal cell proliferation and tumor growth. Tumor cells exhibit a good correlation between malignancy and invasiveness of cells into Matrigel or some other extracellular matrix constituent. In this assay, tumorigenic cells are typically used as host cells. Expression of a tumor suppressor gene in these host cells would decrease invasiveness of the host cells. Therefore, functional Caliban constructs can be identified by measuring changes in the level of invasiveness between the host cells before and after the introduction of Caliban constructs. If a Caliban construct functions as a tumor suppressor, its expression in tumorigenic host cells would decrease invasiveness.

[0277] Techniques described in Freshney (1994), supra, can be used. Briefly, the level of invasion of host cells can be measured by using filters coated with Matrigel or some other extracellular matrix constituent. Penetration into the gel, or through to the distal side of the filter, is rated as invasiveness, and rated histologically by number of cells and distance moved, or by prelabeling the cells with 125 I and counting the radioactivity on the distal side of the filter or bottom of the dish. See, e.g., Freshney (1984), supra.

Apoptosis Analysis

[0278] Apoptosis analysis can be used as an assay to identify functional Caliban constructs. Caliban expression or overexpression causes apoptosis. In this assay, cell lines, such as A549 or EKVX, can be used to screen Caliban constructs which encode a functional tumor suppressor. Cells are transfected with a putative Caliban construct. The cells can be co- transfected with a construct comprising a marker gene, such as a gene that encodes green fluorescent protein. Alternatively, a single construct comprising a putative Caliban gene and a

"rrrafkeT gene can be" tfM'sϊe'etB'd into cells. Overexpression of a Caliban gene that encodes a functional tumor suppressor would cause apoptosis. Not wishing to be bound by a theory, exogenous expression of a tumor suppressor can decrease cell proliferation by causing a cell cycle arrest and by increasing cell death. The apoptotic change can be determined using methods known in the art, such as DAPI staining and TUNEL assay using fluorescent microscope. For TUNEL assay, commercially available kit can be used (e.g., Fluorescein FragEL DNA Fragmentation Detection Kit (Oncogene Research Products, Cat.# QIA39)+Tetramethyl- rhodamine-5-dUTP (Roche, Cat. # 1534 378)). Cells expressing a functional Caliban would exhibit an increased apoptosis compared to control (e.g., a cell transfected with a vector without a Caliban gene insert).

Gn/Gχ Cell Cycle Arrest Analysis

[0279] Go/Gi cell cycle arrest can be used as an assay to identify functional Caliban constructs. In this assay, cell lines, such as A549 or EKVX, can be used to screen Caliban constructs which encode a functional tumor suppressor. Cells are transfected with a putative Caliban construct. The cells can be co-transfected with a construct comprising a marker gene, such as a gene that encodes green fluorescent protein. Alternatively, a single construct comprising a putative Caliban gene and a marker gene can be transfected into cells. Expression or overexpression of a Caliban gene that encodes a functional tumor suppressor would cause Go/Gi cell cycle arrest. Methods known in the art can be used to measure the degree OfG 1 cell cycle arrest. For example, the propidium iodide signal can be used as a measure for DNA content to determine cell cycle profiles on a flow cytometer. The percent of the cells in each cell cycle can be calculated. Cells expressing a functional Caliban would exhibit a higher number of cells that are arrested in Go/Gi phase compared to control (e.g., transfected with a vector without a Caliban gene insert).

Tumor growth in vivo

[0280] Effects of Caliban on cell growth can be tested in transgenic or immune- suppressed mice. Knock-out transgenic mice can be made and used as described herein. Such knock-out mice can be used to study effects of Caliban, e.g., as a cancer model, as a means of assaying in vivo for compounds that modulate Caliban, and to test the effects of restoring a wildtype Caliban to knock-out mice.

[0281] Alternatively, various immune-suppressed or immune-deficient host animals can be used. For example, genetically athymic "nude" mouse (see, e.g., Giovanella et al, J. Natl. Cancer Inst. 52: 921, 1974), a SCID mouse, a thymectomized mouse, or an irradiated mouse (see, e.g., Bradley et al, Br. J. Cancer 38: 263, 1978; Selby et al, Br. J. Cancer 41: 52, 1980)

" dan b'e 1 used as a Ho 1 ST Tralsiplantable tumor cells (typically about 106 cells) injected into isogenic hosts will produce invasive tumors in a high proportions of cases, while normal cells of similar origin will not. In hosts which developed invasive tumors, cells expressing a Caliban construct are injected subcutaneously. After a suitable length of time, preferably 4-8 weeks, tumor growth is measured (e.g., by volume or by its two largest dimensions) and compared to the control. Tumors that have statistically significant reduction (using, e.g., Student's T test) are said to have inhibited growth. Using reduction of tumor size as an assay, functional Caliban constructs which are capable of inhibiting abnormal cell proliferation can be identified. This model can also be used to identify mutant versions of Caliban.

B. Assays for Compounds that Modulate Caliban

[0282] Caliban and its alleles, interspecies homologs, and polymorphic variants participate in regulation of cell proliferation and tumor suppression. Mutations in these genes, including null or missense mutations, can cause abnormal cell proliferation and tumor growth. The activity of Caliban polypeptides (wildtype or mutants) can be assessed using a variety of in vitro and in vivo assays measuring various parameters, e.g., cell growth on soft agar, contact inhibition and density limitation of growth, growth factor or serum dependence, tumor specific markers levels, invasiveness into Matrigel, tumor growth in vivo, transgenic mice, Caliban protein or mRNA levels, transcriptional activation or repression of a reporter gene, apoptosis analysis, G 0 ZG 1 cell cycle arrest, and the like. Such assays can also be used to screen for activators, inhibitors, and modulators of wildtype and mutant Caliban. Such activators, inhibitors, and modulators are useful in inhibiting tumor growth and modulating cell proliferation. Compounds identified using the assays of the invention are useful as therapeutics for treatment of cancer and other diseases involving cellular hyperproliferation.

[0283] Biologically active or inactivated Caliban polypeptides, either recombinants or naturally occurring, are used to screen activators, inhibitors, or modulators of tumor suppression and cell proliferation. The Caliban polypeptides can be recombinantly expressed in a cell, naturally expressed in a cell, recombinantly or naturally expressed in cells transplanted into an animal, or recombinantly or naturally expressed in a transgenic animal. Modulation is tested using one of the in vitro or in vivo assays described herein.

[0284] Cells that have wildtype Caliban, Caliban null mutations, Caliban missense mutations, or inactivation of Caliban are used in the assays of the invention, both in vitro and in vivo. Preferably, human cells are used. Cell lines can also be created or isolated from tumors that have mutant Caliban. Optionally, the cells can be transfected with an exogenous Caliban gene operably linked to a constitutive promoter, to provide higher levels of Caliban expression.

ArteώktiVH^'dnd'OgδϊϊόW'άaiiban levels can be examined. The cells can be treated to induce Caliban expression. The cells can be immobilized, be in solution, be injected into an animal, or be naturally occurring in a transgenic or non-transgenic animal.

[0285] Samples or assays that are treated with a test compound which potentially activates, inhibits, or modulates Caliban are compared to control samples that are not treated with the test compound, to examine the extent of modulation. Generally, the compounds to be tested are present in the range from 0.11 nM to 10 niM. Control samples (untreated with activators, inhibitors, or modulators) are assigned relative Caliban activity value of 100%. Inhibition of Caliban is achieved when the Caliban activity value relative to the control is about 90% {e.g., 10% less than the control), optionally 80% or less, 70% or less, 60% or less, 50% or less, 40% or less, or 25-0%. Activation of Caliban is achieved when the Caliban activity value relative to the control is 110% or more (e.g., at least 10% more than the control), optionally 120%, 130%, 140%, 150% or more, 200-500% or more, 1000-3000% or more.

[0286] The effects of the test compounds upon the function of the Caliban polypeptides can be measured by examining any of the parameters described above. For example, parameters such as ability to grow on soft agar, contact inhibition and density limitation of growth, growth factor or serum dependence, tumor specific markers levels, invasiveness into Matrigel, apoptosis, Go/Gi cell cycle arrest, tumor growth in vivo, transgenic mice and the like, can be measured. Furthermore, the effects of the test compounds on Caliban protein or mRNA levels, transcriptional activation or repression of a reporter gene can be measured. In each assay, cells expressing Caliban are contacted with a test compound and incubated for a suitable amount of time, e.g., from 0.5 to 48 hours. Then, parameters such as those described above are compared to those produced by control cells untreated with the test compound.

[0287] In one embodiment, the effect of test compounds upon the function of Caliban can be determined by comparing the level of Caliban protein or mRNA in treated samples and control samples. The level of Caliban protein is measured using immunoassays such as western blotting, ELISA and the like with a Caliban specific antibody. For measurement of mRNA, amplification, e.g., using PCR, LCR, or hybridization assays, e.g., northern hybridization, RNase protection, dot blotting, are preferred. The level of protein or mRNA is detected using directly or indirectly labeled detection agents, e.g., fluorescently or radioactively labeled nucleic acids, radioactively or enzymatically labeled antibodies, and the like, as described herein.

[0288] Alternatively, a reporter gene system can be devised using the Caliban promoter operably linked to a reporter gene such as luciferase, green fluorescent protein, CAT, or β-gal. After treatment with a potential Caliban modulator, the amount of reporter gene transcription,

transla'tibriJTof activity ϊS'm'elsured according to standard techniques known to those of skill in the art.

[0289] In another embodiment, the effects of test compounds on Caliban activity is performed in vivo, hi this assay, cultured cells that are expressing a wildtype or mutant Caliban (e.g., a null or missense mutation) are injected subcutaneously into an immune compromised mouse such as an athymic mouse, an irradiated mouse, or a SCID mouse. Caliban modulators are administered to the mouse, e.g., a chemical ligand library. After a suitable length of time, preferably 4-8 weeks, tumor growth is measured, e.g., by volume or by its two largest dimensions, and compared to the control. Tumors that have statistically significant reduction (using, e.g., Student's T test) are said to have inhibited growth. Alternatively, the extent of tumor neovascularization can also be measured. Immunoassays using endothelial cell specific antibodies are used to stain for vascularization of the tumor and the number of vessels in the tumor. Tumors that have a statistically significant reduction in the number of vessels (using, e.g., Student's T test) are said to have inhibited neovascularization.

[0290] Alternatively, transgenic mice with the endogenous Caliban gene knocked out can be used in an assay to screen for compounds which modulate the Caliban activity. As described in part A, knock-out transgenic mice can be made, in which the endogenous Caliban gene is disrupted, e.g., by replacing it with a marker gene. A transgenic mouse that is heterozygous or homozygous for integrated transgenes that have functionally disrupted the endogenous Caliban gene can be used as a sensitive in vivo screening assay for Caliban ligands and modulators of Caliban activity.

C. Candidate Bioactive Agents

[0291] Having identified Caliban genes and their homologs {see, e.g., the Figures), the information can be used in a wide variety of ways. In a preferred method, the genes can be used in conjunction with high throughput screening techniques as described herein, to allow monitoring for genes after treatment with a candidate agent, (Zlokarnik et ah, Science 279: 84-8, 1998; Heid et ah, Genome Res. 6: 986, 1996). In a preferred method, the candidate agents are added to cells.

[0292] The term "modulator", "candidate substance", "candidate bioactive agent", "drug candidate", "agent" or grammatical equivalents as used herein describes any molecule, e.g., protein, oligopeptide, small organic molecule, polysaccharide, polynucleotide, to be tested for bioactive agents that are capable of directly or indirectly altering the activity of a target gene, protein, or cell, hi preferred methods, the bioactive agents modulate the expression profiles, or expression profile nucleic acids or proteins provided herein. In a particularly preferred method,

me ' candfflafe agents induce a response, or maintain such a response as indicated, for example, by the effect of the agent on the expression profile, nucleic acids, proteins or activity as further described below. Generally a plurality of assay mixtures are run in parallel with different agent concentrations to obtain a differential response to the various concentrations. Typically, one of these concentrations serves as a negative control, i.e., at zero concentration or below the level of detection.

[0293] Candidate agents encompass numerous chemical classes, though typically they are organic molecules, preferably small organic compounds having a molecular weight of more than 100 and less than about 2,500 daltons. Candidate agents comprise functional groups necessary for structural interaction with proteins, particularly hydrogen bonding, and typically include at least an amine, carbonyl, hydroxyl or carboxyl group, preferably at least two of the functional chemical groups. The candidate agents often comprise cyclical carbon or heterocyclic structures and/or aromatic or polyaromatic structures substituted with one or more of the above functional groups. Candidate agents are also found among biomolecules including peptides, saccharides, fatty acids, steroids, purines, pyrimidines, derivatives, structural analogs or combinations thereof. Particularly preferred are peptides.

[0294] Candidate agents are obtained from a wide variety of sources including libraries of synthetic or natural compounds. For example, numerous means are available for random and directed synthesis of a wide variety of organic compounds and biomolecules, including expression of randomized oligonucleotides. Alternatively, libraries of natural compounds in the form of bacterial, fungal, plant and animal extracts are available or readily produced. Additionally, natural or synthetically produced libraries and compounds are readily modified through conventional chemical, physical and biochemical means. Known pharmacological agents can be subjected to directed or random chemical modifications, such as acylation, alkylation, esterification, amidification to produce structural analogs.

[0295] In some preferred embodiments, the candidate bioactive agents are proteins. By "protein" herein is meant at least two covalently attached amino acids, which includes proteins, polypeptides, oligopeptides and peptides. The protein can be made up of naturally occurring amino acids and peptide bonds, or synthetic peptidomimetic structures. Thus "amino acid", or "peptide residue", as used herein means both naturally occurring and synthetic amino acids. For example, homo-phenylalanine, citrulline and noreleucine are considered amino acids for the purposes of the invention. "Amino acid" also includes imino acid residues such as proline and hydroxyproline. The side chains can be in either the (R) or the (S) configuration. In some preferred embodiment, the amino acids are in the (S) or L-configuration. If non-naturally

όbcuffiiϊg 1 ^' e " chaln's"lafe We'd, non-amino acid substituents can be used, for example to prevent or retard in vivo degradations.

[0296] In a preferred method, the candidate bioactive agents are naturally occurring proteins or fragments of naturally occurring proteins. Thus, for example, cellular extracts containing proteins, or random or directed digests of proteinaceous cellular extracts, can be used. In this way libraries of procaryotic and eucaryotic proteins can be made for screening in the methods of the invention. The libraries can be bacterial, fungal, viral, and mammalian proteins, with the latter being preferred, and human proteins.

[0297] In some methods, the candidate bioactive agents are peptides of from about 5 to about 30 amino acids, with from about 5 to about 20 amino acids being preferred, and from about 7 to about 15 being particularly preferred. The peptides can be digests of naturally occurring proteins as is outlined above, random peptides, or "biased" random peptides. By "randomized" or grammatical equivalents herein is meant that each nucleic acid and peptide consists of essentially random nucleotides and amino acids, respectively. Since generally these random peptides (or nucleic acids, discussed below) are chemically synthesized, they can incorporate any nucleotide or amino acid at any position. The synthetic process can be designed to generate randomized proteins or nucleic acids, to allow the formation of all or most of the possible combinations over the length of the sequence, thus forming a library of randomized candidate bioactive proteinaceous agents.

[0298] In some methods, the library can be fully randomized, with no sequence preferences or constants at any position. In other methods, the library can be biased. Some positions within the sequence are either held constant, or are selected from a limited number of possibilities. For example, in some methods, the nucleotides or amino acid residues are randomized within a defined class, for example, of hydrophobic amino acids, hydrophilic residues, sterically biased (either small or large) residues, towards the creation of nucleic acid binding domains, the creation of cysteines, for cross-linking, prolines for SH-3 domains, serines, threonines, tyrosines or histidines for phosphorylation sites, or to purines, hi other methods, the candidate bioactive agents are nucleic acids, as defined above.

[0299] As described above generally for proteins, nucleic acid candidate bioactive agents can be naturally occurring nucleic acids, random nucleic acids, or "biased" random nucleic acids. For example, digests of procaryotic or eucaryotic genomes can be used as is outlined above for proteins.

[0300] In some methods, the candidate bioactive agents are organic chemical moieties.

D: XoiϊiblfiatόFϊM 1 Chemical Libraries

[0301] The invention provides methods for identifying/screening for modulators (e.g. , inhibitors, activators) of Caliban activity. In practicing the screening methods of the invention, a test compound is provided. It can be contacted with a polypeptide of the invention in vitro or administered to a cell of the invention or an animal of the invention in vivo. Compounds are also screened using the compositions, cells, non-human animals and methods of the invention for their ability to ameliorate a Caliban-associated disease or disorder or a disease or disorder associated with cancer. Combinatorial chemical libraries are one means to assist in the generation of new chemical compound leads for, e.g., compounds that inhibit a Caliban activity or, using a transgenic or a knockout non-human animal of the invention, a compound that can be used to treat or ameliorate a Caliban-associated disease or disorder or a disease or disorder associated with various types of cancers. A combinatorial chemical library is a collection of diverse chemical compounds generated by either chemical synthesis or biological synthesis by combining a number of chemical "building blocks" such as reagents. For example, a linear combinatorial chemical library such as a polypeptide library is formed by combining a set of chemical building blocks called amino acids in every possible way for a given compound length (i.e., the number of amino acids in a polypeptide compound). Millions of chemical compounds can be synthesized through such combinatorial mixing of chemical building blocks. For example, the systematic, combinatorial mixing of 100 interchangeable chemical building blocks results in the theoretical synthesis of 100 million tetrameric compounds or 10 billion pentameric compounds. (See, e.g., Gallop et al, J. Med. Che?n. 37: 1233-1250, 1994). Preparation and screening of combinatorial chemical libraries are well known to those of skill in the art, (see, e.g., U.S. Pat. No. 6,004,617; 5,985,356). Such combinatorial chemical libraries include, but are not limited to, peptide libraries, (see, e.g., U.S. Pat. No. 5,010,175; Furka, Int. J. Pept. Prot. Res. 37: 487-493, 1991; Houghton et al, Nature 354: 84-88, 1991). Other chemistries for generating chemical diversity libraries include, but are not limited to: peptoids (see, e.g., WO 91/19735), encoded peptides (see, e.g., WO 93/20242), random bio-oligomers (see, e.g., WO 92/00091), benzodiazepines (see, e.g., U.S. Pat. No. 5,288,514), diversomers such as hydantoins, benzodiazepines and dipeptides (see, e.g., Hobbs, Proc. Nat. Acad. Sci. USA 90: 6909-6913, 1993), vinylogous polypeptides (see, e.g., Hagihara, J. Amer. Chem. Soc. 114: 6568, 1992), non- peptidal peptidomimetics with a Beta-D-Glucose scaffolding (see, e.g., Hirschmann, J. Amer. Chem. Soc. 114: 9217-9218, 1992), analogous organic syntheses of small compound libraries (see, e.g., Chen, J. Amer. Chem. Soc. 116: 2661, 1994), oligocarbamates (see, e.g., Cho, Science 261:1303, 1993), and/or peptidyl phosphonates (see, e.g., Campbell, J. Org. Chem. 59: 658,

1994). See also (Gδiδ&tζ J: Med. Chem. 37: 1385, 1994); for nucleic acid libraries, peptide nucleic acid libraries, (see, e.g., U.S. Pat. No. 5,539,083); for antibody libraries, (see, e.g., Vaughn, Nature Biotechnology 14: 309-314, 1996); for carbohydrate libraries, (see, e.g., Liang et al, Science 274: 1520-1522, 1996, U.S. Pat. No. 5,593,853); for small organic molecule libraries, (see, e.g., for isoprenoids U.S. Pat. No. 5,569,588); for thiazolidinones and metathiazanones, (U.S. Pat. No. 5,549,974); for pyrrolidines, (U.S. Pat. Nos. 5,525,735) and 5,519,134; for morpholino compounds, (U.S. Pat. No. 5,506,337); for benzodiazepines (U.S. Pat. No. 5,288,514).

[0302] Devices for the preparation of combinatorial libraries are commercially available (see, e.g., U.S. Pat. Nos. 6,045,755; 5,792,431; 357 MPS, 390 MPS), (Advanced Chem Tech, Louisville Ky., Symphony, Rainin, Wobura, Mass., 433A Applied Biosystems, Foster City, Calif, 9050 Plus, Millipore, Bedford, Mass.). A number of robotic systems have also been developed for solution phase chemistries. These systems include automated workstations, e.g., like the automated synthesis apparatus developed by Takeda Chemical Industries, LTD. (Osaka, Japan) and many robotic systems utilizing robotic arms (Zymate II, Zymark Corporation, Hopkinton, Mass; Orca, Hewlett-Packard, Palo Alto, CA) which mimic the manual synthetic operations performed by a chemist. Any of the above devices are suitable for use with the present invention. The nature and implementation of modifications to these devices (if any) so that they can operate as discussed herein will be apparent to persons skilled in the relevant art. In addition, numerous combinatorial libraries are themselves commercially available (see, e.g., ComGenex, Princeton, NJ. , Asinex, Moscow, Ru, Tripos, Inc., St. Louis, Mo., ChemStar, Ltd, Moscow, RU, 3D Pharmaceuticals, Exton, PA, Martek Biosciences, Columbia, Md., and the like).

[0303] The compounds tested as modulators of Caliban genes or gene products can be any small organic molecule, or a biological entity, such as a protein, e.g., an antibody or peptide, a sugar, a nucleic acid, e.g., an antisense oligonucleotide or RNAi, or a ribozyme, or a lipid. Alternatively, modulators can be genetically altered versions of a Caliban protein. Typically, test compounds will be small organic molecules, peptides, lipids, and lipid analogs.

[0304] Essentially any chemical compound can be used as a potential modulator or ligand in the assays of the invention, although most often compounds can be dissolved in aqueous or organic (especially DMSO-based) solutions are used. The assays are designed to screen large chemical libraries by automating the assay steps and providing compounds from any convenient source to assays, which are typically run in parallel {e.g., in microtiter formats on microtiter plates in robotic assays). It will be appreciated that there are many suppliers of

Chbv&ϊύsl b6MpbλMάs;ifιύmmng Sigma (St. Louis, MO), Aldrich (St. Louis, MO), Sigma- Aldrich (St. Louis, MO), Fluka Chemika-Biochemica Analytika (Buchs Switzerland) and the like.

[0305] In one embodiment, high throughput screening methods involve providing a combinatorial small organic molecule or peptide library containing a large number of potential therapeutic compounds (potential modulator or ligand compounds). Such "combinatorial chemical libraries" or "ligand libraries" (as described above) are then screened in one or more assays, as described herein, to identify those library members (particular chemical species or subclasses) that display a desired characteristic activity. The compounds thus identified can serve as conventional "lead compounds" or can themselves be used as potential or actual therapeutics.

E. Arrays Or "Biochips"

[0306] The invention provides methods for identifying/screening for modulators (e.g. , inhibitors, activators) of a Caliban activity, using arrays. Potential modulators, including small molecules, nucleic acids, polypeptides (including antibodies) can be immobilized to arrays. Nucleic acids or polypeptides of the invention can be immobilized to or applied to an array. Arrays can be used to screen for or monitor libraries of compositions (e.g., small molecules, antibodies, nucleic acids, and the like) for their ability to bind to or modulate the activity of a nucleic acid or a polypeptide of the invention, e.g., a Caliban activity. For example, in one aspect of the invention, a monitored parameter is transcript expression of a gene comprising a nucleic acid of the invention. One or more, or, all the transcripts of a cell can be measured by hybridization of a sample comprising transcripts of the cell, or, nucleic acids representative of or complementary to transcripts of a cell, by hybridization to immobilized nucleic acids on an array, or "biochip." By using an "array" of nucleic acids on a microchip, some or all of the transcripts of a cell can be simultaneously quantified. Alternatively, arrays comprising genomic nucleic acid can also be used to determine the genotype of a newly engineered strain made by the methods of the invention. Polypeptide arrays can be used to simultaneously quantify a plurality of proteins. Small molecule arrays can be used to simultaneously analyze a plurality of Caliban modulating or binding activities.

[0307] The present invention can be practiced with any known "array," also referred to as a "microarray" or "nucleic acid array" or "polypeptide array" or "antibody array" or "biochip," or variation thereof. Arrays are genetically a plurality of "spots" or "target elements," each target element comprising a defined amount of one or more biological molecules, e.g., oligonucleotides, immobilized onto a defined area of a substrate surface for specific binding to a sample molecule, e.g., mRNA transcripts. In practicing the methods of the invention, any known array and/or method of making and using arrays can be incorporated in whole or in part, or

variations tlierebf;"as""des'cHb'δ'd, for example, in U.S. Pat. Nos. 6,277,628; 6,277,489; 6,261,776; 6,258,606; 6,054,270; 6,048,695; 6,045,996; 6,022,963; 6,013,440; 5,965,452; 5,959,098; 5,856,174; 5,830,645; 5,770,456; 5,632,957; 5,556,752; 5,143,854; 5,807,522; 5,800,992; 5,744,305; 5,700,637; 5,556,752; 5,434,049; (see also, e.g., WO 99/51773; WO 99/09217; WO 97/46313; WO 96/17958; see also, e.g., Johnston, Curr. Biol. 8, 1998; Schummer, Biotechniques 23: 1087-1092, 1997; Kern, Biotechniques 23: 120-124, 1997; Solinas-Toldo, Genes, 20: 399- 407, 1997; Bowtell, Nature Genetics Supp. 21: 25-32, 1999). See also published U.S. patent applications Nos. 20010018642; 20010019827; 20010016322; 20010014449; 20010014448; 20010012537; 20010008765.

[0308] The terms "array" or "microarray" or "biochip" or "chip" as used herein is a plurality of target elements, each target element comprising a defined amount of one or more polypeptides (including antibodies) or nucleic acids immobilized onto a defined area of a substrate surface.

F. Solid State and Soluble High Throughput Assays

[0309] In one embodiment the invention provide soluble assays using molecules such as a domain such as ligand binding domain, an active site, and the like; a domain that is covalently linked to a heterologous protein to create a chimeric molecule; Caliban; a cell or tissue expressing Caliban, either naturally occurring or recombinant. In another embodiment, the invention provides solid phase based in vitro assays in a high throughput format, where the domain, chimeric molecule, Caliban, or cell or tissue expressing Caliban is attached to a solid phase substrate.

[0310] In the high throughput assays of the invention, it is possible to screen up to several thousand different modulators or ligands in a single day. In particular, each well of a microtiter plate can be used to run a separate assay against a selected potential modulator, or, if concentration or incubation time effects are to be observed, every 5-10 wells can test a single modulator. Thus, a single standard microtiter plate can assay about 100 {e.g., 96) modulators. If 1536 well plates are used, then a single plate can easily assay from about 100-1500 different compounds. It is possible to assay several different plates per day; assay screens for up to about 6,000-20,000 different compounds is possible using the integrated systems of the invention.

[0311] The molecule of interest can be bound to the solid state component, directly or indirectly, via covalent or non covalent linkage, e.g., via a tag. The tag can be any of a variety of components. In general, a molecule which binds the tag (a tag binder) is fixed to a solid support, and the tagged molecule of interest is attached to the solid support by interaction of the tag and the tag binder.

JO ' 3ϊ2J' " " ' A number of tags and tag binders can be used, based upon known molecular interactions well described in the literature. For example, where a tag has a natural binder, for example, biotin, protein A, or protein G, it can be used in conjunction with appropriate tag binders (avidin, streptavidin, neutravidin, the Fc region of an immunoglobulin, and the like) Antibodies to molecules with natural binders such as biotin are also widely available and appropriate tag binders; see, SIGMA Immunochemicals 1998 catalogue SIGMA, St. Louis Mo.

[0313] Similarly, any haptenic or antigenic compound can be used in combination with an appropriate antibody to form a tag/tag binder pair. Thousands of specific antibodies are commercially available and many additional antibodies are described in the literature. For example, in one common configuration, the tag is a first antibody and the tag binder is a second antibody which recognizes the first antibody. In addition to antibody-antigen interactions, receptor-ligand interactions are also appropriate as tag and tag-binder pairs. For example, agonists and antagonists of cell membrane receptors (e.g., cell receptor-ligand interactions such as transferrin, c-kit, viral receptor ligands, cytokine receptors, chemokine receptors, interleukin receptors, immunoglobulin receptors and antibodies, the cadherein family, the integrin family, the selectin family, and the like; see, e.g., Pigott et al, The Adhesion Molecule Facts Book I, 1993. Similarly, toxins and venoms, viral epitopes, hormones (e.g., opiates, steroids, and the like), intracellular receptors (e.g. which mediate the effects of various small ligands, including steroids, thyroid hormone, retinoids and vitamin D; peptides), drugs, lectins, sugars, nucleic acids (both linear and cyclic polymer configurations), oligosaccharides, proteins, phospholipids and antibodies can all interact with various cell receptors.

[0314] Synthetic polymers, such as polyurethanes, polyesters, polycarbonates, polyureas, polyamides, polyethyleneimines, polyarylene sulfides, polysiloxanes, polyimides, and polyacetates can also form an appropriate tag or tag binder. Many other tag/tag binder pairs are also useful in assay systems described herein, as would be apparent to one of skill upon review of this disclosure.

[0315] Common linkers such as peptides, polyethers, and the like can also serve as tags, and include polypeptide sequences, such as poly gly sequences of between about 5 and 200 amino acids. Such flexible linkers are known to persons of skill in the art. For example, poly(ethylene glycol) linkers are available from Shearwater Polymers, Inc. Huntsville, Ala. These linkers optionally have amide linkages, sulfhydryl linkages, or heterofunctional linkages.

[0316] Tag binders are fixed to solid substrates using any of a variety of methods currently available. Solid substrates are commonly derivatized or functionalized by exposing all or a portion of the substrate to a chemical reagent which fixes a chemical group to the surface

which ' is reactive with a portion of the tag binder. For example, groups which are suitable for attachment to a longer chain portion would include amines, hydroxyl, thiol, and carboxyl groups. Aminoalkylsilanes and hydroxyalkylsilanes can be used to functionalize a variety of surfaces, such as glass surfaces. The construction of such solid phase biopolymer arrays is well described in the literature. (See, e.g., Merrifield, J. Am. Chem. Soc. 85: 2149-2154, 1963 (describing solid phase synthesis of, e.g., peptides); Geysen et al, J. Immun. Meth. 102: 259-274, 1987 (describing synthesis of solid phase components on pins); Frank et al, Tetrahedron 44: 6031- 6040, 1988 (describing synthesis of various peptide sequences on cellulose disks); Fodor et al, Science, 251: 767-777, 1991; Sheldon et al, Clinical Chemistry 39: 718-719, 1993; and Kozal et al, Nature Medicine 7: 753-759, 1996 (all describing arrays of biopolymers fixed to solid substrates). Non-chemical approaches for fixing tag binders to substrates include other common methods, such as heat, cross-linking by UV radiation, and the like.

G. Computer-Based Assays

[0317] Yet another assay for compounds that modulate Caliban activity involves computer assisted drug design, in which a computer system is used to generate a three- dimensional structure of Caliban based on the structural information encoded by the amino acid sequence. The input amino acid sequence interacts directly and actively with a preestablished algorithm in a computer program to yield secondary, tertiary, and quaternary structural models of the protein. The models of the protein structure are then examined to identify regions of the structure that have the ability to bind, e.g., ligands. These regions are then used to identify ligands that bind to the protein.

[0318] The three-dimensional structural model of the protein is generated by entering Caliban amino acid sequences of at least 10 amino acid residues or corresponding nucleic acid sequences encoding a Caliban polypeptide into the computer system. The amino acid sequence of the polypeptide or the nucleic acid encoding the polypeptide is selected from the group consisting of the sequences shown in the Figures, and conservatively modified versions thereof. The amino acid sequence represents the primary sequence or subsequence of the protein, which encodes the structural information of the protein. At least 10 residues of the amino acid sequence (or a nucleotide sequence encoding 10 amino acids) are entered into the computer system from computer keyboards, computer readable substrates that include, but are not limited to, electronic storage media {e.g., magnetic diskettes, tapes, cartridges, and chips), optical media (e.g., CD ROM), information distributed by internet sites, and by RAM. The three-dimensional structural model of the protein is then generated by the interaction of the amino acid sequence and the computer system, using software known to those of skill in the art. The three-dimensional

" structural mόηel dftϊϊe'proteTή can be saved to a computer readable form and be used for further analysis {e.g., identifying potential ligand binding regions of the protein and screening for mutations, alleles and interspecies homologs of the gene).

[0319] The amino acid sequence represents a primary structure that encodes the information necessary to form the secondary, tertiary and quaternary structure of the protein of interest. The software looks at certain parameters encoded by the primary sequence to generate the structural model. These parameters are referred to as "energy terms," and primarily include electrostatic potentials, hydrophobic potentials, solvent accessible surfaces, and hydrogen bonding. Secondary energy terms include van der Waals potentials. Biological molecules form the structures that minimize the energy terms in a cumulative fashion. The computer program is therefore using these terms encoded by the primary structure or amino acid sequence to create the secondary structural model.

[0320] The tertiary structure of the protein encoded by the secondary structure is then formed on the basis of the energy terms of the secondary structure. The user at this point can enter additional variables such as whether the protein is membrane bound or soluble, its location in the body, and its cellular location, e.g., cytoplasmic, surface, or nuclear. These variables along with the energy terms of the secondary structure are used to form the model of the tertiary structure. In modeling the tertiary structure, the computer program matches hydrophobic faces of secondary structure with like, and hydrophilic faces of secondary structure with like.

[0321] Once the structure has been generated, potential ligand binding regions are identified by the computer system. Three-dimensional structures for potential ligands are generated by entering amino acid or nucleotide sequences or chemical formulas of compounds, as described above. The three-dimensional structure of the potential ligand is then compared to that of the Caliban protein to identify ligands that bind to Caliban. Binding affinity between the protein and ligands is determined using energy terms to determine which ligands have an enhanced probability of binding to the protein. The results, such as three-dimensional structures for potential ligands and binding affinity of ligands, can also be saved to a computer readable form and can be used for further analysis {e.g., generating a three dimensional model of mutated proteins having an altered binding affinity for a ligand).

[0322] Computer systems are also used to screen for mutations, polymorphic variants, alleles and interspecies homologs of Caliban genes. Such mutations can be associated with disease states or genetic traits. As described above, high density oligonucleotide arrays (GeneChip ® ) and related technology can also be used to screen for mutations, polymorphic variants, alleles and interspecies homologs. Once the variants are identified, diagnostic assays

can be used ' toϊdenfi ' fy patients having such mutated genes. Identification of the mutated Caliban genes involves receiving input of a first nucleic acid or amino acid sequence encoding selected from the group consisting of the sequences shown in the Figures, and conservatively modified versions thereof. The sequence is entered into the computer system as described above and then saved to a computer readable form. The first nucleic acid or amino acid sequence is then compared to a second nucleic acid or amino acid sequence that has substantial identity to the first sequence. The second sequence is entered into the computer system in the manner described above. Once the first and second sequences are compared, nucleotide or amino acid differences between the sequences are identified. Such sequences can represent allelic differences in Caliban genes, and mutations associated with disease states and genetic traits.

H. Database Searching/Sequence Alignments, Computer Systems, Computer Program Products And Databases

[0323] Genomic databases for model organisms of various species can be employed for conducting multi-genome-wide sequence alignments in order to identify homologous sequences of interest. For each identified sequence (e.g., for those sequences listed in the Figures), related orthologous sequences can be determined by searching composite genomic databases. The breath of a database search is limited by the scope of representative model organisms for which sequence data is available.

[0324] Homology can be determined by various methods, including alignments of open-reading-frames ("ORPs") contained in private and/or public databases. Any suitable mathematical algorithm may be used to determine percent identities and percent similarities between any two sequences being compared. For example, nucleic acid and protein sequences of the present invention can be used as a "query sequence" to perform a search against sequences deposited within various public databases to identify other family members or evolutionarily- related sequences. Genomic sequences for various organisms are currently available, including fungi, such as the budding yeast, or Saccharomyces cerevisiae; invertebrates, such as Caenorhabditis elegans and Drosophila melangaster; and mammals, such as the mouse, rat, and human. Exemplary databases for identifying orthologs of interest include Genebank, Swiss Protein, EMBL, and National Center for Biotechnology Information ("NCBI"), and many others known in the art. These databases enable a user to set various parameters for a hypothetical search according to the user's preference, or to utilize default settings. As discussed above, a listing of identified sequences, including various exemplary mammalian orthologs of the invention, are described herein.

[0325] " To ldetermmeTand identify sequence identities, structural homologies, motifs and the like in silico, the sequence of the invention can be stored, recorded, and manipulated on any medium which can be read and accessed by a computer. Accordingly, the invention provides computers, computer systems, computer readable mediums, computer programs products and the like recorded or stored thereon the nucleic acid and polypeptide sequences of the invention. As used herein, the words "recorded" and "stored" refer to a process for storing information on a computer medium. A skilled artisan can readily adopt any known methods for recording information on a computer readable medium to generate manufactures comprising one or more of the nucleic acid and/or polypeptide sequences of the invention.

[0326] Another aspect of the invention is a computer readable medium having recorded thereon at least one nucleic acid and/or polypeptide sequence of the invention. Computer readable media include magnetically readable media, optically readable media, electronically readable media and magnetic/optical media. For example, the computer readable media can be a hard disk, a floppy disk, a magnetic tape, CD-ROM, Digital Versatile Disk (DVD), Random Access Memory (RAM), or Read Only Memory (ROM) as well as other types of other media known to those skilled in the art.

[0327] As used herein, the terms "computer," "computer program" and "processor" are used in their broadest general contexts and incorporate all such devices.

7. TRANSGENIC AND "KNOCKOUT" NON-HUMAN ANIMALS

[0328] The invention provides transgenic non-human animals comprising a nucleic acid, a polypeptide, an expression cassette or vector or a transfected or transformed cell of the invention. The transgenic non-human animals can be, e.g., goats, rabbits, sheep, pigs, cows, rats and mice, comprising the nucleic acids of the invention. A "transgenic animal" is an animal having cells that contain DNA which has been artificially inserted into a cell, which DNA becomes part of the genome of the animal which develops from that cell. Preferred transgenic animals are primates, mice, rats, cows, pigs, horses, goats, sheep, dogs and cats. The transgenic DNA can encode mammalian kinases. Native expression in an animal can be reduced by providing an amount of antisense RNA or DNA effective to reduce expression of the receptor.

[0329] These animals can be used, e.g., as in vivo models to study which is modulators of Caliban activity, or, as models to screen for agents that change the Caliban activity in vivo.

[0330] In one aspect, the inserted transgenic sequence is a sequence of the invention designed such that it does not express a functional Caliban polypeptide. The defect can be designed to be on the transcriptional, translational and/or the protein level.

[0331] The coding sequences for the polypeptides, e.g., Caliban polypeptides, to be expressed in the transgenic non-human animals can be designed to be constitutive, or, under the control of tissue-specific, developmental-specific or inducible transcriptional regulatory factors. Transgenic non-human animals can be designed and generated using any method known in the art; (see, e.g., U.S. Pat. Nos. 6,211,428; 6,187,992; 6,156,952; 6,118,044; 6,111,166; 6,107,541; 5,959,171; 5,922,854; 5,892,070; 5,880,327; 5,891,698; 5,639,940; 5,573,933; 5,387,742; 5,087,571), describing making and using transformed cells and eggs and transgenic mice, rats, rabbits, sheep, pigs and cows. See also, e.g., (Pollock, J. Immunol. Methods 231: 147-157, 1999), describing the production of recombinant proteins in the milk of transgenic dairy animals; (Baguisi, Nat. Biotechnol. 17: 456-461, 1999), demonstrating the production of transgenic goats. (U.S. Pat. No. 6,211,428), describes making and using transgenic non-human mammals which express in their brains a nucleic acid construct comprising a DNA sequence. (U.S. Pat. No. 5,387,742), describes injecting cloned recombinant or synthetic DNA sequences into fertilized mouse eggs, implanting the injected eggs in pseudo-pregnant females, and growing to term transgenic mice whose cells express proteins related to the pathology of Alzheimer's disease. (U.S. Pat. No. 6,187,992), describes making and using a transgenic mouse whose genome comprises a disruption of the gene encoding amyloid precursor protein (APP). One exemplary method to produce genetically altered non-human animals is to genetically modify embryonic stem cells. The modified cells are injected into the blastocoel of a blastocyst. This is then grown in the uterus of a pseudopregnant female, hi order to readily detect chimeric progeny, the blastocysts can be obtained from a different parental line than the embryonic stem cells. For example, the blastocysts and embryonic stem cells can be derived from parental lines with different hair color or other readily observable phenotype. The resulting chimeric animals can be bred in order to obtain non-chimeric animals which have received the modified genes through germ-line transmission. Techniques for the introduction of embryonic stem cells into blastocysts and the resulting generation of transgenic animals are well known.

[0332] Because cells contain more than one copy of a gene, the cell lines obtained from a first round of targeting are likely to be heterozygous for the targeted allele. Homozygosity, in which both alleles are modified, can be achieved in a number of ways. In one approach, a number of cells in which one copy has been modified are grown. They are then subjected to another round of targeting using a different selectable marker. Alternatively, homozygotes can be obtained by breeding animals heterozygous for the modified allele, according to traditional Mendelian genetics, hi some situations, it maybe desirable to have two different modified alleles. This can be achieved by successive rounds of gene targeting or by breeding

heterozygotes, each of which " carries one of the desired modified alleles. See, e.g., (U.S. Pat. No. 5,789,215).

[0333] A variety of methods are available for the production of transgenic animals associated with this invention. DNA can be injected into the pronucleus of a fertilized egg before fusion of the male and female pronuclei, or injected into the nucleus of an embryonic cell (e.g., the nucleus of a two-cell embryo) following the initiation of cell division (Brinster et al., Proc. Nat. Acad. ScL 82: 4438-4442, 1985). Embryos can be infected with viruses, especially retroviruses, modified to carry inorganic-ion receptor nucleotide sequences of the invention.

[0334] Pluripotent stem cells derived from the inner cell mass of the embryo and stabilized in culture can be manipulated in culture to incorporate nucleotide sequences of the invention. A transgenic animal can be produced from such cells through implantation into a blastocyst that is implanted into a foster mother and allowed to come to term. Animals suitable for transgenic experiments can be obtained from standard commercial sources such as Charles River (Wilmington, Mass.), Taconic (Germantown, N.Y.), Harlan Sprague Dawley (Indianapolis, Ind.), and the like.

[0335] The procedures for manipulation of the rodent embryo and for microinjection of DNA into the pronucleus of the zygote are well known to those of ordinary skill in the art (Hogan et al., supra). Microinjection procedures for fish, amphibian eggs and birds are detailed in (Houdebine et al., Experientia 47: 897-905, 1991). Other procedures for introduction of DNA into tissues of animals are described in (U.S. Pat. No. 4,945,050) (Sanford et al, 1990).

[0336] By way of example only, to prepare a transgenic mouse, female mice are induced to superovulate. Females are placed with males, and the mated females are sacrificed by CO 2 asphyxiation or cervical dislocation and embryos are recovered from excised oviducts. Surrounding cumulus cells are removed. Pronuclear embryos are then washed and stored until the time of injection. Randomly cycling adult female mice are paired with vasectomized males. Recipient females are mated at the same time as donor females. Embryos then are transferred surgically. The procedure for generating transgenic rats is similar to that of mice (Hammer et al., Cell 63: 1099-1112, 1990).

[0337] Methods for the culturing of embryonic stem (ES) cells and the subsequent production of transgenic animals by the introduction of DNA into ES cells using methods such as electroporation, calcium phosphate/DNA precipitation and direct injection also are well known to those of ordinary skill in the art (EJ. Robertson, Teratocarcinomas and Embryonic Stem Cells, A Practical Approach, 1987).

10338] In cases involving random gene integration, a clone containing the sequence(s) of the invention is co-transfected with a gene encoding resistance. Alternatively, the gene encoding neomycin resistance is physically linked to the sequence(s) of the invention. Transfection and isolation of desired clones are carried out by any one of several methods well known to those of ordinary skill in the art (E. J. Robertson, supra).

[0339] DNA molecules introduced into ES cells can also be integrated into the chromosome through the process of homologous recombination (Capecchi, Science 244: 1288- 1292, 1989). Methods for positive selection of the recombination event (i.e., neo resistance) and dual positive-negative selection (i.e., neo resistance and gancyclovir resistance) and the subsequent identification of the desired clones by PCR have been described by (Capecchi, supra and Joyner et al. Nature 338: 153-156, 1989), the teachings of which are incorporated herein in their entirety including any drawings. The final phase of the procedure is to inject targeted ES cells into blastocysts and to transfer the blastocysts into pseudopregnant females. The resulting chimeric animals are bred and the offspring are analyzed by Southern blotting to identify individuals that carry the transgene. Procedures for the production of non-rodent mammals and other animals have been discussed by others (Houdebine et al, supra; Pursel et al., Science 244: 1281-1288, 1989; and Simms et al, Bio/Technology 6: 179-183, 1988).

A. Caliban Functional Knockouts

[0340] The invention provides non-human animals that do not express their endogenous Caliban polypeptides, or, express their endogenous Caliban polypeptides at lower than wild type levels (thus, while not completely "knocked out" their Caliban activity is functionally "knocked out"). The invention also provides "knockout animals" and methods for making and using them. For example, in one aspect, the transgenic or modified animals of the invention comprise a "knockout animal," e.g., a "knockout mouse," engineered not to express an endogenous gene, e.g., an endogenous Caliban gene, which is replaced with a gene expressing a polypeptide of the invention, or, a fusion protein comprising a polypeptide of the invention. Thus, in one aspect, the inserted transgenic sequence is a sequence of the invention designed such that it does not express a functional Caliban polypeptide. The defect can be designed to be on the transcriptional, translational and/or the protein level. Because the endogenous Caliban gene has been "knocked out," only the inserted polypeptide of the invention is expressed.

[0341] A "knock-out animal" is a specific type of transgenic animal having cells that contain DNA containing an alteration in the nucleic acid sequence that reduces the biological activity of the polypeptide normally encoded therefrom by at least 80% compared to the unaltered gene. The alteration can be an insertion, deletion, frameshift mutation, missense

mutation, introduction of stop codons, mutation of critical amino acid residue, removal of an intron junction, and the like. Preferably, the alteration is an insertion or deletion, or is a frameshift mutation that creates a stop codon. Typically, the disruption of specific endogenous genes can be accomplished by deleting some portion of the gene or replacing it with other sequences to generate a null allele. Cross-breeding mammals having the null allele generates a homozygous mammals lacking an active copy of the gene.

[0342] A number of such mammals have been developed, and are extremely helpful in medical development. For example, (U.S. Pat. No. 5,616,491) describes knock-out mice having suppression of CD28 and CD45. Procedures for preparation and manipulation of cells and embryos are similar to those described above with respect to transgenic animals, and are well known to those of ordinary skill in the art.

[0343] A knock out construct refers to a uniquely configured fragment of nucleic acid which is introduced into a stem cell line and allowed to recombine with the genome at the chromosomal locus of the gene of interest to be mutated. Thus, a given knock out construct is specific for a given gene to be targeted for disruption. Nonetheless, many common elements exist among these constructs and these elements are well known in the art. A typical knock out construct contains nucleic acid fragments of about 0.5 kb to about 10.0 kb from both the 5' and the 3' ends of the genomic locus which encodes the gene to be mutated. These two fragments are typically separated by an intervening fragment of nucleic acid which encodes a positive selectable marker, such as the neomycin resistance gene. The resulting nucleic acid fragment, consisting of a nucleic acid from the extreme 5' end of the genomic locus linked to a nucleic acid encoding a positive selectable marker which is in turn linked to a nucleic acid from the extreme 3' end of the genomic locus of interest, omits most of the coding sequence for the gene of interest to be knocked out. When the resulting construct recombines homologously with the chromosome at this locus, it results in the loss of the omitted coding sequence, otherwise known as the structural gene, from the genomic locus. A stem cell in which such a rare homologous recombination event has taken place can be selected for by virtue of the stable integration into the genome of the nucleic acid of the gene encoding the positive selectable marker and subsequent selection for cells expressing this marker gene in the presence of an appropriate drug.

[0344] Variations on this basic technique also exist and are well known in the art. For example, a "knock-in" construct refers to the same basic arrangement of a nucleic acid encoding a 5' genomic locus fragment linked to nucleic acid encoding a positive selectable marker which in turn is linked to a nucleic acid encoding a 3' genomic locus fragment, but which differs in that none of the coding sequence is omitted and thus the 5' and the 3' genomic fragments used were

initially contiguous before being disrupted by the introduction of the nucleic acid encoding the positive selectable marker gene. This "knock-in" type of construct is thus very useful for the construction of mutant transgenic animals when only a limited region of the genomic locus of the gene to be mutated, such as a single exon, is available for cloning and genetic manipulation. Alternatively, the "knock-in" construct can be used to specifically eliminate a single functional domain of the targeted gene, resulting in a transgenic animal which expresses a polypeptide of the targeted gene which is defective in one function, while retaining the function of other domains of the encoded polypeptide. This type of "knock-in" mutant frequently has the characteristic of a so-called "dominant negative" mutant because, especially in the case of proteins which homomultimerize, it can specifically block the action of the polypeptide product of the wild-type gene from which it was derived.

[0345] Each knockout construct to be inserted into the cell must first be in the linear form. Therefore, if the knockout construct has been inserted into a vector, linearization is accomplished by digesting the DNA with a suitable restriction endonuclease selected to cut only within the vector sequence and not within the knockout construct sequence. For insertion, the knockout construct is added to the ES cells under appropriate conditions for the insertion method chosen, as is known to the skilled artisan. Where more than one construct is to be introduced into the ES cell, each knockout construct can be introduced simultaneously or one at a time.

[0346] After suitable ES cells containing the knockout construct in the proper location have been identified by the selection techniques outlined above, the cells can be inserted into an embryo. Insertion can be accomplished in a variety of ways known to the skilled artisan, however a preferred method is by microinjection. For microinjection, about 10-30 cells are collected into a micropipette and injected into embryos that are at the proper stage of development to permit integration of the foreign ES cell containing the knockout construct into the developing embryo. For instance, the transformed ES cells can be microinjected into blastocytes. The suitable stage of development for the embryo used for insertion of ES cells is very species dependent, however for mice it is about 3.5 days. The embryos are obtained by perfusing the uterus of pregnant females. Suitable methods for accomplishing this are known to the skilled artisan. After the ES cell has been introduced into the embryo, the embryo can be implanted into the uterus of a pseudopregnant foster mother for gestation as described above.

[0347] Yet other methods of making knock-out or disruption transgenic animals are also generally known. See, for example, (Hogan B. et al., MANIPULATING THE MOUSE EMBRYO, 2ND EDITION, Cold Spring Harbor Press 1994). Recombinase dependent knockouts can also be generated, e.g. by homologous recombination to insert target sequences, such that tissue specific

and/or temporal control of inactivation of a target gene can be controlled by recombinase sequences (described infra).

[0348] Animals containing more than one knockout construct and/or more than one transgene expression construct are prepared in any of several ways. The preferred manner of preparation is to generate a series of mammals, each containing one of the desired transgenic phenotypes. Such animals are bred together through a series of crosses, backcrosses and selections, to ultimately generate a single animal containing all desired knockout constructs and/or expression constructs, where the animal is otherwise congenic (genetically identical) to the wild type except for the presence of the knockout construct(s) and/or transgene(s).

[0349] The functional Caliban "knockout" non-human animals of the invention are of several types. Some non-human animals of the invention that are functional Caliban "knockouts" express sufficient levels of a Caliban inhibitory nucleic acid, e.g., antisense sequences or ribozymes of the invention, to decrease the levels or knockout the expression of functional polypeptide. Some non-human animals of the invention that are functional Caliban "knockouts" express sufficient levels of a Caliban dominant negative polypeptide such that the effective amount of free endogenous active Caliban is decreased. Some non-human animals of the invention that are functional Caliban "knockouts" express sufficient levels of an antibody of the invention, e.g., a Caliban antibody, such that the effective amount of free endogenous active Caliban is decreased. Some non-human animals of the invention that are functional Caliban "knockouts" are "conventional" knockouts in that their endogenous Caliban gene has been disrupted or mutated.

[0350] Functional Caliban "knockout" non-human animals of the invention also include the inbred mouse strain of the invention and the cells and cell lines derived from these mice.

[0351] The invention provides a novel use for these non-human animals by discovering that animals that do not express sufficient levels of a Caliban polypeptides have increased susceptibility to cancer, developmental delay, hypertrophy of the nervous system, or a combination of any two or more thereof. Thus, by using the transgenic non-human animals or inbred strains, e.g., mouse strains, of the invention the invention provides in vivo methods to identify modulators, e.g., chemical or genetic modulators, of susceptibility to cancer, developmental delay, hypertrophy of the nervous system, or a combination of any two or more thereof.

B. Inbred Mouse Strain

[0352] The invention provides an inbred mouse and an inbred mouse strain that can be generated as described herein and bred by standard techniques, see, e.g., (U.S. Pat. Nos. 6,040,495; 5,552,287).

[0353] In order to screen for mutations with recessive effects a number of strategies can be used, all involving a further two generations. For example, male Gl mice can be bred to wild- type female mice. The resulting progeny (G2 mice) can be interbred or bred back to the Gl father. The G3 mice that result from these crosses will be homozygotes for mutations in a small number of genes (3-6) in the genome, but the identity of these genes is unknown. With enough G3 mice, a good sampling of the genome should be present.

8. PEPTIDES AND POLYPEPTIDES

[0354] The invention provides isolated or recombinant polypeptides comprising an amino acid sequence having at least 95%, 96%, 97%, 98%, 99% or more sequence identity to the sequences disclosed herein over a region of at least about 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1100 or more residues, or, the full length of the polypeptide, or, a polypeptide encoded by a nucleic acid of the invention. In one aspect, the polypeptide comprises a sequence as set forth in the Figures. The invention provides methods for inhibiting the activity of Caliban polypeptides, e.g., a polypeptide of the invention. The invention also provides methods for screening for compositions that inhibit the activity of, or bind to (e.g., bind to the active site), of Caliban polypeptides, e.g., a polypeptide of the invention.

[0355] In one aspect, the invention provides Caliban polypeptides (and the nucleic acids encoding them) where one, some or all of the Caliban polypeptides replacement with substituted amino acids. In one aspect, the invention provides methods to disrupt the interaction of Caliban polypeptides with other proteins.

[0356] The peptides and polypeptides of the invention can be expressed recombinantly in vivo after administration of nucleic acids, as described herein, or, they can be administered directly, e.g., as a pharmaceutical composition. They can be expressed in vitro or in vivo to screen for modulators of a Caliban activity and for agents that modulate the nuclear export mediation activity of Caliban. For example, nuclear export of the biomarker EYFP-HDA, is one assay that can be used as a screen for modulators of a Caliban activity using the peptdies and polypeptdies of the invention. Growth on soft agar and any of the measurements of cancer traits as described herein, can also be used in conjunction with the peptides and polypeptides of the invention.

[0357] Polypeptides and peptides of the invention can be isolated from natural sources, be synthetic, or be recombinantly generated polypeptides. Peptides and proteins can be recombinantly expressed in vitro or in vivo. The peptides and polypeptides of the invention can be made and isolated using any method known in the art. Polypeptide and peptides of the invention can also be synthesized, whole or in part, using chemical methods well known in the art. See e.g., (Caruthers, Nucleic Acids Res. Symp. Ser. 215-223, 1980; Horn, Nucleic Acids Res. Symp. Ser. 225-232, 1990; Banga, Therapeutic Peptides and Proteins, Formulation, Processing and Delivery Systems, 1995). For example, peptide synthesis can be performed using various solid-phase techniques (see e.g., Roberge, Science 269: 202, 1995; Merrifield, Methods Enzymol. 289: 3-13, 1997) and automated synthesis can be achieved, e.g., using the ABI 43 IA Peptide Synthesizer (Perkin Elmer) in accordance with the instructions provided by the manufacturer.

[0358] The peptides and polypeptides of the invention, as defined above, include all "mimetic" and "peptidomimetic" forms. The terms "mimetic" and "peptidomimetic" refer to a synthetic chemical compound which has substantially the same structural and/or functional characteristics of the polypeptides of the invention. The mimetic can be either entirely composed of synthetic, non-natural analogues of amino acids, or, is a chimeric molecule of partly natural peptide amino acids and partly non-natural analogs of amino acids. The mimetic can also incorporate any amount of natural amino acid conservative substitutions as long as such substitutions also do not substantially alter the mimetic' s structure and/or activity. As with polypeptides of the invention which are conservative variants, routine experimentation will determine whether a mimetic is within the scope of the invention, i.e., that its structure and/or function is not substantially altered. Thus, a mimetic composition is within the scope of the invention if, when administered to or expressed in a cell, it has an Caliban activity. A mimetic composition can also be within the scope of the invention if it can inhibit an activity of a Caliban polypeptides of the invention, e.g., be a dominant negative mutant or, bind to an antibody of the invention.

[0359] Polypeptide mimetic compositions can contain any combination of non-natural structural components, which are typically from three structural groups: a) residue linkage groups other than the natural amide bond ("peptide bond") linkages; b) non-natural residues in place of naturally occurring amino acid residues; or c) residues which induce secondary structural mimicry, i.e., to induce or stabilize a secondary structure, e.g., a beta turn, gamma turn, beta sheet, alpha helix conformation, and the like. For example, a polypeptide can be characterized as a mimetic when all or some of its residues are joined by chemical means other than natural peptide bonds. Individual peptidomimetic residues can be joined by peptide bonds,

other chemical bonds or coupling means, such as, e.g., glutaraldehyde, N-hydroxysuccinimide esters, bifunctional maleimides, N j N'-dicyclohexylcarbodiimide (DCC) or N 5 N'- diisopropylcarbodiimide (DIC). Linking groups that can be an alternative to the traditional amide bond ("peptide bond") linkages include, e.g., ketomethylene (e.g., ~C(=O) — CH 2 — for — C(=O)—NH-), aminomethylene (CH 2 -NH), ethylene, olefin (CH.dbd.CH), ether (CH 2 -O), thioether (CH 2 — S), tetrazole (CN 4 -), thiazole, retroamide, thioamide, or ester (see, e.g., Spatola, Chemistry and Biochemistry of Amino Acids, Peptides and Proteins, 7: 267-357, 1983).

[0360] A polypeptide can also be characterized as a mimetic by containing all or some non-natural residues in place of naturally occurring amino acid residues. Non-natural residues are well described in the scientific and patent literature; a few exemplary non-natural compositions useful as mimetics of natural amino acid residues and guidelines are described below. Mimetics of aromatic amino acids can be generated by replacing by, e.g., D- or L- naphylalanine; D- or L-phenylglycine; D- or L-2 thieneylalanine; D- or L-I, -2,3-, or 4- pyreneylalanine; D- or L-3 thieneylalanine; D- or L-(2-pyridinyl)-alanine; D- or L-(3-pyridinyl)- alanine; D- or L-(2-pyrazinyl)-alanine; D- or L-(4-isopropyl)-phenylglycine; D- (trifluoromethyl)-phenylglycine; D-(trifluoromethyl)-phenylalanine; D-p-fluoro-phenylalanine; D- or L-p-biphenylphenylalanine; K- or L-p-methoxy-biphenylphenylalanine; D- or L-2- indole(alkyl)alanines; and, D- or L-alkylainines, where alkyl can be substituted or unsubstituted methyl, ethyl, propyl, hexyl, butyl, pentyl, isopropyl, iso-butyl, sec-isotyl, iso-pentyl, or a non- acidic amino acids. Aromatic rings of a non-natural amino acid include, e.g., thiazolyl, thiophenyl, pyrazolyl, benzimidazolyl, naphthyl, furanyl, pyrrolyl, and pyridyl aromatic rings.

[0361] Mimetics of acidic amino acids can be generated by substitution by, e.g., non- carboxylate amino acids while maintaining a negative charge; (phosphono)alanine; sulfated threonine. Carboxyl side groups {e.g., aspartyl or glutamyl) can also be selectively modified by reaction with carbodiimides (R' — N — C — N — R') such as, e.g., l-cyclohexyl-3(2-morpholin- yl- (4-ethyl) carbodiimide or l-ethyl-3(4-azonia-4,4-dimetholpentyl) carbodiimide. Aspartyl or glutamyl can also be converted to asparaginyl and glutaminyl residues by reaction with ammonium ions.

[0362] Mimetics of basic amino acids can be generated by substitution with, e.g., (in addition to lysine and arginine) the amino acids ornithine, citrulline, or (e.g., containing the CN- moiety in place of COOH) can be substituted for Arginine residue mimetics can be generated by reacting arginyl with, e.g., one or more conventional reagents, including, e.g., phenylglyoxal, 2,3-butanedione, 1,2-cyclohexanedione, or ninhydrin, preferably under alkaline conditions. Tyrosine residue mimetics can be generated by reacting tyrosyl with, e.g., aromatic diazonium

compounds or tetranitromethahe. N-acetylimidizol and tetranitromethane can be used to form O- acetyl tyrosyl species and 3-nitro derivatives, respectively. Cysteine residue mimetics can be generated by reacting cysteinyl residues with, e.g., alpha-haloacetates such as 2-chloroacetic acid or chloroacetamide and corresponding amines; to give carboxymethyl or carboxyamidomethyl derivatives. Cysteine residue mimetics can also be generated by reacting cysteinyl residues with, e.g., bromo-trifluoroacetone, alpha-bromo-beta-(5-imidozoyl) propionic acid; chloroacetyl phosphate, N-alkylmaleimides, 3-nitro-2-pyridyl disulfide; methyl 2-pyridyl disulfide; p- chloromercuribenzoate; 2-chloromercuri-4 nitrophenol; or, chloro-7-nitrobenzo-oxa-l,3-diazole. Lysine mimetics can be generated (and amino terminal residues can be altered) by reacting lysinyl with, e.g., succinic or other carboxylic acid anhydrides. Lysine and other alpha-amino- containing residue mimetics can also be generated by reaction with imidoesters, such as methyl picolinimidate, pyridoxal phosphate, pyridoxal, chloroborohydride, trinitrobenzenesulfonic acid, O-methylisourea, 2,4, pentanedione, and transamidase-catalyzed reactions with glyoxylate. Mimetics of methionine can be generated by reaction with, e.g., methionine sulfoxide. Mimetics of e.g., pipecolic acid, thiazolidine carboxylic acid, 3- or 4-hydroxy e.g., diethylprocarbonate or para-bromophenacyl bromide. Other mimetics include, e.g., those generated by hydroxylation of

[0363] A component of a polypeptide of the invention can also be replaced by an amino acid (or peptidomimetic residue) of the opposite chirality. Thus, any amino acid naturally occurring in the L-configuration (which can also be referred to as the R or S, depending upon the structure of the chemical entity) can be replaced with the amino acid of the same chemical structural type or a peptidomimetic, but of the opposite chirality, referred to as the D-amino acid, but which can additionally be referred to as the R- or S-form

[0364] The invention also provides polypeptides that are "substantially identical" to an exemplary polypeptide of the invention. A "substantially identical" amino acid sequence is a sequence that differs from a reference sequence by one or more conservative or non-conservative amino acid substitutions, deletions, or insertions, particularly when such a substitution occurs at a site that is not the active site of the molecule, and provided that the polypeptide essentially retains its functional properties. A conservative amino acid substitution, for example, substitutes one amino acid for another of the same class {e.g., substitution of one hydrophobic amino acid, such as isoleucine, valine, leucine, or methionine, for another, or substitution of one polar amino acid for another, such as substitution of arginine for lysine, glutamic acid for aspartic acid or glutamine for or carboxyl-terminal, or internal, amino acids which are not required for a Caliban activity can be removed.

[0365] The skilled artisan will recognize that individual synthetic residues and polypeptides incorporating these mimetics can be synthesized using a variety of procedures and methodologies, which are well described in the scientific and patent literature, e.g., (Gilman et ah, Organic Syntheses Collective). Peptides and peptide mimetics of the invention can also be synthesized using combinatorial methodologies. Various techniques for generation of peptide and peptidomimetic libraries are well known, and include, e.g., multipin, tea bag, and split- couple-mix techniques; see, e.g., (al-Obeidi, MoI. Biotechnol. 9: 205-223, 1998; Hruby, Curr. Opin. Chem. Biol. 1: 114-119, 1997; Ostergaard, MoI. Divers. 3: 17-27, 1997; Ostresh, Methods Enzymol. 267: 220-234, 1996). Modified peptides of the invention can be further produced by chemical modification methods, see, e.g., (Belousov Nucleic Acids Res. 25: 3440-3444, 1997; Frenkel, Free Radic. Biol. Med. 19: 373-380, 1995; Blommers, Biochemistry 33: 7886-7896, 1994).

[0366] Peptides and polypeptides of the invention can also be synthesized and expressed as fusion proteins with one or more additional domains linked thereto for, e.g., producing a more immunogenic peptide, to more readily isolate a recombinantly synthesized peptide, to identify and isolate antibodies and antibody-expressing B cells, and the like. Detection and purification facilitating domains include, e.g., metal chelating peptides such as polyhistidine tracts and histidine-tryptophan modules that allow purification on immobilized metals, protein A domains that allow purification on immobilized immunoglobulin, and the domain utilized in the FLAGS extension/affinity purification system (Immunex Corp, Seattle Wash.). The inclusion of a cleavable linker sequences such as Factor Xa or enterokinase (Invitrogen, San Diego Calif.) between a purification domain and the motif-comprising peptide or polypeptide to facilitate purification. For example, an expression vector can include an epitope-encoding nucleic acid sequence linked to six histidine residues followed by a thioredoxin and an enterokinase cleavage site (see e.g., Williams, Biochemistry 34: 1787-1797, 1995; Dobeli, Protein Expr. Purif. 12: 404-14, 1998). The histidine residues facilitate detection and purification while the enterokinase cleavage site provides a means for purifying the epitope from the remainder of the fusion protein. Technology pertaining to vectors encoding fusion proteins and application of fusion proteins are well described in the scientific and patent literature, (see e.g., Kxo\\, DNA Cell. Biol. 12: 441-53, 1993).

[0367] The terms "polypeptide" and "protein" as used herein, refer to amino acids joined to each other by peptide bonds or modified peptide bonds, i.e., peptide isosteres, and can contain modified amino acids other than the 20 gene-encoded amino acids. The term "polypeptide" also includes peptides and polypeptide fragments, motifs and the like. The term

also includes glycosylated polypeptides. The peptides and polypeptides of the invention also include all "mimetic" and "peptidomimetic" forms, as described in further detail, below.

[0368] As used herein, the term "isolated" means that the material is removed from its original environment (e.g., the natural environment if it is naturally occurring). For example, a naturally occurring polynucleotide or polypeptide present in a living animal is not isolated, but the same polynucleotide or polypeptide, separated from some or all of the coexisting materials in the natural system, is isolated. Such polynucleotides could be part of a vector and/or such polynucleotides or polypeptides could be part of a composition, and still be isolated in that such vector or composition is not part of its natural environment. As used herein, an isolated material or composition can also be a "purified" composition, i.e., it does not require absolute purity; rather, it is intended as a relative definition. Individual nucleic acids obtained from a library can be conventionally purified to electrophoretic homogeneity. In alternative aspects, the invention provides nucleic acids which have been purified from genomic DNA or from other sequences in a library or other environment by at least one, two, three, four, five or more orders of magnitude.

[0369] An exemplary Caliban is presented; the nucleic acid sequence and the amino acid translation thereof is shown in the Figures.

[0370] In practicing the methods of the invention, a variety of apparatus and methodologies can be used to in conjunction with the polypeptides and nucleic acids of the invention, e.g., to screen polypeptides for Caliban activity, to screen compounds as potential modulators (e.g., inhibitors or activators) of a Caliban activity, for antibodies that bind to a polypeptide of the invention, for nucleic acids that hybridize to a nucleic acid of the invention, to screen for cells expressing a polypeptide of the invention and the like.

[0371] In one aspect, the peptides and polypeptides of the invention can be bound to a solid support. Solid supports can include, e.g., membranes (e.g., nitrocellulose or nylon), a microtiter dish (e.g., PVC, polypropylene, or polystyrene), a test tube (glass or plastic), a dip stick (e.g., glass, PVC, polypropylene, polystyrene, latex and the like), a microfuge tube, or a glass, silica, plastic, metallic or polymer bead or other substrate such as paper. One solid support uses a metal (e.g., cobalt or nickel)-comprising column which binds with specificity to a histidine tag engineered onto a peptide.

[0372] Adhesion of peptides to a solid support can be direct (i.e., the protein contacts the solid support) or indirect (a particular compound or compounds are bound to the support and the target protein binds to this compound rather than the solid support). Peptides can be immobilized either covalently (e.g., utilizing single reactive thiol groups of cysteine residues (see, e.g., Colliuod Bioconjugate Chem. 4: 528-536, 1993) or non-covalently but specifically

(e.g., via immobilized antibodies (see, e.g., Schuhmann, Adv. Mater. 3: 388-391, 1991; Lu, Anal. Chem. 67: 83-87, 1995); the biotin/strepavidin system (see, e.g., Iwane, Biophys. Biochem. Res. Comm. 230: 76-80, 1997); metal chelating, e.g., Langmuir-Blodgett films (see, e.g., Ng, Langmuir 11: 4048-55, 1995); metal-chelating self-assembled monolayers (see, e.g., Sigal, Anal. Chem. 68:490-497, 1996) for binding of polyhistidine fusions.

[0373] Indirect binding can be achieved using a variety of linkers which are commercially available. The reactive ends can be any of a variety of functionalities including, but not limited to: amino reacting ends such as N-hydroxysuccinimide (NHS) active esters, imidoesters, aldehydes, epoxides, sulfonyl halides, isocyanate, isothiocyanate, and nitroaryl halides; and thiol reacting ends such as pyridyl disulfides, maleimides, thiophthalimides, and active halogens. The heterobifunctional crosslinking reagents have two different reactive ends, e.g., an amino-reactive end and a thiol-reactive end, while homobifunctional reagents have two similar reactive ends, e.g., bismaleimidohexane (BMH) which permits the cross-linking of sulfhydryl-containing compounds. The spacer can be of varying length and be aliphatic or aromatic. Examples of commercially available homobifunctional cross-linking reagents include, but are not limited to, the imidoesters such as dimethyl adipimidate dihydrochloride (DMA); dimethyl pimelimidate dihydrochloride (DMP); and dimethyl suberimidate dihydrochloride (DMS). Heterobifunctional reagents include commercially available active halogen-NHS active esters coupling agents such as N-succinimidyl bromoacetate and N-succinimidyl (4- iodoacetyl)aminobenzoate (SLAB) and the sulfosuccinimidyl derivatives such as sulfosuccinimidyl(4-iodoacetyl)aminobenzoate (sulfo-SIAB) (Pierce). Another group of coupling agents is the heterobifunctional and thiol cleavable agents such as N-succinimidyl 3-(2- pyridyidithio)propiona- te (SPDP) (Pierce Chemicals, Rockford, 111.).

[0374] Antibodies can be used for binding polypeptides and peptides of the invention to a solid support. This can be done directly by binding peptide-specific antibodies to the column or it can be done by creating fusion protein chimeras comprising motif-containing peptides linked to, e.g., a known epitope (e.g., a tag (e.g., FLAG, myc) or an appropriate immunoglobulin constant domain sequence (an "immunoadhesin," see, e.g., Capon, Nature 377: 525-531, 1989).

9. FUSION PROTEINS

[0375] Antibodies to Caliban gene products (e.g. , a Caliban protein) can be used to generate fusion proteins. For example, the antibodies of the present invention, when fused to a second protein, can be used as an antigenic tag. Antibodies raised against a Caliban gene product (e.g., a Caliban protein) can be used to indirectly detect the second protein by binding to the polypeptide.

[0376] Examples of domains that can be fused to polypeptides include not only heterologous signal sequences, but also other heterologous functional regions. The fusion does not necessarily need to be direct, but can occur through linker sequences.

[0377] Moreover, fusion proteins can also be engineered to improve characteristics of the polypeptide. For instance, a region of additional amino acids, particularly charged amino acids, can be added to the N-terminus of the polypeptide to improve stability and persistence during purification from the host cell or subsequent handling and storage. Also, peptide moieties can be added to the polypeptide to facilitate purification. Such regions can be removed prior to final preparation of the polypeptide. The addition of peptide moieties to facilitate handling of polypeptides are familiar and routine techniques in the art.

[0378] Moreover, antibody compositions to a Caliban protein, including fragments, and specifically epitopes, can be combined with parts of the constant domain of immunoglobulins (IgG), resulting in chimeric polypeptides. These fusion proteins facilitate purification and show an increased half-life in vivo. One reported example describes chimeric proteins consisting of the first two domains of the human CD4-polypeptide and various domains of the constant regions of the heavy or light chains of mammalian immunoglobulins. EP A 394,827; (Traunecker et al, Nature, 331: 84-86, 1988). Fusion proteins having disulfide-linked dimeric structures (due to the IgG) can also be more efficient in binding and neutralizing other molecules, than the monomelic secreted protein or protein fragment alone. (Fountoulakis et al, J. Biochem. 270: 3958-3964, 1995).

[0379] Similarly, EP-A-O 464 533 (Canadian counterpart 2045869) discloses fusion proteins comprising various portions of constant region of immunoglobulin molecules together with another human protein or part thereof. In many cases, the Fc part in a fusion protein is beneficial in therapy and diagnosis, and thus can result in, for example, improved pharmacokinetic properties. (EP-A 0232 262.) Alternatively, deleting the Fc part after the fusion protein has been expressed, detected, and purified, would be desired. For example, the Fc portion can hinder therapy and diagnosis if the fusion protein is used as an antigen for immunizations, hi drug discovery, for example, human proteins, such as hIL-5, have been fused with Fc portions for the purpose of high throughput screening assays to identify antagonists of hIL-5. (Bennett et al, J. Molecular Recognition 8: 52-58, 1995; Johanson et al, J. Biol. Chem., 270: 9459-9471, 1995).

[0380] Moreover, the polypeptides can be fused to marker sequences, such as a peptide which facilitates purification of the fused polypeptide, hi preferred embodiments, the marker amino acid sequence is a hexa-histidine peptide, such as the tag provided in a pQE vector

(QIAGEN, Inc., 9259 Eton Avenue, Chatsworth, Calif., 91311), among others, many of which are commercially available. As described in (Gentz et ah, Proc. Natl. Acad. Sci. U.S.A. 86: 821- 824, 1989), for instance, hexa-histidine provides for convenient purification of the fusion protein. Another peptide tag useful for purification, the "HA" tag, corresponds to an epitope derived from the influenza hemagglutinin protein. (Wilson et al, Cell 37: 767, 1984).

[0381] Thus, any of these above fusions can be engineered using the polynucleotides or the polypeptides of the present invention.

10. RNA AND DNA INTERFERENCE METHODS

A. Short Interfering RNAs (RNAi)

[0382] RNA interference (RNAi) is a mechanism of post-transcriptional gene silencing mediated by double-stranded RNA (dsRNA), which is distinct from antisense and ribozyme- based approaches (see Jain, K.K., Pharmacogenomics 5: 239-42, 2004, for a review of RNAi and siRNA). RNA interference is useful in a method for treating a neoplastic disease state in a mammal by administering to the mammal a nucleic acid molecule (e.g., dsRNA) that hybridizes under stringent conditions to a Caliban target gene, and attenuates expression of said target gene. dsRNA molecules are believed to direct sequence-specific degradation of mRNA in cells of various types after first undergoing processing by an RNase Ill-like enzyme called DICER (Bernstein et al, Nature 409: 363, 2001) into smaller dsRNA molecules comprised of two 21 nt strands, each of which has a 5' phosphate group and a 3' hydroxyl, and includes a 19 nt region precisely complementary with the other strand, so that there is a 19 nt duplex region flanked by 2 nt-3' overhangs. RNAi is thus mediated by short interfering RNAs (siRNA), which typically comprise a double-stranded region approximately 19 nucleotides in length with 1-2 nucleotide 3' overhangs on each strand, resulting in a total length of between approximately 21 and 23 nucleotides. In mammalian cells, dsRNA longer than approximately 30 nucleotides typically induces nonspecific mRNA degradation via the interferon response. However, the presence of siRNA in mammalian cells, rather than inducing the interferon response, results in sequence- specific gene silencing.

[0383] In general, a short, interfering RNA (siRNA) comprises an RNA duplex that is preferably approximately 19 basepairs long and optionally further comprises one or two single- stranded overhangs or loops. An siRNA may comprise two RNA strands hybridized together, or may alternatively comprise a single RNA strand that includes a self-hybridizing portion. siRNAs may include one or more free strand ends, which may include phosphate and/or hydroxyl groups. siRNAs typically include a portion that hybridizes under stringent conditions with a target transcript. One strand of the siRNA (or, the self-hybridizing portion of the siRNA) is typically

precisely complementary with a region of the target transcript, meaning that the siRNA hybridizes to the target transcript without a single mismatch. In certain embodiments of the invention in which perfect complementarity is not achieved, it is generally preferred that any mismatches be located at or near the siRNA termini.

[0384] siRNAs have been shown to downregulate gene expression when transferred into mammalian cells by such methods as transfection, electroporation, or microinjection, or when expressed in cells via any of a variety of plasmid-based approaches. RNA interference using siRNA is reviewed in, {e.g., Tuschl, Nat. Biotechnol. 20: 446-448, 2002; See also Yu et al, Proc. Natl. Acad. Sd U.S.A., 99: 6047-6052, 2002; Sui et al, Proc. Natl. Acad. Sd U.S.A. 99: 5515-5520, 2002; Paddison et al, Genes andDev. 16: 948-958, 2002; Brummelkamp et al, Science 296: 550-553, 2002; Miyagashi et al, Nat. Biotech. 20: 497-500, 2002; Paul et al, Nat. Biotech. 20: 505-508, 2002). As described in these and other references, the siRNA may consist of two individual nucleic acid strands or of a single strand with a self-complementary region capable of forming a hairpin (stem-loop) structure. A number of variations in structure, length, number of mismatches, size of loop, identity of nucleotides in overhangs, and the like, are consistent with effective siRNA-triggered gene silencing. While not wishing to be bound by any theory, it is thought that intracellular processing (e.g., by DICER) of a variety of different precursors results in production of siRNA capable of effectively mediating gene silencing. Generally it is preferred to target exons rather than introns, and it may also be preferable to select sequences complementary to regions within the 3' portion of the target transcript. Generally it is preferred to select sequences that contain approximately equimolar ratio of the different nucleotides and to avoid stretches in which a single residue is repeated multiple times.

[0385] siRNAs may thus comprise RNA molecules having a double-stranded region approximately 19 nucleotides in length with 1-2 nucleotide 3' overhangs on each strand, resulting in a total length of between approximately 21 and 23 nucleotides. As used herein, siRNAs also include various RNA structures that may be processed in vivo to generate such molecules. Such structures include RNA strands containing two complementary elements that hybridize to one another to form a stem, a loop, and optionally an overhang, preferably a 3 ' overhang. Preferably, the stem is approximately 19 bp long, the loop is about 1-20, more preferably about 4-10, and most preferably about 6-8 nt long and/or the overhang is about 1-20, and more preferably about 2-15 nt long. In certain embodiments of the invention the stem is minimally 19 nucleotides in length and may be up to approximately 29 nucleotides in length. Loops of 4 nucleotides or greater are less likely subject to steric constraints than are shorter loops and therefore may be preferred. The overhang may include a 5' phosphate and a 3' hydroxyl.

The overhang may but need not comprise a plurality of U residues, e.g., between 1 and 5 U residues. Classical siRNAs as described above trigger degradation of mRNAs to which they are targeted, thereby also reducing the rate of protein synthesis. In addition to siRNAs that act via the classical pathway, certain siRNAs that bind to the 3' UTR of a template transcript may inhibit expression of a protein encoded by the template transcript by a mechanism related to but distinct from classic RNA interference, e.g., by reducing translation of the transcript rather than decreasing its stability. Such RNAs are referred to as microRNAs (mRNAs) and are typically between approximately 20 and 26 nucleotides in length, e.g. , 22 nt in length. It is believed that they are derived from larger precursors known as small temporal RNAs (stRNAs) or mRNA precursors, which are typically approximately 70 nt long with an approximately 4-15 nt loop, (see Grishok et al, Cell 106: 23-24, 2001; Hutvagner et al, Science 293: 834-838, 2001; Ketting et al, Genes Dev. 15: 2654-2659). Endogenous RNAs of this type have been identified in a number of organisms including mammals, suggesting that this mechanism of post-transcriptional gene silencing maybe widespread (Lagos-Quintana et al, Science 294: 853-858, 2001; Pasquinelli, Trends in Genetics 18: 171-173, 2002, and references in the foregoing two articles). MicroRNAs have been shown to block translation of target transcripts containing target sites in mammalian cells (Zeng et al, Molecular Cell 9: 1-20, 2002).

[0386] siRNAs such as naturally occurring or artificial (i.e., designed by humans) mRNAs that bind within the 3' UTR (or elsewhere in a target transcript) and inhibit translation may tolerate a larger number of mismatches in the siRNA/template duplex, and particularly may tolerate mismatches within the central region of the duplex, hi fact, there is evidence that some mismatches may be desirable or required as naturally occurring stRNAs frequently exhibit such mismatches as do mRNAs that have been shown to inhibit translation in vitro. For example, when hybridized with the target transcript such siRNAs frequently include two stretches of perfect complementarity separated by a region of mismatch. A variety of structures are possible. For example, the mRNA may include multiple areas of nonidentity (mismatch). The areas of nonidentity (mismatch) need not be symmetrical in the sense that both the target and the mRNA include nonpaired nucleotides. Typically the stretches of perfect complementarity are at least 5 nucleotides in length, e.g., 6, 7, or more nucleotides in length, while the regions of mismatch may be, for example, 1, 2, 3, or 4 nucleotides in length.

[0387] Hairpin structures designed to mimic siRNAs and mRNA precursors are processed intracellularly into molecules capable of reducing or inhibiting expression of target transcripts (McManus et al, RNA 8: 842-850, 2002). These hairpin structures, which are based on classical siRNAs consisting of two RNA strands forming a 19 bp duplex structure are

classified as class I or class II hairpins. Class I hairpins incorporate a loop at the 5' or 3' end of the antisense siRNA strand (i.e., the strand complementary to the target transcript whose inhibition is desired) but are otherwise identical to classical siRNAs. Class II hairpins resemble mRNA precursors in that they include a 19 nt duplex region and a loop at either the 3' or 5' end of the antisense strand of the duplex in addition to one or more nucleotide mismatches in the stem. These molecules are processed intracellularly into small RNA duplex structures capable of mediating silencing. They appear to exert their effects through degradation of the target mRNA rather than through translational repression as is thought to be the case for naturally occurring mRNAs and stRNAs.

[0388] Thus it is evident that a diverse set of RNA molecules containing duplex structures is able to mediate silencing through various mechanisms. For the purposes of the present invention, any such RNA, one portion of which binds to a target transcript and reduces its expression, whether by triggering degradation, by inhibiting translation, or by other means, is considered to be an siRNA, and any structure that generates such an siRNA (i.e., serves as a precursor to the RNA) is useful in the practice of the present invention.

[0389] In the context of the present invention, siRNAs are useful both for therapeutic purposes, e.g., to modulate the expression of a Caliban protein in a subject at risk of or suffering from a disease or disorder (e.g., cancer) and for various of the inventive methods for the identification of compounds for treatment of a disease or disorder that modulate the activity or level of the molecules described herein. In a preferred embodiment, the therapeutic treatment of lung cancer with an antibody, antisense vector, or double stranded RNA vector.

[0390] The invention therefore provides a method of inhibiting expression of a gene encoding a Caliban protein comprising the step of (i) providing a biological system in which expression of a gene encoding Caliban protein is to be inhibited; and (ii) contacting the system with an siRNA targeted to a transcript encoding the Caliban protein. According to certain embodiments of the invention the Caliban protein is encoded by a gene within or linked to a disease susceptibility locus, or within which a functional mutation causing or contributing to susceptibility or development of a disease (e.g., lung cancer) may exist, hi other embodiments, Caliban proteins are inhibited. According to certain embodiments of the invention the biological system comprises a cell, and the contacting step comprises expressing the siRNA in the cell. According to certain embodiments of the invention the biological system comprises a subject, e.g., a mammalian subject such as a mouse or human, and the contacting step comprises administering the siRNA to the subject or comprises expressing the siRNA in the subject.

According to certain embodiments of the invention the siRNA is expressed inducibly and/or in a cell-type or tissue specific manner.

[0391] By "biological system" is meant any vessel, well, or container in which biomolecules (e.g., nucleic acids, polypeptides, polysaccharides, lipids, and the like) are placed; a cell or population of cells; a tissue; an organ; an organism, and the like. Typically the biological system is a cell or population of cells, but the method can also be performed in a vessel using purified or recombinant proteins.

[0392] The invention provides siRNA molecules targeted to a transcript encoding any Caliban protein. In particular, the invention provides siRNA molecules selectively or specifically targeted to a transcript encoding a polymorphic variant of such a transcript, wherein existence of the polymorphic variant in a subject is indicative of susceptibility to or presence of a chemotherapy-resistant neoplastic disease (e.g., malignant glioma). The terms "selectively" or "specifically targeted to", in this context, are intended to indicate that the siRNA causes greater reduction in expression of the variant than of other variants (i.e., variants whose existence in a subject is not indicative of susceptibility to or presence of a chemotherapy-resistant neoplastic disease). The siRNA, or collections of siRNAs, maybe provided in the form of kits with additional components as appropriate

B. Short hairpin RNAs (shRNA)

[0393] RNA interference (RNAi), a mechanism of post-transcriptional gene silencing mediated by double-stranded RNA (dsRNA), is useful in a method for treating a neoplastic disease state in a mammal by administering to the mammal a nucleic acid molecule (e.g., dsRNA) that hybridizes under stringent conditions to a Caliban target gene, and attenuates expression of said target gene. (See Jain, Pharmacogenomics 5: 239-242, 2004) for a review of RNAi and siRNA. A further method of RNA interference in the present invention is the use of short hairpin RNAs (shRNA). A plasmid containing a DNA sequence encoding for a particular desired siRNA sequence is delivered into a target cell via transfection or virally-mediated infection. Once in the cell, the DNA sequence is continuously transcribed into RNA molecules that loop back on themselves and form hairpin structures through intramolecular base pairing. These hairpin structures, once processed by the cell, are equivalent to transfected siRNA molecules and are used by the cell to mediate RNAi of the desired protein. The use of shRNA has an advantage over siRNA transfection as the former can lead to stable, long-term inhibition of protein expression. Inhibition of protein expression by transfected siRNAs is a transient phenomenon that does not occur for times periods longer than several days. In some cases, this

may be preferable and desired. In cases where longer periods of protein inhibition are necessary, shRNA mediated inhibition is preferable.

C. Full and Partial Length Antisense RNA Transcript

[0394] Antisense RNA transcripts have a base sequence complementary to part or all of any other RNA transcript in the same cell. Such transcripts have been shown to modulate gene expression through a variety of mechanisms including the modulation of RNA splicing, the modulation of RNA transport and the modulation of the translation of mRNA (Denhardt Ann N Y Acad. Sd. 660: 70, 1992; Nellen, Trends Biochem. Sd. 18: 419, 1993; Baker et al, Biochim. Biophys. Ada, 1489: 3, 1999; Xu et al, Gene Therapy 7: 438, 2000; French et al, Curr. Opin. Microbiol. 3: 159, 2000; Terryn et al, Trends Plant Sd. 5: 1360, 2000)

D. Antisense RNA and DNA Oligonucleotides

[0395] Antisense nucleic acids are generally single-stranded nucleic acids (DNA, RNA, modified DNA, or modified RNA) complementary to a portion of a target nucleic acid {e.g., an mRNA transcript) and therefore able to bind to the target to form a duplex. Typically they are oligonucleotides that range from 15 to 35 nucleotides in length but may range from 10 up to approximately 50 nucleotides in length. Binding typically reduces or inhibits the function of the target nucleic acid. For example, antisense oligonucleotides may block transcription when bound to genomic DNA, inhibit translation when bound to mRNA, and/or lead to degradation of the nucleic acid. Reduction in expression of a Caliban polypeptide may be achieved by the administration of antisense nucleic acids or peptide nucleic acids comprising sequences complementary to those of the mRNA that encodes the polypeptide. Antisense technology and its applications are well known in the art and are described in (Phillips, Antisense Technology, Methods Enzymol. 313 and 314, 2000, and references mentioned therein). (See also Crooke, 'ANTISENSE DRUG TECHNOLOGY: PRINCIPLES, STRATEGIES, AND APPLICATIONS" 1; and references cited therein).

[0396] Antisense oligonucleotides can be synthesized with a base sequence that is complementary to a portion of any RNA transcript in the cell. Antisense oligonucleotides may modulate gene expression through a variety of mechanisms including the modulation of RNA splicing, the modulation of RNA transport and the modulation of the translation of mRNA (Denhardt, 1992). Various properties of antisense oligonucleotides including stability, toxicity, tissue distribution, and cellular uptake and binding affinity may be altered through chemical modifications including (i) replacement of the phosphodiester backbone (e.g., peptide nucleic acid, phosphorothioate oligonucleotides, and phosphoramidate oligonucleotides), (ii) modification of the sugar base (e.g., 2'-O-propylribose and 2'-methoxyethoxyribose), and (iii)

modification of the nucleoside (e.g., C-5 propynyl U, C-5 thiazole U 9 and phenoxazine C) (Wagner, Nat. Medicine 1: 1116, 1995; Varga et al. Immun. 69: 217, 1999; Neilsen, Curr. Opin. Biotech. 10: 71, 1999; Woolf, Nucleic Acids Res. 18: 1763, 1990).

[0397] The invention provides a method of inhibiting expression of a gene encoding a Caliban protein comprising the step of (i) providing a biological system in which expression of a gene encoding a Caliban protein is to be inhibited; and (ii) contacting the system with an antisense molecule that hybridizes to a transcript encoding the Caliban protein. According to certain embodiments of the invention the Caliban protein is encoded by a gene within or linked to a disease susceptibility locus, or within which a functional mutation causing or contributing to a disease or development of a disease (e.g., cancer or lung cancerb) may exist, hi other embodiments, Caliban proteins are inhibited. According to certain embodiments of the invention the biological system comprises a cell, and the contacting step comprises expressing the antisense molecule in the cell. According to certain embodiments of the invention the biological system comprises a subject, e.g., a mammalian subject such as a mouse or human, and the contacting step comprises administering the antisense molecule to the subject or comprises expressing the antisense molecule in the subject. The expression may be inducible and/or tissue or cell type-specific. The antisense molecule may be an oligonucleotide or a longer nucleic acid molecule. The invention provides such antisense molecules.

E. Ribozymes

[0398] Certain nucleic acid molecules referred to as ribozymes or deoxyribozymes have been shown to catalyze the sequence-specific cleavage of RNA molecules. The cleavage site is determined by complementary pairing of nucleotides in the RNA or DNA enzyme with nucleotides in the target RNA. Thus, RNA and DNA enzymes can be designed to cleave to any RNA molecule, thereby increasing its rate of degradation (Cotton et al, EMBO J. 8: 3861-3866, 1989; Usman et al, Nucl. Acids MoI. Biol. 10: 243, 1996; Usman et al, Curr. Opin. Struct. Biol. 1: 527, 1996; Sun et al, Pharmacol. Rev., 52:325, 2000. See also e.g., Gotten et al, EMBO J. 8: 3861-3866, 1989).

[0399] The invention provides a method of inhibiting expression of a gene encoding a Caliban protein comprising the step of (i) providing a biological system in which expression of a gene encoding a Caliban protein is to be inhibited; and (ii) contacting the system with a ribozyme that hybridizes to a transcript encoding the Caliban protein and directs cleavage of the transcript. According to certain embodiments of the invention the Caliban protein is encoded by a gene within or linked to a disease susceptibility locus, or within which a functional mutation causing or contributing to susceptibility or development of a disease (e.g., cancer or lung cancer) may

exist. In other embodiments, Caliban proteins are inhibited. According to certain embodiments of the invention the biological system comprises a cell, and the contacting step comprises expressing the ribozyme in the cell. According to certain embodiments of the invention the biological system comprises a subject, e.g., a mammalian subject such as a mouse or human, and the contacting step comprises administering the ribozyme to the subject or comprises expressing the ribozyme in the subject. The expression may be inducible and/or tissue or cell-type specific according to certain embodiments of the invention. The invention provides ribozymes designed to cleave transcripts encoding Caliban proteins, or polymorphic variants thereof, as described above.

11. THERAPEUTIC APPLICATIONS

[0400] The compounds and modulators identified by the methods of the present invention can be used in a variety of methods of treatment. Thus, the present invention provides compositions and methods for treating Caliban-related diseases or disorders (e.g., cancer) or disease or disorder associated with cancer.

[0401] Preferably, treatment using a polypeptide or polynucleotide of the present invention could either be by administering an effective amount of a polypeptide to the patient, or by removing cells from the patient, supplying the cells with a polynucleotide of the present invention, and returning the engineered cells to the patient (ex vivo therapy).

12. CELLULAR TRANSFECTION AND GENE THERAPY

[0402] The present invention provides the nucleic acids of Caliban for the transfection of cells in vitro and in vivo. These nucleic acids can be inserted into any of a number of well known vectors for the transfection of target cells and organisms as described below. The nucleic acids are transfected into cells, ex vivo or in vivo, through the interaction of the vector and the target cell. The nucleic acids encoding Caliban, under the control of a promoter, then expresses a Caliban of the present invention, thereby mitigating the effects of absent, partial inactivation, or abnormal expression of the Caliban gene.

[0403] Such gene therapy procedures have been used to correct acquired and inherited genetic defects, cancer, and viral infection in a number of contexts. The ability to express artificial genes in humans facilitates the prevention and/or cure of many important human diseases, including many diseases which are not amenable to treatment by other therapies (for a review of gene therapy procedures, (see Anderson, Science 256: 808-813, 1992; Nabel et ah, TIBTECHW. 211-217, 1993; Mitani et al, TIBTECHU: 162-166, 1993; Mulligan, Science 926-932, 1993; Dillon, TIBTECHIl: 167-175, 1993; Miller, Nature 357: 455-460, 1992; Van Brunt, Biotechnology 10: 1149-1154, 1998; Vigne, Restorative Neurology and Neuroscience 8:

35-36, 1995; Kremer et al, British Medical Bulletin 1: 31-44, 1995; Haddada et al, Current Topics in Microbiology and Immunology, 1995; and Yu etal., Gene Therapy 1: 13-26, 1994).

[0404] Delivery of the gene or genetic material into the cell is the first critical step in gene therapy treatment of disease. A large number of delivery methods are well known to those of skill in the art. Preferably, the nucleic acids are administered for in vivo or ex vivo gene therapy uses. Non-viral vector delivery systems include DNA plasmids, naked nucleic acid, and nucleic acid complexed with a delivery vehicle such as a liposome. Viral vector delivery systems include DNA and RNA viruses, which have either episomal or integrated genomes after delivery to the cell. For a review of gene therapy procedures, (see Anderson, Science 256: 808-813, 1992; Nabel et al, TIBTECHW. 211-217, 1993; Mitani et al, TIBTECH 11: 162-166, 1993; Dillon, TIBTECHIl: 167-175, 1993; Miller, Nature 357: 455-460, 1992; Van Brunt, Biotechnology 10: 1149-1154, 1988; Vigne, Restorative Neurology and Neuroscience 8: 35-36, 1995; Kremer et al. British Medical Bulletin 1: 31-44, 1995; Haddada et al, Current Topics in Microbiology and Immunology Doerfler and Bohm, 1995; and Yu et al., Gene Therapy 1: 13-26, 1994).

[0405] Methods of non- viral delivery of nucleic acids include lipofection, microinjection, biolistics, virosomes, liposomes, immunoliposomes, polycation or lipid:nucleic acid conjugates, naked DNA, artificial virions, and agent-enhanced uptake of DNA. Lipofection is described in, e.g., U.S. Pat. No. 5,049,386, U.S. Pat. No. 4,946,787; and U.S. Pat. No. 4,897,355) and lipofection reagents are sold commercially (e.g., Transfectam.TM. and Lipofectin.TM.). Cationic and neutral lipids that are suitable for efficient receptor-recognition lipofection of polynucleotides include those of Feigner, WO 91/17424, WO 91/16024. Delivery can be to cells (ex vivo administration) or target tissues (in vivo administration).

[0406] The preparation of lipid:nucleic acid complexes, including targeted liposomes such as immunolipid complexes, is well known to one of skill in the art (see, e.g., Crystal, Science 270: 404-410, 1995; Blaese et al, Cancer Gene Ther. 2: 291-297, 1995; Behr et al, Bioconjugate Chem. 5: 382-389, 1994; Remy et al, Bioconjugate Chem. 5: 647-654, 1994; Gao et al, Gene Therapy 2: 710-722, 1995; Ahmad et al, Cancer Res. 52: 4817-4820, 1992; U.S. Pat. Nos. 4,186,183, 4,217,344, 4,235,871, 4,261,975, 4,485,054, 4,501,728, 4,774,085, 4,837,028, and 4,946,787).

[0407] The use of RNA or DNA viral based systems for the delivery of nucleic acids take advantage of highly evolved processes for targeting a virus to specific cells in the body and trafficking the viral payload to the nucleus. Viral vectors can be administered directly to patients (in vivo) or they can be used to treat cells in vitro and the modified cells are administered to patients (ex vivo). Conventional viral based systems for the delivery of nucleic acids could

include retroviral, lentivirus, adenoviral, adeno-associated and herpes simplex virus vectors for gene transfer. Viral vectors are currently the most efficient and versatile method of gene transfer in target cells and tissues. Integration in the host genome is possible with the retrovirus, lentivirus, and adeno-associated virus gene transfer methods, often resulting in long term expression of the inserted transgene. Additionally, high transduction efficiencies have been observed in many different cell types and target tissues.

[0408] The tropism of a retrovirus can be altered by incorporating foreign envelope proteins, expanding the potential target population of target cells. Lentiviral vectors are retroviral vector that are able to transduce or infect non-dividing cells and typically produce high viral titers. Selection of a retroviral gene transfer system would therefore depend on the target tissue. Retroviral vectors are comprised of cis-acting long terminal repeats with packaging capacity for up to 6-10 kb of foreign sequence. The minimum cis-acting LTRs are sufficient for replication and packaging of the vectors, which are then used to integrate the therapeutic gene into the target cell to provide permanent transgene expression. Widely used retroviral vectors include those based upon murine leukemia virus (MuLV), gibbon ape leukemia virus (GaLV), Simian Immuno deficiency virus (SIV), human immuno deficiency virus (HIV), and combinations thereof (see, e.g., Buchscher et α/., J. Virol. 66: 2731-2739, 1992; J oharm et al, J. Virol. 66: 1635-1640, 1992; Sommerfelt et α/., Virol. 176: 58-59, 1990; Wilson et al, J. Virol. 63: 2374-2378, 1989; Miller et al, J. Virol. 65: 2220-2224, 1991; PCT/US94/05700).

[0409] In applications where transient expression of the nucleic acid is preferred, adenoviral based systems are typically used. Adenoviral based vectors are capable of very high transduction efficiency in many cell types and do not require cell division. With such vectors, high titer and levels of expression have been obtained. This vector can be produced in large quantities in a relatively simple system. Adeno-associated virus ("AAV") vectors are also used to transduce cells with target nucleic acids, e.g., in the in vitro production of nucleic acids and peptides, and for in vivoand ex vivo gene therapy procedures (see, e.g., West et αl., Virology 160: 38-47, 1987; U.S. Pat. No. 4,797,368; WO 93/24641; Kotin, Human Gene Therapy 5: 793-801, 1994; Muzyczka, J. Clin. Invest. 94: 1351, 1994). Construction of recombinant AAV vectors are described in a number of publications, including (U.S. Pat. No. 5,173,414; Tratschin et al, MoI. Cell. Biol. 5: 3251-3260, 1985; Tratschin et al, MoI. Cell. Biol. 4: 2072-2081, 1984; Hermonat et al, Proc. Natl. Acad. ScL U.S.A. 81: 6466-6470, 1984; and Samulski et al, J. Virol 63: 3822- 3828, 1989).

[0410] In particular, at least six viral vector approaches are currently available for gene transfer in clinical trials, with retroviral vectors by far the most frequently used system. All of

these viral vectors utilize approaches that involve complementation of defective vectors by genes inserted into helper cell lines to generate the transducing agent.

[0411] pLASN and MFG-S are examples are retroviral vectors that have been used in clinical trials (Dunbar et al, Blood 85: 3048-305, 1995; Kohn et al, Nat. Med. 1: 1017-102, 1995; Malech et al, Proc. Natl. Acad. ScL U.S.A. 94: 22 12133-12138, 1997). PA317/pLASN was the first therapeutic vector used in a gene therapy trial. (Blaese et al, Science 270: 475-480, 1995). Transduction efficiencies of 50% or greater have been observed for MFG-S packaged vectors. (Ellem et al, Immunol Immunother. 1: 10-20, 1997; Dranoff et α/., Hum. Gene Ther. 1: 111-112, 1997).

[0412] Recombinant adeno-associated virus vectors (rAAV) are a promising alternative gene delivery systems based on the defective and nonpathogenic parvovirus adeno-associated type 2 virus. All vectors are derived from a plasmid that retains only the AAV 145 bp inverted terminal repeats flanking the transgehe expression cassette. Efficient gene transfer and stable transgene delivery due to integration into the genomes of the transduced cell are key features for this vector system. (Wagner et al, Lancet 351: 9117, 1702-1703, 1998, Keams et al, Gene Ther. 9: 748-55 1996).

[0413] Replication-deficient recombinant adenoviral vectors (Ad) are predominantly used transient expression gene therapy, because they can be produced at high titer and they readily infect a number of different cell types. Most adenovirus vectors are engineered such that a transgene replaces the Ad EIa, EIb, and E3 genes; subsequently the replication defector vector is propagated in human 293 cells that supply deleted gene function in trans. Ad vectors can transduce multiply types of tissues in vivo, including nondividing, differentiated cells such as those found in the liver, kidney and muscle system tissues. Conventional Ad vectors have a large carrying capacity. An example of the use of an Ad vector in a clinical trial involved polynucleotide therapy for antitumor immunization with intramuscular injection (Sterman et al, Hum. Gene Ther. 7: 1083-9, 1998). Additional examples of the use of adenovirus vectors for gene transfer in clinical trials include (Rosenecker et al, Infection 24: 15-10, 1996; Sterman et al, Hum. Gene Ther. 9: 1083-1089, 1998; Welsh, et al, Hum. Gene Ther. 2: 205-218, 1995; Alvarez et al., Hum. Gene Ther. 5: 597-613, 1997; Topf et al, Gene Ther. 5: 507-513, 1998; Sterman et al, Hum. Gene Ther. 7: 1083-1089, 1998).

[0414] Packaging cells are used to form virus particles that are capable of infecting a host cell. Such cells include 293 cells, which package adenovirus, and ψ2 cells or PA317 cells, which package retrovirus. Viral vectors used in gene therapy are usually generated by producer cell line that packages a nucleic acid vector into a viral particle. The vectors typically contain the

minimal viral sequences required for packaging and subsequent integration into a host, other viral sequences being replaced by an expression cassette for the protein to be expressed. The missing viral functions are supplied in trans by the packaging cell line. For example, AAV vectors used in gene therapy typically only possess ITR sequences from the AAV genome which are required for packaging and integration into the host genome. Viral DNA is packaged in a cell line, which contains a helper plasmid encoding the other AAV genes, namely rep and cap, but lacking ITR sequences. The cell line is also infected with adenovirus as a helper. The helper virus promotes replication of the AAV vector and expression of AAV genes from the helper plasmid. The helper plasmid is not packaged in significant amounts due to a lack of ITR sequences. Contamination with adenovirus can be reduced by, e.g., heat treatment to which adenovirus is more sensitive than AAV.

[0415] In many gene therapy applications, it is desirable that the gene therapy vector be delivered with a high degree of specificity to a particular tissue type. A viral vector is typically modified to have specificity for a given cell type by expressing a ligand as a fusion protein with a viral coat protein on the viruses outer surface. The ligand is chosen to have affinity for a receptor known to be present on the cell type of interest. For example, (Han et al, Proc. Natl. Acad. ScL U.S.A. 92: 9747-9751, 1995), reported that Moloney murine leukemia virus can be modified to express human heregulin fused to gp70, and the recombinant virus infects certain human breast cancer cells expressing human epidermal growth factor receptor. This principle can be extended to other pairs of virus expressing a ligand fusion protein and target cell expressing a receptor. For example, filamentous phage can be engineered to display antibody fragments (e.g., Fab or Fv) having specific binding affinity for virtually any chosen cellular receptor. Although the above description applies primarily to viral vectors, the same principles can be applied to nonviral vectors. Such vectors can be engineered to contain specific uptake sequences thought to favor uptake by specific target cells.

[0416] Gene therapy vectors can be delivered in vivo by administration to an individual patient, typically by systemic administration (e.g., intravenous, intraperitoneal, intramuscular, subdermal, or intracranial infusion) or topical application, as described below. Alternatively, vectors can be delivered to cells ex vivo , such as cells explanted from an individual patient (e.g., lymphocytes, bone marrow aspirates, tissue biopsy) or universal donor hematopoietic stem cells, followed by reimplantation of the cells into a patient, usually after selection for cells which have incorporated the vector.

[0417] Ex vivo cell transfection for diagnostics, research, or for gene therapy (e.g., via re-infusion of the transfected cells into the host organism) is well known to those of skill in the

art. In a preferred embodiment, cells are isolated from the subject organism, transfected with a nucleic acid (gene or cDNA), and re-infused back into the subject organism (e.g., patient). Various cell types suitable for ex vivo transfection are well known to those of skill in the art (see, e.g., Freshney et al, Culture of Animal Cells, A Manual of Basic Technique 3, 1994) and the references cited therein for a discussion of how to isolate and culture cells from patients).

[0418] In one embodiment, stem cells are used in ex vivo procedures for cell transfection and gene therapy. The advantage to using stem cells is that they can be differentiated into other cell types in vitro, or can be introduced into a mammal (such as the donor of the cells) where they will engraft in the bone marrow. Methods for differentiating CD34+cells in vitro into clinically important immune cell types using cytokines such a GM-CSF, IFN-.gamma. and TNF- .alρha.are known (see Inaba et al, J. Exp. Med. 176: 1693-1702, 1992).

[0419] Stem cells are isolated for transduction and differentiation using known methods. For example, stem cells are isolated from bone marrow cells by panning the bone marrow cells with antibodies which bind unwanted cells, such as CD4+ and CD8+ (T cells), CD45+(panB cells), GR-I (granulocytes), and lad (differentiated antigen presenting cells) (see Inaba et al, J. Exp. Med. 176: 1693-1702, 1992).

[0420] Vectors (e.g., retroviruses, adenoviruses, liposomes, and the like) containing therapeutic nucleic acids can be also administered directly to the organism for transduction of cells in vivo. Alternatively, naked DNA can be administered.

[0421] Administration is by any of the routes normally used for introducing a molecule into ultimate contact with blood or tissue cells, as described below. The nucleic acids are administered in any suitable manner, preferably with pharmaceutically acceptable carriers. Suitable methods of administering such nucleic acids are available and well known to those of skill in the art, and, although more than one route can be used to administer a particular composition, a particular route can often provide a more immediate and more effective reaction than another route (see Hermonat P.L. and Muzyczka N., Proc. Natl. Acad. ScL U.S.A. 81: 6466- 6470, 1984; and Samulski et al, J. Virol. 63: 3822-3828, 1989). In particular, at least six viral vector approaches are currently available for gene transfer in clinical trials, with retroviral vectors by far the most frequently used system. All of these viral vectors utilize approaches that involve complementation of defective vectors by genes inserted into helper cell lines to generate the transducing agent.

13. FORMULATION AND ADMINISTRATION OF PHARMACEUTICAL COMPOSITIONS

[0422] The invention provides pharmaceutical compositions comprising nucleic acids, peptides and polypeptides (including Abs) of the invention. As discussed above, the nucleic acids, peptides and polypeptides of the invention can be used to inhibit or activate expression of Caliban polypeptides. Such inhibition in a cell or a non-human animal can generate a screening modality for identifying compounds to treat or ameliorate a disease or disorder associated with cancer.

[0423] The nucleic acids, peptides and polypeptides of the invention can be combined with a pharmaceutically acceptable carrier (excipient) to form a pharmacological composition. Pharmaceutically acceptable carriers can contain a physiologically acceptable compound that acts to, e.g., stabilize, or increase or decrease the absorption or clearance rates of the pharmaceutical compositions of the invention. Physiologically acceptable compounds can include, e.g., carbohydrates, such as glucose, sucrose, or dextrans, antioxidants, such as ascorbic acid or glutathione, chelating agents, low molecular weight proteins, compositions that reduce the clearance or hydrolysis of the peptides or polypeptides, or excipients or other stabilizers and/or buffers. Detergents can also used to stabilize or to increase or decrease the absorption of the pharmaceutical composition, including liposomal carriers. Pharmaceutically acceptable carriers and formulations for peptides and polypeptide are known to the skilled artisan and are described in detail in the scientific and patent literature, see e.g., the latest edition of Remington's Pharmaceutical Science, Mack Publishing Company, Easton, PA ("Remington's").

[0424] Other physiologically acceptable compounds include wetting agents, emulsifying agents, dispersing agents or preservatives which are particularly useful for preventing the growth or action of microorganisms. Various preservatives are well known and include, e.g., phenol and ascorbic acid. One skilled in the art would appreciate that the choice of a pharmaceutically acceptable carrier including a physiologically acceptable compound depends, for example, on the route of administration of the peptide or polypeptide of the invention and on its particular physio-chemical characteristics.

[0425] In one aspect, a solution of nucleic acids, peptides or polypeptides of the invention are dissolved in a pharmaceutically acceptable carrier, e.g., an aqueous carrier if the composition is water-soluble. Examples of aqueous solutions that can be used in formulations for enteral, parenteral or transmucosal drug delivery include, e.g., water, saline, phosphate buffered saline, Hank's solution, Ringer's solution, dextrose/saline, glucose solutions and the like. The formulations can contain pharmaceutically acceptable auxiliary substances as required to

approximate physiological conditions, such as buffering agents, tonicity adjusting agents, wetting agents, detergents and the like. Additives can also include additional active ingredients such as bactericidal agents, or stabilizers. For example, the solution can contain sodium acetate, sodium lactate, sodium chloride, potassium chloride, calcium chloride, sorbitan monolaurate or triethanolamine oleate. These compositions can be sterilized by conventional, well-known sterilization techniques, or can be sterile filtered. The resulting aqueous solutions can be packaged for use as is, or lyophilized, the lyophilized preparation being combined with a sterile aqueous solution prior to administration. The concentration of peptide in these formulations can vary widely, and will be selected primarily based on fluid volumes, viscosities, body weight and the like in accordance with the particular mode of administration selected and the patient's needs.

[0426] Solid formulations can be used for enteral (oral) administration. They can be formulated as, e.g., pills, tablets, powders or capsules. For solid compositions, conventional nontoxic solid carriers can be used which include, e.g., pharmaceutical grades of mannitol, lactose, starch, magnesium stearate, sodium saccharin, talcum, cellulose, glucose, sucrose, magnesium carbonate, and the like. For oral administration, a pharmaceutically acceptable nontoxic composition is formed by incorporating any of the normally employed excipients, such as those carriers previously listed, and generally 10% to 95% of active ingredient {e.g., peptide). A non-solid formulation can also be used for enteral administration. The carrier can be selected from various oils including those of petroleum, animal, vegetable or synthetic origin, e.g., peanut oil, soybean oil, mineral oil, sesame oil, and the like. Suitable pharmaceutical excipients include e.g., starch, cellulose, talc, glucose, lactose, sucrose, gelatin, malt, rice, flour, chalk, silica gel, magnesium stearate, sodium stearate, glycerol monostearate, sodium chloride, dried skim milk, glycerol, propylene glycol, water, ethanol.

[0427] Nucleic acids, peptides or polypeptides of the invention, when administered orally, can be protected from digestion. This can be accomplished either by complexing the nucleic acid, peptide or polypeptide with a composition to render it resistant to acidic and enzymatic hydrolysis or by packaging the nucleic acid, peptide or polypeptide in an appropriately resistant carrier such as a liposome. Means of protecting compounds from digestion are well known in the art, (see, e.g., Fix, Pharm Res. 13: 1760-1764, 1996; Samanen, J. Pharm. Pharmacol. 48: 119-135, 1996; U.S. Pat. No. 5,391,377), describing lipid compositions for oral delivery of therapeutic agents (liposomal delivery is discussed in further detail, infra).

[0428] Systemic administration can also be by transmucosal or transdermal means. For transmucosal or transdermal administration, penetrants appropriate to the barrier to be permeated

can be used in the formulation. Such penetrants are generally known in the art, and include, e.g., for transmucosal administration, bile salts and fusidic acid derivatives. In addition, detergents can be used to facilitate permeation. Transmucosal administration can be through nasal sprays or using suppositories. (See, e.g., Sayani, Crit. Rev. Ther. Drug Carrier Sy st. 13: 85-184, 1996.) For topical, transdermal administration, the agents are formulated into ointments, creams, salves, powders and gels. Transdermal delivery systems can also include, e.g., patches.

[0429] The nucleic acids, peptides or polypeptides of the invention can also be administered in sustained delivery or sustained release mechanisms, which can deliver the formulation internally. For example, biodegradeable microspheres or capsules or other biodegradeable polymer configurations capable of sustained delivery of a peptide can be included in the formulations of the invention. (See, e.g., Putney, Nat. Biotechnol. 16: 153-157, 1998).

[0430] For inhalation, the nucleic acids, peptides or polypeptides of the invention can be delivered using any system known in the art, including dry powder aerosols, liquids delivery systems, air jet nebulizers, propellant systems, and the like. See, e.g., Patton, Biotechniqnes 16: 141-143, 1998; product and inhalation delivery systems for polypeptide macromolecules by, e.g., Dura Pharmaceuticals (San Diego, Calif), Aradigrn (Hayward, Calif), Aerogen (Santa Clara, Calif), Inhale Therapeutic Systems (San Carlos, Calif.), and the like. For example, the pharmaceutical formulation can be administered in the form of an aerosol or mist. For aerosol administration, the formulation can be supplied in finely divided form along with a surfactant and propellant. In another aspect, the device for delivering the formulation to respiratory tissue is an inhaler in which the formulation vaporizes. Other liquid delivery systems include, e.g., air jet nebulizers.

[0431] In preparing pharmaceuticals of the present invention, a variety of formulation modifications can be used and manipulated to alter pharmacokinetics and biodistribution. A number of methods for altering pharmacokinetics and biodistribution are known to one of ordinary skill in the art. Examples of such methods include protection of the compositions of the invention in vesicles composed of substances such as proteins, lipids (for example, liposomes, see below), carbohydrates, or synthetic polymers (discussed above). For a general discussion of pharmacokinetics, see, e.g., Remington's, Chapters 37-39.

[0432] The nucleic acids, peptides or polypeptides of the invention can be delivered alone or as pharmaceutical compositions by any means known in the art, e.g., systemically, regionally, or locally {e.g., directly into, or directed to, a tumor); by intraarterial, intrathecal (IT), intravenous (IV), parenteral, intra-pleural cavity, topical, oral, or local administration, as

subcutaneous, intra-tracheal (e.g., by aerosol) or transmucosal (e.g., buccal, bladder, vaginal, uterine, rectal, nasal mucosa). Actual methods for preparing administrable compositions will be known or apparent to those skilled in the art and are described in detail in the scientific and patent literature, see e.g., Remington's. For a "regional effect," e.g., to focus on a specific organ, one mode of administration includes intra-arterial or intrathecal (IT) injections, e.g., to focus on a specific organ, e.g., brain and CNS. (See e.g., Gurun, Anesth Analg. 85: 317-323, 1997). For example, intra-carotid artery injection if preferred where it is desired to deliver a nucleic acid, peptide or polypeptide of the invention directly to the brain. Parenteral administration is a preferred route of delivery if a high systemic dosage is needed. Actual methods for preparing parenterally administrable compositions will be known or apparent to those skilled in the art and are described in detail, in e.g., Remington's. (See also, Bai, J. Neuroimmunol. 80: 65-75, 1997; Warren, J. Neurol. ScL 152: 31-38, 1997; Tonegawa, J. Exp. Med. 186: 507-515, 1997)

[0433] In one aspect, the pharmaceutical formulations comprising nucleic acids, peptides or polypeptides of the invention are incorporated in lipid monolayers or bilayers, e.g., liposomes, (see, e.g., U.S. Pat. Nos. 6,110,490; 6,096,716; 5,283,185; 5,279,833). The invention also provides formulations in which water soluble nucleic acids, peptides or polypeptides of the invention have been attached to the surface of the monolayer or bilayer. For example, peptides can be attached to hydrazide-PEG-(distearoylphosphatidyl) ethanolamine-containing liposomes. (See, e.g., Zalipsky Bioconjug, Chem. 6: 705-708, 1995). Liposomes or any form of lipid membrane, such as planar lipid membranes or the cell membrane of an intact cell, e.g., a red blood cell, can be used. Liposomal formulations can be by any means, including administration intravenously, transdermally (see, e.g., Vutla, J. Pharm. Sd. 85: 5-8, 1996), transmucosally, or orally. The invention also provides pharmaceutical preparations in which the nucleic acid, peptides and/or polypeptides of the invention are incorporated within micelles and/or liposomes. (See, e.g., Suntres, J. Pharm. Pharmacol. 46: 23-28, 1994; Woodle, Pharm. Res. 9: 260-265, 1992). Liposomes and liposomal formulations can be prepared according to standard methods and are also well known in the art. (See, e.g., Remington's; Akimaru, Cytokines MoI. Ther. 1: 197-210, 1995; Alving, Immunol. Rev. 145: 5-31, 1995; Szόka, Ann. Rev. Biophys. Bioeng. 9: 467, 1980, U.S. Pat. Nos. 4, 235,871, 4,501,728 and 4,837,028.)

[0434] The pharmaceutical compositions are generally formulated as sterile, substantially isotonic and in full compliance with all Good Manufacturing Practice (GMP) regulations of the U.S. Food and Drug Administration.

14. TREATMENT REGIMENS AND PHARMACOKINETICS

[0435] The pharmaceutical compositions of the invention can be administered in a variety of unit dosage forms depending upon the method of administration. Dosages for typical nucleic acid, peptide and polypeptide pharmaceutical compositions are well known to those of skill in the art. Such dosages are typically advisorial in nature and are adjusted depending on the particular therapeutic context, patient tolerance, and the like. The amount of nucleic acid, peptide or polypeptide adequate to accomplish this is defined as a "therapeutically effective dose." The dosage schedule and amounts effective for this use, i.e., the "dosing regimen," will depend upon a variety of factors, including the stage of the disease or condition, the severity of the disease or condition, the general state of the patient's health, the patient's physical status, age, pharmaceutical formulation and concentration of active agent, and the like. In calculating the dosage regimen for a patient, the mode of administration also is taken into consideration. The dosage regimen must also take into consideration the pharmacokinetics, i.e., the pharmaceutical composition's rate of absorption, bioavailability, metabolism, clearance, and the like. (See, e.g., the latest Remington's; Egleton, Peptides 18: 1431-1439, 1997; Langer, Science 249: 1527-1533, 1990).

[0436] hi therapeutic applications, compositions are administered to a patient suffering from a disease {e.g., cancer) or a disease or disorder associated with cancer to at least partially arrest the condition or a disease and/or its complications. For example, in one aspect, a soluble peptide pharmaceutical composition dosage for intravenous (IV) administration would be about 0.01 mg/hr to about 1.0 mg/hr administered over several hours (typically 1, 3, or 6 hours), which can be repeated for weeks with intermittent cycles. Considerably higher dosages {e.g., ranging up to about 10 mg/ml) can be used, particularly when the drug is administered to a secluded site and not into the blood stream, such as into a body cavity or into a lumen of an organ, e.g., the cerebrospinal fluid (CSF).

[0437] The invention provides pharmaceutical compositions comprising one or a combination of antibodies, e.g., antibodies to Caliban gene products (monoclonal, polyclonal or single chain Fv; intact or binding fragments thereof) or nucleic acid compositions, e.g., antisense oligonucleotides, double stranded RNA oligonucleotides (RNAi) or DNA oligonucleotides (vectors) containing nucleotide sequences encoding for the transcription of shRNA molecules, formulated together with a pharmaceutically acceptable carrier. Some compositions include a combination of multiple {e.g., two or more) monoclonal antibodies or antigen-binding portions thereof of the invention. In some compositions, each of the antibodies or antigen-binding

portions thereof of the composition is a monoclonal antibody or a human sequence antibody that binds to a distinct, pre-selected epitope of an antigen.

[0438] In prophylactic applications, pharmaceutical compositions or medicaments are administered to a patient susceptible to, or otherwise at risk of a disease or condition (e.g., a cancer or a disease or disorder related to cancer) in an amount sufficient to eliminate or reduce the risk, lessen the severity, or delay the outset of the disease, including biochemical, histologic and/or behavioral symptoms of the disease, its complications and intermediate pathological phenotypes presenting during development of the disease. In therapeutic applications, compositions or medicants are administered to a patient suspected of, or already suffering from such a disease in an amount sufficient to cure, or at least partially arrest, the symptoms of the disease (biochemical, histologic and/or behavioral), including its complications and intermediate pathological phenotypes in development of the disease. An amount adequate to accomplish therapeutic or prophylactic treatment is defined as a therapeutically- or prophylactically-effective dose. In both prophylactic and therapeutic regimes, agents are usually administered in several dosages until a sufficient immune response has been achieved. Typically, any response is monitored and repeated dosages are given if the response starts to wane.

15. EFFECTIVE DOSAGES

[0439] Effective doses of the antibody compositions of the present invention, e.g., antibodies to Caliban gene products (e.g., Caliban proteins), or nucleic acid compositions, e.g., antisense oligonucleotides, double stranded RNA oligonucleotides (RNAi), or DNA oligonucleotides (vectors) containing nucleotide sequences encoding for the transcription of shRNA molecules, for the treatment of diseases or disorders (e.g., cancer) or diseases or disorders associated with cancer described herein vary depending upon many different factors, including means of administration, target site, physiological state of the patient, whether the patient is human or an animal, other medications administered, and whether treatment is prophylactic or therapeutic. Usually, the patient is a human but nonhuman mammals including transgenic mammals can also be treated. Treatment dosages need to be titrated to optimize safety and efficacy.

[0440] For administration with an antibody or nucleic acid composition, the dosage ranges from about 0.0001 to 100 mg/kg, and more usually 0.01 to 5 mg/kg, of the host body weight. For example dosages can be 1 mg/kg body weight or 10 mg/kg body weight or within the range of 1-10 mg/kg. An exemplary treatment regime entails administration once per every two weeks or once a month or once every 3 to 6 months, hi some methods, two or more monoclonal antibodies with different binding specificities are administered simultaneously, in which case the

dosage of each antibody administered falls within the ranges indicated. Antibody is usually administered on multiple occasions. Intervals between single dosages can be weekly, monthly or yearly. Intervals can also be irregular as indicated by measuring blood levels of antibody in the patient. In some methods, dosage is adjusted to achieve a plasma antibody concentration of 1- 1000 μg/ml and in some methods 25-300 μg/ml. Alternatively, antibody can be administered as a sustained release formulation, in which case less frequent administration is required. Dosage and frequency vary depending on the half-life of the antibody in the patient. In general, human antibodies show the longest half life, followed by humanized antibodies, chimeric antibodies, and nonhuman antibodies. The dosage and frequency of administration can vary depending on whether the treatment is prophylactic or therapeutic, hi prophylactic applications, a relatively low dosage is administered at relatively infrequent intervals over a long period of time. Some patients continue to receive treatment for the rest of their lives. In therapeutic applications, a relatively high dosage at relatively short intervals is sometimes required until progression of the disease is reduced or terminated, and preferably until the patient shows partial or complete amelioration of symptoms of disease. Thereafter, the patient can be administered a prophylactic regime.

[0441] Doses for nucleic acids range from about 10 ng to 1 g, 100 ng to 100 mg, 1 μg to 10 mg, or 30-300 μg DNA per patient. Doses for infectious viral vectors vary from 10-100, or more, virions per dose.

16. ROUTES OF ADMINISTRATION

[0442] Antibody compositions for inducing an immune response, e.g., antibodies to Caliban gene products (e.g., Caliban proteins), or nucleic acid compositions, e.g., antisense oligonucleotides, double stranded RNA oligonucleotides (RNAi), or DNA oligonucleotides (vectors) containing nucleotide sequences encoding for the transcription of shRNA molecules, for the treatment of treatment of diseases or disorders (e.g., cancer) or diseases or disorders associated with cancer described herein, can be administered by parenteral, topical, intravenous, oral, subcutaneous, intraarterial, intracranial, intraperitoneal, intranasal or intramuscular means for prophylactic as inhalants for antibody preparations and/or therapeutic treatment. The most typical route of administration of an immunogenic agent is subcutaneous although other routes can be equally effective. The next most common route is intramuscular injection. This type of injection is most typically performed in the arm or leg muscles, hi some methods, agents are injected directly into a particular tissue, for example intracranial injection or convection enhanced delivery. Intramuscular injection or intravenous infusion are preferred for administration of antibody, hi some methods, particular therapeutic antibodies are delivered

directly into the cranium. In some methods, antibodies are administered as a sustained release composition or device, such as a Medipad™ device.

[0443] Agents of the invention can optionally be administered in combination with other agents that are at least partly effective in treating various diseases or disorders (e.g., cancer) or diseases or disorders associated with cancer. In the case of targets in the brain, agents of the invention can also be administered in conjunction with other agents that increase passage of the agents of the invention across the blood-brain barrier (BBB).

17. FORMULATION

[0444] Antibody compositions for inducing an immune response, e.g., antibodies to antibodies to Caliban gene products (e.g., Caliban proteins), or nucleic acid compositions, e.g., antisense oligonucleotides, double stranded RNA oligonucleotides (RNAi), or DNA oligonucleotides (vectors) containing nucleotide sequences encoding for the transcription of shRNA molecules, for the treatment of treatment of diseases or disorders (e.g., cancer) or diseases or disorders associated with cancer described herein, are often administered as pharmaceutical compositions comprising an active therapeutic agent, i.e., and a variety of other pharmaceutically acceptable components. See the most recent edition of Remington's Pharmaceutical Science (e.g., 20 th ed., Mack Publishing Company, Easton, PA, 2000). The preferred form depends on the intended mode of administration and therapeutic application. The compositions can also include, depending on the formulation desired, pharmaceutically- acceptable, non-toxic carriers or diluents, which are defined as vehicles commonly used to formulate pharmaceutical compositions for animal or human administration. The diluent is selected so as not to affect the biological activity of the combination. Examples of such diluents are distilled water, physiological phosphate-buffered saline, Ringer's solutions, dextrose solution, and Hank's solution. In addition, the pharmaceutical composition or formulation may also include other carriers, adjuvants, or nontoxic, nontherapeutic, nonimmunogenic stabilizers and the like.

[0445] Pharmaceutical compositions can also include large, slowly metabolized macromolecules such as proteins, polysaccharides such as chitosan, polylactic acids, polyglycolic acids and copolymers (such as latex functionalized Sepharose™, agarose, cellulose, and the like), polymeric amino acids, amino acid copolymers, and lipid aggregates (such as oil droplets or liposomes). Additionally, these carriers can function as immunostimulating agents (i.e., adjuvants).

[0446] For parenteral administration, compositions of the invention can be administered as injectable dosages of a solution or suspension of the substance in a physiologically acceptable

diluent with a pharmaceutical carrier that can be a sterile liquid such as water oils, saline, glycerol, or ethanol. Additionally, auxiliary substances, such as wetting or emulsifying agents, surfactants, pH buffering substances and the like can be present in compositions. Other components of pharmaceutical compositions are those of petroleum, animal, vegetable, or synthetic origin, for example, peanut oil, soybean oil, and mineral oil. In general, glycols such as propylene glycol or polyethylene glycol are preferred liquid carriers, particularly for injectable solutions. Antibodies can be administered in the form of a depot injection or implant preparation which can be formulated in such a manner as to permit a sustained release of the active ingredient. An exemplary composition comprises monoclonal antibody at 5 mg/mL, formulated in aqueous buffer consisting of 50 mM L-histidine, 150 mM NaCl, adjusted to pH 6.0 with HCl.

[0447] Typically, compositions are prepared as injectables, either as liquid solutions or suspensions; solid forms suitable for solution in, or suspension in, liquid vehicles prior to injection can also be prepared. The preparation also can be emulsified or encapsulated in liposomes or micro particles such as polylactide, polyglycolide, or copolymer for enhanced adjuvant effect, as discussed above. (Langer, Science 249: 1527, 1990; Hanes, Advanced Drug Delivery Reviews 28: 97-119, 1997). The agents of this invention can be administered in the form of a depot injection or implant preparation which can be formulated in such a manner as to permit a sustained or pulsatile release of the active ingredient.

[0448] Additional formulations suitable for other modes of administration include oral, intranasal, and pulmonary formulations, suppositories, and transdermal applications.

[0449] For suppositories, binders and carriers include, for example, polyalkylene glycols or triglycerides; such suppositories can be formed from mixtures containing the active ingredient in the range of 0.5% to 10%, preferably l%-2%. Oral formulations include excipients, such as pharmaceutical grades of mannitol, lactose, starch, magnesium stearate, sodium saccharine, cellulose, and magnesium carbonate. These compositions take the form of solutions, suspensions, tablets, pills, capsules, sustained release formulations or powders and contain 10%- 95% of active ingredient, preferably 25%-70%.

[0450] Topical application can result in transdermal or intradermal delivery. Topical administration can be facilitated by co-administration of the agent with cholera toxin or detoxified derivatives or subunits thereof or other similar bacterial toxins. (Glenn et ah, Nature 391: 851, 1998). Co-administration can be achieved by using the components as amixture or as linked molecules obtained by chemical crosslinking or expression as a fusion protein.

[0451] Alternatively, transdermal delivery can be achieved using a skin patch or using transferosomes. (Paul et ah, Eur. J. Immunol. 25: 3521-24, 1995; Cevc et ah, Biochem. Biophys. Acta 1368: 201-15, 1998).

[0452] The pharmaceutical compositions are generally formulated as sterile, substantially isotonic and in full compliance with all Good Manufacturing Practice (GMP) regulations of the U.S. Food and Drug Administration.

18. TOXICITY

[0453] Preferably, a therapeutically effective dose of the antibody compositions or nucleic acid compositions, e.g., antisense oligonucleotides, double stranded RNA oligonucleotides (RNAi), or DNA oligonucleotides (vectors) containing nucleotide sequences encoding for the transcription of shRNA molecules, described herein will provide therapeutic benefit without causing substantial toxicity.

[0454] Toxicity of the proteins described herein can be determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g., by determining the LD50 (the dose lethal to 50% of the population) or the LD 10O (the dose lethal to 100% of the population). The dose ratio between toxic and therapeutic effect is the therapeutic index. The data obtained from these cell culture assays and animal studies can be used in formulating a dosage range that is not toxic for use in human. The dosage of the proteins described herein lies preferably within a range of circulating concentrations that include the effective dose with little or no toxicity. The dosage can vary within this range depending upon the dosage form employed and the route of administration utilized. The exact formulation, route of administration and dosage can be chosen by the individual physician in view of the patient's condition. (See, e.g., Fingl et ah, In: The Pharmacological Basis of Therapeutics, 1975).

19. DIAGNOSTIC METHODS

A. Diagnosis Of Caliban-Related Diseases Or Disorders Or Caliban-Related Diseases Or Disorder Susceptibility

[0455] The invention provides a variety of methods for the diagnosis of a disease or disorder susceptibility {e.g., cancer or lung cancer) or disease or disorder susceptibility related to cancer. In particular, the invention provides a method for the diagnosis of a Caliban-related disease or disorder susceptibility comprising: (i) providing a sample obtained from a subject to be tested for a disease or disorder susceptibility; and (ii) detecting a polymorphic variant of a polymorphism in a coding or noncoding portion of a Caliban gene or fragment thereof, or detecting a polymorphic variant of a polymorphism in a genomic region linked to such a gene, in the sample. It is to be understood that "susceptibility to a Caliban disease or disorder" does not

necessarily mean that the subject will develop a Caliban-related disease or disorder but rather that the subject is, in a statistical sense, more likely to develop Caliban-related disease or disorder than an average member of the population. As used herein, "susceptibility to a Caliban- related disease or disorder" can exist if the subject has one or more genetic determinants (e.g., polymorphic variants or alleles) that can, either alone or in combination with one or more other genetic determinants, contribute to an increased risk of developing a Caliban-related disease or disorder in some or all subjects. Ascertaining whether the subject has any such genetic determinants (i.e., genetic determinants that can increase the risk of developing a Caliban-related disease or disorder in the appropriate genetic background) is included in the concept of diagnosing susceptibility to a Caliban-related disease or disorder as used herein. Such determination is useful, for example, for purposes of genetic counseling. Thus providing diagnostic information regarding Caliban-related disease or disorder susceptibility includes providing information useful in genetic counseling, and the provision of such information is encompassed by the invention.

[0456] The sample itself will typically consist of cells (e.g., blood or brain cells), tissue, and the like, removed from the subject. The subject can be an adult, child, fetus, or embryo. According to certain embodiments of the invention the sample is obtained prenatally, either from the fetus or embryo or from the mother (e.g., from fetal or embryonic cells in that enter the maternal circulation). The sample can be further processed before the detecting step. For example, DNA in the cell or tissue sample can be separated from other components of the sample, can be amplified, and the like. All samples obtained from a subject, including those subjected to any sort of further processing, are considered to be obtained from the subject.

[0457] In general, if the polymorphism is located in a gene, it can be located in a noncoding or coding region of the gene. If located in a coding region the polymorphism can, but frequently will not, result in an amino acid alteration. Such alteration can or can not have an effect on the function or activity of the encoded polypeptide. If the polymorphism is linked to, but not located within, a gene, it is preferred that the polymorphism is closely linked to the gene. For example, it is preferred that the recombination frequency between the polymorphism and the gene is less than approximately 20%, preferably less than approximately 10%, less than approximately 5%, less than approximately 1%, or still less.

[0458] According to certain preferred embodiments of any of the inventive methods described above, the gene can be coincident with a mapped or identified Caliban-related disease or disorder susceptibility locus or a related susceptibility locus. For example, according to various embodiments of the invention the gene can encode any of the molecules listed as shown

herein. In a particular embodiment of the invention, discussed further below, the preferred genes encode the Caliban gene family (see the Figures). The inventive methods also encompass genes coincident with Caliban-related disease or disorder susceptibility loci that have yet to be mapped or identified. By "coincident with" is meant either that the gene or a portion thereof falls within the identified chromosomal location or is located in close proximity to that location. In general, the resolution of studies identifying genetic susceptibility loci can be on the order of tens of centimorgans. According to certain embodiments of the invention "close proximity" refers to within 20 centimorgans of either side of the susceptibility locus, more preferably within 10 centimorgans of either side of the susceptibility locus, yet more preferably within 5 centimorgans of either side of the susceptibility locus. In general, susceptibility loci are designated by the chromosomal band positions that they span {e.g., 8p21 refers to chromosome 8, arm p, band 21; 8p20-21 refers to chromosome 8, arm p, bands 20-21 inclusive) and can be defined at higher resolution {e.g., 8p21.1). In general, the terms "coincident with" and "close proximity" can be interpreted in light of the knowledge of one of ordinary skill in the art.

B. Methods And Reagents For Identification And Detection Of Polymorphisms [0459] In general, polymorphisms of use in the practice of the invention can be initially identified using any of a number of methods well known in the art. For example, numerous polymorphisms are known to exist and are available in public databases, which can be searched as described herein. Alternately, polymorphisms can be identified by sequencing either genomic DNA or cDNA in the region in which it is desired to find a polymorphism. According to one approach, primers are designed to amplify such a region, and DNA from a subject suffering from a Caliban-related disease or disorder is obtained and amplified. The DNA is sequenced, and the sequence (referred to as a "subject sequence") is compared with a reference sequence, which is typically taken to represent the "normal" or "wild type" sequence. Such a sequence can be, for example, the human draft genome sequence, publicly available in various databases, or a sequence deposited in a database such as GenBank. hi general, if sequencing reveals a difference between the sequenced region and the reference sequence, a polymorphism has been identified. Note that this analysis does not necessarily presuppose that either the subject sequence or the reference sequence is the "normal", most common, or wild type sequence. It is the fact that a difference in nucleotide sequence is identified at a particular site that determines that a polymorphism exists at that site, hi most instances, particularly in the case of SNPs, only two polymorphic variants will exist at any location. However, in the case of SNPs, up to four variants can exist since there are four naturally occurring nucleotides in DNA. Other polymorphisms such as insertions can have more than four alleles.

[0460] Once a polymorphic site is identified, any of a variety of methods can be employed to detect the existence of any particular polymorphic variant in a subject. In general, a subject can have either the reference sequence or an alternate sequence at the site. The phrase "detecting a polymorphism" or "detecting a polymorphic variant" as used herein generally refers to determining which of two or more polymorphic variants exists at a polymorphic site, although "detecting a polymorphism" can also refer to the process of initially determining that a polymorphic site exists in a population. The meaning to be given to these phrases will be clear from the context as interpreted in light of the knowledge of one of ordinary skill in the art. For purposes of description, if a subject has any sequence other than a defined reference sequence (e.g. the sequence present in the human draft genome) at a polymorphic site, the subject can be said to exhibit the polymorphism. In general, for a given polymorphism, any individual will exhibit either one or two possible variants at the polymorphic site (one on each chromosome). (This can, however, not be the case if the individual exhibits one more chromosomal abnormalities such as deletions.)

[0461] Detection of a polymorphism or polymorphic variant in a subject (genotyping) can be performed by sequencing, similarly to the manner in which the existence of a polymorphism is initially established as described above. However, once the existence of a polymorphism is established a variety of more efficient methods can be employed. Many such methods are based on the design of oligonucleotide probes or primers that facilitate distinguishing between two or more polymorphic variants.

[0462] "Probes" or "primers", as used herein, typically refers to oligonucleotides that hybridize in a base-specific manner to a complementary nucleic acid molecule as decribed herein. Such probes and primers include polypeptide nucleic acids, as described in (Nielsen et ah, Science 254: 1497-1500, 1991). The term "primer" in particular generally refers to a single- stranded oligonucleotide that can act as a point of initiation of template-directed DNA synthesis using methods such as PCR (polymerase chain reaction), LCR (ligase chain reaction), and the like. Typically, a probe or primer will comprise a region of nucleotide sequence that hybridizes to at least about 8, more often at least about 10 to 15, typically about 20-25, and frequently about 40, 50 or 75, consecutive nucleotides of a nucleic acid molecule. In certain embodiments of the invention, a probe or primer comprises 100 or fewer nucleotides, preferably from 6 to 50 nucleotides, preferably from 12 to 30 nucleotides. In certain embodiments of the invention, the probe or primer is at least 70% identical to the contiguous nucleotide sequence or to the complement of the contiguous nucleotide sequence, preferably at least 80% identical, more preferably at least 90% identical, even more preferably at least 95% identical, or having an even

higher degree of identity. In certain embodiments of the invention a preferred probe or primer is capable of selectively hybridizing to a target contiguous nucleotide sequence or to the complement of the contiguous nucleotide sequence. According to certain embodiments of the invention a probe or primer further comprises a label, for example by incorporating a radioisotope, fluorescent compound, enzyme, or enzyme co-factor.

[0463] Oligonucleotides that exhibit differential or selective binding to polymorphic sites can readily be designed by one of ordinary skill in the art. For example, an oligonucleotide that is perfectly complementary to a sequence that encompasses a polymorphic site (i.e., a sequence that includes the polymorphic site within it or at one or the other end) will generally hybridize preferentially to a nucleic acid comprising that sequence as opposed to a nucleic acid comprising an alternate polymorphic variant.

[0464] In order to detect polymorphisms and/or polymorphic variants, it will frequently be desirable to amplify a portion of DNA encompassing the polymorphic site. Such regions can be amplified and isolated by PCR using oligonucleotide primers designed based on genomic and/or cDNA sequences that flank the site. (See e.g., Dieffenbach et al, PCR Primer: A Laboratory Manual, 2000; M. J. McPherson, et al; Mattila et al, Nucleic Acids Res. 19: 4967, 1991; Eckert et al, PCR Methods and Applications 1: 17, 1991; PCR eds. McPherson et al, IRL Press, Oxford; and U.S. Pat. No. 4,683,202). Other amplification methods that can be employed include the ligase chain reaction (LCR) (Wu et al, Genomics 4: 560, 1989, Landegren et al, Science 241: 1077, 1988), transcription amplification (Kwoh et al, Proc. Natl. Acad. ScL U.S.A. 86: 1173, 1989), self-sustained sequence replication (Guatelli et al, Proc. Nat. Acad. ScL 87: 1874, 1990), and nucleic acid based sequence amplification (NASBA). Guidelines for selecting primers for PCR amplification are well known in the art. (See, e.g., McPherson, M., et al, PCR 2000, cited supra). A variety of computer programs for designing primers are available, e.g., "Oligo" (National Biosciences, Inc, Plymouth MN), MacVector (Kodak/IBI), and the GCG suite of sequence analysis programs (Genetics Computer Group, Madison, WI 53711)

[0465] According to certain methods for diagnosing a Caliban-related disease or disorder or susceptibility to Caliban-related disease or disorder, hybridization methods, such as Southern analysis, Northern analysis, or in situ hybridizations, can be used (see Ausubel et al, supra). For example, a sample (e.g., a sample comprising genomic DNA, RNA, or cDNA), is obtained from a subject suspected of being susceptible to or having a Caliban-related disease or disorder. The DNA, RNA, or cDNA sample is then examined to determine whether a polymorphic variant in a coding or noncoding portion of a gene set forth in the Figures, or a polymorphic variant in a genomic region linked to a coding or noncoding portion of a gene

encoding as set forth in the Figures is present. The presence of the polymorphic variant can be indicated by hybridization of the gene in the genomic DNA, RNA, or cDNA to a nucleic acid probe, e.g., a DNA probe (which includes cDNA and oligonucleotide probes) or an RNA probe. The nucleic acid probe can be designed to specifically or preferentially hybridize with a particular polymorphic variant, e.g., a polymorphic variant indicative of susceptibility to Caliban-related disease or disorder.

[0466] In order to diagnose susceptibility to Caliban-related disease or disorder, a hybridization sample is formed by contacting the sample with at least one nucleic acid probe. The probe is typically a nucleic acid probe (which can be labeled, e.g., with a radioactive, fluorescent, or enzymatic label or tag) capable of hybridizing to mRNA, genomic DNA, and/or cDNA sequences encompassing detecting a polymorphic variant in a coding or noncoding portion of a gene set forth in Figure 5, or a polymorphic variant in a genomic region linked to a coding or noncoding portion of a gene encoding as set forth in Figure 5 is present. The nucleic acid probe can be, for example, a full-length nucleic acid molecule, or a portion thereof, such as an oligonucleotide of at least 15, 30, 50, 100, 250 or 500 nucleotides in length and sufficient to specifically hybridize under stringent conditions to appropriate mRNA, cDNA, or genomic DNA.

[0467] The hybridization sample is maintained under conditions selected to allow specific hybridization of the nucleic acid probe to a region encompassing the polymorphic site. Specific hybridization can be performed under high stringency conditions or moderate stringency conditions, for example, as described above. In a particularly preferred embodiment, the hybridization conditions for specific hybridization are high stringency. In general, the probe can be perfectly complementary to the region to which it hybridizes, i.e., perfectly complementary to a region encompassing the polymorphic site when the site contains any particular polymorphic sequence. Multiple nucleic acid probes (e.g., multiple probes differing only at the polymorphic site, or multiple probes designed to detect polymorphic variants at multiple polymorphic sites) can be used concurrently in this method. Specific hybridization of any one of the nucleic acid probes is indicative of a polymorphic variant in a genomic region linked to a coding or noncoding portion of an expression profile gene set forth in Figure 5 or fragment thereof, or detecting a polymorphic variant of a polymorphism in a genomic region linked to such a gene, and is thus diagnostic of susceptibility to a Caliban-related disease or disorder.

[0468] Northern analysis can be performed using similar nucleic acid probes in order to detect a polymorphic variant of a polymorphism in a coding or noncoding portion of a gene selected from the group consisting of a gene set forth in the Figures or fragment thereof, or

detecting a polymorphic variant of a polymorphism in a genomic region linked to such a gene. (See, e.g., Ausubel et al, supra).

[0469] According to certain embodiments of the invention, a peptide nucleic acid (PNA) probe can be used instead of a nucleic acid probe in the hybridization methods described above. PNA is a DNA mimetic with a peptide-like, inorganic backbone, e.g., N-(2- aminoethyl)glycine units, with an organic base (A, G, C, T or U) attached to the glycine nitrogen via a methylene carbonyl linker {see, for example, Nielsen et al, Bioconjugate Chemistry 5, 1994). The PNA probe can be designed to specifically hybridize to a nucleic acid comprising a polymorphic variant conferring susceptibility to or indicative of the presence of a Caliban-related disease or disorder.

[0470] According to another method, restriction digest analysis can be used to detect the existence of a polymorphic variant of a polymorphism, if alternate polymorphic variants of the polymorphism result in the creation or elimination of a restriction site. A sample containing genomic DNA is obtained from the individual. Polymerase chain reaction (PCR) can be used to amplify a region comprising the polymorphic site, and restriction fragment length polymorphism analysis is conducted (see, e.g., Ausubel et al, supra). The digestion pattern of the relevant DNA fragment indicates the presence or absence of a particular polymorphic variant of the polymorphism and is therefore indicative of the presence or absence of susceptibility to a Caliban-related disease or disorder.

[0471] Sequence analysis can also be used to detect specific polymorphic variants. A sample comprising DNA or RNA is obtained from the subject. PCR or other appropriate methods can be used to amplify a portion encompassing the polymorphic site, if desired. The sequence is then ascertained, using any standard method, and the presence of a polymorphic variant is determined.

[0472] Allele-specific oligonucleotides can also be used to detect the presence of a polymorphic variant, e.g., through the use of dot-blot hybridization of amplified oligonucleotides with allele-specific oligonucleotide (ASO) probes (see, for example, Saiki, et al, Nature 324: 163-166, 1986). An "allele-specific oligonucleotide" (also referred to herein as an "allele- specific oligonucleotide probe") is typically an oligonucleotide of approximately 10-50 base pairs, preferably approximately 15-30 base pairs, that specifically hybridizes to a nucleic acid region that contains a polymorphism, e.g., a polymorphism associated with a susceptibility to a Caliban-related disease or disorder. An allele-specific oligonucleotide probe that is specific for particular a polymorphism can be prepared, using standard methods {see Ausubel et al, supra).

[0473] To determine which of multiple polymorphic variants is present in a subject, a sample comprising DNA is obtained from the individual. PCR can be used to amplify a portion encompassing the polymorphic site. DNA containing the amplified portion can be dot-blotted, using standard methods, and the blot contacted with the oligonucleotide probe. The presence of specific hybridization of the probe to the DNA is then detected. Specific hybridization of an allele-specific oligonucleotide probe (specific for a polymorphic variant indicative of susceptibility to a Caliban-related disease or disorder) to DNA from the subject is indicative of susceptibility to a Caliban-related disease or disorder.

[0474] According to another embodiment of the invention, arrays of oligonucleotide probes that are complementary to nucleic acid portions from a subject can be used to identify polymorphisms. Biochips as described herein can be used.

[0475] The array typically includes oligonucleotide probes capable of specifically hybridizing to different polymorphic variants. According to the method, a nucleic acid of interest, e.g., a nucleic acid encompassing a polymorphic site, (which is typically amplified) is hybridized with the array and scanned. Hybridization and scanning are generally carried out according to standard methods. (See, e.g., Published PCT Application Nos. WO 92/10092 and WO 95/11995, and U.S. Pat. No. 5,424,186). After hybridization and washing, the array is scanned to determine the position on the array to which the nucleic acid hybridizes. The hybridization data obtained from the scan is typically in the form of fluorescence intensities as a function of location on the array.

[0476] Arrays can include multiple detection blocks (i.e., multiple groups of probes designed for detection of particular polymorphisms). Such arrays can be used to analyze multiple different polymorphisms. Detection blocks can be grouped within a single array or in multiple, separate arrays so that varying conditions (e.g., conditions optimized for particular polymorphisms) can be used during the hybridization. For example, it can be desirable to provide for the detection of those polymorphisms that fall within G-C rich stretches of a genomic sequence, separately from those falling in A-T rich segments.

[0477] Additional description of use of oligonucleotide arrays for detection of polymorphisms can be found, for example, in (U.S. Pat. Nos. 5,858,659 and 5,837,832). In addition, to oligonucleotide arrays, cDNA arrays can be used similarly in certain embodiments of the invention.

[0478] Other methods of nucleic acid analysis can be used to detect polymorphisms and/or polymorphic variants. Such methods include, e.g., direct manual sequencing (Church et al. Proc. Natl. Acad. Sci. U.S.A. 81: 1991-1995, 1988; Sanger et al, Proc. Natl. Acad. ScL U.S.A.

74: 5463-5467, 1977; Beavis et al, U.S. Pat. No. 5,288,644); automated fluorescent sequencing; single-stranded confoπnation polymorphism assays (SSCP); clamped denaturing gel electrophoresis (CDGE); denaturing gradient gel electrophoresis (DGGE) (Sheffield et al, Proc. Natl. Acad. ScL U.S.A. 86: 232-236, 1991), mobility shift analysis (Orita et al, Proc. Natl Acad. Sd. U.S.A. 86: 2766-2770, 1989), restriction enzyme analysis (Flavell et al, Cell 15: 25, 1978; Geever et al, Proc. Natl. Acad. ScL U.S.A. 78: 5081, 1981); heteroduplex analysis; chemical mismatch cleavage (CMC) (Cotton et al, Proc. Natl. Acad. ScL U.S.A. 85: 4397-4401, 1985); RNase protection assays (Myers et al, Science 230: 1242, 1985); use of polypeptides that recognize nucleotide mismatches, e.g., E. coli mutS protein; allele-specific PCR.

[0479] In certain embodiments of the invention fluorescence polarization template- directed dye-terminator incorporation (FP-TDI) is used to determine which of multiple polymorphic variants of a polymorphism is present in a subject. This method is based on template-directed primer extension and detection by fluorescence polarization. According to this method, amplified genomic DNA containing a polymorphic site is incubated with oligonucleotide primers (designed to hybridize to the DNA template adjacent to the polymorphic site) in the presence of allele-specific dye-labeled dideoxyribonucleoside triphosphates and a commercially available modified Taq DNA polymerase. The primer is extended by the dye- terminator specific for the allele present on the template, increasing 10-fold the molecular weight of the fluorophore. At the end of the reaction, the fluorescence polarization of the two dye- terminators in the reaction mixture are analyzed directly without separation or purification. This homogeneous DNA diagnostic method has been shown to be highly sensitive and specific and is suitable for automated genotyping of large number of samples. (Chen et al, Genome Research 9: 492-498, 1999). Note that rather than involving use of allele-specific probes or primers, this method employs primers that terminate adjacent to a polymorphic site, so that extension of the primer by a single nucleotide results in incorporation of a nucleotide complementary to the polymorphic variant at the polymorphic site.

[0480] Real-time pyrophosphate DNA sequencing is yet another approach to detection of polymorphisms and polymorphic variants. (Alderborn et al, Genome Research 10: 1249- 1258, 2000). Additional methods include, for example, PCR amplification in combination with denaturing high performance liquid chromatography (dHPLC) (Underhill et al, Genome Research 7: 996-1005, 1997).

[0481] In general, it will be of interest to determine the genotype of a subject with respect to both copies of the polymorphic site present in the genome. For example, the complete genotype can be characterized as -/-, as -/+, or as +/+, where a minus sign indicates the presence

of the reference or wild type sequence at the polymorphic site, and the plus sign indicates the presence of a polymorphic variant other than the reference sequence. If multiple polymorphic variants exist at a site, this can be appropriately indicated by specifying which ones are present in the subject. Any of the detection means above can be used to determine the genotype of a subject with respect to one or both copies of the polymorphism present in the subject's genome.

[0482] According to certain embodiments of the invention it is preferable to employ methods that can detect the presence of multiple polymorphic variants (e.g., polymorphic variants at a plurality of polymorphic sites) in parallel or substantially simultaneously. Oligonucleotide arrays represent one suitable means for doing so. Other methods, including methods in which reactions (e.g., amplification, hybridization) are performed in individual vessels, e.g., within individual wells of a multi-well plate or other vessel can also be performed so as to detect the presence of multiple polymorphic variants (e.g., polymorphic variants at a plurality of polymorphic sites) in parallel or substantially simultaneously according to certain embodiments of the invention.

[0483] The invention provides a database comprising a list of polymorphic sequences stored on a computer-readable medium, wherein the polymorphic sequences occur in a coding or noncoding portion of a Caliban gene set forth in the Figures or fragment thereof, or in a genomic region linked to such a gene, and wherein the list is largely or entirely limited to polymorphisms have been identified as useful in performing genetic diagnosis of a Caliban-related disease or disorder or susceptibility to a Caliban-related disease or disorder, or for performing genetic studies of a Caliban-related disease or disorder or susceptibility to a Caliban-related disease or disorder.

20. KITS

[0484] Kits are provided which contain the necessary reagents for determining Caliban gene copy number, for determining abnormal expression of Caliban mRNA or Caliban protein, or for detecting polymorphisms in Caliban alleles. Instructions provided in the diagnostic kits can include calibration curves, diagrams, illustrations, or charts or the like to compare with the determined (e.g., experimentally measured) values or other results.

[0485] Kits are also provided that contain cells that serve as either positive or negative controls. These control cells can be compared to experimental samples containing similar cells, for instance cells of unknown gene activity, mutational state, protein expression level, and so forth.

A. Kits for Detection of Caliban Genomic Sequences

[0486] The nucleotide sequences disclosed herein, and fragments thereof, can be supplied in the form of a kit for use in detection of Caliban genomic sequences, for instance in order to diagnose a predisposition for tumor growth or of the presence of a tumor. In one embodiment of such a kit, an appropriate amount of one or more of the Caliban-specific oligonucleotide primers is provided in one or more containers. In other embodiments, the oligonucleotide primers may be provided suspended in an aqueous solution or as a freeze-dried or lyophilized powder, for instance. The container(s) in which the oligonucleotide(s) are supplied can be any conventional container that is capable of holding the supplied form, for instance, microfuge tubes, ampoules, or bottles. In other embodiments, pairs of primers may be provided in pre-measured single use amounts in individual, typically disposable, tubes or equivalent containers. In one specific, non-limiting embodiment, the sample to be tested for the presence of Caliban genomic amplification can be added to the individual tubes and in vitro amplification carried out directly.

[0487] The amount of each oligonucleotide primer supplied in the kit can be any appropriate amount, depending for instance on the market to which the product is directed. In one embodiment, the kit is adapted for research or clinical use and the amount of each oligonucleotide primer provided is an amount sufficient to prime several in vitro amplification reactions. Those of ordinary skill in the art know the amount of oligonucleotide primer that is appropriate for use in a single amplification reaction. General guidelines may for instance be found in (Innis et al., A Guide to Methods and Applications, 1990; Sambrook et al, supra, and Ausubel et al., supra).

[0488] In one embodiment, a kit may include more than two primers, in order to facilitate the PCR in vitro amplification of Caliban sequences, for instance the Caliban gene, specific exon(s) or other portions of the gene, or the 5' or 3' flanking region thereof.

[0489] In some embodiments, kits may also include the reagents necessary to carry out PCR in vitro amplification reactions, including, for instance, DNA sample preparation reagents, appropriate buffers {e.g., polymerase buffer), salts (e.g., magnesium chloride), and deoxyribonucleotides (dNTPs). Instructions may also be included.

[0490] In other embodiments, kits may include either labeled or unlabeled oligonucleotide probes for use in detection of the in vitro amplified Caliban sequences. In one specific, non-limiting embodiment, the appropriate sequences for such a probe will be any sequence that falls between the annealing sites of the two provided oligonucleotide primers, such

that the sequence the probe is complementary to is amplified during the in vitro amplification reaction.

[0491] In yet another embodiment, the kit provides one or more control sequences for use in the amplification reactions. The design of appropriate positive control sequences is well known to one of ordinary skill in the appropriate art.

B. Kits for Detection of Caliban mRNA Expression

[0492] Kits similar to those disclosed above for the detection of Caliban genomic sequences can be used to detect Caliban mRNA expression levels. One embodiment of such a kit may include an appropriate amount of one or more of the oligonucleotide primers for use in reverse transcription amplification reactions, similarly to those provided above, with art-obvious modifications for use with RNA.

[0493] In some embodiments, kits for detection of Caliban mRNA expression levels may also include the reagents necessary to carry out RT-PCR in vitro amplification reactions, including, for instance, RNA sample preparation reagents (including e.g., an RNAse inhibitor), appropriate buffers (e.g., polymerase buffer), salts (e.g., magnesium chloride), and deoxyribonucleotides (dNTPs). Instructions also may be included.

[0494] In other embodiments, kits may include either labeled or unlabeled oligonucleotide probes for use in detection of the in vitro amplified target sequences. In one specific, non-limiting embodiment, the appropriate sequences for such a probe will be any sequence that falls between the annealing sites of the two provided oligonucleotide primers, such that the sequence the probe is complementary to is amplified during the PCR reaction.

[0495] In another embodiment, the kit provides one or more control sequences for use in the RT-PCR reactions. The design of appropriate positive control sequences is well known to one of ordinary skill in the appropriate art.

[0496] In yet other embodiments, kits may be provided with the necessary reagents to carry out quantitative or semi-quantitative Northern analysis of Caliban mRNA. Such kits include, for instance, at least one Caliban-specific oligonucleotide for use as a probe. This oligonucleotide maybe labeled in any conventional way, including with a selected radioactive isotope, enzyme substrate, co-factor, ligand, chemiluminescent or fluorescent agent, hapten, or enzyme.

C. Kits For Detection of Caliban Protein or Peptide Expression

[0497] In some embodiments, kits for the detection of Caliban protein expression include for instance at least one target protein specific binding agent (e.g., a polyclonal or monoclonal antibody or antibody fragment) and may include at least one control, hi another

embodiment, the Caliban protein specific binding agent and control may be contained in separate containers. In other embodiments, the kits may also include means for detecting Caliban:agent complexes, for instance the agent may be detectably labeled. If the detectable agent is not labeled, it may be detected by second antibodies or protein A for example which may also be provided in some kits in one or more separate containers. Such techniques are well known.

[0498] In another embodiment, the kits include instructions for carrying out the assay. Instructions will allow the tester to determine whether Caliban expression levels are altered, for instance in comparison to a control sample. In other embodiments, reaction vessels and auxiliary reagents such as cells, chromogens, buffers, media, enzymes, and the like also may be included in the kits.

[0499] In one embodiment, an assay kit could provide the recombinant protein as an antigen and enzyme-conjugated goat anti-human IgG as a second antibody as well as the enzymatic substrates. Such kits can be used to test if the sera from a subject contain antibodies against human Caliban.

D. Kits for Detection of Homozygous versus Heterozygous Allelism

[0500] Also provided are kits that allow differentiation between individuals who are homozygous versus heterozygous for a polymorphism of Caliban, hi one embodiment such kits provide the materials necessary to perform oligonucleotide ligation assays (OLA), for instance as described at (Nickerson et al, Proc. Natl. Acad. Sd. U.S.A. 87: 8923-8927, 1990). In specific embodiments, these kits contain one or more microtiter plate assays, designed to detect allelism in the Caliban sequence of a subject, as described herein.

[0501] In one embodiment, additional components in some of these kits may include instructions for carrying out the assay. Instructions will allow the tester to determine whether a Caliban allele is homozygous or heterozygous. In other embodiments, reaction vessels and auxiliary reagents such as chromogens, buffers, enzymes, and the like, may also be included in the kits.

[0502] In another embodiment, the kit may provide one or more control sequences for use in the OLA reactions. The design of appropriate positive control sequences is well known to one of ordinary skill in the appropriate art.

E. Kits for Identifying Modulators of Caliban Activity

[0503] Also provided are kits that allow for the identification of modulators of Caliban activity. In one embodiment, such kits provide the materials necessary to assess the activity of Caliban in vitro, hi one embodiment, this kit contains aliquots of isolated Caliban and cultured cells, hi another embodiment, the kit contains cell lines that express either wildtype or mutant

Calibahriri yet another embodiment, additional components in some of these kits may include instructions for carrying out the assay. In other embodiments, reaction vessels and auxiliary reagents such as chromogens, buffers, media, enzymes, and the like may also be included in the kits.

[0504] The invention will be further described with reference to the following exemplary embodiments; however, it is to be understood that the invention is not limited to such embodiments.

EXEMPLARY EMBODIMENTS

Materials and Methods

[0505] Yeast two-hybrid screen. The 64 amino acid homeodomain of Prospero was cloned into pGBKT7 (BD Biosciences) and used as bait to screen oligo dT primed Drosophila pooled 21 hour Canton S embryonic mRNA in pACT2 (BD Biosciences). Co-transformed yeast (AH109) were transferred onto SD/-Leu/-His/-Ade/-Trp/X-β-gal/5 mM3-AT plates (high stringency selection) and incubated at 3O 0 C for 4-6 days. Surviving colonies were re-streaked on low stringency SD/-Leu/-Trp/X-β-gal plates2-3 times and reselected on high stringency plates. Plasmid DNA was isolated from surviving yeast and analyzed by polymerase chain reaction (PCR) and restriction digestion with HaeIIL Unique clones and frame shift mutant versions were re-transformed into AHl 09, mated to yeast transformed with pGBKT7-HD and again selected on high stringency plates.

[0506] Co-immunoprecipitation. Protein was expressed and labeled with 5 S- methionine by in vitro transcription and translation using TNT T7 coupled reticulocyte lysate system (Promega) with the vectors pGADT7-Clbn-C, pGBKT7-HD, ρGADT7-HD, pGADT7- Clbn, ρGBKT7-Clbn-N, and pGADT7-hCrml. pQE-RanQ69L was expressed in BL21 cells and RanQ69LGTP was purified by NTA-resin chromatography and dialyzed in the presence of 0.1 mM GTP. 5 μl of each in vitro synthesized product were mixed and incubated 1 h at 3O 0 C, with bacterially expressed RanQ69LGTP added where indicated. Mixtures were then diluted in 470 μl co-IP buffer (20 mM Tris.Hcl pH 7.5, 150 mM NaCl, 1 mM DTT, 5 μl/ml aprotinin, 0.5 mM PMSD, 0.1% Tween 20), 10 μl Protein-A agarose, and 10 μlC-Myc antibody, HA antibody (Roche), or anti-Clbn; then incubated at 4 0 C 2 h or overnight. Samples were then centrifuged 2 min and washed in TBST (20 mM Tris.Hcl pH 7.5, 150 mM NaCl, 0.1% Tween 20) four times. Pellets were re-suspended in 15 μl SDS-loading buffer, heated to 8O 0 C for 5 min, briefly spun down and loaded on 4-12% gradient SDS-PAGE gels. Gels were fixed, washed in H 2 O, and NAMPlOO was added for 20 min before drying and exposing to film.

[0507] RNA hybridization in situ. Whole mount embryo in situ hybridization is described elsewhere (Tautz and Pfeifle, Chromosoma 98: 81-85, 1989). Digoxigene labeled sense and antisense probes were transcribed by T7 and SP6 RNA polymerase, respectively, from Xhol or EcoRI digested pSP73-Clbn using the DIG RNA Labeling Kit (Boehringer Mannheim) according to the manufacturer's instructions. Signal was detected by alkaline phosphatase histochemical staining.

[0508] Synthesis of double stranded RNAi. Both strands of linearized templates (pSP73, pSP73-εYFP, pSP73-Clbn, ρSP73-Emb, ρSP73-hCrml) were transcribed using T7 and SP6 RNA polymerase (Promega's RiboMax large scale RNA production system). Double stranded RNA was then prepared as descbied elsewhere (Misquitta and Paterson, Proc. Natil. Acad. ScL U.S.A. 96: 1451-1456, 1999), visualized on 1% agarose gels and quantified by determining the optical density at 260 nm.

[0509] Human Sdccagl Northern blot Reverse transcription-PCR was used to amplify a 309 bp cDNA from the carboxy terminus of Sdccagl (2978-3286 bps from sequence NM_004713.2). This cDNA was cloned into pPCR-Script-Amp (Stratagene), labeled with 32 P and hybridized to a Human NorthernLight Blot (Panomics) according to the manufacturer's instructions.

[0510] Tissue culture and transfection. Drosophila SL2 cells were grown at 25°C in HyQ CCM3 (HyClone) supplemented with gentomycin sulfate. Cells were split, incubated two days to 80-90% confluence and transfected as follows. 6 μl Fugene6 (Roche) was added to 100 μl HyQ media, followed by 2 μg of DNA of the vector ρAc5.1/V5-HisA-EYFP-HDA, incubated 20 min and then mixed with the aforementioned cells. Cells were grown 2-3 days, transferred to fresh media in six-well plates containing poly-lysine (0.1 mg/ml) coated glass cover slips, incubated 4-6 h, during which time cells attach to the coverslips, and then fixed, stained and examined by confocal microscopy as described elsewhere (Demidenkoet al, Development 126:1359-1367, 2001).

[0511] Mammalian CVl, WI-38 and IMR-90 cells were purchased from American Type Culture Collection. A primary normal human bronchial epithelial cell line (NHBE) was purchased from Cambrex. Cancer cell lines were a gift of S. Amundson and A. Fornace. Mammalian cells were cultured, transfected, fixed and analyzed as described elsewhere (Demidenko et al, Development 126:1359-1367, 2001). pEYFP-HDA (Demidenko et al, Development 126: 1359-1367, 2001), pEYFP-Pros-NLS+ (Bi) and pREV(1.4)-GFP-PKI (Henderson and Eleftheriou, Exp. Cell Res. 256: 213-224, 2000) are described elsewhere.

[0512] Creation of stable transformed cell lines. Full length Clbn was cloned into pcDNA/TO/myc-His-C (Invitrogen) with and without the endogenous translation stop sequence, the latter tags the protein with the myc epitope. These were linearized by digestion with Sspl and stable cell lines selected on zeocin (50 μg/ml). pcDNA6/TR (Invitrogen), which expresses the Tet repressor was linearized by digestion with Sapl and stable cell lines selected on blasticidin (1-5 μg/ml). Stable cell lines expressing a short hairpin RNA that functions as RNAi was generated by stably integrating pSM2c (Open Biosystems) containing either of two short sequences of sdccagl (nucleotides 163-185 or 1181-1202 from NM_004713.2) linearized by digestion WύhApal and stable cell lines selected with puromycin (25-50 ng/ml).

[0513] Soft Agar Colony assay - Invasion assay. Invasion chambers (BD Biosciences) were used according to the manufacturer's instructions. In brief, equal numbers of cells in media without serum were placed in experimental (basement membrane coated) and control (uncoated) chambers on membranes with 8 μM pores. These were placed in 6- or 24-well plates containing media with serum. After incubation for 22 h (A549) or 44 h (EKVX), cells that moved through the pores were fixed in methanol, stained with Geimsa and counted. Data is presented as percent invasion, calculated by dividing the number of cells from experimental chambers by those in control chambers.

[0514] FACS analysis. Cells were grown to 80-90% confluence in T75 flasks, harvested with trypsin, pelleted and resuspended in 70% EtOH at -2O 0 C and stored until needed. Cells were resuspended in PBT at room temperature and stained with propidium iodide (50 μg/ml) in the presence of 0.1 mM EDTA and RNase A + T (50 μg/ml and 0.5 U/ml, respectively) for 1 h. Cells were analyzed on a FACScan flow cytometer (Becton Dickson).

[0515] Deletion of Caliban. The fly Caliban gene was deleted using homologous recombination according to standard protocols described elsewhere (Rong and Golic, Science 288: 2013-2018, 2000). In brief, an Arm-GFP clone was inserted into the genome of Drosophila in place of the Caliban gene. Three independent lines were recovered. Inducible full length and dominant negative Caliban. Both full length (Clbn) and the dominant negative carboxy terminal third (Clbn-DN) of a Caliban cDNA were cloned into a Drosophila P-element transformation vector, downstream of an inducible UAS. Multiple transgenic lines of each construct were recovered. These were crossed to flies ubiquitously expressing the transcriptional activator Gal4, with an actin or tubulin promoter or to flies expressing Gal4 in a developmentally restricted pattern. The Gal4 protein binds the UAS, inducing transcription of UAS-Clbn and UAS-Clbn- DN, in a system described elsewhere (Brand and Perrimon, Development 118: 401-415, 1993).

Results and Discussion

[0516] To test whether Clbn specifically interacts with the HDA nuclear export signal, yeast were left untransformed (Blank) or transformed with the indicated clones (Fig. Ia). Co- transformation of Clbn-C and the full homeodomain (HD) or HDA allows yeast to survive on selective media, indicating a protein-protein interaction. The remainder of the homeodomain, HDB, does not interact with Clbn-C (Fig. Ia).

[0517] To examine the relative strength of the protein interactions we plated a dilution series of yeast co-transformed with the Clbn-C construct used in (Fig. Ia) and the indicated HD constructs (Fig. Ib). The interactions between Clbn-C and HD or HDA are indistinguishable. Two mutant forms of HDA, HDA-F4 and HDA-LLL, which abrogate much of its nuclear export function. (Bi et al, MoI. Cell. Biol. 23: 1014-1024, 2003), reduce their interactions with Clbn-C approximately five fold (Fig. Ib). The homeo/Prospero domain (HP), which when in tact masks nuclear export, does not interact with Clbn-C in yeast; in other words it blocks the interaction between Clbn-C and HDA. There is no interaction between Clbn-C and HDB. These interactions in yeast parallel our earlier analysis of HDA sequence requirements for in vivo NES function (Demidenko et al, Development 126: 1359-1367, 2001; Bi et al, MoI. Cell. Biol. 23: 1014-1024, 2003) and suggest that Caliban functions as a regulator of HDA-directed nuclear export.

[0518] The requirement for clbn gene expression during Drosophila development is unknown. As a first step in determining this, we performed RNA in situ hybridization on developing Drosophila embryos (Tautz et al., Chromosoma 98: 81-85, 1989). Digoxigene labeled sense and antisense probes were transcribed by T7 and SP6 RNA polymerase, respectively, from Xhol or EcoRl digested pSP73-Clbn. Clbn is expressed ubiquitously throughout embryonic development.

[0519] Caliban is conserved from yeast to man with the fly (NM_143084.1) and human (NM_004713.2) sequences being 49% identical. While the fly gene was predicted to contain five introns, using RT-PCR we were never able to see the predicted 60-nucleotide second intron spliced out of the mature mRNA (data not shown). Inclusion of the resulting additional 20 amino acids greatly improves the alignment of the fly and human protein sequences. There are three regions of higher homology (>50% identity with fewer gaps). These include the amino terminal 206 amino acids, which are 56% identical. This region is also rich in conserved leucine resides, with 19 of the 26 leucines found in the fly protein also in identical positions of the human protein and four others having conservative changes. A second region of increased similarity includes amino acids 277-679, which are 62% identical. The carboxy terminal 179 amino acids, included within the region that binds HDA, are 51% identical.

[0520] Human NY-CO-I was first identified by a serological analysis of recombinant cDNA expression libraries (SEREX) where it was recognized by autologous serum from colon- cancer patients. The caboxy terminal 366 amino acids of this protein were believed to be the full- length protein. (Scanlan et ah, Int. J. Cancer 76: 652-658, 1998). Another group identified and characterized the protein as a tumor suppressor, which they called Sdccagl (serologically defined colon cancer antigen gene). When induced in NSCLC-N6 cells, Sdccagl caused them to drop out of the cell cycle. Blocking the synthesis of Sdccagl protein with antisense RNA reverted the cells to cancer. (Carbonnelle et ah, Int. J. Cancer 92: 388-397, 2001; Carbonnelle et ah, Cytogen. Cell. Gen. 86: 248-249, 1999). However, the biochemical function of the Sdccagl was unknown.

[0521] To determine the biological function of the fly homolog, Caliban, we transfected Drosophila SL2 tissue culture cells with a green fluorescent tagged NES, pAc5.1 -EYFP-HDA, and used RNAi to determine whether Clbn and/or the Drosophila Exportin protein, encoded by embargoed (emb; Collier et ah, Genetics 155: 1799-1807, 2000) are necessary for HDA- mediated nuclear export. While double stranded vector RNA had no effect on EYFP-HDA localization and EYFP RNAi eliminated green fluorescent protein expression (Fig. 2a and data not shown), both clbn and emb RNAi abrogated nuclear export (Fig. 2b, c). The effect of Exportin RNAi is specific to fly Emb, as human Exportin (hCrml) RNAi did not alter the localization of EYFP- HDA, even following three successive rounds of RNAi treatment (data not shown). We conclude that both Clbn and Emb are required for HDA to function as a nuclear export signal in Drosophila.

[0522] It was enigmatic why Sdccagl was cloned using SEREXl, which implies that it is over- or mis-expressed in colon cancer patients, if it is in fact a human tumor suppressor. (Carbonnelle et ah, Int. J. Cancer 92: 388-397, 2001). As the former paper described their clone of the carboxy terminal third of Sdccagl as full-length, we reasoned that this truncated protein might function as a dominant negative. To test this, we co-transfected SL2 cells with pAc5.1- EYFP-HDA and either the amino or carboxy terminal third of Caliban, Clbn-N or Clbn-C, respectively. While expression of Clbn-N had no effect on the subcellular localization of EYFP- HDA (data not shown), expression of Clbn-C abrogated nuclear export, demonstrating that it acts as a dominant negative (Fig. 2d).

[0523] The amino terminus of Clbn is enriched in conserved leucine residues, suggesting that it might include one or more nuclear export signals. To test this we fused the amino terminal 435 amino acids of Clbn to green fluorescent protein and expressed the constructs in mammalian cells. While pEYFP-Clbnl/435 is ubiquitous, deletion of the amino

terminal 50 amino acids, pEYFP-Clbn51/435, results in the protein accumulating in the cytoplasm (Fig. 2e, f). Thus like Prospero, Caliban has a nuclear export signal-masking domain. Further deletions to amino acid 91 or 131, pEYFP-Clbn91/435 and pEYFP-Clbnl31/435, also resulted in the protein accumulating in the cytoplasm; however, deletion to amino acid 181 partially restored nuclear green fluorescent protein (Fig. 2g-i). Deletion to amino acid 311, pEYFP-Clbn311/435, results in a fusion protein that is not excluded from the nucleus (Fig. 2j). Thus Caliban has a minimum of two nuclear export signals, one included within amino acids 131-181 and the other 181-311.

[0524] As the carboxy terminus of Caliban binds HDA in yeast (Fig. 1), the amino terminus functions as an NES (Fig. 2e-i), and Caliban is required for the nuclear export of an HDA fusion protein (Fig. 2a-d), we reasoned that Caliban might directly mediate the nuclear export of HDA. Epitope tagged versions of the carboxy terminus of Clbn and HD were synthesized by in vitro transcription and translation, in the presence of 35 S-methionine. When incubated together, immunoprecipitation of HD or Clbn-C resulted in the precipitation of the other protein (Fig. 1C, lanes 2 and 5). This interaction does not required RanGTP, a necessary component for Exportin binding to NES sequences. (Fornerod et al, Cell 90: 1051-1060, 1997). Immunoprecipitation of in vitro synthesized full length Clbn with a Clbn peptide antibody was also able to precipitate HD (Fig. 1C, lane 8).

[0525] Epitope tagged versions of the amino terminus of Clbn (Clbn-N) and Exportin, encoded by the human crml gene (Fornerod et al, Cell 90: 1051-1060, 1997) were synthesized in vitro, incubated together in the presence of the nonhydrolizable RanQ69LGTP12 (Bischoff et al, Proc. Natl. Acad. ScL U.S.A. 91: 2587-2591, 1994) and immunoprecipitated with Myc tagged Clbn-N, resulting in the precipitation of Exportin (Fig. 1C, lane 11). A similar experiment conducted with HD and Exportin, in the presence of RanQ69LGTP, showed that immunoprecipitation of Exportin fails to or only weakly precipitates HD (Fig. 1C, lane 13). Thus, Caliban functions as a bipartite mediator of nuclear export, with the amino terminus binding the nuclear export machinery and the carboxy terminus binding the nuclear export signal HDA.

[0526] Mutation of sdccagl had been implicated in colon and lung cancer. As a biomarker assay for Sdccagl function, we transfected pEYFP-HDA into various human cancer cell lines and examined their ability to export green fluorescent fusion protein from the nucleus. While the colon cancer cell lines HCT-116 and SW620 were able to export pEYFP-HDA, all five lung-cancer cell lines that were tested failed to show export (Fig. 3 a). We confirmed that Exportin still functions in these cells by transfecting them with a green fluorescent fusion

" protein, pREV(1.4)-GFP-PKI6. (Henderson et al, Exp. Cell Res. 256: 213-224, 2000). The nuclear export signal of protein kinase A inhibitor (PKI) directly binds Exportin and is a functionally strong nuclear export signal (Henderson, et al, Exp. Cell Res. 256: 213-224, 2000; Fornerod et al, Cell 90: 1051-1060, 1997; Wen et al, Cell 82: 463-473, 1995). We observed nuclear export of pREV(1.4)-GFP-PKI in all of the cell lines (Fig. 3a). Furthermore, nuclear import also functions in these cells as demonstrated by the nuclear localization of pEYFP-Pros- NLS + , a fusion protein that includes the Prospero NLS and a masked version of the NES (Fig. 3a).

[0527] It was possible that sdccagl is not expressed in lung tissue. To test this, we probed a Northern blot containing poly A + RNA from different human tissues. A 4.4 Kb full length transcript was detected in brain, heart, liver, lung, spleen, and skeletal muscle tissue; significantly less signal was seen in stomach and testis (Fig. 3b).

[0528] To determine whether Sdccagl is functional in lung cells, we transfected a primary normal human bronchial epithelial cell line (NHBE) and the immortal normal lung cell lines WI-38 and IMR-90 with pEYFP-HDA, pREV(1.4)-GFP-PKI and pEYFP-Pros-NLS + . All three cell lines showed nuclear export of the green fluorescent proteins fused to HDA and PKI and nuclear localization of pEYFP-Pros-NLS + (Fig. 3a). Thus, Sdccagl functions in normal lung cell lines; however, it is inactive in all five lung-cancer cell lines investigated. Inactivation of sdccagl may play a contributory role in transforming lung cells from normal to cancerous.

[0529] To determine the in vivo function of fly Caliban, we used homologous recombination to delete the gene; three lines were obtained. The lines are homozygous viable, occasionally display mild hypertrophy of the nervous system, have a dramatic delay in development, and are sensitive to radiation. The radiation sensitivity can be observed by exposing embryos and early larvae to 4,000 rads (Roentgen absorbed doses) of irradiation. The number of larvae forming melanotic tumors was increased 5-10 fold over that of non-mutant control larvae, following this treatment. These tumors grow and eventually kill the larvae (Fig. 6a). These data provide the first demonstration that Caliban can function as a tumor suppressor in a multicellular organism.

[0530] To determine the effects of the dominant negative (oncogenic) form of Caliban on whole flies, we examined transgenic flies carrying inducible versions of either the full-length tumor suppressor or the truncated oncogenic Caliban. One line of the oncogenic Caliban forms tumors in homozygous adult flies (Fig. 6b). This suggests that truncated oncogenic Caliban can induce tumors in a multicellular organism.

References

1. Scanlan et al, Int. J. Cancer 76: 652-658, 1998.

2. Carbonnelle et al, Int. J. Cancer 92: 388-397, 2001.

3. Demidenko et al, Development 126: 1359-1367, 2001.

4. Bi et al, MoI Cell Biol. 23: 1014-1024, 2003.

5. Ryter et al, Structure 10: 1541-1549, 2002.

6. Henderson, et al, Exp. Cell Res. 256: 213-224, 2000.

7. Fields et al, Nature 340: 245-246, 1989.

8. Tautz et al, Chromosoma 98: 81-85, 1989.

9. Carbonnelle et al, Cytogen. Cell Gen. 86: 248-249, 1999.

10. Collier et al, Genetics 155: 1799-1807, 2000. ll. Fσmerod ef α/., Cell 90: 1051-1060, 1997.

12. Bischoff e* α/., Proc. Natl. Acad. ScL U.S.A. 91: 2587-2591, 1994.

13. Wen et al, Cell 82: 463-473, 1995.

14. Misquitta et al, USA 96: 1451-1456, 1999.

[0531] Each recited range includes all combinations and sub-combinations of ranges, as well as specific numerals contained therein.

[0532] All publications and patent applications cited in this specification are herein incorporated by reference in their entirety for all purposes as if each individual publication or patent application were specifically and individually indicated to be incorporated by reference for all purposes.

[0533] Although the foregoing invention has been described in detail by way of example for purposes of clarity of understanding, it will be apparent to the artisan that certain changes and modifications are comprehended by the disclosure and can be practiced without undue experimentation within the scope of the appended claims, which are presented by way of illustration not limitation.