Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
REGULATION OF HUMAN MASP-LIKE SERINE PROTEASE
Document Type and Number:
WIPO Patent Application WO/2002/010404
Kind Code:
A2
Abstract:
Reagents which regulate human MASP-like serine protease activity and reagents which bind to human MASP-like serine protease gene products can be used to regulate extracellular matrix degradation. Such regulation is particularly useful for defending against pathogens, such as viruses, bacteria, mycoplasma, fungi, protozoa, helminths, rikettsia, chloamydiae, parasites, prions, and the like.

Inventors:
XIAO YONGHONG (US)
Application Number:
PCT/EP2001/008181
Publication Date:
February 07, 2002
Filing Date:
July 16, 2001
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
BAYER AG (DE)
XIAO YONGHONG (US)
International Classes:
A61K39/395; C12N9/64; C12N15/57; C12Q1/37; C12Q1/68; A61K38/00; (IPC1-7): C12N15/57; A61K31/711; A61K31/713; A61K38/43; A61K39/395; C12N5/10; C12N9/64; C12N15/62; C12Q1/68; G01N33/577; G01N33/68
Domestic Patent References:
WO2001038501A22001-05-31
WO2001040451A22001-06-07
WO2001071004A22001-09-27
Other References:
DATABASE EMBL SEQUENCE DATABASE [Online] Hinxton, UK; 23 May 2000 (2000-05-23) D.M. MUZNY ET AL.: "Homo sapiens 3 BAC RP11-460F1 (Roswell Park Cancer Institute HUman BAC Library) complete Sequence" XP002195190
DAHL MADS R ET AL: "MASP-3 and its association with distinct complexes of the mannan-binding lectin complement activation pathway." IMMUNITY, vol. 15, no. 1, July 2001 (2001-07), pages 127-135, XP002195186 ISSN: 1074-7613
Attorney, Agent or Firm:
BAYER AKTIENGESELLSCHAFT (Leverkusen, DE)
Download PDF:
Claims:
CLAIMS
1. An isolated polynucleotide encoding a MASPlike serine protease poly peptide and being selected from the group consisting of : a) a polynucleotide encoding a MASPlike serine protease polypeptide comprising an amino acid sequence selected form the group consisting of : amino acid sequences which are at least about 50% identical to the amino acid sequence shown in SEQ ID NO: 2; and the amino acid sequence shown in SEQ ID NO:.
2. b) a polynucleotide comprising the sequence of SEQ ID NO: 1; c) a polynucleotide which hybridizes under stringent conditions to a polynucleotide specified in (a) and (b); d) a polynucleotide the sequence of which deviates from the poly nucleotide sequences specified in (a) to (c) due to the degeneration of the genetic code; and e) a polynucleotide which represents a fragment, derivative or allelic variation of a polynucleotide sequence specified in (a to (d).
3. 2 An expression vector containing any polynucleotide of claim 1.
4. A host cell containing the expression vector of claim 2.
5. A substantially purified MASPlike serine protease polypeptide encoded by a polynucleotide of claim 1.
6. A method for producing a MASPlike serine protease polypeptide, wherein the method comprises the following steps: a) culturing the host cell of claim 3 under conditions suitable for the expression of the MASPlike serine protease polypeptide; and b) recovering the MASPlike serine protease polypeptide from the host cell culture.
7. A method for detection of a polynucleotide encoding a MASPlike serine protease polypeptide in a biological sample comprising the following steps: a) hybridizing any polynucleotide of claim 1 to a nucleic acid material of a biological sample, thereby forming a hybridization complex; and b) detecting said hybridization complex.
8. The method of claim 6, wherein before hybridization, the nucleic acid material of the biological sample is amplified.
9. A method for the detection of a polynucleotide of claim 1 or a MASPlike serine protease polypeptide of claim 4 comprising the steps of : contacting a biological sample with a reagent which specifically interacts with the polynucleotide or the MASPlike serine protease polypeptide.
10. A diagnostic kit for conducting the method of any one of claims 6 to 8.
11. A method of screening for agents which decrease the activity of a MASPlike serine protease, comprising the steps of : contacting a test compound with any MASPlike serine protease polypeptide encoded by any polynucleotide of claiml ; detecting binding of the test compound to the MASPlike serine protease polypeptide, wherein a test compound which binds to the polypeptide is identified as a potential therapeutic agent for decreasing the activity of a MASPlike serine protease.
12. A method of screening for agents which regulate the activity of a MASPlike serine protease, comprising the steps of : contacting a test compound with a MASPlike serine protease polypeptide encoded by any polynucleotide of claim 1; and detecting a MASPlike serine protease activity of the polypeptide, wherein a test compound which increases the MASPlike serine protease activity is identified as a potential therapeutic agent for increasing the activity of the MASPlike serine protease, and wherein a test compound which decreases the MASPlike serine protease activity of the polypeptide is identified as a potential therapeutic agent for decreasing the activity of the MASPlike serine protease.
13. A method of screening for agents which decrease the activity of a MASPlike serine protease, comprising the steps of : contacting a test compound with any polynucleotide of claim 1 and detecting binding of the test compound to the polynucleotide, wherein a test compound which binds to the polynucleotide is identified as a potential therapeutic agent for decreasing the activity of MASPlike serine protease.
14. A method of reducing the activity of MASPlike serine protease, comprising the steps of : contacting a cell with a reagent which specifically binds to any polynucleotide of claim 1 or any MASPlike serine protease polypeptide of claim 4, whereby the activity of MASPlike serine protease is reduced.
15. A reagent that modulates the activity of a MASPlike serine protease poly peptide or a polynucleotide wherein said reagent is identified by the method of any of the claim 10 to 12.
16. A pharmaceutical composition, comprising: the expression vector of claim 2 or the reagent of claim 14 and a pharma ceutically acceptable carrier.
17. Use of the pharmaceutical composition of claim 15 for modulating the activity of a MASPlike serine protease in extracellular matrix degradation.
18. Use of the pharmaceutical composition of claim 15 for defending against pathogens.
19. A cDNA encoding a polypeptide comprising the amino acid sequence shown in SEQ ID NO : 2.
20. The cDNA of claim 18 which comprises SEQ ID NO: 1.
21. The cDNA of claim 18 which consists of SEQ ID NO: 1.
22. An expression vector comprising a polynucleotide which encodes a poly peptide comprising the amino acid sequence shown in SEQ ID NO : 2.
23. The expression vector of claim 21 wherein the polynucleotide consists of SEQ ID NO : 1.
24. A host cell comprising an expression vector which encodes a polypeptide comprising the amino acid sequence shown in SEQ ID NO : 2.
25. The host cell of claim 23 wherein the polynucleotide consists of SEQ ID NO : 1.
26. A purified polypeptide comprising the amino acid sequence shown in SEQ ID NO : 2.
27. The purified polypeptide of claim 25 which consists of the amino acid sequence shown in SEQ ID NO : 2.
28. A fusion protein comprising a polypeptide having the amino acid sequence shown in SEQ ID NO : 2.
29. A method of producing a polypeptide comprising the amino acid sequence shown in SEQ ID NO : 2, comprising the steps of : culturing a host cell comprising an expression vector which encodes the poly peptide under conditions whereby the polypeptide is expressed; and isolating the polypeptide.
30. The method of claim 28 wherein the expression vector comprises SEQ ID NO : 1.
31. A method of detecting a coding sequence for a polypeptide comprising the amino acid sequence shown in SEQ ID NO : 2, comprising the steps of : hybridizing a polynucleotide comprising 11 contiguous nucleotides of SEQ ID NO : 1 to nucleic acid material of a biological sample, thereby forming a hybridization complex; and detecting the hybridization complex.
32. The method of claim 30 further comprising the step of amplifying the nucleic acid material before the step of hybridizing.
33. A kit for detecting a coding sequence for a polypeptide comprising the amino acid sequence shown in SEQ ID NO : 2, comprising: a polynucleotide comprising 11 contiguous nucleotides of SEQ ID NO : 1; and instructions for the method of claim 30.
34. A method of detecting a polypeptide comprising the amino acid sequence shown in SEQ ID NO : 2, comprising the steps of : contacting a biological sample with a reagent that specifically binds to the polypeptide to form a reagentpolypeptide complex; and detecting the reagentpolypeptide complex.
35. The method of claim 33 wherein the reagent is an antibody.
36. A kit for detecting a polypeptide comprising the amino acid sequence shown in SEQ ID NO : 2, comprising: an antibody which specifically binds to the polypeptide; and instructions for the method of claim 33.
37. A method of screening for agents which can modulate the activity of a human MASPlike serine protease, comprising the steps of : contacting a test compound with a polypeptide comprising an amino acid sequence selected from the group consisting of : (1) amino acid sequences which are at least about 50% identical to the amino acid sequence shown in SEQ ID NO : 2 and (2) the amino acid sequence shown in SEQ ID NO : 2; and detecting binding of the test compound to the polypeptide, wherein a test compound which binds to the polypeptide is identified as a potential agent for regulating activity of the human MASPlike serine protease.
38. The method of claim 36 wherein the step of contacting is in a cell.
39. The method of claim 36 wherein the cell is in vitro.
40. The method of claim 36 wherein the step of contacting is in a cellfree system.
41. The method of claim 36 wherein the polypeptide comprises a detectable label.
42. The method of claim 36 wherein the test compound comprises a detectable label.
43. The method of claim 36 wherein the test compound displaces a labeled ligand which is bound to the polypeptide.
44. The method of claim 36 wherein the polypeptide is bound to a solid support.
45. The method of claim 36 wherein the test compound is bound to a solid support.
46. A method of screening for agents which modulate an activity of a human MASPlike serine protease, comprising the steps of : contacting a test compound with a polypeptide comprising an amino acid sequence selected from the group consisting of : (1) amino acid sequences which are at least about 50% identical to the amino acid sequence shown in SEQ ID NO : 2 and (2) the amino acid sequence shown in SEQ ID NO : 2; and detecting an activity of the polypeptide, wherein a test compound which increases the activity of the polypeptide is identified as a potential agent for increasing the activity of the human MASPlike serine protease, and wherein a test compound which decreases the activity of the polypeptide is identified as a potential agent for decreasing the activity of the human MASPlike serine protease.
47. The method of claim 45 wherein the step of contacting is in a cell.
48. The method of claim 45 wherein the cell is in vitro.
49. The method of claim 45 wherein the step of contacting is in a cellfree system.
50. A method of screening for agents which modulate an activity of a human MASPlike serine protease, comprising the steps of : contacting a test compound with a product encoded by a polynucleotide which comprises the nucleotide sequence shown in SEQ D NO : 1; and detecting binding of the test compound to the product, wherein a test compound which binds to the product is identified as a potential agent for regulating the activity of the human MASPlike serine protease.
51. The method of claim 49 wherein the product is a polypeptide.
52. The method of claim 49 wherein the product is RNA.
53. A method of reducing activity of a human MASPlike serine protease, comprising the step of : contacting a cell with a reagent which specifically binds to a product encoded by a polynucleotide comprising the nucleotide sequence shown in SEQ ID NO : 1, whereby the activity of a human MASPlike serine protease is reduced.
54. The method of claim 52 wherein the product is a polypeptide.
55. The method of claim 53 wherein the reagent is an antibody.
56. The method of claim 52 wherein the product is RNA.
57. The method of claim 55 wherein the reagent is an antisense oligonucleotide.
58. The method of claim 56 wherein the reagent is a ribozyme.
59. The method of claim 52 wherein the cell is in vitro.
60. The method of claim 52 wherein the cell is in vivo.
61. A pharmaceutical composition, comprising: a reagent which specifically binds to a polypeptide comprising the amino acid sequence shown in SEQ ID NO : 2; and a pharmaceutically acceptable carrier.
62. The pharmaceutical composition of claim 60 wherein the reagent is an antibody.
63. A pharmaceutical composition, comprising: a reagent which specifically binds to a product of a polynucleotide comprising the nucleotide sequence shown in SEQ ID NO : 1; and a pharmaceutically acceptable carrier.
64. The pharmaceutical composition of claim 62 wherein the reagent is a ribozyme.
65. The pharmaceutical composition of claim 62 wherein the reagent is an antisense oligonucleotide.
66. The pharmaceutical composition of claim 62 wherein the reagent is an antibody.
67. A pharmaceutical composition, comprising: an expression vector encoding a polypeptide comprising the amino acid sequence shown in SEQ ID NO : 2; and a pharmaceutically acceptable carrier.
68. The pharmaceutical composition of claim 66 wherein the expression vector comprises SEQ ID NO : 1.
69. A method of treating a MASPlike serine protease disfunction related disease, wherein the disease is extracellular matrix degradation or a disease caused by pathogens, comprising the step of : administering to a patient in need thereof a therapeutically effective dose of a reagent that modulates a function of a human MASPlike serine protease, whereby symptoms of the MASPlike serine protease disfunction related disease are ameliorated.
70. The method of claim 68 wherein the reagent is identified by the method of claim 36.
71. The method of claim 68 wherein the reagent is identified by the method of claim 45.
72. The method of claim 68 wherein the reagent is identified by the method of claim 49.
Description:
REGULATION OF HUMAN MASP-LIKE SERINE PROTEASE TECHNICAL FIELD OF THE INVENTION The invention relates to the area of regulation of MASP-like serine protease.

BACKGROUND OF THE INVENTION Mannose-binding lectin-associate serine protease (MASP) is a member of the serine protease superfamily. Endo et al., J : Immunol. 161, 4924-30,1998. In humans, MASP plays a role in the host defense against pathogens via the lectin pathway, a system of complement activation. Id. This pathway is initiated by the binding of MASP to carbohydrates. Takahashi et al., Int. In7munoL 11, 859-63,1999. Because of the importance of defending the human host against a variety of pathogens, there is a need in the art to identify additional members of this protein family which can be regulated to provide therapeutic effects.

SUMMARY OF THE INVENTION It is an object of the invention to provide reagents and methods of regulating extracellular matrix degradation. These and other objects of the invention are provided by one or more of the embodiments described below.

One embodiment of the invention is a MASP-like serine protease polypeptide comprising an amino acid sequence selected from the group consisting of : amino acid sequences which are at least about 50% identical to the amino acid sequence shown in SEQ ID NO: 2; and the amino acid sequence shown in SEQ ID NO: 2.

Yet another embodiment of the invention is a method of screening for agents which decrease extracellular matrix degradation. A test compound is contacted with a MASP-like serine protease polypeptide comprising an amino acid sequence selected from the group consisting of : amino acid sequences which are at least about 50% identical to the amino acid sequence shown in SEQ ID NO: 2; and the amino acid sequence shown in SEQ ID NO: 2.

Binding between the test compound and the MASP-like serine protease polypeptide is detected. A test compound which binds to the MASP-like serine protease polypeptide is thereby identified as a potential agent for decreasing extracellular matrix degradation. The agent can work by decreasing the activity of the MASP-like serine protease.

Another embodiment of the invention is a method of screening for agents which decrease extracellular matrix degradation. A test compound is contacted with a polynucleotide encoding a MASP-like serine protease polypeptide, wherein the polynucleotide comprises a nucleotide sequence selected from the group consisting of : nucleotide sequences which are at least about 50% identical to the nucleotide sequence shown in SEQ ID NO: 1 ; and the nucleotide sequence shown in SEQ ID NO: 1.

Binding of the test compound to the polynucleotide is detected. A test compound which binds to the polynucleotide is identified as a potential agent for decreasing extracellular matrix degradation. The agent can work by decreasing the amount of the

MASP-like serine protease through interacting with the MASP-like serine protease mRNA.

Another embodiment of the invention is a method of screening for agents which regulate extracellular matrix degradation. A test compound is contacted with a MASP-like serine protease polypeptide comprising an amino acid sequence selected from the group consisting of : amino acid sequences which are at least about 50% identical to the amino acid sequence shown in SEQ ID NO: 2; and the amino acid sequence shown in SEQ ID NO: 2.

A MASP-like serine protease activity of the polypeptide is detected. A test compound which increases MASP-like serine protease activity of the polypeptide relative to MASP-like serine protease activity in the absence of the test compound is thereby identified as a potential agent for increasing extracellular matrix degradation.

A test compound which decreases MASP-like serine protease activity of the polypeptide relative to MASP-like serine protease activity in the absence of the test compound is thereby identified as a potential agent for decreasing extracellular matrix degradation.

Even another embodiment of the invention is a method of screening for agents which decrease extracellular matrix degradation. A test compound is contacted with a MASP-like serine protease product of a polynucleotide which comprises a nucleotide sequence selected from the group consisting of : nucleotide sequences which are at least about 50% identical to the nucleotide sequence shown in SEQ ID NO: 1; and the nucleotide sequence shown in SEQ ID NO: 1.

Binding of the test compound to the MASP-like serine protease product is detected.

A test compound which binds to the MASP-like serine protease product is thereby identified as a potential agent for decreasing extracellular matrix degradation.

Still another embodiment of the invention is a method of reducing extracellular matrix degradation. A cell is contacted with a reagent which specifically binds to a polynucleotide encoding a MASP-like serine protease polypeptide or the product encoded by the polynucleotide, wherein the polynucleotide comprises a nucleotide sequence selected from the group consisting of : nucleotide sequences which are at least about 50% identical to the nucleotide sequence shown in SEQ ID NO: 1; and the nucleotide sequence shown in SEQ ID NO: 1.

MASP-like serine protease activity in the cell is thereby decreased.

The invention thus provides reagents and methods for regulating MASP-like serine protease activity which can be used inter alia, to treat pathogenic infections.

BRIEF DESCRIPTION OF THE DRAWINGS Fig. 1 shows the DNA-sequence encoding a MASP-like serine protease polypeptide (SEQ ID NO : 1).

Fig. 2 shows the amino acid sequence of the of the DNA sequence of Fig. 1 (SEQ ID NO : 2).

Fig. 3 shows the amino acid sequence of a protein identified by EMBL Accession No. AB009074 (SEQ ID NO : 3).

Fig. 4 shows the amino acid sequence of pfam/hmm/trypsin (SEQ ID NO : 4).

Fig. 5 shows the BLASTP alignment of SEQ ID NO : 2 with SEQ ID NO: 3

Fig. 6 shows the Prosite search results Fig. 7 shows the BLOCKS search results Fig. 8 shows the HMMPFAM alignment of 1422TR3 (SEQ ID NO : 2) against pfam/hmm/trypsin (SEQ ID NO : 4) DETAILED DESCRIPTION OF THE INVENTION The invention relates to an isolated polynucleotide encoding a MASP-like serine protease polypeptide and being selected from the group consisting of : a) a polynucleotide encoding a MASP-like serine protease polypeptide comprising an amino acid sequence selected from the group consisting of : amino acid sequences which are at least about 50% identical to the amino acid sequence shown in SEQ ID NO: 2; and the amino acid sequence shown in SEQ ID NO: 2; b) a polynucleotide comprising the sequence of SEQ ID NO: 1, c) a polynucleotide which hybridizes under stringent conditions to a polynucleotide specified in (a) and (b); d) a polynucleotide the sequence of which deviates from the polynucleotide sequences specified in (a) to (c) due to the degeneration of the genetic code; and e) a polynucleotide which represents a fragment, derivative or allelic variation of a polynucleotide sequence specified in (a) to (d).

Furthermore, it has been discovered by the present applicant that regulators of a MASP-like serine protease, particularly a human MASP-like serine protease, can be used to regulate degradation of the extracellular matrix. Human MASP-like serine protease as shown in SEQ ID NO : 2 is 64% identical over 282 amino acids to the protein identified by EMBL Accession No. AB009074 (SEQ ID NO : 3) and

annotated as Triakis scyllium mannose-binding lectin-associated serine protease (FIG. 1). Human MASP-like serine protease contains domains typical of a trypsin family serine protease (FIGS. 2-4).

A coding sequence for SEQ ID NO : 2 is shown in SEQ ID NO : 1. This coding sequence is found within genomic clones identified with GenBank Accession Nos.

AC069069, AC007920, AC034190, and AC046154. An EST identified with EMBL Accession No. AW855182 (423bp) is contained within SEQ ID NO : 1, indicating that human MASP-like serine protease is expressed.

Human MASP-like serine protease is expected to be useful for the same purposes as previously identified mannose-binding lectin-associated serine proteases (Endo et al., J Immunol. 161, 4924-30,1998; Takahashi et al., Int. Immunol. 11, 859-63,1999; Matsushita et al., J. Immunol. 164, 2281-84,2000). Human MASP-like serine protease is expected to be especially useful for treating diseases caused by pathogens, including but not limited to, viruses, bacteria, mycoplasma, fungi, protozoa, helminths, rikettsia, chloamydiae, parasites, prions, and the like.

Poly. peptide MASP-like serine protease polypeptides according to the invention comprise an amino acid sequence as shown in SEQ ID NO : 2, a portion of SEQ ID NO : 2 comprising at least 6,15,25,50,75,100,125,150,175,200,225,250, or 275 contiguous amino acids, or a biologically active variant of the amino acid sequence shown in SEQ ID NO : 2, as defined below. A MASP-like serine protease polypeptide of the invention therefore can be a portion of a MASP-like serine protease molecule, a full-length MASP-like serine protease molecule, or a fusion protein comprising all or a portion of a MASP-like serine protease molecule.

BiologicallvActive Variants MASP-like serine protease variants which are biologically active, i. e., retain a MASP-like serine protease activity, also are MASP-like serine protease polypeptides.

Preferably, naturally or non-naturally occurring MASP-like serine protease variants have amino acid sequences which are at least about 50,55,60,65,70, preferably about 75,90,96, or 98% identical to an amino acid sequence shown in SEQ ID NO : 2. Percent identity between a putative MASP-like serine protease variant and an amino acid sequence of SEQ ID NO : 2 is determined using the Blast2 alignment program.

Variations in percent identity can be due, for example, to amino acid substitutions, insertions, or deletions. Amino acid substitutions are defined as one for one amino acid replacements. They are conservative in nature when the substituted amino acid has similar structural and/or chemical properties. Examples of conservative replacements are substitution of a leucine with an isoleucine or valine, an aspartate with a glutamate, or a threonin with a serine.

Amino acid insertions or deletions are changes to or within an amino acid sequence.

They typically fall in the range of about 1 to 5 amino acids. Guidance in determining which amino acid residues can be substituted, inserted, or deleted without abolishing biological or immunological activity can be found using computer programs well known in the art, such as DNASTAR software. Whether an amino acid change results in a biologically active MASP-like serine protease polypeptide can readily be determined by assaying for MASP-like serine protease activity, as is known in the art and described, for example, in Matsushita et al., 2000.

Fusion Proteins Fusion proteins are useful for generating antibodies against MASP-like serine protease amino acid sequences and for use in various assay systems. For example,

fusion proteins can be used to identify proteins which interact with portions of a MASP-like serine protease polypeptide, including its active site and fibronectin domains. Methods such as protein affinity chromatography or library-based assays for protein-protein interactions, such as the yeast two-hybrid or phage display systems, can be used for this purpose. Such methods are well known in the art and also can be used as drug screens.

A MASP-like serine protease fusion protein comprises two protein segments fused together by means of a peptide bond. Contiguous amino acids for use in a fusion protein can be selected from the amino acid sequence shown in SEQ ID NO : 2 or from a biologically active variants of those sequences, such as those described above.

For example, the first protein segment can comprise at least 6,15,25,50,75,100, 125,150,175,200,225,250, or 275 or more contiguous amino acids of SEQ ID NO : 2 or a biologically active variant. Preferably, a fusion protein comprises the active site of the protease and/or one or both of the fibronectin domains. The first protein segment also can comprise full-length MASP-like serine protease.

The second protein segment can be a full-length protein or a protein fragment or polypeptide. Proteins commonly used in fusion protein construction include ß- galactosidase, p-glucuronidase, green fluorescent protein (GFP), autofluorescent proteins, including blue fluorescent protein (BFP), glutathione-S-transferase (GST), luciferase, horseradish peroxidase (HRP), and chloramphenicol acetyltransferase (CAT). Additionally, epitope tags are used in fusion protein constructions, including histidine (His) tags, FLAG tags, influenza hemagglutinin (HA) tags, Myc tags, VSV- G tags, and thioredoxin (Trx) tags. Other fusion constructions can include maltose binding protein (MBP), S-tag, Lex a DNA binding domain (DBD) fusions, GAL4 DNA binding domain fusions, and herpes simplex virus (HSV) BP16 protein fusions.

A fusion protein also can be engineered to contain a cleavage site located between the MASP-like serine protease polypeptide-encoding sequence and the heterologous protein sequence, so that the MASP-like serine protease polypeptide can be cleaved and purified away from the heterologous moiety.

A fusion protein can be synthesized chemically, as is known in the art. Preferably, a fusion protein is produced by covalently linking two protein segments or by standard procedures in the art of molecular biology. Recombinant DNA methods can be used to prepare fusion proteins, for example, by making a DNA construct which comprises MASP-like serine protease coding sequences disclosed herein in proper reading frame with nucleotides encoding the second protein segment and expressing the DNA construct in a host cell, as is known in the art. Many kits for constructing fusion proteins are available from companies such as Promega Corporation (Madison, WI), Stratagene (La Jolla, CA), CLONTECH (Mountain View, CA), Santa Cruz Biotechnology (Santa Cruz, CA), MBL International Corporation (MIC; Watertown, MA), and Quantum Biotechnologies (Montreal, Canada; 1-888-DNA- KITS).

Identification of SpeciesHomologs Species homologs of human MASP-like serine protease can be obtained using MASP-like serine protease polynucleotides (described below) to make suitable probes or primers to screening cDNA expression libraries from other species, such as mice, monkeys, or yeast, identifying cDNAs which encode homologs of MASP-like serine protease, and expressing the cDNAs as is known in the art.

Polvnucleotides A MASP-like serine protease polynucleotide can be single-or double-stranded and comprises a coding sequence or the complement of a coding sequence for a MASP- like serine protease polypeptide. A partial coding sequence of a MASP-like serine protease polynucleotide is shown in SEQ ID NO : 1.

Degenerate nucleotide sequences encoding human MASP-like serine protease polypeptides, as well as homologous nucleotide sequences which are at least about

50,55,60,65,70, preferably about 75,90,96, or 98% identical to the MASP-like serine protease coding sequences nucleotide sequence shown in SEQ ID NO : 1 also are MASP-like serine protease polynucleotides. Percent sequence identity between the sequences of two polynucleotides is determined using computer programs such as ALIGN which employ the FASTA algorithm, using an affine gap search with a gap open penalty of-12 and a gap extension penalty of-2. Complementary DNA (cDNA) molecules, species homologs, and variants of MASP-like serine protease polynucleotides which encode biologically active MASP-like serine protease polypeptides also are MASP-like serine protease polynucleotides.

Identification of Variants and Homolo s Variants and homologs of the MASP-like serine protease polynucleotides disclosed above also are MASP-like serine protease polynucleotides. Typically, homologous MASP-like serine protease polynucleotide sequences can be identified by hybridization of candidate polynucleotides to known MASP-like serine protease polynucleotides under stringent conditions, as is known in the art. For example, using the following wash conditions--2X SSC (0.3 M NaCl, 0.03 M sodium citrate, pH 7.0), 0.1% SDS, room temperature twice, 30 minutes each; then 2X SSC, 0.1% SDS, 50 °C once, 30 minutes; then 2X SSC, room temperature twice, 10 minutes each--homologous sequences can be identified which contain at most about 25-30% basepair mismatches. More preferably, homologous nucleic acid strands contain 15-25% basepair mismatches, even more preferably 5-15% basepair mismatches.

Species homologs of the MASP-like serine protease polynucleotides disclosed herein can be identified by making suitable probes or primers and screening cDNA expression libraries from other species, such as mice, monkeys, or yeast. Human variants of MASP-like serine protease polynucleotides can be identified, for example, by screening human cDNA expression libraries. It is well known that the Tm of a double-stranded DNA decreases by 1-1.5 °C with every 1% decrease in homology (Bonner et al., J : Mol. Biol. 81, 123 (1973). Variants of human MASP-

like serine protease polynucleotides or MASP-like serine protease polynucleotides of other species can therefore be identified, for example, by hybridizing a putative homologous MASP-like serine protease polynucleotide with a polynucleotide having a nucleotide sequence of SEQ ID NO : 1 or an ephrin-like serine protease coding sequence of SEQ ID NO: 3 to form a test hybrid. The melting temperature of the test hybrid is compared with the melting temperature of a hybrid comprising MASP-like serine protease polynucleotides having perfectly complementary nucleotide sequences, and the number or percent of basepair mismatches within the test hybrid is calculated.

Nucleotide sequences which hybridize to MASP-like serine protease polynucleotides or their complements following stringent hybridization and/or wash conditions are also MASP-like serine protease polynucleotides. Stringent wash conditions are well known and understood in the art and are disclosed, for example, in Sambrook et al., MOLECULAR CLONING : A LABORATORY MANUAL, 2d ed., 1989, at pages 9.50-9.51.

Typically, for stringent hybridization conditions a combination of temperature and salt concentration should be chosen that is approximately 12-20 °C below the calculated Tm of the hybrid under study. The Tm of a hybrid between a MASP-like serine protease polynucleotide having a coding sequence disclosed herein and a polynucleotide sequence which is at least about 50, preferably about 75,90,96, or 98% identical to that nucleotide sequence can be calculated, for example, using the equation of Bolton and McCarthy, Proc. Natl. Acad. Sci. US. A. 48, 1390 (1962): Tm = 81.5 °C-16.6 (logio [Na+]) + 0.41 (% G + C)-0.63 (% formamide)-600/1), where/= the length of the hybrid in basepairs.

Stringent wash conditions include, for example, 4X SSC at 65 °C, or 50% formamide, 4X SSC at 42 °C, or 0. 5X SSC, 0.1% SDS at 65 °C. Highly stringent wash conditions include, for example, 0.2X SSC at 65 °C.

Preparation ofpolynucleotides A naturally occurring MASP-like serine protease polynucleotide can be isolated free of other cellular components such as membrane components, proteins, and lipids.

Polynucleotides can be made by a cell and isolated using standard nucleic acid purification techniques, synthesized using an amplification technique, such as the polymerase chain reaction (PCR), or synthesized using an automatic synthesizer.

Methods for isolating polynucleotides are routine and are known in the art. Any such technique for obtaining a polynucleotide can be used to obtain isolated MASP-like serine protease polynucleotides. For example, restriction enzymes and probes can be used to isolate polynucleotide fragments which comprise MASP-like serine protease nucleotide sequences. Isolated polynucleotides are in preparations which are free or at least 70,80, or 90% free of other molecules.

MASP-like serine protease cDNA molecules can be made with standard molecular biology techniques, using MASP-like serine protease mRNA as a template. MASP- like serine protease cDNA molecules can thereafter be replicated using molecular biology techniques known in the art and disclosed in manuals such as Sambrook et al. (1989). An amplification technique, such as PCR, can be used to obtain additional copies of MASP-like serine protease polynucleotides, using either human genomic DNA or cDNA as a template.

Alternatively, synthetic chemistry techniques can be used to synthesize MASP-like serine protease polynucleotides. The degeneracy of the genetic code allows alternate nucleotide sequences to be synthesized which will encode a MASP-like serine protease polypeptide having, for example, the amino acid sequence shown in SEQ ID NO : 2 or a biologically active variant of that sequence.

Obtaining Full-Length Polynucleotides The partial sequence of SEQ ID NO : 1 or its complement can be used to identify the corresponding full length gene from which they were derived. The partial sequences can be nick-translated or end-labeled with 32P using polynucleotide kinase using labeling methods known to those with skill in the art (BASIC METHODS IN MOLECULAR BIOLOGY, Davis et al., eds., Elsevier Press, N. Y., 1986). A lambda library prepared from human tissue can be directly screened with the labeled sequences of interest or the library can be converted en masse to pBluescript (Stratagene Cloning Systems, La Jolla, Calif. 92037) to facilitate bacterial colony screening (see Sambrook et al., 1989, pg. 1.20).

Both methods are well known in the art. Briefly, filters with bacterial colonies containing the library in pBluescript or bacterial lawns containing lambda plaques are denatured, and the DNA is fixed to the filters. The filters are hybridized with the labeled probe using hybridization conditions described by Davis et al., 1986. The partial sequences, cloned into lambda or pBluescript, can be used as positive controls to assess background binding and to adjust the hybridization and washing stringencies necessary for accurate clone identification. The resulting autoradiograms are compared to duplicate plates of colonies or plaques; each exposed spot corresponds to a positive colony or plaque. The colonies or plaques are selected and expanded, and the DNA is isolated from the colonies for further analysis and sequencing.

Positive cDNA clones are analyzed to determine the amount of additional sequence they contain using PCR with one primer from the partial sequence and the other primer from the vector. Clones with a larger vector-insert PCR product than the original partial sequence are analyzed by restriction digestion and DNA sequencing to determine whether they contain an insert of the same size or similar as the mRNA size determined from Northern blot Analysis.

Once one or more overlapping cDNA clones are identified, the complete sequence of the clones can be determined, for example after exonuclease III digestion (McCombie et al., Methods 3, 33-40,1991). A series of deletion clones are generated, each of which is sequence. The resulting overlapping sequences are assembled into a single contiguous sequence of high redundancy (usually three to five overlapping sequences at each nucleotide position), resulting in a highly accurate final sequence.

Various PCR-based methods can be used to extend the nucleic acid sequences encoding the disclosed portions of human MASP-like serine protease to detect upstream sequences such as promoters and regulatory elements. For example, restriction-site PCR uses universal primers to retrieve unknown sequence adjacent to a known locus (Sarkar, PCR Methods Applic. 2,318-322,1993). Genomic DNA is first amplified in the presence of a primer to a linker sequence and a primer specific to the known region. The amplified sequences are then subjected to a second round of PCR with the same linker primer and another specific primer internal to the first one. Products of each round of PCR are transcribed with an appropriate RNA polymerase and sequenced using reverse transcriptase.

Inverse PCR also can be used to amplify or extend sequences using divergent primers based on a known region (Triglia et al., Nucleic Acids Res. 16, 8186,1988). Primers can be designed using commercially available software, such as OLIGO 4.06 Primer Analysis software (National Biosciences Inc., Plymouth, Minn.), to be 22-30 nucleotides in length, to have a GC content of 50% or more, and to anneal to the target sequence at temperatures about 68-72 °C. The method uses several restriction enzymes to generate a suitable fragment in the known region of a gene. The fragment is then circularized by intramolecular ligation and used as a PCR template.

Another method which can be used is capture PCR, which involves PCR amplification of DNA fragments adjacent to a known sequence in human and yeast artificial chromosome DNA (Lagerstrom et al., PCR Methods Applic. 1, 111-119,

1991). In this method, multiple restriction enzyme digestions and ligations are used to place an engineered double-stranded sequence into an unknown fragment of the DNA molecule before performing PCR.

Another method which can be used to retrieve unknown sequences is that of Parker et al., Nucleic Acids Res. 19, 3055-3060,1991. Additionally, PCR, nested primers, and PROMOTERFINDER libraries (CLONTECH, Palo Alto, Calif.) can be used to walk genomic DNA. This process avoids the need to screen libraries and is useful in finding intron/exon junctions.

When screening for full-length cDNAs, it is preferable to use libraries that have been size-selected to include larger cDNAs. Also, random-primed libraries are preferable, in that they will contain more sequences which contain the 5'regions of genes. Use of a randomly primed library may be especially preferable for situations in which an oligo d (T) library does not yield a full-length cDNA. Genomic libraries can be useful for extension of sequence into 5'non-transcribed regulatory regions.

Commercially available capillary electrophoresis systems can be used to analyze the size or confirm the nucleotide sequence of PCR or sequencing products. For example, capillary sequencing can employ flowable polymers for electrophoretic separation, four different fluorescent dyes (one for each nucleotide) which are laser activated, and detection of the emitted wavelengths by a charge coupled device camera. Output/light intensity can be converted to electrical signal using appropriate software (e. g. GENOTYPER and Sequence NAVIGATOR, Perkin Elmer), and the entire process from loading of samples to computer analysis and electronic data display can be computer controlled. Capillary electrophoresis is especially preferable for the sequencing of small pieces of DNA which might be present in limited amounts in a particular sample.

Obtaining Polypeptides MASP-like serine protease polypeptides can be obtained, for example, by purification from human cells, by expression of MASP-like serine protease polynucleotides, or by direct chemical synthesis.

Protein Purification MASP-like serine protease polypeptides can be purified from human cells, such as primary tumor cells, metastatic cells, or cancer cell lines (e. g., colon cancer cell lines HCT116, DLD1, HT29, Caco2, SW837, SW480, and RKO, breast cancer cell lines 21-PT, 21-MT, MDA-468, SK-BR3, and BT-474, the A549 lung cancer cell line, or the H392 glioblastoma cell line). Carcinoma of the lung is an especially useful source of MASP-like serine protease polypeptides. A purified MASP-like serine protease polypeptide is separated from other compounds which normally associate with the MASP-like serine protease polypeptide in the cell, such as certain proteins, carbohydrates, or lipids, using methods well-known in the art. Such methods include, but are not limited to, size exclusion chromatography, ammonium sulfate fractionation, ion exchange chromatography, affinity chromatography, and preparative gel electrophoresis. A preparation of purified MASP-like serine protease polypeptides is at least 80% pure ; preferably, the preparations are 90%, 95%, or 99% pure. Purity of the preparations can be assessed by any means known in the art, such as SDS-polyacrylamide gel electrophoresis. Enzymatic activity of the purified preparations can be assayed, for example, as described in Matsushita et al., 2000.

Expession of Polynucleotides To express a MASP-like serine protease polypeptide, a MASP-like serine protease polynucleotide can be inserted into an expression vector which contains the necessary elements for the transcription and translation of the inserted coding

sequence. Methods which are well known to those skilled in the art can be used to construct expression vectors containing sequences encoding MASP-like serine protease polypeptides and appropriate transcriptional and translational control elements. These methods include in vitro recombinant DNA techniques, synthetic techniques, and in vivo genetic recombination. Such techniques are described, for example, in Sambrook et al. (1989) and Ausubel et al., CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, John Wiley & Sons, New York, N. Y, 1989.

A variety of expression vector/host systems can be utilized to contain and express sequences encoding a MASP-like serine protease polypeptide. These include, but are not limited to, microorganisms, such as bacteria transformed with recombinant bacteriophage, plasmid, or cosmid DNA expression vectors; yeast transformed with yeast expression vectors, insect cell systems infected with virus expression vectors (e. g., baculovirus), plant cell systems transformed with virus expression vectors (e. g., cauliflower mosaic virus, CaMV ; tobacco mosaic virus, TMV) or with bacterial expression vectors (e. g., Ti or pBR322 plasmids), or animal cell systems.

The control elements or regulatory sequences are those non-translated regions of the vector--enhancers, promoters, 5'and 3'untranslated regions--which interact with host cellular proteins to carry out transcription and translation. Such elements can vary in their strength and specificity. Depending on the vector system and host utilized, any number of suitable transcription and translation elements, including constitutive and inducible promoters, can be used. For example, when cloning in bacterial systems, inducible promoters such as the hybrid lacZ promoter of the BLUESCRIPT phagemid (Stratagene, LaJolla, Calif.) or pSPORTl plasmid (Life Technologies) and the like can be used. The baculovirus polyhedrin promoter can be used in insect cells. Promoters or enhancers derived from the genomes of plant cells (e. g., heat shock, RUBISCO, and storage protein genes) or from plant viruses (e. g., viral promoters or leader sequences) can be cloned into the vector. In mammalian cell systems, promoters from mammalian genes or from mammalian viruses are preferable. If it is necessary to generate a cell line that contains multiple

copies of a nucleotide sequence encoding a MASP-like serine protease polypeptide, vectors based on SV40 or EBV can be used with an appropriate selectable marker.

Bacterial and Yeast Expression S stems In bacterial systems, a number of expression vectors can be selected depending upon the use intended for the MASP-like serine protease polypeptide. For example, when a large quantity of a MASP-like serine protease polypeptide is needed for the induction of antibodies, vectors which direct high level expression of fusion proteins that are readily purified can be used. Such vectors include, but are not limited to, multifunctional E. coli cloning and expression vectors such as BLLTESCRIPT (Stratagene), in which the sequence encoding the MASP-like serine protease polypeptide can be ligated into the vector in frame with sequences for the amino-terminal Met and the subsequent 7 residues of P-galactosidase so that a hybrid protein is produced. pIN vectors (Van Heeke & Schuster, R Biol. Chem. 264, 5503-5509,1989 or pGEX vectors (Promega, Madison, Wis.) can be used to express foreign polypeptides as fusion proteins with glutathione S-transferase (GST). In general, such fusion proteins are soluble and can easily be purified from lysed cells by adsorption to glutathione-agarose beads followed by elution in the presence of free glutathione. Proteins made in such systems can be designed to include heparin, thrombin, or Factor Xa protease cleavage sites so that the cloned polypeptide of interest can be released from the GST moiety at will.

In the yeast Saccharomyces cerevisiae, a number of vectors containing constitutive or inducible promoters such as alpha factor, alcohol oxidase, and PGH can be used.

For reviews, see Ausubel et al. (1989) and Grant et al., Methods Enzymol. 153, 516-544,1987.

Plant and Insect Expression Systems If plant expression vectors are used, the expression of sequences encoding MASP- like serine protease polypeptides can be driven by any of a number of promoters. For example, viral promoters such as the 35S and 19S promoters of CaMV can be used alone or in combination with the omega leader sequence from TMV (Takamatsu EMBO J 6, 307-311,1987). Alternatively, plant promoters such as the small subunit of RUBISCO or heat shock promoters can be used (Coruzzi et al., EMBO J 3, 1671-1680,1984; Broglie et al., Science 224, 838-843, 1984 ; Winter et al., Results Probl. Cell Differ. 17, 85-105,1991). These constructs can be introduced into plant cells by direct DNA transformation or by pathogen-mediated transfection. Such techniques are described in a number of generally available reviews (see, for example, Hobbs or Murray, in McGRAw HILL YEARBOOK OF SCIENCE AND TECHNOLOGY, McGraw Hill, New York, N. Y., pp. 191-196,1992).

An insect system also can be used to express a MASP-like serine protease polypeptide. For example, in one such system Autographa californica nuclear polyhedrosis virus (AcNPV) is used as a vector to express foreign genes in Spodopteraftugiperda cells or in Trichoplusia larvae. Sequences encoding MASP- like serine protease polypeptides can be cloned into a non-essential region of the virus, such as the polyhedrin gene, and placed under control of the polyhedrin promoter. Successful insertion of MASP-like serine protease polypeptides will render the polyhedrin gene inactive and produce recombinant virus lacking coat protein. The recombinant viruses can then be used to infect, for example, S. frugiperda cells or Trichoplusia larvae in which MASP-like serine protease polypeptides can be expressed (Engelhard et al., Proc. Nat. Acad. Sci. 91, 3224-3227,1994).

Man2malian Expression Systems A number of viral-based expression systems can be utilized in mammalian host cells.

For example, if an adenovirus is used as an expression vector, sequences encoding MASP-like serine protease polypeptides can be ligated into an adenovirus transcription/translation complex consisting of the late promoter and tripartite leader sequence. Insertion in a non-essential E1 or E3 region of the viral genome can be used to obtain a viable virus which is capable of expressing a MASP-like serine protease polypeptide in infected host cells (Logan & Shenk, Proc. Natl. Acad Sci.

81, 3655-3659,1984). In addition, transcription enhancers, such as the Rous sarcoma virus (RSV) enhancer, can be used to increase expression in mammalian host cells.

Human artificial chromosomes (HACs) also can be used to deliver larger fragments of DNA than can be contained and expressed in a plasmid. HACs of 6M to 10M are constructed and delivered to cells via conventional delivery methods (e. g, liposomes, polycationic amino polymers, or vesicles).

Specific initiation signals also can be used to achieve more efficient translation of sequences encoding MASP-like serine protease polypeptides. Such signals include the ATG initiation codon and adjacent sequences. In cases where sequences encoding a MASP-like serine protease polypeptide, its initiation codon, and upstream sequences are inserted into the appropriate expression vector, no additional transcriptional or translational control signals may be needed. However, in cases where only coding sequence, or a fragment thereof, is inserted, exogenous translational control signals (including the ATG initiation codon) should be provided.

The initiation codon should be in the correct reading frame to ensure translation of the entire insert. Exogenous translational elements and initiation codons can be of various origins, both natural and synthetic. The efficiency of expression can be enhanced by the inclusion of enhancers which are appropriate for the particular cell

system which is used (see Scharf et al., Results Probl. Cell Differ. 20,125-162, 1994).

Host Cells A host cell strain can be chosen for its ability to modulate the expression of the inserted sequences or to process an expressed MASP-like serine protease polypeptide in the desired fashion. Such modifications of the polypeptide include, but are not limited to, acetylation, carboxylation, glycosylation, phosphorylation, lipidation, and acylation. Post-translational processing which cleaves a"prepro"form of the polypeptide also can be used to facilitate correct insertion, folding and/or function.

Different host cells which have specific cellular machinery and characteristic mechanisms for post-translational activities (e. g., CHO, HeLa, MDCK, HEK293, and WI38), are available from the American Type Culture Collection (ATCC; 10801 University Boulevard, Manassas, VA 20110-2209) and can be chosen to ensure the correct modification and processing of the foreign protein.

Stable expression is preferred for long-term, high-yield production of recombinant proteins. For example, cell lines which stably express MASP-like serine protease polypeptides can be transformed using expression vectors which can contain viral origins of replication and/or endogenous expression elements and a selectable marker gene on the same or on a separate vector. Following the introduction of the vector, cells can be allowed to grow for 1-2 days in an enriched medium before they are switched to a selective medium. The purpose of the selectable marker is to confer resistance to selection, and its presence allows growth and recovery of cells which successfully express the introduced MASP-like serine protease sequences. Resistant clones of stably transformed cells can be proliferated using tissue culture techniques appropriate to the cell type.

Any number of selection systems can be used to recover transformed cell lines.

These include, but are not limited to, the herpes simplex virus thymidine kinase

(Wigler et al., Cell 11, 223-32,1977) and adenine phosphoribosyltransferase (Lowy et al., Cell 22, 817-23, 1980). Genes which can be employed in tk or aprt cells, respectively. Also, antimetabolite, antibiotic, or herbicide resistance can be used as the basis for selection. For example, dhf confers resistance to methotrexate (Wigler et al., Proc. Natl. Acad. Sci. 77,3567-70,1980); npt confers resistance to the aminoglycosides, neomycin and G-418 (Colbere-Garapin et al., J. Mol. Biol. 150, 1-14,1981) ; and als and pat confer resistance to chlorsulfuron and phosphinotricin acetyltransferase, respectively (Murray, 1992 supra). Additional selectable genes have been described, for example trpB, which allows cells to utilize indole in place of tryptophan, or hisD, which allows cells to utilize histinol in place of histidine (Hartman & Mulligan, Proc. Natl. Acad. Sci. 85, 8047-51, 1988). Visible markers such as anthocyanins, p-glucuronidase and its substrate GUS, and luciferase and its substrate luciferin, can be used to identify transformants and to quantify the amount of transient or stable protein expression attributable to a specific vector system (Rhodes etal., MethodsMol. Biol. 55, 121-131,1995).

DetectingExpression of Polvpeptides Although the presence of marker gene expression suggests that the MASP-like serine protease polynucleotide is also present, its presence and expression may need to be confirmed. For example, if a sequence encoding a MASP-like serine protease polypeptide is inserted within a marker gene sequence, transformed cells containing sequences which encode a MASP-like serine protease polypeptide can be identified by the absence of marker gene function. Alternatively, a marker gene can be placed in tandem with a sequence encoding a MASP-like serine protease polypeptide under the control of a single promoter. Expression of the marker gene in response to induction or selection usually indicates expression of the MASP-like serine protease polynucleotide.

Alternatively, host cells which contain a MASP-like serine protease polynucleotide and which express a MASP-like serine protease polypeptide can be identified by a

variety of procedures known to those of skill in the art. These procedures include, but are not limited to, DNA-DNA or DNA-RNA hybridizations and protein bioassay or immunoassay techniques which include membrane, solution, or chip-based technologies for the detection and/or quantification of nucleic acid or protein.

The presence of a polynucleotide sequence encoding a MASP-like serine protease polypeptide can be detected by DNA-DNA or DNA-RNA hybridization or amplifi- cation using probes or fragments or fragments of polynucleotides encoding a MASP- like serine protease polypeptide. Nucleic acid amplification-based assays involve the use of oligonucleotides selected from sequences encoding a MASP-like serine protease polypeptide to detect transformants which contain a MASP-like serine protease polynucleotide.

A variety of protocols for detecting and measuring the expression of a MASP-like serine protease polypeptide, using either polyclonal or monoclonal antibodies specific for the polypeptide, are known in the art. Examples include enzyme-linked immunosorbent assay (ELISA), radioimmunoassay (RIA), and fluorescence activated cell sorting (FACS). A two-site, monoclonal-based immunoassay using monoclonal antibodies reactive to two non-interfering epitopes on a MASP-like serine protease polypeptide can be used, or a competitive binding assay can be employed. These and other assays are described in Hampton et al., SEROLOGICAL METHODS: A LABORATORY MANUAL, APS Press, St. Paul, Minn., 1990) and Maddox et al., J. Exp.

Med. 158, 1211-1216,1983).

A wide variety of labels and conjugation techniques are known by those skilled in the art and can be used in various nucleic acid and amino acid assays. Means for producing labeled hybridization or PCR probes for detecting sequences related to polynucleotides encoding MASP-like serine protease polypeptides include oligo- labeling, nick translation, end-labeling, or PCR amplification using a labeled nucleo- tide. Alternatively, sequences encoding a MASP-like serine protease polypeptide can be cloned into a vector for the production of an mRNA probe. Such vectors are

known in the art, are commercially available, and can be used to synthesize RNA probes in vitro by addition of labeled nucleotides and an appropriate RNA poly- merase, such as T7, T3, or SP6. These procedures can be conducted using a variety of commercially available kits (Amersham Pharmacia Biotech, Promega, and US Biochemical). Suitable reporter molecules or labels which can be used for ease of detection include radionuclides, enzymes, fluorescent, chemiluminescent, or chromogenic agents, as well as substrates, cofactors, inhibitors, magnetic particles, and the like.

Expression and Purification of Polvpeptides Host cells transformed with nucleotide sequences encoding a MASP-like serine protease polypeptide can be cultured under conditions suitable for the expression and recovery of the protein from cell culture. The polypeptide produced by a transformed cell can be secreted or contained intracellularly depending on the sequence and/or the vector used. As will be understood by those of skill in the art, expression vectors containing polynucleotides which encode MASP-like serine protease polypeptides can be designed to contain signal sequences which direct secretion of MASP-like serine protease polypeptides through a prokaryotic or eukaryotic cell membrane.

Other constructions can be used to join a sequence encoding a MASP-like serine protease polypeptide to a nucleotide sequence encoding a polypeptide domain which will facilitate purification of soluble proteins. Such purification facilitating domains include, but are not limited to, metal chelating peptides such as histidine-tryptophan modules that allow purification on immobilized metals, protein A domains that allow purification on immobilized immunoglobulin, and the domain utilized in the FLAGS extension/affinity purification system (Immunex Corp., Seattle, Wash.). The in- clusion of cleavable linker sequences such as those specific for Factor Xa or enterokinase (Invitrogen, San Diego, CA) between the purification domain and the MASP-like serine protease polypeptide can be used to facilitate purification. One such expression vector provides for expression of a fusion protein containing a

MASP-like serine protease polypeptide and 6 histidine residues preceding a thioredoxin or an enterokinase cleavage site. The histidine residues facilitate purification on IMAC (immobilized metal ion affinity chromatography as described in Porath et al., Prot. Exp. Purif 3,263-281,1992), while the enterokinase cleavage site provides a means for purifying the MASP-like serine protease polypeptide from the fusion protein. Vectors which contain fusion proteins are disclosed in Kroll et al., DNA Cell Biol. 12, 441-453,1993).

Chemical Svnthesis Sequences encoding a MASP-like serine protease polypeptide can be synthesized, in whole or in part, using chemical methods well known in the art (see Caruthers et al., Nucl. Acids Res. Symp. Ser. 215-223,1980; Horn et al. Nucl. Acids Res. Symp. Ser.

225-232,1980). Alternatively, a MASP-like serine protease polypeptide itself can be produced using chemical methods to synthesize its amino acid sequence. For ex- ample, MASP-like serine protease polypeptides can be produced by direct peptide synthesis using solid-phase techniques (Merrifield, J Am. Chem. Soc. 85, 2149-2154, 1963; Roberge et al., Science 269, 202-204,1995). Protein synthesis can be performed using manual techniques or by automation. Automated synthesis can be achieved, for example, using Applied Biosystems 431A Peptide Synthesizer (Perkin Elmer). Various fragments of MASP-like serine protease polypeptides can be separately synthesized and combined using chemical methods to produce a full- length molecule.

The newly synthesized peptide can be substantially purified by preparative high performance liquid chromatography (e. g., Creighton, PROTEINS: STRUCTURES AND MOLECULAR PRINCIPLES, WH Freeman and Co., New York, N. Y., 1983). The composition of a synthetic MASP-like serine protease polypeptide can be confirmed by amino acid analysis or sequencing (e. g., the Edman degradation procedure; see Creighton, supra). Additionally, any portion of the amino acid sequence of the MASP-like serine protease polypeptide can be altered during direct synthesis and/or

combined using chemical methods with sequences from other proteins to produce a variant polypeptide or a fusion protein.

Production ofaltered Poly As will be understood by those of skill in the art, it may be advantageous to produce MASP-like serine protease polypeptide-encoding nucleotide sequences possessing non-naturally occurring codons. For example, codons preferred by a particular prokaryotic or eukaryotic host can be selected to increase the rate of protein ex- pression or to produce an RNA transcript having desirable properties, such as a half-life which is longer than that of a transcript generated from the naturally occurring sequence.

The nucleotide sequences disclosed herein can be engineered using methods generally known in the art to alter MASP-like serine protease polypeptide-encoding sequences for a variety of reasons, including modification of the cloning, processing, and/or expression of the gene product. DNA shuffling by random fragmentation and PCR reassembly of gene fragments and synthetic oligonucleotides can be used to engineer the nucleotide sequences. For example, site-directed mutagenesis can be used to insert new restriction sites, alter glycosylation patterns, change codon preference, produce splice variants, introduce mutations, and so forth.

Antibodies Any type of antibody known in the art can be generated to bind specifically to an epitope of a MASP-like serine protease polypeptide."Antibody"as used herein includes intact immunoglobulin molecules, as well as fragments thereof, such as Fab, F (ab') 2, and Fv, which are capable of binding an epitope of a MASP-like serine protease polypeptide. Typically, at least 6,8,10, or 12 contiguous amino acids are required to form an epitope. However, epitopes which involve non-contiguous amino acids may require more, e. g., at least 15,25, or 50 amino acids.

An antibody which specifically binds to an epitope of a MASP-like serine protease polypeptide can be used therapeutically, as well as in immunochemical assays, including but not limited to Western blots, ELISAs, radioimmunoassays, immuno- histochemical assays, immunoprecipitations, or other immunochemical assays known in the art. Various immunoassays can be used to identify antibodies having the de- sired specificity. Numerous protocols for competitive binding or immunoradiometric assays are well known in the art. Such immunoassays typically involve the measurement of complex formation between an immunogen and an antibody which specifically binds to the immunogen.

Typically, an antibody which specifically binds to a MASP-like serine protease polypeptide provides a detection signal at least 5-, 10-, or 20-fold higher than a detection signal provided with other proteins when used in an immunochemical assay. Preferably, antibodies which specifically bind to MASP-like serine protease polypeptides do not detect other proteins in immunochemical assays and can immunoprecipitate a MASP-like serine protease polypeptide from solution.

MASP-like serine protease polypeptides can be used to immunize a mammal, such as a mouse, rat, rabbit, guinea pig, monkey, or human, to produce polyclonal antibodies.

If desired, a MASP-like serine protease polypeptide can be conjugated to a carrier protein, such as bovine serum albumin, thyroglobulin, and keyhole limpet hemocyanin. Depending on the host species, various adjuvants can be used to in- crease the immunological response. Such adjuvants include, but are not limited to, Freund's adjuvant, mineral gels (e. g., aluminum hydroxide), and surface active substances (e. g. lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, keyhole limpet hemocyanin, and dinitrophenol). Among adjuvants used in humans, BCG (bacilli Calmette-Guerin) and Corynebacterium parvum are especially useful.

Monoclonal antibodies which specifically bind to a MASP-like serine protease polypeptide can be prepared using any technique which provides for the production

of antibody molecules by continuous cell lines in culture. These techniques include, but are not limited to, the hybridoma technique, the human B-cell hybridoma technique, and the EBV-hybridoma technique (Kohler et al., Nature 256, 495-497, 1985 ; Kozbor et al., J Immunol. Methods 81, 31-42,1985; Cote et al., Proc. Natl.

Acad. Sci. 80, 2026-2030,1983; Cole et al., Mol. Cell Biol. 62, 109-120,1984).

In addition, techniques developed for the production of"chimeric antibodies,"the splicing of mouse antibody genes to human antibody genes to obtain a molecule with appropriate antigen specificity and biological activity, can be used (Morrison et al., Proc. Natl. Acad. Sci. 81, 6851-6855, 1984 ; Neuberger et al., Nature 312, 604-608, 1984; Takeda et al., Nature 314, 452-454,1985). Monoclonal and other antibodies also can be"humanized"to prevent a patient from mounting an immune response against the antibody when it is used therapeutically. Such antibodies may be sufficiently similar in sequence to human antibodies to be used directly in therapy or may require alteration of a few key residues. Sequence differences between rodent antibodies and human sequences can be minimized by replacing residues which differ from those in the human sequences by site directed mutagenesis of individual residues or by grating of entire complementarity determining regions. Alternatively, one can produce humanized antibodies using recombinant methods, as described in GB2188638B. Antibodies which specifically bind to a MASP-like serine protease polypeptide can contain antigen binding sites which are either partially or fully humanized, as disclosed in U. S. 5,565,332.

Alternatively, techniques described for the production of single chain antibodies can be adapted using methods known in the art to produce single chain antibodies which specifically bind to MASP-like serine protease polypeptides. Antibodies with related specificity, but of distinct idiotypic composition, can be generated by chain shuffling from random combinatorial immunoglobin libraries (Burton, Proc. Natl. Acad. Sci.

88, 11120-23,1991).

Single-chain antibodies also can be constructed using a DNA amplification method, such as PCR, using hybridoma cDNA as a template (Thirion et al., 1996, Eur. J Cancer Prev. 5, 507-11). Single-chain antibodies can be mono-or bispecific, and can be bivalent or tetravalent. Construction of tetravalent, bispecific single-chain antibodies is taught, for example, in Coloma & Morrison, 1997, Nat. Biotechnol. 15, 159-63. Construction of bivalent, bispecific single-chain antibodies is taught in Mallender & Voss, 1994, J ; Biol. Chem. 269, 199-206.

A nucleotide sequence encoding a single-chain antibody can be constructed using manual or automated nucleotide synthesis, cloned into an expression construct using standard recombinant DNA methods, and introduced into a cell to express the coding sequence, as described below. Alternatively, single-chain antibodies can be produced directly using, for example, filamentous phage technology. Verhaar et al., 1995, Int. J Cancer 61, 497-501; Nicholls et al., 1993, J Immunol. Meth. 165, 81- 91.

Antibodies which specifically bind to MASP-like serine protease polypeptides also can be produced by inducing in vivo production in the lymphocyte population or by screening immunoglobulin libraries or panels of highly specific binding reagents as disclosed in the literature (Orlandi et al., Proc. Natl. Acad. Sci. 86, 3833-3837,1989; Winter et al., Nature 349, 293-299,1991).

Other types of antibodies can be constructed and used therapeutically in methods of the invention. For example, chimeric antibodies can be constructed as disclosed in WO 93/03151. Binding proteins which are derived from immunoglobulins and which are multivalent and multispecific, such as the"diabodies"described in WO 94/13804, also can be prepared.

Antibodies of the invention can be purified by methods well known in the art. For example, antibodies can be affinity purified by passage over a column to which a

MASP-like serine protease polypeptide is bound. The bound antibodies can then be eluted from the column using a buffer with a high salt concentration.

Antisense Oligonucleotides Antisense oligonucleotides are nucleotide sequences which are complementary to a specific DNA or RNA sequence. Once introduced into a cell, the complementary nucleotides combine with natural sequences produced by the cell to form complexes and block either transcription or translation. Preferably, an antisense oligonucleotide is at least 11 nucleotides in length, but can be at least 12,15,20,25,30,35,40,45, or 50 or more nucleotides long. Longer sequences also can be used. Antisense oligonucleotide molecules can be provided in a DNA construct and introduced into a cell as described above to decrease the level of MASP-like serine protease gene products in the cell.

Antisense oligonucleotides can be deoxyribonucleotides, ribonucleotides, or a combination of both. Oligonucleotides can be synthesized manually or by an automated synthesizer, by covalently linking the 5'end of one nucleotide with the 3' end of another nucleotide with non-phosphodiester internucleotide linkages such alkylphosphonates, phosphorothioates, phosphorodithioates, alkylphosphonothioates, alkylphosphonates, phosphoramidates, phosphate esters, carbamates, acetamidate, carboxymethyl esters, carbonates, and phosphate triesters. See Brown, Meth. Mol.

Biol. 20,1-8,1994; Sonveaux, Meth. Mol. Biol. 26, 1-72,1994; Uhlmann et al., Chem. Rev. 90, 543-583,1990.

Modifications of MASP-like serine protease gene expression can be obtained by designing antisense oligonucleotides which will form duplexes to the control, 5', or regulatory regions of the MASP-like serine protease gene. Oligonucleotides derived from the transcription initiation site, e. g., between positions-10 and +10 from the start site, are preferred. Similarly, inhibition can be achieved using"triple helix" base-pairing methodology. Triple helix pairing is useful because it causes inhibition

of the ability of the double helix to open sufficiently for the binding of polymerases, transcription factors, or chaperons. Therapeutic advances using triplex DNA have been described in the literature (e. g., Gee et al., in Huber & Carr, MOLECULAR AND IMMUNOLOGIC APPROACHES, Futura Publishing Co., Mt. Kisco, N. Y., 1994). An antisense oligonucleotide also can be designed to block translation ofmRNA by pre- venting the transcript from binding to ribosomes.

Precise complementarity is not required for successful duplex formation between an antisense oligonucleotide and the complementary sequence of a MASP-like serine protease polynucleotide. Antisense oligonucleotides which comprise, for example, 2, 3,4, or 5 or more stretches of contiguous nucleotides which are precisely complementary to a MASP-like serine protease polynucleotide, each separated by a stretch of contiguous nucleotides which are not complementary to adjacent MASP- like serine protease nucleotides, can provide targeting specificity for MASP-like serine protease mRNA. Preferably, each stretch of complementary contiguous nucleotides is at least 4,5,6,7, or 8 or more nucleotides in length. Non-complemen- tary intervening sequences are preferably 1,2,3, or 4 nucleotides in length. One skilled in the art can easily use the calculated melting point of an antisense-sense pair to determine the degree of mismatching which will be tolerated between a particular antisense oligonucleotide and a particular MASP-like serine protease polynucleotide sequence.

Antisense oligonucleotides can be modified without affecting their ability to hybridize to a MASP-like serine protease polynucleotide. These modifications can be internal or at one or both ends of the antisense molecule. For example, inter- nucleoside phosphate linkages can be modified by adding cholesteryl or diamine moieties with varying numbers of carbon residues between the amino groups and terminal ribose. Modified bases and/or sugars, such as arabinose instead of ribose, or a 3', 5'-substituted oligonucleotide in which the 3'hydroxyl group or the 5'phos- phate group are substituted, also can be employed in a modified antisense oligo- nucleotide. These modified oligonucleotides can be prepared by methods well

known in the art. See, e. g., Agrawal et al., Trends Biotechnol. 10, 152-158,1992; Uhlmann et al., Chem. Rev. 90, 543-584,1990; Uhhnann et al., Tetrahedron. Lett.

215, 3539-3542,1987.

Ribozymes Ribozymes are RNA molecules with catalytic activity. See, e. g., Cech, Science 236, 1532-1539; 1987; Cech, Ann. Rev. Biochem. 59, 543-568; 1990, Cech, Curr. Opin.

Struct. Biol. 2,605-609; 1992, Couture & Stinchcomb, Trends Genet. 12, 510-515, 1996. Ribozymes can be used to inhibit gene function by cleaving an RNA sequence, as is known in the art (e. g., Haseloff et al., U. S. Patent 5,641,673). The mechanism of ribozyme action involves sequence-specific hybridization of the ribozyme molecule to complementary target RNA, followed by endonucleolytic cleavage.

Examples include engineered hammerhead motif ribozyme molecules that can specifically and efficiently catalyze endonucleolytic cleavage of specific nucleotide sequences.

The coding sequence of a MASP-like serine protease polynucleotide can be used to generate ribozymes which will specifically bind to mRNA transcribed from the MASP-like serine protease polynucleotide. Methods of designing and constructing ribozymes which can cleave other RNA molecules in trans in a highly sequence specific manner have been developed and described in the art (see Haseloff et al.

Nature 334, 585-591,1988). For example, the cleavage activity of ribozymes can be targeted to specific RNAs by engineering a discrete"hybridization"region into the ribozyme. The hybridization region contains a sequence complementary to the target RNA and thus specifically hybridizes with the target (see, for example, Gerlach et al., EP 321,201).

Specific ribozyme cleavage sites within a MASP-like serine protease RNA target are initially identified by scanning the RNA molecule for ribozyme cleavage sites which include the following sequences: GUA, GUU, and GUC. Once identified, short

RNA sequences of between 15 and 20 ribonucleotides corresponding to the region of the MASP-like serine protease target RNA containing the cleavage site can be evaluated for secondary structural features which may render the target inoperable.

The suitability of candidate targets also can be evaluated by testing accessibility to hybridization with complementary oligonucleotides using ribonuclease protection assays. Longer complementary sequences can be used to increase the affinity of the hybridization sequence for the target. The hybridizing and cleavage regions of the ribozyme can be integrally related; thus, upon hybridizing to the MASP-like serine protease target RNA through the complementary regions, the catalytic region of the ribozyme can cleave the target.

Ribozymes can be introduced into cells as part of a DNA construct. Mechanical methods, such as microinjection, liposome-mediated transfection, electroporation, or calcium phosphate precipitation, can be used to introduce a ribozyme-containing DNA construct into cells in which it is desired to decrease MASP-like serine protease expression. Alternatively, if it is desired that the cells stably retain the DNA construct, it can be supplied on a plasmid and maintained as a separate element or integrated into the genome of the cells, as is known in the art. The DNA construct can include transcriptional regulatory elements, such as a promoter element, an en- hancer or UAS element, and a transcriptional terminator signal, for controlling trans- cription of ribozymes in the cells.

As taught in Haseloff et al., U. S. Patent 5,641,673, ribozymes can be engineered so that ribozyme expression will occur in response to factors which induce expression of a target gene. Ribozymes also can be engineered to provide an additional level of regulation, so that destruction of MASP-like serine protease mRNA occurs only when both a ribozyme and a target gene are induced in the cells.

Identification o f Tarswet and Pathwav Genes and Proteins Described herein are methods for the identification of genes whose products interact with human matriptase. Such genes may represent genes which are differentially ex- pressed in pathogenic infections. Further, such genes may represent genes which are differentially regulated in response to manipulations relevant to the progression or treatment of such diseases. Such differentially expressed genes may represent "target"and/or"fingerprint"genes. Methods for the identification of such differentially expressed genes are described below. Methods for the further characterization of such differentially expressed genes, and for their identification as target and/or fingerprint genes also are described below.

In addition, methods are described for the identification of genes, termed"pathway genes,"which are involved in pathogenic infections."Pathway gene,"as used here- in, refers to a gene whose gene product exhibits the ability to interact with gene products involved in these disorders. A pathway gene may be differentially ex- pressed and, therefore, may have the characteristics of a target and/or fingerprint gene.

"Differential expression"refers to both quantitative as well as qualitative differences in a gene's temporal and/or tissue expression pattern. Thus, a differentially ex- pressed gene may qualitatively have its expression activated or completely in- activated in normal versus diseased states, or under control versus experimental conditions. Such a qualitatively regulated gene will exhibit an expression pattern within a given tissue or cell type which is detectable in either normal or diseased subjects, but is not detectable in both. Alternatively, such a qualitatively regulated gene will exhibit an expression pattern within a given tissue or cell type which is detectable in either control or experimental subjects, but is not detectable in both.

"Detectable"refers to an RNA expression pattern which is detectable via the standard techniques of differential display, RT-PCR and/or Northern analyses, which are well known to those of skill in the art.

A differentially expressed gene may have its expression modulated, i. e., quantitatively increased or decreased, in normal versus diseased states, or under control versus experimental conditions. The degree to which expression differs in normal versus body weight disorder or control versus experimental states need only be large enough to be visualized via standard characterization techniques, such as, for example, the differential display technique described below. Other such standard characterization techniques by which expression differences may be visualized include but are not limited to, quantitative RT (reverse transcriptase) PCR and Northern analyses.

Differentially expressed genes may be further described as target genes and/or fingerprint genes."Fingerprint gene"refers to a differentially expressed gene whose expression pattern may be utilized as part of a prognostic or diagnostic evaluation, or which, alternatively, may be used in methods for identifying compounds useful for the treatment of various disorders. A fingerprint gene may also have the characteristics of a target gene or a pathway gene.

"Target gene"refers to a differentially expressed gene involved in pathogenic infections by which modulation of the level of target gene expression or of target gene product activity may act to ameliorate symptoms. A target gene may also have the characteristics of a fingerprint gene and/or a pathway gene.

Identification of Differentiallv Expressed Genes A variety of methods may be utilized for the identification of genes which are involved in pathogenic infections. To identify differentially expressed genes, RNA, either total or mRNA, may be isolated from one or more tissues of the subjects utilized in paradigms such as those described above. RNA samples are obtained from tissues of experimental subjects and from corresponding tissues of control subjects. Any RNA isolation technique which does not select against the isolation of

mRNA may be utilized for the purification of such RNA samples. See, for example, Ausubel et al., eds."CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, John Wiley & Sons, Inc. New York, 1987-1993. Large numbers of tissue samples may readily be processed using techniques well known to those of skill in the art, such as, for example, the single-step RNA isolation process of Chomczynski, U. S. Patent 4,843,155.

Transcripts within the collected RNA samples which represent RNA produced by differentially expressed genes may be identified by utilizing a variety of methods which are well known to those of skill in the art. For example, differential screening (Tedder et al., Proc. Natl. Acad. Sci. U. S. A. 85, 208-12,1988), subtractive hybridization (Hedrick et al., Nature 308, 149-53; Lee et al., Proc. Natl. Acad. Sci.

U. S. A. 88, 2825, 1984), and, preferably, differential display (Liang & Pardee, Science 257, 967-71,1992; U. S. Patent 5,262,311), may be utilized to identify nucleic acid sequences derived from genes that are differentially expressed.

Differential screening involves the duplicate screening of a cDNA library in which one copy of the library is screened with a total cell cDNA probe corresponding to the mRNA population of one cell type while a duplicate copy of the cDNA library is screened with a total cDNA probe corresponding to the mRNA population of a second cell type. For example, one cDNA probe may correspond to a total cell cDNA probe of a cell type or tissue derived from a control subject, while the second cDNA probe may correspond to a total cell cDNA probe of the same cell type or tissue derived from an experimental subject. Those clones which hybridize to one probe but not to the other potentially represent clones derived from genes differentially expressed in the cell type of interest in control versus experimental subjects.

Subtractive hybridization techniques generally involve the isolation of mRNA taken from two different sources, e. g., control and experimental tissue or cell type, the hybridization of the mRNA or single-stranded cDNA reverse-transcribed from the

isolated mRNA, and the removal of all hybridized, and therefore double-stranded, sequences. The remaining non-hybridized, single-stranded cDNAs, potentially represent clones derived from genes that are differentially expressed in the two mRNA sources. Such single-stranded cDNAs are then used as the starting material for the construction of a library comprising clones derived from differentially ex- pressed genes.

The differential display technique describes a procedure, utilizing the well known polymerase chain reaction (PCR; the experimental embodiment set forth in Mullis, U. S. Patent 4,683,202), which allows for the identification of sequences derived from genes which are differentially expressed. First, isolated RNA is reverse-transcribed into single-stranded cDNA, utilizing standard techniques which are well known to those of skill in the art. Primers for the reverse transcriptase reaction may include, but are not limited to, oligo dT-containing primers.

Next, this technique uses pairs of PCR primers, as described below, which allow for the amplification of clones representing a random subset of the RNA transcripts present within any given cell. Utilizing different pairs of primers allows each of the mRNA transcripts present in a cell to be amplified. Among such amplified trans- cripts may be identified those which have been produced from differentially ex- pressed genes.

The 3'oligonucleotide primer of the primer pairs may contain an oligo dT stretch of 10-13, preferably 11, dT nucleotides at its 5'end, which hybridizes to the poly (A) tail of mRNA or to the complement of a cDNA reverse transcribed from an mRNA poly (A) tail. Second, in order to increase the specificity of the 3'primer, the primer may contain one or more, preferably two, additional nucleotides at its 3'end.

Because, statistically, only a subset of the mRNA derived sequences present in the sample of interest will hybridize to such primers, the additional nucleotides allow the primers to amplify only a subset of the mRNA derived sequences present in the sample of interest. This is preferred in that it allows more accurate and complete

visualization and characterization of each of the bands representing amplified se- quences.

The 5'primer may contain a nucleotide sequence expected, statistically, to have the ability to hybridize to cDNA sequences derived from the tissues of interest. The nucleotide sequence may be an arbitrary one, and the length of the 5'oligonucleotide primer may range from about 9 to about 15 nucleotides, with about 13 nucleotides being preferred. Arbitrary primer sequences cause the lengths of the amplified partial cDNAs produced to be variable, thus allowing different clones to be separated by using standard denaturing sequencing gel electrophoresis.

PCR reaction conditions should be chosen which optimize amplified product yield and specificity, and, additionally, produce amplified products of lengths which may be resolved utilizing standard gel electrophoresis techniques. Such reaction condi- tions are well known to those of skill in the art, and important reaction parameters in- clude, for example, length and nucleotide sequence of oligonucleotide primers as discussed above, and annealing and elongation step temperatures and reaction times.

The pattern of clones resulting from the reverse transcription and amplification of the mRNA of two different cell types is displayed via sequencing gel electrophoresis and compared. Differentially expressed genes are indicated by differences in the two banding patterns.

Once potentially differentially expressed gene sequences have been identified via bulk techniques such as, for example, those described above, the differential ex- pression of such putatively differentially expressed genes should be corroborated.

Corroboration may be accomplished via, for example, such well known techniques as Northern analysis, quantitative RT PCR or RNase protection. Upon corroboration, the differentially expressed genes may be further characterized, and may be identified as target and/or fingerprint genes, as discussed below.

Amplified sequences of differentially expressed genes obtained through, for example, differential display may be used to isolate full length clones of the corresponding gene. The full length coding portion of the gene may readily be isolated, without un- due experimentation, by molecular biological techniques well known in the art. For example, the isolated differentially expressed amplified fragment may be labeled and used to screen a cDNA library. Alternatively, the labeled fragment may be used to screen a genomic library.

PCR technology may also be utilized to isolate full length cDNA sequences. As described above, the isolated, amplified gene fragments obtained through differential display have 5'terminal ends at some random point within the gene and usually have 3'terminal ends at a position corresponding to the 3'end of the transcribed portion of the gene. Once nucleotide sequence information from an amplified fragment is obtained, the remainder of the gene (i. e., the 5'end of the gene, when utilizing differential display) may be obtained using, for example, RT-PCR.

In one embodiment of such a procedure for the identification and cloning of full length gene sequences, RNA may be isolated, following standard procedures, from an appropriate tissue or cellular source. A reverse transcription reaction may then be performed on the RNA using an oligonucleotide primer complimentary to the mRNA that corresponds to the amplified fragment, for the priming of first strand synthesis.

Because the primer is anti-parallel to the mRNA, extension will proceed toward the 5'end of the mRNA. The resulting RNA/DNA hybrid may then be"tailed"with guanines using a standard terminal transferase reaction, the hybrid may be digested with RNAase H, and second strand synthesis may then be primed with a poly-C primer. Using the two primers, the 5'portion of the gene is amplified using PCR.

Sequences obtained may then be isolated and recombined with previously isolated sequences to generate a full-length cDNA of the differentially expressed genes of the invention. For a review of cloning strategies and recombinant DNA techniques, see e. g., Sambrook et al., 1989, and Ausubel et al., 1989.

Iden iffication ofpathway Gene Methods are described herein for the identification of pathway genes."Pathway gene"refers to a gene whose gene product exhibits the ability to interact with gene products involved in pathogenic infections. A pathway gene may be differentially expressed and, therefore, may have the characteristics of a target and/or fingerprint gene.

Any method suitable for detecting protein-protein interactions may be employed for identifying pathway gene products by identifying interactions between gene products and gene products known to be involved in pathogenic infections. Such known gene products may be cellular or extracellular proteins. Those gene products which interact with such known gene products represent pathway gene products and the genes which encode them represent pathway genes.

Among the traditional methods which may be employed are co-immunoprecipitation, crosslinking and co-purification through gradients or chromatographic columns.

Utilizing procedures such as these allows for the identification of pathway gene products. Once identified, a pathway gene product may be used, in conjunction with standard techniques, to identify its corresponding pathway gene. For example, at least a portion of the amino acid sequence of the pathway gene product may be ascertained using techniques well known to those of skill in the art, such as via the Edman degradation technique (see, e. g., Creighton, PROTEINS : STRUCTURES AND MOLECULAR PRINCIPLES, W. H. Freeman & Co., N. Y., pp. 34-49,1983). The amino acid sequence obtained may be used as a guide for the generation of oligonucleotide mixtures that can be used to screen for pathway gene sequences. Screening made be accomplished, for example, by standard hybridization or PCR techniques.

Techniques for the generation of oligonucleotide mixtures and the screening are well- known. (see, e. g., Ausubel, 1989, and Innis et al., eds., PCR PROTOCOLS: A GUIDE TO METHODS AND APPLICATIONS, 1990, Academic Press, Inc., New York).

Methods may be employed which result in the simultaneous identification of pathway genes which encode the protein interacting with a protein involved in pathogenic infections. These methods include, for example, probing expression libraries with labeled protein known or suggested to be involved in such disorders, using this protein in a manner similar to the well known technique of antibody probing of kgtl 1 libraries.

One method which detects protein interactions in vivo, the two-hybrid system, is described in detail for illustration only and not by way of limitation. One version of this system is been described in Chien et al., 1991, Proc. Natl. Acad. Sci. U. SA. 88, 9578-82, 1991, and is commercially available from Clontech (Palo Alto, Calif.).

Briefly, utilizing such a system, plasmids are constructed that encode two hybrid proteins: one consists of the DNA-binding domain of a transcription activator protein fused to a known protein, in this case, a protein known to be involved in body weight disorders and or processes relevant to appetite and/or weight regulation, and the other consists of the transcription activator protein's activation domain fused to an unknown protein that is encoded by a cDNA which has been recombined into this plasmid as part of a cDNA library. The plasmids are transformed into a strain of the yeast Saccharomyces cerevisiae that contains a reporter gene (e. g., lacZ) whose regulatory region contains the transcription activator's binding sites. Either hybrid protein alone cannot activate transcription of the reporter gene: the DNA-binding domain hybrid cannot because it does not provide activation function and the activation domain hybrid cannot because it cannot localize to the activator's binding sites. Interaction of the two hybrid proteins reconstitutes the functional activator protein and results in expression of the reporter gene, which is detected by an assay for the reporter gene product.

The two-hybrid system or related methodology may be used to screen activation domain libraries for proteins that interact with a known"bait"gene product. By way of example, and not by way of limitation, gene products known to be involved in body weight disorders and/or appetite or body weight regulation may be used as the

bait gene products. These include but are not limited to the intracellular domain of receptors for such hormones as neuropeptide Y, galanin, interostatin, insulin, and CCK. Total genomic or cDNA sequences are fused to the DNA encoding an activation domain. This library and a plasmid encoding a hybrid of the bait gene product fused to the DNA-binding domain are cotransformed into a yeast reporter strain, and the resulting transformants are screened for those that express the reporter gene. For example, and not by way of limitation, the bait gene can be cloned into a vector such that it is translationally fused to the DNA encoding the DNA-binding domain of the GAL4 protein. These colonies are purified and the library plasmids responsible for reporter gene expression are isolated. DNA sequencing is then used to identify the proteins encoded by the library plasmids.

A cDNA library of the cell line from which proteins that interact with bait gene product are to be detected can be made using methods routinely practiced in the art.

According to the particular system described herein, for example, the cDNA fragments can be inserted into a vector such that they are translationally fused to the activation domain of GAL4. This library can be co-transformed along with the bait gene-GAL4 fusion plasmid into a yeast strain which contains a lacZ gene driven by a promoter which contains GAL4 activation sequence. A cDNA encoded protein, fused to GAL4 activation domain, that interacts with bait gene product will reconstitute an active GAL4 protein and thereby drive expression of the lacZ gene.

Colonies which express lacZ can be detected by their blue color in the presence of X- gal. The cDNA can then be purified from these strains, and used to produce and isolate the bait gene-interacting protein using techniques routinely practiced in the art. Once a pathway gene has been identified and isolated, it may be further characterized, as described below..

Characterization ofdifferentially Expressed and Pathway Genes Differentially expressed and pathway genes, such as those identified via the methods discussed above, as well as genes identified by alternative means, may be further

characterized by utilizing, for example, methods such as those discussed herein.

Such genes will be referred to herein as"identified genes."Analyses such as those described herein, yield information regarding the biological function of the identified genes. An assessment of the biological function of the differentially expressed genes, in addition, will allow for their designation as target and/or fingerprint genes.

Specifically, any of the differentially expressed genes whose further characterization indicates that a modulation of the gene's expression or a modulation of the gene product's activity may ameliorate any of the body weight disorders of interest will be designated"target genes,"as defined above. Such target genes and target gene products, along with those discussed below, will constitute the focus of the compound discovery strategies discussed below. Further, such target genes, target gene products and/or modulating compounds can be used as part of the treatment methods described below.

Any of the differentially expressed genes whose further characterization indicates that such modulations may not positively affect body weight disorders of interest, but whose expression pattern contributes to a gene expression"fingerprint"pattern correlative of, for example, a malignant state will be designated a"fingerprint gene." It should be noted that each of the target genes may also function as fingerprint genes, as well as may all or a portion of the pathway genes.

Pathway genes may also be characterized according to techniques such as those described herein. Those pathway genes which yield information indicating that they are differentially expressed and that modulation of the gene's expression or a modulation of the gene product's activity may ameliorate any of the disorders of interest will be also be designated"target genes."Such target genes and target gene products, along with those discussed above, will constitute the focus of the compound discovery strategies discussed below and can be used as part of treatment methods.

Characterization of one or more of the pathway genes may reveal a lack of differential expression, but evidence that modulation of the gene's activity or expression may, nonetheless, ameliorate body weight disorder symptoms. In such cases, these genes and gene products would also be considered a focus of the compound discovery strategies. In instances wherein a pathway gene's characterization indicates that modulation of gene expression or gene product activity may not positively affect body weight disorders of interest, but whose expression is differentially expressed and contributes to a gene expression fingerprint pattern correlative of, for example, a pathogenic infection, such pathway genes may additionally be designated as fingerprint genes.

A variety of techniques can be utilized to further characterize the identified genes.

First, the nucleotide sequence of the identified genes, which may be obtained by utilizing standard techniques well known to those of skill in the art, may, for example, be used to reveal homologies to one or more known sequence motifs which may yield information regarding the biological function of the identified gene product.

Second, an analysis of the tissue and/or cell type distribution of the mRNA produced by the identified genes may be conducted, utilizing standard techniques well known to those of skill in the art. Such techniques may include, for example, Northern, RNase protection and RT-PCR analyses. Such analyses provide information as to, for example, whether the identified genes are expressed in tissues or cell types expected to contribute to the disorders of interest. Such analyses may also provide quantitative information regarding steady state mRNA regulation, yielding data concerning which of the identified genes exhibits a high level of regulation in, preferably, tissues which may be expected to contribute to the disorders of interest. Additionally, standard in situ hybridization techniques may be utilized to provide information regarding which cells within a given tissue express the identified gene. Such an analysis may provide information regarding the biological function of an identified gene relative to a given

disorder in instances wherein only a subset of the cells within the tissue is thought to be relevant to the body weight disorder.

Third, the sequences of the identified genes may be used, utilizing standard techniques, to place the genes onto genetic maps, e. g., mouse (Copeland and Jenkins, Trends in Genetics 7,113-18,1991) and human genetic maps (Cohen et al., Nature 366, 698-701,1993). Such mapping information may yield information regarding the genes'importance to human disease by, for example, identifying genes which map within genetic regions to which known genetic disorders map.

Fourth, the biological function of the identified genes may be more directly assessed by utilizing relevant in vivo and in vitro systems. In vivo systems may include, but are not limited to, animal systems which naturally exhibit body weight disorder-like symptoms, or ones which have been engineered to exhibit such symptoms. Further, such systems may include systems for the further characterization of body weight disorders, and/or appetite or body weight regulation, and may include, but are not limited to, naturally occurring and transgenic animal systems. In vitro systems may include, but are not limited to, cell-based systems comprising cell types known or suspected of contributing to the body weight disorder of interest. Such cells may be wild type cells, or may be non-wild type cells containing modifications known to, or suspected of, contributing to the body weight disorder of interest.

In further characterizing the biological function of the identified genes, the expression of these genes may be modulated within the in vivo and/or in vitro systems, i. e., either overexpressed or underexpressed in, for example, transgenic animals and/or cell lines, and its subsequent effect on the system then assayed.

Alternatively, the activity of the product of the identified gene may be modulated by either increasing or decreasing the level of activity in the in vivo and/or in vitro system of interest, and its subsequent effect then assayed.

The information obtained through such characterizations may suggest relevant methods for the treatment of disorders involving the gene of interest. Further, rele- vant methods for the treatment of such disorders involving the gene of interest may be suggested by information obtained from such characterizations. For example, treatment may include a modulation of gene expression and/or gene product activity.

Characterization procedures such as those described herein may indicate where such modulation should involve an increase or a decrease in the expression or activity of the gene or gene product of interest.

Screening Methods The invention provides methods for identifying modulators, i. e., candidate or test compounds which bind to MASP-like serine protease polypeptides or polynucleo- tides and/or have a stimulatory or inhibitory effect on, for example, expression or activity of the MASP-like serine protease polypeptide or polynucleotide, so as to regulate degradation of the extracellular matrix. Decreased extracellular matrix degradation is useful for preventing or suppressing malignant cells from metastasizing. Increased extracellular matrix degradation may be desired, for ex- ample, in developmental disorders characterized by inappropriately low levels of extracellular matrix degradation or in regeneration.

The invention provides assays for screening test compounds which bind to or modulate the activity of a MASP-like serine protease polypeptide or a MASP-like serine protease polynucleotide. A test compound preferably binds to a MASP-like serine protease polypeptide or polynucleotide. More preferably, a test compound decreases a MASP-like serine protease activity of a MASP-like serine protease poly- peptide or expression of a MASP-like serine protease polynucleotide by at least about 10, preferably about 50, more preferably about 75,90, or 100% relative to the absence of the test compound.

Test Compounds Test compounds can be pharmacologic agents already known in the art or can be compounds previously unknown to have any pharmacological activity. Such com- pounds also may include, but are not limited to, other cellular proteins, peptides such as, for example, soluble peptides, including but not limited to, Ig-tailed fusion peptides, comprising extracellular portions of target gene product transmembrane receptors, and members of random peptide libraries (Lam, et al., Nature 354, 82-84, 1991; Houghten et al., Nature 354, 84-86,1991), made of D-and/or L-configuration amino acids, phosphopeptides (including, but not limited to members of random or partially degenerate, directed phosphopeptide libraries (Songyang et al., Cell 72, 767-78, 1993), antibodies (including, but not limited to, polyclonal, monoclonal, humanized, anti-idiotypic, chimeric or single chain antibodies, and FAb, F (ab') 2 and FAb expression library fragments, and epitope-binding fragments thereof), and small organic or inorganic molecules.

The compounds can be naturally occurring or designed in the laboratory. They can be isolated from microorganisms, animals, or plants, and can be produced re- combinantly, or synthesized by chemical methods known in the art. If desired, test compounds can be obtained using any of the numerous combinatorial library methods known in the art, including but not limited to, biological libraries, spatially addressable parallel solid phase or solution phase libraries, synthetic library methods requiring deconvolution, the"one-bead one-compound"library method, and synthetic library methods using affinity chromatography selection. The biological library approach is limited to polypeptide libraries, while the other four approaches are applicable to polypeptide, non-peptide oligomer, or small molecule libraries of compounds. SeeLam, AnticancerDrugDes. 12, 145,1997.

Methods for the synthesis of molecular libraries are well known in the art (see, for example, DeWitt et al., Proc. Natl. Acad. Sci. U. S. A. 90, 6909,1993; Erb et al. Proc.

Natl. Acad. Sci. U. S. A. 91, 11422,1994; Zuckermann et al., J : Med. Chem. 37, 2678,

1994; Cho et al., Science 261, 1303,1993; Carell et al., Angew. Chem. Int. Ed. Engl.

33,2059,1994; Carell et al., Angew. Chem. Int. Ed. Engl. 33,2061; Gallop et al., J : Med. Chem. 37,1233,1994). Libraries of compounds can be presented in solution (see, e. g., Houghten, Biotechniques 13, 412-421,1992), or on beads (Lam, Nature 354, 82-84,1991), chips (Fodor, Nature 364, 555-556, 1993), bacteria or spores (Ladner, U. S. Patent 5,223,409), plasmids (Cull et al., Proc. Natl. Acad. Sci. U. S. A.

89, 1865-1869,1992), or phage (Scott & Smith, Science 249, 386-390, 1990; Devlin, Science 249, 404-406,1990); Cwirla et al., Proc. Natl. Acad. Sci. 97, 6378-6382, 1990; Felici, J. Mol. Biol. 222,301-310,1991; and Ladner, U. S. Patent 5,223,409).

High Throughput Screening Test compounds can be screened for the ability to bind to MASP-like serine protease polypeptides or polynucleotides or to affect MASP-like serine protease activity or MASP-like serine protease gene expression using high throughput screening. Using high throughput screening, many discrete compounds can be tested in parallel so that large numbers of test compounds can be quickly screened. The most widely established techniques utilize 96-well microtiter plates. The wells of the microtiter plates typically require assay volumes that range from 50 to 500 ul. In addition to the plates, many instruments, materials, pipettors, robotics, plate washers, and plate readers are commercially available to fit the 96-well format.

Alternatively,"free format assays,"or assays that have no physical barrier between samples, can be used. For example, an assay using pigment cells (melanocytes) in a simple homogeneous assay for combinatorial peptide libraries is described by Jayawickreme et al., Proc. Natl. Acad. Sci. U. SA. 19, 1614-1 (1994). The cells are placed under agarose in petri dishes, then beads that carry combinatorial compounds are placed on the surface of the agarose. The combinatorial compounds are partially released the compounds from the beads. Active compounds can be visualized as dark pigment areas because, as the compounds diffuse locally into the gel matrix, the active compounds cause the cells to change colors.

Another example of a free format assay is described by Chelsky,"Strategies for Screening Combinatorial Libraries: Novel and Traditional Approaches,"reported at the First Annual Conference of The Society for Biomolecular Screening in Philadelphia, Pa. (Nov. 7-10, 1995). Chelsky placed a simple homogenous enzyme assay for carbonic anhydrase inside an agarose gel such that the enzyme in the gel would cause a color change throughout the gel. Thereafter, beads carrying combinatorial compounds via a photolinker were placed inside the gel and the compounds were partially released by W-light. Compounds that inhibited the enzyme were observed as local zones of inhibition having less color change.

Yet another example is described by Salmon et al., Molecular Diversity 2, 57-63 (1996). In this example, combinatorial libraries were screened for compounds that had cytotoxic effects on cancer cells growing in agar.

Another high throughput screening method is described in Beutel et al., U. S. Patent 5,976,813. In this method, test samples are placed in a porous matrix. One or more assay components are then placed within, on top of, or at the bottom of a matrix such as a gel, a plastic sheet, a filter, or other form of easily manipulated solid support.

When samples are introduced to the porous matrix they diffuse sufficiently slowly, such that the assays can be performed without the test samples running together.

Bindih Assavs For binding assays, the test compound is preferably a small molecule which binds to and occupies the active site or a fibronectin domain of the MASP-like serine protease polypeptide, thereby making the active site or fibronectin domain inaccessible to substrate such that normal biological activity is prevented. Examples of such small molecules include, but are not limited to, small peptides or peptide-like molecules.

In binding assays, either the test compound or the MASP-like serine protease poly- peptide can comprise a detectable label, such as a fluorescent, radioisotopic,

chemiluminescent, or enzymatic label, such as horseradish peroxidase, alkaline phos- phatase, or luciferase. Detection of a test compound which is bound to the MASP- like serine protease polypeptide can then be accomplished, for example, by direct counting of radioemmission, by scintillation counting, or by determining conversion of an appropriate substrate to a detectable product.

Alternatively, binding of a test compound to a MASP-like serine protease poly- peptide can be determined without labeling either of the interactants. For example, a microphysiometer can be used to detect binding of a test compound with a target polypeptide. A microphysiometer (e. g., Cytosensofrm) is an analytical instrument that measures the rate at which a cell acidifies its environment using a light-addressable potentiometric sensor (LAPS). Changes in this acidification rate can be used as an indicator of the interaction between a test compound and a MASP- like serine protease polypeptide. (McConnell et al., Science 257, 1906-1912,1992).

Determining the ability of a test compound to bind to a MASP-like serine protease polypeptide also can be accomplished using a technology such as real-time Bimolecular Interaction Analysis (BIA). Sjolander & Urbaniczky, Anal. Chem. 63, 2338-2345, 1991, and Szabo et al., Curr. Opin. Struct. Biol. 5, 699-705,1995. BIA is a technology for studying biospecific interactions in real time, without labeling any of the interactants (e. g., BlAcore Tm). Changes in the optical phenomenon surface plasmon resonance (SPR) can be used as an indication of real-time reactions between biological molecules.

In yet another aspect of the invention, a MASP-like serine protease polypeptide can be used as a"bait protein"in a two-hybrid assay or three-hybrid assay (see, e. g., U. S.

Patent 5,283,317; Zervos et al., Cell 72, 223-232,1993; Madura et al., J : Biol. Chem.

268, 12046-12054,1993; Bartel et al., Biotechniques 14, 920-924,1993; Iwabuchi et al., Oncogene 8, 1693-1696,1993; and Brent W094/10300), to identify other proteins which bind to or interact with the MASP-like serine protease polypeptide and modulate its activity.

The two-hybrid system is based on the modular nature of most transcription factors, which consist of separable DNA-binding and activation domains. Briefly, the assay utilizes two different DNA constructs. For example, in one construct a polynucleo- tide encoding a MASP-like serine protease polypeptide is fused to a polynucleotide encoding the DNA binding domain of a known transcription factor (e. g., GAL-4). In the other construct, a DNA sequence that encodes an unidentified protein ("prey"or "sample") is fused to a polynucleotide that codes for the activation domain of the known transcription factor. If the"bait"and the"prey"proteins are able to interact in vivo to form an protein-dependent complex, the DNA-binding and activation domains of the transcription factor are brought into close proximity. This proximity allows transcription of a reporter gene (e. g., LacZ), which is operably linked to a transcriptional regulatory site responsive to the transcription factor. Expression of the reporter gene can be detected, and cell colonies containing the functional transcription factor can be isolated and used to obtain the DNA sequence encoding the protein which interacts with the MASP-like serine protease polypeptide.

It may be desirable to immobilize either the MASP-like serine protease polypeptide (or polynucleotide) or the test compound to facilitate separation of bound from un- bound forms of one or both of the interactants, as well as to accommodate auto- mation of the assay. Thus, either the MASP-like serine protease polypeptide (or polynucleotide) or the test compound can be bound to a solid support. Suitable solid supports include, but are not limited to, glass or plastic slides, tissue culture plates, microtiter wells, tubes, silicon chips, or particles such as beads (including, but not limited to, latex, polystyrene, or glass beads). Any method known in the art can be used to attach the MASP-like serine protease polypeptide (or polynucleotide) or test compound to a solid support, including use of covalent and non-covalent linkages, passive absorption, or pairs of binding moieties attached respectively to the polypeptide or test compound and the solid support. Test compounds are preferably bound to the solid support in an array, so that the location of individual test compounds can be tracked. Binding of a test compound to a MASP-like serine

protease polypeptide (or polynucleotide) can be accomplished in any vessel suitable for containing the reactants. Examples of such vessels include microtiter plates, test tubes, and microcentrifuge tubes.

In one embodiment, a MASP-like serine protease polypeptide is a fusion protein comprising a domain that allows the MASP-like serine protease polypeptide to be bound to a solid support. For example, glutathione-S-transferase fusion proteins can be adsorbed onto glutathione sepharose beads (Sigma Chemical, St. Louis, Mo.) or glutathione derivatized microtiter plates, which are then combined with the test compound or the test compound and the non-adsorbed MASP-like serine protease polypeptide; the mixture is then incubated under conditions conducive to complex formation (e. g., at physiological conditions for salt and pH). Following incubation, the beads or microtiter plate wells are washed to remove any unbound components.

Binding of the interactants can be determined either directly or indirectly, as described above. Alternatively, the complexes can be dissociated from the solid sup- port before binding is determined.

Other techniques for immobilizing polypeptides or polynucleotides on a solid support also can be used in the screening assays of the invention. For example, either a MASP-like serine protease polypeptide (or polynucleotide) or a test compound can be immobilized utilizing conjugation of biotin and streptavidin. Biotinylated MASP- like serine protease polypeptides or test compounds can be prepared from biotin-NHS (N-hydroxysuccinimide) using techniques well known in the art (e. g., biotinylation kit, Pierce Chemicals, Rockford, 111.) and immobilized in the wells of streptavidin-coated 96 well plates (Pierce Chemical). Alternatively, antibodies which specifically bind to a MASP-like serine protease polypeptide polynucleotides, or a test compound, but which do not interfere with a desired binding site, such as the active site or a fibronectin domain of the MASP-like serine protease polypeptide, can be derivatized to the wells of the plate. Unbound target or protein can be trapped in the wells by antibody conjugation.

Methods for detecting such complexes, in addition to those described above for the GST-immobilized complexes, include immunodetection of complexes using anti- bodies which specifically bind to the MASP-like serine protease polypeptide (or polynucleotides) or test compound, enzyme-linked assays which rely on detecting a MASP-like serine protease activity of the MASP-like serine protease polypeptide, and SDS gel electrophoresis under non-reducing conditions.

Screening for test compounds which bind to a MASP-like serine protease poly- peptide or polynucleotide also can be carried out in an intact cell. Any cell which comprises a MASP-like serine protease polynucleotide or polypeptide can be used in a cell-based assay system. A MASP-like serine protease polynucleotide can be naturally occurring in the cell or can be introduced using techniques such as those described above. Either a primary culture or an established cell line, including neo- plastic cell lines such as the colon cancer cell lines HCT116, DLD1, HT29, Caco2, SW837, SW480, and RKO, breast cancer cell lines 21-PT, 21-MT, MDA-468, SK- BR3, and BT-474, the A549 lung cancer cell line, and the H392 glioblastoma cell line, can be used. An intact cell is contacted with a test compound. Binding of the test compound to a MASP-like serine protease polypeptide or polynucleotide is determined as described above, after lysing the cell to release the MASP-like serine protease polypeptide-test compound complex.

Enzyme assays Test compounds can be tested for the ability to increase or decrease a MASP-like serine protease activity of a MASP-like serine protease polypeptide. MASP-like serine protease activity can be measured, for example, as described in Matsushita et al., 2000. MASP-like serine protease activity can be measured after contacting either a purified MASP-like serine protease polypeptide, a cell extract, or an intact cell with a test compound. A test compound which decreases MASP-like serine protease activity by at least about 10, preferably about 50, more preferably about 75,90, or 100% is identified as a potential therapeutic agent for decreasing extracellular matrix

degradation. A test compound which increases MASP-like serine protease activity by at least about 10, preferably about 50, more preferably about 75,90, or 100% is identified as a potential therapeutic agent for increasing extracellular matrix degra- dation.

Gene Expression In another embodiment, test compounds which increase or decrease MASP-like serine protease gene expression are identified. A MASP-like serine protease poly- nucleotide is contacted with a test compound, and the expression of an RNA or poly- peptide product of the MASP-like serine protease polynucleotide is determined. The level of expression of MASP-like serine protease mRNA or polypeptide in the presence of the test compound is compared to the level of expression of MASP-like serine protease mRNA or polypeptide in the absence of the test compound. The test compound can then be identified as a modulator of expression based on this comparison. For example, when expression of MASP-like serine protease mRNA or polypeptide is greater in the presence of the test compound than in its absence, the test compound is identified as a stimulator or enhancer of MASP-like serine protease mRNA or polypeptide is less expression. Alternatively, when expression of the mRNA or protein is less in the presence of the test compound than in its absence, the test compound is identified as an inhibitor of MASP-like serine protease mRNA or polypeptide expression.

The level of MASP-like serine protease mRNA or polypeptide expression in the cells can be determined by methods well known in the art for detecting mRNA or protein. Either qualitative or quantitative methods can be used. The presence of polypeptide products of a MASP-like serine protease polynucleotide can be deter- mined, for example, using a variety of techniques known in the art, including immunochemical methods such as radioimmunoassay, Western blotting, and immunohistochemistry. Alternatively, polypeptide synthesis can be determined in

vivo, in a cell culture, or in an in vitro translation system by detecting incorporation of labeled amino acids into a MASP-like serine protease polypeptide.

Such screening can be carried out either in a cell-free assay system or in an intact cell. Any cell which expresses a MASP-like serine protease polynucleotide can be used in a cell-based assay system. The MASP-like serine protease polynucleotide can be naturally occurring in the cell or can be introduced using techniques such as those described above. Either a primary culture or an established cell line, including neoplastic cell lines such as the colon cancer cell lines HCT116, DLD1, HT29, Caco2, SW837, SW480, and RKO, breast cancer cell lines 21-PT, 21-MT, MDA- 468, SK-BR3, and BT-474, the A549 lung cancer cell line, and the H392 glioblastoma cell line, can be used.

Pharmaceutical Compositions The invention also provides pharmaceutical compositions which can be administered to a patient to achieve a therapeutic effect. Pharmaceutical compositions of the invention can comprise a MASP-like serine protease polypeptide, MASP-like serine protease polynucleotide, antibodies which specifically bind to a MASP-like serine protease polypeptide, or mimetics, agonists, antagonists, or inhibitors of a MASP- like serine protease polypeptide. The compositions can be administered alone or in combination with at least one other agent, such as stabilizing compound, which can be administered in any sterile, biocompatible pharmaceutical carrier, including, but not limited to, saline, buffered saline, dextrose, and water. The compositions can be administered to a patient alone, or in combination with other agents, drugs or hormones.

In addition to the active ingredients, these pharmaceutical compositions can contain suitable pharmaceutically-acceptable carriers comprising excipients and auxiliaries which facilitate processing of the active compounds into preparations which can be used pharmaceutically. Pharmaceutical compositions of the invention can be ad-

ministered by any number of routes including, but not limited to, oral, intravenous, intramuscular, intra-arterial, intramedullary, intrathecal, intraventricular, transdermal, subcutaneous, intraperitoneal, intranasal, parenteral, topical, sublingual, or rectal means. Pharmaceutical compositions for oral administration can be formulated using pharmaceutically acceptable carriers well known in the art in dosages suitable for oral administration. Such carriers enable the pharmaceutical compositions to be formulated as tablets, pills, dragees, capsules, liquids, gels, syrups, slurries, suspensions, and the like, for ingestion by the patient.

Pharmaceutical preparations for oral use can be obtained through combination of active compounds with solid excipient, optionally grinding a resulting mixture, and processing the mixture of granules, after adding suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable excipients are carbohydrate or protein fillers, such as sugars, including lactose, sucrose, mannitol, or sorbitol; starch from corn, wheat, rice, potato, or other plants; cellulose, such as methyl cellulose, hydroxy- propylmethyl-cellulose, or sodium carboxymethylcellulose; gums including arabic and tragacanth; and proteins such as gelatin and collagen. If desired, disintegrating or solubilizing agents can be added, such as the cross-linked polyvinyl pyrrolidone, agar, alginic acid, or a salt thereof, such as sodium alginate.

Dragee cores can be used in conjunction with suitable coatings, such as concentrated sugar solutions, which also can contain gum arabic, talc, polyvinylpyrrolidone, carbopol gel, polyethylene glycol, and/or titanium dioxide, lacquer solutions, and suitable organic solvents or solvent mixtures. Dyestuffs or pigments can be added to the tablets or dragee coatings for product identification or to characterize the quantity of active compound, i. e., dosage.

Pharmaceutical preparations which can be used orally include push-fit capsules made of gelatin, as well as soft, sealed capsules made of gelatin and a coating, such as glycerol or sorbitol. Push-fit capsules can contain active ingredients mixed with a filler or binders, such as lactose or starches, lubricants, such as talc or magnesium

stearate, and, optionally, stabilizers. In soft capsules, the active compounds can be dissolved or suspended in suitable liquids, such as fatty oils, liquid, or liquid poly- ethylene glycol with or without stabilizers.

Pharmaceutical formulations suitable for parenteral administration can be formulated in aqueous solutions, preferably in physiologically compatible buffers such as Hanks'solution, Ringer's solution, or physiologically buffered saline. Aqueous in- jection suspensions can contain substances which increase the viscosity of the sus- pension, such as sodium carboxymethyl cellulose, sorbitol, or dextran. Additionally, suspensions of the active compounds can be prepared as appropriate oily injection suspensions. Suitable lipophilic solvents or vehicles include fatty oils such as sesame oil, or synthetic fatty acid esters, such as ethyl oleate or triglycerides, or liposomes.

Non-lipid polycationic amino polymers also can be used for delivery. Optionally, the suspension also can contain suitable stabilizers or agents which increase the solubility of the compounds to allow for the preparation of highly concentrated solutions. For topical or nasal administration, penetrants appropriate to the particular barrier to be permeated are used in the formulation. Such penetrants are generally known in the art.

The pharmaceutical compositions of the present invention can be manufactured in a manner that is known in the art, e. g., by means of conventional mixing, dissolving, granulating, dragee-making, levigating, emulsifying, encapsulating, entrapping, or lyophilizing processes. The pharmaceutical composition can be provided as a salt and can be formed with many acids, including but not limited to, hydrochloric, sulfuric, acetic, lactic, tartaric, malic, succinic, etc. Salts tend to be more soluble in aqueous or other protonic solvents than are the corresponding free base forms. In other cases, the preferred preparation can be a lyophilized powder which can contain any or all of the following: 1-50 mM histidine, 0.1%-2% sucrose, and 2-7% mannitol, at a pH range of 4.5 to 5.5, that is combined with buffer prior to use.

Further details on techniques for formulation and administration can be found in the latest edition of REMINGTON'S PHARMACEUTICAL SCIENCES (Maack Publishing Co., Easton, Pa.). After pharmaceutical compositions have been prepared, they can be placed in an appropriate container and labeled for treatment of an indicated condition. Such labeling would include amount, frequency, and method of ad- ministration.

Therapeutic Indications and Methods In humans, binding of MASP to carbohydrates initiates the lectin pathway, which is involved in host defense against pathogens. Takahashi et al., 1999; Endo et al., 1998. Thus, MASP-like serine protease or reagents which increase its activity can be administered to patients to provide a defense against pathogens, such as viruses, bacteria, mycoplasma, fungi, protozoa, helminths, rikettsia, chloamydiae, parasites, prions, and the like. Alternatively, in diseases in which MASP-like serine protease is overexpressed, it may be useful to decrease expression levels using reagents which specifically bind to MASP-like serine protease or its gene, including antibodies, ribozymes, antisense oligonucleotides, or other molecules.

The invention further pertains to the use of novel agents identified by the screening assays described above. Accordingly, it is within the scope of this invention to use a test compound identified as described herein in an appropriate animal model. For ex- ample, an agent identified as described herein (e. g., a modulating agent, an antisense nucleic acid molecule, a specific antibody, ribozyme, or a polypeptide-binding partner) can be used in an animal model to determine the efficacy, toxicity, or side effects of treatment with such an agent. Alternatively, an agent identified as described herein can be used in an animal model to determine the mechanism of action of such an agent. Furthermore, this invention pertains to uses of novel agents identified by the above-described screening assays for treatments as described herein.

A reagent which affects MASP-like serine protease activity can be administered to a human cell, either in vitro or in vivo, to reduce MASP-like serine protease activity.

The reagent preferably binds to an expression product of a human MASP-like serine protease gene. If the expression product is a polypeptide, the reagent is preferably an antibody. For treatment of human cells ex vivo, an antibody can be added to a preparation of stem cells which have been removed from the body. The cells can then be replaced in the same or another human body, with or without clonal propagation, as is known in the art.

In one embodiment, the reagent is delivered using a liposome. Preferably, the liposome is stable in the animal into which it has been administered for at least about 30 minutes, more preferably for at least about 1 hour, and even more preferably for at least about 24 hours. A liposome comprises a lipid composition that is capable of targeting a reagent, particularly a polynucleotide, to a particular site in an animal, such as a human. Preferably, the lipid composition of the liposome is capable of targeting to a specific organ of an animal, such as the lung or liver.

A liposome useful in the present invention comprises a lipid composition that is capable of fusing with the plasma membrane of the targeted cell to deliver its contents to the cell. Preferably, the transfection efficiency of a liposome is about 0.5 llg of DNA per 16 nmole of liposome delivered to about 106 cells, more preferably about 1.0 pg of DNA per 16 nmol of liposome delivered to about 106 cells, and even more preferably about 2.0 pg of DNA per 16 nmol of liposome delivered to about 106 cells. Preferably, a liposome is between about 100 and 500 nm, more preferably between about 150 and 450 nm, and even more preferably between about 200 and 400 nm in diameter.

Suitable liposomes for use in the present invention include those liposomes standardly used in, for example, gene delivery methods known to those of skill in the art. More preferred liposomes include liposomes having a polycationic lipid composition and/or liposomes having a cholesterol backbone conjugated to

polyethylene glycol. Optionally, a liposome comprises a compound capable of targeting the liposome to a tumor cell, such as a tumor cell ligand exposed on the outer surface of the liposome.

Complexing a liposome with a reagent such as an antisense oligonucleotide or ribozyme can be achieved using methods which are standard in the art (see, for example, U. S. Patent 5,705,151). Preferably, from about 0.1 ug to about 10 J. g of polynucleotide is combined with about 8 nmol of liposomes, more preferably from about 0.5 ig to about 5 llg of polynucleotides are combined with about 8 nmol lipo- somes, and even more preferably about 1.0 pg of polynucleotides is combined with about 8 nmol liposomes.

In another embodiment, antibodies can be delivered to specific tissues in vivo using receptor-mediated targeted delivery. Receptor-mediated DNA delivery techniques are taught in, for example, Findeis et al. Trends in BiotechnoL 11, 202-05 (1993); Chiou et al., GENE THERAPEUTICS : METHODS AND APPLICATIONS OF DIRECT GENE TRANSFER (J. A. Wolff, ed.) (1994); Wu & Wu, J Biol. Chem. 263, 621-24 (1988); Wu et al., J : Biol. Chem. 269, 542-46 (1994); Zenke et al., Proc. Natl. Acad. Sci.

US. A. 87, 3655-59 (1990); Wu et al., J Biol. Chem. 266, 338-42 (1991).

If the reagent is a single-chain antibody, polynucleotides encoding the antibody can be constructed and introduced into a cell either ex vivo or in vivo using well- established techniques including, but not limited to, transferrin-polycation-mediated DNA transfer, transfection with naked or encapsulated nucleic acids, liposome- mediated cellular fusion, intracellular transportation of DNA-coated latex beads, protoplast fusion, viral infection, electroporation,"gene gun,"and DEAE-or calcium phosphate-mediated transfection.

Determination of a Therapeuticallv Effective Dose The determination of a therapeutically effective dose is well within the capability of those skilled in the art. A therapeutically effective dose refers to that amount of active ingredient which increases or decreases extracellular matrix degradation relative to that which occurs in the absence of the therapeutically effective dose.

For any compound, the therapeutically effective dose can be estimated initially either in cell culture assays or in animal models, usually mice, rabbits, dogs, or pigs. The animal model also can be used to determine the appropriate concentration range and route of administration. Such information can then be used to determine useful doses and routes for administration in humans.

Therapeutic efficacy and toxicity, e. g., EDso (the dose therapeutically effective in 50% of the population) and LDSO (the dose lethal to 50% of the population), can be determined by standard pharmaceutical procedures in cell cultures or experimental animals. The dose ratio of toxic to therapeutic effects is the therapeutic index, and it can be expressed as the ratio, LD50/EDso.

Pharmaceutical compositions which exhibit large therapeutic indices are preferred.

The data obtained from cell culture assays and animal studies is used in formulating a range of dosage for human use. The dosage contained in such compositions is preferably within a range of circulating concentrations that include the EDSO with little or no toxicity. The dosage varies within this range depending upon the dosage form employed, sensitivity of the patient, and the route of administration.

The exact dosage will be determined by the practitioner, in light of factors related to the subject that requires treatment. Dosage and administration are adjusted to provide sufficient levels of the active ingredient or to maintain the desired effect.

Factors which can be taken into account include the severity of the disease state, general health of the subject, age, weight, and gender of the subject, diet, time and

frequency of administration, drug combination (s), reaction sensitivities, and tolerance/response to therapy. Long-acting pharmaceutical compositions can be ad- ministered every 3 to 4 days, every week, or once every two weeks depending on the half-life and clearance rate of the particular formulation.

Normal dosage amounts can vary from 0.1 to 100,000 micrograms, up to a total dose of about 1 g, depending upon the route of administration. Guidance as to particular dosages and methods of delivery is provided in the literature and generally available to practitioners in the art. Those skilled in the art will employ different formulations for nucleotides than for proteins or their inhibitors. Similarly, delivery of poly- nucleotides or polypeptides will be specific to particular cells, conditions, locations, etc.

Effective in vivo dosages of an antibody are in the range of about 5 zig to about 50 ug/kg, about 50 pg to about 5 mg/kg, about 100 pg to about 500, ug/kg of patient body weight, and about 200 to about 250 pglkg of patient body weight. For administration of polynucleotides encoding single-chain antibodies, effective in vivo dosages are in the range of about 100 ng to about 200 ng, 500 ng to about 50 mg, about 1 zig to about 2 mg, about 5 p. g to about 500 p. g, and about 20 p. g to about 100 ig of DNA.

If the expression product is mRNA, the reagent is preferably an antisense oligo- nucleotide or a ribozyme. Polynucleotides which express antisense oligonucleotides or ribozymes can be introduced into cells by a variety of methods, as described above.

Preferably, a reagent reduces expression of a MASP-like serine protease poly- nucleotide or activity of a MASP-like serine protease polypeptide by at least about 10, preferably about 50, more preferably about 75,90, or 100% relative to the absence of the reagent. The effectiveness of the mechanism chosen to decrease the level of expression of a MASP-like serine protease polynucleotide or the activity of a

MASP-like serine protease polypeptide can be assessed using methods well known in the art, such as hybridization of nucleotide probes to MASP-like serine protease- specific mRNA, quantitative RT-PCR, immunologic detection of a MASP-like serine protease polypeptide, or measurement of MASP-like serine protease activity.

In any of the embodiments described above, any of the pharmaceutical compositions of the invention can be administered in combination with other appropriate therapeutic agents. Selection of the appropriate agents for use in combination therapy can be made by one of ordinary skill in the art, according to conventional pharmaceutical principles. The combination of therapeutic agents can act syner- gistically to effect the treatment or prevention of the various disorders described above. Using this approach, one may be able to achieve therapeutic efficacy with lower dosages of each agent, thus reducing the potential for adverse side effects.

Any of the therapeutic methods described above can be applied to any subject in need of such therapy, including, for example, mammals such as dogs, cats, cows, horses, rabbits, monkeys, and most preferably, humans.

The above disclosure generally describes the present invention, and all patents and patent applications cited in this disclosure are expressly incorporated herein. A more complete understanding can be obtained by reference to the following specific examples which are provided for purposes of illustration only and are not intended to limit the scope of the invention.

EXAMPLE 1 Detection of MASP-like serine protease activity The polynucleotide of SEQ ID NO: 1 is inserted into the expression vector pCEV4 and the expression vector pCEV4-MASP-like serine protease polypeptide obtained is transfected into human embryonic kidney 293 cells.

Protease activity of cellular extracts from the transfacted cells are measured using thiobenzylester substrates, as described in U. S. Patent 5,500,344. For monitoring enzyme activities from granules and column fractions, assays are performed at room temperature using 0.5 mM 5,5'-dithiobis- (2-nitrobenzoic acid) (DTNB) (Sigma) to detect the HSBzl leaving group (s410 =13600 M-l cm 1).

Furthermore, BLT-esterase activity is estimated using a microtiter assay (Green and Shaw, Anal. Biochem. 93, 223-226,1979). Briefly, 50 u. l of sample is added to 100 u. l of 1 mM DTNB, made up in 10 mM HEPES, 1 mM CaCl2, 1 mM MgCl2, pH 7.2. The reaction is initiated by the addition of 50 u. l ofBLT (Sigma) to give a final concentration of 500 uM. For Metase determinations, 50 u. l of dilutions of the sample in 0.1 M HEPES, 0.05 M CaCl2, pH 7.5, are added to 100 p1 of 1 mM DTNB, and the reaction is initiated by the addition of 50 u, l of Boc-Ala-Ala-Met-S Benzyl (Bzl) to give a final concentration of 150, uM. The duration of the assay depends on color development, the rate of which is measured (O. D. 410) on a Dynatech MR 5000 microplate reader. Controls of sample and DTNB alone or DTNB and substrate alone are run.

Additionally, peptide thiobenzyl ester substrates are used to measure protease activities. The chymase substrate Suc-Phe-Leu-Phe-SBzl is purchased from BACHEM Bioscience Inc., Philadelphia, Pa. Z-Arg-SBzl (the tryptase substrate, Kam et al., J. Biol. Chem. 262, 3444-3451,1987); Boc-Ala-Ala-AA-SBzl (AA=Asp, Met, Leu, Nle, or Ser), and Suc-Ala-Ala-Met-SBzl (Odake et al, Biochemistry 30, 2217-2227,1991); Harper et al., Biochemistry 23,2995-3002,1984) are synthesized

previously. Boc-Ala-Ala-Asp-SBzl is the substrate for Asp-ase and peptide thiobenzyl esters containing Met, Leu or Nle are substrates for Met-ase SP. Assays are performed at room temperature in 0.1 M, HEPES buffer, pH 7.5, containing 0.01 M CaCl2 and 8% Me20 using 0.34 mM 4,4'-dithiodipyridine (Aldrithiol-4, Aldrich Chemical Co., Milwaukee, Wis.) to detect HSBzl leaving group that reacts with 4,4'- dithiodipyridine to release thiopyridone (£324=19800 M-l cm~l, Grasetti and Murray, Arch. Biochem. Biophys. 119, 41-49,1967). The initial rates are measured at 324 nm using a Beckman 35 spectrophotometer when 10-25 p, l of an enzyme stock solution is added to a cuvette containing 2.0 ml of buffer, 150 1ll of 4, 4'-dithiodipyridine, and 25 p, l of substrate. The same volume of substrate and 4,4'-dithiodipyridine are added to the reference cell in order to compensate for the background hydrolysis rate of the substrates. Initial rates are measured in duplicate for each substrate concentration and are averaged in each case. Substrate concentrations are 100-133 uM. The MASP- like serine protease activity of the polypeptide of SEQ ID NO: 2 is shown.

EXAMPLE 2 Identification of a test coynpound which binds to a MASP-like serine protease polypeptide Purified MASP-like serine protease polypeptides comprising a glutathione-S- transferase protein and absorbed onto glutathione-derivatized wells of 96-well microtiter plates are contacted with test compounds from a small molecule library at pH 7.0 in a physiological buffer solution. MASP-like serine protease polypeptides comprise the amino acid sequence shown in SEQ ID NO : 2. The test compounds comprise a fluorescent tag. The samples are incubated for 5 minutes to one hour.

Control samples are incubated in the absence of a test compound.

The buffer solution containing the test compounds is washed from the wells.

Binding of a test compound to a MASP-like serine protease polypeptide is detected by fluorescence measurements of the contents of the wells. A test compound which increases the fluorescence in a well by at least 15% relative to fluorescence of a well

in which a test compound was not incubated is identified as a compound which binds to a MASP-like serine protease polypeptide.

EXAMPLE 3 <BR> <BR> Identification of a test compound which decreases MASP-like serine protease activity Cellular extracts from the human colon cancer cell line HCT116 are contacted with test compounds from a small molecule library and assayed for MASP-like serine protease activity. Control extracts, in the absence of a test compound, also are assayed. Matriptase activity is measured as described in Matsushita et al., 2000. A test compound which decreases MASP-like serine protease activity of the extract relative to the control extract by at least 20% is identified as a MASP-like serine protease inhibitor.

EXAMPLE 4 Identification of a test compound which decreases MASP-like serine protease gene expression A test compound is administered to a culture of the breast tumor cell line MDA-468 and incubated at 37 °C for 10 to 45 minutes. A culture of the same type of cells incubated for the same time without the test compound provides a negative control.

RNA is isolated from the two cultures as described in Chirgwin et al., Biochem. 18, 5294-99,1979). Northern blots are prepared using 20 to 30 llg total RNA and hybridized with a 32P-labeled MASP-like serine protease-specific probe at 65 ° C in Express-hyb (CLONTECH). The probe comprises at least 11 contiguous nucleotides selected from the complement of SEQ ID NO : 1. A test compound which decreases the MASP-like serine protease-specific signal relative to the signal obtained in the absence of the test compound is identified as an inhibitor of MASP-like serine protease gene expression.

EXAMPLE 5 Treatment of a bacterial infection with MASP-like serine protease An expression construct which expresses MASP-like serine protease is administered to a patient with a bacterial infection. The patient is now capable of producing higher levels of MASP-like serine protease. The patient's bacterial infection is monitored over a period of days or weeks. Additional injections of the expression construct can be given during that time. The severity of the patient's bacterial infection is decreased due to increased MASP-like serine protease activity.