Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
E2F-LIKE TRANSCRIPTION REPRESSOR AND DNA ENCODING IT
Document Type and Number:
WIPO Patent Application WO/1997/035975
Kind Code:
A1
Abstract:
A novel transcription factor is provided, along with DNA encoding it, polypeptides which bind it, peptide fragments, mimetics, oligonucleotides, and methods employing these. The transcription factor, for which both mouse and human sequences are provided, has some similarity to E2F but, in contrast with E2F, is a transcriptional repressor. It is involved in cellular proliferation and is indicated to be an oncogen.

Inventors:
KOUZARIDES TONY (GB)
HAGEMEIER CHRISTIAN (DE)
Application Number:
PCT/GB1997/000833
Publication Date:
October 02, 1997
Filing Date:
March 25, 1997
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
CANCER RES CAMPAIGN TECH (GB)
KOUZARIDES TONY (GB)
HAGEMEIER CHRISTIAN (DE)
International Classes:
C07K14/47; C12N15/12; A61K38/00; (IPC1-7): C12N15/12; A01K67/027; A61K38/17; C07K14/47; C12Q1/68; G01N33/53
Other References:
OUELLETTE, MICHEL M. ET AL: "Complexes containing the retinoblastoma gene product recognize different DNA motifs related to the E2F binding site", ONCOGENE (1992), 7(6), 1075-81 CODEN: ONCNES;ISSN: 0950-9232, XP002036017
WEINTRAUB SJ ET AL: "Mechanism of active transcriptional repression by the retinoblastoma protein.", NATURE, JUN 29 1995, 375 (6534) P812-5, ENGLAND, XP002036018
SARDET C ET AL: "E2F-4 and E2F-5, two members of the E2F family, are expressed in the early phases of the cell cycle.", PROC NATL ACAD SCI U S A, MAR 14 1995, 92 (6) P2403-7, UNITED STATES, XP002036019
Download PDF:
Claims:
Claims
1. A nucleic acid molecule which has a nucleotide sequence encoding a polypeptide including the amino acid sequence shown in Figure 1.
2. A nucleic acid molecule according to claim 1 wherein the nucleotide sequence encoding said polypeptide is the coding sequence shown in Figure 1.
3. A nucleic acid molecule according to claim 1 wherein the nucleotide sequence encoding said polypeptide is a mutant, variant, derivative or allele of the coding sequence shown in Figure 1, by way of addition, deletion, substitution and/or insertion of one or more nucleotides.
4. A nucleic acid molecule which has a nucleotide sequence encoding a polypeptide which has an amino acid sequence which is a mutant, variant, derivative, allele or homologue of the amino acid sequence shown in Figure 1, by way of addition, deletion, substitution and/or insertion of one or more amino acids, the polypeptide having one or more property selected from the following: (i) having ability to bind nucleic acid at a sequence motif including TCCCGC and being unable to bind nucleic acid at a sequence motif including TCGCGC; (ii) having a region including the 62 amino acids shown at the Nterminus of the polypeptide in Figure 1, or a sequence of at least 2) contiguous amino acids with at least 50% homology with part of the 62 amino acids shown at the N terminus of the polypeptide in Figure 1, the polypeptide having transcriptional repressor activity; (iii) having at least 70% homology over a region of about 50 amino acids with a region of the polypeptide for which the amino acid sequence is shown in Figure 1 and transcriptional repressor activity.
5. A nucleic acid molecule according to claim 4 wherein the polypeptide includes the amino acid sequence shown in Figure 3.
6. A nucleic acid molecule according to claim 5 wherein the nucleotide sequence encoding said polypeptide is the coding sequence shown in Figure 2.
7. A nucleic acid molecule according to claim 5 wherein the nucleotide sequence encoding said polypeptide is a mutant, variant, derivative or allele of the coding sequence shown in Figure 2, by way of addition, deletion, insertion and/or substitution of one or more nucleotides.
8. A nucleic acid molecule according to any preceding claim which is a replicable vector.
9. A nucleic acid molecule according to any preceding claim wherein the nucleotide sequence encoding said polypeptide is under tne control of regulatory elements for expression of the polypeptide.
10. A nucleic acid molecule according to claim 9 wherein at least one said regulatory element is heterologous to the nucleotide sequence encoding said polypeptide.
11. A host cell containing a heterologous nucleic acid molecule according to any of claims 1 to 8.
12. A host cell containing a heterologous nucleic acid molecule according to claim 9 or claim 10.
13. A method including culturing a host cell according to claim 11 or claim 12 under conditions for expression, of the polypeptide .
14. A method including causing or allowing expression from a nucleic acid molecule according to claim 9 or claim 10 in an in vi tro expression system to produce said polypeptide.
15. A method which includes, following expression of a polypeptide encoded by a nucleic acid according to claim 9 or claim 10 in accordance with the method of claim 13 or claim 14, isolation and/or purification of the polypeptide.
16. A method which includes, following isolation and/or purification of a polypeptide encoded by nucleic acid according to claim 9 or claim 10 in accordance with the method of claim 15, formulation of the polypeptide into a composition containing at least one additional component.
17. A method including introducing a nucleic acid molecule according to any of claims 1 to 8 into a host cell.
18. A method according to claim 17 which takes place in vi tro .
19. A method including introducing a nucleic acid molecule according to claim 9 or claim 10 into a host cell.
20. A method according to claim 19 which takes place in vi tro .
21. A method including, following introduction of nucleic acid according to claim 9 or claim 10 into a host cell in accordance with the method of claim 19 or claim 20, causing or allowing expression from the nucleic acid molecule to produce the polypeptide.
22. A host cell according to claim 11 or claim 12 which is comprised in a mammal.
23. A mammal having a host cell according to claim 11 or claim 12 within its body.
24. A polypeptide encoded by nucleic acid according to any of claims 1 to 8.
25. A composition containing a polypeptide according to claim 24 and at least one additional component.
26. A composition according to claim 25 containing a pharmaceutically acceptable excipient.
27. A method of repressing transcription from a nucleic acid molecule containing a sequence of nucleotides under transcriptional control of a regulatory element containing a binding site for a polypeptide according to any of claims 1 to 8, the method including, within a host cell, causing or allowing a polypeptide according to claim 24 and heterologous to the host cell to bind said binding site to repress said transcription.
28. A method of repressing transcription from a nucleic acid molecule containing a sequence of nucleotides under transcriptional control of a regulatory element containing a binding site for a polypeptide according to any of claims 1 to 8, the method including, within an in vi ro expression system, causing or allowing a polypeptide according to claim 24 and heterologous to the host cell to bind said binding site to repress said transcription.
29. A method according to claim 27 or claim 28 wherein said binding site includes the sequence TCCCGC.
30. A peptide fragment of a polypeptide according to claim 24 containing an epitope able to be bound by an antibody which does not bind any of E2F15 for which the amino acid sequences are given in Figure 4 (a) .
31. A substance containing an antibody antigen binding site specific for a polypeptide according to claim 24.
32. A substance according to claim 31 wherein the antibody antigen binding site is specific for a peptide fragment according to claim 30.
33. A substance according to claim 31 wherein the antibody antigen binding site is specific for a polypeptide which has the amino acid sequence shown in Figure 1.
34. A substance according to claim 31 wherein the antibody antigen binding site is specific for a polypeptide which has the amino acid sequence shown in Figure 3.
35. A substance according to claim 31 or claim 32 which is able to inhibit transcriptional repressor activity of a polypeptide according to claim 24.
36. A substance according to claim 35 able to inhibit transcriptional repressor activity of a polypeptide which has the amino acid sequence shown in Figure 1.
37. A substance according to claim 36 able to inhibit transcriptional repressor activity of a polypeptide which has the amino acid sequence shown in Figure 3.
38. A composition containing a substance according to any of claims 31 to 37 and at least one additional component.
39. A composition according to claim 38 containing a pharmaceutically acceptable excipient.
40. A method which includes causing or allowing a substance according to claim 31 or claim 32 to bind a polypeptide according to any of claims 1 to 8.
41. A method which includes causing or allowing a substance according to claim any of claims 35 to 37 to inhibit said transcriptional repressor activity.
42. A method according claim 40 or claim 41 which takes place in vi tro .
43. A peptide fragment of a polypeptide according to claim 24 able to inhibit transcriptional repressor activity of said polypeptide.
44. A peptide fragment according to claim 43 able to inhibit transcriptional repressor activity of a polypeptide which has the amino acid sequence shown in Figure 1.
45. A peptide fragment according to claim 43 able to inhibit transcriptional repressor activity of a polypeptide which has the amino acid sequence shown in Figure 3.
46. A composition containing a peptide fragment according to any of claims 43 to 45 and at least one additional component .
47. A composition according to claim 46 containing a pharmaceutically acceptable excipient.
48. A method which includes causing or allowing a peptide fragment according to any of claims 43 to 45 to inhibit said transcriptional repressor activity.
49. A method according to claim 48 which takes place in vi tro .
50. A functional nonpeptide mimetic of a peptide fragment according to any of claims 43 to 45.
51. A composition containing a mimetic according to claim 50 and a pharmaceutically acceptable excipient.
52. A method of obtaining a substance able to bind a polypeptide according fco claim 24, the method including contacting said polypeptide with test substance and determining binding between the polypeptide and a test substance.
53. A method of obtaining a substance able to modulate transcriptional repressor activity of a polypeptide according to claim 24 , the method including providing in an expression system the polypeptide and a nucleic acid molecule including a reporter gene under transcriptional control of a promoter including a binding site for the polypeptide, supplying to the expression system test substances and determining promoter activity.
54. A method according to claim 52 or claim 53 wherein the substance obtained is able to bind and/or modulate the transcriptional repressor activity of a polypeptide with the amino acid sequence shown in Figure 1.
55. A method according to claim 52 or claim 53 wherein the substance obtained is able to bind and/or modulate the transcriptional repressor activity or a polypeptide with the amino acid sequence shown in Figure 3.
56. A method which includes, following obtaining of a substance in accordance with any of claims 52 to 53 able to bind and/or modulate the transcriptional repressor activity of said polypeptide, formulation of the substance into a composition containing at least one additional component.
57. A method according to claim 56 wherein the composition contains a pharmaceutically acceptable excipient .
58. A method which includes inhibiting production of a polypeptide according to claim 24.
59. A method according to claim 58 wherein the inhibition is of transcription from DNA of RNA encoding the polypeptide.
60. A method according to claim 58 or claim 59 which takes place in vi tro .
61. An oligonucleotide fragment of the coding nucleotide sequence shown in Figure 1 which is (i) useful as a specific PCR primer for amplification of said coding sequence or a fragment thereof, (ii) useful in antisense downregulation of expression of said coding sequence, and/or (iii) at least 14 nucleotides in length.
62. An oligonucleotide fragment of the coding nucleotide sequence shown in Figure 2 which is (i) useful as a specific PCR primer for amplification of said"coding sequence or a fragment thereof, (ii) useful in antisense downregulation of expression of said coding sequence, and/or (iii) at least 14 nucleotides in length.
63. The use of a nucleic acid molecule according to any of claims 1 to 8, a polypeptide according to claim 24, a peptide fragment according to claim 30, a substance according to any of claims 31 to 37, a peptide fragment according to any of claims 43 to 45, a functional non peptide mimetic according to claim 50, or an oligonucleotide according to claim 61 or claim 62 in the manufacture of a medicament for inhibiting cell proliferation.
Description:
E2F-LIKE TRANSCRIPTION REPRESSOR AND DNA ENCODING IT

The present invention relates to a novel transcription factor, to DNA encoding it and to the application of these in diagnosis and therapy in particular of proliferative disorders such as cancer, indications being that the gene is able to stimulate cell proliferation and has the characteristics of an oncogene.

The E2F/DP family of transcription factors play a key role in regulating the mammalian cell cycle. They activate genes required for S-phase and in doing so can ultimately promote cell proliferation. Both E2F and DP family members have been shown to be oncogenic and E2F1 has been demonstrated to be a potent inducer of S-phase (Lam, E. -F. et al . Current Opinion in Cell Biology 1994, 6: 859-866) .

The transcription activation capacity (and hence the oncogenicity) of the E2F/DP family is kept in check by the Retinoblastoma tumour suppressor family of proteins (RB, pl07, 0130) . Members of this family bind to a transcriptional activation domain within the E2F protein. By doing so, the RB protein family members can silence the transcriptional activation capacity of the E2F/DP proteins and thus cause arrest in the Gl-phase of the cell cycle. Release of RB from E2F/DP results in S-phase induction. This release is mediated by phosphoryla ion events (on RB and E2F) carried out by cyclin/CDK complexes towards the end

of the Gl phase (Whyte, P. The retinoblastoma protein and its relatives. Seminars in Cancer Biology 1995, 6: 83-90) .

There are five identified members of the E2F family (E2F1-5) and three members of the DP family (DPI-3) . All E2F members can form heterodimers with all DP members. These heterodimers can bind and transactivate the promoters of S- phase genes.

The E2F and DP proteins share a common class of DNA binding and dimersation domain which allows them to form heterodimers and bind "E2F" binding sites co-operatively. Outside the DNA binding/dimersation domain the E2F family members have other sequences in common. They all have a highly conserved "marked box", whose function is unknown, and a transcriptional ' activation domain at the C-terminus which contains the binding site for the RB family of proteins (Fig. 4B) . DP family of proteins do not possess any similarity to E2F proteins outside the DNA binding domain. However, they do contain highly conserved sequences which define this family.

The activity of the various E2F/DP heterodimers comes from the transcriptional activity of the E2F partner. No activation functions have been attributed to DP proteins. The activation capacity of the E2F/DP complexes is negatively regulated by different members of the RB family: i.B binds and represses E2F1-3 whereas pl07 and pl30 can bind

and repress E2F4 and Ε2F5.

However, evidence has recently been reported for the existence of "E2F-like" sites within promoters, which act as negative regulatory elements. Mutagenesis of such sites leads to an increase in activity of the promoter (Lam E.W. and Watson R.J. (1993) EMBO J. , 12.7: 2705-2713, and Zwicker J. et. al., (1995), EMBO J. 14.18: 4514-4522) . Indeed there is evidence that the E2F-RB complex may negatively regulate expression of certain promoters (Weintraub et al . , Nature, (1992) 858, 259-261) . In fact RB can repress the basal activity of promoters even in the absence of E2F if it is directed to the promoter via a heterologous DNA binding domain (Weintraub S.J. et al . , Nature, 375, 812-815) .

The applicants have identified a new "E2F-like" protein which has been called EMA. This protein, EMA, does not represent another E2F family member (i.e. it is not "E2F- related") but is "E2F-like" because it shares some but not all the sequence characteristics of the E2F family.

In the following description, reference will be made to the accompanying diagrammatic drawings in which:

Figure 1 shows the DNA sequence and predicted amino acid sequence of the new "E2F-like" murine clone EMA;

Figure 2 shows the coding DNA sequence for human EMA.

Figure 3 shows the human EMA amino acid sequence encoded by the DNA sequence of Figure 2.

Figure 4a: Amino-acid sequence of mouse EMA. Figure 4 (a) : Alignment of mouse EMA with dE2F (ref 10) , E2F-1 (refs 11, 12), E2F-2 (ref 13) , E2F-3 (ref 13) , E2F-4 (ref 6) and E2F-5 (ref 15) . Solid bars on top of the aligned sequences indicate the DNA-binding, dimerisation and activation domain including the RB-binding site, as well as the Marked Box. Amino-acid residues which are identical or similar (D=E, R=K and I=V=L) between all proteins are boxed. Two unique protein regions at the N-terminus (<+79 AA>) and C-terminus (<+166 AA>) of E2F have been deleted for convenience of the alignment. Figure (b) : is a diagrammatic representation of the sequences within EMA which show similarity to E2F family members, E2F1-5. This represents a schematic of the alignment in (a) showing the percentage of identical amino- acid residues within the DNA-binding (DBD) and dimerisation domain (DD) and the Marked Box (MB) between mouse EMA and each E2F family member. The activation domain of the E2Fs is shown in black.

Figure 4 (c) : Northern blot analyses of EMA mRNA isolated from various tissues. A mouse tissue Northern blot (Clontech) was probed with a 683 bp EMA cDNA probe

(nucleotides 1-683) . Each lane contains poly(A)+ RNA from the indicated tissues. Size markers are given on the left in kilobases. Reprobing of the identical blot with a GAPDH

probe (bottom panel) " allowed quantitative evaluation of the signals shown as bar charts in the top panel.

Figure 4(d) : EMA binds DP-1 in vi tro . EMA (amino acids 62-272) and E2F-1 and human cytomegalovirus IE2 (ref 15) as positive and negative controls, respectively, were translated in reticulocyte lysate, radioactively labelled and subjected to a glutathione S-transferase (GST) -pulldown assay using GST-DP-1 and GST as indicated. 25% of the input protein is shown. Molecular weight markers are indicated on the left in kilodalton.

Figure 5: EMA binds a subset of E2F sites.

Figure 5a: Rationale of a modified binding site selection assay. Mixed bacterial lysates containing GST-EMA and GST-DP-1 were collected .on glutathione-agarose-beads prior to incubation with a pool of double stranded binding site selection oligonucleotides containing a randomised central portion (see Methods) . Functional GST-EMA/GST-DP-l heterodimers on beads were expected to bind specific oligonucleotides (SO) presenting a high affinity binding site but not to non-specific oligonucleotides (NSO) . Specifically bound oligonucleotides were amplified via defined regions flanking the randomised core region of the double stranded oligonucleotides and re-incubated with the protein-loaded beads. ' This procedure was repeated four times before selected oligonucleotides were sequenced.

Figure 5(b): All sequences obtained through the binding site selection assay are shown. A statistical evaluation of

each selected positicϊh is given. The core region of the consensus sequence is boxed.

Figures 5 (c) and 5 (d) : Gel retardation assay using the E2F site (TTTCGCGC) of the adenoviral E2 promoter (ref 16) or the double stranded oligonucleotide number 12 from the binding site selection assay (TTTCCCGC) . Probes were incubated with GST fusion proteins of E2F-1, DP-l and/or EMA as indicated. 1000 fold excess of specific (SC) and non¬ specific competitor (NC) was used as shown.

Figure 6: EMA contains a transcriptional repressor activity. Figure 6 (a) : Constructs used in transient transfection assays.

Figure 6(b) : Transient transfection assays in Hela cells using 7 mg of G5TKCAT (ref 23) and 1 mg of control plasmid pSG5 (none) or 1 mg of GAL fusion plasmids as indicated. A typical result of a CAT assay is shown.

Figure 6(c): Average of three independent experiments (two for GALE2F284-359 and GALRB) . Quantitative analyses of CAT activity was done using a phosphoimager (Biorad) .

Figure 7: EMA is a cell cycle regulatory protein. NIH3T3 cells were transfected with a CD20 expression plasmid (pSG5- CD20) in combination with CMV expression vectors pCMVBamNEO (control) , pCMVEMA (EMA) and pCMVE2F-l (E2F-1) . 24 hours after transfection cells were analysed by FACS. The DNA histograms show the cell cycle profiles of CD20-positive cells and contain data from at least 5000 cells each.

EMA, was identified in a two hybrid screen in yeast cells as described in detail hereinafter. In these experiments, a region of the DPI protein (146-410) , which includes the E2F interacting domain, was fused to a lex A DNA binding domain and used as a bait against a mouse cDNA library (made from 10 A day embryos) linked to the VP16 activation domain. The methods used are described in Voytek, A.B., Hellenberg, S.M. and Cooper, J.A. (1993) . Cell, 74: 205-214. This screen identified 60 clones which interacted specifically with the lex A DPI 146-410 fusion. Sequencing showed that 58 of these clones represented members of E2F1-5. However, 1 of these was a novel sequence which was related to but distinct from other characterised E2F proteins.

The _cDNA clones of EMA, isolated in the two hybrid screen did not represent the full length gene. The sequence of the clone isolated had high sequence similarity to the leucine zipper and marked box of the E2F family. Using this short cDNA as a probe, a full- length cDNA was isolated from a mouse cDNA library. This full length cDNA was sequenced and the sequence is shown in Figure 1 herein.

Sequencing of the DNA revealed that murine EMA contains a number of features that make it E2F-like: it has a basic DNA binding motif, a HLH region, a leucine zipper motif and a marked box (Figures 1, 4a and 4b) . The 819 base pair open reading frame translates into a protein of 272 amino-acids with a predicted molecular mass of 30 kDa. As shown in

Figure 4a, sequence analysis indicates between 25% and 28% identity of EMA to E2F1-5 and drosophila E2F. However, it also contains novel features which provide indication that its function is distinct from that of other E2F members. Firstly, it has a distinct N-terminus, unrelated to other E2F's. Secondly, and most importantly, EMA does not contain a transcriptional activation domain. The end of the EMA protein sequence falls just short of the transcriptional activation domain present at the C-terminus of all E2F proteins.

The sequence of EMA gives rise to several functional features of this novel protein. Below is a summary of the functions of EMA for which experimental evidence is provided below.

Summary of EMA functions

EMA can bind to the DPI protein.

2. EMA/DP1 heterodimers bind a novel binding site with the concensus sequence ttTCCCGCCtttt (EMA-binding site) .

3. The EMA protein can repress transcription. This is as a result of its N-terminal domain which has no sequence similarity to the E2F family.

4. The EMA-binding site represents a transcriptional repressor element .

5. EMA can induce cells to enter the S-phase of the cell cycle.

These characteristics of EMA indicate that EMA (I) is a repressor of transcription of genes which have EMA-binding sites (II) has the characteristics of an oncogene since it is sufficient to induce S-phase, and (III) EMA is not a relative of the E2F family but has distinct characteristics that make it a member of a novel subgroup of "E2F-like" genes.

These results (presented below) provide indication that EMA is a regulator of cell proliferation. Its ability to repress transcription is likely to be a novel mechanism by which it can control the cell cycle. Controlling or reversing the function of EMA should therefore result in the suppression of the cell's ability to proliferate.

The availability of the murine sequence can be used to obtain corresponding proteins from other species such as humans. The murine sequence or a part thereof can be used as a probe to isolate genomic sequences from for example human cell sources (see for example Maniatis et al . 1982 Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Press N.Y. Chapters 1-11) . Alternatively, the sequence of Figure 1 could be used to design oligonucleotide primers in order to amplify cDNA of EMA from human cell sources using for example the polymerase chain reaction (PCR) .

DNA identified in thi " S way may be sequenced using conventional techniques and/or the amino acid sequence of the corresponding protein derived.

A human EMA encoding sequence is provided in Figure 2 with the encoded polypeptide sequence shown in Figure 3. The human sequence was isolated by standard hybridisation methods using the mouse EMA sequence as a probe. Over the first 62 amino acids, 54 of the residues of the human EMA are identical with the mouse - as discernible by comparison of the amino acid sequences of Figure 1 and Figure 3. Furthermore, the encoding nucleic acid sequences are highly homogenous.

As an oncogene, the detection of EMA provides for a diagnostic test for proliferative disorders involving disruption of cell-cycle and/or cell growth regulation, such as cancer, in particular by detecting elevated levels of RNA, DNA or protein in tumour cells. Alternatively, if mutation in this gene provides a predisposition to cancer or other cell proliferation disorder, detection of the mutation in an individual could be indicative of a predisposition to, e.g., cancer. The detection could be carried out at the protein level, for example utilising antibodies which were specific for the EMA protein, or at the DNA level or RNA for example by using probes which recognise the EMA gene sequence. The probes may be used as primers in an amplication process for example, using the

polymerase chain reaction (PCR) .

These diagnostic methods, together with antibodies, or probes or primers used therein form aspects of the invention. The antibodies or probes or primers may be formed into a diagnostic kit.

The probes or primers used may comprise a nucleotide sequence which comprises or hybridises with at least a part of the sequence of Figure 1 and/or the human equivalent of said sequence (Figure 2) . Suitably the probe or primer will be of such a size that it specifically hybridizes with the sequence of Figure 1 and/or Figure 2 under stringent hybridisation conditions. Such conditions are conventional in the art and include for example those defined in

"Molecular Cloning, A Laboratory Manual", Sambrook, Fritsch and Maniatis, Cold Spring Harbor Laboratory Press and "Short Protocols in Molecular Biology" ed Ausubel et al., John Wiley and Sons.

In addition, the identification of the oncogenic properties of EMA gives rise to therapeutic treatments. For example, expression of EMA may be inhibited by administering an agent which inhibits expression of the EMA gene. An example of such an agent comprises an anti-sense DNA/RNA construct which is of sufficient length to hybridize to the sequence of Figure 1 and/or the human equivalent thereof (Figure 2) and prevent, wholly or partially, expression of the EMA

protein .

In addition, the effect of EMA may be countered at the protein level for example by administration of an blocking agent such as an antibody or antibody fragment which inhibits EMA function. For instance, a dominant negative approach can be used to block the function of EMA. Using this approach, agents which prevent the dimerisation with a partner of EMA (DP) or which prevent the binding of EMA to the E2F-like site may be employed.

Reagents of this type suitably comprise a peptide or a mimetic thereof .

The designing of mimetics to a known pharmaceutically active compound is a known approach to the development of pharmaceuticals based on a "lead" compound. This might be desirable where the active compound is difficult or expensive to synthesise or where it is unsuitable for a particular method of administration, eg peptides are unsuitable active agents for oral compositions as they tend to be quickly degraded by proteases in the alimentary canal. Mimetic design, synthesis and testing is generally used to avoid randomly screening large number of molecules for a target property.

There are several steps commonly taken in the design of a mimetic from a compound having a given target property.

Firstly, the particular parts of the compound that are critical and/or important in determining the target property are determined. In the case of a peptide, this can be done by systematically varying the amino acid residues in the peptide, eg by substituting each residue in turn. These parts or residues constituting the active region of the compound are known as its "pharmacophore" .

Once the pharmacophore has been found, its structure is modelled to according its physical properties, eg stereochemistry, bonding, size and/or charge, using data from a range of sources, eg spectroscopic techniques, X-ray diffraction data and NMR. Computational analysis, similarity mapping (which models the charge and/or volume of a pharmacophore, rather than the bonding between atoms) and other techniques can be used in this modelling process.

In a variant of this approach, the three-dimensional structure of the ligand and its binding partner are modelled. This can be especially useful where the ligand and/or binding partner change conformation on binding, allowing the model to take account of this the design of the mimetic .

A template molecule is then selected onto which chemical groups which mimic the pharmacophore can be grafted. The template molecule and the chemical groups grafted on to it can conveniently be selected so that the mimetic is easy to

synthesise, is likely- to be pharmacologically acceptable, and does not degrade in vivo, while retaining the biological activity of the lead compound. The mimetic or mimetics found by this approach can then be screened to see whether they have the target property, or to what extent they exhibit it. Further optimisation or modification can then be carried out to arrive at one or more final mimetics for in vi vo or clinical testing.

The agent may be administered to a patient in the form of a pharmaceutical composition for example in which the agent is combined with a pharmaceutically acceptable carrier or excipient . Carriers may be solid or liquid such as water, saline or aqueous alcohol such as ethanol, as conventional in the art. Such compositions form a further aspect of the invention.

Various methods of administration of the therapeutic agent can be used, following known formulations and procedures. Dosages will be dependent upon a number of factors considered by medical practitioners. The administration may be systemic or targeted, the latter employing direct (eg topical) application of the therapeutic agent to the target cells or the use of targeting systems such as antibody or cell specific ligands. Targeting may be desirable for a variety of reasons; for example if the agent is unacceptably toxic, or if it would otherwise require too high a dosage, or if it would not otherwise be able to enter the target

cel ls .

Instead of administering these agents directly, they could be produced in the target cells by expression from an encoding gene introduced into the cells, eg in a viral vector (a variant of the VDEPT technique - see below) . The vector could be targeted to the specific cells to be treated, or it could contain regulatory elements which are switched on more or less selectively by the target cells.

Alternatively, the agent could be administered in a precursor form, for conversion to the active form by an activating agent produced in, or targeted to, the cells to be treated. This type of approach is sometimes known as ADEPT or VDEPT; the former involving targeting the activating agent to the cells by conjugation to a cell- specific antibody, while the latter involves producing the activating agent, eg an enzyme, in a vector by expression from encoding DNA in a viral vector (see for example, EP-A- 415731 and WO 90/07936) .

In accordance with a further aspect of the invention there is provided a method of identifying a predisposition to cancer or other disorder of cell proliferation which method comprises probing DNA from a subject with a probe which comprises part or all of the corresponding wild type sequence of a gene which encodes EMA as described herein.

Peptide reagents of the invention can be prepared using conventional techniques. For example, they may be prepared using chemical techniques such as solid phase techniques, after which they are cleaved from the solid phase support such as resin and purified using chromatographic techniques such as high performance liquid chromatography.

Alternatively they may be derived from natural sources, also using conventional methods. Suitably however, the peptides are obtained by expression of nucleic acid which encodes them. A nucleic acid in accordance with the invention is obtained either using chemical synthesis or obtained from natural sources and optionally subjected to amplification using for example PCR techniques. The nucleic acid is incorporated into an expression vector or plasmid using conventional techniques and a host cell transformed with said vector, or plasmid. Suitable host cells may be prokaryotic or eukaryotic. Transformed host cells are then cultured and EMA protein is recovered from the culture. Such vectors, host cells and the preparation methods form further aspects of the invention.

Various host/expression vector combinations may be employed in expressing DNA sequences encoding the peptides of this invention. Examples of useful expression vectors, include segments of chromosomal, non-chromosomal and synthetic DNA sequences, such as various known derivatives of SV40 and known bacterial plasmids, in particular plasmids from E_ ^

coli, such as col El,-pCRl, pBR322, pMB9, pET-3A and their derivatives, wider host range plasmids, e.g., RP4, phage DNAs, e.g., the numerous derivatives of phage λ, e.g., NM989, and other DNA phages, e.g., M13 and filamentous single-stranded DNA phages, yeast plasmids, such as the 2μ plasmid or derivatives thereof, and vectors derived from combinations of plasmids and phage DNAs, such as plasmids which have been modified to employ phage DNA or other expression control sequences.

Vectors suitably contain expression control sequences operatively linked to the nucleic acid of the invention. Examples of such expression control sequences include promoters, enhancers, splicing signals, and polyadenylation signals, depending upon the nature of the cells and the host cell . Suitable promoters include the early and late promoters of SV40, adenovirus or cytomegalovirus immediate early promoter, the lac system, the trp system, the TAC or TRC system, T7 promoter whose expression is directed by T7 RNA polymerase, the major operator and promoter regions of page λ, the control regions for fd coat protein, the promoter for 3-phosphoglycerate kinase or other glycolytic enzymes, the promoters of acid phosphatase, e.g., Pho5 , the promoters of the yeast α-mating factors, the polyhedron promoter of the baculovirus system. For animal cell expression, a particular example of expression control sequences would be the cytomegalovirus promoter or the

adenovirus major late-promoter augmented by the SV40 enhancer.

Suitable host cells include conventional eukaryotic and prokaryotic hosts, such as strains of E. coli. Pseudomonas. Bacillus, Streptomyces, Saccharomyces, Trichodermas. and other fungi, animal cells, such as Chinese hamster ovary ("CHO") and mouse cells in culture, African green monkey cells, such as COS 1, COS 7, BSC 1, BSC 40, and BMT 10, insect cells in culture, human cells in culture and plant cells in culture.

The selection of a particular host cell/vector combination will be apparent to the skilled person.

According to one aspect of the present invention there is provided a nucleic acid molecule which has a nucleotide sequence encoding a polypeptide which includes an amino acid sequence shown for EMA in the figures. Mouse EMA amino acid sequence is shown in Figure 1, along with mouse encoding DNA sequence. Human EMA amino acid sequence- is shown in Figure 3 , the human encoding DNA sequence being shown in Figure 2.

The coding sequence may be that shown for EMA in the figures, or it may be a mutant, variant, derivative or allele of the sequence. The sequence may differ from that shown by a change which is one or more of addition, insertion, deletion and substitution of one or more

nucleotides of the sequence shown. Changes to a nucleotide sequence may result in an amino acid change at the protein level, or not, as determined by the genetic code.

Thus, nucleic acid according to the present invention may include a sequence different from the sequence shown for EMA in the figures yet encode a polypeptide with the same amino acid sequence. The amino acid sequence of the complete mouse EMA polypeptide shown in the figures consists of 272 residues, while the human EMA polypeptide shown in Figure 3 consists of 281 residues.

On the other hand, the encoded polypeptide may comprise an amino acid sequence which differs by one or more amino acid residues from the amino acid sequence shown for EMA in the figures. Nucleic acid encoding a polypeptide which is an amino acid sequence mutant, variant, derivative or allele of the sequence shown for EMA in the figures is further provided by the present invention. Such polypeptides are discussed below. Nucleic acid encoding such a polypeptide may show in its coding sequence greater than about 60% homology with the coding sequence shown for EMA in the figures, greater than about 70% homology, greater than about 80% homology, greater than about 90% homology or greater than about 95% homology.

E2Fsl-5 are excluded from the present invention. As noted elsewhere herein, EMA is functionally distinct from E2F

family members. In particular, EMA is a transcriptional repressor.

Generally, nucleic acid according to the present invention is provided as an isolate, in isolated and/or purified form, or free or substantially free of material with which it is naturally associated, such as free or substantially free of nucleic acid flanking the gene in the human genome, except possibly one or more regulatory sequence (s) for expression. Nucleic acid may be wholly or partially synthetic and may include genomic DNA, cDNA or RNA. Where nucleic acid according to the invention includes RNA, reference to the sequence shown should be construed as reference to the RNA equivalent, with U substituted for T.

Nucleic acid sequences encoding all or part of the EMA gene and/or its regulatory elements can be readily prepared by the skilled person using the information and references contained herein and techniques known in the art (for example, see Sambrook, Fritsch and Maniatis, "Molecular

Cloning, A Laboratory Manual, Cold Spring Harbor Laboratory Press, 1989, and Ausubel et al, Protocols in Molecular Biology, John Wiley and Sons, 1992) . These techniques include (i) the use of the polymerase chain reaction (PCR) to amplify samples of such nucleic acid, e.g. from genomic sources, (ii) chemical synthesis, or (iii) preparing cDNA sequences. Modifications to the sequences can be made, e.g, using site directed mutagenesis, to lead to the expression

of modified EMA polypeptide or to take account of codon preference in the host cells used to express the nucleic acid.

In order to obtain expression of the EMA nucleic acid sequences, the sequences can be incorporated in a vector having control sequences operably linked to the EMA nucleic acid to control its expression. The vectors may include other sequences such as promoters or enhancers to drive the expression of the inserted nucleic acid, nucleic acid sequences so that the polypeptide with EMA function is produced as a fusion and/or nucleic acid encoding secretion signals so that the polypeptide produced in the host cell is secreted from the cell. The polypeptide can then be obtained by transforming the vectors into host cells in which the vector is functional, culturing the host cells so that the polypeptide is produced and recovering the polypeptide from the host cells or the surrounding medium. Prokaryotic and eukaryotic cells are used for this purpose in the art, including strains of E. coli, yeast, and eukaryotic cells such as COS or CHO cells. The choice of host cell can be used to control the properties of the polypeptide expressed in those cells, e.g. controlling where the polypeptide is deposited in the host cells or affecting properties such as its glycosylation.

PCR techniques for the amplification of nucleic acid are described in US Patent No. 4,683,195. In general, such

techniques require that sequence information from the ends of the target sequence is known to allow suitable forward and reverse oligonucleotide primers to be designed to be identical or similar to the polynucleotide sequence that is the target for the amplification. PCR comprises steps of denaturation of template nucleic acid (if double-stranded) , annealing of primer to target, and polymerisation. The nucleic acid probed or used as template in the amplification reaction may be genomic DNA, cDNA or RNA. PCR can be used to amplify specific sequences from genomic DNA, specific RNA sequences and cDNA transcribed from mRNA, bacteriophage or plasmid sequences. The EMA nucleic acid sequences provided herein readily allow the skilled person to design PCR primers . References for the general use of PCR techniques include Mullis et al, Cold Spring Harbor Symp. Quant. Biol., 51:263, (1987), Ehrlich (ed) , PCR technology, Stockton Press, NY, 1989, Ehrlich et al, Science, 252:1643-1650, (1991) , "PCR protocols; A Guide to Methods and Applications", Eds. Innis et al, Academic Press, New York, (1990) .

Also included within the scope of the invention are antisense oligonucleotide sequences based on the EMA nucleic acid sequences described herein. Antisense oligonucleotides may be designed to hybridise to the complementary sequence of nucleic acid, pre-mRNA or mature mRNA, interfering with the production of polypeptide encoded by a given DNA sequence (e.g. either native polypeptide or a mutant form

thereof) , so that its ~ expression is reduce or prevented altogether. In addition to the EMA coding sequence, antisense techniques can be used to target the control sequences of the EMA gene, e.g. in the 5' flanking sequence of the EMA coding sequence, whereby the antisense oligonucleotides can interfere with EMA control sequences. The construction of antisense sequences and their use is described in Peyman and Ulman, Chemical Reviews, 90:543-584, (1990) , and Crooke, Ann. Rev. Pharmacol. Toxicol. , 32:329- 376, (1992) .

For specificity to the EMA coding region, an oligonucleotide, e.g. for use in PCR or anti-sense regulation, may particularly be designed to be complementary for the N-terminal region of EMA,which is not found in members of the E2F family.

The nucleic acid sequences provided for EMA in the figures are useful for identifying nucleic acid of interest (and which may be according to the present invention) in a test sample. The present invention provides a method of obtaining nucleic acid of interest, the method including hybridisation of a probe, particularly an EMA-specific probe, having a sequence which is a fragment of the sequence shown in the figures or a complementary sequence, to target nucleic acid.

Hybridisation is generally followed by identification of

successful hybridisatrion and isolation of nucleic acid which has hybridised to the probe, which may involve one or more steps of PCR.

Nucleic acid according to the present invention is obtainable using one or more oligonucleotide probes or primers designed to hybridise with one or more fragments of the nucleic acid sequence shown in the figures, particularly fragments of relatively rare sequence, based on codon usage or statistical analysis. A primer designed to hybridise with a fragment of the nucleic acid sequence shown in the above figures may be used in conjunction with one or more oligonucleotides designed to hybridise to a sequence in a cloning vector within which target nucleic acid has been cloned, or in so-called "RACE" (rapid amplification of cDNA ends) in which cDNA' s in a library are ligated to an oligonucleotide linker and PCR is performed using a primer which hybridises with the sequence shown in the figures and a primer which hybridises to the oligonucleotide linker.

Such oligonucleotide probes or primers, as well as the full- length sequence (and mutants, alleles, variants and derivatives) are also useful in screening a test sample containing nucleic acid for the presence of alleles, mutants and variants, especially those that confer susceptibility or predisposition to proliferative disorders, including cancers, the probes hybridising with a target sequence from a sample obtained from the individual being tested. The

conditions of the hybridisation can be controlled to minimise non-specific binding, and preferably stringent to moderately stringent hybridisation conditions are preferred. The skilled person is readily able to design such probes, label them and devise suitable conditions for the hybridisation reactions, assisted by textbooks such as Sambrook et al (1989) and Ausubel et al (1992) .

As well as determining the presence of polymorphisms or mutations in the EMA sequence, the probes may also be used to determine whether mRNA encoding EMA is present in a cell or tissue.

Nucleic acid isolated and/or purified from one or more cells (e.g. human) or a nucleic acid library derived from nucleic acid isolated and/or purified from cells (e.g. a cDNA library derived from mRNA isolated from the cells) , may be probed under conditions for selective hybridisation and/or subjected to a specific nucleic acid amplification reaction such as the polymerase chain reaction (PCR) .

In the context of cloning, it may be necessary for one or more gene fragments to be ligated to generate a full-length coding sequence. Also, where a full-length encoding nucleic acid molecule has not been obtained, a smaller molecule representing part of the full molecule, may be used to obtain full-length clones. Inserts may be prepared from partial cDNA clones and used to screen cDNA libraries. The

full-length clones isolated may be subcloned into expression vectors and activity assayed by transfection into suitable host cells, e.g. with a reporter plasmid.

A method may include hybridisation of one or more (e.g. two) probes or primers to target nucleic acid. Where the nucleic acid is double-stranded DNA, hybridisation will generally be preceded by denaturation to produce single-stranded DNA. The hybridisation may be as part of a PCR procedure, or as part of a probing procedure not involving PCR. An example procedure would be a combination of PCR and low stringency hybridisation. A screening procedure, chosen from the many available to those skilled in the art, is used to identify successful hybridisation events and isolated hybridised nucleic acid.

Binding of a probe to target nucleic acid (e.g. DNA) may be measured using any of a variety of techniques at the disposal of those skilled in the art. For instance, probes may be radioactively, fluorescently or enzymatically labelled. Other methods not employing labelling of probe include examination of restriction fragment length polymorphisms, amplification using PCR, RNAase cleavage and allele specific oligonucleotide probing.

Probing may employ the standard Southern blotting technique. For instance DNA may be extracted from cells and digested with different restriction enzymes. Restriction fragments

may then be separated-by electrophoresis on an agarose gel, before denaturation and transfer to a nitrocellulose filter. Labelled probe may be hybridised to the DNA fragments on the filter and binding determined. DNA for probing may be prepared from RNA preparations from cells.

Preliminary experiments may be performed by hybridising under low stringency conditions various probes to Southern blots of DNA digested with restriction enzymes. Suitable conditions would be achieved when a large number of hybridising fragments were obtained while the background hybridisation was low. Using these conditions nucleic acid libraries, e.g. cDNA libraries representative of expressed sequences, may be searched.

Those skilled in the art are well able to employ suitable conditions of the desired stringency for selective hybridisation, taking into account factors such as oligonucleotide length and base composition, temperature and so on.

On the basis of amino acid sequence information, oligonucleotide probes or primers may be designed, taking into account the degeneracy of the genetic code, and, where appropriate, codon usage of the organism from the candidate nucleic acid is derived. An oligonucleotide for use in nucleic acid amplification may have about 10 or fewer codons (e.g. 6, 7 or 8) , i.e. be about 30 or fewer nucleotides in

length (e.g. 18, 21 or 24) . Generally specific primers are upwards of 14 nucleotides in length, but not more than 18- 20. Those skilled in the art are well versed in the design of primers for use processes such as PCR.

Similar length oligonucleotides may be used in anti-sense regulation, as discussed above, though longer polynucleotides may also be used.

A further aspect of the present invention provides an oligonucleotide or polynucleotide fragment of the nucleotide sequence shown for EMA in the figures or a complementary sequence, in particular for use in a method of obtaining and/or screening nucleic acid, and/or anti-sense regulation of gene expression. The sequences referred to above may be modified by addition, substitution, insertion or deletion of one or more nucleotides, but preferably without abolition of ability to hybridise selectively with nucleic acid with the sequence shown for EMA in the figures, that is wherein the degree of homology of the oligonucleotide or polynucleotide with the sequence given is sufficiently high.

It may be particularly important for the present inven i'^ to employ an oligonucleotide or polynucleotide which distinguishes EMA from E2F family members, it being remembered that a functional characteristic which distinguishes EMA polypeptides from E2Fs is the ability to repress transcription from an operably linked promoter. The

N-terminal region of EMA, discussed below, does not appear in E2Fsl-5.

In some preferred embodiments, oligonucleotides according to the present invention that are fragments of the sequences shown for EMA in the figures, or any allele associated with susceptibility to cancer or other disorder of cell proliferation, are at least about 10 nucleotides in length, more preferably at least about 15 nucleotides in length, more preferably at least about 20 nucleotides in length.

Such fragments themselves individually represent aspects of the present invention. Fragments and other oligonucleotides may be used as primers or probes as discussed but may also be generated (e.g. by PCR) in methods concerned with determining the presence in a test sample of a sequence indicative of susceptibility to cancer or other disorder of cell-cycle regulation.

Methods involving use of nucleic acid in diagnostic and/or prognostic contexts, for instance in determining susceptibility to, e.g., cancer, and other methods concerned with determining the presence of sequences indicative of, e.g., cancer susceptibility are discussed below.

Nucleic acid according to the present invention may in principle be used in methods of gene therapy, for instance in treatment of individuals with the aim of preventing or curing (wholly or partially) cancer or other disorder

involving loss of proper regulation of the cell-cycle and/or cell growth.

Nucleic acid according to the present invention, such as a full-length coding sequence or oligonucleotide probe or primer, may be provided as part of a kit, e.g. in a suitable container such as a vial in which the contents are protected from the external environment. The kit may include instructions for use of the nucleic acid, e.g. in PCR and/or a method for determining the presence of nucleic acid of interest in a test sample. A kit wherein the nucleic acid is intended for use in PCR may include one or more other reagents required for the reaction, such as polymerase, nucleosides, buffer solution etc. The nucleic acid may be labelled. A kit for use in determining the presence or absence of nucleic acid of interest may include one or more articles and/or reagents for performance of the method, such as means for providing the test sample itself, e.g. a swab for removing cells from the buccal. cavity or a syringe for removing a blood sample (such components generally being sterile) . In a further aspect, the present invention provides an apparatus for screening for EMA nucleic acid, the apparatus comprising storage means including the EMA nucleic acid sequence as set out in the figures, or a fragment thereof, the stored sequence being used to compare the sequence of the test nucleic acid to determine the presence of mutations.

A convenient way of producing a polypeptide according to the present invention is to express nucleic acid encoding it, by use of the nucleic acid in an expression system. The use of expression system has reached an advanced degree of sophistication today.

Accordingly, the present invention also encompasses a method of making a polypeptide (as disclosed) , the method including expression from nucleic acid encoding the polypeptide (generally nucleic acid according to the invention) . This may conveniently be achieved by growing a host cell in culture, containing such a vector, under appropriate conditions which cause or allow expression of the polypeptide. Polypeptides may also be expressed in in vitro systems, such as reticulocyte lysate.

Systems for cloning and expression of a polypeptide in a variety of different host cells are well known. Suitable host cells include bacteria, eukaryotic cells such as mammalian and yeast, and baculovirus systems. Mammalian cell lines available in the art for expression of a heterologous polypeptide include Chinese hamster ovary cells, HeLa cells, baby hamster kidney cells, COS cells and many others. A common, preferred bacterial host is E. coli.

Suitable vectors can be chosen or constructed, containing appropriate regulatory sequences, including promoter sequences, terminator fragments, polyadenylation sequences,

enhancer sequences, marker genes and other sequences as appropriate. Vectors may be plasmids, viral e.g. 'phage, or phagemid, as appropriate. For further details see, for example, Molecular Cloning: a Laboratory Manual: 2nd edition, Sambrook et al . , 1989, Cold Spring Harbor

Laboratory Press. Many known techniques and protocols for manipulation of nucleic acid, for example in preparation of nucleic acid constructs, mutagenesis, sequencing, introduction of DNA into cells and gene expression, and analysis of proteins, are described in detail in Current Protocols in Molecular Biology, Ausubel et al . eds., John Wiley & Sons, 1992.

Thus, a further aspect of the present invention provides a host cell containing nucleic acid as disclosed herein. The nucleic acid of the invention may be integrated into the genome (e.g. chromosome) of the host cell. Integration may be promoted by inclusion of sequences which promote recombination with the genome, in accordance with standard techniques. The nucleic acid may be on an extra-chromosomal vector within the cell.

A still further aspect provides a method which includes introducing the nucleic acid into a host cell. The introduction, which may (particularly for in vi tro introduction) be generally referred to without limitation as "transformation", may employ any available technique. For eukaryotic cells, suitable techniques may include calcium

phosphate transfectiόh, DEAE-Dextran, electroporation, liposome-mediated transfection and transduction using retrovirus or other virus, e.g. vaccinia or, for insect cells, baculovirus. For bacterial cells, suitable techniques may include calcium chloride transformation, electroporation and transfection using bacteriophage. As an alternative, direct injection of the nucleic acid could be employed.

Marker genes such as antibiotic resistance or sensitivity genes may be used in identifying clones containing nucleic acid of interest, as is well known in the art.

The introduction may be followed by causing or allowing expression from the nucleic acid, e.g. by culturing host cells (which may include cells actually transformed although more likely the cells will be descendants of the transformed cells) under conditions for expression of the gene, so that the encoded polypeptide is produced. If the polypeptide is expressed coupled to an appropriate signal leader peptide it may be secreted from the cell into the culture medium. Following production by expression, a polypeptide may be isolated and/or purified from the host cell and/or culture medium, as the case may be, and subsequently used as desired, e.g. in the formulation of a composition which may include one or more additional components, such as a pharmaceutical composition which includes one or more pharmaceutically acceptable excipientε, vehicles or carriers

( e . g . see below) .

Introduction of nucleic acid may take place in vivo by way of gene therapy, as discussed below.

A host cell containing nucleic acid according to the present invention, e.g. as a result of introduction of the nucleic acid into the cell or into an ancestor of the cell and/or genetic alteration of the sequence endogenous to the cell or ancestor (which introduction or alteration may take place in vivo or ex vivo), may be comprised (e.g. in the soma) within an organism which is an animal, particularly a mammal, which may be human or non-human, such as rabbit, guinea pig, rat, mouse or other•rodent, cat, dog, pig, sheep, goat, cattle or horse, or which is a bird, such as. a chicken. Genetically modified or transgenic animals or birds comprising such a cell are also provided as further aspects of the present invention.

This may have a therapeutic aim. The presence of a mutant, allele or variant sequence within cells of an organism, particularly when in place of a homologous endogenous sequence, may allow the organism to be used as a model in testing and/or studying the role of the EMA gene or substances which modulate activity of the encoded polypeptide in vitro, indicated to be of therapeutic potential .

Instead of or as well" as being used for the production of a polypeptide encoded by a transgene, host cells may be used as a nucleic acid factory to replicate the nucleic acid of interest in order to generate large amounts of it. Multiple copies of nucleic acid of interest may be made within a cell when coupled to an amplifiable gene such as .DHFR. Host cells transformed with nucleic acid of interest, or which are descended from host cells into which nucleic acid was introduced, may be cultured under suitable conditions, e.g. in a fermenter, taken from the culture and subjected to processing to purify the nucleic acid. Following purification, the nucleic acid or one or more fragments thereof may be used as desired, for instance in a diagnostic or prognostic assay as discussed elsewhere herein.

The skilled person can use the techniques described herein and others well known in the art to produce large amounts of EMA polypeptides, or fragments or active portions thereof, for use as pharmaceuticals, in the developments of drugs and for further study into its properties and role in vivo.

Thus, a further aspect of the present invention provides a polypeptide which has the amino acid sequence shown for EMA in the figures, which may be in isolated and/or purified form, free or substantially free of material with which it is naturally associated, such as other polypeptides or such as human polypeptides other than EMA polypeptide or (for example if produced by expression in a prokaryotic cell)

lacking in native glycosylation, e.g. unglycosylated.

Polypeptides which are amino acid sequence variants, alleles, derivatives or mutants are also provided by the present invention. A polypeptide which is a variant, allele, derivative or mutant may have an amino acid sequence which differs from that given for EMA in the figures by one or more of addition, substitution, deletion and insertion of one or more amino acids. Preferred such polypeptides have EMA function, as discussed herein. Preferred polypeptides may have immunological cross-reactivity with an antibody reactive the EMA polypeptide for which the sequence is given in the figures, particularly an N-terminal portion such as that shown to have transcriptional repressor activity. Preferred polypeptides may share an epitope with the EMA polypeptide for which the amino acid sequence is shown in the figures (as determined for example by immunological cross-reactivity between the two polypeptides) , particulary an N-terminal portion such as that shown to have transcriptional repressor activity. Thus, a polypeptide according to the invention may be immunologically distinct from E2F polypeptides.

A polypeptide which is an amino acid sequence variant, allele, derivative or mutant of the amino acid sequence shown for EMA in the figures may comprise an amino acid sequence which shares greater than about 35% sequence identity with the sequence shown, greater than about 40%,

greater than about 50% " , greater than about 60%, greater than about 70%, greater than about 80%, greater than about 90% or greater than about 95%. The sequence may share greater than about 60% similarity, greater than about 70% similarity, greater than about 80% similarity, greater than about 90% similarity or greater than about 95% similarity with the amino acid sequence shown for EMA in the figures. Particular amino acid sequence variants may differ from those shown for EMA in figures by insertion, addition, substitution or deletion of 1 amino acid, 2, 3, 4, 5-10, 10- 20 20-30, 30-50, 50-100, 100-150, or more than 150 amino acids.

Homology and/or identity percentages may be considered over the full-length of the protein, but may be considered over only a portion, such as a region which has transcriptional repressor activity. Thus, the above considerations of homologies may in certain embodiments be made in relation to an N-terminal portion of the EMA- polypeptide for which the sequence is shown in the figures, particular a portion of about 20 amino acids, or about 30 amino acids, or about 40 amino acids, or about 50 amino acids, or about 60 amino acids, or the 62 amino acid portion used in the experiments described below.

Thus, the present invention also includes active portions, fragments, derivatives and functional mimetics of the EMA polypeptides of the invention.

An "active portion" or EMA polypeptide means a peptide which is less than said full length polypeptide, but which retains a biological activity of EMA, particularly transcriptional repressor function.

A "fragment" of the EMA polypeptide means a stretch of amino acid residues of at least about five to seven contiguous amino acids, often at least about seven to nine contiguous amino acids, typically at least about nine to 13 contiguous amino acids and, most preferably, at least about 20 to 30 or more contiguous amino acids, e.g. about 40 amino acids, about 50 amino acids, about 60 amino acids, or 62 amino acis. Fragments of the EMA polypeptide sequence may include antigenic determinants or epitopes useful for raising antibodies to a portion of the EMA amino acid sequence.

A "derivative" of the EMA polypeptide or a fragment thereof means .a polypeptide modified by varying the amino acid sequence of the protein, e.g. by manipulation of the nucleic acid encoding the protein or by altering the protein itself. Such derivatives of the natural amino acid sequence may involve insertion, addition, deletion or substitution of one or more amino acids, without fundamentally altering the essential activity of the wild type polypeptide.

"Functional mimetic" means a substance which may not contain an active portion of the EMA amino acid sequence, and probably is not a peptide at all, but which retains the

essential biological activity of natural EMA polypeptide. The design and screening of candidate mimetics is described in detail above.

A polypeptide according to the present invention may be isolated and/or purified (e.g. using an antibody) for instance after production by expression from encoding nucleic acid (for which see below) . Polypeptides according to the present invention may also be generated wholly or partly by chemical synthesis. The isolated and/or purified polypeptide may be used in formulation of a composition, which may include at least one additional component, for example a pharmaceutical composition including a pharmaceutically acceptable excipient, vehicle or carrier. A composition including a polypeptide according to the invention may be used in prophylactic and/or therapeutic treatment as discussed below.

A polypeptide, peptide fragment, allele, mutant or variant according to the present invention may be used as an immunogen or otherwise in obtaining specific antibodies. Antibodies are useful in purification and other manipulation of polypeptides and peptides, diagnostic screening and therapeutic contexts. This is discussed further below.

A polypeptide according to the present invention may be used in screening for molecules which affect or modulate its activity or function. Such molecules may be useful in a

therapeutic (possibly-including prophylactic) context.

A further important use of the EMA polypeptides is in raising antibodies that have the property of specifically binding to the EMA polypeptides, or fragments or active portions thereof.

The production of monoclonal antibodies is well established in the art. Monoclonal antibodies can be subjected to the techniques of recombinant DNA technology to produce other antibodies or chimeric molecules which retain the specificity of the original antibody. Such techniques may involve introducing DNA encoding the immunoglobulin variable region, or the complementarity determining regions (CDRs) , of an antibody to the constant regions, or constant regions plus framework regions, of a different immunoglobulin. See, for instance, EP-A-184187, GB-A-2188638 or EP-A-239400. A hybridoma producing a monoclonal antibody may be subject to genetic mutation or other changes, which may or may not alter the binding specificity of antibodies produced.

The provision of the novel EMA polypeptides enables for the first time the production of antibodies able to bind it specifically, particularly antibodies which do not bind E2F polypeptides such as those for which the sequences are shown in the figures. Accordingly, a further aspect of the present invention provides an antibody able to bind specifically to the EMA polypeptide whose sequence is given

in figures. Such an antibody may be specific in the sense of being able to distinguish between the polypeptide it is able to bind and other human polypeptides for which it has no or substantially no binding affinity (e.g. a binding affinity of about lOOOx worse) . Specific antibodies bind an epitope on the molecule which is either not present or is not accessible on other molecules. Antibodies according to the present invention may be specific for the wild-type polypeptide. Antibodies according to the invention may be specific for a particular mutant, variant, allele or derivative polypeptide as between that molecule and the wild-type polypeptide, so as to be useful in diagnostic and prognostic methods as discussed below. Antibodies are also useful in purifying the polypeptide or polypeptides to which they bind, e.g. following production by recombinant expression from encoding nucleic acid.

Preferred antibodies according to the invention are isolated, in the sense of being free from contaminants such as antibodies able to bind other polypeptides and/or free of serum components. Monoclonal antibodies are preferred for some purposes, though polyclonal antibodies are within the scope of the present invention.

Antibodies may be obtained using techniques which are standard in the art. Methods of producing antibodies include immunising a mammal (e.g. mouse, rat, rabbit, horse, goat, sheep or monkey) with the protein or a fragment

thereof. Antibodies may be obtained from immunised animals using any of a variety of techniques known in the art, and screened, preferably using binding of antibody to antigen of interest. For instance, Western blotting techniques or immunoprecipitation may be used (Armitage et al, Nature,

357:80-82, 1992) . Isolation of antibodies and/or antibody- producing cells from an animal. may be accompanied by a step of sacrificing the animal.

As an alternative or supplement to immunising a mammal with a peptide, an antibody specific for a protein may be obtained from a recombinantly produced library of expressed immunoglobulin variable domains, e.g. using lambda bacteriophage or filamentous bacteriophage which display functional immunoglobulin binding domains on their surfaces; for instance see WO92/01047. The library may be naive, that is constructed from sequences obtained from an organism which has not been immunised with any of the proteins (or fragments) , or may be one constructed using sequences obtained from an organism which has been exposed to the antigen of interest .

Antibodies according to the present invention may be modified in a number of ways. Indeed the term "antibody" should be construed as covering any binding substance having a binding domain with the required specificity. Thus the invention covers antibody fragments, derivatives, functional equivalents and homologues of antibodies, including

synthetic molecules and molecules whose shape mimics that of an antibody enabling it to bind an antigen or epitope.

Example antibody fragments, capable of binding an antigen or other binding partner are the Fab fragment consisting of the VL, VΗ, Cl and CHI domains; the Fd fragment consisting of the VH and CHI domains; the Fv fragment consisting of the VL and VH domains of a single arm of an antibody; the dAb fragment which consists of a VH domain; isolated CDR regions and F(ab')2 fragments, a bivalent fragment including two Fab fragments linked by a disulphide bridge at the hinge region. Single chain Fv fragments are also included.

Humanised antibodies in which CDRs from a non-human source are grafted onto human framework regions, typically with the alteration of some of the framework amino acid residues, to provide antibodies which are less immunogenic than the parent non-human antibodies, are also included within the present invention

A hybridoma producing a monoclonal antibody according to the present invention may be subject to genetic mutation or other changes. It will further be understood by those skilled in the art that a monoclonal antibody can be subjected to the techniques of recombinant DNA technology to produce other antibodies or chimeric molecules which retain the specificity of the original antibody. Such techniques may involve introducing DNA encoding the immunoglobulin

variable region, or the complementarity determining regions (CDRs) , of an antibody to the constant regions, or constant regions plus framework regions, of a different immunoglobulin. See, for instance, EP-A-184187, GB-A- 2188638 or EP-A-0239400. Cloning and expression of chimeric antibodies are described in EP-A-0120694 and EP-A-0125023.

Hybridomas capable of producing antibody with desired binding characteristics are within the scope of the present invention, as are host cells, eukaryotic or prokaryotic, containing nucleic acid encoding antibodies (including antibody fragments) and capable of their expression. The invention also provides methods of production of the antibodies including growing a cell capable of producing the antibody under conditions in which the antibody is produced, and preferably secreted.

The reactivities of antibodies on a sample may be determined by any appropriate means. Tagging with individual reporter molecules is one possibility. The reporter molecules may directly or indirectly generate detectable, and preferably measurable, signals. The linkage of reporter molecules may be directly or indirectly, covalently, e.g. via a peptide bond or non-covalently. Linkage via a peptide bond may be as a result of recombinant expression of a gene fusion encoding antibody and reporter molecule.

One favoured mode is by covalent linkage of each antibody

with an individual fluorochrome, phosphor or laser dye with spectrally isolated absorption or emission characteristics. Suitable fluorochromes include fluorescein, rhodamine, phycoerythrin and Texas Red. Suitable chromogenic dyes include diaminobenzidine.

Other reporters include macro olecular colloidal particles or particulate material such as latex beads that are coloured, magnetic or paramagnetic, and biologically or chemically active agents that can directly or indirectly cause detectable signals to be visually observed, electronically detected or otherwise recorded. These molecules may be enzymes which catalyse reactions that develop or change colours or cause changes in electrical properties, for example. They may be molecularly excitable, such that electronic transitions between energy states result in characteristic spectral absorptions or emissions. They may include chemical entities used in conjunction with biosensors. Biotin/avidin or biotin/streptavidin and alkaline phosphatase detection systems may be employed.

The mode of determining binding is not a feature of the present invention and those skilled in the art are able to choose a suitable mode according to their preference and general knowledge.

Antibodies according to the present invention may be used in screening for the presence of a polypeptide, for example in

a test sample containing cells or cell lysate as discussed, and may be used in purifying and/or isolating a polypeptide according to the present invention, for instance following production of the polypeptide by expression from encoding nucleic acid therefor. Antibodies may modulate the activity of the polypeptide to which they bind and so, if that polypeptide has a deleterious effect in an individual, may be useful in a therapeutic context (which may include prophylaxis) .

An antibody may be provided in a kit, which may include instructions for use of the antibody, e.g. in determining the presence of a particular substance in a test sample. One or more other reagents may be included, such as labelling molecules, buffer solutions, elutants and so on. Reagents may be provided within containers which protect them from the external environment, such as a sealed vial.

A number of methods are known in the art for analysing biological samples from individuals to determine whether the individual carries an EMA allele predisposing them to disease. The purpose of such analysis may be used for diagnosis or prognosis, and serve to detect the presence of, e.g., an existing cancer, to help identify the type of cancer, to assist a physician in determining the severity or likely course of the cancer and/or to optimise treatment of it. Alternatively, the methods can be used to detect alleles that are statistically associated with a

susceptibility to cancer or other proliferative disorder in the future, e.g. early onset cancer, identifying individuals who would benefit from regular screening to provide early diagnosis of cancer.

Broadly, the methods divide into those screening for the presence of EMA nucleic acid sequences and those that rely on detecting the presence or absence of EMA polypeptide. The methods make use of biological samples from individuals that are suspected of contain the nucleic acid sequences or polypeptide. Examples of biological samples include blood, plasma, serum, tissue samples, tumour samples, saliva and urine.

Exemplary approaches for detecting EMA nucleic acid or polypeptides include:

(a) comparing the sequence of nucleic acid in the sample with an EMA nucleic acid sequence to determine whether the sample from the patient contains mutations; or, (b) determining the presence in a sample from a patient of the polypeptide encoded by an EMA gene and, if present, determining whether the polypeptide is full length, and/or is mutated, and/or is expressed at the normal level; or,

(c) using DNA fingerprinting to compare the restriction pattern produced when a restriction enzyme cuts a sample of nucleic acid from the patient with the restriction pattern obtained from normal EMA gene or from known mutations thereof; or,

(d) using a specific binding member capable of binding to an EMA nucleic acid sequence (either a normal sequence or a known mutated sequence) , the specific binding member comprising nucleic acid hybridisable with the EMA sequence, or substances comprising an antibody domain with specificity for a native or mutated EMA nucleic acid sequence or the polypeptide encoded by it, the specific binding member being labelled so that binding of the specific binding member to its binding partner is detectable; or, (e) using PCR involving one or more primers based on normal or mutated EMA gene sequence to screen for normal or mutant EMA gene in a sample from a patient.

A "specific binding pair" comprises a specific binding member (sbm) and a binding partner (bp) which have a particular specificity for each other and which in normal conditions bind to each other in preference to other molecules. Examples of specific binding pairs are antigens and antibodies, molecules and receptors and complementary nucleotide sequences. The skilled person will be able to think of many other examples and they do not need to be listed here. Further, the term "specific binding pair" is also applicable where either or both of the specific binding member and the binding partner comprise a part of a larger molecule. In embodiments in which the specific binding pair are nucleic acid sequences, they will be of a length to hybridise to each other under the conditions of the assay, preferably greater than 10 nucleotides long, more preferably

greater than 15 or 20 -nucleotides long.

In most embodiments for screening for susceptibility alleles, the EMA nucleic acid in the sample will initially be amplified, e.g. using PCR, to increase the amount of the analyte as compared to other sequences present in the sample. This allows the target sequences to be detected with a high degree of sensitivity if they are present in the sample. This initial step may be avoided by using highly sensitive array techniques that are becoming increasingly important in the art .

The identification of the EMA gene and its implication with disorders of cell proliferation paves the way for aspects of the present invention to provide the use of materials and methods, such as are disclosed and discussed above, for establishing the presence or absence in a test sample of an variant form of the gene, in particular an allele or variant specifically associated with cancer. This may be for diagnosing a predisposition of an individual to cancer. It may be for diagnosing cancer of a patient with the disease as being associated with the gene.

This allows for planning of appropriate therapeutic and/or prophylactic treatment, permitting stream-lining of treatment by targeting those most likely to benefit.

A variant form of the gene may contain one or more

insertions, deletions, " substitutions and/or additions of one or more nucleotides compared with the wild-type sequence which may or may not disrupt the gene function. Differences at the nucleic acid level are not necessarily reflected by a difference in the amino acid sequence of the encoded polypeptide. However, a mutation or other difference in a gene may result in a frame-shift or stop codon, which could seriously affect the nature of the polypeptide produced, or a point mutation or gross mutational change to the encoded polypeptide, including insertion, deletion, substitution and/or addition of one or more amino acids or regions in the polypeptide. A mutation in a promoter sequence or other regulatory region may prevent or reduce expression from the gene or affect the processing or stability of the mRNA transcript.

There are various methods for determining the presence or absence in a test sample of a particular nucleic acid sequence, such as the sequence shown for EMA in the figures or a mutant, variant or allele thereof.

Tests may be carried out on preparations containing genomic DNA, cDNA and/or mRNA. Testing cDNA or mRNA has the advantage of the complexity of the nucleic acid being reduced by the absence of intron sequences, but the possible disadvantage of extra time and effort being required in making the preparations. RNA is more difficult to manipulate than DNA because of the wide-spread occurrence of

RN' ases .

Nucleic acid in a test sample may be sequenced and the sequence compared with the sequence shown in the figures, to determine whether or not a difference is present. If so, the difference can be compared with known susceptibility alleles to determine whether the test nucleic acid contains one or more of the variations indicated, or the difference can be investigated for association with cancer.

Since it will not generally be time- or labour-efficient to sequence all nucleic acid in a test sample or even the whole EMA gene, a specific amplification reaction such as PCR using one or more pairs of primers may be employed to amplify the region of interest in the nucleic acid, for instance the EMA gene or a particular region in which mutations associated with a proliferative disorder susceptibility occur. The amplified nucleic acid may then be sequenced as above, and/or tested in any other way to determine the presence or absence of a particular feature. Nucleic acid for testing may be prepared from nucleic acid removed from cells or in a library using a variety of other techniques such as restriction enzyme digest and electrophoresis.

Nucleic acid may be screened using a variant- or allele- specific probe. Such a probe corresponds in sequence to a region of the EMA gene, or its complement, containing a

sequence alteration known to be associated with susceptibility to cancer or other proliferative disorder. Under suitably stringent conditions, specific hybridisation of such a probe to test nucleic acid is indicative of the presence of the sequence alteration in the test nucleic acid. For efficient screening purposes, more than one probe may be used on the same test sample.

Allele- or variant-specific oligonucleotides may similarly be used in PCR to specifically amplify particular sequences if present in a test sample. Assessment of whether a PCR band contains a gene variant may be carried out in a number of ways familiar to those skilled in the art. The PCR product may for instance be treated in a way that enables one to display the mutation or polymorphism on a denaturing polyacrylamide DNA sequencing gel, with specific bands that are linked to the gene variants being selected.

An alternative or supplement to looking for the presence of variant sequences in a test sample is to look for the presence of the normal sequence, e.g. using a suitably specific oligonucleotide probe or primer.

Use of oligonucleotide probes and primers has been discussed in more detail above.

Approaches which rely on hybridisation between a probe and test nucleic acid and subsequent detection of a mismatch may

be employed. Under appropriate conditions (temperature, pH etc.) , an oligonucleotide probe will hybridise with a sequence which is not entirely complementary. The degree of base-pairing between the two molecules will be sufficient for them to anneal despite a mis-match. Various approaches are well known in the art for detecting the presence of a mis-match between two annealing nucleic acid molecules.

For instance, RN'ase A cleaves at the site of a mis-match. Cleavage can be -detected by electrophoresing test nucleic acid to which the relevant probe or probe has annealed and looking for smaller molecules (i.e. molecules with higher electrophoretic mobility) than the full length probe/test hybrid. Other approaches rely on the use of enzymes such as resolvases or endonucleases.

Thus, an oligonucleotide probe that has the sequence of a region of the normal EMA gene (either sense or anti-sense strand) in which mutations associated with, e.g., cancer susceptibility are known to occur may be annealed to test nucleic acid- and the presence or absence of a mis-match determined. Detection of the presence of a mis-match may indicate the presence in the test nucleic acid of a mutation associated with, e.g., cancer susceptibility. On the other hand, an oligonucleotide probe that has the sequence of a region of the EMA gene including a mutation associated with, e.g. , cancer susceptibility may be annealed to test nucleic acid and the presence or absence of a mis-match determined.

The absence of a mis-match may indicate that the nucleic acid in the test sample has the normal sequence. In either case, a battery of probes to different regions of the gene may be employed.

The presence of differences in sequence of nucleic acid molecules may be detected by means of restriction enzyme digestion, such as in a method of DNA fingerprinting where the restriction pattern produced when one or more restriction enzymes are used to cut a sample of nucleic acid is compared with the pattern obtained when a sample containing the normal gene or a variant or allele is digested with the same enzyme or enzymes.

The presence of absence of a lesion in a promoter or other regulatory sequence may also be assessed by determining the level of mRNA production by transcription or the level of polypeptide production by translation from the mRNA.

A test sample of nucleic acid may be provided for example by extracting nucleic acid from cells, e.g. in saliva or preferably blood, or for pre-natal testing from the a nion, placenta or foetus itself .

There are various methods for determining the presence or absence in a test sample of a particular polypeptide, such as the polypeptide with the EMA amino acid sequence shown in the figures or an amino acid sequence mutant, variant or

allele thereof.

A sample may be tested for the presence of a binding partner for a specific binding member such as an antibody (or mixture of antibodies) , specific for one or more particular variants of the EMA polypeptide shown in the figures, or a mutant, variant or allele thereof.

A sample may be tested for the presence of a binding partner for a specific binding member such as an antibody (or mixture of antibodies) , specific for the EMA polypeptide shown in the figures.

In such cases, the sample may be tested by being contacted with a specific binding member such as an antibody under appropriate conditions for specific binding, before binding is determined, for instance using a reporter system as discussed. Where a panel of antibodies is used, different reporting labels may be employed for each antibody so that binding of each can be determined.

A specific binding member such as an antibody may be used to isolate and/or purify its binding partner polypeptide from a test sample, to allow for sequence and/or biochemical analysis of the polypeptide to determine whether it has the sequence and/or properties of the EMA polypeptide whose sequence is shown in the figures, or if it is a mutant or variant form. Amino acid sequence is routine in the art

using automated sequencing machines .

The EMA polypeptides, antibodies, peptides and nucleic acid of the invention can be formulated in pharmaceutical compositions. These compositions may comprise, in addition to one of the above substances, a pharmaceutically acceptable excipient, carrier, buffer, stabiliser or other materials well known to those skilled in the art. Such materials should be non-toxic and should not interfere with the efficacy of the active ingredient. The precise nature of the carrier or other material may depend on the route of administration, e.g. oral, intravenous, cutaneous or subcutaneous, nasal, intramuscular, intraperitoneal routes.

Pharmaceutical compositions for oral administration may be in tablet, capsule, powder or liquid form. A tablet may include a solid carrier such as gelatin or an adjuvant. Liquid pharmaceutical compositions generally include a liquid carrier such as water, petroleum, animal or vegetable oils, mineral oil or synthetic oil. Physiological saline solution, dextrose or other saccharide solution or glycols such as ethylene glycol, propylene glycol or polyethylene glycol may be included.

For intravenous, cutaneous or subcutaneous injection, or injection at the site of affliction, the active ingredient will be in the form of a parenterally acceptable aqueous solution which is pyrogen-free and has suitable pH,

isotonicity and stability. Those of relevant skill in the art are well able to prepare suitable solutions using, for example, isotonic vehicles such as Sodium Chloride Injection, Ringer's Injection, Lactated Ringer's Injection. Preservatives, stabilisers, buffers, antioxidants and/or other additives may be included, as required.

Whether it is a polypeptide, antibody, peptide, nucleic acid molecule, small molecule, mimetic or other pharmaceutically useful compound according to the present invention that is to be given to an individual, administration is preferably in a "prophylactically effective amount" or a "therapeutically effective amount" (as the case may be, although prophylaxis may be considered therapy) , this being sufficient to show benefit to the individual. The actual amount administered, and rate and time-course of administration, will depend on the nature and severity of what is being treated. Prescription of treatment, e.g. decisions on dosage etc, is within the responsibility of general practitioners and other medical doctors, and typically takes account of the disorder to be treated, the condition of the individual patient, the site of delivery, the method of administration and other factors known to practitioners. Examples of the techniques and protocols mentioned above can be found in Remington's Pharmaceutical Sciences, 16th edition, Osol, A. (ed) , 1980.

Alternatively, targeting therapies may be used to deliver

the active agent more- specifically to certain types of cell, by the use of targeting systems such as antibody or cell specific ligands. Targeting may be desirable for a variety of reasons; -for example if the agent is unacceptably toxic, or if it would otherwise require too high a dosage, or if it would not otherwise be able to enter the target cells.

Instead of administering these agents directly, they could be produced in the target cells by expression from an encoding gene introduced into the cells, eg in a viral vector (a variant of the VDEPT technique - see below) . The vector could be targeted to the specific cells to be treated, or it could contain regulatory elements which are switched on more or less selectively by the target cells.

Alternatively, the agent could be administered in a precursor form, for conversion to the active form by an activating agent produced in, or targeted to, the cells to be treated. This type of approach is sometimes known as ADEPT or VDEPT; the former involving targeting the activating agent to the cells by conjugation to a cell- specific antibody, while the latter involves producing the activating agent, eg an enzyme, in a vector by expression from encoding DNA in a viral vector (see for example, EP-A- 415731 and WO 90/07936) .

A composition may be administered alone or in combination with other treatments, either simultaneously or sequentially

dependent upon the condition to be treated.

As a further alternative, the nucleic acid encoded an authentic biologically active EMA polypeptide could be used in a method of gene therapy, to treat a patient who is unable to synthesize the active polypeptide or unable to synthesize it at the normal level, thereby providing the effect provided by wild-type and suppressing the occurrence of cancer and/or reduce the size or extent of existing defects in cell-cycle and/or growth regulation in the target cells.

Vectors such as viral vectors have been used in the prior art to introduce genes into a wide variety of different target cells. Typically the vectors are exposed to the target cells so that transfection can take place in a sufficient proportion of the cells to provide a useful therapeutic or prophylactic effect from the expression of the desired polypeptide. The transfected nucleic acid may be permanently incorporated into the genome of each of the targeted tumour cells, providing long lasting effect, or alternatively the treatment may have to be repeated periodically.

A variety of vectors, both viral vectors and plasmid vectors, are known in the art, see US Patent No. 5,252,479 and WO 93/07282. In particular, a number of viruses have been used as gene transfer vectors, including papovaviruses,

such as SV40, vaccini-a virus, herpesviruses, including HSV and EBV, and retroviruses. Many gene therapy protocols in the prior art have used disabled murine retroviruses.

As an alternative to the use of viral vectors other known methods of introducing nucleic acid into cells includes electroporation, calcium phosphate co-precipitation, mechanical techniques such as microinjection, transfer mediated by liposomes and direct DNA uptake and receptor- mediated DNA transfer.

As mentioned above, the aim of gene therapy using nucleic acid encoding an EMA polypeptide, or an active portion thereof, is to increase the amount of the expression product of the nucleic acid in cells in which the level of the wild- type polypeptide is absent or present only at reduced levels. Such treatment may be therapeutic in the treatment of cells which are already cancerous or pre-cancerous or prophylactic in the treatment of individuals known through screening to have an EMA susceptibility allele and hence a predisposition to cancer.

Receptor-mediated gene transfer, in which the nucleic acid is linked to a protein ligand via polylysine, with the ligand being specific for a receptor present on the surface of the target cells, is an example of a technique for specifically targeting nucleic acid to particular cells.

Antisense technology based on EMA nucleic acid sequences is discussed above.

A polypeptide according to the present invention may be used in screening for molecules which affect or modulate its activity or function. Such molecules may be useful in a therapeutic (possibly including prophylactic) context.

It is well known that pharmaceutical research leading to the identification of a new drug may involve the screening of very large numbers of candidate substances, both before and even after a lead compound has been found. This is one factor which makes pharmaceutical research very expensive and time-consuming. Means for assisting in the screening process can have considerable commercial importance and utility. Such means for screening for substances potentially useful in treating or preventing cancer or other disorder of cell proliferation is provided by polypeptides according to the present invention. Substances identified as modulators of the polypeptide represent an advance in the fight against cancer since they provide basis for design and investigation of therapeutics for in vivo use.

A method of screening for a substance which modulates activity of a polypeptide may include contacting one or more test substances with the polypeptide in a suitable reaction medium, testing the activity of the treated polypeptide and comparing that activity with the activity of the polypeptide

in comparable reaction medium untreated with the test substance or substances. A difference in activity between the treated and untreated polypeptides is indicative of a modulating effect of the relevant test substance or substances.

Combinatorial library technology provides an efficient way of testing a potentially vast number of different substances for ability to modulate activity of a polypeptide. Such libraries and their use are known in the art. The use of peptide libraries is preferred.

Prior to or as well as being screened for modulation of activity, test substances may be screened for ability to interact with the polypeptide, e.g. in a yeast two-hybrid system (which requires that both the polypeptide and the test substance can be expressed in yeast from encoding nucleic acid) . This may be used as a coarse screen prior to testing a substance for actual ability to modulate activity of the polypeptide. Alternatively, the screen could be used to screen test substances for binding to a specific binding partner, to find mimetics of the polypeptide, e.g. for testing as therapeutics.

Following identification of a substance which modulates or affects polypeptide activity, the substance may be investigated further. Furthermore, it may be manufactured and/or used in preparation, i.e. manufacture or formulation,

of a composition such " as a medicament, pharmaceutical composition or drug. These may be administered to individuals .

Thus, the present invention extends in various aspects not only to a substance identified using a nucleic acid molecule as a modulator of polypeptide activity, in accordance with what is disclosed herein, but also a pharmaceutical composition, medicament, drug or other composition comprising such a substance, a method comprising administration of such a composition to a patient, e.g. for treatment (which may include preventative treatment) of, e.g., cancer, use of such a substance in manufacture of a composition for administration, e.g. for treatment of a disorder of cell proliferation, and a method of making a pharmaceutical composition comprising admixing such a substance with a pharmaceutically acceptable excipient, vehicle or carrier, and optionally other ingredients.

A substance identified using as a modulator of polypeptide function may be peptide or non-peptide in nature. Non- peptide "small molecules" are often preferred for many in vivo pharmaceutical uses. Accordingly, a mimetic or mimic of the substance (particularly if a peptide) may be designed for pharmaceutical use.

The designing of mimetics to a known pharmaceutically active compound is discussed above.

Nucleic acid constructs in which a site recognised by EMA and at which, on binding, EMA represses transcription from an operably-linked promoter, may be used to assess the effect a test substance has on EMA function, by determination of promoter activity.

"Promoter activity" is used to refer to ability to initiate transcription. The level of promoter activity is quantifiable for instance by assessment of the amount of mRNA produced by transcription from the promoter or by assessment of the amount of protein product produced by translation of mRNA produced by transcription from the promoter. The amount of a specific mRNA present in an expression system may be determined for example using specific oligonucleotides which are able to hybridise with the mRNA and which are labelled or may be used in a specific amplification reaction such as the polymerase chain reaction. Use of a reporter gene facilitates determination of promoter activity by reference to protein production.

In such.a construct, the promoter is operably linked to a gene, e.g. a coding sequence. Generally, the gene may be transcribed into mRNA which may be translated into a peptide or polypeptide product which r be detected and preferably quantitated following expression. A gene whose encoded product may be assayed following expression is termed a "reporter gene", i.e. a gene which "reports" on promoter activity.

The reporter gene preferably encodes an enzyme which catalyses a reaction which produces a detectable signal, preferably a visually detectable signal, such as a coloured product. Many examples are known, including S-galactosidase and luciferase. -galactosidase activity may be assayed by production of blue colour on substrate, the assay being by eye or by use of a spectrophotometer to measure absorbance. Fluorescence, for example that produced as a result of luciferase activity, may be quantitated using a spectrophotometer. Radioactive assays may be used, for instance using chloramphenicol acetyltransferase, which may also be used in non-radioactive assays. The presence and/or amount of gene product resulting from expression from the reporter gene may be determined using a molecule able to bind the product, such as an antibody or fragment thereof. The binding molecule may be labelled directly or indirectly using any standard technique.

Those skilled in the art are well aware of a multitude of possible reporter genes and assay techniques which may be used to determine gene activity. Any suitable reporter/assay may be used and it should be appreciated that no particular choice is essential to or a limitation of the present invention.

Thus, nucleic acid constructs comprising a promoter and a reporter gene may be employed in screening for a substance able to modulate the repressor activity of EMA on the

promoter. For therapeutic purposes, e.g. for treatment of cancer, a substance able to up-regulate expression of the promoter, i.e. antagonise the repressor function of EMA, may be sought. A method of screening for ability of a substance to modulate activity of EMA may comprise contacting an expression system, such as a host cell, containing a nucleic acid construct as discussed with a test or candidate substance and determining expression of the reporter gene.

The level of expression in the presence of the test substance may be compared with the level of expression in the absence of the test substance. A difference in expression in the presence of the test substance may indicate ability of the substance to modulate EMA function.

A promoter construct may be introduced into a cell line using any technique previously described to produce a stable cell line containing the reporter construct integrated into the genome. The cells may be grown and incubated with test compounds for varying times. The cells may be grown in 96 well plates to facilitate the analysis of large numbers of compounds. The cells may then be washed and the reporter gene expression analysed. For some reporters, such as luciferase the cells will be lysed then analysed.

Following identification of a substance which modulates or affects promoter activity, the substance may be investigated further. Furthermore, it may be manufactured and/or used in

preparation, i.e. manufacture or formulation, of a composition such as a medicament, pharmaceutical composition or drug. These may be administered to individuals.

Example 1

The Two Hybrid Library Screen Method used to identify the EMA Gene (E2F Modulating Activity)

The following method was used to isolate the E2F Modulating Activity Gene (EMA) . The yeast to hybrid system is based on a protein interaction assay in yeast (Fields and Song. 1989. Nature 340, 245-246) . The following protocol contains several modifications of the original Fields strategy and facilitates large scale library screens. It has been designed and optimised and was first used by Ann Vojtek to isolate c-raf and a-raf clones (Vojtek et al. 1993. Cell 74, 205-214) . The method described below is essentially identical to the one outlined in Vojtek et al. It uses the same set of vectors (the different bait constructs are described below) , yeast strains and in particular the cDNA library. The Two Hybrid System is based on an in vivo yeast protein interaction assay. In general yeast are transformed with a reporter gene construction which expresses a selective marker protein. The promoter of that gene has been designed such that it contains binding site for the

LexA DNA-binding protein. Gene expression from that plasmid is usually very low. Two more expression vectors are transformed into the yeast containing the selectable marker

expression plasmid. " The first of those two vectors is based on pBTM116. It contains the coding sequence for the full length LexA gene linked to a multiple cloning site. This multiple cloning site is used to clone a gene of interest in frame on to the LexA coding region. The second yeast expression vector contains the activation domain of the herpes simplex transactivator VP16 fused to random sequences of a cDNA library. Those two plasmids should facilitate expression from the reporter construct containing the selectable marker only when the LexA fusion construct (bait) interacts with a peptide sequence derived from the cDNA library.

The Two Hybrid Plasmids

Amino Acids 146-410 of a human DP-l clone were amplified by PCR and cloned as an EcoRl BamHI fragment in frame with the LexA gene in pBTM116. The resulting plasmid was called pBTMDPl (146-410) . In a small scale yeast transformation the LexA DP-l fusion protein induced a beta-galactosidase activity considerably below the level found to be induced by pBTM116. The BTM vector system contains the TRP1 gene which allows selection of transformed yeast on tryptophan negative plates .

The pVP16 cDNA library vector carries the LEU2 gene which allows selection on Leucine negative plates. The mouse embryo cDNA library cloned next to the activation domain of

VP16 was generated by " random primed synthesis of 9.5/10.5 CD1 mouse embryo poly A+ RNA. The vast majority of inserts had a length of 400 - 600 nucleotides.

Two reporter constructs are in use and both are provided by the yeast strain L40. The first construct has a selectable marker, the LYS2 gene, which allows growth on Lysine negative plates. It contains the coding region for the histidine gene under the control of a promoter containing four binding sites for the LexA operator. The second reporter gene has a URA3 gene as selectable marker which allows growth on uracil negative plates. It contains the coding region for the lacZ gene controlled by a promoter containing eight binding sites for the lexA protein.

Yeast Transformation

Small Scale Transformation

10 ml of YPAD were inoculated with a colony of L40 and incubated overnight at 30°C. Thereafter, the culture was diluted to an OD600 of around JO .4 in 50 ml YPAD and grown for an additional 2-4 hours. Cells were then pelleted at 2500 rpm at room temperature and re-suspended in 40 ml TE. Yeast were then repelleted at 2500 rpm and resuspended in 2 ml of 100 mM LiAc in 0.5 x TE. This yeast suspension was incubated at room temperature for 10 minutes. 1 μg of plasmid DNA together with 100 μg of sonicated sheared salmon

sperm DNA was mixed with 100 μl of the yeast suspension. After a further addition of a buffer of 700 μl containing 100 mM (LiAc) , 40% PEG-3350 in 1 x TE, the solution was mixed well and incubated at 30°C for 30 minutes. To stop the transformation process 88 μl DMSO was added and the mixture heat shocked at 40°C for 7 minutes. Cells were pelletted in a microfuge for 10 seconds and re-suspended in 1 ml TE. Cells were then re-washed in 1 ml TE and re¬ suspended in 50-100 μl TE and plated on selective plates. Plates were incubated at 30°C and colonies were picked after 2-3 days.

Small scale transformation was used to test the induction of the beta-galactosidase activity by the LexA DP-l fusion plasmid. The beta-galactosidase filter assay is described below.

Large Scale Library Transformation

The LexA DP-l fusion plasmid was first introduced into L40 by selecting for growth on tryptophan minus plates after a small scale transformation. The resulting strain was used to grow a 2 ml overnight culture in yeast selective medium minus tryptophan and minus uracil. Thereafter, the culture was diluted with 100 ml of the same medium. The next day the mid log phase culture should be used to inoculate 1 litre of YPAD medium (pre-warmed to 30°C) . The optical density at 600 nm should be about 0.3. This culture is

grown at 30°C for 3 hours. During this time the cells should roughly double in number. Yeast were pelletted at 2500 rpm for 5 minutes at room temperature and re-suspended in 500 ml of TE. After a re-spin the cells were taken up in 10 ml of 100 mM Li Ac in 0.5 x TE. To this a mixture of 0.5 ml of 10 mg/ml denatured salmon sperm DNA and 200 μg of library plasmid is added. The suspension is mixed well. After this 70 ml of a solution containing 100 mM LiAc, 40% PEG-3350 in 1 x TBE was added and mixed well. This mixture was incubated for 30 minutes at 30°C. The 'transformation mixture was then transferred to a sterile 2 litre beaker and 8.8 ml of DMSO was added. After mixing the suspension was heat shocked at 42°C in a water bath for 6 minutes. Thereafter, the suspension was diluted with 200 ml of YPA and rapidly cooled to room temperature in a water bath.

Cells were then pelletted at 2500 rpm for 5 minutes at room temperature, washed with 250 ml YPA medium and re-suspended in 1 litre of pre-warmed YPAD medium. At this stage incubation at 30°C was allowed for 1 hour with gentle shaking. The culture was then pelletted at 2500 rpm for 5 minutes at room temperature and re-suspended in 500 ml of selective medium omitting uracil, tryptophan leucine (-UTL) . After a further respin cells were resuspended in 1 litre of pre-warmed -UTL medium with shaking at 30°C for about 4 hours. Thereafter, cells were pelletted at 2500 rpm for 5 minutes at room temperature and washed twice with selective medium omitting tryptophan, histidine, uracil and leucine (- THULL) . The final pellet was resuspended in 10 ml of -THULL

medium and plated in aliquots of 100 μl on plates made from -THULL media. After 2-3 days colonies were picked to a grid. A nitrocellulose filter lift was used in a beta- galactosidase filter assay for analysis of lacZ induction.

Beta-galactosidase Filter Assay

Filters were removed from the plates and immersed for 3-5 seconds in liquid nitrogen. Filters were then placed, colony side up, at room temperature until thawed. The beta- galactosidase assay was performed in the lid of a petri dish. 3 ml of Z-buffer (60 mM Na 2 HP0 4 , 40 mM NaH 2 P0 4 , 10 mM KCl, 1 mM MgSO 4 , pH7.0) containing 30 ml of 50 mg/ml X-gal. Circularised Whatman filters (#1) were placed into the buffer, followed by the nitrocellulose filters, colonies facing up. The lid was then covered with the bottom of the petri dish and placed at 30°C. Interactions were detectable by the appearance of a blue colour after 20 to 40 minutes.

Recovering of Plasmids from Yeast and Shuttling into E.coli

Viable cells were recovered from colonies and grown in a 50 ml overnight culture with the appropriate selection. The next morning cells were pelletted at 2500 rpm for 5 minutes at room temperature. Pellets were resuspended in 0.3 ml of lysis buffer (2.5 M LiCl, 50 mM Tris-Cl (pH 8.0) , 4% Triton X-100, 62.5 mM EDTA) . At this stage solution were transferred to 1.5 ml tubes and 150 ml of glass beads (0.45

- 0.50 mm) together with 0.3 ml phenol/chloroform were added. After vigorous shaking for 1 minute samples were centrifuged for 1 minute and the aqueous phase transferred to a new tube. DNA was precipitated twice with ethanol and resuspended in 25 ml TE followed by electroporation of DNA into E.coli.

Verification of Interacting Partners

Recovered library plasmids from positive yeast colonies were retransformed into the L40 strain containing the LexA DP-l bait vector using the small scale transformation procedure. Using the beta-galactosidase filter assay 68 positive colonies were identified with the LexA DP-l bait. No induction of beta-galactosidase activity was detected in colonies transformed with a LexA instead of a LexA DP-l clone. Therefore, the underlying protein interactions of those 68 positive colonies were considered significant and consequently further analysed.

Since the bait construct used in the screen contained the heterodimerisation domain of DP-l most of the positive clones were considered to be one form of already identified E2F's (E2F 1-5) . Therefore, DNA from positive colonies was spotted to a grid and hybridised to radioactively labelled cDNA probes from full length E2F 1-5. The hybridisation was done using standard filter hybridisation techniques under stringent conditions. This screening for know DP-l

interacting proteins yielded six groups of cDNAs. Five of them were identified as E2F 1-5. However, one of our 68 clones did not hybridise to any of the probes under stringent conditions. Randomly selective clones from each of the positive groups 1-5 and the single DNA which gave no positive signal in the E2F hybridisation assay were sequenced. Sequencing of the group 1-5 cDNAs confirmed the result of the hybridisation assay. Sequencing of the single cDNA clone resulted in the identification of a protein fragment with high degree of sequence homology to E2F family members but clearly this clone was not one of the so far identified E2Fs (see description of EMA) .

Identification of a Full Length Mouse cDNA of EMA

In order to isolate a full length cDNA clone of EMA a plasmid cDNA library of adult mouse liver tissue was plated on ampicillin resistant plates. Nitrocellulose filter lifts of those plates were hybridised to EMA DNA sequences obtained through the yeast to hybrid screen. Washes were done at high stringency and two positive signals were identified. One of those signals arrived from hybridisation to a full length cDNA clone of EMA, the other positive signal identified a partial EMA clone. The full length sequence of a mouse EMA cDNA (SEQ ID No. 1) was verified by primer walking sequencing in both directions. The full length cDNA contained a fragment which was identical to the DNA sequence used as a probe for the library screen.

Example 2

Further Characterisation of EMA and its Biological Activity

As discussed, we isolated a 2.5 kb mouse cDNA. The 819 base pair open reading frame translates into a protein of 272 amino-acids with a predicted molecular mass of 30 kDa which we named EMA (Fig. 4a) . Sequence analysis revealed high sequence similarity to E2F family. Features found conserved in all E2F family members, such as the DNA-binding and also dimerisation domain are highly conserved in EMA (Fig. 4b) . However, most strikingly the EMA protein lacks the activation domain found at the C-terminus of all known E2Fs (Fig. 4a-b) . Consequently, EMA also lacks the binding site for RB. This primary structure of EMA provides indication that, despite being E2F-like, EMA has functions fundamentally different from E2Fs .

EMA mRNA is expressed in a wide variety of tissues and the size of the EMA transcript is in good agreement with that of the EMA cDNA we isolated (Fig. 4c) . Reprobing the blot in Figure 4c with a GAPDH probe demonstrated that EMA is expressed at similar levels in the different tissues tested. Also, as could be expected from the cloning strategy, EMA binds to DP-l in a GST-pulldown experiment but not to a GST control (Fig. 4d) . However, no significant binding of EMA to GST-EMA or GST-E2F-1 was detected (data not shown) , suggesting that EMA preferentially heterodimerises with DP

family members .

One characteristic feature of all E2F/DP protein complexes is their ability to recognise a consensus E2F-binding site. To investigate whether EMA could bind to an E2F-binding site we expressed EMA, DP-l and E2F-1 as GST-fusion proteins and performed gel retardation assays using the prototype E2F- binding site of the adenovirus E2 promoter as a probe (core: T-C-G-C-G-C) . We were unable to detect EMA binding to this site. In order to identify EMA/DP-1 binding sites we employed a binding site selection assay, the rationale of which is depicted in Figure 5a. Interestingly, all selected binding sites had an identical core region (T-C-C-C-G-C) without a single exception in all sequenced sites (Fig. 5b) . The adjacent nuclejotide 3' to the core region was always a cytosine or guanine followed by a poly-thymine or poly- adenine stretch. The 5' -site of the core region was in most cases flanked by two further thymine residues .

The EMA/DP-1-specific core sequence falls within the consensus E2F-binding site identified by a binding site selection assay on a RB-bound E2F activity (T-C/G-C/G-C/G-G- C) (ref 17) . Therefore, EMA/DP-1 appears to recognise only a subset of E2F-binding sites with high affinity. This explains the lack of binding to the viral E2 promoter E2F- binding site which deviates from the core EMA/DP-1 site by a single residue. We next asked whether EMA/DP-1 could bind to the specifically selected binding site under gel-shift

conditions. Figure 5d shows that EMA only recognises the selected binding site (core: T-C-C-C-G-C) in co-operation with DPI but does not recognise the E2F-binding site of the E2 promoter (T-C-G-C-G-C) . This binding is specific since it can be completed for with a specific but not with a non¬ specific competitor. In contrast, E2F-1 can specifically bind to both sites and can do so in the absence of DP-l (Fig. 5c) . This experiment confirms the result obtained with the binding site selection assay and supports the view that EMA/DP-1 heterodimers recognise only a particular subset of E2F-binding sites.

A search for the EMA/DP-1 core binding site (T-C-C-C-G-C) identified this sequence in the promoter region of the B-myb (ref 18) , c-myc (refs 19) , human thymidine kinase (ref 20) and human cyclin A (refs 21, 22) genes. However, we were not able to demonstrate binding of EMA/DP-1 to these sites which highlights the importance of residues surrounding the core T-C-C-C-G-C motif (data not shown) .

We next asked whether the sequence identified as a binding site for EMA/DP-1 was sufficient to confer regulation on a basal promoter. Therefore, we inserted two consensus EMA/DP-1-binding sites into the reporter plasmid pAlOCAT immediately upstream of the minimal SV40 early gene promoter resulting in pAlOCAT-EMA (Fig. 6a) . pAlOCAT shows a readily detectable basal activity when transfected into HeLa cells (Fig. 6b) . Surprisingly, pAlOCAT-EMA showed a markedly

reduced level of expression compared to pAlOCAT (Fig. 6b) . In contrast, the presence of the double E2F-binding site of the E2 promoter resulted in an increased expression level of pA10CAT-E2F compared to pAlOCAT. The demonstration that an EMA/DP-1 binding site functions as a negative regulatory element in HeLa cells provides indication that EMA itself acts as a transcriptional repressor.

To assess whether EMA can function as a .transcriptional repressor we linked the EMA cDNA onto the GAL4 DNA-binding domain and asked if the encoded fusion protein would repress the activity of the high basal level promoter 5GALTKCAT (Fig. 6c) in transient transfection assays. Most strikingly, the GALEMA fusion protein can reduce the "basal" level of the reporter plasmid by about six fold (Fig. 6d) .

As expected, a GALRB construct which had recently been shown to confer direct repressor activity (ref 24) also reduced the reporter activity (Fig. 6d) whereas a GAL fusion protein containing the E2F-1 activation domain markedly activated the promoter (data not shown) .

Since EMA lacks the complete C-terminus of the E2Fs, the EMA N-terminus was most likely to confer the repressor activity. When the 62 N-terminal amino-acids of EMA were fused onto GAL4 this construct repressed transcription to an even higher level than the full length GALEMA fusion (Fig. 6d) . This repression was not due to sterical hindrance since a transcriptionally insert fragment of E2F-1 containing amino

acids 284-359 (ref 25) " had no significant effect on the activity of the 5GTKCAT reporter (Fig. 6d) . Thus EMA contains an independent repressor domain within its N- terminus, a region not conserved in any of the E2F family members.

To assess the biological functions of EMA we transiently transfected HeLa cells either with an expression vector for EMA (pCIEMA) or with control plasmids pCMVE2F-l or pCINEO (Clontech) . We then performed FACS analysis on transfected cells and asked whether EMA expression would alter cell cycle progression. In these experiments EMA transfected cells showed a similar cell cycle profile as cells expressing E2F-1. The Gl population was markedly decreased and the S/G2 population markedly increased, compared to cells transfected with the control plasmid (Fig. 7) . These results indicate that EMA, like E2F1, has the ability to mediate the induction of S phase.

Taken together these results demonstrate that EMA represents a member of a novel class of transcription factor with a E2F-binding site Modulating Activity: EMA can bind a subset of E2F sites along with DP-l. In addition, like E2F family members, EMA has the capacity to stimulate cell cycle progression. Our results indicate, however, that the mechanism by which EMA regulates the cell cycle is distinct from that of the E2F family. Cell cycle progression mediated by E2F1 requires the activation domain which is

missing in EMA. The unique transcriptional repression characteristics of EMA may be responsible for its effects on the cell cycle. Identification of cellular promoters repressed by EMA in vivo will shed light on the mechanism by which EMA is able to act as an S-phase inducing transcription factor.

Methods

Cloning of an EMA cDNA

The yeast two hybrid system was kindly provided by S. M. Hollenberg and used essentially as previously reported (ref 26) . The bait construct expressing DP-l amino acids 146 - 410 was constructed by amplifying the appropriate region of DP-l by PCR using gene specific sense

(5'AGTAGAATTCTTCAGCGCTGCCGACAACCAC3' ) and anti-sense primers (5'AGCGGGATCCTCACTGAGCCATTTCTGTCACGTATG 3') . The amplified product was cloned into pBTM116 resulting in pBTMDP-1. The library used in the screen was a 9.5/10.5 day mouse embryonic cDNA library (kindly provided by S. M. Hollenberg) . An EMA fragment identified in the screen containing DNA sequence corresponding to amino-acids 141-259 was used to isolate a full length cDNA clone of EMA from an adult murine liver λzap library (Stratagene) . Two independent clones were isolated.

Plasmid Construction

Plasmids for in vitro-transcription of IE2 (ref 15) and E2F- 1 (ref 12) have been described. pING14-EMADN contains the coding region for amino acids 62-272. pGEX-2TK-EMA was constructed by amplifying the entire coding region of EMA using gene specific sense (5'AGGAGGATCCGAATTCATGAGTCAGCAGCG GACGGC3') and anti-sense primers (5'ATGCACTAGTACACT GGATGGGGCACATGATTC3' ) . The amplimer was cloned directionally into pGEX-2TK. pGEXDP-1 (ref 27) and pGEXE2F- 1 (ref 11) were described before. pCMV-EMA was constructed by inserting a full length EMA cDNA into pRcCMV

(Invitrogen) . pCMVGT-EMA was made by amplifying the EMA coding sequence with the aforementioned primers and cloning of the amplimer into the Gal-fusion plasmid pCMVGT (T.K., unpublished) . pCMVGT-EMA(N) was made by inserting amplified EMA DNA sequences corresponding to the first 62 amino acids of EMA (sense primer:

5'AGGAGGATCCGAATTCATGAGTCAGCAGCGGACGGC3 ' and anti-sense primer: 5'AGTCACTAGTGGGCCTCTTCACTTTCAGAGCTTTTC3' ) into pCMVGT. pGALRB was constructed by cloning the RB coding region corresponding to amino-acids 378-928 directionally into pHKG4 (ref 28) .

Northern Blot Analysis

A murine multiple tissue Northern blot (Clontech) was probed with a random primed cDNA fragment of EMA according to the manufacturer's instructions. As a control the identical blot was hybridised with a Ps tl -Xbal fragment of the GAPDH

gene (kindly provided " by M. W. Hentze) . Quantitative evaluation was done using a phosphoimager (Biorad) .

Binding Site Selection Assay

Oligonucleotides used to produce and amplify the probe were as follows: BSS-1: 5' CGGGTCTAGATCTGTGAGATCAG-N16-GAGAC TGAGCGTGAATTCCGTC3' ; BSS-2 : 5' CGGGTCTAGATCTGTGAGATCAG3' , BSS-3: 5'GACGGAATTCACGCTCAGTCTC3' . To obtain the double stranded probe, BSS-3 was annealed to BSS-1 and Klenow enzyme used to fill in the overhang. The probe can be amplified using BSS-2 and BSS-3 as primers. It contains restriction sites for subcloning and diagnostic purposes. Approximately 20 pg of double stranded probe was incubated with 500 ng of GST fusion proteins for 1 h at room temperature in 200 μl binding buffer (in mM: 20 HEPES, pH 7.4; 100 KCl; 1 MgCl 2 ; 0.1 EDTA; 1 DTT; 8% glycerol; 30 μg BSA and 1 μg ssDNA) . After washes in binding buffer specifically bound probe was eluted and amplified by PCR. 5 μl of the amplification product were used as probe for the next round of selection. After five rounds of selection the products were subcloned into Bluescript SK (Stratagene) and DNA from individual clones was subjected to sequence analysis .

Gel Retardation Assays

Approximately 1 pg of probe was incubated with 5 ng of GST-

E2F-1, 20 ng of GST-DP Z 1 and/or 30 ng of GST-EMA at conditions previously described (ref 13) .

Flow Cytometry Analysis

Cells were trypsinized, fixed in 75% ethanol on ice for several hours, and stained with 50 μg/ml of propidium iodide containing 0.2 mg/ml of RNase. Flow cytometry analysis was performed on a Becton-Dickinson FACScan. Using the CellQuest software gates were selected for single cells within a normal size range. The propidium iodide signal was used as measure for DNA content. The DNA histograms each contain data from.50000 cells.

Example 3

Identification of a Clone Incoding the Human Homolog of Mouse EMA

Cloning and sequencing of the human homolog of mouse EMA has been carried out using the the coding sequence of mouse EMA as a probe. The human coding and polypeptide sequences are shown in Figures 2 and 3 respectively.

References

1. Johnson, D.G. et al . Nature 365, 349-352 (1993) .

2. Qin, X.-Q. et al . Proc . Na tl . Acad . Sci . USA 91,

10918-10922 (1994) .

3. Shan, B., and Lee, W.-H. Mol. Cell. Biol. 14, 8166- 8173 (1994) .

4. Lukas, J. et al . Mol. Cell. Biol. 16, 1047-1057 (1996) .

5. Singh, P. et al . EMBO J. 13, 3329-38 (1994) .

6. Beijersbergen, R.L. et al. Genes Dev. 8, 2680-90

(1994) .

7. Xu, G. et al. Proc. Natl. Acad. Sci. USA 92, 1357-61 (1995) .

8. Johnson, D.G. et al . Proc. Natl. Acad. Sci USA 91, 12823-7 (1994) .

9. Wu, X. and Levine, A.J. Proc. Natl. Acad. Sci USA 91, 3602-3606 (1994) . 10. Dynlacht, B.D. et al . Proc. Natl. Acad. Sci USA 91, 6359-6363 (1994) .

11. Helin, K. et al. Cell 70, 337350 (1992) .

12. Kaelin, W.G. et al . Cell 70, 351-364 (1992) . 13. Lees, J.A., et al . Mol. Cell. Biol. 13, 7813-7825 (1993) .

14. Hijmans, E.M. et al . Mol. Cell. Biol. 15, 3082-3089

(1995) .

15. Hagemeier, C. et al . EMBO J. 13, 2897-2903 (1994) . 16. Kovesdi, I. et al . Cell. 45, 219-228 (1986) .

17. Chittenden, T. et al . Cell. 65, 1073-82 (1991) .

18. Lam, E.W-F and Watson R.J. EMBO J. 12, 2705-2713

(1993) .

19. Thalmeier, K. et ' al . Genes Dev. 3, 527-536 (1989) .

20. Li, L-J. et al. Proc. Natl. Acad. Sci. USA 90, 3554- 3558 (1993) .

21. Zwicker, J. et al . EMBO J. 14, 4514-4522 (1995) . 22. Schulze, A. et al . Proc. Natl. Acad. Sci. USA 92, 11264-11268 (1995) .

23. Yew, P.R. et al . Genes Dev. 8, 190-202 (1994) .

24. Weintraub, S.J. et al. Nature 375, 812-815 (1995) .

25. Hagemeier, C. et al . Nucleic Acids Research 21, 4998- 5004 (1993) .

26. Vojtek, A.B. et al . Cell 74, 205-214 (1993) .

27. Girling, R. et al. Nature 362, 83-87 (1993) .

28. Hagemeier, C. et al. Proc. Natl. Acad. Sci USA 90, 1580-1584 (1993) .