Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
FRAGILE X RELATED PROTEINS, COMPOSITIONS AND METHODS OF MAKING AND USING THE SAME
Document Type and Number:
WIPO Patent Application WO/1996/038467
Kind Code:
A1
Abstract:
The present invention relates to substantially pure FXR1 and FXR2 proteins and isolated nucleic acid molecules encoding the same. Recombinant expression vectors comprising nucleic acid sequences that encode FXR1 and FXR2 protein are also provided. Antibodies which bind to an epitope of FXR1 or FXR2, or FMR1 protein are also provided. The present invention also relates to methods of screening individuals for FMR1 deficiency using antibodies or nucleic acid molecules of the invention and pharmaceutical kits for accomplishing the same.

Inventors:
DREYFUSS GIDEON
SIOMI MIKIKO C
ZHANG YAN
Application Number:
PCT/US1996/008853
Publication Date:
December 05, 1996
Filing Date:
May 31, 1996
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNIV PENNSYLVANIA (US)
International Classes:
C07K14/47; C07K16/18; C12Q1/68; (IPC1-7): C07K1/00; C07K16/00; C07H21/02; C07H21/04; C12N5/00; C12N15/00; C12Q1/68; G01N33/53
Other References:
CELL, 30 July 1993, Vol. 27, SIOMI et al., "The Protein Product of the Fragile X Gene, FMR1, Has Characteristics of an RNA-Binding Protein", pages 291-298.
CELL, 31 May 1991, Volume 65, ANNEMIEKE et al., "Identification of a Gene (FMR-1) Containing a CGG Repeat Coincident with a Breakpoint Cluster Region Exhibiting Length Variation in Fragile X Syndrome", pages 905-914.
CELL, 25 July 1994, Volume 78, BAKKER et al., "Fmr1 Knockout Mice: A Model to Study Fragile X Mental Retardation", pages 23-33.
Download PDF:
Claims:
CLAIMS
1. A substantially pure protein having the amino acid sequence of SEQ ID N0:4.
2. An isolated nucleic acid molecule consisting of SEQ ID NO:3 or a fragment thereof having at least 10 nucleotides.
3. The nucleic acid molecule of claim 2 consisting of SEQ ID NO:3.
4. A recombinant expression vector comprising a nucleic acid sequence that encodes the protein of claim 1.
5. A host cell comprising the recombinant expression vector of claim 4.
6. A recombinant expression vector comprising the nucleic acid molecule of claim 3.
7. A host cell comprising the recombinant expression vector of claim 6.
8. The nucleic acid molecule of claim 2 consisting of a fragment of SEQ ID NO:3 having at least 10 nucleotides.
9. The nucleic acid molecule of claim 2 consisting of a fragment of SEQ ID NO:3 having 12150 nucleotides.
10. The nucleic acid molecule of claim 2 consisting of a fragment of SEQ ID NO:3 having 1550 nucleotides.
11. The nucleic acid molecule of claim 2 consisting of a fragment of SEQ ID NO:3 having 1830 nucleotides.
12. The nucleic acid molecule of claim 2 consisting of a fragment of SEQ ID NO:3 having 24 nucleotides.
13. An oligonucleotide molecule comprising a nucleotide sequence complimentary to a nucleotide sequence of at least 10 nucleotides of SEQ ID NO:3.
14. The oligonucleotide molecule of claim 13 consisting of a nucleotide sequence complimentary to a nucleotide sequence of at least 10150 nucleotides of SEQ ID NO:3.
15. The oligonucleotide molecule of claim 14 consisting of a nucleotide sequence complimentary to a nucleotide sequence of at least 1828 nucleotides of SEQ ID NO:3.
16. An isolated antibody which binds to an epitope on SEQ ID NO:4.
17. The antibody of claim 16 which binds to an epitope within amino acids 380 to 621 of SEQ ID NO:4.
18. The antibody of claim 16 which binds to an epitope within amino acids 535 to 621 of SEQ ID NO:4.
19. The antibody of claim 16 wherein said antibody is a monoclonal antibody.
20. A substantially pure protein having the amino acid sequence of SEQ ID NO:6.
21. An isolated nucleic acid molecule consisting of SEQ ID NO:5 or a fragment thereof having at least 10 nucleotides.
22. The nucleic acid molecule of claim 21 consisting of SEQ ID NO:5.
23. A recombinant expression vector comprising a nucleic acid sequence that encodes the protein of claim 20.
24. A host cell comprising the recombinant expression vector of claim 23.
25. A recombinant expression vector comprising the nucleic acid molecule of claim 22.
26. A host cell comprising the recombinant expression vector of claim 25.
27. The nucleic acid molecule of claim 21 consisting of a fragment of SEQ ID NO:5 having at least 10 nucleotides.
28. The nucleic acid molecule of claim 21 consisting of a fragment of SEQ ID NO:5 having 12150 nucleotides.
29. The nucleic acid molecule of claim 21 consisting of a fragment of SEQ ID NO:5 having 1550 nucleotides.
30. The nucleic acid molecule of claim 21 consisting of a fragment of SEQ ID NO:5 having 1830 nucleotides.
31. The nucleic acid molecule of claim 21 consisting of a fragment of SEQ ID NO:5 having 24 nucleotides.
32. An oligonucleotide molecule comprising a nucleotide sequence complimentary to a nucleotide sequence of at least 10 nucleotides of SEQ ID NO:5.
33. The oligonucleotide molecule of claim 32 consisting of a nucleotide sequence complimentary to a nucleotide sequence of at least 10150 nucleotides of SEQ ID NO:5.
34. The oligonucleotide molecule of claim 33 consisting of a nucleotide sequence complimentary to a nucleotide sequence of at least 1828 nucleotides of SEQ ID NO:5.
35. An isolated antibody which binds to an epitope on SEQ ID NO:6.
36. The antibody of claim 35 which binds to an epitope within amino acids 390 to 673 of SEQ ID NO:6.
37. The antibody of claim 35 which binds to an epitope within amino acids 574 to 673 of SEQ ID NO:6.
38. The antibody of claim 35 wherein said antibody is a monoclonal antibody.
39. An isolated antibody which binds to an epitope within amino acids 331 to 375 of SEQ ID NO:2.
40. An isolated antibody which binds to an epitope within amino acids 520 to 610 of SEQ ID NO:2.
41. A method of screening individuals for FMRl deficiency comprising the steps of: contacting a sample of tissue or body fluid from said individual with antibodies that bind to FMRl but that do not to bind to FXRl or FXR2; and detecting said antibodies that are bound to FMRl in said sample.
42. A method of screening individuals for FMRl deficiency comprising the steps of: contacting a sample of nucleic.acid molecules derived from tissue or body fluid from said individual with nucleic acid probes that hybridize to nucleotide sequences that encode FMRl but that do not hybridize to nucleotides sequences that encode FXRl or FXR2; and detecting said nucleic acid probes that are hybridized to nucleotide sequences that encode FMRl in said sample.
43. A method of screening individuals for FMRl deficiency comprising the steps of : amplifying by polymerase chain reaction, nucleic acid molecules derived from tissue or body fluid from said individual using primers that amplify nucleotide sequences that encode FMRl but that do not amplify nucleotide sequences that encode FXRl or FXR2; and detecting amplified nucleic acid molecules in said sample.
44. FXRl knockout mice which lack normal FXRl mRNA and are homozygous for a mutated, nonfunctional FXRl gene.
45. FXR2 knockout mice which lack normal FXR2 mRNA and are homozygous for a mutated, nonfunctional FXR2 gene.
46. A pharmaceutical kit comprising: a container comprising antibodies which bind to FMRl but not FXRl or FXR2; positive and negative controls; and instructions .
47. The kit of claim 46 wherein the antibodies bind to an epitope within amino acids 331 to 375 of SEQ ID NO:2.
48. The kit of claim 46 wherein the antibodies bind to an epitope within amino acids 520 to 610 of SEQ ID NO:2.
49. A pharmaceutical kit comprising: a container comprising nucleic acid molecules which hybridize to nucleic acid molecules encoding FMRl but not FXRl or FXR2; positive and negative controls; and instructions .
Description:
FRAGILE X RELATED PROTEINS, COMPOSITIONS AND METHODS OF MAKING AND USING THE SAME

FIELD OF THE INVENTION The invention relates to novel human proteins, FXRl and FXR2, which are related to FMRl, the protein associated with Fragile X syndrome. The invention relates to diagnostic assays for identifying individuals with Fragile X syndrome.

BACKGROUND OF THE INVENTION Fragile X mental retardation syndrome is one of the most common human genetic diseases and the most common cause of hereditary mental retardation, affecting about 1 in 1200 males and 1 in 2500 females. Reviewed in Oostra, et al . ,

(1993) in Genome analysis : Gemone mapping and neurological disorder, Vol. 6, K.E. Davies and S.M. Tilghman (Eds.) , Cold Spring Harbor Laboratory Press, New York, pp. 45-75 and Nussbaum, R.L. and Ledbetter, D.H., (1995) in Metabolic Basis of Inheri ted Disease, CR. Scriver, et al . (Eds.), McGraw Hill, New York, 7th Ed., pp. 795-810. The syndrome is characterized by mental retardation (average I.Q. of 20-60) and varying degrees of autistic behavior, macroorchidism in adult males, characteristic facial features and hyperextensible joints. Hagerman, R.J., (1991) in Fragile X syndrome : Diagnosis, treatment, and research, R.J. Hagerman and A.C. Silverman (Eds.) , Johns Hopkins University Press, Baltimore, pp. 1-68. The gene directly responsible for fragile X syndrome, FMRl, is located at Xq27.3. Verkerk, et al . , Cell , 1991, 65 ,

905-914. The nucleotide and amino acid sequences of FMRl are set forth in SEQ ID NO:l and SEQ ID NO:2, respectively. The 5' untranslated region of the FMRl gene contains a polymorphic CGG trinucleotide repeat, 6-60 repeats found in normal individuals, which can be amplified to hundreds or thousands of copies in affected patients. Verkerk, et al . , Cell , 1991, 65, 905-914; Oberle, et al . , Science, 1991, 252, 1097-1102; Yu, et al . , Science, 1991, 252, 1179-1181; Fu, et al . , Cell , 1991, 67, 1047-1058. Fragile X syndrome usually results from expansion of the CGG repeats leading to hypermethylation of the CpG island adjacent to FMRl and loss of transcription of the FMRl gene. Pieretti, et al . , Cell , 1991, 66, 817-822. Indeed, affected patients usually do not have detectable FMRl protein. Siomi, et al . , Cell , 1993, 74, 291-298 and Verheij , et al. , Na ture, 1993, 363 , 722-724. The FMRl mRΝA and protein are expressed in many tissues, but particularly high levels are found in the brain and in tubules of the testes which are two of the major organs affected in fragile X syndrome. Hinds, et al . , Na ture Genet . , 1993, 3 , 36-43; Devys, et al . , Na ture Genet., 1993, 4 , 335-340; Abitbol, et al . , Nature Genet . , 1993, 4 , 147-152; and Bachner, et al . , Nature Genet . , 1993, 4 , 115- 116.

FMRl knockout mice have been generated. These knockout mice lack normal FMRl mRΝA and protein expression and show enlarged testes, impaired cognitive function, and abnormal behavior. This animal model supports the central role of FMRl in fragile X syndrome and it may serve as a valuable tool for the elucidation of the physiological role of FMRl. Bakker, et al . , Cell , 1994, 78, 23-33. The FMRl protein contains motifs characteristic of

RΝA-binding proteins, namely two KH domains and an RGG box

(Siomi, et al . , Cell , 1993, 74, 291-298; Ashley, et al . ,

Science, 1993, 262, 563-566; and Gibson, et al . , Trends

Biochem. Sci . , 1993, 18 , 331-333) , and has been shown to bind RΝA in vitro (Siomi, et al . , Cell , 1993, 74 , 291-298; and Ashley, et al . , Science, 1993, 262, 563-566) . Importantly, the RΝA-binding activity of the FMRl Ile-304→Asn mutant (which

changes a highly conserved residue in the KH domain) that was found in a patient with severe fragile X syndrome (De Boulle, et al . , Nature Genet . , 1993, 3 , 31-35) , is impaired (Siomi, et al . , Cell , 1994, 77, 33-39) . Together, these findings suggest a strong connection between fragile X syndrome and the RNA- binding activity of FMRl. However, the cognate RNA target of FMRl and its precise functions have not yet been elucidated.

PCT Publication WO 90/05194, which is incorporated herein by reference, describes a probe that is used to detect fragile X syndrome. The probe consists of a nucleic acid fragment that is hybridizable with the human X chromosome at the region Xq27.3 - DXS52.

PCT Publication WO 91/09140, which is incorporated herein by reference, describes a probe for the detection of fragile X syndrome. The probe comprises at least 17 contiguous nucleotide bases.

PCT Publication WO 92/20825, which is incorporated herein by reference, describes nucleotide sequences and cosmids used to detect fragile X syndrome. The nucleotide sequences correspond to the FMR-1 gene.

PCT Publication WO 86/05512, which is incorporated herein by reference, describes DNA probes that recognize the polymorphic locus of the q28 region of the X chromosome.

PCT Publication WO 93/15225, which is incorporated herein by reference, describes a method of amplifying and detecting specific GC-rich nucleic acid sequences by polymerase chain reaction (PCR) , which may be used to detect individuals with fragile X syndrome. The method determines whether the number of CGG repeats in the test individual's X-chromosome are characteristic of a normal, carrier, or afflicted person. Such a method is used to amplify and detect the GC-rich region of the FMRl gene.

PCT Publication WO 92/12262, which is incorporated herein by reference, describes DNA sequences that may be used to detect individuals with fragile X syndrome. The DNA sequences span the fragile X site on the human X chromosome.

PCT Publication WO 92/14840, which is incorporated herein by reference, describes nucleic acid fragments containing mutations associated with the fragile X syndrome that may be used to detect individuals with mental retardation. There remains a need for reagents, kits and methods useful in the identification of individuals suffering from fragile X syndrome. There is a need for reagents, kits and methods useful in the identification of individuals suffering from FMRl deficiency without misdiagnosing an individual's condition due to cross-reactivity with fragile X related proteins.

SUMMARY OF THE INVENTION

The present invention relates to substantially pure FXRl protein. The present invention relates to nucleic acid molecules that encode FXRl protein.

The present invention relates to recombinant expression vectors that comprise a nucleic acid sequence that encodes FXRl protein. The present invention relates to host cells that comprise recombinant expression vectors that encode FXRl protein.

The present invention relates to fragments of nucleic acid molecules with sequences encoding FXRl protein that have at least 10 nucleotides.

The present invention relates to oligonucleotide molecules that comprise a nucleotide sequence complimentary to a nucleotide sequence of at least 10 nucleotides of SEQ ID NO:3. The present invention relates to substantially pure

FXR2 protein.

The present invention relates to nucleic acid molecules that encode FXR2 protein.

The present invention relates to nucleic acid molecules encoding FXR2 protein that consists of SEQ ID NO:5.

The present invention relates to recombinant expression vectors that comprise a nucleic acid sequence that encodes FXR2 protein.

The present invention relates to host cells that comprise recombinant expression vectors that encode FXR2 protein.

The present invention relates to fragments of nucleic acid molecules with sequences encoding FXR2 protein that have at least 10 nucleotides. The present invention relates to oligonucleotide molecules that comprise a nucleotide sequence complimentary to a nucleotide sequence of at least 10 nucleotides of SEQ ID NO:5.

The present invention relates to isolated antibodies which bind to an epitope on SEQ ID NO:4.

The present invention relates to isolated antibodies which bind to an epitope on SEQ ID NO:6.

The present invention relates to isolated antibodies which bind to FMRl protein but not FXRl protein or FXR2 protein.

The present invention relates to methods of screening individuals for FMRl deficiency comprising the steps of contacting a sample of tissue or body fluid from said individual with antibodies that bind to FMRl but not to FXRl or FXR2 and detecting the FMRl-specific antibodies that are bound to FMRl in the sample.

The present invention relates to methods of screening individuals for FMRl deficiency comprising the steps of contacting a sample of nucleic acid molecules derived from tissue or body fluid from the individual with nucleic acid molecules that hybridize to nucleotide sequences that encode FMRl but not FXRl or FXR2.

The present invention relates to methods of screening individuals for FMRl deficiency comprising the steps of amplifying nucleic acid molecules derived from tissue or body fluid from the individual using polymerase chain reaction

primers that amplify nucleotide sequences that encode FMRl but not FXRl or FXR2.

The present invention relates to FXRl knockout mice which lack normal FXRl and are homozygous for a mutated, non- functional FXRl gene.

The present invention relates to FXR2 knockout mice which lack normal FXR2 and are homozygous for a mutated, non¬ functional FXR2 gene.

The present invention relates to pharmaceutical kit comprising a container comprising antibodies which bind to FMRl but not FXRl or FXR2, positive and negative controls, and instructions.

The present invention relates to pharmaceutical kit comprising a container comprising nucleic acid molecules which hybridize to nucleic acid molecules encoding FMRl but not FXRl or FXR2, positive and negative controls, and instructions.

DETAILED DESCRIPTION OF THE DRAWINGS

Figure 1 shows a comparison of the amino acid sequences for human FMRl (SEQ ID NO:2) , FXRl (SEQ ID NO:4) and FXR2 (SEQ ID NO: 6) . Conserved amino acids among the three proteins are shaded. The dashes indicate the placement of gaps to maximize alignment between the sequences. The amino acid position is shown at the left.

DETAILED DESCRIPTION OF THE INVENTION Two novel human genes, termed FXRl and FXR2, that are highly homologous by amino acid sequence to FMRl have been discovered. The FXRl and FXR2 proteins are cytoplasmic RNA- binding proteins that are expressed in many tissues. However, unlike FMRl, neither FXRl nor FXR2 are located on the X chromosome. FXRl is an autosomal gene located at 12ql3. FXR2 is an autosomal gene located at 17pl3.1.

According to some methods of detecting individuals with mental retardation caused by conditions such as fragile X syndrome, large nucleotide sequences, such as cosmids, are used to detect various loci associated with fragile X syndrome.

After FMRl was isolated, subsequent approaches used nucleotide probes to detect the GC-rich region in the 5'-UTR of the FMRl gene. In addition, immunoassays have been proposed to identify individuals who are suffering from FMRl deficiencies. Two new genes, FXRl and FXR2, have now been discovered that have significant sequence homology with FMRl. Thus, any nucleotide probes or antibodies used to detect the FMRl gene or FMRl protein, respectively, may cross-react with FXRl or FXR2 genes and proteins. The detection of FXRl or FXR2 may lead to the belief that the FMRl gene or protein has been detected, resulting in a misdiagnosis. The present invention overcomes this problem by providing the means to detect unique FMRl nucleotide or amino acid sequences which do not cross react with corresponding regions of the FXRl and FXR2 genes and proteins.

The present invention provides the cloned gene that encodes FXRl. The nucleotide sequence that encodes FXRl and that is disclosed herein as SEQ ID NO:3 allows for the production of pure FXRl and the design of probes which specifically hybridize to nucleic acid molecules that encode FXRl and antisense compounds to inhibit transcription of the gene that encodes FXRl.

The present invention provides the cloned gene that encodes FXR2. The nucleotide sequence that encodes FXR2 and that is disclosed herein as SEQ ID NO:5 allows for the production of pure FXR2 and the design of probes which specifically hybridize to nucleic acid molecules that encode FXR2 and antisense compounds to inhibit transcription of the gene that encodes FXR2. Antibodies that specifically bind to FXRl are provided. Such antibodies are may be used in methods of isolating pure FXRl. In some preferred embodiments, the antibodies do not cross react with FMRl and can be used to distinguish FXRl from FMRl. Such antibodies may, for example, bind to an epitope within amino acids 380 to 621 or 535 to 621 of FXRl (as found in SEQ ID NO:4) . In some preferred embodiments, the antibodies do not cross react with FXR2 and

can be used to distinguish FXRl from FXR2. In some preferred embodiments, the antibodies do not cross react with FMRl and FXR2 and can be used to distinguish FXRl from FMRl and FXR2. Antibodies that specifically bind to FXR2 are provided. Such antibodies may be used in methods of isolating pure FXR2. In some preferred embodiments, the antibodies do not cross react with FMRl and can be used to distinguish FXR2 from FMRl. Such antibodies may, for example, bind to an epitope within amino acids 390 to 673 or 574 to 673 of FXR2 (as found in SEQ ID NO:6) . In some preferred embodiments, the antibodies do not cross react with FXRl and can be used to distinguish FXR2 from FXRl. In some preferred embodiments, the antibodies do not cross react with FMRl and FXRl and can be used to distinguish FXR2 from FMRl and FXRl. Antibodies that specifically bind to FMRl are provided. Such antibodies may be used in methods of isolating pure FMRl. Such antibodies are may be used in methods of identifying individuals who have fragile X syndrome. That is, by identifying individuals whose tissue shows an absence or deficiency in FMRl, a diagnosis of fragile X syndrome is indicated. In some preferred embodiments, the antibodies do not cross react with FXRl or FXR2 and can be used to distinguish FMRl from FXRl and FXR2, thereby eliminating false positive and misdiagnosis. The present invention provides substantially purified

FXRl and FXR2 which have amino acid sequences consisting of SEQ ID NO:4 and SEQ ID NO:6, respectively. The amino acid and nucleotide sequences for FXRl and FXR2 may vary according to the clone from which they were isolated. It is expected that amino acid substitutions or deletions may be found in proteins isolated from numerous clones. FXRl and FXR2 can be isolated from natural sources, produced by recombinant DNA methods or synthesized by standard protein synthesis techniques.

Antibodies which specifically bind to a particular FXRl or FXR2 may be used to purify the protein from natural sources using well known techniques and readily available starting materials. Such antibodies may also be used to purify

the FXRl or FXR2 from material present when producing the protein by recombinant DNA methodology. The present invention relates to antibodies that bind to FXRl protein (SEQ ID NO:4) or FXR2 protein (SEQ ID NO:6) . As used herein, the term "antibody" is meant to refer to complete, intact antibodies, and Fab fragments and F(ab) 2 fragments thereof. Complete, intact antibodies include monoclonal antibodies such as murine monoclonal antibodies, chimeric antibodies and humanized antibodies . Antibodies that bind to an epitope which is present on FXRl or FXR2 are useful to isolate and purify the FXRl or FXR2 from both natural sources or recombinant expression systems using well known techniques such as affinity chromatography. Such antibodies are useful to detect the presence of such protein in a sample and to determine if cells are expressing the protein.

The production of antibodies and the protein structures of complete, intact antibodies, Fab fragments and F(ab) 2 fragments and the organization of the genetic sequences that encode such molecules are well known and are described, for example, in Harlow, E. and D. Lane (1988) Antibodies : A Laboratory Manual , Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, which is incorporated herein by reference. Briefly, for example, FXRl or FXR2, or an immunogenic fragment thereof, is injected into mice. The spleen of the mouse is removed, the spleen cells are isolated and fused with immortalized mouse cells. The hybrid cells, or hybridomas, are cultured and those cells which secrete antibodies are selected. The antibodies are analyzed and, if found to specifically bind to protein, the hybridoma which produces them is cultured to produce a continuous supply of antibodies.

Using standard techniques and readily available starting materials, nucleic acid molecules that encode FXRl and FXR2 may each be isolated from a cDNA library, using probes which are designed based upon the nucleotide sequence information disclosed in SEQ ID NO:3 and SEQ ID NO:5, respectively. The present invention relates to an isolated nucleic acid molecule that comprises a nucleotide sequence that

encodes FXRl or FXR2 and that comprises the amino acid sequence of SEQ ID NO:4 and SEQ ID NO:6, respectively. In some embodiments, the nucleic acid molecules consist of a nucleotide sequence that encodes FXRl or FXR2. In some embodiments, the nucleic acid molecules comprise the nucleotide sequence that consists of the coding sequence in SEQ ID NO:3 or SEQ ID NO:5. In some embodiments, the nucleic acid molecules consist of the nucleotide sequence set forth in SEQ ID NO:3 or SEQ ID NO:5. The isolated nucleic acid molecules of the invention are useful to prepare constructs and recombinant expression systems for preparing FXRl and FXR2.

A cDNA library may be generated by well known techniques. A cDNA clone which contains one of the nucleotide sequences set out is identified using probes that comprise at least a portion of the nucleotide sequence disclosed in SEQ ID NO: 3 or SEQ ID NO:5. The probes generally have at least 16 nucleotides, preferably 24 nucleotides. The probes are used to screen the cDNA library using standard hybridization techniques. Alternatively, genomic clones may be isolated using genomic DNA from any human cell as a starting material. The present invention relates to isolated nucleic acid molecules that comprise a nucleotide sequence identical or complementary to a fragment of SEQ ID NO:3 or SEQ ID NO:5 which is at least 10 nucleotides. In some embodiments, the isolated nucleic acid molecules consist of a nucleotide sequence identical or complementary to a fragment of SEQ ID NO:3 or SEQ ID NO:5 which is at least 10 nucleotides. In some embodiments, the isolated nucleic acid molecules comprise or consist of a nucleotide sequence identical or complementary to a fragment of SEQ ID NO:3 or SEQ ID NO:5 which is 15-150 nucleotides. In some embodiments, the isolated nucleic acid molecules comprise or consist of a nucleotide sequence identical or complementary to a fragment of SEQ ID NO:3 or SEQ ID NO:5 which is 15-30 nucleotides. Isolated nucleic acid molecules that comprise or consist of a nucleotide sequence identical or complementary to a fragment of SEQ ID NO:3 or SEQ ID NO:5 which is at least 10 nucleotides are useful as probes for identifying genes and cDNA

sequence having SEQ ID NO:3 or SEQ ID NO:5, respectively, and PCR primers for amplifying genes and cDNA having SEQ ID NO:3 or SEQ ID NO:5, respectively.

The cDNA that encodes FXRl and FXR2 may be used as a molecular marker in electrophoresis assays in which cDNA from a sample is separated on an electrophoresis gel and probes are used to identify bands which hybridize to such probes. Specifically, SEQ ID NO:3 or portions thereof, and SEQ ID NO:5 or portions thereof, may be used as a molecular marker in electrophoresis assays in which cDNA from a sample is separated on an electrophoresis gel and specific probes are used to identify bands which hybridize to them, indicating that the band has a nucleotide sequence complementary to the sequence of the probes. The isolated nucleic acid molecule provided as a size marker will show up as a positive band which is known to hybridize to the probes and thus can be used as a reference point to the size of cDNA that encodes FXRl and FXR2, respectively. Electrophoresis gels useful in such an assay include standard polyacrylamide gels as described in Sambrook et al . , Molecular Cloning a Laboratory Manual , (1989) Second Ed., Cold Spring Harbor Press, New York, which is incorporated herein by reference.

The nucleotide sequences in SEQ ID NO:3 and SEQ ID NO:5, may be used to design probes, primers and complimentary molecules which specifically hybridize to the unique nucleotide sequences of FXRl and FXR2, respectively. Probes, primers and complimentary molecules which specifically hybridize to nucleotide sequence that encodes FXRl and FXR2 may be designed routinely by those having ordinary skill in the art. The present invention also includes labelled oligonucleotides which are useful as probes for performing oligonucleotide hybridization methods to identify FXRl and FXR2. Accordingly, the present invention includes probes that can be labelled and hybridized to unique nucleotide sequences that encode FXRl and FXR2, respectively. The labelled probes of the present invention are labelled with radiolabelled nucleotides or are otherwise detectable by readily available

nonradioactive detection systems. In some preferred embodiments, probes comprise oligonucleotides consisting of between 10 and 100 nucleotides. In some preferred, probes comprise oligonucleotides consisting of between 10 and 50 nucleotides. In some preferred, probes comprise oligonucleotides consisting of between 12 and 20 nucleotides. The probes preferably contain nucleotide sequence completely identical or complementary to a fragment of a unique nucleotide sequences of FXRl and FXR2. PCR technology is practiced routinely by those having ordinary skill in the art and its uses in diagnostics are well known and accepted. Methods for practicing PCR technology are disclosed in "PCR Protocols: A Guide to Methods and Applications", Innis, M.A. , et al . Eds. Academic Press, Inc. San Diego, CA (1990) which is incorporated herein by reference. Applications of PCR technology are disclosed in "Polymerase Chain Reaction" Erlich, H.A., et al . , Eds. Cold Spring Harbor Press, Cold Spring Harbor, New York (1989) which is incorporated herein by reference. Some simple rules aid in the design of efficient primers. Typical primers are 18-28 nucleotides in length having 50% to 60% G+C composition. The entire primer is preferably complementary to the sequence it must hybridize to. Preferably, primers generate PCR products 100 basepairs to 2000 base pairs. However, it is possible to generate products of 50 base pairs to up to 10 kb and more. PCR technology allows for the rapid generation of multiple copies of nucleotide sequences by providing 5' and 3' primers that hybridize to sequences present in a nucleic acid molecule, and further providing free nucleotides and an enzyme which fills in the complementary bases to the nucleotide sequence between the primers with the free nucleotides to produce a complementary strand of DNA. The enzyme will fill in the complementary sequences adjacent to the primers. If both the 5' primer and 3' primer hybridize to nucleotide sequences on the complementary strands of the same fragment of nucleic acid, exponential amplification of a specific double- stranded product results. If only a single primer hybridizes

to the nucleic acid molecule, linear amplification produces single-stranded products of variable length.

One having ordinary skill in the art can isolate the nucleic acid molecule that encodes FXRl and FXR2 and insert it into an expression vector using standard techniques and readily available starting materials.

The present invention relates to a recombinant expression vector that comprises a nucleotide sequence that encodes FXRl or FXR2 that comprises the amino acid sequence of SEQ ID NO:4 or SEQ ID NO:6, respectively. As used herein, the term "recombinant expression vector" is meant to refer to a plasmid, phage, viral particle or other vector which, when introduced into an appropriate host, contains the necessary genetic elements to direct expression of the coding sequence that encodes FXRl or FXR2. The coding sequence is operably linked to the necessary regulatory sequences. Expression vectors are well known and readily available. Examples of expression vectors include plasmids, phages, viral vectors and other nucleic acid molecules or nucleic acid molecule containing vehicles useful to transform host cells and facilitate expression of coding sequences. In some embodiments, the recombinant expression vector comprises the nucleotide sequence set forth in SEQ ID NO:3 or SEQ ID NO:5. The recombinant expression vectors of the invention are useful for transforming hosts to prepare recombinant expression systems for preparing FXRl or FXR2.

The present invention relates to a host cell that comprises the recombinant expression vector that includes a nucleotide sequence that encodes FXRl or FXR2 that comprises SEQ ID NO:4 or SEQ ID NO:6. In some embodiments, the host cell comprises a recombinant expression vector that comprises SEQ ID NO:3 or SEQ ID NO:5. Host cells for use in well known recombinant expression systems for production of proteins are well known and readily available. Examples of host cells include bacteria cells such as E. coli , yeast cells such as S . cerevisiae, insect cells such as S . frugiperda, non-human

mammalian tissue culture cells Chinese hamster ovary (CHO) cells and human tissue culture cells such as HeLa cells.

The present invention relates to a transgenic non- human mammal that comprises the recombinant expression vector that comprises a nucleic acid sequence that encodes the FXRl or FXR2 that comprises the amino acid sequence of SEQ ID NO:4 or SEQ ID NO:6. Transgenic non-human mammals useful to produce recombinant proteins are well known as are the expression vectors necessary and the techniques for generating transgenic animals. Generally, the transgenic animal comprises a recombinant expression vector in which the nucleotide sequence that encodes the invention is operably linked to a mammary cell specific promoter whereby the coding sequence is only expressed in mammary cells and the recombinant protein so expressed is recovered from the animal's milk. In some embodiments, the coding sequence that encodes an FXRl or FXR2 is SEQ ID NO:3 or SEQ ID NO:5.

In some embodiments, for example, one having ordinary skill in the art can, using well known techniques, insert such DNA molecules into a commercially available expression vector for use in well known expression systems. For example, the commercially available plasmid pSE420 (Invitrogen, San Diego, CA) may be used for production in __. coli . The commercially available plasmid pYES2 (Invitrogen, San Diego, CA) may, for example, be used for production in S. cerevisiae strains of yeast. The commercially available MAXBAC™ complete baculovirus expression system (Invitrogen, San Diego, CA) may, for example, be used for production in insect cells. The commercially available plasmid pcDNA I (Invitrogen, San Diego, CA) may, for example, be used for production in mammalian cells such as CHO cells. One having ordinary skill in the art can use these commercial expression vectors and systems or others to produce FXRl and FXR2 using routine techniques and readily available starting materials. (See e . g. , Sambrook et al . , Molecular Cloning a Laboratory Manual , Second Ed. Cold Spring Harbor Press (1989) which is incorporated herein by reference.) Thus, the desired proteins can be prepared in both prokaryotic and

eukaryotic systems, resulting in a spectrum of processed forms of the protein.

One having ordinary skill in the art may use other commercially available expression vectors and systems or produce vectors using well known methods and readily available starting materials. Expression systems containing the requisite control sequences, such as promoters and polyadenylation signals, and preferably enhancers, are readily available and known in the art for a variety of hosts. See e . g. , Sambrook et al . , Molecular Cloning a Laboratory Manual , Second Ed., Cold Spring Harbor Press, New York (1989) .

A wide variety of eukaryotic hosts are also now available for production of recombinant foreign proteins. As in bacteria, eukaryotic hosts may be transformed with expression systems which produce the desired protein directly, but more commonly signal sequences are provided to effect the secretion of the protein. Eukaryotic systems have the additional advantage that they are able to process introns which may occur in the genomic sequences encoding proteins of higher organisms. Eukaryotic systems also provide a variety of processing mechanisms which result in, for example, glycosylation, carboxy-terminal amidation, oxidation or derivatization of certain amino acid residues, conformational control, and so forth. Commonly used eukaryotic systems include, but is not limited to, yeast, fungal cells, insect cells, mammalian cells, avian cells, and cells of higher plants. Suitable promoters are available which are compatible and operable for use in each of these host types as well as are termination sequences and enhancers, e . g. the baculovirus polyhedron promoter. As above, promoters can be either constitutive or inducible. For example, in mammalian systems, the mouse metallothionein promoter can be induced by the addition of heavy metal ions. The particulars for the construction of expression systems suitable for desired hosts are known to those in the art. Briefly, for recombinant production of the protein, the DNA encoding the polypeptide is suitably ligated into the

expression vector of choice. The DNA is operably linked to all regulatory elements which are necessary for expression of the DNA in the selected host. One having ordinary skill in the art can, using well known techniques, prepare expression vectors for recombinant production of the polypeptide.

The expression vector including the DNA that encodes FXRl or FXR2 is used to transform the compatible host which is then cultured and maintained under conditions wherein expression of the foreign DNA takes place. The protein of the present invention thus produced is recovered from the culture, either by lysing the cells or from the culture medium as appropriate and known to those in the art. One having ordinary skill in the art can, using well known techniques, isolate FXRl or FXR2 that is produced using such expression systems. The methods of purifying FXRl or FXR2 from natural sources using antibodies which specifically bind to FXRl or FXR2 as described above, may be equally applied to purifying FXRl or FXR2 produced by recombinant DNA methodology.

Examples of genetic constructs include the FXRl or FXR2 coding sequences operably linked to a promoter that is functional in the cell line into which the constructs are transfected. Examples of constitutive promoters include promoters from cytomegalovirus or SV40. Examples of inducible promoters include mouse mammary leukemia virus or metallothionein promoters. Those having ordinary skill in the art can readily produce genetic constructs useful for transfecting with cells with DNA that encodes FXRl or FXR2 from readily available starting materials. Such gene constructs are useful for the production of FXRl or FXR2. In some embodiments of the invention, transgenic non- human animals are generated. The transgenic animals according to some embodiments of the invention contain SEQ ID NO:3 or SEQ ID NO:5 under the regulatory control of a mammary specific promoter. One having ordinary skill in the art using standard techniques, such as those taught in U.S. Patent No. 4,873,191 issued October 10, 1989 to Wagner and U.S. Patent No. 4,736,866 issued April 12, 1988 to Leder, both of which are incorporated

herein by reference, can produce transgenic animals which produce FXRl or FXR2. Preferred animals are rodents, particularly goats, rats and mice.

In addition to producing these proteins by recombinant techniques, automated peptide synthesizers may also be employed to produce FXRl and FXR2. Such techniques are well known to those having ordinary skill in the art and are useful if derivatives which have substitutions not provided for in DNA- encoded protein production. Another aspect of the present invention relates to knockout mice and methods of using the same. In particular, transgenic mice may be generated which are homozygous for either a mutated, non-functional FXRl or FXR2 gene which is introduced into them using well known techniques. The mice produce no functional FXRl or FXR2 and are useful to study the function of FXRl or FXR2. Furthermore, the mice may be used in assays to study the effect of test compounds on FXRl or FXR2 deficiency.

Methods of generating genetically deficient "knock out" mice are well known and disclosed in Capecchi, M.R., Science, 1989, 244 , 1288-1292 and Li, P., et al . , Cell , 1995, 80, 401-411, which are each incorporated herein by reference. The murine FXRl or FXR2 genomic clone can be isolated using the homologous human sequences described herein. The genomic clone can be used to prepare a FXRl or FXR2 targeting construct which can disrupt the FXRl or FXR2 gene in the mouse by homologous recombination.

The targeting construct contains a non-functioning portion of the FXRl or FXR2 gene which inserts in place of the functioning portion of the native mouse gene. The non- functioning insert generally contains an insertion in the exon that encodes the active region of FXRl or FXR2. The targeting construct can contain markers for both positive and negative selection. The positive selection marker allows for the selective elimination of cells without it while the negative selection marker allows for the elimination of cells that carry it.

For example, a first selectable marker is a positive marker that will allow for the survival of cells carrying it. In some embodiments, the first selectable marker is an antibiotic resistance gene such as the neomycin resistance gene can be placed within the coding sequences of the FXRl or FXR2 gene to render it non-functional while additionally rendering the construct selectable. The antibiotic resistance gene is within the homologous region which can recombine with native sequences. Thus, upon homologous reconstruction, the non- functional and antibiotic resistance selectable gene sequences will be taken up.

The targeting construct also contains a second selectable marker which is a negative selectable marker. Cells with the negative selectable marker will be eliminated. The second selectable marker is outside the recombination region. Thus, if the entire construct is present in the cell, both markers will be present. If the construct has recombined with native sequences, the first selectable marker will be incorporated into the genome and the second will be lost . The herpes simplex virus thymidine kinase (HSV tk) gene is an example of a negative selectable marker which can be used as a second marker to eliminate cells that carry it. Cells with the HSV tk gene are selectively killed in the presence of gancyclovir. Cells are transfected with targeting constructs and then selected for the presence of the first selection marker and the absence of the second. Clones are then injected into the blastocysts and implanted into pseudopregnant females . Chimeric offspring which are capable of transferring the recombinant genes in their germline are selected, mated and their offspring is examined for heterozygous carriers of the recombined genes. Mating of the heterozygous offspring can then be used to generate fully homozygous offspring which are the FXRl-deficient or FXR2-deficient knockout mouse. The present invention provides the means and methodology for accurately identifying individuals who have FMRl deficiency, also referred to as fragile X syndrome. The

discovery of two significantly related proteins, FXRl and FXR2 and the genes that encode these proteins, allows for the more accurate detection of FMRl protein and FMRl gene sequences. In particular, reagents may be designed to which do not cross react with the related proteins and mRNA sequences thereby more accurately detecting the presence of FMRl protein or mRNA.

In addition, the present invention provides the means and methodology for accurately identifying individuals who have FXRl or FXR2 deficiencies. Reagents may be designed to detect FXRl protein or mRNA which does not cross react with the FMRl or FXR2 proteins and mRNA sequences, thereby allowing for more accurate detection of the presence of FXRl protein or mRNA. Likewise, reagents may be designed to detect FXR2 protein or mRNA which does not cross react with the FMRl or FXRl proteins and mRNA sequences, thereby allowing for more accurate detection of the presence of FXR2 protein or mRNA.

According to some embodiments, diagnostic reagents and kits are provided for performing immunoassays to determine the presence or absence of FMRl protein in a sample from an individual. Antibodies that bind to FMRl but that do not bind to FXRl and FXR2 are provided. Kits may additionally include one or more of the following: means for detecting antibodies bound to FMRl present in a sample, instructions for performing the method, and diagrams or photographs that are representative of how positive and/or negative results appear. In addition, kits may comprise optional positive controls such as FMRl protein. Further, optional negative controls may be provided.

Antibodies that bind to FMRl but that do not cross react with FXRl and FXR2 preferably bind to an epitope in the C-terminal region of FMRl. Figure 1 depicts a comparison of the amino acid sequences of human FMRl, FXRl, and FXR2. It is preferred that antibodies of the invention bind to epitopes not shared between FMRl and FXRl or FMRl and FXR2. Preferably, the antibodies bind to an epitope within the C-terminal 200 amino acids of FMRl. More preferably, the antibodies bind to an epitope within the C-terminal 100 amino acids of FMRl. Most preferably, the antibodies bind to an epitope within the C-

terminal 60 amino acids of FMRl. For example, antibodies which bind to amino acids 331 to 375 or 520 to 610 of FMRl (as found in SEQ ID NO:2) are unlikely to cross-react with either FXRl or FXR2. One skilled in the art will readily be able to produce antibodies to FMRl that do not cross-react with either FXRl or FXR2.

Immunoassay methods may be used to identify individuals with fragile X syndrome by detecting the absence or deficiency of FMRl in sample of tissue or body fluid using antibodies which bind to FMRl but which are non-cross reactive to FXRl and FXR2. The antibodies are preferably monoclonal antibodies. The antibodies are preferably raised against FMRl made in human cells. The antibodies preferably bind to an epitope on FMRl which is not present on FXRl or FXR2. Immunoassays are well known and there design may be routinely undertaken by those having ordinary skill in the art. Those having ordinary skill in the art can produce monoclonal antibodies which specifically bind to FMRl and which do not bind to FXRl and FXR2 useful in methods and kits of the invention using standard techniques and readily available starting materials. The techniques for producing monoclonal antibodies are outlined in Harlow, E. and D. Lane (1988) Antibodies : A Laboratory Manual , Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, which is incorporated herein by reference, provide detailed guidance for the production of hybridomas and monoclonal antibodies which specifically bind to FMRl.

According to some embodiments, immunoassays comprise allowing proteins in the sample to bind a solid phase support such as a plastic surface. Detectable antibodies are then added which selectively binding to FMRl. Detection of the detectable antibody indicates the presence of FMRl. The detectable antibody may be a labelled or an unlabelled antibody. Unlabelled antibody may be detected using a second, labelled antibody that specifically binds to the first antibody or a second, unlabelled antibody which can be detected using labelled protein A, a protein that complexes with antibodies.

Various immunoassay procedures are described in Immunoassays for the 80 ' s, Voller, et al . , Ed., University Park, 1981, which is incorporated herein by reference.

Simple immunoassays may be performed in which a solid phase support is contacted with the test sample. Any proteins present in the test sample bind the solid phase support and can be detected by a specific, detectable antibody preparation. Such a technique is the essence of the dot blot, Western blot and other such similar assays. Other immunoassays may be more complicated but actually provide excellent results. Typical and preferred immunometric assays include "forward" assays for the detection of a protein in which a first anti-protein antibody bound to a solid phase support is contacted with the test sample. After a suitable incubation period, the solid phase support is washed to remove unbound protein. A second, distinct anti-protein antibody is then added which is specific for a portion of the specific protein not recognized by the first antibody. The second antibody is preferably detectable. After a second incubation period to permit the detectable antibody to complex with the specific protein bound to the solid phase support through the first antibody, the solid phase support is washed a second time to remove the unbound detectable antibody. Alternatively, the second antibody may not be detectable. In this case, a third detectable antibody, which binds the second antibody is added to the system. This type of "forward sandwich" assay may be a simple yes/no assay to determine whether binding has occurred or may be made quantitative by comparing the amount of detectable antibody with that obtained in a control. Such "two-site" or "sandwich" assays are described by Wide, Radioim une Assay Method, (1970) Kirkham, Ed., E. & S. Livingstone, Edinburgh, pp. 199-206, which is incorporated herein by reference.

Other types of immunometric assays are the so-called "simultaneous" and "reverse" assays. A simultaneous assay involves a single incubation step wherein the first antibody bound to the solid phase support, the second, detectable

antibody and the test sample are added at the same time. After the incubation is completed, the solid phase support is washed to remove unbound proteins . The presence of detectable antibody associated with the solid support is then determined as it would be in a conventional "forward sandwich" assay. The simultaneous assay may also be adapted in a similar manner for the detection of antibodies in a test sample.

The "reverse" assay comprises the stepwise addition of a solution of detectable antibody to the test sample followed by an incubation period and the addition of antibody bound to a solid phase support after an additional incubation period. The solid phase support is washed in conventional fashion to remove unbound protein/antibody complexes and unreacted detectable antibody. The determination of detectable antibody associated with the solid phase support is then determined as in the "simultaneous" and "forward" assays. The reverse assay may also be adapted in a similar manner for the detection of antibodies in a test sample.

The first component of the immunometric assay may be added to nitrocellulose or other solid phase support which is capable of immobilizing proteins. The first component for determining the presence of FMRl in a test sample is anti-FMRl antibody. By "solid phase support" or "support" is intended any material capable of binding proteins. Well-known solid phase supports include glass, polystyrene, polypropylene, polyethylene, dextran, nylon, amylases, natural and modified celluloses, polyacrylamides, agaroses, and magnetite. The nature of the support can be either soluble to some extent or insoluble for the purposes of the present invention. The support configuration may be spherical, as in a bead, or cylindrical, as in the inside surface of a test tube or the external surface of a rod. Alternatively, the surface may be flat such as a sheet, test strip, etc. Those skilled in the art will know many other suitable "solid phase supports" for binding proteins or will be able to ascertain the same by use of routine experimentation. A preferred solid phase support is a 96-well microtiter plate.

To detect the presence of FMRl, detectable anti-FMRl antibodies are used. Several methods are well known for the detection of antibodies.

One method in which the antibodies can be detectably labelled is by linking the antibodies to an enzyme and subsequently using the antibodies in an enzyme immunoassay (EIA) or enzyme-linked immunosorbent assay (ELISA) , such as a capture ELISA. The enzyme, when subsequently exposed to its substrate, reacts with the substrate and generates a chemical moiety which can be detected, for example, by spectrophotometric, fluorometric or visual means. Enzymes which can be used to detectably label antibodies include, but are not limited to malate dehydrogenase, staphylococcal nuclease, delta-5-steroid isomerase, yeast alcohol dehydrogenase, alpha-glycerophosphate dehydrogenase, triose phosphate isomerase, horseradish peroxidase, alkaline phosphatase, asparaginase, glucose oxidase, beta-galactosidase, ribonuclease, urease, catalase, glucose-6-phosphate dehydrogenase, glucoamylase and acetylcholinesterase. One skilled in the art would readily recognize other enzymes which may also be used.

Another method in which antibodies can be detectably labelled is through radioactive isotopes and subsequent use in a radioimmunoassay (RIA) (see, for example, Work, et al . , Laboratory Techniques and Biochemistry in Molecular Biology, North Holland Publishing Company, N.Y., 1978, which is incorporated herein by reference) . The radioactive isotope can be detected by such means as the use of a gamma counter or a scintillation counter or by autoradiography. Isotopes which are particularly useful for the purpose of the present invention are 3 H, 125 I, 131 I, 35 S, and 14 C. Preferably 125 I is the isotope. One skilled in the art would readily recognize other radioisotopes which may also be used.

It is also possible to label the antibody with a fluorescent compound. When the fluorescent-labelled antibody is exposed to light of the proper wavelength, its presence can be detected due to its fluorescence. Among the most commonly

used fluorescent labeling compounds are fluorescein isothiocyanate, rhodamine, phycoerythrin, phycocyanin, allophycocyanin, o-phthaldehyde and fluorescamine. One skilled in the art would readily recognize other fluorescent compounds which may also be used.

Antibodies can also be detectably labelled using fluorescence-emitting metals such as 152 Eu, or others of the lanthanide series. These metals can be attached to the protein-specific antibody using such metal chelating groups as diethylenetriaminepentaacetic acid (DTPA) or ethylenediamine- tetraacetic acid (EDTA) . One skilled in the art would readily recognize other fluorescence-emitting metals as well as other metal chelating groups which may also be used.

Antibodies can also be detectably labelled by coupling to a chemiluminescent compound. The presence of the chemiluminescent-labelled antibody is determined by detecting the presence of luminescence that arises during the course of a chemical reaction. Examples of particularly useful che oluminescent labeling compounds are luminol, isoluminol, theromatic acridinium ester, imidazole, acridinium salt and oxalate ester. One skilled in the art would readily recognize other chemiluminescent compounds which may also be used.

Likewise, a bioluminescent compound may be used to label antibodies. Bioluminescence is a type of chemiluminescence found in biological systems in which a catalytic protein increases the efficiency of the chemiluminescent reaction. The presence of a bioluminescent protein is determined by detecting the presence of luminescence. Important bioluminescent compounds for purposes of labeling are luciferin, luciferase and aequorin. One skilled in the art would readily recognize other bioluminescent compounds which may also be used.

Detection of the protein-specific antibody, fragment or derivative may be accomplished by a scintillation counter if, for example, the detectable label is a radioactive gamma emitter. Alternatively, detection may be accomplished by a fluorometer if, for example, the label is a fluorescent

material. In the case of an enzyme label, the detection can be accomplished by colorometric methods which employ a substrate for the enzyme. Detection may also be accomplished by visual comparison of the extent of enzymatic reaction of a substrate in comparison with similarly prepared standards. One skilled in the art would readily recognize other appropriate methods of detection which may also be used.

The binding activity of a given lot of antibodies may be determined according to well known methods. Those skilled in the art will be able to determine operative and optimal assay conditions for each determination by employing routine experimentation.

Positive and negative controls may be performed in which known amounts of FMRl and no FMRl, respectively, are added to assays being performed in parallel with the test assay. One skilled in the art would have the necessary knowledge to perform the appropriate controls.

FMRl may be produced as a reagent for positive controls routinely. One skilled in the art would appreciate the different manners in which the FMRl may be produced and isolated.

An "antibody composition" refers to the antibody or antibodies required for the detection of the protein. For example, the antibody composition used for the detection of FMRl in a test sample comprises a first antibody which binds FMRl, but not FXRl or FXR2, as well as a second or third detectable antibody that binds the first or second antibody, respectively.

To examine a test sample for the presence or absence of FMRl, a standard immunometric assay such as the one described herein may be performed. A first anti-FMRl antibody, which recognizes a specific portion of FMRl but not FXRl or FXR2 is added to a 96-well microtiter plate in a volume of buffer. The plate is incubated for a period of time sufficient for binding to occur and subsequently washed with PBS to remove unbound antibody. The plate is then blocked with a PBS/BSA solution to prevent sample proteins from non-specifically

binding the microtiter plate. Test sample are subsequently added to the wells and the plate is incubated for a period of time sufficient for binding to occur. The wells are washed with PBS to remove unbound protein. Labelled anti-FMRl antibodies, which recognize portions of FMRl not recognized by the first antibody, are added to the wells. The plate is incubated for a period of time sufficient for binding to occur and subsequently washed with PBS to remove unbound, labelled anti-FMRl antibody. The amount of labelled and bound anti-FMRl antibody is subsequently determined by standard techniques.

Kits which are useful for the detection of FMRl in a test sample comprise a container comprising anti-FMRl antibodies and a container or containers comprising controls.

Controls include one control sample which does not contain FMRl and/or another control sample which contained FMRl. The anti- FMRl antibodies used in the kit are detectable such as being detectably labelled. If the detectable anti-FMRl antibody is not labelled, it may be detected by second antibodies or protein A for example which may also be provided in some kits in separate containers. Additional components in some kits include solid support, buffer, and instructions for carrying out the assay. The immunoassay is useful for detecting FMRl in homogenized tissue samples and body fluid samples including the plasma portion or cells in the fluid sample. Western blots may be used in methods of identifying individuals suffering from fragile X syndrome by detecting presence of FMRl in samples of tissue, such as for example, brain and testes. Western blots use detectable anti-FMRl antibodies to bind to any FMRl present in a sample and thus indicate the presence of the protein in the sample.

Western blot techniques, which are described in Sambrook, J. et al . , (1989) Molecular Cloning: A Laboratory Manual , Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, which is incorporated herein by reference, are similar to immunoassays with the essential difference being that prior to exposing the sample to the antibodies, the proteins in the samples are separated by gel electrophoresis

and the separated proteins are then probed with antibodies. In some preferred embodiments, the matrix is an SDS-PAGE gel matrix and the separated proteins in the matrix are transferred to a carrier such as filter paper prior to probing with antibodies. Anti-FMRl antibodies described above are useful in Western blot methods.

Generally, samples are homogenized and cells are lysed using detergent such as Triton-X. The material is then separated by the standard techniques in Sambrook, J. et al . , (1989) Molecular Cloning: A Laboratory Manual , Cold Spring

Harbor Laboratory Press, Cold Spring Harbor, New York.

Kits which are useful for the detection of FMRl in a test sample by Western blot comprise a container comprising FMRl antibodies and a container or containers comprising controls. Controls include one control sample which does not contain FMRl and/or another control sample which contained FMRl. The anti-FMRl antibodies used in the kit are detectable such as being detectably labelled. If the detectable anti-FMRl is not labelled, it may be detected by second antibodies or protein A for example which may also be provided in some kits in separate containers. Additional components in some kits include instructions for carrying out the assay. The antibodies of the kit preferably bind to an epitope on the extracellular domain of FMRl . The means to detect anti-FMRl antibodies that are bound to FMRl include the immunoassays described above.

Aspects of the present invention also include various methods of determining whether a sample contains cells that express FMRl by sequence-based molecular analysis. Several different methods are available for doing so including those using Polymerase Chain Reaction (PCR) technology, using Northern blot technology, oligonucleotide hybridization technology, and in si tu hybridization technology. According to the invention, samples are screened to determine the presence or absence of mRNA that encodes FMRl. In particular detection of FMRl mRNA is performed using primers or probes which do not cross react with FXRl mRNA or FXR2 mRNA.

The invention relates to oligonucleotide probes and primers used in the methods of identifying mRNA that encodes FMRl and to diagnostic kits which comprise such components. The mRNA sequence-based methods for determining whether a sample mRNA encoding FMRl include but are not limited to PCR technology, Northern and Southern blot technology, in si tu hybridization technology and oligonucleotide hybridization technology.

Primers and probes that detect FMRl mRNA but that do not cross react with FXRl mRNA and FXR2 mRNA preferably hybridize to nucleotide sequences that encode the C-terminal region of FMRl. It is preferred that the FMRl specific probes or primers hybridize to nucleotide sequences that encode all or apart of the C-terminal 200 amino acids of FMRl. More preferably, the FMRl specific probes or primers hybridize to nucleotide sequences that encode all or apart of the C-terminal 100 amino acids of FMRl. Most preferably, the FMRl specific probes or primers hybridize to nucleotide sequences that encode all or apart of the C-terminal 60 amino acids of FMRl. One skilled in the art will readily be able to design primers and probes that hybridize to FMRl mRNA sequences that do not cross- react with either FXRl mRNA sequences or FXR2 mRNA sequences.

The methods described herein are meant to exemplify how the present invention may be practiced and are not meant to limit the scope of invention. It is contemplated that other sequence-based methodology for detecting the presence of specific mRNA that encodes FMRl in tissue samples may be employed according to the invention.

A preferred method to detecting mRNA that encodes FMRl in genetic material derived from tissue samples uses PCR technology. PCR technology is practiced routinely by those having ordinary skill in the art and its uses in diagnostics are well known and accepted. Methods for practicing PCR technology are disclosed in "PCR Protocols: A Guide to Methods and Applications", Innis, M.A. , et al . Eds. Academic Press, Inc. San Diego, CA (1990), which is incorporated herein by reference. Applications of PCR technology are disclosed in

"Polymerase Chain Reaction" Erlich, H.A. , et al . , Eds. Cold Spring Harbor Press, Cold Spring Harbor, New' York (1989), which is incorporated herein by reference. U.S. Patent Number 4,683,202, U.S. Patent Number 4,683,195, U.S. Patent Number 4,965,188 and U.S. Patent Numbers 5,075,216, which are each incorporated herein by reference describe methods of performing PCR. PCR may be routinely practiced using Perkin Elmer Cetus GENEAMP RNA PCR kit, Part No. N808-0017.

To perform this method, RNA is extracted from cells in a sample and tested or used to make cDNA using well known methods and readily available starting materials. The mRNA or cDNA is combined with the FMRl specific primers, free nucleotides and enzyme following standard PCR protocols. The mixture undergoes a series of temperature changes. If the mRNA or cDNA encoding FMRl is present, that is, if both primers hybridize to sequences on the same molecule, the molecule comprising the primers and the intervening complementary sequences will be exponentially amplified. The amplified DNA can be easily detected by a variety of well known means. If the FMRl encoding mRNA is not present, no DNA molecule will be exponentially amplified. The PCR technology therefore provides an extremely easy, straightforward and reliable method of detecting mRNA encoding FMRl protein in a sample.

PCR primers can be designed routinely by those having ordinary skill in the art using well known cDNA sequence information. Primers are generally 8-50 nucleotides, preferably 18-28 nucleotides. A set of primers contains two primers. When performing PCR on extracted mRNA or cDNA generated therefrom, if the mRNA or cDNA encoding FMRl protein is present, multiple copies of the mRNA or cDNA will be made. If it is not present, PCR will not generate a discrete detectable product .

PCR product, i . e . amplified DNA, may be detected by several well known means. The preferred method for detecting the presence of amplified DNA is to separate the PCR reaction material by gel electrophoresis and stain the gel with ethidium bromide in order to visual the amplified DNA if present. A

size standard of the expected size of the amplified DNA is preferably run on the gel as a control .

In some instances, such as when unusually small amounts of RNA are recovered and only small amounts of cDNA are generated therefrom, it is desirable or necessary to perform a PCR reaction on the first PCR reaction product. That is, if difficult to detect quantities of amplified DNA are produced by the first reaction, a second PCR can be performed to make multiple copies of DNA sequences of the first amplified DNA. A nested set of primers are used in the second PCR reaction. The nested set of primers hybridize to sequences downstream of the 5' primer and upstream of the 3' primer used in the first reaction.

The present invention includes oligonucleotide which are useful as primers for performing PCR methods to amplify mRNA or cDNA that encodes FMRl protein. According to the invention, diagnostic kits can be assembled which are useful to practice methods of detecting the presence of mRNA or cDNA that encodes FMRl in tissue samples. Such diagnostic kits comprise oligonucleotide which are useful as primers for performing PCR methods . It is preferred that diagnostic kits according to the present invention comprise a container comprising a size marker to be run as a standard on a gel used to detect the presence of amplified DNA. The size marker is the same size as the DNA generated by the primers in the presence of the mRNA or cDNA encoding FMRl.

Another method of determining whether a sample contains cells expressing FMRl is by Northern blot analysis of mRNA extracted from a tissue sample. The techniques for performing Northern blot analyses are well known by those having ordinary skill in the art and are described in Sambrook, J. et al . , Molecular Cloning: A Laboratory Manual , (1989) Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York. mRNA extraction, electrophoretic separation of the mRNA, blotting, probe preparation and hybridization are all well known techniques that can be routinely performed using readily available starting material.

One having ordinary skill in the art, performing routine techniques, could design probes to identify mRNA encoding FMRl using the information in SEQ ID N0:1. Such probes preferentially do not cross-react with FXRl or FXR2. The mRNA is extracted using poly dT columns and the material is separated by electrophoresis and, for example, transferred to nitrocellulose paper. Labelled probes made from an isolated specific fragment or fragments can be used to visualize the presence of a complementary fragment fixed to the paper. According to the invention, diagnostic kits can be assembled which are useful to practice methods of detecting the presence of mRNA that encodes FMRl in tissue samples by Northern blot analysis. Such diagnostic kits comprise oligonucleotide which are useful as probes for hybridizing to the mRNA. The probes may be radiolabelled. It is preferred that diagnostic kits according to the present invention comprise a container comprising a size marker to be run as a standard on a gel. It is preferred that diagnostic kits according to the present invention comprise a container comprising a positive control which will hybridize to the probe.

Another method of detecting the presence of mRNA encoding FMRl protein is by oligonucleotide hybridization technology. Oligonucleotide hybridization technology is well known to those having ordinary skill in the art. Briefly, detectable probes which contain a specific nucleotide sequence that will hybridize to nucleotide sequence of mRNA encoding FMRl protein. RNA or cDNA made from RNA from a sample is fixed, usually to filter paper or the like. The probes are added and maintained under conditions that permit hybridization only if the probes fully complement the fixed genetic material. The conditions are sufficiently stringent to wash off probes in which only a portion of the probe hybridizes to the fixed material. Detection of the probe on the washed filter indicate complementary sequences. One having ordinary skill in the art, using the sequence information disclosed in SEQ ID NO:1 can design probes which are fully complementary to mRNA sequences

but not genomic DNA sequences. Such probes preferentially do not cross-react with FXRl or FXR2. Hybridization conditions can be routinely optimized to minimize background signal by non-fully complementary hybridization. The present invention includes labelled oligonucleotide which are useful as probes for performing oligonucleotide hybridization. That is, they are fully complementary with mRNA sequences but not genomic sequences. For example, the mRNA sequence includes portions encoded by different exons. The labelled probes of the present invention are labelled with radiolabelled nucleotides or are otherwise detectable by readily available nonradioactive detection systems.

According to the invention, diagnostic kits can be assembled which are useful to practice oligonucleotide hybridization methods of the invention. Such diagnostic kits comprise a labelled oligonucleotide which encodes portions of FMRl encoded by different exons. It is preferred that labelled probes of the oligonucleotide diagnostic kits according to the present invention are labelled with a radionucleotide. The oligonucleotide hybridization-based diagnostic kits according to the invention preferably comprise DNA samples that represent positive and negative controls. A positive control DNA sample is one that comprises a nucleic acid molecule which has a nucleotide sequence that is fully complementary to the probes of the kit such that the probes will hybridize to the molecule under assay conditions. A negative control DNA sample is one that comprises at least one nucleic acid molecule, the nucleotide sequence of which is partially complementary to the sequences of the probe of the kit. Under assay conditions, the probe will not hybridize to the negative control DNA sample.

Another aspect of the invention relates to methods of analyzing tissue samples which are fixed sections routinely prepared by surgical pathologists to characterize and evaluate cells. In some embodiments, the cells are from brain tissue or testicular tissue and are analyzed to determine and evaluate the extent of FMRl expression.

The present invention relates to in vi tro kits for evaluating tissues samples to determine the level of FMRl expression and to reagents and compositions useful to practice the same. The tissue is analyzed to identify the presence or absence of the FMRl protein. Techniques such as FMRl/anti-FMRl binding assays and immunohistochemistry assays may be performed to determine whether FMRl is absent in cells in the tissue sample which are indicative of fragile X syndrome.

Alternatively, in some embodiments of the invention, tissue samples are analyzed to identify whether FMRl protein is being expressed in cells in the tissue sample which indicate a lack of fragile X syndrome. The presence of mRNA that encodes the

FMRl protein or cDNA generated therefrom can be determined using techniques such as in si tu hybridization, immunohistochemistry and in si tu FMRl binding assay.

In si tu hybridization technology is well known by those having ordinary skill in the art. Briefly, cells are

■ fixed and detectable probes which contain a specific nucleotide sequence are added to the fixed cells. If the cells contain complementary nucleotide sequences, the probes, which can be detected, will hybridize to them. One having ordinary skill in the art, using the sequence information in SEQ ID NO:l can design probes useful in in si tu hybridization technology to identify cells that express FMRl. Such probes are preferentially FMRl specific probes, i.e. probes that do not cross-react with, i.e. hybridize to, FXRl-encoding nucleic acid molecules or FXR2-encoding nucleic acid molecules.

The probes a fully complementary and do not hybridize well to partially complementary sequences. For in si tu hybridization according to the invention, it is preferred that the probes are detectable by fluorescence. A common procedure is to label probe with biotin-modified nucleotide and then detect with fluorescently-tagged avidin. Hence, probe does not itself have to be labelled with florescent but can be subsequently detected with florescent marker.

Cells are fixed and the probes are added to the genetic material. Probes will hybridize to the complementary

nucleic acid sequences present in the sample. Using a fluorescent microscope, the probes can be visualized by their fluorescent markers .

According to the invention, diagnostic kits can be assembled which are useful to practice in si tu hybridization methods of the invention are fully complementary with mRNA sequences but not genomic sequences. For example, the mRNA sequence includes portions encoded by different exons. It is preferred that labelled probes of the in si tu diagnostic kits according to the present invention are labelled with a fluorescent marker.

Immunohistochemistry techniques may be used to identify and essentially stain cells with FMRl. Anti-FMRl antibodies, such as those described above, are contacted with fixed cells and the FMRl present in the cells reacts with the antibodies. The antibodies are detectably labelled or detected using labelled second antibody or protein A to stain the cells.

FMRl binding assays may be performed instead of immunohistochemistry except that the cell section is first frozen, then the FMRl binding assay is performed and then the cells are fixed.

EXAMPLES

Example 1A: Materials and methods

1. Isolation of CDNA Clones and DNA Sequencing 10 s plaques of λZA.PII Xenopus laevis ovary cDNA library were screened to obtain FMRl and FXRl cDNAs using the human FMRl full length cDNA as a probe. The Xenopus FXRl cDNA was used as a probe to screen a λgtll cDNA library to obtain the human FXRl cDNA. The probe was made using the FXR1- specific region to avoid isolating other FMRl-like clones. Since no initial clones contained the entire open reading frame encoding FXRl, the same library was rescreened using one of the cDNAs encoding a more amino terminal region of FXRl as a probe. A composite transcript was determined from the overlapping clones. In vivo excision was done to create pXF43 and pXF45 for X. laevis FMRl and FXRl respectively. Phage inserts from

positive clones of human FXRl were amplified by PCR using the λgtll phage DNA arms according to conditions suggested by the manufacturer and were cloned into pCRII vector (Invitrogen) . To create a plasmid containing full length cDNA of human FXRl, part of the inserts were subcloned into pGEM7Z vector (Promega Biotech) . The nucleotide sequence of all inserts were verified by DNA sequencing. Sanger, et al . , Proc . Natl . Acad . Sci . USA, 1977, 74, 5463-5467. All plasmids and DNA fragments were manipulated using standard techniques. Sambrook, J. et al., Mol ecular Cloning: A Labora tory Manual , (1989) Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York.

2. In Vitro Transcription and Translation

The plasmids pXF43 for X. laevis FMRl, pXF45 for X. laevis FXRl and pHHSI-F27X for human FMRl (Siomi, et al . , Cell , 1993, 74, 291-298) were linearized at appropriate restriction sites to generate templates for in vitro RNA synthesis with T3 RNA polymerase for X. laevis FMRl and FXRl and T7 RNA polymerase for human FMRl. The resultant RNAs were translated in rabbit reticulocyte lysate in the presence of ( 35 S)methionine (Amersham) according to the conditions of the manufacturer (Promega Biotech) .

3. RNA Binding Assays

Binding of in vitro produced proteins to ribohomopolymers was carried out essentially as described in Siomi, et al . , Cell , 1993, 74, 291-298. In brief, ribonucleotide homopolymer (Sigma or Pharmacia) binding reactions were carried out with an equivalent of 100,000 cpm of trichloroacetic acid-precipitable protein in a total of 0.5 ml of binding buffer (10 mM Tris-HCl (pH 7.4), 2.5 mM MgCl 2 , 0.5% Triton X100, 2 mg/ml pepstatin, 2 mg/ml leupeptin, 0.5% aprotinin) with 250 mM NaCl concentration for 10 minutes on a rocking platform at 4°C. The beads were pelleted with a brief spin in a microfuge and washed five times with binding buffer prior to resuspension in 30 μl of SDS-PAGE loading buffer. Bound protein was eluted from the nucleic acid by boiling, resolved on a SDS-polyacrylamide (12.5%) gel, and visualized by fluorograpy.

4. Production of antisera against X. laevis FXRl

To raise antisera specifically to X. laevis FXRl, pXF45 was digested with BamHI and the 600 bp fragment encoding just the carboxy terminal region of X. laevis FXRl was inserted into pET15b to create the expression vector pEXFXRI . pEXFXRl was introduced into BL21(DE3) bacteria and the His-FXRl peptides were induced with isopropyl-.-thiogalactopylanoside as described. Studier, et al . , Methods Enzym . , 1990, 185, 60- 89; and Rosenberg, et al . , Gene, 1987, 56, 125-135. For purification of the fusion peptide, bacterial sonicates were applied to 2 ml His-Bind resin (Novagen) column, washed and eluted as described by the manufacturer. Antisera were raised in BALB/c mice injected with the purified recombinant His-FXRl fusion peptides produced in E. coli . The anti-human FMRl antibodies were described previously. Siomi, et al . , Cell , 1993, 74, 291-298.

5. Western Blot Analysis

The lymphoblastoid cell lines used in this work correspond to FX24 (normal sibling of FX25) and FX25 (a patient of fragile X syndrome) described in Siomi, et al . , Cell , 1993, 74, 291-298. Cells were grown to subconfluence, lysed in SDS- PAGE sample buffer, sonicated and then heated at 95°C for 5 minutes. Proteins were resolved on an SDS-polyacrylamide (10%) gel and transferred to nitrocellulose. Filters were incubated in blotting solution (phosphate-buffered saline, 5% nonfat milk) for at least 30 minutes at room temperature and then incubated with primary antibody diluted at 1:400 for 1 hr at room temperature. Filter were washed three times in PBS, 0.05% Tween-20, and bound antibody was detected using the peroxidase- conjugated goat anti-mouse IgG + IgM (Jackson ImmunoResearch Laboratories) . The protein bands were visualized by ECL (Amersham) after washing three times in PBS, 0.05% Tween-20.

6. Immunofluorescence microscopy

HeLa cells were grown on cover glasses to subconfluence, fixed with 2% formaldehyde in PBS and permeabilized with cold acetone. After washing with cold PBS, cells were incubated with polysera for either __. laevis FXRl

or human FMRl diluted at 1:400 with 3% BSA in PBS for 1 hr at room temperature, followed by washing with cold PBS extensively. Fluorescein isothiocyanate-conjugated anti-mouse F(ab') 2 secondary antibody (Cappel Laboratories) was diluted 1:500 with 3% BSA in PBS, applied to the cells and incubated for 1 hr at room temperature. The localization of FMRl and FXRl gene products in HeLa cells were detected and pictures were taken under the microscope.

7. Chromosome mapping of human FXRl Somatic cell hybrid panel #2 was purchased from the

Coriell Institute Cell Repository. This panel consists of DNA isolated from 24 human/rodent somatic cell hybrids. All but two of the hybrid retain a single intact chromosome. Primers were designed to generate a PCR product of 161 bp from a portion of the carboxy terminal end of the FXRl open reading frame derived from a cDNA clone isolated from a HeLa cDNA library. The primer sequences are forward-5' : GATGACATTTCTAAGCTACAGC-3' (1777-1798) (SEQ ID NO:7) and reverse¬ s' : TGTACAAGCACTATTGTAAATG-3' (1916-1937) (SEQ ID NO:8) . The number in the parenthesis of the primers above were based on the numbering in SEQ ID NO:3. PCR reactions were performed according to conditions suggested by manufacturer (Perkin-Elmer Cetus) and analyzed by 6% PAGE.

8. Reverse transcription and PCR 'Oligo (dT) -selected RNAs from human brain, testis, kidney, and heart were purchased from Clontech. Oligo (de¬ selected RNAs from HeLa cells and the lymphoblastoid cells (FX24 and FX25 (Siomi, et al . , Cell , 1993, 74 , 291-298)) were manually prepared using a Dynabeads mRNA direct kit (Dynal) . RNA (100 ng) was reverse transcribed using the oligo(dt) primer according to conditions suggested by the manufacturer (Stratagene) . PCR reactions were done on 5 μl of each cDNA solution with the primers specifically bound to FMRl and FXRl, namely 27XM7 and 27X31 for FMRl (Siomi, et al . , Cell , 1993, 74, 291-298) and XF-E and XF-B1 (XF-E, 1225-1245; XF-B1, 1906-1930, the coordinates are based on the numbering used in SEQ ID NO:3

for FXRl . The samples were resolved on an agarose gel and visualized with EtBr.

Example IB

Fragile X Mental Retardation Syndrome, the most common cause of hereditary mental retardation, is directly associated with the FMRl gene at Xq27.3. FMRl encodes an RNA-binding protein and the syndrome results from lack of expression of FMRl or expression of a mutant protein that is impaired in RNA- binding. A novel gene was discovered, FXRl, that is highly homologous to FMRl and located on chromosome 12 at 12ql3. FXRl encodes a protein which, like FMRl, contains two KH domains and is highly conserved in vertebrates. The 3' untranslated regions (3'UTRs) of the human and Xenopus laevis FXRl mRNAs are strikingly conserved (-90% identity) , suggesting conservation of an important function. The KH domains of FXRl and FMRl are almost identical, and the two proteins have similar RNA-binding properties in vi tro. However, FXRl and FMRl have very different carboxyl termini. FXRl and FMRl are expressed in many tissues and both proteins, which are cytoplasmic, can be expressed in the same cells. Interestingly, cells from a fragile X patient that do not have any detectable FMRl express normal levels of FXRl. These findings demonstrate that FMRl and FXRl are members of a gene family and suggest a biological role for FXRl that is related to that of FMRl. The human FMRl cDNA was used to screen a Xenopus laevis ovary cDNA library by hybridization and have isolated a clone, designated FXRl (for FMRl unreacting relative) , is a highly related homologue. The X. laevis FXRl cDNA was used in hybridization screening to isolate the human FXRl cDNA from a HeLa cell library. The nucleotide sequence and predicted amino acid sequence of the human FXRl is shown in SEQ ID NO:3.

FXRl has 86% amino acid sequence identity to FMRl in the region containing the KH domains (Siomi, et al . , Nuc. Acids Res . , 1993, 21 , 1193-1198; Siomi, et al . , Cell , 1993, 74 , 291- 298; Siomi, et al . , Cell , 1994, 77, 33-39; Gibson, et al . , FEBS Lett . , 1993, 324 , 361-366; and Burd, et al . , Science, 1994,

265 , 615-621) and is very similar to FMRl over the amino terminal domain (70% identity) , but FXRl and FMRl have entirely different carboxyl domains (6% identity) . The carboxyl portion of FXRl, beginning after the region containing the RGG box (Kiledjian, et al . , EMBO J. , 1992, 11 , 2655-2664; Siomi, et al . , Cell , 1993, 74, 291-298; and Burd, et al . , Science, 1994, 265, 615-621) has several intriguing features. There is a nine amino acid sequence RRRRSRRRR (SEQ ID NO:11) beginning at amino acid 502 of human FXRl. This arginine-rich sequence is similar to arginine-rich motifs that constitute the RNA-binding element of several proteins including HIV Rev and Tat (reviewed in Burd, et al . , Science, 1994, 265, 615-621) . Computer data base searches also picked up similar arginine/serine-rich sequences in several RNA-binding proteins, including snRNP Ul 70 kDa protein and the D . melanogaster splicing regulator tra. In addition, the last four amino acids of both FMRl and FXRl contain the tripeptide NGV. The human FMRl clone contained a sequence (amino acids 331-396; numbering according to Verkerk, et al . , Cell , 1991, 65, 905-914) that was not found in the X. laevis FMRl or in the human and X. laevis FXRl. This segment (amino acids 331-396) corresponds to exons 11 and 12 of the human FMRl gene. Eichler, et al . , Hum. Mol . Genet . , 1993, 2, 1147-1153. The absence of these exons from FXRl and from the X. laevis FMRl cDNA is probably the result of alternative splicing. Also isolated from a HeLa library was a cDNA for a shorter forirr of FXRl. The shorter form diverges from the longer form beginning with amino acid 535, and contains instead the sequence GKRCD (SEQ ID NO:12) as its carboxyl terminus.

The longest FXRl cDNA that was isolated from the library contained only 12 nucleotides upstream of the putative initiation codon for the human mRNA, and 99 nucleotides of upstream sequence for the X. laevis mRNA. This is insufficient to establish whether or not there are CGG (or other) repeats in the 5' UTR of FXRl mRNAs, as there are in the FMRl mRNAs. However, there is a striking sequence feature in the 3' UTRs of FXRl mRNAs. The 3' UTRs of the human and X. laevis mRNAs are about 90% identical over the 238 nucleotides (88% overall;

93% excluding the small gaps) . This is a higher degree of sequence identity than is found for the coding regions of the human and the X. laevis FXRl mRNAs, and it strongly suggests conservation of an important function for the FXRl 3' UTR. By comparison, the 3' UTRs of the human and the __. laevis FMRl mRNAs are only 42% identical over 280 nucleotides, in which a 73-nucleotide region is 73% identical (nucleotides 2020-2092, the coordinate is based on the numbering used in Verkerk, et al . , Cell , 1991, 65, 905-914) . Out of the 12 nucleotides of the human 5' UTR sequence available so far, 11 nucleotides are identical between the human and the X. laevis FXRl mRNAs.

The deduced amino acid sequence of X. laevis FXRl predicted a protein of molecular mass 73 kD. Consistent with this, in vitro transcription and translation of the FXRl cDNA produced a protein that migrated by SDS-PAGE at this apparent molecular mass. FMRl binds RNA in vitro. Siomi, et al . , Cell , 1993, 74 , 291-298; and Ashley, et al . , Science, 1993, 262, 563- 566. Since FXRl, like FMRl, contains sequence motifs characteristic of RNA-binding proteins, including KH domains, an RGG box and possibly also an arginine-rich motif, the binding of RNA by FXRl in vi tro was also examined. This property was determined by an immobilized ribohomopolymer binding assay. Briefly, human FMRl (HFMR1) , X. laevis FMRl

(X.FMR1) , and X. laevis FXRl (XFXR1) were produced by in vi tro transcription-translation of pHHSI-F27X (Siomi, et al . , Cell , 1993, 74, 291-298) , pXF43 and pXF45 truncated outside the coding regions. The in vi tro transcribed RNA was translated in reticulocyte lysate in the presence of ( 35 S)methionine. An amount equivalent to 20% of the material used for each binding reaction is shown in the lanes marked "total". In vitro produced proteins were bound to 30 μl of the indicated

ribonucleotide homopolymers at 250 mM NaCl and analyzed by-SDS- PAGE as described. Swanson, et al . , Mol . Cell . Biol . , 1988, 3 , 2237-2241; and Siomi, et al . , Cell , 1993, 74, 291-298. FXRl showed a similar RNA-binding profile to FMRl, binding in a moderately (250 mM NaCl) salt-resistant manner to poly(G) and poly(U) , but not to poly(A) or poly(C) . It is concluded that FXRl, like FMRl, has characteristics of an RNA-binding protein. The carboxyl portion of X. laevis FXRl (amino acids 500-649) was produced in E. coli , purified, and used for immunizations of mice to produce specific antibodies to FXRl. An immunoblot with the serum of an immunized mouse indicated that the serum shows reactivity towards a 70 kD protein, which corresponds to the shorter form of FXRl in HeLa cells. With longer exposure of same gel, several additional bands near the 70 kD protein were observed. It appears likely that in HeLa cells the shorter form of FXRl is much more abundant than other forms of FXRl. The serum of the mouse was specific to FXRl; it immunoprecipitated both isoforms of human FXRl but not FMRl produced in vi tro by transcription/translation of the corresponding cDNAs. Crossreacting proteins of similar size were observed in monkey and chicken cells. A band of lower apparent molecular mass, about 47 kD, and much weaker bands at about 68-70 kD are detected in D. melanogaster cells.

The cellular localization of the FXRl protein was studied by immunofluorescence microscopy in HeLa cells using the antibodies to FXRl and showed cytoplasmic localization with no significant staining in the nucleus. It has been previously shown that human FMRl has cytoplasmic localization (Devys, et al . , Nature Genet . , 1993, 4 , 335-340) and the antibodies for human FMRl which had been raised in our laboratory (Siomi, et al . , Cell , 1993, 74, 291-298) also showed a cytoplasmic localization of FMRl in HeLa cells.

Mapping of FXRl was carried out to determine the chromosomal location of FXRl. Reaction conditions allowed specific amplification of the human gene in a background of either rodent or yeast DNA. In the mapping panel, the cell line containing chromosome 12 contained an amplified fragment

of the correct size, 161 bp, whereas none of the other samples contained the amplified fragment of interest. Therefore, FXRl was tentatively assigned to human chromosome 12. The faint signal in the chromosome 21 lane likely results from some contamination of this hybrid with chromosome 12 material .

The same set of primers was used to screen pools from the Washington University CGM YAC library (Green, et al . , Proc. Natl . Acad . Sci . USA, 1990, 87, 1213-1217) using the same conditions described for the mapping panel. Two YAC clones containing FXRl were identified: 1) A192D7; and 2) B105H7. Fluorescence in si tu hybridization (Janne, et al . , Cylogenet . Cell Genet . , 1994, 66, 164-166) using both of these two YACs localized FXRl to chromosome 12ql3, thus confirming the somatic hybrid data. To determine the expression of FXRl mRNA in different tissues reverse transcription-polymerase chain reaction (RT- PCR) were performed. Specific non-crossreacting primers were designed to amplify FXRl and FMRl. FXRl mRNAs were detected in all tissues tested, but different size bands were observed in various tissues. For example, while HeLa cells contain only one FXRl mRNA, at least two forms are detected in brain and testis, and in heart there is an additional larger form. The major smaller HeLa band was cloned and sequenced and its sequence corresponded to the shorter cDNA FXRl form described above. These findings suggest that there is considerable tissue-specific alternative splicing of FXRl pre-mRNA at least for the carboxyl part and immediate 3' UTR of the mRNA. A similar complex pattern of expression has been reported for FMRl (Verkerk, et al . , Hum . Mol . Genet . , 1993, 2, 399-404) , although multiple forms of FMRl were observed by RT-PCR under the conditions used in the experiment.

Most patients with fragile X mental retardation syndrome do not express FMRl mRNA or protein (Pieretti, et al . , Cell , 1991, 66, 817-822; Verheij , et al . , Nature, 1993, 363 , 722-724; and Siomi, et al . , Cell , 1993, 74, 291-298) . It was, therefore, of particular interest to determine if the expression of the related protein, FXRl, is also affected in

these patients. RT-PCR and immunoblotting were carried out on lymphoblastoid cells of a fragile X patient and his normal sibling (Siomi, et al . , Cell , 1993, 74, 291-298) . By RT-PCR, both the normal sibling and the patient express FXRl mRNA, while the patient, as expected, does not express FMRl mRNA. The same is seen for the protein products of FXRl and FMRl, respectively. Because of inherent limitations of RT-PCR it is not possible to draw quantitative conclusions from this experiment. It does, however, appear from the immunoblotting experiments that the amount of FXRl produced in the patient cells is not reduced compared to normal. Thus, FXRl expression is not drastically effected by the lack of expression of FMRl, and therefore, the FXRl gene expression in lymphoblastoid cells does not appear to be linked to that of the FMRl gene. FXRl has high amino acid sequence identity to FMRl in the region containing the KH domains and is very similar to FMRl within the amino terminal domain. The carboxyl portion of FXRl is, however, quite different from that of FMRl. Using RT-PCR and primers specific to the carboxyl terminus of FXRl, different size bands were detected in tissues tested. In fact, two alternatively spliced forms of the human FXRl mRNAs encoding different isoforms have been isolated. It has been shown that alternative splicing, which occurs in both human and mouse FMRl, results in the production of several isoforms that differ at the carboxyl ends but not in the amino terminus of FMRl (Ashley, et al . , Nature Genet . , 1993, 4, 244-251; and Verkerk, et al . , Hum. Mol . Genet . , 1993, 2, 399-404) . Taken together, considerable diversity is observed at the carboxyl terminus of both FMRl and FXRl proteins. Many genes are subject to alternative splicing which can introduce functional diversity to the products of a single gene. In most cases, this gives rise to protein isoforms sharing extensive legions of identity and varying only in specific domains, thereby allowing for the fine regulation of protein function. The findings of FMRl and FXRl isoforms mentioned above suggest that the carboxyl terminus may be involved in the determination of the localization, or regulatory or catalytic specificities in

the different members of the FMRl family. It should be noted that taking into account the relative abundance of different transcripts in various tissues, there may be tissue-specific functions for the various isoforms of FXRl, which may be in contrast to FMRl. Ashley, et al . , Nature Genet . , 1993, 4, 244- 251; and Verkerk, et al . , Hum . Mol . Genet . , 1993, 2, 399-404. Both FMRl and FXRl have been well conserved through evolution, probably reflecting their essential roles in cells, although the functions of FXRl and FMRl have not yet been elucidated. FMRl and FXRl have strong structural similarity, are both cytoplasmic and have similar RNA-binding activities in vitro. It, therefore, appears that the biological function of FXRl protein may be strongly related to that of FMRl. However, if redundancy of FMRl and FXRl functions exists, it must only be partial . This follows from the fact that an apparently normal FXRl mRNA and protein are expressed in lymphoblastoid cells of a patient with fragile X syndrome, while FMRl mRNA and protein are not expressed in these cells

(Pieretti, et al . , Cell , 1991, 66, 817-822; Verheij , et al . , Nature, 1993, 363 , 722-724; and Siomi, et al . , Cell , 1993, 74, 291-298) . It has been suggested that FMRl has an important physiological function in neurological tissues as the intragenic mutation in FMRl (Ile-304→Asn substitution) is directly responsible for the clinical abnormalities, particularly mental retardation, of the fragile X syndrome (De Boulle, et al . , Nature Genet . , 1993, 3, 31-35) .

To produce FXRl gene knockout mice, several mouse FXRl genomic clones have been isolated. The intron-exon boundaries around exon 2 and 3 of mouse FXRl are identical to those of human FMRl, which has been determined to comprise of 17 exons spanning 38kb at Xq27.3. Eichler, et al . , Hum . Mol . Genet . , 1993, 2, 1147-1153. The segment containing amino acid 331-396 of FMRl which is absent from FXRl and from the X. laevis FMRl cDΝA exactly corresponds to exons 11 and 12 of FMRl.

Example 2A: Materials and Methods

The in vi tro transcription-translation reaction was performed using the TNT coupled reticulocyte lysate system (Promega Biotech) in the presence of ( 35 S)methionine (Amersham) . Truncated FMRl peptides were produced from pHHSI- F27X (Siomi, et al . , Cell , 1993, 74 , 291-298) digested at either Kpnl (nucleotide 1324 (Verkerk, et al . , Cell , 1991, 65, 905-914)) (FMRl-Kpnl) or Ndel at nucleotide 740 (FMRl- Ndel) . The full length FMRl was produced from the same plasmid DNA but undigested. FXR2 is from the pET28a-FXR2s plasmid. An EcoRI fragment of pGAD-FXR2s isolated from the two hybrid system was cloned into the pET28a vector to generate pET28a-FXR2S. In vitro-produced proteins were analyzed by SDS-PAGE followed by fluorography. pGST-FXR2 was constructed by inserting an EcoRI fragment of FXR2s into pGST-lλT (Pharmacia) . The bacterially expressed fusion protein (GST-FXR2) and GST was purified as described by the manufacturer. 2 μg of purified GST or GST- FXR2 was incubated with 10 μl of the in vi tro translated proteins and 25 μL of glutathione-Sepharose (Pharmacia) in 500 μl of binding buffer (50 mM Tris-HCl pH 7.5, 500 mM NaCl, 2 mM EDTA, 0.1% NP40, 5 μg/ml leupeptin and 0.5% aprotinin) . Following incubation for 60 minutes at 4°C, the resin was sedimented, washed with binding buffer, and the bound fraction was eluted and analyzed by SDS-PAGE followed by fluorography. Monoclonal antibodies were raised in mouse against a

His6-FXR2s fusion protein from the pET28a-FXR2s plasmid. The fusion protein was expressed in the E. coli strain BL21 (DE3)pLysS and purified by metal chelation chromatography as described by the manufacturer (Novogen) . Immunofluorescence microscopy on HeLa cells was carried out as previously described (Choi, et al . , J. Cell Biol . , 1984, 99 , 197-204) using hybridoma culture supernatant of monoclonal antibody A66 which was used without dilution. Immunoprecipitations were carried out in the presence of the nondenaturing zwititterionic detergent Empigen BB. Choi, et al . , J. Cell Biol . , 1984, 99 , 197-204. The immunoprecipitates were analyzed by SDS- polyacrylamide gel electrophoresis followed by fluorography.

One μl of mouse ascites fluid was used for immunoprecipitation. Immunoblotting was carried out as described using undiluted hybridoma culture supernatant of monoclonal antibody A66. Full length FXR2 produced in vi tro was from one of the library isolates cloned into the pGEM7Z vector. The FXRl peptide was produced. The anti FMRl monoclonal antibody EF8 was used as control.

Somatic Cell Hybrid Panel #2 was purchased from the Coriell Institute Cell Repository. This panel consists of DNA isolated from 24 human/rodent somatic cell hybrids each retaining a single intact human chromosome. Primers were designed to generate a PCR product of 175 hp from a portion of the 3' untranslated end of the FXR2 gene. The primer (Genosys, inc.) sequences are: forward: 5'- CAGGGTCATACCCCCTCC-3' (SEQ ID NO:9) and reverse: 5'- CTGAACGGTCAAATCTGGGT-3 ' (SEQ ID NO:10) .

PCR reactions were performed in a Perkin-Elmer 480 thermal cycler. Reaction volume of the PCR amplifica ions was 50 μl containing 50 ng of genomic DNA, 200 mM dNTP, and 2.5 units AmpliTaq DNA polymerase in Perkin-Elmer Buffer 1. Samples were overlaid with light mineral oil and processed through one step of denaturation (95°C for 6 minutes) , 26 cycles of denaturation (95°C for 1 minute) , annealing (55°C for 2 minutes) , and elongation (72°C for 2 minutes) , followed by elongation for one cycle (7 minutes at 72°C) . The PCR products were analyzed by electrophoresis in a 3% NuSieve/agarose gel.

For FISH analysis, an arrayed Chromosome 17 cosmid library constructed from flow-sorted chromosome 17 material by

L. Deavan (Los Alamos National Laboratory) was screened for cosmids containing FXR2 using the full-length FXR2 cDNA as probe. Three of the cosmids that hybridized to FXR2 were used as templates for PCR with the primers described above and their

PCR products, sequenced as per manufacturer's instructions

(fmol sequencing kit, Promega) . DNA was isolated from two of these cosmids, labeled with digoxigenin, hybridized to DAPI stained metaphase chromosomes and detected with rhodamine for FISH analysis as described. Pinkel, et al . , Proc . Natl . Acad .

Sci . USA, 1986, 83 , 2934-2938. Simultaneously, a marker specific for the centromeric region of chromosome 17 was prepared by PCR amplification of alpha satellite DNA (Weier, et al . , Hum. Genet . , 1991, 87, 489-494) using DNA from a human chromosome 17-only somatic cell hybrid, labeled with biotin and detected with FITC using a triple band pass filter.

The human brain cDNA library, yeast strains and yeast plasmids pGBT9, pGAD424, pVA3, and pTDl were from Clontech Incorporated. The manipulation of yeast and the library screening were according to the conditions suggested by the manufacturer. Plasmids pVA3 and pTDl contain a murine p53/GAL4 DNA binding domain and an SV40 large T-antigen/GALA activation domain hybrid respectively, which served as a positive control for interaction. FMRl cDNA was from EcoRI-Nsil fragment in pF27X (Siomi, et al . , Cell , 1993, 74, 291-298) , in which the EcoRI site was blunted and the resulting fragment was inserted into pGBTP9 or pG7AD424 between Smal and PstI sites to create pGBT-FMRl or pGAD-FMRl. As low basal levels of HIS3 expression in the HF7C host strain were observed, 15 mM 3-aminotriazole (3-AT) (Sigma) was added to the selection medium during the screen to suppress growth of transformants containing noninteracting hybrid proteins.

Example 2B

The yeast two-hybrid system was used to identify proteins that interact with FMRl. Fields, et al . , Nature,

1989, 340 , 245-246. FMRl was fused to the DNA binding domain of the yeast transcription factor GALA (GALA 1 - 147 -FMR1) as a bait. As a target a human brain cDNA library which was fused to the GALA activation domain was used. The selection in yeast strain HF7C carrying both HISS and LacZ reporters under the control of GAL4-responsive elements was performed.

Approximately lxlO 7 yeast transformants were screened. Eleven colonies showed both histidine prototrophy and 3-galactosidase activity. From these 11 colonies, we recovered fusion plasmids that conferred Hie and blue color to reporter strains only in the presence of GALA 1 - 147 -FMR1. Sequencing of the 11 clones

revealed that they were all derived from the same cDNA. As the sequences showed significant homology with that of FMRl and to another recently described FMRl homologue, FXRl, the protein was named FXR2 (Fragile X related) . The FXR2 clone which turned out to be a partial cDNA, (~1.2kb) designated FXR2s (amino acids 14-426, see below), was retransformed into a host strain and the specificity of the interaction between FXR2s and FMRl was further tested (Table 1) . Visual inspection of activation of the HIS3 and LacZ reporters showed that FMRl interacted specifically with FXR2s. Reciprocal exchange of GAL4 peptides, to which FMRl and FXR2s were fused, did not affect the interaction. FXR2s was also capable of associating with itself, whereas no FMR1-FMR1 interaction was detected in the assay. The interactions observed in the two-hybrid system of FXR2s with FMRl and with itself were subsequently confirmed for the full-length FXR2 proteins.

To isolate full length clones, the partial FXR2 cDNA (FXR2s) described above was used as a probe to screen a human fetal brain cDNA. The three largest overlapping clones contained an open reading frame of 673 amino acids with a predicted molecular mass of 74 kDa. Northern blot hybridization using labeled FXR2s cDNA as probe detected a transcript of approximately 3.0 kb in HeLa cells and in mouse brain, indicating that the 2.9 kb of cDNA whose sequence is shown contains full-length or near full-length mRNA of this protein. The sequence context of the putative first AUG conforms to the Kozak consensus sequence for preferred translation start sites. Kozak, J " . Cell Biol . , 1989, 108, 229. Unlike FMRl mRNA, the 5' UTR of FXR2 does not contain CGG repeats, nor are there any other striking characteristics in the 5'- or 3' -untranslated regions. Homology searches with the predicted protein sequence identified significant similarity with the FMRl protein (-60% identity) and with a recently described homologue, FXRl. In particular, like FMRl and FXRl, FXR2 also contains two highly conserved KH domains, the sequence motifs characteristic of RNA binding proteins of this family. Siomi, et al . , Nucl . Acids Res . , 1993, 21 , 1193-1198;

and Burd, et al . , Science, 1994, 265, 615-621. In this region, the similarity between the three proteins is as high as 90%. Additionally, the spacing between the two KH domains is identical in all three proteins. In the carboxyl terminal portion, the similarity between FXR2 and FMRl decreases gradually. Nevertheless, the strong overall similarity indicates that these proteins belong to the same family. The region carboxyl terminal to the KH domains in FXR2 is very basic and is rich in serines, arginines, glycines and prolines. The last few amino acids of all three proteins contain the same sequence NGVS/P (SEQ ID NO:13) .

To further confirm and characterize the FMR1-FXR2 interaction, in vi tro binding assays were performed. The FXR2s peptide was expressed as a fusion protein with bacterial glutathione S-transferase (GST) . The FMRl protein was produced and labeled with ( 3S S)methionine by in vi tro transcription- translation in reticulocyte lysate. The purified GST protein or GST-FXR2 fusion protein immobilized on glutathione-Sepharose was incubated with either labeled FMRl or, as a control, the hnRNP K protein (Siomi, et al . , Nucl . Acids Res . , 1993, 21 , 1193-1198; and Burd, et al., Science, 1994, 265, 615-621) which also contains KH domains. Following washing, bound GST fusion protein and any associated proteins were dissociated by boiling in SDS-containing buffer and analyzed by SDS-polyacrylamide gel electrophoresis (PAGE) . Full-length FMRl bound specifically to the immobilized GST-FXR2, but not to GST alone. As the FXR2 sequence bears strong similarity to the amino- erminus of FMRl, we tested whether the protein-interaction domain of FMRl was also located in this region. We produced peptides from truncated transcripts generated by digestions of the FMRl cDNA with Kpnl or Ndel. The peptides from Kpnl-truncated transcripts were still capable of interacting with FXR2, indicating that, like FXR2, the carboxyl terminus of FMRl is not necessary for the oligomerization. The region between Ndel and Kpnl is required for the interaction with FXR2, since the peptides from Ndel-truncated transcripts, in which the KH domains were deleted, showed little or no binding. The in

vi tro translated FXR2 also bound to the GST-FXR2, confirming the observation in the two-hybrid system that FXR2 was able to homo-oligomerize. A weak interaction was also detected with a fragment of hnRNP K, the reason for which are presently not understood. The interaction appears to be of high affinity as the washing conditions were stringent, containing 500 mM NaCl in the buffer. In addition, the interaction could also be observed by a far-Western blot assay, in which the ( 35 S)methionine-FMRl specifically bound to FXR2 produced and purified in E. coli and immobilized on a nitrocellulose membrane. FXR2 is capable of interacting with FMRl and with itself in vi tro, with high affinity and specificity.

Monoclonal antibodies against FXR2 were produced in mice. Immunoprecipitations with one of the monoclonal antibodies that we have characterized, A42, of FMRl and FXR2 proteins produced in vi tro by transcription/translation demonstrated that it reacted specifically with FXR2 that migrates as a -95 kD band, and did not crossreact with FMRl. By immunoblotting, A66 (a monoclonal antibody to FXR2 with similar specificity to A42) detected a single band of -95 kD in total HeLa cell material. The intracellular localization of FXR2 was investigated in HeLa cells by immunofluorescence microscopy. Immunostaining with the monoclonal antibody A66 showed that FXR2, like FMRl (Devys, et al . , Nature Genet . , 1993, 4 , 335-340) , is also present in the cytoplasm. HeLa cells which express FXR2 protein tagged with the Myc-epitope at its amino-terminus detected using a monoclonal antibody directed against Mycepitope also showed cytoplasmic staining. The localization of both proteins to the cytoplasm and their expression in the same cell type ( e . g. , HeLa) suggests that the FMR1-FXR2 interaction is likely to be biologically relevant.

The chromosomal localization of the FXR2 gene in humans was determined. Primers derived from the 3'- untranslated region of FXR2 allowed specific amplification of the human gene in a background of rodent DNA. PCR amplification of the hybrid cell line with chromosome 17 generated an amplified fragment of the correct size, 175 bp,

as predicted from the cDNA sequence, whereas none of the other cell lines showed the amplified fragment of interest. Therefore, FXR2 was tentatively assigned to human chromosome 17. For fluorescent in si tu hybridization (FISH) analysis, an arrayed chromosome 17 specific cosmid library was screened by hybridization using full length FXR2 cDNA as probe. Each of three cosmids, located in the array at positions, 58G2, 98F2 and 148C9, was confirmed by PCR to contain the correctly sized 175 hp fragment using the same 3' -untranslated primers used in the somatic cell hybrid panel analysis. The sequence of these PCR products from the cosmids matched the sequence of FXR2 perfectly. The position of FXR2 at 17pl3.1 is indicated by the red signal and that of the chromosome 17 centromeric marker by the green signal. Further FISH analysis with the FXR2 cosmids and a probe specific for the Miller-Dieker Syndrome region (17pl3.3) , a brain malformation manifested by a smooth cerebral surface and abnormal neuronal migration, demonstrated that FXR2 maps proximal to this locus.

Using anti-FXR2 antibodies, FXR2 was detected in the cytoplasm. This is a similar staining pattern to that observed with antibodies to FMRl. Verheij , et al . , Nature, 1993, 363 , 722-724; and Devys, et al . , Nature Genet . , 1993, 4, 335-340. Significantly, FXR2 shares strong similarity with FMRl, a further indication that the FMR1-FXR2 association is biologically meaningful.

Table 1: Interaction of FXR2 with related proteins Fusion protein

PGBT- PGAD- His β-galactosidase blue white white blue blue blue white white white white white

Table 1: Interaction of FMRl and FXR2 in the yeast two-hybrid system. Individual colonies of HF7C yeast cells that contained pairs of the indicated plasmids were streaked with toothpicks onto duplicate Trp-Leu- plates, and two days later replica plated onto either a Trp-Leu-His- plate containing 15 mM 3- aminotriazole (Sigma) or a filter paper (Grade 413, VWR Scientific) that was then incubated for a β-galactosidase filter assay. Interaction results in activation of the HIS3 and the LacZ reporters, growing in the absence of histidine or turning blue by galactosidase filter assay are indicated.

SEQUENCE LISTING (1) GENERAL INFORMATION:

(i) APPLICANT: Gideon Dreyf ss

Miki o C. Siomi

Yan Zhang

(ii) TITLE OF INVENTION: Fragile X Related Proteins, Compositions And Methods Of Making And Using The Same

(iii) NUMBER OF SEQUENCES: 13 (iv) CORRESPONDENCE ADDRESS:

(A) ADDRESSEE: Woodcock, Washburn, Kurtz, Mackiewicz & Norris

(B) STREET: One Liberty Place, 46th floor (C) CITY: Philadelphia (D) STATE: PA (E) COUNTRY: USA (F) ZIP: 19103 (v) COMPUTER READABLE FORM:

(A) MEDIUM TYPE: Floppy disk

(B) COMPUTER: IBM PC compatible

(C) OPERATING SYSTEM: PC-DOS/MS-DOS

(D) SOFTWARE: WordPerfect 5.1

(vi) CURRENT APPLICATION DATA:

(A) APPLICATION NUMBER:

(B) FILING DATE:

(C) CLASSIFICATION:

( ii) PRIOR APPLICATION DATA:

(A) APPLICATION NUMBER: US 08/455,073

(B) FILING DATE: 31-MAY-1995

(C) CLASSIFICATION:

(viii) ATTORNEY/ GENT INFORMATION:

(A) NAME: DeLuca, Mark

(B) REGISTRATION NUMBER: 33,229

(C) REFERENCE/DOCKET NUMBER: UPN-2816

(ix) TELECOMMUNICATION INFORMATION:

(A) TELEPHONE: (215) 568-3100

(B) TELEFAX: (215) 568-3439

(2) INFORMATION FOR SEQ ID NO:l:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS : double

(D) TOPOLOGY: linear

(ix) FEATURE:

(A) NAME/KEY: CDS

(B) LOCATION: 220..2118

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:1 :

1 acςgcgagcg cgggcggcgg cggcgαcgga ggcgccgccg ccagggggcg tgcggcagcg αl cggcggcggc ggcggcggcg gcggcggcgg aggcggcggc ggcggcggcg gcggcggcgg

121 aggcggcggc ggcggcggcg gcggcggcgg ccgggcctcg agcgcccgca gcccaccccc

131 cgggggcggg cccccggcgc cagcagggct gaagagaaga tggaggagcc ggcggtggaa

241 CZ~C~ Q ~QCC ccaatgαcσc tttctacaacr gcatctσtaa aggatαctca tσaaqattca

301 acaacagtcg catttgaaaa caactggcag cccgacaggc agaccccatc tcatgacgcc 361 ≤gactcccac ctcctgtagg ttataataaa gatacaaatg aaagcgatga agccgaggtg 4.21 Caccccagag caaacgaaaa agagccctgc cgtcggcggc tagccaaagt gaggacgaca 481 aagggcgagc tctacgtgac agaacacgca gcacgcgacg caacttacaa cgaaatcgtc 541 acaaccgaac gcccaagatc tgccaacccc aacaaacccg ccacaaaaga caccttccac 601 aagaCcaagc cggacgcgcc agaagaccca cggcaaacgc gtgccaaaga ggcggcacac 561 aaggaCtcta aaaaggcagt CggtgccCCC Cctgtaaccc acgacccaga aaaccaccag 721 cr gccaccc cgcccaccaa cgaagccacc ccaaagcgag cacacacgcc gaccgacacg 781 cactcccgga gcccgcgcac taaςccgtct ccgataacga gaaacgaaga agccagcaag

3841 caagecagga aaca g agcaa aacacacagc aegcaagcag aaacaacaaa gcaagaacaa 3901 ccaccgccaa accgaagcag aagcaaacca aacaaacgac agcaactaaa aaaaaaagca

2961 aggaaaagaa atctaaagaa agagctctgt tacaaaaagt aactgtaacc aaaggaaaaa 21 acaegtcaaa ggaagaaagc cacaatcaac ccaacacaca agaaggatca aaaaaaaagc 4081 gaaaacccaa ccaaacggca cccccccaca gagaaccaaa aaaacacgea caagccac t a 4141 aaaacaaggc agaacccagc aaggg caaa caegcaaaaa gagaaactac aaacaacaaa 42C1 aacagaacaa agaaacacac gaaaccaaaa ggagaaggag accacacaaa caacaaaaaa 4261 a agagctaa acgagaaaga aaaacagaac aaagacaaga a acaaaaaa aacataacaa 4321 agaaacaaaa aacagaagaa agaaaaccta gaagtaaaaa ta

(2) INFORMATION FOR SEQ ID NO:2 :

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 610 amino acid residues

(B) TYPE: amino acid

(C) STRANDEDNESS:

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2 : Met Glu Glu Leu Val Val Glu Val Arg Gly Ser Asn Gly Ala Phe Tyr 1 5 10 15

Lys Ala Phe Val Lys Asp Val His Glu Asp Ser lie Thr Val Ala Phe 20 25 30

Glu Asn Asn Trp Gin Pro Asp Arg Gin lie Pro Phe His Asp Val Arg 35 40 45

Phe Pro Pro Pro Val Gly Tyr Asn Lys Asp lie Asn Glu Ser Asp Glu 50 55 60

Val Glu Val Tyr Ser Arg Ala Asn Glu Lys Glu Pro Cys Cys Trp Trp 65 70 75 80

Leu Ala Lys Val Arg Met lie Lys Gly Glu Phe Tyr Val lie Glu Tyr 85 90 95

Ala Ala Cys Asp Ala Thr Tyr Asn Glu lie Val Thr lie Glu Arg Leu 100 105 110

Arg Ser Val Asn Pro Asn Lys Pro Ala Thr Lys Asp Thr Phe His Lys 115 120 125 lie Lys Leu Asp Val Pro Glu Asp Leu Arg Gin Met Cys Ala Lys Glu 130 135 140

Ala Ala His Lys Asp Phe Lys Lys Ala Val Gly Ala Phe Ser Val Thr 145 150 155 160

Tyr Asp Pro Glu Asn Tyr Gin Leu Val lie Leu Ser lie Asn Glu Val 165 170 175

Thr Ser Lys Arg Ala His Met Leu lie Asp Met His Phe Arg Ser Leu 180 185 190

Arg Thr Lys Leu Ser Leu lie Met Arg Asn Glu Glu Ala Ser Lys Gin 195 200 205

Leu Glu Ser Ser Arg Gin Leu Ala Ser Arg Phe His Glu Gin Phe lie 210 215 220

Val Arg Glu Asp Leu Met Gly Leu Ala lie Gly Thr His Gly Ala Asn 225 230 235 240 lie Gin Gin Ala Arg Lys Val Pro Gly Val Thr Ala lie Asp Leu Asp 245 250 255

Glu Asp Thr Cys Thr Phe His lie Tyr Gly Glu Asp Gin Asp Ala Val 260 265 270

Lys Lys Ala Arg Ser Phe Leu Glu Phe Ala Glu Asp Val lie Gin Val 275 280 285

Pro Arg Asn Leu Val Gly Lys Val lie Gly Lys Asn Gly Lys Leu lie 290 295 300

Gin Glu He Val Asp Lys Ser Gly Val Val Arg Val Arg He Glu Ala 305 310 315 320

Glu Asn Glu Lys Asn Val Pro Gin Glu Glu Glu He Met Pro Pro Asn 325 330 335

Ser Leu Pro Ser Asn Asn Ser Arg Val Gly Pro Asn Ala Pro Glu Glu 340 345 350

Lys Lys His Leu Asp He Lys Glu Asn Ser Thr His Phe Ser Gin Pro 355 360 365

Asn Ser Thr Lys Val Gin Arg Gly Met Val Pro Phe Val Phe Val Gly 370 375 380

Thr Lys Asp Ser He Ala Asn Ala Thr Val Leu Leu Asp Tyr His Leu 385 390 395 400

Asn Tyr Leu Lys Glu Val Asp Gin Leu Arg Leu Glu Arg Leu Gin He 405 410 415

Asp Glu Gin Leu Arg Gin He Gly Ala Ser Ser Arg Pro Pro Pro Asn 420 425 430

Arg Thr Asp Lys Glu Lys Ser Tyr Val Thr Asp Asp Gly Gin Gly Met 435 440 445

Gly Arg Gly Ser Arg Pro Tyr Arg Asn Arg Gly His Gly Arg Arg Gly 450 455 460

Pro Gly Tyr Thr Ser Gly Thr Asn Ser Glu Ala Ser Asn Ala Ser Glu 465 470 475 480

Thr Glu Ser Asp His Arg Asp Glu Leu Ser Asp Trp Ser Leu Ala Pro 485 490 495

Thr Glu Glu Glu Arg Glu Ser Phe Leu Arg Arg Gly Asp Arg Arg Arg 500 505 510

Gly Gly Gly Gly Arg Gly Gin Gly Gly Arg Gly Arg Gly Gly Gly Phe 515 520 525

Lys Gly Asn Asp Asp His Ser Arg Thr Asp Asn Arg Pro Arg Asn Pro 530 535 540

Arg Glu Ala Lys Gly Arg Thr Thr Asp Gly Ser Leu Gin He Arg Val 545 550 555 560

Asn Cys Asn Asn Glu Arg Ser Val His Thr Lys Thr Leu Gin Asn Thr 565 570 575

Ser Ser Glu Gly Ser Arg Leu Arg Thr Gly Lys Asp Arg Asn Gin Lys 580 585 590

Lys Glu Lys Pro Asp Ser Val Asp Gly Gin Gin Pro Leu Val Asn Gly 595 600 605

Val Pro 610

(2) INFORMATION FOR SEQ ID NO:3 :

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1863 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS : double

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3

ATG GCG GAC GTG ACG GTG GAG GTT CGC GGC TCT AAC GGG GCT TTC TAC 48 Met Ala Asp Val Thr Val Glu Val Arg Gly Ser Asn Gly Ala Phe Tyr 1 5 10 15

AAG GGA TTT ATC AAA GAT GTT CAT GAA GAC TCC CTT ACA GTT GTT TTT 96 Lys Gly Phe He Lys Asp Val His Glu Asp Ser Leu Thr Val Val Phe 20 25 30

GAA AAT AAT TGG CAA CCA GAA CGC CAG GTT CCA TTT AAT GAA GTT AGA 144 Glu Asn Asn Trp Gin Pro Glu Arg Gin Val Pro Phe Asn Glu Val Arg 35 40 45

TTA CCA CCA CCA CCT GAT ATA AAA AAA GAA ATT AGT GAA GGA GAT GAA 192 Leu Pro Pro Pro Pro Asp He Lys Lys Glu He Ser Glu Gly Asp Glu 50 55 60

GTA GAG GTA TAT TCA AGA GCA AAT GAC CAA GAG CCA TGT GGG TGG TGG 240 Val Glu Val Tyr Ser Arg Ala Asn Asp Gin Glu Pro Cys Gly Trp Trp 65 70 75 80

TTG GCT AAA GTT CGG ATG ATG AAA GGA GAA TTT TAT GTC ATT GAA TAT 288 Leu Ala Lys Val Arg Met Met Lys Gly Glu Phe Tyr Val He Glu Tyr 85 90 95

GCT GCT TGT GAC GCT ACT TAC AAT GAA ATA GTC ACA TTT GAA CGA CTT 336 Ala Ala Cys Asp Ala Thr Tyr Asn Glu He Val Thr Phe Glu Arg Leu 100 105 110

CGG CCT GTC AAT CAA AAT AAA ACT GTC AAA AAA AAT ACC TTC TTT AAA 384 Arg Pro Val Asn Gin Asn Lys Thr Val Lys Lys Asn Thr Phe Phe Lys 115 120 125

TGC ACA GTG GAT GTT CCT GAG GAT TTG AGA GAG GCG TGT GCT AAT GAA 432 Cys Thr Val Asp Val Pro Glu Asp Leu Arg Glu Ala Cys Ala Asn Glu 130 135 140

AAT GCA CAT AAA GAT TTT AAG AAA GCA GTA GGA GCA TGC AGA ATT TTT 480 Asn Ala His Lys Asp Phe Lys Lys Ala Val Gly Ala Cys Arg He Phe 145 150 155 160

TAC CAT CCA GAA ACA ACA CAG CTA ATG ATA CTG TCT GCC AGT GAA GCA 528 Tyr His Pro Glu Thr Thr Gin Leu Met He Leu Ser Ala Ser Glu Ala 165 170 175

ACT GTG AAG AGA GTA AAC ATC TTA AGT GAC ATG CAT TTG CGA AGT ATT 576 Thr Val Lys Arg Val Asn He Leu Ser Asp Met His Leu Arg Ser He 180 185 190

CGT ACG AAG TTG ATG CTT ATG TCC AGA AAT GAA GAG GCC ACT AAG CAT 624 Arg Thr Lys Leu Met Leu Met Ser Arg Asn Glu Glu Ala Thr Lys His 195 200 205

TTA GAA TGC ACA AAA CAA CTT GCA GCA GCT TTT CAT GAG GAA TTT GTT 672 Leu Glu Cys Thr Lys Gin Leu Ala Ala Ala Phe His Glu Glu Phe Val 210 215 220

GTG AGA GAA GAT TTA ATG GGC CTG GCA ATA GGA ACA CAT GGT AGT AAC 720 Val Arg Glu Asp Leu Met Gly Leu Ala He Gly Thr His Gly Ser Asn 225 230 235 240

ATC CAG CAA GCT AGG AAG GTT CCT GGA GTT ACC GCC ATT GAG CTA GAT 768 He Gin Gin Ala Arg Lys Val Pro Gly Val Thr Ala He Glu Leu Asp 245 250 255

GAA GAT ACT GGA ACA TTC AGA ATC TAC GGA GAG AGT GCT GAT GCT GTA 816 Glu Asp Thr Gly Thr Phe Arg He Tyr Gly Glu Ser Ala Asp Ala Val 260 265 270

AAA AAG GCT AGA GGT TTC TTG GAA TTT GTG GAG GAT TTT ATT CAG GTT 864 Lys Lys Ala Arg Gly Phe Leu Glu Phe Val Glu Asp Phe He Gin Val 275 280 285

CCT AGG AAT CTC GTT GGA AAA GTA ATT GGA AAA AAT GGC AAA GTT ATT 912 Pro Arg Asn Leu Val Gly Lys Val He Gly Lys Asn Gly Lys Val He 290 295 300

CAA GAA ATA GTG GAC AAA TCT GGT GTG GTT CGA GTG AGA ATT GAA GGG 960 Gin Glu He Val Asp Lys Ser Gly Val Val Arg Val Arg He Glu Gly 305 310 315 320

GAC AAT GAA AAT AAA TTA CCC AGA GAA GAC GGT ATG GTT CCA TTT GTA 1008 Asp Asn Glu Asn Lys Leu Pro Arg Glu Asp Gly Met Val Pro Phe Val 325 330 335

TTT GTT GGC ACT AAA GAA AGC ATT GGA AAT GTG CAG GTT CTT CTA GAG 1056 Phe Val Gly Thr Lys Glu Ser He Gly Asn Val Gin Val Leu Leu Glu 340 345 350

TAT CAT ATT GCC TAT CTA AAG GAA GTA GAA CAG CTA AGA ATG GAA CGC 1104 Tyr His He Ala Tyr Leu Lys Glu Val Glu Gin Leu Arg Met Glu Arg 355 360 365

CTA CAG ATT GAT GAA CAG CTG CGA CAG ATT GGT TCT AGG TCT TAT AGC 1152 Leu Gin He Asp Glu Gin Leu Arg Gin He Gly Ser Arg Ser Tyr Ser 370 375 380

GGA AGA GGC AGA GGT CGT CGG GGA CCT AAT TAC ACC TCC GGT TAT GGT 1200 Gly Arg Gly Arg Gly Arg Arg Gly Pro Asn Tyr Thr Ser Gly Tyr Gly 385 390 395 400

ACA AAT TCT GAG CTG TCT AAC CCC TCT GAA ACG GAA TCT GAG CGT AAA 1248 Thr Asn Ser Glu Leu Ser Asn Pro Ser Glu Thr Glu Ser Glu Arg Lys 405 410 415

GAC GAG CTG AGT GAT TGG TCA TTG GCA GGA GAA GAT AAT CGA GAC AGC 1296 Asp Glu Leu Ser Asp Trp Ser Leu Ala Gly Glu Asp Asn Arg Asp Ser 420 425 430

CGA CAT CAG CGT GAC AGC AGG AGA CGC CCA GGA GGA AGA GGC AGA AGT 1344 Arg His Gin Arg Asp Ser Arg Arg Arg Pro Gly Gly Arg Gly Arg Ser 435 440 445

GTT TCA GGG GGT CGA GGT CGT GGT GGA CCA CGT GGT GGC AAA TCC TCC 1392 Val Ser Gly Gly Arg Gly Arg Gly Gly Pro Arg Gly Gly Lys Ser Ser 450 455 460

ATC AGT TCT GTG CTC AAA GAT CCA GAC AGC AAT CCA TAC AGC TTA CTT 1440 He Ser Ser Val Leu Lys Asp Pro Asp Ser Asn Pro Tyr Ser Leu Leu 465 470 475 480

GAT AAT ACA GAA TCA GAT CAG ACT GCA GAC ACT GAT GCC AGC GAA TCT 1488 Asp Asn Thr Glu Ser Asp Gin Thr Ala Asp Thr Asp Ala Ser Glu Ser 485 490 495

CAT CAC AGT ACT AAC CGT CGT AGG CGG TCT CGT AGA CGA AGG ACT GAT 1536 His His Ser Thr Asn Arg Arg Arg Arg Ser Arg Arg Arg Arg Thr Asp 500 505 510

GAA GAT GCT GTT CTG ATG GAT GGA ATG ACT GAA TCT GAT ACA GCT TCA 1584 Glu Asp Ala Val Leu Met Asp Gly Met Thr Glu Ser Asp Thr Ala Ser 515 520 525

GTT AAT GAA AAT GGG CTA GTC ACA GTT GCA GAT TAT ATT TCT AGA GCT 1632 Val Asn Glu Asn Gly Leu Val Thr Val Ala Asp Tyr He Ser Arg Ala 530 535 540

GAG TCT CAG AGC AGA CAA AGA AAC CTC CCA AGG GAA ACT TTG GCT AAA 1680 Glu Ser Gin Ser Arg Gin Arg Asn Leu Pro Arg Glu Thr Leu Ala Lys 545 550 555 560

AAC AAG AAA GAA ATG GCA AAA GAT GTG ATT GAA GAG CAT GGT CCT TCA 1728 Asn Lys Lys Glu Met Ala Lys Asp Val He Glu Glu His Gly Pro Ser 565 570 575

GAA AAG GCA ATA AAC GGC CCA ACT AGT GCT TCT GGC GAT GAC ATT TCT 1776 Glu Lys Ala He Asn Gly Pro Thr Ser Ala Ser Gly Asp Asp He Ser 580 585 590

AAG CTA CAG CGT ACT CCA GGA GAA GAA AAG ATT AAT ACC TTA AAA GAA 1824 Lys Leu Gin Arg Thr Pro Gly Glu Glu Lys He Asn Thr Leu Lys Glu 595 600 605

GAA AAC ACT CAA GAA GCA GCA GTC CTG AAT GGT GTT TCA 1863 Glu Asn Thr Gin Glu Ala Ala Val Leu Asn Gly Val Ser 610 615 620

(2) INFORMATION FOR SEQ ID NO:4:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 621 amino acid residues

(B) TYPE: amino acid

(C) STRANDEDNESS:

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:4:

Met Ala Asp Val Thr Val Glu Val Arg Gly Ser Asn Gly Ala Phe Tyr 1 5 10 15

Lys Gly Phe He Lys Asp Val His Glu Asp Ser Leu Thr Val Val Phe 20 25 30

Glu Asn Asn Trp Gin Pro Glu Arg Gin Val Pro Phe Asn Glu Val Arg 35 40 45

Leu Pro Pro Pro Pro Asp He Lys Lys Glu He Ser Glu Gly Asp Glu 50 55 60

Val Glu Val Tyr Ser Arg Ala Asn Asp Gin Glu Pro Cys Gly Trp Trp 65 70 " 75 80

Leu Ala Lys Val Arg Met Met Lys Gly Glu Phe Tyr Val He Glu Tyr 85 90 95

Ala Ala Cys Asp Ala Thr Tyr Asn Glu He Val Thr Phe Glu Arg Leu 100 105 110

Arg Pro Val Asn Gin Asn Lys Thr Val Lys Lys Asn Thr Phe Phe Lys 115 120 125

Cys Thr Val Asp Val Pro Glu Asp Leu Arg Glu Ala Cys Ala Asn Glu 130 135 140

Asn Ala His Lys Asp Phe Lys Lys Ala Val Gly Ala Cys Arg He Phe 145 150 155 160

Tyr His Pro Glu Thr Thr Gin Leu Met He Leu Ser Ala Ser Glu Ala 165 170 175

Thr Val Lys Arg Val Asn He Leu Ser Asp Met His Leu Arg Ser He 180 185 190

Arg Thr Lys Leu Met Leu Met Ser Arg Asn Glu Glu Ala Thr Lys His

195 200 205

Leu Glu Cys Thr Lys Gin Leu Ala Ala Ala Phe His Glu Glu Phe Val 210 215 220

Val Arg Glu Asp Leu Met Gly Leu Ala He Gly Thr His Gly Ser Asn 225 230 235 240

He Gin Gin Ala Arg Lys Val Pro Gly Val Thr Ala He Glu Leu Asp 245 250 255

Glu Asp Thr Gly Thr Phe Arg He Tyr Gly Glu Ser Ala Asp Ala Val 260 265 270

Lys Lys Ala Arg Gly Phe Leu Glu Phe Val Glu Asp Phe He Gin Val 275 280 285

Pro Arg Asn Leu Val Gly Lys Val He Gly Lys Asn Gly Lys Val He 290 295 300

Gin Glu He Val Asp Lys Ser Gly Val Val Arg Val Arg He Glu Gly 305 310 315 320

Asp Asn Glu Asn Lys Leu Pro Arg Glu Asp Gly Met Val Pro Phe Val 325 330 335

Phe Val Gly Thr Lys Glu Ser He Gly Asn Val Gin Val Leu Leu Glu 340 345 350

Tyr His He Ala Tyr Leu Lys Glu Val Glu Gin Leu Arg Met Glu Arg 355 360 365

Leu Gin He Asp Glu Gin Leu Arg Gin He Gly Ser Arg Ser Tyr Ser 370 375 380

Gly Arg Gly Arg Gly Arg Arg Gly Pro Asn Tyr Thr Ser Gly Tyr Gly 385 390 395 400

Thr Asn Ser Glu Leu Ser Asn Pro Ser Glu Thr Glu Ser Glu Arg Lys 405 410 415

Asp Glu Leu Ser Asp Trp Ser Leu Ala Gly Glu Asp Asn Arg Asp Ser 420 425 430

Arg His Gin Arg Asp Ser Arg Arg Arg Pro Gly Gly Arg Gly Arg Ser 435 440 445

Val Ser Gly Gly Arg Gly Arg Gly Gly Pro Arg Gly Gly Lys Ser Ser 450 455 460

He Ser Ser Val Leu Lys Asp Pro Asp Ser Asn Pro Tyr Ser Leu Leu 465 470 475 480

Asp Asn Thr Glu Ser Asp Gin Thr Ala Asp Thr Asp Ala Ser Glu Ser 485 490 495

His His Ser Thr Asn Arg Arg Arg Arg Ser Arg Arg Arg Arg Thr Asp 500 505 510

Glu Asp Ala Val Leu Met Asp Gly Met Thr Glu Ser Asp Thr Ala Ser 515 520 525

Val Asn Glu Asn Gly Leu Val Thr Val Ala Asp Tyr He Ser Arg Ala 530 535 540

Glu Ser Gin Ser Arg Gin Arg Asn Leu Pro Arg Glu Thr Leu Ala Lys 545 550 555 560

Asn Lys Lys Glu Met Ala Lys Asp Val He Glu Glu His Gly Pro Ser 565 570 ι 575

Glu Lys Ala He Asn Gly Pro Thr Ser Ala Ser Gly Asp Asp He Ser 580 585 590

Lys Leu Gin Arg Thr Pro Gly Glu Glu Lys He Asn Thr Leu Lys Glu 595 600 605

Glu Asn Thr Gin Glu Ala Ala Val Leu Asn Gly Val Ser 610 615 620

(2) INFORMATION FOR SEQ ID NO:5 :

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 2019 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: double

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:5: ATG GGC GGC CTG GCC TCT GGG GGG GAT GTG GAG CCG GGA CTG CCC GTC 48 Met Gly Gly Leu Ala Ser Gly Gly Asp Val Glu Pro Gly Leu Pro Val 1 5 10 15

GAG GTG CGC GGC TCC AAC GGG GCC TTC TAC AAG GGC TTT GTG AAG GAT 96 Glu Val Arg Gly Ser Asn Gly Ala Phe Tyr Lys Gly Phe Val Lys Asp 20 25 30

GTC CAT GAA GAC TCT GTC ACC ATC TTC TTT GAA AAC AAC TGG CAG AGT 144 Val His Glu Asp Ser Val Thr He Phe Phe Glu Asn Asn Trp Gin Ser 35 40 45

GAG AGA CAA ATT CCT TTT GGG GAT GTC CGG CTA CCA CCT CCA GCT GAC 192 Glu Arg Gin He Pro Phe Gly Asp Val Arg Thr Pro Pro Pro Ala Asp 50 55 60

TAT AAT AAG GAG ATC ACA GAA GGG GAT GAA GTG GAG GTT TAT TCT CGA 240 Tyr Asn Lys Glu He Thr Glu Gly Asp Glu Val Glu Val Tyr Ser Arg 65 70 75 80

GCC AAT GAA CAG GAA CCT TGT GGC TGG TGG CTG GCC CGG GTG CGG ATG 288 Ala Asn Glu Gin Glu Pro Cys Gly Trp Trp Thr Ala Arg Val Arg Met

85 90 95

ATG AAG GGA GAT TTC TAT GTC ATT GAA TAT GCT GCC TGT GAT GCC ACC 336 Met Lys Gly Asp Phe Tyr Val He Glu Tyr Ala Ala Cys Asp Ala Thr 100 105 110

TAC AAT GAA ATT GTT ACC CTG GAG CGA CTT CGG CCA GTT AAT CCC AAT 384 Tyr Asn Glu He Val Thr Leu Glu Arg Leu Arg Pro Val Asn Pro Asn 115 120 125

CCC CTT GCA ACC AAA GGC AGC TTC TTC AAG GTT ACC ATG GCT GTG CCC 432 Pro Leu Ala Thr Lys Gly Ser Phe Phe Lys Val Thr Met Ala Val Pro 130 135 140

GAG GAT CTG AGA GAA GCC TGC TCC AAT GAA AAC GTC CAT AAA GAG TTC 480 Glu Asp Leu Arg Glu Ala Cys Ser Asn Glu Asn Val His Lys Glu Phe 145 150 155 160

AAG AAA GCC CTG GGA GCC AAC TGC ATC TTT CTC AAC ATC ACA AAC AGT 528 Lys Lys Ala Thr Gly Ala Asn Cys He Phe Thr Asn He Thr Asn Ser 165 170 175

GAG CTC TTC ATT CTG TCA ACC ACA GAA GCC CCT GTG AAG CGA GCA TCT 576 Glu Leu Phe He Leu Ser Thr Thr Glu Ala Pro Val Lys Arg Ala Ser 180 185 190

CTG CTG GGT GAT ATG CAT TTC CGA AGC CTG CGC ACC AAA CTG CTA CTT 624 Leu Leu Gly Asp Met His Phe Arg Ser Leu Arg Thr Lys Leu Leu Leu 195 200 205

ATG TCC CGC AAT GAA GAA GCT ACC AAG CAC CTA GAG ACA AGC AAG CAG 672 Met Ser Arg Asn Glu Glu Ala Thr Lys His Leu Glu Thr Ser Lys Gin 210 215 220

TTG GCA GCA GCC TTC CAA GAG GAG TTC ACA GTG CGA GAG GAC CTG ATG 720 Leu Ala ALa Ala Phe Gin Glu Glu Phe Thr Val Arg Glu Asp Leu Met 225 230 235 240

GGA CTG GCA ATT GGG ACT CAC GGT GCC AAC ATC CAG CAG GCC CGA AAA 768 Gly Leu Ala He Gly Thr His Gly Ala Asn He Gin Gin Ala Arg Lys 245 250 255

GTA CCT GGG GTG ACC GCC ATT GAG TTG GGT GAA GAG ACC TGC ACT TTC 816 Val Pro Gly Val Thr Ala He Glu Leu Gly Glu Glu Thr Cys Thr Phe 260 265 270

CGC ATC TAT GGG GAG ACT CCC GAG GCT TGC CGA CAG GCC CGA AGC TAC 864 Arg He Tyr Gly Glu Thr Pro Glu Ala Cys Arg Gin Ala Arg Ser Tyr 275 280 285

CTT GAG TTT TCT GAG GAC TCA GTG CAA GTG CCC AGG AAC CTG GTT GGC 912 Leu Glu Phe Ser Glu Asp Ser Val Gin Val Pro Arg Asn Leu Val Gly 290 295 300

AAA GTG ATT GGA AAG AAC GGG AAA GTG ATC CAG GAG ATT GTG GAT AAA 960 Lys Val He Gly Lys Asn Gly Lys Val He Gin Glu He Val Asp Lys 305 310 315 320

TCT GGT GTG GTG AGG GTT CGA GTG GAA GGT GAT AAT GAC AAG AAG AAC 1008 Ser Gly Val Val Arg Val Arg Val Glu Gly Asp Asn Asp Lys Lys Asn 325 330 335

CCC AGG GAG GAG GGA ATG GTT CCC TTC ATT TTT GTT GGC ACC CGA GAG 1056 Pro Arg Glu Glu Gly Met Val Pro Phe He Phe Val Gly Thr Arg Glu 340 345 350

AAC ATC AGC AAT GCC CAG GCT TTG CTG GAG TAT CAC CTC TCC TAC CTG 1104 Asn He Ser Asn Ala Gin Ala Leu Leu Glu Tyr His Leu Ser Tyr Leu 355 360 365

CAG GAG GTA GAG CAG CTT CGC TTG GAG AGG CTA CAA ATT GAT GAG CAG 1152 Gin Glu Val Glu Gin Leu Arg Leu Glu Arg Leu Gin He Asp Glu Gin 370 375 380

CTT CGG CAG ATT GGG CTG GGC TTT CGC CCT CCT GGG AGT GGG CGG GGC 1200 Leu Arg Gin He Gly Leu Gly Phe Arg Pro Pro Gly Ser Gly Arg Gly 385 390 395 400

AGC GGT GGC AGC GAC AAG GCT GGA TAT AGC ACT GAT GAG AGC TCC TCC 1248 Ser Gly Gly Ser Asp Lys Ala Gly Tyr Ser Thr Asp Glu Ser Ser Ser 405 410 415

TCC TCC CTC CAT GCG ACT CGA ACC TAT GGG GGC AGC TAT GGG GGC CGT 1296 Ser Ser Leu His Ala Thr Arg Thr Tyr Gly Gly Ser Tyr Gly Gly Arg 420 425 430

GGC CGT GGC CGG AGG ACA GGC GGT CCT GCC TAT GGC CCC AGC TCA GAT 1344 Gly Arg Gly Arg Arg Thr Gly Gly Pro Ala Tyr Gly Pro Ser Ser Asp 435 440 445

GTG TCT ACA GCT TCA GAG ACT GAG TCA GAG AAG AGA GAG GAG CCC AAC 1392 Val Ser Thr Ala Ser Glu Thr Glu Ser Glu Lys Arg Glu Glu Pro Asn 450 455 460

CGA GCT GGG CCT GGC GAC AGG GAT CCC CCA ACC CGA GGG GAA GAA AGC 1440 Arg Ala Gly Pro Gly Asp Arg Asp Pro Pro Thr Arg Gly Glu Glu Ser 465 470 475 480

CGG AGG CGG CCG ACT GGG GGC CGG GGT AGG GGA CCC CCA CCT GCC CCC 1488 Arg Arg Arg Pro Thr Gly Gly Arg Gly Arg Gly Pro Pro Pro Ala Pro 485 490 495

CGG CCC ACT TCG AGA TAC AAT TCT TCA TCT ATT AGC TCA GTG CTG AAG 1536 Arg Pro Thr Ser Arg Tyr Asn Ser Ser Ser He Ser Ser Val Leu Lys 500 505 510

GAT CCA GAC AGT AAT CCC TAC AGC CTA TTG GAC ACG TCT GAA CCA GAG 1584 Asp Pro Asp Ser Asn Pro Tyr Ser Leu Leu Asp Thr Ser Glu Pro Glu 515 520 525

CCC CCG GTT GAT TCA GAA CCT GGG GAA CCC CCC CCA GCA AGT GCC AGG 1632 Pro Pro Val Asp Ser Glu Pro Gly Glu Pro Pro Pro Ala Ser Ala Arg 530 ' 535 540

CGC CGC CGC TCC CGC CGC CGC CGC ACT GAT GAA GAC AGG ACC GTC ATG 1680 Arg Arg Arg Ser Arg Arg Arg Arg Thr Asp Glu Asp Arg Thr Val Met 545 550 555 560

GAT GGA GGC CTG GAA TCA GAT GGG CCC AAC ATG ACA GAG AAT GGC CTG 1728 Asp Gly Gly Leu Glu Ser Asp Gly Pro Asn Met Thr Glu Asn Gly Leu 565 570 575

GAA GAT GAA TCA AGA CCT CAA CGT CGT AAT CGC AGC CGC CGC CGC CGT 1776 Glu Asp Glu Ser Arg Pro Gin Arg Arg Asn Arg Ser Arg Arg Arg Arg 580 585 590

AAC CGT GGT AAT CGG ACT GAT GGC TCT ATC AGT GGA GAC CGC CAG CCA 1824 Asn Arg Gly Asn Arg Thr Asp Gly Ser He Ser Gly Asp Arg Gin Pro 595 600 605

GTG ACT GTG GCT GAC TAT ATC TCA CGA GCA GAG TCT CAG AGC CGC CAG 1872 Val Thr Val Ala Asp Tyr He Ser Arg Ala Glu Ser Gin Ser Arg Gin 610 615 620

AGC GCA CCC CTG GAA CGC ACT AAA CCC TCA GAA GAC TCT CTT TCA GGA 1920 Ser Ala Pro Leu Glu Arg Thr Lys Pro Ser Glu Asp Ser Leu Ser Gly 625 630 635 640

CAG AAG GGT GAC TCT GTC AGC AAG CTT CCT AAG GGC CCC TCG GAG AAT 1968 Gin Lys Gly Asp Ser Val Ser Lys Leu Pro Lys Gly Pro Ser Glu Asn 645 650 655

GGG GAG CTC TCC GCC CCC TTG GAG TTG GGT AGT ATG GTG AAT GGG GTT 2016 Gly Glu Leu Ser Ala Pro Leu Glu Leu Gly Ser Met Val Asn Gly Val 660 665 670

TCA 2019 Ser

(2) INFORMATION FOR SEQ ID NO:6 :

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 673 amino acid residues

(B) TYPE: amino acid

(C) STRANDEDNESS:

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: Met Gly Gly Leu Ala Ser Gly Gly Asp Val Glu Pro Gly Leu Pro Val 1 5 10 15

Glu Val Arg Gly Ser Asn Gly Ala Phe Tyr Lys Gly Phe Val Lys Asp

20 25 30

Val His Glu Asp Ser Val Thr He Phe Phe Glu Asn Asn Trp Gin Ser 35 40 45

Glu Arg Gin He Pro Phe Gly Asp Val Arg Thr Pro Pro Pro Ala Asp 50 55 60

Tyr Asn Lys Glu He Thr Glu Gly Asp Glu Val Glu Val Tyr Ser Arg 65 70 75 80

Ala Asn Glu Gin Glu Pro Cys Gly Trp Trp Thr Ala Arg Val Arg Met 85 90 95

Met Lys Gly Asp Phe Tyr Val He Glu Tyr Ala Ala Cys Asp Ala Thr 100 105 110

Tyr Asn Glu He Val Thr Leu Glu Arg Leu Arg Pro Val Asn Pro Asn 115 120 125

Pro Leu Ala Thr Lys Gly Ser Phe Phe Lys Val Thr Met Ala Val Pro 130 135 140

Glu Asp Leu Arg Glu Ala Cys Ser Asn Glu Asn Val His Lys Glu Phe 145 150 155 160

Lys Lys Ala Thr Gly Ala Asn Cys He Phe Thr Asn He Thr Asn Ser 165 170 175

Glu Leu Phe He Leu Ser Thr Thr Glu Ala Pro Val Lys Arg Ala Ser 180 185 190

Leu Leu Gly Asp Met His Phe Arg Ser Leu Arg Thr Lys Leu Leu Leu 195 200 205

Met Ser Arg Asn Glu Glu Ala Thr Lys His Leu Glu Thr Ser Lys Gin 210 215 220

Leu Ala ALa Ala Phe Gin Glu Glu Phe Thr Val Arg Glu Asp Leu Met ' 225 230 235 240

Gly Leu Ala He Gly Thr His Gly Ala Asn He Gin Gin Ala Arg Lys 245 250 255

Val Pro Gly Val Thr Ala He Glu Leu Gly Glu Glu Thr Cys Thr Phe 260 265 270

Arg He Tyr Gly Glu Thr Pro Glu Ala Cys Arg Gin Ala Arg Ser Tyr 275 280 285

Leu Glu Phe Ser Glu Asp Ser Val Gin Val Pro Arg Asn Leu Val Gly 290 295 300

Lys Val He Gly Lys Asn Gly Lys Val He Gin Glu He Val Asp Lys 305 310 315 320

Ser Gly Val Val Arg Val Arg Val Glu Gly Asp Asn Asp Lys Lys Asn 325 330 335

Pro Arg Glu Glu Gly Met Val Pro Phe He Phe Val Gly Thr Arg Glu 340 345 350

Asn He Ser Asn Ala Gin Ala Leu Leu Glu Tyr His Leu Ser Tyr Leu 355 360 365

Gin Glu Val Glu Gin Leu Arg Leu Glu Arg Leu Gin He Asp Glu Gin 370 375 380

Leu Arg Gin He Gly Leu Gly Phe Arg Pro Pro Gly Ser Gly Arg Gly 385 390 395 400

Ser Gly Gly Ser Asp Lys Ala Gly Tyr Ser Thr Asp Glu Ser Ser Ser 405 410 415

Ser Ser Leu His Ala Thr Arg Thr Tyr Gly Gly Ser Tyr Gly Gly Arg 420 425 430

Gly Arg Gly Arg Arg Thr Gly Gly Pro Ala Tyr Gly Pro Ser Ser Asp 435 440 445

Val Ser Thr Ala Ser Glu Thr Glu Ser Glu Lys Arg Glu Glu Pro Asn 450 455 460

Arg Ala Gly Pro Gly Asp Arg Asp Pro Pro Thr Arg Gly Glu Glu Ser 465 470 475 480

Arg Arg Arg Pro Thr Gly Gly Arg Gly Arg Gly Pro Pro Pro Ala Pro 485 490 495

Arg Pro Thr Ser Arg Tyr Asn Ser Ser Ser He Ser Ser Val Leu Lys 500 505 510

Asp Pro Asp Ser Asn Pro Tyr Ser Leu Leu Asp Thr Ser Glu Pro Glu 515 520 525

Pro Pro Val Asp Ser Glu Pro Gly Glu Pro Pro Pro Ala Ser Ala Arg 530 535 540

Arg Arg Arg Ser Arg Arg Arg Arg Thr Asp Glu Asp Arg Thr Val Met 545 550 555 560

Asp Gly Gly Leu Glu Ser Asp Gly Pro Asn Met Thr Glu Asn Gly Leu 565 570 575

Glu Asp Glu Ser Arg Pro Gin Arg Arg Asn Arg Ser Arg Arg Arg Arg 580 585 590

Asn Arg Gly Asn Arg Thr Asp Gly Ser He Ser Gly Asp Arg Gin Pro 595 600 605

Val Thr Val Ala Asp Tyr He Ser Arg Ala Glu Ser Gin Ser Arg Gin 610 615 620

Ser Ala Pro Leu Glu Arg Thr Lys Pro Ser Glu Asp Ser Leu Ser Gly 625 630 635 640

Gin Lys Gly Asp Ser Val Ser Lys Leu Pro Lys Gly Pro Ser Glu Asn 645 650 655

Gly Glu Thr Ser Ala Pro Leu Glu Leu Gly Ser Met Val Asn Gly Val 660 665 670

Ser

(2) INFORMATION FOR SEQ ID NO:7 :

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 22 bases

(B) TYPE: nucleic acid

(C) STRANDEDNESS : single

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: GATGACATTT CTAAGCTACA GC 22

(2) INFORMATION FOR SEQ ID NO:8:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 22 bases

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:8: TGTACAAGCA CTATTGTAAA TG 22

(2) INFORMATION FOR SEQ ID NO: 9 :

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 18 bases

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9 : CAGGGTCATA CCCCCTCC 18

(2) INFORMATION FOR SEQ ID NO:10: (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 20 bases

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:10: CTGAACGGTC AAATCTGGGT 20

(2) INFORMATION FOR SEQ ID NO:11: (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 9 amino acids

(B) TYPE: amino acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear (ii) MOLECULE TYPE: peptide

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:11: Arg Arg Arg Arg Ser Arg Arg Arg Arg 1 5

(2) INFORMATION FOR SEQ ID NO:12: (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 5 amino acids

(B) TYPE: amino acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear (ii) MOLECULE TYPE: peptide

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:12: Gly Lys Arg Cys Asp 1 5

(2) INFORMATION FOR SEQ ID NO:13: (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 4 amino acids

(B) TYPE: amino acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (ix) FEATURE:

(A) NAME/KEY: Modified-site

(B) LOCATION: 4

(D) OTHER INFORMATION: /note= "Xaa at 4 is Ser or Pro" (xi) SEQUENCE DESCRIPTION: SEQ ID NO:13 : Asn Gly Val Xaa 1