Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
UDP-N-ACETYLGLUCOSAMINE: GALACTOSE-$g(b)1,3-N-ACETYLGALACTOSAMINE-$g(a)-R/(GlcNAc to GalNAc) $g(b)1,6-N-ACETYLGLUCOSAMINYLTRANSFERASE, C2GnT3
Document Type and Number:
WIPO Patent Application WO/2001/014535
Kind Code:
A2
Abstract:
A novel gene defining a novel human UDP-GlcNAc: Gal$g(b)1-3GAlNAc$g(a) $g(b)1, 6GlcNAc-transferase, termed C2GnT3, with unique enzymatic properties is disclosed. The invention discloses isolated DNA molecules and DNA constructs encoding C2GnT3, as well as cloning and expression vectors including such DNA. The enzyme C2GnT3 -active derivatives thereof are disclosed, in particular soluble derivatives comprising the catalytically active domain of C2GnT3. Further, the invention discloses methods of obtaining 1,6-N-acetylglucosaminyl glycosylated saccharides, glycopeptides or glycoproteins by use of an enzymatically active C2GnT3 protein or fusion protein thereof or by using cells stably transfected with a vector including DNA encoding an enzymatically active C2GnT3 protein. Methods are disclosed for the identification of agents with the ability to inhibit or stimulate the biological activity of C2GnT3.

Inventors:
SCHWIENTEK TILO (DK)
CLAUSEN HENRIK (DK)
Application Number:
PCT/DK2000/000469
Publication Date:
March 01, 2001
Filing Date:
August 24, 2000
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
SCHWIENTEK TILO (DK)
CLAUSEN HENRIK (DK)
International Classes:
A61K31/7088; A61K31/7105; G01N33/53; A61K38/45; A61K39/395; A61K45/00; A61K48/00; A61P5/00; A61P31/18; A61P35/00; A61P35/02; A61P35/04; A61P43/00; C07K16/40; C12N1/21; C12N5/10; C12N9/10; C12N15/09; C12P19/00; C12Q1/02; C12Q1/48; C12Q1/68; A61K38/00; C12R1/91; (IPC1-7): C12N15/00
Domestic Patent References:
WO1995007020A11995-03-16
Foreign References:
US5766910A1998-06-16
Other References:
DATABASE EMBL [Online] 9 April 2000 (2000-04-09) SCHWIENTEK T ET AL: "Control of O-glycan branch formation. Molecular clonint and characterization of a novel thymus-associated core 2 betal,6-N-acetylglucosaminyltransferase " retrieved from EMBL, accession no. AF132035 XP002901428 & J BIOL CHEM, vol. 275, no. 15, 2000, pages 11106-11113,
Attorney, Agent or Firm:
Hofman-bang, A/s (Hans Bekkevolds Allé 7 Hellerup, DK)
Download PDF:
Claims:
CLAIMS
1. An isolated nucleic acid encoding UDPNAcetyl glucosamine: Galactoseßl, 3NAcetylgalactosamineaR ßl 6 NAcetylglucosaminyltransferase (C2GnT3) or a fragment hereof.
2. An isolated nucleic acid as defined in claim 1, wherein said nucleic acid is DNA.
3. An isolated nucleic acid as defined in claim 2, wherein said DNA is cDNA.
4. An isolated nucleic acid as defined in claim 2, wherein said DNA is genomic DNA.
5. An isolated nucleic acid as defined in any one of claims 14, wherein said nucleic acid comprises the sequence of nucleotides 11362 in SEQ ID NO: 1 and in Figure 1 or sequenceconservative or function conservative variants thereof.
6. A nucleic acid which hybridizes under conditions of high stringency with the nucleic acid having the se quence of nucleotides 11362 in SEQ ID NO: 1 and in Fig ure 1.
7. A nucleic acid vector comprising a nucleotide sequence encoding C2GnT3 or fragments thereof.
8. A vector as defined in claim 7, wherein said nu cleotide sequence comprises the sequence of nucleotides 11362 in SEQ ID NO: 1 and in Figure 1 or sequencecon servative or functionconservative variants thereof.
9. A vector as defined in claim 7 or 8, wherein said sequence encoding C2GnT3 is operably linked to a transcriptional regulatory element.
10. A cell comprising a vector as defined in any one of claims 79.
11. A cell as defined in claim 10, wherein said cell is stably transfected with said vector.
12. A cell as defined in claim 10 or 11, wherein said cell produces enzymatically active C2GnT3.
13. A cell as defined in any one of claims 1012, wherein said cell is selected from the group consisting of bacterial, yeast, insect, avian, and mammalian cells.
14. A cell as defined in any one of claims 1012, wherein said cell is Sf9.
15. A cell as defined in any one of claims 1012, wherein said cell is CHO.
16. A method for producing C2GnT3 polypeptides, which comprises: (i) introducing into a host cell an isolated DNA mole cule encoding a human C2GnT3, or a DNA construct comprising a DNA sequence encoding C2GnT3; (ii) growing the host cell under conditions suitable for human C2GnT3 expression ; and (iii) isolating C2GnT3 produced by the host cell.
17. A method as defined in claim 16, wherein said enzymatically active C2GnT3 is selected from the group consisting of: (i) a polypeptide having the sequence set forth in SEQ ID NO: 2 and in Figure 1; (ii) a polypeptide consisting of amino acids 39453 as set forth in SEQ ID NO: 2 and in Figure 1; (iii) a fusion polypeptide consisting of at least amino acids 39453 as set forth in SEQ ID NO: 2 and in Figure 1 fused in frame to a second sequence, wherein said second sequence comprises an affinity ligand or a reactive group; and (iv) functionconservative variants of any of the fore going.
18. A method of screening one or more agents for the ability to inhibit or stimulate C2GnT3 enzymatic activity in a cellfree or cellbased assay, which comprises: (i) contacting a C2GnT3 polypeptide according to claim 17, or contacting a cell that recombinantly ex presses C2GnT3 polypeptide according to claims 10 to 15, with agents under assay conditions suitable for the detection of said enzymatic activity; and (ii) measuring whether said enzymatic activity is thereby inhibited or stimulated by one or more agents, wherein agents are selected from compounds, compositions, antibodies or antibody fragments, an tisense sequences and ribozyme nucleotide sequences for C2GnT3 polypeptide.
19. A method as defined in claim 18, wherein one or more agents are selected from a combinatorial chemical library.
20. A method as defined in claim 18, wherein one or more agents are generated by methods of C2GnT3 polypep tide structurebased design.
21. A method for the identification of DNA sequence variations in the C2GnT3 gene, comprising the steps of: (i) isolating DNA from a patient ; (ii) amplifying C2GnT3 genomic regions by PCR; and (iii) detecting the presence of DNA sequence variation by DNA sequencing, singlestrand conformational poly morphism (SSCP) or mismatch mutation.
22. An isolated C2GnT3 polypeptide comprising an amino acid sequence of SEQ ID NO: 2.
23. An isolated polypeptide having at least 45% amino acid sequence identity to an amino acid sequence of SEQ ID NO: 2.
24. A polypeptide prepared in accordance with the method of claim 16 or 17.
25. An antibody having specificity against an epi tope of a polypeptide as claimed in claim 22,23, or 24.
26. A probe comprising a sequence encoding a poly peptide as claimed in claim 22,23, or 24 or a part thereof.
27. A method of diagnosing and monitoring conditions mediated by a C2GnT3 polypeptide comprising determining the presence of a nucleic acid molecule as claimed in any one of claims 19 or a polypeptide as claimed in any one of claims 2224.
28. A method as claimed in claim 27 wherein the condition is a thymusrelated disorder.
29. A method for identifying a substance which as sociates with a polypeptide as claimed in any one of claims 2224 comprising: (i) reacting the polypeptide with at least one sub stance which potentially can associate with the polypeptide, under conditions which permit the as sociation between the substance and polypeptide; and (ii) removing or detecting polypeptide associated with the substance, wherein detection of associated polypeptide and substance indicates the substance associates with the polypeptide.
30. A method for evaluation a compound for its abil ity to modulate the biological activity of a polypeptide as claimed in any one of claims 2224 comprising provid ing a known concentration of the polypeptide with a sub stance which associates with the polypeptide, and a test compound under conditions which permit the formation of complexes between the substance and polypeptide, and removing and/or detecting complexes.
31. A method for detecting a nucleic acid molecule encoding a C2GnT3 polypeptide in a biological sample com prising the steps of: (i) hybridizing the nucleic acid molecule of any one of claims 19 to nucleic acid of the biological sam ple, thereby forming a hybridization (ii) detecting the hybridization complex wherein the presence of the hybridization complex correlates with the presence of a nucleic acid molecule encod ing the polypeptide in the biological sample.
32. A method as claimed in claim 31 wherein nucleic acids of the biological sample are amplified by the poly merase chain reaction prior to the hybridizing step.
33. A method for treating a condition mediated by a C2GnT3 polypeptide as claimed in any one of claims 2224 comprising administering an effective amount of an anti body as claimed in claim 25 or a substance or compound identified in accordance with a method claimed in claim 29 or claim 30.
34. A method as claimed in claim 33 wherein the con dition is a thymusrelated disorder.
35. A composition comprising one or more of a nu cleic acid molecule as claimed in any one of claims 19 or a polypeptide as claimed in any one of claims 2224 or a substance or compound identified using a method as claimed in any one of claims 1820,29 and 30 and a phar maceutically acceptable carrier, excipient or diluent.
36. Use of one or more of a nucleic acid molecule as claimed in any one of claims 19 or a polypeptide as claimed in any one of claims 2224 or a substance or com pound identified using a method as claimed in any one of claims 1820,29 and 30 in the preparation of a pharma ceutical composition for treating a condition mediated by a C2GnT3 polypeptide.
37. A genebased therapy directed at the thymus com prising a polynucleotide comprising all or a portion of a regulatory sequence of SEQ ID NO: 1.
38. A method for preparing an oligosaccharide com prising contacting a reaction mixture comprising an acti vated GlcNAc, and an acceptor in the presence of an enzy matically active polypeptide as claimed in any one of claims 2224.
Description:
UDP-N-Acetylglucosamine: Galactose-pl, 3-N-Acetyl- galactosamine-a-R/ (GlcNAc to GalNAc) ßl, 6-N- Acetylglucosaminyltransferase, C2GnT3 TECHNICAL FIELD The present invention relates generally to the biosynthe- sis of glycans found as free oligosaccharides or cova- lently bound to proteins and glycolipids. In particular, this invention relates to a family of nucleic acids en- coding UDP-N-acetylglucosamine: N-acetylgalactosamine- (31,6-N-acetylglucosaminyltransferases (Core-ßl ;, 6-N-acet- ylglucosaminyltransferases), which add N-acetylglucos- amine to the hydroxy group at C6 of 2-acetamido-2-deoxy-D- galactosamine (GalNAc) in O-glycans of the core 1 and the core 3 type thereby forming the core 2 and core 4 types.

Previously two members of this family have been identified and designated C2GnT1 and C2GnT2.

This invention is more particularly related to a gene en- coding a third member of this family of O-glycan pl, 6-N- acetylglucosaminyltransferases, termed C2GnT3, probes to the DNA encoding C2GnT3, DNA constructs comprising DNA en- coding C2GnT3, recombinant plasmids and recombinant meth- ods for producing C2GnT3, recombinant methods for stably transforming or transfecting cells for expression of C2GnT3, methods for identification of agents with the ability to inhibit or stimulate C2GnT3 biological activ- ity, and methods for identification of DNA polymorphism in patients. In the U. S. Provisional Patent Application No.

60/150,488 filed on August 24,1999, from which the pre- sent application claims priority, this novel Core 2 (36GlcNAc-transferase isoform was identified and desig- nated C2GnTII. The designation C2GnTII has here been re-

placed by the designation C2GnT3 in accordance with its scientific publication (14).

BACKGROUND OF THE INVENTION O-linked protein glycosylation involves an initiation stage in which a family of N-acetylgalactosaminyltransfe- rases catalyzes the addition of N-acetylgalactosamine to Serine or Threonine residues (1). Further assembly of 0- glycan chains involves several sucessive or alternative biosynthetic reactions: i) formation of simple mucin-type core 1 structures by UDP-Gal: GalNAca-R pl, 3Gal-transfera- se activity; ii) conversion of core 1 to complex-type core 2 structures by UDP-GlcNAc: Galp1-3GalNAca-R ßl, 6GlcNAc- transferase activities; iii) direct formation of complex mucin-type core 3 by UDP-GlcNAc: GalNAca p1,3GlcNAc-trans- ferase activities; and iv) conversion of core 3 to core 4 by UDP-GlcNAc: GlcNAcp1-3GalNAca-R p1,6GlcNAc-transferase activity. The formation of ßl, 6GlcNAc branches (reactions ii and iv) may be considered a key controlling event of O- linked protein glycosylation leading to structures produ- ced upon differentiation and malignant transformation (2- 6). For example, increased formation of GlcNAcp1-6GalNAc branching in O-glycans has been demonstrated during T-cell activation, during the development of leukemia, and for immunodeficiencies like Wiskott-Aldrich syndrome and AIDS (7; 8). Core 2 branching may play a role in tumor pro- gression and metastasis (9). In contrast, many carcinomas show changes from complex 0-glycans found in normal cell types to immaturely processed simple mucin-type 0-glycans such as T (Thomsen-Friedenreich antigen; Galßl-3GalNAcal- R), Tn (GalNAcal-R), and sialosyl-Tn (NeuAca2-6GalNAcal-R) (10). The molecular basis for this has been extensively studied in breast cancer, where it was shown that speci- fic downregulation of a core 2 36GlcNAc-transferase was

responsible for the observed lack of complex type O-gly- cans on the mucin MUC1 (6). 0-glycan core assembly may therefore be controlled by inverse changes in the expres- sion level of Core-pl, 6-N-acetylglucosaminyltransferases and the sialyltransferases forming sialyl-T and sialyl- Tn.

Interestingly, the metastatic potential of tumors has been correlated with increased expression of core 2 p6GlcNAc-transferase activity (5). The increase in core 2 ß6GlcNAc-transferase activity was associated with in- creased levels of poly N-acetyllactosamine chains carry- ing sialyl-LeX, which may contribute to tumor metastasis by altering selectin-mediated adhesion (4; 11). The con- trol of O-glycan core assembly is regulated by the ex- pression of key enzyme activities; however, epigenetic factors including posttranslational modification, topol- ogy, or competition for substrates may also play a role in this process (11).

Changes in surface carbohydrates of T-cells have been identified during development and activation. O-glycan branches of the core 2 type are restricted to immature thymocytes of the thymal cortex but are no longer exposed on the surface of mature medullary thymocytes (17). Core 2 structures on T-cell surface proteins are ligands for the S-type lectin galectin-1, which participates in thy- mocyte-thymic epithelia interaction (18). The elimina- tion of Core 2 structures from the thymocyte cell surface was found to be essential for controlled apoptosis medi- ated by galectin-1 (19).

Core 2 p6GlcNAc-transferase activity is carried out by more than one enzyme isoform. The first Core 2 (36GlcNAc- transferase isoform was initially identified as a criti- cal enzyme in blood cell development and differentiation

and designated leukocyte form or L-Form (C2GnT-L) (12).

The gene encoding C2GnT-L has been cloned by expression cloning from a cDNA library of the human promyelocytic leukemia cell line HL-60 (13). This gene has now been re- named as C2GnT1 (14). Using the C2GnT1 sequence as a probe for BLAST analysis of the human expressed sequence tag database, a homologous gene encoding a second Core 2 ß6GlcNAc-transferase isoform has been identified and des- ignated C2/4GnT (15) and C2GnT-M (16). This gene has now been renamed as C2GnT2 (14).

C2GnT1 was predicted to control synthesis of core 2 se- lectin ligands in leukocytes and lymphoid tissues, how- ever, mice deficient in C2GnT1 exhibited only partial re- duction in selectin ligand production and no significant changes in lymphocyte homing properties (Ellies, L. G., et al. 1998, Immunity 9: 881-890). One possible explana- tion for these results would be the expression of addi- tional Core 2 ß6GlcNAc-transferases. C2GnT2 does not ap- pear to be a candidate, as its expression pattern is re- stricted to mucous secreting organs (15,16).

Consequently, there exists a need in the art for detect- ing as yet unidentified UDP-N-acetylglucosamine: Galac- tose-pl, 3-N-acetylgalactosamine-a-R (GlcNAc to GalNAc) ßl- 6 N-acetylglucosaminyltransferases and identifying the primary structures of the genes encoding such enzymes. The present invention meets this need, and further presents other related advantages.

SUMMARY OF THE INVENTION The present invention provides isolated nucleic acids en- coding human UDP-N-acetylglucosamine: N-acetylgalactos- amine (31,6 N-acetylglucosaminyltransferase 3 (C2GnT3), in-

cluding cDNA and genomic DNA. C2GnT3 has acceptor sub- strate specificities comparable to C2GnT1 (14). The com- plete nucleotide sequence encoding C2GnT3 is set forth in SEQ ID NO: 1 and in Figure 1.

Variations in one or more nucleotides may exist among in- dividuals within a population due to natural allelic variation. Any and all such nucleic acid variations are within the scope of the invention. DNA sequence polymor- phisms may also occur which lead to changes in the amino acid sequence of a C2GnT3 polypeptide. These amino acid polymorphisms are also within the scope of the present invention. In addition, species variations i. e. varia- tions in nucleotide sequence naturally occurring among different species, are within the scope of the invention.

Among Core 2 ß6GlcNAc-transferases, C2GnT3 appears to be the dominant isoform in thymus (14). Thus, C2GnT3 is likely to have important functions during thymocyte de- velopment as well as T-cell maturation and homing (14).

The identification of agents with the ability to inhibit or stimulate C2GnT3 enzymatic activity therefore has the potential for both diagnostic and therapeutic purposes of related diseases.

Access to the gene encoding C2GnT3 allows production of a glycosyltransferase for use in formation of core 2-based 0-glycan modifications on oligosacccharides, glycopro- teins and glycosphingolipids. This enzyme can be used, for example, in pharmaceutical or other commercial appli- cations that require synthetic addition of core 2-based O-glycans to these or other substrates, in order to pro- duce appropriately glycosylated glycoconjugates having particular enzymatic, immunogenic, or other biological and/or physical properties.

In one aspect, the invention encompasses isolated nucleic acids comprising the nucleotide sequence of nucleotides 1- 1362 as set forth in Figure 1 or sequence-conservative or function-conservative variants thereof. Also provided are isolated nucleic acids hybridizable with nucleic acids having the sequence as set forth in Figure 1 or fragments thereof or sequence-conservative or function-conservative variants thereof; preferably, the nucleic acids are hy- bridizable with C2GnT3 sequences under conditions of in- termediate stringency, and, most preferably, under condi- tions of high stringency. In one embodiment, the DNA se- quence encodes the amino acid sequence shown in Figure 1, from methionine (amino acid no. 1) to serine (amino acid no. 453). In another embodiment, the DNA sequence encodes an amino acid sequence comprising a sequence from proline (no. 39) to serine (no. 453) of the amino acid sequence set forth in Figure 1.

In a related aspect, the invention provides nucleic acid vectors comprising C2GnT3 DNA sequences, including but not limited to those vectors in which the C2GnT3 DNA sequence is operably linked to a transcriptional regulatory ele- ment, with or without a polyadenylation sequence. Cells comprising these vectors are also provided, including without limitation transiently and stably expressing cells. Viruses, including bacteriophages, comprising C2GnT3-derived DNA sequences are also provided. The inven- tion also encompasses methods for producing C2GnT3 poly- peptides. Cell-based methods include without limitation those comprising: introducing into a host cell an isolated DNA molecule encoding C2GnT3, or a DNA construct compris- ing a DNA sequence encoding C2GnT3; growing the host cell under conditions suitable for C2GnT3 expression ; and iso- lating C2GnT3 produced by the host cell. A method for gen- erating a host cell with de novo stable expression of

C2GnT3 comprises: introducing into a host cell an isolated DNA molecule encoding C2GnT3 or an enzymatically active fragment thereof (such as, for example, a polypeptide com- prising amino acids 39-453 of the sequence set forth Fig- ure 1), or a DNA construct comprising a DNA sequence en- coding C2GnT3 or an enzymatically active fragment thereof; selecting and growing host cells in an appropriate medium; and identifying stably transfected cells expressing C2GnT3. The stably transfected cells may be used for the production of C2GnT3 enzyme for use as a catalyst and for recombinant production of peptides or proteins with appro- priate glycosylation. For example, eukaryotic cells, whether normal or diseased cells, having their glycosyla- tion pattern modified by stable transfection as above, or components of such cells, may be used to deliver specific glycoforms of glycopeptides and glycoproteins, such as, for example, as immunogens for vaccination.

In yet another aspect, the invention provides isolated C2GnT3 polypeptides, including without limitation polypep- tides having the sequence set forth in Figure 1, polypep- tides having the sequence of amino acids 39-453 as set forth in Figure 1, and a fusion polypeptide consisting of at least amino acids 39-453 as set forth in Figure 1 fused in frame to a second sequence, which may be any sequence that is compatible with retention of C2GnT3 enzymatic ac- tivity in the fusion polypeptide. Suitable second se- quences include without limitation those comprising an af- finity ligand or a reactive group.

In a related aspect, methods are disclosed for the identi- fication of agents with the ability to inhibit or stimu- late the enzymatic activity of C2GnT3. Assays utilizing C2GnT3 to screen for potential inhibitors or stimulators thereof are encompassed by the invention. Furthermore,

methods of using C2GnT3 in the structure-based design of inhibitors or stimulators thereof are also an aspect of the invention. Such a design would comprise the steps of determining the three-dimensional structure of the C2GnT3 polypeptide, analyzing the three-dimensional structure for the likely binding sites of donor and/or acceptor substrates, synthesis of a molecule that incorporates a predictive reactive site, and determining the inhibiting or stimulating activity of the molecule.

In another aspect of the present invention, methods are disclosed for screening for mutations in the coding region of the C2GnT3 gene using genomic DNA isolated from, e. g., blood cells of patients. In one embodiment, the method comprises: isolation of DNA from a patient; PCR amplifica- tion of the coding exon; DNA sequencing of amplified exon DNA fragments and establishing therefrom potential struc- tural defects of the C2GnT3 gene associated with disease.

In accordance with an aspect of the invention there is provided a method of, and products for (i. e. kits), dia- gnosing and monitoring conditions mediated by C2GnT3 by determining the presence of nucleic acid molecules and polypeptides of the invention.

Still further the invention provides a method for evalu- ating a test compound for its ability to modulate the biological activity of a C2GnT3 polypeptide of the inven- tion. For example, a substance that inhibits or enhances the catalytic activity of a C2GnT3 polypeptide may be evaluated."Modulate"refers to a change or an alteration in the biological activity of a polypeptide of the inven- tion. Modulation may be an increase or a decrease in activity, a change in characteristics, or any other change in the biological, functional, or immunological properties of the polypeptide.

Compounds which modulate the biological activity of a polypeptide of the invention may also be identified using the methods of the invention by comparing the pattern and level of expression of a nucleic acid molecule or poly- peptide of the invention in biological samples, tissues and cells, in the presence, and in the absence of the compounds.

In an embodiment of the invention a method is provided for screening a compound for effectiveness as an anta- gonist of a polypeptide of the invention, comprising the steps of a) contacting a sample containing said polypep- tide with a compound, under conditions wherein antagonist activity of said polypeptide can be detected, and b) detecting antagonist activity in the sample.

Methods are also contemplated that identify compounds or substances (e. g. polypeptides), which interact with C2GnT3 nucleic acid regulatory sequences (e. g. promoter sequences, enhancer sequences, negative modulator sequen- ces).

The nucleic acids, polypeptides, and substances and com- pounds identified using the methods of the invention, may be used to modulate the biological activity of a C2GnT3 polypeptide of the invention, and they may be used in the treatment of conditions mediated by C2GnT3 such as proli- ferative diseases including cancer, and thymus-related disorders. Accordingly, the nucleic acids, polypeptides, substances and compounds may be formulated into composi- tions for administration to individuals suffering from one or more of these conditions. Therefore, the present invention also relates to a composition comprising one or more of a polypeptide, nucleic acid molecule, or sub- stance or compound identified using the methods of the invention, and a pharmaceutically acceptable carrier,

excipient or diluent. A method for treating or preventing these conditions is also provided comprising administe- ring to a patient in need thereof, a composition of the invention.

The present invention in another aspect provides means necessary for production of gene-based therapies directed at the thymus. These therapeutic agents may take the form of polynucleotides comprising all or a portion of a nucleic acid of the invention comprising a regulatory sequence of a C2GnT3 nucleic acid placed in appropriate vectors or delivered to target cells in more direct ways.

Having provided a novel C2GnT3, and nucleic acids enco- ding same, the invention accordingly further provides methods for preparing oligosaccharides. In specific embo- diments, the invention relates to a method for preparing an oligosaccharide comprising contacting a reaction mixture comprising a donor substrate, and an acceptor substrate in the presence of a C2GnT3 polypeptide of the invention.

In accordance with a further aspect of the invention, there are provided processes for utilizing polypeptides or nucleic acid molecules, for in vitro purposes related to scientific research, synthesis of DNA, and manufacture of vectors.

These and other aspects of the present invention will be- come evident upon reference to the following detailed de- scription and drawings.

BRIEF DESCRIPTION OF THE DRAWINGS Figure 1 depicts the DNA sequence of the C2GnT3 gene (ac- cession # AF132035) and the predicted amino acid sequence

of C2GnT3. The amino acid sequence is shown in single let- ter code. The hydrophobic segment representing the puta- tive transmembrane domain is double underlined. Four con- sensus motifs for N-glycosylation are indicated by aster- isks. The location of the primers used for preparation of the expression constructs are indicated by single under- lining.

Figure 2 is an illustration of a sequence comparison be- tween human C2GnT3 (accession # AF132035), human C2GnT2 (formerly designated C2/4GnT; accession # AF038650), human C2GnT1 (formerly designated C2GnT-L; accession # M97347), and human IGnT (accession # Z19550). Introduced gaps are shown as hyphens, and aligned identical residues are box- shaded (black for all sequences, dark grey for three se- quences, and light grey for two sequences). The putative transmembrane domains are boxed. The positions of con- served cysteines are indicated by asterisks. One con- served N-glycosylation site is indicated by an open cir- cle.

Figure 3 depicts Northern blot analyses of healthy human adult and fetal tissues. Panel A: loading pattern for the human mRNA master blot (CLONTECH). Dots in row H contain 100 ng (H1-H7) or 500 ng (H8) of control DNA or RNA.

Panel B: autoradiogram of master blot expression analysis using a 32P-labeled C2GnT3 probe corresponding to the sol- uble expression fragment of C2GnT3 (base pairs 115-1359).

Panel C: A multiple human tissue northern blot (MTN II from Clontech) was probed as described for panel B.

Figure 4 shows a PCR analysis of C2GnT3 expression in hu- man blood cell fractions. PCR amplifications with primers specific for human C2GnT3 (C2GnT3) or GAPDH (G3PDH) were performed on a normalized human blood cell cDNA panel (MTC from Clontech) for 31 cycles.

Figure 5 is a schematic representation of forward and re- verse PCR primers that can be used to amplify the coding exon of the C2GnT3 gene. The sequences of the primers TSHC119 and TSHC123 are also shown.

DETAILED DESCRIPTION OF THE INVENTION All patent applications, patents, and literature refer- ences cited in this specification are hereby incorporated by reference in their entirety. In the case of conflict, the present description, including definitions, is in- tended to control.

DEFINITIONS 1."Nucleic acid"or"polynucleotide"as used herein re- fers to purine-and pyrimidine-containing polymers of any length, either polyribonucleotides or polydeoxyribonucleo- tides or mixed polyribo-polydeoxyribo nucleotides. This includes single-and double-stranded molecules, i. e., DNA-DNA, DNA-RNA and RNA-RNA hybrids, as well as"protein nucleic acids" (PNA) formed by conjugating bases to an amino acid backbone. This also includes nucleic acids con- taining modified bases (see below).

2."Complementary DNA or cDNA"as used herein refers to a DNA molecule or sequence that has been enzymatically syn- thesized from the sequences present in a mRNA template, or a clone of such a DNA molecule. A"DNA Construct"is a DNA molecule or a clone of such a molecule, either single- or double-stranded, which has been modified to contain segments of DNA that are combined and juxtaposed in a man- ner that would not otherwise exist in nature. By way of non-limiting example, a cDNA or DNA which has no introns

is inserted adjacent to, or within, exogenous DNA se- quences.

3. A plasmid or, more generally, a vector, is a DNA con- struct containing genetic information that may provide for its replication when inserted into a host cell. A plasmid generally contains at least one gene sequence to be ex- pressed in the host cell, as well as sequences that fa- cilitate such gene expression, including promoters and transcription initiation sites. It may be a linear or closed circular molecule.

4. Nucleic acids are"hybridizable"to each other when at least one strand of one nucleic acid can anneal to another nucleic acid under defined stringency conditions. Strin- gency of hybridization is determined, e. g., by a) the tem- perature at which hybridization and/or washing is per- formed, and b) the ionic strength and polarity (e. g., for- mamide) of the hybridization and washing solutions, as well as other parameters. Hybridization requires that the two nucleic acids contain substantially complementary se- quences; depending on the stringency of hybridization, however, mismatches may be tolerated. Typically, hybridi- zation of two sequences at high stringency (such as, for example, in an aqueous solution of 0.5X SSC, at 65°C) re- quires that the sequences exhibit some high degree of com- plementarity over their entire sequence. Conditions of intermediate stringency (such as, for example, an aqueous solution of 2X SSC at 65°C) and low stringency (such as, for example, an aqueous solution of 2X SSC at 55°C), re- quire correspondingly less overall complementarily between the hybridizing sequences. (1X SSC is 0.15 M NaCl, 0.015 M Na citrate).

5. An"isolated"nucleic acid or polypeptide as used herein refers to a component that is removed from its

original environment (for example, its natural environment if it is naturally occurring). An isolated nucleic acid or polypeptide contains less than about 50%, preferably less than about 75%, and most preferably less than about 90%, of the cellular components with which it was origi- nally associated.

6. A"probe"refers to a nucleic acid that forms a hybrid structure with a sequence in a target region due to com- plementarily of at least one sequence in the probe with a sequence in the target region.

7. A nucleic acid that is"derived from"a designated se- quence refers to a nucleic acid sequence that corresponds to a region of the designated sequence. This encompasses sequences that are homologous or complementary to the se- quence, as well as"sequence-conservative variants"and "function-conservative variants". Sequence-conservative variants are those in which a change of one or more nu- cleotides in a given codon position results in no altera- tion in the amino acid encoded at that position. Function- conservative variants of C2GnT3 are those in which a given amino acid residue in the polypeptide has been changed without altering the overall conformation and enzymatic activity (including substrate specificity) of the native polypeptide; these changes include, but are not limited to, replacement of an amino acid with one having similar physico-chemical properties (such as, for example, acidic, basic, hydrophobic, and the like).

8. A"donor substrate"is a molecule recognized by, e. g., a Core-pl, 6-N-acetylglucosaminyltransferase and that con- tributes an N-acetylglucosaminyl moiety for the transfe- rase reaction. For C2GnT3, a donor substrate is UDP-N- acetylglucosamine. An"acceptor substrate"is a molecule, preferably a saccharide or oligosaccharide, that is recog-

nized by, e. g., an N-acetylglucosaminyltransferase and that is the target for the modification catalyzed by the transferase, i. e., receives the N-acetylglucosaminyl moi- ety. For C2GnT3, acceptor substrates include without limitation oligosaccharides, glycoproteins, 0-linked core 1-glycopeptides, and glycosphingolipids comprising the se- quences Galal-3GalNAc, or GlcNAcal-3GalNAc.

9. In accordance with the present invention there may be employed conventional molecular biology, microbiology, and recombinant DNA techniques within the skill of the art. Such techniques are explained fully in the litera- ture. See for example, Sambrook, Fritsch, Maniatis, Molecular Cloning: A Laboratory Manual, Second Edition (1989) Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N. Y); DNA Cloning: A Practical Approach, Volumes I and II (D. N. Glover ed. 1985); Oligonucleotide Synthe- sis (M. J. Gait ed. 1984); Nucleic Acid Hybridization B. D.

Hames & S. J. Higgins eds. (1985); Transcription and Translation B. D. Hames & S. J. Higgins eds (1984); Animal Cell Culture R. I. Freshney, ed. (1986); Immobilized Cells and enzymes IRL Press, (1986); and B. Perbal, A Practical Guide to Molecular Cloning (1984).

10. The terms"sequence similarity"or"sequence identity"refer to the relationship between two or more amino acid or nucleic acid sequences, determined by comparing the sequences, which relationship is generally known as"homology". Identity in the art also means the degree of sequence relatedness between amino acid or nucleic acid sequences, as the case may be, as determined by the match between strings of such sequences. Both identity and similarity can be readily calculated (Computational Molecular Biology, Lesk, A. M., ed., Oxford University Press New York, 1988 ; Biocomputing:

Informatics and Genome Projects, Smith, D. W. ed., Academic Press, New York, 1993 ; Computer Analysis of Sequence Data, Part I, Griffin, A. M., and Griffin, H. G. eds. Humana Press, New Jersey, 1994; Sequence Analysis in Molecular Biology, von Heinje, G., Academic Press, New York, 1987; and Sequence Analysis Primer, Gribskov, M. and Devereux, S., eds. M. Stockton Press, New York, 1991). While there are a number of existing methods to measure identity and similarity between two amino acid sequences or two nucleic acid sequences, both terms are well known to the skilled artisan (Sequence Analysis in Molecular Biology, von Hinge, G., Academic Press, New York, 1987; Sequence Analysis Primer, Gribskov, M. and Devereux, J., eds. M. Stockton Press, New York, 1991; and Carillo, H., and Lipman, D. SIAM J. Applied Math., 48.1073,1988). Preferred methods for determining identity are designed to give the largest match between the sequences tested. Methods to determine identity are codified in computer programs. Preferred computer program methods for determining identity and similarity between two sequences include but are not limited to the GCG program package (20), BLASTP, BLASTN, and FASTA (21).

Identity or similarity may also be determined using the alignment algorithm of Dayhoff et al. (Methods in Enzymology 91: 524-545 (1983)].

Preferably the nucleic acids of the present invention have substantial sequence identity using the preferred computer programs cited herein, for example greater than 40%, 45%, 50%, 60%, 70%, 75%, 80%, 85%, or 90% identity; more preferably at least 95%, 96%, 97%, 98%, or 99% sequence identity to the sequence shown in SEQ ID NO: 1 and Figure 1.

11. The polypeptides of the invention also include homo- logs of a C2GnT3 polypeptide and/or truncations thereof as described herein. Such homologs include polypeptides whose amino acid sequences are comprised of the amino acid sequences of C2GnT3 polypeptide regions from other species that hybridize under selected hybridization con- ditions (see discussion of hybridization conditions in particular stringent hybridization conditions herein) with a probe used to obtain a C2GnT3 polypeptide. These homologs will generally have the same regions which are characteristic of a C2GnT3 polypeptide. It is anticipated that a polypeptide comprising an amino acid sequence which has at least 40% identity or at least 60% similari- ty, preferably at least 60-65% identity or at least 80- 85% similarity, more preferably at least 70-80% identity or at least 90-95% similarity, most preferably at least 95% identity or at least 99% similarity with the amino acid sequence shown in SEQ ID NO: 2 and Figures 1 and 2, will be a homolog of a C2GnT3 polypeptide. A percent amino acid sequence similarity or identity is calculated using the methods described herein, preferably the computer programs described herein.

IDENTIFICATION AND CLONING OF C2GnT3 The present invention provides the isolated DNA molecules, including genomic DNA and cDNA, encoding the UDP-N-acetyl- glucosamine: N-acetylgalactosamine (31,6 N-acetylglucosami- nyltransferase 3 (C2GnT3).

C2GnT3 was identified by analysis of genomic survey se- quences (GSS), and cloned based on a genomic clone ob- tained from a human foreskin fibroblast library. The clon- ing strategy may be briefly summarized as follows: 1) iso- lation and sequencing of GSS clone CIT-HSP-2288B17. TF

(GSS GenBank accession number AQ005888); 2) synthesis of oligonucleotides derived from GSS sequence information, designated TSHC96 and TSHC101; 3) identification, cloning and sequencing of genomic P1 clone GS22597 &num 844/B1; 4) identification of a novel cDNA sequence corresponding to C2GnT3; 5) confirmatory sequencing of a cDNA clone ob- tained by reverse-transcription-polymerase chain reaction (RT-PCR) using human thymus poly A-mRNA; 6) construction of expression constructs; 7) expression of the cDNA encod- ing C2GnT3 in Sf9 (Spodoptera frugiperda) cells. More spe- cifically, the isolation of a representative DNA molecule encoding a novel third member of the mammalian UDP-N- acetylglucosamine: ß-N-actylgalactosamine ßl, 6-N-acetyl- glucosaminyltransferase family involved the following pro- cedures described below.

Identification of DNA Homologous to C2/4GnT (C2GnT2) Database searches were performed with the coding sequence of the human C2/4GnT (C2GnT2) sequence (13) using the BLASTn and the tBLASTn algorithm with the GSS database at The National Center for Biotechnology Information, USA.

The BLASTn algorithm was used to identify clones repre- senting the query gene (identities of > 95%), whereas tBLASTn was used to identify non-identical, but similar GSS sequences. GSSs with 50-90% nucleotide sequence iden- tity were regarded as different from the query sequence.

Two GSS clones with several apparent short sequence motifs and cysteine residues arranged with similar spacing were selected for further sequence analysis.

Cloning of Human C2GnT3 GSS clone CIT-HSP-2288B17. TF (GSS GenBank accession num- ber AQ005888), derived from a putative homologue to

C2/4GnT (C2GnT2), was obtained from Research Genetics Inc., USA. Sequencing of this clone revealed a partial open reading frame with significant sequence similarity to C2/4GnT (C2GnT2). The coding region of human C2GnT-L (C2GnTl), C2/4GnT (C2GnT2) and a bovine homologue was previously found to be organized in one exon ( (22), (15)).

Since the 3'sequence available from the C2GnT3 GSS was incomplete but likely to be located in a single exon, the missing 3'portion of the open reading frame was obtained by sequencing a genomic P1 clone. The P1 clone was ob- tained from a human foreskin genomic P1 library (DuPont Merck Pharmaceutical Co. Human Foreskin Fibroblast P1 Li- brary) by screening with the primer pair TSHC96 (5'-GGTTTCACCGTCTCCAACATA-3', SEQ ID NO: 3) and TSHC101 (5'-TCGTAAGGCACCTGATACTT-3', SEQ ID NO: 6).

One genomic clone for C2GnT3, GS22597 &num 844/B1 was obtain- ed from Genome Systems Inc. DNA from P1 phage was prepa- red as recommended by Genome Systems Inc. The entire cod- ing sequence of the C2GnT3 gene was represented in the clone and sequenced in full using automated sequencing (ABI377, Perkin-Elmer). Confirmatory sequencing was per- formed on a cDNA clone obtained by PCR (30 cycles at 95 °C for 10 sec; 55 °C for 15 sec and 68 °C for 2 min 30 sec) on cDNA from human thymus poly A-mRNA with the sense primer TSHC99 (5'-CGAGGATCCAGAATGAAGATATTCAAATGTTA-3', SEQ ID NO: 4) and the anti-sense primer TSHC121 (5'-AGCGAATTCTTACTATCATGATGTGGTAGTG-3', SEQ ID NO: 9).

The composite sequence contained an open reading frame of 1359 base pairs encoding a putative protein of 453 amino acids with type II domain structure predicted by the TMpred-algorithm at the Swiss Institute for Experimental Cancer Research (ISREC) (http://www. ch. embnet. org/software/TMPREDform. html).

Expression of C2GnT3 An expression construct designed to encode amino acid residues 39-453 of C2GnT3 was prepared by PCR using P1 DNA, and the primer pair TSHC100 (5'-CGAGGATCCGCAAAAAGACATTTACTTGGTT-3', SEQ ID NO: 5) and TSHC121 (5'-AGCGAATTCTTACTATCATGATGTGGTAGTG- 3', SEQ ID NO: 9) with BamHl and EcoRI restriction sites, respectively (Fig. 2). The PCR product was cloned between the BamHI and EcoRI sites of pAcGP67A (PharMingen), and the insert was fully sequenced. pAcGP67-C2GnT3-sol was co-transfected with Baculo-Gold DNA (PharMingen) as de- scribed previously (23). Recombinant Baculovirus was ob- tained after two successive amplifications in Sf9 cells grown in serum-containing medium, and titers of virus were estimated by titration in 24-well plates with moni- toring of enzyme activities. Transfection of Sf9-cells with pAcGP67-C2GnT3-sol resulted in marked increase in GlcNAc-transferase activity compared to uninfected cells or cells infected with a control construct. C2GnT3 showed significant activity with disaccharide derivatives of 0- linked core 1 (Galpl-3GalNAcal-R). In contrast, no activ- ity was found with core 3 structures (GlcNAcßl-3GalNAcal- R), lacto-N-neotetraose as well as GlcNAcpl-3Gal-Me as acceptor substrates indicating that C2GnT3 has no Core4GnT and IGnT-activity. Additionally, no activity could be detected wih a-D-GalNAc-1-para-nitrophenyl in- dicating that C2GnT3 does not form core 6 (GlcNAcpl- 6GalNAcal-R) (Table I). No substrate inhibition of enzyme activity was found at high acceptor concentrations up to 20 mM core 1-para-nitrophenyl. C2GnT3 shows strict donor substrate specificity for UDP-GlcNAc, no activity could be detected with UDP-Gal or UDP-GalNAc (data not shown).

Table I: Substrate specificities of C2GnT3 and C2GnTl C2GnT3a C2GnT1 Substrate 2 mM 10 mM 2 mM 10 mM mmol/h/mgmmol/h/mg ß-D-Gal-(1-3)-a-D-GalNAc(1-3)-a-D-GalNAc 6.6 14.3 9.6 19.0 ß-D-Gal-(1-3)-α-D-GalNAc-1-p-Nph 18. 1 26.1 16.2 23.6 P-D-GlcNAc- (1-3)-a-D-GaINAc-l-p-Nph <0. 1 <0.1 <0.1 <0.1 a-D-GaINAc-1-p-Nph <p. 1 <0. 1 <0. 1 <0.1 D-GaINAc <0.1 <0.1 <0.1 <0.1 lacto-N-neo-tetraose <0.1 <0.1 <0.1 <0.1 ß-D-GlcNAc-(1-3)-ß-D-Gal-l-Me <0. 1 <0.1 <0.1 <0.1 a Enzyme sources were partially purified media of infected High FiveTM cells (see"Experimental Pro- cedures"). Background values obtained with uninfected cells or cells infected with an irrelevant con- struct were subtracted. b Me methyl ; Nph. nitrophenyl.

Controls included the pAcGP67-GalNAc-T3-sol (24). The ki- netic properties were determined with partially purified enzymes expressed in High FiveE cells. Partial purifica- tion was performed by consecutive chromatography on Am- berlite IRA-95, DEAE-Sephacryl and SP-Sepharose essen- tially as described (25; 25).

Northern Blot analysis of Human Organs A human RNA master blot containing mRNA from fifty healthy human adult and fetal organs (CLONTECH) and a human multi- ple tissue northern blot (MTNII from CLONTECH) were probed with a 32P-labeled probe corresponding to the soluble fragment of C2GnT3 (base pairs 115-1359). The autoradio- graphic analyses showed expression of C2GnT3 predominantly in lymphoid organs and in organs of the gastrointestinal tract with high transcription levels observed in thymus, and lower levels in PBLs, lymph node, stomach, pancreas and small intestine (Fig. 3A and 3B). The size of the sin- gle transcript was approximately 5.5 kilobases, which cor- relates to the transcript size of 5.4 kilobases of the

biggest of three transcripts of human C2GnT1 (Fig. 3C).

Multiple transcripts of C2GnT1 have been suggested to be caused by differential usage of polyadenylation signas, which affects the length of the 3'UTR (13).

The C2GnT3 enzyme of the present invention was shown to exhibit 0-glycosylation capacity implying that the C2GnT3 gene is vital for correct/full O-glycosylation in vivo as well. A structural defect in the C2GnT3 gene leading to a deficient enzyme or completely defective enzyme would therefore expose a cell or an organism to protein/peptide sequences which were not covered by 0-glycosylation as seen in cells or organisms with intact C2GnT3 gene. De- scribed in Example 5 below is a method for scanning the coding exon for potential structural defects. Similar methods could be used for the characterization of defects in the non-coding region of the C2GnT3 gene including the promoter region.

DNA, Vectors, and Host Cells In practicing the present invention, many conventional techniques in molecular biology, microbiology, recombinant DNA, and immunology, are used. Such techniques are well known and are explained fully in, for example, Sambrook et al., 1989, Molecular Cloning A Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York; DNA Cloning: A Practical Approach, Vol- umes I and II, 1985 (D. N. Glover ed.) ; Oligonucleotide Synthesis, 1984, (M. L. Gait ed.) ; Nucleic Acid Hybridiza- tion, 1985, (Hames and Higgins); Transcription and Trans- lation, 1984 (Hames and Higgins eds.) ; Animal Cell Cul- ture, 1986 (R. I. Freshney ed.); Immobilized Cells and En- zymes, 1986 (IRL Press); Perbal, 1984, A Practical Guide to Molecular Cloning; the series, Methods in Enzymology

(Academic Press, Inc.); Gene Transfer Vectors for Mammal- ian Cells, 1987 (J. H. Miller and M. P. Calos eds., Cold Spring Harbor Laboratory); Methods in Enzymology Vol. 154 and Vol. 155 (Wu and Grossman, and Wu, eds., respective- ly); Immunochemical Methods in Cell and Molecular Biology, 1987 (Mayer and Waler, eds; Academic Press, London); Scopes, 1987, Protein Purification: Principles and Prac- tice, Second Edition (Springer-Verlag, N. Y.) and Handbook of Experimental Immunology, 1986, Volumes I-IV (Weir and Blackwell eds.).

The invention encompasses isolated nucleic acid fragments comprising all or part of the nucleic acid sequence dis- closed herein as set forth in Figure 1. The fragments are at least about 8 nucleotides in length, preferably at least about 12 nucleotides in length, and most preferably at least about 15-20 nucleotides in length. The invention further encompasses isolated nucleic acids comprising se- quences that are hybridizable under stringency conditions of 2X SSC, 55 °C, to the sequence set forth in Figure 1; preferably, the nucleic acids are hybridizable at 2X SSC, 65 °C; and most preferably, are hybridizable at 0.5X SSC, 65 °C.

The nucleic acids may be isolated directly from cells. Al- ternatively, the polymerase chain reaction (PCR) method can be used to produce the nucleic acids of the invention, using either chemically synthesized strands or genomic ma- terial as templates. Primers used for PCR can be synthe- sized using the sequence information provided herein and can further be designed to introduce appropriate new re- striction sites, if desirable, to facilitate incorporation into a given vector for recombinant expression.

The nucleic acids of the present invention may be flanked by natural human regulatory sequences, or may be associ-

ated with heterologous sequences, including promoters, en- hancers, response elements, signal sequences, polyadenyla- tion sequences, introns, 5'-and 3'-noncoding regions, and the like. The nucleic acids may also be modified by many means known in the art. Non-limiting examples of such modifications include methylation,"caps", substitution of one or more of the naturally occurring nucleotides with an analog, internucleotide modifications such as, for exam- ple, those with uncharged linkages (e. g., methyl phospho- nates, phosphotriesters, phosphoroamidates, carbamates, etc.) and with charged linkages (e. g., phosphorothioates, phosphorodithioates, etc.). Nucleic acids may contain one or more additional covalently linked moieties, such as, for example, proteins (e. g., nucleases, toxins, antibod- ies, signal peptides, poly-L-lysine, etc.), intercalators (e. g., acridine, psoralen, etc.), chelators (e. g., metals, radioactive metals, iron, oxidative metals, etc.), and al- kylators. The nucleic acid may be derivatized by formation of a methyl or ethyl phosphotriester or an alkyl phosphor- amidate linkage. Furthermore, the nucleic acid sequences of the present invention may also be modified with a label capable of providing a detectable signal, either directly or indirectly. Exemplary labels include radioisotopes, fluorescent molecules, biotin, and the like.

According to the present invention, useful probes comprise a probe sequence at least eight nucleotides in length that consists of all or part of the sequence from among the se- quences as set forth in Figure 1 or sequence-conservative or function-conservative variants thereof, or a complement thereof, and that has been labelled as described above.

The invention also provides nucleic acid vectors compris- ing the disclosed sequence or derivatives or fragments thereof. A large number of vectors, including plasmid and

fungal vectors, have been described for replication and/or expression in a variety of eukaryotic and prokaryotic hosts, and may be used for gene therapy as well as for simple cloning or protein expression.

Recombinant cloning vectors will often include one or more replication systems for cloning or expression, one or more markers for selection in the host, e. g. antibiotic resis- tance, and one or more expression cassettes. The inserted coding sequences may be synthesized by standard methods, isolated from natural sources, or prepared as hybrids, etc. Ligation of the coding sequences to transcriptional regulatory elements and/or to other amino acid coding se- quences may be achieved by known methods. Suitable host cells may be transformed/transfected/infected as appropri- ate by any suitable method including electroporation, CaCl2 mediated DNA uptake, fungal infection, microinjec- tion, microprojectile, or other established methods.

Appropriate host cells included bacteria, archaebacteria, fungi, especially yeast, and plant and animal cells, espe- cially mammalian cells. Of particular interest are Sac- charomyces cerevisiae, Schizosaccharomyces pombe, Pichia pastoris, Hansenula polymorpha, Neurospora spec., SF9 cells, C129 cells, 293 cells, and CHO cells, COS cells, HeLa cells, and immortalized mammalian myeloid and lym- phoid cell lines. Preferred replication systems include M13, ColEl, 2p, ARS, SV40, baculovirus, lambda, adenovi- rus, and the like. A large number of transcription initia- tion and termination regulatory regions have been isolated and shown to be effective in the transcription and trans- lation of heterologous proteins in the various hosts. Ex- amples of these regions, methods of isolation, manner of manipulation, etc. are known in the art. Under appropriate expression conditions, host cells can be used as a source

of recombinantly produced C2GnT3 derived peptides and polypeptides.

Advantageously, vectors may also include a transcription regulatory element (i. e., a promoter) operably linked to the C2GnT3 coding portion. The promoter may optionally contain operator portions and/or ribosome binding sites.

Non-limiting examples of bacterial promoters compatible with E. coli include:-lactamase (penicillinase) promo- ter ; lactose promoter; tryptophan (trp) promoter; arabi- nose BAD operon promoter; lambda-derived Pi promoter and N gene ribosome binding site; and the hybrid tac promoter derived from sequences of the trp and lac W5 promoters.

Non-limiting examples of yeast promoters include 3-phos- phoglycerate kinase promoter, glyceraldehyde-3 phosphate dehydrogenase (GAPDH) promoter, galactokinase (GAL1) pro- moter, galactoepimerase (GAL10) promoter, metallothioneine (CUP) promoter and alcohol dehydrogenase (ADH) promoter.

Suitable promoters for mammalian cells include without limitation viral promoters such as that from Simian Virus 40 (SV40), Rous sarcoma virus (RSV), adenovirus (ADV), and bovine papilloma virus (BPV). Mammalian cells may also re- quire terminator sequences and poly A addition sequences and enhancer sequences which increase expression may also be included; sequences which cause amplification of the gene may also be desirable. Furthermore, sequences that facilitate secretion of the recombinant product from cells, including, but not limited to, bacteria, yeast, and animal cells, such as secretory signal sequences and/or prohormone pro region sequences, may also be included.

These sequences are known in the art.

Nucleic acids encoding wild type or variant polypeptides may also be introduced into cells by recombination events.

For example, such a sequence can be introduced into a

cell, and thereby effect homologous recombination at the site of an endogenous gene or a sequence with substantial identity to the gene. Other recombination-based methods such as nonhomologous recombinations or deletion of en- dogenous genes by homologous recombination may also be used.

The nucleic acids of the present invention find use, for example, as probes for the detection of C2GnT3 in other species or related organisms and as templates for the re- combinant production of peptides or polypeptides. These and other embodiments of the present invention are de- scribed in more detail below.

Polypeptides and Antibodies The present invention encompasses isolated peptides and polypeptides encoded by the disclosed cDNA sequence. Pep- tides are preferably at least five residues in length.

Nucleic acids comprising protein-coding sequences can be used to direct the recombinant expression of polypeptides in intact cells or in cell-free translation systems. The known genetic code, tailored if desired for more efficient expression in a given host organism, can be used to syn- thesize oligonucleotides encoding the desired amino acid sequences. The phosphoramidite solid support method of (26), the method of (27), or other well known methods can be used for such synthesis. The resulting oligonucleotides can be inserted into an appropriate vector and expressed in a compatible host organism.

The polypeptides of the present invention, including func- tion-conservative variants of the disclosed sequence, may be isolated from native or from heterologous organisms or cells (including, but not limited to, bacteria, fungi, in-

sect, plant, and mammalian cells) into which a protein- coding sequence has been introduced and expressed. Fur- thermore, the polypeptides may be part of recombinant fu- sion proteins.

Methods for polypeptide purification are well known in the art, including, without limitation, preparative discontiu- ous gel elctrophoresis, isoelectric focusing, HPLC, re- versed-phase HPLC, gel filtration, ion exchange and parti- tion chromatography, and countercurrent distribution. For some purposes, it is preferable to produce the polypeptide in a recombinant system in which the protein contains an additional sequence tag that facilitates purification, such as, but not limited to, a polyhistidine sequence. The polypeptide can then be purified from a crude lysate of the host cell by chromatography on an appropriate solid- phase matrix. Alternatively, antibodies produced against a protein or against peptides derived therefrom can be used as purification reagents. Other purification methods are possible.

The present invention also encompasses derivatives and homologues of polypeptides. For some purposes, nucleic acid sequences encoding the peptides may be altered by substitutions, additions, or deletions that provide for functionally equivalent molecules, i. e., function-conser- vative variants. For example, one or more amino acid resi- dues within the sequence can be substituted by another amino acid of similar properties, such as, for example, positively charged amino acids (arginine, lysine, and his- tidine); negatively charged amino acids (aspartate and glutamate); polar neutral amino acids; and non-polar amino acids.

The isolated polypeptides may be modified by, for example, phosphorylation, sulfation, acylation, or other protein

modifications. They may also be modified with a label ca- pable of providing a detectable signal, either directly or indirectly, including, but not limited to, radioisotopes and fluorescent compounds.

The present invention encompasses antibodies that specifi- cally recognize immunogenic components derived from C2GnT3. Such antibodies can be used as reagents for detec- tion and purification of C2GnT3.

C2GnT3 specific antibodies according to the present inven- tion include polyclonal and monoclonal antibodies. The antibodies may be elicited in an animal host by immuniza- tion with C2GnT3 components or may be formed by in vitro immunization of immune cells. The immunogenic components used to elicit the antibodies may be isolated from human cells or produced in recombinant systems. The antibodies may also be produced in recombinant systems programmed with appropriate antibody-encoding DNA. Alternatively, the antibodies may be constructed by biochemical reconstitu- tion of purified heavy and light chains. The antibodies include hybrid antibodies (i. e., containing two sets of heavy chain/light chain combinations, each of which recog- nizes a different antigen), chimeric antibodies (i. e., in which either the heavy chains, light chains, or both, are fusion proteins), and univalent antibodies (i. e., com- prised of a heavy chain/light chain complex bound to the constant region of a second heavy chain). Also included are Fab fragments, including Fab'and F (ab) 2 fragments of antibodies. Methods for the production of all of the above types of antibodies and derivatives are well known in the art. For example, techniques for producing and processing polyclonal antisera are disclosed in Mayer and Walker, 1987, Immunochemical Methods in Cell and Molecular Biol- ogy, (Academic Press, London).

The antibodies of this invention can be purified by stan- dard methods, including but not limited to preparative disc-gel elctrophoresis, isoelectric focusing, HPLC, re- versed-phase HPLC, gel filtration, ion exchange and parti- tion chromatography, and countercurrent distribution. Pu- rification methods for antibodies are disclosed, e. g., in The Art of Antibody Purification, 1989, Amicon Division, W. R. Grace & Co. General protein purification methods are described in Protein Purification: Principles and Prac- tice, R. K. Scopes, Ed., 1987, Springer-Verlag, New York, NY.

Anti C2GnT3 antibodies, whether unlabeled or labeled by standard methods, can be used as the basis for immunoas- says. The particular label used will depend upon the type of immunoassay used. Examples of labels that can be used include, but are not limited to, radiolabels such as 32P, l25I, 3H and 14C; fluorescent labels such as fluorescein and its derivatives, rhodamine and its derivatives, dansyl and umbelliferone; chemiluminescers such as luciferia and 2,3- dihydrophthalazinediones ; and enzymes such as horseradish peroxidase, alkaline phosphatase, lysozyme and glucose-6- phosphate dehydrogenase.

The antibodies can be tagged with such labels by known methods. For example, coupling agents such as aldehydes, carbodiimides, dimaleimide, imidates, succinimides, bisdi- azotized benzadine and the like may be used to tag the an- tibodies with fluorescent, chemiluminescent or enzyme la- bels. The general methods involved are well known in the art and are described in, e. g., Chan (Ed.), 1987, Immuno- assay: A Practical Guide, Academic Press, Inc., Orlando, FL.

APPLICATIONS OF THE NUCLEIC ACID MOLECULES, POLYPEPTIDES, AND ANTIBODIES OF THE INVENTION The nucleic acid molecules, C2GnT3 polypeptide, and anti- bodies of the invention may be used in the prognostic and diagnostic evaluation of conditions associated with al- tered expression or activity of a polypeptide of the in- vention or conditions requiring modulation of a nucleic acid or polypeptide of the invention including thymus- related disorders and proliferative disorders (e. g. can- cer), and the identification of subjects with a predispo- sition to such conditions (See below). Methods for de- tecting nucleic acid molecules and polypeptides of the invention can be used to monitor such conditions by de- tecting and localizing the polypeptides and nucleic ac- ids. It would also be apparent to one skilled in the art that the methods described herein may be used to study the developmental expression of the polypeptides of the invention and, accordingly, will provide further insight into the role of the polypeptides. The applications of the present invention also include methods for the iden- tification of substances or compounds that modulate the biological activity of a polypeptide of the invention (See below). The substances, compounds, antibodies etc., may be used for the treatment of conditions requiring modulation of polypeptides of the invention (See below).

Diagnostic Methods A variety of methods can be employed for the diagnostic and prognostic evaluation of conditions requiring modula- tion of a nucleic acid or polypeptide of the invention (e. g. thymus-related disorders, and cancer), and the identification of subjects with a predisposition to such conditions. Such methods may, for example, utilize nu-

cleic acids of the invention, and fragments thereof, and antibodies directed against polypeptides of the inven- tion, including peptide fragments. In particular, the nu- cleic acids and antibodies may be used, for example, for: (1) the detection of the presence of C2GnT3 mutations, or the detection of either over-or under-expression of C2GnT3 mRNA relative to a non-disorder state or the qualitative or quantitative detection of alternatively spliced forms of C2GnT3 transcripts which may correlate with certain conditions or susceptibility toward such conditions; or (2) the detection of either an over-or an under-abundance of a polypeptide of the invention rela- tive to a non-disorder state or the presence Of a modi- fied (e. g., less than full length) polypeptide of the in- vention which correlates with a disorder state, or a pro- gression toward a disorder state.

The methods described herein may be performed by utiliz- ing pre-packaged diagnostic kits comprising at least one specific nucleic acid or antibody described herein, which may be conveniently used, e. g., in clinical settings, to screen and diagnose patients and to screen and identify those individuals exhibiting a predisposition to develop- ing a disorder.

Nucleic acid-based detection techniques and peptide de- tection techniques are described below. The samples that may be analyzed using the methods of the invention in- clude those that are known or suspected to express C2GnT3 nucleic acids or contain a polypeptide of the invention.

The methods may be performed on biological samples in- cluding but not limited to cells, lysates of cells which have been incubated in cell culture, chromosomes isolated from a cell (e. g. a spread of metaphase chromosomes), ge- nomic DNA (in solutions or bound to a solid support such

as for Southern analysis), RNA (in solution or bound to a solid support such as for northern analysis), cDNA (in solution or bound to a solid support), an extract from cells or a tissue, and biological fluids such as serum, urine, blood, and CSF. The samples may be derived from a patient or a culture.

Methods for Detection of Nucleic Acid Molecules of the Invention The nucleic acid molecules of the invention allow those skilled in the art to construct nucleotide probes for use in the detection of nucleic acid sequences of the inven- tion in biological materials. Suitable probes include nu- cleic acid molecules based on nucleic acid sequences en- coding at least 5 sequential amino acids from regions of the C2GnT3 polypeptide (see SEQ ID NO: 1), preferably they comprise 15 to 50 nucleotides, more preferably 15 to 40 nucleotides, most preferably 15-30 nucleotides. A nu- cleotide probe may be labelled with a detectable sub- stance such as a radioactive label that provides for an adequate signal and has sufficient half-life such as 32p, 3H, 14C or the like. Other detectable substances that may be used include antigens that are recognized by a spe- cific labelled antibody, fluorescent compounds, enzymes, antibodies specific for a labelled antigen, and lumines- cent compounds. An appropriate label may be selected hav- ing regard to the rate of hybridization and binding of the probe to the nucleotide to be detected and the amount of nucleotide available for hybridization. Labelled probes may be hybridized to nucleic acids on solid sup- ports such as nitrocellulose filters or nylon membranes as generally described in Sambrook et al, 1989, Molecular Cloning, A Laboratory Manual (2nd ed.). The nucleic acid probes may be used to detect C2GnT3 genes, preferably in

human cells. The nucleotide probes may also be used for example in the diagnosis or prognosis of conditions such as thymus-related disorders and cancer, and in monitoring the progression of these conditions, or monitoring a therapeutic treatment.

The probe may be used in hybridisation techniques to de- tect a C2GnT3 gene. The technique generally involves con- tacting and incubating nucleic acids (e. g. recombinant DNA molecules, cloned genes) obtained from a sample from a patient or other cellular source with a probe of the present invention under conditions favourable for the specific annealing of the probes to complementary se- quences in the nucleic acids. Alter incubation, the non- annealed nucleic acids are removed, and the presence of nucleic acids that have hybridized to the probe if any are detected.

The detection of nucleic acid molecules of the invention may involve the amplification of specific gene sequences using an amplification method (e. g. PCR), followed by the analysis of the amplified molecules using techniques known to those skilled in the art. Suitable primers can be routinely designed by one of skill in the art. For ex- ample, primers may be designed using commercially avail- able software, such as OLIGO 4.06 Primer Analysis soft- ware (National Biosciences, Plymouth, Minn.) or another appropriate program, to be about 22 to 30 nucleotides in length, to have a GC content of about 50% or more, and to anneal to the template at temperatures of about 60 °C to 72 °C.

Genomic DNA may be used in hybridization or amplification assays of biological samples to detect abnormalities in- volving C2GnT3 nucleic acid structure, including point mutations, insertions, deletions, and chromosomal rear-

rangements. For example, direct sequencing, single stranded conformational polymorphism analyses, heterodu- plex analysis, denaturing gradient gel electrophoresis, chemical mismatch cleavage, and oligonucleotide hybridi- zation may be utilized.

Genotyping techniques known to one skilled in the art can be used to type polymorphisms that are in close proximity to the mutations in a C2GnT3 gene. The polymorphisms may be used to identify individuals in families that are likely to carry mutations. If a polymorphism exhibits linkage disequalibrium with mutations in the G2GnT3 gene, it can also be used to screen for individuals in the gen- eral population likely to carry mutations. Polymorphisms which may be used include restriction fragment length polymorphisms (RFLPs), single-nucleotide polymorphisms (SNP), and simple sequence repeat polymorphisms (SSLPs).

A probe or primer of the invention may be used to di- rectly identify RFLPs. A probe or primer of the invention can additionally be used to isolate genomic clones such as YACs, BACs, PACs, cosmids, phage or plasmids. The DNA in the clones can be screened for SSLPs using hybridiza- tion or sequencing procedures.

Hybridization and amplification techniques described herein may be used to assay qualitative and quantitative aspects of C2GnT3 expression. For example RNA may be iso- lated from a cell type or tissue known to express C2GnT3 and tested utilizing the hybridization (e. g. standard Northern analyses) or PCR techniques referred to herein.

The techniques may be used to detect differences in tran- script size that may be doe to normal or abnormal alter- native splicing. The techniques may be used to detect quantitative differences between levels of full length and/or alternatively splice transcripts detected in nor-

mal individuals relative to those individuals exhibiting symptoms of a disease.

The primers and probes may be used in the above described methods in situ i. e directly on tissue sections (fixed and/or frozen) of patient tissue obtained from biopsies or resections.

Oligonucleotides or longer fragments derived from any of the nucleic acid molecules of the invention may be used as targets in a microarray. The microarray can be used to simultaneously monitor the expression levels of large numbers of genes and to identify genetic variants, muta- tions, and polymorphisms. The information from the mi- croarray may be used to determine gene function, to un- derstand the genetic basis of a disorder, to identify predisposition to a disorder, to treat a disorder, to di- agnose a disorder, and to develop and monitor the activi- ties of therapeutic agents.

The preparation, use, and analysis of micro arrays are well known to a person skilled in the art. (see, for example, Brennan, T. M., et al. (1995), U. S. Patent No.

5,474,796; Schena et al. (1996), Proc. Natl. Acad. Sci.

93: 10614-10619; Baldeschweiler et al. (1995), PCT Appli- cation W095/251116; Shalon, D., et al. (1995), PCT appli- cation W095/35505; Heller, R. A., et al. (1997), Proc.

Natl. Acad. Sci. 94: 2150-2155; and Heller, M. J., et al. (1997), U. S. Patent No. 5,605,662.) Methods for Detecting Polypeptides Antibodies specifically reactive with a C2GnT3 Polypep- tide, or derivatives, such as enzyme conjugates or la- beled derivatives, may be used to detect C2GnT3 polypep- tides in various biological materials. They may be used

as diagnostic or prognostic reagents and they may be used to detect abnormalities in the level of C2GnT3 polypep- tides, expression, or abnormalities in the structure, and/or temporal, tissue, cellular, or subcellular loca- tion of the polypeptides. Antibodies may also be used to screen potentially therapeutic compounds in vitro to de- termine their effects on a condition such as a thymus- related disorder or cancer. In vitro immunoassays may also be used to assess or monitor the efficacy of par- ticular therapies.

The antibodies of the invention may also be used in vitro to determine the level of C2GnT3 polypeptide expression in cells genetically engineered to produce a C2GnT3 poly- peptide. The antibodies may be used to detect and quan- tify polypeptides of the invention in a sample in order to determine their role in particular cellular events or pathological states, and to diagnose and treat such pathological states.

In particular, the antibodies of the invention may be used in immuno-histochemical analyses, for example, at the cellular and sub-subcellular level, to detect a poly- peptide of the invention, to localize it to particular cells and tissues, and to specific subcellular locations, and to quantitate the level of expression.

The antibodies may be used in any known immunoassays that rely on the binding interaction » between an antigenic de- terminant of a polypeptide of the invention, and the an- tibodies. Examples of such assays are radio immunoassays, enzyme immunoassays (e. g. ELISA), immunofluorescence, im- munoprecipitation, latex agglutination, hemagglutination, and histochemical tests,

Cytochemical techniques known in the art for localizing antigens using light and electron microscopy may be used to detect a polypeptide of the invention. Generally, an antibody of the invention may be labelled with a detect- able substance and a polypeptide may be localised in tis- sues and cells based upon the presence of the detectable substance. Various methods of labelling polypeptides are known in the art and may be used. Examples of detectable substances include, but are not limited to, the follow- ing: radioisotopes (e. g., 3H, 1C, j5S, 1'SI, 131I) rescent labels (e. g., FITC, Rhodamine, lanthanide phos- phors), luminescent labels such as luminol, enzymatic la- bels (e. g., horseradish peroxidase, p-galactosidase, luciferase, alkaline phosphatase, acetylcholinesterase), biotinyl groups (which can be detected by marked avidin e. g., streptavidin containing a fluorescent marker or en- zymatic activity that can be detected by optical or calo- rimetric methods), predetermined polypeptide epitopes recognized by a secondary reporter (e. g., leucine zipper pair sequences, binding sites for secondary antibodies, metal binding domains, epitope tags). In some embodi- ments, labels are attached via spacer arms of various lengths to reduce potential steric hindrance. Antibodies may also be coupled to electron dense substances, such as ferritin or colloidal gold, which are readily visualised by electron microscopy.

The antibody or sample may be immobilized on a carrier or solid support which is capable of immobilizing cells, an- tibodies, etc. For example, the carrier or support may be nitrocellulose, or glass, polyacrylamides, gabbros, and magnetite. The support material may have any possible configuration including spherical (e. g. bead), cylindri- cal (e. g. inside surface of a test tube or well, or the external surface of a rod), or flat (e. g. sheet, test

strip). Indirect methods may also be employed in which the primary antigen-antibody reaction is amplified by the introduction of a second antibody, having specificity for the antibody reactive against a polypeptide of the inven- tion. By way of example, if the antibody having specific- ity against a polypeptide of the invention is a rabbit IgG antibody, the second antibody may be goat anti-rabbit gamma-globulin labelled with a detectable substance as described herein.

Where a radioactive label is used as a detectable sub- stance, a polypeptide of the invention may be localized by radioautography. The results of radioautography may be quantitated by determining the density of particles in the radioautographs by various optical methods, or by counting the grains.

A polypeptide of the invention may also be detected by assaying for C2GnT3 activity as described herein. For ex- ample, a sample may be reacted with an acceptor substrate and a donor substrate under conditions where a C2GnT3 polypeptide is capable of transferring the donor sub- strate to the acceptor substrate to produce a donor sub- strate-acceptor substrate complex.

Methods for Identifying or Evaluating Substances/Com- pounds The methods described herein are designed to identify substances and compounds that modulate the expression or biological activity of a C2GnT3 polypeptide including substances that interfere with or enhance the expression or activity of a C2GnT3 polypeptide.

Substances and compounds identified using the methods of the invention include but are not limited to peptides

such as soluble peptides including Ig-tailed fusion pep- tides, members of random peptide libraries and combinato- rial chemistry-derived molecular libraries made of D- and/or L-configuration amino acids, phosphopeptides (in- cluding members of random or partially degenerate, di- rected phosphopeptide libraries), antibodies [e. g. poly- clonal, monoclonal, humanized, anti-idiotypic, chimeric, single chain antibodies, fragments, (e. g. Fab, F (ab) 2, and Fab expression library fragments, and epitope-binding fragments thereof)], polypeptides, nucleic acids, carbo- hydrates, and small organic or inorganic molecules. A substance or compound may be an endogenous physiological compound or it may be a natural or synthetic compound.

Substances which modulate a C2GnT3 polypeptide can be identified based on their ability to associate with a C2GnT3 polypeptide. Therefore, the invention also pro- vides methods for identifying substances that associate with a C2GnT3 polypeptide. Substances identified using the methods of the invention may be isolated, cloned and sequenced using conventional techniques. A substance that associates with a polypeptide of the invention may be an agonist or antagonist of the biological or immunological activity of a polypeptide of the invention.

The term"agonist"refers to a molecule that increases the amount of, or prolongs the duration of, the activity of the polypeptide. The term"antagonist"refers to a molecule which decreases the biological or immunological activity of the polypeptide. Agonists and antagonists may include proteins, nucleic acids, carbohydrates, or any other molecules that associate with a polypeptide of the invention.

Substances which can associate with a C2GnT3 polypeptide may be identified by reacting a C2GnT3 polypeptide with a

test substance which potentially associates with a C2GnT3 polypeptide, under conditions which permit the associa- tion, and removing and/or detecting the associated C2GnT3 polypeptide and substance. The Substance-polypeptide com- plexes, free substance, or non-complexed polypeptides may be assayed. Conditions which permit the formation of sub- stance-polypeptide complexes may be selected having re- gard to factors such as the nature and amounts of the substance and the polypeptide.

The substance-polypeptide complex, free substance or non- complexes polypeptides may be isolated by conventional isolation techniques, for example, salting out, chroma- tography, electrophoresis, gel filtration, fractionation, absorption, polyacrylamide gel electrophoresis, aggluti- nation, or combinations thereof. To facilitate the assay of the components, antibody against a polypeptide of the invention or the substance, or labelled polypeptide, or a labelled substance may be utilized. The antibodies, poly- peptides, or substances may be labelled with a detectable substance as described above.

A C2GnT3 polypeptide, or the substance used in the method of the invention may be insolubilized. For example, a polypeptide, or substance may be bound to a suitable car- rier such as agarose, cellulose, dextran, Sephadex, Sepharose, carboxymethyl cellulose polystyrene, filter paper, ion-exchange resin, plastic film, plastic tube, glass beads, polyamine-methyl vinyl-ether-maleic acid co- polymer, amino acid copolymer, ethylene-maleic acid co- polymer, nylon, silk, etc. The carrier may be in the shape of, for example, a tube, test plate, beads, disc, sphere etc. The insolubilized polypeptide or substance may be prepared by reacting the material with a suitable

insoluble carrier using known chemical or physical meth- ods, for example, cyanogen bromide coupling.

The invention also contemplates a method for evaluating a compound for its ability to modulate the biological ac- tivity of a polypeptide of the invention, by assaying for an agonist or antagonist (i. e. enhancer or inhibitor) of the association of the polypeptide with a substance which interacts with the polypeptide (e. g. donor or acceptor substrates or parts thereof). The basic method for evalu- ating if a compound is an agonist or antagonist of the association of a polypeptide of the invention and a sub- stance that associates with the polypeptide is to prepare a reaction mixture containing the polypeptide and the substance under conditions which permit the formation of substance-polypeptide complexes, in the presence of a test compound. The test compound may be initially added to the mixture, or may be added subsequent to the addi- tion of the polypeptide and substance. Control reaction mixtures without the test compound or with a placebo are also prepared. The formation of complexes is detected and the formation of complexes in the control reaction but not in the reaction mixture indicates that the test com- pound interferes with the interaction of the polypeptide and substance. The reactions may be carried out in the liquid phase or the polypeptide, substance, or test com- pound may be immobilized as described herein.

It will be understood that the agonists and antagonists i. e. inhibitors and enhancers, that can be assayed using the methods of the invention may act on one or more of the interaction sites an the polypeptide or substance in- cluding agonist binding sites, competitive antagonist binding cites, non-competitive antagonist binding sites or allosteric sites.

The invention also makes it possible to screen for an- tagonists that inhibit the effects of an agonist of the interaction of a polypeptide of the invention with a sub- stance which is capable of associating with the polypep- tide. Thus, the invention may be used to assay for a com- pound that competes for the same interacting site of a polypeptide of the invention.

Substances that modulate a C2GnT3 polypeptide of the in- vention can be identified based on their ability to in- terfere with or enhance the activity of a C2GnT3 polypep- tide. Therefore, the invention provides a method for evaluating a compound for its ability to modulate the ac- tivity of a C2GnT3 polypeptide comprising (a) reacting an acceptor substrate and a donor substrate for a C2GnT3 polypeptide in the presence of a test substance ; (b) measuring the amount of donor substrate transferred to acceptor substrate, and (c) carrying out steps (a) and (b) in the absence of the test substance to determine if the substance interferes with or enhances transfer of the sugar donor to the acceptor by the C2GnT3 polypeptide.

Suitable acceptor substrate for use in the methods of the invention are a saccharide, oligosaccharides, polysaccha- rides, polypeptides, glycopolypeptides, or glycolipids which are either synthetic with linkers at the reducing end or naturally occuring structures, for example, asialo-agalacto-fetuin glycopeptide. Acceptors will gen- erally comprise ß-D-galactosyl-1,3-N-acetyl-D-galactos- aminyl-.

The donor substrate may be a nucleotide sugar, dolichol- phosphate-sugar or dolichol-pyrophosphate-oligosaccha- ride, for example, uridine diphospho-N-acetylglucosamine (UDP-GlcNAc), or derivatives or analogs thereof. The

C2GnT3 polypeptide may be obtained from natural sources or produced used recombinant methods as described herein.

The acceptor or donor substrates may be labeled with a detectable substance as described herein, and the inter- action of the polypeptide of the invention with the ac- ceptor and donor will give rise to a detectable change.

The detectable change may be colorimetric, photometric, radiometric, potentiometric, etc. The activity of C2GnT3 polypeptide of the invention may also be determined using methods based on HPLC (Koenderman et al., FEBS Lett. 222: 42,1987) or methods employed synthetic oligosaccharide acceptors attached to hydrophobic aglycones (Palcic et al Glycoconjugate 5: 49,1988; and Pierce et al, Biochem.

Biophys. Res. Comm. 146: 679,1987).

The C2GnT3 polypeptide is reacted with the acceptor and donor substrates at a pH and temperature effective for the polypeptide to transfer the donor to the acceptor, and where one of the components is labeled, to produce a detectable change. It is preferred to use a buffer with the acceptor and donor to maintain the pH within the pH range effective for the polypeptides. The buffer, accep- tor and donor may be used as an assay composition. Other compounds such as EDTA and detergents may be added to the assay composition.

The reagents suitable for applying the methods of the in- vention to evaluate compounds that modulate a C2GnT3 polypeptide may be packaged into convenient kits provid- ing the necessary materials packaged into suitable con- tainers. The kits may also include suitable supports use- ful in performing the methods of the invention.

Substances that modulate a C2GnT3 polypeptide can also be identified by treating immortalized cells which express

the polypeptide with a test substance, and comparing the morphology of the cells with the morphology of the cells in the absence of the substance and/or with immortalized cells which do not express the polypeptide. Examples of immortalized cells that can be used include lung epithe- lial cell lines such as MvlLu or HEK293 (human embryonal kidney) transfected with a vector containing a nucleic acid of the invention. In the absence of an inhibitor the cells show signs of morphologic transformation (e. g. fi- broblastic morphology, spindle shape and pile up; the cells are less adhesive to substratum ; there is less cell to cell contact in monolayer culture; there is reduced growth-factor requirements for survival and prolifera- tion ; the cells grow in soft-agar of other semi-solid me- dium ; there is a lack of contact inhibition and increased apoptosis. in low-serum high density cultures ; there is enhanced cell motility, and there is invasion into ex- tracellular matrix and secretion of proteases). Sub- stances that inhibit one or more phenotypes may be con- sidered an inhibitor.

A substance that inhibits a C2GnT3 polypeptide may be identified by treating. a cell which expresses the poly- peptide with a test substance, and assaying for complex core 2-based 0-linked structures (e. g. repeating Gal [ß] 1- 4GlcNAc [ß]) associated with the cell. The complex core 2- based 0-linked structures can be assayed using a. sub- stance that binds to the structures (e. g. antibodies).

Cells that have not been treated with the substance or which do not express the polypeptide may be employed as controls.

Substances which inhibit transcription or translation of a C2GnT3 gene may be identified by transfecting a cell with an expression vector comprising a recombinant mole-

cule of the invention, including a reporter gene, in the presence of a test substance and comparing the level of expression of the C2GnT3 polypeptide, or the expression of the polypeptide encoded by the reporter gene with a control cell transfected with the nucleic acid molecule in the absence of the substance. The method can be used to identify transcription and translation inhibitors of a C2GnT3 gene.

Compositions and Treatments The substances or compounds identified by the methods de- scribed herein, polypeptides, nucleic acid molecules, and antibodies of the invention may be used for modulating the biological activity of a C2GnT3 polypeptide, and they may be used in the treatment of conditions mediated by a C2GnT3 polypeptide. In particular, they may be used to T- cell development and lymphocyte homing and they may be used in the prevention and treatment of thymus-related disorders.

Therefore, the present invention may be useful for diag- nosis or treatment of various thymus-related disorders in mammals, preferably humans. Such disorders include the following: tumors and cancers, hypoactivity, hyperactiv- ity, atrophy, enlargement of the thymus, and the like.

Other disorders include disregulation of T-lymphocyte se- lection or activity and would include but not be limited to disorders involving autoimmunity, arthritis, leuke- mias, lymphomas, immunosuppression, sepsis, wound heal- ing, acute and chronic in action, cell mediated immunity, humor immunity, TH1/TH2 imbalance, and the like.

The substances or compounds identified by the methods de- scribed herein, antibodies, and polypeptides, and nucleic

acid molecules of the invention may be useful in the pre- vention and treatment of tumors. Tumor metastasis may be inhibited or prevented by inhibiting the adhesion of cir- culating cancer cells. The substances, compounds, etc. of the invention may be especially useful in the treatment of various forms of neoplasia such as leukemias, lympho- mas, melanomas, adenomas, sarcomas, and carcinomas of solid tissues in patients. In particular the composition may be used for treating malignant melanoma, pancreatic cancer, cervico-uterine cancer, cancer of the liver, kid- ney, stomach, lung, rectum, breast, bowel, gastric, thy- roid, neck, cervix, salivary gland, bile duct, pelvis, mediastinum, urethra, bronchogenic, bladder, esophagus and colon, and Kapos's Sarcoma which is a form of cancer associated with HIV-infected patients with Acquired Im- mune Deficiency Syndrome (AIDS). The substances etc. are particularly useful in the prevention and treatment of tumors of the immune system and thymus and the metastases derived from these tumors.

A substance or compound identified in accordance with the methods described herein, antibodies, polypeptides, or nucleic acid molecules of the invention may be used to modulate T-cell activation and immunodeficiency due to the Wiskott-Aldrich syndrome or AIDS, or to stimulate he- matopoietic progenitor cell growth, and/or confer protec- tion against chemotherapy and radiation therapy in a sub- ject.

Accordingly, the substances, antibodies, and compounds may be formulated into pharmaceutical compositions for administration to subjects in a biologically compatible form suitable for administration in vivo. By biologically compatible form suitable for administration in vivo is meant a form of the substance to be administered in which

any toxic effects are outweighed by the therapeutic ef- fects. The substances may be administered to living or- ganisms including humans, and animals. Administration of a therapeutically active amount of the pharmaceutical compositions of the present invention is defined as an amount effective, at dosages and for periods of time nec- essary to achieve the desired result. For example, a therapeutically active amount of a substance may vary ac- cording to factors such as the disease state, age, sex, and weight of the individual, and the ability of antibody to elicit a desired response in the individual. Dosage regima may be adjusted to provide the optimum therapeutic response. For example, several divided doses may be ad- ministeted daily or the dose may be proportionally re- duced as indicated by the exigencies of the therapeutic situation.

The active substance may be administered in a convenient manner such as by injection (subcutaneous, intravenous, etc.), oral administration, inhalation, transdermal ap- plication, or rectal administration. Depending on the route of administration, the active substance may be coated in a material to protect the compound from the ac- tion of enzymes, acids and other natural conditions that may inactivate the compound.

The compositions described herein can be prepared by methods known per se for the preparation of pharmaceuti- cally acceptable compositions which can be administered to subjects, such that an effective quantity of the ac- tive substance is combined in a mixture with a pharmaceu- tically acceptable vehicle. Suitable vehicles are de- scribed, for example, in Remington's Pharmaceutical Sci- ences (Remington's Pharmaceutical Sciences, Mack Publish- ing Company, Easton, Pa., USA 1985). On this basis, the

compositions include, albeit not exclusively, solutions of the substances or compounds in association with one or more pharmaceutically acceptable vehicles or diluents, and contained in buffered solutions with a suitable pH and iso-osmotic with the physiological fluids.

After pharmaceutical compositions have been prepared, they can be placed in an appropriate container and la- beled for treatment of an indicated condition. For ad- ministration of an inhibitor of a polypeptide of the in- vention, such labeling would include amount, frequency, and method of administration.

The nucleic acids encoding C2GnT3 polypeptides or any fragment thereof, or antisense sequences may be used for therapeutic purposes. Antisense to a nucleic acid mole- cule encoding a polypeptide of the invention may be med in situations to block the synthesis of the polypeptide.

In particular, cells may be transformed with sequences complementary to nucleic acid molecules encoding C2GnT3 polypeptide. Thus, antisense sequences may be used to modulate C2GnT3 activity or to achieve regulation of gene function. Sense or antisense oligomers or larger frag- ments, can be designed from various locations along the coding or regulatory regions of sequences encoding a polypeptide of the invention.

Expression vectors may be derived from retroviruses, ade- noviruses, herpes or vaccinia viruses or from various bacterial plasmids for delivery of nucleic acid sequences to the target organ, tissue, or cells. Vectors that ex- press antisense nucleic acid sequences of C2GnT3 polypep- tide can be constructed using techniques well known to those skilled in the art (see for example, Sambrook, Fritsch, Maniatis, Molecular Cloning, A Laboratory Man-

ual, Second Edition (1989) Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N. Y).

Genes encoding C2CnT3 polypeptide can be turned off by transforming a cell or tissue with expression vectors that express high levels of a nucleic acid molecule or fragment thereof which encodes a polypeptide of the in- vention. Such constructs may be used to introduce un- translatable sense or antisense sequences into a cell.

Even if they do not integrate into the DNA, the vectors may continue to transcribe RNA molecules until all copies are disabled by endogenous nucleases. Transient expres- sion may last for extended periods of time (e. g. a month or more) with a non-replicating vector or if appropriate replication elements are part of the vector system.

Modification of gene expression may be achieved by de- signing antisense molecules, DNA, RNA, or PNA, to the control regions of a C2GnT3 polypeptide gene i. e. the promoters, enhancers, and introns. Preferably the an- tisense molecules are oligonucleotides derived from the transcription initiation site (e. g. between positions-10 and +10 from the start site). Inhibition can also be achieved by using triple-helix base-pairing techniques.

Triple helix pairing causes inhibition of the ability of the double helix to open sufficiently for the binding of polymerases, transcription factors, or regulatory mole- cules (see Gee J. E. et al (1994) In: Huber, B. E. and B. I.

Carr, Molecular and Immunologic Approaches, Futura Pub- lishing Co., Mt. Kisco, N. Y.).

Ribozymes, enzymatic RNA molecules, may be used to cata- lyze the specific cleavage of RNA. Ribozyme action in- volves sequence-specific hybridization of the ribozyme molecule to complementary target RNA, followed by endonu- cleolytic cleavage. For example, hammerhead motif ri-

bozyme molecules may be engineered that can specifically and efficiently catalyze endonucleolytic cleavage of se- quences encoding a polypeptide of the invention.

Specific ribosome cleavage sites within any RNA target may be initially identified by scanning the target mole- cule for ribozyme cleavage sites which include the fol- lowing sequences: GUA, GUU, and GUC. Short RNA sequences of between 15 and 20 ribonucleotides corresponding to the region of the cleavage site of the target gene may be evaluated for secondary structural features which may render the oligonucleotide inoperable. The suitability of candidate targets may be evaluated by testing accessibil- ity to hybridization with complementary oligonucleotides using ribonuclease protection assays.

Therapeutic efficacy and toxicity may be determined by standard pharmaceutical procedures in cell cultures or with experimental animals, such as by calculating the ED5o (the dose therapeutically effective in 50% of the population) or LDso (the dose lethal to 50% of the popu- lation) statistics. The therapeutic index is the dose ra- tio of therapeutic to toxic effects and it can be ex- pressed as the ED5 (}/LD5"ratio. Pharmaceutical composi- tions which exhibit large therapeutic indices are pre- ferred.

The invention also provides methods for studying the function of a C2GnT3 polypeptide. Cells, tissues, and non-human animals lacking in C2GnT3 expression or par- tially lacking in C2GnT3 expression may be developed us- ing recombinant expression vectors of the invention hav- ing specific deletion or insertion mutations in a C2GnT3 gene. A recombinant expression vector may be used to in- activate or alter the endogenous gene by homologous re-

combination, and thereby create a C2GnT3 deficient cell, tissue or animal.

Null alleles may be generated in cells, such as embryonic stem cells by deletion mutation. A recombinant C2GnT3 gene may also be engineered to contain an insertion muta- tion which inactivates C2GnT3. Such a construct may then be introduced into a cell, such as an embryonic stem cell, by a technique such as transfection, elcctropora- tion, injection etc. Cells lacking an intact C2GnT3 gene may then be identified, for example by Southern blotting, Northern Blotting or by assaying for expression of a polypeptide of the invention using the methods described herein. Such cells may then be used to generate trans- genic non-human animals deficient in C2GnT3. Germline transmission of the mutation may be achieved, for exam- ple, by aggregating the embryonic stem cells with early stage embryos, such as 8 cell embryos, in vitroi trans- ferring the resulting blastocysts into recipient females and; generating germline transmission of the resulting aggregation chimeras. Such a mutant animal may be used to define specific cell populations, developmental patterns and in vivo processes, normally dependent on C2GnT3 ex- pression.

The invention thus provides a transgenic non-human mammal all of whose germ cells and somatic cells contain a re- combinant expression vector that inactivates or alters a gene encoding a C2GnT3 polypeptide. Further the invention provides a transgenic non-human mammal, which does not express a C2GnT3 polypeptide of the invention.

A transgenic non-human animal includes but is not limited to mouse, rat, rabbit, sheep, hamster, guinea pig, micro- pig, pig, dog, cat, goat, and non-human primate, prefera- bly mouse.

The invention also provides a transgenic non-human animal assay system which provides a model system for testing for an agent that reduces or inhibits a pathology associ- ated with a C2GnT3 polypeptide comprising: (a) adminis- tering the agent to a transgenic non-human animal of the invention; and (b) determining whether said agent reduces or inhibits the pathology in the transgenic non-human animal relative to a transgenic non-human animal of step (a) to which the agent has not been administered.

The agent may be useful to treat the disorders and condi- tions discussed herein. The agents may also be incorpo- rated in a pharmaceutical composition as described herein.

A polypeptide of the invention may be used to support the survival, growth, migration, and/or differentiation of cells expressing the polypeptide. Thus, a polypeptide of the invention may be used as a supplement to support, for example cells in culture.

Methods to Prepare Oligosaccharides The invention relates to a method for preparing an oligo- saccharide comprising contacting a reaction mixture com- prising an activated donor substrate e. g. GlcNAc, and an acceptor substrate in the presence of a polypeptide of the invention.

Examples of acceptor substrates for use in the method for preparing an oligosaccharide are a saccharide, oligosac- charides, polysaccharides, glycopeptides, glycopolypep- tides, or glycolipids which are either synthetic with linkers at the reducing end or naturally occurring struc- tures, for example, asialo-agalacto-fetuin glycopeptide.

The activated donor substrate is preferably GlcNAc which

may be part of a nucleotide-sugar, a dolichol-phosphate- sugar, or dolichol-pyrophosphate-oligosaccharide.

In an embodiment of the invention, the oligosaccharides are prepared on a carrier that is non-toxic to a mammal, in particular a human such as a lipid isoprenoid or poly- isoprenoid alcohol. An example of a suitable carrier is dolichol phosphate. The oligosaccharide may be attached to a carrier via a labile bond allowing for chemical re- moval of the oligosaccharide from the lipid carrier. In the alternative, the oligosaccharide transferase may be used to transfer the oligosaccharide from a lipid carrier to a polypeptide.

The following examples are intended to further illustrate the invention without limiting its scope.

EXAMPLE 1 A: Identification of cDNA homologous to C2GnT3 by analy- sis of GSS database sequence information.

Database searches were performed with the coding sequence of the human C2/4GnT (C2GnT2) sequence using the BLASTn and tBLASTn algorithms against the GSS database at The Na- tional Center for Biotechnology Information, USA. The BLASTn algorithm was used to identify GSSs representing the query gene (identities of > 95%), whereas tBLASTn was used to identify non-identical, but similar GSS sequences.

GSSs with 50-90% nucleotide sequence identity were regar- ded as different from the query sequence. Composites of the sequence information for two GSSs were compiled and analysed for sequence similarity to human C2/4GnT (C2GnT2).

B: Cloning and sequencing of C2GnT3 A GSS clone CIT-HSP-2288B17. TF (GSS GenBank accession number AQ005888), derived from a putative homologue to C2/4GnT (C2GnT2), was obtained from Research Genetics Inc., USA. Sequencing of this clone revealed a partial open reading frame with significant sequence similarity to C2/4GnT (C2GnT2). The coding region of human C2GnT-L (C2GnT1), C2/4GnT (C2GnT2) and a bovine homologue was previously found to be organized in one exon ( (22), (15)).

Since the 3'sequence available from the C2GnT3 GSS was incomplete but likely to be located in the single exon, the missing 3'portion of the open reading frame was ob- tained by sequencing a genomic Pl clone. The P1 clone was obtained from a human foreskin genomic P1 library (DuPont Merck Pharmaceutical Co. Human Foreskin Fibroblast Pl Li- brary) by screening with the primer pair TSHC96 (5'-GGTTTCACCGTCTCCAACATA-3', SEQ ID NO: 3) and TSHC101 (5'-TCGTAAGGCACCTGATACTT-3', SEQ ID NO: 6). One genomic clone for C2GnT3, GS22597 &num 844/B1 was obtained from Genome Systems Inc., USA. DNA from P1 phage was pre- pared as recommended by Genome Systems Inc. The entire coding sequence of the C2GnT3 gene was represented in the clone and sequenced in full using automated sequencing (ABI377, Perkin-Elmer). Confirmatory sequencing was per- formed on a cDNA clone obtained by PCR (30 cycles at 95°C for 10 sec; 55°C for 15 sec and 68°C for 2 min 30 sec) on cDNA from human thymus poly A-mRNA with the sense primer TSHC 99 (5'-CGAGGATCCAGAATGAAGATATTCAAATGTTA-3', SEQ ID NO: 4) and the anti-sense primer TSHC121 (5'-AGCGAATTCTTACTATCATGATGTGGTAGTG-3', SEQ ID NO: 9).

The composite sequence contained an open reading frame of 1359 base pairs encoding a putative protein of 453 amino acids with type II domain structure predicted by the TMpred-algorithm at the Swiss Institute for Experimental

Cancer Research (ISREC) (http://www. ch. embnet. org/software/TMPREDform. html).

EXAMPLE 2 A: Expresson of C2GnT3 in Sf9 cells An expression construct designed to encode amino acid residues 39-453 of C2GnT3 was prepared by PCR using P1 DNA, and the primer pair TSHC100 (5'-CGAGGATCCGCAAAAAGACATTTACTTGGTT-3', SEQ ID NO: 5) and TSHC121 (5'-AGCGAATTCTTACTATCATGATGTGGTAGTG- 3', SEQ ID NO: 9) with BamHl and EcoRI restriction sites, respectively (Fig. 2). The PCR product was cloned between the BamHI and EcoRI sites of pAcGP67A (PharMingen), and the insert was fully sequenced. pAcGP67-C2GnT3-sol was co-transfected with Baculo-Gold DNA (PharMingen) as de- scribed previously (23). Recombinant Baculo-viruses were obtained after two successive amplifications in Sf9 cells grown in serum-containing medium, and titers of virus were estimated by titration in 24-well plates with moni- toring of enzyme activities. Transfection of Sf9-cells with pAcGP67-C2GnT3-sol resulted in marked increase in GlcNAc-transferase activity compared to uninfected cells or cells infected with a control construct.

B: Analysis of C2GnT3 activity Standard assays were performed using culture supernatant from infected cells in 50 1 reaction mixtures containing 100 mM MES (pH 6.5), 0.1% Nonidet M UDP- [14C]-M UDP- [14C]- GlcNAc (2,000 cpm/nmol) (Amersham Pharmacia Biotech), and the indicated concentrations of acceptor substrates (Sigma and Toronto Research Laboratories Ltd., see Table I for structures). Reaction products were quantified by chromatography on Dowex AG1-X8.

EXAMPLE 3 Restricted organ expression pattern of C2GnT3 A human RNA master blot (CLONTECH) was used for expres- sion analysis. The cDNA-fragment of soluble C2GnT3 was used as a probe for hybridization. The probe was random primer-labeled using [a32P] dATP and and the Strip-EZ DNA labeling kit (Ambion). The membrane was probed for 6h at 65°C following the protocol of the manufacturer (CLONTECH) and washed five times for 20 min each at 65 °C with 2 x SSC, 1% SDS and twice for 20 min each at 55 °C with 0.1 x SSC, 0.5 % SDS. A human multiple tissue North- ern blot MTN II (CLONTECH), was probed as described (24), and washed twice for 10 min each at room temperature with 2 x SSC, 0.1% SDS; twice for 10 min each at 55 °C with 1 x SSC, 0.1 % SDS; and once for 10 min with 0.1 x SSC, 0.1 % SDS at 55 °C.

EXAMPLE 4 Analysis of C2GnT3 gene expression in peripheral blood mononuclear cells PCR analysis of C2GnT3 expression in resting and acti- vated human blood cell fractions was performed using the primer pair TSHC118 (5'-GAGTCAGTGTGGAATTGAATAC-3', SEQ ID NO: 7) and TSHC126 (5'-CAACAGTCTCCTCAACCCTG-3', SEQ ID NO: 11). PCR amplifications with primers specific for hu- man C2GnT3 (C2GnT3) or GAPDH (G3PDH, supplied by the manufacturer) were performed on a normalized human blood cell cDNA panel (MTC from CLONTECH) for 31 cycles. Ex- pression of C2GnT3 transcript was detected in all periph- eral blood mononuclear cell (PBMC) fractions with par- ticularly high levels of expression in CD4 and CD8 posi- tive T-lymphocytes (Figure 4).

EXAMPLE 5 Analysis of DNA polymorphism of the C2GnT3 gene Primer pairs such as TSHC123 (5'-GGGCAGCATTTGCCTAGTATG-3', SEQ ID NO: 10) and TSHC119 (5'-GATCTCTGATTTGGCTCAGTG-3', SEQ ID NO: 8) as described in Figure 5 have been used for PCR amplification of individual sequences of the coding exon. Each PCR product was subcloned and the sequence of 10 clones containing the appropriate insert was determined assuring that both alleles of each individual are charac- terized.

From the foregoing it will be evident that, although spe- cific embodiments of the invention have been described herein for purposes of illustration, various modifications may be made without deviating from the spirit and scope of the invention.

REFERENCES 1. Clausen, H. and Bennett, E. P. A family of UDP- GalNAc: polypeptide N-acetylgalactosaminyltrans- ferases control the initiation of mucin-type 0- linked glycosylation. Glycobiology 6: 635-646, 1996.

2. Piller, F., Piller, V., Fox, R. I., and Fukuda, M.

Human T-lymphocyte activation is associated with changes in O-glycan biosynthesis. J. Biol. Chem.

263: 15146-15150,1988.

3. Yang, J. M., Byrd, J. C., Siddiki, B. B., Chung, Y. S., Okuno, M., Sowa, M., Kim, Y. S., Matta, K. L., and Brockhausen, I. Alterations of 0-glycan

biosynthesis in human colon cancer tissues. Gly- cobiology 4: 873-884,1994.

4. Yousefi, S., Higgins, E., Daoling, Z., Pollex- Kruger, A., Hindsgaul, 0., and Dennis, J. W. In- creased UDP-GlcNAc: Gal beta 1-3GalNAc-R (GlcNAc to GalNAc) beta-1,6-N-acetylglucosaminyltrans- ferase activity in metastatic murine tumor cell lines. Control of polylactosamine synthesis. J.

Biol. Chem. 266: 1772-1782,1991.

5. Fukuda, M. Possible roles of tumor-associated carbohydrate antigens. Cancer Res. 56: 2237- 2244,1996.

6. Brockhausen, I., Yang, J. M., Burchell, J., White- house, C., and Taylor-Papadimitriou, J. Mecha- nisms underlying aberrant glycosylation of MUC1 mucin in breast cancer cells. Eur. J. Biochem.

233: 607-617,1995.

7. Brockhausen, I., Kuhns, W., Schachter, H., Matta, K. L., Sutherland, D. R., and Baker, M. A. Biosyn- thesis of O-glycans in leukocytes from normal do- nors and from patients with leukemia: increase in 0-glycan core 2 UDP-GlcNAc: Gal beta 3 GalNAc al- pha-R (GlcNAc to GalNAc) beta (1-6)-N-acetylglu- cosaminyltransferase in leukemic cells. Cancer Res. 51: 1257-1263,1991.

8. Higgins, E. A., Siminovitch, K. A., Zhuang, D. L., Brockhausen, I., and Dennis, J. W. Aberrant 0- linked oligosaccharide biosynthesis in lymphocy- tes and platelets from patients with the Wiskott- Aldrich syndrome. J. Biol. Chem. 266: 6280-6290, 1991.

9. Saitoh, 0., Piller, F., Fox, R. I., and Fukuda, M.

T-lymphocytic leukemia expresses complex, branch- ed 0-linked oligosaccharides on a major sialogly- coprotein, leukosialin. Blood 77: 1491-1499, 1991.

10. Springer, G. F. T and Tn, general carcinoma auto- antigens. Science 224: 1198-1206,1984.

11. Kumar, R., Camphausen, R. T., Sullivan, F. X., and Cumming, D. A. Core2 beta-1,6-N-acetylglucosami- nyltransferase enzyme activity is critical for P- selectin glycoprotein ligand-1 binding to P-se- lectin. Blood 88: 3872-3879,1996.

12. Williams, D. and Schachter, H. Mucin synthesis.

I. Detection in canine submaxillary glands of an N-acetylglucosaminyltransferase which acts on mucin substrates. J. Biol. Chem. 255: 11247- 11252,1980.

13. Bierhuizen, M. F. and Fukuda, M. Expression clon- ing of a cDNA encoding UDP-GlcNAc: Gal beta 1-3- GalNAc-R (GlcNAc to GalNAc) beta 1-6GlcNAc trans- ferase by gene transfer into CHO cells expressing polyoma large tumor antigen. Proc. Natl. Acad.

Sci. U. S. A. 89: 9326-9330,1992.

14. Schwientek, T., Yeh, J. C., Levery, S. B., Keck, B., Merkx, G., van Kessel, A. G., Fukuda, M., and Clausen, H. Control of 0-glycan branch formation.

Molecular cloning and characterization of a novel thymus-associated core 2 betal, 6-n-acetylglucos- aminyltransferase. J. Biol. Chem. 275: 11106- 11113,2000.

15. Schwientek, T., Nomoto, M., Levery, S. B., Merkx, G., van Kessel, A. G., Bennett, E. P., Hollings- worth, M. A., and Clausen, H. Control of 0-glycan branch formation. Molecular cloning of human cDNA encoding a novel betal, 6-N-acetylglucosaminyl- transferase forming core 2 and core 4. J. Biol.

Chem. 274: 4504-4512,1999.

16. Yeh, J. C., Ong, E., and Fukuda, M. Molecular cloning and expression of a novel beta-1,6-N- acetylglucosaminyltransferase that forms core 2, core 4, and I branches. J. Biol. Chem. 274: 3215-3221,1999.

17. Baum, L. G., Pang, M., Perillo, N. L., Wu, T., Delegeane, A., Uittenbogaart, C. H., Fukuda, M., and Seilhamer, J. J. Human thymic epithelial cells express an endogenous lectin, galectin-1, which binds to core 2 0-glycans on thymocytes and T lymphoblastoid cells. J. Exp. Med. 181: 877-887, 1995.

18. Perillo, N. L., Marcus, M. E., and Baum, L. G.

Galectins: versatile modulators of cell adhesion, cell proliferation, and cell death. J. Mol. Med.

76: 402-412,1998.

19. Perillo, N. L., Pace, K. E., Seilhamer, J. J., and Baum, L. G. Apoptosis of T cells mediated by ga- lectin-1. Nature 378: 736-739,1995.

20. Devereux, J., Haeberli, P., and Smithies, 0. A comprehensive set of sequence analysis programs for the VAX. Nucleic Acids Res. 12: 387-395, 1984

21. Altschul, S. F., Gish, W., Miller, W., Myers, E. W., and Lipman, D. J. Basic local alignment search tool. J. Mol. Biol. 1990, Oct. 5., 215: 403-410, 22. Bierhuizen, M. F., Maemura, K., Kudo, S., and Fu- kuda, M. Genomic organization of core 2 and I branching beta-1,6-N- acetylglucosaminyltrans- ferases. Implication for evolution of the beta- 1,6-N-acetylglucosaminyltransferase gene family.

Glycobiology 5: 417-425,1995.

23. Almeida, R., Amado, M., David, L., Levery, S. B., Holmes, E. H., Merkx, G., van Kessel, A. G., Ry- gaard, E., Hassan, H., Bennett, E., and Clausen, H. A family of human beta4-galactosyltransfera- ses. Cloning and expression of two novel UDP- galactose: beta-N-acetylglucosamine betal, 4- galactosyltransferases, beta4Gal-T2 and beta4Gal- T3. J. Biol. Chem. 272: 31979-31991,1997.

24. Bennett, E. P., Hassan, H., and Clausen, H. cDNA cloning and expression of a novel human UDP-N- acetyl-alpha-D-galactosamine. Polypeptide N- acetylgalactosaminyltransferase, GalNAc-T3. J.

Biol. Chem. 271: 17006-17012,1996.

25. Wandall, H. H., Hassan, H., Mirgorodskaya, E., Kristensen, A. K., Roepstorff, P., Bennett, E. P., Nielsen, P. A., Hollingsworth, M. A., Burchell, J., Taylor-Papadimitriou, J., and Clausen, H. Sub- strate specificities of three members of the hu- man UDP-N-acetyl-alpha-D-galactosamine: Polypep- tide N-acetylgalactosaminyltransferase family, GalNAc-T1,-T2, and-T3. J. Biol. Chem. 272: 23503-23514,1997.

26. Matteucci, M. D. and Caruthers, M. H. J. Am.

Chem. Soc. 103: 3185-3191.1981.

27. Yoo, Y., Rote, K., and Rechsteiner, M. Synthesis of peptides as cloned ubiquitin extensions. J.

Biol. Chem. 264: 17078-17083,1989.