Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
GLYCAN PANELS AS SPECIFIC TUMOR TISSUE BIOMARKERS
Document Type and Number:
WIPO Patent Application WO/2016/036705
Kind Code:
A1
Abstract:
The present invention provides methods and compositions for the profiling of glycans in a biological sample. In one embodiment, the method includes the generation of panels of multiple glycans associated with a disease state. The invention further relates to the use of glycan panels for the diagnosis and screening of disease states, particularly in the field of cancer biology.

Inventors:
DRAKE RICHARD R (US)
POWERS THOMAS W (US)
NEELY BENJAMIN A (US)
Application Number:
PCT/US2015/047889
Publication Date:
March 10, 2016
Filing Date:
September 01, 2015
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
MUSC FOUND FOR RES DEV (US)
International Classes:
G01N33/53; G01N27/62; G01N33/48
Domestic Patent References:
WO2013177385A12013-11-28
WO2010142860A12010-12-16
WO2008128220A12008-10-23
Foreign References:
US20120109530A12012-05-03
US20060171586A12006-08-03
US20140154710A12014-06-05
Attorney, Agent or Firm:
CHEN, Ming et al. (LLP300 Four Falls Corporate Center Suite 710,300 Conshohocken State Roa, West Conshohocken PA, US)
Download PDF:
Claims:
CLAIMS

What is claimed is:

1. A method of spatially profiling glycans on a tissue section, the method comprising the steps of:

a. heating the tissue section in antigen retrieval buffer and washing in xylene

b. digesting in situ the proteins in the tissue section; c. depositing a matrix solution onto the tissue section;

d. detecting released glycan ions by mass spectrometry;

e. deconvoluting the mass spectrometry data;

f. associating individual peaks with glycans based on mass accuracy; and

g. viewing identified glycans on imaging software to assess intensity distribution of ions.

2. The method of claim 1, wherein the glycans are N-linked glycans.

3. The method of claim 1 , wherein the tissue section is a formalin fixed, paraffin embedded tissue section.

4. The method of claim 1, wherein the proteins are digested

enzymatically with PNGaseF.

5. The method of claim 1, wherein the matrix solution is a-cyano-4- hydroxycinnamic acid.

6. The method of claim 1, wherein the mass spectrometry is Fourier transform ion cyclotron resonance (FTICR) mass spectrometry.

7. The method of claim 1, wherein the mass spectrometry is matrix- assisted laser desorption/ionization imaging mass spectroscopy (MALDI-IMS).

8. A method for processing data collected from a tissue microarray section comprising a plurality of tissue microarray cores, the method comprising the steps of:

a. heating the tissue microarray section in antigen retrieval buffer and washing in xylene

b. digesting in situ the proteins in the tissue microarray section; c. depositing a matrix solution onto the tissue microarray section; d. detecting released glycan ions by mass spectrometry;

e. generating machine learning models using mass spectrometry data of a random selection of a first percent of tissue microarray cores; and f. optimizing the machine learning models by cross-validation on the first percent of tissue microarray cores, and qualifying the performance using the mass spectrometry data of the second percent of tissue microarray cores to form panels of glycans with individual sensitivities and specificities.

9. The method of claim 8, wherein the glycans are N-linked glycans.

10. The method of claim 8, wherein the tissue microarray section is a formalin fixed, paraffin embedded tissue microarray section.

1 1. The method of claim 8, wherein the proteins are digested

enzymatically with PNGaseF.

12. The method of claim 8, wherein the matrix solution is a-cyano-4- hydroxycinnamic acid.

13. The method of claim 8, wherein the mass spectrometry is Fourier transform ion cyclotron resonance (FTICR) mass spectrometry.

14. The method of claim 8, wherein the mass spectrometry is matrix- assisted laser desorption/ionization imaging mass spectroscopy (MALDI-IMS).

15. The method of claim 8, wherein the machine learning model is a supervised machine learning model.

16. The method of claim 15, wherein the supervised machine learning model is selected from the group consisting of: random forest, linear basis function kernel support vector machine, radial basis function kernel support vector machine, naive Bayes classifier, linear discriminant analysis, quadratic discriminant analysis, neural networks, artificial neural networks, genetic algorithm, k-nearest neighbors, and combinations thereof.

17. The method of claim 8, wherein the machine learning model is be optimized by forward sequential feature selection.

18. The method of claim 8, wherein the tissue microarray section comprises tumor tissue cores and normal tissue cores.

19. A glycan panel comprising glycans associated with a type of cancer tissue identified by the method of claim 8.

20. A glycan panel associated with pancreatic cancer tissue, said panel comprising one or more of Hex6HexNAc2 glycan; Hex4dHexlHexNAc3 glycan; Hex3dHexlHexNAc4 glycan; Hex4HexNAc4 glycan; Hex7HexNAc2 glycan;

Hex4dHexlHexNAc4 glycan; Hex3dHexlHexNAc5 glycan; Hex8HexNAc2 glycan; Hex5dHexlHexNAc4 glycan; Hex4dHexlHexNAc5 glycan; Hex5HexNAc4NeuAcl glycan; Hex5dHexlHexNAc5 glycan; Hex6HexNAc5 glycan;

Hex5dHexlHexNAc4NeuAcl glycan; Hex5dHex2HexNAc5 glycan;

Hex6dHexlHexNAc5 glycan; Hex6dHex2HexNAc5 glycan; Hex7HexNAc6 glycan; Hex9HexNAc3NeuAcl glycan; Hex7dHexlHexNAc6 glycan; Hex7dHexlHexNAc7 glycan; and Hex9dHexlHex Ac8 glycan.

21. A kit for diagnosing or monitoring cancer in an individual wherein the glycan profile of a test sample from said individual is determined and comparing the measured profile with a profile of normal patient or profile of a patient with a family history of cancer, wherein said kit comprises an array of glycan molecules identified by the method of claim 8.

22. A method of diagnosing cancer in a tissue sample, said method comprising detecting in a tissue sample the presence of at least one glycan selected from a glycan panel.

23. The method of claim 22, wherein the glycan panel comprises glycans associated with a type of cancer tissue identified by the method of claim 8.

24. A method of identifying pancreatic cancer in a tissue sample, said method comprising detecting in a tissue sample the presence of one or more of Hex6HexNAc2 glycan; Hex4dHexlHexNAc3 glycan; Hex3dHexlHexNAc4 glycan; Hex4HexNAc4 glycan; Hex7HexNAc2 glycan; Hex4dHexlHexNAc4 glycan;

Hex3dHexlHexNAc5 glycan; Hex8HexNAc2 glycan; Hex5dHexlHexNAc4 glycan; Hex4dHexlHexNAc5 glycan; Hex5HexNAc4NeuAcl glycan; Hex5dHexlHexNAc5 glycan; Hex6HexNAc5 glycan; Hex5dHexlHexNAc4NeuAcl glycan;

Hex5dHex2HexNAc5 glycan; Hex6dHexlHexNAc5 glycan; Hex6dHex2HexNAc5 glycan; Hex7HexNAc6 glycan; Hex9HexNAc3NeuAc 1 glycan;

Hex7dHexlHexNAc6 glycan; Hex7dHexlHexNAc7 glycan; and

Hex9dHexlHexNAc8 glycan.

Description:
TITLE OF THE INVENTION

GLYCAN PANELS AS SPECIFIC TUMOR TISSUE BIOMARKERS

CROSS-REFERENCE TO RELATED APPLICATIONS This application claims priority to U.S. Provisional Patent Application

No. 62/045,202 filed September 3, 2014, the contents of which are incorporated by reference herein in its entirety.

STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT

This invention was made with government support under R21CA137704 and ROICA135087 awarded by National Institutes of Health (NIH). This invention was also supported by grants R01 CA120206 and U01 CA168856 from the National Cancer Institute (NCI) and U01CA 168896 from the NIH/NCI. The government has certain rights in the invention.

BACKGROUND OF THE INVENTION

Tissues obtained from surgeries or diagnostic procedures are most commonly preserved by fixation in formalin and processed as paraffin-embedded tissue blocks. The embedding process preserves the cellular morphology and allows tissues to be stored at room temperature, causing formalin- fixed paraffin-embedded (FFPE) fixation to be used by many tissue banks and biorepositories (Thompson et al, Proteomics Clin Appl, 2013, 7(3-4):241-51; Craven et al, Proteomics Clin Appl, 2013, 7(3-4):273-82). For cancer biomarker discovery, FFPE tissues are particularly attractive because they are archived for years and are much more widely available than cryopreserved tissue. When combined with clinical outcomes, FFPE tissues are a rich source of samples for biomarker discover and validation in retrospective studies. While the fixation method has many benefits, the formalin treatment results in the formation of methylene bridges between the amino acids of the proteins, complicating further analysis by mass spectrometry. There has been continued progress in improving extraction methods of trypsin digested peptides from FFPE tissues in recent years, in parallel with improved high resolution sequencing analysis of peptides by mass spectrometry (Magdeldin et al, Proteomics, 12: 1045-1058; Wisniewski et al, Proteomics Clin Appl, 2013, 7(3-4):225-33). Incorporation of multiple FFPE tumor tissue cores in a tissue microarray (TMA) format also has proven to be effective for immunohistochemistry analysis of potential biomarker candidates (Takikita et al, Curr Opin Biotechnol, 18: 318-325), and TMAs are increasingly being used for validation of alterations in protein expression associated with emerging genetic mutation phenotypes and transcriptional profiling studies (Franco et al, Expert Rev Anticancer Ther., 2011, 1 1 : 859-869; Hewitt, Methods Enzymol, 2006, 410: 400-415). The main advantages of experiments performed with TMAs are the ability to include multiple cores from the same subject tumors, improved sample throughput, statistical relevance and multiplexed analysis of diverse molecular targets (Takikita et al, Curr Opin Biotechnol, 18: 318-325; Camp et al, J. Clin. Oncol, 2008, 26: 5630-5637). Thus, it is possible to place up to 100 samples with duplicates and controls on a single slide. When correlated with associated clinical outcomes, this provides a powerful method for biomarker discovery and validation while minimizing reagent use and assuring that each core in the TMA is treated under identical conditions.

It is well documented that malignant transformation and cancer progression result in fundamental changes in the glycosylation patterns of cell surface and secreted glycoproteins. Glycosylation of proteins are post-translational modifications most commonly involving either N-linked addition to asparagine residues or O-linked additions to serine or threonine residues. Current approaches to evaluate glycosylation changes generally involve bulk extraction of glycans and glycoproteins from tumor tissues for analysis by mass spectrometry or antibody array platforms, however, this disrupts tissue architecture and distribution of the analytes. Broad affinity carbohydrate binding lectins and a small number of glycan antigen antibodies can be used to target glycan structural classes in tissues, but not individual glycan species. Additionally, these detection methods for global alterations in glycosylation require staining on many adjacent tissue sections, making large scale assessments on many samples difficult, expensive and time consuming. There are only a few reported studies examining glycosylation changes of proteins in FFPE cancer tissues (van Cruijsen et al, BMC Cancer, 2009, 9: 180; Chen et al, Proc Natl Acad Sci USA, 2013, 110(2):630-5). One potential approach to assess glycan changes in tissues is matrix-assisted laser desorption/ionization imaging mass spectrometry (MALDI-IMS). This technique has been used to directly profile multiple proteins (Chaurand et al, J Proteome Res, 2006, 5: 2889-2900; Cazares et al., Clin Cancel' Res, 2009, 15: 5541-5551), lipid (Berry et al, Chem Rev, 11 1 : 6491-6512; Chaurand et al, Mol Cell Proteomics, 201 1, 10: 01 10.004259) and drug metabolite (Castellino et al, Bioanalysis, 201 1, 3: 2427-2441 ; Cornett et al, Anal Chem, 2008, 80: 5648- 5653; Nilsson et al, PLoS One, 2010, 5: el 141 1) in tissue, generating molecular maps of the relative abundance and spatial distribution of individual analytes linked to tissue histopathology. MALDI-IMS analysis of peptides following trypsin digestion of FFPE TMAs has also been reported (Groseclose et al, Proteomics, 2008, 8: 3715- 3724; Quaas et al., Histopathology, 2013, 63 : 455-462; Casadonte et al, Nat Protoc, 201 1, 6: 1695-1709).

Recently, a MALDI-IMS method workflow to directly profile N-linked glycan species in fresh/frozen tissues was reported (Powers et al, Anal Chem, 2013 85: 9799-9806). However, there is a need in the art to analyze N-glycans in FFPE tissues. The present invention satisfies this need.

SUMMARY OF THE INVENTION

The present invention provides methods and compositions for the profiling of glycans in a biological sample. In one embodiment, the method includes the generation of panels of multiple glycans associated with a disease state. The invention further relates to the use of glycan panels for the diagnosis and screening of disease states, particularly in the field of cancer biology.

In one aspect, the invention relates to a method of spatially profiling glycans on a tissue section. The method comprises the steps of: (a) heating the tissue section in antigen retrieval buffer; (b) digesting in situ the proteins in the tissue section; (c) depositing a matrix solution onto the tissue section; (d) detecting released glycan ions by mass spectrometry; (e) deconvoluting the mass spectrometry data; (f) associating individual peaks with glycans based on mass accuracy; and (g) viewing identified glycans on imaging software to assess intensity distribution of ions.

In one embodiment, the glycans are N-linked glycans. In one embodiment, the tissue section is a formalin fixed, paraffin embedded tissue section. In one embodiment, the proteins are digested enzymatically with PNGaseF. In one embodiment, the matrix solution is a-cyano-4-hydroxycinnamic acid. In one embodiment, the mass spectrometry is mass spectrometry is Fourier transform ion cyclotron resonance (FTICR) mass spectrometry. In one embodiment, the mass spectrometry is matrix-assisted laser desorption/ionization imaging mass spectrometry (MALDI-IMS).

In another aspect, the invention relates to a method of processing data collected from a tissue microarray section comprising a plurality of tissue microarray cores. The method comprises the steps of: (a) heating the tissue microarray section in antigen retrieval buffer and washing in xylene; (b) digesting in situ the proteins in the tissue microarray section; (c) depositing a matrix solution onto the tissue microarray section; (d) detecting released glycan ions by mass spectrometry; (e) generating machine learning models using mass spectrometry data of a random selection of a first percent of tissue microarray cores; and (f) optimizing the machine learning models by cross-validation on the first percent of tissue microarray cores, and qualifying the performance using the mass spectrometry data of the second percent of tissue microarray cores to form panels of glycans with individual sensitivities and specificities.

In one embodiment, the glycans are N-linked glycans. In one embodiment, the tissue microarray section is a formalin fixed, paraffin embedded tissue microarray section. In one embodiment, the proteins are digested enzymatically with PNGaseF. In one embodiment, the matrix solution is a-cyano-4- hydroxycinnamic acid. In one embodiment, the mass spectrometry is Fourier transform ion cyclotron resonance (FTICR) mass spectrometry. In one embodiment, the mass spectrometry is matrix-assisted laser desorption/ionization imaging mass spectroscopy (MALDI-IMS).

In one embodiment, the machine learning model is a supervised machine learning model. In one embodiment, the supervised machine learning model is selected from the group consisting of: random forest, linear basis function kernel support vector machine, radial basis function kernel support vector machine, naive Bayes classifier, linear discriminant analysis, quadratic discriminant analysis, neural networks, artificial neural networks, genetic algorithm, k-nearest neighbors, and combinations thereof. In one embodiment, the machine learning model is optimized by forward sequential feature selection. In one embodiment, the tissue microarray section comprises tumor tissue cores and normal tissue cores.

In another aspect, the invention also relates to a glycan panel comprising glycans associated with a type of cancer tissue identified by the methods of the present invention. In another aspect, the invention also relates to a glycan panel associated with pancreatic cancer tissue, comprising one or more of Hex6HexNAc2 glycan; Hex4dHexlHexNAc3 glycan; Hex3dHexlHexNAc4 glycan; Hex4HexNAc4 glycan; Hex7HexNAc2 glycan; Hex4dHexlHexNAc4 glycan; Hex3dHexlHexNAc5 glycan; Hex8HexNAc2 glycan; Hex5dHexlHexNAc4 glycan; Hex4dHexlHexNAc5 glycan; Hex5HexNAc4NeuAcl glycan; Hex5dHexlHexNAc5 glycan;

Hex6HexNAc5 glycan; Hex5dHexlHexNAc4NeuAcl glycan; Hex5dHex2HexNAc5 glycan; Hex6dHexlHexNAc5 glycan; Hex6dHex2HexNAc5 glycan; Hex7HexNAc6 glycan; Hex9HexNAc3NeuAcl glycan; Hex7dHexlHexNAc6 glycan;

Hex7dHexlHexNAc7 glycan; and Hex9dHexlHexNAc8 glycan.

In another aspect, the invention also relates to a kit for diagnosing or monitoring cancer in an individual wherein the glycan profile of a test sample from said individual is determined and comparing the measured profile with a profile of normal patient or profile of a patient with a family history of cancer, wherein said kit comprises an array of glycan molecules identified by the methods of the present invention.

In another aspect, the invention also relates to a method of diagnosing cancer in a tissue sample, comprising detecting in a tissue sample the presence of at least one glycan selected from a glycan panel. In one embodiment, the glycan panel comprises glycans associated with a type of cancer identified by the methods of the present invention.

In another aspect, the invention also relates to a method of identifying pancreatic cancer in a tissue sample, comprising detecting in a tissue sample the presence of one or more of Hex6HexNAc2 glycan; Hex4dHexlHexNAc3 glycan; Hex3dHexlHexNAc4 glycan; Hex4HexNAc4 glycan; Hex7HexNAc2 glycan;

Hex4dHexlHexNAc4 glycan; Hex3dHexlHexNAc5 glycan; Hex8HexNAc2 glycan;

Hex5dHexlHexNAc4 glycan; Hex4dHexlHexNAc5 glycan; Hex5HexNAc4NeuAcl glycan; Hex5dHexlHexNAc5 glycan; Hex6HexNAc5 glycan;

Hex5dHexlHexNAc4NeuAcl glycan; Hex5dHex2HexNAc5 glycan;

Hex6dHexlHexNAc5 glycan; Hex6dHex2HexNAc5 glycan; Hex7HexNAc6 glycan;

Hex9HexNAc3NeuAcl glycan; Hex7dHexlHexNAc6 glycan; Hex7dHexlHexNAc7 glycan; and Hex9dHexlHex Ac8 glycan.

BRIEF DESCRIPTION OF THE DRAWINGS For the purpose of illustrating the invention, there are depicted in the drawings certain embodiments of the invention. However, the invention is not limited to the precise arrangements and instrumentalities of the embodiments depicted in the drawings.

Figure 1 is a schematic of the methodology for imaging N-glycans from FFPE tissues. Prior to enzyme application, FFPE blocks are cut at 5um, incubated, deparaffinized and undergo antigen retrieval. PNGaseF is then applied and the slide is incubated before MALDI-IMS. The data is then linked with

histopathology either on the same tissue slice or a serial tissue slice.

Figure 2, comprising Figures 2A through 2F, is a series of images depicting MALDI-IMS of N-Glycans on mouse kidney tissue. Two mouse kidneys were sliced at 5um prior to proceeding with the MALDI-IMS workflow. One tissue was covered with a glass slide during PNGaseF application to serve as an undigested control tissue. An average annotated spectra from the tissue that received PNGaseF application is provided (Figure 2A). Tissue regions were assessed by H&E stain (Figure 2B). The labeled peaks correspond to native N-glycans that have been reported for the mouse kidney on the Consortium for Functional Glycomics mouse kidney database. Two of these ions were selected and their tissue localization was assessed. Hex4dHex2HexNAc5 at m/z = 1996.7 (Figure 2C) is located in the cortex and medulla while Hex5dHex2HexNAc5 m/z = 2158.7 (Figure 2D) is more abundant in the cortex of the mouse kidney. An overlay image of these two masses is also shown (Figure 2E), as well as the corresponding image from untreated PNGaseF control tissues (Figure 2F).

Figure 3, comprising Figures 3 A through 3C, is a series of images depicting MALDI-IMS of a human pancreas FFPE tissue block. An FFPE block of pancreatic tissue from a human patient was cut at 5um prior to and selected for MALDI-IMS. Histopathology found four unique regions in the H&E of this tissue block. The tissue block contained tumor tissue, non-tumor tissue, fibroconnective tissue representing desmoplasia surrounding the tumor tissue, and necrotic tissue (Figures 3A and 3B). MALDI-IMS was able to distinguish these four regions based off of specific ions after MALDI-IMS. M/z = 1891.80 (red) is found in the non-tumor (NT) region of the pancreas and corresponds to Hex3dHexlHexNAc6, while m/z = 1743.64 (blue) represents Hex8HexNAc2 and is predominant in the tumor region (T) of the tissue. Desmoplasia (DP) is represented by m/z = 1809.69 (green) corresponding to Hex5dHexlHexNAc4. In the region where necrosis was identified (TN), m/z = 1663.64 (orange) was elevated corresponding to Hex5HexNAc4. Image spectra were acquired at 200 um raster. (Figure 3C). Representative individual glycan images for the pancreatic FFPE tissue slice.

Figure 4, comprising Figures 4A through 4F, is a series of images depicting MALDI-IMS of a human prostate FFPE tissue block. An archived FFPE block of prostate tissue from a human patient was cut at Sum and prepared for MALDI-IMS glycan analysis, (Figure 4A). H&E image. A global glycan imaging experiment performed with a raster of 225 um demonstrated a heterogeneous expression of two glycan ions (Figure 4B) at m/z = 1663.56 and (Figure 4C) at m/z = 1850.65. Stromal versus gland distribution were further assessed in a high resolution experiment at 50um raster (Figure 4D-4F). Column (Figure 4D) indicates a 2X amplification of the H&E, and distribution of the same two glycans are shown at this magnification for m/z = 1663.56 (red) and m/z = 1850.65 (green), and an overlay image. Column (Figure 4E) (enlargement of upper region shown in Figure 4D) and (Figure 4F) (enlargement of lower region shown in Figure 4D), show two highlighted regions of stroma and glands enhanced at 10X resolution, with the same colors and glycans shown for column (Figure 4D).

Figure 5, comprising Figures 5A and 5B, is a series of images showing comparison of the fragmentation pattern of a glycan standard with the same ion on tissue. (Figure 5A). A representative MALDI spectra for native N-linked glycans from pancreatic cancer FFPE tissue. (Figure 5B). NA2 glycan standard (m/z = 1663.6) was fragmented using CID, revealing a variety of cleavages across glycosidic bonds as demonstrated in the spectrum (Figure 5A). When the same ion was fragmented on the pancreatic tissue, the fragmentation pattern was the same, verifying that Hex5HexNAc4 was detected in the human pancreas.

Figure 6, comprising Figures 6A through 6D, is a series of images depicting N-Glycan imaging of a liver TMA. A liver TMA purchased by BioChain consisting of 2 tumor tissue cores and one normal tissue core from 16 patients was imaged. The H&E (Figure 6A) provides the TMA location (red letters and numbers) and classifies whether the row is tumor (green bar) or non-tumor (red bar). M/z = 2393.95 (Figure 6C) and m/z 1743.64 (Figure 6D) were able to distinguish between hepatocellular carcinoma and uninvolved liver tissue. An overlay of these ions demonstrates that m/z 2393.95 is elevated in tumor tissue and m/z 1743.64 is elevated in normal tissue (Figure 6B). Statistical data for these two ions is provided in Table 1. (200 um raster).

Figure 7 shows a panel of mouse kidney N-glycans. Ions detected in the kidney with enzyme application were compared to the control tissue. Ions that were only observed in the tissue following PNGaseF application were compared to the glycans found in the mouse kidney database on the Consortium for Functional Glycomics. The panel provides the glycan species, the projected mass for the sodium adduct, and the observed mass for the sodium adduct.

Figure 8 is a chart showing permethylation of mouse kidney

N-glycans. Mouse kidney N-glycans were extracted from the imaging slide after PNGaseF application and digestion. Glycans were dried down and underwent permethylation as described elsewhere herein. The permethylated m/z values were then compared to the permethylation data from the Consortium for Functional Glycomics mouse kidney database (www.functionalglycomics.org).

Figure 9 shows a panel of mouse kidney N-glycans linked to known glycan database.

Figure 10 is a series of images showing individual N-glycans from prostate cancer FFPE tissue.

Figure 11 is a series of images showing collision-induced dissociation (CID) of N-glycans from human pancreas tissue.

Figure 12 is a series of images showing CID of N-glycans from human pancreas tissue.

Figure 13 is a series of images from Ions corresponding to N-glycans. The Ions identified in Table 1 were viewed in Fleximaging Software.

Figure 14, comprising Figures 14A through 14G, depicts the results of experiments demonstrating heterogeneous distribution of N-Glycans in pancreatic cancer tissue sections. (14A) pancreatic ductal adenocarcinoma (PDAC) tissue section of complex histology was profiled by MALDI-IMS. In the H&E, tumor/pre- cancerous lesions (green), intestine mucosa (yellow), fibroadipose connective tissue (blue), smooth muscle (orange), and non-tumor pancreas tissue (red) are all outlined. In the MALDI-IMS images, individual glycans correlate with defined histology regions. Hex6dHexlHexNAc5 (14B, tumor), Hex5dHex3HexNAc5 (14C, intestine mucosa), Hex5HexNAc4NeuAcl (14D, fibroadipose connective tissue), Hex4dHexlHexNAc4 (14E, smooth muscle), and Hex6HexNAc2 (14F, non-tumor pancreas) are among these N-glycans.

Figure 15 depicts a panel of N-Glycan distribution in PDAC tissue section. Upwards of 90 ions corresponding to N-glycoforms were observed in an individual tissue section. In a representative panel of these glycoforms, many of these glycans exhibit a distribution pattern related to regions of complex histology.

Figure 16 depicts a workflow schematic for the identification of individual and panels of N-Glycan disease markers. Six TMAs were imaged by MALDI-IMS to add an element of throughput to the disease marker discovery process. Many glycoforms were detected in all 6 of the TMAs and were selected for further analysis of individual discriminators and as panels of biomarkers. For panel identification, 2/3rds of the data was used to optimize the variables for a linear discriminant analysis (LDA) model, while the remaining l/3 rd of the data was used to test the performance of the model.

Figure 17, comprising Figures 17A through 17G, depicts the results of experiments demonstrating that combinations of individual discriminators reveal more robust differences in tumor and non-tumor samples. Representative images of 5 tissue sections (17D) of complex histology and 2 tissue microarrays, looking at the localization of Hex6dHexlHexNAc5 (17A, 17E), Hex6HexNAc2 (17B, 17F), and the overlay of the two glycans (17C, 17G). In the TMA, tumor cores are outlined in green, while non-tumor cores are outlined in red. Furthermore, in overlay images, Hex6dHexlHexNAc5 is presented in green, while Hex6HexNAc2 is presented in red. The overlay of the two glycans is able to distinguish tumor from non-tumor cores in both the whole tissue blocks and the tissue microarray.

Figure 18, comprising Figures 18A through 18D, depicts the results of experiments demonstrating that an LDA model of N-Glycan discriminates tumor from non-tumor tissue cores. Supervised machine learning algorithms, specifically the Linear Discriminant Analysis (LDA), were used to identify important features to distinguish tumor from non-tumor tissue sections. Two glycans

(Hex7 dHex 1 HexN Ac7 , m/z = 2742.9820; Hex6dHex2HexNAc5, m/z = 2320.7524) were observed to be important, and the performance of the LDA model based off of these two features was tested for performance of the biomarker panel (18A).

Representative images from three of the TMAs are presented with an overlay of Hex7dHexlHexNAc7 (green) and Hex6dHex2HexNAc5 (red) (18B-18D). DETAILED DESCRIPTION OF THE INVENTION

The present invention relates to methods and compositions for cancer diagnosis, research and therapy, including but not limited to profiling glycans. In particular, the present invention relates to glycan panels as diagnostic markers and clinical targets for cancer. Preferably in one embodiment, the cancer is pancreatic cancer.

Accordingly, embodiments of the present invention provide methods and compositions for the profiling of glycans for detecting and screening cancer. In some embodiments, the profiled glycans are N-linked glycans. In other embodiments, the glycans are profiled directly on a tissue. In other embodiments, the method of profiling glycans is performed using MALDI-imaging mass spectrometry. In particular, the present invention relates to the spatial identification of N-glycan species in relation to their histopathology expression and tissue distribution.

In other embodiments, multiple profiled glycans are associated with a specific cancer. Accordingly, the invention provides a panel of multiple glycans for the detection and screening of cancer.

Accordingly, embodiments of the present invention provide glycan libraries or panels useful in the detection and screening of their associated underlying tissue features and morphologies. For example, in some embodiments, the features and morphologies detected and screened include healthy tissue, cancerous tissue, and diseased tissue. For example, in some embodiments, glycan types are increased relative to a control sample from a subject that does not have a disease or condition (e.g., a population average of samples, a control sample, a prior sample from the same patient, etc.). In other embodiments, glycan types are decreased relative to a control sample from a subject that does not have a disease or condition (e.g., a population average of samples, a control sample, a prior sample from the same patient, etc.). Accordingly, the invention in some instances provides a combination of markers for a disease or condition, wherein some of the glycan types are increased and other markers include decrease is some glycan types.

In one embodiment, the present invention describes glycans, which are specifically expressed by certain cancer cells, tumors and other malignant tissues. The present invention describes methods to detect cancer specific glycans as well as methods for the production of reagents binding to the glycans. The invention is also directed to the use of the glycans and reagents binding to them for the diagnostics of cancer and malignancies. In one embodiment, the invention is directed to the use of the glycans and reagents binding to them for the treatment of cancer and

malignancies. In one embodiment, the present invention comprises efficient methods to differentiate between malignant and benign tumors by analyzing glycan structures.

Definitions

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the invention pertains. Although any methods and materials similar or equivalent to those described herein can be used in the practice for testing of the present invention, the preferred materials and methods are described herein. In describing and claiming the present invention, the following terminology will be used.

It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to be limiting.

The articles "a" and "an" are used herein to refer to one or to more than one (i.e., to at least one) of the grammatical object of the article. By way of example, "an element" means one element or more than one element.

"About" as used herein when referring to a measurable value such as an amount, a temporal duration, and the like, is meant to encompass non-limiting variations of ±40% or ±20% or ±10%, ±5%, ±1%, or ±0.1% from the specified value, as such variations are appropriate.

The term "abnormal" when used in the context of organisms, tissues, cells or components thereof, refers to those organisms, tissues, cells or components thereof that differ in at least one observable or detectable characteristic (e.g., age, treatment, time of day, etc.) from those organisms, tissues, cells or components thereof that display the "normal" (expected) respective characteristic. Characteristics that are normal or expected for one cell or tissue type, might be abnormal for a different cell or tissue type.

The terms "biomarker" and "marker" are used herein interchangeably.

They refer to a substance that is a distinctive indicator of a biological process, biological event and/or pathologic condition.

The phrase "body sample" or "biological sample" is used herein in its broadest sense. A sample may be of any biological tissue or fluid from which biomarkers of the present invention may be assayed. Examples of such samples include but are not limited to blood, saliva, buccal smear, feces, lymph, urine, gynecological fluids, biopsies, amniotic fluid and smears. Samples that are liquid in nature are referred to herein as "bodily fluids." Body samples may be obtained from a patient by a variety of techniques including, for example, by scraping or swabbing an area or by using a needle to aspirate bodily fluids. Methods for collecting various body samples are well known in the art. Frequently, a sample will be a "clinical sample," i.e., a sample derived from a patient. Such samples include, but are not limited to, bodily fluids which may or may not contain cells, e.g., blood (e.g., whole blood, serum or plasma), urine, saliva, tissue or fine needle biopsy samples, and archival samples with known diagnosis, treatment and/or outcome history. Biological or body samples may also include sections of tissues such as frozen sections taken for histological purposes. The sample also encompasses any material derived by processing a biological or body sample. Derived materials include, but are not limited to, cells (or their progeny) isolated from the sample, proteins or nucleic acid molecules extracted from the sample. Processing of a biological or body sample may involve one or more of: filtration, distillation, extraction, concentration, inactivation of interfering components, addition of reagents, and the like.

As used herein, the term "carbohydrate" is intended to include any of a class of aldehyde or ketone derivatives of polyhydric alcohols. Therefore, carbohydrates include starches, celluloses, gums and saccharides. Although, for illustration, the term "saccharide" or "glycan" is used elsewhere herein, this is not intended to be limiting. It is intended that the methods provided herein can be directed to any carbohydrate, and the use of a specific carbohydrate is not meant to be limiting to that carbohydrate only.

As used herein, the term "cell-surface glycoprotein" refers to a glycoprotein, at least a portion of which is present on the exterior surface of a cell. In some embodiments, a cell-surface glycoprotein is a protein that is positioned on the cell-surface such that at least one of the glycan structures is present on the exterior surface of the cell.

In the context of the present invention, the term "control," when used to characterize a subject, refers, by way of non-limiting examples, to a subject that is healthy, to a patient that otherwise has not been diagnosed with a disease. The term "control sample" refers to one, or more than one, sample that has been obtained from a healthy subject or from a non-disease tissue such as normal colon.

The term "control or reference standard" describes a material comprising none, or a normal, low, or high level of one of more of the marker (or biomarker) expression products of one or more the markers (or biomarkers) of the invention, such that the control or reference standard may serve as a comparator against which a sample can be compared.

"Differentially increased levels" refers to biomarker levels which are at least 1%, 2%, 3%, 4%, 5%, 10% or more, for example, 5%, 10%, 20%, 30%, 40%, or 50%, 60%, 70%, 80%, 90% higher or more, and/or 0.5 fold, 1.1 fold, 1.2 fold, 1.4 fold, 1.6 fold, 1.8 fold higher or more, as compared with a control.

"Differentially decreased levels" refers to biomarker levels which are at least at least 1%, 2%, 3%, 4%, 5%, 10% or more, for example, 5%, 10%, 20%, 30%, 40%, or 50%, 60%, 70%, 80%, 90% lower or less, and/or 0.9 fold, 0.8 fold, 0.6 fold, 0.4 fold, 0.2 fold, 0.1 fold or less, as compared with a control.

A "disease" is a state of health of an animal wherein the animal cannot maintain homeostasis, and wherein if the disease is not ameliorated then the animal's health continues to deteriorate. In contrast, a "disorder" in an animal is a state of health in which the animal is able to maintain homeostasis, but in which the animal's state of health is less favorable than it would be in the absence of the disorder. Left untreated, a disorder does not necessarily cause a further decrease in the animal's state of health.

A disease or disorder is "alleviated" if the severity of a sign or symptom of the disease, or disorder, the frequency with which such a sign or symptom is experienced by a patient, or both, is reduced.

The terms "effective amount" and "pharmaceutically effective amount" refer to a sufficient amount of an agent to provide the desired biological result. That result can be reduction and/or alleviation of a sign, symptom, or cause of a disease or disorder, or any other desired alteration of a biological system. An appropriate effective amount in any individual case may be determined by one of ordinary skill in the art using routine experimentation.

As used herein "endogenous" refers to any material from or produced inside the organism, cell, tissue or system. The term "expression" as used herein is defined as the transcription and/or translation of a particular nucleotide sequence driven by its promoter.

The "level" of one or more biomarkers means the absolute or relative amount or concentration of the biomarker in the sample. The term "level" also refers to the absolute or relative amount of glycosylation of the biomarker in the sample.

As is known in the art and used herein "glycans" are sugars (e.g., oligosaccharides and polysaccharides). Glycans can be monomers or polymers of sugar residues typically joined by glycosidic bonds also referred to herein as linkages. In some embodiments, the terms "glycan", "oligosaccharide" and "polysaccharide" may be used to refer to the carbohydrate portion of a glycoconjugate (e.g., glycoprotein, glycolipid or proteoglycan). A glycan may include natural sugar residues (e.g., glucose, N-acetylglucosamine, N-acetyl neuraminic acid, galactose, mannose, fucose, hexose, arabinose, ribose, xylose, etc.) and/or modified sugars (e.g., 2'-fluororibose, 2'-deoxyribose, phosphomannose, 6'-sulfo N-acetylglucosamine, etc.). The term "glycan" includes homo and heteropolymers of sugar residues. The term "glycan" also encompasses a glycan component of a glycoconjugate (e.g., of a glycoprotein, glycolipid, proteoglycan, etc.). The term also encompasses free glycans, including glycans that have been cleaved or otherwise released from a

glycoconjugate.

As used herein, the term "glycan array" refers to a tool used to identify agents that interact with any of a number of different glycans linked to the array substrate. In some embodiments, glycan arrays comprise a number of chemically synthesized glycans, referred to herein as "glycan probes". In some embodiments, glycan arrays comprise at least 2, at least 5, at least 10, at least 20, at least 30, at least 40, at least 50, at least 60, at least 70, at least 80, at least 90, at least 100, at least 150, at least 350, at least 1000 or at least 1500 glycan probes. In some embodiments, glycan arrays may be customized to present a desired set of glycan probes. In some embodiments, glycan probes may be attached to the array substrate by a linker molecule.

The term "glycan preparation" as used herein refers to a set of glycans obtained according to a particular production method. In some embodiments, glycan preparation refers to a set of glycans obtained from a glycoprotein preparation.

The term "glycoconjugate", as used herein, encompasses all molecules in which at least one sugar moiety is covalently linked to at least one other moiety. The term specifically encompasses all biomolecules with covalently attached sugar moieties, including for example N-linked glycoproteins, O-linked glycoproteins, glycolipids, proteoglycans, etc.

The term "glycoform", is used herein to refer to a particular form of a glycoconjugate. That is, when the same backbone moiety (e.g., polypeptide, lipid, etc) that is part of a glycoconjugate has the potential to be linked to different glycans or sets of glycans, then each different version of the glycoconjugate (i.e., where the backbone is linked to a particular set of glycans) is referred to as a "glycoform."

The term "glycosidase" as used herein refers to an agent that cleaves a covalent bond between sequential sugars in a glycan or between the sugar and the backbone moiety (e.g. between sugar and peptide backbone of glycoprotein). In some embodiments, a glycosidase is an enzyme. In certain embodiments, a glycosidase is a protein (e.g., a protein enzyme) comprising one or more polypeptide chains. In certain embodiments, a glycosidase is a chemical cleavage agent.

As used herein, the term "glycosylation pattern" refers to the set of glycan structures present on a particular sample. For example, a particular glycoconjugate (e.g., glycoprotein) or set of glycoconjugates (e.g., set of

glycoproteins) will have a glycosylation pattern. In some embodiments, reference is made to the glycosylation pattern of cell-surface glycans. A glycosylation pattern can be characterized by, for example, the identities of glycans, amounts (absolute or relative) of individual glycans or glycans of particular types, degree of occupancy of glycosylation sites, etc., or combinations of such parameters.

As used herein, the term "glycoprofile" refers to one or more properties of the glycans of a glycoprotein; for example, the glycoprofile can include, but is not limited to, one or more of the following: number or placement of glycans; number or placement of N-linked glycans; number or placement of O-linked glycans; sequence of one or more attached glycans; tertiary structure of one or more glycans, e.g., branching pattern, e.g., biantennary, triantennary, tetrantennary, and so on; number or placement of Lewis antigens; number or placement of fucosyl or sialyl groups; molecular weight or mass of the intact glycoprotein; molecular weight or mass of the glycoprotein following the application of one or more experimental constraints, e.g., digestion (enzymatic or chemical); molecular weight or mass of some or all of the glycans after being released from the glycoprotein, e.g., enzymatically or chemically; molecular weight or mass of some or all of the glycans after being released from the glycoprotein and following the application of one or more experimental constraints; mass signature; or charge.

A "glycoprotein preparation", as that term is used herein, refers to a set of individual glycoprotein molecules, each of which comprises a polypeptide having a particular amino acid sequence (which amino acid sequence includes at least one glycosylation site) and at least one glycan covalently attached to the at least one glycosylation site. Individual molecules of a particular glycoprotein within a glycoprotein preparation typically have identical amino acid sequences but may differ in the occupancy of the at least one glycosylation sites and/or in the identity of the glycans linked to the at least one glycosylation sites. That is, a glycoprotein preparation may contain only a single glycoform of a particular glycoprotein, but more typically contains a plurality of glycoforms. Different preparations of the same glycoprotein may differ in the identity of glycoforms present (e.g., a glycoform that is present in one preparation may be absent from another) and/or in the relative amounts of different glycoforms.

The term "lectin" as used herein encompasses any amino acid and peptide bond-based compound having specific binding affinity to carbohydrates. Typically it relates to non-antibody polypeptides found in nature featuring specific carbohydrate binding. The term "lectin" includes functional fragments and derivatives thereof, the latter terms being defined in analogy to the same terms used in the context of antibodies.

"Measuring" or "measurement," or alternatively "detecting" or "detection," means assessing the presence, absence, quantity or amount (which can be an effective amount) of either a given substance within a clinical or subject-derived sample, including the derivation of qualitative or quantitative concentration levels of such substances, or otherwise evaluating the values or categorization of a subject's clinical parameters.

The term "N-glycan", as used herein, refers to a polymer of sugars that has been released from a glycoconjugate but was formerly linked to the

glycoconjugate via a nitrogen linkage. N-linked glycans are glycans that are linked to a glycoconjugate via a nitrogen linkage at asparagine residues within conserved protein structural motifs of N/X (any amino acid except proline)/S or T (serine or threonine). A diverse assortment of N-linked glycans exists, but is typically based on the common core pentasaccharide (Man) 3 (GlcNAc)(GlcNAc). "Naturally-occurring" as applied to an object refers to the fact that the object can be found in nature. For example, a polypeptide or polynucleotide sequence that is present in an organism (including viruses) that can be isolated from a source in nature and which has not been intentionally modified by man is a naturally occurring sequence.

By "nucleic acid" is meant any nucleic acid, whether composed of deoxyribonucleosides or ribonucleosides, and whether composed of phosphodiester linkages or modified linkages such as phosphotriester, phosphoramidate, siloxane, carbonate, carboxymethylester, acetamidate, carbamate, thioether, bridged phosphoramidate, bridged methylene phosphonate, phosphorothioate,

methylphosphonate, phosphorodithioate, bridged phosphorothioate or sulfone linkages, and combinations of such linkages. The term nucleic acid also specifically includes nucleic acids composed of bases other than the five biologically occurring bases (adenine, guanine, thymine, cytosine and uracil). The term "nucleic acid" typically refers to large polynucleotides.

Conventional notation is used herein to describe polynucleotide sequences: the left-hand end of a single-stranded polynucleotide sequence is the 5'- end; the left-hand direction of a double-stranded polynucleotide sequence is referred to as the 5 '-direction.

The direction of 5 ' to 3 ' addition of nucleotides to nascent RNA transcripts is referred to as the transcription direction. The DNA strand having the same sequence as an mRNA is referred to as the "coding strand"; sequences on the DNA strand that are located 5' to a reference point on the DNA are referred to as "upstream sequences"; sequences on the DNA strand which are 3' to a reference point on the DNA are referred to as "downstream sequences."

The term "O-glycan", as used herein, refers to a polymer of sugars that has been released from a glycoconjugate but was formerly linked to the

glycoconjugate via an oxygen linkage. O-linked glycans are glycans that are linked to a glycoconjugate via an oxygen linkage. O-linked glycans are typically attached to glycoproteins via N-acetyl-D-galactosamine (GalNAc) or via N-acetyl-D- glucosamine (GlcNAc) to the hydroxyl group of L-serine (Ser) or L-threonine (Thr). Some O-linked glycans also have modifications such as acetylation and sulfation. In some instances O-linked glycans are attached to glycoproteins via fucose or mannose to the hydroxyl group of L-serine (Ser) or L-threonine (Thr). The term "pre-cancerous" or "pre-neoplastic" and equivalents thereof shall be taken to mean any cellular proliferative disorder that is undergoing malignant transformation. Examples of such conditions include, in the context of colorectal cellular proliferative disorders, cellular proliferative disorders with a high degree of dysplasia and the following classes of adenomas: Level 1 : penetration of malignant glands through the muscularis mucosa into the submucosa, within the polyp head; Level 2: the same submucosal invasion, but present at the junction of the head to the stalk; Level 3: invasion of the stalk; and Level 4: invasion of the stalk's base at the connection to the colonic wall. In some instances, pre-neoplastic is used to describe a normal tissue that will form tumors.

As used herein, "predisposition" refers to the property of being susceptible to a cellular proliferative disorder. A subject having a predisposition to a cellular proliferative disorder has no cellular proliferative disorder, but is a subject having an increased likelihood of having a cellular proliferative disorder.

A "polynucleotide" means a single strand or parallel and anti-parallel strands of a nucleic acid. Thus, a polynucleotide may be either a single-stranded or a double-stranded nucleic acid. In the context of the present invention, the following abbreviations for the commonly occurring nucleic acid bases are used. "A" refers to adenosine, "C" refers to cytidine, "G" refers to guanosine, "T" refers to thymidine, and "U" refers to uridine.

The term "oligonucleotide" typically refers to short polynucleotides, generally no greater than about 60 nucleotides. It will be understood that when a nucleotide sequence is represented by a DNA sequence (i.e., A, T, G, C), this also includes an RNA sequence (i.e., A, U, G, C) in which "U" replaces "T."

As used herein, the term "providing a prognosis" refers to providing a prediction of the probable course and outcome of colorectal cancer, including prediction of severity, duration, chances of recovery, etc. The methods can also be used to devise a suitable therapeutic plan, e.g., by indicating whether or not the condition is still at an early stage or if the condition has advanced to a stage where aggressive therapy would be ineffective.

A "reference level" of a biomarker means a level of the biomarker, for example level of a type of glycan that is indicative of a particular disease state, phenotype, or lack thereof, as well as combinations of disease states, phenotypes, or lack thereof. A "positive" reference level of a biomarker means a level that is indicative of a particular disease state or phenotype. A "negative" reference level of a biomarker means a level that is indicative of a lack of a particular disease state or phenotype.

As used herein, the term "saccharide" refers to a polymer comprising one or more monosaccharide groups. Saccharides, therefore, include mono-, di-, tri- and polysaccharides (or glycans). Glycans can be branched or branched. Glycans can be found covalently linked to non-saccharide moieties, such as lipids or proteins (as a glycoconjugate). These covalent conjugates include glycoproteins, glycopeptides, peptidoglycans, proteoglycans, glycolipids and lipopolysaccharides. The use of any one of these terms also is not intended to be limiting as the description is provided for illustrative purposes. In addition to the glycans being found as part of a

glycoconjugate, the glycans can also be in free form (i.e., separate from and not associated with another moiety).

By the term "specifically binds," as used herein, is meant a molecule, such as an antibody, which recognizes and binds to another molecule or feature, but does not substantially recognize or bind other molecules or features in a sample.

"Standard control value" as used herein refers to a predetermined glycan level. The standard control value is suitable for the use of a method of the present invention, in order for comparing the amount of glycan of interest that is present in a sample. An established sample serving as a standard control provides an average amount of glycan of interest that is typical for an average, healthy person of reasonably matched background, e.g., gender, age, ethnicity, and medical history. A standard control value may vary depending on the biomarker of interest and the nature of the sample.

As used herein, the term "subject" refers to a human or another mammal (e.g., primate, dog, cat, goat, horse, pig, mouse, rat, rabbit, and the like. In many embodiments of the present invention, the subject is a human being. In such embodiments, the subject is often referred to as an "individual" or a "patient." The terms "individual" and "patient" do not denote a particular age.

Ranges: throughout this disclosure, various aspects of the invention can be presented in a range format. It should be understood that the description in range format is merely for convenience and brevity and should not be construed as an inflexible limitation on the scope of the invention. Accordingly, the description of a range should be considered to have specifically disclosed all the possible subranges as well as individual numerical values within that range. For example, description of a range such as from 1 to 6 should be considered to have specifically disclosed subranges such as from 1 to 3, from 1 to 4, from 1 to 5, from 2 to 4, from 2 to 6, from 3 to 6 etc., as well as individual numbers within that range, for example, 1, 2, 2.7, 3, 4, 5, 5.3, and 6. This applies regardless of the breadth of the range.

Description

The present invention is based partly on the profiling of multiple glycans on a biological sample. The invention is also based on the generation of glycan panels from the data obtained from profiling biological samples in various states. For example, the results presented herein demonstrate the application of MALDI-IMS glycan imaging to various formalin-fixed tissues. As a non-limiting example, formalin-fixed mouse kidney tissues were used to optimize antigen retrieval, PNGaseF digestion and glycan detection conditions for MALDI-IMS. This was followed by N-glycan analysis of clinical FFPE tissue blocks from prostate and pancreatic cancers, as well as a commercial tissue microarray of hepatocellular carcinoma (HCC). Glycan identity was confirmed by on-tissue collision-induced dissociation (CID) and off-tissue permethylation analysis. An optimized MALDI-IMS workflow is presented that allows routine simultaneous analysis of thirty or more glycans per FFPE tissue, including TMA formats. The approach is amenable to any FFPE tissue, and represents an additional molecular correlate assay for use with the TMA format.

Accordingly, the invention provides compositions and methods for identifying novel glycan biomarker panels for cancer detection and prognosis.

In one embodiment, the glycans are N-glycans and the biological samples are tissue samples. In another embodiment, the tissue samples are formalin- fixed, paraffin-embedded tissue samples.

In another embodiment, the biological samples are profiled using mass spectroscopy. In another embodiment, the biological samples are profiled using MALDI. In another embodiment, the spatial distribution of glycans on the biological samples is visualized using MALDI-IMS.

In another embodiment, the profiled glycans are characterized using mass spectroscopy. In another embodiment, the profiled glycans are characterized using Fourier transform ion cyclotron resonance (FTICR). In another embodiment, profiled glycans are characterized and associated with spatial distribution and biological sample state, such as a cancer presence and progression.

Accordingly, the invention provides a disease-specific glycan panel. In one embodiment, the glycan panel is specific to pancreatic cancer and can be effectively used to diagnose pancreatic cancer at an early stage, as well as to monitor the progression of pancreatic cancer.

Profiling Glycans

The present invention relates to the analysis of glycans for assessing the presence of cancer. In some instances, the analysis is qualitative. The invention provides profiles of glycans, including but not limited to type of glycan and expression of the type of glycan, which can be used for the profiling of glycans. The present invention is directed in a preferred embodiment quantitative mass spectrometric profiling of human cancers according to the invention and analysis of alterations in cancer in comparison with normal corresponding normal tissues. The analysis can be performed based on signals corresponding to glycan structures, these signals were translated to likely monosaccharide compositions and further analyzed to reveal structures and correlations between the signals. The invention is especially directed to analysis of N-glycan and/or O-glycan derived from cancer proteins. The glycans can be analyzed as neutral and/or acidic signals and glycan mixtures, multiple analysis methods are preferred to obtain maximal amount of data.

In one embodiment, the present invention provides a method for high- throughput profiling of multiple glycans. Accordingly, the methods may be applied in multiple contexts to simultaneously profile glycans on samples in a global manner.

Glycans play multi-faceted roles in many biological processes and aberrant glycosylation is associated with many diseases. Glycans are post-translation modifications of proteins that are involved in cell growth, cytokinesis, differentiation, transcription regulation, signal transduction, ligand-receptor binding, and interactions of cells with other cells, extracellular matrix, and bacterial and viral infection, among other functions. Glycan misregulations and structural changes occur in most of the diseases that affect the human.

In some embodiments, the methods of the present invention can include determining the glycoprofile of a glycoprotein. The properties can be determined by analyzing the glycans of the intact glycoprotein, by releasing the glycans from the glycoprotein before analysis, or by digesting the intact glycoprotein and analyzing the glycans attached to one or more of the resulting glycopeptide fragments. Properties of the glycans which can be determined include: the mass of part or all of the saccharide structure, the charges of the chemical units of the saccharide, identities of the chemical units of the saccharide, confirmations of the chemical units of the saccharide, total charge of the saccharide, total number of sulfates of the saccharide, total number of acetates, total number of phosphates, presence and number of carboxylates, presence and number of aldehydes or ketones, dye-binding of the saccharide, compositional ratios of substituents of the saccharide, compositional ratios of anionic to neutral sugars, presence of uronic acid, enzymatic sensitivity, linkages between chemical units of the saccharide, charge, branch points, number of branches, number of chemical units in each branch, core structure of a branched or unbranched saccharide, the hydrophobicity and/or charge/charge density of each branch, absence or presence of GlcNAc and/or fucose in the core of a branched saccharide, number of mannose in an extended core of a branched saccharide, presence or absence or sialic acid on a branched chain of a saccharide, the presence or absence of galactose on a branched chain of a saccharide.

A property of a glycan can be identified by any means known in the art. For example, molecular weight can be determined by several methods including mass spectrometry. The use of mass spectrometry for determining the molecular weight of glycans is well known in the art. Mass spectrometry has been used as a powerful tool to characterize polymers such as glycans because of its accuracy in reporting the masses of fragments generated (e.g., by enzymatic cleavage), and also because only minute sample concentrations are required.

Any analytic method for analyzing the glycans so as to characterize them can be performed on any sample of glycans, such analytic methods include those described herein. As used herein, to "characterize" a glycan or other molecule means to obtain data that can be used to determine its identity, structure, composition or quantity. When the term is used in reference to a glycoconjugate, it can also include determining the glycosylation sites, the glycosylation site occupancy, the identity, structure, composition or quantity of the glycan and/or non-saccharide moiety of the glycoconjugate as well as the identity and quantity of the specific glycoform. These methods include, for example, mass spectrometry, nuclear magnetic resonance (NMR) (e.g., 2D-NMR), electrophoresis and chromatographic methods. Examples of mass spectrometric methods include fast atom bombardment mass spectrometry (FAB-MS), liquid chromatography mass spectrometry (LC-MS), liquid

chromatography tandem mass spectrometry (LC -MS/MS), matrix-assisted laser desorption/ionization mass spectrometry (MALDI-MS), matrix-assisted laser desorption/ionization tandem mass spectrometry (MALDI-MS/MS), etc. NMR methods can include, for example, correlation spectroscopy (COSY), two- dimensional nuclear magnetic resonance spectroscopy (TOCSY), Nuclear Overhauser effect spectroscopy (NOESY). Electrophoresis can include, for example, capillary electrophoresis with laser induced fluorescence (CE-LIF), capillary gel

electrophoresis (CGE), capillary zone electrophoresis (CZE), COSY, TOCSY, and NOESY.

Mass spectrometry imaging is a powerful tool that has been used to correlate various peptides, proteins, lipids and metabolites with their underlying histopathology in tissue sections. Taking advantage of the rapid advances in mass spectrometry, mass spectrometry imaging can push the limits of glycomics studies. Mass spectrometry imaging offers some advantages over the conventional methods that support its use as a complementary technique to lectin histochemistry. One significant advantage is that matrix-assisted laser desorption/ionization (MALDI) imaging combined with tandem mass spectrometry reveals detailed structural information about the glycans in a sample. A wide range of molecular weights can be detected by mass spectrometry imaging. Also, the high mass resolution allows distinguishing two peaks with close molecular weights, which subsequently improves the detection specificity. In addition, tens or even hundreds of glycans can be detected at femtomole levels in one single image, allowing detection of low concentrations of molecules. Therefore, MALDI imaging facilitates high-throughput analysis of tissue glycans. MALDI imaging can also be used for performing quantitative assays.

Another significant advantage of MALDI imaging is that it has the capability of detecting an unknown compound without any a prior knowledge of the analytes. Therefore, this technique is particularly suitable for biomarker discovery research.

MALDI is a soft ionization mass spectrometric technique that is suitable for use in the analysis of biomolecules, such as proteins, peptides, sugars, and the like, which tend to be fragile and fragment when ionized by conventional ionization methods. Generally, MALDI comprises a two-step process. In the first step, desorption is triggered by an ultraviolet (UV) laser beam. The matrix material absorbs the UV laser radiation, which leads to the ablation of an upper layer of the matrix material, thereby producing a hot plume. The hot plume contains many species: neutral and ionized matrix molecules, protonated and deprotonated matrix molecules, matrix clusters, and nanodroplets. In the second step, the analyte molecules are ionized, e.g., protonated or deprotonated, in the hot plume.

The matrix material comprises a crystallized molecule capable of absorbing the UV laser radiation. Common matrix materials include, but are not limited to, a-cyano-4-hydroxycinnamic acid, 2,5-dihydroxybenzoic acid, 2,5- dihydroxybenzoic acid/2-hydroxy-5-methoxybenzoic acid, 2,4,6- trihydroxyacetophenone, 6-aza-2-thiothymine, 3-hydroxypicolinic acid, 3- aminoquinoline, anthranilic acid, 5-chloro-2-mercaptobenzothiazole, 2,5- dihydroxyacetophenone, ferulic acid, and 2-(4-hydroxyphenylazo) benzoic acid. A solution of the matrix material is made in highly purified water and an organic solvent, such as acetonitrile or ethanol. In some embodiments, a small amount of trifluoroacetic acid (TFA) also can be added to the solution.

The matrix solution can then be mixed with the analyte, e.g., a protein sample. This solution is then deposited onto a MALDI plate, wherein the solvents vaporize leaving only the recrystallized matrix comprising the analyte molecules embedded in the MALDI crystals.

The property of the glycan that is detected by this method can also be any structural property of a glycan or unit. For instance, the property of the glycan can be the molecular mass or length of the glycan. In other embodiments the property can be the compositional ratios of substituents or units, type of basic building block of a polysaccharide, hydrophobicity, enzymatic sensitivity, hydrophilicity, secondary structure and conformation (i.e., position of helices), spatial distribution of substituents, linkages between chemical units, number of branch points, core structure of a branched polysaccharide, ratio of one set of modifications to another set of modifications (i.e., relative amounts of sulfation, acetylation or phosphorylation at the position for each), and binding sites for proteins.

Methods of identifying other types of properties are easily identifiable to those of skill in the art and generally can depend on the type of property and the type of glycan; such methods include, but are not limited to capillary electrophoresis (CE), NMR, mass spectrometry (both MALDI and ESI), and high performance liquid chromatography (HPLC) with fluorescence detection. For example, hydrophobicity can be determined using reverse-phase high-pressure liquid chromatography (RP- HPLC). Enzymatic sensitivity can be identified by exposing the glycan to an enzyme and determining a number of fragments present after such exposure. The chirality can be determined using circular dichroism. Protein binding sites can be determined by mass spectrometry, isothermal calorimetry and NMR. Linkages can be determined using NMR and/or capillary electrophoresis. Enzymatic modification (not degradation) can be determined in a similar manner as enzymatic degradation, i.e., by exposing a substrate to the enzyme and using MALDI-MS to determine if the substrate is modified. For example, a sulfotransferase can transfer a sulfate group to an oligosaccharide chain having a concomitant increase of 80 Da. Conformation can be determined by modeling and nuclear magnetic resonance (NMR). The relative amounts of sulfation can be determined by compositional analysis or approximately determined by Raman spectroscopy.

In accordance with an embodiment, the present invention provides a mass spectroscopy imaging technique that has been developed for profiling of glycans from a biological sample. In another embodiment, the glycans are N-linked glycans. In yet another embodiment, the biological sample is a tissue section from a formalin fixed, paraffin embedded (FFPE) tissue block. FFPE tissues are sectioned on indium tin oxide coated glass slides for matrix-assisted laser desorption/ionization mass spectrometry (MALDI-MS). Deparaffinization and rehydration of the tissue sections are followed by antigen retrieval and denaturing of the proteins. A releasing agent can be sprayed over the tissue sections to release glycans from the proteins, while preserving their spatial distribution. Common enzymatic releasing agents include, but are not limited to, trypsin, Endoglycosidase H (Endo H), Endoglycosidase F (EndoF), N-Glycanase F (PNGaseF), PNGaseA, O-glycanase, and/or one or more proteases (e.g., trypsin, or LysC), or chemically (e.g., using anhydrous hydrazine (N) or reductive or non-reductive beta-elimination (O)).

When PNGaseF is used for glycan release the proteins may, for example, first be unfolded prior to the use of the enzyme. The unfolding of the protein can be accomplished with any of the denaturing agents provided above. In accordance with an embodiment of the above methods of the present invention the denaturing of the sample in a) comprises: i) heating the sample for a sufficient period of time; ii) incubating the sample from i) with a proteolytic enzyme for a period of time; and iii) adding a sufficient amount of PNGaseF to the sample of ii) to release the glycans from the peptide fragments. Common chemical releasing methods for cleaving glycans from glycoconjugates include hydrazinolysis or alkali borohydrate. After glycan release, samples can then be spray -coated with matrix and analyzed by MALDI-MS.

In accordance with another embodiment, the denaturation of the glycoprotein(s) occurs by heating the biological sample and/or incubating the biological sample with a proteolytic enzyme for a sufficient period of time.

In other embodiments, the glycans can be modified to improve ionization of the glycans, particularly when MALDI-MS is used for analysis. Such modifications include permethylation. Another method to increase glycan ionization is to conjugate the glycan to a hydrophobic chemical (such as AA, AB labeling) for MS or liquid chromatographic detection. In other embodiments, spot methods can be employed to improve signal intensity.

It will be understood by those of skill in the art that for spatial distribution of the glycans in the biological sample to be maintained, a solid support, such as a glass plate or slide, or similar support, can be used with sectioning. In some embodiments, a biological sample, such as a tissue, is raster scanned by a laser in the x and y directions and mass spectra are acquired for each pixel on the tissue.

Practical m/z ranges comprising most of the important signals, as observed by the present invention, may be more limited than these. Preferred practical ranges includes lower limit of about m/z 400, more preferably about m/z 500, and even more preferably about m/z 600, and most preferably m/z about 700 and upper limits of about m/z 4000, more preferably m/z about 3500 (especially for negative ion mode), even more preferably m/z about 3000 (especially for negative ion mode), and in particular at least about 2500 (negative or positive ion mode) and for positive ion mode to about m/z 2000 (for positive ion mode analysis). The preferred range depends on the sizes of the sample glycans, samples with high branching or polysaccharide content or high sialylation levels are preferably analyzed in ranges containing higher upper limits as described for negative ion mode. The limits are preferably combined to form ranges of maximum and minimum sizes or lowest lower limit with lowest higher limit, and the other limits analogously in order of increasing size. In accordance with another embodiment, the present invention provides a method for direct profiling of N-linked glycans in a biological sample wherein spatial distribution of at least one glycan is maintained, the method comprising: (a) obtaining a biological sample comprising at least one glycoprotein; (b) denaturing the at least one glycoprotein in the biological sample; (c) releasing at least one glycan from the at least one glycoprotein; (d) coating the biological sample with matrix; and (e) analyzing the at least one glycan using mass spectrometry.

In another embodiment, the profiling of N-glycans on FFPE tissue comprises several steps, although not all steps have to be carried out in this order. In step (a), a tissue section is heated in antigen retrieval buffer and then washed in xylene to remove the paraffin. In step (b), digestion of the proteins on a tissue section is achieved by incubation with an enzyme. In step (c), a matrix is deposited on the tissue section. In step (d), released glycan ions are detected by mass spectrometry. In step (e), the mass spectrometry data is deconvoluted and individual peaks were associated with glycans based on mass accuracy. In step (f), identified glycans are viewed in imaging software to assess intensity distribution of ions.

In another embodiment, the profiling of N-glycans on FFPE tissue comprises several steps, although not all steps have to be carried out in this order. In step (a), a thin FFPE tissue section is heated in antigen retrieval buffer and then washed in xylene to remove the paraffin. In step (b), enzymatic digestion of the proteins on the thin FFPE tissue section is achieved by incubation with PNGaseF. In step (c), an a-cyano-4-hydroxycinnamic acid matrix is deposited on the thin FFPE tissue section. In step (d), released glycan ions are detected by FTICR mass spectrometry. In step (e), the FTICR data is deconvoluted and individual peaks were associated with glycans based on mass accuracy. In step (f), identified glycans are viewed in imaging software to assess intensity distribution of ions.

Methods of the present disclosure can be applied to glycan mixtures obtained from a wide variety of sources including, but not limited to, therapeutic formulations and biological samples. A biological sample may undergo one or more analysis and/or purification steps prior to or after being analyzed according to the present disclosure. For example, in some embodiments, a biological sample is treated with one or more proteases and/or glycosidases (e.g., so that glycans are released); in some embodiments, glycans in a biological sample are labeled with one or more detectable markers or other agents that may facilitate analysis by, for example, mass spectrometry or NMR. Any of a variety of separation and/or isolation steps may be applied to a biological sample in accordance with the present disclosure.

In various embodiments the methods can be used to detect biomarkers indicative of, e.g., a disease state, prior to the appearance of symptoms and/or progression of the disease state to an unbeatable or less treatable condition, by detecting one or more specific glycans whose presence or level (whether absolute or relative) may be correlated with a particular disease state (including susceptibility to a particular disease) and/or the change in the concentration of such glycans over time. Glycan Panels

The invention provides libraries of glycans (or referred to as glycan panels) that are useful for detecting and preventing cancer. These glycan libraries can include numerous different types of carbohydrates and oligosaccharides. In general, the major structural attributes and composition of the separate glycans within the libraries have been identified. In some embodiments, the libraries consist of separate, substantially pure pools of glycans, carbohydrates and/or oligosaccharides.

The glycans of the invention include straight chain and branched oligosaccharides as well as naturally occurring and synthetic glycans. For example, the glycan can be a glycoaminoacid, a glycopeptide, a glycolipid, a

glycosaminoglycan (GAG), a glycoprotein, a whole cell, a cellular component, a glycoconjugate, a glycomimetic, a glycophospholipid anchor,

glycosylphosphatidylinositol (GPI)-linked glycoconjugates, bacterial

lipopolysaccharides and endotoxins. The glycans can also include N-glycans, β- glycans, glycolipids and glycoproteins.

In some instances, the glycans of the invention include two or more sugar units. Any type of sugar unit can be present in the glycans of the invention, including, for example, allose, altrose, arabinose, glucose, galactose, gulose, fucose, fructose, idose, lyxose, mannose, ribose, talose, xylose, or other sugar units. Such sugar units can have a variety of modifications and substituents. For example, sugar units can have a variety of substituents in place of the hydroxy, carboxylate, and methylenehydroxy substituents. Thus, lower alkyl moieties can replace any of the hydrogen atoms from the hydroxy, carboxylic acid and methylenehydroxy substituents of the sugar units in the glycans of the invention. For example, amino acetyl can replace any of the hydroxy or hydrogen atoms from the hydroxy, carboxylic acid and methylenehydroxy substituents of the sugar units in the glycans of the invention.

Libraries and panels of glycans can be embodiments of the implementations of the methods disclosed herein. The libraries and panels of glycans can find uses directed to detecting, treating and/or preventing a variety of early stage diseases and/or cancers. In some embodiments, the presence of such glycans is indicative of the presence of cancer and can provide information on the prognosis of such a disease, for example, whether the disease is in remission or is becoming more aggressive. Patients with familial history of cancer, and hence a heightened risk of developing the disease, can be tested regularly to monitor their propensity for disease.

In some embodiments, the methods of the present invention allow for direct imaging of glycans on tissues to determine disease-specific glycosylation changes. Therefore, in some embodiments, these methods provide a method of diagnosing a disease or condition in a subject comprising: (a) comparing the N-linked glycan profile from a subject to an N-linked glycan profile from a normal sample or diseased sample; (b) determining whether the subject has the disease or condition; and wherein the glycan profile is determined using the presently disclosed methods.

In some embodiments, the present invention provides a panel for the analysis of a plurality of glycan structures. The panel allows for the simultaneous analysis of multiple glycan structures correlating with carcinogenesis and/or metastasis. For example, a panel may include two or more glycan structures identified as correlating with cancerous tissue, metastatic cancer, localized cancer that is likely to metastasize, pre-cancerous tissue that is likely to become cancerous, chronic pancreatitis, and pre-cancerous tissue that is not likely to become cancerous.

Depending on the subject, panels may be analyzed alone or in combination in order to provide the best possible diagnosis and prognosis. Any of the glycan structures described herein may be used in combination with each other or with other known or later identified cancer glycan structures. In other embodiments, the present invention provides an expression profile map comprising expression profiles of cancers of various stages or prognoses (e.g., likelihood of future metastasis). Such maps can be used for comparison with patient samples. Any suitable method may be utilized, including but not limited to, by computer comparison of digitized data. The comparison data is used to provide diagnoses and/or prognoses to patients. The diagnosis can be carried out in a person with or thought to have a disease or condition. The diagnosis can also be carried out in a person thought to be at risk for a disease or condition. "A person at risk" is one that has either a genetic predisposition to have the disease or condition or is one that has been exposed to a factor that could increase his/her risk of developing the disease or condition.

Detection of cancers at an early stage is crucial for its efficient treatment. Despite advances in diagnostic technologies, many cases of cancer are not diagnosed and treated until the malignant cells have invaded the surrounding tissue or metastasized throughout the body. Although current diagnostic approaches have significantly contributed to the detection of cancer, they still present problems in sensitivity and specificity.

In accordance with one or more embodiments of the present invention, it will be understood that the types of cancer diagnosis which may be made, using the methods provided herein, is not necessarily limited. For purposes herein, the cancer can be any cancer. As used herein, the term "cancer" is meant any malignant growth or tumor caused by abnormal and uncontrolled cell division that may spread to other parts of the body through the lymphatic system or the blood stream.

The cancer can be a metastatic cancer or a non- metastatic (e.g., localized) cancer. As used herein, the term "metastatic cancer" refers to a cancer in which cells of the cancer have metastasized, e.g., the cancer is characterized by metastasis of a cancer cells. The metastasis can be regional metastasis or distant metastasis, as described herein.

In accordance with an embodiment, the present invention provides a use of a glycan profile prepared using the method disclosed herein to diagnose a disease or condition in a subject, comprising comparing the glycan profile from a subject to a glycan profile from a normal sample, or diseased sample, and determining whether the sample of the subject has the disease or condition.

In accordance with the inventive methods, the terms "cancers" or "tumors" also include but are not limited to adrenal gland cancer, biliary tract cancer; bladder cancer, brain cancer; breast cancer; cervical cancer; choriocarcinoma; colon cancer; endometrial cancer; esophageal cancer; extrahepatic bile duct cancer; gastric cancer; head and neck cancer; intraepithelial neoplasms; kidney cancer; leukemia; lymphomas; liver cancer; lung cancer (e.g. small cell and non-small cell); melanoma; multiple myeloma; neuroblastomas; oral cancer; ovarian cancer; pancreas cancer; prostate cancer; rectal cancer; sarcomas; skin cancer; small intestine cancer; testicular cancer; thyroid cancer; uterine cancer; urethral cancer and renal cancer, as well as other carcinomas and sarcomas.

An extensive listing of cancer types includes but is not limited to acute lymphoblastic leukemia (adult), acute lymphoblastic leukemia (childhood), acute myeloid leukemia (adult), acute myeloid leukemia (childhood), adrenocortical carcinoma, adrenocortical carcinoma (childhood), AIDS-related cancers, AIDS- related lymphoma, anal cancer, astrocytoma (childhood cerebellar), astrocytoma (childhood cerebral), basal cell carcinoma, bile duct cancer (extrahepatic), bladder cancer, bladder cancer (childhood), bone cancer (osteosarcoma/malignant fibrous histiocytoma), brain stem glioma (childhood), brain tumor (adult), brain tumor— brain stem glioma (childhood), brain tumor— cerebellar astrocytoma (childhood), brain tumor— cerebral astrocytoma/malignant glioma (childhood), brain tumor— ependymoma (childhood), brain tumor— medulloblastoma (childhood), brain tumor— supratentorial primitive neuroectodermal tumors (childhood), brain tumor— visual pathway and hypothalamic glioma (childhood), breast cancer (female, male, childhood), bronchial adenomas/carcinoids (childhood), Burkitt's lymphoma, carcinoid tumor (childhood), carcinoid tumor (gastrointestinal), carcinoma of unknown primary site (adult and childhood), central nervous system lymphoma (primary), cerebellar astrocytoma (childhood), cerebral astrocytoma/malignant glioma (childhood), cervical cancer, chronic lymphocytic leukemia, chronic myelogenous leukemia, chronic myeloproliferative disorders, colon cancer, colorectal cancer (childhood), cutaneous t-cell lymphoma, endometrial cancer, ependymoma

(childhood), esophageal cancer, esophageal cancer (childhood), Ewing's family of tumors, extracranial germ cell tumor (childhood), extragonadal germ cell tumor, extrahepatic bile duct cancer, eye cancer (intraocular melanoma and retinoblastoma), gallbladder cancer, gastric (stomach) cancer, gastric (stomach) cancer (childhood), gastrointestinal carcinoid tumor, gastrointestinal stromal tumor (gist), germ cell tumor (extracranial (childhood), extragonadal, ovarian), gestational trophoblastic tumor, glioma (adult), glioma (childhood: brain stem, cerebral astrocytoma, visual pathway and hypothalamic), hairy cell leukemia, head and neck cancer, hepatocellular (liver) cancer (adult primary and childhood primary), Hodgkin's lymphoma (adult and childhood), Hodgkin's lymphoma during pregnancy, hypopharyngeal cancer, hypothalamic and visual pathway glioma (childhood), intraocular melanoma, islet cell carcinoma (endocrine pancreas), Kaposi's sarcoma, kidney (renal cell) cancer, kidney cancer (childhood), laryngeal cancer, laryngeal cancer (childhood), leukemia— acute lymphoblastic (adult and childhood), leukemia, acute myeloid (adult and childhood), leukemia— chronic lymphocytic, leukemia— chronic myelogenous, leukemia— hairy cell, lip and oral cavity cancer, liver cancer (adult primary and childhood primary), lung cancer— non-small cell, lung cancer— small cell, lymphoma— AIDS-related, lymphoma— Burkitt's, lymphoma— cutaneous t-cell, lymphoma— Hodgkin's (adult, childhood and during pregnancy), lymphoma— non-Hodgkin's (adult, childhood and during pregnancy), lymphoma— primary central nervous system,

macroglobulinemia— Waldenstrom's, malignant fibrous histiocytoma of

bone/osteosarcoma, medulloblastoma (childhood), melanoma, melanoma— intraocular (eye), Merkel cell carcinoma, mesothelioma (adult) malignant, mesothelioma (childhood), metastatic squamous neck cancer with occult primary, multiple endocrine neoplasia syndrome (childhood), multiple myeloma/plasma cell neoplasm, mycosis fungoides, myelodysplastic syndromes,

myelodysplastic/myeloproliferative diseases, myelogenous leukemia, chronic, myeloid leukemia (adult and childhood) acute, myeloma— multiple,

myeloproliferative disorders— chronic, nasal cavity and paranasal sinus cancer, nasopharyngeal cancer, nasopharyngeal cancer (childhood), neuroblastoma, non-small cell lung cancer, oral cancer (childhood), oral cavity and lip cancer, oropharyngeal cancer, osteosarcoma/malignant fibrous histiocytoma of bone, ovarian cancer (childhood), ovarian epithelial cancer, ovarian germ cell tumor, ovarian low malignant potential tumor, pancreatic cancer, pancreatic cancer (childhood), pancreatic cancer— islet cell, paranasal sinus and nasal cavity cancer, parathyroid cancer, penile cancer, pheochromocytoma, pineoblastoma and supratentorial primitive neuroectodermal tumors (childhood), pituitary tumor, plasma cell neoplasm/multiple myeloma, pleuropulmonary blastoma, pregnancy and breast cancer, primary central nervous system lymphoma, prostate cancer, rectal cancer, renal cell (kidney) cancer, renal cell (kidney) cancer (childhood), renal pelvis and ureter— transitional cell cancer, retinoblastoma, rhabdomyosarcoma (childhood), salivary gland cancer, salivary gland cancer (childhood), sarcoma— Ewing's family of tumors, sarcoma— Kaposi's, sarcoma— soft tissue (adult and childhood), sarcoma— uterine, Sezary syndrome, skin cancer (non-melanoma), skin cancer (childhood), skin cancer (melanoma), skin carcinoma— Merkel cell, small cell lung cancer, small intestine cancer, soft tissue sarcoma (adult and childhood), squamous cell carcinoma, squamous neck cancer with occult primary— metastatic, stomach (gastric) cancer, stomach (gastric) cancer (childhood), supratentorial primitive neuroectodermal tumors (childhood), testicular cancer, thymoma (childhood), thymoma and thymic carcinoma, thyroid cancer, thyroid cancer (childhood), transitional cell cancer of the renal pelvis and ureter, trophoblastic tumor, gestational, ureter and renal pelvis— transitional cell cancer, urethral cancer, uterine cancer— endometrial, uterine sarcoma, vaginal cancer, visual pathway and hypothalamic glioma (childhood), vulvar cancer, Waldenstrom's macroglobulinemia, and Wilms' tumor.

Pancreatic cancer is a devastating cancer with uniformly poor prognosis and treatment options. For localized disease the 5-year survival is approximately 15% and the median survival for locally advanced or metastatic disease is 10 months or less. New approaches to identify more effective biomarkers for early disease, and identify potential therapeutic targets are needed. The glycan panels identified by the present method fit in both of these areas.

Pancreatic cancer is the 4th leading cause of cancer-related death and one of the most highly aggressive and lethal of all solid malignancies. Worldwide, over 200,000 individuals are diagnosed with pancreatic cancer each year, and due to the asymptomatic nature of its early stages, coupled with inadequate methods for early detection, the majority of patients (>75%) present with locally advanced and inoperable forms of the cancer at the time of diagnosis. At these advanced stages, available chemotherapy, radiation and combinatorial therapies are largely anecdotal, and less than 5% of patients survive up to five years post diagnosis.

One way to aid in the clinical management of cancer patients is through the use of serum biomarkers. Biomarkers are measurable indicators of a biological state or condition, and in the context of cancer, serum biomarkers present a non-invasive and relatively cost effective means to aid in detection, monitor tumor progression and response to therapy, and for other measurable outcomes of disease. The most widely used biomarker in the clinic for pancreatic cancer is carbohydrate antigen 19.9 (CA19.9), a sialylated Lewis A antigen found on the surface of proteins. While CA19.9 is elevated in late stage disease, it is also elevated in benign and inflammatory diseases of the pancreas and in other malignancies of the

gastrointestinal tract. As well, for early-stage pancreatic cancer detection, CA19.9 has a reported sensitivity of -55% and is often undetectable in many asymptomatic individuals. Other tumor markers such as members of the carcinoembryonic antigen (CEA) and mucin (MUC) families have also been associated with pancreatic cancer. When used in combination, with or without CA19.9, some of these markers have shown enhanced sensitivity and specificity; however none have become a constant fixture in the clinic. The lack of a single highly specific and sensitive marker has led to a growing consensus in the field towards the development of multiparametric panels of biomarkers, whereby the combinatorial assessment of multiple molecules can likely achieve increased sensitivity and specificity for disease detection and management.

Accordingly, the present methods of profiling glycans and generating glycan panels are particularly useful because they can detect and monitor pancreatic cancer, for example, in the early stages of the disease, thereby increasing survival rates.

In one embodiment, the generation of glycan biomarker panels comprises several steps, although not all steps have to be carried out in this order. In step (a), a tissue section is heated in antigen retrieval buffer and then washed in xylene to remove the paraffin. In step (b), digestion of proteins on a tissue microarray section is achieved by incubation with an enzyme. In step (c), a matrix is deposited on the tissue microarray section. In step (d), released glycan ions are detected by mass spectrometry. Data from a random selection of a first percent of the tissue cores is utilized for a model training set while the remaining second percent is used for external model validation. In step (e), data from the first percent is used to generate any of a variety of machine learning models. In step (f), the models are optimized by cross-validation on the first percent, and the performance is qualified using the second percent to generate panels of glycans with individual sensitivities and specificities. The method is amenable to modification of the tissue by degradative enzymes to enhance and alter the types of analytes detected by the method.

In one embodiment, the generation of glycan biomarker panels comprises several steps, although not all steps have to be carried out in this order. In step (b), a tissue section is heated in antigen retrieval buffer and then washed in xylene to remove the paraffin. In step (b), enzymatic digestion of proteins on a thin FFPE TMA section composed of normal and tumor cores is achieved by incubation with PNGaseF. In step (c), an a-cyano-4-hydroxycinnamic acid matrix is deposited on the thin FFPE TMA tissue section. In step (d), released glycan ions are detected by FTICR mass spectrometry. Data from a random selection of a first percent of the tissue cores is utilized for a model training set while the remaining second percent is used for external model validation. In step (e), data from the first percent is used to generate any of a variety of machine learning models. In step (f), the models are optimized by cross-validation on the first percent, and the performance is validated using the second percent to generate panels of glycans with individual sensitivities and specificities.

In accordance with one or more embodiments of the present invention, it will be understood that the types of models used for machine learning, using the methods provided herein, is not necessarily limited. For example, the machine learning models may include random forest, support vector machines, discriminant analysis, neural networks, artificial neural networks, naive Bayes classifier, genetic algorithm, and k-nearest neighbors.

In one embodiment, the invention provides a glycan panel for diagnosing cancer. In one embodiment, the glycan panel comprises one or more of a Hex6HexNAc2 glycan, a Hex4dHexlHexNAc3 glycan, a Hex3dHexlHexNAc4 glycan, a Hex4HexNAc4 glycan, a Hex7HexNAc2 glycan, a Hex4dHexlHexNAc4 glycan, a Hex3dHexlHexNAc5 glycan, a Hex8HexNAc2 glycan, a

Hex5dHexlHexNAc4 glycan, a Hex4dHexlHexNAc5 glycan, a

Hex5HexNAc4NeuAcl glycan, a Hex5dHexlHexNAc5 glycan, a Hex6HexNAc5 glycan, a Hex5dHexlHexNAc4NeuAcl glycan, a Hex5dHex2HexNAc5 glycan, a Hex6dHexlHexNAc5 glycan, a Hex6dHex2HexNAc5 glycan, a Hex7HexNAc6 glycan, a Hex9HexNAc3NeuAcl glycan, a Hex7dHexlHexNAc6 glycan, a

Hex7dHexlHexNAc7 glycan, and a Hex9dHexlHexNAc8 glycan. However, the invention should not be limited to these glycans. Rather, any glycan and combination of glycans identified from the methods of profiling glycans of the invention can be a component of the glycan panel of the invention.

As a non-limiting example, 290 pancreatic tissue cores (150 tumor and 140 non-tumor sections from 149 unique samples from 76 patients) in tissue microarray (TMA) format were analyzed using the present methods of profiling glycans and generating glycan panels. Of the 149 unique tissue cores, a training set of 52 tumor sections and 49 non-tumor sections was assessed for differential detection of glycan species, followed by qualification using a validation set of 23 tumor sections and 25 non-tumor sections. The study conducted has provided a number of specific glycans that can be used in the diagnosis of pancreatic diseases such as cancer. These glycans include, for example, a Hex6HexNAc2 glycan, a Hex4dHexlHexNAc3 glycan, a Hex3dHexlHexNAc4 glycan, a Hex4HexNAc4 glycan, a Hex7HexNAc2 glycan, a Hex4dHexlHexNAc4 glycan, a Hex3dHexlHexNAc5 glycan, a

Hex8HexNAc2 glycan, a Hex5dHexlHexNAc4 glycan, a Hex4dHexlHexNAc5 glycan, a Hex5HexNAc4NeuAcl glycan, a Hex5dHexlHexNAc5 glycan, a

Hex6HexNAc5 glycan, a Hex5dHexlHexNAc4NeuAcl glycan, a

Hex5dHex2HexNAc5 glycan, a Hex6dHexlHexNAc5 glycan, a

Hex6dHex2HexNAc5 glycan, a Hex7HexNAc6 glycan, a Hex9HexNAc3NeuAc 1 glycan, a Hex7dHexlHexNAc6 glycan, a Hex7dHexlHexNAc7 glycan, and a

Hex9dHexlHexNAc8 glycan. The composition of a glycan recited here is meant to refer to any glycan with the particular types and numbers of saccharides represented by the composition notation. For example, a "Hex5dHexlHexNAc4NeuAcl" glycan encompasses any glycan that contains 5 hexoses, 1 deoxyhexose, 4 N-acetyl hexosamines, and 1 N-acetyl neuraminic acids. These saccharides can be present in any order in the glycan and can be linked to each other with any of a number of types of linkages, including for example, al-2, al-3, al-6, a2-3, a2-6, β1-2, or β1-4 link. It will be recognized by one of ordinary skill in the art that the glycans provided may exist in a modified form, for example, derivatives, enzymatically modified versions, a precursor form in a sample, or modified as part of an analytic method used for its detection.

The glycans above are in some embodiments (with sodium adduct) a Hex6HexNAc2 glycan with m/z = 1419.4755, a Hex3dHexlHexNAc5 glycan with m/z = 1688.6130, a Hex5 dHex 1 Hex Ac5 glycan with m/z = 2012.7187, a

Hex5 dHex 1 Hex Ac4NeuAc 1 glycan with m/z = 2100.7347, a

Hex3dHex4HexNAc5glycan with m/z = 2126.7868, a Hex5dHex2HexNAc5 glycan with m/z = 2158.7766, a Hex6dHex 1 Hex Ac5 glycan with m/z = 2174.7715, a Hex6dHex2HexNAc5 glycan with m/z = 2320.8294, a Hex7dHexlHexNAc6 glycan with m/z = 2539.9037, and a Hex7dHexlHexNAc7 glycan with m/z = 2742.9831. As used herein, "a glycan with m/z = 1419.4755" with sodium adduct is meant to refer to a glycan that can be determined to have the recited mass with MALDI-FTICR. It will be understood by one of ordinary skill in the art that the mass recited is approximate and includes the mass of a sodium adduct ion. The definition is meant to identify the particular glycan and is not intended to be limited by a specific method of analysis. Another aspect of the invention is a composition of glycans that can be used for treating or preventing ovarian cancer. The compositions include glycans used to elicit protective immune response in patients with a high risk of developing cancer. The compositions can also be used to enhance the immune response of patients that have cancer. The compositions can also be used to prepare isolated antibody preparations useful for passive immunization of patients who have developed or may develop ovarian cancer.

The present invention describes glycans, which are specifically expressed by certain cancer cells, tumors and other malignant tissues. The present invention describes methods to detect cancer specific glycans as well as methods for the production of reagents binding to the glycans. The invention is also directed to the use of the glycans and reagents binding to them for the diagnostics of cancer and malignancies. Furthermore, the invention is directed to the use of the glycans and reagents binding to them for the treatment of cancer and malignancies. Moreover, the present invention comprises efficient methods to differentiate between malignant and benign tumors by analyzing glycan structures.

Glycan Carrier Proteins Attached to Specific Glycans

The invention is further directed to methods of identifying glycoproteins attached to any of the glycans according to the invention, preferably from integral (cell bound/transmembrane) cancer tissue or cell released proteins and assigning the glycan structures with specific carrier proteins, preferably by specific purification of the protein, e.g. by affinity methods such as immunoprecipitation or by sequencing, preferably by mass spectrometric sequencing, glycopeptides including sequencing and recognizing peptides and thus proteins linked to the glycans.

In some embodiments, the determined glycosylation marker of cancer can be used for identifying and isolating one or more glycoprotein biomarkers, i.e. glycoproteins that are specific for particular type of cancer. The glycoprotein biomarker of the disease carries the glycosylation marker of cancer. The isolation of the glycoprotein biomarkers of the cancer can be carried out using lectins or monoclonal antibodies.

The glycosylation of a protein may be indicative of a normal or a disease state. Therefore, methods are provided for diagnostic purposes based on the analysis of the glycosylation of a protein or set of proteins, such as the total glycome. The methods provided herein can be used for the diagnosis of any disease or condition that is caused or results in changes in a particular protein glycosylation or pattern of glycosylation. These patterns can then be compared to "normal" and/or "diseased" patterns to develop a diagnosis, and treatment for a subject. For example, the methods provided can be used in the diagnosis of cancer, inflammatory disease, benign prostatic hyperplasia (BPH), etc.

The diagnosis can be carried out in a person with or thought to have a disease or condition. The diagnosis can also be carried out in a person thought to be at risk for a disease or condition. "A person at risk" is one that has either a genetic predisposition to have the disease or condition or is one that has been exposed to a factor that could increase his/her risk of developing the disease or condition.

The present invention provides glycosylation markers associated with cancer. In one embodiment, the glycosylation marker is an organic biomolecule which is differentially present in a sample taken from an individual of one phenotypic status (e.g., having a disease) as compared with an individual of another phenotypic status (e.g., not having the disease). A biomarker is differentially present between the two individuals if the mean or median expression level, including glycosylation level, of the biomarker in the different individuals is calculated to be statistically significant. Biomarkers, alone or in combination, provide measures of relative risk that an individual belongs to one phenotypic status or another. Therefore, they are useful as markers for diagnosis of disease, the severity of disease, therapeutic effectiveness of a drug, and drug toxicity.

In one embodiment, the method of the invention is carried out by obtaining a set of measured values for a plurality of biomarkers from a biological sample derived from a test individual, obtaining a set of measured values for a plurality of biomarkers from a biological sample derived from a control individual, comparing the measured values for each biomarker between the test and control sample, and identifying biomarkers which are significantly different between the test value and the control value, also referred to as a reference value.

The process of comparing a measured value and a reference value can be carried out in any convenient manner appropriate to the type of measured value and reference value for the biomarker of the invention. For example, "measuring" can be performed using quantitative or qualitative measurement techniques, and the mode of comparing a measured value and a reference value can vary depending on the measurement technology employed. For example, when a qualitative colorimetric assay is used to measure biomarker levels, the levels may be compared by visually comparing the intensity of the colored reaction product, or by comparing data from densitometric or spectrometric measurements of the colored reaction product (e.g., comparing numerical data or graphical data, such as bar charts, derived from the measuring device). However, it is expected that the measured values used in the methods of the invention will most commonly be quantitative values (e.g., quantitative measurements of concentration). In other examples, measured values are qualitative. As with qualitative measurements, the comparison can be made by inspecting the numerical data, or by inspecting representations of the data (e.g., inspecting graphical representations such as bar or line graphs).

A measured value is generally considered to be substantially equal to or greater than a reference value if it is at least about 95% of the value of the reference value. A measured value is considered less than a reference value if the measured value is less than about 95% of the reference value. A measured value is considered more than a reference value if the measured value is at least more than about 5% greater than the reference value.

The process of comparing may be manual (such as visual inspection by the practitioner of the method) or it may be automated. For example, an assay device (such as a luminometer for measuring chemiluminescent signals) may include circuitry and software enabling it to compare a measured value with a reference value for a desired biomarker. Alternately, a separate device (e.g., a digital computer) may be used to compare the measured value(s) and the reference value(s). Automated devices for comparison may include stored reference values for the biomarker(s) being measured, or they may compare the measured value(s) with reference values that are derived from contemporaneously measured reference samples.

The above method for screening biomarkers can find biomarkers that are differentially glycosylated in cancer as well as at various dysplasic stages of the tissue which progresses to cancer. The screened biomarker can be used for cancer screening, risk-assessment, prognosis, disease identification, the diagnosis of disease stages, and the selection of therapeutic targets.

According to the method of the present invention, the progression of cancer at various stages or phases can be diagnosed by determining the glycosylation stage of one or more biomarkers obtained from a sample. By comparing the glycosylation stage of a biomarker from a sample at each stage of cancer with the glycosylation stage of one or more biomarkers isolated from a sample in which there is no cell proliferative disorder of tissue, a specific stage of cancer in the sample can be detected. In one embodiment, the glycosylation stage may be hyperglycosylation. In another embodiment, the glycosylation stage may be hypoglycosylation.

The Target Cancer Tissue

The present invention is directed to analysis of un-normally transformed tissues, when the transformation is benign and/or malignant cancer type transformation referred as cancer (or tumor). It is realized that benign transformation may be a step towards malignant transformation, and thus the benign cancers are also useful to be analyzed and differentiated from normal tissue, which may have also noncancerous or non-transformation related alterations such as swelling or trauma related to physical or e.g. infectious trauma, and it is useful and preferred to differentiate with benign and malignant cancers.

In one embodiment, the tissue is human tissue or tissue part such as liquid tissue, cell and/or solid polycellular tumors, and in another embodiment preferably a solid human tissue. The solid tissues are preferred for the analysis and/or targeting specific glycan marker structures from the tissues, including intracellularly and extracellularly, preferably cell surface associated, localized markers. In a preferred embodiment the invention is specifically directed to the recognition of cell surface localized and/or mostly cell surface localized marker structures from solid tumor tissues or parts thereof. It is realized that the contacts between cells and this glycomes mediating these are affected by presence of cells as solid tumor or as more individual cells. The preferred individual cell type cancers or tumors include preferably blood derived tumors such as leukemias and lymphomas, while solid tumors are preferably includes solid tumors derived from solid tissues such as gastrointestinal tract tissues, other internal organs such as liver, kidneys, spleen, pancreas, lungs, gonads and associated organs including preferably ovary, testicle, and prostate. The invention further reveals markers from individually or

multicellularly presented cancer cells in contrast to solid tumors. The preferred cancer cells to be analyzed include metastatic cells released from tumors/cancer and blood cell derived cancers, such as leukemias and/or lymphomas. Metastasis from solid tissue tumors forms a separately preferred class of cancer samples with specific characteristics.

The cancer tissue materials to be analyzed according to the invention are in the invention also referred as tissue materials or simply as cells, because all tissues comprise cells, however the invention is preferably directed to unicellularly and/or multicellularly expressed cancer cells and/or solid tumors as separate preferred characteristics. The invention further reveals normal tissue materials to be compared with the cancer materials. The invention is specifically directed to methods according to the invention for revealing status of transformed tissue or suspected cancer sample when expression of specific structure of a signal correlated with it is compared to a expression level estimated to correspond to expression in normal tissue or compared with the expression level in an standard sample from the same tissue, preferably a tissue sample from healthy part of the same tissue from the same patient.

The invention is in a preferred embodiment directed to analysis of the marker structures and/or glycan profiles from both cancer tissue and corresponding normal tissue of the same patient because part of the glycosylations includes individual changes for example related to rare glycosylation related diseases such as congenital disorders in glycosylation (of glycoproteins/carbohydrates) and/or glycan storage diseases. The invention is furthermore directed to method of verifying analyzing importance and/or change of a specific structure/structure group or glycan group in glycome in specific cancer and/or a subtype of a cancer optionally with a specific status (e.g. primary cancer, metastatic, benign transformation related to a cancer) by methods according to the present invention. Diagnostic

In one embodiment, diagnostic tests that use the biomarkers of the invention (e.g., glycans) exhibit a sensitivity and specificity of at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98% and about 100%. In some instances, screening tools of the present invention exhibit a high sensitivity of at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98% and about 100%.

In one embodiment, the sensitivity is from about 75% to about 99%, or from about 80% to about 90%, or from about 80% to about 85%. Preferably, the specificity is from about 75% to about 99%, or from about 80% to about 90%, or from about 80% to about 85%.

In another embodiment, the present invention enables the screening of at-risk populations for the early detection of cancers, for example pancreatic cancer. Furthermore, in certain aspects, the present invention enables the differentiation of neoplastic (e.g. malignant) from benign (i.e. non-cancerous) cellular proliferative disorders.

Although diagnostic and prognostic accuracy and sensitivity may be achieved by using a combination of markers, such as 2 or more biomarkers of the invention, practical considerations may dictate use of one or more biomarkers and smaller combinations thereof. Any combination of markers for a specific cancer may be used which comprises 1, 2, 3, 4, 5, 6, 7 or more markers. Combinations of 1, 2, 3, 4, 5, 6, 7 or more markers can be readily envisioned given the specific disclosures of individual markers provided herein.

The prognostic methods can be used to identify patients with cancer or at risk of cancer. Such patients can be offered additional appropriate therapeutic or preventative options, including endoscopic polypectomy or resection, and when indicated, surgical procedures, chemotherapy, radiation, biological response modifiers, or other therapies. Such patients may also receive recommendations for further diagnostic or monitoring procedures, including but not limited to increased frequency of colonoscopy, virtual colonoscopy, video capsule endoscopy, PET-CT, molecular imaging, or other imaging techniques.

Following the diagnosis of a subject according to the methods of the invention, the subject diagnosed with cancer or at risk for having a proliferative disease, such as cancer can be treated against the disease. Accordingly, the method comprises administering to the subject a therapeutically effective amount of a therapeutic agent, thereby treating a subject having or at risk for having a proliferative disease.

Anti-cancer drugs may be used in the various embodiments of the invention, including in pharmaceutical compositions and dosage forms and kits of the invention. One type of anti-cancer drug includes cytotoxic agents (i.e., drugs that kill cancer cells in different ways). These include the alkylating agents, antimetabolites, antitumor antibiotics, and plant drugs. Another type of anti-cancer drug includes hormones and hormone antagonists. Some tumors require the presence of hormones to grow. Many of these drugs block the effects of hormones at its tissue receptors or prevent the manufacture of hormones by the body.

Another type of anti-cancer drug includes biological response modifiers. These drugs increase the body's immune system to detect and destroy the cancer.

Non-limiting examples of anti-cancer drugs include but are not limited to: acivicin; aclarubicin; acodazole hydrochloride; acronine; adozelesin; aldesleukin; altretamine; ambomycin; ametantrone acetate; aminoglutethimide; amsacrine;

anastrozole; anthramycin; asparaginase; asperlin; azacitidine; azetepa; azotomycin; batimastat; benzodepa; bicalutamide; bisantrene hydrochloride; bisnafide dimesylate; bizelesin; bleomycin sulfate; brequinar sodium; bropirimine; busulfan; cactinomycin; calusterone; caracemide; carbetimer; carboplatin; carmustine; carubicin

hydrochloride; carzelesin; cedefingol; chlorambucil; cirolemycin; cisplatin;

cladribine; crisnatol mesylate; cyclophosphamide; cytarabine; dacarbazine;

dactinomycin; daunorubicin hydrochloride; decitabine; dexormaplatin; dezaguanine; dezaguanine mesylate; diaziquone; docetaxel; doxorubicin; doxorubicin

hydrochloride; droloxifene; droloxifene citrate; dromostanolone propionate;

duazomycin; edatrexate; eflornithine hydrochloride; elsamitrucin; enloplatin;

enpromate; epipropidine; epirubicin hydrochloride; erbulozole; esorubicin hydrochloride; estramustine; estramustine phosphate sodium; etanidazole; etoposide; etoposide phosphate; etoprine; fadrozole hydrochloride; fazarabine; fenretinide; floxuridine; fludarabine phosphate; fluorouracil; flurocitabine; fosquidone; fostriecin sodium; gemcitabine; gemcitabine hydrochloride; hydroxyurea; idarubicin hydrochloride; ifosfamide; ilmofosine; interleukin II (including recombinant interleukin II, or rIL2), interferon alfa-2a; interferon alfa-2b; interferon alfa-nl; interferon alfa-n3; interferon beta-I a; interferon gamma-I b; iproplatin; irinotecan hydrochloride; lanreotide acetate; letrozole; leuprolide acetate; liarozole

hydrochloride; lometrexol sodium; lomustine; losoxantrone hydrochloride;

masoprocol; maytansine; mechlorethamine, mechlorethamine oxide hydrochloride rethamine hydrochloride; megestrol acetate; melengestrol acetate; melphalan;

menogaril; mercaptopurine; methotrexate; methotrexate sodium; metoprine;

meturedepa; mitindomide; mitocarcin; mitocromin; mitogillin; mitomalcin; mitomycin; mitosper; mitotane; mitoxantrone hydrochloride; mycophenolic acid; nocodazole; nogalamycin; ormaplatin; oxisuran; paclitaxel; pegaspargase;

peliomycin; pentamustine; peplomycin sulfate; perfosfamide; pipobroman;

piposulfan; piroxantrone hydrochloride; plicamycin; plomestane; porfimer sodium; porfiromycin; prednimustine; procarbazine hydrochloride; puromycin; puromycin hydrochloride; pyrazofurin; riboprine; rogletimide; safingol; safingol hydrochloride; semustine; simtrazene; sparfosate sodium; sparsomycin; spirogermanium

hydrochloride; spiromustine; spiroplatin; streptonigrin; streptozocin; sulofenur;

talisomycin; tecogalan sodium; tegafur; teloxantrone hydrochloride; temoporfin; teniposide; teroxirone; testolactone; thiamiprine; thioguanine; thiotepa; tiazofurin; tirapazamine; toremifene citrate; trestolone acetate; triciribine phosphate;

trimetrexate; trimetrexate glucuronate; triptorelin; tubulozole hydrochloride; uracil mustard; uredepa; vapreotide; verteporfin; vinblastine sulfate; vincristine sulfate; vindesine; vindesine sulfate; vinepidine sulfate; vinglycinate sulfate; vinleurosine sulfate; vinorelbine tartrate; vinrosidine sulfate; vinzolidine sulfate; vorozole;

zeniplatin; zinostatin; zorubicin hydrochloride, improsulfan, benzodepa, carboquone, triethylenemelamine, triethylenephosphoramide, triethylenethiophosphoramide, trimethylolomelamine, chlornaphazine, novembichin, phenesterine, trofosfamide, estermustine, chlorozotocin, gemzar, nimustine, ranimustine, dacarbazine, mannomustine, mitobronitol,aclacinomycins, actinomycin F(l), azaserine, bleomycin, carubicin, carzinophilin, chromomycin, daunorubicin, daunomycin, 6-diazo-5-oxo-l- norleucine, doxorubicin, olivomycin, plicamycin, porfiromycin, puromycin, tubercidin, zorubicin, denopterin, pteropterin, 6-mercaptopurine, ancitabine, 6- azauridine, carmofur, cytarabine, dideoxyuridine, enocitabine, pulmozyme, aceglatone, aldophosphamide glycoside, bestrabucil, defofamide, demecolcine, elfornithine, elliptinium acetate, etoglucid, flutamide, hydroxyurea, lentinan, phenamet, podophyllinic acid, 2-ethylhydrazide, razoxane, spirogermanium, tamoxifen, taxotere, tenuazonic acid, triaziquone, 2,2',2"-trichlorotriethylamine, urethan, vinblastine, vincristine, vindesine and related agents. 20-epi-l,25

dihydroxyvitamin D3; 5-ethynyluracil; abiraterone; aclarubicin; acylfulvene;

adecypenol; adozelesin; aldesleukin; ALL-TK antagonists; altretamine; ambamustine; amidox; amifostine; aminolevulinic acid; amrubicin; amsacrine; anagrelide;

anastrozole; andrographolide; angiogenesis inhibitors; antagonist D; antagonist G; antarelix; anti-dorsalizing morphogenetic protein- 1 ; antiandrogen, prostatic carcinoma; antiestrogen; antineoplaston; antisense oligonucleotides; aphidicolin glycinate; apoptosis gene modulators; apoptosis regulators; apurinic acid; ara-CDP-

DL-PTBA; arginine deaminase; asulacrine; atamestane; atrimustine; axinastatin 1 ; axinastatin 2; axinastatin 3; azasetron; azatoxin; azatyrosine; baccatin III derivatives; balanol; batimastat; BCR/ABL antagonists; benzochlorins; benzoylstaurosporine; beta lactam derivatives; beta-alethine; betaclamycin B; betulinic acid; bFGF inhibitor; bicalutamide; bisantrene; bisaziridinylspermine; bisnafide; bistratene A; bizelesin; breflate; bropirimine; budotitane; buthionine sulfoximine; calcipotriol; calphostin C; camptothecin derivatives; canarypox IL-2; capecitabine; carboxamide-amino-triazole; carboxyamidotriazole; CaRest M3; CARN 700; cartilage derived inhibitor; carzelesin; casein kinase inhibitors (ICOS); castanospermine; cecropin B; cetrorelix; chlorins; chloroquinoxaline sulfonamide; cicaprost; cisporphyrin; cladribine; clomifene analogues; clotrimazole; collismycin A; collismycin B; combretastatin A4;

combretastatin analogue; conagenin; crambescidin 816; crisnatol; cryptophycin 8; cryptophycin A derivatives; curacin A; cyclopentanthraquinones; cycloplatam;

cypemycin; cytarabine ocfosfate; cytolytic factor; cytostatin; dacliximab; decitabine; dehydrodidemnin B; deslorelin; dexamethasone; dexifosfamide; dexrazoxane;

dexverapamil; diaziquone; didemnin B; didox; diethylnorspermine; dihydro-5- azacytidine; dihydrotaxol, 9-; dioxamycin; diphenyl spiromustine; docetaxel;

docosanol; dolasetron; doxifluridine; droloxifene; dronabinol; duocarmycin SA; ebselen; ecomustine; edelfosine; edrecolomab; eflornithine; elemene; emitefur;

epirubicin; epristeride; estramustine analogue; estrogen agonists; estrogen

antagonists; etanidazole; etoposide phosphate; exemestane; fadrozole; fazarabine; fenretinide; filgrastim; finasteride; flavopiridol; flezelastine; fluasterone; fludarabine; fluorodaunorunicin hydrochloride; forfenimex; formestane; fostriecin; fotemustine; gadolinium texaphyrin; gallium nitrate; galocitabine; ganirelix; gelatinase inhibitors; gemcitabine; glutathione inhibitors; hepsulfam; heregulin; hexamethylene

bisacetamide; hypericin; ibandronic acid; idarubicin; idoxifene; idramantone;

ilmofosine; ilomastat; imidazoacridones; imiquimod; immunostimulant peptides; insulin-like growth factor-1 receptor inhibitor; interferon agonists; interferons;

interleukins; iobenguane; iododoxorubicin; ipomeanol, 4-; iroplact; irsogladine;

isobengazole; isohomohalicondrin B; itasetron; jasplakinolide; kahalalide F;

lamellarin-N triacetate; lanreotide; leinamycin; lenograstim; lentinan sulfate;

leptolstatin; letrozole; leukemia inhibiting factor; leukocyte alpha interferon; leuprolide+estrogen+progesterone;leuprorelin; levamisole; liarozole; linear polyamine analogue; lipophilic disaccharide peptide; lipophilic platinum compounds;

lissoclinamide 7; lobaplatin; lombricine; lometrexol; lonidamine; losoxantrone;

lovastatin; loxoribine; lurtotecan; lutetium texaphyrin; lysofylline; lytic peptides; maitansine; mannostatin A; marimastat; masoprocol; maspin; matrilysin inhibitors; matrix metalloproteinase inhibitors; menogaril; merbarone; meterelin; methioninase; metoclopramide; MIF inhibitor; mifepristone; miltefosine; mirimostim; mismatched double stranded R A; mitoguazone; mitolactol; mitomycin analogues; mitonafide; mitotoxin fibroblast growth factor-saporin; mitoxantrone; mofarotene; molgramostim; monoclonal antibody, human chorionic gonadotrophin; monophosphoryl lipid

A+myobacterium cell wall sk; mopidamol; multiple drug resistance gene inhibitor; multiple tumor suppressor 1 -based therapy; mustard anticancer agent; mycaperoxide

B; mycobacterial cell wall extract; myriaporone; N-acetyldinaline; N-substituted benzamides; nafarelin; nagrestip; naloxone+pentazocine; napavin; naphterpin;

nartograstim; nedaplatin; nemorubicin; nemoronic acid; neutral endopeptidase;

nilutamide; nisamycin; nitric oxide modulators; nitroxide antioxidant; nitrullyn; 06- benzylguanine; octreotide; okicenone; oligonucleotides; onapristone; ondansetron; ondansetron; oracin; oral cytokine inducer; ormaplatin; osaterone; oxaliplatin;

oxaunomycin; taxel; taxel analogues; taxel derivatives; palauamine;

palmitoylrhizoxin; pamidronic acid; panaxytriol; panomifene; parabactin;

pazelliptine; pegaspargase; peldesine; pentosan polysulfate sodium; pentostatin; pentrozole; perflubron; perfosfamide; perillyl alcohol; phenazinomycin;

phenylacetate; phosphatase inhibitors; picibanil; pilocarpine hydrochloride;

pirarubicin; piritrexim; placetin A; placetin B; plasminogen activator inhibitor;

platinum complex; platinum compounds; platinum-triamine complex; porfimer sodium; porfiromycin; prednisone; propyl bis-acridone; prostaglandin J2; proteasome inhibitors; protein A-based immune modulator; protein kinase C inhibitor; protein kinase C inhibitors, microalgal; protein tyrosine phosphatase inhibitors; purine nucleoside phosphorylase inhibitors; purpurins; pyrazoloacridine; pyridoxylated hemoglobin polyoxyethylene conjugate; raf antagonists; raltitrexed; ramosetron; ras farnesyl protein transferase inhibitors; ras inhibitors; ras-GAP inhibitor; retelliptine demethylated; rhenium Re 186 etidronate; rhizoxin; ribozymes; RII retinamide;

rogletimide; rohitukine; romurtide; roquinimex; rubiginone Bl ; ruboxyl; safingol; saintopin; SarCNU; sarcophytol A; sargramostim; Sdi 1 mimetics; semustine; senescence derived inhibitor 1 ; sense oligonucleotides; signal transduction inhibitors; signal transduction modulators; single chain antigen binding protein; sizofiran;

sobuzoxane; sodium borocaptate; sodium phenylacetate; solverol; somatomedin binding protein; sonermin; sparfosic acid; spicamycin D; spiromustine; splenopentin; spongistatin 1 ; squalamine; stem cell inhibitor; stem-cell division inhibitors;

stipiamide; stromelysin inhibitors; sulfinosine; superactive vasoactive intestinal peptide antagonist; suradista; suramin; swainsonine; synthetic glycosaminoglycans; tallimustine; tamoxifen methiodide; tauromustine; tazarotene; tecogalan sodium; tegafur; tellurapyrylium; telomerase inhibitors; temoporfin; temozolomide;

teniposide; tetrachlorodecaoxide; tetrazomine; thaliblastine; thiocoraline;

thrombopoietin; thrombopoietin mimetic; thymalfasin; thymopoietin receptor agonist; thymotrinan; thyroid stimulating hormone; tin ethyl etiopurpurin; tirapazamine;

titanocene bichloride; topsentin; toremifene; totipotent stem cell factor; translation inhibitors; tretinoin; triacetyluridine; triciribine; trimetrexate; triptorelin; tropisetron; turosteride; tyrosine kinase inhibitors; tyrphostins; UBC inhibitors; ubenimex;

urogenital sinus-derived growth inhibitory factor; urokinase receptor antagonists; vapreotide; variolin B; vector system, erythrocyte gene therapy; velaresol; veramine; verdins; verteporfin; vinorelbine; vinxaltine; vitaxin; vorozole; zanoterone; zeniplatin; zilascorb; and zinostatin stimalamer. Preferred additional anti-cancer drugs are 5- fluorouracil and leucovorin.

Additional cancer therapeutics include monoclonal antibodies such as rituximab, trastuzumab and cetuximab.

Cancer Vaccines

In one embodiment, the cancer specific glycans (e.g., oligosaccharide sequences or analogs or derivatives thereof) can be used as cancer or tumor vaccines in man to stimulate immune response to inhibit or eliminate cancer or tumor cells. The treatment may not necessarily cure cancer or tumor but it can reduce tumor burden or stabilize a cancer condition and lower the metastatic potential of cancers. For the use as vaccines the oligosaccharides or analogs or derivatives thereof can be conjugated, for example, to proteins such as bovine serum albumin or keyhole limpet hemocyanin, lipids or lipopeptides, bacterial toxins such as cholera toxin or heat labile toxin, peptidoglycans, immunoreactive polysaccharides, or to other molecules activating immune reactions against a vaccine molecule. A cancer or tumor vaccine may also comprise a pharmaceutically acceptable carrier and optionally an adjuvant. Suitable carriers or adjuvants are, e.g., lipids known to stimulate the immune response. The saccharides or derivatives or analogs thereof, preferably conjugates of the saccharides, can be injected or administered mucosally, such as orally or nasally, to a cancer patient with tolerated adjuvant molecule or adjuvant molecules. The cancer or tumor vaccine can be used as a medicine in a method of treatment against cancer or tumor. Preferably the method is used for the treatment of a human patient. Preferably the method of treatment is used for the treatment of cancer or tumor of a patient, who is under immunosuppressive medication or the patient is suffering from immunodeficiency.

Furthermore it is possible to produce a pharmaceutical composition comprising the cancer specific oligosaccharide sequences or analogs or derivatives thereof for the treatment of cancer or tumor. Preferably the pharmaceutical composition is used for the treatment of a human patient. Preferably the

pharmaceutical composition is used for the treatment of cancer or tumor, when patient is under immunosuppressive medication or he/she is suffering from

immunodeficiency. The methods of treatment or the pharmaceutical compositions described above are especially preferred for the treatment of cancer or tumor diagnosed to express the cancer specific oligosaccharide sequences of the invention. The methods of treatment or the pharmaceutical compositions can be used together with other methods of treatment or pharmaceutical compositions for the treatment of cancer or tumor. Preferably the other methods or pharmaceutical compositions comprise cytostatics, anti-angiogenic pharmaceuticals, anti-cancer proteins, such as interferons or interleukins, or use of radioactivity.

EXPERIMENTAL EXAMPLES

The invention is further described in detail by reference to the following experimental examples. These examples are provided for purposes of illustration only, and are not intended to be limiting unless otherwise specified. Thus, the invention should in no way be construed as being limited to the following examples, but rather, should be construed to encompass any and all variations which become evident as a result of the teaching provided herein.

Without further description, it is believed that one of ordinary skill in the art can, using the preceding description and the following illustrative examples, make and utilize the compounds of the present invention and practice the claimed methods. The following working examples therefore, specifically point out the preferred embodiments of the present invention, and are not to be construed as limiting in any way the remainder of the disclosure.

Example 1 : MALDI Imaging Mass Spectrometry Profiling of N-Glycans in

Formalin-Fixed Paraffin Embedded Clinical Tissue Blocks and Tissue Microarrays

The results presented herein are based on the application of a Matrix Assisted Laser Desorption Ionization Imaging Mass Spectrometry (MALDI-IMS) method to spatially profile the location and distribution of multiple N-linked glycan species in clinically derived formalin-fixed paraffin-embedded (FFPE) tissues.

Formalin- fixed tissues from normal mouse kidney, human pancreatic and prostate cancers and a human hepatocellular carcinoma tissue microarray were processed by antigen retrieval followed by on-tissue digestion with peptide N-glycosidase F (PNGaseF). The released N-glycans were detected by MALDI-IMS analysis, and the structural composition of a subset of glycans was verified directly by on-tissue collision-induced fragmentation. Other structural assignments were confirmed by off- tissue permethylation analysis combined with multiple database comparisons.

Imaging of mouse kidney tissue sections demonstrated specific tissue distributions of major cellular N-linked glycoforms in the cortex and medulla. Differential tissue distribution of N-linked glycoforms was also observed in the other tissue types. The efficacy of using MALDI-IMS glycan profiling to distinguish tumor from normal tissues in a tumor microarray format is also demonstrated. This MALDI-IMS workflow can be applied to any FFPE tissue block or tissue microarray to enable higher throughput analysis of the global changes in N-glycosylation associated with cancers.

The materials and methods employed in these experiments are now described.

Materials

The glycan standard NA2 was obtained from ProZyme (Hayward, CA). Trifluoroacetic acid, sodium hydroxide, dimethyl sulfoxide, iodomethane and a- cyano-4-hydroxycinnamic acid (CHCA) were obtained from Sigma-Aldrich (St. Louis, MO). HPLC grade methanol, ethanol, acetonitrile, xylene and water were obtained from Fisher Scientific (Pittsburgh, PA). ITO slides were purchased from Bruker Daltonics (Billerica, MA). Citraconic anhydride for antigen retrieval was from Thermo Scientific (Bellefonte, PA). Recombinant Peptide N-Glycosidase F from Flavobacterium meningosepticum was expressed and purified as previously described (Powers et al, Anal Chem, 2013 85: 9799-9806).

FFPE Tissues and TMA

Mouse kidneys were excised from euthanized C57BL/6 mice and immediately placed in 10% formalin prior to processing for routine histology and paraffin embedding. Mice were housed in an Institutional Animal Care and Use Committee-approved small animal facility at MUSC, and kidneys were harvested as part of approved projects. A liver TMA was purchased from BioChain consisting of 16 cases of liver cancer in duplicates, and one adjacent normal tissue for each case. Tissues were from 14 male and two female patients with an average age of 47.5 with a range of 33 to 68 years old, with additional information provided in Table 2. A de- identified prostate tumor FFPE block, stored for 10 years representing a Gleason grade 6 (3+3)/stage T2c adenocarcinoma from a 62 year old Caucasian male, was obtained from the Hollings Cancer Center Biorepository at the Medical University of South Carolina. A pathologist confirmed the presence of approximately 10% prostate cancer gland content in the sample. A de-identified large-cell undifferentiated pancreatic carcinoma FFPE tissue section with low CA19-9 staining was obtained from the Van Andel Institute Biospecimen Repository. For each section analyzed, histological analysis and staining with hematoxylin and eosin (H & E) were performed.

Washes for Deparaffinization and Rehydration

Tissue and TMA blocks were sectioned at 5μιη and mounted on MALDI-IMS ITO slides. The slides were heated at 60°C for lHr. After cooling, tissue sections were deparaffinized by washing twice in xylene (3 minutes each).

Tissue sections were then rehydrated by submerging the slide twice in 100% ethanol (1 minute each), once in 95% ethanol (one minute), once in 70% ethanol (one minute), and twice in water (3 minutes each). Following the wash, the slide was transferred to a coplin jar containing the citraconic anhydride buffer for antigen retrieval and the jar was placed in a vegetable steamer for 25 minutes. Citraconic anhydride (Thermo) buffer was prepared by adding 25 μΐ., citraconic anhydride in 50mL water, and adjusted to pH 3 with HCI. After allowing the buffer to cool, the buffer was exchanged with water five times by pouring out 1/2 of the buffer and replacing with water, prior to replacing completely with water on the last time. The slide was then desiccated prior to enzymatic digestion. Tris buffer pH 9-10 was also effective, but the citraconic anhydride buffer for all experiments in this study.

N-slvcan MALDI-IMS

An ImagePrep spray station (Bruker Daltonics) was used to coat the slide with a 0.2 ml solution of O. lmg/mL PNGaseF in water as previously described (Powers et al, Anal Chem, 2013 85: 9799-9806). Control tissue slices are blocked with a glass slide during the spraying process. Following application of PNGaseF, slides were incubated at 37°C for 2 hrs in a humidified chamber, then dried in a desiccator prior to matrix application. a-Cyano-4-hydroxycinnamic acid matrix

(0.02 lg CHCA in 3 ml 50% acetonitrile/50% water and 12ul 25%TFA) was applied using the ImagePrep sprayer. Released glycan ions were detected using a Solarix dual source 7T FTICR mass spectrometer (Bruker Daltonics) (m/z = 690-5000 m/z) with a SmartBeam II laser operating at 1000 Hz, a laser spot size of 25 μιη. Images of differentially expressed glycans were generated to view the expression pattern of each analyte of interest using Flexlmaging 4.0 software (Bruker Daltonics). Following MS analysis, data was loaded into Flexlmaging Software focusing on the range m/z = 1000-4000 and reduced to 0.95 ICR Reduction Noise Threshold. Observed glycans were searched against the glycan database provided by the ConsOltium for Functional Glycomics (www.functionalglycomics.org). Glycan structures were generated by Glycoworkbench (Ceroni et al, J Proteome Res, 2013, 7: 1650-1659) and represent putative structures determined by combinations of accurate m/z, collision induced dissociation (CID) fragmentation patterns and glycan database structures. Permethylation of Tissue Extracted N-glycans

PNGaseF sprayed mouse kidney tissue slides were incubated for 2 hr at 37°C; 50μΙ, water was applied on top of the tissue and incubated for 20 minutes to extract the released native N-glycans. The water was removed from the tissue, then concentrated under vacuum by centrifugation. Permethylation was performed as described (Powers et al., Anal Chem, 2013 85: 9799-9806), and glycans analyzed by MALDI. Masses detected in the permethylation experiments were searched against the permethylated glycan database provided by the Consortium for Functional Glycomics (www. functionalglycomics . org) .

Collision-Induced Dissociation of N-linked Glycans

Glycan standards were spotted on a stainless steel MALDI plate using CHCA matrix and desiccated to yield a homogenous layer. Tissues were prepared as previously described for MALDI imaging of FFPE tissues. 10 spectra of 1000 laser shots with a laser frequency of 1000 Hz were averaged for each spectra provided. The collision energy varied between 60-70V.

TMA Statistics

Mass spectra from TMA tissue Regions of Interest (ROIs) representing each tissue core were exported directly from Flexlmaging and analyzed using an in- house workflow. The peak lists were first deconvoluted followed by calculating the mean peak intensity of points in each ROI, resulting in a monoisotopic peak list corresponding to signal intensity in each region. Comparison of tumor versus non- tumor was accomplished with a Wilcoxon rank sum test followed by Benjamini- Hochberg correction.

The results of the experiments are now described.

Analysis of Formalin-Fixed Mouse Kidneys

Mouse kidney tissues were fixed in formalin and used as an initial model system to develop MALDI-IMS glycan imaging workflows for FFPE tissues. These tissues were chosen due to the availability of reference glycan structures and spectra (Consortium for Functional Glycomics; www.functionalglycomics.org), and previous MALDI-IMS glycan imaging data for fresh frozen mouse kidney tissue analysis (Powers et al, Anal Chem, 2013 85: 9799-9806). A summary workflow schematic is provided (Figure 1). Tissues were cut at 5 microns, deparaffinized and rehydrated in sequential xylene/ethanol/water rinses, followed by antigen retrieval in citraconic anhydride pH 3. The rehydrated tissues were sprayed with PNGaseF, incubated for glycan release, and then analyzed by MALDI-IMS. As shown in Figure 2, there were multiple ions detectable only in the tissue incubated with PNGaseF that were not present in the control tissue with no PNGaseF application. Different glycans were distributed across the cortex or medulla regions. For example, a

Hex4dHex2HexNAc5 ion (m/z = 1996.74) is present in the cortex and medulla (Figure 2C), while a Hex5dHex2HexNAc5 glycan (m/z = 2158.76) is more specific to the cortex (Figure 2D). An overlay of the MALD-IMS images for these two ions from the PNGaseF treated sections (Figure 2E) and the control tissue (Figure 2F) demonstrated that these two ions were released by PNGaseF. A summary glycan image panel of 28 glycan ions detected in these kidneys, sodium adducts and observed/expected m/z values is provided in Figure 7. Additionally, N-glycans were extracted from the tissue following on-tissue PNGaseF digestion, permethylated and analyzed by MALDI. A representative spectra from this analysis is provided in Figure 8. These permethylated values were also compared with MALDI reference spectra for mouse kidney glycans from the Consortium for Functional Glycomics. The imaged glycan ions were correlated to the reference spectra glycans, illustrated in Figure 9, and could be matched to all 28 glycan species highlighted in the reference spectra.

Experiments were designed to assess whether the method was compatible with two representative archived pathology FFPE tissue blocks, one for pancreatic cancer and one for prostate cancer. A section of human pancreatic cancer tissue of complex histology was processed, incubated with PNGaseF and glycans detected by MALDI-IMS (Figure 3). Different N-glycans were detected that could distinguish between non-tumor, tumor, tumor necrotic and fibroconnective tissue regions. A representative glycan image overlay of four m/z values that correspond to the sodium adducts of potential N-glycan species is shown in Figure 3 A, each representing a specific region of the tissue (Figure 3B). A glycan of m/z = 1891.8 (red) lHex3dHexlHexNAc6 was detected primarily in the non-tumor region of the pancreas, while a glycan of m/z = 1743.64 (blue)/Hex8HexNAc2 was predominant in the tumor region of the tissue. A region of desmoplasia surrounding the tumor region, an area of increased extracellular matrix proteins and myofibroblast- like cells resulting in a dense fibrous connective tissue (Shi et al, Lab Invest. 2014, in press), is represented by a glycan of m/z = 1809.69 (green)/Hex5dHexlHexNAc4. A region of tumor necrosis is represented by a different glycan of m/z = 1663.64

(orange)/Hex5HexNAc4. Additional examples of tissue distributions of other individual glycan species are shown in Figure 3C. A human prostate tissue block containing both tumor and non-tumor gland regions was also analyzed by MALDI-IMS. A heterogeneous N-glycan distribution reflective of the tissue histology was observed, and as an example of stroma and gland distributions, two glycan ions and two sub-regions within the tissue are highlighted in Figure 4. Distribution of glycans of m/z = 1663.56

(Hex5HexNAc4) and m/z = 1850.65 (Hex4dHexlHexNAc5) are shown in Figure 4B and 4C. A higher resolution tissue imaging analysis was done for selected regions as marked in the panel, with the H&E images (Figure 4D-4F) highlighting stroma and glands substructures. In both instances, m/z = 1850.65 is present in both the stroma and glands, while m/z = 1663.56 is predominantly located in the stroma. An overlay of these two ions depicts the stroma as an orange color, demonstrating the presence of both red and green, while the glands are predominantly green. The distribution of other representative individual glycan ions is provided in Figure 10. On-tissue Glycan Fragmentation and Structural Composition

The glycan structures identified by imaging of the FPPE tissue blocks were assigned based on the comparison to permethylated species, glycan reference databases and previous studies (Powers et al, Anal Chem, 2013 85: 9799-9806). An on-tissue approach to further verify N-glycan structures was done using collision- induced dissociation (CID) directly on the human pancreatic tissue. Released native glycans from pancreatic cancer FFPE tissues were used as a source for on-tissue CID analysis, and a representative MALDI spectra of these glycans is shown in Figure 5A. For comparison, a Hex5HexNAc4 (m/z = 1663.6) purified standard (also termed NA2) was spotted on a stainless steel MALDI target plate and fragmented by CID, generating a robust fragmentation pattern of glycans for this ion as previously reported by Harvey et al (Harvey et al, Proteomics, 2005, 5: 1774-1786). The same glycan ion was abundant in pancreatic tissue after PNGaseF release of N-glycans (Figure 3) and was selected for CID. As shown in Figure 5b, the CID fragmentation pattern of m/z 1663.6 in pancreatic tissue was the same as the N-glycan standard, confirming detection ofNA2 directly in pancreatic tissue (Figure 3). Mass shifts due to loss of individual sugar ions were detected, such as Hex (resulting in m/z = 1502.5), HexNAc (resulting in m/z 1460.5), and Hex + HexNAc (resulting in m/z = 1298.5) (Figure 5b). An ion at m/z = 712.2, which has been previously characterized (Harvey et al, Proteomics, 2005, 5: 1774-1786) as the sodium adduct of Hex3HexNAcl, was also detected. The structures of 13 other glycan ions were confirmed using this CID approach, and additional fragmentation data and spectra are provided in Figure 12. Glycan MALDI-IMS of a hepatocellular carcinoma tissue microarray

The ability to perform N-glycan analysis on FFPE tissues enables the analysis of multiple FFPE tissue cores in a TMA format. Initial experiments were performed using a commercially available hepatocellular carcinoma (HCC) TMA (BioChain) consisting of samples from 16 individual patients, with two tumor tissue cores and one non-tumor tissue core per patient (Figure 7). Additional patient data are provided in Table 2. Glycan MALDI-IMS was done as described for the other FPPE tissues, and imaging data for two representative glycan ions at m/z = 2393.95 (Hex7HexNAc6) and m/z = 1743.64 (Hex8HexNAc2) are shown in Figure 7.

Analysis of the cumulative MALDI spectra and detected ions for each tissue core were processed and compared using an in-house bioinformatic workflow followed by statistical analysis. Of the 176 identified ions from the HCC TMA, 132 were increased in tumor cores, and 83 ions had a p-value <0.05. Interestingly 78 (94%) of the significantly different ions were elevated in tumor cores. After cross-referencing this list of 176 ions with glycans presented herein and reported previously (Powers et al, Anal Chem, 2013 85: 9799-9806), 26 N-glycans of high-confidence structure determinations were selected, listed in Table 1. Of these 26 glycans, ion intensities of 13 species were significantly different in tumor and normal tissue (p<0.05), and 21 were increased in tumor relative to normal. Flexlmaging was then used to demonstrate the distribution and relative ion intensities of each glycan across the TMA (images provided in Figure 13). Additionally, ROC curves were used to evaluate how well each of the glycan ion intensities discriminates tumor versus non- tumor. Of the 176 identified ions, 61 had area under the ROC curve (AuROC) >0.80, indicating they are strong classifiers. For two glycans at m/z = 2393.95

(Hex7HexNAc6) and m/z = 1743.64 (Hex8HexNAc2), both had an AuROC > 0.80 and a p-value < 0.05, with m/z 2393.95 being elevated in tumor tissue and m/z

1743.64 being elevated in non-tumor tissue, as demonstrated by the log2-fold change value (tumor/non-tumor) (Figure 7). In the overlay (Figure 7B) tumor tissue is predominantly green and non-tumor tissue is predominantly red, confirming results from the statistical analysis. This data demonstrates the ability of a panel of glycans to be used to accurately discriminate cell types or outcomes on a TMA by MALDI-IMS.

Table 1 : Cross-Reference of Identified Ions From Liver TMA With Known Glycans

A list of mono isotopic ions that were identified in the Liver TMA were cross- referenced against a library of known glycan m/z values. Presented is a list of26 ions that were detected and present in the glycan library. This table describes how important the ions are in distinguishing tumor tissue from non-tumor tissue using AuROe, log2-fold changes (tumor/non-tumor), and p-values when comparing the difference between tumor and non-tumor tissues for each selected ion. Localization of m/z = 2393.95 and m z =1743.62 are depicted in Figure 7.

Table 2: Patient Characteristics Table of HCC TMA

N-glycans from FFPE tissues

Multiple N-linked glycans can be directly profiled from FFPE tissue blocks and TMAs while maintaining intact architecture. The basic methodology, which mirrors that of MALDI-IMS analysis of peptides in FFPE tissues and TMAs (Groseclose et al, Proteomics, 2008, 8: 3715-3724; Quaas et al, Histopathology, 2013, 63 : 455-462; Casadonte et al., Nat Protoc, 2011 , 6: 1695-1709), requires deparaffinization and antigen retrieval prior to PNGaseF application. The ability to adapt the N-glycan imaging method originally designed for fresh/frozen tissues (Powers et al, Anal Chem, 2013 85: 9799-9806) to encompass FFPE tissue and TMA blocks increases the scope and speed of glycan-based studies that can be performed in tissues. In initial studies of formalin-fixed mouse kidney slices, the MALDI-IMS workflow successfully identified all 28 of the glycans in the mouse kidney database provided by the Consortium for Functional Glycomics. Many of the structures of these glycans were verified by permethylation Figure 8) and CID experiments (Figures 10-1 1). As observed with the mouse brain (Chaurand et al, Mol Cell Proteomics, 201 1, 10: 01 10.004259), these glycans were not homogenously present across the entire mouse kidney slice, and were either predominantly located in the cortex, or distributed across the cortex and medulla (Figure 7). This unique distribution of N-glycans associated with tissue sub-structure or disease status was also observed in human pancreas and prostate tissue slices. In the pancreas, an overlay of four different glycans was able to map the normal pancreas tissue, tumor pancreas tissue, a region of desmoplasia, and a necrotic region (Figure 3B and 3C). Similarly, an overlay of two glycans could distinguish between prostate stroma and glands (Figure 4).

In general, the peak intensities of PNGaseF -released glycans in the FFPE tissues seem to be more intense than that obtained with flesh/frozen tissue sections. This may be a result of the more extensive heating and washing steps required in the deparaffmization and rehydration steps. It is this increased detection sensitivity that facilitated CID fragmentation of N-glycans directly from the tissue (Figure 5). Under the conditions used, CID generated mainly fragments across the glycosidic bonds, which were-useful in characterizing that the structure was an intact hexose or HexNAc. This did not provide any information regarding anomerie linkages between sugar residues. The amount of fragmentation observed was directly related to the relative intensity of each parent N-glycan ion, and inversely related to the mass of the parent ion observed. This is typified by the extensive fragmentation of two glycans of m/z = 1663.50 and m/z = 1809.64 (Figure 5B, Figures 11 and 12).

One drawback to using FPPE tissues is residual polymer from the paraffin block adjacent to the tissue. Detection of this polymer is more predominant in the lower mass range of the imaging runs, and can overlap with potential glycan masses, complicating detection and further statistical analysis. This polymer can be observed in the average spectra of the mouse kidney tissue after PNGase application (Figure 2A) from m/z = 1250-1300, 1450-1500, and 1650-1700.

An additional key to distinguishing polymer peaks is the analysis of spectra from the non-PNGaseF treated control tissues. Particularly for the TMA format, the ion selection program can detect and account for polymer peaks relative to glycan ions. These polymer peaks seem to vary in terms of intensity compared to N- glycan ions depending on what tissue is being used. It is possible that this variation is a function of different formalin formulations, variations is tissue processing (Le. amount of time in formalin), storage time or variations in the tissue itself (Thompson et al, Proteomics Clin Appl, 2013, 7(3-4):241-51; Craven et al, Proteomics Clin Appl, 2013, 7(3-4):273-82). These considerations can be further monitored and evaluated as more glycans from FFPE tissues are analyzed.

In relation to potential cancer diagnostic applications, the most significant aspect to developing a method to image N-glycans on FFPE tissue blocks could be the ability to use TMAs for high-throughput glycan-based experiments. Not only does the method increase the number of tumor samples that can be analyzed in one experiment, but it could also be used to compare the glycans detected in a TMA core versus the larger source FFPE tissue. N-glycan MALDI-IMS of the BCC TMA (Figure 7) is provided as an example, but an initial glycan profiling data has already been obtained from TMAs representing prostate, lung, breast, colon, and pancreas cancers. In the BCC TMA, a statistically significant increase in tetra-antennary N- glycan (m/z = 2393.95) and decrease in Man-8 glycan (m/z = 1743.64) was detected in BCC cores compared to adjacent non-tumor tissue (Figure 7 and Table 1). The tetra-antennary N-glycan has been previously demonstrated to be elevated in BCC compared to matched adjacent non-tumor tissue by Mehta et al. (Mehta et al, Cancer Epidemiol Biomarkers Prev, 2012 21 : 925-933). Continued investigations will be performed on whether these two ions can distinguish between matched HCC and non- tumor tissues in other HCC TMAs. The data analysis identified a total of 176 ions in the tissue, with the majority of significantly different ions being increased in HCC relative to non-tumor tissue, including 21 known or previously identified glycans. It is unclear how this trend of increased glycan levels relates specifically to tumor related biochemical changes, though the role of glycosylation in tumor development is well documented. Future work will also focus on determining the identity of the remaining ions to distinguish other glycan species from the aforementioned polymer peak contaminants. Evaluation of methods to stabilize larger branched chain sialic acid containing glycans is also ongoing. Overall, the ability to effectively profile N- glycans on FFPE tissue blocks and TMAs provides new opportunities to evaluate glycan profiles associated with disease status.

Example 2: A Novel Glycomic Approach to Identify Pancreatic Cancer Disease Markers in FFPE Tissue Blocks and Microarrays Pancreatic cancer represents one of the most deadly cancers, both in terms cancer related deaths per year and 5-year survival rates. The dismal 5-year survival rate following detection, which is around 6%, can be attributed to poor understanding of disease etiology, rapid disease progression, late diagnosis, limited effective treatment options, and resistance to therapeutic intervention (Ma J et al, Future Oncol Lond Engl, 2013 Jul; 9(7):917— 9). Patient outcome is significantly improved when the cancer is detected at an early stage and before it has spread to regional lymph nodes or metastasized to distant sites. This is largely because these patients are eligible for surgical resection, which represents the only curative treatment for pancreatic cancer (Wolfgang CL et al, AJR Am J Roentgenol, 201 1 Dec; 197(6): 1343-50; SEER Stat Fact Sheets: Pancreatic Cancer [Internet]. National Cancer Institute: Surveillance, Epidemiology, and End Results Program, [cited 2015 Apr 14]. Available from: http://seer.cancer.gov/statfacts/html/pancreas.html; Al Haddad AHI et al, Expert Opin Investig Drugs, 2014 Nov; 23(1 1): 1499-515). Unfortunately, it is estimated that only 9% of patients are diagnosed with localized pancreatic cancer (SEER Stat Fact Sheets: Pancreatic Cancer [Internet]. National Cancer Institute: Surveillance, Epidemiology, and End Results Program, [cited 2015 Apr 14]. Available from: http://seer.cancer.gov/statfacts/html/pancreas.html).

Therefore, efforts directed at the early diagnosis of pancreatic cancer and further understanding of biological pathways implicated in disease progression are extremely important.

An in-depth analysis of aberrant glycosylation, a hallmark of all cancers, offers the potential to advance the understanding of pancreatic cancer and identify new disease markers. Furthermore, glycoproteins and glycan antigens are ideal targets for initial discovery phase disease marker studies, as they [1] are prevalent in biological fluids, [2] play a direct role in cancer progression and metastasis, and [3] represent a majority of the protein-based FDA approved biomarkers. Both individual glycan antigens and alterations in the glycosylation biosynthetic machinery have been implicated in pancreatic cancer. The carbohydrate antigen 19-9 (SLeA, CA19-9) is the most well defined disease marker for pancreatic cancer and the target of the only FDA approved blood test for the management of pancreatic cancer (Koprowski H et al, Somatic Cell Genet, 1979 Nov; 5(6):957-71 ; Fong ZV et al., Cancer J Sudbury Mass, 2012 Dec; 18(6):530-8). Functionally, CA19-9 has been implicated in disease metastasis through the interaction of the antigen with endothelium E-selectin (Ugorski M et al, Acta Biochim Pol, 2002; 49(2):303-l 1). However, CA19-9 is not used as a clinical disease marker for pancreatic cancer, as elevated levels are detected in benign diseases and other sites of cancer (Marrelli D et al, Am J Surg, 2009 Sep; 198(3):333-9; Parra JL et al, Dig Dis Sci, 2005 Apr; 50(4):694-5; Nie S et al, J Proteome Res, 2014 Apr 4; 13(4): 1873- 84). Additionally, between 5-15% of the population do not express fucosyltransferase 3 and are therefore incapable of expressing CA19-9, even with advanced pancreatic cancer (Fong ZV et al, Cancer J Sudbury Mass, 2012 Dec; 18(6):530-8; Nie S et al, J Proteome Res, 2014 Apr 4; 13(4): 1873-84). While elevated CA19-9 is the most pronounced glycosylation-related change in pancreatic cancer, global alterations in glycosylation have been detected. At a fundamental level, the malignant processes resulting in the formation of pancreatic cancer involve several common mutations in oncogenes and tumor suppressors. These mutations ultimately affect the glycan biosynthesis pathway in pancreatic cancer. Constitutive KRAS activation increases the expression of GLUT1, HK1, HK2, PFK1 and LDHA, all of which impact glycosylation by directing glycolytic intermediates to the HBP (Bryant KL et al, Trends Biochem Sci, 2014 Feb; 39(2):91-100; Ying H et al, Cell, 2012 Apr;

149(3):656-70). Similarly, the dense stroma and ECM accumulation surrounding pancreatic cancer results in a hypoxic microenvironment and initiates anaerobic glucose metabolism through glycolysis (Upadhyay M et al, Pharmacol Ther, 2013 Mar; 137(3):318-30; Guillaumond F et al, Arch Biochem Biophys, 2014 Mar;

545:69-73). The shift to glycolysis is accompanied by an upregulation of GLUT1, HK2, PFK1, PFK2, PKM2, GFPT1 and GFPT2 (Upadhyay M et al, Pharmacol Ther, 2013 Mar; 137(3):318-30; de Matos LL et al, Biomark Insights, 2010; 5:9-20). Importantly, in vitro knockdown of GFPT 1 , a key modulator of entry into the HBP, exhibited the same reduction in tumor growth as the extinction of mutated KRAS, directly implicating glycosylation in tumor growth (Bryant KL et al, Trends Biochem Sci, 2014 Feb; 39(2):91— 100). In addition to enzymatic regulation of the hexosamine biosynthetic pathway, glycosyltransferases have been reported to be alternatively expressed in pancreatic cancer (Perez-Garay M et al., Int J Biochem Cell Biol, 2013 Aug; 45(8): 1748-57; Taniuchi K et al., Oncogene, 2011 Dec 8; 30(49):4843-54; Mas E et al, Glycobiology, 1998 Jun; 8(6):605-13; Bassaganas S et al, Cytokine, 2015 Apr 28; 75(1): 197-206). Alterations in the hexosamine biosynthetic pathway and glycosyltransferase expression are apparent in global glycoform analysis, where increases in fucosylation, sialylation and glycan branching have been observed (Nie S et al, J Proteome Res, 2014 Apr 4; 13(4): 1873-84; Stumpo KA et al., J Proteome Res, 2010 Sep 3; 9(9):4823-30). However, the pancreatic cancer glycome has been analyzed on a relatively small subset of patients, often ignoring individual glycoforms in favor of structural motifs.

Recent analytical advances have improved the ability to monitor changes in the glycome in relation to cancer status. Traditional techniques characterized fluorescently labeled glycans based off of their retention time in chromatographic separations or by sequential enzymatic digestion (Campbell MP et al, Bioinformatics, 2008 May 1 ; 24(9): 1214-6). As chromatographic separation is required for each individual sample, this method is not ideal for high-throughput disease marker experiments. Other studies have utilized mass spectrometers to characterize disease-related changes in glycan expression levels, although lengthy desalting and derivatization steps are frequently employed (Biskup K et al, J Proteome Res, 2013 Sep 6; 12(9):4056-63). When applied to the analysis of biological tissues, these methods result in a complete loss of glycan localization. As up to 90% of the tumor volume is occupied by surrounding stromal cells, this analysis may differentiate the non-diseased tissue from the stroma, as opposed to the tumor (Michaud DS et al, JAMA J Am Med Assoc, 2001 Aug 22; 286(8):921-9).

Conversely, immunostaining with lectins or monoclonal antibodies directed at specific glycosylation structures or antigens can probe glycan localization within tissue sections, but does not target or identify individual glycoforms. To this end, a novel approach is introduced to study specific N-linked glycan localization using MALDI-IMS on fresh frozen and FFPE tissue sections (Powers TW et al, Anal Chem, 2013 Oct 15; 85(20):9799-806; Powers TW et al, PloS One, 2014;

9(9):el06255). This technique has been further explored and validated by other laboratories (Gustafsson OJR et al, Anal Bioanal Chem, 2014 Dec 2; 407:2127-2139; Toghi Eshghi S et al, ACS Chem Biol, 2014 Sep 19; 9(9):2149-56). In addition to simply localizing and identifying N-glycans in tissue sections, the study demonstrated the potential of the method for the identification of individual glycans that discriminate tumor from non-tumor tissue specimen in a high-throughput TMA-based approach.

The results presented herein demonstrate how MALDI-IMS of N- linked glycans can be adapted to the identification of disease markers for pancreatic cancer. The study reports MALDI-IMS of N-glycans in TMAs and whole tissue blocks, followed by subsequent glycan characterization by mass accuracy and derivatization. As the spatial localization of N-glycans is retained during MALDI- IMS, analysis of tissue blocks can be further interrogated to look at glycosylation patterns in histological regions other than the tumor and non-tumor regions. Finally, N-glycan distribution was correlated with CA19-9 staining on serial slides, to determine any correlations between CA19-9 and released N-glycans.

The materials and methods employed in these experiments are now described.

Materials

Trifluoroacetic acid, a-cyano-4-hydroxycinnamic acid (CHCA), sodium hydroxide (NaOH), and 1-hydroxybenzotriazole hydrate (HOBt) were obtained from Sigma-Aldrich (St. Louis, MO). HPLC grade methanol, ethanol, acetonitrile, xylene and water were obtained from Fisher Scientific (Pittsburgh, PA). l-(3-Dimethylaminopropyl)-3-ethylcarbodiimide hydrochloride (EDC) was obtained from Oakwood Chemical (West Columbia, SC). Tissue Tack microscope slides were purchased from Polysciences, Inc (Warrington, PA). Citraconic anhydride for antigen retrieval was from Thermo Scientific (Bellefonte, PA). Recombinant Peptide N-

Glycosidase F (PNGaseF) from Flavobacterium meningosepticum was expressed and purified as previously described, and is available commercially as PNGase F Prime™ from Bulldog Bio (Portsmouth, NH) (Powers TW et al, Anal Chem, 2013 Oct 15; 85(20):9799-806). Cotton tips for HILIC enrichment of N-glycans were produced using 100% cotton swabs from Assured (Rio Ranch, NM).

FFPE Tissues, TMAs, and Plasma

All human tissues and TMAs used were de-identified and determined to be not human research classifications by the respective Institutional Review Boards at MUSC. For each section analyzed, histological analysis and staining with hematoxylin and eosin (H&E) was performed and regions of diverse pathology were annotated by a pathologist. Plasma from patients with pancreatitis and pancreatic ductal adenocarcinoma (PDAC) were de-identified and determined to be not human research classifications by the respective Institutional Review Boards at MUSC. Sample Preparation for MALDI Imaging

Tissues and TMA blocks were sectioned at 5μηι and mounted on slides (25 x 75mm) compatible with the Bruker slide adaptor. After mounting, sections were dewaxed and antigen retrieval proceeded as described (Powers TW et al., PloS One, 2014; 9(9):el06255).

PNGaseF (20μg/slide) was applied to slide-mounted tissue blocks and TMAs using the ImagePrep spray station (Bruker Daltonics) as previously described (Powers TW et al, PloS One, 2014; 9(9):el06255). N-glycan release occurred during a 2hr incubation at 37°C in a humidified chamber, followed by desiccation and matrix application. For the TMAs, a-Cyano-4-hydroxycinnamic acid matrix (CHCA), consisting of 0.021g CHCA in 3mL 50% acetonitrile/50% water and 12μΙ, 25%TFA, was applied using the ImagePrep sprayer. The TM-Sprayer (HTX Imaging) was used to coat slides containing whole tissue blocks with CHCA. CHCA was prepared at a concentration of 5mg/mL in 50% ACN/50% H 2 0 at 0.1 % TFA. CHCA was applied at 70°C at 0.0017mg/mm 2 .

Glycan Derivatization

N-glycans were extracted from slides as described previously and dried by vacuum centrifugation (Powers TW et al, Anal Chem, 2013 Oct 15; 85(20):9799- 806). The ethyl esterification protocol, including the modification and enrichment, was adapted from Reiding et al. (Reiding KR et al, Anal Chem, 2014 May 28;

86(12):5784-5793). CA19-9 and SLeX Staining

Serial sections of five of the TMAs were stained with CA19-9 and SLeX. Individual cores were manually scored for the presence or absence of stain and annotated for the presence of tumor or non-tumor tissue. Sensitivity, specificity, positive predictive value and negative predictive value were calculated for both stains individually across the five TMAs.

Data Processing

Imaging data was loaded into Flexlmaging 4.1 (Bruker Daltonics) for visual analysis of tissue blocks and TMAs. TMAs were searched against an N-glycan library to identify N-glycans present across all six TMAs. For computational peak picking of the TMAs, Regions of Interest (ROIs) representing each tissue core were exported from data analysis using the hierarchical clustering option and further processed using an in-house workflow.

To assess the mass accuracy of N-glycans, N-glycans present in all six

TMAs were manually tabulated along with the accurate mass of the glycans. In addition to the accurate mass, m/z values were reported from the imaging peak picking tool as well as individual spectra from a tissue imaging experiment. To generate the m/z values from individual spectra, one spectra localized to the tumor tissue and one spectra localized to the non-tumor tissue were loaded into FTMS processing (Bruker Daltonics) and recalibrated. PPM errors were calculated for both the picked peaks and the individual spectra.

Intensity values were normalized within each array by using a membership function. Using just signal from known glycans (n=24), supervised machine learning based classifiers were trained on a 2/3 sample set and qualified using a 1/3 hold-out sample set.

Classifier Training

Supervised machine learning was used to train classifiers of pancreatic cancer using a 2/3 training set (52 tumor and 49 non-tumor) equally distributed across arrays. The following approaches were evaluated: random forest, support vector machine (linear and radial basis function kernel), Naive Bayes classifier, discriminant analysis (linear and quadratic), and artificial neural networks. Classifiers were trained using interarray normalized data for the known 24 glycans as well as smaller groups of glycans determined using forward sequential feature selection with 10-fold cross- validation. Final models were qualified using an independent 1/3 test set (23 tumor and 25 non-tumor). The best performing classifier was linear discriminant analysis (LDA) model. Using all 24 peaks, the LDA had an error rate of 0.2083 and a sensitivity of 0.7826 and a specificity of 0.8000. Following sequential feature selection, and LDA comprising two masses, 2320.75 (Hex6dHex2HexNAc5 + INa) and 2742.93 (Hex7dHexlHexNAc7 + INa) m/z, had an error rate of 0.1458, a sensitivity of 0.8696, and a specificity of 0.8400.

Glycomic Analysis of Plasma Pools Plasma from patients with pancreatitis and PDAC were pooled together for glycomic analysis. ΙΟμΙ, of plasma from each pool was diluted in 90μΙ, H 2 0 prior to digestion with ^g PNGaseF overnight at 37°C. N-glycans were extracted from the sample following the addition of 400μΙ. MeOH, ΙΟΟμΙ. CHCI 3 , and 300μΙ, H 2 0 and centrifugation at 14,000 x g for 2 minutes. The aqueous phase was collected, dried by vacuum centrifugation, and ethyl esterified as described (Reiding KR et al, Anal Chem, 2014 May 28; 86(12):5784-5793).

The results of the experiments are now described.

N-Glycan Variation in Complex Histopathology Regions FFPE tissue blocks containing both pancreatic cancer and matched non-cancer tissues, as well as other complex histopathology regions, were selected for N-glycome analysis by MALDI-IMS. N-glycans in these tissue sections displayed patterns of regionalized distribution that correlated with histopathology analysis by H&E staining. In the example provided, the regions of the tissue section are outlined in different colors, with each color corresponding to a different histological region, as annotated by a pathologist (Figure 14A). These regions are pancreas tumor/ precancerous lesions (green), intestine mucosa (yellow), fibroadipose connective tissue (blue), smooth muscle (orange), and non-tumor pancreas tissue (red). N-glycans were identified that were either highly elevated or nearly exclusively detected in each region. Hex6dHexlHexNAc5 (m/z = 2174.771) is elevated in the tumor region of the pancreas, which contrasts with Hex6HexNAc2 (m/z = 1419.476), which is elevated non-tumor pancreas (Figure 14B, 14F). Similarly, Hex5HexNAc4NeuAc 1 (m/z = 1976.677) and Hex4dHex 1 HexNAc4 (m/z = 1647.586) are elevated in the fibroadipose connective tissue and smooth muscle regions, respectively (Figure 14D- 14E). Unlike the other N-glycans above which are elevated in defined tissue regions but expressed at lower levels throughout the tissue section, Hex5dHex2HexNAc5 (m/z = 2304.834) is almost exclusively observed in the intestine mucosa (Figure 14C). Overall, close to 90 N-glycans were identified in the tissue section that correspond to the glycan library, most having a distribution that correlates with a certain histology-derived region of the pancreas. The most abundant N-glycans are annotated in the overall average spectra from the imaging experiment (Figure 14G). A selected panel of N-glycans demonstrates the differential distribution of N-glycans across the tissue section (Figure 15).

Individual N-Glycan Discriminators for Pancreatic Cancer As N-glycan distribution varies between tumor and non-tumor regions of the pancreas in whole tissue blocks, the study determined if individual glycans were consistently altered between tumor and non-tumor tissue sections across different patients. To this end, a higher-throughput study based on the analysis of six TMAs was performed to allow a larger number of samples to be surveyed in a reduced amount of time. Each TMA consisted of patient matched tumor and non- tumor tissue cores. Figure 16 provides a detailed workflow schematic of the data processing steps utilized to generate individual or panels of N-glycans that can distinguish tumor from non-tumor tissue cores. Initially, the data were loaded into an in-house peak picking algorithm, which identified 54 monoisotopic peaks present across all six TMAs. A total of 24 peaks that corresponded to N-glycans were identified from the 54 peaks (Table 3). The expression level of the glycans was assessed by their p-value and Log 2 FC (Table 3). All N-glycans were detected as the [M + Na] + adduct, except for sialylated glycans which were observed as singly or doubly sodiated ions. Of the peaks corresponding to glycans, 17 were elevated in the tumor tissues at a statistically significant level (p < 0.05), while none were elevated in the non-tumor cores at a statistically significant level. Initially peaks were characterized as N-glycans by comparing the observed m/z values to a library of N- glycan structures, but peaks were further validated by glycan release and

derivatization using ethyl esterification of sialic acids. For this approach, glycans were extracted from a tissue section, underwent derivatization and enrichment, and were spotted on a MALDI plate prior to analysis. Of the 24 peaks corresponding to N- glycans in the structural library, all 24 were observed in the glycan derivatization experiment. However, additional N-glycans were detected in the analysis following derivatization. The five N-glycans with the highest Log 2 FC values were all complex, fucosylated N-glycans, and all had a Log 2 FC > 1.11. While the level of these glycans were significantly different between tumor and non-tumor (p < 0.05), individual discriminators displayed poor sensitivity and specificity for distinguishing the conditions. For example, Hex7dHexlHexNAc7 (m/z = 2742.983) (data not shown) was only elevated in a small subset of tumor tissues, while Hex6dHexlHexNAc5 (m/z = 2174.771) was elevated in a majority of tumor cores but also some non-tumor cores (Figure 17E). In representative images from five pancreas tissue blocks, Hex6dHexlHexNAc5 was primarily localized to the tumor tissue, but was frequently detected in regions not characterized as tumor tissue by a pathologist (Figure 17A). This general trend is reflected in the statistical data, in which Hex6dHexlHexNAc5 is elevated in the tumor over the non-tumor tissue cores, with a log 2 FC = 1.11 (Figure 17A). In contrast, m/z = 1419.476 (Hex6HexNAc2) is more ubiquitously expressed throughout the tissue blocks (Figure 17B) and TMA (Figure 17F), which is reflective of the low log 2 FC of the Hex6HexNAc2 glycan, where log 2 FC = -0.08. Of the three high-mannose N-glycans detected across all six TMAs, all had a low or negative

Log 2 FC value and were not significantly different between the tumor and non-tumor cores.

Table 3 : Individual Glycan Discriminators for Pancreatic Cancer.

N-Glycan Biomarker Panels

The combination of N-glycans improved the ability of the platform to distinguish tumor from non-tumor tissues, both visually in the MALDI-IMS data and empirically in the statistical data. In comparison to the individual discriminators, the visual image overlay of Hex6dHexlHexNAc5 (green) and Hex6HexNAc2 (red) in both the tissue blocks (Figure 17C) and TMA (Figure 17G), yields more pronounced differences between the tumor from non-tumor tissue regions, as outlined by the pathologist (Figure 17D). To generate a N-glycan biomarker panel, a supervised machine learning algorithm was applied across the glycans identified in all six TMAs, focusing on the ability of the multiple N-glycans to distinguish tumor from non-tumor tissue cores (Table 4). With this approach, 2/3 rd of the data was subjected to a sequential feature selection process to reduce the number of variables used while minimizing the misclassification error of the LDA model, and the remaining l/3 rd was used as a validation set. In the LDA model, Hex7dHexlHexNAc7 (m/z = 2742.9280) and Hex6dHex2HexNAc5 (m/z = 2320.7524) displayed the lowest error rate of 0.1458 (Figure 18A). Additionally, the image overlay of Hex7dHexlHexNAc7 (green) and Hex6dHex2HexNAc5 (red) in three TMAs aligns with the statistics (Figure 18B-18D). The predictive characteristics of the model were compared against CA19-9 and SLeX staining of the same TMAs. The LDA model consisting of two glycans was able to outperform both CA19-9 and SLeX staining in negative predictive value, and CA19-9 in positive predictive value (Table 4).

Table 4: Comparison of LP A Model to Carbohydrate Antigen Staining

The sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) were calculated based off of the markers ability to distinguish tumor from non-tumor tissue sections.

MALDI-IMS Analysis of N-Glycans can be Utilized as a Disease Marker Identification Platform

The present study used TMAs for initial disease marker identification, but results were validated in both TMAs and whole tissue blocks. A secondary goal was to compare the classification metrics of individual discriminators to panels of N- glycans. Due to the enormous biological heterogeneity between individuals with cancer, single markers rarely have the sensitivity and specificity needed for screening asymptomatic patients. Biomarker panels, even when some of the markers are not adequate predictors of disease on their own, can improve disease classification

(Pinsky PF et al, Biomark Insights, 2011 Aug; 6:83-93). While aberrant glycosylation is known to be a hallmark of cancer, including pancreatic cancer, few studies have assessed alterations of N-linked glycosylation in pancreatic tissue sections. Instead, most studies monitor CA19-9 serum expression, which is primarily present on O- glycans and glycolipids (Magnani JL et al, J Biol Chem, 1982 Dec 10;

257(23): 14365-9; Ho JJ et al, Cancer Res, 1995 Aug 15; 55(16):3659-63). However, CA19-9 expression has well defined limitations for its use as a diagnostic disease marker. In the initial analysis of N-glycan distribution of pancreatic cancer tissue blocks, the expression levels of various N-glycans seem to correlate to complex histological regions present in the tissue blocks (Figure 14). The need for diagnostic markers for pancreatic cancer, limitations in currently available disease markers, and the observation of N-glycans that correlate to tumor and non-tumor tissue regions make pancreatic cancer an ideal disease model for this manuscript.

Given the lack of software available to process data from the SolariX

FTICR mass spectrometer, mainly due to the enormous size of the high-resolution data, an in-house approach was developed for peak picking and generation of biomarker panels (Figure 17). For the peak picking function, a bucket table consisting of m/z values and the corresponding relative intensity was generated by using the hierarchical clustering function present in the Flexlmaging software. In the peak picking process, a total of 54 peaks were detected in all six TMAs, 24 of which correspond to masses present in the N-glycans structural library (Table 3). Of the identified N-glycans, 20 are elevated in the tumor over the non-tumor tissue cores, 17 of which have a p-value < 0.05. The high-mannose glycans observed have a negative (Hex6HexNAc2, Hex8HexNAc2) or low (Hex7HexNAc2) log 2 FC value, with p- values > 0.05.

Following the peak picking process, combinations of N-glycans were evaluated for their potential to distinguish tumor from non-tumor tissue sections, both visually and computationally. An overlay of two identified N-glycans,

Hex6dHexlHexNAc5 and Hex6HexNAc2, is able to visually differentiate tumor from non-tumor tissues in whole tissue blocks and TMAs (Figure 17). For panel identification, the 24 peaks corresponding to N-glycans were subjected to supervised machine learning algorithms and sequential feature selection to identify N-glycan panels that are capable of distinguishing tumor from non-tumor tissue cores. The LDA model highlights the importance of Hex7dHexlHexNAc7 and

Hex6dHex2HexNAc5 in distinguishing tumor from non-tumor sections (Figure 18). This model outperforms CA19-9 staining metrics on serial TMA sections (Table 4). Both comparisons mentioned above, Hex6dHexlHexNAc5 vs. Hex6HexNAc2 and Hex7dHexlHexNAc7 vs. Hex6dHex2HexNAc5, contrast N-glycans with a large Log 2 FC and a p-value < 0.05 with N-glycans with a negative Log 2 FC and p-value > 0.05 (Table 3). Additionally, as little correlation was found between N-glycans and the CA19-9 antigen, it is possible to incorporate CA19-9 expression metrics into the biomarker panel to potentially improve the predictive value (Pinsky PF et al, Biomark Insights, 2011 Aug; 6:83-93). By validating the glycan distribution on whole tissue blocks, glycosylation trends were able to be evaluated in relation to complex histopathology. Among the findings, two important observations warrant further discussion. As previously mentioned, pancreatic cancer is only curable if detected at an early stage prior to becoming locally invasive or metastatic. As such, markers capable of detecting of pre-cancerous lesions are particularly useful. The overlay of

Hex6dHexlHexNAc5 and Hex6HexNAc2 (Figure 17C, top tissue), is able to differentiate both low and high grade IPMN lesions from the non-tumor pancreas tissue. Interestingly, Hex6dHexlHexNAc5 (Figure 17A, top tissue) has a higher relative intensity in the tumor regions than the pre-cancerous lesions, demonstrating a signature gradient that coincides with cancer stage. However, not all of these lesions ultimately become cancerous, so additional tests are required to determine if these glycans can distinguish lesions that will become cancerous from lesions that will remain dormant. Secondly, one of the most significant limitations of cancer biomarkers is their poor ability to differentiate tumor from benign diseases. In the examples of whole tissue blocks provided, the overlay of Hex6dHexlHexNAc5 and Hex6HexNAc2 is also able to differentiate tumor from chronic pancreatitis (Figure 17C). As CA19-9 is limited from a clinical perspective by its poor specificity in distinguishing malignant from benign conditions, this finding is particularly promising.

The present study demonstrates that a novel disease marker discovery platform was developed that utilizes MALDI-IMS of N-linked glycans in a high- throughput TMA study. Expression levels of individual and panels of N-glycans were shown to be capable of differentiating tumor from non-tumor tissue, as well as other complex pathologies. Furthermore, the developed platform described herein can be applied to any disease for initial screening purposes.

The disclosures of each and every patent, patent application, and publication cited herein are hereby incorporated herein by reference in their entirety. While this invention has been disclosed with reference to specific embodiments, it is apparent that other embodiments and variations of this invention may be devised by others skilled in the art without departing from the true spirit and scope of the invention. The appended claims are intended to be construed to include all such embodiments and equivalent variations.