Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
SYSTEMS AND METHODS FOR DIAGNOSIS AND PROGNOSIS OF COLORECTAL CANCER
Document Type and Number:
WIPO Patent Application WO/2008/116178
Kind Code:
A8
Abstract:
Methods and systems are described for using in making a diagnosis of colorectal cancer in a subject. A method can include providing a biological sample from the subject; determining an amount in the sample of at least one peptide marker, selected from a group of markers as described herein; and comparing the amount of the at least one peptide marker in the sample, if present, to a control level of the at least one peptide marker, wherein the subject is diagnosed as having colorectal cancer or a risk thereof if there is a measurable difference in the amount of the at least one peptide marker in the sample as compared to the control level.

Inventors:
COFFEY ROBERT J (US)
ARONOW BRUCE J (US)
FRANKLIN JEFFREY L (US)
YEATMAN TIMOTHY (US)
OLESEN SANNE K H (US)
BLOOM GREGORY (US)
Application Number:
PCT/US2008/057884
Publication Date:
September 17, 2009
Filing Date:
March 21, 2008
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNIV VANDERBILT (US)
H LEE MOFFITT CANCER CT AND RE (US)
CINCINNATI CHILDREN S HOSPITAL (US)
COFFEY ROBERT J (US)
ARONOW BRUCE J (US)
FRANKLIN JEFFREY L (US)
YEATMAN TIMOTHY (US)
OLESEN SANNE K H (US)
BLOOM GREGORY (US)
International Classes:
G01N33/53; C12M1/00
Attorney, Agent or Firm:
MYERS JR., Richard S et al. (401 Commerce StreetNashville, Tennessee, US)
Download PDF:
Claims:

CLAIMS

What is claimed is :

1. A method for making a diagnosis of a colorectal cancer in a subject, comprising:

(a) providing a biological sample from the subject;

(b) determining an amount in the sample of at least one peptide selected from the peptides set forth in Table A; and

(c) comparing the amount of the at least one peptide in the sample, if present, to a control level of the at least one peptide, wherein the subject is diagnosed as having colorectal cancer or a risk thereof if there is a measurable difference in the amount of the at least one peptide in the sample as compared to the control level.

2. The method of claim 1, further comprising determining an amount in the sample of at least two peptides selected the peptides set forth in Table A.

3. The method of claim 1, further comprising determining an amount in the sample of at least five peptides selected from the peptides set forth in Table A.

4. The method of claim 1, further comprising determining an amount in the sample of at least ten peptides selected from the peptides set forth in Table A.

5. The method of claim 1, wherein the biological sample comprises blood, plasma, or serum.

6. The method of claim 1 , wherein the subject is human.

7. The method of claim 1, wherein determining the amount in the sample of the at least one peptide comprises determining the amount in the sample of the at least one peptide using mass spectrometry (MS) analysis, immunoassay analysis, or both.

8. The method of claim 7, wherein the immunoassay analysis comprises an enzyme-linked immunosorbent assay (ELISA).

9. The method of claim 1, further comprising selecting a treatment or modifying a treatment for the colorectal cancer based on the determined amount of the at least one peptide.

10. The method of claim 1 , further comprising determining an amount in the sample of at least one additional peptide is selected from the peptides set forth in Table B.

11. The method of claim 10, wherein the biological sample comprises blood, plasma, or serum.

12. The method of claim 10, wherein the subject is human.

13. The method of claim 10, wherein determining the amount in the sample of the at least one peptide comprises determining the amount in the sample of the at least one peptide using mass spectrometry (MS) analysis, immunoassay analysis, or both.

14. The method of claim 10, wherein the immunoassay analysis comprises an enzyme-linked immunosorbent assay (ELISA).

15. The method of claim 10, further comprising selecting a treatment or modifying a treatment for the colorectal cancer based on the determined amount of the at least one peptide.

16. The method of claim 1, further comprising determining an amount in the sample of at least one additional peptide is selected from the peptides set forth in Table C.

17. The method of claim 16, wherein the biological sample comprises blood, plasma, or serum.

18. The method of claim 16, wherein the subject is human.

19. The method of claim 16, wherein determining the amount in the sample of the at least one peptide comprises determining the amount in the sample of the at least one peptide using mass spectrometry (MS) analysis, immunoassay analysis, or both.

20. The method of claim 16, wherein the immunoassay analysis comprises an enzyme-linked immunosorbent assay (ELISA).

21. The method of claim 16, further comprising selecting a treatment or modifying a treatment for the colorectal cancer based on the determined amount of the at least one peptide.

22. A system for making a diagnosis of a colorectal cancer in a subject, comprising: probes for selectively binding each of one or more peptide markers, wherein the peptide markers are selected from the peptides set forth in Table A; and means for detecting the binding of said probes to said one or more peptide markers.

23. The system of claim 22, further comprising probes for selectively binding each of one or more peptide markers selected from the peptides set forth in Table B.

24. The system of claim 22, further comprising probes for selectively binding each of one or more peptide markers selected from the peptides set forth in Table C.

25. A method for making a diagnosis of a colorectal cancer in a subject, comprising:

(a) providing a biological sample from the subject;

(b) determining an amount in the sample of at least one peptide selected from the peptides set forth in Table B; and

(c) comparing the amount of the at least one peptide in the sample, if present, to a control level of the at least one peptide, wherein the subject is diagnosed as having colorectal cancer or a risk thereof if there is a measurable difference in the amount of the at least one peptide in the sample as compared to the control level.

26. The method of claim 25, further comprising determining an amount in the sample of at least two peptides selected the peptides set forth in Table B.

27. The method of claim 25, further comprising determining an amount in the sample of at least five peptides selected from the peptides set forth in Table B.

28. The method of claim 25, further comprising determining an amount in the sample of at least ten peptides selected from the peptides set forth in Table B.

29. The method of claim 25, wherein the biological sample comprises blood, plasma, or serum.

30. The method of claim 25, wherein the subject is human.

31. The method of claim 25 , wherein determining the amount in the sample of the at least one peptide comprises determining the amount in the sample of the at least one peptide using mass spectrometry (MS) analysis, immunoassay analysis, or both.

32. The method of claim 31 , wherein the immunoassay analysis comprises an enzyme-linked immunosorbent assay (ELISA).

33. The method of claim 25, further comprising selecting a treatment or modifying a treatment for the colorectal cancer based on the determined amount of the at least one peptide.

34. A system for making a diagnosis of a colorectal cancer in a subject, comprising: probes for selectively binding each of one or more peptide markers, wherein the peptide markers are selected from the peptides set forth in Table B; and means for detecting the binding of said probes to said one or more peptide markers.

35. A method for making a diagnosis of a colorectal cancer in a subject, comprising: (a) providing a biological sample from the subject;

(b) determining an amount in the sample of at least one peptide marker from at least one group of markers, selected from

(i) the group of markers set forth in Tables H and I, which are associated with a first type of colorectal cancer;

(ii) the group of markers set forth in Tables J and K, which are associated with a second type of colorectal cancer;

(iii) the group of markers set forth in Tables L and M, which are associated with a third type of colorectal cancer; and

(iv) the group of markers set forth in Tables N and O, which are associated with a fourth type of colorectal cancer; and

(c) comparing the amount of the at least one peptide marker in the sample, if present, to a control level of the at least one peptide marker; wherein the subject is diagnosed as having the type of colorectal cancer associated with the group of markers of which the at least one peptide marker is a member if there is a measurable difference in the amount of the at least one peptide in the sample as compared to the control level.

36. The method of claim 35, further comprising determining the amount in the sample of at least one peptide marker from at least two groups of markers.

37. The method of claim 35, further comprising determining the amount in the sample of at least one peptide marker from at least three groups of markers.

38. The method of claim 35, further comprising determining the amount in the sample of at least one peptide marker from four of the groups of markers.

39. The method of claim 35, wherein the at least one group of markers is selected from (i) the group of markers set forth in Table I;

(ii) the group of markers set forth in Table K; (iii) the group of markers set forth in Table M; and (iv) the group of markers set forth in Table O.

40. The method of claim 35, further comprising determining the amount in the sample of at least one peptide marker from at least two groups of markers.

41. The method of claim 35, further comprising determining the amount in the sample of at least one peptide marker from at least three groups of markers.

42. The method of claim 35, further comprising determining the amount in the sample of at least one peptide marker from each of the groups of markers.

43. A system for making a diagnosis of a colorectal cancer in a subject, comprising: probes for selectively binding each of one or more peptide markers, wherein the peptide markers are selected from at least one group of markers, selected from

(i) the group of markers set forth in Tables H and I, which are associated with a first type of colorectal cancer;

(ii) the group of markers set forth in Tables J and K, which are associated with a second type of colorectal cancer;

(iii) the group of markers set forth in Tables L and M, which are associated with a third type of colorectal cancer; and

(iv) the group of markers set forth in Tables N and O, which are associated with a fourth type of colorectal cancer; and means for detecting the binding of said probes to said one or more peptide markers.

44. The system of claim 43, wherein the at least one group of markers is selected from (i) the group of markers set forth in Table I;

(ii) the group of markers set forth in Table K; (iii) the group of markers set forth in Table M; and (iv) the group of markers set forth in Table O.

45. A method for making a diagnosis of a colorectal cancer in a subject, comprising: (a) providing a biological sample from the subject;

(b) determining an amount in the sample of at least one peptide selected from the peptides set forth in Table C; and

(c) comparing the amount of the at least one peptide in the sample, if present, to a control level of the at least one peptide, wherein the subject is diagnosed as having colorectal cancer or a risk thereof if there is a measurable difference in the amount of the at least one peptide in the sample as compared to the control level.

46. The method of claim 45, further comprising determining an amount in the sample of at least two peptides selected the peptides set forth in Table C.

47. The method of claim 45, further comprising determining an amount in the sample of at least five peptides selected from the peptides set forth in Table C.

48. The method of claim 45, further comprising determining an amount in the sample of at least ten peptides selected from the peptides set forth in Table C.

49. The method of claim 45, comprising determining an amount in the sample of at least one peptide selected from the peptides set forth in Table D.

50. The method of claim 45, comprising determining an amount in the sample of at least one peptide selected from the peptides set forth in Table E.

51. The method of claim 45, comprising determining an amount in the sample of at least one peptide selected from the peptides set forth in Table F.

52. The method of claim 45, comprising determining an amount in the sample of at least one peptide selected from the peptides set forth in Table G.

53. The method of claim 45, wherein the biological sample comprises blood, plasma, or serum.

54. The method of claim 45, wherein the subject is human.

55. The method of claim 45, wherein determining the amount in the sample of the at least one peptide comprises determining the amount in the sample of the at least one peptide using mass spectrometry (MS) analysis, immunoassay analysis, or both.

56. The method of claim 55, wherein the immunoassay analysis comprises an enzyme-linked immunosorbent assay (ELISA).

57. The method of claim 45, further comprising selecting a treatment or modifying a treatment for the colorectal cancer based on the determined amount of the at least one peptide.

58. A system for making a diagnosis of a colorectal cancer in a subject, comprising: probes for selectively binding each of one or more peptide markers, wherein the peptide markers are selected from the peptides set forth in Table C; and means for detecting the binding of said probes to said one or more peptide markers.

59. The system of claim 58, wherein the peptide markers are selected from the peptides set forth in Table D.

60. The system of claim 58, wherein the peptide markers are selected from the peptides set forth in Table E.

61. The system of claim 58, wherein the peptide markers are selected from the peptides set forth in Table F.

62. The system of claim 58, wherein the peptide markers are selected from the peptides set forth in Table G.

Description:

SYSTEMS AND METHODS FOR DIAGNOSIS AND PROGNOSIS OF COLORECTAL CANCER

Attorney Docket No. : 11672N/7117 W0(VC/VCM)

RELATED APPLICATIONS

[0001] This application claims priority from U.S. Provisional Application Serial Nos. 60/896,020 filed on March 21, 2007; 60/938,307 filed on May 16, 2007; 60/968,988 filed on August 30, 2007; and 60/971,839 filed on September 12, 2007; the entire disclosures of which are incorporated herein by this reference.

GOVERNMENT INTEREST

[0002] Subject matter described herein was made with U.S. Government support under Grant Numbers 5P50 CA095103-03, 2U01 CA084239-06, and UO1CA085052 awarded by the National Cancer Institute (NCI). The government has certain rights in the described subject matter.

TECHNICAL FIELD

[0003] The presently-disclosed subject matter relates to methods for diagnosis and prognosis of colorectal cancer. In particular, the presently-disclosed subject matter relates to diagnostic and prognostic methods based on determining an amount of protein markers in a biological sample from a subject.

INTRODUCTION AND GENERAL CONSIDERATIONS

[0001] Colorectal cancer is the second leading cause of cancer-related deaths in the western world, and accounts for about 655,000 deaths/year worldwide. Surveillance by

colonoscopy is currently considered to be the most effective screening method for colorectal cancer; however, the process is invasive, unpleasant, and time-consuming.

[0002] The identification of colorectal cancer biomarkers suitable for the early detection and diagnosis of colorectal cancer holds great promise for improving the clinical outcome of subjects, and would allow for more comprehensive population screening because it is less invasive, less expensive, and is less time consuming than methods currently in use. The identification of such biomarkers would be especially important for subjects presented with vague or no symptoms, as well as subjects who have tumors that are relatively inaccessible to physical examination. Despite considerable effort directed at early detection, few reliable and cost- effective screening tests have been developed.

[0003] In addition to improving early-stage cancer detection screening procedures, a set of markers that could, for an individual, quantitively indicate the presence of residual or recurrent disease would be particularly useful for assessing effectiveness of treatment or a change in clinical status management.

[0004] Thus, there is an unmet need for biomarkers that individually, or in combination with other biomarkers or diagnostic modalities, could deliver the required sensitivity and specificity for early detection and prognosis of colorectal cancer, as well as quantitative detection following diagnosis and successive stages of therapeutic management. In particular, simple tests that can be performed on readily-accessible biological fluids are needed.

SUMMARY

[0005] The presently-disclosed subject matter meets some or all of the above-identified needs, as will become evident to those of ordinary skill in the art after a study of information provided in this document.

[0006] This Summary describes several embodiments of the presently-disclosed subject matter, and in many cases lists variations and permutations of these embodiments. This Summary is merely exemplary of the numerous and varied embodiments. Mention of one or more representative features of a given embodiment is likewise exemplary. Such an embodiment can typically exist with or without the feature(s) mentioned; likewise, those features can be applied to other embodiments of the presently-disclosed subject matter, whether listed in this Summary or not. To avoid excessive repetition, this Summary does not list or suggest all possible combinations of such features.

[0007] The presently-disclosed subject matter includes methods and systems for making a diagnosis of colorectal cancer in a subject. Various peptides are disclosed herein for use as markers for colorectal cancer. Markers are identified in Tables A-O, set forth below.

[0008] In some embodiments, a method for making a diagnosis of a colorectal cancer in a subject includes: providing a biological sample from the subject; determining an amount in the sample of at least one peptide selected from the peptides set forth in Table A, Table B, and/or Table C; and comparing the amount of the at least one peptide in the sample, if present, to a control level of the at least one peptide, wherein the subject is diagnosed as having colorectal cancer or a risk thereof if there is a measurable difference in the amount of the at least one peptide in the sample as compared to the control level.

[0009] In some embodiments, the method includes determining an amount in the sample of at least two peptides, at least five peptides, or at least ten peptides selected from the peptides set forth in Table A, Table B, and/or Table C.

[0010] In some embodiments, the method includes determining an amount in the sample of at least one peptide selected from the peptides set forth in Table D, Table E, Table F, or Table G.

[0011] In some embodiments, the method includes determining an amount in the sample of at least one peptide selected from the peptides set forth in Table A, and at least one peptide selected from the peptides set forth in Table B. In some embodiments, the method includes determining an amount in the sample of at least one peptide selected from the peptides set forth in Table A, and at least one additional peptide selected from the peptides set forth in Table C. In some embodiments, the method includes determining an amount in the sample of at least one peptide selected from the peptides set forth in Table B, and at least one additional peptide selected from the peptides set forth in Table C.

[0012] In some embodiments, determining the amount in the sample of the at least one peptide is conducted using mass spectrometry (MS) analysis, immunoassay analysis, or both. In some embodiments where immunoassay analysis is conducted, an enzyme-linked immunosorbent assay (ELISA) can be used.

[0013] In some embodiments, the method also includes selecting a treatment or modifying a treatment for the colorectal cancer based on the determined amount of the at least one peptide.

[0014] In some embodiments, a method for making a diagnosis of a colorectal cancer in a subject includes: providing a biological sample from the subject; determining an amount in the sample of at least one peptide marker from at least one group of markers, selected from (i) the group of markers set forth in Tables H and/or I, which are associated with a first type of colorectal cancer; (ii) the group of markers set forth in Tables J and/or K, which are associated with a second

type of colorectal cancer; (iii) the group of markers set forth in Tables L and/or M, which are associated with a third type of colorectal cancer; and (iv) the group of markers set forth in Tables N and/or O, which are associated with a fourth type of colorectal cancer; and comparing the amount of the at least one peptide marker in the sample, if present, to a control level of the at least one peptide marker; wherein the subject is diagnosed as having the type of colorectal cancer associated with the group of markers of which the at least one peptide marker is a member if there is a measurable difference in the amount of the at least one peptide in the sample as compared to the control level.

[0015] In some embodiments, the method includes determining the amount in the sample of at least one peptide marker from at least two groups of markers. In some embodiments, the method includes determining the amount in the sample of at least one peptide marker from at least three groups of markers. In some embodiments, the method includes determining the amount in the sample of at least one peptide marker from four of the groups of markers.

[0016] In some embodiments, the biological sample includes blood, plasma, or serum. In some embodiments, the subject is human.

[0017] In some embodiments, a system for making a diagnosis of a colorectal cancer in a subject includes: probes for selectively binding each of one or more peptide markers, wherein the peptide markers are selected from the peptides set forth in Table A, Table B, and/or Table C; and means for detecting the binding of said probes to said one or more peptide markers. In some embodiments, the system includes at least one probe for selectively binding each of one or more peptide markers selected from the peptides set forth in Table A, and at least one probe for selectively binding each of one or more peptide markers selected from the peptides set forth in Table B. In some embodiments, the system includes at least one probe for selectively binding each of one or more peptide markers selected from the peptides set forth in Table B, and at least one probe for selectively binding each of one or more peptide markers selected from the peptides set forth in Table C. In some embodiments, the system includes at least one probe for selectively binding each of one or more peptide markers selected from the peptides set forth in Table A, and at least one probe for selectively binding each of one or more peptide markers selected from the peptides set forth in Table C.

[0018] In some embodiments, the system includes at least one probe for selectively binding each of one or more peptide markers selected from the peptides set forth in Table D, Table E, Table F, or Table G.

[0019] In some embodiments, a system for making a diagnosis of a colorectal cancer in a subject includes: probes for selectively binding each of one or more peptide markers, wherein the

peptide markers are selected from at least one group of markers, selected from (i) the group of markers set forth in Tables H and/or I, which are associated with a first type of colorectal cancer; (ii) the group of markers set forth in Tables J and/or K, which are associated with a second type of colorectal cancer; (iii) the group of markers set forth in Tables L and/or M, which are associated with a third type of colorectal cancer; and the group of markers set forth in Tables N and/or O, which are associated with a fourth type of colorectal cancer; and means for detecting the binding of said probes to said one or more peptide markers.

BRIEF DESCRIPTION OF THE DRAWINGS

[0004] FIG. 1. Flow chart illustrating the steps involved in an embodiment of a method of the presently-disclosed subject matter.

[0005] FIG. 2. depicts a stratification of murine colon tumor models by localization of β-catenin and a plan for analysis.

[0006] FIG. 3. is a graph depicting the results of a study to determine the differential expression of transcripts identified by microarray analyses using quantitative real-time PCR (qRT-PCR).

[0007] FIG. 4. includes an analysis of genes over-expressed and under-expressed in embryonic colon and in tumors.

[0008] FIG. 5. includes an analysis of gene lists with criteria of over-expression or under-expression in development, or over-expression or under expression in human CRCs.

[0009] FIG. 6A is a schematic diagram of the canonical WNT signaling pathway showing elements present in C6 (gene symbols with gray background).

[0010] FIG. 6B depicts relative gene expression for MYC and SOX4 for individual murine and human tumors.

[0011] FIG. 7A is a diagram showing that tumors exhibit large-scale activation of developmental patterns, where nuclear β-catenin-positive (Apc Mιn + and AOM) tumors map more strongly to early development stages during (more proliferative, less differentiated), whereas nuclear β-catenin-negative (TgfbV ' ; Rag2 ~ ' and Smad3 ~ " ) tumors map more strongly to later stages consistent with increased epithelial differentiation.

[0012] FIG. 7B illustrates an overall representation of the relationship of mouse colon tumor models and human CRC to development and non-developmental expression patterns.

DESCRIPTION OF EXEMPLARY EMBODIMENTS

[0020] The details of one or more embodiments of the presently-disclosed subject matter are set forth in this document. Modifications to embodiments described in this document, and other embodiments, will be evident to those of ordinary skill in the art after a study of the information provided. The information provided in this document, and particularly the specific details of the described exemplary embodiments, is provided primarily for clearness of understanding and no unnecessary limitations are to be understood therefrom. In case of conflict, the specification of this document, including definitions, will control.

[0021] While the terms defined herein are believed to be well understood by one of ordinary skill in the art, the definitions are set forth to facilitate explanation of the presently-disclosed subject matter.

[0022] Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the presently- disclosed subject matter belongs. Although any methods, systems, and materials similar or equivalent to those described herein can be used in the practice or testing of the presently- disclosed subject matter, representative methods, devices, and materials are now described.

[0023] Following long-standing patent law convention, the terms "a", "an", and "the" refer to "one or more" when used in this application, including the claims. Thus, for example, reference to "a cell" includes a plurality of such cells, and so forth.

[0024] Unless otherwise indicated, all numbers expressing quantities of ingredients, properties such as reaction conditions, and so forth used in the specification and claims are to be understood as being modified in all instances by the term "about". Accordingly, unless indicated to the contrary, the numerical parameters set forth in this specification and claims are approximations that can vary depending upon the desired properties sought to be obtained by the presently- disclosed subject matter.

[0025] As used herein, the term "about," when referring to a value or to an amount of mass, weight, time, volume, concentration or percentage is meant to encompass variations of in some embodiments ±20%, in some embodiments ±10%, in some embodiments ±5%, in some embodiments ±1%, in some embodiments ±0.5%, and in some embodiments ±0.1% from the specified amount, as such variations are appropriate to perform the disclosed method.

[0026] The presently-disclosed subject matter includes methods and systems for diagnosis and prognosis of colorectal cancer. Colorectal cancer can include, for example, cancer of the colon, rectum, and/or appendix. The presently-disclosed subject matter includes methods and systems

for diagnosing colorectal cancer in a subject, and for determining whether to initiate or continue treatment of colorectal cancer in a subject. In some embodiments, the method includes identifying at least one marker in a biological sample from a subject. The at least one marker can be a secreted protein. In some embodiments, the at least one marker is a polypeptide selected from the following, or a polypeptide fragment thereof:

[0027] In some embodiments, the at least one marker is a polypeptide selected from the following, or a polypeptide fragment thereof:

[0028] In some embodiments, the at least one marker is a polypeptide selected from the following, or a polypeptide fragment thereof:

[0029] In some embodiments, the at least one marker is a polypeptide selected from the following, or a polypeptide fragment thereof:

[0030] The markers set forth in Table D are each associated with one of four Dukes stages.

[0031] In some embodiments, the at least one marker is a polypeptide selected from the following, or a polypeptide fragment thereof:

[0032] The markers set forth in Table E are each associated with two of four Dukes stages.

[0033] In some embodiments, the at least one marker is a polypeptide selected from the following, or a polypeptide fragment thereof:

[0034] The markers set forth in Table F are each associated with three of four Dukes stages.

[0035] In some embodiments, the at least one marker is a polypeptide selected from the following, or a polypeptide fragment thereof:

[0036] The markers set forth in Table G are each associated with four of four Dukes stages. Stages of colorectal cancer can be defined according to different systems, as will be understood by those of ordinary skill in the art. For example, stages can be defined using the Dukes system, which stages cancer as Dukes A, B, C, or D. Dukes A can refer to a cancer affecting the innermost lining of the colon or rectum. Dukes B can refer to a cancer that has spread into the muscle layer of the colon or rectum. Dukes C can refer to a cancer that has spread to a lymph node in the region of the colon or rectum. Dukes D can refer to a primary colorectal cancer that has metastasized to a location elsewhere in the body.

[0037] The terms "polypeptide", "protein", and "peptide", which are used interchangeably herein, refer to a polymer of the 20 protein amino acids, including modified amino acids (e.g., phosphorylated, glycated, etc.) and amino acid analogs, regardless of size or function. Thus, exemplary polypeptides include gene products, naturally occurring proteins, homologs, orthologs, paralogs, fragments and other equivalents, variants, and analogs of the foregoing.

[0038] The term "fragment", when used with reference to a polypeptide, refers to a polypeptide in which amino acid residues are absent as compared to the full-length polypeptide itself, but where the remaining amino acid sequence is usually identical to the corresponding positions in the reference polypeptide. Such deletions can occur at the amino-terminus or carboxy-terminus of the reference polypeptide, or alternatively both. A fragment can retain one or more of the biological features of the reference polypeptide. When the term "peptide" is used herein, it is intended to include the full-length peptide as well as fragments of the peptide. Thus, an identified fragment of a peptide (e.g., by mass spectrometry) is intended to encompass the fragment as well as the full-length peptide.

[0039] In some embodiments of the presently-disclosed subject matter, a method for diagnosing colorectal cancer in a subject is provided. The terms "diagnosing" and "diagnosis" as used herein refer to methods by which one of ordinary skill in the art can estimate and/or even determine whether or not a subject has a colorectal cancer or a risk thereof. One of ordinary skill in the art can make a diagnosis on the basis of one or more diagnostic indicators, such as for example a marker, the amount (including presence or absence) of which is indicative of the presence, severity, or absence of the condition.

[0040] With reference to FIG. 1, in some embodiments, a method for diagnosing colorectal cancer in a subject 100 comprises providing a biological sample from the subject 102; determining an amount of at least one peptide marker 104; and identifying the subject as having colorectal cancer or a risk thereof if there is a measurable difference in the amount of the at least one peptide in the sample as compared to a control level 106.

[0041] With regard to providing a biological sample from the subject 102, the term "biological sample" as used herein refers to any body fluid or tissue potentially comprising secreted peptide markers, such as the peptides set forth in Tables A, B, and C. hi some embodiments, for example, the biological sample can be a blood sample, a serum sample, a plasma sample, or sub-fractions thereof. In some embodiments, the biological sample can be derived from a normal stool specimen or from a stool specimen obtained following a treatment with a GI tract prokinetic and or secretory stimulant.

[0042] Turning now to identifying one or more markers in the biological sample 104, various methods known to those of ordinary skill in the art can be used to identify the one or more markers in the provided biological sample. For example, mass spectrometry and/or immunoassay devices and methods can be used, although other methods are well known to those of ordinary skill in the art (for example, the measurement of marker RNA levels). See, e.g.. U.S. Pat. Nos. 6,143,576; 6,113,855; 6,019,944; 5,985,579; 5,947,124; 5,939,272; 5,922,615; 5,885,527; 5,851,776; 5,824,799; 5,679,526; 5,525,524; and 5,480,792, each of which is hereby incorporated by reference in its entirety. Immunoassay devices and methods can utilize labeled molecules in various sandwich, competitive, or non-competitive assay formats, to generate a signal that is related to the presence or amount of an analyte of interest. Additionally, certain methods and devices, such as biosensors and optical immunoassays, can be employed to determine the presence or amount of analytes without the need for a labeled molecule. See, e^, U.S. Pat. Nos. 5,631,171; and 5,955,377, each of which is hereby incorporated by reference in its entirety.

[0043] Thus, in some embodiments of the presently-disclosed subject matter, the marker peptides are analyzed using an immunoassay. The presence or amount of a marker can be determined using antibodies or fragments thereof specific for each marker, and detecting specific binding. For example, in some embodiments, the antibody specifically binds SPARC, which is inclusive of antibodies that bind the full-length peptide or a fragment thereof. In some embodiments, antibodies are provided, wherein each antibody specifically binds a full-length peptide or a fragment thereof that is selected from: a peptide of Tables A, B, and C. In some embodiments, antibodies are provided, wherein each antibody specifically binds a particular

isotype of a peptide or a fragment thereof. In some embodiments, the antibody or antibodies can be monoclonal.

[0044] Any suitable immunoassay can be utilized, for example, enzyme-linked immunoassays (ELISA), radioimmunoassays (RIAs), competitive binding assays, and the like. Specific immunological binding of the antibody to the marker can be detected directly or indirectly. Direct labels include fluorescent or luminescent tags, metals, dyes, radionuclides, and the like, attached to the antibody. Indirect labels include various enzymes well known in the art, such as alkaline phosphatase, horseradish peroxidase and the like.

[0045] The use of immobilized antibodies or fragments thereof specific for the markers is also contemplated by the presently-disclosed subject matter. The antibodies can be immobilized onto a variety of solid supports, such as magnetic or chromatographic matrix particles, the surface of an assay plate (such as microtiter wells), pieces of a solid substrate material (such as plastic, nylon, paper), and the like. An assay strip can be prepared by coating the antibody or a plurality of antibodies in an array on a solid support. This strip can then be dipped into the test biological sample and then processed quickly through washes and detection steps to generate a measurable signal, such as, for example, a colored spot.

[0046] In some embodiments, mass spectrometry (MS) analysis can be used alone or in combination with other methods (e.g., immunoassays) to determine the presence and/or quantity of the one or more markers of interest in a biological sample. In some embodiments, the MS analysis comprises matrix-assisted laser desorption/ionization (MALDI) time-of-flight (TOF) MS analysis, such as for example direct-spot MALDI-TOF or liquid chromatography MALDI-TOF mass spectrometry analysis. In some embodiments, the MS analysis comprises electrospray ionization (ESI) MS, such as for example liquid chromatography (LC) ESI-MS. Mass analysis can be accomplished using commercially-available spectrometers, such as for example triple quadrupole mass spectrometers. Methods for utilizing MS analysis, including MALDI-TOF MS and ESI-MS, to detect the presence and quantity of biomarker peptides in biological samples are known in the art. See, e^, U.S. Patents 6,925,389; 6,989,100; and 6,890,763 for further guidance, each of which is incorporated herein by this reference.

[0047] Although some embodiments of the method only call for a qualitative assessment of the presence or absence of the one or more markers in the biological sample, other embodiments of the method call for a quantitative assessment of the amount of each of the one or more markers in the biological sample. Such quantitative assessments can be made, for example, using one of the above-mentioned methods, as will be understood by those of ordinary skill in the art.

[0048] In some embodiments of the method, a subject is identified as having colorectal cancer upon identifying in a biological sample obtained from the subject one or more markers selected from: a peptide of Table A. In some embodiments of the method, a subject is identified having colorectal cancer upon identifying in a biological sample obtained from the subject one or more markers selected from: a peptide of Table B. In some embodiments of the method, a subject is identified having colorectal cancer upon identifying in a biological sample obtained from the subject one or more markers selected from: a peptide of Table C. In some embodiments of the method, the identification of one or more of such markers in a biological sample obtained from the subject results in the subject being identified as having a risk of colorectal cancer.

[0049] In some embodiments of the method, it can be desirable to include a control sample that is analyzed concurrently with the biological sample, such that the results obtained from the biological sample can be compared to the results obtained from the control sample. Additionally, it is contemplated that standard curves can be provided, with which assay results for the biological sample can be compared. Such standard curves present levels of protein marker as a function of assay units, i.e., fluorescent signal intensity, if a fluorescent signal is used. Using samples taken from multiple donors, standard curves can be provided for control levels of the one or more markers in normal tissue.

[0050] It is contemplated that the efficacy, accuracy, sensitivity, and/or specificity of the method can be enhanced by probing for multiple markers in the biological sample. For example, in some embodiments of the method, the biological sample can be probed for at least one marker selected from: the peptides of Table A, Table B, or Table C. For another example, in some embodiments, the biological sample can be probed for 2-5 markers selected from: the peptides of Table A, Table B, or Table C. For another example, in some embodiments, the biologic sample can be probed for 6-10 markers selected from: the peptides of Table A, Table B, or Table C.

[0051] The analysis of markers can be carried out separately or simultaneously with additional markers within one test sample. For example, several markers can be combined into one test for efficient processing of a multiple of samples and for potentially providing greater diagnostic and/or prognostic accuracy. In addition, one of ordinary skill in the art would recognize the value of testing multiple samples (for example, at successive time points) from the same subject. Such testing of serial samples can allow the identification of changes in marker levels over time. Increases or decreases in marker levels, as well as the absence of change in marker levels, can provide useful information about the disease status that includes, but is not limited to: identifying the approximate time from onset of the event; identifying the presence and amount of salvageable tissue; identifying the appropriateness of drug therapies, including starting,

stopping, and/or changing drug therapies; identifying the relative effectiveness of various therapies; and identifying the prediction and optimization of the subject's outcome, including risk of future events.

[0052] The analysis of markers can be carried out in a variety of physical formats as well. For example, the use of microtiter plates or automation can be used to facilitate the processing of large numbers of test samples. Alternatively, single sample formats could be developed to facilitate immediate treatment and diagnosis in a timely fashion, for example, in ambulatory transport or emergency room settings.

[0053] Referring again to FIG. 1, in some embodiments the subject is identified as having colorectal cancer or a risk thereof if there is a measurable difference in the amount of the at least one peptide in the sample as compared to a control level 106. Conversely, when no probed marker is identified in the biological sample, the subject can be identified as not having colorectal cancer or a risk thereof, or as having a low risk of colorectal cancer.

[0054] As mentioned above, depending on the embodiment of the method, identification of the one or more markers can be a qualitative determination of the presence or absence of the markers, or it can be a quantitative determination of the concentration of the markers. In this regard, in some embodiments, the step of identifying the subject as having colorectal cancer or a risk thereof requires that certain threshold measurements are made, i.e., the levels of the one or more markers in the biological sample exceed control level. In some embodiments of the method, the control level is any detectable level of the marker. In some embodiments of the method where a control sample is tested concurrently with the biological sample, the control level is the level of detection in the control sample. In some embodiments of the method, the control level is based upon and/or identified by a standard curve. In some embodiments of the method, the control level is a specifically identified concentration, or concentration range. As such, the control level can be chosen, within acceptable limits that will be apparent to those of ordinary skill in the art, based in part on the embodiment of the method being practiced and the desired specificity, etc.

[0055] In some cases it can be desirable not only to identify a subject as having a colorectal cancer, but to identify a subject as having a particular type of colorectal cancer. As will be understood by those of ordinary skill in the art, information about the type of cancer can be useful for making a prognosis, choosing appropriate treatment, and in some instances less severe treatment for the subject can be chosen.

[0056] It is contemplated that particular markers can be associated with particular types of colorectal cancer that differ with respect to their activation of cell cycle proliferation, tissue disruption, and angiogenesis gene programs. In this regard, the biomarkers set forth in Tables A,

B, and C, above, can be divided into groups or modules, as set forth in Tables H-G, below. The relative activity of genes in each of these modules indicates tumor state and predicts behavior and can be used to guide the selection of therapies and to better predict outcome than simple single measures. For example, Tables H and I include markers that are the products of genes associated with tissue disruption and angiogenesis.

Table H Table I

COL12A1 TDGFl

COL1A2 WISPl

[0057] For another example, Tables J and K include markers that are the products of genes associated with cell cycle progression, replication, cancer, tumor morphology, and cellular movement, and genes highly correlated with transformation.

Table J Table K

COLlAl MMP9

ENCl S100A9

FNl SERPINE2

PTGS2 SPARC

TGFBI TGIF2

[0058] For another example, Tables L and M include markers that are the products of genes highly associated with the disruption of basement membranes, invasion and cell cycle progression, as well as altered transcriptional control.

Table L Table M

BGN CXCL6

MMP7 KIAA0963 (SBNO2)

LYZ

[0059] For another example, Tables N and O include markers that are associated with advanced stages of colorectal cancer.

Table N Table O

AZGPl COL8A1

CHBLl CTHRCl

COLlOAl CXCL5

COLI lAl MMPI l

COL1A2, alternate isotype SULFl (identified using isotype-specific antibody)

COL5A2

COMP

CXCL2

CXCL3

FNl, alternate isotype (identified using isotype-specific antibody)

GDF 15

IL8

INHBA

MMP 12

Table N Table O

PTGS2, alternate isotype (identified using isotype-specific antibody)

SFRP4

SPPl

THB S2

[0013] In some embodiments, a method for diagnosing a type of colorectal cancer in a subject comprises providing a biological sample from the subject; determining an amount in the sample of at least one peptide marker from at least one group of markers, wherein each group of markers is associated with a different type of colorectal cancer; comparing the amount of the at least one peptide marker in the sample, if present, to a control level of the at least one peptide marker, wherein the subject is diagnosed as having the type of colorectal cancer associated with a group of markers of which the at least one peptide marker is a member if there is a measurable difference in the amount of the at least one peptide in the sample as compared to the control level.

[0014] In some embodiments, the at least one group of markers is selected from the group of markers set forth in Tables H and I; the group of markers set forth in Tables J and I; the group of markers set forth in Tables L and M; or the group of markers set forth in Tables N and O. In some embodiments, the at least one group of markers is selected from the group of markers set forth in Table I; the group of markers set forth in Table K; the group of markers set forth in Table M; or the group of markers set forth in Table O. In some embodiments, the at least one group of markers is selected from the group of markers set forth in Table H; the group of markers set forth in Table J; the group of markers set forth in Table L; or the group of markers set forth in Table N.

[0015] In some embodiments, the method includes determining the amount in the sample of at least one peptide marker from at least two groups of markers. In some embodiments, the method includes determining the amount in the sample of at least one peptide marker from at least three groups of markers. In some embodiments, the method includes determining the amount in the sample of at least one peptide marker from four groups of markers. In some embodiments, the groups can be selected from: the group of Tables H and I; the group of Tables J and K; the group of Tables L and M; and the group of Tables N and O. In some embodiments, the groups can be selected from: the group of Table H; the group of Table J; the group of Table L; and the group of Table N. In some embodiments, the groups can be selected from: the group of Table I; the group of Table K; the group of Table M; and the group of Table O.

[0016] In some embodiments, an amount of at least one peptide marker selected from the a group of markers set forth in Table H, Table I, Table J, Table K, Table L, Table M, Table N, or

Table O is determined, and if there is a measurable difference in the amount of the at least one peptide marker from the group in the sample as compared to a control level, then the subject is identified as having a type of colorectal cancer associated the group. For example, in some embodiments, an amount of at least one peptide marker selected from a group of markers set forth in Tables F and G is determined, and if there is a measurable difference in the amount of the at least one peptide marker from the group in the sample as compared to a control level, then the subject is identified as having a particularly malignant form of colorectal cancer that will require particular modalities of treatment to be designed based on knowledge of the implicated activated pathways, as will be understood by one of ordinary skill in the art.

[0017] In some embodiments, an amount of at least one peptide marker selected from the a group of markers set forth in Tables H and I; the group of Tables J and K; the group of Tables L and M; and the group of Tables N and O is determined, and if there is a measurable difference in the amount of the at least one peptide marker from the group in the sample as compared to a control level, then the subject is identified as having a type of colorectal cancer associated the group.

[0018] In some embodiments, an amount of at least one peptide marker selected from the group of markers set forth in Table H, Table J, Table L, or Table N is determined, and if there is a measurable difference in the amount of the at least one peptide marker from the group in the sample as compared to a control level, then the subject is identified as having a type of colorectal cancer associated the group.

[0019] In some embodiments, an amount of at least one peptide marker selected from the a group of markers set forth in Table I, Table K, Table M, or Table O is determined, and if there is a measurable difference in the amount of the at least one peptide marker from the group in the sample as compared to a control level, then the subject is identified as having a type of colorectal cancer associated the group.

[0020] Clinical cancer prognosis is also an area of great concern and interest. It is useful to know the aggressiveness, stage, and/or type of the cancer cells and the likelihood of tumor recurrence in order to plan the most effective treatment. If a more accurate prognosis can be made, appropriate treatment, and in some instances less severe treatment for the subject can be chosen.

[0021] As such, "making a diagnosis", as used herein, is further inclusive of making a prognosis, which can provide for selecting an appropriate treatment, or monitoring a current treatment and potentially changing the treatment, based on the measure or identity of marker

peptides. In some embodiments a method for determining whether to initiate or continue treatment of a colorectal cancer in a subject comprises: providing a series of biological samples over a time period from the subject; analyzing the series of biological samples to determine an amount in each of the biological samples of at least one peptide marker; and comparing any measurable change in the amounts of the at least one peptide marker in each of the biological samples to thereby determine whether to initiate or continue the treatment.

[0022] As used herein, the terms treatment or treating relate to any treatment of a condition of interest, including but not limited to prophylactic treatment, i.e., prophylaxis, and therapeutic treatment As such, the terms treatment or treating include, but are not limited to: preventing or arresting the development of a cancer; inhibiting the progression of a cancer; reducing the severity of a cancer; ameliorating or relieving symptoms associated with a cancer; and causing a regression of cancer or one or more of the symptoms associated with the cancer.

[0023] In some embodiments of the presently-disclosed subject matter, multiple determination of the peptide markers over time can be made to facilitate diagnosis and/or prognosis. A temporal change in the marker can be used to monitor the progression of the cancer and/or efficacy of appropriate therapies directed against the cancer. In such an embodiment, for example, one might expect to see a decrease in the amount of one or more of the marker peptides in a biological sample over time during the course of effective treatment.

[0024] In some embodiments of the presently-disclosed subject matter, a system for diagnosing colorectal cancer in a subject is provided, or a system for determining whether to initiate or continue treatment of colorectal cancer in a subject is provided. Such systems can be provided, for example, as commercial kits that can be used to test a biological sample, or series of biological samples, from a subject. In some embodiments, the system includes probes for selectively binding each of one or more peptide markers; and means for detecting the binding of said probes to said one or more markers. The system can also include certain samples for use as controls. The system can further include one or more standard curves providing levels of markers as a function of assay units.

[0025] In some embodiments, a system for the analysis of biomarkers is provided that comprises antibodies having specificity for one or more markers associated with colorectal cancer, including a marker as set forth in Table A, B, and C. Such a system can comprise devices and reagents for the analysis of at least one test sample. The system can further comprise instructions for using the system and conducting the analysis. Optionally the systems can contain

one or more reagents or devices for converting a marker level to a diagnosis or prognosis of the subject.

[0026] In some embodiments, a system is provided including probes for selectively binding each of one or more peptide markers selected from the peptides of Table A. In some embodiments, a system is provided including probes for selectively binding each of one or more peptide markers selected from the peptides of Table B. In some embodiments, a system is provided including probes for selectively binding each of one or more peptide markers selected from the peptides of Table C.

[0027] In some embodiments, a system is provided including probes for selectively binding each of one or more peptide markers selected from a group of peptides set forth in Tables H and I, Tables J and K, Tables L and M, or Tables N and O. In some embodiments, a system is provided including probes for selectively binding each of one or more peptide markers, wherein at least one peptide marker is selected from at least two of the following groups of peptides: Tables H and I, Tables J and K, Tables L and M, or Tables N and O. In some embodiments, a system is provided including probes for selectively binding each of one or more peptide markers, wherein at least one peptide marker is selected from at least three of the following groups of peptides: Tables H and I, Tables J and K, Tables L and M, or Tables N and O. In some embodiments, a system is provided including probes for selectively binding each of one or more peptide markers, wherein at least one peptide marker is selected from four of the following groups of peptides: Tables H and I, Tables J and K, Tables L and M, or Tables N and O.

[0028] In some embodiments, a system is provided including probes for selectively binding each of one or more peptide markers selected from a group of peptides set forth in Table H, Table J, Table L, or Table N. In some embodiments, a system is provided including probes for selectively binding each of one or more peptide markers, wherein at least one peptide marker is selected from at least two of the following groups of peptides: Table H, Table J, Table L, or Table N. In some embodiments, a system is provided including probes for selectively binding each of one or more peptide markers, wherein at least one peptide marker is selected from at least three of the following groups of peptides: Table H, Table J, Table L, or Table N. In some embodiments, a system is provided including probes for selectively binding each of one or more peptide markers, wherein at least one peptide marker is selected from four of the following groups of peptides: Table H, Table J, Table L, or Table N.

[0029] In some embodiments, a system is provided including probes for selectively binding each of one or more peptide markers selected from a group of peptides set forth in Table

I, Table K, Table M, and Table O. In some embodiments, a system is provided including probes for selectively binding each of one or more peptide markers, wherein at least one peptide marker is selected from at least two of the following groups of peptides: Table I, Table K, Table M, and Table O. In some embodiments, a system is provided including probes for selectively binding each of one or more peptide markers, wherein at least one peptide marker is selected from at least three of the following groups of peptides: Table I, Table K, Table M, and Table O. In some embodiments, a system is provided including probes for selectively binding each of one or more peptide markers, wherein at least one peptide marker is selected from four of the following groups of peptides: Table I, Table K, Table M, and Table O.

[0030] As used herein, the term "subject" includes both human and animal subjects. Thus, veterinary therapeutic uses are provided in accordance with the presently-disclosed subject matter. As such, the presently-disclosed subject matter provides for the diagnosis of mammals such as humans, as well as those mammals of importance due to being endangered, such as Siberian tigers; of economic importance, such as animals raised on farms for consumption by humans; and/or animals of social importance to humans, such as animals kept as pets or in zoos. Examples of such animals include but are not limited to: carnivores such as cats and dogs; swine, including pigs, hogs, and wild boars; ruminants and/or ungulates such as cattle, oxen, sheep, giraffes, deer, goats, bison, and camels; and horses. Also provided is the treatment of birds, including the treatment of those kinds of birds that are endangered and/or kept in zoos, as well as fowl, and more particularly domesticated fowl, i.e., poultry, such as turkeys, chickens, ducks, geese, guinea fowl, and the like, as they are also of economic importance to humans. Thus, also provided is the treatment of livestock, including, but not limited to, domesticated swine, ruminants, ungulates, horses (including race horses), poultry, and the like.

[0031] The presently-disclosed subject matter is further illustrated by the following specific but non-limiting examples. The following examples may include prophetic examples, and may also include compilations of data that are representative of data gathered at various times during the course of development and experimentation related to the presently-disclosed subject matter.

EXAMPLES [0060] Example 1

[0061] Identification of Markers for Colorectal Cancer.

[0062] To identify new markers, the present inventors developed a gene expression database of genome-wide expression derived from about 350 human colon cancer and normal specimens. By evaluating about 54,000 genetic elements for each tumor or normal specimen, the present inventors were able to develop a list of genes that were substantially over-expressed in tumor samples versus normal controls. A genetic approach, described below, was then used to identify the genes that might produce secreted proteins. This approach ultimately identified potential biomarker proteins that could be used as serum or plasma makers. These markers could have utility for predicting the diagnosis, prognosis, of a cancer or potentially its response to a particular therapy. The markers could also be useful to develop diagnostic imaging agents to image a cancer's location, size, or extent of spread.

[0063] Secreted protein predictions.

[0064] Representative Public IDs (GenBank Nucleotide IDs) matching Affymetrix probe IDs from a U133 Plus 2.0 GeneChip (Affymetrix, Santa Clara, CA) were retrieved from the Affymetrics website. The GenBank Nucleotide IDs were then translated into GenBank Protein IDs using the Batch Entrez function at http://www.ncbi.nlm.nih.gov/entrez/batchentrez.cgi. The Protein IDs were translated into batches of individual FASTA sequences. These FASTA sequences were run through the publicly available SIGCLEAVE database at http://bioweb.pasteur.fr/seqanal/interfaces/sigcleave.html. The output from SIGCLEAVE was then sorted, giving two lists of proteins: 1) proteins with one or more predicted signal cleavage sites, and 2) proteins with no predicted signal cleavage sites. The sorting was carried out using Perl Script. The GenBank Protein IDs were translated back to GenBank Nucleotide IDs and then Affymetrix probe IDs. This provided a list of potentially secreted proteins with genes represented on the Affymetrix U133 Plus 2.0 GeneChip.

[0065] This list of Affymetrix probe IDs was then compared to two different lists.

[0066] The first list (Table P) was composed of genes 4 times or more increased in expression in colon tumors relative to normal colonic mucosa stratified by the four Dukes' stages individually. The second list (Table Q) was composed of genes 4 times or more increased between the mean expression of normal colon tissue and the mean expression of colon tumors of at least one of the four Dukes' stages individually. The first comparison produced a list of genes that are 4 times or more increased from normal in all four Dukes stages, which genes have products that are potentially secreted. The second comparison produced a list of genes that is 4 times or more increased from normal to at least one of the four Dukes stages and potentially secreted.

I Table P I

[0067] Table R include a list of genes, the expression of which is 4 times or more increased from normal in all four Dukes stages, which genes have products that are potentially secreted:

[0068] Table S include a list of genes, the expression of which is 4 times or more increased from normal in three of four Dukes stages, which genes have products that are potentially secreted: I Table S I

[0069] Table T include a list of genes, the expression of which is 4 times or more increased from normal in two of four Dukes stages, which genes have products that are potentially secreted:

[0070] Table U include a list of genes, the expression of which is 4 times or more increased from normal in one of four Dukes stages, which genes have products that are potentially secreted:

[0071] Example 2

[0072] The colon is composed of a dynamic and self-renewing epithelium that turns over every 3-5 days. Without wishing to be bound by theory, it is thought that at the base of the crypt, variable numbers (between one and 16) of slowly dividing, stationary, pluripotent stem cells give rise to more rapidly proliferating, transient amplifying cells. These cells differentiate chiefly into post-mitotic columnar colonocytes, mucin-secreting goblet cells and enteroendocrine cells as they migrate from the crypt base to the surface where they are sloughed into the lumen [I].

[0073] Several signaling pathways, including Wnt, Tgfβ, Bmp, Hedgehog and Notch, play roles in the control of proliferation and differentiation of the developing and adult colon [2]. Their perturbation via mutation or epigenetic modification occurs in human colorectal cancer (CRC) and doing so via genetic engineering confers risk for neoplasia in mouse models. Moreover, tumor cell de-differentiation correlates with key tumor features such as tumor progression rates, invasiveness, drug resistance and metastatic potential [3-5].

[0074] A variety of scientific and organizational obstacles tend to interfere with the ability to integrate and compare detailed characterizations of human cancer to corresponding molecular and clinical characterizations from genetically engineered mouse models of normal, cancer, and other disease processes. Because of this, there integrated views of the molecular bases of cancer risk, development, and progression are lacking. To approach these problems and learn more about the opportunities potentially afforded by large integration and cross-models comparisons, a consortium-based combined analysis was undertaken of a variety of mouse models of colorectal

cancer (CRC) that have been developed and used individually (reviewed in [6, 7]), and the results were compared with human CRC, in an effort to develop an increased understanding of fundamental aspects of the pathogenesis of CRC.

[0075] In the studies describe herein, gene expression patterns of human colorectal cancer (CRC) and mouse models of colon cancer were compared to that of embryonic mouse colon development using colon samples from embryonic days 13.5-18.5.

[0076] Mouse Models of Colorectal Cancer (CRC)

[0077] There are four mouse models of CRC that were used in the studies described herein: (1) Apc Min/+ mouse model (multiple intestinal neoplasia), (2) azoxymethane (AOM) carcinogen model, (3) Tgft>T' ~ /Ragl' ' mouse model, and (4) Smadi ~ ' ~ mouse model.

[0078] The Apc Mιn/+ (multiple intestinal neoplasia) mouse model harbors a germline mutation in the Ape tumor suppressor gene and exhibits multiple tumors in the small intestine and colon [8]. A major function of APC is to regulate the canonical WNT signaling pathway as part of a β- catenin degradation complex. Loss of APC results in a failure to degrade β-catenin, which instead enters the nucleus to act as a transcriptional co-activator with the lymphoid enhancer factor/T-cell factor (LEF/TCF) family of transcription factors [9]. The localization of β-catenin within the nucleus indicates activated canonical WNT signaling. In addition to germline APC mutations that occur in persons with familial adenomatous polyposis coli (FAP) and Apc Min/+ mice, loss of functional APC and activation of canonical WNT signaling occurs in more than 80% of human sporadic CRCs [10].

[0079] Similar to the Apc Ml + model, tumors in the azoxymethane (AOM) carcinogen model, which occur predominantly in the colon [11], have signaling alterations marked by activated canonical WNT signaling.

[0080] Two other mouse models that carry different genetic alterations leading to colon tumor formation are based on the observation that TGFβ type II receptor (TGFBR2) gene mutations are present in up to 30% of sporadic CRC and in more than 90% of tumors that occur in patients with the DNA mismatch repair deficiency associated with hereditary non-polyposis colon cancer (HNPCC) [12].

[0081] In the mouse, a deficiency of TGFβ 1 combined with an absence of T-cells {Tgfbϊ' ~ ; Rag2 ~ ' ~ ) results in a high occurrence of colon cancer [13]. These mice develop adenomas by two months of age, and adenocarcinomas, often mucinous, by 3-6 months of age. Immunohistochemical analyses of these tumors are negative for nuclear β-catenin, suggesting that TGFβl does not suppress tumors via a canonical WNT signaling-dependent pathway.

[0082] The SMAD-family proteins are critical downstream transcription regulators activated by TGFB signaling, in part through the TGFB type II receptor. Smadi ~ ' ~ mice also develop intestinal lesions that include colon adenomas and adenocarcinomas by six months of age [14].

[0083] To identify transcriptional programs that are activated or repressed in different colon tumor models, gene expression profiles of (1) 100 human CRCs and 39 colonic tumors from the four models of colon cancer to (2) mouse embryonic and mouse and human adult colon were compared. The results of these analyses demonstrate that tumors from the mouse models extensively adopt embryonic gene expression patterns, irrespective of the initiating mutation.

[0084] Although two of the mouse tumor subtypes were distinguishable by their relative shifts towards early or later stages of embryonic gene expression (driven principally by localization of β-catenin to the nucleus versus the plasma membrane), Myc was over-expressed in tumors from all four tumor-models. Further, by mapping mouse genes to their corresponding human orthologs, it was further shown that human CRCs share in the broad over-expression of genes characteristic of colon embryogenesis and the up-regulation of MYC, consistent with a fundamental relationship between embryogenesis and tumorigenesis. Large scale additional similarities could also be found at the level of developmental genes that were not activated in either mouse or human tumors. In addition, there were transcriptional modules consistently activated and repressed in human CRC that were not found in the mouse models. Taken together, this cross-species, cross-models analytical approach, filtered through the lens of embryonic colon development, provides an integrated view of gene expression that implicates the adoption of a broad program that encompasses embryonic activation, developmental arrest, and failed differentiation as fundamental characteristics of human CRC.

[0085] Example: Methods - Mouse models, human CRC patients, and tumor collection.

[0086] Mouse tumors.

[0087] All tumors were isolated as spontaneously-occurring lesions in Apc MlτJ+ [56], Smad3 ~ ' ~ [57], and Tgfbr A ; Rag2 ~ ' ~ mice were collected at three-to-nine months of age depending on the model (for a review, see [6]). The only exceptions were two Apc Ml + tumors UW_3_2778 and UW_6_2748 that were 13 and 14 months and the three TgfbV ' ; Rag2 ~ ' tumors, all five of which had histological features of locally invasive carcinoma [7].

[0088] Three-to-four month old mice from various AXB recombinant inbred lines were treated with AOM doses chosen for enhancement of inter-strain differences in susceptibility [H]. Mice were given four weekly i.p. injections of 10 mg AOM per kg body weight, and tumors were collected six months after the first injection. Animals were euthanized with CO 2 , colons removed,

flushed with IX PBS, and laid out on Whatman 3MM paper. A summary of the mouse strains, mutant alleles and source laboratories is presented in Table V.

[0089] All tumors were obtained from the colon only, the particular segment of which is indicated in the GEO database reposited sample information (www.ncbi.nlm.nih.gov/geo/, GSE5261). The majority of TgfbV ' ; Rag2 ~ ' and Smad3 ~ ' tumors occur in the cecum and proximal colon and all samples isolated for characterization were obtained from there. In contrast tumors isolated from Apc Mm/+ and AOM mice occurred predominantly in the mid- and distal colon. A small portion of the tumor was placed in formalin for histology, with the remainder finely dissected into RNAlater (Ambion) and stored at -20 0 C. Normal adult colon RNA for reference was obtained from whole colon samples harvested from ten 8-week old C57BL/6 male mice. The tissue was lysed in Trizol Reagent and homogenized. Total RNA was purified using a Qiagen kit.

[0090] Human samples - collection/biopsies, regulatory aspects, compliance and informed consents.

[0091] Sample collection protocol and analyses were performed as described in Kwong, et al, Genomics (2005) [36]. Information collected with the samples for this study includes TNM and Dukes staging/presentation criteria, pathological diagnosis and differentiation criteria.

[0092] RNA isolation.

[0093] All RNA samples were purified using Trizol Reagent (Invitrogen Systems, Inc.) from finely dissected tumors and were subjected to quality control screening using the Agilent Bio Analyzer 2100.

[0094] Example: Methods - Microarray procedures and data analysis

[0095] Mouse cDNA arrays.

[0096] Mouse tumors were analyzed on Vanderbilt University Microarray Core (VUMC)- printed 2OK mouse cDNA arrays, composed principally of PCR products derived from three sources: the 15K National Institute of Aging mouse cDNA library; the Research Genetics mouse 5K set; and an additional set of cDNAs mapped to RefSeq transcripts. Labeling, hybridization, scanning, and quantitative evaluation of these two-color channel arrays were performed according to VUMC protocols (http://array.mc.vanderbilt.edu/microarray/spotted.vmsr) using a Universal Reference standard (embryonic day 17.5 (E17.5) whole fetal mouse RNA). Arrays were analyzed by GenePix version 3.0, flagged and filtered for unreliable measurements, with dye channel ratios corrected using Lowess and dye-specific correction normalization as described in Park, et al. Genesis (2005) [15].

[0097] Human Affymetrix oligonucleotide arrays.

[0098] Human RNA samples were labeled for hybridization to Affymetrix HG-U133plus2 microarrays using the Affymetrix-recommended standard labeling protocol (Small-scale labeling protocol version 2.0 with 0.5 μg of total RNA; Affymetrix Technical Bulletin). Microarrays were scanned with MicroarraySuite version 5.0 to generate "CEL" files that were processed using the RMA algorithm as implemented by Bioconductor [15].

[0099] Analysis strategy.

[00100] The four different mouse models of CRC were compared for model-specific differences, then compared to mouse colon development stages, and then to human CRC samples (FIG. 2). Colon tumors from four etiologically distinct mouse models of CRC were subjected to microarray gene expression profiling. The gene expression profiles from the different mouse model tumors were compared and contrasted to each other, as well as to those from embryonic mouse colon development and 100 human CRCs. The mouse tumor sample array data are comprised of Lowess-normalized Cy3 :Cy5 labeling ratios of each individual tumor sample versus a Universal E17.5 whole fetal mouse reference RNA (See description found in the NCBI GEO database under series accession numbers [GSE5261], which is incorporated herein in its entirety by this reference; See also supplementary information, which is available at http://www.ncbi.nlm.nih.gov/geo/query/acc. cgi?acc=GSE5204, which is incorporated herein in its entirety by this reference, Query DataSets for GSE5204, Title-Mouse models of human colon cancer and mouse colon development, Organism(s) http://www.ncbi.nlm.nih. gov/Taxonomy/Browser/wwwtax.cgi?mode=Info&id=l 0090, Type-

[00101] Tumors from four murine models of colorectal cancer and normal mouse colon samples at different developmental stages, Summary - experiments to identify transcriptional patterns across tumors from colorectal cancer murine models and normal mouse colon samples at

different developmental stages, Experiment description - Colorectal cancer (CRC) results from multiple genetic and epigenetic events that produce variable histologies and clinical outcomes. To identify gene regulatory programs that underlie colon tumorigenesis, gene expression was profiled in 39 mouse colon tumors from four independent mouse models and compared to mouse colon embryonic development, as well as with 100 human colon carcinomas. There was a striking recapitulation of embryonic patterns of gene expression in both mouse and human colon tumors. All four of the mouse colon tumor models exhibited large-scale activation of embryonic gene expression signatures. The two nuclear beta-catenin-positive mouse tumors (azoxymethane- treated [AOM] and ApcMin/+), exhibited strong activation of genes characteristic of those expressed in the earliest embryonic stages, while tumors from two other models (Smad37- and Tgfbl-/- x Rag2-/-) exhibited lower activation of early stage-specific genes but substantial expression of general embryonic colon genes. Human colon cancer cases over-expressed genes characteristic of both early and late embryonic stages. Examining tumor gene expression through the lens of development has revealed an extensive network of therapeutic targets for cancer control. Overall design - Four colorectal tumor models were compared using the approach of finding model-specific differences with gene expression levels referenced to the median gene expression value across all models. A second complementary strategy compared the gene expression levels of the murine tumors to those of mouse normal adult colon).

[00102] The first approach to referencing was to compare normalized ratios across the tumor series. To do this, for each gene, the Lowess-corrected ratio for each probe element (sample vs. E17.5 whole fetal mouse reference) was divided by the median ratio for that probe across the entire tumor sample series. This is termed the median-per-tumor expression ratio and was useful for identifying, clustering and visualizing differences that occur between the different tumor samples. Mouse expression data had been collected for normal E13.5-E18.5 colon samples from inbred C57BL/6J and outbred CD-I mice [15] using the identical E17.5 whole fetal mouse reference, allowing the data to be combined directly.

[00103] Differential expression profiles in the tumors were combined with relative developmental gene expression levels by direct comparisons of ratios determined within each experimental series. Initial comparisons were made between median normalized tumor data to that of gene expression levels observed in the E13.5-E18.5, and adult (8 week post-natal) colon samples that were referenced to either E 13.5 samples or to the adult colon. The latter approach subsequently allowed for the broadest comparison of mouse and human data using gene ortholog mapping. Correlated phenomena could be observed from any of the different referencing strategies.

[00104] Inter-organism gene ortholog and inter-platform comparison strategy.

[00105] Pairs of human and mouse ortholog genes (12,693) were curated using Mouse Genome Informatics (MGI - The Jackson Laboratory - http://www.informatics.jax.org/) and the National Center for Biotechnology Information - Homologene (NCBI - http://www.ncbi.nlm.nih. gov/entrez/query.fcgi?db=homologene) databases. Individual microarray elements or features were mapped to these. The concatenated human and mouse RefSeq IDs was used as the composite ID for the orthologous gene pair in the ortholog genome definition. NIA/Research Genetics mouse cDNAs were mapped to human orthologs using a variety of resources, usually via the Stanford Online Universal Reference resource (http://source.stanford.edu/). Gene-transcript assignments are made unique by choosing the longest corresponding transcript.

[00106] To map the Affymetrix human and mouse array data into the ortholog genome, a sequence matching approach was used. First, human and mouse transcript sequences were obtained from RefSeq (ftp://ftp.ncbi.nih.gov/refseq/) and probe sequences from the manufacturer's website

(http://www.affymetrix.com/support/technical/byproduct.af fx?cat=arrays). Next, all perfect probe-transcript pairs were computed. Probes that matched multiple gene symbols were excluded, but probes that matched multiple transcripts were accepted. Probe sets were assigned to represent a given transcript if at least 50% of the perfect match probes of the probe set matched to that transcript. The newly assigned transcript identifiers were then used to map probe sets to ortholog genes.

[00107] Since some transcripts have multiple probe-set representations on both the Affymetrix and cDNA microarrays to one ortholog identifier, an ad hoc strategy was employed to use the average of those probe sets or cDNAs that exhibited consistent regulation across a sample series. In such cases, the signals of the regulated probe sets that were interpreted as being in agreement were averaged and assigned to the corresponding ortholog. Probe sets or cDNAs were excluded if it was known that they corresponded to non-transcript genomic sequence as tested using BLAT at http://genome.ucsc.edu/.

[00108] Mouse-human RefSeq gene ortholog assignments can be found at http://genometrafac.cchmc.org/. All ortholog assignments and cross-species mapping annotations were incorporated into annotations associated with the Affymetrix HG-U133 plus2.0 genome. Gene expression ratios obtained for the mouse samples were then represented as expression values within the human platform for all of the probe sets that mapped to the corresponding mouse gene ortholog. Data for the primary human sample series, as well as the

combined mouse-human data sets, are available at http://genet.cchmc.org/ in the HG-Ul 33 genome under the KaiserEtAl_2006 folders ("guest" login, all cross-platform ortholog gene identifiers are contained as annotation fields within the HG-U133 genome table).

[00109] Statistical and data visualization approaches.

[00110] Most normalization, expression- level referencing, statistical comparisons, and data visualization were performed using GeneSpring v7.0 (Silicon Genetics-Agilent). Fisher's exact test was performed online at http://www.matforsk.no/ola/fisher.htm. To identify differentially expressed features between 2 or more classes, GeneSpring's Wilcoxon-Mann- Whitney or the Kruskal-Wallis test were applied, respectively. For 3 or more classes, the initial non-parametric test was followed by the Student-Newman-Keuls post-hoc test. Results from the primary analyses were corrected for multiple testing effects by applying Benjamini and Hochberg false discovery rate correction (FDR, [58]). In general, due to the referencing strategies, good platform technical performances, and moderately low within-group biological variation of gene expression, stringent cutoffs could be used, i.e., the FDR level of significance was set between FDR < 5.10 5 and FDR<5.10 ~4 . K-means clustering was performed using the GeneSpring K-means tool and the Pearson correlation similarity measure.

[00111] Example: Methods - Ontology-based analysis of gene cluster-associated functional correlates.

[00112] Gene expression clusters were analyzed for the occurrence of multiple genes involved in related gene function categories by comparing each list of coordinately regulated clustered genes to categories within GeneOntology, pathways, or literature-based gene associations using GATACA (http://gataca.cchmc.org/), Ontoexpress (http://vortex.cs.wayne.edu/projects.htm/), and Ingenuity Pathway Analysis, version 3 (IPA; http://www.ingenuity.com/, Ingenuity Systems, Redwood City, CA). To do this, each cluster indicated in FIGS. 3, 4, 6, 7, and 9 was converted to a list of gene identifiers, uploaded to the application, and examined for over-representation of multiple genes from one or more molecular networks, or functional or disease associations as developed from literature mining. Networks of these focus genes were algorithmically generated based on the relationships of individual genes as derived from literature review and used to identify the biological functions and/or associated pathological processes most significant for each gene cluster. Fisher's exact test was used to calculate a/»-value estimating the probability that a particular functional classification or category of genes is associated with a particular pattern or cluster of gene expression more than would be expected by chance.

[00113] For each cluster, only the top significant functional classes and canonical pathways are shown. FIG. 6A shows a diagram of the canonical WNT signaling pathway and an associated- gene network that was a top-ranked association of the clusters that exhibited significant overexpression in AOM and Apc Mm/+ versus Smadi ~ ' ~ and Tgfbl '/' ; Ragl' ' mouse models. Genes or gene products are represented as nodes, and biological relationships between nodes are represented as edges (lines). All edges are supported by at least one literature reference from a manuscript, or from canonical information stored in the Ingenuity Pathways Knowledge Base.

[00114] With reference to FIGS. 8A and 8B, the up-regulated signature in tumors from Apc Mιn/+ (M) and AOM (A) models (cluster C6) is enriched with genes associated with the activation of the canonical WNT signaling pathway, as determined by nuclear β-catenin positivity. FIG. 6A depicts a schematic diagram of the canonical WNT signaling pathway showing elements present in C6 (gene symbols with gray background). Key elements of this pathway (Ctnnbl, Lefl, Tc/ and Myc) are outlined in blue. FIG. 6B depicts relative gene expression for MYC and SOX4 is plotted for individual murine and human tumors. The relative expression level of MYC and SOX4 is normalized to adult colon. Note that whereas Sox4, a canonical WNT target gene, is expressed at high levels in all human CRCs, A/M tumors and during embryonic mouse colon development, it is not expressed in S and T tumors (black). In contrast, MYC is over-expressed in all human and murine tumors and during colonic embryonic development (red), irrespective of the activation of canonical WNT signaling, as determined by nuclear β-cateninpositivity.

[00115] Example: Methods - Real-time quantitative PCR (RTQ-PCR)

[00116] To confirm the validity of data normalization and referencing procedures as well as the cDNA gene assignments of the printed arrays used in the microrarray analyses, quantitative real-time polymerase chain reaction (RTQ-PCR) was used to measure relative levels of nine genes found by microrarray data analysis to be differentially expressed (FDR<5.10 ~5 ) in tumors from Apc Mm/+ and Smadi '1' mice.

[00117] Total RNAs from C57BL6 Apc Mm/+ and 129 Smadi '1' tumor samples (20 μg) were reverse-transcribed to cDNA using the High Capacity cDNA Archive Kit (oligo-dT primed; Applied Biosystems). RTQ-PCR reactions (20 μl) were set up in 96-well MicroAmp Reaction Plates (Applied Biosystems) using 10 ng of cDNA template in Taqman Universal PCR Master Mix and 6-FAM-labeled Assays-on-Demand primer-probe sets (Applied Biosystems). Reactions were run on an MX3000P (Stratagene) with integrated analysis software. Threshold cycle numbers (Ct) were determined for each target gene using an algorithm that assigns a fluorescence baseline based on measurements prior to exponential amplification. Relative gene expression

levels were calculated using the δδCt method [59], with the Gush gene as a control. Fold-change was determined relative to expression in normal adult colon from two C57BL/6J mice.

[00118] Example: Methods - Immunohistochemistry

[00119] Immunohistochemical procedures were performed as described in Park, et al., Genesis (2005) [15]. Apc Mιn/+ and Smad3 ~A colon tumors were rapidly dissected, fixed in 4% paraformaldehyde, and embedded in paraffin before cutting lOμm thick sections. Antigen- retrieval was performed by boiling for 20 min in citrate buffer, pH 6.0. Sections were treated with 0.3% hydrogen peroxide in PBS for 30 min, washed in PBS, blocked in PBS plus 3% goat serum and 0.1% Triton X-100, and then incubated with primary antibodies and HRP-conjugated goat anti-rabbit secondary antibody (Sigma, St. Louis, MO). Antigen-antibody complexes were detected with a DAB peroxidase substrate kit (Vector Laboratories, Burlingame, CA) according to the manufacturer's protocol.

[00120] Example: Results

[00121] Thirty-nine 39 colon tumors from four independent mouse models and 100 human CRCs all share recapitulation of embryonic colon gene expression. A large-scale expression module based on increased expression by embryonic colon, all mouse tumor models, and human CRCs was composed of genes responsible for control of cell cycle progression, proliferation, and migration such as MYC, AKT2, ANLN, BIRC5, CSElL, ITGB5, MAD2L1, MIF, MSF, PLKl and SPARC. Nuclear β-catenin status subdivided mouse tumor models into nuclear positive (azoxymethane-treated [AOM] and Apc Ml ) tumors exhibiting greater expression of early colon development genes versus nuclear negative (SmadS 7" and Tgfbl 7" ; Rag2 7" ) tumors with a later developmental profile. Both human and mouse tumors differed from embryonic colon by loss of a shared tumor suppressor-containing module (EDNRB, HSPE, KIT and LSPl). Human CRC adenocarcinomas lost an additional suppressor module (BCL2, IGFBP4, MAP4K1, PDGFRA, STABl and WNT4) and in many of the tumor samples, gained a module associated with advanced malignancy (ABCCl, FOXO3A, LIF, PIK3R1, PRNP, TNC, TIMP3 and VEGF).

[00122] Example: Results - Strategy for cross-species analysis

[00123] With reference to FIG. 2, colon tumors from four etiologically distinct mouse models of CRC were subjected to microarray gene expression profiling. The gene expression profiles from the different mouse model tumors were compared and contrasted to each other, as well as to those from embryonic mouse colon development and 100 human CRCs. The strategy for the characterization of mouse models of human CRC relies on gene expression differences and relative patterning across a range of mouse CRC models, normal mouse colon developmental stages, and human CRCs. Achieving this comparison was facilitated by the use of reference

RNAs from whole-mouse and normal adult colon reference RNAs for both mouse and human measurements. Mouse tumor samples were profiled on cDNA microarrays using the embryonic day (E)17.5 whole mouse reference RNA identical to that described in Park, et al., Genesis (2005) [15] to examine embryonic mouse colon gene expression dynamics during from E13.5 to E18.5 during which time the primitive, undifferentiated, pseudo-stratified colonic endoderm becomes a differentiated, single-layered epithelium. This strategy allowed us to construct a gene expression database of mouse colon tumors in which gene expression levels of the tumors could be referenced, ranked, and statistically compared to an average value among the tumors or to embryonic or adult colon gene expression levels on a per-gene basis. First, the four models were compared with each other, then to mouse colon development, and finally to human CRCs using gene ortholog mapping (FIG. 2).

[00124] Example: Results - Mouse colon tumors partition into classes reflecting differential canonical WNT signaling activity.

[00125] To discover gene expression programs underlying differences between etiologically distinct mouse models of CRC, gene expression level values for each transcript in each tumor sample was set to its ratio relative to its median across the series of tumor models. Using non-parametric statistical analyses, 1798 cDNA transcripts were identified as differentially expressed among the four mouse models of CRC. Five major gene patterns were identified using K-means clustering (clusters C1-C5). Genes belonging to these clusters were strongly associated with annotated gene function categories (see Table W for detailed biological descriptions and associations). For example, cluster Cl, composed of transcripts that exhibited lower expression in SmadS "7" tumors and higher expression in AOM, Apc Mm/+ and Tgfbl 7" ; Rag2 7" tumors, contains 391 transcripts including Cdk4, Ctnnbl, Myc, Ezh2, Mcm2 and Tcf3. Gene list over- representation analysis using DAVID and Ingenuity Pathway Analysis applications demonstrated highly significant associations to cell cycle progression, replication, post-transcriptional control and cancer. Similarly, cluster C2, composed of 663 transcripts that exhibited high expression in AOM and Apc Ml + tumors, but low in Smad3 " " and Tgfbl " " ; Rag2 " " tumors, included transcripts for contact growth inhibition (Metapl, Pcyoxl), mitosis (Mif, Pikl), cell cycle progression and checkpoint control (Id2, Ptp4A2, Tp53).

[00126] Active canonical WNT signaling (as determined by nuclear β-catenin) stratifies the four murine colon tumor models into two groups. 1798 gene transcripts are identified as differentially expressed among any of the four mouse tumor models (Kruskal-Wallis test + Student-Newman-Keuls test + FDR<5.10 "5 ) (Data not shown). Results demonstrate that AOM (A) and Apc Mιn + (M) tumors are transcriptionally more similar to each other than to tumors from

Smad3 ~A (S) and Tgfbϊ 1' ; Rag2 ~/~ (T) mice. Five clusters have been identified (C1-C5) that correspond to the K-means functional clusters listed in Table W. Please refer to Table W for a description of the functional classification of the genes found in these clusters. The lower panel illustrates the extent of the similarity between A/M and S/T tumors by identifying the top-ranked 1265 transcripts of the 1798 that were higher or lower in the two tumor super-groups (rank based on Wilcoxon-Mann- Whitney test for between-group differences with a FDR<5.10 5 cutoff). Up- regulated transcripts in A/M tumors are highly enriched for genes associated with canonical WNT signaling activity, cell proliferation, chromatin remodeling, cell cycle progression and mitosis; transcripts over-expressed in S/T tumors are highly enriched for genes related to immune and defense responses, endocytosis, transport, oxidoreductase activity, signal transduction and metabolism. Representative histologies for each of the four tumor models were obtained (data not shown). Model-dependent localization of β-catenin was shown. Tumors from M and A (not shown) mice exhibited prominent nuclear β-catenin accumulation and reduced cell surface staining. Conversely, tumors from 5 * and 7 (not shown) mice exhibited retention of plasma membrane β-catenin immunoreactivity.

[00127] From the 1798 transcripts differentially expressed among the four mouse models of CRC, more than 70% (n=1265) distinguished Apc Mm/+ and AOM tumors versus Smad3 "A and Tgfbr 7" ; Rag2 7~ tumors. If a random or equivalent degree of variance occurred among all classes, there would be far less overlap. The majority of this signature (-75%, n=904 features) derived from genes over-expressed in Apc Mm/+ and AOM tumors relative to the SmadS 7" and Tgfbl 7" ; Rag2 " " tumors (cluster C6). Cluster C6 was functionally enriched for genes linked to canonical WNT signaling (Table W). These included genes previously identified to be part of this pathway (Cd44, Myc, Straό, Tcfl, Tcf4 [16], Id2, Lefl, Nkdl, NIk, Twist

[http://www.stanford.edu/~rnusse/], Catnb, Csnklal, Csnkld, Csnkle, Plat, Wifl) as well as genes that appear to be novel canonical WNT signaling targets (e.g. Cryll, Expi, Ifitm31, Pacsin2, Sox4 [16], Ets2, Hnrnpg, Hnrpal, Id3, Kpnb3, Pais, Pcna, Ranbpl 1, Rbbp4, Yes [17], Hdac2 [18]). Moreover, consistent with the over-expression of Myc in tumors from the Apc Ml + and AOM models, enrichment of Myc targets such as Apex, Eefld, Eif2a, Eif4e, Hsp90, Mif, Mitf, Npml [19] and the repression of Nibam [19] were detected.

[00128] Example: Results - Nuclear β-catenin expression distinguishes murine models [00129] To establish a molecular basis for over-expression of canonical WNT target genes in Apc Mm/+ and AOM tumors, immunohistochemistry was used to characterize the relative cellular

distribution of β-catenin. Tumors from Apc Mm/+ and AOM mice exhibited strong nuclear β- catenin immunoreactivity and reduced membrane staining, whereas tumors from Smad3 7" and Tgibl "7" ; Rag2 "/" mice showed strong plasma membrane β-catenin staining with no nuclear accumulation (see inset). Additional tests to confirm the microarray results were also carried out using an independent set of C57BL/6 Apc Mm/+ colon tumor samples analyzed by qRT-PCR (FIG. 3) and immunohistochemistry. Colon tumors from five Apc Mιn + (M; nuclear β-catenin-positive) mice and four Smad3 ~ ' (S; nuclear β-catenin-negative) mice were harvested, and qRT-PCR was performed on nine genes that exhibited representative strong or subtle patterns in the microarray analyses. All nine patterns detected in the microarray set were validated by the qRT-PCR results. Aloxl2=Arachidonate 12-lipoxygenase; Casp6=Caspase 6; Matn2=Matrilin 2; Ptplb=Protein tyrosine phosphatase-like B; Sox2 I=SRY (sex determining region Y)-box 21; Spock2=Sparc/osteonectin 2; Tesc=Tescalcin; Tpm2=Tropomysin 2; Wifl=WNT inhibitory factor; Stmnl=stathmin 1; Ptp4a2=phosphatase 4a2. In (a), * p<0.05 and ** pθ.01. All expression patterns identified via microarray analysis were consistent with the qRT-PCR results (n=9 transcripts, chosen for their demonstration of a range of differential expression characteristics). In situ hybridization analyses using C57BL/6 Apc Mm/+ colon tumor samples also validated that Wif, Tesc, Spock2 and Caspό were strongly expressed in dysplastic cells of the tumors (data not shown). At the protein level, immunohistochemical analyses confirmed relatively greater expression of the oncoprotein stathmin 1 in Apc Mm/+ mice and tyrosine phosphatase 4a2 in Smad3 7" mice.

[00130] Overall, cluster C6 genes (i.e. genes with greater up-regulation in tumors from Apc Mm/+ and AOM models than in SmadS 7" and Tgfbl 7" ; Rag2 7" ) were consistent with increased tumor cell proliferation (e.g. Myc, Pcna), cytokinesis (e.g. Amot, Cxcl5), chromatin remodeling (e.g. Ets2, Hdac2, Set) as well as cell cycle progression and mitosis (e.g. Cdkl, Cdk4, Cull, Plkl). Myc is up-regulated in all four mouse tumors models relative to normal colon tissue (see FIG. 7B). Biological processes showing increased transcription in tumors from the SmadS 7" and Tgibl "7" ; Rag2 7" models (cluster C7) included immune and defense responses (e.g. 1118, Ml, Myd88), endocytosis (e.g. Lrpl, LdIr, Racl), transport (e.g. Abca3, Slc22a5, Slc30a4), and oxidoreductase activity (e.g. Gcdh, Prdxό, Xdh) (Table W). Taken together, these transcriptional observations are both consistent with and extend the understanding of the histological features of the CRC models [7]. For example, while Apc Mm/+ and AOM tumors are characterized by cytologic atypia (i.e. nuclear crowding, hyperchromasia, increased nucleus-to-cytoplasm ratios and minimal inflammation), tumors from SmadS "7" and Tgibl "7" ; Rag2 "A mice show less overt dysplastic changes but exhibit a significant inflammatory component.

[00131] Multiperspective views of cancer sample transcriptional programs provide integrative insight into large-scale transcriptional patterns adopted by colon cancer and adenomatous tumor models. Murine colon tumor adenomas and human CRCs both show adoption and dysregulation of signatures tightly controlled during embryonic mouse colon development. The use of etiologically distinct mouse models of colon cancer allows for the identification of models that resemble different stages of embryonic mouse colon development and that are recapitulated by specific tumors types. With reference to FIG. 7 A, all tumors exhibit large-scale activation of developmental patterns. Nuclear β-catenin-positive (Apc Mιn/+ and AOM) tumors map more strongly to early development stages during (more proliferative, less differentiated), whereas nuclear β-catenin-negative (TgβV ' ; Rag2 ~ ' and Smad3 ~ " ) tumors map more strongly to later stages consistent with increased epithelial differentiation. With reference to FIG. 7B, overall representation of the relationship of mouse colon tumor models and human CRC to development and non-developmental expression patterns is shown. Gene expression clusters mapped to the progression of adenomatous and carcinomatous transformation are shown. Both mouse colon tumor models and human CRC share in the activation of embryonic colon expression, the repression of adult differentiation, and the loss of shared tumor suppressor genes. Many human CRCs also lack the expression of additional tumor suppressor programs and gain the expression of oncogenes that are not overexpressed during normal developmental morphogenesis.

[00132] Example: Results - Large-scale activation of the embryonic colon transcriptome in mouse tumor models.

[00133] The studies described herein were used to compare genes over-expressed in both colon tumors and embryonic mouse colon in order to provide insights into tumor programs important for fundamental aspects of tumor growth and regulation of differentiation. To identify genes and observe regulatory patterns that were shared or differed between colon tumors and embryonic development, a global quantitative referencing strategy was applied to both tumor and embryonic samples by calculating the relative expression of each gene as the ratio of its expression in any sample as that relative to its mean level in adult colon. From this adult baseline reference, genes over-expressed in the four mouse tumor models appeared strikingly similar. Moreover, the vast majority of genes over-expressed in tumors were also over-expressed in embryonic colon. If the fraction of fetal over-expressed genes from the entire microarray (5796 of 20393 features; 28.4%) was maintained at a similar occurrence frequency in the tumor over- expressed fraction (8804 of 20393), one would expect an overlap of 2502 transcripts ((8804/20393)*28.4%). Rather, 4693 out of the 5796 fetal over-expressed transcripts were observed to be over-expressed in the 8804 tumor over-expressed genes (FIG. 4). The probability

calculated by Fisher's exact test is p < I "300 , and thus represents highly significant over- representation of fetal genes among the tumor over-expressed genes. Similarly, genes under- expressed in developing colon were disproportionately underexpressed in tumors relative to normal adult colon (3282 of 3541; p < I "300 ). Combining these results, approximately 85% of the developmentally regulated transcripts (7975 out of 9337 features) were recapitulated in tumor expression patterns relative to adult colon.

[00134] To explore the potential biological significance of genes over-expressed in both embryonic colon development and mouse tumors, K-means clustering was used to generate C8- ClO cluster patterns as shown in a hierarchical tree heatmap (Table X). Several sub-patterns were evident, some of which clearly separated Apc Ml + and AOM from Smad3 " " and Tgfbl " " ; Rag2 " " tumors. One strong cluster, cluster C8, consisted of genes more strongly expressed in Apc Mm/+ and AOM than SmadS 7" and Tgfbl 7" ; Rag2 7~ tumors. This group of genes represented a large fraction of all differences found between nuclear β-catenin-positive (Apc Mm/+ and AOM) and negative (Smad3 "A and Tgfbl "7" ; Rag2 7~ ) tumors (-45%; 1636 out of 3592 features), as well as differences detected between early (i.e. E13.5-E15.5, ED) and late (E.16.5-E18.5, LD) embryonic colon developmental stages. Thus, the fraction of developmentally regulated genes that are more characteristic of the earlier stages of normal colon development (E 13.5 -E 15.5), are clearly expressed at higher levels in nuclear β-catenin-positive tumors. This observation is illustrated by 750 transcripts selected solely for stronger expression in ED versus LD. The majority of these transcripts overlap with cluster C6 containing 230 features and illustrate the tendency of the earlier-expressed developmental genes to be more strongly expressed in Apc Ml + and AOM mice. In addition, transcripts associated with increased differentiation and maturation, observed at later stages of colon development E16.5-E18.5 (e.g. Klf4 [20], Crohn's disease-related Slc22a5/Octn2 [21], Slc30a4/Znt4 [22], Sst [23]), were expressed at higher levels by tumors from Smad3 7" and Tgft>r /~ ; Rag2 ~/~ mice.

[00135] All four murine tumor models exhibit reactivation of embryonic gene expression. The expression level of each gene in each sample was calculated relative to that in adult colon. Genes and samples were subjected to unsupervised hierarchical tree clustering for similarities among genes and tumors. Heatmap showed the relative behaviors of 20393 transcripts that passed basic signal quality filters with gene transcripts shown as separate rows and samples as separate columns. The majority of genes over-expressed in tumors are also over-expressed in embryonic colon; similarly, the genes under-expressed in tumors are under-expressed in embryonic colon. In addition, there are genes over-expressed in embryonic colon that are under-expressed in tumors and vice versa. The genes represented in FIG. 3 were divided into those over-expressed and

under-expressed in embryonic colon and in the tumors, respectively. Fisher's exact test was used to calculate expected overlaps between lists and confirmed significant over-representation of development-regulated signatures among the tumors (*p < I "300 , **p < 1.3 "19 , ***p < 4 "296 , ****p < I "300 ). A heatmap was generated showing the behavior of a subset of the transcripts in 4a (n=4693 features) that were over-expresssed in both embryonic colon and tumor samples (not shown). Table W includes a description of the genes associated with these clusters. Embryonic gene expression can be further refined into genes expressed differentially during early (ED; E13.5-15.5) and late (LD; E16.5-18.5) embryonic development. A heatmap was generated showing the relative behaviors of 750 transcripts that are highest-ranked for early versus late embryonic regulation (not shown). Overall, transcripts with the highest early embryonic expression were expressed at higher levels in nuclear β-catenin-positive tumors (A and M), whereas nuclear Bcatenin negative tumors (S and T) were representative of later stages of embryonic development. Sample groups: ED, early development (E13.5-E15.5); LD, late development (E16.5-E18.5); A, AOM-induced; M, Apc Min/+ ; T, Tgfbl^; Ragl 1' ; S, Smad^).

[00136] Example: Results - Human CRCs reactivate an embryonic gene signature [00137] Since mouse tumors recapitulated developmental signatures irrespective of their etiology, it was asked whether a similar commitment to embryonic gene programming was shared by sporadic human CRCs. Tumor classification by microarray profiling is usually accomplished by referencing relative gene expression levels to the median value for each gene across a series of tumor samples. Using this "between-tumors median normalization" approach, as well as a gene filtering strategy that detects significantly regulated genes in at least 10% of the cases, led to the

identification of a set of 3285 probe sets corresponding to transcripts whose expression was highly varied between independent human tumor cases. There was striking heterogeneity of gene expression among 100 human CRCs. For example, Cluster 15 contained a set of genes (principally metallothionein genes) recently identified to be predictive of microsatellite instability [24, 25]. This analysis indicates that human CRCs have a greater level of complexity than the mouse colon tumors studied here. There was no correlation between these distinguishing clusters and the stage of the tumor (note the broad overlapping distributions of Dukes stages A-D across these different clusters). However, as shown in Table Y, gene ontology and network analysis of the individual gene clusters (Cl 1 -C 17) that were differentially active in subgroups of the tumors, map to genes highly associated with a diverse set of biological functions, including lipid metabolism, digestive tract development and function, immune response and cancer.

[00138] Human CRCs exhibit gene expression profile complexity consistent with significant tumor subclasses. Genes potentially able to distinguish cancer subtypes were identified from Affymetrix HG-U133 plus2 Genechip expression profiles by filtering for 3285 probe sets that were top-ranked by raw expression and their differential regulation in at least 10 out of 100 human colorectal cancer tumors. Coordinately regulated transcripts and similarly behaving samples were identified via hierarchical tree clustering. Seven different gene clusters (Cl 1-17) were identified that distinguished 10 or more tumors from the other tumors. Gene clusters were found to be highly enriched for gene functions listed in Table Y. Data were processed using robust microarray analysis (RMA) with expression value ratios depicted as the relative expression per probe set in each sample relative to the median of its expression across the 100 CRCs. A striking heterogeneity of gene expression was observed, including metallothionein genes in cluster C 15 previously shown to be predictive of microsatellite instability, and Cl 7 represented by 734 probe sets rich in genes associated with extracellular matrix and connective tissue, tumor invasion and malignancy.

[00139] To evaluate if similar sets of genes are systematically activated or repressed in human CRC, as in the mouse colon tumors, two procedures were undertaken to align the data. First, gene expression values for the mouse and human tumors were separately normalized and referenced relative to their respective normal adult colon controls; second, mouse and human gene identifiers were reduced to a single ortholog gene identifier. The latter is a somewhat complex procedure that requires identifying microarray probes from each platform that can be mapped to a single gene ortholog and undertaking a procedure to aggregate redundant probes within a platform (see Methods). This approach allowed the identification of 8621 gene transcripts on the HG- Ul 33 plus2 and Vanderbilt NIA 2OK cDNA arrays for which relative expression values could be mapped for nearly all mouse and human samples. A clustering-based assessment of expression across the whole mouse-human ortholog gene set identified a large number of transcripts behaving similarly across colon tumors, many irrespective, but some respective of species. Notably, the great majority of genes over-expressed in all tumors were also over-expressed during colon development. To evaluate the statistical significance of this pattern, a Venn overlap filtering strategy and Fisher's exact test analysis were used. Approximately 50% of the 2212 ortholog genes over-expressed in at least 10% of the human cancers relative to adult colon were also over-expressed in developing colon. If there was not a selection for developmental genes among the tumor-overexpressed, the expected overlap would be (2718/8621)*2212=697 transcripts. Using Fisher's exact test for the significance of the increased overlap of 1080 versus 697 transcripts is p<le-300. Similarly, genes under-expressed in mouse colon development and human CRCs also strongly overlapped (FIG. 5; 431 of 737, p<le-76). This result is significantly greater than the 8-19% of genes that were estimated to be over-expressed in human colon tumors and fetal gut morphogenesis based upon a computational extrapolation of SAGE data [26]. Thus, the findings not only confirm but also significantly expand and experimentally validate the previously suggested recapitulation of embryonic signatures by human CRCs.

[00140] All overlaps between tumor expression and development were pooled to form a set of 2116 ortholog gene transcripts. This was subjected to hierarchical tree and K-means clustering to define six expression clusters, C18-C23 (Table Z).

[00141] Both human CRCs and mouse colon tumors reactivate an embryonic gene signature. When human and murine tumors are compared, they both broadly re-express an embryonic gene expression pattern. Gene expression profiles from the mouse tumor models and human CRC samples were combined into a single non-redundant gene ortholog genome table structure and subjected to comparative profile analysis. Informative probe-sets from human and mouse platforms were selected, mapped to corresponding ortholog genes, and used to populate a table in which normalized expression for each gene is relative to normal adult colon. A heatmap plot for all cross-species gene orthologs both present and successfully measured on both the Affymetrix Hg-U133 and Vanderbilt Mouse NIA 2OK microarrays (n=8621 features) suggested that a large number of human CRC signatures exhibit similar behaviors in the mouse tumors and during embryonic mouse colon development. Based on these results, four separate gene lists were generated with criteria of over- or under-expression in development or over-expression or under- expression in human CRCs (2718, 2365, 2212, and 737, respectively). Genes over-expressed and under-expressed in embryonic mouse colon and human CRCs were found to be over-represented as determined by Fisher's exact test analysis (*p < 7x10 88 , **p < 1 xlO "76 , ***/> < 5 xlO "4 , ****/? < 1 xlO "76 ). A heatmap plot of all genes co-regulated in human CRCs and during early (ED) and late (LD) mouse embryonic colon development (n=2216 features). Six predominant clusters (C18-C23) characterize the transcriptional relationship between human CRC and mouse colon tumor models and embryonic development. Two clusters (C20 and 21) primarily distinguish human CRCs from murine tumors (A, M, S and T). For example, CRC up-regulated transcripts that are either developmentally up- or down-regulated are represented by clusters C22 (n=860 features) and clusters C21/C23 (n=142 features), respectively. Conversely, CRC down-regulated transcripts that are either down- or up-regulated during development are shown in clusters C18/C19 (n=258 features) and cluster C20 (n=42 features), respectively. Interestingly, while -80% and -60% of genes up- and down-regulated in both human CRCs and mouse development were also up- and down-regulated in tumors from the various mouse models, several clusters provide very interesting exceptions: cluster C20 are genes down-regulated in human CRCs that are routinely over-expressed in mouse tumors and development; cluster C21 are genes robustly expressed in human CRC that are rarely expressed in embryonic colon or murine tumors.

Table Z: Detailed cluster analysis; Differential and statistically significant biological functions in clusters C18-C23.

[00142] These clusters provide an impressive partitioning of groups of genes associated with different biological functions critical for colon development, maturation and oncogenesis. Cluster C22 (860 transcripts of genes strongly expressed both developmentally and across all tumors) is highly enriched with genes associated with cell cycle progression, replication, cancer, tumor morphology and cellular movement. Cluster C18 (258 transcripts down-regulated in mouse and human tumors, as well as in development) is highly enriched in genes associated with digestive tract function, biochemical and lipid metabolism. This cluster is clearly composed of genes associated with the mature GI tract. Thus, as opposed to recapitulating developmental gene activation, the cluster Cl 8 pattern indicates a corresponding arrest of differentiation in both mouse and human tumors. Cluster C23 (142 transcripts over-expressed in all mouse models and human CRC, but with low expression in development) maps to genes highly associated with the disruption of basement membranes, invasion and cell cycle progression, as well as altered transcriptional control. Cluster C21 (313 transcripts in which human tumors somewhat variably express a set of genes that are rarely expressed by the mouse tumors) is remarkable for its composition of genes associated with cell cycle proliferation, tissue disruption and angiogenesis. Thus, while categorically quite similar to cluster C23, the genes in cluster C21 represent a separately regulated module that is enriched for genes associated with invasion. Clusters C21 and C23 reveal sets of genes likely involved in tumor progression. Cluster C22 (with genes over- expressed in all mouse and human tumors and strongly expressed in embryonic colon) represents a group of genes highly correlated with transformation. The top-ranked transcription factor present in this cluster, with regulation independent of β-catenin localization, is Myc/MYC (FIG. 6B). Although Myc was lower in expression in the Smad3 7" tumors compared to tumors from the other three models, it was elevated in all four models relative to normal adult colon. Myc/MYC was over-expressed in all mouse and human tumors as well as in development. This contrasts with Sox4, which is unaltered in expression in the SmadS 7" and Tgfbl 7" ; Rag2 7" tumors but is up- regulated in AOM and Apc Mm/+ tumors relative to normal adult colon (FIG. 6B). Myc/MYC

over-expression may be independent of nuclear β-catenin status. Increased Myc/MYC expression may reflect both activation of canonical Wnt signaling, as it is a target of nuclear β-catenin/TCF [27], and deregulation of TGFB signaling, as TGFBl is known to repress Myc/MYC [28-30]. These observations suggest a fundamental role for Myc/MYC in colonic neoplasia.

[00143] Example: Discussion

[00144] Numerous mouse models of intestinal neoplasia have been developed, each with unique characteristics. The models constructed to date, however, do not fully represent the complexity of human CRCs principally because most are unigenic in origin and produce primarily adenomas and early stage cancers. Although models like Apc Mιn/+ show molecular similarities to human CRCs, such as initiation of adenoma formation by inactivation oiApc, little is known about the molecular similarities of tumors from the different mouse models. It is also unknown how such common and perhaps large-scale molecular changes in mouse models relate to the molecular programming of human CRC. To shed light on the underlying molecular changes in tumors from mouse models and human CRC, the relationship was assessed at the molecular level of four widely used, but genetically distinct, mouse models that develop colon tumors. A subsequent analysis of the models in the context of embryonic mouse colon development was also undertaken. Finally, to identify consensus species-independent cancer signatures that may define gene expression changes common to all CRCs, relevant mouse model signatures were projected onto a large set of human primary CRCs of varied histopathology and stage.

[00145] Example: Discussion - Differential canonical WNT signaling activity discriminates two major classes of mouse models of CRC with distinct molecular characteristics

[00146] Tumors from mouse models of CRC exhibit significant phenotypic diversity [6], and therefore were expected to exhibit differential gene expression patterns. Using a combination of inter-model and normal adult gene expression level referencing, the analysis of tumors from mouse models of CRC has revealed a low complexity between models and strains, and has identified common and unique transcriptional patterns associated with a variety of biological processes and pathway-associated activities. The results demonstrate an imbalance between proliferation and differentiation with nuclear B-catenin-positive tumors being more proliferative, less differentiated and with lower immunogenic characteristics than tumors from nuclear B- catenin-negative tumors. Mouse tumors characterized by signatures of relative up-regulation of genes associated with cell cycle progression also showed increased canonical WNT signaling activity (Apc Mιn + and AOM). Tumors from mouse models not showing canonical WNT signaling pathway activation (Smad3 ~A and Tgfbr A ; Rag2 ~ ' ~ ) were characterized by up-regulation of genes associated with inflammatory and innate immunological responses, and intestinal epithelial cell

differentiation. Recent studies have indicated that chronic inflammation caused either by infection with Helicobacter pylori [31] or Helicobacter hepaticus [13] is a prerequisite for intestinal tumor development in Smad3 ~ ' and TgfbV ' ; Rag2 ~ " mice, respectively.

[00147] The activation of canonical WNT signaling in AOM tumors was identified using a between-tumor global median normalization to gene expression data. However, when tumor sample expression was referenced to that of normal adult intestinal tissue, many more genes are up-regulated, including developmental genes that are not dependent on nuclear β-catenin. That canonical WNT signaling-related genes are altered similarly in both AOM and Apc Mιn/+ tumors suggest biological similarities between the two models. In addition, the relatively consistent programming within the AOM model also emphasizes its value for examining the more complicated genetics that result in strain-specific sensitivity to environmental agents that induce cancer.

[00148] Activation of canonical WNT signaling leads to nuclear translocation of β-catenin and, through its interaction with LEF/TCF, the regulation of genes relevant to embryonic development and proliferation [16], as well as stem cell self-renewal [32]. Consequently, the activated canonical WNT signaling observed in Apc Mιn/+ and AOM models suggests that tumors may arise as a consequence of proliferation of the stem cell or "transient amplifying" compartment. In the colonic crypt, loss of TCF4 [33] or DKKl over-expression [34] promotes loss of stem cells, suggesting that canonical WNT signaling is required for the maintenance of the intestinal stem cell compartment [33-35]. Conversely, increased nuclear β-catenin/TCF4 activity imposes a crypt progenitor phenotype on tumor cells [17]. In this study, transcriptional activation of the canonical WNT signaling pathway was identified in tumors from Apc Mιn/+ and AOM mice. This was confirmed by immunohistochemistry.

[00149] In colon tumors and perhaps intestinal stem cells, activation of canonical WNT signaling promotes a hyperproliferative state. Proliferation-related characteristics of nuclear β- catenin-positive tumors include increased expression of CCNDl, MYC, PCNA [17], and SoxA [16]. These genes were also identified as a component of the nuclear-β-catenin-positive signatures. In turn, increased MYC decreases intestinal cell differentiation by binding to and repressing the Cdknla (coding for p21 CIP1/WAF1 ) promoter [36], the gene encoding Wnt-inhibitory factor Wifl, the gene encoding the negative regulator of WNT Nakedl [37], and the gene encoding the Takl /Nemo-like kinase, NIk [38]. Wifl displays a graded expression in colonic tissue, with higher expression in the stem cell compartments and lower expression in the more differentiated cells at the luminal surface, suggesting that Wifl may contribute to stem cell pool maintenance independent of WNT signaling inhibition. [39].

[00150] Canonical WNT signaling not only governs intestinal cell proliferation, but also cell differentiation and cell positioning along the crypt-lumen axis of epithelial differentiation. Increased canonical WNT signaling activity enhances MATHl -mediated amplification of the gut secretory lineages [40]. Canonical WNT signaling also influences cell positioning by regulating the gradient of EPHB2/EPHB3 and EPHBl ligand expression [41, 42]. Together, the data suggest a complex imbalance of crypt homeostasis due to enhanced canonical WNT activity.

[00151] The results indicate that tumors arising in response to abnormal TGFB1/SMAD signaling [14, 43] are similar to one another in their specific gene signatures and broadly distinct from those with activated canonical WNT signaling by their absence of nuclear β-catenin. Unique to the dysregulated TGFB1/SMAD4 signaling models is the strong signature of an immunologically altered state, with up-regulation of genes determining immune and defense responses such as 1118, Irfl and mucin pathway-associated genes. Again, these tumors are usually characterized by a strong inflammatory component when evaluated histopathologically, even in the absence of T- and B-cells such as in the TgfbV ' ; Rag2 ~ ' background.

[00152] The microarray patterns of gene expression for AOM and Apc Mιn/+ tumors are mirror images of those for Tgfbr A ; Rag2 ~ ' ~ tumors. It is perhaps not surprising that combining these two transcriptional programs results in increased number and invasiveness of colonic tumors as recently reported for Apc Mm/+ mice crossed to SmadS 7" mice [44]. Moreover, combined activation of canonical WNT signaling and inhibition of TGFB signaling also results in more advanced intestinal tumors (Apc delta716/+ ; Smad4 +I~ mice [45] and intestine-specific deletion of the type II TGFB receptor in Ape 1638N/wt mice [46].

[00153] The findings that shared over-expressed signatures are identifiable in all four mouse models of CRC, that are also representative of the majority of embryonic colonic over- expressed signatures, and that these signatures are also present in all human CRCs, suggest that colon tumors may arise independently of canonical WNT signaling status. A likely candidate to impart this oncogenic signaling is Myc, which is an embryonic up-regulated transcript that is also upregulated in all human CRCs and mouse tumor models independently of nuclear B-catenin status.

[00154] Example - Discussion - Embryology provides insight into the biology of mouse and human colon tumors

[00155] It has long been suggested that cancer represents a reversion to an embryonic state, partly based upon the observation that several oncofetal antigens are diagnostic for some tumors [47, 48]. To assess the embryology-related aspects of tumorigenesis and tumor progression in CRC, the transcriptomes of normal mouse colon development and models of CRC were analyzed

and compared. The data show that developmentally regulated genes represent -56% of mouse tumor signatures, and that the tumor signatures from the four mouse models recapitulate -85% of developmentally regulated genes.

[00156] There are at least two regulatory programs that determine the expression of developmental genes by mouse tumors (FIG. 7). The simpler program is evident by the over- expression of the earliest genes of colon development by the nuclear β-catenin-positive models. The more subtle program is that which is also shared by nuclear β-catenin-negative models that are also highly enriched in developmentally expressed genes despite the fact that they lack activation of canonical WNT signaling. Genes found within this signature are consistent with those present in the colon at later developmental stages (E16.5-E18.5).

[00157] How do genes tightly regulated during mouse colon development become activated in colon tumors? While activated canonical WNT signaling imparts a strong influence, its absence in Tgfbl ' ^; Rag2 ~ ' ~ and Smad3 -/- tumors, as determined by the absence of nuclear β-catenin, did not prevent the large scale activativation of developmental/embryonic gene expression. One mechanism may be through epigenetic alterations. In human CRCs, these types of alterations in gene expression programs [49], suggest a link between cellular homeostasis and tumorigenesis. The recruitment of histone acetyltransferases and histone deacetylases (HDACs) are key steps in the regulation of cell proliferation and differentiation during normal development and carcinogenesis [50]. Induction oiHdac2 expression occurs in 82% of human CRCs as well as in tumors from Apc Mm/+ mice [18], Alternatively, common regulatory controls may operate in parallel growth and differentiation/anti-differentiation pathways such that a single or small subset of regulators, such as MYC or one or more miRNAs, may be responsible for the control of multiple pathways. Indeed, consistent with the observation of nuclear beta catenin independent activation of Myc in all mouse models and across the board for human CRC, deletion of Myc has recently been demonstrated to completely abrogate nuclear beta catenin-driven small bowel oncogenesis in mouse models [51].

[00158] Example: Discussion - Comparative analysis reveals underlying development- related signatures in human CRCs

[00159] As shown in FIG. 6, considerable and intriguing heterogeneity of human CRC is observed among genes highly relevant for differential malignant behavior. However, employing the between-tumors normalization and referencing strategies prevent the detection of gene expression patterns that are shared between tumors. Using the adult normal colon as a reference, as shown in FIG. 5, a large fraction of differential gene expression relative to adult colon could be demonstrated that recapitulated developmental gene expression by virtue of both activating

embryonic colon gene expression and failing to express genes associated with normal colon maturation. Within these developmentally regulated gene sets, the analyses revealed little evidence of CRC subsets, including those suggestive of nuclear beta catenin negative tumors that might approximate the Smad3 ~ ' and TgfbV ' ; Rag! ' signature. The inability to identify distinct subclasses with respect to developmental genes in the human CRCs is perhaps not surprising in that over 80% of MSI+ CRCs from HNPCC families exhibit nuclear β-catenin [52]. In addition, within the developmental genes, little evidence was apparent for signatures related to microsatellite-unstable tumors (MSI+), often associated with HNPCC although perhaps some of this type of signature was apparent in the median normalized depiction of the tumors as highlighted in FIG. 6.

[00160] This report constitutes a comprehensive molecular evaluation and comparison of mouse and human colon tumor gene expression profiles. The ability to compare tumor gene expression profiles between mouse and human tumors has been improved by using a referencing strategy in which gene expression levels in the tumor samples are analyzed in relation to gene expression in corresponding normal colon epithelium. This approach has revealed that gene expression patterns are both shared and distinct between mouse models and human CRCs. The present study actually demonstrates the magnitude of the similarity between tumors and embryonic gene expression.

[00161] Finally, the results suggest that comparisons made between mouse tumor models, developing embryonic tissues, and human CRCs provides a powerful biological framework from which to observe shared and unique genetic programs associated with human cancer. While ortholog-gene based analyses have been used previously to obtain direct comparison of the molecular features of mouse and human hepatocellular carcinomas [55], the results provide striking support for the hypothesis that cancer represents a subversion of normal embryonic development. By inclusion of detailed mouse embryonic and developmental profile information, the results have revealed critical similarities and differences between the mouse and human tumors that are particularly revealing of oncogenic and tumor suppressor programs, some genes from which should be useful for development of diagnostic biomarkers and identification of therapeutic targets and pathways.

[00162] Example: Conclusions

[00163] Cross-species and multi-model comparisons of gene expression in colon cancer versus that of normal embryonic development provide an integrated and versatile framework from which to recognize transcriptional patterns associated with oncogenesis and a general method for pattern-specific identification of biomarkers and therapeutic targets. A pronounced feature of all

CRCs is a simultaneous exploitation and subversion of normal organogenesis. Irrespective of nuclear β-catenin status, all mouse and human colonic tumors exhibited an embryonic expression pattern and Myc/ MYC overexpression suggesting its fundamental oncogenic role in colonic neoplasia. Compared to the mouse model tumors, human CRC adenocarcinomas both gained and lost additional non-developmental gene expression modules whose specific targeting may lead to both improved mouse models and novel combinatorial therapeutics.

[00164] Throughout this document, various references are mentioned. All such references are incorporated herein by reference, including the references set forth in the following list:

REFERENCES

1. Heath JP: Epithelial cell migration in the intestine. Cell Biol Int 1996, 20(2): 139-146.

2. Sancho E, Batlle E, Clevers H: Signaling pathways in intestinal development and cancer. Annu Rev Cell Dev Biol 2004, 20:695-723.

3. Brabletz T, Hlubek F, Spaderna S, Schmalhofer O, Hiendlmeyer E, Jung A, Kirchner T: Invasion and metastasis in colorectal cancer: epithelial-mesenchymal transition, mesenchymal-epithelial transition, stem cells and beta-catenin. Cells Tissues Organs 2005, 179(l-2):56-65.

4. Helm J, Enkemann SA, Coppola D, Barthel JS, Kelley ST, Yeatman TJ: Dedifferentiation precedes invasion in the progression from Barrett's metaplasia to esophageal adenocarcinoma. Clin Cancer Res 2005 , l l(7):2478-2485.

5. Kaihara T, Kusaka T, Nishi M, Kawamata H, Imura J, Kitajima K, Itoh-Minami R, Aoyama N, Kasuga M, Oda Y et ah Dedifferentiation and decreased expression of adhesion molecules, E-cadherin and ZO-I, in colorectal cancer are closely related to liver metastasis. J Exp Clin Cancer Res 2003, 22(1): 117-123.

6. Boivin GP, Groden J: Mouse Models of Intestinal Cancer. Comp Med 2004, 54(1): 15-18.

7. Boivin GP, Washington K, Yang K, Ward JM, Pretlow TP, Russell R, Besselsen DG, Godfrey VL, Doetschman T, Dove WF et a Pathology of mouse models of intestinal cancer: consensus report and recommendations. Gastroenterology 2003, 124(3):762-777.

8. Moser AR, Pitot HC, Dove WF: A dominant mutation that predisposes to multiple intestinal neoplasia in the mouse. Science 1990, 247(4940):322-324.

9. Logan CY, Nusse R: The Wnt signaling pathway in development and disease. Annu Rev Cell Dev Biol 2004, 20:781-810.

10. Oving IM, Clevers HC: Molecular causes of colon cancer. Eur J Clin Invest 2002, 32(6):448-457.

11. Bissahoyo A, Pearsall RS, Hanlon K, Amann V, Hicks D, Godfrey VL, Threadgill DW: Azoxymethane Is a Genetic Background-Dependent Colorectal Tumor Initiator and Promoter in Mice: Effects of Dose, Route, and Diet. Toxicol Sd 2005.

12. Grady WM, Myeroff LL, Swinler SE, Rajput A, Thiagalingam S, Lutterbaugh JD, Neumann A, Brattain MG, Chang J, Kim SJ et ah. Mutational inactivation of transforming growth factor beta receptor type II in micros atellite stable colon cancers. Cancer Res 1999, 59(2):320-324.

13. Engle SJ, Ormsby I, Pawlowski S, Boivin GP, Croft J, Balish E, Doetschman T: Elimination of colon cancer in germ- free transforming growth factor beta 1 -deficient mice. Cancer Res 2002, 62(22):6362-6366.

14. Zhu Y, Richardson JA, Parada LF, Graff JM: Smad3 mutant mice develop metastatic colorectal cancer. Cell 1998, 94(6):703-714.

15. Park YK, Franklin JL, Settle SH, Levy SE, Chung E, Jeyakumar LH, Shyr Y, Washington MK, Whitehead RH, Aronow BJ et a Gene expression profile analysis of mouse colon embryonic development. Genesis 2005, 41(1):1-12.

16. Reichling T, Goss KH, Carson DJ, Holdcraft RW, Ley-Ebert C, Witte D, Aronow BJ, Groden J: Transcriptional profiles of intestinal tumors in Apc(Min) mice are unique from those of embryonic intestine and identify novel gene targets dysregulated in human colorectal tumors. Cancer Res 2005, 65(1): 166-176.

17. van de Wetering M, Sancho E, Verweij C, de Lau W, Oving I, Hurlstone A, van der Horn K, Batlle E, Coudreuse D, Haramis AP et ah. The beta-catenin/TCF-4 complex imposes a crypt progenitor phenotype on colorectal cancer cells. Cell 2002, 111(2):241-250.

18. Zhu P, Martin E, Mengwasser J, Schlag P, Janssen KP, Gottlicher M: Induction of HDAC2 expression upon loss of APC in colorectal tumorigenesis. Cancer Cell 2004, 5(5):455-463.

19. Watson JD, Oster SK, Shago M, Khosravi F, Penn LZ: Identifying genes regulated in a Myc-dependent manner. J Biol Chem 2002, 277(40):36921-36930.

20. Katz JP, Perreault N, Goldstein BG, Actman L, McNally SR, Silberg DG, Furth EE, Kaestner KH: Loss of Klf4 in mice causes altered proliferation and differentiation and precancerous changes in the adult stomach. Gastroenterology 2005, 128(4):935-945.

21. Lamhonwah AM, Ackerley C, Onizuka R, Tilups A, Lamhonwah D, Chung C, Tao KS, Tellier R, Tein I: Epitope shared by functional variant of organic cation/carnitine transporter, OCTNl, Campylobacter jejuni and Mycobacterium paratuberculosis may underlie susceptibility to Crohn's disease at 5q31. Biochem Biophys Res Commun 2005, 337(4): 1165-1175.

22. Cragg RA, Phillips SR, Piper JM, Varma JS, Campbell FC, Mathers JC, Ford D: Homeostatic regulation of zinc transporters in the human small intestine by dietary zinc supplementation. Gut 2005, 54(4):469-478.

23. Jenny M, UhI C, Roche C, Duluc I, Guillermin V, Guillemot F, Jensen J, Kedinger M, Gradwohl G: Neurogenin3 is differentially required for endocrine cell fate specification in the intestinal and gastric epithelium. Embo J2002, 21(23):6338-6347.

24. Kruhoffer M, Jensen JL, Laiho P, Dyrskjot L, Salovaara R, Arango D, Birkenkamp- Demtroder K, Sorensen FB, Christensen LL, Buhl L et ah. Gene expression signatures for colorectal cancer microsatellite status and HNPCC. Br J Cancer 2005, 92(12):2240-2248.

25. Giacomini CP, Leung SY, Chen X, Yuen ST, Kim YH, Bair E, Pollack JR: A gene expression signature of genetic instability in colon cancer. Cancer Res 2005, 65(20):9200- 9205.

26. Hu M, Shivdasani RA: Overlapping gene expression in fetal mouse intestine development and human colorectal cancer. Cancer Res 2005, 65(19):8715-8722.

27. He TC, Sparks AB, Rago C, Hermeking H, Zawel L, da Costa LT, Morin PJ, Vogelstein B, Kinzler KW: Identification of c-MYC as a target of the APC pathway. Science 1998, 281(5382): 1509-1512.

28. Feng XH, Liang YY, Liang M, Zhai W, Lin X: Direct interaction of c-Myc with Smad2 and Smad3 to inhibit TGF-beta-mediated induction of the CDK inhibitor pl5(Ink4B). MoI Cell 2002, 9(1): 133-143.

29. Frederick JP, Liberati NT, Waddell DS, Shi Y, Wang XF: Transforming growth factor beta-mediated transcriptional repression of c-myc is dependent on direct binding of Smad3 to a novel repressive Smad binding element. MoI Cell Biol 2004, 24(6):2546-2559.

30. Warner BJ, Blain SW, Seoane J, Massague J: Myc downregulation by transforming growth factor beta required for activation of the pl5(Ink4b) G(I) arrest pathway. MoI Cell Biol 1999, 19(9):5913-5922.

31. Maggio-Price L, Treuting P, Zeng W, Tsang M, Bielefeldt-Ohmann H, Iritani BM: Helicobacter infection is required for inflammation and colon cancer in SMAD3 -deficient mice. Cancer Res 2006, 66(2):828-838.

32. Reya T, Clevers H: Wnt signalling in stem cells and cancer. Nature 2005, 434(7035):843- 850.

33. Korinek V, Barker N, Moerer P, van Donselaar E, HuIs G, Peters PJ, Clevers H: Depletion of epithelial stem-cell compartments in the small intestine of mice lacking Tcf-4. Nat Genet 1998, 19(4):379-383.

34. Pinto D, Gregorieff A, Begthel H, Clevers H: Canonical Wnt signals are essential for homeostasis of the intestinal epithelium. Genes Dev 2003, 17(14): 1709-1713.

35. Kuhnert F, Davis CR, Wang HT, Chu P, Lee M, Yuan J, Nusse R, Kuo CJ: Essential requirement for Wnt signaling in proliferation of adult small intestine and colon revealed by adenoviral expression of Dickkopf- 1. P roc Natl Acad Sci USA 2004, 101(l):266-271.

36. Kwong KY, Bloom GC, Yang I, Boulware D, Coppola D, Haseman J, Chen E, McGrath A, Makusky AJ, Taylor J et a Synchronous global assessment of gene and protein expression in colorectal cancer progression. Genomics 2005, 86(2): 142-158.

37. Zeng W, Wharton KA, Jr., Mack JA, Wang K, Gadbaw M, Suyama K, Klein PS, Scott MP: naked cuticle encodes an inducible antagonist of Wnt signalling. Nature 2000, 403(6771):789-795.

38. Smit L, Baas A, Kuipers J, Korswagen H, van de Wetering M, Clevers H: Wnt activates the Takl/Nemo-like kinase pathway. J Biol Chem 2004, 279(17): 17232-17240.

39. Byun T, Karimi M, Marsh JL, Milovanovic T, Lin F, Holcombe RF: Expression of secreted Wnt antagonists in gastrointestinal tissues: potential role in stem cell homeostasis. J CUn Pathol 2005, 58(5):515-519.

40. Yang Q, Bermingham NA, Finegold MJ, Zoghbi HY: Requirement of Mathl for secretory cell lineage commitment in the mouse intestine. Science 2001, 294(5549):2155-2158.

41. Batlle E, Bacani J, Begthel H, Jonkheer S, Gregorieff A, van de Born M, Malats N, Sancho E, Boon E, Pawson T et ah. EphB receptor activity suppresses colorectal cancer progression. Nature 2005, 435(7045): 1126-1130.

42. Batlle E, Henderson JT, Beghtel H, van den Born MM, Sancho E, HuIs G, Meeldijk J, Robertson J, van de Wetering M, Pawson T et a Beta-catenin and TCF mediate cell positioning in the intestinal epithelium by controlling the expression of EphB/ephrinB. Cell 2002, 111(2):251-263.

43. Engle SJ, Hoying JB, Boivin GP, Ormsby I, Gartside PS, Doetschman T: Transforming growth factor betal suppresses nonmetastatic colon cancer at an early stage of tumorigenesis. Cancer Res 1999, 59(14):3379-3386.

44. Sodir NM, Chen X, Park R, Nickel AE, Conti PS, Moats R, Bading JR, Shibata D, Laird PW: Smad3 Deficiency Promotes Tumorigenesis in the Distal Colon of ApcMin/+ Mice. Cancer Res 2006, 66(17):8430-8438.

45. Taketo MM, Takaku K: Gastro-intestinal tumorigenesis in Smad4 mutant mice. Cytokine Growth Factor Rev 2000, 11(1 -2): 147-157.

46. Munoz NM, Upton M, Rojas A, Washington MK, Lin L, Chytil A, Sozmen EG, Madison BB, Pozzi A, Moon RT et ah. Transforming growth factor beta receptor type II inactivation induces the malignant transformation of intestinal neoplasms initiated by Ape mutation. Cancer Res 2006, 66(20):9837-9844.

47. Plebani M, Basso D, Panozzo MP, Fogar P, Del Favero G, Naccarato R: Tumor markers in the diagnosis, monitoring and therapy of pancreatic cancer: state of the art. Int J Biol Markers 1995, 10(4): 189-199.

48. Zbar AP: The immunology of colorectal cancer. Surg Oncol 2004, 13(2-3):45-53.

49. Valk-Lingbeek ME, Bruggeman SW, van Lohuizen M: Stem cells and cancer; the polycomb connection. Cell 2004, 118(4):409-418.

50. Kim DH, Kim M, Kwon HJ: Histone deacetylase in carcinogenesis and its inhibitors as anti-cancer agents. JBiochem MoI Biol 2003, 36(1): 110-119.

51. Sansom OJ, Meniel VS, Muncan V, Phesse TJ, Wilkins JA, Reed KR, Vass JK, Athineos D, Clevers H, Clarke AR: Myc deletion rescues Ape deficiency in the small intestine. Nature 2007, 446(7136):676-679.

52. Abdel-Rahman WM, Ollikainen M, Kariola R, Jarvinen HJ, Mecklin JP, Nystrom-Lahti M, Knuutila S, Peltomaki P: Comprehensive characterization of HNPCC-related colorectal cancers reveals striking molecular features in families with no germline mismatch repair gene mutations. Oncogene 2005, 24(9): 1542-1551.

53. Kho AT, Zhao Q, Cai Z, Butte AJ, Kim JY, Pomeroy SL, Rowitch DH, Kohane IS: Conserved mechanisms across development and tumorigenesis revealed by a mouse development perspective of human cancers. Genes Dev 2004, 18(6):629-640.

54. Lepourcelet M, Tou L, Cai L, Sawada J, Lazar AJ, Glickman JN, Williamson JA, Everett AD, Redston M, Fox EA et a Insights into developmental mechanisms and cancers in the mammalian intestine derived from serial analysis of gene expression and study of the hepatoma-derived growth factor (HDGF). Development 2005 , 132(2):415-427.

55. Lee JS, Chu IS, Mikaelyan A, Calvisi DF, Heo J, Reddy JK, Thorgeirsson SS: Application of comparative functional genomics to identify best-fit mouse models to study human cancer. Nat Genet 2004, 36(12): 1306-1311.

56. Haigis KM, Hoff PD, White A, Shoemaker AR, Halberg RB, Dove WF: Tumor regionality in the mouse intestine reflects the mechanism of loss of Ape function. Proc Natl Acad Sd USA 2004, 101(26):9769-9773.

57. Mithani SK, Balch GC, Shiou SR, Whitehead RH, Datta PK, Beauchamp RD: Smad3 has a critical role in TGF-beta-mediated growth inhibition and apoptosis in colonic epithelial cells. J Surg Res 2004, 117(2):296-305.

58. Benjamini Y, Drai D, Elmer G, Kafkafi N, Golani I: Controlling the false discovery rate in behavior genetics research. Behav Brain Res 2001, 125(l-2):279-284.

59. Livak KJ, Schmittgen TD: Analysis of relative gene expression data using real-time quantitative PCR and the 2(-Delta Delta C(T)) Method. Methods 2001, 25(4):402-408.

60. Kaiser S., et al.: Transcriptional recapitulation and subversion of embryonic colon development by mouse colon tumor models and human colon cancer. Genome Biology 2007; 8(7):R131.