Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHODS AND COMPOSITIONS FOR TREATING AND PROGNOSING COLORECTAL CANCER
Document Type and Number:
WIPO Patent Application WO/2019/178283
Kind Code:
A1
Abstract:
Embodiments concern evaluating, prognosing, diagnosing, and treating colorectal cancer patients. In some embodiments, there are methods and kits relating to expression of one or more of the following biomarkers: KLF7, PDLIM4, MECP2, RARB, TCF4, ZNF354C, TCEA2, SSBP2, EHF, CHAF1A, PURA, HDAC1, SSPB4, EWSR1, FN1, COL3A1, PRR16, POSTN, BCAT1, COL1A2, DKK3, FSTL1, miR-210, miR-425*, and/or miR-141.

Inventors:
GOEL, Ajay (2001 Bryan St Suite 220, Dallas Texas, 75201, US)
ROY, Roshni (2001 Bryan St Suite 220, Dallas Texas, 75201, US)
MATSUYAMA, Takatoshi (2001 Bryan Street Suite 220, Dallas Texas, 75201, US)
Application Number:
US2019/022130
Publication Date:
September 19, 2019
Filing Date:
March 13, 2019
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
BAYLOR RESEARCH INSTITUTE (2001 Bryan St, Suite 2200Dallas, Texas, 75201, US)
International Classes:
A61K45/00; A61P35/00; C12Q1/68
Domestic Patent References:
WO2014191559A12014-12-04
Foreign References:
US20140314662A12014-10-23
Attorney, Agent or Firm:
STELLMAN, Laurie B.F. (98 San Jacinto Blvd, Suite 1100Austin, Texas, 78701, US)
Download PDF:
Claims:
CLAIMS

1. A method for evaluating a colorectal cancer patient comprising:

measuring a level of expression in a biological sample from the patient of one or more of the listed biomarkers: KLF7, PDLIM4, MECP2, RARB, TCF4, ZNF354C, TCEA2, SSBP2, EHF, CHAF1A, PURA, HDAC1, SSPB4, or EWSR1:

measuring a level of expression in a biological sample from the patient of one or more of the listed biomarkers: FN1, COL3A1, PRR16, POSTN, BCAT1, COL1A2, DKK3, or FSTL1; and/or,

measuring a level of expression of miR-2lO, miR-425*, and/or miR-l4l in a blood sample from the patient.

2. The method of claim 1, comprising measuring the level of expression in a biological sample from the patient of one or more of the listed biomarkers: KLF7, PDLIM4, MECP2, RARB, TCF4, ZNF354C, TCEA2, SSBP2, EHF, CHAF1A, PURA, HDAC1, SSPB4, or EWSR1.

3. The method of claim 2, wherein the colorectal cancer patient was determined to have stage II or III cancer.

4. The method of any of claims 2-3, wherein at least KLF7 is measured.

5. The method of claim 4, wherein KLF7 expression is upregulated.

6. The method of any of claims 2-3, wherein at least PDLIM4 is measured.

7. The method of claim 6, wherein PDLIM4 expression is upregulated.

8. The method of any of claims 2-3, wherein at least MECP2 is measured.

9. The method of claim 8, wherein MECP2 expression is upregulated.

10. The method of any of claims 2-3, wherein at least RARB is measured.

11. The method of claim 10, wherein RARB expression is upregulated.

12. The method of any of claims 2-3, wherein at least TCF4 is measured.

13. The method of claim 12, wherein TCF4 expression is upregulated.

14. The method of any of claims 2-3, wherein at least ZNF354C is measured.

15. The method of claim 14, wherein ZNF354C expression is upregulated.

16. The method of any of claims 2-3, wherein at least TCEA2 is measured.

17. The method of claim 16, wherein TCEA2 expression is upregulated.

18. The method of any of claims 2-3, wherein at least SSBP2 is measured.

19. The method of claim 18, wherein SSBP2 expression is upregulated.

20. The method of any of claims 2-3, wherein at least EHF is measured.

21. The method of claim 20, wherein EHF expression is downregulated.

22. The method of any of claims 2-3, wherein at least CHAF1A is measured.

23. The method of claim 22, wherein CHAF1A expression is downregulated.

24. The method of any of claims 2-3, wherein at least PURA is measured.

25. The method of claim 24, wherein PURA expression is downregulated.

26. The method of any of claims 2-3, wherein at least HDAC1 is measured.

27. The method of claim 26, wherein HDAC1 expression is downregulated.

28. The method of any of claims 2-3, wherein at least EWSR1 is measured.

29. The method of claim 28, wherein EWSRlexpression is downregulated.

30. The method of any of claims 2-3, wherein at least SSPB4 is measured.

31. The method of claim 30, wherein SSPB4 expression is downregulated.

32. The method of any of claims 2-31, wherein the levels of expression of at least 2, 3, 4,

5, 6, 7, 8, 9, 10, 11, 12, 13, or all 14 listed biomarkers are measured.

33. The method of any of claims 2-32, wherein the expression level of no other biomarker in the biological sample is measured.

34. The method of any of claims 2-33, wherein one of the listed biomarkers is excluded from being measured.

35. The method of claim 34, wherein two of the listed biomarkers are excluded from being measured.

36. The method of claim 35, wherein three of the listed biomarkers are excluded from being measured.

37. The method of claim 36, wherein four of the listed biomarkers are excluded from being measured.

38. The method of claim 37, wherein five of the listed biomarkers are excluded from being measured.

39. The method of claim 38, wherein six of the listed biomarkers are excluded from being measured.

40. The method of claim 39, wherein seven of the listed biomarkers are excluded from being measured.

41. The method of any of claims 2-40, further comprising comparing the level(s) of expression to a control sample(s) or control level(s) of expression.

42. The method of claim 41, wherein the control sample(s) have expression levels that are representative of normal colorectal cells, colorectal cancer cells from patients surviving 5 years disease-free, colorectal cancer cells from a cohort of patients who did not have liver metastasis, colorectal cancer cells from patients not surviving 5 years disease-free, colorectal cancer cells from a cohort of patients who had liver metastasis.

43. The method of claim 42, wherein the control level(s) of expression are representative of expression levels in samples from colorectal cancer patients surviving 5 years disease-free or colorectal cancer cells from a cohort of patients who did not have liver metastasis.

44. The method of claim 42, wherein the control sample(s) have expression levels that are representative of samples from colorectal cancer patients surviving 5 years disease-free or colorectal cancer cells from a cohort of patients who did not have liver metastasis.

45. The method of claim 42, wherein the control level(s) of expression are representative of expression levels in samples from colorectal cancer patients not surviving 5 years disease- free or colorectal cancer cells of a cohort of rom patients who had liver metastasis.

46. The method of claim 42, wherein the control sample(s) have expression levels that are representative of colorectal cancer patients not surviving at least 5 years disease-free or colorectal cancer cells from a cohort of patients who had liver metastasis.

47. The method of any of claims 41-46, wherein control samples or control levels are from a cohort of at least 100, 200, 300, 400, 500 or more patients.

48. The method of any of claims 2-47, wherein 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, or 14 measured expression levels of the listed biomarkers in the biological sample from the patient are a) not differentially expressed as compared to the levels of expression in colorectal cancer patients surviving at least 5 years disease-free or colorectal cancer cells from a cohort of patients who did not have liver metastasis b) differentially expressed as compared to the levels of expression in colorectal cancer patients not surviving more than 5 years disease-free or colorectal cancer cells from a cohort of patients who had liver metastasis.

49. The method of claim 48, wherein either

a) the levels of expression of 1) KLF7, PDLIM4, MECP2, RARB, TCF4, ZNF354C, TCEA2, and/or SSBP2 are not upregulated and/or 2) EHF, CHAF1A, PURA, HDAC1, SSPB4, and/or EWSR1 are not downregulated as compared to colorectal cancer patients surviving at least 5 years disease-free or colorectal cancer cells from patients who did not have liver metastasis; or

b) the levels of expression of 1) KLF7, PDLIM4, MECP2, RARB, TCF4, ZNF354C, TCEA2, and/or SSBP2 are downregulated and/or 2) EHF, CHAF1A, PURA, HDAC1,

SSPB4, and/or EWSR1 are upregulated as compared to the levels of expression in colorectal cancer patients not surviving 5 years disease-free or colorectal cancer cells from a cohort of patients who did have liver metastasis.

50. The method of claim 48 or 49, wherein the patient is identified as in the low-risk survivor cohort or as likely not to have liver metastasis.

51. The method of any of claims 2-46, wherein 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, or 14 measured expression levels of the listed biomarkers in the biological sample from the patient are a) differentially expressed as compared to the levels of expression in colorectal cancer patients surviving at least 5 years disease-free or colorectal cancer cells from a cohort of patients who did not have liver metastasis; or b) are not differentially expressed as compared to the levels of expression in colorectal cancer patients not surviving at least 5 years disease-free or colorectal cancer cells from a cohort of patients who had liver metastasis.

52. The method of claim 51, wherein either

a) the levels of expression of 1) KLF7, PDLIM4, MECP2, RARB, TCF4, ZNF354C, TCEA2, and/or SSBP2 are upregulated and/or 2) EHF, CHAF1A, PURA, HDAC1, SSPB4, and/or EWSR1 are downregulated as compared to colorectal cancer patients surviving at least 5 years disease-free or colorectal cancer cells from a cohort of patients who did not have liver metastasis; or,

b) the levels of expression of 1) KLF7, PDLIM4, MECP2, RARB, TCF4, ZNF354C, TCEA2, and/or SSBP2 are not upregulated and/or 2) EHF, CHAF1A, PURA, HDAC1, SSPB4, and/or EWSR1 are not downregulated compared to the levels of expression in colorectal cancer patients not surviving at least 5 years disease-free or colorectal cancer cells from a cohort of patients who had liver metastasis.

53. The method of claim 51 or 52, wherein the patient is identified as in the high-risk survivor cohort or as likely to have metastasis to the liver.

54. The method of any of claims 2-53, wherein the biological sample is a blood sample, a tissue sample, a tumor sample, fecal sample, or a colorectal sample.

55. The method of any of claims 2-54, further comprising treating the patient for colorectal cancer after measuring the level of expression of one or more listed biomarkers.

56. The method of claim 55, wherein the treatment comprises chemotherapy, radiation, and/or surgery.

57. The method of any of claims 2-56, wherein expression is measured using one or more hybridization and/or amplification assays.

58. The method of claim 57, wherein the assay comprises polymerase chain reaction.

59. The method of any of claims 2-58, creating an expression profile for the patient based on the expression levels of the measured listed biomarkers.

60. The method of any of claims 2-59, further comprising determining a risk score based on the expression profile for the patient.

61. The method of claim 60, wherein lymph node metastasis and/or CEA levels factor into the risk score.

62. The method of claim 61, wherein the cohort comprises at least 50, 100, 200, 300, 400, 500 or more patients.

63. A method comprising measuring in a biological sample from a colorectal cancer patient the levels of expression of the following biomarkers KLF7, PDLIM4, MECP2,

RARB, TCF4, ZNF354C, TCEA2, SSBP2, EHF, CHAF1A, PURA, HDAC1, SSPB4, and EWSR1.

64. The method of claim 63, wherein the level of expression of no additional biomarkers is measured.

65. A method comprising measuring in a biological sample from a colorectal cancer patient increased levels of expression of 1) KLF7, PDLIM4, MECP2, RARB, TCF4,

ZNF354C, TCEA2, and SSBP2 and reduced levels of expression of 2) EHF, CHAF1A, PURA, HDAC1, SSPB4, and EWSR1 as compared to colorectal cancer patients surviving at least 5 years disease-free or colorectal cancer cells from a cohort of patients who have not had liver metastasis.

66. The method of claim 65, wherein the cohort comprises at least 50, 100, 200, 300, 400, 500 or more patients.

67. A method of treating a patient with colorectal cancer comprising administering a chemotherapy and/or radiation to the patient after a biological sample from the patient has been measured for the level of expression of at least one or more of the following listed biomarkers: one or more of the listed biomarkers: KLF7, PDLIM4, MECP2, RARB, TCF4, ZNF354C, TCEA2, SSBP2, EHF, CHAF1A, PURA, HDAC1, SSPB4, or EWSR1.

68. The method of claim 67, wherein the colorectal cancer patient was determined to have stage II or III cancer.

69. The method of any of claims 67-68, wherein at least KLF7 is measured.

70. The method of claim 69, wherein KLF7 expression is upregulated.

71. The method of any of claims 67-68, wherein at least PDLIM4 is measured.

72. The method of claim 71, wherein PDLIM4 expression is upregulated.

73. The method of any of claims 67-68, wherein at least MECP2 is measured.

74. The method of claim 73, wherein MECP2 expression is upregulated.

75. The method of any of claims 67-68, wherein at least RARB is measured.

76. The method of claim 75, wherein RARB expression is upregulated.

77. The method of any of claims 67-68, wherein at least TCF4 is measured.

78. The method of claim 77, wherein TCF4 expression is upregulated.

79. The method of any of claims 67-68, wherein at least ZNF354C is measured.

80. The method of claim 79, wherein ZNF354C expression is upregulated.

81. The method of any of claims 67-68, wherein at least TCEA2 is measured.

82. The method of claim 81, wherein TCEA2 expression is upregulated.

83. The method of any of claims 67-68, wherein at least SSBP2 is measured.

84. The method of claim 83, wherein SSBP2 expression is upregulated.

85. The method of any of claims 67-68, wherein at least EHF is measured.

86. The method of claim 85, wherein EHF expression is downregulated.

87. The method of any of claims 67-68, wherein at least CHAF1A is measured.

88. The method of claim 87, wherein CHAF1A expression is downregulated.

89. The method of any of claims 67-68, wherein at least PURA is measured.

90. The method of claim 89, wherein PURA expression is downregulated.

91. The method of any of claims 67-68, wherein at least HDAC1 is measured.

92. The method of claim 91, wherein HDAC1 expression is downregulated.

93. The method of any of claims 67-68, wherein at least EWSR1 is measured.

94. The method of claim 93, wherein EWSRlexpression is downregulated.

95. The method of any of claims 67-68, wherein at least PURA is measured.

96. The method of claim 95, wherein PURA expression is downregulated.

97. The method of any of claims 67-96, wherein the levels of expression of at least 2, 3,

4, 5, 6, 7, 8, 9, 10, 11, 12, 13, or all 14 listed biomarkers are measured.

98. The method of any of claims 67-97, wherein the expression level of no other biomarker in the biological sample is measured.

99. The method of any of claims 67-98, wherein one of the listed biomarkers is excluded from being measured.

100. The method of claim 99, wherein two of the listed biomarkers are excluded from being measured.

101. The method of claim 100, wherein three of the listed biomarkers are excluded from being measured.

102. The method of claim 101, wherein four of the listed biomarkers are excluded from being measured.

103. The method of claim 102, wherein five of the listed biomarkers are excluded from being measured.

104. The method of claim 103, wherein six of the listed biomarkers are excluded from being measured.

105. The method of claim 104, wherein seven of the listed biomarkers are excluded from being measured.

106. The method of any of claims 67-105, further comprising comparing the level(s) of expression to a control sample(s) or control level(s) of expression.

107. The method of claim 106, wherein the control sample(s) have expression levels that are representative of normal colorectal cells, colorectal cancer cells from patients surviving 5 years disease-free, colorectal cancer cells from a cohort of patients who did not have liver metastasis, colorectal cancer cells from patients not surviving at least 5 years disease-free, colorectal cancer cells from a cohort of patients who had liver metastasis.

108. The method of claim 107, wherein the control level(s) of expression are representative of expression levels in samples from colorectal cancer patients surviving at least 5 years disease-free or colorectal cancer cells from a cohort of patients who did not have liver metastasis.

109. The method of claim 107, wherein the control sample(s) have expression levels that are representative of samples from colorectal cancer patients surviving at least 5 years disease-free or colorectal cancer cells from a cohort of patients who did not have liver metastasis.

110. The method of claim 107, wherein the control level(s) of expression are representative of expression levels in samples from colorectal cancer patients not surviving 5 years disease- free or colorectal cancer cells from patients who had liver metastasis.

111. The method of claim 107, wherein the control sample(s) have expression levels that are representative of colorectal cancer patients with a risk of surviving at least 5 years disease-free or colorectal cancer cells from a cohort of patients who did not have liver metastasis.

112. The method of any of claims 106-111, wherein control samples or control levels are from a cohort of at least 100, 200, 300, 400, 500 or more patients.

113. The method of any of claims 67-112, wherein 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, or 14 measured expression levels of the listed biomarkers in the biological sample from the patient are a) not differentially expressed as compared to the levels of expression in colorectal cancer patients with a risk of surviving at least 5 years disease-free or colorectal cancer cells from a cohort of patients who did not have liver metastasis b) differentially expressed as compared to the levels of expression in colorectal cancer patients not surviving 5 years disease-free or colorectal cancer cells from a cohort of patients who had liver metastasis.

114. The method of claim 113, wherein either

a) the levels of expression of 1) KLF7, PDLIM4, MECP2, RARB, TCF4, ZNF354C, TCEA2, and/or SSBP2 are not upregulated and/or 2) EHF, CHAF1A, PURA, HDAC1, SSPB4, and/or EWSR1 are not downregulated as compared to colorectal cancer patients surviving at least 5 years disease-free or colorectal cancer cells from a cohort of patients who did not have liver metastasis; or

b) the levels of expression of 1) KLF7, PDLIM4, MECP2, RARB, TCF4, ZNF354C, TCEA2, and/or SSBP2 are downregulated and/or 2) EHF, CHAF1A, PURA, HDAC1, SSPB4, and/or EWSR1 are upregulated as compared to the levels of expression in colorectal cancer patients not surviving at least 5 years disease-free or colorectal cancer cells from a cohort of patients who did not have liver metastasis.

115. The method of claim 113 or 114, wherein the patient is identified as in the low-risk survivor cohort or as likely not to have liver metastasis.

116. The method of any of claims 67-111, wherein 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, or 14 measured expression levels of the listed biomarkers in the biological sample from the patient are a) differentially expressed as compared to the levels of expression in colorectal cancer patients surviving at least 5 years disease-free or colorectal cancer cells from a cohort of patients who did not have liver metastasis; or b) are not differentially expressed as compared to the levels of expression in colorectal cancer patients not surviving 5 years disease-free or colorectal cancer cells from a cohort of patients who had liver metastasis.

117. The method of claim 116, wherein either

a) the levels of expression of 1) KLF7, PDLIM4, MECP2, RARB, TCF4, ZNF354C, TCEA2, and/or SSBP2 are upregulated and/or 2) EHF, CHAF1A, PURA, HDAC1, SSPB4, and/or EWSR1 are downregulated as compared to colorectal cancer patients surviving at least 5 years disease-free or colorectal cancer cells from a cohort of patients who did not have liver metastasis; or,

b) the levels of expression of 1) KLF7, PDLIM4, MECP2, RARB, TCF4, ZNF354C, TCEA2, and/or SSBP2 are not upregulated and/or 2) EHF, CHAF1A, PURA, HDAC1, SSPB4, and/or EWSR1 are not downregulated compared to the levels of expression in colorectal cancer patients not surviving 5 years disease-free or colorectal cancer cells from a cohort of patients who had liver metastasis.

118. The method of claim 116 or 117, wherein the patient is identified as in the high-risk survivor cohort or as likely to have metastasis to the liver.

119. The method of any of claims 67-118, wherein the biological sample is a blood sample, a tissue sample, a tumor sample, fecal sample, or a colorectal sample.

120. The method of any of claims 67-119, wherein the treatment comprises chemotherapy, radiation, and/or surgery.

121. The method of any of claims 67-120, wherein expression is measured using one or more hybridization and/or amplification assays.

122. The method of claim 121, wherein the assay comprises polymerase chain reaction.

123. The method of any of claims 2-122, creating an expression profile for the patient based on the expression levels of the measured listed biomarkers.

124. The method of any of claims 2-123, further comprising determining a risk score based on the expression profile for the patient.

125. The method of claim 124, wherein lymph node metastasis and/or CEA levels factor into the risk score.

126. The method of any of claims 67-125, wherein the cohort comprises at least 50, 100, 200, 300, 400, 500 or more patients.

127. A method of prognosing a patient with colorectal cancer and/or evaluating treatment for the patient comprising:

a) measuring the level of expression of one or more of the listed biomarkers: KLF7, PDLIM4, MECP2, RARB, TCF4, ZNF354C, TCEA2, SSBP2, EHF, CHAF1A, PURA, HDAC1, SSPB4, and/or EWSR1 in a blood sample from the patient;

b) comparing the level(s) of expression to a control sample(s) or control level(s) of expression; and,

c) prognosing the patient and/or evaluating treatment for the patient based on the levels of measured expression.

128. The method of claim 127, wherein the colorectal cancer patient was determined to have stage II or III cancer.

129. The method of any of claims 127-128, wherein at least KLF7 is measured.

130. The method of claim 129, wherein KLF7 expression is upregulated.

131. The method of any of claims 127-128, wherein at least PDLIM4 is measured.

132. The method of claim 131, wherein PDLIM4 expression is upregulated.

133. The method of any of claims 127-128, wherein at least MECP2 is measured.

134. The method of claim 133, wherein MECP2 expression is upregulated.

135. The method of any of claims 127-128, wherein at least RARB is measured.

136. The method of claim 135, wherein RARB expression is upregulated.

137. The method of any of claims 127-128, wherein at least TCF4 is measured.

138. The method of claim 137, wherein TCF4 expression is upregulated.

139. The method of any of claims 127-128, wherein at least ZNF354C is measured.

140. The method of claim 139, wherein ZNF354C expression is upregulated.

141. The method of any of claims 127-128, wherein at least TCEA2 is measured.

142. The method of claim 141, wherein TCEA2 expression is upregulated.

143. The method of any of claims 127-128, wherein at least SSBP2 is measured.

144. The method of claim 143, wherein SSBP2 expression is upregulated.

145. The method of any of claims 127-128, wherein at least EHF is measured.

146. The method of claim 145, wherein EHF expression is downregulated.

147. The method of any of claims 127-128, wherein at least CHAF1A is measured.

148. The method of claim 147, wherein CHAF1A expression is downregulated.

149. The method of any of claims 127-128, wherein at least PURA is measured.

150. The method of claim 149, wherein PURA expression is downregulated.

151. The method of any of claims 127-128, wherein at least HDAC1 is measured.

152. The method of claim 151, wherein HDAC1 expression is downregulated.

153. The method of any of claims 127-128, wherein at least EWSR1 is measured.

154. The method of claim 153, wherein EWSRlexpression is downregulated.

155. The method of any of claims 127-128, wherein at least PURA is measured.

156. The method of claim 155, wherein PURA expression is downregulated.

157. The method of any of claims 127-156, wherein the levels of expression of at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, or all 14 listed biomarkers are measured.

158. The method of any of claims 127-157, wherein the expression level of no other biomarker in the biological sample is measured.

159. The method of any of claims 127-158, wherein one of the listed biomarkers is excluded from being measured.

160. The method of claim 159, wherein two of the listed biomarkers are excluded from being measured.

161. The method of claim 160, wherein three of the listed biomarkers are excluded from being measured.

162. The method of claim 161, wherein four of the listed biomarkers are excluded from being measured.

163. The method of claim 162, wherein five of the listed biomarkers are excluded from being measured.

164. The method of claim 163, wherein six of the listed biomarkers are excluded from being measured.

165. The method of claim 164, wherein seven of the listed biomarkers are excluded from being measured.

166. The method of any of claims 127-165, further comprising comparing the level(s) of expression to a control sample(s) or control level(s) of expression.

167. The method of claim 166, wherein the control sample(s) have expression levels that are representative of normal colorectal cells, colorectal cancer cells from patients surviving at least 5 years disease-free, colorectal cancer cells from a cohort of patients who did not have liver metastasis, colorectal cancer cells from patients not surviving 5 years disease-free, colorectal cancer cells from a cohort of patients who had liver metastasis.

168. The method of claim 167, wherein the control level(s) of expression are representative of expression levels in samples from colorectal cancer patients surviving at least 5 years disease-free or colorectal cancer cells from a cohort of patients who did not have liver metastasis.

169. The method of claim 167, wherein the control sample(s) have expression levels that are representative of samples from colorectal cancer patients surviving at least 5 years disease-free or colorectal cancer cells from a cohort of patients who did not have liver metastasis.

170. The method of claim 167, wherein the control level(s) of expression are representative of expression levels in samples from colorectal cancer patients not surviving 5 years disease- free or colorectal cancer cells from a cohort of patients who had liver metastasis.

171. The method of claim 167, wherein the control sample(s) have expression levels that are representative of colorectal cancer patients notf surviving 5 years disease-free or colorectal cancer cells from a cohort of patients who had liver metastasis.

172. The method of any of claims 166-171, wherein control samples or control levels are from a cohort of at least 100, 200, 300, 400, 500 or more patients.

173. The method of any of claims 127-172, wherein 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, or 14 measured expression levels of the listed biomarkers in the biological sample from the patient are a) not differentially expressed as compared to the levels of expression in colorectal cancer patients surviving at least 5 years disease-free or colorectal cancer cells from a cohort of patients who did not have liver metastasis b) differentially expressed as compared to the levels of expression in colorectal cancer patients with a risk of not surviving at least 5 years disease-free or colorectal cancer cells from a cohort of patients who had liver metastasis.

174. The method of claim 173, wherein either

a) the levels of expression of 1) KLF7, PDLIM4, MECP2, RARB, TCF4, ZNF354C, TCEA2, and/or SSBP2 are not upregulated and/or 2) EHF, CHAF1A, PURA, HDAC1, SSPB4, and/or EWSR1 are not downregulated as compared to colorectal cancer patients surviving at least 5 years disease-free or colorectal cancer cells from a cohort of patients who did not have liver metastasis; or

b) the levels of expression of 1) KLF7, PDLIM4, MECP2, RARB, TCF4, ZNF354C, TCEA2, and/or SSBP2 are downregulated and/or 2) EHF, CHAF1A, PURA, HDAC1, SSPB4, and/or EWSR1 are upregulated as compared to the levels of expression in colorectal cancer patients not surviving at least 5 years disease-free or colorectal cancer cells from a cohort of patients who did not have liver metastasis.

175. The method of claim 173 or 174, wherein the patient is identified as in the low-risk survivor cohort or as likely not to have liver metastasis.

176. The method of any of claims 127-171, wherein 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, or 14 measured expression levels of the listed biomarkers in the biological sample from the patient are a) differentially expressed as compared to the levels of expression in colorectal cancer patients surviving at least 5 years disease-free or colorectal cancer cells from a cohort of patients who did not have liver metastasis; or b) are not differentially expressed as compared to the levels of expression in colorectal cancer patients not surviving 5 years disease-free or colorectal cancer cells from a cohort of patients who had liver metastasis.

177. The method of claim 176, wherein either

a) the levels of expression of 1) KLF7, PDLIM4, MECP2, RARB, TCF4, ZNF354C, TCEA2, and/or SSBP2 are upregulated and/or 2) EHF, CHAF1A, PURA, HDAC1, SSPB4, and/or EWSR1 are downregulated as compared to colorectal cancer patients surviving at least 5 years disease-free or colorectal cancer cells from a cohort of patients who did not have liver metastasis; or,

b) the levels of expression of 1) KLF7, PDLIM4, MECP2, RARB, TCF4, ZNF354C, TCEA2, and/or SSBP2 are not upregulated and/or 2) EHF, CHAF1A, PURA, HDAC1, SSPB4, and/or EWSR1 are not downregulated compared to the levels of expression in colorectal cancer patients not surviving 5 years disease-free or colorectal cancer cells from a cohort of patients who had liver metastasis.

178. The method of claim 176 or 177, wherein the patient is identified as in the high-risk survivor cohort or as likely to have metastasis to the liver.

179. The method of any of claims 127-178, wherein the biological sample is a blood sample, a tissue sample, a tumor sample, fecal sample, or a colorectal sample.

180. The method of any of claims 127-179, further comprising treating the patient for colorectal cancer.

181. The method of claim 180, wherein the treatment comprises chemotherapy, radiation, and/or surgery.

182. The method of any of claims 127-181, wherein expression is measured using one or more hybridization and/or amplification assays.

183. The method of claim 182, wherein the assay comprises polymerase chain reaction.

184. The method of any of claims 127-183, creating an expression profile for the patient based on the expression levels of the measured listed biomarkers.

185. The method of any of claims 127-184, further comprising determining a risk score based on the expression profile for the patient.

186. The method of claim 185, wherein lymph node metastasis and/or CEA levels factor into the risk score.

187. The method of claim 186, wherein the cohort comprises at least 50, 100, 200, 300, 400, 500 or more patients.

188. A kit comprising, in suitable container means, at least one probe or one primer set to detect KLF7, PDLIM4, MECP2, RARB, TCF4, ZNF354C, TCEA2, SSBP2, EHF, CHAF1A, PURA, HDAC1, SSPB4, and/or EWSR1.

189. The kit of claim 188, wherein the kit comprises at least one probe or one primer set to detect KLF7, PDLIM4, MECP2, RARB, TCF4, ZNF354C, TCEA2, SSBP2, EHF, CHAF1A, PURA, HDAC1, SSPB4, and EWSR1.

190. The method of claim 1 further defined as a method for evaluating a stage II or stage III colorectal cancer patient comprising measuring the level of expression in a biological sample from the patient of one or more of the listed biomarkers: FN1, COL3A1, PRR16, POSTN, BCAT1, COL1A2, DKK3, or FSTL1.

191. The method of claim 190, wherein the colorectal cancer patient was determined to have stage II or III cancer.

192. The method of any of claims 190-191, wherein at least FN1 is measured.

193. The method of any of claims 190-191, wherein at least COL3A1 is measured.

194. The method of any of claims 190-191, wherein at least PRR16 is measured.

195. The method of any of claims 190-191, wherein at least POSTN is measured.

196. The method of any of claims 190-191, wherein at least BCATli s measured.

197. The method of any of claims 190-191, wherein at least COL1A2 is measured.

198. The method of any of claims 190-191, wherein at least DKK3 is measured.

199. The method of any of claims 190-191, wherein at least FSTL1 is measured.

200. The method of any of claims 190-199, further comprising measuring the level of expression of a gene identified in eTable 3.

201. The method of claim 190, wherein the levels of expression of at least two listed biomarkers is measured.

202. The method of claim 201, wherein the levels of expression of at least three listed biomarkers is measured.

203. The method of claim 202, wherein the levels of expression of at least four listed biomarkers is measured.

204. The method of claim 203, wherein the levels of expression of at least five listed biomarkers is measured.

205. The method of claim 204, wherein the levels of expression of at least six listed biomarkers is measured.

206. The method of claim 205, wherein the levels of expression of at least seven listed biomarkers is measured.

207. The method of claim 206, wherein the levels of expression of at least eight listed biomarkers is measured.

208. The method of any of claims 190-207, wherein the expression level of no other biomarker in the biological sample is measured.

209. The method of any of claims 190-208, wherein at least one of the listed bio markers is excluded from being measured.

210. The method of claim 209, wherein at least two of the listed biomarkers is excluded from being measured.

211. The method of any of claims 190-210, wherein at all of the genes in eTable 3 except the listed biomarkers are excluded from being measured.

212. The method of any of claims 190-211, further comprising comparing the level(s) of expression to a control sample(s) or control level(s) of expression.

213. The method of claim 212, wherein the control sample(s) have expression levels that are representative of normal colorectal cells, stage II or III colorectal cancer cells from patients with a risk of surviving 5 years disease-free that is greater than 50% (low-risk survivor cohort), or stage II or III colorectal cancer cells from patients not surviving at least 5 years disease-free, stage II or III colorectal cancer cells from patients who are responsive to fluoropyrimidine-based adjuvant therapy, or stage II or III colorectal cancer cells from patients who are non-responsive to fluoropyrimidine-based adjuvant therapy, wherein responsiveness or non-responsiveness to fluoropyrimidine is determined by survival benefit.

214. The method of claim 213, wherein the control level(s) of expression are representative of expression levels in samples from stage II or III colorectal cancer patients surviving at least 5 years disease-free, or colorectal cancer cells patients who are responsive to

fluoropyrimidine-based adjuvant therapy.

215. The method of claim 213, wherein the control sample(s) have expression levels that are representative of samples from stage II or III colorectal cancer patients surviving at least 5 years disease-free or colorectal cancer cells patients who are responsive to

fluoropyrimidine-based adjuvant therapy.

216. The method of claim 214 or 215, wherein the expression levels of 1, 2, 3, 4, 5, 6, 7, or all 8 of FN1, COL3A1, PRR16, POSTN, BCAT1, COL1A2, DKK3, or FSTL1 are increased as compared to the expression levels of the low-risk survivor cohort and/or a cohort of patients who are responsive to fluoropyrmidine-based adjuvant therapy.

217. The method of claim 216, wherein the expression profile of the patient indicates the patient is in the high risk survivor cohort and/or has a greater than 50% chance of being non- responsive to fluoropyrimidine-based adjuvant therapy.

218. The method of claim 214 or 215, wherein the expression levels of 1, 2, 3, 4, 5, 6, 7, or all 8 of FN1, COL3A1, PRR16, POSTN, BCAT1, COL1A2, DKK3, or FSTL1 are within the levels representative of the expression levels of the low-risk survivor cohort and/or a cohort of patients who are responsive to fluoropyrmidine-based adjuvant therapy.

219. The method of claim 216, wherein the expression profile of the patient indicates the patient is in the low risk survivor cohort and/or has a greater than 50% chance of being responsive to fluoropyrimidine-based adjuvant therapy.

220. The method of claim 213, wherein the control level(s) of expression are representative of expression levels in samples from stage II or III colorectal cancer patients not surviving at least 5 years disease-free or colorectal cancer cells patients who are non-responsive to fluoropyrimidine-based adjuvant therapy.

221. The method of claim 213, wherein the control sample(s) have expression levels that are representative of stage II or III colorectal cancer patients not surviving 5 years disease- free or colorectal cancer cells patients who are non-responsive to fluoropyrimidine-based adjuvant therapy.

222. The method of claim 220 or 221, wherein the expression levels of 1, 2, 3, 4, 5, 6, 7, or all 8 of FN1, COL3A1, PRR16, POSTN, BCAT1, COL1A2, DKK3, or FSTL1 are within the levels representative of the expression levels of the high-risk survivor cohort and/or a cohort of patients who are non-responsive to fluoropyrmidine-based adjuvant therapy.

223. The method of claim 222, wherein the expression profile of the patient indicates the patient is in the high-risk survivor cohort and/or has a greater than 50% chance of being non- responsive to fluoropyrimidine-based adjuvant therapy.

224. The method of claim 220 or 221, wherein the expression levels of 1, 2, 3, 4, 5, 6, 7, or all 8 of FN1, COL3A1, PRR16, POSTN, BCAT1, COL1A2, DKK3, or FSTL1 are decreased as compared to the levels representative of the expression levels of the high-risk survivor cohort and/or a cohort of patients who are non-responsive to fluoropyrmidine-based adjuvant therapy.

225. The method of claim 224, wherein the expression profile of the patient indicates the patient is in the low-risk survivor cohort and/or has a greater than 50% chance of being responsive to fluoropyrimidine-based adjuvant therapy.

226. The method of any of claims 190-221, wherein 1, 2, 3, 4, 5, 6, 7, or 8 measured expression levels of the listed biomarkers in the biological sample from the patient are differentially expressed compared to the levels of expression in stage II or stage III colorectal cancer cells from patients in a low-risk survivor cohort or patients responsive to a

fluoropyrimidine-based adjuvant therapy.

227. The method of claim 226, wherein the expression of the listed biomarkers are increased.

228. The method of claim 226, wherein the patient is identified as in the high-risk survivor cohort or as likely not to respond to a fluoropyrimidine-based adjuvant therapy.

229. The method of claim 228, wherein the patient is administered an oxaliplatin-based therapy.

230. The method of any of claims 190-229, wherein 1, 2, 3, 4, 5, 6, 7, or 8 measured expression levels of the listed biomarkers in the biological sample from the patient are differentially expressed compared to the levels of expression in stage II or stage III colorectal cancer cells from patients in a high-risk survivor cohort or patients non-responsive to a fluoropyrimidine-based adjuvant therapy.

231. The method of claim 230, wherein the expression of the listed biomarkers are decreased.

232. The method of claim 230, wherein the patient is identified as in the low-risk survivor cohort or as likely to respond to a fluoropyrimidine-based adjuvant therapy.

233. The method of claim 232, wherein the patient is administered an fluoropyrimidine- based therapy.

234. The method of any of claims 190-233, wherein the biological sample is a blood sample, a tissue sample, a tumor sample, or a colorectal sample.

235. The method of any of claims 190-234, further comprising treating the patient for colorectal cancer after measuring the level of expression of one or more listed biomarkers.

236. The method of claim 235, wherein the treatment comprises a fluoropyrimidine-based therapy or an oxalitplatin-based therapy.

237. The method of any of claims 190-236, wherein expression is measured using one or more hybridization and/or amplification assays.

238. The method of claim 237, wherein the assay comprises polymerase chain reaction.

239. The method of any of claim 190-238, wherein the cohort comprises at least 50, 100, 200, 300, 400, 500 or more patients.

240. A method of treating a patient with stage II or III colorectal cancer comprising administering a fluoropyrimidine-based compound or a oxaliplatin-based compound to the patient after a biological sample from the patient has been measured for the level of expression of at least one or more of the following listed biomarkers: one or more of the listed biomarkers: FN1, COL3A1, PRR16, POSTN, BCAT1, COL1A2, DKK3, or FSTL1.

241. The method of claim 240, wherein at least FN1 is measured.

242. The method of claim 240, wherein at least COL3A1 is measured.

243. The method of claim 240, wherein at least PRR16 is measured.

244. The method of claim 240, wherein at least POSTN is measured.

245. The method of claim 240, wherein at least BCATli s measured.

246. The method of claim 240, wherein at least COL1A2 is measured.

247. The method of claim 240, wherein at least DKK3 is measured.

248. The method of claim 240, wherein at least FSTL1 is measured.

249. The method of any of claims 240-248, further comprising measuring the level of expression of a gene identified in eTable 3.

250. The method of claim 240, wherein the levels of expression of at least two listed biomarkers is measured.

251. The method of claim 250, wherein the levels of expression of at least three listed biomarkers is measured.

252. The method of claim 251, wherein the levels of expression of at least four listed biomarkers is measured.

253. The method of claim 252, wherein the levels of expression of at least five listed biomarkers is measured.

254. The method of claim 253, wherein the levels of expression of at least six listed biomarkers is measured.

255. The method of claim 254, wherein the levels of expression of at least seven listed biomarkers is measured.

256. The method of claim 255, wherein the levels of expression of all eight listed biomarkers is measured.

257. The method of any of claims 240-256, wherein the expression level of no other biomarker in the biological sample is measured.

258. The method of any of claims 240-257, wherein at least one of the listed biomarkers is excluded from being measured.

259. The method of claim 258, wherein at least two of the listed biomarkers is excluded from being measured.

260. The method of any of claims 240-259, wherein at all of the genes in eTable 3 except the listed biomarkers are excluded from being measured.

261. The method of any of claims 240-260, further comprising comparing the level(s) of expression to a control sample(s) or control level(s) of expression.

262. The method of claim 261, wherein the control sample(s) have expression levels that are representative of normal colorectal cells, stage II or III colorectal cancer cells from patients surviving at least 5 years disease-free, or stage II or III colorectal cancer cells from patients not surviving at least 5 years disease-free, stage II or III colorectal cancer cells from patients who are responsive to fluoropyrimidine-based adjuvant therapy, or stage II or III colorectal cancer cells from patients who are non-responsive to fluoropyrimidine-based adjuvant therapy, wherein responsiveness or non-responsiveness to fluoropyrimidine is determined by survival benefit.

263. The method of claim 262, wherein the control level(s) of expression are representative of expression levels in samples from stage II or III colorectal cancer patients surviving at least 5 years disease-free or colorectal cancer cells patients who are responsive to

fluoropyrimidine-based adjuvant therapy.

264. The method of claim 262, wherein the control sample(s) have expression levels that are representative of samples from stage II or III colorectal cancer patients surviving at least 5 years disease-free or colorectal cancer cells patients who are responsive to

fluoropyrimidine-based adjuvant therapy.

265. The method of claim 263 or 264, wherein the expression levels of 1, 2, 3, 4, 5, 6, 7, or all 8 of FN1, COL3A1, PRR16, POSTN, BCAT1, COL1A2, DKK3, or FSTL1 are increased as compared to the expression levels of the low-risk survivor cohort and/or a cohort of patients who are responsive to fluoropyrmidine-based adjuvant therapy.

266. The method of claim 265, wherein the expression profile of the patient indicates the patient is in the high risk survivor cohort and/or has a greater than 50% chance of being non- responsive to fluoropyrimidine-based adjuvant therapy.

267. The method of claim 263 or 264, wherein the expression levels of 1, 2, 3, 4, 5, 6, 7, or all 8 of FN1, COL3A1, PRR16, POSTN, BCAT1, COL1A2, DKK3, or FSTL1 are within the levels representative of the expression levels of the low-risk survivor cohort and/or a cohort of patients who are responsive to fluoropyrmidine-based adjuvant therapy.

268. The method of claim 265, wherein the expression profile of the patient indicates the patient is in the low risk survivor cohort and/or has a greater than 50% chance of being responsive to fluoropyrimidine-based adjuvant therapy.

269. The method of claim 262, wherein the control level(s) of expression are representative of expression levels in samples from stage II or III colorectal cancer patients surviving at least 5 years disease-free or colorectal cancer cells in patients who are responsive to fluoropyrimidine-based adjuvant therapy.

270. The method of claim 262, wherein the control sample(s) have expression levels that are representative of stage II or III colorectal cancer patients not surviving at least 5 years disease-free or colorectal cancer cells in patients who are non-responsive to

fluoropyrimidine-based adjuvant therapy.

271. The method of claim 269 or 270, wherein the expression levels of 1, 2, 3, 4, 5, 6, 7, or all 8 of FN1, COL3A1, PRR16, POSTN, BCAT1, COL1A2, DKK3, or FSTL1 are within the levels representative of the expression levels of the high-risk survivor cohort and/or a cohort of patients who are non-responsive to fluoropyrmidine-based adjuvant therapy.

272. The method of claim 271, wherein the expression profile of the patient indicates the patient is in the high-risk survivor cohort and/or has a greater than 50% chance of being non- responsive to fluoropyrimidine-based adjuvant therapy.

273. The method of claim 269 or 270, wherein the expression levels of 1, 2, 3, 4, 5, 6, 7, or all 8 of FN1, COL3A1, PRR16, POSTN, BCAT1, COL1A2, DKK3, or FSTL1 are decreased as compared to the levels representative of the expression levels of the high-risk survivor cohort and/or a cohort of patients who are non-responsive to fluoropyrmidine-based adjuvant therapy.

274. The method of claim 273, wherein the expression profile of the patient indicates the patient is in the low-risk survivor cohort and/or has a greater than 50% chance of being responsive to fluoropyrimidine-based adjuvant therapy.

275. The method of any of claims 240-270, wherein 1, 2, 3, 4, 5, 6, 7, or 8 measured expression levels of the listed biomarkers in the biological sample from the patient are differentially expressed compared to the levels of expression in stage II or stage III colorectal cancer cells from patients in a low-risk survivor cohort or patients responsive to a

fluoropyrimidine-based adjuvant therapy.

276. The method of claim 275, wherein the expression of the listed biomarkers are increased.

277. The method of claim 275, wherein the patient is identified as in the high-risk survivor cohort or as likely not to respond to a fluoropyrimidine-based adjuvant therapy.

278. The method of claim 277, wherein the patient is administered an oxaliplatin-based therapy.

279. The method of any of claims 240-278, wherein 1, 2, 3, 4, 5, 6, 7, or 8 measured expression levels of the listed biomarkers in the biological sample from the patient are differentially expressed compared to the levels of expression in stage II or stage III colorectal cancer cells from patients in a high-risk survivor cohort or patients non-responsive to a fluoropyrimidine-based adjuvant therapy.

280. The method of claim 279, wherein the expression of the listed biomarkers is decreased.

281. The method of claim 279, wherein the patient is identified as in the low-risk survivor cohort or as likely to respond to a fluoropyrimidine-based adjuvant therapy.

282. The method of claim 281, wherein the patient is administered an fluoropyrimidine- based therapy.

283. The method of any of claims 240-282, wherein the biological sample is a blood sample, a tissue sample, a tumor sample, or a colorectal sample.

284. The method of any of claims 240-283, further comprising treating the patient for colorectal cancer after measuring the level of expression of one or more listed biomarkers.

285. The method of claim 284, wherein the treatment comprises a fluoropyrimidine-based therapy or an oxalitplatin-based therapy.

286. The method of any of claims 240-285, wherein expression is measured using one or more hybridization and/or amplification assays.

287. The method of claim 286, wherein the assay comprises polymerase chain reaction.

288. A method of prognosing a patient with stage II or III colorectal cancer and/or evaluating treatment for the patient comprising:

a) measuring the level of expression of one or more of the listed biomarkers: FN1, COL3A1, PRR16, POSTN, BCAT1, COL1A2, DKK3, or FSTL1 in a blood sample from the patient;

b) comparing the level(s) of expression to a control sample(s) or control level(s) of expression; and,

c) prognosing the patient and/or evaluating treatment for the patient based on the levels of measured expression.

289. The method of claim 288, wherein at least FN 1 is measured.

290. The method of claim 288, wherein at least COL3A1 is measured.

291. The method of claim 288, wherein at least PRR16 is measured.

292. The method of claim 288, wherein at least POSTN is measured.

293. The method of claim 288, wherein at least BCATli s measured.

294. The method of claim 288, wherein at least COL1A2 is measured.

295. The method of claim 288, wherein at least DKK3 is measured.

296. The method of claim 288, wherein at least FSTL1 is measured.

297. The method of any of claims 288-296, further comprising measuring the level of expression of a gene identified in eTable 3.

298. The method of claim 288, wherein the levels of expression of at least two listed biomarkers is measured.

299. The method of claim 298, wherein the levels of expression of at least three listed biomarkers is measured.

300. The method of claim 299, wherein the levels of expression of at least four listed biomarkers is measured.

301. The method of claim 300, wherein the levels of expression of at least five listed biomarkers is measured.

302. The method of claim 301, wherein the levels of expression of at least six listed biomarkers is measured.

303. The method of claim 302, wherein the levels of expression of at least seven listed biomarkers is measured.

304. The method of claim 202, wherein the levels of expression of all eight listed biomarkers is measured.

305. The method of any of claims 288-304, wherein the expression level of no other biomarker in the biological sample is measured.

306. The method of any of claims 288-305, wherein at least one of the listed biomarkers is excluded from being measured.

307. The method of claim 306, wherein at least two of the listed biomarkers is excluded from being measured.

308. The method of any of claims 288-307, wherein at all of the genes in eTable 3 except the listed biomarkers are excluded from being measured.

309. The method of any of claims 288-308, further comprising comparing the level(s) of expression to a control sample(s) or control level(s) of expression.

310. The method of claim 309, wherein the control sample(s) have expression levels that are representative of normal colorectal cells, stage II or III colorectal cancer cells from patients surviving at least 5 years disease-free, or stage II or III colorectal cancer cells from patients not surviving at least 5 years disease-free, stage II or III colorectal cancer cells from patients who are responsive to fluoropyrimidine-based adjuvant therapy, or stage II or III colorectal cancer cells from patients who are non-responsive to fluoropyrimidine-based adjuvant therapy, wherein responsiveness or non-responsiveness to fluoropyrimidine is determined by survival benefit.

311. The method of claim 310, wherein the control level(s) of expression are representative of expression levels in samples from stage II or III colorectal cancer patients with a risk of surviving surviving at least 5 years disease-free or colorectal cancer cells patients who are responsive to fluoropyrimidine-based adjuvant therapy.

312. The method of claim 310 or 311, wherein the expression levels of 1, 2, 3, 4, 5, 6, 7, or all 8 of FN1, COL3A1, PRR16, POSTN, BCAT1, COL1A2, DKK3, or FSTL1 are increased as compared to the expression levels of the low-risk survivor cohort and/or a cohort of patients who are responsive to fluoropyrmidine-based adjuvant therapy.

313. The method of claim 312, wherein the expression profile of the patient indicates the patient is in the high risk survivor cohort and/or has a greater than 50% chance of being non- responsive to fluoropyrimidine-based adjuvant therapy.

314. The method of claim 310 or 311, wherein the expression levels of 1, 2, 3, 4, 5, 6, 7, or all 8 of FN1, COL3A1, PRR16, POSTN, BCAT1, COL1A2, DKK3, or FSTL1 are within the levels representative of the expression levels of the low-risk survivor cohort and/or a cohort of patients who are responsive to fluoropyrmidine-based adjuvant therapy.

315. The method of claim 312, wherein the expression profile of the patient indicates the patient is in the low risk survivor cohort and/or has a greater than 50% chance of being responsive to fluoropyrimidine-based adjuvant therapy.

316. The method of claim 310, wherein the control sample(s) have expression levels that are representative of samples from stage II or III colorectal cancer patients surviving at least 5 years disease-free or colorectal cancer cells patients who are responsive to

fluoropyrimidine-based adjuvant therapy.

317. The method of claim 310, wherein the control level(s) of expression are representative of expression levels in samples from stage II or III colorectal cancer patients not surviving at least 5 years disease-free or colorectal cancer cells patients who are non-responsive to fluoropyrimidine-based adjuvant therapy.

318. The method of claim 309, wherein the control sample(s) have expression levels that are representative of stage II or III colorectal cancer patients not surviving at least 5 years disease-free or colorectal cancer cells patients who are non-responsive to fluoropyrimidine- based adjuvant therapy.

319. The method of any of claims 288-318, wherein 1, 2, 3, 4, 5, 6, 7, or 8 measured expression levels of the listed biomarkers in the biological sample from the patient are differentially expressed compared to the levels of expression in stage II or stage III colorectal cancer cells from patients in a low-risk survivor cohort or patients responsive to a

fluoropyrimidine-based adjuvant therapy.

320. The method of claim 319, wherein the expression of the listed biomarkers are increased.

321. The method of claim 320, wherein the patient is identified as in the high-risk survivor cohort or as likely not to respond to a fluoropyrimidine-based adjuvant therapy.

322. The method of claim 321, wherein the patient is administered an oxaliplatin-based therapy.

323. The method of any of claims 288-322, wherein 1, 2, 3, 4, 5, 6, 7, or 8 measured expression levels of the listed biomarkers in the biological sample from the patient are differentially expressed compared to the levels of expression in stage II or stage III colorectal cancer cells from patients in a high-risk survivor cohort or patients non-responsive to a fluoropyrimidine-based adjuvant therapy.

324. The method of claim 323, wherein the expression of the listed biomarkers are decreased.

325. The method of claim 324, wherein the patient is identified as in the low-risk survivor cohort or as likely to respond to a fluoropyrimidine-based adjuvant therapy.

326. The method of claim 325, wherein the patient is administered an fluoropyrimidine- based therapy.

327. The method of any of claims 288-326, wherein the biological sample is a blood sample, a tissue sample, a tumor sample, or a colorectal sample.

328. The method of any of claims 288-327, further comprising treating the patient for colorectal cancer after measuring the level of expression of one or more listed biomarkers.

329. The method of claim 328, wherein the treatment comprises a fluoropyrimidine-based therapy or an oxalitplatin-based therapy.

330. The method of any of claims 288-329, wherein expression is measured using one or more hybridization and/or amplification assays.

331. The method of claim 330, wherein the assay comprises polymerase chain reaction.

332. The method of any of claims 288-331, further comprising treating the patient for stage II or stage III colorectal cancer after measuring the level of expression of one or more listed biomarkers.

333. The method of claim 332, wherein the patient is treated with a fluoropyrimidine-based compound after being determined to likely respond to respond to fluoropyrimidine-based compound.

334. The method of claim 332, wherein the patient is treated with a oxalitplatin-based compound after being determined to likely not respond to respond to fluoropyrimidine-based compound.

335. The method of any of claims 288-334, wherein the patient is prognosed as having a greater than 50% chance of being disease free and surviving cancer for a certain period of time.

336. The method of any of claims 288-334, wherein the patient is prognosed as having a greater than 50% chance of not being disease free and surviving cancer for a certain period of time.

337. The method of claim 335 or 336, wherein the certain period of time is 5 years.

338. The method of any of claims 288-337, wherein the cohort is at least 50, 100, 200, 300, 400, 500 or more patients.

339. A kit comprising 1, 2, or 3 probes or primer sets for determining expression levels of FN1, COL3A1, PRR16, POSTN, BCAT1, COL1A2, DKK3, and/or FSTL1.

340. The kit of claim 339, wherein the kit further comprises one or more negative or positive control samples.

341. The method of claim 1 further defined as a method for evaluating a colorectal cancer patient comprising measuring the level of expression of miR-2lO, miR-425*, and/or miR-l4l in a blood sample from the patient.

342. The method of claim 341, wherein the colorectal cancer patient was determined to have stage II or III cancer.

343. The method of any of claims 341-342, wherein the colorectal cancer patient was determined to have stage IV cancer.

344. The method of any of claims 341-343, wherein at least miR-2lO is measured.

345. The method of any of claims 341-343, wherein at least miR-425* is measured.

346. The method of any of claims 341-343, wherein at least miR-l4l is measured.

347. The method of any of claims 341-346, further comprising measuring the level of expression of miR-l35b, miR-l82, miR-l83, and/or miR-224.

348. The method of claim 347, wherein at least the listed biomarker of miR-l35b is measured.

349. The method of claim 347, wherein at least the listed biomarker of miR-l82 is measured.

350. The method of claim 347, wherein at least the listed biomarker of miR-l83is measured.

351. The method of claim 347, wherein at least the listed biomarker of miR-224 is measured.

352. The method of claim any of claims 341-351, wherein at least miR-2lO and miR-425* are measured.

353. The method of claim 352, wherein the levels of expression of miR-2lO, miR-425*, and miR-l4l are measured.

354. The method of claim 347, wherein the levels of expression of at least two of miR- l35b, miR-l82, miR-l83, and/or miR-224 are measured.

355. The method of claim 347, wherein the levels of expression of at least three of miR- l35b, miR-l82, miR-l83, and/or miR-224 are measured.

356. The method of any of claims 341-355, wherein the expression level of no other miRNA biomarker in the biological sample is measured.

357. The method of any of claims 341-356, wherein at miR-2lO is excluded from being measured.

358. The method of any of claims 341-357, wherein at miR-425* is excluded from being measured.

359. The method of any of claims 341-358, wherein at miR-l4l is excluded from being measured.

360. The method of claim 347, wherein the expression level of at least one of miR-l35b, miR-l82, miR-l83, and/or miR-224 is excluded from being measured.

361. The method of claim 360, wherein the expression levels of at least two of miR-l35b, miR-l82, miR-l83, and/or miR-224 are excluded from being measured.

362. The method of claim 360, wherein the expression levels of at least three of miR-l35b, miR-l82, miR-l83, and/or miR-224 are excluded from being measured.

363. The method of claim 360, wherein the expression levels of miR-l35b, miR-l82, miR- 183, and miR-224 are not measured.

364. The method of any of claims 341-363, further comprising comparing the level(s) of expression to a control sample(s) or control level(s) of expression.

365. The method of claim 364, wherein the control sample(s) have expression levels that are representative of non-metastatic colorectal cancer, normal colorectal cells, or colorectal cancer cells from patients surviving at least 3 years disease-free.

366. The method of claim 364, wherein the control level(s) of expression are representative of expression levels in samples negative for colorectal cancer or metastatic colorectal cancer or samples from colorectal cancer patients surviving at least 3 years disease-free.

367. The method of claim 364, wherein the control sample(s) have expression levels that are representative of samples positive for colorectal cancer or metastatic colorectal cancer or samples from colorectal cancer patients not surviving at least 3 years disease-free.

368. The method of claim 367, wherein the colorectal cancer is stage II or stage III.

369. The method of claim 367, wherein the colorectal cancer is stage IV.

370. The method of claim 364, wherein the control sample(s) have expression levels that are representative of colorectal cancer, metastatic colorectal cancer, colorectal cancer in patients surviving at least 3 years disease-free, or colorectal cancer in patients not surviving at least 3 years disease-free.

371. The method of claim 370, wherein the colorectal cancer is stage II or stage III.

372. The method of claim 370, wherein the colorectal cancer is stage IV.

373. The method of any of claims 341-372, wherein at least one measured expression level of the listed biomarkers in the biological sample from the patient is i) reduced compared to the levels of expression in stage IV colorectal cancer cells or to the levels of expression in a high-risk survivor cohort or ii) is within the range of expression representative of stage II or stage III colorectal cancer cells or within the range of expression representative of patients in the low-risk survivor cohort.

374. The method of claim 373, wherein at least two measured expression levels of the listed biomarkers in the biological sample from the patient are i) reduced compared to the levels of expression in stage IV colorectal cancer cells or to the levels of expression in a high- risk survivor cohort or ii) is within the range of expression representative of stage II or stage III colorectal cancer cells or within the range of expression representative of patients in the low-risk survivor cohort..

375. The method of claim 374, wherein at least three measured expression levels of the listed biomarkers in the biological sample from the patient are i) reduced compared to the levels of expression in stage IV colorectal cancer cells or to the levels of expression in a high- risk survivor cohort or ii) is within the range of expression representative of stage II or stage III colorectal cancer cells or within the range of expression representative of patients in the low-risk survivor cohort.

376. The method of claim 375, wherein at least four measured expression levels of the listed biomarkers in the biological sample from the patient are i) reduced compared to the levels of expression in stage IV colorectal cancer cells or to the levels of expression in a high- risk survivor cohort or ii) is within the range of expression representative of stage II or stage III colorectal cancer cells or within the range of expression representative of patients in the low-risk survivor cohort.

377. The method of claim 376, wherein at least five measured expression levels of the listed biomarkers in the biological sample from the patient are i) reduced compared to the levels of expression in stage IV colorectal cancer cells or to the levels of expression in a high- risk survivor cohort or ii) is within the range of expression representative of stage II or stage III colorectal cancer cells or within the range of expression representative of patients in the low-risk survivor cohort.

378. The method of claim 377, wherein at least six measured expression levels of the listed biomarkers in the biological sample from the patient are i) reduced compared to the levels of expression in stage IV colorectal cancer cells or to the levels of expression in a high-risk survivor cohort or ii) is within the range of expression representative of stage II or stage III colorectal cancer cells or within the range of expression representative of patients in the low- risk survivor cohort.

379. The method of claim 378, wherein all seven measured expression levels of the listed biomarkers in the biological sample from the patient are i) reduced compared to the levels of expression in stage IV colorectal cancer cells or to the levels of expression in a high-risk survivor cohort or ii) is within the range of expression representative of stage II or stage III colorectal cancer cells or within the range of expression representative of patients in the low- risk survivor cohort.

380. The method of any of claims 341-372, wherein at least one measured expression level of the listed biomarkers in the biological sample from the patient is i) increased compared to the levels of expression in stage II or III colorectal cancer cells or to the levels of expression in a low-risk survivor cohort or ii) is within the range of expression representative of stage IV colorectal cancer cells or within the range of expression representative of patients in the high- risk survivor cohort.

381. The method of any of claims 341-380, wherein at least two measured expression levels of the listed biomarkers in the biological sample from the patient are i) increased compared to the levels of expression in stage II or III colorectal cancer cells or to the levels of expression in a low-risk survivor cohort or ii) is within the range of expression representative of stage IV colorectal cancer cells or within the range of expression representative of patients in the high-risk survivor cohort.

382. The method of any of claims 341-381, wherein at least three measured expression levels of the listed biomarkers in the biological sample from the patient are i) increased compared to the levels of expression in stage II or III colorectal cancer cells or to the levels of expression in a low-risk survivor cohort or ii) is within the range of expression representative of stage IV colorectal cancer cells or within the range of expression representative of patients in the high-risk survivor cohort.

383. The method of any of claims 341-382, wherein at least four measured expression levels of the listed biomarkers in the biological sample from the patient are i) increased compared to the levels of expression in stage II or III colorectal cancer cells or to the levels of expression in a low-risk survivor cohort or ii) is within the range of expression representative of stage IV colorectal cancer cells or within the range of expression representative of patients in the high-risk survivor cohort.

384. The method of any of claims 341-383, wherein at least five measured expression levels of the listed biomarkers in the biological sample from the patient are i) increased compared to the levels of expression in stage II or III colorectal cancer cells or to the levels of expression in a low-risk survivor cohort or ii) is within the range of expression representative of stage IV colorectal cancer cells or within the range of expression representative of patients in the high-risk survivor cohort.

385. The method of any of claims 341-384, wherein at least six measured expression levels of the listed biomarkers in the biological sample from the patient are i) increased compared to the levels of expression in stage II or III colorectal cancer cells or to the levels of expression in a low-risk survivor cohort or ii) is within the range of expression representative of stage IV colorectal cancer cells or within the range of expression representative of patients in the high- risk survivor cohort.

386. The method of any of claims 341-385, wherein at least seven measured expression levels of the listed biomarkers in the biological sample from the patient are i) increased compared to the levels of expression in stage II or III colorectal cancer cells or to the levels of expression in a low-risk survivor cohort or ii) is within the range of expression representative of stage IV colorectal cancer cells or within the range of expression representative of patients in the high-risk survivor cohort.

387. The method of any of claims 341-386, wherein all eight measured expression levels of the listed biomarkers in the biological sample from the patient are i) increased compared to the levels of expression in stage II or III colorectal cancer cells or to the levels of expression in a low-risk survivor cohort or ii) is within the range of expression representative of stage IV colorectal cancer cells or within the range of expression representative of patients in the high- risk survivor cohort.

388. The method of any of claims 341-387, wherein the biological sample is a blood sample, a serum sample, or a plasma sample.

389. The method of any of claims 341-388, wherein the patient has been diagnosed with colorectal cancer.

390. The method of any of claims 341-389, further comprising treating the patient for colorectal cancer after measuring the level of expression of one or more listed biomarkers.

391. The method of claim 390, wherein the patient is treated after measuring an increased level of expression of miR-2lO and miR-425* as compared to the levels of expression in the low-risk survival cohort.

392. The method of claim 390, wherein the patient is treated after measuring an decreased level of expression of miR-2lO and miR-425* as compared to the levels of expression in the high-risk survival cohort.

393. The method of any of claims 341-392, wherein the patient has undergone surgery to resect all or part of the colorectal cancer.

394. The method of any of claims 391-393, wherein the level of expression of miR-2lO is measured pre-operative and/or post-operative.

395. The method of any of claims 391-393, wherein the level of expression of miR-425* is measured pre-operative and/or post-operative.

396. A method for evaluating a stage II or stage III colorectal cancer patient comprising measuring the level of expression of miR-2lO, miR-425*, and/or miR-l4l in blood samples from the patient obtained before and/or after surgery to remove all or part of the colorectal cancer.

397. The method of claim 396, wherein the levels of expression of preoperative miR-2lO and postoperative miR-425*are measured.

398. The method of claim 396 or 397, further comprising comparing the levels of expression to control samples or control levels.

399. The method of claim 398, wherein the control samples or control levels are representative of levels of expression of preoperative miR-2lO and postoperative miR-425* in a low-risk patient cohort or a high-risk patient cohort.

400. The method of claim 399, wherein the level of expression of preoperative miR-2lO and postoperative miR-425* are higher than the levels of expression for preoperative miR- 210 and postoperative miR-425* in the low risk-patient cohort.

401. The method of claim 399, wherein the level of expression of preoperative miR-2lO and postoperative miR-425* are within the range of levels of expression for preoperative miR-2lO and postoperative miR-425* in the high risk-patient cohort.

402. The method of claim 399, wherein the level of expression of preoperative miR-2lO and postoperative miR-425* are lower than the levels of expression for preoperative miR-2lO and postoperative miR-425* in the high risk-patient cohort.

403. The method of claim 399, wherein the level of expression of preoperative miR-2lO and postoperative miR-425* are within the range of levels of expression for preoperative miR-2lO and postoperative miR-425* in the low risk-patient cohort.

404. The method of any of claims 400-404, further comprising determining a risk score for disease-free survival over a specified length of time based on the levels of expression for preoperative miR-2lO and postoperative miR-425*.

405. The method of claim 404, wherein the risk score takes into account whether the patient is venous invasion positive and/or has any lymph node metastasis.

406. A method of prognosing a patient with colorectal cancer comprising

a) measuring the level of expression of miR-2lO, miR-425*, and/or miR-l4l in a blood sample from the patient;

b) comparing the level(s) of expression to a control sample(s) or control level(s) of expression; and,

c) prognosing the patient based on the levels of measured expression.

407. The method of claim 406, wherein the colorectal cancer patient was determined to have stage II or III cancer.

408. The method of any of claims 406-408, wherein the colorectal cancer patient was determined to have stage IV cancer.

409. The method of any of claims 406-408, wherein at least miR-2lO is measured.

410. The method of any of claims 406-408, wherein at least miR-425* is measured.

411. The method of any of claims 406-408, wherein at least miR-l4l is measured.

412. The method of any of claims 406-411, further comprising measuring the level of expression of miR-l35b, miR-l82, miR-l83, and/or miR-224.

413. The method of claim 412, wherein at least the listed biomarker of miR-l35b is measured.

414. The method of claim 412, wherein at least the listed biomarker of miR-l82 is measured.

415. The method of claim 412, wherein at least the listed biomarker of miR-l83is measured.

416. The method of claim 412, wherein at least the listed biomarker of miR-224 is measured.

417. The method of claim any of claims 406-416, wherein at least miR-2lO and miR-425* are measured.

418. The method of claim 417, wherein the levels of expression of miR-2lO, miR-425*, and miR-l4l are measured.

419. The method of claim 412, wherein the levels of expression of at least two of miR- l35b, miR-l82, miR-l83, and/or miR-224 are measured.

420. The method of claim 412, wherein the levels of expression of at least three of miR- l35b, miR-l82, miR-l83, and/or miR-224 are measured.

421. The method of any of claims 406-420, wherein the expression level of no other miRNA biomarker in the biological sample is measured.

422. The method of any of claims 406-421, wherein at miR-2lO is excluded from being measured.

423. The method of any of claims 406-422, wherein at miR-425* is excluded from being measured.

424. The method of any of claims 406-423, wherein at miR-l4l is excluded from being measured.

425. The method of claim 412, wherein the expression level of at least one of miR-l35b, miR-l82, miR-l83, and/or miR-224 is excluded from being measured.

426. The method of claim 425, wherein the expression levels of at least two of miR-l35b, miR-l82, miR-l83, and/or miR-224 are excluded from being measured.

427. The method of claim 425, wherein the expression levels of at least three of miR-l35b, miR-l82, miR-l83, and/or miR-224 are excluded from being measured.

428. The method of claim 425, wherein the expression levels of miR-l35b, miR-l82, miR- 183, and miR-224 are not measured.

429. The method of claim 364, wherein the control sample(s) have expression levels that are representative of non-metastatic colorectal cancer, normal colorectal cells, or colorectal cancer cells from patients with a risk of surviving 3 years disease-free that is greater than 50% (low-risk survivor cohort).

430. The method of claim 364, wherein the control level(s) of expression are representative of expression levels in samples negative for colorectal cancer or metastatic colorectal cancer or samples from colorectal cancer patients with a risk of surviving 3 years disease-free that is greater than 50% (low-risk survivor cohort).

431. The method of claim 364, wherein the control sample(s) have expression levels that are representative of samples positive for colorectal cancer or metastatic colorectal cancer or samples from colorectal cancer patients with a risk of surviving 3 years disease-free that is less than 50% (high-risk survivor cohort).

432. The method of claim 431, wherein the colorectal cancer is stage II or stage III.

433. The method of claim 431, wherein the colorectal cancer is stage IV.

434. The method of claim 364, wherein the control sample(s) have expression levels that are representative of colorectal cancer, metastatic colorectal cancer, colorectal cancer with a risk of surviving 3 years disease-free that is greater than 50% (low-risk survivor cohort), or colorectal cancer with a risk of surviving 3 years disease-free that is less than 50% (high-risk survivor cohort).

435. The method of claim 434, wherein the colorectal cancer is stage II or stage III.

436. The method of claim 434, wherein the colorectal cancer is stage IV.

437. The method of any of claims 406-436, wherein at least one measured expression level of the listed biomarkers in the biological sample from the patient is i) reduced compared to the levels of expression in stage IV colorectal cancer cells or to the levels of expression in a high-risk survivor cohort or ii) is within the range of expression representative of stage II or stage III colorectal cancer cells or within the range of expression representative of patients in the low-risk survivor cohort.

438. The method of claim 437, wherein at least two measured expression levels of the listed biomarkers in the biological sample from the patient are i) reduced compared to the levels of expression in stage IV colorectal cancer cells or to the levels of expression in a high- risk survivor cohort or ii) is within the range of expression representative of stage II or stage III colorectal cancer cells or within the range of expression representative of patients in the low-risk survivor cohort..

439. The method of claim 438, wherein at least three measured expression levels of the listed biomarkers in the biological sample from the patient are i) reduced compared to the levels of expression in stage IV colorectal cancer cells or to the levels of expression in a high- risk survivor cohort or ii) is within the range of expression representative of stage II or stage III colorectal cancer cells or within the range of expression representative of patients in the low-risk survivor cohort.

440. The method of claim 439, wherein at least four measured expression levels of the listed biomarkers in the biological sample from the patient are i) reduced compared to the levels of expression in stage IV colorectal cancer cells or to the levels of expression in a high- risk survivor cohort or ii) is within the range of expression representative of stage II or stage III colorectal cancer cells or within the range of expression representative of patients in the low-risk survivor cohort.

441. The method of claim 440, wherein at least five measured expression levels of the listed biomarkers in the biological sample from the patient are i) reduced compared to the levels of expression in stage IV colorectal cancer cells or to the levels of expression in a high- risk survivor cohort or ii) is within the range of expression representative of stage II or stage III colorectal cancer cells or within the range of expression representative of patients in the low-risk survivor cohort.

442. The method of claim 441, wherein at least six measured expression levels of the listed biomarkers in the biological sample from the patient are i) reduced compared to the levels of expression in stage IV colorectal cancer cells or to the levels of expression in a high-risk survivor cohort or ii) is within the range of expression representative of stage II or stage III colorectal cancer cells or within the range of expression representative of patients in the low- risk survivor cohort.

443. The method of claim 442, wherein all seven measured expression levels of the listed biomarkers in the biological sample from the patient are i) reduced compared to the levels of expression in stage IV colorectal cancer cells or to the levels of expression in a high-risk survivor cohort or ii) is within the range of expression representative of stage II or stage III colorectal cancer cells or within the range of expression representative of patients in the low- risk survivor cohort.

444. The method of any of claims 406-436, wherein at least one measured expression level of the listed biomarkers in the biological sample from the patient is i) increased compared to the levels of expression in stage II or III colorectal cancer cells or to the levels of expression in a low-risk survivor cohort or ii) is within the range of expression representative of stage IV colorectal cancer cells or within the range of expression representative of patients in the high- risk survivor cohort.

445. The method of any of claims 406-444, wherein at least two measured expression levels of the listed biomarkers in the biological sample from the patient are i) increased compared to the levels of expression in stage II or III colorectal cancer cells or to the levels of expression in a low-risk survivor cohort or ii) is within the range of expression representative of stage IV colorectal cancer cells or within the range of expression representative of patients in the high-risk survivor cohort.

446. The method of any of claims 406-445, wherein at least three measured expression levels of the listed biomarkers in the biological sample from the patient are i) increased compared to the levels of expression in stage II or III colorectal cancer cells or to the levels of expression in a low-risk survivor cohort or ii) is within the range of expression representative of stage IV colorectal cancer cells or within the range of expression representative of patients in the high-risk survivor cohort.

447. The method of any of claims 406-446, wherein at least four measured expression levels of the listed biomarkers in the biological sample from the patient are i) increased compared to the levels of expression in stage II or III colorectal cancer cells or to the levels of expression in a low-risk survivor cohort or ii) is within the range of expression representative of stage IV colorectal cancer cells or within the range of expression representative of patients in the high-risk survivor cohort.

448. The method of any of claims 406-447, wherein at least five measured expression levels of the listed biomarkers in the biological sample from the patient are i) increased compared to the levels of expression in stage II or III colorectal cancer cells or to the levels of expression in a low-risk survivor cohort or ii) is within the range of expression representative of stage IV colorectal cancer cells or within the range of expression representative of patients in the high-risk survivor cohort.

449. The method of any of claims 406-448, wherein at least six measured expression levels of the listed biomarkers in the biological sample from the patient are i) increased compared to the levels of expression in stage II or III colorectal cancer cells or to the levels of expression in a low-risk survivor cohort or ii) is within the range of expression representative of stage IV colorectal cancer cells or within the range of expression representative of patients in the high- risk survivor cohort.

450. The method of any of claims 406-449, wherein at least seven measured expression levels of the listed biomarkers in the biological sample from the patient are i) increased compared to the levels of expression in stage II or III colorectal cancer cells or to the levels of expression in a low-risk survivor cohort or ii) is within the range of expression representative of stage IV colorectal cancer cells or within the range of expression representative of patients in the high-risk survivor cohort.

451. The method of any of claims 406-450, wherein all eight measured expression levels of the listed biomarkers in the biological sample from the patient are i) increased compared to the levels of expression in stage II or III colorectal cancer cells or to the levels of expression in a low-risk survivor cohort or ii) is within the range of expression representative of stage IV colorectal cancer cells or within the range of expression representative of patients in the high- risk survivor cohort.

452. The method of any of claims 406-451, wherein the biological sample is a blood sample, a serum sample, or a plasma sample.

453. The method of any of claims 406-452, wherein the patient has been diagnosed with colorectal cancer.

454. The method of any of claims 406-453, further comprising treating the patient for colorectal cancer after measuring the level of expression of one or more listed biomarkers.

455. The method of claim 454, wherein the patient is treated after measuring an increased level of expression of miR-2lO and miR-425* as compared to the levels of expression in the low-risk survival cohort.

456. The method of claim 454, wherein the patient is treated after measuring an decreased level of expression of miR-2lO and miR-425* as compared to the levels of expression in the high-risk survival cohort.

457. The method of any of claims 406-456, wherein the patient has undergone surgery to resect all or part of the colorectal cancer.

458. The method of any of claims 455-457, wherein the level of expression of miR-2lO is measured pre-operative and/or post-operative.

459. The method of any of claims 455-457, wherein the level of expression of miR-425* is measured pre-operative and/or post-operative.

460. The method of any of claims 406-459, wherein the patient is prognosed to be more likely than not to survive disease free for a specified amount of time.

461. The method of any of claims 406-459, wherein the patient is prognosed to be less likely than not to survive disease free for a specified amount of time.

462. The method of claim 460 or 461, wherein the specified amount of time is 3 years.

463. A kit comprising 1, 2, or 3 probes or primer sets for determining expression levels of miR-2lO, miR-425*, and/or miR-l4l.

464. The kit of claim 463, wherein the kit further comprises one or more negative or positive control samples.

Description:
METHODS AND COMPOSITIONS FOR TREATING AND PROGNOSING

COLORECTAL CANCER

DESCRIPTION

[0001] This application claims the benefit of priority to U.S. Provisional Patent Application Serial Nos. 62/642,227, 62/642,409, and 62/642,414, all filed March 13, 2018. The contents of each of which are hereby incorporated by reference in their entirety.

STATEMENT OF GOVERNMENT SUPPORT

[0002] This invention was made with Government support under grant CA181572, CA072851, and CA202797, awarded by the National Institutes of Health. The Government has certain rights in the invention.

BACKGROUND OF THE INVENTION

1. Field of the Invention

[0003] The present invention relates generally to the fields of molecular biology and oncology. More particularly, it concerns methods and compositions involving biomarkers and cancer prognosis, diagnosis, and treatment.

2. Description of Related Art

[0004] Distant metastasis is the most frequent cause of mortality in patients with colorectal cancer (CRC). About 30 % of new cases of CRC have the distant metastatic disease (stage IV) at the time of diagnosis, and 50-60% of patients with Stage III and 25% with Stage II disease develop metastatic diseases after curative resection 1. For stage II and III CRC patients, managing micro-metastatic disease after curative resection is the primary purpose of adjuvant therapy. Therefore, it is essential to Identify patients who have micro-metastasis at the time of curative resection for clinical decision making for adjuvant chemotherapy that remains the most pressing challenge in the management of stage II and III CRC patients. Accordingly, there is a need in the art for prognostic markers that can predict recurrence in colorectal cancer patients.

SUMMARY OF THE INVENTION

[0005] Methods, compositions, and kits are provided related to patients who may or may not have signs or who have been diagnosed with colorectal cancer.

[0006] A number of methods of provided including but not limited to methods for evaluating a patient, for evaluating a colorectal cancer patient, for measuring a biological sample from a patient or from a colorectal cancer patient, for treating a colorectal cancer patient, for prognosing a patient with colorectal cancer and/or evaluating treatment for the patient, for evaluating a stage II or stage III colorectal cancer patient, for measuring expression of one or more biomarkers, for diagnosing colorectal cancer in a patient, for determining the stage of colorectal cancer in a patient, for evaluating the chances of survival of a colorectal cancer patient, for evaluating a risk score of a colorectal cancer patient, for detecting or quantifying one or more biomarkers from a patient, for assaying for one or more biomarkers from a patient, for evaluating likelihood of recurrence in a cancer patient, and for employing a classifier to provide information about a colorectal cancer patient.

[0007] Steps for methods include but not limited to the following: measuring the level of expression of one or more biomarkers; measuring a level of expression in a biological sample from the patient of one or more of the listed biomarkers: KLF7, PDLIM4, MECP2, RARB, TCF4, ZNF354C, TCEA2, SSBP2, EHF, CHAF1A, PURA, HDAC1, SSPB4, or EWSR; measuring a level of expression in a biological sample from the patient of one or more of the listed biomarkers: FN1, COL3A1, PRR16, POSTN, BCAT1, COL1A2, DKK3, or FSTL1; measuring a level of expression of miR-2lO, miR-425*, and/or miR-l4l; measuring, detecting, assaying, quantifying or determining the expression of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, or all 25 (or any range therein) of the following biomarkers: KLF7, PDLIM4, MECP2, RARB, TCF4, ZNF354C, TCEA2, SSBP2, EHF, CHAF1A, PURA, HDAC1, SSPB4, EWSR1, FN1, COL3A1, PRR16, POSTN, BCAT1, COL1A2, DKK3, FSTL1, miR-2lO, miR-425*, or miR-l4l; comparing expression level(s) to the level of expression of a control sample or a reference level of expression or to a threshold level of expression; identifying one or more biomarkers as differentially expressed compared to a particular cohort of patients, measuring a differential level of expression of one or more biomarkers as compared to a particular cohort of patients; identifying or diagnosing a patient as having a high risk of not suriviving (high-risk survivor), identifying or diagnosing a patient as having a low risk of not surviving (low-risk survivor); calculating a risk score for the patient based on the level of expression of one or more biomarkers; prognosing a patient based on levels of determined or measured expression levels; treating the patient for colorectal cancer; administering to the patient a fluoropyrimidine-based therapy, administering to the patient an oxaliplatin-based therapy, collecting a biological sample from the patient; processing the biological sample from the patient to measure expression of one or more biomarkers; preserving the biological sample from the patient; evaluating lymph nodes from the patient; evaluating tumor size in the patient; performing a hybridization assay on a biological sample; amplifying a nucleic acid in a biological sample; identifying the patient as likely to respond to a cancer therapy; and identifying the patient as unlikely to respond to a cancer therapy.

Any of these steps or steps disclosed elsewhere in the disclosure may be used in any method described herein.

[0008] In some embodiments, there are methods for evaluating a patient comprising: measuring a level of expression in a biological sample from the patient of one or more of the listed biomarkers: KLF7, PDLIM4, MECP2, RARB, TCF4, ZNF354C, TCEA2, SSBP2, EHF, CHAF1A, PURA, HDAC1, SSPB4, or EWSR1; measuring a level of expression in a biological sample from the patient of one or more of the listed biomarkers: FN1, COL3A1, PRR16, POSTN, BCAT1, COL1A2, DKK3, or FSTL1; and/or, measuring a level of expression of miR- 210, miR-425*, and/or miR-l4l in a blood sample from the patient. In some embodiments, the patient has one or more symptoms of, is suspected of having, is at risk for, has a family history of, or has been diagnosed with colorectal cancer. In certain embodiments, methods comprise measuring, detecting, assaying, quantifying or determining the expression of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, or all 25 (or any range therein) of the following biomarkers: KLF7, PDLIM4, MECP2, RARB, TCF4, ZNF354C, TCEA2, SSBP2, EHF, CHAF1A, PURA, HDAC1, SSPB4, EWSR1, FN1, COL3A1, PRR16, POSTN, BCAT1, COL1A2, DKK3, FSTL1, miR-2lO, miR-425*, or miR-l4l. It is specifically contemplated that 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, or 24 of the biomarkers (or any range therein) may be excluded from an embodiment. Expression may be of a transcript (or RNA) or a protein (if translated).

[0009] In some embodiments, there are methods for evaluating a colorectal cancer patient comprising measuring a level of expression in a biological sample from the patient of one or more of the listed biomarkers: KLF7, PDLIM4, MECP2, RARB, TCF4, ZNF354C, TCEA2, SSBP2, EHF, CHAF1A, PURA, HDAC1, SSPB4, or EWSR. In some embodiments, there are methods for evaluating a colorectal cancer patient comprising measuring a level of expression in a biological sample from the patient of one or more of the listed biomarkers: FN1, COL3A1, PRR16, POSTN, BCAT1, COL1A2, DKK3, or FSTL1. In some embodiments there are methods for evaluating a colorectal cancer patient comprising measuring a level of expression of miR-2lO, miR-425*, and/or miR-l4l.

[0010] In some embodiments, there are methods for evaluating a colorectal cancer patient comprising measuring a level of expression in a biological sample from the patient of one or more of the listed biomarkers: KLF7, PDLIM4, MECP2, RARB, TCF4, ZNF354C, TCEA2, SSBP2, EHF, CHAF1A, PURA, HDAC1, SSPB4, or EWSR. In some embodiments, one or more biomarkers is differentially expressed compared to the biomarker in a low-risk surivivor cohort; a low-risk survivor cohort refers to a cohort of patients who survived disease-free for five years or more, which means expression levels consistent with those in this cohort is indicative of a low risk of not surviving the cancer (i.e., a greater than 50% risk of surivival). In some embodiments, one or more biomarkers is differentially expressed compared to the biomarker in a high-risk surivivor cohort; a high-risk survivor cohort refers to a cohort of patients who did not survive disease-free for at least five years, which means expression levels consistent with those in this cohort is indicative of a high risk (greater than 50%) of not surviving the cancer. In some embodiments, 1, 2, 3, 4, 5, 6, 7, or all 8 of KLF7, PDLIM4, MECP2, RARB, TCF4, ZNF354C, TCEA2, SSBP2 are measured and found to have differential expression as compared to the control or reference. In some embodiments, 1, 2, 3, 4, 5, 6, 7, or all 8 of KLF7, PDLIM4, MECP2, RARB, TCF4, ZNF354C, TCEA2, SSBP2 are measured and found to have upregulated expression as compared to the control or reference. In some embodiments, 1, 2, 3, 4, 5, 6, 7, or all 8 of KLF7, PDLIM4, MECP2, RARB, TCF4, ZNF354C, TCEA2, SSBP2 are measured and found to have expression within the representative level as as compared to the control or reference. In some embodiments, 1, 2, 3, 4, 5, 6, or 7 of the the biomarkers in this paragraph may be excluded in an embodiment.In some embodiments, there are methods comprising measuring in a biological sample from a colorectal cancer patient increased levels of expression of 1) KLF7, PDLIM4, MECP2, RARB, TCF4, ZNF354C, TCEA2, and SSBP2 and/or reduced levels of expression of 2) EHF, CHAF1A, PURA, HDAC1, SSPB4, and EWSR1 as compared to colorectal cancer patients surviving at least 5 years disease-free or colorectal cancer cells from a cohort of patients who did not have liver metastasis; the biological sample could then identify the patient as having a high risk of not surviving 5 years disease free or a high risk of liver metastasis. In some embodiments, methods involve measuring in a biological sample from a colorectal cancer patient levels of expression of 1) KLF7, PDLIM4, MECP2, RARB, TCF4, ZNF354C, TCEA2, and SSBP2 and/or levels of expression of 2) EHF, CHAF1A, PURA, HDAC1, SSPB4, and EWSR1 that are within the ranges of expression as compared to colorectal cancer patients surviving at least 5 years disease-free or colorectal cancer cells from a cohort of patients who did not have liver metastasis; the biological sample could then identify the patient as having a low risk of not surviving 5 years disease free or a low risk of liver metastasis. In some embodiments, there are methods of treating a patient with colorectal cancer comprising administering a chemotherapy and/or radiation to the patient after a biological sample from the patient has been measured for the level of expression of at least one or more of the following listed biomarkers: one or more of the listed biomarkers: KLF7, PDLIM4, MECP2, RARB, TCF4, ZNF354C, TCEA2, SSBP2, EHF, CHAF1A, PURA, HDAC1, SSPB4, or EWSR1. In some embodiments, the patient has been determined to have stage II or stage III cancer. In some embodiments, measured levels of expression of one or more of the following biomarkers indicates that treatment with the chemotherapy and/or radiation is warranted: KLF7, PDLIM4, MECP2, RARB, TCF4, ZNF354C, TCEA2, SSBP2; in some embodiments, the measured level of expression of one or more of the biomarkers is upregulated compared to the relevant control. In some embodiments, the patient has been determined to have stage II or stage III cancer. In some embodiments, measured levels of expression of one or more of the following biomarkers indicates that treatment with the chemotherapy and/or radiation is warranted: EHF, CHAF1A, PURA, HDAC1, SSPB4, or EWSR1; in some embodiments, the measured level of expression of one or more of the biomarkers is downregulated compared to the relevant control. It is specifically contemplated that 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, or 13 of the biomarkers in this paragraph may be excluded in an embodiment. In some embodiments, there are methods of prognosing a patient with colorectal cancer and/or evaluating treatment for the patient comprising: a) measuring the level of expression of one or more of the listed biomarkers: KLF7, PDLIM4, MECP2, RARB, TCF4, ZNF354C, TCEA2, SSBP2, EHF, CHAF1A, PURA, HDAC1, SSPB4, and/or EWSR1 in a blood sample from the patient; b) comparing the level(s) of expression to a control sample(s) or control level(s) of expression; and, c) prognosing the patient and/or evaluating treatment for the patient based on the levels of measured expression. In some embodiments, the measured level of expression of one or more of KLF7, PDLIM4, MECP2, RARB, TCF4, ZNF354C, TCEA2, and/or SSBP2 is upregulated compared to the relevant control. In some embodiments, the patient has been determined to have stage II or stage III cancer. In some embodiments, the measured level of expression of one or more of EHF, CHAF1A, PURA, HDAC1, SSPB4, and/or EWSR1 is downregulated compared to the relevant control. It is specifically contemplated that 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, or 13 of the biomarkers in this paragraph may be excluded in an embodiment.

[0011] In some embodiments, there ar methods of evaluating a stage II or stage III colorectal cancer patient comprising measuring the level of expression in a biological sample from the patient of one or more of the listed biomarkers: FN1, COL3A1, PRR16, POSTN, BCAT1, COL1 A2, DKK3, and/or FSTL1. In some embodiments, there are methods of treating a patient with stage II or III colorectal cancer comprising administering a fluoropyrimidine-based compound or a oxaliplatin-based compound to the patient after a biological sample from the patient has been measured for the level of expression of at least one or more of the following listed biomarkers: FN1, COL3A1, PRR16, POSTN, BCAT1, COL1A2, DKK3, or FSTL1. In some embodiments, there are methods of prognosing a patient with stage II or III colorectal cancer and/or evaluating treatment for the patient comprising: a) measuring the level of expression of one or more of the listed biomarkers: FN1, COL3A1, PRR16, POSTN, BCAT1, COL1A2, DKK3, or FSTL1 in a blood sample from the patient; b) comparing the level(s) of expression to a control sample(s) or control level(s) of expression; and, c) prognosing the patient and/or evaluating treatment for the patient based on the levels of measured expression. In some embodiments, 1, 2, 3, 4, 5, 6, 7, or 8 measured expression levels of the listed biomarkers in the biological sample from the patient are differentially expressed compared to the levels of expression in stage II or stage III colorectal cancer cells from patients in a low- risk survivor cohort or patients responsive to a fluoropyrimidine-based adjuvant therapy. In some embodiments, the expression levels of 1, 2, 3, 4, 5, 6, 7, or all 8 of FN1, COL3A1, PRR16, POSTN, BCAT1, COL1A2, DKK3, or FSTL1 are increased as compared to the expression levels of the low-risk survivor cohort and/or a cohort of patients who are responsive to fluoropyrmidine-based adjuvant therapy. In some embodiments, the expression levels of 1, 2, 3, 4, 5, 6, 7, or all 8 of FN1, COL3A1, PRR16, POSTN, BCAT1, COL1A2, DKK3, or FSTL1 are within the levels representative of the expression levels of the low-risk survivor cohort and/or a cohort of patients who are responsive to fluoropyrmidine-based adjuvant therapy. It will be understood that“within the levels” means that the measured level is within a normalized or standardized range of expression for that cohort. In some embodiments, 1, 2, 3, 4, 5, 6, 7, or all 8 of FN1, COL3A1, PRR16, POSTN, BCAT1, COL1A2, DKK3, or FSTL1 are decreased as compared to the levels representative of the expression levels of the high-risk survivor cohort and/or a cohort of patients who are non-responsive to fluoropyrmidine-based adjuvant therapy. It is specifically contemplated that 1, 2, 3, 4, 5, 6, 7, or 8 of the biomarkers in this paragraph may be excluded in an embodiment.

[0012] In some embodiments, there are methods of prognosing a patient with colorectal cancer comprising a) measuring the level of expression of miR-2lO, miR-425*, and/or miR- 141 in a blood sample from the patient; b) comparing the level(s) of expression to a control sample(s) or control level(s) of expression; and, c) prognosing the patient based on the levels of measured expression. In some embodiments, there are methods of prognosing a patient with colorectal cancer comprising a) measuring the level of expression of miR-2lO, miR-425*, and/or miR-l4l in a blood sample from the patient; b) comparing the level(s) of expression to a control sample(s) or control level(s) of expression; and, c) prognosing the patient based on the levels of measured expression. In some embodiments, the colorectal cancer patient was determined to have stage II or III cancer, while in other embodiments, the colorectal cancer patient was determined to have stage IV cancer. In some embodiments, methods comprise measuring the level of expression of one or more of miR-l35b, miR-l82, miR-l83, and/or miR-224, which may be in addition to measuring the level of expression of miR-2lO, miR- 425*, and/or miR- 141. In some embodiments, 1, 2, 3, 4, 5, or 6 of the miRNAs may be excluded in an embodiment.

[0013] A person of ordinary skill in the art understands that control or representative levels that are used for comparison purposes identifies a sample as having the relevant characteristic of the control or representative level or as NOT having the relevant characteristic characteristic of the control or representative level if the measured level differs from the control or representative level; in other words, if the sample or control level is indicative of disease-free survival and the measured level differs from range expected for that sample or control, then the sample does not have the characteristic indicative of disease-free survival. If, however, the sample or control level is indicative of disease-free survival and the measured level is within the range expected for that sample or control, then the sample does have the characteristic indicative of disease-free survival.

[0014] In some embodiments, the expression of at least KLF7 is measured. In some embodiments, KLF7 expression is upregulated. In some embodiments, the expression of at least PDLIM4 is measured. In some embodiments, PDLIM4 expression is upregulated. In some embodiments, the expression of at least MECP2 is measured. In some embodiments, MECP2 expression is upregulated. In some embodiments, the expression of at least RARB is measured. In some embodiments, RARB expression is upregulated. In some embodiments, the expression of at least TCF4 is measured. In some embodiments, TCF4 expression is upregulated. In some embodiments, the expression of at least ZNF354C is measured. In some embodiments, ZNF354C expression is upregulated. In some embodiments, the expression of at least TCEA2 is measured. In some embodiments, TCEA2 expression is upregulated. In some embodiments, the expression of at least SSBP2 is measured. In some embodiments, SSBP2 expression is upregulated. In some embodiments, 1, 2, 3, 4, 5, 6, 7, or all 8 of KLF7, PDLIM4, MECP2, RARB, TCF4, ZNF354C, TCEA2, SSBP2 are measured and found to have differential expression as compared to the control or reference. In some embodiments, 1, 2, 3, 4, 5, 6, 7, or all 8 of KLF7, PDLIM4, MECP2, RARB, TCF4, ZNF354C, TCEA2, SSBP2 are measured and found to have upregulated expression as compared to the control or reference. In some embodiments, 1, 2, 3, 4, 5, 6, 7, or all 8 of KLF7, PDLIM4, MECP2, RARB, TCF4, ZNF354C, TCEA2, SSBP2 are measured and found to have expression within the representative level as as compared to the control or reference. In some embodiments, 1, 2, 3, 4, 5, 6, or 7 of the the biomarkers in this paragraph may be excluded in an embodiment.

[0015] In some embodiments, the expression of at least EHF is measured. In some embodiments, EHF expression is downregulated. In some embodiments, the expression of at least CHAF1A is measured. In some embodiments, CHAF1A expression is downregulated. In some embodiments, the expression of at least PURA is measured. In some embodiments, PURA expression is upregulated. In some embodiments, the expression of at least HDAC1 is measured. In some embodiments, HDAC1 expression is downregulated. In some embodiments, the expression of at least EWSR1 is measured. In some embodiments, EWSR1 expression is downregulated. In some embodiments, the expression of at least SSBP4 is measured. In some embodiments, SSBP4 expression is downregulated. In some embodiments, 1, 2, 3, 4, 5, or all 6 of EHF, CHAF1A, PURA, HDAC1, EWSR1, SSBP4 are measured and found to have differential expression as compared to the control or reference. In some embodiments, 1, 2, 3, 4, 5, or all 6 of EHF, CHAF1A, PURA, HDAC1, TCF4, SSBP4 are measured and found to have downregulated expression as compared to the control or reference. In some embodiments, 1, 2, 3, 4, 5, or all 6 of EHF, CHAF1A, PURA, HDAC1, TCF4, SSBP4 are measured and found to have expression within the representative level as as compared to the control or reference. In some embodiments, the levels of expression of at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, or all 14 listed biomarkers are measured. In some embodiments, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, or 13 of the biomarkers in this paragraph is excluded. In some embodiments, no other biomarker except KLF7, PDLIM4, MECP2, RARB, TCF4, ZNF354C, TCEA2, SSBP2, EHF, CHAF1A, PURA, HDAC1, EWSR1, and/or SSBP4 is measured.

[0016] In some embodiments, methods comprise comparing the level(s) of expression to a control sample(s) or control level(s) of expression.

[0017] In some embodiments, a colorectal cancer patient was previously determined to have stage II or III cancer. In some embodiments, a colorectal cancer patient was previously determined to have stage IV cancer.

[0018] In some embodiments, the control sample(s), the control expression level, or representative expression level has an expression level(s) that is representative of either normal colorectal cells, colorectal cancer cells from a cohort of patients surviving 5 years or more disease-free, colorectal cancer cells from a cohort of patients who did not have liver metastasis, colorectal cancer cells from a cohort of patients not surviving 5 years disease-free, or colorectal cancer cells from a cohort of patients who had liver metastasis. It is contemplated that in other embodiments, a low survival risk cohort may be defined as a group of patients that did not survive beyond 2 or 3 years after initial diagnosis or completion of primary treatment. It is contemplated that in other embodiments, a high survival risk cohort may be defined as a group of patients that did survive beyond 2 or 3 years after initial diagnosis or completion of primary treatment. A patient may be deemed as having a low risk of survival if they have less than a 50, 40, 30, 20, 10% or less chance (or any range derivable therein) of surviving beyond 2, 3, or 5 years after diagnosis or completion of primary treatment. A patient may be deemed as having a high risk of survival if they have greather than a 50, 60, 70, 80, 90% or more chance (or any range derivable therein) of surviving beyond 2, 3, or 5 years after diagnosis or completion of primary treatment. Any of the methods described herein may include a step of identifying a patient as having a low risk of survival or a high risk of survival based on the measured levels of expression of one or more biomarkers described herein. In some embodiments, a patient is determined to have less than a 50, 40, 30, 20, 10% or lower chance (or any range derivable therein) of having metastasis, such as liver metastasis. In some embodiments, a patient is determined to have more than a 50, 60, 70, 80, 90% or greater chance (or any range derivable therein) of having a metastasis, such as a liver metastasis.

[0019] In some embodiments, the measured expression of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, or all 25 (or any range therein) of the following biomarkers: KLF7, PDLIM4, MECP2, RARB, TCF4, ZNF354C, TCEA2, SSBP2, EHF, CHAF1A, PURA, HDAC1, SSPB4, EWSR1, FN1, COL3A1, PRR16, POSTN, BCAT1, COL1A2, DKK3, FSTL1, miR-2lO, miR-425*, or miR-l4l in the biological sample from the patient are a) differentially expressed as compared to the levels of expression in colorectal cancer patients surviving 5 years or more disease-free or colorectal cancer cells from a cohort of patients who did not have liver metastasis; or b) are not differentially expressed as compared to the levels of expression in colorectal cancer patients not surviving at least 5 years or colorectal cancer cells from a cohort of patients who did have liver metastasis.

[0020] In some embodiments, a cohort of patients is least 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000 or more patients (or any range derivable therein).

[0021] In some embodiments, measuring, detecting, quantifying, or assaying shows a) the levels of expression of 1) KLF7, PDLIM4, MECP2, RARB, TCF4, ZNF354C, TCEA2, and/or SSBP2 are not upregulated and/or 2) EHF, CHAF1A, PURA, HDAC1, SSPB4, and/or EWSR1 are not downregulated as compared to colorectal cancer patients surviving 5 years or more disease-free or colorectal cancer cells from patients who did not have liver metastasis; or b) the levels of expression of 1) KLF7, PDLIM4, MECP2, RARB, TCF4, ZNF354C, TCEA2, and/or SSBP2 are downregulated and/or 2) EHF, CHAF1A, PURA, HDAC1, SSPB4, and/or EWSR1 are upregulated as compared to the levels of expression in colorectal cancer patients surviving 5 years or more disease-free or colorectal cancer cells from a cohort of patients who did not have liver metastasis. In some embodiments, the patient is identified as having a low survival risk or as likely not to have a liver metastasis. In some embodiments, the patient is identified as having a high survival risk or as likely not to have a liver metastasis.

[0022] In some embodiments, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, or 14 measured expression levels of the listed biomarkers of KLF7, PDLIM4, MECP2, RARB, TCF4, ZNF354C, TCEA2, SSBP2, EHF, CHAF1A, PURA, HDAC1, SSPB4, and/or EWSR1 in the biological sample from the patient are a) differentially expressed as compared to the levels of expression in colorectal cancer patients surviving 5 years or more disease-free or colorectal cancer cells from a cohort of patients who did not have liver metastasis; or b) are not differentially expressed as compared to the levels of expression in colorectal cancer patients surviving less than 5 years disease-free or colorectal cancer cells from a cohort of patients who had liver metastasis.

[0023] In some embodiments, methods involve the following: either a) the levels of expression of 1) KLF7, PDLIM4, MECP2, RARB, TCF4, ZNF354C, TCEA2, and/or SSBP2 are upregulated and/or 2) EHF, CHAF1A, PURA, HDAC1, SSPB4, and/or EWSR1 are downregulated as compared to expression levels in colorectal cancer cells from a cohort of colorectal cancer patients surviving 5 years or greater disease-free or from colorectal cancer cells from a cohort of patients who did not have liver metastasis; or, b) the levels of expression of 1) KLF7, PDLIM4, MECP2, RARB, TCF4, ZNF354C, TCEA2, and/or SSBP2 are not upregulated and/or 2) EHF, CHAF1A, PURA, HDAC1, SSPB4, and/or EWSR1 are not downregulated compared to expression levels in colorectal cancer cells from a cohort of colorectal cancer patients not surviving 5 years or greater disease-free or from colorectal cancer cells from a cohort of patients who did have liver metastasis. In some embodiments, a patient is identified as a having a high survival risk or as not likely to have metastasis to the liver. In some embodiments, a patient is identified as a having a low survival risk or as likely to have metastasis to the liver.

[0024] In some embodiments, the biological sample is a blood sample, a tissue sample, a tumor sample, fecal sample, or a colorectal sample.

[0025] In some embodiments, methods further comprise treating the patient for colorectal cancer after measuring the level of expression of one or more biomarkers discussed herein. In some embodiments, treatment comprises chemotherapy, radiation, and/or surgery. [0026] In some embodiments, expression is measured using one or more hybridization and/or amplification assays. In some embodiments, an assay comprises polymerase chain reaction.

[0027] In some embodiments, methods involve creating an expression profile for the patient based on the expression levels of the measured biomarkers. In some embodiments, methods include determining a risk score or evaluating survival chances or evaluating chances of metastasis based on the expression profile for the patient. In some embodiments, lymph node metastasis and/or CEA levels factor into the risk score.

[0028] In some methods, expression level is measured in or in at least or in at most 1, 2, 3, 4, 5, 6, 7, or all 8 of FN1, COL3A1, PRR16, POSTN, BCAT1, COL1A2, DKK3, or FSTL. In some embodiments, at least FN1 is measured. In some embodiments, at least COF3A1 is measured. In some embodiments, at least PRR16 is measured. In some embodiments, at least POSTN is measured. In some embodiments, at least BCAT1 is measured. In some embodiments, at least COF1A2 is measured. In some embodiments, at least DKK3 is measured. In some embodiments, at least FSTF1 is measured. In some embodiments, methods also comprise measuring the level of expression of another gene identified in eTable 3 or any other table identified herein. In some embodiments, the expression level of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25 (or any range derivable therein) is measured or is taken into consideration. Any of the biomarkers in eTable 3 or any other table herein may be excluded, such as 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, or 24 (or any range derivable therein) of the other biomarkers on the tables. In some embodiments, the expression level of no other biomarker in the biological sample is measured. In some embodiments, not other biomarker besides those specifically enumerated in this paragraph are evaluated or measured for expression. In some embodiments, methods comprise comprising comparing the level(s) of expression to a control sample(s) or control level(s) of expression. In some embodiments, the control sample(s) have expression levels that are representative of normal colorectal cells, stage II or III colorectal cancer cells from patients surviving 5 years disease-free (and are indicative of a greater than 50% of survival, making this a low-risk survivor cohort), or stage II or III colorectal cancer cells from patients not surviving at least 5 years disease-free ( and are indicative of a less than 50% chance of surviving 5 years, making this a high-risk survivor cohort), stage II or III colorectal cancer cells from patients who are responsive to fluoropyrimidine-based adjuvant therapy, or stage II or III colorectal cancer cells from patients who are non-responsive to fluoropyrimidine-based adjuvant therapy, wherein responsiveness or non-responsiveness to fluoropyrimidine is determined by survival benefit. In some embodiments, the control level(s) of expression are representative of expression levels in samples from stage II or III colorectal cancer patients who survived 5 years or more disease-free or from colorectal cancer cells in patients who were responsive to fluoropyrimidine-based adjuvant therapy. In some embodiments, the control sample(s) have expression levels that are representative of samples from stage II or III colorectal cancer patients not surviving 5 years disease-free or colorectal cancer cells patients who are not responsive or have low responsiveness to fluoropyrimidine-based adjuvant therapy. In some embodiments, the expression levels of 1, 2, 3, 4, 5, 6, 7, or all 8 of FN1, COL3A1, PRR16, POSTN, BCAT1, COL1A2, DKK3, or FSTL1 are increased as compared to the expression levels of the low-risk survivor cohort and/or a cohort of patients who are responsive to fluoropyrmidine-based adjuvant therapy. In some embodiments, the expression profile of the patient indicates the patient is in the high risk survivor cohort and/or has a greater than 50% chance of being non-responsive to fluoropyrimidine-based adjuvant therapy. In some embodiments, the expression levels of 1, 2, 3, 4, 5, 6, 7, or all 8 of FN1, COL3A1, PRR16, POSTN, BCAT1, COL1A2, DKK3, or FSTL1 are within the levels representative of the expression levels of the low-risk survivor cohort and/or a cohort of patients who are responsive to fluoropyrmidine-based adjuvant therapy. In some embodiments, the expression profile of the patient indicates the patient is in the low risk survivor cohort and/or has a greater than 50% chance of being responsive to fluoropyrimidine-based adjuvant therapy. In some embodiments, the control level(s) of expression are representative of expression levels in samples from stage II or III colorectal cancer patients with a risk of surviving 5 years disease-free that is less than 50% (high-risk survivor cohort) or colorectal cancer cells patients who are non-responsive to fluoropyrimidine-based adjuvant therapy. In some embodiments, the control sample(s) have expression levels that are representative of stage II or III colorectal cancer patients with a risk of surviving 5 years disease-free that is less than 50% (high-risk survivor cohort) or colorectal cancer cells patients who are non-responsive to fluoropyrimidine-based adjuvant therapy. In some embodiments, the expression levels of 1, 2, 3, 4, 5, 6, 7, or all 8 of FN1, COL3A1, PRR16, POSTN, BCAT1, COL1A2, DKK3, or FSTL1 are within the levels representative of the expression levels of the high-risk survivor cohort and/or a cohort of patients who are non- responsive to fluoropyrmidine-based adjuvant therapy. In some embodiments, the expression profile of the patient indicates the patient is in the high-risk survivor cohort and/or has a greater than 50% chance of being non-responsive to fluoropyrimidine-based adjuvant therapy. In some embodiments, the expression levels of 1, 2, 3, 4, 5, 6, 7, or all 8 of FN1, COL3A1, PRR16, POSTN, BCAT1, COL1A2, DKK3, or FSTL1 are decreased as compared to the levels representative of the expression levels of the high-risk survivor cohort and/or a cohort of patients who are non-responsive to fluoropyrmidine-based adjuvant therapy. In some embodiments, the expression profile of the patient indicates the patient is in the low-risk survivor cohort and/or has a greater than 50% chance of being responsive to fluoropyrimidine- based adjuvant therapy.

[0029] In some embodiments, 1, 2, 3, 4, 5, 6, 7, or 8 measured expression levels of the listed biomarkers in the biological sample from the patient are differentially expressed compared to the levels of expression in stage II or stage III colorectal cancer cells from patients in a low- risk survivor cohort or patients responsive to a fluoropyrimidine-based adjuvant therapy. In some embodiments, the expression of the listed biomarkers are increased. In some embodiments, the patient is identified as in the high-risk survivor cohort or as likely not to respond to a fluoropyrimidine-based adjuvant therapy. In some embodiments, the patient is administered an oxaliplatin-based therapy. In some embodiments, 1, 2, 3, 4, 5, 6, 7, or 8 measured expression levels of the listed biomarkers in the biological sample from the patient are differentially expressed compared to the levels of expression in stage II or stage III colorectal cancer cells from patients in a high-risk survivor cohort or patients non-responsive to a fluoropyrimidine-based adjuvant therapy. In some embodiments, the expression of the listed biomarkers are decreased. In some embodiments, the patient is identified as in the low- risk survivor cohort or as likely to respond to a fluoropyrimidine-based adjuvant therapy. In some embodiments, the patient is administered an fluoropyrimidine-based therapy.

[0030] In some embodiments, methods include treating the patient for colorectal cancer after measuring the level of expression of one or more listed biomarkers. In some embodiments, treatment comprises a fluoropyrimidine-based therapy or an oxalitplatin-based therapy.

[0031] In some embodiments, methods involve either prognosing a patient or wherein a patient is prognosed as having a greater than 50% chance of being disease free and surviving cancer for a certain period of time. In some embodiments, methods involve either prognosing a patient or wherein a patient is prognosed as having a greater than 50% chance of not being disease free and surviving cancer for a certain period of time.

[0032] In some embodiments, methods concern evaluating a colorectal cancer patient comprising measuring the level of expression of miR-2lO, miR-425*, and/or miR-l4l in a blood sample from the patient. In some embodiments, the colorectal cancer patient was determined to have stage II or III cancer, while in other embodiments, the colorectal cancer patient was determined to have stage IV cancer. In some embodiments, at least miR-2lO is measured. In some embodiments, at least miR-425* is measured. In some embodiments, at least miR-l4l is measured. In some embodiments, expression of 2 or 3 of the following is measured: miR-2lO, miR-425*, and miR-l4l. In some embodiments, expression of 1, 2, 3, or 4 of the following biomarkers is measured: miR-l35b, miR-l82, miR-l83, and/or miR-224. In some embodiments, at least miR-l35b is measured. In some embodiments, at least miR-l82 is measured. In some embodiments, at least miR-l83 is measured. In some embodiments, at least miR-224 is measured. In some embodiments, wherein at least miR-2lO and miR-425* are measured. In some embodiments, the levels of expression of miR-2lO, miR-425*, and miR- 141 are measured. In some embodiments, the levels of expression of at least two, three, or all four of miR-l35b, miR-l82, miR-l83, and/or miR-224 are measured. In certain embodiments, the expression level of no other biomarker (not only miRNA) in the biological sample is measured. In certain embodiments, miR-2lO miR-425*, and/or miR-l4l is excluded. In some embodiments, the expression level of 1, 2, 3, or 4 of miR-l35b, miR-l82, miR-l83, and/or miR-224 is excluded from being measured. In some embodiments, methods also include comparing the level(s) of expression to a control sample(s) or control level(s) of expression. In some embodiments, the control sample(s) have expression levels that are representative of non metastatic colorectal cancer, normal colorectal cells, or colorectal cancer cells from patients surviving at least 3 (or 5 years in some embodiments) years disease-free, which can be indicative of survival beyond 3 years (or 5 years in some embodiments) that is greater than 50% (referred to as low-risk survivor cohort). In some embodiments, control level(s) of expression are representative of expression levels in samples negative for colorectal cancer or metastatic colorectal cancer or samples from colorectal cancer patients surviving 3 (or 5 years in some embodiments) years disease-free, which is indicative of a greater than 50% chance of survival in 3 years (or 5 years in some embodiments) (low-risk survivor cohort). In some embodiments, control sample(s) have expression levels that are representative of samples positive for colorectal cancer or metastatic colorectal cancer or samples from colorectal cancer patients not surviving 3 years disease-free, which is indicative of a chance for surivival in 3 years that is less than 50% (high-risk survivor cohort). In some embodiments, the control sample(s) have expression levels that are representative of colorectal cancer, metastatic colorectal cancer, colorectal cancer in patients surviving 3 years disease-free (or 5 years in some embodiments), which is indicative of a risk of surviving that is greater than 50% (low- risk survivor cohort), or colorectal cancer in patients not surviving 3 years disease-free, which is indicative of a risk of survival that is less than 50% in a 3 year period (or 5 years in some embodiments) (high-risk survivor cohort). In some embodiments, methods of prognosing a patient with colorectal cancer comprising a) measuring the level of expression of miR-2lO, miR-425*, and/or miR-l4l in a blood sample from the patient; b) comparing the level(s) of expression to a control sample(s) or control level(s) of expression; and, c) prognosing the patient based on the levels of measured expression.

[0033] It will be understood throughout this disclosure that controls or evaluation of survival risk are evaluated based on certain characteristics such as survival length or disease- free length or both. The length of time can be or be at least 6 months, 12 months 1 year, 18 months, 2 years, 30 minths, 3 years, 42 months, 4 years, 54 months, 5 years, 6 years, 7 years, 8 years, 9 years, 10 years or more (or any range derivable therein). Embodiments described in the context of 3 or 5 years may be alternatively set at a different time period. It will be understood to someone of ordinary skill in the art that methods and kits disclosed herein can be used to assess risk of varying times depending on the relevant control used. Therefore, controls may be survival or non- survival for a length of time that is or is at least at least 6 months, 12 months 1 year, 18 months, 2 years, 30 minths, 3 years, 42 months, 4 years, 54 months, 5 years, 6 years, 7 years, 8 years, 9 years, 10 years or more (or any range derivable therein). Risk assessments can be based on these time periods and may set of chances in terms of percentages or percentiles in 10% or decile increments between 0 and 100, such 12 10, 20, 30, 40, 50, 60, 70, 80, 90, and 100 (and any range derivable therein). Therefore, any percent chance, for instance, may be described as or at least as 10, 20, 25, 30, 40, 50, 60, 70, 75, 80, 90 or 100% chance of occuring. Any embodiment discussed herein in the context of chances or risk can be described as such.

[0034] In some embodiments, methods concern measured expression levels of at least 1, 2, 3, 4, 5, 6, or 7 of miRNA biomarkers (miR-2lO, miR-425*, miR-l4l, miR-l35b, miR-l82, miR-l83, and/or miR-224) in the biological sample from the patient. In some embodiments, the measured expression levels of 1, 2, 3, 4, 5, 6 and/or 7 of these miRNAs are i) reduced compared to the levels of expression in stage IV colorectal cancer cells or to the levels of expression in a high-risk survivor cohort or ii) is within the range of expression representative of stage II or stage III colorectal cancer cells or within the range of expression representative of patients in the low-risk survivor cohort. In some embodiments, the measured expression levels of 1, 2, 3, 4, 5, 6 and/or 7 of these miRNAs are i) increased compared to the levels of expression in stage II or III colorectal cancer cells or to the levels of expression in a low-risk survivor cohort or ii) is within the range of expression representative of stage IV colorectal cancer cells or within the range of expression representative of patients in the high-risk survivor cohort. In some embodiments, methods comprise or further comprise treating the patient for colorectal cancer after measuring the level of expression of one or more miRNA biomarkers. In some embodiments, the patient is treated after measuring an increased level of expression of miR-2lO and miR-425* as compared to the levels of expression in the low-risk survival cohort or after measuring an decreased level of expression of miR-2lO and miR-425* as compared to the levels of expression in the high-risk survival cohort.

[0035] In some embodiments, a patient has undergone surgery to resect all or part of the colorectal cancer. In some embodiments, the level of one or more biomarkers is measured before an operation (pre-operative) or after an operation (post-operative). In specific embodiments, the biomarker is an miRNA, such as miR-2lO and/or miR-425*. In some embodiments, there are methods for evaluating a stage II or stage III colorectal cancer patient comprising measuring the level of expression of miR-2lO, miR-425*, and/or miR-l4l in blood samples from the patient obtained before and/or after surgery to remove all or part of the colorectal cancer. In some embodiments, the levels of expression of preoperative miR-2lO and postoperative miR-425*are measured. In some embodiments, methods further comprise comparing the levels of expression to control samples or control levels. In some embodiments, the control samples or control levels are representative of levels of expression of preoperative miR-2lO and postoperative miR-425* in a low-risk patient cohort or a high-risk patient cohort. In some embodiments, the level of expression of preoperative miR-2lO and postoperative miR- 425* are higher than the levels of expression for preoperative miR-2lO and postoperative miR- 425* in the low risk-patient cohort.In some embodiments, the level of expression of preoperative miR-2lO and postoperative miR-425* are within the range of levels of expression for preoperative miR-2lO and postoperative miR-425* in the high risk-patient cohort. In some embodiments, the level of expression of preoperative miR-2lO and postoperative miR-425* are lower than the levels of expression for preoperative miR-2lO and postoperative miR-425* in the high risk-patient cohort. In some embodiments, the level of expression of preoperative miR-2lO and postoperative miR-425* are within the range of levels of expression for preoperative miR-2lO and postoperative miR-425* in the low risk-patient cohort. In some embodiments, methods comprise or further comprise determining a risk score for disease-free survival over a specified length of time based on the levels of expression for preoperative miR- 210 and postoperative miR-425*. In some embodiments, a risk score takes into account whether the patient is venous invasion positive and/or has any lymph node metastasis.

[0036] In some embodiments, there are kits. Components of kits can include 1, 2, 3, 4, 5, 6, 7, 8 91, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50 or more probes or primer sets for detecting, measuring, assaying, quantifying, or determining the level of expression of, of at least, or of at most 1, 2, 3, 4, 5, 6, 7, 8 91, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28 biomarkers disclosed herein . In some embodiments, there is a kit comprising, in suitable container means, at least one probe or one primer set to detect KLF7, PDLIM4, MECP2, RARB, TCF4, ZNF354C, TCEA2, SSBP2, EHF, CHAF1A, PURA, HDAC1, SSPB4, and/or EWSR1. In some embodiments, a kit comprises at least one probe or one primer set to detect KLF7, PDLIM4, MECP2, RARB, TCF4, ZNF354C, TCEA2, SSBP2, EHF, CHAF1A, PURA, HDAC1, SSPB4, and EWSR1. In some embodiments, a kit comprises

1, 2, or 3 probes or primer sets for determining expression levels of FN1, COL3A1, PRR16, POSTN, BCAT1, COL1A2, DKK3, and/or FSTL1. In some embodiments, a kit comprises 1,

2, or 3 probes or primer sets for determining expression levels of miR-2lO, miR-425*, and/or miR-l4l. In some embodiments, a kit further comprises one or more negative or positive control samples. In some embodiments, the kit further comprises one or more agents for detecting one or more controls. In some embodiments, the kit further comprises reagents for isolating nucleic acids from a biological sample. In some embodiments, the reagents are for isolating nucleic acids from a serum sample. In some embodiments, the reagents are for isolating nucleic acids from a sample described herein. It is specifically contemplated that provbes or primers for detecting, measuring, assaying, 1, 2, 3, 4, 5, 6, 7, 8 91, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29 or more biomarkers disclosed herein may be excluded.

[0037] The term subject or patient may refer to an animal (for example a mammal), including but not limited to humans, non-human primates, rodents, dogs, or pigs. The methods of obtaining provided herein include methods of biopsy such as fine needle aspiration, core needle biopsy, vacuum assisted biopsy, incisional biopsy, excisional biopsy, punch biopsy, shave biopsy or skin biopsy.

[0038] In certain embodiments the sample is obtained from a biopsy from colorectal tissue, mucosa or submucosa thereof. In other embodiments the sample may be obtained from any of the tissues provided herein that include but are not limited to gall bladder, skin, heart, lung, breast, pancreas, liver, muscle, kidney, smooth muscle, bladder, intestine, brain, prostate, or thyroid tissue.

[0039] Alternatively, the sample may include but not be limited to blood, serum, sweat, hair follicle, buccal tissue, tears, menses, urine, feces, or saliva. In particular embodiments, the sample may be a tissue sample, a tumor sample, a whole blood sample, a urine sample, a saliva sample, a serum sample, a plasma sample or a fecal sample. [0040] In certain aspects the sample is obtained from cystic fluid or fluid derived from a tumor or neoplasm. In yet other embodiments the cyst, tumor or neoplasm is in the digestive system. In certain aspects of the current methods, any medical professional such as a doctor, nurse or medical technician may obtain a biological sample for testing. In further aspects of the current methods, the patient or subject may obtain a biological sample for testing without the assistance of a medical professional, such as obtaining a whole blood sample, a urine sample, a fecal sample, a buccal sample, or a saliva sample.

[0041] In further embodiments, the sample may be a fresh, frozen or preserved sample or a fine needle aspirate. In particular embodiments, the sample is a formalin-fixed, paraffin- embedded (FFPE) sample. An acquired sample may be placed in short term or long term storage by placing in a suitable medium, excipient, solution, or container. In certain cases storage may require keeping the sample in a refrigerated, or frozen environment. The sample may be quickly frozen prior to storage in a frozen environment. In certain instances the frozen sample may be contacted with a suitable cryopreservation medium or compound. Examples of cryopreservation mediums or compounds include but are not limited to: glycerol, ethylene glycol, sucrose, or glucose.

[0042] Some embodiments further involve isolating nucleic acids such as ribonucleic or RNA from a biological sample or in a sample of the patient. Other steps may or may not include amplifying a nucleic acid in a sample and/or hybridizing one or more probes to an amplified or non-amplified nucleic acid. The methods may further comprise assaying nucleic acids in a sample. In certain embodiments, a microarray may be used to measure or assay the level of biomarker expression in a sample. The methods may further comprise recording the biomarker expression level in a tangible medium or reporting the expression level to the patient, a health care payer, a physician, an insurance agent, or an electronic system.

[0043] A difference between or among weighted coefficients ore expression levels or between or among the weighted comparisons may be, be at least or be at most about 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0, 5.5, 6.0, 6.5, 7.0, 7.5, 8.0, 8.5, 9.0, 9.5, 10.0, 10.5, 11.0, 11.5, 12.0, 12.5, 13.0, 13.5, 14.0, 14.5, 15.0, 15.5, 16.0, 16.5, 17.0, 17.5, 18.0, 18.5, 19.0. 19.5, 20.0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300, 305, 310, 315, 320,

325, 330, 335, 340, 345, 350, 355, 360, 365, 370, 375, 380, 385, 390, 395, 400, 410, 420, 425,

430, 440, 441, 450, 460, 470, 475, 480, 490, 500, 510, 520, 525, 530, 540, 550, 560, 570, 575,

580, 590, 600, 610, 620, 625, 630, 640, 650, 660, 670, 675, 680, 690, 700, 710, 720, 725, 730,

740, 750, 760, 770, 775, 780, 790, 800, 810, 820, 825, 830, 840, 850, 860, 870, 875, 880, 890,

900, 910, 920, 925, 930, 940, 950, 960, 970, 975, 980, 990, 1000 times or -fold (or any range derivable therein).

[0044] In some embodiments, determination of calculation of a diagnostic, prognostic, or risk score is performed by applying classification algorithms based on the expression values of biomarkers with differential expression p values of about, between about, or at most about 0.005, 0.006, 0.007, 0.008, 0.009, 0.01, 0.011, 0.012, 0.013, 0.014, 0.015, 0.016, 0.017, 0.018, 0.019, 0.020, 0.021, 0.022, 0.023, 0.024, 0.025, 0.026, 0.027, 0.028, 0.029, 0.03, 0.031, 0.032, 0.033, 0.034, 0.035, 0.036, 0.037, 0.038, 0.039, 0.040, 0.041, 0.042, 0.043, 0.044, 0.045, 0.046, 0.047, 0.048, 0.049, 0.050, 0.051, 0.052, 0.053, 0.054, 0.055, 0.056, 0.057, 0.058, 0.059, 0.060, 0.061, 0.062, 0.063, 0.064, 0.065, 0.066, 0.067, 0.068, 0.069, 0.070, 0.071, 0.072, 0.073, 0.074, 0.075, 0.076, 0.077, 0.078, 0.079, 0.080, 0.081, 0.082, 0.083, 0.084, 0.085, 0.086, 0.087, 0.088, 0.089, 0.090, 0.091, 0.092, 0.093, 0.094, 0.095, 0.096, 0.097, 0.098, 0.099, 0.1, 0.2, 0.3, 0.4,

0.5, 0.6, 0.7, 0.8, 0.9 or higher (or any range derivable therein). In certain embodiments, the prognosis score is calculated using one or more statistically significantly differentially expressed biomarkers (either individually or as difference pairs).

[0045] Any of the methods described herein may be implemented on tangible computer- readable medium comprising computer-readable code that, when executed by a computer, causes the computer to perform one or more operations. In some embodiments, there is a tangible computer-readable medium comprising computer-readable code that, when executed by a computer, causes the computer to perform operations comprising: a) receiving information corresponding to an expression level of a biomarkers in a sample from a patient; and b) determining a difference value in the expression levels using the information corresponding to the expression levels in the sample compared to a control or reference expression level for the gene.

[0046] In other aspects, tangible computer-readable medium further comprise computer- readable code that, when executed by a computer, causes the computer to perform one or more additional operations comprising making recommendations comprising: wherein the patient in the step a) is under or after a first treatment for cancer, administering the same treatment as the first treatment to the patient if the patient does not have increased expression level; administering a different treatment from the first treatment to the patient if the patient has increased expression level.

[0047] In some embodiments, receiving information comprises receiving from a tangible data storage device information corresponding to the expression levels from a tangible storage device. In additional embodiments the medium further comprises computer-readable code that, when executed by a computer, causes the computer to perform one or more additional operations comprising: sending information corresponding to the difference value to a tangible data storage device, calculating a prognosis score for the patient, treating the patient with a traditional therapy if the patient does not have expression levels, and/or or treating the patient with an alternative esophageal therapy if the patient has increased expression levels.

[0048] The tangible, computer-readable medium further comprise computer-readable code that, when executed by a computer, causes the computer to perform one or more additional operations comprising calculating a prognosis score for the patient. The operations may further comprise making recommendations comprising: administering a treatment to a patient that is determined to have a decreased expression level.

[0049] As used herein, the terms “or” and“and/or” are utilized to describe multiple components in combination or exclusive of one another. For example,“x, y, and/or z” can refer to“x” alone,“y” alone,“z” alone,“x, y, and z,”“(x and y) or z,”“x or (y and z),” or“x or y or z.” Is is specifically contemplated that x, y, or z may be specifically excluded from an

[0050] Throughout this application, the term“about” is used according to its plain and ordinary meaning in the area of cell biology to indicate that a value includes the standard deviation of error for the device or method being employed to determine the value.

[0051] The term“comprising,” which is synonymous with“including,”“containing,” or “characterized by,” is inclusive or open-ended and does not exclude additional, unrecited elements or method steps. The phrase“consisting of’ excludes any element, step, or ingredient not specified. The phrase“consisting essentially of’ limits the scope of described subject matter to the specified materials or steps and those that do not materially affect its basic and novel characteristics. It is contemplated that embodiments described in the context of the term “comprising” may also be implemented in te context of the term“consisting of’ or“consisting essentially of.”

[0052] It is specifically contemplated that any limitation discussed with respect to one embodiment of the invention may apply to any other embodiment of the invention. Furthermore, any composition of the invention may be used in any method of the invention, and any method of the invention may be used to produce or to utilize any composition of the invention. Aspects of an embodiment set forth in the Examples are also embodiments that may be implemented in the context of embodiments discussed elsewhere in a different Example or elsewhere in the application, such as in the Summary of Invention, Detailed Description of the Embodiments, Claims, and description of Figure Legends.

BRIEF DESCRIPTION OF THE DRAWINGS

[0053] The following drawings form part of the present specification and are included to further demonstrate certain aspects of the present invention. The invention may be better understood by reference to one or more of these drawings in combination with the detailed description of specific embodiments presented herein.

[0054] FIG. 1A-D AUC and associated risk scores of TF associated signature. A) AUC of TF signature in clinical test cohort 1. B) Risk score distribution plot of signature in clinical test cohort 1. C) AUC of TF signature in clinical test cohort 2. D) Risk score distribution plot of signature in clinical test cohort 2.

[0055] FIG. 2A-D Kaplan Meier survival and Cox hazard ratio analysis of TF signature.

A) Kaplan Meier plot for overall survival (OS) associated with TF signature in test cohort 1.

B) Kaplan Meier plot for disease-free survival associated with TF signature in stages2-3 in test cohort 1. C) Kaplan Meier plot for overall survival (OS) associated with TF signature in test cohort 2. D) Kaplan Meier plot for disease-free survival associated with TF signature in stages2-3 in test cohort 2.

[0056] FIG. 3A-B In silico analysis of TF signature motif cooperativity and super-enhancer binding. A) Heatmap to measure similarity in binding motifs of seven transcription factors present in the TF signature. Shades of green indicate similarity in binding motifs of two factors while red implies mutual exclusivity in binding motifs of two factors. B) Enrichment score for preferential super-enhancer binding for each of the seven TFs in the signature in all the 21 colorectal cancer cell lines. A green bar in positive axis implies enrichment of super-enhancers for motifs for a particular transcription factor.

[0057] FIG. 4A-C Identification of TF signature for identifying liver metastasis in colorectal cancer patients. A) Flowchart illustrating the in silico discovery, testing and clinical validation steps. B) The AUROC of the 14 gene TF associated gene signature in in silico validation cohort GSE39582. C) Waterfall plot to show the TF signature risk score distribution in GSE39582. [0058] FIG. 5A-D Study design and in-silico exploratory analysis (A) the flow chart of the study design. (B) Kaplan-Meier survival curve derived from the LASSO cox regression model of MATS in the GSE39582 exploratory cohort. (C) Comparison of the prognostic accuracy of the eight genes classifier (mesenchymal associated transcriptomic signature; MATS), microarray based OncoDx and CMS4 in two independent cohorts. (D) The ROC curves for predicting CMS4 using MATS in six independent CRC cohorts.

[0059] FIG. 6A-F Constructing and validating mesenchymal associated transcriptomic signature (MATS) by qRT-PCR in independent in-house clinical cohorts. Univariate cox proportional hazard model derived hazard ratios (HR) and 95% CIs for MATS genes in predicting relapse-free survival; training cohort (A) and the validation cohort (B). Time- dependent ROC curve (C) and Kaplan-Meier survival curve (D) of MATS for relapse-free survival in the training cohort. Time-dependent ROC curve (E) and Kaplan-Meier survival curve (F) of MATS for relapse-free survival in the validation cohort.

[0060] FIG. 7A-D Univariate and multivariate analyses of MATS and clinicopathological parameters in the in-house training (A, B) and validation cohorts (C, D).

[0061] FIG. 8A-D Kaplan-Meier survival analyses in the subgroup of the validation cohort according to the MATS risk score stratified by TNM stage (stage II or stage III) (A) and T stage (<T4 or T4) (B). (C) Time-dependent ROC curves for comparisons of the prognostic accuracy by the MATS (high risk vs low risk), tumor location (rectum vs colon), lymphatic invasion status (present vs absent), mismatch repair status (instable vs stable) and the combination model of the MATS and these 3 factors in the validation cohort with stage II, T3 CRC. This combination model showed the highest area under the curve (AUC) of 0.79. (D) The association of RFS with a combination model of MATS, tumor location, lymphatic invasion status and MSI status in stage II, T3 CRC patients. High risk group of combination model showed poorer RFS than those with low risk group (P< 0.001, HR: 4.74).

[0062] FIG. 9A-F Chemotherapy predictive ability of MATS in both adjuvant and palliative setting (A) Kaplan-Meier survival curve for stage III patients in MATS low group, which were stratified by the receipt of fluoropyrimidine based chemotherapy alone in the in- house training cohort. (B) Kaplan-Meier survival curves for stage III patients in MATS high group in the in-house training cohort. (C) Kaplan-Meier survival curves for stage III patients in MATS low group in the in-house validation cohort. (D) Kaplan-Meier survival curves for stage III patients in MATS high group in the in-house validation cohort. (E) ROC curve of MATS in predicting FOLFOX response in mCRC patients analyzed using GSE28702 external validation cohort. (F) ROC curve of MATS, KRAS as well as MATS+KRAS in predicting Cetuximab response in mCRC patients analyzed using GSE5851 external validation cohort.

[0063] FIG. 10 The association between each MATS gene and CMS status in GSE39582 dataset.

[0064] FIG. 11 The association between each MATS gene and CMS status in GSE 17536 dataset.

[0065] FIG. 12 The association between each MATS gene and CMS status in GSE33113 dataset.

[0066] FIG. 13 The association between each MATS gene and CMS status in TCGA RNA seq dataset

[0067] FIG. 14 The association between each MATS gene and CMS status in TCGA microarray dataset

[0068] FIG. 15 The association between each MATS gene and CMS status in GSE104645 microarray dataset

[0069] FIG. 16A-C (A) miRNA microarray revealed differentially expressed miRNAs between liver metastasis and normal liver tissues, as well as liver metastasis and normal colonic mucosa from the GSE54088 dataset. The heatmaps illustrate the Z-score of each candidate miRNAs. (B) Circulating miR-l4l, miR-2lO, and miR-425* expression levels in the testing cohort using blood samples collected at the time of pre- and matched post-primary CRC resection (n = 136). Dot plots depicting expression levels of miRNAs in stage IV patients (IV: n=40), and stage I-III patients (I-III: n=96) (miR-l4l, miR-2lO, and miR-425*, respectively). Statistically significant differences were determined using Mann-Whitney tests. (C) Circulating miR-l4l, miR-2lO, and miR-425* expression levels in the validation cohort using blood samples collected at the time of pre- and matched post-primary CRC resection (n = 180). Dot plots depicting expression levels of miRNAs in stage IV patients (IV: n=48), and stage I- III patients (I-III: n=l32) (miR-l4l, miR-2lO, and miR-425*, respectively). Statistically significant differences were determined using Mann-Whitney tests.

[0070] FIG. 17A-F Kaplan-Meier survival analyses revealed high expression levels of circulating preoperative miR-2lO and post-operative miR-425* were consistently associated with poor DFS in stage II and III patients in the testing and validation cohort. (A) testing cohort (B) validation cohort. (C) Time-dependent ROC analysis comparing the accuracy of predicting DFS at 3 years for patients with stage II and III CRC. Expression of pre-operative miR-2lO, post-operative miR-425*, several clinicopathological factors, and combination model of miRNAs, and combination model of miRNAs with T stage and venous invasion were investigated. Combination model showed the highest AUC of 0.859. (C) The association of DFS with a combination model of miRNAs, T stage, and venous invasion in serum samples from stage II and stage III CRC patients. (D, E, F) Kaplan-Meier survival analyses revealed high levels of combination model showed poorer DFS than those with low expression in stage II and III, stage II, and stage III patients respectively.

[0071] FIG. 18 Study Design

[0072] FIG. 19A-B RT-qPCR based independent tissue validation using FFPE samples. (A) Comparison of liver or lung metastasis to matched normal tissues of the metastatic site. (B) Comparison of liver or lung metastasis to matched normal colonic mucosa.

[0073] FIG. 20A-D (A-B) Circulating miR-l82, miR-l83, and miR-224 expression levels in the testing cohort using blood samples collected at the time of pre- and matched post-primary CRC resection (n = 136). Dot plots depicting expression levels of miRNAs in stage IV patients (IV: n=40), and stage I- III patients (I- III: n=96) (miR-l4l, miR-2lO, and miR-425*, respectively). Statistically significant differences were determined using Mann-Whitney tests. (C-D) Circulating miR-l83 expression levels in the validation cohort using blood samples collected at the time of pre- and matched post-primary CRC resection (n =180). Dot plots depicting expression levels of miRNAs in stage IV patients (IV: n=48), and stage I- III patients (I-III: n=l32). Statistically significant differences were determined using Mann-Whitney tests.

[0074] FIG. 21A-D The association of OS with circulating miR-l4l, miR-2lO, and miR- 425* expression in stage I-IV CRC patients. (A) pre-operative blood in the testing cohort, (B) post-operative blood in the testing cohort, (C) pre-operative blood in the validation cohort, and (D) post-operative blood in the validation cohort, respectively.

DESCRIPTION OF ILLUSTRATIVE EMBODIMENTS

[0075] Certain aspects of the invention provide a test that could assist physicians to select the optimal therapy for a patient from several alternative treatment options. A major clinical challenge in cancer treatment is to identify the subset of patients who will benefit from a therapeutic regimen, both in metastatic and adjuvant settings. The number of anti-cancer drugs and multi-drug combinations has increased substantially in the past decade, however, treatments continue to be applied empirically using a trial- and-error approach. Here methods and compositions are provided to diagnose patients to determine the optimal treatment option for cancer patients. I. Definitions

[0076] The term“substantially the same”,“not significantly different”, or“within the range” refers to a level of expression that is not significantly different than what it is compared to. Alternatively, or in conjunction, the term substantially the same refers to a level of expression that is less than 2, 1.5, or 1.25 fold different than the expression level it is compared to or less than 20, 15, 10, or 5% difference in expression.

[0077] By“subject” or“patient” is meant any single subject for which therapy is desired, including humans, cattle, dogs, guinea pigs, rabbits, chickens, and so on. Also intended to be included as a subject are any subjects involved in clinical research trials not showing any clinical sign of disease, or subjects involved in epidemiological studies, or subjects used as controls.

[0078] The term "primer" or“probe” as used herein, is meant to encompass any nucleic acid that is capable of priming the synthesis of a nascent nucleic acid in a template-dependent process. Typically, primers are oligonucleotides from ten to twenty and/or thirty base pairs in length, but longer sequences can be employed. Primers may be provided in double-stranded and/or single- stranded form, although the single-stranded form is preferred. A probe may also refer to a nucleic acid that is capable of hybridizing by base complementarity to a nucleic acid of a gene of interest or a fragment thereof.

[0079] As used herein,“increased expression” or“elevated expression” or“decreased expression” refers to an expression level of a biomarker in the subject’s sample as compared to a reference level representing the same biomarker or a different biomarker. In certain aspects, the reference level may be a reference level of expression from a non-cancerous tissue from the same subject. Alternatively, the reference level may be a reference level of expression from a different subject or group of subjects. For example, the reference level of expression may be an expression level obtained from a sample (e.g., a tissue, fluid or cell sample) of a subject or group of subjects without cancer, with colorectal cancer, or an expression level obtained from a non-cancerous tissue of a subject or group of subjects with cancer. The reference level may be a single value or may be a range of values. The reference level of expression can be determined using any method known to those of ordinary skill in the art. The reference level may also be depicted graphically as an area on a graph. In certain embodiments, a reference level is a normalized level.

[0080] The term“determining” or“evaluating” as used herein may refer to measuring, quantitating, or quantifying (either qualitatively or quantitatively). II. MiRNAs

[0081] Certain aspects are based, in part, on the systematic discovery and validation of miRNA(s) biomarkers of cancer. In certain embodiments, microRNAs (abbreviated miRNAs) may be used in methods and compositions for determining the prognosis, for diagnosing subjects, for determining a response to a particular cancer treatment, of a particular patient, and for treating individuals with cancer.

[0082] MiRNAs may be naturally occurring, small non-coding RNAs that are about 17 to about 25 nucleotide bases (nt) in length in their biologically active form. miRNAs post- transcriptionally regulate gene expression by repressing target mRNA translation. It is thought that miRNAs function as negative regulators, i.e. greater amounts of a specific miRNA will correlate with lower levels of target gene expression.

[0083] There may be three forms of miRNAs existing in vivo, primary miRNAs (pri- miRNAs), premature miRNAs (pre-miRNAs), and mature miRNAs. Primary miRNAs (pri- miRNAs) are expressed as stem-loop structured transcripts of about a few hundred bases to over 1 kb. The pri-miRNA transcripts are cleaved in the nucleus by an RNase II endonuclease called Drosha that cleaves both strands of the stem near the base of the stem loop. Drosha cleaves the RNA duplex with staggered cuts, leaving a 5' phosphate and 2 nt overhang at the 3' end.

[0084] The cleavage product, the premature miRNA (pre-miRNA) may be about 60 to about 110 nt long with a hairpin structure formed in a fold-back manner. Pre-miRNA is transported from the nucleus to the cytoplasm by Ran-GTP and Exportin-5. Pre-miRNAs are processed further in the cytoplasm by another RNase II endonuclease called Dicer. Dicer recognizes the 5 ' phosphate and 3 ' overhang, and cleaves the loop off at the stem- loop junction to form miRNA duplexes. The miRNA duplex binds to the RNA-induced silencing complex (RISC), where the antisense strand is preferentially degraded and the sense strand mature miRNA directs RISC to its target site. It is the mature miRNA that is the biologically active form of the miRNA and is about 17 to about 25 nt in length.

[0085] MicroRNAs function by engaging in base pairing (perfect or imperfect) with specific sequences in their target genes' messages (mRNA). The miRNA degrades or represses translation of the mRNA, causing the target genes' expression to be post-transcriptionally down-regulated, repressed, or silenced. In animals, miRNAs do not necessarily have perfect homologies to their target sites, and partial homologies lead to translational repression, whereas in plants, where miRNAs tend to show complete homologies to the target sites, degradation of the message (mRNA) prevails.

[0086] MicroRNAs are widely distributed in the genome, dominate gene regulation, and actively participate in many physiological and pathological processes. For example, the regulatory modality of certain miRNAs is found to control cell proliferation, differentiation, and apoptosis; and abnormal miRNA profiles are associated with oncogenesis. Additionally, it is suggested that viral infection causes an increase in miRNAs targeted to silence“pro-cell survival” genes, and a decrease in miRNAs repressing genes associated with apoptosis (programmed cell death), thus tilting the balance toward gaining apoptosis signaling.

[0087] In other embodiments of the invention, there are synthetic nucleic acids that are miRNA inhibitors or antagonists. In some embodiments, the miRNA inhibitor or antagonist is an antagomir. A miRNA inhibitor is between about 17 to 25 nucleotides in length and comprises a 5’ to 3’ sequence that is at least 90% complementary to the 5’ to 3’ sequence of a mature miRNA. In certain embodiments, a miRNA inhibitor molecule is 17, 18, 19, 20, 21, 22, 23, 24, or 25 nucleotides in length, or any range derivable therein. Moreover, a miRNA inhibitor has a sequence (from 5’ to 3’) that is or is at least 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 99.1, 99.2, 99.3, 99.4, 99.5, 99.6, 99.7, 99.8, 99.9 or 100% complementary, or any range derivable therein, to the 5’ to 3’ sequence of a mature miRNA, particularly a mature, naturally occurring miRNA. One of skill in the art could use a portion of the probe sequence that is complementary to the sequence of a mature miRNA as the sequence for a miRNA inhibitor. Moreover, that portion of the probe sequence can be altered so that it is still 90% complementary to the sequence of a mature miRNA.

[0088] In certain embodiments, a synthetic miRNA has one or more modified nucleic acid residues. In certain embodiments, the sugar modification is a 2’O-Me modification, a 2’F modification , a 2Ή modification, a 2’amino modification, a 4’ribose modification, or a phosphorothioate modification on the carboxy group linked to the carbon at position 6. In further embodiments, there is one or more sugar modifications in the first or last 2 to 4 residues of the complementary region or the first or last 4 to 6 residues of the complementary region.

[0089] Yet further, the nucleic acid structure of the miRNA can also be modified into a locked nucleic acid (LNA) with a methylene bridge between the 2 Oxygen and the 4' carbon to lock the ribose in the 3'-endo (North) conformation in the A- type conformation of nucleic acids (Lennox, et al, 2011; Bader, et al 2011). This modification significantly increases both target specificity and hybridization properties of the molecules. [0090] The miRNA region and the complementary region may be on the same or separate polynucleotides. In cases in which they are contained on or in the same polynucleotide, the miRNA molecule will be considered a single polynucleotide. In embodiments in which the different regions are on separate polynucleotides, the synthetic miRNA will be considered to be comprised of two polynucleotides.

[0091] When the RNA molecule is a single polynucleotide, there is a linker region between the miRNA region and the complementary region. In some embodiments, the single polynucleotide is capable of forming a hairpin loop structure as a result of bonding between the miRNA region and the complementary region. The linker constitutes the hairpin loop. It is contemplated that in some embodiments, the linker region is, is at least, or is at most 2, 3, 4,

5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,

32, 33, 34, 35, 36, 37, 38, 39, or 40 residues in length, or any range derivable therein. In certain embodiments, the linker is between 3 and 30 residues (inclusive) in length.

[0092] In addition to having a miRNA region and a complementary region, there may be flanking sequences as well at either the 5’ or 3’ end of the region. In some embodiments, there is or is at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 nucleotides or more, or any range derivable therein, flanking one or both sides of these regions.

[0093] Other miRNA-based therapies that negatively manipulate oncogenic miRNAs’, may include further include miRNA sponges, miRNA masks or locked nucleic acid (LNA). As used herein, the term "miRNA sponge" refers to a synthetic nucleic acid (e.g. a mRNA transcript) that contains multiple tandem-binding sites for a miRNA of interest, and that serves to titrate out the endogenous miRNA of interest, thus inhibiting the binding of the miRNA of interest to its endogenous targets.

[0094] Methods in certain aspects include reducing, eliminating, or inhibiting activity and/or expression of one or more miRNAs in a cell comprising introducing into a cell a miRNA inhibitor, antagonist, or antagomir; or supplying or enhancing the activity of one or more miRNAs in a cell. Certain embodiments also concern inducing certain cellular characteristics by providing to a cell a particular nucleic acid, such as a specific synthetic miRNA molecule or a synthetic miRNA inhibitor molecule. However, in methods of the invention, the miRNA molecule or miRNA inhibitor need not be synthetic. They may have a sequence that is identical to a naturally occurring miRNA or they may not have any design modifications. In certain embodiments, the miRNA molecule and/or a miRNA inhibitor are synthetic, as discussed above. III. Colorectal Cancer Staging and Treatments

[0095] Methods and compositions may be provided for treating, prognosing, and/or diagnosing colorectal cancer. Based on a biomarker profile, different treatments may be prescribed or recommended for different cancer patients.

A. Cancer staging

[0096] Colorectal cancer, also known as colon cancer, rectal cancer, or bowel cancer, is a cancer from uncontrolled cell growth in the colon or rectum (parts of the large intestine), or in the appendix. Certain aspsects of the methods are provided for patients that are stage I-IV colorectal cancer patients. In particular aspects, the patient is a stage II or III patient. In a further embodiment, the patient is a stage I or II patient. In a further embodiment, the patient is a stage I, II, or III patient. In some embodiments, the patient is diagnosed as having and/or determined to have Tis, NO, and/or MO; Tl, NO, and/or MO; T2, NO, and/or MO; T3, NO, and/or MO; T4, NO, and/or MO; Tl-2, Nl, and/or MO; T3-4, Nl, and/or MO; any T, N2, and/or MO; or any T, any N, and/or Ml.

[0097] The most common staging system is the TNM (for tumors/nodes/metastases) system, from the American Joint Committee on Cancer (AJCC). The TNM system assigns a number based on three categories.“T” denotes the degree of invasion of the intestinal wall, “N” the degree of lymphatic node involvement, and“M” the degree of metastasis. The broader stage of a cancer is usually quoted as a number I, II, III, IV derived from the TNM value grouped by prognosis; a higher number indicates a more advanced cancer and likely a worse outcome. Details of this system are in the graph below:

AJCC TNM stage TNM stage criteria for colorectal cancer stage

Stage 0 Tis NO M0 Tis: Tumor confined to mucosa; cancer-in-situ

Stage I Tl NO M0 Tl: Tumor invades submucosa

Stage I T2 NO M0 T2: Tumor invades muscularis propria

Stage II-A T3 NO M0 T3: Tumor invades subserosa or beyond (without other organs involved)

Stage II-B T4 NO M0 T4: Tumor invades adjacent organs or perforates the visceral peritoneum

Stage III-A Tl-2 Nl M0 Nl: Metastasis to 1 to 3 regional lymph nodes. Tl or T2. Stage III-B T3-4 Nl M0 Nl: Metastasis to 1 to 3 regional lymph nodes. T3 or T4. Stage III-C any T, N2 M0 N2: Metastasis to 4 or more regional lymph nodes. Any T. Stage IV any T, any N, Ml: Distant metastases present. Any T, any N.

Ml B. Therapy

[0098] Methods of the disclosure may include a cancer therapy as described herein. In some embodiments, the cancer therapy comprises surgical removal of a tumor. This can either be done by an open laparotomy or sometimes laparoscopically. In some embodiments, the cancer therapy comprises chemotherapy. In some embodiments, the chemotherapy is used in a neoadjuvant setting before surgery to shrink the cancer before attempting to remove it (neoadjuvant therapy). The two most common sites of recurrence of colorectal cancer is in the liver and lungs. In some embodiments, the treatment of early colorectal cancer excludes chemotherapy. In further embodiments, the treatment of early colorectal cancer includes neoadjuvant therapy (chemotherapy or radiotherapy before the surgical removal of the primary tumor), but excludes adjuvant therapy (chemotherapy and/or radiotherapy after surgical removal of the primary tumor.

[0099] In both cancer of the colon and rectum, chemotherapy may be used in addition to surgery in certain cases. In rectal cancer, chemotherapy may be used in the neoadjuvant setting.

[00100] In certain embodiments, there may be a decision regarding the therapeutic treatment based on microbiome profile. In some embodiments, the methods include the administration of a chemotherapeutic. In some embodiments, the chemotherapeutic comprises antimetabolites or thymidylate synthase inhibitors such as fluorouracil (5-FU). In some embodiments, the chemotherapeutic comprises cytotoxic drugs, such as irinotecan or oxaliplatin. In some embodiments, the chemotherapeutic comprises combinations such as irinotecan, fluorouracil, and Jeucovorin (FOLFIRI); and oxaliplatin, fluorouracil, and leucovorin (FOLFOX).

[00101] In some embodiments, the cancer therapy comprises an antibody. In some embodiments, the cancer therapy comprises Avastin® (bevacizumab) (Genentech Inc., South San Francisco CA) and/or epidermal growth factor receptor Erbitux® (cetuximab) (Imclone Inc. New York City). In some embodiments, the cancer therapy may include one or more of the chemical therapeutic agents including thymidylate synthase inhibitors or antimetabolites such as fluorouracil (5-FU), alone or in combination with other therapeutic agents. For example, in some embodiments, the first treatment to be tested for response therapy may be antimetabolites or thymidylate synthase inhibitors, prodrugs, or salts thereof. .

[00102] Antimetabolites can be used in cancer treatment, as they interfere with DNA production and therefore cell division and the growth of tumors. Because cancer cells spend more time dividing than other cells, inhibiting cell division harms tumor cells more than other cells. Anti-metabolites masquerade as a purine (azathioprine, mercaptopurine) or a pyrimidine, chemicals that become the building-blocks of DNA. They prevent these substances becoming incorporated in to DNA during the S phase (of the cell cycle), stopping normal development and division. They also affect RNA synthesis. However, because thymidine is used in DNA but not in RNA (where uracil is used instead), inhibition of thymidine synthesis via thymidylate synthase selectively inhibits DNA synthesis over RNA synthesis. Due to their efficiency, these drugs are the most widely used cytostatics. In the ATC system, they are classified under L01B. In some embodiments, this treatment regimen is for advanced cancer. In some embodiments, this treatment regimen is excluded for early cancer.

[00103] Thymidylate synthase inhibitors are chemical agents which inhibit the enzyme thymidylate synthase and have potential as an anticancer chemotherapy. As an anti-cancer chemotherapy target, thymidylate synthetase can be inhibited by the thymidylate synthase inhibitors such as fluorinated pyrimidine fluorouracil, or certain folate analogues, the most notable one being raltitrexed (trade name Tomudex). Five agents were in clinical trials in 2002: raltitrexed, pemetrexed, nolatrexed, ZD9331, and GS7904L. Additional non-limiting examples include: Raltitrexed, used for colorectal cancer since 1998; Fluorouracil, used for colorectal cancer; BGC 945; OST7904L.

[00104] In further embodiments, there may be involved prodrugs that can be converted to thymidylate synthase inhibitors in the body, such as Capecitabine (INN), an orally- administered chemotherapeutic agent used in the treatment of numerous cancers. Capecitabine is a prodrug, that is enzymatically converted to 5-fluorouracil in the body. In some embodiments, this treatment regimen is for advanced cancer. In some embodiments, this treatment regimen is excluded for early cancer.

[00105] Further chemotherapeutic agents that may be used include capecitabine, fluorouracil, irinotecan, leucovorin, oxaliplatin and UFT. Another type of agent that is sometimes used are the epidermal growth factor receptor inhibitors.

[00106] In certain embodiments, alternative treatments may be prescribed or recommended based on the biomarker profile. In addition to traditional chemotherapy for colorectal cancer patients, cancer therapies also include a variety of combination therapies with both chemical and radiation based treatments. Combination chemotherapies include, for example, cisplatin (CDDP), carboplatin, procarbazine, mechlorethamine, cyclophosphamide, camptothecin, ifosfamide, melphalan, chlorambucil, busulfan, nitrosurea, dactinomycin, daunorubicin, doxorubicin, bleomycin, plicomycin, mitomycin, etoposide (VP 16), tamoxifen, raloxifene, estrogen receptor binding agents, taxol, gemcitabien, navelbine, famesyl-protein tansferase inhibitors, transplatinum, 5-fluorouracil, vincristin, vinblastin and methotrexate, or any analog or derivative variant of the foregoing.

[00107] In people with incurable colorectal cancer, treatment options including palliative care can be considered for improving quality of life. Surgical options may include non-curative surgical removal of some of the cancer tissue, bypassing part of the intestines, or stent placement. These procedures can be considered to improve symptoms and reduce complications such as bleeding from the tumor, abdominal pain and intestinal obstruction. Non-operative methods of symptomatic treatment include radiation therapy to decrease tumor size as well as pain medications. In some embodiments, this treatment regimen is for advanced cancer. In some embodiments, this treatment regimen is excluded for early cancer.

[00108] Immunotherapies that are designed to boost the body’s natural defenses to fight the cancer may also be used. Immunotherapeutics, generally, rely on the use of immune effector cells and molecules to target and destroy cancer cells. The immune effector may be, for example, an antibody specific for some marker on the surface of a tumor cell. The antibody alone may serve as an effector of therapy or it may recruit other cells to actually effect cell killing. The antibody also may be conjugated to a drug or toxin (chemotherapeutic, radionuclide, ricin A chain, cholera toxin, pertussis toxin, etc.) and serve merely as a targeting agent. Alternatively, the effector may be a lymphocyte carrying a surface molecule that interacts, either directly or indirectly, with a tumor cell target. Various effector cells include cytotoxic T cells and NK cells. Immune therapy methods are further described below:

1. Checkpoint Inhibitors and Combination Treatment

[00109] Embodiments of the disclosure may include administration of immune checkpoint inhibitors, which are further described below.

a. PD- 1, PDL1, and PDL2 inhibitors

[00110] PD- 1 can act in the tumor microenvironment where T cells encounter an infection or tumor. Activated T cells upregulate PD-l and continue to express it in the peripheral tissues. Cytokines such as IFN-gamma induce the expression of PDL1 on epithelial cells and tumor cells. PDL2 is expressed on macrophages and dendritic cells. The main role of PD-l is to limit the activity of effector T cells in the periphery and prevent excessive damage to the tissues during an immune response. Inhibitors of the disclosure may block one or more functions of PD-l and/or PDL1 activity.

[00111] Alternative names for“PD-l” include CD279 and SLEB2. Alternative names for “PDL1” include B7-H1, B7-4, CD274, and B7-H. Alternative names for“PDL2” include B7- DC, Btdc, and CD273. In some embodiments, PD-l, PDL1, and PDL2 are human PD-l, PDL1 and PDL2.

[00112] In some embodiments, the PD-l inhibitor is a molecule that inhibits the binding of PD-l to its ligand binding partners. In a specific aspect, the PD-l ligand binding partners are PDL1 and/or PDL2. In another embodiment, a PDL1 inhibitor is a molecule that inhibits the binding of PDL1 to its binding partners. In a specific aspect, PDL1 binding partners are PD-l and/or B7-1. In another embodiment, the PDL2 inhibitor is a molecule that inhibits the binding of PDL2 to its binding partners. In a specific aspect, a PDL2 binding partner is PD-l. The inhibitor may be an antibody, an antigen binding fragment thereof, an immunoadhesin, a fusion protein, or oligopeptide. Exemplary antibodies are described in U.S. Patent Nos. 8,735,553, 8,354,509, and 8,008,449, all incorporated herein by reference. Other PD-l inhibitors for use in the methods and compositions provided herein are known in the art such as described in U.S. Patent Application Nos. US2014/0294898, US 2014/022021, and US2011/0008369, all incorporated herein by reference.

[00113] In some embodiments, the PD-l inhibitor is an anti-PD-l antibody (e.g., a human antibody, a humanized antibody, or a chimeric antibody). In some embodiments, the anti-PD- 1 antibody is selected from the group consisting of nivolumab, pembrolizumab, and pidilizumab. In some embodiments, the PD-l inhibitor is an immunoadhesin (e.g., an immunoadhesin comprising an extracellular or PD-l binding portion of PDL1 or PDL2 fused to a constant region (e.g. , an Fc region of an immunoglobulin sequence). In some embodiments, the PDL1 inhibitor comprises AMP- 224. Nivolumab, also known as MDX- 1106-04, MDX- 1106, ONO-4538, BMS-936558, and OPDIVO ® , is an anti-PD-l antibody described in W 02006/121168. Pembrolizumab, also known as MK-3475, Merck 3475, lambrolizumab, KEYTRUDA ® , and SCH-900475, is an anti-PD-l antibody described in W02009/114335. Pidilizumab, also known as CT-011, hBAT, or hBAT-l, is an anti-PD-l antibody described in W02009/101611. AMP-224, also known as B7-DCIg, is a PDL2-Fc fusion soluble receptor described in W02010/027827 and WO2011/066342. Additional PD-l inhibitors include MEDI0680, also known as AMP-514, and REGN2810.

[00114] In some embodiments, the immune checkpoint inhibitor is a PDL1 inhibitor such as Durvalumab, also known as MEDI4736, atezolizumab, also known as MPDL3280A, avelumab, also known as MSB00010118C, MDX-1105, BMS-936559, or combinations thereof. In certain aspects, the immune checkpoint inhibitor is a PDL2 inhibitor such as rHIgMl2B7. [00115] In some embodiments, the inhibitor comprises the heavy and light chain CDRs or VRs of nivolumab, pembrolizumab, or pidilizumab. Accordingly, in one embodiment, the inhibitor comprises the CDR1, CDR2, and CDR3 domains of the VH region of nivolumab, pembrolizumab, or pidilizumab, and the CDR1, CDR2 and CDR3 domains of the VL region of nivolumab, pembrolizumab, or pidilizumab. In another embodiment, the antibody competes for binding with and/or binds to the same epitope on PD-l, PDL1, or PDL2 as the above- mentioned antibodies. In another embodiment, the antibody has at least about 70, 75, 80, 85, 90, 95, 97, or 99% (or any derivable range therein) variable region amino acid sequence identity with the above-mentioned antibodies.

b. CTLA-4, B7-1, and B7-2

[00116] Another immune checkpoint that can be targeted in the methods provided herein is the cytotoxic T-lymphocyte-associated protein 4 (CTLA-4), also known as CD 152. The complete cDNA sequence of human CTLA-4 has the Genbank accession number L15006. CTLA-4 is found on the surface of T cells and acts as an“off’ switch when bound to B7-1 (CD80) or B7-2 (CD86) on the surface of antigen-presenting cells. CTLA4 is a member of the immunoglobulin superfamily that is expressed on the surface of Helper T cells and transmits an inhibitory signal to T cells. CTLA4 is similar to the T-cell co-stimulatory protein, CD28, and both molecules bind to B7-1 and B7-2 on antigen-presenting cells. CTLA-4 transmits an inhibitory signal to T cells, whereas CD28 transmits a stimulatory signal. Intracellular CTLA- 4 is also found in regulatory T cells and may be important to their function. T cell activation through the T cell receptor and CD28 leads to increased expression of CTLA-4, an inhibitory receptor for B7 molecules. Inhibitors of the disclosure may block one or more functions of CTLA-4, B7-1, and/or B7-2 activity. In some embodiments, the inhibitor blocks the CTLA-4 and B7-1 interaction. In some embodiments, the inhibitor blocks the CTLA-4 and B7-2 interaction.

[00117] In some embodiments, the immune checkpoint inhibitor is an anti-CTLA-4 antibody ( e.g ., a human antibody, a humanized antibody, or a chimeric antibody), an antigen binding fragment thereof, an immunoadhesin, a fusion protein, or oligopeptide.

[00118] Anti-human-CTLA-4 antibodies (or VH and/or VL domains derived therefrom) suitable for use in the present methods can be generated using methods well known in the art. Alternatively, art recognized anti-CTLA-4 antibodies can be used. For example, the anti- CTLA-4 antibodies disclosed in: US 8,119,129, WO 01/14424, WO 98/42752; WO 00/37504 (CP675,206, also known as tremelimumab; formerly ticilimumab), U.S. Patent No. 6,207,156; Hurwitz el al, 1998; can be used in the methods disclosed herein. The teachings of each of the aforementioned publications are hereby incorporated by reference. Antibodies that compete with any of these art-recognized antibodies for binding to CTLA-4 also can be used. For example, a humanized CTLA-4 antibody is described in International Patent Application No. W 02001/014424, W02000/037504, and U.S. Patent No. 8,017,114; all incorporated herein by reference.

[00119] A further anti-CTLA-4 antibody useful as a checkpoint inhibitor in the methods and compositions of the disclosure is ipilimumab (also known as 10D1, MDX- 010, MDX- 101, and Yervoy®) or antigen binding fragments and variants thereof (see, e.g., WOO 1/14424).

[00120] In some embodiments, the inhibitor comprises the heavy and light chain CDRs or VRs of tremelimumab or ipilimumab. Accordingly, in one embodiment, the inhibitor comprises the CDR1, CDR2, and CDR3 domains of the VH region of tremelimumab or ipilimumab, and the CDR1, CDR2 and CDR3 domains of the VL region of tremelimumab or ipilimumab. In another embodiment, the antibody competes for binding with and/or binds to the same epitope on PD-l, B7-1, or B7-2 as the above- mentioned antibodies. In another embodiment, the antibody has at least about 70, 75, 80, 85, 90, 95, 97, or 99% (or any derivable range therein) variable region amino acid sequence identity with the above-mentioned antibodies.

2. Other immunotherapies

[00121] In some embodiments, the methods comprise administration of a cancer immunotherapy. Cancer immunotherapy (sometimes called immuno-oncology, abbreviated IO) is the use of the immune system to treat cancer. Immunotherapies can be categorized as active, passive or hybrid (active and passive). These approaches exploit the fact that cancer cells often have molecules on their surface that can be detected by the immune system, known as tumour-associated antigens (TAAs); they are often proteins or other macromolecules (e.g. carbohydrates). Active immunotherapy directs the immune system to attack tumor cells by targeting TAAs. Passive immunotherapies enhance existing anti-tumor responses and include the use of monoclonal antibodies, lymphocytes and cytokines. Immumotherapies are known in the art, and some are described below.

a. Inhibition of co- stimulatory molecules

[00122] In some embodiments, the immunotherapy comprises an inhibitor of a co stimulatory molecule. In some embodiments, the inhibitor comprises an inhibitor of B7-1 (CD80), B7-2 (CD86), CD28, ICOS, 0X40 (TNFRSF4), 4-1BB (CD137; TNFRSF9), CD40L (CD40LG), GITR (TNFRSF18), and combinations thereof. Inhibitors include inhibitory antibodies, polypeptides, compounds, and nucleic acids.

b. Dendritic cell therapy

[00123] Dendritic cell therapy provokes anti-tumor responses by causing dendritic cells to present tumor antigens to lymphocytes, which activates them, priming them to kill other cells that present the antigen. Dendritic cells are antigen presenting cells (APCs) in the mammalian immune system. In cancer treatment they aid cancer antigen targeting. One example of cellular cancer therapy based on dendritic cells is sipuleucel-T.

[00124] One method of inducing dendritic cells to present tumor antigens is by vaccination with autologous tumor lysates or short peptides (small parts of protein that correspond to the protein antigens on cancer cells). These peptides are often given in combination with adjuvants (highly immunogenic substances) to increase the immune and anti-tumor responses. Other adjuvants include proteins or other chemicals that attract and/or activate dendritic cells, such as granulocyte macrophage colony- stimulating factor (GM-CSF).

[00125] Dendritic cells can also be activated in vivo by making tumor cells express GM- CSF. This can be achieved by either genetically engineering tumor cells to produce GM-CSF or by infecting tumor cells with an oncolytic virus that expresses GM-CSF.

[00126] Another strategy is to remove dendritic cells from the blood of a patient and activate them outside the body. The dendritic cells are activated in the presence of tumor antigens, which may be a single tumor- specific peptide/protein or a tumor cell lysate (a solution of broken down tumor cells). These cells (with optional adjuvants) are infused and provoke an immune response.

[00127] Dendritic cell therapies include the use of antibodies that bind to receptors on the surface of dendritic cells. Antigens can be added to the antibody and can induce the dendritic cells to mature and provide immunity to the tumor. Dendritic cell receptors such as TLR3, TLR7, TLR8 or CD40 have been used as antibody targets.

c. CAR-T cell therapy

[00128] Chimeric antigen receptors (CARs, also known as chimeric immunoreceptors, chimeric T cell receptors or artificial T cell receptors) are engineered receptors that combine a new specificity with an immune cell to target cancer cells. Typically, these receptors graft the specificity of a monoclonal antibody onto a T cell. The receptors are called chimeric because they are fused of parts from different sources. CAR-T cell therapy refers to a treatment that uses such transformed cells for cancer therapy. [00129] The basic principle of CAR-T cell design involves recombinant receptors that combine antigen-binding and T-cell activating functions. The general premise of CAR-T cells is to artificially generate T-cells targeted to markers found on cancer cells. Scientists can remove T-cells from a person, genetically alter them, and put them back into the patient for them to attack the cancer cells. Once the T cell has been engineered to become a CAR-T cell, it acts as a“living drug”. CAR-T cells create a link between an extracellular ligand recognition domain to an intracellular signalling molecule which in turn activates T cells. The extracellular ligand recognition domain is usually a single-chain variable fragment (scFv). An important aspect of the safety of CAR-T cell therapy is how to ensure that only cancerous tumor cells are targeted, and not normal cells. The specificity of CAR-T cells is determined by the choice of molecule that is targeted.

[00130] Exemplary CAR-T therapies include Tisagenlecleucel (Kymriah) and Axicabtagene ciloleucel (Yescarta). In some embodiments, the CAR-T therapy targets CD 19.

d. Cytokine therapy

[00131] Cytokines are proteins produced by many types of cells present within a tumor. They can modulate immune responses. The tumor often employs them to allow it to grow and reduce the immune response. These immune-modulating effects allow them to be used as drugs to provoke an immune response. Two commonly used cytokines are interferons and interleukins.

[00132] Interferons are produced by the immune system. They are usually involved in anti viral response, but also have use for cancer. They fall in three groups: type I (IFNa and IFNP), type II (IFNy) and type III (IFNk).

[00133] Interleukins have an array of immune system effects. IL-2 is an exemplary interleukin cytokine therapy.

e. Adoptive T-cell therapy

[00134] Adoptive T cell therapy is a form of passive immunization by the transfusion of T- cells (adoptive cell transfer). They are found in blood and tissue and usually activate when they find foreign pathogens. Specifically they activate when the T-cell's surface receptors encounter cells that display parts of foreign proteins on their surface antigens. These can be either infected cells, or antigen presenting cells (APCs). They are found in normal tissue and in tumor tissue, where they are known as tumor infiltrating lymphocytes (TILs). They are activated by the presence of APCs such as dendritic cells that present tumor antigens. Although these cells can attack the tumor, the environment within the tumor is highly immunosuppressive, preventing immune-mediated tumour death. [60] [00135] Multiple ways of producing and obtaining tumour targeted T-cells have been developed. T-cells specific to a tumor antigen can be removed from a tumor sample (TILs) or filtered from blood. Subsequent activation and culturing is performed ex vivo, with the results reinfused. Activation can take place through gene therapy, or by exposing the T cells to tumor antigens.

[00136] It is contemplated that a cancer treatment may exclude any of the cancer treatments described herein. Furthermore, embodiments of the disclosure include patients that have been previously treated for a therapy described herein, are currently being treated for a therapy described herein, or have not been treated for a therapy described herein. In some embodiments, the patient is one that has been determined to be resistant to a therapy described herein. In some embodiments, the patient is one that has been determined to be sensitive to a therapy described herein.

C. Monitoring

[00137] In certain aspects, the methods of the disclosure may be combined with one or more other colon cancer diagnosis or screening tests at increased frequency if the patient is determined to be at high risk for recurrence or have a poor prognosis based on the biomarker described above.

[00138] The colon monitoring may include any methods known in the art. In particular, the monitoring include obtaining a sample and testing the sample for diagnosis. For example, the colon monitoring may include colonoscopy or coloscopy, which is the endoscopic examination of the large bowel and the distal part of the small bowel with a CCD camera or a fiber optic camera on a flexible tube passed through the anus. It can provide a visual diagnosis (e.g. ulceration, polyps) and grants the opportunity for biopsy or removal of suspected colorectal cancer lesions. Thus, colonoscopy or coloscopy can be used for treatment.

[00139] In further aspects, the monitoring diagnosis may include sigmoidoscopy, which is similar to colonoscopy— the difference being related to which parts of the colon each can examine. A colonoscopy allows an examination of the entire colon (1200-1500 mm in length). A sigmoidoscopy allows an examination of the distal portion (about 600 mm) of the colon, which may be sufficient because benefits to cancer survival of colonoscopy have been limited to the detection of lesions in the distal portion of the colon. A sigmoidoscopy is often used as a screening procedure for a full colonoscopy, often done in conjunction with a fecal occult blood test (FOBT). About 5% of these screened patients are referred to colonoscopy. [00140] In additional aspects, the monitoring diagnosis may include virtual colonoscopy, which uses 2D and 3D imagery reconstructed from computed tomography (CT) scans or from nuclear magnetic resonance (MR) scans, as a totally non-invasive medical test.

[00141] The monitoring include the use of one or more screening tests for colon cancer including, but not limited to fecal occult blood testing, flexible sigmoidoscopy and colonoscopy. Of the three, only sigmoidoscopy cannot screen the right side of the colon where 42% of malignancies are found. Virtual colonoscopy via a CT scan appears as good as standard colonoscopy for detecting cancers and large adenomas but is expensive, associated with radiation exposure, and cannot remove any detected abnormal growths like standard colonoscopy can. Fecal occult blood testing (FOBT) of the stool is typically recommended every two years and can be either guaiac based or immunochemical. Annual FOBT screening results in a 16% relative risk reduction in colorectal cancer mortality, but no difference in all cause mortality. The M2-PK test identifies an enzyme in colorectal cancers and polyps rather than blood in the stool. It does not require any special preparation prior to testing. M2-PK is sensitive for colorectal cancer and polyps and is able to detect bleeding and non-bleeding colorectal cancer and polyps. In the event of a positive result people would be asked to undergo further examination e.g. colonoscopy.

IV. ROC analysis, Biomarkers and Sample Preparation

[00142] In statistics, a receiver operating characteristic (ROC), or ROC curve, is a graphical plot that illustrates the performance of a binary classifier system as its discrimination threshold is varied. The curve is created by plotting the true positive rate against the false positive rate at various threshold settings. (The true-positive rate is also known as sensitivity in biomedical informatics, or recall in machine learning. The false-positive rate is also known as the fall-out and can be calculated as 1 - specificity). The ROC curve is thus the sensitivity as a function of fall-out. In general, if the probability distributions for both detection and false alarm are known, the ROC curve can be generated by plotting the cumulative distribution function (area under the probability distribution from -infinity to + infinity) of the detection probability in the y- axis versus the cumulative distribution function of the false-alarm probability in x-axis.

[00143] ROC analysis provides tools to select possibly optimal models and to discard suboptimal ones independently from (and prior to specifying) the cost context or the class distribution. ROC analysis is related in a direct and natural way to cost/benefit analysis of diagnostic decision making. ROC analysis provides a tool for creating cut-off values to partition patient populations into high expression and low expression of certain biomarkers. [00144] The ROC is also known as a relative operating characteristic curve, because it is a comparison of two operating characteristics (TPR and FPR) as the criterion changes. ROC analysis curves are known in the art and described in Metz CE (1978) Basic principles of ROC analysis. Seminars in Nuclear Medicine 8:283-298; Youden WJ (1950) An index for rating diagnostic tests. Cancer 3:32-35; Zweig MH, Campbell G (1993) Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine. Clinical Chemistry 39:561-577; and Greiner M, Pfeiffer D, Smith RD (2000) Principles and practical application of the receiver-operating characteristic analysis for diagnostic tests. Preventive Veterinary Medicine 45:23-41, which are herein incorporated by reference in their entirety.

[00145] Biomarkers for identifying effective treatment for colorectal cancer patients are provided. It is contemplated that these biomarkers may be evaluated based on their gene products. In some embodiments, the gene product is the RNA transcript. In other embodiments, the gene product is the protein expressed by the RNA transcript.

[00146] In certain aspects a meta-analysis of expression can be performed. In statistics, a meta-analysis combines the results of several studies that address a set of related research hypotheses. This is normally done by identification of a common measure of effect size, which is modeled using a form of meta-regression. Generally, three types of models can be distinguished in the literature on meta- analysis: simple regression, fixed effects meta regression and random effects meta-regression. Resulting overall averages when controlling for study characteristics can be considered meta-effect sizes, which are more powerful estimates of the true effect size than those derived in a single study under a given single set of assumptions and conditions. A meta-gene expression value, in this context, is to be understood as being the median of the normalized expression of a marker gene or activity. Normalization of the expression of a marker gene is preferably achieved by dividing the expression level of the individual marker gene to be normalized by the respective individual median expression of this marker genes, wherein said median expression is preferably calculated from multiple measurements of the respective gene in a sufficiently large cohort of test individuals. The test cohort may comprises at least 3, 10, 100, 200, 300, 400, 500, 600, 700, 800, 900, or 1000 individuals or more including all values and ranges thereof. Dataset-specific bias can be removed or minimized allowing multiple datasets to be combined for meta-analyses ( See Sims et al. BMC Medical Genomics (1:42), 1-14, 2008, which is incorporated herein by reference in its entirety).

[00147] The calculation of a meta-gene expression value is performed by: (i) determining the gene expression value of at least two, preferably more genes (ii) "normalizing" the gene expression value of each individual gene by dividing the expression value with a coefficient which is approximately the median expression value of the respective gene in a representative breast cancer cohort (iii) calculating the median of the group of normalized gene expression values.

[00148] A gene shall be understood to be specifically expressed in a certain cell type if the expression level of the gene in the cell type is at least about 2-fold, 5-fold, lO-fold, lOO-fold, 1000-fold, or 10000-fold higher (or any range derivable therein) than in a reference cell type, or in a mixture of reference cell types. Reference cell types include non-cancerous breast tissue cells or a heterogenous population of breast cancers.

[00149] In certain algorithms a suitable threshold level is first determined for a marker gene. The suitable threshold level can be determined from measurements of the marker gene expression in multiple individuals from a test cohort. The median expression of the marker gene in said multiple expression measurements is taken as the suitable threshold value.

[00150] Comparison of multiple marker genes with a threshold level can be performed as follows:

[00151] 1. The individual marker genes are compared to their respective threshold levels.

[00152] 2. The number of marker genes, the expression level of which is above their respective threshold level, is determined.

[00153] 3. If a marker genes is expressed above its respective threshold level, then the expression level of the marker gene is taken to be "above the threshold level".

[00154] "A sufficiently large number", in this context, means preferably 30%, 50%, 80%, 90%, or 95% of the marker genes used.

[00155] In certain aspects, the determination of expression levels is on a gene chip, such as an Affymetrix™ gene chip. In other embodiments, RNA sequencing is employed.

[00156] In another aspect, the determination of expression levels is done by kinetic real time PCR.

[00157] In certain aspects, the methods can relate to a system for performing such methods, the system comprising (a) apparatus or device for storing data on regarding expression levels of one or more biomarkers; (b) apparatus or device for determining the expression level of at least one biomarker; (c) apparatus or device for comparing the expression level of the first biomarker with a predetermined first threshold value; (d) apparatus or device for determining the expression level of at least one second biomarker; and (e) computing apparatus or device programmed to provide information about colorectal cancer, including treatment and prognosis, if the data indicates altered expression levels of said first biomarker as compared to the predetermined first threshold value and, alternatively, the expression level of said second biomarker is above or below a predetermined second threshold level, wherein the predetermined threshold values are based on expression levels for biomarkers that provide information about prognosis and treatment. The person skilled in the art readily appreciates that an unfavorable or poor prognosis can be given if the expression level of one or more biomarkers with the predetermined threshold value indicates a high risk of not surviving or not responding well to standard therapies or likelihood of metastasis.

[00158] The expression levels of biomarkers can be compared to reference expression levels using various methods. These reference levels can be determined using expression levels of a reference based on one or more cohorts of colorectal cancer patients. Any comparison can be performed using the fold change or the absolute difference between the expression levels to be compared. One or more cancer biomarkers can be used in the comparison. It is contemplated that 1, 2, 3, 4, 5, 6, 7, 8, and/or 9 or more biomarkers may be compared to each other and/or to a reference that is internal or external. A person of ordinary skill in the art would know how to do such comparisons.

[00159] Comparisons or results from comparisons may reveal or be expressed as x-fold increase or decrease in expression relative to a standard or relative to another biomarker in a different class of prognosis or treatment. In some embodiments, patients with a poor prognosis have a relatively high level of expression (increased expression or overexpression) or relatively low level of expression (reduced expression or underexpression) when compared to patients with a better or favorable prognosis, or vice versa.

[00160] Fold increases or decreases may be, be at least, or be at most 1-, 2-, 3-, 4-, 5-, 6-, 7- , 8-, 9-, 10-, 11-, 12-, 13-, 14-, 15-, 16-, 17-, 18-, 19-, 20-, 25-, 30-, 35-, 40-, 45-, 50-, 55-, 60- , 65-, 70-, 75-, 80-, 85-, 90-, 95-, 100- or more, or any range derivable therein. Alternatively, differences in expression may be expressed as a percent decrease or increase, such as at least or at most 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 300, 400, 500, 600, 700, 800, 900, 1000% difference, or any range derivable therein.

[00161] Other ways to express relative expression levels are by normalized or relative numbers such as 0.001, 0.002, 0.003, 0.004, 0.005, 0.006, 0.007, 0.008, 0.009, 0.01, 0.02, 0.03. 0.04, 0.05, 0.06, 0.07, 0.08, 0.09, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1, 1.2, 1.3, 1.4,

1.5, 1.6, 1.7, 1.8, 1.9, 2.0, 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 2.9, 3.0, 3.1, 3.2, 3.3, 3.4, 3.5,

3.6, 3.7. 3.8, 3.9, 4.0, 4.1, 4.2, 4.3, 4.4, 4.5, 4.6, 4.7, 4.8, 4.9, 5.0, 5.1, 5.2, 5.3, 5.4, 5.5, 5.6,

5.7, 5.8, 5.9, 6.0, 6.1, 6.2, 6.3, 6.4, 6.5, 6.6, 6.7, 6.8, 6.9, 7.0, 7.1, 7.2, 7.3, 7.4, 7.5, 7.6, 7.7, 7.8, 8.0, 8.1, 8.2, 8.3, 8.4, 8.5, 8.6, 8.7, 8.8, 8.9, 9.0, 9.1, 9.2, 9.3, 9.4, 9.5, 9.6, 9.7, 9.8, 9.9, 10.0, or any range derivable therein.

[00162] Algorithms, such as the weighted voting programs, can be used to facilitate the evaluation of biomarker levels. In addition, other clinical evidence can be combined with the biomarker-based test to reduce the risk of false evaluations. Other cytogenetic evaluations may be considered in some embodiments of the invention. In some embodiments, the expression levels of one or more biomarkers are within a predetermined amount of the mean expression levels of the one or more biomarkers, on a biomarker-by-biomarker basis, in the biological samples from a cohort of patients having colorectal cancer with a poor survival outcome or a cohort of patients having a good survival outcome within a certain time period, such as 6 months, 1 year, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more years. The mean levels may be determined by measuring the expression levels of biomarkers in samples from patients in the cohort and calculating a mean expression level for each bio marker. In some embodiments, the patients are patients having colorectal cancer or particular subtype of patient. Classification of a oatuent may be done by comparing the measured expression levels of biomarkers to reference methy expression lation levels of the same biomarkers. The reference expression levels may be identified as the mean expression levels in a cohort of patients with low risk survival or high risk survival, for example. The reference methylation levels of such cohorts, and of any patient cohorts described herein, may be established by measuring the expression levels in biological samples of at least, at most, or exactly 50, 100, 150, 200, 250, 300, 350, 400, 450, 500, 600, 700, 800, 900, or 1000 subjects in the cohort, or any range derivable therein. In some embodiments, the cohort of patients comprises a representative sample of colorectal cancer patients who survive or do not survive (or have metastasis or do not have metastasis) within a certain time period such as within 1, 2, 3, 4, or 5 years of intial diagnosis or completion of primary treatment. If the expression levels of the biomarkers measured in a sample are sufficiently close to the reference expression levels of a particular cohort, then the sample in question can be classified as being of that characteristic. The degree of closeness in expression levels required to be classified as a match may be predetermined using a statistical analysis. In some embodiments, the predetermined amount of closeness is within one standard deviation of the mean expression level of the reference cohort. In some embodiments, the predetermined amount is within 0.1, 0.5, 1.0, 2.0, 3.0, 4.0, 5.0, 10, 15, or 20% of the reference expression level, or any range derivable therein. In some embodiments, a sample may be classified as belonging to a low risk survival cohort or a high risk survival cohort despite the expression levels of one or more biomarkers deviating from a reference expression level by a substantial amount. For instance, if a substantial number of other biomarker expression levels sufficiently match the reference expression, then the sample metastasis may be classified as belonging to the subtype. A computer-based classifier programmed to perform a statistical analysis may be used to determine whether expression levels of a sufficient number of biomarkers in a sample are sufficiently close to the reference methylation levels of a particular molecular subtype to classify the sample as belonging to that subtype.

[00163] It is contemplated that the methods described herein may involve a comparison between expression levels measured for a sample and reference expression levels that are indicative of different prognostic and/or treatment outcomes. Thus, in some embodiments, the measured expression level for a biomarker is lower than, higher than, close to, higher by a predetermined amount than, lower by a predetermined amount than, or within a predetermined amount of the expression level of the biomarker from a cohort of specific patients.

[00164] A unique collection of biomarkers as a genetic classifier with respect to expression states in a cancer tissue is provided that is useful in determining prognosis and treatment options. The panel also provides relevant information about prognosis and/or treatment with other cancer treatment such as chemotherapeutic s, radiation, and/or mmunotherapeutics. Such a collection may be termed a“biomarker panel,”“expression classifier,” or“classifier.”

[00165] In some embodiments, a score is calculated based on the expression profile of a patient. In certain embodiments, the value assigned to represent the expression of one or more genes may be adjusted. In some cases, a weight is attached to one or more values. The term “weight" refers to the relative importance of an item in a statistical calculation. The weight of each biomarker in a expression level classifier may be determined on a data set of patient samples using analytical methods known in the art.

[00166] In certain aspects, methods involve obtaining a sample from a subject. The methods of obtaining provided herein may include methods of biopsy such as fine needle aspiration, core needle biopsy, vacuum assisted biopsy, incisional biopsy, excisional biopsy, punch biopsy, shave biopsy or skin biopsy. In certain embodiments the sample is obtained from a biopsy from esophageal tissue by any of the biopsy methods previously mentioned. In other embodiments the sample may be obtained from any of the tissues provided herein that include but are not limited to non-cancerous or cancerous tissue and non-cancerous or cancerous tissue from the serum, gall bladder, mucosal, skin, heart, lung, breast, pancreas, blood, liver, muscle, kidney, smooth muscle, bladder, colon, intestine, brain, prostate, esophagus, or thyroid tissue. Alternatively, the sample may be obtained from any other source including but not limited to blood, sweat, hair follicle, buccal tissue, tears, menses, feces, or saliva. In certain aspects of the current methods, any medical professional such as a doctor, nurse or medical technician may obtain a biological sample for testing. Yet further, the biological sample can be obtained without the assistance of a medical professional.

[00167] A sample may include but is not limited to, tissue, cells, or biological material from cells or derived from cells of a subject. The biological sample may be a heterogeneous or homogeneous population of cells or tissues. The biological sample may be obtained using any method known to the art that can provide a sample suitable for the analytical methods described herein. The sample may be obtained by non-invasive methods including but not limited to: scraping of the skin or cervix, swabbing of the cheek, saliva collection, urine collection, feces collection, collection of menses, tears, or semen.

[00168] The sample may be obtained by methods known in the art. In certain embodiments the samples are obtained by biopsy. In other embodiments the sample is obtained by swabbing, endoscopy, scraping, phlebotomy, or any other methods known in the art. In some cases, the sample may be obtained, stored, or transported using components of a kit of the present methods. In some cases, multiple samples, such as multiple esophageal samples may be obtained for diagnosis by the methods described herein. In other cases, multiple samples, such as one or more samples from one tissue type (for example esophagus) and one or more samples from another specimen (for example serum) may be obtained for diagnosis by the methods. In some cases, multiple samples such as one or more samples from one tissue type (e.g. esophagus) and one or more samples from another specimen (e.g. serum) may be obtained at the same or different times. Samples may be obtained at different times are stored and/or analyzed by different methods. For example, a sample may be obtained and analyzed by routine staining methods or any other cytological analysis methods.

[00169] In some embodiments the biological sample may be obtained by a physician, nurse, or other medical professional such as a medical technician, endocrinologist, cytologist, phlebotomist, radiologist, or a pulmonologist. The medical professional may indicate the appropriate test or assay to perform on the sample. In certain aspects a molecular profiling business may consult on which assays or tests are most appropriately indicated. In further aspects of the current methods, the patient or subject may obtain a biological sample for testing without the assistance of a medical professional, such as obtaining a whole blood sample, a urine sample, a fecal sample, a buccal sample, or a saliva sample.

[00170] In other cases, the sample is obtained by an invasive procedure including but not limited to: biopsy, needle aspiration, endoscopy, or phlebotomy. The method of needle aspiration may further include fine needle aspiration, core needle biopsy, vacuum assisted biopsy, or large core biopsy. In some embodiments, multiple samples may be obtained by the methods herein to ensure a sufficient amount of biological material.

[00171] General methods for obtaining biological samples are also known in the art. Publications such as Ramzy, Ibrahim Clinical Cytopathology and Aspiration Biopsy 2001, which is herein incorporated by reference in its entirety, describes general methods for biopsy and cytological methods. In one embodiment, the sample is a fine needle aspirate of a esophageal or a suspected esophageal tumor or neoplasm. In some cases, the fine needle aspirate sampling procedure may be guided by the use of an ultrasound, X-ray, or other imaging device.

[00172] In some embodiments of the present methods, the molecular profiling business may obtain the biological sample from a subject directly, from a medical professional, from a third party, or from a kit provided by a molecular profiling business or a third party. In some cases, the biological sample may be obtained by the molecular profiling business after the subject, a medical professional, or a third party acquires and sends the biological sample to the molecular profiling business. In some cases, the molecular profiling business may provide suitable containers, and excipients for storage and transport of the biological sample to the molecular profiling business.

[00173] In some embodiments of the methods described herein, a medical professional need not be involved in the initial diagnosis or sample acquisition. An individual may alternatively obtain a sample through the use of an over the counter (OTC) kit. An OTC kit may contain a means for obtaining said sample as described herein, a means for storing said sample for inspection, and instructions for proper use of the kit. In some cases, molecular profiling services are included in the price for purchase of the kit. In other cases, the molecular profiling services are billed separately. A sample suitable for use by the molecular profiling business may be any material containing tissues, cells, nucleic acids, genes, gene fragments, expression products, gene expression products, or gene expression product fragments of an individual to be tested. Methods for determining sample suitability and/or adequacy are provided.

[00174] In some embodiments, the subject may be referred to a specialist such as an oncologist, surgeon, or endocrinologist. The specialist may likewise obtain a biological sample for testing or refer the individual to a testing center or laboratory for submission of the biological sample. In some cases the medical professional may refer the subject to a testing center or laboratory for submission of the biological sample. In other cases, the subject may provide the sample. In some cases, a molecular profiling business may obtain the sample. V. Nucleic Acid Assays

[00175] Aspects of the methods include assaying nucleic acids to determine expression levels. Arrays can be used to detect differences between two samples. Specifically contemplated applications include identifying and/or quantifying differences between miRNA from a sample that is normal and from a sample that is not normal, between a cancerous condition and a non-cancerous condition, or between two differently treated samples. Also, miRNA may be compared between a sample believed to be susceptible to a particular disease or condition and one believed to be not susceptible or resistant to that disease or condition. A sample that is not normal is one exhibiting phenotypic trait(s) of a disease or condition or one believed to be not normal with respect to that disease or condition. It may be compared to a cell that is normal with respect to that disease or condition. Phenotypic traits include symptoms of, or susceptibility to, a disease or condition of which a component is or may or may not be genetic or caused by a hyperproliferative or neoplastic cell or cells.

[00176] An array comprises a solid support with nucleic acid probes attached to the support. Arrays typically comprise a plurality of different nucleic acid probes that are coupled to a surface of a substrate in different, known locations. These arrays, also described as "microarrays" or colloquially "chips" have been generally described in the art, for example, U.S. Pat. Nos. 5,143,854, 5,445,934, 5,744,305, 5,677,195, 6,040,193, 5,424,186 and Fodor et al., 1991), each of which is incorporated by reference in its entirety for all purposes. Techniques for the synthesis of these arrays using mechanical synthesis methods are described in, e.g., U.S. Pat. No. 5,384,261, incorporated herein by reference in its entirety for all purposes. Although a planar array surface is used in certain aspects, the array may be fabricated on a surface of virtually any shape or even a multiplicity of surfaces. Arrays may be nucleic acids on beads, gels, polymeric surfaces, fibers such as fiber optics, glass or any other appropriate substrate, see U.S. Pat. Nos. 5,770,358, 5,789,162, 5,708,153, 6,040,193 and 5,800,992, which are hereby incorporated in their entirety for all purposes.

[00177] In addition to the use of arrays and microarrays, it is contemplated that a number of difference assays could be employed to analyze miRNAs, their activities, and their effects. Such assays include, but are not limited to, nucleic amplification, polymerase chain reaction, quantitative PCR, RT-PCR, in situ hybridization, Northern hybridization, hybridization protection assay (HPA)(GenProbe), branched DNA (bDNA) assay (Chiron), rolling circle amplification (RCA), single molecule hybridization detection (US Genomics), Invader assay (ThirdWave Technologies), and/or Bridge Litigation Assay (Genaco). VI. Administration of Therapeutic Compositions

[00178] The therapy provided herein may comprise administration of a combination of therapeutic agents, such as a first cancer therapy and a second cancer therapy. The therapies may be administered in any suitable manner known in the art. For example, the first and second cancer treatment may be administered sequentially (at different times) or concurrently (at the same time). In some embodiments, the first and second cancer treatments are administered in a separate composition. In some embodiments, the first and second cancer treatments are in the same composition.

[00179] Embodiments of the disclosure relate to compositions and methods comprising therapeutic compositions. The different therapies may be administered in one composition or in more than one composition, such as 2 compositions, 3 compositions, or 4 compositions. Various combinations of the agents may be employed, for example, a first cancer treatment is “A” and a second cancer treatment is“B”:

A/B/A B/A/B B/B/A A/A/B A/B/B B/A/A A/B/B/B B/A/B/B

B/B/B/A B/B/A/B A/A/B/B A/B/A/B A/B/B/A B/B/A/A

B/A/B/A B/A/A/B A/A/A/B B/A/A/A A/B/A/A A/A/B/A

[00180] The therapeutic agents of the disclosure may be administered by the same route of administration or by different routes of administration. In some embodiments, the cancer therapy is administered intravenously, intramuscularly, subcutaneously, topically, orally, transdermally, intraperitoneally, intraorbitally, by implantation, by inhalation, intrathecally, intraventricularly, or intranasally. In some embodiments, the antibiotic is administered intravenously, intramuscularly, subcutaneously, topically, orally, transdermally, intraperitoneally, intraorbitally, by implantation, by inhalation, intrathecally, intraventricularly, or intranasally. The appropriate dosage may be determined based on the type of disease to be treated, severity and course of the disease, the clinical condition of the individual, the individual's clinical history and response to the treatment, and the discretion of the attending physician.

[00181] The treatments may include various“unit doses.” Unit dose is defined as containing a predetermined-quantity of the therapeutic composition. The quantity to be administered, and the particular route and formulation, is within the skill of determination of those in the clinical arts. A unit dose need not be administered as a single injection but may comprise continuous infusion over a set period of time. In some embodiments, a unit dose comprises a single administrable dose. [00182] The quantity to be administered, both according to number of treatments and unit dose, depends on the treatment effect desired. An effective dose is understood to refer to an amount necessary to achieve a particular effect. In the practice in certain embodiments, it is contemplated that doses in the range from 10 mg/kg to 200 mg/kg can affect the protective capability of these agents. Thus, it is contemplated that doses include doses of about 0.1, 0.5, 1, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, and 200, 300, 400,

500, 1000 mg/kg, mg/kg, mg/day, or mg/day or any range derivable therein. Furthermore, such doses can be administered at multiple times during a day, and/or on multiple days, weeks, or months.

[00183] In certain embodiments, the effective dose of the pharmaceutical composition is one which can provide a blood level of about 1 mM to 150 mM. In another embodiment, the effective dose provides a blood level of about 4 mM to 100 mM.; or about 1 mM to 100 mM; or about 1 mM to 50 mM; or about 1 mM to 40 mM; or about 1 mM to 30 mM; or about 1 mM to 20 mM; or about 1 mM to 10 mM; or about 10 mM to 150 mM; or about 10 mM to 100 mM; or about 10 mM to 50 mM; or about 25 mM to 150 mM; or about 25 mM to 100 mM; or about 25 mM to 50 mM; or about 50 mM to 150 mM; or about 50 mM to 100 mM (or any range derivable therein). In other embodiments, the dose can provide the following blood level of the agent that results from a therapeutic agent being administered to a subject: about, at least about, or at most about

1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28,

29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53,

54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78,

79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, or 100 mM or any range derivable therein. In certain embodiments, the therapeutic agent that is administered to a subject is metabolized in the body to a metabolized therapeutic agent, in which case the blood levels may refer to the amount of that agent. Alternatively, to the extent the therapeutic agent is not metabolized by a subject, the blood levels discussed herein may refer to the unmetabolized therapeutic agent.

[00184] Precise amounts of the therapeutic composition also depend on the judgment of the practitioner and are peculiar to each individual. Factors affecting dose include physical and clinical state of the patient, the route of administration, the intended goal of treatment (alleviation of symptoms versus cure) and the potency, stability and toxicity of the particular therapeutic substance or other therapies a subject may be undergoing. [00185] It will be understood by those skilled in the art and made aware that dosage units of pg/kg or mg/kg of body weight can be converted and expressed in comparable concentration units of pg/ml or mM (blood levels), such as 4 pM to 100 pM. It is also understood that uptake is species and organ/tissue dependent. The applicable conversion factors and physiological assumptions to be made concerning uptake and concentration measurement are well-known and would permit those of skill in the art to convert one concentration measurement to another and make reasonable comparisons and conclusions regarding the doses, efficacies and results described herein.

VII. Kits

[00186] Certain aspects of the present invention also concern kits containing compositions of the invention or compositions to implement methods of the invention. In some embodiments, kits can be used to evaluate one or more miRNA molecules or biomarkers. In certain embodiments, a kit contains, contains at least or contains at most 1, 2, 3, 4, 5, 6, 7, 8, 9,

10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34,

35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 100, 500, 1,000 or more probes, synthetic molecules or inhibitors, or any value or range and combination derivable therein. In some embodiments, there are kits for evaluating biomarker activity in a cell.

[00187] Kits may comprise components, which may be individually packaged or placed in a container, such as a tube, bottle, vial, syringe, or other suitable container means.

[00188] Individual components may also be provided in a kit in concentrated amounts; in some embodiments, a component is provided individually in the same concentration as it would be in a solution with other components. Concentrations of components may be provided as lx, 2x, 5x, lOx, or 20x or more.

[00189] Kits for using probes, synthetic nucleic acids, nonsynthetic nucleic acids, and/or inhibitors of the disclosure for prognostic or diagnostic applications are included as part of the disclosure. Specifically contemplated are any such molecules corresponding to any biomarker identified herein.

[00190] In certain aspects, negative and/or positive control nucleic acids, probes, and inhibitors are included in some kit embodiments. The control molecules can be used to verify transfection efficiency and/or control for transfection-induced changes in cells.

[00191] It is contemplated that any method or composition described herein can be implemented with respect to any other method or composition described herein and that different embodiments may be combined. The claims originally filed are contemplated to cover claims that are multiply dependent on any filed claim or combination of filed claims.

[00192] Any embodiment of the invention involving specific biomarker by name is contemplated also to cover embodiments involving biomarkers whose sequences are at least 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99% identical to the mature sequence of the specified miRNA or biomarker.

[00193] Embodiments of the disclosure include kits for analysis of a pathological sample by assessing biomarker profile for a sample comprising, in suitable container means, two or more biomarker probes, wherein the biomarker probes detect one or more of the biomarkers identified herein. The kit can further comprise reagents for labeling nucleic acids in the sample. The kit may also include labeling reagents, including at least one of amine-modified nucleotide, poly(A) polymerase, and poly(A) polymerase buffer. Labeling reagents can include an amine- reactive dye.

VIII. Examples

[00194] The following examples are included to demonstrate preferred embodiments of the invention. It should be appreciated by those of skill in the art that the techniques disclosed in the examples which follow represent techniques discovered by the inventor to function well in the practice of the invention, and thus can be considered to constitute preferred modes for its practice. However, those of skill in the art should, in light of the present disclosure, appreciate that many changes can be made in the specific embodiments which are disclosed and still obtain a like or similar result without departing from the spirit and scope of the invention.

Example 1 - A super-enhancer associated transcription factor signature for the identification of liver metastasis from primary colorectal cancer tissues

[00195] More than 50% of patients with recurrent colorectal cancer (CRC) develop metastatic disease to liver, which ultimately results in death in more than two thirds of the cases. Hence, early detection, together with the identification of patients at high-risk for tumor recurrence to liver could significantly improve patient outcomes. Current paradigm asserts that metastasis reflects mutational and transcriptional landscape present in the majority of cells constituting the primary tumor mass. In this study, the inventors undertook an effort to develop a transcription factor (TF) based gene expression signature in primary CRC tissues that can identify patients with liver metastasis. A comprehensive list of TFs was established from key relevant databases including TFdb, FANTOM and TFCAT. An in silico discovery for differentially expressed TFs was performed on multiple, publicly available datasets. The final TF signature was examined in two independent cohorts of CRC patients (N=l04 and 151 respectively). The AUROC values for liver metastasis-positive cases were 0.79 and 0.86 in the two cohorts, respectively. The AUROC values improved to 0.93 in both cohorts on combining the signature with lymph node metastasis (LNM) status and the CEA levels. This signature was also associated with worse overall survival in the training (H.R (95% C.I) = 4.8(2.1-11.2), p<0.00l) and validation cohorts (H.R (95% C.I) =8.9(3.1-25.6), p<0.00l). In silico analyses on cell line ChIP-Seq data indicated transcriptional cooperativity between the TFs in the signature and preferential binding to super-enhancers controlling metastasis. In conclusion, the inventors present a super-enhancer binding TF associated signature that can identify liver metastasis and can predict poor survival by analyzing primary tumor tissues from CRC patients.

A. INTRODUCTION

[00196] More than 70% of the colorectal cancer patients develop distant metastasis especially to the liver in the course of the disease which leads to mortality in two thirds of these patients. The primary reason behind higher proportion of liver metastasis cases is the hepatic portal system that drains from the intestine into the liver thereby transporting the tumor cells to the new site for progression [1, 2].

[00197] Surgical resection is currently the primary treatment for colorectal cancer but patients with liver metastasis are not suitable for surgery alone. Chemotherapy is then the main course of treatment used to improve patient survival and make them more responsive to surgery. Thus, prior information on patient’s metastasis status can influence treatment regime. Adjuvant chemotherapy significantly reduces the recurrence rate and improves survival after surgery for colorectal cancer. However, adverse events occur in some patients who receive adjuvant chemotherapy. As liver metastasis is one of the major forms of recurrence after surgery for colorectal cancer, early identification of patients who may or may not benefit from adjuvant chemotherapy is crucial. Prior knowledge of patient’s metastasis status or identifying patients at high risk of metastasis would help decide between intensive adjuvant chemotherapy and curative surgical resection of the disease. Also, hepatic resection despite increasing survival rate is reported to be highly invasive. Hence molecular profiling of patients might help in identifying poor prognostic group and thereby avoiding surgery associated toxicities in individuals who might not have a survival benefit [3].

[00198] There is a growing consensus that metastasis in parts relies on mutations and/or gene regulation events present in many cells which constitute the primary tumor mass and can be believed to initiate progression of the disease. Research has identified subset of tumors as being predisposed to metastasis though molecular profiling even when no clinical evidence for metastatic spread was apparent at the time of tumor resection [4, 5, 6, 7]. As transcription factors regulate the levels of all RNA including protein coding mRNA and other regulating non-coding RNA like miRNA, lncRNA, dysregulation in transcription factor expression can be the underlying mechanism that triggers process of cell growth and metastasis. In their study, the inventors have developed a gene signature based on the transcription factors and their associated genes in primary colorectal cancer tissue to identify liver metastasis cases.

B. MATERIALS AND METHODS

1. Patient cohorts and sample selection

[00199] This study used 255 colorectal cancer tissue specimens comprising 207 primary CRC without synchronous liver metastasis and 48 patients with liver metastasis at the time of diagnosis from two different CRC patient cohorts. The first test cohort consisted of 104 patients who underwent surgical resection from the Mie University Hospital, Japan, between February 2001 and February 2015; and from the National Cancer Center Hospital, Japan, between January 2004 and January 2006. The tumor stage was evaluated according to American Joint Committee on Cancer (AJCC) Tumor-Node-Metastasis (TNM) grading system 7th edition and clinicopathological profiles of the patients were analyzed according to the classification of colorectal cancer proposed by the Japanese Society for Cancer of the Colon and Rectum (JSCCR) Guidelines [8]. The LNM status was determined from histopathologic examination of resected LNs. The patient cohort from the Mie University Hospital were treated as test cohort 1, while the patients from the National Cancer Center Hospital comprised test cohort 2. Written informed consent was obtained from all patients and the study was approved by the institutional review boards of all participating institutions.

2. The transcription signature discovery for in silico datasets

[00200] Diverse arrays of proteins are crucial for successful transcription by RNA polymerase in eukaryotic cells. These proteins include general transcription factors, co-factors, histones and chromatin remodeling proteins [9]. A comprehensive and exhaustive list of transcription factors and their associated genes was obtained from three key relevant databases - TFdb, FANTOM, TFCAT and the one of the largest manually curated census on 1391 human transcription factors published by Vacquerizas et al [10]. For in silico training, 3 independent expression microarray CRC data sets (GSE6988, GSE22834 and GSE72718) were downloaded from the Gene Expression Omnibus (GEO) database (http://www.ncbi.nlm.nih.gov/geo/), which comprising of T2-T4 stage patients with synchronous liver metastasis and patients who did not develop recurrence or liver metastasis within 3 years of follow-up. For in silico validation, the invenotrs used another independent microarray CRC dataset (GSE39582) from which they specifically selected patients with exact criteria as the in silico training cohorts.

3. RNA isolation and qPCR

[00201] RNA was extracted from fresh frozen primary tissues using AllPrep DNA/RNA/miRNA Universal (Qiagen, Hilden, Germany) as per manufacturer’s instructions. Following RNA quantification using Nanodrop system (ThermoFisher Scientific, Massachusetts, USA), 100 ng of RNA was used for cDNA preparation with random hexamer primers using High capacity cDNA reverse transcription kit (ThermoFisher Scientific, Massachusetts, USA). Real time quantitative PCR was performed using SensiFAST™ SYBR® Hi-ROX Kit (Bioline, Taunton, MA, USA) following the manufacture’s protocol. The genes were normalized with ACTB endogenous control (Supplementary Table 1).

4. Statistical analysis

[00202] Unpaired Wilcoxon’s rank sum test was carried out to compare expression levels of the genes between patients with liver metastasis and without liver metastasis in each of the GEO datasets followed by Benjamini Hochberg’s multiple testing correction. An adjusted p- value of less than 0.05 was considered significant. The area under the curve receiver operating characteristic (AUROC) curves, binary logistic regression analyses were performed using IBM SPSS version 23 (IBM, Armonk, NY, USA). Sensitivity, specificity, positive predictive values (PPV), negative predictive values (NPV), false discovery rate (FDR=l-PPV) and false omission rate (FOR=l-NPV were calculated using the median cut-off of the risk scores in both the cohorts. The Kaplan-Meier analysis was performed for overall and disease-free survival, and the log-rank test for comparing survival differences between groups using the MedCalc Statistical Software version 16.4.3 (MedCalc Software bvba, Ostend, Belgium; https://www.medcalc.org; 2016). The hazard ratio of the signature as well as the other clinical variables was calculated with Cox proportionality hazard model. Univariate binary logistic regression analysis was performed on age, sex, T-stage, differentiation, lymphatic vessel invasion (LVI), venous invasion, and LNM status along with the proposed TF associated signature. Only the significant variables in the univariate model were used to perform the multivariate logistic regression analysis 5. In silico analysis for transcriptional cooperativity along with enhancer and super-enhancer binding

[00203] To investigate the binding tendency of the transcription factors in colorectal tissue, the inventors downloaded public high-resolution H3K27ac ChIP-seq profiles of 21 colorectal cell lines (GSE77737). In each of these cell lines, the inventors called for enhancers and super enhancers. Enhancers were defined as the 400bp signal-depleted region within the H3K27ac enriched region situated at least 5000bp away from any transcription start site. To analyze transcription factor cooperativity, the binding sites of each transcription factor with a motif [11] were predicted by scoring each motif in each 400bp enhancer region in each cell type. The enhancers with the top 10% strongest motifs were used as a high confidence set, and the degree of overlap between predicted binding sites for each transcription factor pair were calculated using the Jaccard similarity coefficient. To estimate the background, the inventors performed 1000 permutations of shuffling the predicted binding sites between all enhancers and recalculated the Jaccard similarity coefficient for all pairs. To analyze the association between the transcription factors and super-enhancers, the inventors called super-enhancers in each of the 21 cell lines by the standard method of stitching together the 400bp enhancers if they are separated by less than 12,500 bp and defining a super-enhancer threshold based on the slope of the curve of signal within these stitched regions (ref?). Using these super-enhancers, the inventors defined each of the original 400bp enhancers as either a regular enhancer, or a constituent of a super-enhancer. Again, the inventors predicted binding sites of each transcription factor by scoring each motif in each 400bp enhancer region (either regular enhancer or super-enhancer constituent). The enhancers with the top 10% strongest motifs were used as a high confidence set, and the degree of overlap between predicted binding sites for each transcription factor and super-enhancers were calculated using the Jaccard similarity coefficient. To estimate the background, the inventors performed 1000 permutations of shuffling the predicted binding sites between all enhancers and recalculated the Jaccard similarity coefficient.

6. Identification of pathways that are depleted or enriched with binding motifs in the TF signature

[00204] The inventors identified genes with enrichment or depletion of super-enhancer- associated binding of the signature transcription factors by calculating the average strength within each super-enhancer for each motif in each cell type and ranking them. All the super enhancers were assigned to the nearest gene. For each gene, the inventors calculated the average motif rank in the associated super-enhancers, as well as the number of cell types with a super-enhancer associated to that gene. To identify enrichment or depletion of the entire transcription factor signature, the inventors calculated an empirical P-value of the average rank across the signature by performing 10,000 permutations of randomly shuffling the ranks of each motif between all genes. The inventors extracted all genes that are associated with either significant enrichment or depletion of motifs and have a super-enhancer in at least 2 or more cell types. Using those two gene lists, the inventors performed GO enrichment analysis. For each pathway, the inventors calculated the log2 fold change between the two lists as the log2 fold difference between the Jaccard similarity index of each pathway in the enriched and the depleted super-enhancer-associated genes and show the 5 most significant pathways with a log2 fold change higher than 0.25 (enriched) and the top 5 with a log2 fold change less than - 0.25 (depleted). The -loglO Q-value is the -loglO to the FDR-corrected P-value.

C. RESULTS

1. In silico discovery of transcription factor associated signature for liver metastasis

[00205] Using the various online databases, a comprehensive list of transcription factors and associated genes was prepared. Three online sources, TFdb, FANTOM and TFCAT along with a detailed list provided by Vacquerizas[lO] were used to get an extensive list of 2813 genes (Data not shown). 1604 of the 2813 transcription factors and associated genes (57%) were present in at least two of the four databases.

2. 14 gene transcription factor associated signature predicts liver metastasis in colorectal cancer patients

[00206] In silico training of the transcription factor-associated signature was performed on three GEO datasets (GSE6988, GSE22834 and GSE72718) where expression of all the 2813 genes was compared between primary colorectal tumor tissue of patients who had liver metastasis with primary tumor tissue from patients who did not develop hepatic recurrence or liver metastasis in a follow up span of a minimum of 3 years. On selecting genes that were significantly deregulated by 1.5 folds or more in at least two of the three datasets, the inventors got a list of 315 genes. For increased stringency, these 315 genes were further examined in a larger in silico cohort which had data on 294 T2-T4 stages colorectal cancer patients comprising of 40 liver metastasis positive cases (GSE39582). On considering genes whose direction of dysregulation matched in all the datasets, the inventors developed a 14 gene signature that comprised 7 transcription factors (EHF, KLF7, MECP2, PURA, RARB, TCF4 and ZNF354C) and 7 chromatin-associated genes (PDLIM4, CHAF1A, TCEA2, HDAC1, SSBP2, SSBP4 and EWSR1). On plotting the risk scores developed through binary logistic regression of the 14 genes, 70% of the liver metastasis positive cases had a high score while 79.5% of patients with no liver metastasis had a low score (AUC (95% C.I) = 0.85 (0.81-0.89), p<0.00l).

[00207] The inventors then tested the performance of the signature on independent clinical cohorts. On applying a fixed cut off value on the risk scores, 96% of liver metastasized patients had a high-risk score while 71% of non-metastasized patients had a risk score below 0.1 in test cohort 1 (AUC=0.79(0.69-0.86), p<0.00l)) (FIG. 1A & B). Similarly, in test cohort 2, 80% of the metastasis cases had high risk score and 79.4% of negative cases had a low risk score (AUC=0.86 (0.76-0.96), p<0.00l) (FIG. 1C & D). The specificity was 71.1 and 74.8 percent while sensitivity was 78.6 and 90 percent in the two test cohorts respectively on using median risk score as a cut-off to define high and low risk groups (Supplementary Tables 2 & 3).

3. Comparative AUROC of TF signature in the clinical cohorts improved when combined with lymph node metastasis status and CEA levels

[00208] To evaluate the efficiency of their transcription factor-associated genes, the inventors compared the AUC of their signature with that of the variables that are used in clinics for detecting distant metastasis including venous invasion, lymph node metastasis(FNM) and CEA value. The AUC of FNM and CEA levels was higher than venous invasion in both cohorts while the TF signature had an improved AUC over all the clinical variables (Table 1). The inventors observed that the signature had better AUROC values when combined with clinical variables, the highest when combined with LNM and CEA values (AUC=0.93 (0.88-0.98), p<0.00l in test cohort 2 and 0.93(0.86-0.99), p<0.00l in test cohort 2).

4. Risk assessment of clinicopathological variables along with TF signature for liver metastasis in CRC patients

[00209] Association between various clinicopathological factors including the variables used in comparative AUROC analysis and the TF signature with risk of liver metastasis was calculated by univariate and multivariate logistic regression analysis. In univariate analysis, tumor size, venous invasion along with lymphatic invasion, LNM and CEA levels were the clinical variables which significantly contributed with liver metastasis risk with LNM status being the highest contributor of all (Table 2). The TF signature was an independent risk contributor [O.R (95% Cl) = 6.11(2.44-15.33), p<0.0l] which remained significant after adjusting for the all the clinicopathological variables [O.R (95% Cl) = 7.17(1.55-33.22), p=0.0l] in test cohort 1 (Table 2). [00210] In test cohort 2, along with TF signature only three variables, namely, venous invasion, LNM status and CEA levels were independent significant risk contributors to liver metastasis. The signature was the biggest risk contributor in univariate [OR (95% Cl) = 26.73 (5.89 - 121.37), p < 0.01] as well as multivariate analyses after considering the other clinical parameters [OR (95% Cl) = 28.15 (5.65 - 140.21), p < 0.01] (Table3).

5. Association of the TF signature with survival in the clinical cohorts

[00211] To determine the prognostic potential of the TF signature the inventors investigated the survival data from all patients for overall survival (OS) (stages 1-4) and association of TF risk scores with recurrence data for disease free survival (DFS) in stage 2 and 3 patients in both test cohorts. Patients with high score for TF signature had reduced OS than those with low-risk score in both the clinical cohorts [HR (95% Cl) = 4.8(2.1-11.2), p < 0.001 in test cohortl, HR (95% Cl) = 8.9 (3.1 - 25.6), p = 0.003 in validation cohort] (FIG. 2A & 2B). Interestingly, with respect to recurrence, high TF signature score was also associated with worse DFS in both cohorts [HR (95%CI) = 3.9 (1.9 - 8.1), p < 0.001 in test cohort 1 and HR (95%CI) = 19.2 (7.7 - 47.7), p < 0.001 in test cohort 2 (FIG. 2C & 2D).

6. The transcription factors in the TF signature showed transcriptional cooperativity in binding sites bound preferentially to super-enhancers

[00212] To study the binding cooperativity between the transcription factors in the signature, the inventors downloaded the H3K27ac and control IP data on for the 21 colorectal cancer cell lines from the GEO database (GSE77737) along with the publicly available transcription factors motifs for EHF, KLF7, MECP2, PURA, RARB, TCF4 and ZNF354C.

[00213] Transcriptional cooperativity was calculated as the Jaccard similarity coefficient (JSC), which quantifies the extent over overlap on a scale from 0 (no overlap) to 1 (complete overlap) for each motif pair (e.g. motif A: EHF and motif B: KLF7). For instance; of 2250 EHF motifs and 2750 KLF7 motifs, 2000 overlap would sum up to as follows- JSC = 2000 / (2250 + 2750 - 2000) = 0.66. To control for random, the inventors randomly selected as many enhancers as motif A (in the example; 2250), and separately as many as motif B (in the example; 2750). The inventors calculated the Jaccard similarity coefficient for these random groups (Jrandom) (e.g. 1250 overlaps; Jrandom = 1250 / (2250 + 2750 - 1250) = 0.33) and calculated the fold-over-background (Jtrue / Jrandom = 0.66 / 0.33 = 2). The inventors did this 1000 times to have a robust sample size. This whole procedure was repeated in all 21 cell lines, and the values in the heatmap represent the mean fold-over-background across all these cell lines on a log2 scale (FIG. 3A). The results indicate that there is co-occurrence between many of the motif pairs, as many pairs occur together more often than expected by random and that all motifs have co-occurrence with at least one other of the tested motifs.

7. The transcription factors in the TF signature bound preferentially to super-enhancers

[00214] Finally, the inventors did an in silico analysis of preferential binding of the transcription factors to enhancers or super-enhancers using the dataset on 21 colorectal cancer cell lines. Enhancers were defined as the 400bp signal-depleted region within the H3K27ac- enriched region. This is also the region where you expect to see for example MED1 occupancy or DHS signal. The inventors limited the enhancers to only those separated from the nearest transcription start sit by at least 5000bp to avoid analyzing promoter regions.

[00215] Similar to cooperativity analysis, the inventors calculated the Jaccard similarity coefficient (Jtrue) between each motif and super-enhancer constituents (for example, 3000 motifs, 5000, super-enhancer constituents, 2000 constituents with motifs; Jtrue = 2000 / (5000 + 3000 - 2000) = 0.33). To control for random, the inventors randomly selected as many enhancers as there are super-enhancer constituents, and separately randomly selected as many enhancers as there are enhancers with a high confidence motif hit, and recalculated the Jaccard similarity coefficient (Jrandom), and calculated the fold-over-background (Jtrue / Jrandom). Again, the inventors repeated this 1000 times for each cell line and the bars represent the mean fold-over-background on a log2 scale and the error bars the standard deviation (FIG. 3B). Green colored bars indicate a p-value less than or equal to 0.05. The inventors ranked the bars according to mean fold-over-background in each cell line. The results indicate that these 7 motifs are in fact significantly more often found in super-enhancers than expected by random in many of the cell lines and almost always trend towards enrichment rather than depletion.

8. Pathways regulated by super-enhancers with enriched binding by transcription factors in the signature

[00216] On identifying the super-enhancers where the transcription factors in the signature might bind preferentially as compared to general enhancers, the inventors then wanted to identify pathways enriched or depleted of motifs where the transcription factors in the signature bind with respect to super-enhancers. Specifically, enriched pathways are pathways where the genes often are associated with super-enhancers with strong motifs for the inventors’ TF signature, whereas the depleted pathway are pathways where the genes are associated with super-enhancers without motifs. The inventors found that the former set of genes are highly enriched for pathways associated with cell migration, positive regulation of locomotion while the genes associated with super-enhancers that are depleted for these motifs are highly enriched for cell adhesion (Supplementary table 4).

D. DISCUSSION

[00217] Liver metastasis is a major cause of mortality in colorectal cancer patients. Early detection of patients with metastasis or individuals who are at high risk of developing recurrence is critical for determining the course of treatment of doing surgery alone or combining it with chemotherapy before and/or after surgery or opting for alternative therapies[l]. Large liver metastases (greater than about 1-2 cm in size) are detected by standard CT or MRI techniques with a high level of accuracy, but the smaller ones are generally missed[l2]. As it would be ideal to detect the process of metastasis at the onset, it is important to identify even small liver metastasis. Mechanistically, as the multistep process of cancer cell dissemination to distant organs begins with molecular changes in the primary site of tumor, pathogenetic modifications within primary tumor can be contributory to its potential to spread and invade other tissues [13]. The inventors have molecular markers that detect cancer cells that are released in circulation from the metastasized site, but the inventors lack a comprehensive study that has explored the potential of utilizing molecular changes occurring in the primary tumor site with respect to predicting spread of cancer to distant organs. In this study, the inventors have stringently analyzed multiple in silico datasets that have gene expression data on colorectal cancer patients with or without synchronous liver metastasis to develop a signature on specialized family of genes, namely the transcription factors, that detect changes in the primary tumor tissue to estimate the chances of liver metastasis along with prediction of prognosis in colorectal cancer.

[00218] In present clinical setting, clinicians suspect occurrence of metastasis through pathophysiological features like LNM status, venous invasion among others[l5, 16]. In this regard, the transcription factor signature alone outperformed these clinical variables and when combined with LNM status and CEA values had remarkably high accuracy of AUROC >0.90 in identifying liver metastasis cases in both clinical cohorts. Hence the inventors’ marker has the potential of getting incorporated with the existing methods for detecting liver metastasis and aid in early detection of progressive cases, albeit larger validation studies are warranted.

[00219] Metastatic colon cancer patients have a median 5 years survival rate of 38% [17]. The association of the signature with liver metastasis was further enforced by the findings that high risk score of the TF signature was associated with worse overall survival in all patients from both clinical cohorts. Interestingly, when the inventors looked specifically into stage 2 and 3 colorectal cancer patients for disease free recurrence with follow-up of close to 10 years, high risk score had worse disease-free survival too. As liver metastasis is the leading cause of mortality in colorectal cancer patients and high score of TF signature is associated with liver metastasis, that could explain the prognostic predictability of the signature.

[00220] Transcription factors and their associated genes drive cellular processes as they regulate gene expression that eventually affects protein synthesis. The inventors’ signature comprising of seven transcription factors and seven cofactors, showed high accuracy in identifying patients with liver metastasis. In their study, the inventors used H3K27ac ChIP-Seq from colorectal cell lines as input for their in silico analyses of transcription factor activity and target enhancers. The inventors found that the transcription factors in their signature are predicted to bind preferentially to super-enhancers compared to general enhancer site. Furthermore, they also display significant overlap in their predicted target enhancers, indicating transcriptional cooperativity.

E. TABLES

[00221] Table 1: Comparative AUROC of TF associated signature along with other clinical variables in test cohorts 1 and 2

Test Cohort 1 Test Cohort 2

Variable AUC 95%CI p Value AUC 95%CI p Value

Venous Invasion 0.70 0.59-0.81 0.002 0.68 0.56-0.81 0.007

Lymph node metastasis(LNM) 0.78 0.59-0.87 0.001 0.67 0.55-0.79 0.012

CEA value 0.74 0.78-0.94 <0.001 0.77 0.74-0.94 <0.001

TF signature 0.79 0.69-0.88 <0.001 0.84 0.74-0.94 <0.001

LNM + CEA 0.87 0.79-0.94 <0.001 0.84 0.76-0.91 <0.001

TF signature + LNM 0.90 0.81-0.94 <0.001 0.88 0.81-1.00 <0.001

TF signature + CEA 0.87 0.84-0.93 <0.001 0.90 0.81-1.00 0.001

TF signature + LNM+ CEA 0.93 0.88-0.98 <0.001 0.93 0.86-0.99 <0.001

Bold values indicate the combination of TF signature with two strongest clinical predictors of liver metastasis

[00222] Table 2: Multivariate logistic regression of clinicopathological factors and TF associated gene signature in Test cohort 1

Univariate Multivariate

Characteristics OR 95%CI p Value OR 95%CI p Value

Sex (male vs female) 0.68 0.27-1.67 0.40

Tumor stage (T2 vs T3&T4) 2.85 0.77-10.47 0.12

Tumor size (£median vs > median) * 3.90 1.35-11.24 0.01 1.19 0.21-6.67 0.85 Venous Invasion (Absent vs Present) 5.67 2.17-14.82 <0.01 4.92 1.09-22.26 0.04 Lymphatic Invasion (Absent vs Present) 13.58 1.75-105.51 0.01 1.44 0.12-17.34 0.77 Lymph node metastasis (Absent vs Present) 26.85 5.95-121.24 <0.01 31.09 4.34-222.9 <0.01

CEA (£5vs >5) _ 25.20 6.89-92.14 <0.01 24.58 4.35-138.8 <0.01

* Median =37mm

[00223] Table 3: Multivariate logistic regression of clinicopathological factors and TF associated gene signature in Test cohort 2.

Univariate Multivariate

Characteristics OR 95%CI p Value OR 95%CI p Value

Sex (male vs female) 1.12 0.44-2.87 0.81

Tumor stage (T2 vs T3&T4) 3.30 0.73-14.9 0.12

Tumor size (£median vs >median) * 2.02 0.73-5.56 0.17

Venous Invasion (Absent vs Present) 2.44 1.3-4.58 0.01 1.66 0.75-3.68 0.21 Lymphatic Invasion (Absent vs Present) 1.68 0.69-4.09 0.25

Lymph node metastasis (Absent vs Present) 4.90 1.56-15.42 0.01 3.35 0.83-13.4 0.09 CEA (£5 vs >5) 11.36 2.54-50.86 <0.01 12.20 2.31-64.35 <0.01 [00224] Supplementary Table 4: Top five enriched and depleted pathways regulated by super-enhancers and enriched for the TF associated signature

F. REFERENCES

[00225] The following references and the publications referred to throughout the specification, to the extent that they provide exemplary procedural or other details supplementary to those set forth herein, are specifically incorporated herein by reference.

1. Dhir M, Sasson AR. Surgical Management of Liver Metastases From Colorectal Cancer. J Oncol Pract 2016;12:33-9.

2. Siegel RL, Miller KD, Jemal A. Cancer statistics, 2016. CA Cancer J Clin 20l6;66:7- 30.

3. Zarour LR, Anand S, Billingsley KG, Bisson WH, Cercek A, Clarke MF, et al. Colorectal Cancer Liver Metastasis: Evolving Paradigms and Future Directions. Cell Mol Gastroenterol Hepatol 2017;3:163-73.

4. Ramaswamy S, Ross KN, Lander ES, Golub TR. A molecular signature of metastasis in primary solid tumors. Nat Genet 2003;33:49-54.

5. van 't Veer LJ, Dai H, van de Vijver MJ, He YD, Hart AA, Bernards R, et al. Expression profiling predicts outcome in breast cancer. Breast Cancer Res 2003;5:57-8.

6. van de Vijver MJ, He YD, van't Veer LJ, Dai H, Hart AA, Voskuil DW, et al. A gene- expression signature as a predictor of survival in breast cancer. N Engl J Med 2002;347:l999- 2009.

7. van 't Veer LJ, Dai H, van de Vijver MJ, He YD, Hart AA, Mao M, et al. Gene expression profiling predicts clinical outcome of breast cancer. Nature 2002;415:530-6.

8. Watanabe T, Itabashi M, Shimada Y, Tanaka S, Ito Y, Ajioka Y, et al. Japanese Society for Cancer of the Colon and Rectum (JSCCR) Guidelines 2014 for treatment of colorectal cancer. Int J Clin Oncol 2015;20:207-39.

9. Lemon B, Tjian R. Orchestrated response: a symphony of transcription factors for gene control. Genes Dev 2000;14:2551-69.

10. Vaquerizas JM, Kummerfeld SK, Teichmann SA, Luscombe NM. A census of human transcription factors: function, expression and evolution. Nat Rev Genet 2009;10:252-63.

11. Madsen JGS, Rauch A, Van Hauwaert EL, Schmidt SF, Winnefeld M, Mandrup S. Integrated analysis of motif activity and gene expression changes of transcription factors. Genome Res 2018;28:243-55.

12. Misiakos EP, Karidis NP, Kouraklis G. Current treatment for colorectal liver metastases. World J Gastroenterol 2011;17:4067-75. 13. Lambert AW, Pattabiraman DR, Weinberg RA. Emerging Biological Principles of Metastasis. Cell 2017;168:670-91.

14. Liu Y, Beyer A, Aebersold R. On the Dependency of Cellular Protein Levels on mRNA Abundance. Cell 2016;165:535-50.

15. Christophi C, Nguyen L, Muralidharan V, Nikfarjam M, Banting J. Lymphatics and colorectal liver metastases: the case for sentinel node mapping. HPB (Oxford) 2014; 16: 124- 30.

16. Lee JH, Lee SW. The Roles of Carcinoembryonic Antigen in Liver Metastasis and Therapeutic Approaches. Gastroenterol Res Pract 2017;2017:7521987.

17. Kanas GP, Taylor A, Primrose JN, Langeberg WJ, Kelsh MA, Mowat FS, et al. Survival after liver resection in metastatic colorectal cancer: review and meta-analysis of prognostic factors. Clin Epidemiol 2012;4:283-301.

18. Sengupta S, George RE. Super-Enhancer-Driven Transcriptional Dependencies in Cancer. Trends Cancer 2017;3:269-81.

19. Wang J, Liu Q, Sun J, Shyr Y. Disrupted cooperation between transcription factors across diverse cancer types. BMC Genomics 20l6;l7:560.

20. Lu M, Jolly MK, Onuchic J, Ben-Jacob E. Toward decoding the principles of cancer metastasis circuits. Cancer Res 2014;74:4574-87.

Example 2 - A mesenchymal associated transcriptomic signature for prognosis and adjuvant therapy prediction in stage II and III colorectal cancer

[00226] Selecting high-risk stage II and III patients for adjuvant therapy as well as identifying predictive biomarkers for adjuvant and palliative setting remains a herculean task in colorectal cancer (CRC). Recent gene expression based subtypes showed great promise but the clinical translation is cumbersome.

[00227] Initially, the microarray data of 152 laser capture microdissected (LCM) CRC samples and 709 CRC tissue samples were analyzed to develop a mesenchymal associated transcriptomic signature (MATS). Besides analyzing the accuracy of this signature in identifying CRC poor consensus molecular subtype (CMS4), this signature was trained (N=l42) and validated (N=286) using RT-PCR in two independent clinical cohorts consisting of stage II and III CRC patients to predict recurrence using cox proportional hazard models. Furthermore, the inventors investigated the chemotherapy predictive potential of MATS in both adjuvant as well as palliative setting using either recurrence or RECIST derived response criterion. [00228] MATS achieved an AUC of 0.92 to 0.99 in identifying CMS4 subtype in six independent CRC cohorts. Furthermore, MATS showed significantly higher predictive abilities in identifying high-risk CRC patients compared to CMS4 subtype as well as OncotypeDx. RT- PCR based training and validation of MATS in two independent clinical cohorts stratified patients into low and high-risk groups with a five-year relapse-free survival (RFS) rates ranging from 87% and 54% in the training cohort (HR: 4.11 (Cl: 2.72-15.43)) and 82% and 56% in the validation cohort (HR: 2.66 (Cl: 1.66-3.98)), respectively. Excitingly, MATS was found to be highly accuarate in predicting response to fluoropyrimidine-based adjuvant chemotherapy as well as Folfox and Cetuximab response in metastatic CRC patients.

[00229] MATS potentially offers clinical value in prognosis, identifying poor CMS4 subtype as well as predicting response to fluoropyrimidine-based adjuvant chemotherapy and Cetuximab response in metastatic CRC patients.

A. INTRODUCTION

[00230] Clinical decision making for adjuvant chemotherapy, which includes a selection of appropriate patients and optimal treatment regimen, remains the most pressing challenge in the management of stage II and III colorectal cancer (CRC) patients (1). Although professional clinical organizations have shown several risk factors which should be considered for adjuvant therapy decision making in stage II patients, there have been no solid clinicopathological risk factors for selecting stage II patients who could benefit from adjuvant therapy. For stage III patients, six months of oxaliplatin-based adjuvant chemotherapy has been the standard treatment after radical surgery. However, oxaliplatin often causes sustained, severe neuropathy in many patients. Recently, the International Duration Evaluation of Adjuvant Chemotherapy (IDEA) trial, consisting of six international studies involving almost 13,000 stage III CRC patients examined the efficacy of a shorter duration of oxaliplatin-based adjuvant chemotherapy to prevent severe neuropathy. In this trial, patients at low risk of recurrence (Tl- 3, Nl) treated by shorter duration of oxaliplatin-based regimen, show non-inferiority for disease-free survival with much less neurotoxicity, compared to standard six months treatment. This result showed the importance of selecting low-risk stage III patients who can be treated by less toxic adjuvant therapy. Collectively, accurate prognostic and predictive biomarkers are needed to select the optimal treatment methods thereby improving the survival and quality of life in stage II and III CRC patients.

[00231] Like-wise, currently oxaliplatin combined with various schedules of antimetabolite 5-fluorouracil (5-FU) and leucovorin (LV) are administered in first- line treatments for unresectable colorectal cancer (CRC). However, only few patients really get benefited with oxaliplatin based chemotherapy, while others undergo ineffective chemotherapy and severely suffer from detrimental side effects. Furthermore, in spite of significant efforts to identify predictive biomarkers for various therapies used in the metastatic CRC (mCRC) setting, RAS mutational status for anti-EGFR antibodies and micro satellite instability (MSI) status for anti- PD-l drugs are the only markers that are currently used in the clinic (the inventors’ JCO review cite). However, most patients develop resistance and not all RAS wild type patients are benefited by anti-EGFR treatment and therefore predictive biomarkers are urgently needed in mCRC.

[00232] Recently, the colorectal cancer subtyping consortium (CRCSC) consisting of six large international groups proposed the consensus molecular subtypes (CMS) classification, which divides CRC into four molecular subtypes. Among these subtypes, CMS4, characterized by increased expression of EMT genes, showed a worse overall survival (OS) and relapse-free survival (RFS) than the other subtypes. Although the CMS classification is promising classification systems in CRC for patient stratification in future clinical trials and studies, it is currently hard to apply CMS classification in clinical settings as it requires analyzing huge number of genes. Early exploratory studies published recently showed the value of CMS4 subtyping in predicting response to chemotherapy both in adjuvant as well as palliative setting. Therefore, in this study the inventors developed a clinically applicable mesenchymal- associated transcriptomic signature (MATS) through comprehensive gene expression profiling using multiple colorectal cancer cohorts, followed by evaluating whether MATS can accurately predict prognosis and identify CRC patients who could realistically benefit from fluoropyrimidine based adjuvant chemotherapy as well as the ability of MATS in predicting response to Folfox and Cetuximab therapy in metastatic CRC patients.

B. MATERIALS AND METHODS

1. Identification and independent validation of MATS recurrence signature using in-house laser capture micro-dissected (LCM) as well as other public CRC gene expression datasets

[00233] To identify mesenchymal associated transcriptomic signature, the inventors aimed to identify positively correlated genes with Vimentin (VIM), which is a representative mesenchymal marker in cancer cells. For this purpose, the inventors analyzed the microarray data of 152 laser capture microdissected (LCM) CRC samples, which they published previously. The microarray data has been publicly available in the National Center for Biotechnology Information’s Gene Expression Omnibus (GSE71222). Subsequently, the inventors narrowed down the list of candidate genes by selecting upregulated genes in CRC using GSE41258 dataset comprising 186 CRC and 54 normal colon mucosa. In addition, to construct a multi-gene-based RFS classifier in stage II and III CRC, the inventors performed the least absolute shrinkage and selection operator (LASSO) Cox regression model analysis on the initially selected mesenchymal-associated genes in a large public dataset, which included a total of 461 stage II and III CRC patients from the GSE39582 dataset. The inventors used R software version 3.3.1 and the“glmnet” package (R Foundation for Statistical Computing, Vienna, Austria) to perform the LASSO Cox regression model analysis. Next, the inventors evaluated MATS signature in RFS prediction using a couple of independent publicly available datasets (GSE17536; N=l l4 and GSE33113; N=90), for which clinical information of recurrence status and RFS time as well as microarray derived OncotypeDX prediction data was available. Furthermore, to additionally evaluate the MATS signature in predicting adjuvant and palliative therapy benefit, the inventors analyzed the inventors’ own in-house clinical cohorts as well as preprocessed data from GSE5851 and GSE28702. The flow chart of the study design is shown in FIG. 5A.

2. Consensus molecular subtype (CMS) classification

[00234] To evaluate the association between the selected mesenchymal associated genes and CMS4 subtype, the inventors analyzed six publicly available datasets comprised of GSE39582 (N=566), GSE17536 (N=l77), GSE33113 (N=90), TCGA microarray (N=209), TCGA RNA seq (N=323) and GSE104645 (N=l93). The expression data of publicly available datasets were obtained via GE02R and cBioPortal. CMS status of publicly available datasets GSE39582, GSE17536, GSE33113, and TCGA were obtained from the CRCSC, while the CMS status of GSE104645 dataset was obtained from the associated publication.

3. Patient cohorts for in-house independent training and validation of MATS signature

[00235] To confirm the results from the inventors’ initial exploratory results using genome wide expression data, mesenchymal associated transcriptomic signature (MATS) was analyzed in 428 stage II and III CRC samples from two independent institutes. For the training set, the inventors assessed 142 fresh frozen samples from National Cancer Center Hospital, Tokyo, Japan, between October 2004, and May 2006 (Training cohort). For the independent validation set, MATS were validated in 286 formalin-fixed paraffin-embedded (FFPE) CRC from Tokyo Medical and Dental University Hospital, Tokyo between January 2007, and December 2011 (Validation cohort). Clinicopathological characteristics of the in-house training and validation cohorts are shown in Supplementary Table S l. Patients with radiotherapy or chemotherapy before surgery were excluded in this study. Fluoropyrimidine-based drugs (5FU+leucovorin, capecitabine, S-l) were used for adjuvant therapy. No patients were treated by oxaliplatin- based regimen as adjuvant therapy in both of the inventors’ in-house clinical cohorts. Written informed consent for participating this study was obtained from all patients, and the inventors obtained the approval of the institutional review boards in all participating insutitutions for this study. Overall survival (OS) and RFS times were calculated from the date of surgery. The event of RFS was defined in a previous study. The Tumor Node Metastasis (TNM) staging was performed according to the American Joint Committee on Cancer (AJCC) standards. Investigation of the benefit from fluoropyrimidine-based adjuvant chemotherapy was performed by comparing RFS rates of patients with and without chemotherapy. As most (93% in training cohort, 89% in the validation cohort, respectively) stage II patients in the inventors’ in-house cohorts were not treated by adjuvant therapy, this analysis was limited to the stage III patients.

4. RNA extraction from fresh frozen and FFPE samples and Quantitative Reverse Transcription Polymerase Chain Reaction (qRT-PCR)

[00236] Total RNA extraction from fresh frozen specimens was performed using the RNeasy Mini kit (QIAGEN, Hilden, Germany) according to the manufacturer’s instructions. For FFPE specimens, total RNA and genomic DNA was extracted using the Allprep FFPE kit (QIAGEN) according to the manufacturer’s instructions. Then, cDNA was synthesized from 2 pg of total RNA using the High Capacity cDNA Reverse Transcription Kit according to the manufacturer’s recommendations (Thermo Fisher Scientific, Waltham, MA). The RT-qPCR assays were performed using QuantStudio 6 Flex and QuantStudio 7 Real-Time PCR System (Applied Biosystems, Foster City, CA). The inventors used 5ng of cDNA for each well with the SensiFast Low-rox probe Master Mix (Bioline, London, UK). The following PCR cycling conditions were used: 2 min at 95°C for enzyme activation, 50 cycles of 95°C for 10 s and 60°C for 50 s for denaturation, annealing and extension. The expression level of target genes was normalized against ACTB using 2-Act method. The information of all primers used in this study is shown in Supplementary Table S2.

5. Tumor microsatellite instability analysis

[00237] Microsatellite instability (MSI) analysis was conducted using five mononucleotide repeat microsatellite markers (BAT-25, BAT-26, NR-21, NR- 24, and NR-27) in a pentaplex PCR system. Primer sequences and MSI calling were described previously. 6. Statistical analysis

[00238] Association between the gene classifier and various clinicopathological factors were assessed by the c2 test. The inventors selected the optimum cut-point for the expression of every gene using X-tile plots based on the RFS (X-tile software version 3.6.1 (Yale University School of Medicine, New Haven, CT, USA)). Kaplan-Meier analysis and log-rank test were conducted for estimating and comparing the survival rates of CRC patients with MATS high and low risk. The Cox proportional hazards model was applied for identifying independent prognostic factors dictating patient survival. The inventors performed statistical analyses using the GraphPad Prism Ver. 6.0 (GraphPad Software, San Diego, CA), Medcalc version 16.1 (MedCalc Software, Ostend, Belgium) and R software version 3.3.1. All P values were 2-sided, and those less than 0.05 were considered statistically significant. The inventors investigated the prognostic or predictive accuracy of each feature and gene classifier using time-dependent receiver operating characteristic (ROC) analysis via the R package“survival ROC”. To analyze the FOLFOX and Cetuximab predictive ability of MATS in external public validation cohorts, the inventors used RECIST derived response criterion and plotted ROC curves using binary logistic regression.

C. RESULTS

1. Identification of candidate mesenchymal associated genes using in- house laser capture microdissected (LCM) as well as other public CRC gene expression datasets

[00239] Primarily, this study aimed to identify promising mesenchymal associated biomarkers that can predict RFS in CRC. For this purpose, the inventors initially searched candidate genes, which have a positive correlation with VIM from the inventors’ microarray data of 152 laser capture micro-dissected CRC tissues. This search led to the identification of 87 candidate genes (Supplementary Table S3). Subsequently, the inventors narrowed down the candidate genes to 34 that were upregulated in CRC from microarray data of 186 CRC and 54 normal colon samples (Supplementary Table S4). Then the Lasso cox regression model was applied to construct a clinically applicable multi-gene-based classifier for predicting the RFS in 466 patients with stage II and III CRC. This analysis led to the identification of eight candidate genes (10 probes) predicting RFS, which the inventors called as mesenchymal associated transcriptomic signature (MATS) genes. In this exploratory cohort, MATS low and high-risk patients showed a five-year relapse-free survival (RFS) rates of 69% and 52% respectively (HR: 1.79 (1.32-2.44), p<0.00l) (FIG. 5B). 2. MATS is superior in RFS prediction as well as robust in identifying CMS4 subtype CRC in multiple CRC datasets

[00240] To compare the RFS prediction performance of the MATS classifier versus the CMS4 and microarray-based OncoDx risk-score, the inventors evaluated the AUCs of the three predicting models at five years after surgery using the time-dependent ROC in two publicly available datasets. MATS classifier yielded a better AUC than both CMS4 and microarray- based OncoDx risk-score as shown in two datasets GSE17536 and GSE33113, for which all three data were available (FIG. 5C). Next, the inventors evaluated the relationship between MATS genes and the CMS4 subtype, which is a known mesenchymal CRC subtype displaying poor prognosis. Indeed, all eight genes were significantly upregulated in the CMS4 subtype CRC in six large CRC cohorts (FIGS. 10-15). When the inventors applied an independent logistic regression model to the 5 datasets for distinguishing CMS4 subtype from the others, the receiver operator characteristic curve of MATS showed high area under the curve (AUC) values of 0.94 (GSE39582), 0.92 (GSE17536), 0.99 (GSE33113), 0.95 (TCGA microarray) and 0.97 (TCGA RNA seq), 0.92 (GSE104645) respectively (FIG. 5D). The AUC of each gene was 0.86-0.93 (COL1A2), 0.8-0.94 (COL3A1), 0.81-0.89 (FN1), 0.77-0.91 (POSTN), 0.76- 0.97 (FSTL1), 0.48-0.92 (BCAT1), 0.76-0.92 (DKK3), 0.62-0.89 (PRR16) highly significant across all the datasets for the identification of CMS4 subtype robustly (Supplementary Table S5).

3. Prognostic ability of MATS in the in-house training cohort analyzed by qRT-PCR

[00241] To further evaluate the prognostic ability of MATS, the inventors performed a qRT- PCR based training and validation in two independent in-house clinical cohorts. Since one of the eight genes (PRR16) could not be amplified by qRT-PCR, the inventors reduced MATS based on the remaining seven genes. FIG. 6A shows the univariate analysis between each of the seven genes and RFS. Hazard Ratios of individual genes are in the range of 1.95 to 3.11. Subsequently, the inventors combined all seven MATS genes using Cox regression to build a prognostic classifier predicting RFS. The coefficients of each gene are listed in Supplementary Table S6. Using the coefficients derived from the Cox regression model, the inventors calculated the risk score for each patient using the expression of all seven genes of MATS. Subsequently, to plot the Kaplan-Meier curves, the inventors stratified patients with a risk score of 0.57 or higher as high-risk of disease recurrence (high-risk group), and those with a risk- score lower than 0.57 as low-risk of disease recurrence (low-risk group) using X tile software. Time-dependent ROC analysis at five years after surgery (FIG. 6C) for RFS prediction reached an AUC of 0.67. When the inventors assessed the association of risk group and survival status, they found patients with low-risk scores generally had better RFS than did those with high-risk scores. The 5-year RFS was 54% for the high-risk group, and 87% for the low-risk group (Hazard Ratio (HR) 4.11, 95% Cl 2.72-15.43; p<0.00l (FIG. 6D).

4. Validation of the prognostic ability of MATS classifier in the in-house independent validation cohort

[00242] To further confirm the prognostic ability of MATS, the inventors applied the same cox model coefficients and cutoff scores derived from the training cohort to an independent validation cohort (N=286). FIG. 6B shows the univariate analysis between each of the seven genes and RFS in the validation cohort with HR ranging from 1.68 to 5.44. Time-dependent ROC analysis at five years after surgery (FIG. 6E) for RFS prediction reached an AUC of 0.69. Consistent with the training cohort results, the patients with lower risk-scores had better survival than did those with higher risk-scores (5-year RFS was 56% for the high-risk group, and 82% for the low-risk group (HR 2.66, 95% Cl 1.66-3.98, p<0.00l; FIG. 6F). In multivariable analysis with clinicopathological variables, MATS remained as an independent prognostic factor for RFS in both the training (FIG. 7A and 7B) and validation cohorts (FIG. 7C and 7D). The association between MATS risk-score and clinicopathological factors in the training and validation cohorts are shown in Table 1. When stratified by clinicopathological risk factors (stage II and stage III, <T4 and T4), MATS remained as a clinically and statistically significant prognostic marker of RFS except for T4 tumor (FIG. 8A, 8B). In stage II, T3 patients, MATS showed the significantly higher prognostic accuracy of RFS than any other clinicopathological risk factor including mismatch repair status in a time-dependent ROC analysis (FIG. 8C). In addition, the combination of MATS, mismatch repair status, tumor location and lymphatic invasion status in a multivariate Cox proportional hazard model resulted in an AUC of 0.79 for recurrence prediction (FIG. 8C). This combined RFS prediction model consisting of these four factors stratified good or poor RFS patients significantly with the highest HR of 4.74 (95% Cl 1.79-12.57, p=0.00l; FIG. 8C).

5. MATS is a strong predictor of response to fluoropyrimidine-based adjuvant chemotherapy in stage III CRC patients as well as FOLFOX or Cetuximab based palliative chemotherapy in metastatic CRC patients

[00243] To evaluate whether MATS could predict the benefit of fluoropyrimidine-based adjuvant chemotherapy, the inventors investigated the association between the fluoropyrimidine-based adjuvant chemotherapy and RFS among the inventors’ in-house training and validation cohorts. In the analysis of 88 stage III patients who are stratified into high and low-risk groups based on MATS in the training cohort, the fluoropyrimidine-based adjuvant chemotherapy was associated with a higher rate of RFS (5 year survival rate 89% with chemotherapy vs. 69% with no chemotherapy, P=0.05, HR: 2.96) in the stage III subgroup of the MATS low-risk patient population (FIG. 9A). On the other hand, there is no difference between patients who did and did not receive fluoropyrimidine-based adjuvant chemotherapy in the stage III subgroup of the MATS high-risk patient population (FIG. 9B). Consistently, in the validation cohort of 125 stage III patients, the fluoropyrimidine-based adjuvant chemotherapy was associated with a higher rate of RFS (5 year survival rate 82% with chemotherapy vs. 56% with no chemotherapy, P=0.04, HR: 2.88) in the MATS low-risk group, while there is no significant difference in the MATS high-risk group (FIG. 9C, D). These results indicate that the MATS low-risk population could be treated with less toxic adjuvant therapy of fluoropyrimidine alone to prevent sustained painful adverse effects of Oxaliplatin and MATS high-risk population would need to be treated with the more aggressive regimen (Oxaliplatin-based regimen with or without targeted agents) for adjuvant chemotherapy. Having noticed the chemotherapy predictive power of MATS in adjuvant setting as well as the association of MATS with CMS4 subtype, the inventors have further analyzed two independent clinical cohorts published earlier in metastatic CRC patients. In the first cohort of 83 unresectable CRC patients with available RECIST criterion, MATS was able to achieve an AUC of 0.74 (p=0.000l, FIG. 9E) in predicting response to FOLFOX based first line chemotherapy. Excitingly, MATS achieved an AUC of 0.76 (p=0.00l, FIG. 9F) in predicting response to Cetuximab therapy in metastatic CRC patients from Khambata-Ford cohort (N=68 with RECIST response status). While KRAS alone showed an AUC of 0.70 (r=0.012, FIG. 9F), the combination of MATS with KRAS mutation status improved the AUC to 0.85 (p=0.000l, FIG. 9F) for predicting Cetuximab therapy response. These results further emphasizes the clinical utility of MATS both in prognosis as well as predicting response to chemo as well as targeted therapies in colorectal cancer.

D. DISCUSSION

[00244] In this study, the inventors primarily developed and validated a mesenchymal associated transcriptomic signature from LCM CRC samples to improve the current prognosis and adjuvant treatment prediction in stage II and III CRC patients using a comprehensive approach as well as utilizing multiple CRC patient cohorts. To identify EMT subtypes, previous studies analyzed gene expression of cell lines, or stroma contained CRC tissues. However, 2D cell line models often do not represent the natural tumor environment. In addition, as mesenchymal markers are highly expressed in stromal cells, it is always better to analyze cancer epithelial cells without stroma to select significant mesenchymal markers. In other words, the inventors used LCM to reliably identify the transcriptome for epithelial- mesenchymal transition in pure tumor epithelial cells. This will also facilitate the clinical translation of predictive markers as it allows testing in pre-clinical 2D/3D cell culture models, where the tumor stroma is absent. Based on these issues, in this study, the inventors applied LCM on CRC samples to separate cell populations from stroma sections for identifying the mesenchymal signature of cancer epithelial cells precisely. In addition, the use of the LASSO Cox regression model allowed us to integrate multiple mesenchymal genes into one gene panel, which improved prognostic accuracy significantly compared to the single gene models.

[00245] In this study, the inventors successfully developed and validated MATS as an RFS prediction panel in stage II and III CRC patients. Especially, MATS could stratify recurrence risk except for T4 tumors. Among high-risk clinicopathological factors such as number of dissected lymph nodes (<12), poorly differentiated histology, lymphatic/ venous invasion, bowel obstruction and localized perforation, NCCN guidelines give weight to T4 as a most reliable high-risk clinicopathological feature. In the NCCN guidelines for stage II CRC, patients with T4 tumor are considered to be treated by adjuvant therapy regardless of any other feature including mismatch repair status. Thus, the risk stratification of patients with stage II, T3 tumors is very crucial. From this aspect, MATS has great importance in clinical settings to stratify recurrence risk in stage II, T3 patients.

[00246] When the inventors analyzed the benefit of fluoropyrimidine-based adjuvant chemotherapy in this study, stage III patients with MATS low-risk significantly benefited from fluoropyrimidine-based adjuvant chemotherapy alone with an excellent prognosis. Conversely, those with MATS high-risk did not benefit from fluoropyrimidine-based adjuvant chemotherapy alone. This result would represent chemosensitivity based on EMT status. Therefore, MATS low-risk stage III patients might be able to be treated by fluoropyrimidine- based drug alone, sparing them from the potentially toxic and expensive oxaliplatin-based regimen. Excitingly, the inventors’ exploratory analysis in mCRC patients revealed that MATS is an excellent predictive markers for FOLFOX therapy in first-line treatment of unresectable CRC patients. In addition, MATS was a better as well as an independent predictor of Cetuximab response in mCRC patients.

[00247] These are a very important findings of the inventors’ study as MATS is not only beneficial to predict RFS but also helps in guiding treatment decisions. This is one of the major concerns of prognostic markers which are published thus far in CRC and one of the biggest strengths of the inventors’ MATS classifier. In addition, MATS is probably one of the most robust and clinically translatable 7-gene signature published so far that can identify CMS4 subtype patients with excellent accuracy.

[00248] In conclusion, the inventors’ findings indicate that the MATS can effectively classify patients with stage II and III CRC into low and high-risk groups, thereby adding prognostic value to the traditional clinicopathological risk factors and mismatch repair status used to assess the prognosis of these patients. Moreover, the inventors’ study showed that the MATS could help to identify low-risk stage III patients who can benefit from fluoropyrimidine-based adjuvant chemotherapy alone with favorable prognosis even better than stage II patients. MATS might facilitate reduction of unnecessary oxaliplatin-based adjuvant therapy currently being performed in patients with stage III CRC. Thus, MATS potentially offers clinical value in directing personalized medicine and tailored decision making in stage II and III CRC patients. Furthermore, MATS robustly identified poor mesenchymal colorectal cancer subtype besides accurately identifying response to FOLFOX and Cetuximab in metastatic colorectal cancer patients. Since the inventors developed an RT-PCR based‘risk prediction model’ using the inventors’ 7-gene signature, this scores can be readily applied to independent, future prospective cohorts to evaluate the potential of this new classifier for decision making in CRC patients, and thereby implement precision medicine.

E. TABLES

[00249] Table 1: Association between MATS risk score and clinicopathological factors

training cohort Validation cohort MATS risk score MATS risk score

Variables Low High P value Low High P value

N=107 N=35 N=144 N=142

Gender

Male 58 26 0.03 73 89 0.04

Female 49 9 71 53

Age

<65 71 24 0.80 42 54 0.11

>65 36 11 102 88

Location

Colon 57 14 0.17 104 80 0.005

Rectum 50 21 40 62

Histology

Differentiated 99 30 0.22 135 126 0.13

Undifferentiated 8 5 9 16

Tumor size <45mm

(median) 58 13 0.07 69 43 0.002 >45mm 49 22 70 94 not available 5 5

T stage

Tl-3 79 22 0.21 108 90 0.03

T4 28 13 36 52

Lymphatic invasion

Absent 73 19 0.13 70 61 0.36 Present 34 16 74 80 not available 0 0 0 1 venous invasion

Absent 43 10 0.21 18 10 0.12 Present 64 25 126 131 not available 0 0 0 1 Lymphnode Meta

Absent 40 14 0.78 95 65 <0.001 Present 67 21 49 77 Preoperative CEA

<5 75 21 0.26 85 77 0.53 5< 32 14 58 61 not available 0 0 1 4

[00250] Supplementary Table Sl: Clinicopathological characteristics of in-house training and validation cohorts

Validation cohort N

Training cohort N (%)

(%)

(N=l42) (N=286)

Gender

Male 84 (59) 162 (57)

Female 58 (41) 124 (43)

Age

Mean+SD 59 +10 68 +11

Location

Rectum 71 (50) 102 (36)

Left colon 47 (33) 89 (31)

Right colon 24 (17) 95 (33)

Size

Mean+SD

49 +17 53 +37

(mm)

Tumor

grade

Low 132 (93) 261 (91)

High 10 (7) 25 (9)

T stage

T4 41 (29) 88 (31) Tl-3 101 (71) 198 (69)

Lymphovascular invasion

Absent 39 (27) 22 (8)

Present 103 (73) 263 (92)

unknown 0 1 (0)

No. of lymphnode examined

<12 13 (9) 50 (17)

12 or more 129 (91) 234 (82) unavailable 0 2 (1)

TNM stage

54 (38) 161 (56)

88 (62) 125 (44)

Adjuvant therapy

No 73 (51) 187 (65)

Yes 69 (49) 99 (35)

MSI status

MSI-H 25 (9)

MSS, MSH-L 251 (88)

unavailable 10 (3)

Median follow up period (Month)

71 47

[00251] Supplementary Table S2: Primer sequences

[00252] Supplementary Table S3: The list of candidate genes, which have positive correlation with VIM in the inventors’ laser capture micro dissected transcriptomic data

[00253] Supplementary Table S4: The list of candidate genes, which were used for lasso cox regression model

[00254] Supplementary Table S5: The AUC of each candidate gene for identifying CMS4 subtype in five publicly available datasets

[00255] Supplementary Table S6: The coefficients of seven genes MATS classifier in the cox regression model

F. REFERENCES

[00256] The following references and the publications referred to throughout the specification, to the extent that they provide exemplary procedural or other details supplementary to those set forth herein, are specifically incorporated herein by reference.

1. Dienstmann R, Salazar R, Tabernero J. Personalizing colon cancer adjuvant therapy: selecting optimal treatments for individual patients. J Clin Oncol 20l5;33(l6): 1787-96 doi 10.1200/JC0.2014.60.0213.

2. Benson AB, 3rd, Schrag D, Somerfield MR, Cohen AM, Figueredo AT, Flynn PJ, et al. American Society of Clinical Oncology recommendations on adjuvant chemotherapy for stage II colon cancer. J Clin Oncol 2004;22(l6):3408-l9 doi 10.1200/JC0.2004.05.063. 3. Schmoll HJ, Van Cutsem E, Stein A, Valentini V, Glimelius B, Haustermans K, et al. ESMO Consensus Guidelines for management of patients with colon and rectal cancer a personalized approach to clinical decision making. Ann Oncol 2012;23(10):2479-516 doi 10. l093/annonc/mds236.

4. Network. NCC. Colon Cancer (Version 1.2017). .

5. Andre T, Boni C, Navarro M, Tabemero J, Hickish T, Topham C, et al. Improved overall survival with oxaliplatin, fluorouracil, and leucovorin as adjuvant treatment in stage II or III colon cancer in the MOSAIC trial. J Clin Oncol 2009;27(19):3109-16 doi 10.1200/JC0.2008.20.6771.

6. O'Connell MJ, Lavery I, Yothers G, Paik S, Clark-Langone KM, Lopatin M, et al. Relationship between tumor gene expression and recurrence in four independent studies of patients with stage II/III colon cancer treated with surgery alone or surgery plus adjuvant fluorouracil plus leucovorin. J Clin Oncol 20l0;28(25):3937-44 doi 10.1200/JCO.2010.28.9538.

7. Kopetz S, Tabernero J, Rosenberg R, Jiang ZQ, Moreno V, Bachleitner- Hofmann T, et al. Genomic classifier ColoPrint predicts recurrence in stage II colorectal cancer patients more accurately than clinical factors. Oncologist 20l5;20(2):l27-33 doi 10.1634/theoncologist.2014-0325.

8. Agesen TH, Sveen A, Merok MA, Lind GE, Nesbakken A, Skotheim RI, et al. ColoGuideEx: a robust gene classifier specific for stage II colorectal cancer prognosis. Gut 20l2;6l(l l): 1560-7 doi 10. H36/gutjnl-20l 1-301179.

9. Gao S, Tibiche C, Zou J, Zaman N, Trifiro M, O'Connor-McCourt M, et al. Identification and Construction of Combinatory Cancer Hallmark-Based Gene Signature Sets to Predict Recurrence and Chemotherapy Benefit in Stage II Colorectal Cancer. JAMA Oncol 20l6;2(l):37-45 doi l0.l00l/jamaoncol.20l5.34l3.

10. Beijers AJ, Mols F, Tjan-Heijnen VC, Faber CG, van de Poll-Franse LV, Vreugdenhil G. Peripheral neuropathy in colorectal cancer survivors: the influence of oxaliplatin administration. Results from the population-based PROFILES registry. Acta Oncol 20l5;54(4):463-9 doi 10.3109/0284186X.2014.980912.

11. Douillard JY, Cunningham D, Roth AD, Navarro M, James RD, Karasek P, et al. Irinotecan combined with fluorouracil compared with fluorouracil alone as first-line treatment for metastatic colorectal cancer: a multicentre randomised trial. Lancet 2000;355(9209):l04l- 7. 12. Denlinger CS, Barsevick AM. The challenges of colorectal cancer survivorship. J Natl Compr Cane Netw 2009;7(8):883-93; quiz 94.

13. Clarke SJ, Karapetis CS, Gibbs P, Pavlakis N, Desai J, Michael M, et al. Overview of biomarkers in metastatic colorectal cancer: tumour, blood and patient-related factors. Crit Rev Oncol Hematol 2013;85(2): 121-35 doi 10. l0l6/j.critrevonc.20l2.06.001.

14. Overman MJ, Lonardi S, Wong KYM, Lenz HJ, Gelsomino F, Aglietta M, et al. Durable Clinical Benefit With Nivolumab Plus Ipilimumab in DNA Mismatch Repair- Deficient/Micro satellite Instability-High Metastatic Colorectal Cancer. J Clin Oncol 20l8;36(8):773-9 doi 10.1200/JC0.2017.76.9901.

15. Van Emburgh BO, Arena S, Siravegna G, Lazzari L, Crisafulli G, Corti G, et al. Acquired RAS or EGFR mutations and duration of response to EGFR blockade in colorectal cancer. Nat Commun 20l6;7: l3665 doi l0.l038/ncommsl3665.

16. Guinney J, Dienstmann R, Wang X, de Reynies A, Schlicker A, Soneson C, et al. The consensus molecular subtypes of colorectal cancer. Nat Med 20l5;2l(l 1): 1350-6 doi l0.l038/nm.3967.

17. Mooi JK, Wirapati P, Asher R, Lee CK, Savas PS, Price TJ, et al. The prognostic impact of Consensus Molecular Subtypes (CMS) and its predictive effects for bevacizumab benefit in metastatic colorectal cancer: molecular analysis of the AGITG MAX clinical trial. Ann Oncol 2018 doi l0.l093/annonc/mdy4l0.

18. Okita A, Takahashi S, Ouchi K, Inoue M, Watanabe M, Endo M, et al. Consensus molecular subtypes classification of colorectal cancer as a predictive factor for chemotherapeutic efficacy against metastatic colorectal cancer. Oncotarget 20l8;9(27): 18698- 711 doi l0.l8632/oncotarget.246l7.

19. Linnekamp JF, Hooff SRV, Prasetyanti PR, Kandimalla R, Buikhuisen JY, Fessler E, et al. Consensus molecular subtypes of colorectal cancer are recapitulated in in vitro and in vivo models. Cell Death Differ 2018;25(3):616-33 doi l0.l038/s4l4l8-0l7-00l l-5.

20. Sveen A, Bruun J, Eide PW, Eilertsen IA, Ramirez L, Murumagi A, et al. Colorectal Cancer Consensus Molecular Subtypes Translated to Preclinical Models Uncover Potentially Targetable Cancer Cell Dependencies. Clin Cancer Res 20l8;24(4):794-806 doi 10.1158/1078- 0432.CCR-17-1234.

21. Thanki K, Nicholls ME, Gajjar A, Senagore AJ, Qiu S, Szabo C, et al. Consensus Molecular Subtypes of Colorectal Cancer and their Clinical Implications. Int Biol Biomed J 20l7;3(3): 105-11. 22. Dienstmann R, Vermeulen L, Guinney J, Kopetz S, Tejpar S, Tabernero J. Consensus molecular subtypes and the evolution of precision medicine in colorectal cancer. Nat Rev Cancer 2017; l7(4):268 doi l0.l038/nrc.20l7.24.

23. Takahashi H, Ishikawa T, Ishiguro M, Okazaki S, Mogushi K, Kobayashi H, et al. Prognostic significance of Traf2- and Nek- interacting kinase (TNIK) in colorectal cancer. BMC Cancer 2015; 15:794 doi l0.H86/sl2885-0l5-l783-y.

24. Sheffer M, Bacolod MD, Zuk O, Giardina SF, Pincas H, Barany F, et al. Association of survival and disease progression with chromosomal instability: a genomic exploration of colorectal cancer. Proc Natl Acad Sci U S A 2009;106(17):7131-6 doi l0.l073/pnas .0902232106.

25. Marisa L, de Reynies A, Duval A, Selves J, Gaub MP, Vescovo L, et al. Gene expression classification of colon cancer into molecular subtypes: characterization, validation, and prognostic value. PLoS Med 20l3;l0(5):el00l453 doi 10. l37l/journal.pmed.1001453.

26. Freeman TJ, Smith JJ, Chen X, Washington MK, Roland JT, Means AL, et al. Smad4- mediated signaling inhibits intestinal neoplasia by inhibiting expression of beta-catenin. Gastroenterology 2012;142(3):562-71 e2 doi l0.l053/j.gastro.20l l.l l.026.

27. Kemper K, Versloot M, Cameron K, Colak S, de Sousa e Melo F, de Jong JH, et al. Mutations in the Ras-Raf Axis underlie the prognostic value of CD133 in colorectal cancer. Clin Cancer Res 2012; 18(11):3132-41 doi 10.1158/1078-0432.CCR-11-3066.

28. Cerami E, Gao J, Dogrusoz U, Gross BE, Sumer SO, Aksoy BA, et al. The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. Cancer Discov 20l2;2(5):40l-4 doi 10.1158/2159-8290.CD-12-0095.

29. Goel A, Nagasaka T, Hamelin R, Boland CR. An optimized pentaplex PCR for detecting DNA mismatch repair-deficient colorectal cancers. PLoS One 20l0;5(2):e9393 doi 10.1371 /journal. pone.0009393.

30. Camp RL, Dolled-Filhart M, Rimm DL. X-tile: a new bio-informatics tool for biomarker assessment and outcome-based cut-point optimization. Clin Cancer Res 2004 ;10(21):7252-9 doi 10.1158/1078-0432.CCR-04-0713.

31. Heagerty PJ, Lumley T, Pepe MS. Time-dependent ROC curves for censored survival data and a diagnostic marker. Biometrics 2000;56(2):337-44.

32. Park YY, Lee SS, Lim JY, Kim SC, Kim SB, Sohn BH, et al. Comparison of prognostic genomic predictors in colorectal cancer. PLoS One 20l3;8(4):e60778 doi 10.1371 /j ournal .pone .0060778. 33. Tsuji S, Midorikawa Y, Takahashi T, Yagi K, Takayama T, Yoshida K, et al. Potential responders to FOLFOX therapy for colorectal cancer by Random Forests analysis. Br J Cancer 2012; 106(1): 126-32 doi 10. l038/bjc.20l 1.505.

34. Khambata-Ford S, Garrett CR, Meropol NJ, Basik M, Harbison CT, Wu S, et al. Expression of epiregulin and amphiregulin and K-ras mutation status predict disease control in metastatic colorectal cancer patients treated with cetuximab. J Clin Oncol 2007;25(22):3230-7 doi 10.1200/JC0.2006.10.5437.

35. Loboda A, Nebozhyn MV, Watters JW, Buser CA, Shaw PM, Huang PS, et al. EMT is the dominant program in human colon cancer. BMC Med Genomics 20l l;4:9 doi 10.1186/1755-8794-4-9.

36. Tan TZ, Miow QH, Miki Y, Noda T, Mori S, Huang RY, et al. Epithelial-mesenchymal transition spectrum quantification and its efficacy in deciphering survival and drug responses of cancer patients. EMBO Mol Med 20l4;6(l0): 1279-93 doi l0.l5252/emmm.20l404208.

37. Roepman P, Schlicker A, Tabernero J, Majewski I, Tian S, Moreno V, et al. Colorectal cancer intrinsic subtypes predict chemotherapy benefit, deficient mismatch repair and epithelial-to-mesenchymal transition. Int J Cancer 20l4;l34(3):552-62 doi 10. l002/ijc.28387.

38. Lorsch JR, Collins FS, Lippincott-Schwartz J. Cell Biology. Fixing problems with cell lines. Science 20l4;346(62l6): 1452-3 doi 10.1 l26/science.1259110.

39. Saito S, Okabe H, Watanabe M, Ishimoto T, Iwatsuki M, Baba Y, et al. CD44v6 expression is related to mesenchymal phenotype and poor prognosis in patients with colorectal cancer. Oncol Rep 20l3;29(4): 1570-8 doi l0.3892/or.20l3.2273.

40. Nieto MA, Huang RY, Jackson RA, Thiery JP. Emt: 2016. Cell 2016;166(1):21-45 doi 10. l0l6/j.cell.20l6.06.028.

41. van Gestel YR, de Hingh IH, van Herk-Sukel MP, van Erning FN, Beerepoot LV, Wijsman JH, et al. Patterns of metachronous metastases after curative treatment of colorectal cancer. Cancer Epidemiol 20l4;38(4):448-54 doi l0.l0l6/j.canep.20l4.04.004.

42. van Gestel YR, Thomassen I, Lemmens VE, Pruijt JF, van Herk-Sukel MP, Rutten HJ, et al. Metachronous peritoneal carcinomatosis after curative treatment of colorectal cancer. Eur J Surg Oncol 20l4;40(8):963-9 doi l0.l0l6/j.ejso.20l3.10.001.

43. Cao H, Xu E, Liu H, Wan L, Lai M. Epithelial-mesenchymal transition in colorectal cancer metastasis: A system review. Pathol Res Pract 20l5;2l l(8):557-69 doi 10. !0l6/j.prp.2015.05.010. Example 3 - Metastasis-associated miRNAs as a blood-based prognostic biomarker in colorectal cancer.

[00257] CRC cells continually shed subcellular components including microRNAs (miRNAs) into the bloodstream. Herein, the inventors performed comprehensive miRNA profiling to identify a translatable circulating metastasis-associated miRNA signature in CRC patients.

[00258] During the initial screening phase, the inventors analyzed miRNA expression profiling from miRNA microarray dataset comprised of 6 liver metastases, 8 surrounding normal liver tissues and 10 normal colon mucosa. In the second phase, candidate miRNAs were validated in an independent FFPE clinical cohort. The final phase evaluated the non-invasive biomarker potential of these miRNAs in two independent CRC patient cohorts using matched pre- and early post-operative blood samples (N=136, N=l80 respectively).

[00259] In the initial screening phase, the inventors identified a panel of 7 miRNAs, which had elevated expression in metastatic liver tissues. Among these, three metastasis-associated miRNAs (miR-l4l, miR-2lO, and miR-425*) were upregulated in blood specimens from stage IV vs. stage I-III patients in two independent clinical cohorts. High levels of miR-2lO in pre operative blood and miR-425* in post-operative blood, were associated with poor disease-free survival (DFS) in stage II and III patients in both cohorts. The combination of these miRNA expressions yielded the highest AUROC value of 0.795 for DFS prediction. The corresponding 3-year DFS rates were 38.5% for the high-risk group, and 90.6% for the low-risk group (HR 9.10, p<0.00l).

[00260] High miR-2lO and miR-425* were associated with poor DFS in stage II and III CRC patients. Non-invasive prediction of recurrence using these two miRNAs serves as essential biomarkers for personalized clinical management in CRC patients.

[00261] Distant metastasis is the most frequent cause of mortality in patients with colorectal cancer (CRC). About 30 % of new cases of CRC have the distant metastatic disease (stage IV) at the time of diagnosis, and 50-60% of patients with Stage III and 25% with Stage II disease develop metastatic diseases after curative resection 1. For stage II and III CRC patients, managing micro-metastatic disease after curative resection is the primary purpose of adjuvant therapy (2). Therefore, it is essential to Identify patients who have micro-metastasis at the time of curative resection for clinical decision making for adjuvant chemotherapy that remains the most pressing challenge in the management of stage II and III CRC patients.

[00262] miRNAs are small noncoding RNA molecules functioning primarily to down- regulate gene expression by specifically binding to the 3- untranslated region of mRNAs and subsequently prevent their translation and promote their degradation (7). Recent evidence demonstrated that miRNAs function as oncogenes or tumor suppressors to modulate multiple oncogenic cellular processes, including cell proliferation, invasion, angiogenesis, and metastasis (8).

[00263] In this study, the inventors performed an unbiased, systematic and comprehensive discovery and selected metastasis-associated miRNAs that are highly expressed in metastasis as they could be shed into the bloodstream from metastasis of CRC. Then the inventors investigated whether these miRNAs were upregulated in the blood of stage IV patients who had metastatic disease followed by evaluating the utility of these miRNAs in pre-operative and early post-operative blood to identify high-risk stage II and III disease. As a result, the inventors developed metastasis associated miRNA combination using pre- and early post-operative blood for identifying high-risk stage II and III patients more accurately than current risk factors.

A. MATERIALS AND METHODS

1. Study design

[00264] This study was a three-phase study, comprised of screening, tissue validation, and evaluating the potential contribution of circulating miRNAs in CRC patients. During the initial screening phase, the inventors analyzed GSE54088 dataset comprised of 6 liver metastases, 8 surrounding normal liver tissues and 10 normal colon mucosa samples 9. In the second phase, candidate miRNAs that were overexpressed in metastases compared to surrounding noncancerous tissues of metastases and normal colon mucosa in the initial screening step were validated in an independent clinical cohort. The final phase aimed to evaluate the potential source of miRNAs in the blood. Candidate metastasis-associated miRNAs were selected by comparing blood samples from stage IV patients with those from stage I-III patients who do not have distant metastasis in two independent clinical cohorts. Then, candidate miRNAs were analyzed for evaluating the usability as blood-based prognostic biomarkers using preoperative and postoperative blood samples in two independent clinical cohorts (FIG. 18).

2. Bioinformatics Analysis of miRNA microarray Data.

[00265] To identify metastasis-associated miRNAs, the inventors initially selected upregulated miRNAs in liver metastasis compared with surrounding normal liver tissues and normal colorectal mucosa. The inventors analyzed GSE54088 dataset comprised of 6 liver metastases, 8 surrounding normal liver tissues, and 10 normal colon mucosa samples. Candidate miRNAs were selected according to the criteria of P value less than 0.05 and fold change more than 1.8. The expression data of GSE54088 dataset were obtained via GE02R. 3. Patients and Specimen Collection

[00266] This study included examination of 685 serum and tissue specimens, including 25 formalin-fixed, paraffin-embedded (FFPE) metastatic CRC tissues (liver or lung), 22 matched corresponding normal liver or lung tissues and 6 matched corresponding normal colorectal mucosa that were enrolled at Tokyo Medical and Dental University Medical Hospital, as described in online supplementary table S l. In addition, 136 pre- and matched post-operative plasma samples collected from CRC patients before and at one month after primary CRC resection were obtained from Tokyo Medical and Dental University Medical Hospital (testing cohort). Furthermore, 180 pre- and matched post-operative serum samples collected before and at one week after primary CRC resection were obtained from Mie University Medical Hospital (validation cohort). Careful microdissection was performed to collect and enrich tumor cells from the FFPE tissue specimens. All patients included in this study was not treated with radiotherapy or chemotherapy before surgery. 5-fluorouracil-based adjuvant therapy (5- fluorouracil + leucovorin, capecitabine, S-l) were used for patients with stage III disease. The inventors got written informed consent from all patients who participated in this study. The institutional review boards of all participating institutions approved this study. The Tumor Node Metastasis (TNM) staging was performed according to American Joint Committee on Cancer (AJCC) standards. Details of the clinicopathological features of the patients involved in this study are shown in Supplementary Table S l

4. RNA extraction

[00267] For tissue samples, total RNA extraction from FFPE was performed using AllPrep DNA/RNA FFPE Kit (Qiagen, Valencia, California, USA). Both of the RNA extraction kits were used according to manufacturer’s protocol. For blood samples, total RNA (including miRNAs) extraction from 200pL of serum and plasma was performed with miRNeasy Serum/Plasma Kit (Qiagen, Valencia, California, USA). 3.5pL of a synthetic cel-miR-39 (1.6x108 copies/pL; Qiagen, Germany) was spiked-in. RNA was eluted in 30pL of RNase-free water.

5. Quantitative Reverse Transcription Polymerase Chain Reaction (qRT- PCR)

[00268] TaqMan miRNA qRT-PCR assays were performed using QuantStudio 6 Flex and QuantStudio 7 Real-Time PCR System (Applied Biosystems, Foster City, CA) using the TaqMan microRNA reverse transcription kit (Applied Biosystems) and the SensiFAST probe Lo-ROX kit (Bioline, Memphis, TN) according to the manufacturer’s recommendations. The relative expression of miRNAs was quantified by the 2-ACt method using miR-l6 as an endogenous reference control.

6. Statistical analysis

[00269] Statistical differences between circulating metastasis-associated miRNAs and various clinicopathological factors were determined by the c2 test. Kaplan-Meier analysis with log-rank test were used to estimate and compare the survival rates of CRC patients. For survival analysis, the optimum cut-points were selected by X-tile plots 10 for each miRNA as it related to disease-free survival (DFS) and over-all survival (OS). The inventors used the X-tile software version 3.6.1 (Yale University School of Medicine, New Haven, CT, USA). The Cox’s proportional hazards regression models were used to identify independent prognostic factors dictating patient survival. The inventors used the GraphPad Prism Ver. 6.0 (GraphPad Software, San Diego, CA), Medcalc version 16.1 (MedCalc Software, Ostend, Belgium) and R software version 3.3.1. for statistical analyses. All statistical analyses were two-sided, and P values of less than 0.05 were considered statistically significant. The inventors investigated the prognostic or predictive accuracy of each feature and circulating metastasis-associated miRNAs using time-dependent receiver operating characteristic (ROC) analysis (11). The inventors used the“survival ROC” package to do the time-dependent ROC curve analysis.

B. RESULTS

1. Identification of candidate metastasis associated miRNAs

[00270] The primary aim of the current work was to identify clinically critical circulating miRNAs which come from metastasis of CRC. In this purpose, the inventors initially searched candidate miRNAs which were upregulated in liver metastasis compared with normal liver and normal colorectal mucosa using GSE54088 dataset. The heat maps and clustering of differentiated expressed miRNA were shown in FIG. 16A. This search led to the identification of seven candidate miRNAs (miR-l35b, miR-l4l, miR-l82, miR-l83, miR-2lO, miR-224, miR-425*).

2. Tissue validation of candidate metastasis associated miRNAs.

[00271] The candidate metastasis-associated miRNAs were validated by qRT-PCR in an independent clinical cohort of FFPE tissue samples. All seven miRNAs were upregulated in liver or lung metastases compared with matched normal liver or lung tissues as well as normal colorectal mucosa (FIG. 19). 3. Plasma/ serum miRNAs which were upregulated in stage IV patients compared to stage I-III patients in two independent clinical cohorts.

[00272] Next, the inventors performed perioperative sequential blood collection (pre- and post-primary CRC resection) in two independent clinical cohorts to evaluate the utility of candidate miRNAs as blood-based biomarkers. In the testing cohort comprised of 136 CRC patients, four circulating miRNAs were highly expressed in both pre- and post-operative blood from CRC patients with distant metastasis compared to patients without distant metastasis (FIG. 16B, FIG. 20A-B).

[00273] Then, four miRNAs were evaluated in the validation cohort comprised of 180 serum samples to validate the results of the testing cohort. Three out of four miRNAs (miR-2lO, miR- 425*, miR-l4l) were successfully validated as circulating metastasis-associated miRNAs (FIG. 18C, FIG. 20C-D).

4. The association with clinicopathological variables and survival analysis of metastasis-associated miRNAs

[00274] The expression levels of metastasis-associated circulating miRNAs were analyzed in the context of various clinicopathological characteristics and prognosis of the patients using matched pre- and post-operative blood samples. The detailed associations between clinicopathological variables and expression of each miRNA in the two independent clinical cohorts are shown in Supplementary Table S2 and S3. To further demonstrate whether circulating metastasis-associated miRNA levels can predict prognosis in patients with CRC, the inventors next performed OS analysis in patients with stage I- IV CRC and DFS analysis in patients with stage II and III CRC. As shown in Supplementary Figure 4, Kaplan Meier analysis showed that the high pre- and post-operative miR-2lO group, high pre- and post-operative miR- 425* group, and high postoperative miR-l4l group had a significantly decreased OS rate compared with the each of low groups in two clinical cohorts. In stage II and III patients, high pre-operative miR-2lO group and high post-operative miR-425* group had a significantly decreased DFS rate compared with the each of low groups in two independent clinical cohorts consistently (FIG. 17A: testing cohort, FIG. 17B: validation cohort).

5. Pre-operative miR-210 and post-operative miR-425* expression were independent predictors of poor DFS in stage II and III CRC patients

[00275] The inventors next performed univariate and multivariate analyses using the Cox proportional hazard model in the testing and validation cohort. In the testing cohort, the univariate analysis revealed that the T classification (T4) (hazard ratio (HR): 2.69, 95%CI=l.25-5.80, P=0.0l), high pre-operative miR-2lO expression (HR: 2.70, 95%CI: 1.18- 6.15, P=0.0l), and high post-operative miR-425* expression (HR: 2.25, 95%CI: 1.05-4.80, P=0.03) were significantly associated with poor DFS (Supplementary Table 4). In the validation cohort, the univariate analysis revealed that the T classification (T4) (HR: 5.14, 95%CI=l.99-13.27, P<0.00l), venous invasion positive (HR: 4.36, 95%CI: l.44-13.18, P=0.009), high pre-operative miR-2lO expression (HR:4.73, 95%CI: 1.84-12.18, P=0.00l), and high post-operative miR-425* expression (HR: 5.21, 95%CI: 2.02-13.44, P<0.00l) were significantly associated with poor DFS, the presence of lymph node metastasis tended to associate with poor DFS (HR: 2.32, 95%CI: 0.90-5.99, P=0.08) (Supplementary Table 5). Multivariate analysis revealed that the expression levels of pre-operative miR-2lO and post operative miR-425* were independent factors for predicting poor DFS in both the testing cohort (HR: 2.33, 95%CI: 1.01-5.37, P=0.04, HR: 2.48, 95%CI: 1.13-5.43, P=0.02, respectively) (Supplementary Table 4) and the validation cohort (HR: 3.03, 95%CI: 1.04-8.87, P=0.04, HR: 5.82, 95%CI: 2.10-16.09, P<0.00l, respectively) (Supplementary Table 5). Taken together, the inventors successfully validated the prognostic significance of both pre-operative miR-2lO and early post-operative miR-425* expression as important prognostic biomarkers in multiple clinical cohorts of stage II and III CRC patients.

6. The utility of DFS prediction model using circulating metastasis- associated miRNAs and clinicopathological variables

[00276] For constructing a DFS prediction model in stage II and III patients, at first, the inventors evaluated the circulating metastasis-associated miRNAs combination using the Cox proportional hazard model in the validation cohort. The combination of preoperative miR-2lO and postoperative miR-425* expression yielded the highest area under the curve (AUC) of 0.795 (FIG. 17C). A Kaplan-Meier survival curve showed three-year DFS rate was 38.5% for the high-risk group, and 90.6% for the low-risk group (HR: 9.10, 95% Cl 2.47-33.48; p<0.00l). Thereafter, the inventors constructed a new DFS prediction model of circulating metastasis-associated miRNA combination with clinicopathological variables (T4 and venous invasion positive). The time-dependent ROC curve showed the AUC of 0.859 (FIG. 17C). A Kaplan-Meier survival curve showed three-year DFS rate was 37.2% for the high-risk group, and 93.4% for the low-risk group (HR: 12.02, 95% Cl 3.59-40.22; p<0.00l; FIG. 17C). Furthermore, the inventors evaluated the DFS using this model in stage II and stage III CRC patients separately. This model efficiently distinguished DFS in both stage II and stage III CRC patients by employing the Kaplan-Meier curve analysis (P<0.00l, P<0.00l, respectively, FIG. 17D). C. DISCUSSION

[00277] In the current study, the inventors performed systemic and comprehensive profiling on the clinical significance of circulating metastasis-associated miRNAs based on the hypothesis that highly expressed miRNA in metastasis are shed into the bloodstream and promising circulating biomarker in CRC patients. The results of the current study supported the inventors’ hypothesis as their three candidate circulating miRNAs were significantly associated with poor OS in CRC patients in two independent clinical cohorts. More importantly, for the inventors’ research purpose of this study, identifying novel and clinically translatable circulating CRC biomarkers for recurrence prediction, the inventors have successfully identified two miRNAs associated with poor DFS after curative surgery in stage II and III CRC patients, which were validated in two independent clinical cohorts. In addition, this novel circulating miRNA combination achieved excellent predictive values for DFS in stage II and III CRC patients.

[00278] Risk stratification of recurrence is intensely investigated in the field of cancer treatment as it is essential in treatment decision-making for adjuvant chemotherapy. To address this problem, researchers have studied the possibility of stratifying patients with CRC according to development of gene-expression signatures using their tumor tissues 12 13. Although gene-expression signatures hold promise, thus far, they have not been able to use in clinical practice and are often not predictive of benefit from adjuvant chemotherapy. Intra- tumoral heterogeneity is a major explainable drawback of tissue based biomarker 14. Contrarily, the blood-based biomarker can assess intra-tumor heterogeneity 14. In addition, a significant utility of blood-based biomarker is that it can be measured repeatedly and sequentially. As NCCN guideline recommends adjuvant therapy should be started within 6-8 weeks after curative resection 15, early postoperative blood could serve important biomarkers for decision making of adjuvant therapy. In the current study, the invnetors revealed, for the first time, circulating metastasis-associated miRNAs stratified stage II and III patients into high- and low-risk groups effectively using pre- and early post-operative blood samples. Thus, the inventors’ findings would highlight the novel approach for identifying high-risk stage II and III CRC patients.

[00279] In conclusion, the inventors provide novel evidence that the intentors’ circulating metastasis-associated miRNAs using pre- and early post-operative blood can effectively stratify stage II and III CRC patients into high and low-risk groups based upon DFS. These circulating miRNAs potentially offers tremendous clinical value in directing personalized treatment regimens and clinical management of patients with stage II and III CRC.

D. TABLES

Supplementary Table 1: Clinicopathological features of patients in this study

Testing cohort Validation cohort

N=l36 N=l80

Variables N (%) N (%)

Gender

Male 65 (48) 100 (56)

Female 71 (52) 80 (44)

Age (Years)

Mean+SEM 66+1.15 68+0.77

Tumor Location

Left sided colon 94 (69) 104 (28)

Rectum 42 (31) 76 (42)

Tumor Depth

Tl 4 (3)

T2 9 (7)

T3 82 (60) 155 (86)

T4 41(30) 24 (13)

unavailable 0 1 (1)

Histology (Differentiation)

Differentiated 121 (89) 164 (91)

Undefferentiated 14 (10) 15 (8)

unavailable 1 (D 1 (1)

Lymphatic Invasion

Negative 82 (60) 42 (23)

Positive 54 (40) 137 (76)

unavailable 0 HD

Venous Invasion

Negative 18 (13) 99 (55)

Positive 118 (87) 80 (44)

unavailable 0 1 (1)

Lymph Node Metastasis

Negative 47 (35) 99 (55)

Positive 89 (65) 80 (44)

unavailable 0 1 (1)

Tumor Stage

Preoperative Serum CEA (ng/mL)

< 5 81 (60) 72 (40)

> 5 55 (40) 103 (57)

unavailable 0 5 (3) Median follow up period (Months)

53 36.2

Supplementary Table 2: Association between circulating miRNAs expression and clinicopathological factors in the testing cohort

Supplementary Table 2 (Continued)

Supplementary Table 3: Association between circulating miRNAs expression and clinicopathological factors in the validation cohort

Supplemental Table 3 (Continued)

Supplementary Table 4: Univariate and multivariate analysis of DFS in stage II and III patients of the testing cohort

Supplementary Table 5 Univariate and multivariate analysis of DFS in stage II and III patients of the validation cohort

E. REFERENCES

[00280] The following references and the publications referred to throughout the specification, to the extent that they provide exemplary procedural or other details supplementary to those set forth herein, are specifically incorporated herein by reference.

1. Kosmider S, Lipton L. Adjuvant therapies for colorectal cancer. World J Gastroenterol. 2007;l3(28):3799-3805.

2. Pantel K, Cote RJ, Fodstad O. Detection and clinical importance of micrometastatic disease. J Natl Cancer Inst. 1999;91(13): 1113-1124.

3. Shigeyasu K, Toden S, Zumwalt TJ, Okugawa Y, Goel A. Emerging Role of MicroRNAs as Liquid Biopsy Biomarkers in Gastrointestinal Cancers. Clin Cancer Res. 20l7;23(l0):239l-2399.

4. Toiyama Y, Okugawa Y, Fleshman J, Richard Boland C, Goel A. MicroRNAs as potential liquid biopsy biomarkers in colorectal cancer: A systematic review. Biochim Biophys Acta Rev Cancer. 20l8;l870(2):274-282.

5. Li J, Liu Y, Wang C, et al. Serum miRNA expression profile as a prognostic biomarker of stage II/III colorectal adenocarcinoma. Sci Rep. 2015;5: 12921.

6. Chen J, Wang W, Zhang Y, Chen Y, Hu T. Predicting distant metastasis and chemoresistance using plasma miRNAs. Med Oncol. 20l4;3 l(l):799.

7. Bartel DP. MicroRNAs: target recognition and regulatory functions. Cell. 2009;136(2):215-233.

8. Slaby O, Svoboda M, Michalek J, Vyzula R. MicroRNAs in colorectal cancer: translation of molecular biology into clinical application. Mol Cancer. 2009;8: l02.

9. Mudduluru G, Abba M, Batliner J, et al. A Systematic Approach to Defining the microRNA Landscape in Metastasis. Cancer Res. 2015;75(15):3010-3019.

10. Camp RL, Dolled-Filhart M, Rimm DL. X-tile: a new bio-informatics tool for biomarker assessment and outcome-based cut-point optimization. Clin Cancer Res. 2004;l0(2l):7252-7259.

11. Heagerty PJ, Lumley T, Pepe MS. Time-dependent ROC curves for censored survival data and a diagnostic marker. Biometrics. 2000;56(2):337-344.

12. O'Connell MJ, Lavery I, Yothers G, et al. Relationship between tumor gene expression and recurrence in four independent studies of patients with stage II/III colon cancer treated with surgery alone or surgery plus adjuvant fluorouracil plus leucovorin. J Clin Oncol. 20l0;28(25):3937-3944. 13. Salazar R, Roepman P, Capella G, et al. Gene expression signature to improve prognosis prediction of stage II and III colorectal cancer. J Clin Oncol. 201 l;29(l): 17-24.

14. Yamada T, Matsuda A, Koizumi M, et al. Liquid Biopsy for the Management of Patients with Colorectal Cancer. Digestion. 20l9;99(l):39-45.

15. Network. NCC. Colon Cancer (Version 1.2017). .

16. Nadal C, Maurel J, Gascon P. Is there a genetic signature for liver metastasis in colorectal cancer? World J Gastroenterol. 2007;l3(44):5832-5844.

17. Hur K, Toiyama Y, Schetter AJ, et al. Identification of a metastasis- specific MicroRNA signature in human colorectal cancer. J Natl Cancer Inst. 20l5;l07(3).

18. Dang K, Myers KA. The role of hypoxia-induced miR-2lO in cancer progression. Int J Mol Sci. 20l5;l6(3):6353-6372.

19. Qin Q, Furong W, Baosheng L. Multiple functions of hypoxia-regulated miR-2lO in cancer. J Exp Clin Cancer Res. 20l4;33:50.

20. Wang W, Qu A, Liu W, et al. Circulating miR-2lO as a diagnostic and prognostic biomarker for colorectal cancer. Eur J Cancer Care (Engl). 20l7;26(4).

21. Yuwen D, Ma Y, Wang D, et al. Prognostic Role of Circulating Exosomal miR-425-3p for the Response of NSCLC to Platinum-Based Chemotherapy. Cancer Epidemiol Biomarkers Prev. 20l9;28(l): 163-173.

22. Vaira V, Roncalli M, Camaghi C, et al. MicroRNA-425-3p predicts response to sorafenib therapy in patients with hepatocellular carcinoma. Liver Int. 20l5;35(3): 1077-1086.

* * *

[00281] All of the methods disclosed and claimed herein can be made and executed without undue experimentation in light of the present disclosure. While the compositions and methods of this invention have been described in terms of preferred embodiments, it will be apparent to those of skill in the art that variations may be applied to the methods and in the steps or in the sequence of steps of the method described herein without departing from the concept, spirit and scope of the invention. More specifically, it will be apparent that certain agents which are both chemically and physiologically related may be substituted for the agents described herein while the same or similar results would be achieved. All such similar substitutes and modifications apparent to those skilled in the art are deemed to be within the spirit, scope and concept of the invention as defined by the appended claims. All references and publications referred to throughout the disclosure are incorporated by reference for all purposes.