Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHOD FOR ALL CANCER CATEGORY DETERMINATION BY MEANS OF METHYLATION PROFILING
Document Type and Number:
WIPO Patent Application WO/2015/181330
Kind Code:
A1
Abstract:
A method for determining whether a patient belongs to a specific cancer category is disclosed, which method comprises, in a sample from said patient, performing an analysis to determine the DNA methylation level in a group of predetermined CpG sites; said group of predetermined CpG sites comprising in the range of 7- 50 CpG sites, and said group of predetermined CpG sites being associated with a specific cancer category, wherein the output of said analysis is a patient methylation profile comprising a patient-specific methylation level for each CpG site in said group of predetermined CpG sites, and wherein said specific cancer category is characterised by a cancer category methylation profile comprising a cancer category specific methylation level for each CpG site in said group of predetermined CpG sites, said method further comprising the step of comparing said patient methylation profile with said cancer category methylation profile, wherein said patient belongs to said specific cancer category if said patient methylation profile corresponds to said cancer category methylation profile.

Inventors:
SYVÄNEN ANN-CHRISTINE (SE)
NORDLUND JESSICA (SE)
BÄCKLIN CHRISTOFER (SE)
GUSTAFSSON MATS (SE)
LÖNNERHOLM GUDMAR (SE)
Application Number:
PCT/EP2015/061906
Publication Date:
December 03, 2015
Filing Date:
May 28, 2015
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
SYVÄNEN ANN-CHRISTINE (SE)
NORDLUND JESSICA (SE)
BÄCKLIN CHRISTOFER (SE)
GUSTAFSSON MATS (SE)
LÖNNERHOLM GUDMAR (SE)
International Classes:
C12Q1/68
Other References:
BUSCHE STEPHAN ET AL: "Integration of High-Resolution Methylome and Transcriptome Analyses to Dissect Epigenomic Changes in Childhood Acute Lymphoblastic Leukemia", CANCER RESEARCH, vol. 73, no. 14, July 2013 (2013-07-01), pages 4323 - 4336, XP002731850
JESSICA NORDLUND ET AL: "Genome-wide signatures of differential DNA methylation in pediatric acute lymphoblastic leukemia", GENOME BIOLOGY, BIOMED CENTRAL LTD., LONDON, GB, vol. 14, no. 9, R105, 24 September 2013 (2013-09-24), pages 1 - 15, XP021165720, ISSN: 1465-6906, DOI: 10.1186/GB-2013-14-9-R105
MILANI LILI ET AL: "DNA methylation for subtype classification and prediction of treatment outcome in patients with childhood acute lymphoblastic leukemia.", BLOOD 11 FEB 2010, vol. 115, no. 6, 11 February 2010 (2010-02-11), pages 1214 - 1225, XP002731851, ISSN: 1528-0020
GENG HUIMIN ET AL: "Integrative Epigenomic Analysis Identifies Biomarkers and Therapeutic Targets in Adult B-Acute Lymphoblastic Leukemia", CANCER DISCOVERY, vol. 2, no. 11, November 2012 (2012-11-01), pages 1004 - 1023, XP002731852
FIGUEROA MARIA E ET AL: "Integrated genetic and epigenetic analysis of childhood acute lymphoblastic leukemia", JOURNAL OF CLINICAL INVESTIGATION, vol. 123, no. 7, July 2013 (2013-07-01), pages 3099 - 3111, XP002731853
D. J. P. M. STUMPEL ET AL: "Specific promoter methylation identifies different subgroups of MLL-rearranged infant acute lymphoblastic leukemia, influences clinical outcome, and provides therapeutic options", BLOOD, vol. 114, no. 27, 24 December 2009 (2009-12-24), pages 5490 - 5498, XP055031056, ISSN: 0006-4971, DOI: 10.1182/blood-2009-06-227660
BURKE MICHAEL J ET AL: "Epigenetic modifications in pediatric acute lymphoblastic leukemia.", FRONTIERS IN PEDIATRICS 2014, vol. 2, 42, 14 May 2014 (2014-05-14), pages 1 - 7, XP002731854, ISSN: 2296-2360
Attorney, Agent or Firm:
AWAPATENT AB (Box, 104 30 Stockholm, SE)
Download PDF:
Claims:
CLAIMS

1 . A method for determining whether a patient belongs to a specific cancer category, wherein said specific cancer category is a cancer type or a cancer subtype, said cancer type being leukemia, in particular pediatric acute lympoblastic leukemia (ALL), and said specific cancer subtype(s) being T- ALL, HeH, t(12;21 ),1 1 q23/MLL, t(1 ;19), dic(9;20): t(9;22), and/or iAMP21 , said method comprising: in a sample from said patient, performing an analysis to determine the DNA methylation level in a group of predetermined CpG sites; said group of predetermined CpG sites being associated with one of said specific cancer categories, wherein the output of said analysis is a patient methylation prof il< comprising a patient-specific methylation level for each CpG site said group of predetermined CpG sites, and wherein said specific cancer category is characterised by a cancer category methylation profile comprising a cancer category specific methylation level for each CpG site in said group of predetermined CpG sites, said method further comprising the step of comparing said patient methylation profile with said cancer category methylation profile, wherein said patient belongs to said specific cancer category if said patient methylation profile corresponds to said cancer category methylation profile, and wherein the number of CpG sites in each group of predetermined CpG sites associated with each specific cancer category is as follows: • ALL: 13-17 CpG sites, preferably 17 CpG sites.

• T-ALL: 7-14 CpG sites, such as 1 1 -14 CpG sites, preferably 14 CpG sites;

• HeH: 23-34 CpG sites, such as 27-34 CpG sites, preferably 34 CpG sites;

• t(12;21 ): 28-42 CpG sites, such as 33-42 CpG sites, preferably 42 CpG sites;

• 1 1 q23/MLL: 16-28 CpG sites, such as 22-28 CpG sites, preferably 28 CpG sites;

· t(1 ;19): 9-21 CpG sites, such as 16-21 CpG sites, preferably 21 CpG sites;

• dic(9;20): 24-37 CpG sites, such as 29-37 CpG sites, preferably 37 CpG sites;

• t(9;22): 17-23 CpG sites, such as 18-23 CpG sites, preferably 23 CpG sites;

• iAMP21 : 1 1 -1 6 CpG sites, such as 12-1 6 CpG sites, preferably 1 6 CpG sites. and wherein the CpG sites in each group of predetermined CpG sites associated with each specific cancer category are selected from Table A, where each CpG site is expressed as a range of nucleotide coordinates, referring to the UCSC Feb. 2009 (GRCh37/hg19) assembly.

2. A method according to claim 1 , wherein the CpG sites in each group of predetermined CpG sites associated with each specific cancer category are selected from Table B where each CpG site is expressed as a range of nucleotide coordinates, referring to the UCSC Feb. 2009 (GRCh37/hg19) assembly.

3. A method according to claim 1 , wherein the CpG sites in each group of predetermined CpG sites associated with each specific cancer category are selected from Table C, where each CpG site is expressed as a specific nucleotide coordinate, referring to the UCSC Feb. 2009 (GRCh37/hg19) assembly. 4. A method according to any one of the previous claims, wherein said patient methylation profile corresponds to said cancer category methylation profile if the mean absolute difference between the patient-specific

methylation levels and the cancer category specific methylation levels is at most 25%.

5. A method according to any one of the previous claims, wherein said DNA methylation level is determined by establishing the beta value for each CpG site, preferably using a bisulfite reaction based method. 6. A method according to any one of the previous claims, wherein said cancer category methylation profile comprises a cancer category specific mean beta value number between 0 and 1 for each CpG site in said group of predetermined CpG sites. 7. A method according to any one of the previous claims, wherein said patient methylation profile comprises a patient-specific beta value number between 0 and 1 for each CpG site in said group of predetermined CpG sites.

8. A method according to any one of the previous claims, wherein said analysis comprises determining the DNA methylation level in at least two groups of predetermined CpG sites.9. A method according to any one of the previous claims, wherein the cancer category specific methylation level for each CpG site is as defined in Tables A, B, or C, where the cancer category specific methylation level for each CpG site is expressed as a range of mean beta values (Tables A and B), or as a specific mean beta value (Table C).

10. Kit for determining whether a patient belongs to a specific cancer category, said kit comprising means for performing the steps of the method according to any one of the claims 1 -9. 1 1 . Analysis chip comprising means for analysing the methylation level of one or more of the CpG sites defined in Table A, B, or C, where each CpG site is expressed as a range of nucleotide coordinates (Tables A and B), or as a specific nucleotide coordinate (Table C), referring to the UCSC Feb. 2009 (GRCh37/hg19) assembly.

12. Computer program product including instructions that, when executed, cause one or more processors to perform the steps of the method according to any one of the claims 1 -9.

Description:
METHOD FOR ALL CANCER CATEGORY DETERMINATION BY MEANS OF METHYLATION PROFILING

Technical field

The present invention relates to a method for determining whether a patient belongs to a specific cancer category, in particular pediatric acute lymphoblastic leukemia (ALL), or a subtype thereof.

Technical background

Pediatric acute lymphoblastic leukemia (ALL) is a clinically, genetically, and epigenetically heterogeneous disease. The genetic subtypes of ALL are characterized by large-scale chromosomal aberrations, such as aneuploidies and translocations. Karyotyping, fluorescent in situ hybridization (FISH), reverse transcriptase PCR (RT-PCR), and array-based methods are routinely used to detect cytogenetic subtypes which are recurrent in patients with ALL.

The Nordic Society of Pediatric Hematology and Oncology (NOPHO) ALL-1992, ALL-2000 and ALL-2008 protocols all contain three levels of treatment intensity. Patients diagnosed between 1996 and 2008 have been treated according to the NOPHO ALL 1992 and 2000 protocols, which were highly similar. Although the backbone of the protocol was unchanged for the lower risk-groups, the current Nordic treatment protocol for ALL, NOPHO ALL 2008, introduced altered stratification of patients into treatment groups based on detection of minimal residual disease (MRD) by PCR, and intensified high- risk therapy as well as intensified treatment with asparaginase, compared to previous protocols.

The allocation of the patients into the different protocols is made i.a. based on the result of the cytogenic subtype analysis. A precise patient allocation is of course crucial in order to achieve as successful treatment results as possible.

The accuracy of detecting chromosomal abnormalities by karyotyping, FISH, and PCR is generally high, however, these methods do not allow detection of all the aberrations that may occur in ALL cells. Moreover, 15% of ALL patients harbor complex, non-recurrent genomic aberrations. These shortcomings of conventional cytogenic analyses obstruct a correct allocation of the patients into the right treatment protocols, and as a consequence, the patients may not receive the optimal treatment.

Methylation of cytosine (5mC) residues in CpG dinucleotides is an epigenetic modification that plays a pivotal role in the establishment of cellular identity by influencing gene expression in a tissue-specific manner. There are approximately 28 million CpG sites in the human genome that are targets for DNA methylation. The pathogenesis and phenotypic characteristics of leukemic cells are partially explained by alterations in DNA methylation, pediatric ALL cells with different rearrangements display subtype-specific gene expression and DNA methylation patterns.

However, the genome-wide DNA methylation patterns have not yet been comprehensively described for all subtypes of ALL, and thus the relationship between DNA methylation and cytogenic background is poorly understood.

Summary of the invention

In the research work leading to the present invention, it was

surprisingly found that DNA methylation profiling may be applied for predicting, with remarkably high accuracy, cytogenetic abnormalities indicative of various cancer categories, in particular ALL and various cytogenic subtypes thereof. It has been demonstrated that the measurement of DNA methylation at only 232 CpG sites is able to accurately detect ALL and 8 different subtypes of primary ALL samples.

It has been shown that nearly 50% of patients that could not be categorized by conventional methods displayed DNA methylation patterns that are similar to those in recurrent subtypes of ALL. By verification analyses, it could be established that at least 20% of the newly categorized patients harbor a canonical aberration that was not detected by conventional techniques. Methylation-based classification in accordance with the present invention may therefore be a useful tool for samples missing comprehensive cytogenetic analyses.

In contrast to traditional methods used for diagnostics, which require >1 μg of RNA or intact dividing cells, analysis of DNA methylation requires only 250 ng of input DNA, therefore, this can be performed on samples with little material. DNA methylation classification may also be useful for newly diagnosed ALL cases, where traditional cytogenetic assays yield inconclusive results or where lack of material precludes RNA-based analyses.

The present invention thus provides for a significant improvement in specificity and sensitivity over conventional cytogenic analyses. This in turn leads to a better possibility to correctly allocate patients into the right treatment protocols, and as a consequence, more patients will receive the optimal treatment.

In its broadest sense, the present invention relates to a method for determining whether a patient belongs to a specific cancer category, said method comprising: in a sample from said patient, performing an analysis to determine the DNA methylation level in a group of predetermined CpG sites; said group of predetermined CpG sites comprising in the range of 7- 50 CpG sites, and said group of predetermined CpG sites being associated with a specific cancer category, wherein the output of said analysis is a patient methylation profile comprising a patient-specific methylation level for each CpG site in said group of predetermined CpG sites, and wherein said specific cancer category is characterised by a cancer category methylation profile comprising a cancer category specific methylation level for each CpG site in said group of predetermined CpG sites, said method further comprising the step of comparing said patient methylation profile with said cancer category methylation profile, wherein said patient belongs to said specific cancer category if said patient methylation profile corresponds to said cancer category methylation profile. More particularly, the present invention relates to a method for determining whether a patient belongs to a specific cancer category, wherein said specific cancer category is a cancer type or a cancer subtype, said cancer type being leukemia, in particular pediatric acute lympoblastic leukemia (ALL), and said specific cancer subtype(s) being T-ALL, HeH, t(12;21 ),1 1 q23/MLL, t(1 ;19), dic(9;20): t(9;22), and/or iAMP21 , said method comprising: in a sample from said patient, performing an analysis to determine the DNA methylation level in a group of predetermined CpG sites; said group of predetermined CpG sites being associated with one of said specific cancer categories, wherein the output of said analysis is a patient methylation prof il< comprising a patient-specific methylation level for each CpG site said group of predetermined CpG sites, and wherein said specific cancer category is characterised by a cancer category methylation profile comprising a cancer category specific methylation level for each CpG site in said group of predetermined CpG sites, said method further comprising the step of comparing said patient methylation profile with said cancer category methylation profile, wherein said patient belongs to said specific cancer category if said patient methylation profile corresponds to said cancer category methylation profile, and wherein the number of CpG sites in each group of predetermined CpG sites associated with each specific cancer category is as follows: • ALL: 13-17 CpG sites, preferably 17 CpG sites.

• T-ALL: 7-14 CpG sites, such as 1 1 -14 CpG sites, preferably 14 CpG sites;

• HeH: 23-34 CpG sites, such as 27-34 CpG sites, preferably 34 CpG sites;

• t(12;21 ): 28-42 CpG sites, such as 33-42 CpG sites, preferably 42 CpG sites;

• 1 1 q23/MLL: 16-28 CpG sites, such as 22-28 CpG sites, preferably 28 CpG sites;

• t(1 ;19): 9-21 CpG sites, such as 16-21 CpG sites, preferably 21 CpG sites;

• dic(9;20): 24-37 CpG sites, such as 29-37 CpG sites, preferably 37 CpG sites;

• t(9;22): 17-23 CpG sites, such as 18-23 CpG sites, preferably 23 CpG sites;

• iAMP21 : 1 1 -1 6 CpG sites, such as 12-1 6 CpG sites, preferably 1 6 CpG sites. and wherein the CpG sites in each group of predetermined CpG sites associated with each specific cancer category are selected from Table A, where each CpG site is expressed as a range of nucleotide coordinates, referring to the UCSC Feb. 2009 (GRCh37/hg19) assembly.

It is unexpected and advantageous that a cancer category can be identified by determining the DNA methylation level in as few as 7-50 CpG sites. This in turn provides for cheap, quick and easy-to-handle analysis setups.

In accordance with an embodiment of the invention, the patient methylation profile corresponds to said cancer category methylation profile if the mean absolute difference between the patient-specific methylation levels and the cancer category specific methylation levels is at most 25%. The DNA methylation level is suitably determined by establishing the beta value for each CpG site, preferably using a bisulfite reaction based method.

More particularly, the DNA methylation level for each CpG site is preferably determined in bisulfite treated DNA, where unmethylated C- nucleotides have been converted to T-nucleotides, while methylated C- nucleotides are protected from the conversion. The proportion of methylated and unmethylated C-nucleotides, called beta value (a value between 0 and 1 ), is suitably determined using a genotyping method for single nucleotide variants (SNVs).

This analysis method for determining the DNA methylation level only requires a small amount of input DNA, and is a quick and simple method as compared to conventional cytogenic and molecular biological analysis methods, e.g. PCR.

The cancer category methylation profile may comprise a cancer category specific mean beta value number between 0 and 1 for each CpG site in said group of predetermined CpG sites, and the patient methylation profile may comprise a patient-specific beta value number between 0 and 1 for each CpG site in said group of predetermined CpG sites.

The analysis may comprise the determination of the DNA methylation level in at least two groups of predetermined CpG sites; each group of predetermined CpG sites comprising in the range of 7-50 CpG sites; and each group of predetermined CpG sites being associated with a specific cancer category. Thereby, several cancer categories can be determined simultaneously.

By a specific cancer category is typically meant a cancer type or a cancer subtype. A preferred example of a cancer type to be determined in accordance with the present invention is leukemia, in particular pediatric acute lympoblastic leukemia (ALL), and preferred examples of specific cancer subtype(s) to be determined in accordance with the present invention is/are typically T-ALL, HeH, t(12;21 ),1 1 q23/MLL, t(1 ;19), dic(9;20): t(9;22), and/or iAMP21 . In a preferred embodiment of the invention, the number of CpG sites in each group of predetermined CpG sites associated with each specific cancer category is as follows:

ALL: 13-17 CpG sites, preferably 17 CpG sites.

T-ALL: 7-14 CpG sites, such as 1 1 -14 CpG sites, preferably 14 CpG sites;

HeH: 23-34 CpG sites, such as 27-34 CpG sites, preferably 34 CpG sites;

t(12;21 ): 28-42 CpG sites, such as 33-42 CpG sites, preferably 42 CpG sites;

1 1 q23/MLL: 16-28 CpG sites, such as 22-28 CpG sites, preferably 28 CpG sites;

t(1 ;19): 9-21 CpG sites, such as 16-21 CpG sites, preferably 21 CpG sites;

• dic(9;20): 24-37 CpG sites, such as 29-37 CpG sites, preferably 37 CpG sites;

t(9;22): 17-23 CpG sites, such as 18-23 CpG sites, preferably 23 CpG sites;

iAMP21 : 1 1 -1 6 CpG sites, such as 12-1 6 CpG sites, preferably 1 6 CpG sites.

The CpG sites in each group of predetermined CpG sites associated with each specific cancer category may be selected from Table A, B, or C, where each CpG site is expressed as a range of nucleotide coordinates (Tables A and B), or as a specific nucleotide coordinate (Table C), referring to the UCSC Feb. 2009 (GRCh37/hg19) assembly.

The knowledge that the methylation level in these particular sites are of relevance for cancer category determination is very useful, and may be applied in different ways depending on the method used for measuring the methylation level.

The cancer category specific methylation level for each CpG site may be as defined in Tables A, B, or C, where the cancer category specific methylation level for each CpG site is expressed as a range of mean beta values (Tables A and B), or as a specific mean beta value (Table C). The present invention also relates to a kit for determining whether a patient belongs to a specific cancer category, said kit comprising: means for performing, in a sample drawn from said patient, an analysis to determine the DNA methylation level in a group of predetermined CpG sites; said group of predetermined CpG sites comprising in the range of 7- 50 CpG sites, and said group of predetermined CpG sites being associated with a specific cancer category, wherein the output of said analysis is a patient methylation profile comprising a patient-specific methylation level for each CpG site in said group of predetermined CpG sites, and wherein said specific cancer category is characterised by a cancer category methylation profile comprising a cancer category specific level of methylation for each CpG site in said group of predetermined CpG sites, wherein said kit further comprises means for comparing said patient methylation profile with said cancer category methylation profile, whereby said patient belongs to said specific cancer category if said patient

methylation profile corresponds to said cancer category methylation profile.

The present invention also relates to a kit for determining whether a patient belongs to a specific cancer category, said kit comprising means for performing the steps of any of the methods described above.

Another embodiment of the present invention relates to an analysis chip comprising means for analysing the methylation level of one or more of the CpG sites defined in Table A, B, or C, where each CpG site is expressed as a range of nucleotide coordinates (Tables A and B), or as a specific nucleotide coordinate (Table C), referring to the UCSC Feb. 2009

(GRCh37/hg19) assembly. In a further embodiment, the present invention relates to a computer program product including instructions that, when executed, cause one or more processors to perform the steps of the method described in the foregoing. An embodiment of the invention relates to a computer-readable medium having stored thereon said computer program product. Another embodiment of the invention relates to a computer program product stored on a non-transitive computer-readable medium including instructions that, when executed, cause one or more processors to perform the steps of the method described in the foregoing.

A further aspect of the present invention relates to a method for allocating a cancer patient to a treatment protocol, said method comprising: determining whether said patient belongs to a specific cancer category by using the above-described method for determining whether a patient belongs to a specific cancer category; and

if said patient belongs to a specific cancer category, allocating said patient to a treatment protocol depending on the specific cancer category.

In a further embodiment, the present invention relates to a computer program product including instructions that, when executed, cause one or more processors to perform the steps of said method for allocating a cancer patient to a treatment protocol. An embodiment of the invention relates to a computer-readable medium having stored thereon said computer program product. Another embodiment of the invention relates to a computer program product stored on a non-transitive computer-readable medium including instructions that, when executed, cause one or more processors to perform the steps of said method for allocating a cancer patient to a treatment protocol.

A further aspect of the present invention relates to a kit for allocating a cancer patient to a treatment protocol, said kit comprising:

means for determining whether said patient belongs to a specific cancer category by using the method described in the foregoing; and if said patient belongs to a specific cancer category, means for allocating said patient to a treatment protocol depending on the specific cancer cancer category.

The present invention involves a significant contribution to the art, as it for the first time provides for a reliable and robust method for identifying various cancer categories, in particular ALL cancer categories, by methylation level analysis. Considering that there are 28 million CpG sites in the human genome that are targets for DNA methylation, it is ground-breaking that analysing the methylation level at as few as 7-50 CpG sites, 10-50 CpG sites, or even 14-42 CpG sites, suffices in order to make a determination of whether a patient belongs to a certain cancer category or not. The knowledge of which CpG sites to analyse for methylation level in order to make the cancer category determination provides for completely new opportunities in the diagnosis and treatment of cancer patients, in particular ALL patients.

Brief description of the drawings

Figure 1 shows the prediction of ALL subtypes by consensus CpG sites defined using ALL samples of known subtype. The subtype predication scores for each of the original ALL samples used to define the classifier (n=546) are plotted by subtype.

Figure 2 shows the prediction of ALL subtypes for the independent ALL validation samples. Each sample in the validation set (n=39) is represented as a vertical bar positioned into its corresponding subtype as indicated below on the x-axis. The color key to the right of the panel shows the subtype prediction scores. A value >0.5 indicated high probability of correct

classification. Subtype prediction scores <0.5 are not shown.

Figure 3 shows the classification of ALL samples with undefined cytogenetic subtypes. (A) Each sample (n=210) is represented as a vertical bar positioned in its corresponding subtype "track" according to its allocation by the classifier. The color key to the right of panel A shows the estimated subtype prediction scores. All subtype probability scores <0.5 are not shown. (B) The distribution of the number of subtypes in which newly classified patients receive subtype probability scores >0.5. Eighty-seven patients were not classified, 106 patients were unequivocally assigned to one subtype, and 17 patients were classified into multiple subtype groups. (C) The distribution of the number of patients with "normal", "no result", and "non-recurrent" karyotypes into subtype-groups. The subtype distribution in the known sample group is also shown.

Figure 4 shows a profile comparison between the patient methylation profile of an unknown patient sample (thin black line with empty circles), and the specific methylation profiles for each cancer category (bold grey line with filled circles).

Detailed description of the invention

The gist of the present invention is to apply DNA methylation profiling for prediction of pediatric acute lymphoblastic leukemia (ALL) and/or the cytogenetic subtypes of ALL cells.

In accordance with the invention, it has been shown that measuring the methylation level at a very limited number of CpG sites is sufficient in order to be able to predict whether a patient belongs to a certain cancer category. It has surprisingly been established that measuring the methylation at as few as 232 CpG sites is sufficient in order to classify patient into 9 different cancer categories. None of the 9 different cancer categories requires measurement at more than 42 sites.

The present invention therefore provides for a convenient and quick procedure for classifying patients into different cancer categories. This classification is very valuable when establishing a suitable treatment for the patients. In addition, the present invention makes it possible to correctly classify a greater percentual number of patients into the right cancer category compared to the conventionally used methods.

The present invention is primarily based on the data presented in Tables A-C, which show:

1 . The number of CpG sites to be measured for each cancer category: "Serial number".

2. The cancer category to be identified: "Cancer type/cancer subtype".

3. The chromosomal position of each CpG site: "Chromosome". 4. The position of the CpG site in the genome: "Coordinate (min)",

"Coordinate (max)" (Tables A and B), and "Coordinate (specific)" (Table C).

5. The mean beta value indicative of each cancer category for each CpG site: "Mean Beta value (min"), "Mean Beta value (max)" (Tables A and B), and "Mean beta value (specific)" (Table C).

The coordinates in Tables A-C refer to The February 2009 human reference sequence (GRCh37), Feb. 2009 (GRCh37/hg19) assembly, available at the UCSC Human Genome Browser Gateway, http://genome- euro.ucsc.edu/cgi-bin/hg. Thus the CpG sites are unambiguously and completely identified by the coordinates referred to in Tables A-C.

The identification of the relevant CpG sites, "the DNA methylation classifiers", were made by using the genome-wide DNA methylation profiles of 546 patients with T-ALL or B-cell precursor ALL with the cytogenetic subtypes defined by high hyperdiploidy, t(12;21 ), t(1 ;19), t(9;22), MLL- rearrangements, dic(9;20), and iAMP21 . Methylation classification was then applied to 210 patients with unknown karyotype or non-recurrent cytogenetic aberrations, of which 106 displayed highly similar (subtype-like) methylation profiles as the known recurrent groups.

The subtypes of the newly classified patients were verified by examination of diagnostic karyotypes, copy number analysis, and detection of fusion genes. True subtype membership was established in 20% of the newly classified patients.

Thus, the present invention demonstrates that methylation-based classification may be a useful tool for samples lacking comprehensive cytogenetic analyses. TABLES A-C: Groups of predetermined CpG sites associated

with specific cancer categories (cancer types / cancer

subtypes), and their respective mean beta values

TABLE A

Cancer

Serial type / Coordinate Coordinate Mean Beta Mean Beta

Chromosome 2

number 1 Cancer (min) 3 (max) 4 value (min) 5 value (max) 6 subtype

1 T-ALL 1 9714343 9715343 0,82 0,92

2 T-ALL 1 16292246 16293246 0,82 0,92

3 T-ALL 1 61377618 61378618 0,88 0,98

4 T-ALL 1 111414732 111415732 0,79 0,89

5 T-ALL 2 20776120 20777120 0,83 0,93

6 T-ALL 3 32463213 32464213 0,81 0,91

7 T-ALL 5 149792340 149793340 0,79 0,89

8 T-ALL 6 157881707 157882707 0,83 0,93

9 T-ALL 7 2339277 2340277 0,83 0,93

10 T-ALL 8 129061165 129062165 0,84 0,94

11 T-ALL 11 66103389 66104389 0,81 0,91

12 T-ALL 13 46943643 46944643 0,85 0,95

13 T-ALL 17 61773674 61774674 0,80 0,90

14 T-ALL 17 78698322 78699322 0,90 1,00

1 HeH 1 4712264 4713264 0,63 0,73

2 HeH 1 153581155 153582155 0,24 0,34

3 HeH 1 153581250 153582250 0,19 0,29

4 HeH 2 38830666 38831666 0,18 0,28

5 HeH 2 220147653 220148653 0,22 0,32

6 HeH 3 15358838 15359838 0,27 0,37

7 HeH 3 45557593 45558593 0,16 0,26

8 HeH 4 3298014 3299014 0,27 0,37

9 HeH 4 26274589 26275589 0,10 0,20

10 HeH 4 101106972 101107972 0,34 0,44

11 HeH 5 10631556 10632556 0,10 0,20

12 HeH 6 22896579 22897579 0,22 0,32

13 HeH 6 32077627 32078627 0,21 0,31

14 HeH 6 32077633 32078633 0,20 0,30

15 HeH 7 141646190 141647190 0,21 0,31

16 HeH 8 79513461 79514461 0,25 0,35

17 HeH 8 141598685 141599685 0,12 0,22

18 HeH 9 139639825 139640825 0,20 0,30

19 HeH 9 139925250 139926250 0,32 0,42 HeH 9 139925356 139926356 0,24 0,34

HeH 10 7138526 7139526 0,34 0,44

HeH 10 45958271 45959271 0,29 0,39

HeH 10 80501931 80502931 0,25 0,35

HeH 11 44976660 44977660 0,21 0,31

HeH 11 67447895 67448895 0,18 0,28

HeH 14 68797865 68798865 0,24 0,34

HeH 14 95941549 95942549 0,20 0,30

HeH 16 85880875 85881875 0,21 0,31

HeH 17 27308613 27309613 0,16 0,26

HeH 19 2090906 2091906 0,18 0,28

HeH 19 17487276 17488276 0,36 0,46

HeH 21 37948739 37949739 0,33 0,43

HeH 21 45231732 45232732 0,13 0,23

HeH 22 46770144 46771144 0,12 0,22 t(12;21) 1 24647703 24648703 0,61 0,71 t(12;21) 1 28593232 28594232 0,18 0,28 t(12;21) 2 11606988 11607988 0,09 0,19 t{12;21) 2 97191766 97192766 0,25 0,35 t(12;21) 2 242904238 242905238 0,12 0,22 t(12;21) 2 242904268 242905268 0,12 0,22 t(12;21) 2 242904293 242905293 0,13 0,23 t(12;21) 3 39234501 39235501 0,28 0,38 t{12;21) 3 45003329 45004329 0,20 0,30 t(l¾21) 3 75441304 75442304 0,22 0,32 t(12;21) 3 113249314 113250314 0,14 0,24 t(12;21) 3 126009951 126010951 0,21 0,31 t(12;21) 4 40920671 40921671 0,19 0,29 t(12;21) 5 61702151 61703151 0,20 0,30 t{12;21) 5 123090470 123091470 0,13 0,23 t(12;21) 5 172195653 172196653 0,12 0,22 t{12;21) 6 4482383 4483383 0,16 0,26 t(12;21) 6 4482490 4483490 0,35 0,45 t(12;21) 6 13295649 13296649 0,17 0,27 t(12;21) 6 28859083 28860083 0,25 0,35 t{12;21) 7 101559108 101560108 0,30 0,40 t(12;21) 8 604600 605600 0,32 0,42 t(12;21) 8 604923 605923 0,28 0,38 t{12;21) 8 1053548 1054548 0,20 0,30 t(12;21) 8 81789491 81790491 0,25 *5 t{12;21) 11 70243970 70244970 0,55 0,65 t(12;21) 13 98959270 98960270 0,18 0,28 t(12;21) 13 98959299 98960299 0,23 0,33 t(12;21) 13 98961575 98962575 0,24 0,34 t{12;21) 16 69386347 69387347 0,21 0,31 t{12;21) 16 88993145 88994145 0,26 0,36 t(12;21) 16 88994240 88995240 0,27 0,37 t(12;21) 16 88999725 89000725 0,24 0,34 t(12;21) 17 47079459 47080459 0,11 0,21 t(12;21) 17 47092989 47093989 0,18 0,28 t{12;21) 17 63624067 63625067 0,16 0,26 t(12;21) 17 63624119 63625119 0,13 0,23 t(12;21) 18 19322833 19323833 0,28 0,38 t(12;21) 18 19324259 19325259 0,25

t(12;21) 18 28622540 28623540 0,09 0,19 t(12;21) 19 11494761 11495761 0,25 0,35 t(l¾21) 20 61489095 61490095 0,13 0,23

Ilq23/MLL 1 26644015 26645015 0,66 0,76

Ilq23/MLL 1 32668650 32669650 0,20 0,30

Ilq23/MLL 1 44879817 44880817 0,13 0,23

Ilq23/MLL 1 86952550 86953550 0,13 0,23

Ilq23/MLL 2 28021723 28022723 0,31 0,41

Ilq23/MLL 2 65571391 65572391 0,23 0,33

Ilq23/MLL 2 65571448 65572448 0,15 0,25

Ilq23/MLL 2 98328837 98329837 0,67 0,77

Ilq23/MLL 2 145283759 145284759 0,19 0,29

Ilq23/MLL 3 18003061 18004061 0,20 0,30

Ilq23/MLL 3 18003512 18004512 0,20 0,30

Ilq23/MLL 3 70432164 70433164 0,21 0,31

Ilq23/MLL 3 151952374 151953374 0,19 0,29

Ilq23/MLL 3 151984322 151985322 0,05 0,15

Ilq23/MLL 3 170077045 170078045 0,30 0,40

Ilq23/MLL 3 177554061 177555061 0,31 0,41

Ilq23/MLL 5 67779761 67780761 0,18 0,28

Ilq23/MLL 5 99930937 99931937 0,26 0,36

Ilq23/MLL 6 106551123 106552123 0,82 0,92

Ilq23/MLL 9 82278327 82279327 0,20 0,30

Ilq23/MLL 10 22722451 22723451 0,28 0,38

Ilq23/MLL 14 91865825 91866825 0,72 0,82

Ilq23/MLL 14 91865873 91866873 0,64 0,74

Ilq23/MLL 15 33162376 33163376 0,22 0,32

Ilq23/MLL 15 78504551 78505551 0,30 0,40

Ilq23/MLL 17 29621443 29622443 0,15 0,25

Ilq23/MLL 17 48970501 48971501 0,34 0,44 Ilq23/MLL 19 11073928 11074928 0,83 0,93 t(l;19) 2 1885586 1886586 0,23 0,33 t(l;19) 2 1885605 1886605 0,28 0,38 tfl;19) 2 27804834 27805834 0,19 0,29 t(l;19) 2 240043889 240044889 0,14 0,24 t(l;19) 3 124768999 124769999 0,15 0,25 t(l;19) 7 148801655 148802655 0,28 0,38 t(l;19) 8 29848079 29849079 0,14 0,24 t(l;19) 9 137649595 137650595 0,14 0,24 t(l;19) 10 28571348 28572348 0,21 0,31 t(l;19) 11 1012437 1013437 0,16 0,26 t(l;19) 12 32589645 32590645 0,15 0,25 t(l;19) 12 131518018 131519018 0,22 0,32 t(l;19) 13 30074171 30075171 0,10 0,20 t(l;19) 14 96369300 96370300 0,18 0,28 t(l;19) 15 58723157 58724157 0,14 0,24 t(l;19) 16 569688 570688 0,28 0,38 t(l;19) 16 87753519 87754519 0,10 0,20 t(l;19) 16 87753576 87754576 0,14 0,24 t(l;l9) 17 55993019 55994019 0,12 0,22 t(l;19) 17 80768504 80769504 0,09 0,19 t{l;l9) 22 38377133 38378133 0,24 0,34 dic(9;20) 1 162530643 162531643 0,50 0,60 dic(9;20) 1 162530667 162531667 0,56 0,66 dic(9;20) 1 162530673 162531673 0,52 0,62 dic(9;20) 1 162531019 162532019 0,45 0,55 dic(9;20) 1 162531203 162532203 0,54 0,64 dic(9;20) 1 162531226 162532226 0,55 0,65 dic(9;20) 1 174843897 174844897 0,34 0,44 dic(9;20) 1 202776795 202777795 0,43 0,53 dic(9;20) 2 9171728 9172728 0,36 0,46 dic(9;20) 3 34055300 34056300 0,24 0,34 dic(9;20) 5 114915130 114916130 0,25 0,35 dic(9;20) 6 28543193 28544193 0,28 0,38 dic(9;20) 6 99391978 99392978 0,43 0,53 dic(9;20) 6 130544332 130545332 0,15 0,25 dic(9;20) 6 147524341 147525341 0,67 0,77 dic(9;20) 6 147524962 147525962 0,66 0,76 dic(9;20) 6 147524977 147525977 0,64 0,74 dic(9;20) 7 86084370 86085370 0,27 0,37 dic(9;20) 7 134549930 134550930 0,25 0,35 dic(9;20) 7 158808518 158809518 0,37 0,47 dic(9;20) 7 158828103 158829103 0,31 0,41 dic(9;20) 7 158885598 158886598 0,21 0,31 dic(9;20) 7 158885620 158886620 0,25 0,35 dic(9;20) 7 158885769 158886769 0,20 0,30 dic(9;20) 8 106604531 106605531 0,33 0,43 dic(9;20) 9 90112211 90113211 0,45 0,55 dic(9;20) 11 7869738 7870738 0,29 0,39 dic(9;20) 11 20043968 20044968 0,27 0,37 dic(9;20) 12 10150405 10151405 0,42 0,52 dic(9;20) 12 93100681 93101681 0,35 0,45 dic(9;20) 13 33543514 33544514 0,27 0,37 dic(9;20) 13 34208394 34209394 0,27 0,37 dic(9;20) 15 57404943 57405943 0,37 0,47 dic(9;20) 16 72818754 72819754 0,28 0,38 dic(9;20) 16 73084592 73085592 0,29 0,39 dic(9;20) 16 89888317 89889317 0,41 0,51 dic(9;20) 17 33865573 33866573 0,26 0,36 t(9;22) 1 12072945 12073945 0,47 0,57 t(9;22) 1 214421886 214422886 0,40 0,50 t(9;22) 1 214785023 214786023 0,23 0,33 t{9;22) 2 169658682 169659682 0,17 0,27 t(9;22) 2 201169926 201170926 0,41 0,51 t(9;22) 3 4458858 4459858 0,31 0,41 t(9;22) 3 8910924 8911924 0,31 0,41 t(9;22) 3 55555697 55556697 0,64 0,74 t(9;22) 4 110226000 110227000 0,25 0,35 t(9;22) 6 167274895 167275895 0,25 0,35 t{9;22) 7 95059847 95060847 0,37 0,47 t{9;22) 8 142140955 142141955 0,40 0,50 t(9;22) 8 142142372 142143372 0,38 0,48 t(9;22) 9 124059655 124060655 0,32 0,42 t(9;22) 10 104915051 104916051 0,43 0,53 t(9;22) 11 58383774 58384774 0,28 0,38 t{9;22) 12 45818807 45819807 0,50 0,60 t{9;22) 12 94082970 94083970 0,43 t(9;22) 12 118501838 118502838 0,19 0,29 t{9;22) 15 52176189 52177189 0,31 0,41 t(9;22) 15 67417928 67418928 0,43 0,53 t{9;22) 15 77568477 77569477 0,25 0,35 t(9;22) 19 54645604 54646604 0,46 0,56

ΪΑΜΡ21 1 3289298 3290298 0,11 0,21

ΪΑΜΡ21 1 161279787 161280787 0,09 0,19 3 ΪΑΜΡ21 1 162136213 162137213 0,04 0,14

4 ΪΑΜΡ21 2 85875814 85876814 0,19 0,29

5 ΪΑΜΡ21 2 103233233 103234233 0,03 0,13

6 ΪΑΜΡ21 2 175627551 175628551 0,21 0,31

7 ΪΑΜΡ21 3 8570402 8571402 0,00 0,09

8 ΪΑΜΡ21 5 134607544 134608544 0,27 0,37

9 ΪΑΜΡ21 6 130537470 130538470 0,10 0,20

10 ΪΑΜΡ21 6 149744330 149745330 0,02 0,12

11 ΪΑΜΡ21 9 16310037 16311037 0,09 0,19

12 ΪΑΜΡ21 10 675437 676437 0,01 0,11

13 ΪΑΜΡ21 11 48190568 48191568 0,11 0,21

14 ΪΑΜΡ21 12 118312058 118313058 0,12 0,22

15 ΪΑΜΡ21 13 53770911 53771911 0,03 0,13

16 ΪΑΜΡ21 13 102360143 102361143 0,16 0,26

1 ALL 1 65991096 65992096 0,78 0,88

2 ALL 1 170629570 170630570 0,81 0,91

3 ALL 2 150186989 150187989 0,82 0,92

4 ALL 4 1398735 1399735 0,81 0,91

5 ALL 4 188952626 188953626 0,83 0,93

6 ALL 5 146613798 146614798 0,77 0,87

7 ALL 6 26987677 26988677 0,78 0,88

8 ALL 8 11058676 11059676 0,76 0,86

9 ALL 8 110703388 110704388 0,77 0,87

10 ALL 10 118607638 118608638 0,78 0,88

11 ALL 10 129535393 129536393 0,81 0,91

12 ALL 11 40314478 40315478 0,81 0,91

13 ALL 11 43569055 43570055 0,79 0,89

14 ALL 13 96204478 96205478 0,80 0,90

15 ALL 13 96704623 96705623 0,86 0,96

16 ALL 18 74961633 74962633 0,79 0,89

17 ALL 20 13975690 13976690 0,76 0,86

1 "Serial number" is an identifier for an individual CpG site of a specific cancer category

2 "Chromosome" defines the chromosome on which the CpG site in question is located

3 4 "Coordinate (min)" and "Coordinate (max)" refer to the UCSC Feb. 2009 (GRCh37/hg19) assembly, and define the end coordinates for the position of the CpG site in question such that the CpG site lies within the range of "Coordinate (min)" and "Coordinate (max)" 5-6 "Mean beta value (min)" and "Mean beta value (max)" define the end values for the cancer category specific methylation level for each CpG site such that the methylation level of the CpG site in question lies within the range of "Mean beta value (min)" and "Mean beta value (max)"

TABLE B

Cancer

Serial type / Coordinate Coordinate Mean Beta Mean Beta

Chromosome 2

number 1 Cancer (min) 3 (max) 4 value (min) 5 value (max) 6 subtype

1 T-ALL 1 9714593 9715093 0,85 0,89

2 T-ALL 1 16292496 16292996 0,85 0,89

3 T-ALL 1 61377868 61378368 0,91 0,95

4 T-ALL 1 111414982 111415482 0,82 0,86

5 T-ALL 2 20776370 20776870 0,86 0,90

6 T-ALL 3 32463463 32463963 0,84 0,88

7 T-ALL 149792590 149793090 0,82 0,86

8 T-ALL 6 157881957 157882457 0,86 0,90

9 T-ALL 7 2339527 2340027 0,86 0,90

10 T-ALL 8 129061415 129061915 0,87 0,91

11 T-ALL 11 66103639 66104139 0,84 0,88

12 T-ALL 13 46943893 46944393 0,88 0,92

13 T-ALL 17 61773924 61774424 0,83 0,87

14 T-ALL 17 78698572 78699072 0,93 0,97

1 HeH 1 4712514 4713014 0,66 0,70

2 HeH 1 153581405 153581905 0,27 0,31

3 HeH 1 153581500 153582000 0,22 0,26

4 HeH 2 38830916 38831416 0,21 0,25

5 HeH 2 220147903 220148403 0,25 0,29

6 HeH 3 15359088 15359588 0,30 0,34

7 HeH 3 45557843 45558343 0,19 0,23

8 HeH 4 3298264 3298764 0,30 0,34

9 HeH 4 26274839 26275339 0,13 0,17

10 HeH 4 101107222 101107722 0,37 0,41

11 HeH 5 10631806 10632306 0,13 0,17

12 HeH 6 22896829 22897329 0,25 0,29

13 HeH 6 32077877 32078377 0,24 0,28

14 HeH 6 32077883 32078383 0,23 0,27

15 HeH 7 141646440 141646940 0,24 0,28

16 HeH 8 79513711 79514211 0,28 0,32

17 HeH 8 141598935 141599435 0,15 0,19 HeH 9 139640075 139640575 0,23 0,27

HeH 9 139925500 139926000 0,35 0,39

HeH 9 139925606 139926106 0,27 0,31

HeH 10 7138776 7139276 0,37 0,41

HeH 10 45958521 45959021 0,32 0,36

HeH 10 80502181 80502681 0,28 0,32

HeH 11 44976910 44977410 0,24 0,28

HeH 11 67448145 67448645 0,21 0,25

HeH 14 68798115 68798615 0,27 0,31

HeH 14 95941799 95942299 0,23 0,27

HeH 16 85881125 85881625 0,24 0,28

HeH 17 27308863 27309363 0,19 0,23

HeH 19 2091156 2091656 0,21 0,25

HeH 19 17487526 17488026 0,39 0,43

HeH 21 37948989 37949489 0,36 0,40

HeH 21 45231982 45232482 0,16 0,20

HeH 22 46770394 46770894 0,15 0,19 t(12;21) 1 24647953 24648453 0,64 0,68 t{12;21) 1 28593482 28593982 0,21 0,25 t(12;21) 2 11607238 11607738 0,12 0,16 t{12;21) 2 97192016 97192516 0,28 0,32 t(12;21) 2 242904488 242904988 0,15 0,19 t{12;21) 2 242904518 242905018 0,15 0,19 t(12;21) 2 242904543 242905043 0,16 0,20 t(12;21) 3 39234751 39235251 0,31 0,35 t(12;21) 3 45003579 45004079 0,23 0,27 t(12;21) 3 75441554 75442054 0,25 0,29 t(12;21) 3 113249564 113250064 0,17 0,21 t(12;21) 3 126010201 126010701 0,24 0,28 t(12;21) 4 40920921 40921421 0,22 0,26 t{12;21) 5 61702401 61702901 0,23 0,27 t(12;21) 5 123090720 123091220 0,16 0,20 t(12;21) 5 172195903 172196403 0,15 0,19 t(12;21) 6 4482633 4483133 0,19 0,23 t(12;21) 6 4482740 4483240 0,38 0,42 t{12;21) 6 13295899 13296399 0,20 0,24 t(12;21) 6 28859333 28859833 0,28 0,32 t(12;21) 7 101559358 101559858 0,33 0,37 t(12;21) 8 604850 605350 0,35 0,39 t(12;21) 8 605173 605673 0,31 0,35 t(12;21) 8 1053798 1054298 0,23 0,27 t(12;21) 8 81789741 81790241 0,28 0,32 t(12;21) 11 70244220 70244720 0,58 0,62 t(12;21) 13 98959520 98960020 0,21 0,25 t(12;21) 13 98959549 98960049 0,26 0,30 t(12;21) 13 98961825 98962325 0,27 0,31 t{12;21) 16 69386597 69387097 0,24 0,28 t(12;21) 16 88993895 0,29 0,33 t(12;21) 16 88994490 88994990 0,30 0,34 t(12;21) 16 88999975 89000475 0,27 0,31 t(12;21) 17 47079709 47080209 0,14 0,18 t{12;21) 17 47093239 47093739 0,21 0,25 t(12;21) 17 63624317 63624817 0,19 0,23 t(12;21) 17 63624369 63624869 0,16 0,20 t(12;21) 18 19323083 19323583 0,31 0,35 t(12;21) 18 19324509 19325009 0,28 0,32 t(12;21) 18 28622790 28623290 0,12 0,16 t(12;21) 19 11495011 11495511 0,28 0,32 t(12;21) 20 61489345 61489845 0,16 0,20

Ilq23/MLL 1 26644265 26644765 0,69 0,73

Ilq23/MLL 1 32668900 32669400 0,23 0,27

Ilq23/MLL 1 44880067 44880567 0,16 0,20

Ilq23/MLL 1 86952800 86953300 0,16 0,20

Ilq23/MLL 2 28021973 28022473 0,34 0,38

Ilq23/MLL 2 65571641 65572141 0,26 0,30

Ilq23/MLL 2 65571698 65572198 0,18 0,22

Ilq23/MLL 2 98329087 98329587 0,70 0,74

Ilq23/MLL 2 145284009 145284509 0,22 0,26

Ilq23/MLL 3 18003311 18003811 0,23 0,27

Ilq23/MLL 3 18003762 18004262 0,23 0,27

Ilq23/MLL 3 70432414 70432914 0,24 0,28

Ilq23/MLL 3 151952624 151953124 0,22 0,26

Ilq23/MLL 3 151984572 151985072 0,08 0,12

Ilq23/MLL 3 170077295 170077795 0,33 0,37

Ilq23/MLL 3 177554311 177554811 0,34 0,38

Ilq23/MLL 5 67780011 67780511 0,21 0,25

Ilq23/MLL 5 99931187 99931687 0,29 0,33

Ilq23/MLL 6 106551373 106551873 0,85 0,89

Ilq23/MLL 9 82278577 82279077 0,23 0,27

Ilq23/MLL 10 22722701 22723201 0,31 0,35

Ilq23/MLL 14 91866075 91866575 0,75 0,79

Ilq23/MLL 14 91866123 91866623 0,67 0,71

Ilq23/MLL 15 33162626 33163126 0,25 0,29

Ilq23/MLL 15 78504801 78505301 0,33 0,37 Ilq23/MLL 17 29621693 29622193 0,18 0,22

Ilq23/MLL 17 48970751 48971251 0,37 0,41

Ilq23/MLL 19 11074178 11074678 0,86 0,90 t(l;19) 2 1885836 1886336 0,26 0,30 t{l;19) 2 1885855 1886355 0,31 0,35 t{l;19) 2 27805084 27805584 0,22 0,26 t(l;19) 2 240044139 240044639 0,17 0,21 t(l;19) 3 124769249 124769749 0,18 0,22 t(l;19) 7 148801905 148802405 0,31 0,35 t(l;19) 8 29848329 29848829 0,17 0,21 t(l;19 9 137649845 137650345 0,17 0,21 t(¾19) 10 28571598 28572098 0,24 0,28 t(l;19) 11 1012687 1013187 0,19 0,23 t(l;19) 12 32589895 32590305 0,18 0,22 t(l;19) 12 131518268 131518768 0,25 0,29 t{l;19) 13 30074421 30074921 0,13 0,17 t(l;19) 14 96369550 96370050 0,21 0,25 t(l;19) 15 58723407 58723907 0,17 0,21 t(l;l9) 16 569938 570438 0,31 0,35 t(l;19) 16 87753769 87754269 0,13 0,17 t(i;i9) 16 87753826 87754326 0,17 0,21 t(l;l9) 17 55993269 55993769 0,15 0,19 t(l;19) 17 80768754 80769254 0,12 0,16 t(l;19) 22 38377383 38377883 0,27 0,31 dic(9;20) 1 162530893 162531393 0,53 0,57 dic(9;20) 1 162530917 162531417 0,59 0,63 dic(9;20) 1 162530923 162531423 0,55 0,59 dic(9;20) 1 162531269 162531769 0,48 0,52 dic(9;20) 1 162531453 162531953 0,57 0,61 dic(9;20) 1 162531476 162531976 0,58 0,62 dic(9;20) 1 174844147 174844647 0,37 0,41 dic(9;20) 1 202777045 202777545 0,46 0,50 dic(9;20) 2 9171978 9172478 0,39 0,43 dic(9;20) 3 34055550 34056050 0,27 0,31 dic(9;20) 5 114915380 114915880 0,28 0,32 dic(9;20) 6 28543443 28543943 0,31 0,35 dic(9;20) 6 99392228 99392728 0,46 0,50 dic(9;20) 6 130544582 130545082 0,18 0,22 dic(9;20) 6 147524591 147525091 0,70 0,74 dic(9;20) 6 147525212 147525712 0,69 0,73 dic(9;20) 6 147525227 147525727 0,67 0,71 dic(9;20) 7 86084620 86085120 0,30 0,34 dic(9;20) 7 134550180 134550680 0,28 0,32 dic(9;20) 7 158808768 158809268 0,40 0,44 dic(9;20) 7 158828353 158828853 0,34 0,38 dic(9;20) 7 158885848 158886348 0,24 0,28 dic(9;20) 7 158885870 158886370 0,28 0,32 dic(9;20) 7 158886019 158886519 0,23 0,27 dic(9;20) 8 106604781 106605281 0,36 0,40 dic(9;20) 9 90112461 90112961 0,48 0,52 dic(9;20) 11 7869988 7870488 0,32 0,36 dic(9;20) 11 20044218 20044718 0,30 0,34 dic(9;20) 12 10150655 10151155 0,45 0,49 dic(9;20) 12 93100931 93101431 0,38 0,42 dic(9;20) 13 33543764 33544264 0,30 0,34 dic(9;20) 13 34208644 34209144 0,30 0,34 dic(9;20) 15 57405193 57405693 0,40 0,44 dic(9;20) 16 72819004 72819504 0,31 0,35 dic(9;20) 16 73084842 73085342 0,32 0,36 dic(9;20) 16 89888567 89889067 0,44 0,48 dic(9;20) 17 33865823 33866323 0,29 0,33 t(9;22) 1 12073195 12073695 0,50 0,54 t(9;22) 1 214422136 214422636 0,43 0,47 t{9;22) 1 214785273 214785773 0,26 0,30 t(9;22) 2 169658932 169659432 0,20 0,24 t(9;22) 2 201170176 201170676 0,44 0,48 t(9;22) 3 4459108 4459608 0,34 0,38 t(9;22) 3 8911174 8911674 0,34 0,38 t(9;22) 3 55555947 55556447 0,67 0,71 t(9;22) 4 110226250 110226750 0,28 0,32 t(9;22) 6 167275145 167275645 0,28 0,32 t(9;22) 7 95060097 95060597 0,40 0,44 t(9;22) 8 142141205 142141705 0,43 0,47 t{9;22) 8 142142622 142143122 0,41 0,45 t(9;22) 9 124059905 124060405 0,35 0,39 t{9;22) 10 104915301 104915801 0,46 0,50 t(9;22) 11 58384024 58384524 0,31 0,35 t(9;22j 12 45819057 45819557 0,53 0,57 t(9;22) 12 94083220 94083720 0,36 0,40 t(9;22) 12 118502088 118502588 0,22 0,26 t(9;22) 15 52176439 52176939 0,34 0,38 t(9;22) 15 67418178 67418678 0,46 0,50 t(9;22) 15 77568727 77569227 0,28 0,32 t(9;22) 19 54645854 54646354 0,49 0,53 1 ΪΑΜΡ21 1 3289548 3290048 0,14 0,18

2 ΪΑΜΡ21 1 161280037 161280537 0,12 0,16

3 ΪΑΜΡ21 1 162136463 162136963 0,07 0,11

4 ΪΑΜΡ21 2 85876064 85876564 0,22 0,26

5 ΪΑΜΡ21 2 103233483 103233983 0,06 0,10

6 ΪΑΜΡ21 2 175627801 175628301 0,24 0,28

7 ΪΑΜΡ21 3 8570652 8571152 0,02 0,06

8 ΪΑΜΡ21 5 134607794 134608294 0,30 0,34

9 ΪΑΜΡ21 6 130537720 130538220 0,13 0,17

10 ΪΑΜΡ21 6 149744580 149745080 0,05 0,09

11 ΪΑΜΡ21 9 16310287 16310787 0,12 0,16

12 ΪΑΜΡ21 10 675687 676187 0,04 0,08

13 ΪΑΜΡ21 11 48190818 48191318 0,14 0,18

14 ΪΑΜΡ21 12 118312308 118312808 0,15 0,19

15 ΪΑΜΡ21 13 53771161 53771661 0,06 0,10

16 ΪΑΜΡ21 13 102360393 102360893 0,19 0,23

1 ALL 1 65991346 65991846 0,81 0,85

2 ALL 1 170629820 170630320 0,84 0,88

3 ALL 2 150187239 150187739 0,85 0,89

4 ALL 4 1398985 1399485 0,84 0,88

5 ALL 4 188952876 188953376 0,86 0,90

6 ALL 5 146614048 146614548 0,80 0,84

7 ALL 6 26987927 26988427 0,81 0,85

8 ALL 8 11058926 11059426 0,79 0,83

9 ALL 8 110703638 110704138 0,80 0,84

10 ALL 10 118607888 118608388 0,81 0,85

11 ALL 10 129535643 129536143 0,84 0,88

12 ALL 11 40314728 40315228 0,84 0,88

13 ALL 11 43569305 43569805 0,82 0,86

14 ALL 13 96204728 96205228 0,83 0,87

15 ALL 13 96704873 96705373 0,89 0,93

16 ALL 18 74961883 74962383 0,82 0,86

17 ALL 20 13975940 13976440 0,79 0,83

1 "Serial number" is an identifier for an individual CpG site of a specific cancer category

2 "Chromosome" defines the chromosome on which the CpG site in question is located

3 4 "Coordinate (min)" and "Coordinate (max)" refer to the UCSC Feb. 2009 (GRCh37/hg19) assembly, and define the end coordinates for the position of the CpG site in question such that the CpG site lies within the range of "Coordinate (min)" and "Coordinate max"

5-6 "Mean beta value (min)" and "Mean beta value (max)" define the end values for the cancer category specific methylation level for each CpG site such that the methylation level of the CpG site in question lies within the range of "Mean beta value (min)" and "Mean beta value (max)"

TABLE C:

1 "Serial number" is an identifier for an individual CpG site of a specific cancer category

2 "Chromosome" defines the chromosome on which the CpG site in question is located 3 "Coordinate (specific)" refers to the UCSC Feb. 2009 (GRCh37/hg19) assembly, and defines the exact coordinate for the position of the CpG site in question

4 "Mean beta value (specific)" defines a preferred cancer category specific methylation level for each CpG site

In performing the method according to the present invention, the patient sample is suitably DNA extracted from a blood sample or a bone marrow sample.

The level of DNA methylation may be measured in several different ways, and the present invention is not limited to a specific method for establishing the DNA methylation level. However, in accordance with the present invention, the level of DNA methylation is suitably determined by establishing the beta value for each CpG site, preferably using a bisulfite reaction based method.

A preferred example of a bisulfite reaction based method within the context of the present invention is "The Infinium Methylation Assay" (lllumina Inc.), which detects cytosine methylation at CpG sites based on highly multiplexed genotyping of bisulfite-converted genomic DNA (gDNA). Upon treatment with bisulfite, unmethylated cytosine bases are converted to uracil, while methylated cytosine bases remain unchanged.

The assay interrogates these chemically differentiated loci using two site-specific probes, one designed for the methylated locus (M bead type) and another for the unmethylated locus (U bead type).

Single-base extension of the probes incorporates a labeled ddNTP, which is subsequently stained with a fluorescence reagent. The level of methylation for the interrogated locus can be determined by calculating the ratio of the fluorescent signals from the methylated vs unmethylated sites.

Specifically, a bisulfite reaction based method within the context of the present invention may include treating DNA with sodium bisulfite and analysing the bisulfite treated DNA using the Infinium HumanMethylation450 BeadChip assay (lllumin Inc). In the bisulfite treatment step, about 250-500 ng DNA may be used, and in the analysis step, about 200 ng of bisulfite treated DNA may be used. The patient-specific methylation level is preferably expressed as a beta value number and the cancer category specific methylation level is preferably expressed as a mean beta value number.

The output of a beta value analysis, i.e. a beta value number, is the methylation level of a particular base in the cells representing the DNA sample.

The patient-specific methylation level is generally based on one single DNA sample, and thus, the patient-specific methylation level is expressed as "a" patient-specific beta value number. It may however also be contemplated that the patient-specific beta value number is based on two or more DNA samples from the patient.

The cancer category specific methylation level, on the other hand, is expressed as the "mean" beta value number since the values listed in Tables A-C are based on beta values from DNA samples obtained from several patients having defined subtypes. In this regard, reference is made to Table

1 , where it can be seen that the number of individual samples for each subtype ranges from 8 to 101 .

The mean absolute difference between the patient-specific methylation levels and the cancer category specific methylation levels is established by: 1 . for each CpG site in said group of predetermined CpG sites, calculating the difference in % between the patient-specific methylation level (preferably expressed as a beta value number) and the cancer category specific methylation level (preferably expressed as a mean beta value number), and

2. calculating the mean difference in %.

In a preferred embodiment of the invention, the mean absolute difference is 25 % or lower. In other embodiments of the invention, the mean absolute difference is 20 % or lower, 15 % or lower, or 10 % or lower. By "absolute" difference is meant that the difference may be positive or negative.

With reference to Figure 4, each cancer category (i.e cancer type or cancer subtype) is characterized by a profile of 14-42 CpG-sites (bold grey lines with filled circles, denoted with the name of each cancer category). When classifying an unknown patient sample (thin black line with empty circles), the patient methylation profile is compared with the cancer category specific profile.

If the mean absolute difference between the patient methylation profile and the cancer category specific methylation profile is at most 25%, it is considered a match. In the example shown in Figure 4, the sample matches the ALL profile (top), and the t(12;21 ) cancer subtype (third row left). Note that the CpG-sites are typically located at various genomic locations on different chromosomes, even though they appear next to each other in the figure.

By the feature: "wherein the CpG sites in each group of predetermined CpG sites associated with each specific cancer category are selected from table A, B, and/or C, where each CpG site is expressed as a range of nucleotide coordinates (Tables A and B) or as a specific nucleotide

coordinate (Table C), referring to the UCSC Feb. 2009 (GRCh37/hg1 9) assembly or a subgroup thereof" is meant the following:

The CpG sites associated with each cancer category are defined by the nucleotide coordinates listed in the Tables A, B, and C. The listed nucleotide coordinates refer to the UCSC Feb. 2009 (GRCh37/hg1 9) assembly, publicly available at the UCSC Human Genome Browser Gateway, http://genome-euro.ucsc.edu/cgi-bin/hgGateway

Since a CpG locus contains two nucleotides, there are two genomic coordinates for a given site: one for C and the other for G. The lesser of the two coordinates is used as the coordinate of the CpG locus. Thus, the stated coordinates in Tables A-C refer to the C base in each CpG base: 5'...CpG...3'

t

Tables A and B define a nucleotide coordinate interval for the position of the CpG site in question. Thus, Tables A and B define that the CpG site in question is located at a coordinate within the interval defined by a minimum coordinate "Coordinate (min)" and a maximum coordinate "Coordinate (max)".

Table C defines a specific nucleotide coordinate "Coordinate (specific)" for the position of the CpG site in question. As an example, if seeking to define the CpG sites in the group of predetermined CpG sites associated with the specific cancer category T-ALL (which is a cancer subtype), it can be seen from Tables A, B, and C that the group of predetermined CpG sites contains 14 CpG sites. Information on the location of each CpG site is unambiguously disclosed in the Tables A, B, and C.

Choosing a random CpG site within this group of predetermined CpG sites, e.g. the CpG site having serial number 7, it can be seen from Table A that this CpG site is located on chromosome 5 at a coordinate within the range of from 149792340 to 149793340 (endpoints included), where the listed nucleotide coordinates refer to the UCSC Feb. 2009 (GRCh37/hg19) assembly. In Table B, the nucleotide interval is narrowed, defining the CpG site in question (T-ALL, serial number 7) to be located on chromosome 5 at a coordinate within the range of from 149792590 to 149793090 (endpoints included). Table C specifically defines the CpG site in question (T-ALL, serial number 7) to be located on chromosome 5 at coordinate 149792840.

The corresponding reading of the Tables A, B, and C can be made for all the listed CpG sites.

In the research work leading to the present invention, a number of specific CpG sites were identified as decisive for identifying a specific cancer category. However, as closely located CpG-sites usually have very similar methylation statuses it can be expected that a methylated CpG site lying in close vicinity to the identified CpG sites, such as within a range of ±500 nucleotides, or ±250 nucleotides, (i.e. 500 or 250 base pairs upstreams or downstreams from the specific CpG sites) will impart the corresponding cytogenic profile.

By the feature: "wherein the cancer category specific methylation level for each CpG site is as defined in Tables A, B, or C, where the cancer category specific methylation level for each CpG site is expressed as a range of mean beta values (Tables A and B), or as a specific mean beta value (Table C)" is meant the following:

The cancer category specific methylation level for each CpG site is defined by the mean beta values listed in the Tables A, B, and C. The mean beta value is a number between 0 and 1 , wherein a mean beta value number of 0 corresponds to 0% methylation of the CpG site in question, and a mean beta value number of 1 corresponds to 100% methylation of the CpG site in question. Methylation occurs at the cytosine residue in CpG dinucleotides, such that 5-methylcytosine (5mC) is formed, and the beta value is established by determining the methylation of a particular base in the cells representing the DNA sample.

The cancer category specific methylation level is expressed as the "mean" beta level since the listed values are based on beta values from DNA samples obtained from several patients having defined subtypes. In this regard, reference is made to Table 1 , where it can be seen that the number of individual samples for each subtype ranges from 8 to 101 .

Tables A and B define that the mean beta value for each CpG site lies within the interval defined by a minimum mean beta value "Mean Beta value (min)" and a maximum mean beta value "Mean Beta value (max)".

Table C defines a specific mean beta value "Mean Beta value

(specific)" for the CpG site in question.

As an example, if seeking to define the mean beta values for the group of predetermined CpG sites associated with the specific cancer category T- ALL (which is a cancer subtype), it can be seen from Tables A, B, and C that the group of predetermined CpG sites contains 14 CpG sites, and information on the methylation level, i.e. the mean beta value, of each CpG site is unambiguously disclosed in the Tables A, B, and C.

Choosing a random CpG site within this group of predetermined CpG sites, e.g. the CpG site having serial number 7, it can be seen from Table A that this CpG site has a mean beta value in the range of from 0,83 to 0,93 (end points included). In Table B, the mean beta value interval is narrowed, defining the CpG site in question (T-ALL, serial number 7) to have a mean beta value within the range of from 0,82 to 0,86 (end points included). Table C specifically defines the CpG site in question (T-ALL, serial number 7) to have a mean beta value of 0,84.

The corresponding reading of the Tables A, B, and C can be made for all the listed CpG sites. In the research work leading to the present invention, a specific methylation of the identified CpG was identified as decisive for identifying a specific cancer category. However, it can be expected that minor variations, such as ±0,05 or ±0,02 units, of the methylation levels of the identified CpG sites may be contemplated while still imparting the corresponding cytogenic profile.

In embodiments of the invention, the specific coordinates of the CpG sites and/or the specific beta value levels of Table C may independently be combined with the coordinate ranges of the CpG sites and/or the beta value ranges of Tables A and B.

Other aspects of the invention relates to the allocation of cancer patients to a specific treatment protocol based on the result of cancer category determination. The invention will now be further explained by reference to the following examples, which in no way are to be considered as limiting for the scope of the appended claims.

Examples

MATERIALS & METHODS

Clinical diagnostic analysis of ALL samples

Bone marrow aspirates or peripheral blood samples were collected at diagnosis from 756 pediatric ALL patients enrolled between 1995 and 2008 on the Nordic Society of Pediatric Hematology and Oncology (NOPHO) ALL- 92, ALL-2000, or ALL-2008 protocols. ALL diagnoses were established by analysis of leukemic cells with respect to morphology, immunophenotype, and cytogenetics. HeH was defined as having 51 -67 chromosomes per cell. FISH or RT-PCR analyses were used to screen for the translocations

t(12;21 )(p13;q22)[E7V6/RL//VX7], t(9;22)(q34;q1 1 )[BCR/ABL 1], and t(1 ;19)(q23;p13.3)[7 ~ CF3/PeX7]. FISH or Southern blot analyses were used to identify MLL rearrangements, and >3x RUNX1 FISH probes defined iAMP21 , and high resolution SNP arrays and/or FISH were used to detect dic(9;20) aberrations.

Design and test sample sets

Criteria for selecting patients with established subtypes included abnormal karyotypes from chromosome banding and/or positive results from targeted analyses. In total, 546 patients fulfilled these criteria and were included in the design of a DNA methylation classifier (Table 1 ).

Table 1. Summary of ALL samples with known subtype used to design methylation-based classifiers.

Immuno- Cytogenetic

Fusion gene N

phenotype abnormality

T-ALL 101

BCP ALL HeH 189

t(12;21) EVT6/RUNX1 161

Ilq23/ LL MLL-r 27

t(l;19) TCF3/PBX1 21

die (9; 20) 20

t(9;22) BCR/ABL1 19

iAMP21 8

Thirty-nine blinded DNA samples were obtained from newly diagnosed ALL cases as an independent validation set. A reference panel of non- leukemic blood cells was used for detection of ALL samples with low blast counts. The reference panel consisted of bone marrow aspirates from pediatric ALL patients during remission (n=86) and fractionated blood cells from healthy donors (n=51 ).

Of the ALL patients in our cohort, 210 patients did not display any of the recurrent ALL subtypes according to chromosomal banding or targeted assays for a chromosomal rearrangement. These patients with unassigned subtypes were divided into three groups denoted "non-recurrent", "normal" or "no result". Samples denoted as "non-recurrent" harbored non-recurrent complex aberrations (n=105). Samples designated as "normal" (n=87) displayed normal karyotypes and were negative in the targeted assays.

Samples designated as "no result" failed in the cytogenetic analysis (n=18).

DNA extraction and DNA methylation assay

Lymphocytes isolated from diagnostic bone marrow or peripheral blood samples were enriched for lymphoblastic by Ficoll-isopaque centrifugation (Pharmacia). All leukemic cell samples included in the study contained >80% leukemic blasts (average 91 %) according to light microscopy. DNA was extracted from cell pellets or vital frozen cells using the DNA/RNA ALLprep kit, Blood Mini Kit, or Gentra Puregene Tissue Kit (Qiagen). DNA

concentration was measured on a Qubit 2.0 Flurorometer (Life Technologies). 250-500 ng of DNA was treated with sodium bisulfite (Zymo Research) and 200 ng of bisulfite treated DNA was analyzed on the Human DNA Methylation 450k Array (lllumin Inc). The methylation levels were normalized using peak- based correction. After applying a previously described probe-filtering method, data from 435 941 autosomal and 10 891 X-chromosomal probes were available for DNA methylation analysis. Data are available at the Gene Expression Omnibus under series GSE49031 . Predictive modeling of ALL subtype using DNA methylation

The modeling procedure consisted of a feature selection step and a final training step. In the feature selection step, sub-classifiers were designed to distinguish between 9 pairs of groups: ALL against healthy reference samples, each of the eight subtypes T-ALL, HeH, t(12;21 ), 1 1 q23/M _ _, t(1 ;19), dic(9;20), t(9;22), and iAMP21 against a background of the other ALL subtypes. The sub-classifiers were trained on all CpG-sites on autosomes present on the array. The sub-classifiers were created in R using Nearest Shrunken Centroid (NSC) classification (Tibshirani et al., Proc Natl Acad Sci U S A 2002, May 14; 99(10):6567-6572). The R-code is available at Github (https://github.com/Molmed/Nordlund-2013).

Sub-classifier training was performed as follows. Consider the pair of groups "T-ALL against not T-ALL" where class 1 is "T-ALL" and class 2 is "not T-ALL". First the mean profile of all training samples is defined, called the overall centroid

where x is the methylation level of CpG-site i = 1 , 2, ... , p and training sample j = 1 , 2, ... , J. N is the total number of training samples. Then the mean profiles of the two classes are defined, called class centroids

where k = 1 or 2 represent the class, Ck is the set of training samples in class k, and n k is the number of training samples in class k. Since the CpG-sites' ability to separate the classes vary they are weighted differently by defining the so called shrunken centroids where

k |— Δ) +

rn k (si + so)

The interpretation of the above quantities is:

• ik - \ is the difference between class centroid and overall centroid, • Si is the pooled within-class standard deviation,

• So is a positive constant guarding against division by 0,

• m k Si is the estimated standard error of the difference,

• dik is the standardized difference,

· d'ik is the shrunken standardized difference where d,k is moved towards 0 by the parameter Δ. Negative values of |d ik | - Δ are truncated to 0. The selection of the shrinkage parameter Δ was done by internal cross-validation.

The decision on class membership for a test sample x * is then taken by calculating the standardized squared distance between the new observation and the shrunken centroids δκ(χ * ) and choosing the nearest one. n k is the frequency of class k in the population.

Using 5-fold cross validation repeated five times, 25 sub-classifiers were trained on different subsets of the samples. CpG sites selected in 17/25 cross validation folds were selected as "consensus CpG sites". These were used for training a consensus classifier, i.e. establishing the profiles of Figure 4 and Tables A, B, and C and formulating the final decision rule. The performance of the consensus classifier was evaluated using an external round of cross validation.

Analysis of copy number alterations (CNAs)

For each ALL sample CNA data was generated from raw signal intensities extracted from Genome Studio. For each probe, the intensities were summed (methylated + unmethylated signals) and subjected to quantile normalization using the preprocessCore package in R. Log2 ratios were calculated by dividing the normalized intensity by the mean intensity across the non-leukemic reference cells. CNAs were detected by plotting the log2 ratios in the integrative genomics viewer. Analysis of fusion gene expression

One microgram of total RNA was converted to cDNA and subjected to RT-PCR for the fusion transcript ETV6/RUNX1 in triplicate using reagents from Life Technologies. Strand-specific RNA-sequencing (RNA-seq) libraries were generated from one microgram total RNA with the ScriptSeq v1 .2 kits (Epicentre) followed by sequencing on a HiSeq2000 or MiSeq machine (lllumina Inc). Gene fusions were detected using the FusionCatcher software.

RESULTS

Prediction of ALL subtypes using DNA methylation classifiers

We used 546 ALL samples with the following eight unequivocally defined subtypes to design DNA methylation classifiers: T-ALL, the B-cell precursor ALL subtypes HeH, t(12;21 ), 1 1 q23/ MLL, t(1 ;19), dic(9;20), t(9;22), and ΪΑΜΡ21 .

We evaluated the performance of the classifier design procedure by cross-validation for different sizes of the consensus set. The best

performance in terms of sensitivity and specificity was obtained using a set of 246 consensus CpG sites that contained 14-42 CpG sites per subtype (Table

2).

Table 2. Performance of classifiers designed using ALL samples

known cancer category.

Cancer Mean Min CpGs Max CpGs Consensus

Mean specificity

category sensitivity (N) (N) CpGs (N)

ALL 1.00 + 0.01 0.99 + 0.01 13 21 17

T-ALL 0.99 ± 0.02 1.00 ± 0.00 7 17 14

HeH 0.94 + 0.04 0.95 ± 0.02 23 55 34 t(12;21) 0.97 ± 0.04 0.99 ± 0.01 28 2 263 42

Ilq23/MLL 0.95 + 0.11 1.00 ± 0.00 16 27 28 t(l;19) 0.91 ± 0.13 1.00 ± 0.00 9 30 21 dic(9;20) 0.78 ± 0.16 0.99 + 0.01 24 35 37 t(9;22) 0.70 ± 0.25 0.99 ± 0.01 17 41 23 iAMP21 0.81 ± 0.35 1.00 ± 0.00 11 19 16 This corresponded to 91 .0% of the ALL samples being assigned to a single correct subtype, 3.4% being assigned to multiple subtypes including the correct one, and 5.6% being assigned to an incorrect or no subtype.

The consensus classifier trained on the entire data set correctly classified 526 out of the 546 training samples. (95% Cl=515-532). The classifier failed to predict a subtype for as few as 17 patients in the design set (Figure 1 ). Only three of the patients were assigned to have an unexpected subtype.

All of the iAMP21 patients displayed high scores according to both the iAMP21 and HeH classifiers, while none of the patients with HeH obtained a high score in the iAMP classifier.

Blinded validation of ALL subtype classifiers

For independent validation of our DNA methylation consensus classifier, 39 ALL samples with known subtype that had not been used for the design of the classifiers were analyzed using the 450k BeadChip and subjected to blinded classification. In total, 36 of the 39 samples were accurately classified (Figure 2).

Classification of ALL samples with unknown cytogenetic risk group

We performed DNA methylation-based subtype classification of 210

ALL patient samples with no result (n=18), normal (n=87), or non-recurrent karyotype (n=105) (Figure 3A). In total, 106 of the 210 samples were assigned to one of the major cytogenetic groups with an estimated class probability of >0.50 for one single subtype (Figure 3B). Because all the iAMP21 patients obtained high scores from both the iAMP21 and HeH classifiers in the design set, we counted all patients with this pattern as iAMP21 only.

We assigned 50/105 patients in the non-recurrent group, 50/87 in the normal group, and 13/18 in the no result group to one of the subtypes. The distribution of the patients newly assigned from the normal and no result groups was similar to what could be expected in a pediatric ALL population (Figure 3C). The methylation profiles of the newly classified samples closely matched those of the group of original samples used to design the classifier and are thus hereby referred to as subtype-like.

A small group of 17 patients classified into two or more groups and were defined as "multi-class". The most common was double classification as dic(9;20) and t(9;22)-like. Eighty-three patients (-10% of the entire cohort) did not have methylation patterns similar to any of the subtype groups and were labeled as "non-class".

Subtype verification

With the combination of karyotyping, fusion gene detection, and copy number analysis evidence was found to support true subtype membership of 20% of the subtype-like patients (Table 3).

As expected this proportion was highest in the "normal" and "no result" groups. In 17% of the patients, evidence were lacking to either qualify or disqualify them as having the canonical aberration that defines their subtype. The remaining 64% of subtype-like patients were negative for the canonical re-arrangements. In this group, different types of aberrations involving the same chromosomes and genes that convey similar effects on the DNA methylation levels of consensus CpG sites were found.

Table 3. Verification of DNA methylation-based classification by karyotype, copy number alterations, and expressed fusion

Unconfirmed

Confirmed subtype 3 Subtype-like b

Methylation subtype 0

subtype KaryoRT- KaryoNegative for Novel fusion

N CNA f N CNA f N

type 4 PCR e type 4 fusion genes gene h

T-ALL* 0/1 (0%) 0 ND ND 0/1 (0%) 0 NA ND ND 1/1 (100%)

HeH* 7/25 (28%) 0 ND 7 18/25 (72%) 8 NA ND 13 0/25 (0%)

t(12;21)* 4/33 (12%) 1 4 ND 22/33 (67%) 0 19 3 ND 7/33 (21%)

Ilq23/MLL* 3/4 (75%) 1 ND 3 1/4 (25%) 0 1 ND ND 0/4 (0%)

t(l;19)* 0/15 (0%) 0 ND ND 9/15 (60%) 0 9 0 1 6/15 (40%)

dic(9;20)* 3/19 (16%) 0 ND 3 16/19 (84%) 5 NA 0 10 0/19 (0%)

t(9;22)* 0/5 (0%) 0 ND ND 3/5 (60%) 0 3 ND ND 2/5 (40%)

iAMP21* 4/4 (100%) 4 ND 4 0/4 (0%) 0 NA ND ND 0/4 (0%)

Newly classified patients with confirmed verification of their subtype group by at least one verification method.

Newly classified patients negative for the canonical event that defines their subtype and/or non-canonical event.

Newly classified patients without positive class membership for the recurrent subtype or subtype-like.

Number of patients with a chromosomal aberration observed in karyotyping results determined at diagnosis in NOPHO centers.

Number of patients positive for respective fusion genes by re-analysis by RT-PCR.

The number of patients with chromosomal aberrations discovered by array-based copy number alteration (CNA) that support subtype classification.

The number of subtype-like patients negative for canonical aberrations by targeted analyses (FISH/RT-PCR) at diagnosis.

10 The number of patients with novel (non-canonical) fusion genes detected.

NA, not applicable; ND, not determined; * Methylation-based subtype group.

Clinical outcome of the newly classified ALL patients

The clinical features of the newly classified patients, including age, white blood cell count at diagnosis, and outcome were similar to those of the original patients with known subtypes (Table 4).

Table 4. Comparison of clinical observation for ALL patients of known subtype and newly classified ALL patients.

% on treatment protocol

Age WBC

Subtype N Freq a Sex M:F RFS Infant SR IR HR

mean, sd mean, sd

T-ALL 10 0.18 2.9 9.1 + 4.6 152 + 171 0.80 - ■ ' " -..TO 4 96

T-ALL* 1 <0.01 , 1.0 11.6 V, " | " % 1.00 - ft »s 100

Hel l 189 0.35 1.2 5.2 ± 3.7 16 ± 23 0.84 - 44 44 12

HeH* 25 0.12 2.1 5.7 ± 3.8 16 ± 21 0.72 - 44 44 12

t(12;21) 161 0.29 1.1 5.2 ± 2.8 * 19 ± 30 0.78 ' ;■ 44 . 41 15 :

t02;21)* 33 0.16 1.1 . 5 ± 3.1 ' " 16 + 19 0.76 36 · 39 ; 21 ;

Ilq23/MLL 27 0.05 0.7 2.3 ± 3.8 230 ± 254 0.59 52 - 4 44

Ilq23/MLL* 4 0.02 4.0 1 ± 0.6 121 ± 113 0.50 25 - 25 50

t(l;19) 2i ; 0.04 0.8 9.2 ± 4.6 30 ± 37 ; 0.86 'WS ' /f ψψψ, 14 86

t(l;19)* 15 0.07 0.9 8.2 ± 4.2 18 + 36 1 0.93 7 67 27

dic(9;20) 20 0.04 0.4 4 ± 3.9 63 ± 76 0.75 - 20 25 55

dic(9;20)* 19 0.09 1.7 7.8 ± 5.7 19 ± 38 0.79 - 21 63 16

iAMP21* 4 0.02 0.3 10.4 ± 3.7 9 ± 4 0.50 - 25 50 25

Multi-class - 1 0.03 0.89 11.1 ± 5.1 22 ± 31 0.93 0 6 ■■· 65 29

Non-class 83 0.15 1.52 8.9 ± 4.7 < 4 > 41 + 62 0.79 4 17 ..1 42 36 ..

a) Proportion of total number of patients.

SR, Standard Risk; IR, Intermediate Risk; HR, High Risk; *, Patients with subtype determined by DNA methylation-based classification.

The relapse free survival (RFS) of the newly classified patients and the patients of established subtype were evaluated with Kaplan-Meier estimates and the Gray's test, however no significant differences were detected the newly classified and previously established patient groups.

While the invention has been described in detail and with reference to specific embodiments thereof, it will be apparent for one skilled in the art that various changes and modifications can be made therein without departing from the spirit and scope thereof.