Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
NOVEL BLOOD-DERIVED MARKERS FOR THE DETECTION OF CANCER
Document Type and Number:
WIPO Patent Application WO/2019/081507
Kind Code:
A1
Abstract:
The present invention relates to the field of pharmacogenomics and in particular to detecting the presence of CpG methylation and mi RNAs in blood for the detection of cancer. This detection is useful for a minimally invasive diagnosis of cancer as well as for monitoring cancer treatment and assessing treatment response. The invention provides methods suitable for this purpose.

Inventors:
BURWINKEL BARBARA (DE)
CAO XUE (CN)
TANG QUIQIONG (DE)
Application Number:
PCT/EP2018/079035
Publication Date:
May 02, 2019
Filing Date:
October 23, 2018
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNIV HEIDELBERG (DE)
International Classes:
C12Q1/6886
Domestic Patent References:
WO2008134596A22008-11-06
WO2016135168A12016-09-01
Foreign References:
EP2210954A12010-07-28
Other References:
SO YEON PARK ET AL: "Alu and LINE-1 Hypomethylation Is Associated with HER2 Enriched Subtype of Breast Cancer", PLOS ONE, vol. 9, no. 6, 27 June 2014 (2014-06-27), pages e100429, XP055461522, DOI: 10.1371/journal.pone.0100429
JIMIN MIN ET AL: "Methylation Levels of LINE-1 As a Useful Marker for Venous Invasion in Both FFPE and Frozen Tumor Tissues of Gastric Cancer", MOLECULES AND CELLS, vol. 40, no. 5, 1 January 2017 (2017-01-01), KR, pages 346 - 354, XP055461923, ISSN: 1016-8478, DOI: 10.14348/molcells.2017.0013
YOON HEE CHO ET AL: "Aberrant promoter hypermethylation and genomic hypomethylation in tumor, adjacent normal tissues and blood from breast cancer patients", ANTICANCER RESEARCH, vol. 30, no. 7, 1 July 2010 (2010-07-01), pages 2489 - 2496, XP055461516
"Helvetica Chimica Acta", article "A multilingual glossary of biotechnological terms: (IUPAC Recommendations"
FRIGOLA ET AL., NATURE GENETICS, vol. 38, no. 5, 2006
TANG ET AL., ONCOTARGET, vol. 7, no. 39, 2016, pages 64191 - 64202
YANG ET AL., INT J CANCER, vol. 136, no. 8, 2015, pages 1845 - 55
MILLER ET AL., CANCER, vol. 47, no. 1, 1981, pages 207 - 14
SOBIN ET AL.: "International Union Against Cancer (UICC), TNM Classification of Malignant tumors", 2002, SPRINGER, pages: 191 - 203
DOWDY; WEARDEN: "Statistics for Research", 1983, JOHN WILEY & SONS
BARON; VALIN, REC. MED. VET, SPECIAL CANE, vol. 11, no. 166, 1990, pages 999 - 1007
TANG ET AL., ONCOTARGET, vol. 7, 2016, pages 64191 - 64202
BRENNAN ET AL., CANCER RES, vol. 72, no. 9, 2012, pages 2304 - 2313
XU ET AL., FASEB J, vol. 26, no. 6, 2012, pages 2657 - 2666
CHOI ET AL., CARCINOGENESIS, vol. 30, no. 11, 2009, pages 1889 - 1897
KITKUMTHORN ET AL., CLIN CHIM ACTA, vol. 413, no. 9-10, 2012, pages 869 - 874
CHO ET AL., ANTICANCER RES, vol. 30, no. 7, 2010, pages 2489 - 2496
WU ET AL., CARCINOGENESIS, vol. 33, no. 10, 2012, pages 1946 - 1952
Attorney, Agent or Firm:
ZWICKER, Jörk (DE)
Download PDF:
Claims:
CLAIMS

1. A method for detecting the presence or absence of cancer in a subject, comprising the step of determining the level of cytosine methylation of at least one CpG dinucleotide selected from the group consisting of LINEl CpG dinucleotides 1, 3, 4, 5, 9, 12, and 14 of SEQ ID NO: 1, CpG dinucleotides within SEQ ID NO: 3 that are co-methylated with LINEl CpG dinucleotide 1 of SEQ ID NO: l, and Alu CpG dinucleotides 13 and 14 of SEQ ID NO: 2 in genomic DNA from a sample of the subject, wherein hypomethylation is indicative of the presence of the cancer and the lack thereof is indicative of the absence of the cancer. 2. The method of claim 1, wherein the cancer is breast cancer, ovarian cancer or pancreatic cancer, preferably breast cancer.

3. The method of claim 1 or 2, wherein the level of cytosine methylation of LINEl CpG dinucleotide 1 of SEQ ID NO: 1 is determined.

4. The method of any one of claims 1 to 3, wherein the level of cytosine methylation of Alu CpG dinucleotide 13 of SEQ ID NO: 2 is determined.

5. The method of any one of claims 1 to 4, further comprising detecting the level of cytosine methylation of at least one further CpG dinucleotide selected from the group consisting of LINEl CpG sites 2, 15, 16 and 17 of SEQ ID NO: 1, and/or of at least one further CpG dinucleotide selected from the group consisting of Alu CpG dinucleotides 1, 2, 11 and 12 of SEQ ID NO: 2. 6. The method of any one of claims 1 to 5, wherein the level of cytosine methylation of LINEl CpG dinucleotides 1, 2, 14, 16 and 17 of SEQ ID NO: 1 and of Alu CpG dinucleotides 1, 2, 11, 12 and 14 of SEQ ID NO: 2 is determined.

7. The method of any one of claims 1 to 6, further comprising determining the amount of at least one miRNA selected from the group consisting of miR-328, miR-320, miR-145, miR-339- 3p, and miR-193a-3p in the sample, wherein an increased amount of miR-328, miR-145 and/or miR-339-3p, and/or a decreased amount of miR-320 and/or miR-193a-3p is indicative of the presence of the cancer, and wherein a decreased or normal amount of miR-328, miR-145 and/or miR-339-3p, and/or an increased or normal amount of miR-320 and/or miR-193a-3p is indicative of the absence of the cancer. 8. The method of any one of claims 1 to 7, further comprising (i) determining the level of cytosine methylation of at least one CpG dinucleotide within and/or gene expression of at least one gene selected from the group consisting of HYAL2, MGRN1, RPTOR, SLC22A18, FUT7, RAPSN, S100P, and DYRK4 in the genomic DNA, wherein hypomethylation and/or increased expression is indicative of the presence of the cancer and the lack hypomethylation and/or a decreased or normal expression is indicative of the absence of the cancer; and/or

(ii) determining the amount of at least one RNA selected from the group consisting of miR-652, spliceosomal RNA Ul l or a fragment thereof, miR-376c, miR-376a, miR-127-3p, miR-409-3p and miR-148b in the sample, wherein an increased amount of the at least one RNA is indicative of the presence of the cance, and/or a decreased or normal amount is indicative of the absence of the cancer.

9. The method of any one of claims 1 to 8, wherein the subject has an increased risk of having or developing the cancer.

10. The method of any one of claims 1 to 9, wherein the sample is a body fluid sample or a tissue sample, wherein the body fluid sample is preferably selected from the group consisting of blood, serum, plasma, synovial fluid, urine, saliva, lymphatic fluid, lacrimal fluid, and fluid obtainable from the glands, and more preferably is peripheral blood.

11. A method method for diagnosing cancer or for screening for cancer, comprising detecting the cancer according to any one of claims 1-10. 12. A method for monitoring a subject having an increased risk of developing cancer, comprising detecting the cancer according to any one of claims 1-10 repeatedly.

13. A method for monitoring cancer treatment of a subject, comprising detecting the cancer according to any one of claims 1-10 repeatedly across the treatment period.

14. A method for assessing the response of a subject to a cancer treatment, comprising detecting the cancer according to any one of claims 1-10 during and/or after the treatment.

15. A method for treating a subject having cancer detected according to the method according to any one of claims 1-10, comprising administering a cancer therapy to the subject.

Description:
NOVEL BLOOD-DERIVED MARKERS FOR THE DETECTION OF CANCER

FIELD OF THE INVENTION The present invention relates to the field of pharmacogenomics and in particular to detecting the presence of CpG methylation and miRNAs in blood for the detection of cancer. This detection is useful for a minimally invasive diagnosis of cancer as well as for monitoring cancer treatment and assessing treatment response. The invention provides methods suitable for this purpose.

BACKGROUND OF THE INVENTION

In 2015, about 90.5 million people had cancer. About 14.1 million new cases occur a year (not including skin cancer other than melanoma). It caused about 8.8 million deaths (15.7% of human deaths). In 2015, ovarian cancer was present in 1.2 million women and resulted in 161,100 deaths worldwide. Among women it is the seventh-most common cancer and the eighth-most common cause of death from cancer. Also in 2015, pancreatic cancers of all types resulted in 411,600 deaths globally. Pancreatic cancer is the fifth most common cause of death from cancer in the United Kingdom, and the fourth most common in the United States. Breast cancer is one of the most common cancers and the leading cause of cancer death among women worldwide. While these cancers can be diagnosed, further diagnostic approaches are needed that are less invasive and more reliable. For example, although screening mammography is critical for the declined mortality of breast cancer, the limitations of mammography are well recognized, especially for young women with dense breast tissue. Therefore, other approaches are urgently needed for breast cancer detection and screening.

DNA methylation is a type of epigenetic alteration which plays an important role in cancer development. Promoter hypermethylation of tumor suppressor genes and global hypomethylation leading to malignancy have been studied extensively in different cancer types. Global DNA hypomethylation is a hallmark of most cancers, including breast cancer. This DNA hypomethylation has been proposed to activate oncogenes, induce genomic instability and promote choromosome instability. Genome-wide DNA hypomethylation originates from the decrease of 5-methyldeoxycytosine (5-mdC) in dinucleotide CpG sites throughout the genome. As most 5- mdC sites are rich in repetitive sequences that account for approximately half of the human genome, and those repetitive DNA sequences are highly methylated in normal cells. There are several different categories of repetitive sequences dispersed throughout the genome, such as long interspersed nuclear elements, short interspersed nuclear elements and satellite repeats. LINEl, a long interspersed nuclear element, is scattered throughout about 17% of the entire genome. Alu is a short interspersed repetitive sequence that contributes almost 11% of the human genome. The DNA methylation of repetitive elements has been associated with global DNA methylation and used as a marker for global methylation status by some investigators. Furthermore, global hypomethylation in peripheral blood DNA has been considered as a risk factor for many tumors, such as colorectal, bladder as well as head and neck cancer.

MiRNAs are small, non-coding RNAs (-18-25 nucleotides in length) that regulate gene expression on a post-transcriptional level by degrading mRNA molecules or blocking their translation. Hence, they play an essential role in the regulation of a large number of biological processes, including cancer. MiRNAs can be found in body fluids like blood. Such circulating miRNAs have been reported as aberrantly expressed in blood plasma or serum in different types of cancer, e.g. prostate, colorectal or esophageal carcinoma. They are relatively stable and can be measured repeatedly in a minimally invasive manner.

The inventors found that LINEl and Alu methylation as well as several miRNAs can be used as blood markers for the detection of cancer, in particular of breast, ovarian and pancreatic cancer.

SUMMARY OF THE INVENTION

In a first aspect, the present invention relates to a method for detecting the presence or absence of cancer in a subject, comprising the step of determining the level of cytosine methylation of at least one CpG dinucleotide selected from the group consisting of LINEl CpG dinucleotides 1, 3, 4, 5, 9, 12, and 14 of SEQ ID NO: 1, CpG dinucleotides within SEQ ID NO: 3 that are co- methylated with LINEl CpG dinucleotide 1 of SEQ ID NO: l, and Alu CpG dinucleotides 13 and 14 of SEQ ID NO: 2 in genomic DNA from a sample of the subject, wherein hypomethylation is indicative of the presence of the cancer and the lack thereof is indicative of the absence of the cancer.

In a second aspect, the present invention relates to a method for diagnosing cancer or for screening for cancer, comprising detecting the cancer according to the first aspect.

In a third aspect, the present invention relates to a method for monitoring a subject having an increased risk of developing cancer, comprising detecting the cancer according to the first aspect repeatedly. In a fourth aspect, the present invention relates to a method for monitoring cancer treatment of a subject, comprising detecting the cancer according to the first aspect repeatedly across the treatment period.

In a fifth aspect, the present invention relates to a method for assessing the response of a subject to a cancer treatment, comprising detecting the cancer according to the first aspect during and/or after the treatment.

In a sixth aspect, the present invention relates to a method for treating a subject having cancer detected according to the method of the first aspect, comprising administering a cancer therapy to the subject.

LEGENDS TO THE FIGURES

Figure 1: Box plot of LINE1.

Figure 2: ROC curve for LINEl_CpG_l methylation (A) and LINE1 mean methylation (B). Figure 3: Box plot of Alu_CpG_13 and Alu_CpG_14.

Figure 4: ROC curve for Alu_CpG_13 methylation (A) and Alu_CpG_14 methylation (B). ROC curve was calculated by logistic regression.

Figure 5: ROC curve for the CpG dinucleotides of Table 11.

Figure 6: ROC curve for LINEl_CpG_l and Alu_CpG_13.

Figure 7: ROC curve for CpG dinucleotides of Table 13.

DETAILED DESCRIPTION OF THE INVENTION

Before the present invention is described in detail below, it is to be understood that this invention is not limited to the particular methodology, protocols and reagents described herein as these may vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to limit the scope of the present invention, which will be limited only by the appended claims. Unless defined otherwise, all technical and scientific terms used herein have the same meanings as commonly understood by one of ordinary skill in the art.

Preferably, the terms used herein are defined as described in "A multilingual glossary of biotechnological terms: (IUPAC Recommendations)", Leuenberger, H.G.W, Nagel, B. and Kolbl, H. eds. (1995), Helvetica Chimica Acta, CH-4010 Basel, Switzerland).

Several documents are cited throughout the text of this specification. Each of the documents cited herein (including all patents, patent applications, scientific publications, manufacturers' specifications, instructions etc.), whether supra or infra, is hereby incorporated by reference in its entirety. Nothing herein is to be construed as an admission that the invention is not entitled to antedate such disclosure by virtue of prior invention.

In the following, the elements of the present invention will be described. These elements are listed with specific embodiments, however, it should be understood that they may be combined in any manner and in any number to create additional embodiments. The variously described examples and preferred embodiments should not be construed to limit the present invention to only the explicitly described embodiments. This description should be understood to support and encompass embodiments, which combine the explicitly described embodiments with any number of the disclosed and/or preferred elements. Furthermore, any permutations and combinations of all described elements in this application should be considered disclosed by the description of the present application unless the context indicates otherwise.

Throughout this specification and the claims which follow, unless the context requires otherwise, the word "comprise", and variations such as "comprises" and "comprising", are to be understood to imply the inclusion of a stated integer or step or group of integers or steps but not the exclusion of any other integer or step or group of integer or step. As used in this specification and the appended claims, the singular forms "a", "an", and "the" include plural referents, unless the content clearly dictates otherwise. Aspects of the invention and particular embodiments thereof

In a first aspect, the present invention relates to a method for detecting the presence or absence of cancer in a subject, comprising the step of determining the level of cytosine methylation of at least one CpG dinucleotide selected from the group consisting of LINE1 CpG dinucleotides 1, 3, 4, 5, 9, 12, and 14 of SEQ ID NO: 1, CpG dinucleotides within SEQ ID NO: 3 that are co- methylated with LINE1 CpG dinucleotide 1 of SEQ ID NO: l, and Alu CpG dinucleotides 13 and 14 of SEQ ID NO: 2 in genomic DNA from a sample of the subject, wherein hypomethylation is indicative of the presence of the cancer and the lack thereof is indicative of the absence of the cancer.

All of the above CpG dinucleotides can be used as univariate markers or as multivariate markers.

In a preferred embodiment of the method, the level of cytosine methylation of Alu CpG dinucleotide 13 of SEQ ID NO: 2 is determined. The inventors have shown that this is one of the most distinctive CpG sites and useful in particular as a univariate marker. In the most preferred embodiment of the method, the level of cytosine methylation of LINEl CpG dinucleotide 1 of SEQ ID NO: 1 is determined. The inventors have shown that this is the most distinctive CpG site and useful as a univariate marker and a multivariate marker irrespective of other markers used in combination.

Other embodiments include determining the level of cytosine methylation of at least one

(including at least two, at least three, at least four, at least five, at least six or seven) CpG dinucleotide selected from the group consisting of LINEl CpG dinucleotides 1, 3, 4, 5, 9, 12, and 14 of SEQ ID NO: 1 and/or of at least one (including two) CpG dinucleotide selected from the group consisting of Alu CpG dinucleotides 13 and 14 of SEQ ID NO: 2.

In a further preferred embodiment, the above methods of the first aspect determining the methlyation level of one or more LINEl CpG dinucleotides further comprise determining the level of cytosine methylation of at least one (including at least two, at least three or of four) further CpG dinucleotide selected from the group consisting of LINEl CpG sites 2, 15, 16 and 17 of SEQ ID NO: 1.

In a further preferred embodiment, the above methods of the first aspect determining the methlyation level of one or more Alu CpG dinucleotides further comprise determining the level of cytosine methylation of at least one (including at least two, at least three or of four) further CpG dinucleotide selected from the group consisting of Alu CpG dinucleotides 1, 2, 11 and 12 of SEQ ID NO: 2.

In a further embodiment, the above methods of the first aspect comprising determining the methlyation level of both one or more LINEl and one or more Alu CpG dinucleotides comprise determining the level of cytosine methylation of LINEl CpG dinucleotides 1, 2, 14, 16 and 17 of SEQ ID NO: 1 and of Alu CpG dinucleotides 1, 2, 11, 12 and 14 of SEQ ID NO: 2.

Generally, it is preferred that if

- the methylation level of one of LINEl CpG dinucleotides 3, 4 or 5 is determined, the methylation level of one or two of the remaining two is also determined, i.e. of 3 and 4, 3 and 5, 4 and 5 or 3 to 5;

- the methylation level of one of LINEl CpG dinucleotides 16 or 17 is determined, the methylation level of the other one is also determined, i.e. of 16 and 17;

- the methylation level of one of Alu CpG dinucleotides 11 or 12 is determined, the methylation level of the other one is also determined, i.e. of 11 and 12; and/or

- the methylation level of one of Alu CpG dinucleotides 15, 16 or 17 is determined, the methylation level of one or two of the remaining two is also determined, i.e. of 15 and 16, 15 and 17, 16 and 17 or 15 to 17. The methylation levels of the CpG dinucleotides of these groups are preferably summarized, e.g. as a mean, for each group. As commonly known in the art, the methylation profiles of CpG sites within a certain genomic distance are not independent. CpG sites usually have a correlated methylation state, i.e., they are co-methylated. Mainly two factors influence the methylation behavior within a short genomic distance: the association of CpGs to the same biologically functional unit, e.g., gene, and mechanisms for methylation and demethylation addressing whole stretches, e.g., SSI methylase binding the DNA at a certain position, wandering along the strand, processing a part and then detaching. As a result, co -methylation can occur over short and long distances. Blocks of co- methylated CpGs reach sizes of about 2000 bp in healthy cells, but in cancer much longer regions have been observed to be co-methylated (see Frigola et ah, Nature Genetics, vol. 38, no. 5, 2006). See also Tang et al., Oncotarget. 2016;7(39):64191-64202 and Yang et al., Int J Cancer. 2015; 136(8): 1845-55. Accordingly, the method of the first aspect also includes co-methylated CpG dinucleotides in particular of LINEl CpG dinucleotide 1 of SEQ ID NO: 1. In a further embodiment thereof, the CpG dinucleotides within SEQ ID NO: 3 that are co-methylated with LINEl CpG dinucleotide 1 of SEQ ID NO: l (which has the positions 1140/1141 in SEQ ID NO: 3) are limited to CpG dinucleotides within a range of 1000 nucleotides downstream and upstream of that specific CpG dinucleotide (i.e. posititions 140 to 2141), a range of 500 nucleotides downstream and upstream of that specific CpG dinucleotide (i.e. posititions 640 to 1641), or a range of 100 nucleotides downstream and upstream of that specific CpG dinucleotides (i.e. posititions 1040 to 2241) of SEQ ID NO: 3. "Co-methylated" generelly refers to having the same methylation status (i.e. not methylated or methylated) and with respect to the invention having the same methylation status in the genomic DNA according to first aspect in either subjects having the cancer or not having the cancer. With respect to the invention, the term "co-hypomethylated" can also be used to define such CpG dinucleotides. In a preferred embodiment, co-hypomethylated refers to a methylation level that is substantially the same as the LINEl CpG dinucleotide 1 of SEQ ID NO: 1 methylation level. "Substantially the same" may comprise a variation of +/- 0.1, preferably +/- 0.05 of methylation levels between 0 and 1. Preferably, "substantially the same" refers to the absence of a statistially significant difference. For a description of statistic significance and suitable confidence intervals and p values, see below. Co-methylation can be determined without undue burden by the skilled person, e.g. by NGS sequencing.

The cancer preferably is breast cancer, ovarian cancer or pancreatic cancer, most preferably breast cancer.

The subject is preferably is selected from the group consisting of laboratory animals (e.g. mouse or rat), domestic animals (including e.g. guinea pig, rabbit, horse, donkey, cow, sheep, goat, pig, chicken, camel, cat, dog, turtle, tortoise, snake, or lizard), or primates including chimpanzees, bonobos, gorillas, and humans. Humans are particularly preferred. In the case of breast cancer or ovarian cancer, it is preferred that the subject is female. In a preferred embodiment, the subject has an increased risk of having or developing the cancer. In other words, the subject has at least one risk factor for having or developing the cancer. In this embodiment with respect to breast cancer, the subject has for example at least one risk factor selected from the group consisting of female gender, age of 45 or older (in particular 50 or 55 or older), having a risk-increasing inherited syndrome or predisposition (such as mutation of the BRCA1, BRCA2, ATM, TP53, CHEK2, PTEN, CDH1, STK11 and/or PALB2 gene), family history of breast cancer (in particular first degree relatives), personal history of breast cancer in another part of the breast or in the other breast, ethnicity (in particular White), dense breast tissue (in particular a dense breast on mammogram 1.2 to 2 times of the average breast density of women), benign breast conditions (such as (i) non-proliferative lesions including fibrosis and/or simple cysts, mild hyperplasia, adenosis (non- sclera sing), ductal ectasia, Phyllodes tumor (benign), single papilloma, fat necrosis, periductal fibrosis, squamous and apocrine metaplasia, and epithelial-related calcifications, or (ii) proliferative lesions without atypia including usual ductal hyperplasia (without atypia), fibroadenoma, sclerosing adenosis, papillomas (in particular papillomatosis), and radial scar, or (iii) proliferative lesions with atypia including atypical ductal hyperplasia (ADH) and atypical lobular hyperplasia (ALH)), lobular carcinoma in situ, above average number of menstrual periods (e.g. due to menstruation before age of 12 and/or menopause after age of 55), previous chest radiation, diethylstilbestrol exposure, not having children or having the first child after the age of 30 (in particular combined with the age risk factor defined above), use of oral contraceptives or depot-medroxyprogesterone acetate, hormone therapy after menopause, heavy alcohol use (more than 3 or 4 alcohol units a day for men, or more than 2 or 3 alcohol units a day for women; an alcohol unit is defined as 10 ml (8 g) of pure alcohol), being overweight or obese, and physical inactivity. With respect to ovarian cancer, the subject has for example at least one risk factor selected from the group consisting of age of 40 or older (in particular of 63 and older and more particular after menopause), female gender, obesity, first full-term pregnancy after age 35 or no pregancy, no use of birth control pills (in particular never or not in the last 3, 4 or 6 months), exposure to fertility drugs (in particular clomiphene citrate), exposure to androgens, estrogen therapy and/or hormone therapy, family history of ovarian cancer, breast cancer, or colorectal cancer (in particular in first degree relatives), having an inherited cancer syndrome or predisposition (such as hereditary breast and ovarian cancer syndrome, PTEN tumor hamartoma syndrome, hereditary nonpolyposis colon cancer, Peutz-Jeghers syndrome, or MUTYH-associated polyposis), personal history of breast cancer, use of talcum powder applied directly to the genital area, high fat diet, use of analgesics, and tobacco consumption (in particular smoking). With respect to pancreatic cancer, the subject has for example at least one risk factor selected from the group consisting of male gender, age of 45 or older (in particular 55 or older or 65 or older), asian, hispanic or white origin, tobacco consumption (in particular smoking), heavy alcohol use (as defined above), being overweight or obese, family history of pancreatic cancer (in particular at least two or at least three first degree relatives, e.g. parent, child or sibling, with pancreatic cancer), an inherited condition selected from the group consisting of Hereditary pancreatitis (HP), Peutz- Jeghers syndrome (PJS), Familial malignant melanoma and pancreatic cancer (FAMM-PC), Hereditary breast and ovarian cancer (HBOC) syndrome, Lynch syndrome, and Li-Fraumeni syndrome (LFS), Familial adenomatous polyposis (FAP); chronic pancreatitis, infection with H. pylori or hepatitis B, and liver cirrhosis.

The sample can be a body fluid sample or a tissue sample. The body fluid sample can be selected from the group consisting of blood, serum, plasma, synovial fluid, urine, saliva, lymphatic fluid, lacrimal fluid, and fluid obtainable from the glands such as e.g. breast. Preferably, the sample comprises blood cells, so it can be blood (e.g. whole blood) or a blood-derived sample, but is not limited thereto. Preferred is peripheral blood (e.g. whole peripheral blood) or a sample derived therefrom. Accordingly, it is preferred that the genomic DNA is DNA from blood cells, more preferably peripheral blood cells.

Methods for determining the level of cytosine methylation are well-known in the art and the method of the first aspect is not limited to any particular method for determining the level of of cytosine methylation. Examplary methods are COBRA, restriction ligation-mediated PCR, Ms- SNuPE, ion-pair reverse-phase high performance liquid chromatography, denaturing high performance liquid chromatography, any bisulfite sequencing method, e.g. direct bisulfite sequencing with the Sanger method or sequencing methods of the 2 nd or 3 rd generation (NexGen sequencing, NGS), or any pyrosequencing method, DNA sequencing methods that can per se distinguish between methylated and unmethylated cytosines (e.g. using Nanopores and/or enzymes used as sensors), digital sequencing, mass spectrometry (e.g. MALDI-TOF), QM™ or quantitative real-time PCR, preferably MethyLight™ or HeavyMethyl™, or methylation sensitive restriction enzyme analysis, and any other quantitative methylation determination technique including nanotechnology approaches.

In a preferred embodiment, detecting cytosine methylation comprises converting, in the genomic DNA, cytosine unmethylated in the 5-position to uracil or another base that does not hybridize to guanine, preferably by bisulfite conversion.

The skilled person will understand that LINE1 and Alu are repetitive elements, so the genomic DNA contains multiple copies, and that each CpG dinucleotide is not necessarily methylated uniformly in all copies. Therefore, the method of the first aspect determines methylation levels rather than simply an absence or presence of methylation. Accordingly, the cytosine methylation level of each of the at least one CpG dinucleotides represents the cytosine methylation of at least a representative large part (depending on the method) or all copies of each of the at least one CpG dinucleotides in the sample. In a preferred embodiment, the cytosine methylation level of each of the at least one CpG dinucleotides is the mean cytosine methylation of these copies of each of the at least one CpG dinucleotides in the sample. In specific embodiments thereof, hypomethylation is a mean methylation that is within the three lower quartiles, preferably the two lower quartiles and more preferably the lowest quartile of the mean methylation of a cancer-positive control.

In a particular embodiment, in which the methylation levels of several CpG dinucleotides is summarized, the method of the first aspect can comprise determining the mean cytosine methylation for each of at least two CpG dinucleotides (or at least three, four, etc.), and further comprise determining the overall mean of the mean cytosine methylation of each of the at least two CpG dinucleotides. Examples of this embodiment are shown in Tables 5 and 8.

In another embodiment, the method of the first aspect comprises the step of comparing the cytosine methylation level to a cancer-negative control methylation level of the same at least one CpG dinucleotide. The cancer-negative control methylation preferably is that of one or more control subjects not having the cancer. Accordingly, the cancer-negative control methylation level is the base-line methylation level and all lower methylation levels are considered as hypomethylation. Preferably, hypomethylation is a lower mean methylation. A lack of hypomethylation preferably is at least the level of methylation of a subject not having the cancer. In addition or as an alternative to a cancer-negative control, the method of the first aspect may comprise the step of comparing the cytosine methylation level to a cancer-positive control methylation level of the same at least one CpG dinucleotide, wherein a lower or equal methylation level is indicative of the presence of the cancer. Also, a higher methylation level can be taken as indicative of the absence of the cancer. The cancer-positive control methylation is preferably that of a plurality of control subjects having the cancer.

In preferred embodiments, the step of comparing comprises an age adjustment (i.e. a subject is compared only to controls of the same or a similar age, e.g. +/- 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 years), a cell type adjustment (i.e. only the same cell type of several comprised in the sample is compared), or both.

Preferably, the method of the first aspect comprises a normalisation of the methylation level, e.g. using the methylation level of one or more CpG dinucleotides that are not differentially methylated between subjects having and not having the cancer. An exemplary gene comprising such CpG sites is actin-B. The skilled person will appreciate that absolute mean methylation values for the CpG dinucleotides may vary depending on factors such as the detection method or the normalisation used. Therefore, it is difficult to set general absolute methylation level thresholds that are valid for all conditions. Nevertheless, it is preferred that if the cytosine methylation level of LINE1 CpG dinucleotide 1 of SEQ ID NO: 1 is determined, a mean methylation of LINE1 CpG dinucleotide 1 of SEQ ID NO: 1 of 0.82 or lower, preferably 0.80 or lower or more preferably 0.76 or lower, as would be determined by MALDI-TOF with age normalisation, is indicative of the presence of the cancer. Further, it is preferred that if the cytosine methylation level of LINE1 CpG dinucleotides 1, 2, 3, 4, 5, 9, 12, 14, 15, 16 and 17 of SEQ ID NO: 1 is determined, a mean methylation of LINE1 CpG dinucleotides 1, 2, 3, 4, 5, 9, 12, 14, 15, 16 and 17 of SEQ ID NO: 1 of 0.877 or lower, preferably 0.860 or lower or more preferably 0.845 or lower, as would be determined by MALDI- TOF with age normalisation, is indicative of the presence of the cancer. Further, it is preferred that if the cytosine methylation level of Alu CpG dinucleotide 13 of SEQ ID NO: 2 is determined, a mean methylation of Alu CpG dinucleotide 13 of SEQ ID NO: 2 of 0.65 or lower, preferably 0.62 or lower, as would be determined by MALDI-TOF with age normalisation, is indicative of the presence of the cancer. It is to be understood that the setting of these thresholds values does not require that the method of the first aspect comprises determining methylation levels by MALDI- TOF and/or an age normalisation, this merely defines thresholds and may require determining concordance values for a different detection method and normalisation that is to be used. The above thresholds for LINE1 CpG dinucleotide 1 of SEQ ID NO: 1 may also be used if the methylation level of a co-methylated CpG dinucleotide (as described herein) is determined.

The method of the first aspect may further comprise the detection of additional markers of the cancer. Such additional markers can be used in addition to one or more of the CpG dinucleotides described above. Preferably, they are used in addition to LINE1 CpG dinucleotide 1 of SEQ ID NO: 1 (in combination with or without one or more of the remaining INE1 or Alu CpG dinucleotides described). Accordingly, in one embodiment, the method of the first aspect further comprises determining the level of cytosine methylation of at least one CpG dinucleotide within and/or gene expression of at least one gene selected from the group consisting of HYAL2, MGRN1, RPTOR, SLC22A18, FUT7, RAPSN, S100P, and DYRK4 in the genomic DNA, wherein hypomethylation and/or increased expression is indicative of the presence of the cancer and the lack hypomethylation and/or decreased or expression is indicative of the absence of the cancer. The determination of cytosine methylation and/or gene expression of these genes with respect to the cancer is described in WO 2016/135168, including CpG dinucleotides that are diffentially methylated in subjects having the cancer and not having the cancer. Alternatively or in addition, the method of the first aspect further comprises determining the amount of at least one RNA marker selected from the group consisting of miR-652, spliceosomal RNA Ul 1 or a fragment thereof (previously described as miR-801), miR-376c, miR-376a, miR-127-3p (previously described as miR-127), miR-409-3p (previously described as miR-409) and miR-148b in the sample, wherein an increased amount of the at least one RNA marker is indicative of the presence of the cancer and/or a decreased or normal amount of the at least one RNA marker is indicative of the absence of the cancer. This determination is also described in WO 2016/135168.

Furthermore, in a preferred embodiment, the method of the first aspect further comprises (next to or without the additional markers described above) determining the amount of at least one miRNA marker selected from the group consisting of miR-328, miR-320, miR-145, miR-339-3p, and miR-193a-3p in the sample, wherein an increased amount of miR-328, miR-145 and/or miR- 339-3p, and/or a decreased amount of miR-320 and/or miR-193a-3p is indicative of the presence of the cancer, and wherein a decreased or normal amount of miR-328, miR-145 and/or miR-339- 3p, and/or an increased or normal amount of miR-320 and/or miR-193a-3p is indicative of the absence of the cancer. The utility of the markers is shown in Example 2. Decreased amount/expression or increased amount/expression is to be understood as a comparison to one or more subjects not having the cancer (cancer-negative control). Normal is to be understood as not being statistically significantly different from one or more subjects not having the cancer (cancer- negative control). For a description of statistic significance and suitable confidence intervals and p values, see below.

Any additional determination as described above preferably also comprises one or more comparisons (e.g. to cancer-negative or cancer-positive), adjustments and/or normalisation as described above. For determining an amount of RNA, it is preferred that an RNA that is not differentially expressed in subjects having and not having the cancer is used for normalisation. An example is cel-miR-39.

The determination of the amount of RNA described herein usually can use the same sample as the determination of methylation levels, but instead of using cellular nucleic acids, the determination of RNA is directed to cell-free circulating RNA. This means free-floating RNA in the sample rather than RNA isolated from cells. For example, if peripheral blood is the sample, the amount of RNA in the cell-free fraction of the blood, e.g. in serum or plasma, is determined. However, also a different sample can be used to determine the amount of RNA (independent of the sample used for determining the methylation level). Such a different sample is usually a body fluid sample as described above, in particular blood plasma, blood serum, fluid obtainable from the breast glands, or saliva.

The amount of a RNA can be determined by techniques well known in the art. Depending on the nature of the sample, the amount may be determined by PCR based techniques for quantifying the amount of a polynucleotide or by other methods like mass spectrometry or (next generation) sequencing. The term "determining the amount of at least one RNA/miRNA", as used herein, preferably relates to determining the amount of each of the RNA separately in order to be able to compare the amount of each RNA to a cancer-positive or -negative control.

Definitions and embodiments described below, in particular under the header 'Definitions and further embodiments of the invention' apply to the method of the first aspect.

In a second aspect, the present invention relates to a method for diagnosing cancer or for screening for cancer, comprising detecting the cancer according to the first aspect.

"Diagnosis" refers to a determination whether a particular subject does or does not have cancer. A diagnosis by the methods described herein may be supplemented with a further diagnostic means to detect or confirm the presence of the cancer. As will be understood by persons skilled in the art, the diagnosis normally may not be correct for 100% of the subjects, although it preferably is correct. The term, however, requires that a correct diagnosis can be made for a statistically significant part of the subjects.

"Screening for cancer" refers to the use of the method of the first aspect with samples of a population of subjects. Preferably, the subjects have an increased risk for or are suspected of having the cancer. In particular, one or more of the risk factors recited herein can be attributed to the subjects of the population. In a specific embodiment, the same one or more risk factors can be attributed to all subjects of the population. For example, the population may be characterized by a certain minimal age (e.g. 50 or older). It is to be understood that the term "screening" does not necessarily indicate a definite diagnosis, but is intended to indicate an increased possibility of the presence or of the absence of the cancer. An indicated increased possibility is preferably confirmed using a further diagnostic means. As will be understood by persons skilled in the art, the screening result normally may not be correct for 100% of the subjects, although it preferably is correct. The term, however, requires that a correct screening result can be achieved for a statistically significant part of the subjects.

For a description of statistic significance and suitable confidence intervals and p values, see below.

In a third aspect, the present invention relates to a method for monitoring a subject having an increased risk of developing cancer, comprising detecting the cancer according to the first aspect repeatedly.

Preferably, one or more of the risk factors recited herein can be attributed to the subject. It is envisaged that the method according to the first aspect is carried out periodically over an extended amount of time, e.g. once per year, per two years, per three years, per four years, per five years of per ten years for at least two times, preferably at least three times, at least four times, at least five times or at least ten times. This preferably occurs until the subject no longer has the one or more risk factors, until the cancer is detected (in that case the method of the first aspect can still be carried out for different purposes as described herein, though, e.g. according to the fourth or fifth aspect), until the subject dies or until the monitoring is terminated for any other reason.

In a fourth aspect, the present invention relates to a method for monitoring cancer treatment of a subject, comprising detecting the cancer according to the first aspect repeatedly across the treatment period.

This relates to the accompaniment of a diagnosed cancer during a treatment procedure or during a certain period of time, typically during at least 1 week, 2 weeks, 3 weeks, 4 weeks, 1 month, 2 months, 3 months, 4 months, 5 months, 6 months, 1 year, 2 years, 3 years, 5 years, 10 years, or any other period of time. The term "accompaniment" means that states of and, in particular, changes of these states of a cancer may be detected, particular based on changes in the amount of methylation/RNA/miRNA in any type of periodical time segment, determined e.g. daily or 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14 or 15 times per month (no more than one determination per day) over the course of the treatment, which may be up to 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15 or 24 months. Amounts or changes in the amounts can also be determined at treatment specific events, e.g. before and/or after every treatment cycle or drug/therapy administration. A cycle is the time between one round of treatment until the start of the next round. Cancer treatment is usually not a single treatment, but a course of treatments. A course usually takes between 3 to 6 months, but can be more or less than that. During a course of treatment, there are usually between 4 to 8 cycles of treatment. Usually a cycle of treatment includes a treatment break to allow the body to recover. As will be understood by persons skilled in the art, the result of the monitoring normally may not be correct for 100% of the subjects, although it preferably is correct. The term, however, requires that a correct result of the monitoring can be achieved for a statistically significant part of the subjects. For a description of statistic significance and suitable confidence intervals and p values, see below.

In a fifth aspect, the present invention relates to a method for assessing the response of a subject to a cancer treatment, comprising detecting the cancer according to the first aspect during and/or after the treatment.

"Response to treatment" refers to the response of a subject suffering from cancer to a therapy for treating said disease. Standard criteria (Miller, et al., Cancer, 1981; 47(1): 207-14) can be used to evaluate the response to therapy include response, stabilization and progression. The response can be a complete response (or complete remission) which is the disappearance of all detectable malignant disease or a partial response which is defined as approximately >50% decrease in the sum of products of the largest perpendicular diameters of one or more lesions (tumor lesions), no new lesions and no progression of any lesion. Subjects achieving complete or partial response are considered "responders", and all other subjects are considered "non- responders". The term "stabilization", as used herein, is defined as a < 50% decrease or a < 25% increase in tumor size. The term "progression", as used herein, is defined as an increase in the size of tumor lesions by > 25% or appearance of new lesions.

In a sixth aspect, the present invention relates to a method for treating a subject having cancer detected according to the method of the first aspect, comprising administering a cancer therapy to the subject. Suitable cancer treatments are described below. Definitions given and embodiments described with respect to the first aspect apply also to all other aspects, in as far as they are applicable. Also, definitions and embodiments described below, apply to all methods described above.

Definitions and further embodiments of the invention

The specification uses a variety of terms and phrases, which have certain meanings as defined below. Preferred meanings are to be construed as preferred embodiments of the aspects of the invention described herein. As such, they and also further embodiments described in the following can be combined with any embodiment of the aspects of the invention and in particular any preferred embodiment of the aspects of the invention described above.

The term "univariate marker" refers to a marker that is determined independently, i.e. not in combination with another marker. The term "multivariate marker" refers to a marker that is determined in combination with another marker. "Another marker" with respect to a CpG dinucleotide refers herein to another CpG dinucleotide (preferably of LINEl or Alu). In a broader sense it can also include any other marker, including markers not disclosed herein.

The term "level of cytosine methylation" or "methylation level" refers to the methylation of multiple copies of the same CpG dinucleotide. It is preferably expressed as the mean methylation with a value of 0 to 1, corresponding to 0% to 100% of the multiple copies being methylated. The term "expression level" refers to the amount of gene product present in the body or a sample. The expression level can e.g. be measured/quantified/detected by means of the protein or mRNA expressed from the gene. The expression level can for example be quantified by normalizing the amount of gene product of interest present in a sample with the total amount of gene product of the same category (total protein or mRNA) in the same sample or a reference sample (e.g. a sample taken at the same time from the same individual or a part of identical size (weight, volume) of the same sample) or by identifying the amount of gene product of interest per defined sample size (weight, volume, etc.). The expression level can be measured or detected by means of any method as known in the art, e.g. methods for the direct detection and quantification of the gene product of interest (such as mass spectrometry) or methods for the indirect detection and measurement of the gene product of interest that usually work via binding of the gene product of interest with one or more different molecules or detection means (e.g. primer(s), probes, antibodies, protein scaffolds) specific for the gene product of interest. The determination of the level of gene copies comprising also the determination of the absence or presence of one or more fragments (e.g. via nucleic acid probes or primers, e.g. quantitative PCR, Multiplex ligation- dependent probe amplification (MLPA) PCR) is also within the knowledge of the skilled artisan.

The term "miRNA" refers to at least a mature miRNA, optionally comprising the complete stem-loop. While in Example 2, mature miRNAs are detected, the stem-loop miRNAs can be detected instead.

The term "breast cancer" is used in the broadest sense and refers to all cancers that start in the breast. It includes the subtypes ductal carcinoma in situ, invasive ductal carcinoma (including tubular carcinoma of the breast, medullary carcinoma of the breast, mucinous carcinoma of the breast, papillary carcinoma of the breast, and cribriform carcinoma of the breast), invasive lobular carcinoma, inflammatory breast cancer, lobular carcinoma in situ, male breast cancer, Paget's disease of the nipple, and Phyllodes tumors of the breast. It also includes the following stages (as defined by the corresponding TNM classification(s) in brackets): stage 0: (Tis, NO, M0), stage IA (Tl, NO, M0), stage IB (TO or Tl, Nlmi, M0), stage IIA (TO or Tl, Nl (but not Nlmi), M0; or T2, NO, M0), stage IIB (T2, Nl, M0; or T3, NO, M0), stage IIIA (TO to T2, N2, M0; or T3, Nl or N2, M0), stage IIIB (T4, NO to N2, M0), stage IIIC (any T, N3, M0), and stage IV (any T, any N, Ml). The term "ovary cancer" or "ovarian cancer" is used in the broadest sense and refers to all cancers that start in the ovaries. It includes the subtypes benign epithelial ovarian tumors, tumors of low malignant potential, and malignant epithelial ovarian tumors. It also includes the following stages (as defined by the corresponding TNM classification(s) in brackets): stage IA (Tla, NO, MO), stage IB (Tib, NO, MO), stage IC (Tic, NO, MO), stage IIA (T2a, NO, MO), stage IIB (T2b, NO, MO), stage IIIA1 (Tl or T2, Nl, MO), stage IIIA2 (T3a2, NO or Nl, MO), stage IIIB (T3b, NO or Nl, MO), stage IIIC (T3c, NO or Nl, MO), and stage IV (any T, any N, Ml).

The term "pancreatic cancer" is used in the broadest sense and refers to all cancers that start in the pancreas. It includes the subtypes exocrine cancers, endocrine cancers, pancreatoblastoma, sarcomas of the pancreas, and lymphoma. Exocrine cancers include adenocarcinomas, in particular ductal adenocarcinomas, as well as cystic tumours and cancer of the acinar cells. Endocrine cancers include gastrinomas, insulinomas, somatostatinomas, VTPomas, and glucagonomas. It also includes the following stages (as defined by the corresponding TNM classification(s) in brackets): stage 0 (Tis, NO, MO), stage IA (Tl, NO, MO), stage IB (T2, NO, MO), stage IIA (T3, NO, MO), stage IIB (Tl-3, Nl, MO), stage III (T4, any N, MO), and stage IV (any T, any N, Ml).

The TNM classification is a staging system for malignant cancer. As used herein the term "TNM classification" refers to the 6 th edition of the TNM stage grouping as defined in Sobin et al. (International Union Against Cancer (UICC), TNM Classification of Malignant tumors, 6 th ed. New York; Springer, 2002, pp. 191-203).

The term "is indicative for" or "indicates" as used herein refers to an act of identifying or specifying the thing to be indicated. As will be understood by persons skilled in the art, such assessment normally may not be correct for 100% of the subjects, although it preferably is correct. The term, however, requires that a correct indication can be made for a statistically significant part of the subjects. Whether a part is statistically significant can be determined easily by the person skilled in the art using several well known statistical evaluation tools, for example, determination of confidence intervals, determination of p values, Student's t-test, Mann- Whitney test, etc. Details are provided in Dowdy and Wearden, Statistics for Research, John Wiley & Sons, New York 1983. The preferred confidence intervals are at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%. The p values are preferably 0.05, 0.01, or 0.005.

The term "risk" with respect to the method for detecting the presence or absence of cancer in a subject refers to the detection of an increased risk of developing the cancer or an increased probability of having it. If the subject already has an increased risk in view of one or more risk factors that can be attributed to it (as defined herein), the 'risk therof refers to a risk that is increased further, i.e. that is in addition to the risk due to those risk factors. The term "treatment" or "treating" with respect to cancer as used herein refers to a therapeutic treatment, wherein the goal is to reduce progression of cancer. Beneficial or desired clinical results include, but are not limited to, release of symptoms, reduction of the length of the disease, stabilized pathological state (specifically not deteriorated), slowing down of the disease's progression, improving the pathological state and/or remission (both partial and total), preferably detectable. A successful treatment does not necessarily mean cure, but it can also mean a prolonged survival, compared to the expected survival if the treatment is not applied. In a preferred embodiment, the treatment is a first line treatment, i.e. the cancer was not treated previously. Cancer treatment involves a treatment regimen.

The term "treatment regimen" as used herein refers to how the subject is treated in view of the disease and available procedures and medication. Non-limiting examples of cancer treatment regimes are chemotherapy, surgery and/or irradiation or combinations thereof. The early detection of cancer the present invention enables allows in particular for a surgical treatment, especially for a curative resection. In particular, the term "treatment regimen" refers to administering one or more anti-cancer agents or therapies as defined below. The term "anti-cancer agent or therapy" as used herein refers to chemical, physical or biological agents or therapies, or surgery, including combinations thereof, with antiproliferative, antioncogenic and/or carcinostatic properties.

A chemical anti-cancer agent or therapy may be selected from the group consisting of alkylating agents, antimetabolites, plant alkaloyds and terpenoids and topoisomerase inhibitors. Preferably, the alykylating agents are platinum-based compounds. In one embodiment, the platinum-based compounds are selected from the group consisting of cisplatin, oxaliplatin, eptaplatin, lobaplatin, nedaplatin, carboplatin, iproplatin, tetraplatin, lobaplatin, DCP, PLD-147, JM1 18, JM216, JM335, and satraplatin.

A physical anti-cancer agent or therapy may be selected from the group consisting of radiation therapy (e.g. curative radiotherapy, adjuvant radiotherapy, palliative radiotherapy, teleradiotherapy, brachytherapy or metabolic radiotherapy), phototherapy (using, e.g. hematoporphoryn or photofrin II), and hyperthermia.

Surgery may be a curative resection, palliative surgery, preventive surgery or cytoreductive surgery. Typically, it involves an excision, e.g. intracapsular excision, marginal, extensive excision or radical excision as described in Baron and Valin (Rec. Med. Vet, Special Cane. 1990; 11(166):999-1007).

A biological anti-cancer agent or therapy may be selected from the group consisting of antibodies (e.g. antibodies stimulating an immune response destroying cancer cells such as retuximab or alemtuzubab, antibodies stimulating an immune response by binding to receptors of immune cells an inhibiting signals that prevent the immune cell to attack "own" cells, such as ipilimumab, antibodies interfering with the action of proteins necessary for tumor growth such as bevacizumab, cetuximab or panitumumab, or antibodies conjugated to a drug, preferably a cell- killing substance like a toxin, chemotherapeutic or radioactive molecule, such as Y-ibritumomab tiuxetan, I-tositumomab or ado-trastuzumab emtansine), cytokines (e.g. interferons or interleukins such as INF-alpha and IL-2), vaccines (e.g. vaccines comprising cancer-associated antigens, such as sipuleucel-T), oncolytic viruses (e.g. naturally oncolytic viruses such as reovirus, Newcastle disease virus or mumps virus, or viruses genetically engineered viruses such as measles virus, adenovirus, vaccinia virus or herpes virus preferentially targeting cells carrying cancer-associated antigens such as EGFR or HER-2), gene therapy agents (e.g. DNA or RNA replacing an altered tumor suppressor, blocking the expression of an oncogene, improving a subject's immune system, making cancer cells more sensitive to chemotherapy, radiotherapy or other treatments, inducing cellular suicide or conferring an anti- angiogenic effect) and adoptive T cells (e.g. subject-harvested tumor-invading T-cells selected for antitumor activity, or subject-harvested T-cells genetically modified to recognize a cancer-associated antigen) .

In one embodiment, the one or more anti-cancer drugs is/are selected from the group consisting of Abiraterone Acetate, ABVD, ABVE, ABVE-PC, AC, AC-T, ADE, Ado- Trastuzumab Emtansine, Afatinib Dimaleate, Aldesleukin, Alemtuzumab, Aminolevulinic Acid, Anastrozole, Aprepitant, Arsenic Trioxide, Asparaginase Erwinia chrysanthemi, Axitinib, Azacitidine, BEACOPP, Belinostat, Bendamustine Hydrochloride, BEP, Bevacizumab, Bexarotene, Bicalutamide, Bleomycin, Bortezomib, Bosutinib, Brentuximab Vedotin, Busulfan, Cabazitaxel, Cabozantinib-S-Malate, CAFCapecitabine, CAPOX, Carboplatin, CARBOPLATIN- TAXOL, Carfilzomib, Carmustine, Carmustine Implant, Ceritinib, Cetuximab, Chlorambucil, CHLORAMBUCIL-PREDNISONE, CHOP, Cisplatin, Clofarabine, CMF, COPP, COPP-ABV, Crizotinib, CVP, Cyclophosphamide, Cytarabine, Cytarabine, Liposomal, Dabrafenib, Dacarbazine, Dactinomycin, Dasatinib, Daunorubicin Hydrochloride, Decitabine, Degarelix, Denileukin Diftitox, Denosumab, Dexrazoxane Hydrochloride, Docetaxel, Doxorubicin Hydrochloride, Doxorubicin Hydrochloride Liposome, Eltrombopag Olamine, Enzalutamide, Epirubicin Hydrochloride, EPOCH, Eribulin Mesylate, Erlotinib Hydrochloride, Etoposide Phosphate, Everolimus, Exemestane, FEC, Filgrastim, Fludarabine Phosphate, Fluorouracil, FU- LV, Fulvestrant, Gefitinib, Gemcitabine Hydrochloride, GEMOT ABINE-CISPLATIN, GEMOT ABINE-OXALIPLATIN, Gemtuzumab Ozogamicin, Glucarpidase, Goserelin Acetate, HPV Bivalent Vaccine, Recombinant HPV Quadrivalent Vaccine, Hyper-CVAD, Ibritumomab Tiuxetan, Ibrutinib, ICE, Idelalisib, Ifosfamide, Imatinib, Mesylate, Imiquimod, Iodine 131 Tositumomab and Tositumomab, Ipilimumab, Irinotecan Hydrochloride, Ixabepilone, Lapatinib Ditosylate, Lenalidomide, Letrozole, Leucovorin Calcium, Leuprolide Acetate, Liposomal Cytarabine, Lomustine, Mechlorethamine Hydrochloride, Megestrol Acetate, Mercaptopurine, Mesna, Methotrexate, Mitomycin C, Mitoxantrone Hydrochloride, MOPP, Nelarabine, Nilotinib, Obinutuzumab, Ofatumumab, Omacetaxine Mepesuccinate, OEPA, OFF, OPPA, Oxaliplatin, Paclitaxel, Paclitaxel Albumin- stabilized Nanoparticle Formulation, PAD, Palifermin, Palonosetron Hydrochloride, Pamidronate Disodium, Panitumumab, Pazopanib Hydrochloride, Pegaspargase, Peginterferon Alfa-2b, Pembrolizumab, Pemetrexed Disodium, Pertuzumab, Plerixafor, Pomalidomide, Ponatinib Hydrochloride, Pralatrexate, Prednisone, Procarbazine Hydrochloride, Radium 223 Dichloride, Raloxifene Hydrochloride, Ramucirumab, Rasburicase, R-CHOP, R-CVP, Recombinant HPV Bivalent Vaccine, Recombinant HPV Quadrivalent Vaccine, Recombinant Interferon Alfa-2b, Regorafenib, Rituximab, Romidepsin, Romiplostim, Ruxolitinib Phosphate, Siltuximab, Sipuleucel-T, Sorafenib Tosylate, STANFORD V, Sunitinib Malate, TAC, Talc, Tamoxifen Citrate, Temozolomide, Temsirolimus, Thalidomide, Topotecan Hydrochloride, Toremifene, Tositumomab and I 131 Iodine Tositumomab, TPF, Trametinib, Trastuzumab, Vandetanib, VAMP, VelP, Vemurafenib, Vinblastine Sulfate, Vincristine Sulfate, Vincristine Sulfate Liposome, Vinorelbine Tartrate, Vismodegib, Vorinostat, XELOX, Ziv- Aflibercept, and Zoledronic Acid.

SEQ IDs referred to in the application The present application refers to SEQ ID NOs 1-7. An overview of these SED ID NOs is given in the following:

SEQ ID NO: 1 represents a LINE1 sequence investigated by the inventors.

SEQ ID NO: 2 represents an Alu sequence investigated by the inventors.

The specific CpG dinucleotides referred to herein have the following positions within SEQ ID NOs 1 and 2: LINE1 (SEQ ID NO: 1): dinucleotide 1 - positions 26/27, dinucleotide 2 - positions 44/45, dinucleotide 3 - positions 49/50, dinucleotide 4 - positions 51/52, dinucleotide 5 - positions 53/54, dinucleotide 6 - positions 60/61, dinucleotide 7 - positions 70/71, dinucleotide 8 - positions 94/95, dinucleotide 9 - positions 120/121, dinucleotide 10 - positions 140/141, dinucleotide 11 - positions 144/145, dinucleotide 12 - positions 158/159, dinucleotide 13 - positions 173/174, dinucleotide 14 - positions 182/183, dinucleotide 15 - positions 194/195, dinucleotide 16 - positions 206/207, dinucleotide 17 - positions 209/210, dinucleotide 18 - positions 216/217; Alu (SEQ ID NO: 2): dinucleotide 1 - positions 22/23, dinucleotide 2 - positions 24/25, dinucleotide 3 - positions 30/31, dinucleotide 4 - positions 47/48, dinucleotide 5 - positions 54/55, dinucleotide 6 - positions 64/65, dinucleotide 7 - positions 86/87, dinucleotide 8 - positions 106/107, dinucleotide 9 - positions 108/109, dinucleotide 10 - positions 110/111, dinucleotide 11 - positions 118/119, dinucleotide 12 - positions 122/123, dinucleotide 13 - positions 150/151, dinucleotide 14 - positions 181/182, dinucleotide 15 - positions 204/205, dinucleotide 16 - positions 208/209. Compare Table 4, which prevails in case of any discrepancy.

SEQ ID NO: 3 represents the human genomic consensus sequence of LINE1. LINE1 CpG dinucleotide 1 of SEQ ID NO: 1 has the positions 1140/1141 in SEQ ID NO: 3. The other LINE1 CpG dinucleotide have corresponsing positions.

SEQ ID NO: 4 represents LINE1 sense primer of Table 3.

SEQ ID NO: 5 represents LINE1 antisense primer of Table 3.

SEQ ID NO: 6 represents Alu sense primer of Table 3.

SEQ ID NO: 7 represents Alu antisense primer of Table 3.

SEQ ID NOs 8-12 represent the mature miRNAs miR-328, miR-320, miR-145, miR-339-3p, and miR-193a-3p, respectively.

***

The invention is described by way of the following examples which are to be construed as merely illustrative and not limitative of the scope of the invention.

Example 1: Investigation of LINE1 and Alu methylation in peripheral blood from Breast Cancer Patients

Methods

Study population

The study was ratified by the Ethics Committee of University Hospital in Heidelberg. All samples of breast cancer (BC) cases and healthy controls were Caucasian and obtained from the same region in southwest Germany at University Hospital of Heidelberg. All enrolled patients and controls provided written informed consent. Blood samples of sporadic BC patients were obtained from the genome study of the inventors and were collected at the time of diagnosis prior to any treatment. Clinical parameters of BC patients were confirmed according to the American Joint Committee on Cancer staging manual. Detailed characteristics of BC cases are shown in Table 1. Additionally, control blood samples originated from the inventors biomarker study, which were obtained from blood donors by the university hospital of Heidelberg. All the controls were healthy and without a family history of BC. All blood samples collected during the period from 2011 to 2014, a total of 229 sporadic BC patients and 151 healthy controls were randomly selected for this study (see Table 2). Table 1. Characteristics of sporadic BC patients

Characteristics BC patients (n,%)

TNM stage

stage 0 1(0.4%)

stage I 69(30.1%)

stage II 72(31.4%)

stage III 15(6.6%)

stage IV 4(1.7%)

neoadj.* 50(21.8%)

unknown 18(7.9%)

Type of BC

Ductal 179(78.2%)

Lobular 13(5.7%)

Ductal-Lobular 3(1.3%)

DCIS 4(1.7%)

Others 10(4.4%)

unknown 24(10.5%)

ER status 3

negative 21(9.2%)

positive 160(69.9%)

unknown 48(21.0%)

PR status 3

negative 36(15.7%)

positive 145(63.3%)

unknown 48(21.0%)

HER2 status b

negative 165(72.1%)

positive 16(7.0%)

unknown 48(21.0%)

a Immunoreactive score (IRS), ER/PR negative: IRS 0-2; ER/PR positive: IRS 3-12 b HER2 negative: 0-1 ; HER2 positive: IHC-score 3. If IHC-score = 2, FISH/CISH was further analyzed and recognized positive if HER2 was amplified

*patients were treated with neoadjuvant chemotherapy, so no stage is given here

Table 2. Sample Information

Sample Age (y)

Gene Group Number

Types Mean ± SD

LINE1 Sporadic BC

Peripheral 229 48.31+8.09

cases

blood DNA

Controls 151 41.47+10.91

Alu Sporadic BC

Peripheral 229 48.31+8.09

cases

blood DNA

Controls 151 41.47+10.91 DNA isolation and Bisulfite conversion

DNA from whole blood samples (200μ1 per column) was extracted by using QIAamp DNA Blood Mini Kit (Qiagen, Hilden, Germany) according to the manufacturer's recommendations. DNA quality and quantity were measured by the NanoDrop ND- 1000 UV/Vis-Spectralphotometer 3.3 (peqLab, Erlangen, Germany). Aliquots of DNA (500ng) were bisulfite-treated with the EZ- 96 DNA methylation Gold kit (Zymo Research Corporation, Orange, US) according as the description of the manufacturer. Primer design and PCR amplification

The online tool "epidesigner" (http://www.epidesigner.com/start3.html) was used for the primer design. The PCR primers for LINEl and Alu and their amplicon sequences are shown in Tables 3 and 4. PCR reaction was performed in a total volume of 6μ1 mix. PCR reaction components included lOng/μΙ bisulfite-treated DNA, 10 x CoralLoad Buffer (Qiagen), lOmM dNTPs, ΙμΜ of each (forward and reverse) primer(Sigma), and 5U HotStar Taq DNA polymerase (Qiagen, Valencia CA). The touch-down PCR profile was 95°C for 5 minutes, denaturation at 94°C for 30 seconds, primer annealing from 59°C to 53°C for 30 seconds, a final extension at 72°C for 1 minute, then 72°C for 5 minutes, 4°C for infinite. 1% agarose gel was used for electrophoresis to inspect PCR products and visualized under ultraviolet light.

Table 3. Bisulfite-specific primers for the target amplicons

No. of

Amplicon No. of

Amplicon Primer Sequence Analyzed size CpG

CpG

LINEl sense aggaagagagTTATTAGGGAGTGTTAGATAGTGGG 250 18 11

antisense cagtaatacgactcactatagggagaaggctAAAACCCTCTAAACC

AAATATAAAAT

Alu sense aggaagagagGTTTAGGTTGGAGTGTAGTGG 240 17 11

antisense cagtaatacgactcactatagggagaaggctCCTATAATCCCAACAC

TTTAAAAAA

Table 4. Sequences of the target amplicons

Amplicon Sequence

LINEl (SEP TCACTAGGGAGTGCCAGACAGTGGGCGCAGGCCAGTGTGTGTGCGCACC ID NO: 1) GTGCGCGAGCCGAAGCAGGGCGAGGCATTGCCTCACCTGGGAAGCGCAA

GGGGTCAGGGAGTTCCCTTTCCGAGTCAAAGAAAGGGGTGACGGACGCA TTAAGAAACGGCGCACCACGAGACTATATCCCACACCTGGCTCAGAGGGT CCT

Alu (SEQ ID GCCCAGGCTGGAGTGCAGTGGCGCGATCTCGGCTCACTGCAACCTCCGCC NO: 2) TCCCGGGTTCAAGCGATTCTCCTGCCTCAGCCTCCCGAGTAGCTGGGATT

GGGGTTTCACCATGTTGGCCAGGCTGGTCTCGAACTCCTGACCTCAGGTG ATCCGCCCGCCTCGGCCTCCCAAAGTGCTGGGATTACAGG

Methylation analysis

The Sequenom MassARRAY EpiTyper assay was used for methylation analysis as described previously (Tang et al., Oncotarget.2016; 7:64191-64202). The PCR products were dealt with succeeding procedures according as the protocol of Sequenom MassARRAY EpiTyper Assay and cleaned by Resin. Then 8- 15nl of cleavage reaction was transferred to a 384 SpectroCHIP by using the Nanodispenser (SEQUENOM). The chip was analyzed with the MassARRAY (SEQUENOM). The mass spectra were collected from a MassARRAY Compact MALDI-TOF (SEQUENOM) and spectra's methylation ratios were generated by MassARRAY EpiTyper vl .O software. Results were expressed as "beta" values between 0 and 1.

Statistical analysis

Statistical analysis was performed with SPSS statistics 24.0 and R 3.4.0. The nonparametric Mann- Whitney U test and Kruskal-Wallis H test were used for all univariable comparisons. Multivariable logistic regression models were calculated to evaluate the association between LINE1 and Alu methylation and BC. For this, the methylation level of one or more CpGs was entered either as a raw beta value, or categorized into four quartile groups based on the distribution among controls. In all cases age was included as a covariable. Receiver operating characteristic (ROC) curves were used to display sensitivity and specificity and the corresponding area under the curve (AUC) was calculated using the R package "pROC". To account for overfitting, additional corrected estimates of the AUC were calculated using the "0.632+" technique (R package "Daim"). The significance level for all analyses was set as p<0.05. Results

Hypomethylation of repetitive elements in BC patients

DNA methylation levels in two repetitive elements, LINE1 and Alu, were compared in 229 BC patients and 151 healthy controls. LINE1

11 CpG sites were measured in the amplicon of LINE1 (Table 5). The inventors observed significant hypomethylation of seven CpG sites, LINEl_CpG_l , LINEl_CpG_3,4,5, LINEl_CpG_9, LINEl_CpG_12, LINEl_CpG_14, in peripheral blood of BC patients. Their median methylation levels, IQRs (interquartile ranges) and the p values are shown in Table 5. The mean methylation level of all investigated CpG sites of LINE1 was also significantly lower in peripheral blood of BC cases than that in healthy controls [median of mean methylation level in BC cases = 0.85 (interquartile range, IQR = 0.84-0.86) compared to controls 0.86 (IQR = 0.85- 0.88), with adjusted P = 8.78E-06] .

Table 5. Comparison of DNA methylation in LINE1 amplicon between breast cancer (BC) patients and controls in peripheral blood

BC Cases median Controls median Adjusted

CpG site p value

(IQR), n=227 (IQR), n=151 p value b

LINE1_ _CpG_ 1 0.76(0.74-0.79) 0.80(0.76-0.82) 4.04E-11 3.64E-10

LINE1_ _CpG_ _2 0.92(0.91-0.94) 0.93(0.91-0.94) 0.143 0.057

LINE1_ _CpG_ 3,4,5 0.88(0.86-0.89) 0.88(0.87-0.90) 0.001 0.002

LINE1_ _CpG_ 9 0.87(0.86-0.88) 0.88(0.87-0.90) 4.71E-04 0.001

LINE1_ _CpG_ 12 0.71(0.69-0.74) 0.74(0.71-0.77) 1.21E-07 4.09E-07

LINE1_ _CpG_ 14 0.90(0.89-0.91) 0.91(0.90-0.92) 3.94E-09 8.19E-09

LINE1_ _CpG_ 15 0.94(0.78-0.98) 0.90(0.80-0.98) 0.416 0.927

LINE1_ _CpG_ . 16,17 0.88(0.90-0.91) 0.88(0.90-0.91) 0.056 0.199

MEAN 0.85(0.84-0.86) 0.86(0.85-0.88) 2.14E-06 8.78E-06

Abbreviations: IQR: interquartile range

^ value for the difference between controls and patients was analyzed by Mann-Whitney U test

and adjusted by Bonferroni-Holm method o^O.00556

b Adjusted p value was calculated by logistic regression and adjusted by age. Significant p values

are in Bold (P < 0.05).

In Figure 1, the methylation levels of the seven significant CpG sites of LINE1 are plotted for the BC cases and controls. Although all of these CpG sites revealed significant results, the methylation levels of cases and controls were quite similar. The biggest difference between cases and controls was shown in the median methylation levels for LINEl_CpG_l [median methylation level in BC cases = 0.76 (IQR = 0.74-0.79) compared to controls 0.80 (IQR = 0.76-0.82), with adjusted P = 3.64E-10] and LINEl_CpG_12 [median methylation level in BC cases = 0.71 (IQR = 0.69-0.74) compared to controls 0.74 (IQR = 0.71-0.77), with adjusted p = 4.09E-07] . The lowest P value was observed for LINEl_CpG_l (adjusted P = 3.64E- 10 see Table 5). As especially LINEl_CpG_l showed a strong association with BC, the inventors also calculated the diagnostic AUC for this single CpG site through a ROC analysis.

ROC curve analysis of LINEl_CpG_l methylation and LINEl mean methylation (Figure 2) was used to estimate the potential clinical utility of LINEl methylation, AUC was 0.73 (95%CI 0.68-0.79) and 0.68 (95%CI 0.62-0.74) respectively.

Moreover, in the quartile analysis the inventors observed that patients in the lowest of methylation quartile of LINEl_CpG_l have the highest OR value of 38.47 (95% CI: 8.77-168.64) compared to the highest quartile. P for trend was 1.42E-07 for LINEl_CpG_l methylation in Table

Table 6. Associations between quartile of LINEl_CpG_l methylation and the risk of breast cancer (BC)

LINEl_CpG_l BC case, control,

Quartile OR (95% CI) P* value

methylation n=229 n=151

1 0.64 - 0.76 122 44 38.47(8.77-168.64) 1.30E-06

2 0.77 - 0.80 81 44 24.84(5.64-109.37) 2.16E-05

3 0.81 - 0.82 22 32 10.27(2.21-47.68) 2.93E-03

4 0.83- 0.96 2 31 l.OO(reference)

P for trend* 1.42E-07

* p value and p for trend were calculated by logistic regression and adjusted for age. Significant p values are in bold, o^= 0.05

Similarly, as shown in Table 7, an increased risk was found in the lower quartiles compared to the highest quartile of LINEl mean methylation (OR = 5.94, 4.44 and 3.90; 95% confidence interval (CI) = 2.94-11.98, 2.13-9.28 and 1.84-8.24 respectively; p for trend = 1.33E-05).

Table 7. Associations between quartile of LINEl mean methylation and the risk of breast cancer

LINEl mean case, control,

Quartile OR (95% CI) P* value

methylation n=229 n=151

1 0.798 - 0.845 100 41 5.94(2.94-11.98) 6.54E-07

2 0.846 - 0.860 60 35 4.44(2.13-9.28) 7.35E-05

3 0.861 - 0.877 52 32 3.90(1.84-8.24) 3.66E-04

4 0.878 - 0.921 15 43 1.OO(reference)

P for trend* 1.33E-05

* p value and p for trend were calculated by logistic regression and adjusted for age. Significant p values are in bold. a= 0.05 Alu

For Alu, the inventors investigated 11 CpG sites in the amplicon, and found a significant difference in methylation of Alu_CpG_13 and Alu_CpG_14 in peripheral blood between cases and controls (see Table 8) with adjusted P = 0.002 and 0.006 respectively [median of methylation level for Alu_CpG_13 = 0.63 in cases (IQR = 0.62-0.65) compared to controls 0.65 (IQR = 0.62- 0.67) and median of methylation level for Alu_CpG_14 = 0.50 (IQR = 0.48-0.55) in cases compared to controls 0.54 (IQR = 0.49-0.56)] . All other investigated CpG sites did not show significant differences between BC cases and controls. The box plot and ROC curve of Alu_CpG_13 and Alu_CpG_14 are shown in Figures 3 and 4, respectively, with AUC = 0.67 (95 CI 0.61-0.74) and 0.67 (95 CI 0.61-0.73).

Table 8. Comparison of DNA methylation in Alu amplicon between breast cancer patients and controls in peripheral blood

BC

Control, BC cases controls P Adjusted

CpG site Case,

n median (IQR) median (IQR) value P value b n

Alu_CpG_ .1,2 229 150 0.21(0.19-0.23) 0.21(0.20-0.23) 0.348 0.208

Alu_CpG_ _3 229 150 0.63(0.62-0.64) 0.63(0.62-0.64) 0.193 0.131

Alu_CpG_ 7 212 125 0.46(0.44-0.47) 0.46(0.44-0.47) 1.0 0.252

Alu_CpG_ _11,12 229 148 0.53(0.52-0.55) 0.53(0.52-0.55) 1.0 0.345

Alu_CpG_ .13 229 149 0.63(0.62-0.65) 0.65(0.62-0.67) 0.002 0.002

Alu_CpG_ _14 229 147 0.50(0.48-0.55) 0.54(0.49-0.56) 0.035 0.006

Alu_CpG_ _15,16,17 157 88 0.65(0.63-0.67) 0.66(0.63-0.68) 0.348 0.128

MEAN 229 150 0.51(0.50-0.53) 0.52(0.50-0.53) 0.348 0.343

Abbreviations: IQR: interquartile range OR: odds ratio

a P value for the difference between controls and patients was analysed by Mann-Whitney U test and

adjusted by Bonferroni-Holm method a=0.00625

b Adjusted P value was calculated by logistic regression and adjusted by age. Significant P values are in

Bold (P < 0.05). In addition, as shown in Table 9, the inventors found an increased risk in first two lower quartiles compared with the highest quartile of the methylation in Alu_CpG_13 (OR = 2.50 and 2.10; 95% confidence interval (CI) = 1.25-5.02 and 1.05-4.19 respectively; P for trend = 0.002). Contrarily, as shown in Table 10, no increased risk was found between the lower quartiles and the highest quartile of Alu_CpG_14 methylation with P for trend = 0.08. Table 9. Associations between quartile of Alu_CpG_13 methylation and the risk of breast cancer

Alu_CpG_13 case, control,

OR (95% CI)* P* value methylation n=229 n=151

Quartile

1 0.57 - 0.62 92 41 2.50(1.25-5.02) 0.01

2 0.63 - 0.65 87 45 2.10(1.05-4.19) 0.04

3 0.66 - 0.67 27 37 0.87(0.40-1.88) 0.72

4 0.68- 0.73 23 26 l.OO(reference)

P for trend* 0.002

* p value and p for trend were calculated by logistic regression and adjusted for age. Significant p values are in bold. a= 0.05

Table 10. Associations between quartile of Alu_CpG_14 methylation and the risk of breast cancer

Alu_CpG_14 case, control,

OR (95% CI)* P* value methylation n=229 n=147

Quartile

1 0.43 - 0.49 98 45 2.23(1.21-4.13) 0.01

2 0.50 - 0.54 62 41 1.62(0.86-3.07) 0.14

3 0.55 - 0.56 35 27 1.49(0.73-3.06) 0.27

4 0.57 - 0.62 34 34 l.OO(reference)

P for trend* 0.08

* p value and p for trend were calculated by logistic regression and adjusted for age. Significant p values are in bold. a= 0.05

Combination of LINE1 and Alu

Finally, the inventors did a multiple logistic regression analysis including all variables from both repetitive elements plus age (see Table 11). For this calculation, Alu_CpG_7 and Alu_CpG_15, 16, 17 were excluded because of the large number of missing values. Both of these CpGs were not noticeable in the univariate analysis. Again, LINEl_CpG _1 was dominant, but different other CpGs contribute additional information. Furthermore, the effect of Alu_CpG_13 was no longer significant. Performing a ROC analysis on this model, an AUC of 0.78 (95 CI: 0.72-0.83) was observed (Figure 5). The ".632+" corrected AUC was 0.74.

Table 11. Combination analysis of LINE1 and Alu

„ „ ., BC Cases median Controls median „„ „ T . „ , *

CpG s i te (iQR), n= 220 (IQR)n = 145 OR (95% CI) P value*

Alu_CpG_l,2 0.21(0.19-0.23) 0.21(0.20-0.23) 0.18(0.05-0.61) 0.006

Alu_CpG_3 0.63(0.62-0.64) 0.63(0.62-0.64) 1.82(0.23-14.76) 0.574

Alu_CpG_l l,12 0.53(0.52-0.55) 0.53(0.52-0.55) 13.39(2.17-82.84) 0.005 Alu_CpG_13 0.63(0.62-0.65) 0.65(0.62-0.67) 0.60(0.13-2.71) 0.503

Alu_CpG_14 0.50(0.48-0.55) 0.54(0.49-0.56) 5.13(1.69-15.55) 0.004

LINEl_CpG_ . 1 0.76(0.74-0.79) 0.80(0.76-0.82) 0.05(0.01-0.21) <0.001

LINEl_CpG_ 2 0.92(0.91-0.94) 0.93(0.91-0.94) 5.04(0.85-30.03) 0.076

LINEl_CpG_ 3,4,5 0.88(0.86-0.89) 0.88(0.87-0.90) 1.27(0.23-6.96) 0.783

LINEl_CpG_ 9 0.87(0.86-0.88) 0.88(0.87-0.90) 0.98(0.14-6.70) 0.984

LINEl_CpG_ .12 0.71(0.69-0.74) 0.74(0.71-0.77) 0.84(0.35-2.02) 0.696

LINEl_CpG_ .14 0.90(0.89-0.91) 0.91(0.90-0.92) 0.01(0.001-0.22) 0.002

LINEl_CpG_ .15 0.94(0.78-0.98) 0.90(0.80-0.98) 1.03(0.80-1.34) 0.807

LINEl_CpG_ .16,1

0.88(0.90-0.91) 0.88(0.90-0.91)

7 1.52(0.91-2.53) 0.114

*p values are calculated by multiple logistic regression and adjusted for age and the other CpGs of the table. Significant p values are in bold.

Abbreviations: IQR: interquartile range OR: odds ratio

If only the two most promising univariate candidates LINEl_CpG_l and Alu_CpG_13 plus age are included (Table 12), Alu_CpG_13 still lost significance and an AUC of 0.73 (95%CI: 0.68- 0.79, corrected AUC =0.73) can be observed (Figure 6).

Table 12. Combination analysis of LINEl_CpG_l and Alu_CpG_13

BC Cases median Controls median

CpG site OR (95% CI) P value*

(IQR), n=220 (IQR), n=147

LINEl_CpG_l 0.76(0.74-0.79) 0.80(0.76-0.82) 0.09(0.04-0.20) <0.001

Alu_CpG_13 0.63(0.62-0.65) 0.65(0.62-0.67) 1.76(0.64-4.86) 0.274

*p values are calculated by multiple logistic regression and adjusted for age and the other CpGs of the table. Significant p values are in bold.

Abbreviations: IQR: interquartile range OR: odds ratio

The inventors also calculated a model with the 7 most important variables plus age (see Table 13). The AUC is 0.77 (95 -CI=0.72-0.83, Cross- Validation AUC=0.75) (Figure 7).

Table 13. Combination analysis of LINE1 and Alu

Alu_CpG_l,2 0.21(0.19-0.23) 0.21(0.20-0.23) 0.19(0.06-0.61) 0.005

Alu_CpG_l l,12 0.53(0.52-0.55) 0.53(0.52-0.55) 13.91(2.76-70.00) 0.001

Alu_CpG_14 0.50(0.48-0.55) 0.54(0.49-0.56) 4.18(1.60-10.96) 0.004

LINEl_CpG_l 0.76(0.74-0.79) 0.80(0.76-0.82) 0.04(0.01-0.15) <0.001

LINEl_CpG_2 0.92(0.91-0.94) 0.93(0.91-0.94) 4.96(1.12-21.87) 0.035

LINEl_CpG_14 0.90(0.89-0.91) 0.91(0.90-0.92) 0.01(0.001-0.17) 0.001

LINEl_CpG_16,17 0.88(0.90-0.91) 0.88(0.90-0.91) 1.60(1.0-1.05) 0.051

*p values are calculated by multiple logistic regression and adjusted for age and the other CpGs of

the table. Significant p values are in bold.

Abbreviations: IQR: interquartile range OR: odds ratio Correlation of LINEl and Am methylation with clinical characteristics of BC patients

The inventors evaluated the association of the methylation levels of these two repetitive elements with clinical features of BC patients. Overall, there were no significant associations between the methylation levels of most CpG sites in LINEl and Alu and the different clinical characteristics. Further studies are needed to discover the association between blood DNA methylation and clinicopathological parameters (see Tables 14 and 15).

Table 14. The LINEl methylation in sporadic BC patients with different clinical characteristics

Clinical Median Median of methylation levels

Group (N)

characteristics (N) of Age CpG 1 CpG 2 CpG 3,4,5 CpG 9 CpG 12 CpG 14 CpG 15 CpG 16,17 MEAN

Her-2 status (181) Her-2 negative (165) 48 0.76 0.92 0.87 0.87 0.71 0.90 0.93 0.88 0.85

Her-2 positive (16) 46 0.75 0.92 0.88 0.86 0.70 0.89 0.96 0.88 0.84

P value

0 0.402 0.645 0.608 0.203 0.325 0.211 0.359 0.506 0.564

(Mann- Whitney U)

triple -negative BC non- triple neg. BC (161) 48 0.76 0.92 0.87 0.87 0.71 0.90 0.93 0.88 0.85

(181) triple neg. BC (20) 48 0.76 0.93 0.88 0.87 0.70 0.89 0.95 0.89 0.85

P value

0 0.288 0.871 0.815 0.705 0.206 0.604 0.376 0.978 0.968

(Mann- Whitney U)

Tumor Nr. (221) 1 tumor(217) 48 0.76 0.92 0.87 0.87 0.71 0.90 0.94 0.88 0.85

2 tumor(4) 51 0.78 0.94 0.90 0.89 0.73 0.91 0.88 0.89 0.87

P value

0 0.780 0.940 0.900 0.890 0.730 0.905 0.875 0.890 0.867

(Mann- Whitney U)

* patients underwent neoadjuvant chemotherapy

&3 C

S

Table 15. The Alu methylation in sporadic BC patients with different clinical characteristics

Table 15 (continued)

Gene Literature 450k results Sequenom

BC Cases BC Cases BC Cases

No./ Meth (cases) Meth No./ cases controls P No./ cases Median controls

Author, year [ref] P value Pvalue

Controls (%) (controls) (%) Controls Mean±SD Mean±SD value Controls (IQR) Median (IQR)

No. No. No.

LINEl Kitkumthorn N, 2012[40] 36/144 40 42 > 0.05 MEAN 48/48 0.523± 0.280.524±0.24 0.09 LIN El 229/151 0.85(0.84-0.86) 0.86(0.85-0.88) 2.14E-06

Xu X, 2012[38] 1064/1100 78.8 78.8 0.94 MEAN

pyrosequencing (mean) 279/340 74.5 ± 3.0 74.5 ± 2.6 > 0.05

Cho YH, 2010[41] 40/40 70 78 > 0.05

Choi JY, 2009[39] 19/18 74.7 73.9 0.176

Alu Wu HC, 2012[42] 266/334 95.5 ± 36.6 98.7 ± 51.5 > 0.05 Alu 229/151 0.51(0.50-0.53) 0.52(0.50-0.53) 0.348

Cho YH, 2010[41] 40/40 58 61 > 0.05 MEAN

Comparison of the results from this study with the results of Infinium HumanMethylation450 BeadChip array and with literature

As methylation level of repetitive elements is thought to reflect the average methylation level of genomic DNA, the inventors compared the mean methylation of LINEl and Alu in Sequenom data with the mean of all the CpG sites in 450K methylation array. In the epigenome- wide 450K methylation data, the inventors observed the mean methylation level was lower in peripheral blood of patients compared to controls but the P value was not statistically significant (0.523+ 0.28 and 0.524+0.24 for cases and controls, respectively, P = 0.09, Table 15). In this study, the mean methylation level of LINEl was significantly lower in peripheral blood DNA of BC cases than that in controls [median of mean methylation level in BC cases = 0.85 (IQR = 0.84- 0.86) compared to controls 0.86 (IQR = 0.85-0.88), with P = 2.14E-06]. However, the mean methylation of Alu did not show the significant difference between BC patients and healthy controls (see Table 16). Discussion

In their study, the inventors found statistically significant LINEl hypomethylation in the peripheral blood DNA of BC patients compared with healthy controls, especially for LINEl_CpG_l. They identified inter alia LINEl_CpG_l methylation to be strongly associated with BC. Also for Alu, they observed that single CpG sites were significantly hypomethylated in the peripheral blood DNA from BC cases compared to controls. Furthermore, the results show that the decreased methylation level of inter alia LINEl_CpG_l is associated with an increased BC risk. Also, quartiles of LINEl methylation levels are associated with BC, with an increased risk observed in particular in the lowest quartile compared with those in the highest quartile.

However, other studies found that there were no significant differences in the methylation levels of LINEl and Alu in peripheral blood DNA between BC cases and healthy controls, measured by different detection methods including pyrosequencing, methyLight and COBRA (Brennan et al., Cancer Res 2012; 72(9):2304-2313; Xu et al., Faseb j 2012; 26(6):2657-2666; Choi et al., Carcinogenesis 2009; 30(11): 1889- 1897; Kitkumthorn et al., Clin Chim Acta 2012; 413(9-10):869-874; Cho et al., Anticancer Res 2010; 30(7):2489-2496; Wu et al., Carcinogenesis 2012; 33(10): 1946-1952). In contrast to the inventors' observations, Wu et al. reported that there was no association between BC and LINE-1 and Alu methylation (Wu et al., Carcinogenesis 2012; 33(10): 1946-1952).

In summary, the inventors' study indicates that hypomethylation of CpG sites in Alu and especially in LINEl elements in peripheral blood DNA is a biomarker for cancer. Example 2: Investigation of miRNA expression in peripheral blood from breast cancer patients

Methods

miRNA extraction from liquid samples (plasma and serum)

For total RNA extraction (including miRNAs) from liquid samples a combination of phenol based sample lysis and silica-membrane column extraction was applied. TRIzol LS was added to the liquid samples (volume ratio 3: 1), homogenized by brief vortexing and incubated for 5 minutes at room temperature to permit the nucleoprotein complexes to dissociate. During the incubation, 10 fmol of a synthetic cel-miR-39 oligo was spiked in and 10 μg of glycogen added. The spiked- in cel-miR-39 was used later on for normalization and glycogen served as an RNA carrier to facilitate the extraction due to expected low RNA yields. Chloroform was added to the lysed samples (volume ratio 1:5) and immediately shaken. Vigorous, simultaneous vortexing of all samples followed before they were incubated at room temperature for 5 minutes. By centrifugation at 12000g for 15 minutes at 8°C the samples separated into three phases. The upper, aqueous phase containing the RNA was transferred to microcentrifugation tubes. The total RNA extraction continued with the components of the miRNeasy Mini kit as per manufacturer's recommendations. The addition of 1.5 volumes of absolute ethanol to the aqueous phase established the conditions for binding of RNA molecules >18 nucleotides in length. To obtain higher RNA concentrations the eluates were re-applied to the same columns and the elution repeated. The eluates were subsequently stored at -80°C until use.

Overview of circulating miRNA profiling and validation studies

TaqMan® Low Density Arrays (TLDA) are 384- well microfluidic cards pre-loaded with dried miRNA- specific TaqMan primers and probes for miRNA quantification. The quantification of miRNAs is based on the two-step RT-PCR described below. After sample preparation, hundreds of miRNAs from one sample are simultaneously reverse transcribed to cDNA in a Megaplex reverse transcription (RT). For samples with low miRNA concentrations, a pre-amplification of cDNA is performed to improve the sensitivity of subsequent miRNA detection. After the cDNA product of one sample is loaded onto the card, the real-time PCR based profiling of miRNAs began.

Profiling of plasma miRNAs with TaqMan® Low Density Arrays (TLDA)

Plasma samples from 10 early stage breast cancer patients and 10 healthy controls were profiled. All patients had a stage I or II invasive ductal carcinoma (IDC), which was ER/PR- positive and HER2-negative. Patients and healthy controls were age-matched. The mean (median) age of the patients was 54 (51) years, while it was 53 (54.5) years for the controls. Profiling of plasma samples was carried out utilizing TLDA array Human microRNA Cards A v2.1 and B v2.0 from Applied Biosystems. These arrays measure the expression of 667 human, mature miRNAs from miRBase version v.10. To obtain a full miRNA profile, two Megaplex reverse transcription (RT) reactions (Pool A and B), two pre-amplification reactions (Pool A and B) and two TaqMan MicroRNA Arrays (Array Cards A and B) were run per sample. In the first step, single- stranded cDNA was synthesized using the TaqMan MicroRNA Reverse Transcription Kit and TaqMan MicroRNA Megaplex RT Human Pool Sets A & B. The RT reaction had a final volume of 7.5 μΐ and contained a fixed volume of miRNA template (3 μΐ) and 4.5 μΐ of the Megaplex RT reaction mix (see below). The reverse transcription was carried out in a STORM GS2 PCR cycler.

Components of the Megaplex reverse transcription (RT) reaction mix: 0.8 μΐ Megaplex RT primers (lOx), 0.2 μΐ lOOmM dNTPs with dTTP, 1.5 μΐ MultiScribe RTase (50 U/μΙ), 0.8 μΐ 25 mM RT buffer, 0.9 μΐ 25 mMMgC , 0.1 μΐ RNase inhibitor (20 U/μΙ). Thermal-cycling conditions for the Megaplex reverse transcription (RT): 40 cycles of 16°C for 2 min, 42°C for 1 min and 50°C for 1 sec. Hold at 85°C for 5 min and then hold at 4°C.

Due to low miRNA quantities in plasma, cDNA was pre-amplified in a 25 μΐ reaction comprising 2.5 μΐ of the Megaplex RT product and 22.5 μΐ PreAmp reaction mix described below. The reaction was carried out in a STORM GS2 PCR cycler under thermal-cycling conditions described below.

Components of the pre-amplification (PreAmp) reaction mix: 12.5 μΐ TayMan PreAmp

Master Mis (2x), 2.5 μΐ Megaplex PreAmp primers (lOx), 7.5 μΐ nuclease free water. Thermal- cycling conditions for the pre-amplification (PreAmp): 95°C for 10 min, 55°C for 2 min, 72°C for 2 min, 12 cycles of 95°C for 15 sed and 60°C for 4 min, hold at 99.9° for 10 min, then hold at 4°C.

Prior to loading on the TLDA microfluidic cards, the PreAmp product was diluted with 75 μΐ of nuclease-free water and stored at -20°C until use (within 2 days of pre-amplification). The TLDA array real-time PCR reaction mix components are listed below. In each of the 8 microfluidic card ports, 100 μΐ of the TLDA array real-time PCR reaction mix was loaded as described by the manufacturer and the arrays ran in an ABI PRISM 7900HT thermal cycler as specified below.

Components of the TLDA array real-time PCR reaction mix: 450 μΐ TayMan Universal PCR Master Mix, No AmpErase UNG (2x), 9 μΐ diluted PreAmp product, 441 μΐ nuclease free water. Thermal-cycling conditions for the TLDA array real-time PCR reaction: 50°C for 2 min, 94.5 °C for 10 min, 40 cycles 97°C for 30 sec and 59.7°C for 1 min, hold at 4°C. Statistical analysis of circulating miRNA profiling data (TLDA)

Raw Ct values from TLDA runs were exported using SDS Relative Quantification Software version 2.2.2 (Applied Biosystems) with automatic baseline and threshold settings. For filtering, normalization and quality assessment all 20 samples (10 early stage breast cancer patients and 10 healthy controls) were processed together. The analysis was performed utilizing the statistical computational environment R version 2.11 and the R package HTqPCR version 1.2.0. First, Ct values larger than 35 were classified as "undetermined" and miRNAs with Ct>35 across all the samples were filtered out from further analysis. Second, following quantile normalization of the data, quality control plots, i.e. Pearson's correlations of Ct values across samples and principal components analysis (PCA) plots, were constructed to identify potential outliers. Overall, after filtering and averaging of few duplicate miRNA measurements, a total of 402 miRNAs remained for a statistical analysis performed by fitting linear models to the array data in order to identify miRNAs, which are deregulated between the plasma samples of the cases and controls with significant two-tailed P values (P<0.05). The results of this limma (linear models for microarrays) analysis were additionally adjusted for multiple testing by computing the false discovery rate (FDR).

Results

Array-based profiling of circulating miRNAs from plasma (TLDA arrays) and normalization of TLDA data

As the main goal was to develop a miRNA -panel sensitive enough to detect even early stage cancers, the plasma samples selected for the initial miRNA profiling step intentionally comprised only early stage, i.e. stage I and II, breast cancer patients. The investigated samples were from patients with invasive ductal carcinomas of the luminal subtype (representing the most common type of breast cancer). Furthermore, the patient and control samples selected for the arrays were age-, gender- and ethnicity- matched. To obtain a full miRNA profile two TLDA microfluidic cards were used for each sample measuring the levels of 667 human miRNAs. The array run quality check was conducted by analyzing a heat map of Pearson's correlations of Ct values across samples and principal components analysis (PCA) plots for dates of miRNA extraction, pre- amplification and TLDA runs. One sample (B024) that behaved differently regarding the real-time PCR data was flagged as an outlier and removed from further analysis steps. After the removal of sample B024, there were no other noticeable outliers regarding the array real-time PCR data. Circulating miRNAs classified as undetermined (Ct>35 in all investigated samples) were filtered out and raw Ct values quantile normalized so that the values from different runs would have the same distribution and could be easily compared. As could be seen from density plots of Ct values across the arrays, the Ct values formed two peaks before and after normalization. On microfluidic card A most of the miRNAs displayed Ct values around 30. The other peak around Ct value of 40 represents the fraction of undetected miRNAs. On card B most miRNAs were present at lower levels compared to the miRNAs from card A, which resulted in a shift of the first peak to approximately a Ct value of 32. Finally, a relatively large proportion of miRNAs present on array card B was not detected resulting in a second peak around Ct value of 40.

Statistical analysis of TLDA data

After filtering of undetermined miRNAs and normalization, 402 miRNAs remained for subsequent statistical analysis. It was necessary to define a subset of circulating miRNAs with the capability to truly differentiate between samples derived from healthy women and those with breast cancer. It was looked for circulating miRNAs, which show statistically significant differences between these two groups of samples by performing a limma test for a differential expression analysis of data arising from microarray experiments. Such an analysis resulted in 38 circulating miRNAs, which were significantly deregulated in the plasma of early stage breast cancer patients.

21 miRNAs were found in different levels in the plasma of breast cancer patients vs. controls, among them:

P value adj. P value mean Ct mean Ct ACt*

(FDR) (controls) (cases)

hsa-miR-328 0.0004 0.07 28.6 27.9 1.2

hsa-miR-320 0.001 0.11 24.0 24.8 -0.8

hsa-miR-145 0.004 0.23 28.1 27.1 1.0

hsa-miR-229-3p 0.004 0.23 30.9 30.2 0.7

hsa-miR-193a-3p 0.007 0.28 38.3 40.0 -1.7

FDR: false discovery rates, *: ACt = mean Ct controls - mean Ct cases.