Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
OLFACTOMEDIN-4, NEUDESIN AND DESMOPLAKIN AS BIOMARKERS OF BREAST CANCER
Document Type and Number:
WIPO Patent Application WO/2015/075242
Kind Code:
A1
Abstract:
The present invention is in the technical field of breast cancer management, and more particularly relates to the diagnosis of breast cancer. The invention is more particularly based on the finding that specific biomarkers (olfactomedin-4, neudesin and desmoplakin) are abberantly expressed in the blood of breast cancer patients.

Inventors:
GUETTE CATHERINE (FR)
RARO PEDRO (FR)
COQUERET OLIVIER (FR)
BARRE BENJAMIN (FR)
CAMPONE MARIO (FR)
Application Number:
PCT/EP2014/075427
Publication Date:
May 28, 2015
Filing Date:
November 24, 2014
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
INST CANCEROLOGIE DE L OUEST (FR)
UNIV ANGERS (FR)
INST NAT SANTE RECH MED (FR)
International Classes:
G01N33/50; G01N33/574
Domestic Patent References:
WO2013033609A22013-03-07
WO2011113047A22011-09-15
WO2012117267A12012-09-07
Foreign References:
US20070037228A12007-02-15
US20110085982A12011-04-14
Other References:
SAORI KOSHIDA ET AL: "Specific overexpression of OLFM4/GW112/hGC-1 mRNA in colon, breast and lung cancer tissues detected using quantitative analysis", CANCER SCIENCE, JAPANESE CANCER ASSOCIATION, TOKYO, JP, vol. 98, no. 3, 1 March 2007 (2007-03-01), pages 315 - 320, XP002663354, ISSN: 1347-9032, [retrieved on 20070110], DOI: 10.1111/J.1349-7006.2006.00383.X
Attorney, Agent or Firm:
REGIMBEAU (20 rue de Chazelles, Paris Cedex 17, FR)
Download PDF:
Claims:
CLAIMS

An in vitro method for diagnosing a breast cancer in a subject, comprising the steps of: a) determining from a biological fluid sample of a subject the protein expression level of at least one biomarker selected from the group consisting of Olfactomedin-4, Neudesin, Desmoplakin, and any combination thereof; and

b) comparing said expression level with a reference expression level of said biomarker.

The method according to claim 1 , wherein a protein expression level of said at least one biomarker dysregulated by comparison to said reference expression level obtained from a biological fluid sample of at least one healthy subject, is indicative that said subject is suffering from breast cancer.

The method according to claim 2, wherein said expression level of:

Olfactomedin-4 is superior to said reference expression level, and/or

Neudesin is superior to said reference expression level, and/or

Desmoplakin is inferior or superior to said reference expression level,

in the biological fluid sample of said subject suffering from breast cancer.

The method according to claim 3, wherein said expression level of Desmoplakin inferior to said reference expression level is indicative that said breast cancer is an early breast cancer.

The method according to claim 3, wherein said expression level of Desmoplakin superior to said reference expression level is indicative that said breast cancer is a recurring breast cancer.

The method according to anyone of claims 1 to 5, wherein the protein expression level of at least two, preferably three, of said biomarkers are determined in step a).

The method according to anyone of claims 1 to 6, further comprising the step of determining the protein expression level of at least one standard biomarker associated with breast cancer, said standard biomarker being preferably selected from the group consisting of estrogen receptor (ER), progesterone receptor (PR), and human epidermal growth factor receptor 2 (HER2).

8. An in vitro method for determining a drug-responding or non-responding phenotype in a subject suffering from breast cancer, comprising the steps of:

a) determining from a biological fluid sample of said subject the protein expression level of at least one biomarker as defined in claim 1 ;

b) comparing the protein expression level in step a) to a reference expression level of said biomarker; and

c) determining from said comparison the drug-responding or non-responding phenotype.

9. A method for designing or adapting a treatment regimen for a subject suffering from breast cancer, comprising the steps of:

a) determining from a biological sample of said subject a drug-responding or non- responding phenotype according to the method of claim 8; and

b) designing or adapting a treatment regimen for said subject based upon said responding or non-responding phenotype.

10. A screening method for identifying a drug or combination of drugs suitable for treating breast cancer, comprising the steps of:

a) contacting isolated breast cancer cells or cell line displaying a breast cancer phenotype with a candidate drug or combination of candidate drugs;

b) determining, from said cells or cell line contacted with said drug or combination of drugs, the protein expression level of at least one biomarker as defined in claim 1 ; and

c) comparing the protein expression level of said biomarker in step b) to its expression level in the absence of said drug or combination of drugs.

1 1 . The method according to anyone of claims 1 to 9, wherein said biological fluid is selected from the group consisting of blood, serum, plasma, lymph, tumor interstitial fluid, saliva, mucus, sputum, sweat, and urine.

12. The method according to claim 1 1 , wherein said biological fluid is serum.

13. The method according to anyone of claims 1 to 12, wherein the protein expression level is determined by a method selected from the group consisting of Western blot, immunoblot, enzyme-linked immunosorbant assay (ELISA), enzyme-linked immunospot (ELISPOT), radioimmunoassay (RIA), immunohistochemistry, immunoprecipitation, fluorescence activated cell sorting (FACS), microscopy, flow cytometry, microcytometry, protein binding assay, ligand binding assay, microarray, polyacrylamide gel electrophoresis such as SDS- PAGE, surface plasmon resonance (SPR), Forster resonance energy transfer (FRET), Bioluminescence resonance energy transfer (BRET), chemiluminescence, fluorescent polarization, phosphorescence, mass spectrometry, magnetic resonance imaging (MRI), and any combination thereof.

14. A kit for use in a method according to anyone of claims 1 to 13, comprising :

a) at least one reagent capable of specifically determining the protein expression level of at least one biomarker as defined in claim 1 ; and

b) instructions for performing said method.

15. A protein microarray for use in a method according to anyone of claims 1 to 13, comprising:

a) at least one reagent capable of specifically determining the protein expression level of at least one biomarker as defined in claim 1.

Description:
OLFACTOMEDIN-4, NEUDESIN AND DESMOPLAKIN AS BIOMARKERS OF BREAST CANCER

INTRODUCTION

The present invention is in the technical field of breast cancer management, and more particularly relates to the diagnosis of breast cancer. The invention is more particularly based on the finding that specific biomarkers are abberantly expressed in the blood of breast cancer patients.

With over 1.3 million cases of invasive breast cancers diagnosed annually, and more than 450,000 deaths reported per year, breast cancer is the most common malignancy diagnosed in women and one of the leading causes of cancer-related death in females.

The early detection of breast cancer is the cornerstone for reducing mortality rates in this cancer that affects one in nine women. Currently, breast cancer screening campaigns are delivered through mammography and although there is no doubt of their efficacy, this approach does have limitations in terms of sensitivity in women who have very dense breast tissue and in young women considered "at risk" (family history or genetic predisposition) for whom the regular use of ionising radiation is not recommended. Furthermore, according to recent work published in the Lancet (Independent UK Panel on Breast Cancer Screening, 2012), mammography screening leads to overdiagnosis in 19% of women. In other words, one in five diagnosis is said to be an overdiagnosis. Other imaging techniques such as sonography and nuclear magnetic resonance imaging are available, but they are not generally used for detection, being instead used as a further examination after mammography.

Besides, despite improvement in breast cancer therapies, local, contralateral breast or distant recurrence (also known as metastasis) occurs in 10 to 20% of patients in the three to ten years following initial adjuvant treatment. However, such recurrence is often either missed or identified as false positive by mammography, and unnecessary biopsies are performed on patients suspected of relapse.

It has thus become critical to identify reliable biomarkers allowing, without routine recourse to imaging techniques or invasive biopsies, not only the early detection of a breast tumour but also the monitoring of cancer progression. Alongside imaging techniques, a great deal of work examining the expression of genes or proteins in breast tumour tissue has been carried out, but the number of biomarkers that could potentially be used for reliably detecting breast cancer was very limited, mainly because they lacked sensitivity in the clinical context. In this regard, the serum biomarkers prostate-specific antigen (PSA), CA 15-3, and carcinoembryonic antigen (CEA), which have demonstrated some value in the diagnosis and treatment of other cancers, didn't prove to be useful in the detection and monitoring of breast cancer as they lacked the desired sensitivity and specificity.

There is thus an urgent need to identify breast cancer biomarkers that are easily detectable, sensitive enough to detect the presence of a tumour in breast cancer patients, and specific enough to not detect such tumour in those who do not have cancer.

The above discussed needs are addressed by the present invention, which reports herein the results of an investigation conducted by comparative proteome mapping of breast tumours and by validation of dysregulated secreted proteomic biomarkers on a large cohort of breast cancer patients. By contrast to genomic biomarkers, proteomic biomarkers are indeed particularly advantageous as they are more reflective of a tumour microenvironment and can undergo cancer specific posttranslational modifications.

In particular, the inventors have demonstrated that dysregulation in protein expression level of Olfactomedin-4 (OLFM4), Neudesin (NENF) and/or Desmoplakin (DSP) correlates with breast cancer, and that such biomarkers are detectable in blood samples of patients. It has notably been discovered that the expression level of Olfactomedin-4 and Neudesin are higher in breast cancer patients thoughout progression of the disease, while the expression level of Desmoplakin is lower at an early stage and higher in the case of recurrence, by comparison to healthy subjects.

Olfactomedin-4 is a secreted N-glycosylated protein belonging to the olfactomedin domain-containing protein family, which is characterized by a coil-coil domain N-terminal domain and a well-conserved C-terminal olfactomedin domain. The OFLM4 protein has been described in the literature as mediating cell adhesion through binding to cadherins and lectins (Liu et al., 2006), and as being involved in the regulation of cellular apoptosis and in the proliferation of cancer cells (Zhang et al., 2004; Kobayashi et al., 2007).

Neudesin on the other hand is an extracellular heme-binding protein which has been described as displaying neurotrophic activity in neurons via the mitogen-activated protein kinase (MAPK) and phosphatidylinositol 3-kinase (PI3K) pathways (Kimura et al., 2013).

Desmoplakin is a founding member of the plakin family, and is known as the principal plaque protein of desmosomes (Leung et al., 2002). It is therefore specialized in adhesion junctions found in various tissues and plays a critical role in the maintenance of epithelial tissue integrity. Recently, studies suggested that desmosomes participate in the regulation of cell motility, growth, differentiation and apoptosis (Allen et al., 1996; Wan et al., 2007; and Rickelt et al., 2009). Two isoforms of Desmoplakin have been reported so far, Desmoplakin I (322 kDa) and Desmoplakin II (259 kDa), both encoded by the Desmoplakin gene on human chromosome 6p24.3. Desmoplakin proteins interact with plakoglobin (γ-catenin), plakophilins and intermediate filaments, providing the intimate link between desmosomal cadherins and the cytoskeleton (Junkman et al., 2005; and Kowalczyk et al., 1997).

The above biomarkers can be used herein to detect breast cancer, from a mere blood sample, to monitor disease progression, to assess response to breast cancer treatment, but also to develop and adapt a breast cancer treatment. They can also be used as therapeutic targets to design novel drugs.

Therefore, based on the findings disclosed herein, the present invention provides for the first time a reliable and easy to perform diagnostic method for breast cancer, which is based on determination of the expression level of the above-mentioned biomarker(s). The invention further provides a screening method for identifying drugs, a method for determining a drug-responding or non-responding phenotype, as well as a method for designing or adapting a treatment regimen. Kits and protein microarrays for carrying out the methods of the invention are also provided herein.

DETAILED DESCRIPTION OF THE INVENTION

Unless stated otherwise, scientific and technical terms used in connection with the present invention shall have the meanings that are commonly understood by those of ordinary skill in the art. Further, unless otherwise required by context, nomenclatures used herein, and techniques of molecular biology and cell culture are those well-known and commonly used in the art.

Nevertheless, with respect to the use of different terms throughout the current specification, the following definitions more particularly apply.

According to the different aspects and embodiments of the invention, the term "comprising" or "containing" means the inclusion of the referent and does not exclude the presence of any other element. By contrast to the term "comprising", the term "consisting of" means the sole inclusion of the referent and thus excludes the presence of any other element.

The term "subject" or "patient" is used herein to describe any member of the animal kingdom, preferably a human being, more preferably a woman. The term "diagnosing" or "diagnosis", as used in the context of the present invention, means assessing whether a subject suffers or not from a disease. As will be understood by those skilled in the art, such an assessment, although preferred to be, may usually not be correct for 100% of the investigated subjects. This term requires however that a statistically significant portion of subjects can be correctly assessed and, thus, diagnosed. Whether a portion is statistically significant can be easily determined by the skilled person in the art using various well-known statistic evaluation tools, such as determination of confidence intervals, p- value determination, Student's t-test, Mann-Whitney test, etc.. Details of such statistical methods can found in Dowdy and Wearden (1983). Statistical methods may notably allow the determination of the sensitivity and the specificity of a diagnostic test. The sensitivity of a diagnostic test can be defined as the proportion of subjects suffering from a disease who will have a positive result, while the specificity of a diagnostic test can be defined as the proportion of subjects without the disease who will have a negative result.

By "breast cancer", it is meant herein a cancer that forms in tissues of the breast, as defined by the National Cancer Institute. Types of breast cancer include, without limitation, ductal carcinoma, which begins in the lining of the milk ducts (thin tubes that carry milk from the lobules of the breast to the nipple); lobular carcinoma, which begins in the lobules (milk glands) of the breast; and invasive breast cancer (breast cancer that has spread from where it began in the breast ducts or lobules to surrounding normal tissue).

By "early breast cancer", it is meant herein a breast cancer that has not spread beyond the breast or the axillary lymph nodes. According to the TNM (Tumor, Nodes, Metastasis) international classification of breast cancer, this includes ductal or lobular carcinoma in situ (pTis NO M0) and stage I (T1 NO M0) breast cancers. More particularly, a pTis NO M0 breast cancer refers to a breast cancer, wherein cancer cells can only be found inside the breast ducts or lobules (TO), without the tumour crossing the basal membrane. A stage I (T1 NO M0) breast cancer refers to a breast cancer, wherein cancer cells have infiltrated the tissue surrounding the breast ducts and lobules, forming a tumour whose diameter is inferior or equal to 2 cm. The abbreviation NO means that the cancer has not spread to lymph nodes, while the abbreviation M0 means that there is no distant metastasis. An early breast cancer is generally characterized by a 100% survival rate, within five years from the initial diagnosis.

By "cancer recurrence", "recurring cancer", "cancer relapse or "relapsing cancer", it is meant herein, in the context of potential clinical outcomes of cancer and as defined by the National Cancer Institute, that the cancer has recurred (come back), usually after a period of time during which the cancer could not be detected. A recurring cancer may refer to a cancer that comes back to the same place as the original (primary) tumour or to another place in the body (also known as metastasis).

A "biological fluid sample" according to the invention can be any fluid sample that may be isolated from a subject, including, without limitation, blood or a fractional component thereof (serum, plasma, cellular extract), lymph, tumor interstitial fluid, saliva, mucus, sputum, sweat, or urine. Furthermore, it should be noted that, in the case of a local or a distant cancer recurrence, circulating tumoral cells (CTCs) may be isolated from a biological fluid as defined above, preferably from blood, by techniques well-known in the art. An example of a technique allowing the isolation of circulating tumoral cells (CTCs) is Dean Flow Fractionation (DFF), as established by Hou et al. (2013). In the context of the present invention, the biological fluid sample is preferably a blood sample, such as a serum or plasma sample, and even more preferably a serum sample.

The term "biomarker" according to the invention refers to a polypeptide or protein, fragment thereof, or epitope that is differentially present in a subject as compared to healthy subjects, including differentially modified (e.g. differentially glycosylated) and/or expressed biomarkers. It should be noted that the term "biomarker" includes soluble biomarkers, i.e. biomarkers which are differentially cleaved, secreted, released or shed from a tumor cell in a subject, and are thus detectable in a biological fluid as defined above.

Particularly preferred biomarkers associated with breast cancer according to the invention are listed in the following Table 1.

Table 1. Biomarkers of breast cancer

Accession number

Symbol Full name U n iprotKB/Swiss-Prot

(SEQ ID number)

Olfactomedin-4 (OLFM4)

OLM4 Alternative name(s):

or Antiapoptotic protein GW112 Q6UX06 (SEQ ID NO:1 ) OLFM4 G-CSF-stimulated clone 1 protein (hGC-1 )

hOLfD

Neudesin

Alternative name(s):

Cell immortalization-related protein 2

NENF Q9UMX5 (SEQ ID NO:2)

Neuron-derived neurotrophic factor

Secreted protein of unknown function (SPUF

protein)

DSP Desmoplakin : Isoforms 1 and 2

P15924-1 (Isoform 1 : SEQ ID NO: 3) or DP Alternative name(s):

P15924-2 (Isoform 2: SEQ ID NO: 4) 250/210 kDA paraneoplastic pemphigus antigen The term "expression level", as applied to a biomarker, refers herein to the amount or level of a biomarker of interest expressed in a cell, tissue, biological fluid, or organ(s). The term "level" as used herein refers to an amount (e.g. relative amount or concentration) of a biomarker that is detectable or measurable in a sample. For example, the level can be a concentration such as μg/L or a relative amount by comparison to a reference expression level. The act of actually "determining the expression level" of a biomarker in a biological sample refers to the act of actively detecting whether a biomarker is expressed in said sample or not, and notably allows to detect whether the biomarker expression is upregulated, downregulated or substantially unchanged when compared to a reference expression level. A "dysregulated expression level" of a given biomarker is, according to the invention, a downregulated or upregulated expression level when compared to a reference expression level.

By "reference expression level" or "control expression level" of a biomarker, it is meant a predetermined expression level of said biomarker, which can be used as a reference in any method of the invention. For example, a reference expression level can be the expression level of a biomarker in a biological sample of a healthy subject, or the average or median, preferably median, expression level in a biological sample of a population of healthy subjects.

Additional definitions are provided throughout the specification.

The present invention may be understood more readily by reference to the following detailed description, including preferred embodiments of the invention, and examples included herein.

The inventors have demonstrated that the expression level of Olfactomedin-4 (OLFM4), Neudesin (NENF) and/or Desmoplakin (DSP) circulating in the blood is dysregulated in subjects suffering from breast cancer. The present invention thus proposes to easily and rapidly diagnose breast cancer in a subject based on the above discovery, by determining the expression level of said biomarker(s), from a mere biological fluid sample such as blood. Such diagnosis method thereby enables to circumvent using conventional, burdensome, or even invasive diagnostic methods such as biopsy, magnetic resonance imaging (MRI), computed tomography (CT), or intrathecal contrast-enhanced CT scan.

Accordingly, in a first aspect, the present invention relates to an in vitro method for diagnosing a breast cancer in a subject, comprising the steps of: a) determining from a biological fluid sample of a subject the protein expression level of at least one biomarker selected from the group consisting of Olfactomedin-4, Neudesin, Desmoplakin, and any combination thereof; and

b) comparing said expression level with a reference expression level of said biomarker. The above method may optionally further comprise the step c) of determining whether said subject is suffering from breast cancer, based upon the comparison in step b).

Each of the above biomarkers are sufficient to perform a diagnosis according to the invention. Nevertheless, the skilled person in the art will readily understand that the above biomarkers may be combined as a panel of biomarkers, each of which contributing to the final diagnosis of the invention.

In a preferred embodiment, a protein expression level of said at least one biomarker dysregulated by comparison to a reference expression level of said biomarker obtained from a biological fluid sample of at least one healthy subject, is indicative that said subject is suffering from breast cancer.

Preferably, said protein expression level of:

Olfactomedin-4 is superior to said reference expression level; and/or Neudesin is superior to said reference expression level; and/or

Desmoplakin is inferior or superior to said reference expression level; in the biological fluid sample of said subject suffering from breast cancer.

In other words, a protein expression level of Olfactomedin-4 in step a) superior to the reference expression level of Olfactomedin-4 obtained from a biological fluid sample of at least one healthy subject, is indicative that the tested subject is suffering from breast cancer.

Similarly, a protein expression level of Neudesin in step a) superior to the reference expression level of Neudesin obtained from a biological fluid sample of at least one healthy subject, is indicative that the tested subject is suffering from breast cancer.

This means as well that a protein expression level of Desmoplakin in step a) inferior to the reference expression level of Desmoplakin obtained from a biological fluid sample of at least one healthy subject, is indicative that the tested subject is suffering from breast cancer.

This also means that a protein expression level of Desmoplakin in step a) superior to the reference expression level of Desmoplakin obtained from a biological fluid sample of at least one healthy subject, is indicative that the tested subject is suffering from breast cancer.

It shall be further understood that the present method, as well as other methods of the invention, encompass the use of any combination of the above biomarkers. The protein expression level of any one of Olfactomedin-4, Neudesin and Desmoplakin, or of any combination thereof, may further indicate the stage of breast cancer.

Accordingly, in a preferred embodiment, said expression level of Olfactomedin-4 superior to said reference expression level is indicative that the breast cancer is an early breast cancer. In another preferred embodiment, said expression level of Neudesin superior to said reference expression level is indicative that the breast cancer is an early breast cancer.

Yet, in another preferred embodiment, said expression level of Desmoplakin inferior to said reference expression level is indicative that the breast cancer is an early breast cancer.

Such diagnosis test of early breast cancer is particularly useful for patients at risk (e.g. having a family history of breast cancer), and for patients for which small breast tumours (e.g. tumour size below 1 cm) can not be accurately detected by conventional diagnostic methods such as ultrasound.

Still, in another preferred embodiment, said expression level of Desmoplakin superior to said reference expression level is indicative that the breast cancer is a recurring breast cancer. Such diagnosis test of a recurring breast cancer is particularly useful for the monitoring of a patient previously suffering from breast cancer. Notably if performed early on, such diagnostic test can help to improve the prognosis and survival of the patient.

By superior to a reference expression level, it is meant that the ratio between the expression level of said biomarker and the reference expression level is above 1.

By inferior to a reference expression level, it is meant that the ratio between the expression level of said biomarker and the reference expression level is below 1.

Alternatively, said expression level may be indicated as the concentration of biomarker in the tested biological fluid.

Accordingly, said protein expression level of Olfactomedin-4 is preferably superior to 30, 31 , 32, 34, 35, 36, 37, 38, 39, 40, 50, 60, 70, 80, 90, 100, 150, 200 ng/mL, more preferably superior to 30, 31 , 32, 34, 35, 36, 37, 38, 39, 40 ng/mL, most preferably superior to 31 ng/mL, in the biological fluid sample of said subject suffering from breast cancer, preferably from early breast cancer. Indeed, as illustrated in the experimental results, the inventors have demonstrated in a first cohort of 335 breast cancer patients and 65 healthy subjects that the sensitivity of a breast cancer diagnostic test based on Olfactomedin-4 is 67%, while the specificity of such test is 88%, for an Olfactomedin-4 sera concentration above 40 ng/mL. They further demonstrated in a second cohort of 766 breast cancer patients and 195 healthy subjects that the sensitivity of a breast cancer diagnostic test based on Olfactomedin-4 is ranging from 64 to 78%, with a specificity of about 80-90% for an Olfactomedin-4 sera concentration above 31 ng/mL. Similar values were observed in early breast cancer patients.

Still, said protein expression level of Neudesin is preferably superior to 15, 16, 17, 18, 19, 20, 25, 30, 35, 40 ng/mL, more preferably superior to 15, 16, 17, 18, 19, 20 ng/mL, most preferably superior to 16 ng/mL, in the biological fluid sample of said subject suffering from breast cancer, preferably early breast cancer. Indeed, as illustrated in the experimental results, the inventors have demonstrated in a first cohort of 335 breast cancer patients and 65 healthy subjects, that the sensitivity of a breast cancer diagnostic test based on Neudesin is 47%, while the specificity of such test reached 91 %, for a Neudesin sera concentration above 20 ng/mL. They further demonstrated in a second cohort of 766 breast cancer patients and 195 healthy subjects that the sensitivity of a breast cancer diagnostic test based on Neudesin is ranging from 52 to 60%, with a specificity reaching about 75-80% for a Neudesin sera concentration above 16 ng/mL. Similar values were observed in early breast cancer patients.

Still, said protein expression level of Desmoplakin is preferably inferior to 600, 500, 400, 300 pg/mL, more preferably inferior to 600 pg/mL, in the biological fluid sample of said subject suffering from early breast cancer.

Yet, still, said protein expression level of Desmoplakin is preferably superior to 1800, 1900, 2000 pg/mL, more preferably superior to 1800 pg/mL, in the biological fluid sample of said subject suffering from a recurring breast cancer.

As indicated above, the above biomarkers can be combined as a panel of biomarkers, each of which contributing to the final diagnosis of the invention. Indeed, the inventors have demonstrated that such combination increases the sensivity and/or the specificity of the diagnosis test of the invention.

Accordingly, in a preferred embodiment, the protein expression level of at least two, preferably three, of said biomarkers are determined in step a).

Preferably, the protein expression level of Olfactomedin-4 and Neudesin are determined in step a) of the above method. The combination of these biomarkers is particularly useful for diagnosing breast cancer, such as for diagnosing an early breast cancer. That is to say that, in a preferred embodiment, said protein expression level of:

· Olfactomedin-4 is superior to said reference expression level, and

Neudesin is superior to said reference expression level,

in the biological fluid sample of said subject suffering from breast cancer, such as an early breast cancer. In other words, a protein expression level of:

Olfactomedin-4 superior to the reference expression level of Olfactomedin-4 obtained from a biological fluid sample of at least one healthy subject, and

Neudesin superior to the reference expression level of Neudesin obtained from a biological fluid sample of at least one healthy subject,

is indicative that the tested subject is suffering from breast cancer, such as an early breast cancer.

Indeed, as illustrated in the experimental results, the inventors have demonstrated in a first cohort of 335 breast cancer patients and 65 healthy subjects that the sensitivity of a breast cancer diagnostic test based on the combination of Olfactomedin-4 and Neudesin is 74%, while the specificity of such test is 78%, for a sera concentration of those biomarkers above 44 ng/mL. They further demonstrated in a second cohort of 766 breast cancer patients and 195 healthy subjects that the sensitivity of a breast cancer diagnostic test based on the combination of Olfactomedin-4 and Neudesin is ranging from 75 to 85%, with a specificity reaching 87% for a sera concentration of those biomarkers above 38 ng/mL. Similar values were observed in early breast cancer patients. The combination of Olfactomedin-4 and Neudesin thus increases the sensitivity and/or specifity values of the diagnostic test of the invention, by comparison to a test based on only one of these biomarkers.

Alternatively, the protein expression level of Olfactomedin-4 and Desmoplakin are preferably determined in step a) of the above method. The combination of these biomarkers is particularly useful for diagnosing breast cancer, and more particularly an early breast cancer. That is to say that, in a preferred embodiment, said protein expression level of:

Olfactomedin-4 is superior to said reference expression level, and

Desmoplakin is inferior to said reference expression level,

in the biological fluid sample of said subject suffering from breast cancer, and preferably from an early breast cancer.

In other words, a protein expression level of:

Olfactomedin-4 superior to the reference expression level of Olfactomedin-4 obtained from a biological fluid sample of at least one healthy subject, and

· Desmoplakin inferior to the reference expression level of Desmoplakin obtained from a biological fluid sample of at least one healthy subject,

is indicative that the tested subject is suffering from breast cancer, and preferably from an early breast cancer. Indeed, as illustrated in the experimental results, the inventors have demonstrated that the sensitivity of an early breast cancer diagnostic test based on the combination of Olfactomedin-4 and Desmoplakin is 87%, while the specificity of such test is 84%.

Alternatively, the protein expression level of Neudesin and Desmoplakin are preferably determined in step a) of the above method. The combination of these biomarkers is particularly useful for diagnosing breast cancer, and more particularily an early breast cancer. That is to say that, in a preferred embodiment, said protein expression level of:

Neudesin is superior to said reference expression level, and

Desmoplakin is inferior to said reference expression level,

in the biological fluid sample of said subject suffering from breast cancer, and preferably from an early breast cancer.

In other words, a protein expression level of:

Neudesin superior to the reference expression level of Neudesin obtained from a biological fluid sample of at least one healthy subject, and

· Desmoplakin inferior to the reference expression level of Desmoplakin obtained from a biological fluid sample of at least one healthy subject,

is indicative that the tested subject is suffering from breast cancer, and preferably from an early breast cancer.

Still preferably, the protein expression level of Olfactomedin-4, Neudesin and Desmoplakin are determined in step a) of the above method. The combination of these three biomarkers is particularly useful for diagnosing breast cancer, more particularily an early breast cancer. That is to say that, in a preferred embodiment, said protein expression level of:

Olfactomedin-4 is superior to said reference expression level,

Neudesin is superior to said reference expression level, and

· Desmoplakin is inferior to said reference expression level,

in the biological fluid sample of said subject suffering from breast cancer, and preferably from an early breast cancer.

In other words, a protein expression level of:

Olfactomedin-4 superior to the reference expression level of Olfactomedin-4 obtained from a biological fluid sample of at least one healthy subject,

Neudesin superior to the reference expression level of Neudesin obtained from a biological fluid sample of at least one healthy subject, and

Desmoplakin inferior to the reference expression level of Desmoplakin obtained from a biological fluid sample of at least one healthy subject, is indicative that the tested subject is suffering from a breast cancer, and preferably from an early breast cancer.

It shall be further understood that the information obtained using the methods of the invention as described herein may be used in combination with other information, such as, but not limited to, expression levels of additional biomarkers which may be standard biomarkers, clinical chemical parameters, histopathological parameters, or age, gender and/or weight of the subject.

Accordingly, in a further preferred embodiment, the in vitro diagnostic method of the invention further comprises the step of determining the protein expression level of at least one standard biomarker associated with breast cancer, such as estrogen receptor (ER), progesterone receptor (PR) or human epidermal growth factor receptor 2 (HER2).

As indicated above, in the context of the present invention, the expression level is measured at the protein level. Methods for measuring protein expression levels are well-known in the art and are notably reviewed by Reeves et al. (2000) and Schena (2005). Those methods generally involve contacting a biological sample of interest with one or more detectable reagents that is or are suitable for measuring protein expression level, such as an antibody, and subsequently determining protein expression level based on the level of detected reagent, preferably after normalization. Examples of methods which generally involve the use of an antibody include, without limitation, Western blot, immunoblot, enzyme-linked immunosorbant assay (ELISA), enzyme-linked immunospot (ELISPOT), radioimmunoassay (RIA), immunohistochemistry and immunoprecipitation. Other methods suitable for measuring a protein expression level, which do not necessarily involve the use of an antibody, may be used, including, without limitation, fluorescence activated cell sorting (FACS), microscopy such as atomic force microscopy, flow cytometry, microcytometry, protein binding assay, ligand binding assay, microarray, polyacrylamide gel electrophoresis such as SDS-PAGE, surface plasmon resonance (SPR), Forster resonance energy transfer (FRET), Bioluminescence resonance energy transfer (BRET), chemiluminescence, fluorescent polarization, phosphorescence, mass spectrometry such as liquid chromatography mass spectrometry (LC-MS) or liquid chromatography/ mass spectrometry/ mass spectrometry (LC-MS-MS), matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF), surface-enhanced laser desorption/ionization time-of-flight (SELDI-TOF), and magnetic resonance imaging (MRI).

According to the different aspect and preferred embodiments of the present invention, the step of determining the expression level of a biomarker of interest preferably further comprises a substep of normalizing the expression level of said biomarker. The method for normalizing expression level can be selected based upon the method used for measuring expression level. For example, if a Western-blot is performed, the expression level of a biomarker of interest in a biological sample may be normalized by assessing in parallel in said sample the expression level of a protein which is usually constitutively expressed in any cell of a living organism, preferably at the same expression level whether the cell is healthy or not (e.g. cancerous or not). An example of constitutively expressed protein is a housekeeping protein, which may be selected, without limitation, among actin, beta-tubulin, and Glyceraldehyde 3-phosphate dehydrogenase (GAPDH), to name a few. Alternatively, if an ELISA is performed, involving for example a colorimetric detection method, protein expression level can be normalized by total cell number. Yet, still alternatively, if a microarray is performed, protein expression level can be normalized, for example, by loess-regression. For a detailed review of normalization methods of protein expression level in an antibody microarray, one skilled in the art may refer to Hamelinck et al. (2005).

All these methods for measuring and normalizing protein expression level are well-known to the skilled person, and thus do not need to be further detailed herein. Should the skilled person wish to use any of the above methods involving the use of an antibody to measure a biomarker protein expression level, one may use any appropriate commercial antibody specific for said biomarker. Alternatively, based on the knowledge of the amino-acid sequence of a biomarker of interest, it is easy to the skilled person to design suitable reagent(s) to measure expression level in any biological sample. For example, an antibody directed against a specific biomarker may be prepared by any conventional method, e.g. by immunizing an animal, such as a mouse, with an immunogenic form of said biomarker which elicits an antibody response in said animal. Methods for producing polyclonal and monoclonal antibodies are well described in the literature (see notably Kohler and Milstein, 1975; Kozbor et al. , 1983; Roder et al., 1986; and Huse et al., 1986), and therefore need not be further detailed herein.

As indicated above, the comparison of a determined or tested expression level with a reference expression level can be done by merely calculating the ratio between the expression level of a biomarker of interest in the tested biological sample and in at least one reference sample, preferably after normalization as described above. Accordingly, a ratio above 1 is indicative that the biomarker is overexpressed, while a ratio below 1 is indicative that the biomarker is underexpressed (i.e. downregulated). In another aspect of the present invention, the biomarkers disclosed herein can be used to determine if a patient will respond or not to a cancer therapy. Associating a patient's response to treatment with such biomarker(s) can indeed elucidate new opportunities for treatment in non- responding patients or indicate one treatment over other treatment choices.

Therefore, the present invention further provides an in vitro method for determining a drug- responding or non-responding phenotype in a subject suffering from breast cancer, comprising the steps of:

a) determining from a biological fluid sample of said subject the protein expression level of at least one biomarker selected from the group consisting of Olfactomedin-4, Neudesin, Desmoplakin, and any combination thereof;

b) comparing the protein expression level in step a) to a reference expression level of said biomarker ; and

c) determining from said comparison the drug-responding or non-responding phenotype. According to the present invention, a "drug-responding phenotype" refers to a response state of a subject to the administration of a drug. A "response state" means that said subject responds to the treatment, i.e. that said treatment is efficacious in said subject. A responding phenotype is thus characterized by an improvement in clinical signs, i.e. in the context of the present invention, a responding phenotype is characterized for example by a regression or disappearance of breast cancer cells and metastases thereof, if any. By contrast, a "drug-non responding phenotype" refers to the absence in said subject of a state response, meaning that said subject is refractory to the treatment.

Protein expression level of the above-mentioned biomarkers in a subject suffering from breast cancer are as described above. In a further aspect of the present invention, the biomarkers disclosed herein can be used to design or adapt a breast cancer treatment. In particular, such treatment may be designed or adapted once a subject has been diagnosed as suffering from breast cancer, according to the method of the invention.

Accordingly, the present invention provides herein a method for designing or adapting a treatment regimen for a subject suffering from breast cancer, comprising the steps of:

a) determining from a biological sample of said subject a drug-responding or non- responding phenotype, according to the in vitro method described above ; and b) designing or adapting a treatment regimen for said subject based upon said responding or non-responding phenotype. The present method is particularly useful for offering a therapy tailored to each patient affected by breast cancer.

The term "treatment regimen" refers herein to a treatment plan that specifies the type of treatment (i.e. type of drug or combination of drugs, and mode of administration of said drug(s)), dosage, schedule and/or duration of a treatment provided to a subject in need thereof. A dosage, schedule and/or duration of treatment can vary, depending on the progression of disease and the selected type of treatment. In this regard, in addition to the drugs that can be identified according to the screening method of the invention, therapeutic agents that may be used in the treatment regimen according to the invention include, without limitation, chemotherapeutic agents; hormone therapeutic agents such as tamoxifen or aromatase inhibitors (e.g. Raloxifene, Toremifene, Fulvestrant, Anastrozole, Exemestane, Letrozole); human epidermal growth factor receptor 2 (HER2) inhibitors such as trastuzumab (Herceptin), pertuzumab, or lapatinib; vascular endothelial growth factor receptor (VEGFR) inhibitors such as bevacizumab; epidermal growth factor receptor (EGFR) inhibitors such as cetuximab and panitumumab; and any combination thereof.

Standard chemotherapeutic drugs for treating breast cancer include, without limitation, platinum-based agents such as oxaliplatin, cisplatin, carboplatin, spiroplatin, iproplatin, and satraplatin; alkylating agents such as cyclophosphamide, ifosfamide, chlorambucil, busulfan, melphalan, mechlorethamine, uramustine, thiotepa, and nitrosoureas; anti-metabolites such as 5-fluorouracil, azathioprine, 6-mercaptopurine, methotrexate, leucovorin, capecitabine, cytarabine, floxuridine, fludarabine, gemcitabine, pemetrexed, or raltitrexed; plant alkaloids such as vincristine, vinblastine, vinorelbine, vindesine, podophyllotoxin, or taxanes such as paclitaxel and docetaxel; topoisomerase inhibitors such as irinotecan, topotecan, amsacrine, etoposide (VP16), etoposide phosphate, or teniposide; antitumor antibiotics such as anthracyclines (e.g. doxorubicin, daunorubicin, epirubicin, mitoxantrone), actinomycin, bleomycin, mitomycin, or plicamycin; and any combination thereof.

In the above method, the treatment regimen that is designed or adapted and optionally administered to the subject depends on the responding or non-responding phenotype. In particular, a treatment regimen may be selected for the first time, continued, adjusted or stopped based upon said phenotype. For example, a treatment regimen may be adjusted by increasing the dose to be administered, or stopped and switched to an alternative treatment regimen, if the subject is non-responding. Still, alternatively, a treatment regimen may be selected for the first time or continued if a subject is responding. One skilled in the art would nevertheless easily design or adjust the type of treatment with the dosage, schedule and duration of treatment, depending upon the phenotype of the subject.

Furthermore, based upon said phenotype, the selected treatment regimen can be an aggressive one which is expected to result in the best clinical outcome (e.g., regression and/or disappearance of breast cancer) and which may be associated with some discomfort to the subject or adverse side effects (e.g., damage to healthy cells or tissue), or a more moderate one which may only slow the progression of the disease. An example of aggressive treatment regimen include a treatment regimen as described above combined with surgical intervention to remove tumoral cells, tissue or organs and/or an exposure to radiation therapy. An aggressive treatment regimen may also include a higher dosage of the therapeutic agent(s), a more frequent administration of said agent(s), and/or a longer duration of treatment.

Thus, once a treatment regimen has been determined in accordance with the teachings of the invention, the subject may receive the appropriate treatment.

Therefore, in another aspect, the invention relates to a method for treating breast cancer in a subject in need thereof, comprising the steps of:

a) determining from a biological sample of said subject a drug-responding or non- responding phenotype, according to the method described above; and

b) administering to said subject said drug if the phenotype is a responding phenotype. The term "administering" as used herein means that the drug(s) of interest is delivered or dispensed to a subject orally, or parenterally such as by subcutaneous, intravenous, intramuscular, intrathecal or intraperitoneal injection.

In another aspect of the present invention, the biomarkers disclosed herein may be used for drug screening purposes. In particular, novel drug assays may be provided, which identify therapeutics efficiently interfering with the proliferation of breast cancer cells that aberrantly express those biomarkers. Current treatment of breast cancer mainly relies on chemotherapy and/or antiangiogenic drugs, which may be combined, if need be, with surgery.

Accordingly, in the present aspect, the invention relates to a screening method for identifying a drug or combination of drugs suitable for treating breast cancer, comprising the steps of:

a) contacting isolated breast cancer cells or cell line displaying a breast cancer phenotype with a candidate drug or combination of candidate drugs; b) determining, from said cells or cell line contacted with said drug or combination of drugs, the protein expression level of at least one biomarker selected from the group consisting of Olfactomedin-4, Neudesin, Desmoplakin, and any combination thereof; and

c) comparing the protein expression level of said biomarker in step b) to its expression level in the absence of said drug or combination of drugs.

By "drug" or "agent", it is meant herein a compound such as chemical or a biological molecule that can be administered or tested according to the invention. A chemical can be of any composition such as inorganic or organic. A biological molecule can be a molecule of any biological origin that can be found in or produced by, at least in part, a cell, such as, without limitation, peptides or proteins such as antibodies or affibodies, lipids, nucleic acids such as RNAi or aptamers, carbohydrates, and any combination thereof.

By "drug suitable for treating breast cancer", it is meant herein a drug that can slow or stop the growth of breast cancer cells and metastases thereof, if any, either by killing said cells, or by slowing or stopping their uncontrolled division.

Furthermore, it shall be understood that by "breast cancer cells or cell line" according to the invention, it is preferably meant breast cancer cells or cell line wherein the protein expression level of Olfactomedin-4, Neudesin, and/or Desmoplakin is dysregulated by comparison to a reference expression level of said biomarker(s) in the breast cells of at least one healthy subject. Preferably, the cells or cell line used in the present screening method are breast cancer cells isolated from a subject diagnosed as suffering from breast cancer according to the method of the invention.

The screening method described above is preferably an in vitro screening method. For example, the cells or cell line used in the present method can be cultured in a three-dimensional (3D) culture system, so as to mimic a breast tumour micro-environment. To do so, said cells can be embedded in an extracellular matrix (ECM) as described by Weigelt et al. (2008), Kenny et al. (2007) and Li et al. (2010).

In order to assess the efficacy of the candidate anti-cancer agent, said cells or cell line may, as an alternative or as a validation test, be grafted to an animal, such as a mouse. Should such xenograft be carried out, the screening method described above preferably further comprises the step of killing said animal.

In a preferred embodiment of the above method, a protein expression level of Olfactomedin-4 in step b) inferior to the protein expression level of said biomarker in the absence of said drug or combination of drugs is indicative that said drug or combination of drugs is suitable for treating breast cancer.

In a preferred embodiment, a protein expression level of Neudesin in step b) inferior to the protein expression level of said biomarker in the absence of said drug or combination of drugs is indicative that said drug or combination of drugs is suitable for treating breast cancer.

In a further preferred embodiment, a protein expression level Desmoplakin in step b) superior to the protein expression level of said biomarker in the absence of said drug or combination of drugs is indicative that said drug or combination of drugs is suitable for treating early breast cancer.

Yet, in another preferred embodiment, a protein expression level of Desmoplakin in step b) inferior to the protein expression level of said biomarker in the absence of said drug or combination of drugs is indicative that said drug or combination of drugs is suitable for treating a recurring breast cancer.

One skilled in the art would readily understand from the data provided herein that the above-mentioned biomarkers may be combined to aid in the identification of a drug or combination of drugs. It is within the skill of the person in the art to select the appropriate biomarker to be combined.

Preferably, a protein expression level of Olfactomedin-4 and Neudesin in step b) inferior to the protein expression level of said biomarkers in the absence of said drug or combination of drugs is indicative that said drug or combination of drugs is suitable for treating breast cancer such as an early breast cancer.

Preferably, a protein expression level of :

Desmoplakin in step b) superior to the protein expression level of said biomarker in the absence of said drug or combination of drugs, and

· Olfactomedin-4 in step b) inferior to the protein expression level of said biomarker in the absence of said drug or combination of drugs,

is indicative that said drug or combination of drugs is suitable for treating breast cancer, and preferably an early breast cancer.

Preferably, a protein expression level of :

· Desmoplakin in step b) superior to the protein expression level of said biomarker in the absence of said drug or combination of drugs, and

Neudesin in step b) inferior to the protein expression level of said biomarker in the absence of said drug or combination of drugs, is indicative that said drug or combination of drugs is suitable for treating breast cancer, and preferably an early breast cancer.

Still preferably, a protein expression level of:

• Olfactomedin-4 and Neudesin in step b) inferior to the protein expression level of said biomarkers in the absence of said drug or combination of drugs, and

• Desmoplakin in step b) superior to the expression level of said biomarker in the absence of said drug or combination of drugs,

is indicative that said drug or combination of drugs is suitable for treating breast cancer, and preferably an early breast cancer.

In another aspect, the present invention provides kits that can be employed in the methods described herein. In this regard, the invention relates to a kit for use in any method described above, comprising or consisting of:

a) at least one reagent capable of specifically determining the protein expression level of at least one biomarker selected from the group consisting of Olfactomedin-4,

Neudesin, Desmoplakin, and any combination thereof; and

b) instructions for performing said method.

As used herein, the term "instructions" refers to a publication, a recording, a diagram, or any other medium which can be used to communicate how to perform a method of the invention. Said instructions can, for example, be affixed to a container which contains said kit. Preferably, the instructions for using said kit include a reference expression level of said biomarker(s).

The term "reagent capable of specifically determining the protein expression level [of a given biomarker]" designates a reagent or a set of reagents which specifically recognizes said biomarker and allows for the quantification of its protein expression level. These reagents can be for example antibodies, aptamers or affibodies specifically recognizing a biomarker. In the context of the present invention, such reagent is said to be "specific" for its target (i.e. biomarker) or "recognizes specifically" its target if it 1 ) exhibits a threshold level of binding activity, and/or 2) does not significantly cross-react with target molecules known to be related to the biomarker of interest. The binding affinity of such reagent can be easily determined by one skilled in the art, for example, by Scatchard analysis. Cross-reactivity of a reagent can as well be easily determined by one skilled in the art, and thus need to be further detailed herein.

In a preferred embodiment, the kit of the invention may further comprise:

c) at least one reagent capable of specifically determining the protein expression level of at least one standard breast cancer biomarker, such as estrogen receptor (ER), progesterone receptor (PR) or human epidermal growth factor receptor 2 (HER2). In order to normalize protein expression level, the kit of the invention may also optionally comprise at least one reagent capable of specifically determining the protein expression level of a housekeeping protein, such as actin, beta-tubulin, or Glyceraldehyde 3-phosphate dehydrogenase (GAPDH).

In yet another aspect, the methods of the invention can be practiced using a microarray, so as to notably determine the expression level of biomarkers of interestin the present invention.

The term "microarray" refers herein to a spatially defined and separated collection of individual biological molecules which are immobilized on a solid surface, and to which one or several biomarkers of interest specifically bind(s). Those biological molecules allow for the determination of the expression level of said biomarker(s), and may be antibodies, affibodies or aptamers if the microarray is a protein microarray, which is a preferred type of microarray according to the invention. Protein microarrays technologies are well-known to the skilled person, and are notably described in Mitchell (2002), Haab (2005), and Eckel-Passow et al. (2005), and in US patent Nos. 6,087,102, 6,139,831 , and 6,087, 103. For determination of protein expression level of one or several biomarkers by using such array, two technologies can typically be used: 1 ) direct labeling, and 2) indirect labeling, as described for example by Kingsmore et al. (2006). In the "direct labeling" method, the protein of interest (i.e. biomarker of the invention, or target) obtained from a sample, such as a biological sample, is labeled with a specific marker (e.g. a fluorescent or a radioisotope marker), and subsequently hybridized to the microarray by specifically binding to a reagent recognizing said biomarker, said reagent being conjugated to the surface of the protein microarray. If the expression level of several biomarkers is to be assessed, each biomarker is labeled with a distinct marker. In the "indirect labeling" method, the sample containing the biomarker of interest is hybridized to the microarray by specifically binding to an unlabeled reagent recognizing said biomarker, said reagent being conjugated to the surface of the protein microarray, and a secondary labeled reagent, specifically recognizing as well said biomarker, is then added. The specificity and sensitivity of such indirect labeling method can further be enhanced by using a third labeled reagent, recognizing the secondary reagent (sandwich assay). Similarly, if the expression level of several biomarkers is to be assessed in the indirect labeling method, each secondary or third reagent is labeled with a distinct marker. Label-free systems may also be used to determine the expression level of a biomarker on a protein microarray; in such system, detection of the biomarker, and hence of its expression level, may be done by surface plasmon resonance (SPR), microcantilever biosensing, SELDI-TOF-MS, or atomic force microscopy (Chandra et al., 201 1 ).

Therefore, the invention further relates herein to a protein microarray for use in any method described above, comprising or consisting of:

a) at least one a reagent capable of specifically determining the protein expression level of at least one biomarker selected from the group consisting of Olfactomedin-4, Neudesin, Desmoplakin, and any combination thereof.

In a preferred embodiment, said protein microarray may further comprise:

b) at least one reagent capable of specifically determining the protein expression level of at least one standard breast cancer biomarker, such as estrogen receptor (ER), progesterone receptor (PR) or human epidermal growth factor receptor 2 (HER2). In order to normalize protein expression level, the microarray of the invention may also optionally comprise at least one reagent capable of specifically determining the expression level of a housekeeping protein, such as actin, beta-tubulin, or Glyceraldehyde 3-phosphate dehydrogenase (GAPDH).

The present invention will be better understood in the light of the following detailed description of experiments, including examples. Nevertheless, the skilled artisan will appreciate that this detailed description is not limitative and that various modifications, substitutions, omissions, and changes may be made without departing from the scope of the invention.

BRIEF DESCRIPTION OF THE DRAWINGS

Figure 1 : Scoring system for selection of candidate breast cancer biomarkers

Flow diagram showing the filtering algorithm used to identify 5 prospective candidate biomarkers, namely Versican (VCAN), Tenascin (TNC), Olfactomedin-4 (OLFM4), Neudesin (NENF) and Desmoplakin (DSP).

Figure 2: Preliminary verification of Versican (A), Tenascin (B), Olfactomedin-4 (C), Neudesin (D) and Desmoplakin (E) expression in breast cancer patients versus healthy controls

Serum concentrations of these five candidate biomarkers in patients diagnosed with breast cancer and heathy controls sera were measured using ELISA. The corresponding concentration medians are represented by a horizontal line. The most promising candidates are OLFM4, NENF and DSP. Figure 3: OLFM4 and NENF expression levels in serum samples of breast cancer patients (first study)

A: Expression level of circulating olfactomedin-4 in the serum of 335 breast cancer patients and in that of 65 healthy controls; the Mann-Whitney test for independent samples was used to calculate the significance of the results. Olfactomedin-4 was significantly over-expressed (p<0.0001 ) in the patients with breast cancer and in each of the sub-groups by grade.

B: Expression level of circulating Neudesin in the serum of 335 breast cancer patients and in that of 65 healthy controls; the Mann-Whitney test for independent samples was used to calculate the significance of the results. Neudesin was significantly over-expressed (p<0.0001 ) in the patients with breast cancer and in each of the sub-groups by grade (p=0.008 for grade 1 and p<0.0001 for grade 2 and 3).

C: Expression evels of circulating Olfactomedin-4 and neudesin in the serum of 335 patients with breast cancer and in that of 65 healthy controls; the combination of these two proteins is significantly over-expressed in the breast cancer patients group (p<0.0001 ).

D: ROC curve of the Olfactomedin-4 analysis to distinguish breast cancers (n=335) from healthy controls (n=65); the area under the curve (AUC) for patients of all stages combined compared to the population without cancer was 0.78 (CI 95: 0.74 - 0.85); the sensitivity of the test was 67%, while specificity was 88% for a concentration > 40 ng/ml.

E: ROC curve of the Neudesin analysis to distinguish breast cancers (n=241 ) from healthy controls (n=65); the area under the curve (AUC) for patients of all stages combined compared to the population without cancer was 0.73 (CI 95: 0.70 - 0.80); the sensitivity of the test was 47%, while specificity was 91 % for a concentration > 20 ng/ml.

F: ROC curve of the Olfactomedin-4 + Neudesin analysis to distinguish breast cancers (n=241 ) from healthy controls (n=65); the area under the curve (AUC) for patients of all stages combined compared to the population without cancer was 0.81 (CI 95: 0.74 - 0.86); the sensitivity of the test was 74%, while specificity was 78% for a concentration > 44 ng/ml.

Figure 4: DSP expression levels in serum samples of breast cancer patients (first study) Expression level of circulating DSP in the serum of 241 breast cancer patients and in that of 65 healthy controls; the Mann-Whitney test for independent samples was used to calculate the significance of the results. Desmoplakin was significantly under-expressed (p=0.0037) in pT1 a- pTib breast cancer group and significantly over-expressed (p=0.0069) in recurrence group. Figure 5: OLFM4, NENF and DSP dysregulation in breast cancer patients (first study) Proportion of breast cancer patients tested positive for the elevation of OFLM4 and NENF and decrease of DSP. Figure 6: OLFM4 and Low-DSP are early breast cancer biomarkers (first study)

A: Proportion of early breast cancer patients tested positive for high OLFM4 and NENF expression and low-DSP expression.

B: Expression level of circulating OLFM4 in the serum of 335 breast cancer patients and in 81 early breast cancer and in that of 65 healthy controls; the Mann-Whitney test for independent samples was used to calculate the significance of the results. OLFM4 was significantly over- expressed (p<0.0001 ) in the patients with early breast cancer (tumour size <1 cm).

C: ROC curve of the OLFM4 analysis to distinguish early breast cancers (n=81 ) from healthy controls (n=65); the AUC for patients compared to the population without cancer was 0.83 (CI 95: 0.75 - 0.89); the sensitivity of the test was 67%, while specificity was 87% for a concentration > 40 ng/ml.

D: ROC curve obtained with the predictor combining OLFM4 and DSP. The AUC for patients was 0.92; the sensitivity of the test was 87% and the specificity was 84%.

Figure 7: Serum OLFM4 and NENF levels in breast cancer samples (second study)

A: Level of circulating olfactomedin-4 in the sera of BC-1 (test cohort, n=277), BC-2 (validation cohort-1 , n=171 ) and BC-3 (validation cohort-2, n=318) breast cancer patients and in that of 195 healthy controls; the Mann-Whitney test for independent samples was used to calculate the significance of the results. Olfactomedin-4 was significantly over-expressed (p<0.0001 ) in the test cohort and in the both validation cohorts.

B: ROC curves of the Olfactomedin-4 analysis to distinguish BC-1 , BC-2 and BC-3 from healthy controls; the area under the curve (AUC) for patients compared to the population without cancer was 0.89 (CI 95: 0.87 - 0.92) for BC-1 ; 0.85 (CI 95: 0.81 - 0.89) for BC-2; 0.89 (CI 95: 0.86 - 0.91 ) for BC-3 and 0.88 (CI 95: 0.85 - 0.90) for the 3 cohorts.

C: Level of circulating Neudesin in the serum of BC-1 (test cohort, n=277), BC-2 (validation cohort-1 , n=171 ) and BC-3 (validation cohort-2, n=318) breast cancer patients and in that of 195 healthy controls; the Mann-Whitney test for independent samples was used to calculate the significance of the results. Neudesin was significantly over-expressed (p<0.0001 ) in the test cohort and in the both validation cohorts.

D: ROC curves of the Neudesin for BC-1 , BC-2 and BC-3; the AUC for patients compared to the population without cancer was 0.74 (CI 95: 0.68 - 0.79) for BC-1 ; 0.72 (CI 95: 0.66 - 0.77) for BC-2; 0.74 (CI 95: 0.70 - 0.79) for BC-3 and 0.73 (CI 95: 0.69 - 0.77) for the 3 cohorts.

E: ROC curves of Olfactomedin-4 + Neudesin for BC-1 , BC-2 and BC-3; the AUC for patients compared to the population without cancer was 0.92 (CI 95: 0.89 - 0.95) for BC-1 ; 0.88 (CI 95: 0.84 - 0.92) for BC-2; 0.91 (CI 95: 0.88 - 0.94) for BC-3 and 0.91 (CI 95: 0.88 - 0.93) for the 3 cohorts.

F: ROC curves of olfactomedin-4 alone, neudesin alone and the association of the both proteins; the AUC was 0.88 (CI 95: 0.85 - 0.90) for olfactomedin-4 alone, 0.73 (CI 95: 0.69 - 0.77) for neudesin alone and 0.91 (CI 95: 0.88 - 0.93) for the both proteins.

Figure 8: OLFM4, NENF and OLFM4+NENF positive patients (second study)

Proportion of patients tested positive for the increased OFLM4 (A), NENF (B) and OLFM4 +

NENF (C) levels.

Figure 9: Serum OLFM4 and NENF levels in sera of patients with a small tumor (<1cm) (second study)

A: Level of circulating olfactomedin-4 in the sera of T1 a-T1 b-1 (test cohort, n=105), T1 a-T1 b-2 (validation cohort-1 , n=123) and T1 a-T1 b-3 (validation cohort-2, n=108) patients and in that of 195 healthy controls; the Mann-Whitney test for independent samples was used to calculate the significance of the results. Olfactomedin-4 was significantly over-expressed (p<0.0001 ) in the test cohort and in the both validation cohorts.

B: ROC curves of OLFM4 in BC and T1 a-T1 b cohorts; the AUC was 0.88 (CI 95: 0.85 - 0.90) in BC and 0.89 (CI 95: 0.86 - 0.92) in T1 a-T1 b cohort.

C: Level of the circulating neudesin in the same cohorts. Neudesin was significantly over- expressed (p<0.0001 ) in the test cohort and in the both validation cohorts.

D: ROC curves of NENF in BC and T1 a-T1 b cohorts; the AUC was 0.73 (CI 95: 0.69 - 0.77) in BC and 0.72 (CI 95: 0.68 - 0.77) in T1 a-T1 b cohort.

Figure 10: OLFM4 and NENF are early breast cancer biomarkers (second study)

A : Levels of OLFM4 in sera of T1 a-T1 b -3 cohorts (n=336) , in BC- 3 cohorts (n=766) and in healthy Contros (n=195).

B: Levels of NENF in sera of T1a-T1 b -3 cohorts (n=336) , in BC- 3 cohorts (n=766) and in healthy Contros (n=195).

C: Levels of OLFM4 + NENF in sera of T1 a-T1 b -3 cohorts (n=336), in BC- 3 cohorts (n=766) and in healthy Contros (n=195).

D: ROC curves of OLFM4, NENF and OLFM4 + NENF in BC-3 cohorts and in T1 a-T1 b- 3 cohorts. EXAMPLES

1. MATERIAL AND METHODS 1.1. Patient selection

• A preliminary study was conducted on 20 healthy controls and (20-50) breast cancer serum samples of female breast cancer patients. This study aimed to evaluate the diagnostic value of potential biomarkers of breast cancer identified via proteomic mapping of a transformed breast cancer cell line and of several breast tumours.

• A first study was subsequently conducted on 65 healthy controls and 335 breast cancer serum samples of female breast cancer patients. This study aimed to evaluate in more details the diagnostic values of three specific potential breast cancer biomarkers. Table 2 - clinical/pathological characteristics of patients with breast carcinoma (first study)

Patient characteristics BC-1 (n=335)

Age (years) median [min-max] 60 [31 -90]

< 50 (%) 72 (21 .5)

>= 50 (%) 263 (78.5)

HR

negative (%) 28 (8.3)

positive (%) 307 (91 .7)

Her2 overexpression (%) 40 (1 1 .9) lymph node status

positive (%) 41 (12.2)

• In a second study aimed to further validate two biomarkers of the first study, sera from female breast cancer patients were collected at Institut de Cancerologie de I'Ouest (ICO) Paul Papin in Angers for the first cohort, at ICO Rene Gauducheau in Nantes for the second cohort and at ICO Paul Papin in Angers for the third cohort. The first population set consisted of 277 subjects (BC-1 ) among which 105 women had a small tumour <1 cm (T1ab-1 ); the second population set consisted of 171 individuals with 123 (BC-2) women with a small tumour <1 cm (T1 ab-2); the third cohort consisted of 318 women (BC-3) among which 108 had a small tumor <1 cm (T1ab-3) (see their characteristics in Table 3 below). A total of 195 healthy controls from Etablissement Frangais du Sang were also evaluated. Table 3 - clinical/pathological characteristics of patients with breast carcinoma (second study)

Patient characteristics BC-1 (n=277) BC-2 (n=171) BC-3 (n=318)

Age (years) median [min-max] 60 [31-87] 61 [33-91] 58 [19-90]

< 50 (%) 60 (21.6) 31 (18.1) 81 (25.5)

>= 50 (%) 217 (78.4) 140 (81.9) 237 (74.5)

HR

negative (%) 23 (8.3) 7(4.1) 29 (9.1)

positive (%) 254 (91.7) 164 (95.9) 289 (90.9)

Her2 overexpression (%) 33 (11.9) 19 (11.1) 41 (12.9)

lymph node status

positive (%) 33 (11.9) 19 (11.1) 41 (12.9)

T1a-T1b (<1cm) 105 123 108

T1ab-2 T1ab-3

T1ab-1 (n=105) (n=123) (n=108)

Age (years) median [min-max] 61 [34-84] 63 [36-90] 60 [37-84]

< 50 (%) 19 (18.1) 17 (13.8) 18 (16.6)

>= 50 (%) 86 (81.9) 106 ( 86.2) 90 (83.4)

HR

negative (%) 7 (6.7) 5 (4.0) 9 (8.3)

positive (%) 98 (93.3) 118 (96.0) 99 (91.7)

Her2 overexpression (%) 12 (11.4) 7 (5.7) 12 (11.1)

lymph node status

positive (%) 12 (11.4) 6 (4.9) 15(13.9)

In each study, all sera were collected after obtaining written informed consent. The study protocol was approved by the Institutional Review Board. All samples were collected, processed and stored in a similar fashion. Briefly, blood sample was centrifuged at 3500 rpm for 10 minutes, and the serum was stored at -80°C. All sera were obtained prior to surgery or neoadjuvant treatment.

1.2. Cell culture

The human breast epithelial cell lines MCF10A LXSN (MCF10A) (non-tumorigenic breast epithelial cell line, expressing a control empty vector) and MCF10A KRASV12 (MCF10A-RAS, which is a tumorigenic breast epithelial cell line) were obtained by retroviral infection as previously described (Konishi et al., 2007). They were kindly provided by Dr Ben Ho'Park. MCF10A LXSN and MCF10A KRASV12 cell lines were grown in DMEM/F12 (1:1) (Life Technologies) supplemented with 50 mM Hepes, 5% donor horse serum (DHS; Eurobio), 1% L- Glutamin, 20 ng/mL EGF (Peprotech), 10 μg/mL insulin, 0.5 μg/mL hydrocortisone, and 0.1 μg/mL cholera toxin. All supplements were purchased from Sigma-Aldrich unless otherwise noted. Cells were harvested using Trypsin EDTA, enzyme reaction were stopped with 2 volumes of supplemented DMEM/F12 medium, cells were washed twice with PBS and dried cell pellets were frozen.

1.3. Protein extraction from the MCF10A and MCF10A-RAS breast epithelial cell lines

Approximately 5x10 s cells were lysed in 0.6 ml of 4% SDS and 0.1 M DTT in 0.1 M Tris- HCI, pH 7.6 at room temperature for 30 min and briefly sonicated to reduce viscosity of the lysate. Detergent was removed from the lysates and the proteins were digested with trypsin according to the FASP protocol (Wisniewski et al., 2009) using spin ultrafiltration units of nominal molecular weight cut of 30 000 Daltons. Briefly, the protein lysate was applied to an YM- 30 microcon filter units (Cat No. MRCF0R030, Millipore) spun down and washed three times with 200 μΙ_ of 8 M urea in 0.1 M Tris/HCI, pH 8.5. Then 6 μΙ_ of 200 mM MMTS in 8 M urea was added to the filters and the samples were incubated for 20 min. Filters were washed thrice with 200 μΙ_ of 8 M urea in 0.1 M Tris/HCI, pH 8.5, followed by six washes with 100 μΙ_ 0.5M TEAB. Finally, trypsin (AB sciex) was added in 100 μΙ_ 0.5M TEAB to each filter. The protein to enzyme ratio was 100:1. Samples were incubated overnight at 37°C and released peptides were collected by centrifugation. Samples were then dried completely using a Speed-Vac and re- suspended in 100 μΙ of 0.5% trifluoroacetic acid (TFA) in 5% acetonitrile, and desalted via PepClean C-18 spin columns (Pierce Biotechnology, Rockford, IL). Peptide content was determined using a Micro BCA Protein Assay Kit (Pierce -Thermo Scientific, Rockford, IL).

1.4. Protein extraction from frozen tissues (breast tumors and healthy breast tissues) Frozen sections (12 pm thick) of breast tumors or normal breast area were cut on a cryostat (Bright Instrument Co Ltd, St Margarets Way, UK). Specific sections were stained with toluidine blue for visual reference. To take into account tumor heterogeneity, ten frozen sections per tumor of luminal A, Her-2 overexpressed and triple-negative breast tumors were lysed in a buffer consisting of 0.1 M Tris-HCI, pH 8.0, 0.1 M DTT, and 4% SDS at 95°C for 90 min. Detergent was removed from the lysates and the proteins were digested with trypsin according to the FASP protocol (Wisniewski et al., 2009) using spin ultrafiltration units of nominal molecular weight cut of 30 000 Daltons. To YM-30 microcon filter units (Cat No. MRCF0R030, Millipore) containing protein concentrates, 200 pL of 8 M urea in 0.1 M Tris/HCI, pH 8.5 (UA), was added and samples were centrifuged at 14 000 g at 20 C for 8 min. This step was performed thrice. Then 6 μΙ_ of 200 mM MMTS in 8 M urea was added to the filters and the samples were incubated for 20 min. Filters were washed thrice with 200 μΙ_ of 8 M UA followed by six washes with 100 μΙ_ 0.5M TEAB. Finally, trypsin (AB sciex) was added in 100 μΙ_ 0.5M TEAB to each filter. The protein to enzyme ratio was 100: 1. Samples were incubated overnight at 37°C and released peptides were collected by centrifugation. Samples were then dried completely using a Speed-Vac and re-suspended in 100 μΙ of 0.5% trifluoroacetic acid (TFA) in 5% acetonitrile, and were desalted via PepClean C-18 spin columns (Pierce Biotechnology, Rockford, IL). Peptide content was determined using Micro BCA Protein Assay Kit (Pierce - Thermo Scientific, Rockford, IL).

1.5. Peptide labelling with iTRAQ reagents

For the iTRAQ labelling, 100 μg of each peptide solution was labelled at room temperature for 2h with one iTRAQ reagent vial previously reconstituted with 70μΙ of ethanol for 4plex iTRAQ reagent. Labelled peptides were subsequently mixed in a 1 :1 : 1 :1 ratio and dried completely using a Speed-Vac.

1.6. Peptide OFFGEL fractionation

For pl-based peptide separation, the 3100 OFFGEL Fractionator (Agilent Technologies, Boblingen, Germany) was used with a 12 or 24-well set-up using the following protocol. First, samples were desalted onto a Sep-Pak C18 cartridge (Waters). For the 24-well set-up, peptide samples were diluted in the OFFGEL peptide sample solution to a final volume of 3.6 mL. Then, the IPG gel strip of 24 cm-long (GE Healthcare, Munchen, Germany) with a 3-10 linear pH range was rehydrated with the Peptide IPG Strip Rehydradation Solution, according to the protocol of the manufacturer, for 15 min. 150 μί of sample was loaded in each well. Electrofocusing of the peptides was performed at 20°C and 50 μΑ until the 50 kVh level was reached. After focusing, the 24 peptide fractions were withdrawn and the wells were washed with 200 μί of a solution of water/methanol/formic acid (49/50/1 ). After 15 min, each washing solution was pooled with its corresponding peptide fraction. All fractions were evaporated by centrifugation under vacuum and maintained at -20°C. For the 2D-OFFGEL approach, the peptides were first fractionated in 12 fractions in the pH range 3-10. Then, fractions F1 -F2, fraction F3 to F8 and fractions F9 to F12 were pooled and refractionated in 24 fractions in the pH range 3.5-4.5, 4-7 and 6-9, respectively. 72 fractions were obtained which were subsequently analysed by nanoLC-MS/MS. 1.7. Capillary LC separation

Juste before nano-LC analysis, each fraction was resuspended in 20 μΙ_ of H 2 0 with 0.1 % (v/v) TFA.The samples were separated on an Ultimate 3,000 nano-LC system (Dionex, Sunnyvale, USA) using a C18 column (PepMap100, 3μηι, 100A, 75μηι id x 15cm, Dionex) at 300nl_/min a flow rate. Buffer A was 2% ACN in water with 0.05% TFA and buffer B was 80% ACN in water with 0.04% TFA. Peptides were desalted for 3 min using only buffer A on the precolumn, followed by a separation for 105 min using the following gradient: 0 to 20% B in 10 min, 20% to 45% B in 85 min and 45% to 100% B in 10 min. Chromatograms were recorded at the wavelength of 214 nm. Peptide fractions were collected using a Probot microfraction collector (Dionex). CHCA (LaserBioLabs, Sophia-Antipolis, France) was used as MALDI matrix. The matrix (concentration of 2mg/ml_ in 70% ACN in water with 0.1 % TFA) was continuously added to the column effluent via a micro "T" mixing piece at 1 .2 μΙ_Λτπη flow rate. After 12 min run, a start signal was sent to the Probot to initiate fractionation. Fractions were collected for 10 sec and spotted on a MALDI sample plate (1 ,664 spots per plate, ABsciex, Foster City, CA.).

1.8. MALDI-MS/MS

MS and MS/MS analyses of off-line spotted peptide samples were performed using the 5800 MALDI-TOF/TOF Analyser (ABsciex) and 4000 Series Explorer software, version 4.0. The instrument was operated in positive ion mode and externally calibrated using a mass calibration standard kit (ABsciex). The laser power was set between 2800 and 3400 for MS and between 3600 and 4200 for MS/MS acquisition. After screening all LC-MALDI sample positions in MS- positive reflector mode using 2000 laser shots, the fragmentation of automatically-selected precursors was performed at a collision energy of 1 kV using air as collision gas (pressure of ~ 2 x 10-6 Torr) with an accumulation of 3000 shots for each spectrum. MS spectra were acquired between m/z 1000 and 4000. The parent ion of Glu1 -fibrinopeptide was used at m/z 1570.677 diluted in the matrix (30 femtomoles per spot) for internal calibration. Up to 12 of the most intense ion signals per spot position having a S/N > 20 were selected as precursors for MS/MS acquisition. The identification of peptides and proteins was performed by the ProteinPilot™ Software V 4.0 (AB Sciex) using the Paragon algorithm as the search engine (Shilov et al., 2007). Each MS/MS spectrum was searched for Homo sapiens species against the Uniprot/swissprot database (UniProtKB/Sprot 201 10208 release 01 , with 525997 sequence entries). The searches were run using the fixed modification of methylmethanethiosulfate labeled cysteine parameter enabled. Other parameters, such as tryptic cleavage specificity, precursor ion mass accuracy and fragment ion mass accuracy, were MALDI 5800 built-in functions of ProteinPilot software. The detected protein threshold (unused protscore (confidence)) in the software was set to 1.3 to achieve 95% confidence, and identified proteins were grouped by the ProGroup algorithm (ABsciex) to minimize redundancy. The bias correction option was executed.

To estimate the false discovery rate (FDR), a decoy database search strategy was used.

The FDR is defined as the percentage of decoy proteins identified against the total protein identification. The FDR was calculated by searching the spectral against the Uniprot Homo sapiens decoy database. 1.9. ELISA tests

Commercially available ELISA kits from USCN Life Science Inc. or R&D were used to assay concentrations of OLFM4, NENF and DSP. The kits consisted of 96-well microtiter plates coated with antibody specific to each type of molecule, detection antibodies for identifying the antibody-protein in the plate by streptavidin-biotin labeling and TMB substrate which generated colored product. The sample was added and assay was conducted according to the manufacturer's instructions. The absorbance of the colored product developed at the end of the assay was quantified at wavelength 450 nm on ELISA reader (Tecan Magellan Sunrise).

1.10. Statistical quantification of relative protein expression

For the quantification of the relative protein expressionthe customized software package iQuantitator (Schwacke et al., 2009; Grant et al., 2009) as well as the softwares TANAGRA (V1.4) and GraphPad Prism 5 were used to infer the magnitude of change in protein expression. Those software infer treatment-dependent changes in expression using Bayesian statistical methods, more specifically, the Mann-Whitney test for independent samples, and receiver- operating-characteristic (ROC) curves. Basically, this approach was used to generate means, medians, the area under the ROC curve (AUC), and 95% confidence intervals (upper and lower) to test the hypothesis that the AUC was superior to 0.5 for each treatment-dependent change in protein expression by using peptide-level data for each component peptide.

For proteins whose iTRAQ ratios were downregulated, the extent of down-regulation was considered further if the higher limit of the credible interval had a value lower than 1. Conversely, for proteins whose iTRAQ ratios were increased, the extent of upregulation was considered further if the lower limit of the confidence interval had a value greater than 1. The width of these confidence intervals depended on the data available for a given protein. The iQuantitator software took into consideration all the peptides observed and the number of spectra used to quantify the change in expression for a given protein. In these conditions, it was possible to detect small but significant changes in up- or down-regulation when many peptides were available. The peptide selection criteria for relative quantification were performed as follows. Only peptides unique for a given protein were considered for relative quantification, excluding those common to other isoforms or proteins of the same family. Proteins were identified on the basis of having at least two peptides with an ion score above 95% confidence.

2. RESULTS The strategy used to identify new candidate biomarkers for breast tumours is summarised in Figure 1. Firstly, the inventors drew up a proteomic map of a transformed breast cell line using several breast tumours with different statuses in order to establish the most exhaustive list possible of the proteins that characterise breast tumours. This list was compared against the list of secreted proteins that have already been identified in human blood (Schenk et al., 2008; Cao et al., 2013; HUPO Plasma Proteome Project website http://www.ccmb.med.umich.edu/PPP) then was examined on the "Breast cancer database" (http://www.itb.cnr.it/breastcancer/) so that only the novel secreted proteins in breast cancer pathology were selected.

2.1. Proteomic analysis of the the MCFIOA and MCF10A-RAS cell lines

(Imbalzano et al., 2013)

Using the two-step OFFGEL approach, the inventors mapped the proteome of the non- transformed breast cell line (MCF10A), as well as that of this same cell line transformed with the KRAS oncogene in order to mimic oncogenic activation and abnormal survival. Although mutations in the RAS gene are not common in breast cancers, the RAS pathway is activated in this disease by overexpression of growth factor receptors signaling such as the ErbB2 receptor, which is activated in 30% of breast cancers. RAS-induced breast tumors are characterized by activation of mitogen-activated protein kinase signaling which is well known to be associated with early neoplasia and poor prognosis. Using the proteomic approach described herein, 2152 proteins with at least two peptides were identified. Out of these proteins, 262 were found in the secreted proteins databases (data not shown). 2.2. Proteomic analysis of luminal A breast tumours

Through a similar proteomic mapping of luminal A breast tumours, the inventors were able to identify 1093 proteins with at least two peptides. Out of these proteins, 246 were secreted proteins (data not shown).

2.3. Proteomic analysis of breast tumours expressing HER2

Through a similar proteomic mapping of breast tumours with HER2 receptor over- expression, the inventors were able to identify 624 proteins with at least two peptides. Out of these proteins, 225 were secreted proteins (data not shown).

2.4. Proteomic analysis of triple-negative breast tumours

Through a similar proteomic mapping of triple negative breast tumours, the inventors identified 2407 proteins with at least two peptides. Out of these proteins, 266 were secreted proteins (data not shown).

2.5. Comparative proteomic analysis

A comparison of all the identified secreted proteins in the MCF10A-RAS cell line, as well as in luminal A, HER2 positive and triple-negative breast tumors against the "Breast cancer database" (http://www.itb.cnr.it/breastcancer/) allowed to select 125 novel proteins in breast cancer pathology in the first study, of which 121 were validated in the second study (data not shown).

2.6. Dysregulated new secreted proteins in the transformed MCF10A cell line transformed by the KRAS oncogene (MCF10-RAS)

Comparing the list of the 121 secreted proteins against the list of significantly dysregulated proteins in the MCF10A cell line transformed by the KRAS oncogene, 12 proteins were identified (Table 4). Seven were under-expressed (DSP, JUP, ACTN1 , CTNNA1 , METTL13, HSPD1 and GSTP1 ) and five were over-expressed (COPA, TLN1 , PYGB, HPCAL1 and IGF2R). Table 4. Dysregulated secreted proteins in the MCF10A cell line transformed by the KRAS oncogene

2.7. Dysregulated new secreted proteins in triple-negative tumours

Comparing the list of the 121 secreted proteins against the list of significantly dysregulated proteins in the triple-negative tumours, 15 proteins were identified. Two were down-regulated (APOH and CFH) and 13 were over-expressed (CMPK1 , ALDOA, COPA, DDT, CFL1 , GST01 , ARF1 , COTL1 , FTL, DSTN, DSP, ACTN1 , TNC) (Table 5).

Table 5. Dysregulated secreted proteins in triple-negative tumours

iTRAQ

Symbol Full name of the biomarker

ratio

APOH Beta-2-glycoprotein 1 0,602

CFH Complement factor 0,693

CMPK1 UMP-CMP kinase 1 ,332

ALDOA Fructose-bisphosphate aldolase A 1 ,37

COPA Coatomer subunit alpha 1 ,496

DDT D-dopachrome decarboxylase 1 ,504

CFL1 Cofilin-1 1 ,569

GST01 Glutathione S-transferase omega-1 1 ,586

ARF1 ADP-ribosylation factor 1 1 ,624

COTL1 Coactosin-like protein 1 ,703

FTL Ferritin light chain 1 ,743

DSTN Destrin 1 ,978

DSP Desmoplakin 2,119

ACTN1 Alpha-actinin-1 2,196

TNC Tenascin 1 ,616 2.8. Dysregulated new secreted proteins in a HER2+ tumour compared to the healthy tissue

Comparing the proteome of dysregulated proteins against the list of the 125 proteins, five proteins were characterised; one under-expressed protein (CFH) and four over-expressed proteins (ANXA2, FTL, TAGLN2, TNC) (Table 6).

Table 6. Dysregulated secreted proteins in HER2+ tumour

2.9. Glycoproteomic analysis

The purpose of this analysis was to complete the sub-proteome of secreted proteins. Using three breast tumours, the inventors established a glycoproteome. Using this glycoproteome, 5 secreted proteins (HPX, OLFM4, OLFML3, TNC, VCAN) that have never been studied as breast cancer biomarkers were characterized.

2.10. Selection of candidate biomarkers

The inventors employed a systematic scoring system to segregate 5 candidates : DSP,

NENF, OLFM4, TNC, VCAN.

2.11. Candidate validation in breast cancer patients

A preliminary verification was performed on 20 healthy controls and (20-50) breast cancer serum samples. The concentration medians for TNC and VCAN cancer samples were not significantly (p>0.05) different from those of healthy controls. The concentration medians for OLFM4 and NENF were 2.2 and 3.1-fold higher than healthy controls sera, respectively, with p- value <0.005. The concentration median for DSP breast cancer samples was not significantly different from those of healthy controls but a significant difference (p-value<0.032) between healthy controls samples and small tumors group (size<2 cm, pT1 ) was identified (Figure 2). In this case, DSP concentration was lower than in the control group. Conversely, among the 50 breast cancer samples, 3 overexpressed DSP samples were found which matched with 3 recurrent breast tumors. 2.12. OLFM4 and NENF elevation in breast cancer sera

• To further evaluate the potential of OLFM4 and NENF as serum breast cancer biomarkers, their serum concentrations were determined in a first study based on 65 healthy subjects and in 335 with breast cancer. OLFM4 and NENF were found to be significantly elevated (p<0.0001 ) in breast cancer sera (regardless of the grade of the tumor) compared to healthy sera (Figure 3A and B). The OLFM4 and NENF serum concentration was then combined for each patient (n=335) and this value was compared to that obtained in healthy controls sera: it was found that OLFM4+NENF concentration was significantly elevated (p<0.0001 ) in breast cancer sera (Figure 3C).

· In a second validation study, 766 participants were divided in 3 independent cohorts : BC-1 recruited in the Angers ICO Cancer Canter, BC-2 recruited in the Nantes ICO Cancer Canter and BC-3 recruited in the Angers ICO Cancer Center. The concentrations of both markers were also determined in 195 healthy subjects. For the control cohort, the OLFM4 median concentration was 9,96 ng/ml (IQR 1.00-21.94) and for NENF, the median concentration was 6.77 ng/ml (IQR 1.46-13.14). The median concentrations for OLFM4 and NENF were found to be significantly elevated (p<0.0001 ) in breast cancer sera as compared to healthy samples (Figure 7A and 7C); the values did not significantly differ between the three cohorts. When the 3 independent breast cancer sera cohorts were combined, OLFM4 median concentration of 47.00 ng/ml (IQR 25.00-75.00) and a NENF median concentration of 16.82 ng/ml (IQR 8.05-31.69) were determined.

ROC curves showed that the optimum diagnostic cutoff for OLFM4 was 29.8, 30.0 and 29.4 ng/ml for BC-1 , BC-2 and BC-3 respectively. The optimum cutoff value for NENF was 13.8, 15.6 and 13.8 ng/ml for BC-1 , BC-2 and BC-3, respectively. When the 3 cohorts were combined (Figure 7B and 7D), the optimum diagnostic values were 30.6 and 15.6 ng/ml for OLFM4 and NENF, respectively. These values were very similar to those obtained for each independent cohorts, the cutoff values in this study were chosen to be 31 ng/ml for OLFM4 and 16 ng/ml for NENF (Table 7). With these cutoff values, the sensitivity was ranging from 64 to 78% for OLFM4 and from 52% to 60% for NENF.

In order to develop a specific test, the cutoff value at 90 and 95% specificity were calculated:

o for OLFM4, with 90% specificity, the cutoff value reached 33.9 ng/ml in the three independent cohorts and in the total cohort with a sensitivity ranging from 57 to 71 %; with 95% specificity, the cutoff value rose to 41 or 42 ng/ml with a sensitivity from 49 to 63% (Table 7); o for NENF, with 90% specificity, the cutoff value reached 13.8 or 15.6 ng/ml for the different cohorts with a sensitivity ranging from 53 to 60% and with 95% specificity, the cutoff value rose to 39 ng/ml with a sensitivity from 14 to 22% (Table 7). To test if these markers were complementary, the combined markers OLFM4 and NENF were estimated by binary logistic regression and the values of this function was used as one marker and subjected to ROC analysis. The values corresponding to the addition of the Elisa concentrations of OLFM4 and NENF in the same serum were also tested and the ROC curves were as exhibited as equivalent (data not shown). So, when combining the OLFM4 and NENF Elisa concentrations, ROC curves showed that the optimum diagnostic cutoff was 38.3 or 38.4 ng/ml in the three independent cohorts or in the combined cohort with a specificity of 87% and a sensitivity ranging from 75 to 85% (Figure 7E).

When the ROC curves were compared, the AUC for OLFM4+NENF and OLFM4 alone appeared to be very close (Figure 7F). The proportion of patients who were positive to OLFM4, NENF and OLFM4+NENF were compared in the different specificity conditions (Figure 8). For OLFM4, the proportion of positive patients appeared to be higher in the test cohort (BC-1 ). Through the 3 cohorts, a proportion of positive patients of 70% for the optimum cutoff, and 66% and 56% of positive patients at 90% and 95% specificity, respectively (Figure 8A). For NENF, the proportions of positive patients was quite similar in the 3 cohorts and an average of 53% of positive patients were reached at the optimum cutoff, and 32% and 19% were reached for 90% and 95% specificity, respectively (Figure 8B). When the two markers were combined, it appeared that, in the test cohort and in the validation cohort 2 (BC-2), more than 80% of patients were positive (75% for the BC-2 cohort), at the optimum cutoff value. When the proportion at 90% specificity was evaluated, between 70 to 81 % of positive patients were still reached. At 95% specificity, between 62 to 73 patients were OLFM4+NENF positive (Figure 8C). The proportion of patients who were positive for OLFM4 at the optimum cutoff (31 ng/ml) was 70% and this proportion reached 81 % when the two markers were combined (at the optimum cutoff 39 ng/ml, Figure 8A and 8C). The number of OLFM4+NENF positive patients was superior to 10% at least for each specificity values, comparing the proportion of OLFM4 alone positive patient. Table 7. Area under the ROC curve, sensitivity and specificity values for diagnostic tests based on OFLM4, NENF or OLFM4+NENF (second study)

2.13. DSP is decreased in sera of patients with early tumors and increased in sera of patients with recurrent tumors

To evaluate the potential of DSP to discriminate small tumors (low DSP) and recurrent tumors (high DSP) from controls, the serum concentrations in 65 healthy subjects and in 384 with breast cancer were determined (first study). DSP was found to be significantly elevated (p=0.0069) in recurrent tumor sera, according our tumor proteomic approach. Conversely, DSP was found significantly decreased (p=0.0037) in small tumors with a size <1 cm (pT1a and pT1 b tumors (Figure 4). This low-DSP serum concentration was consistent with the mammary cell line proteomic results.

2.13. OLFM4, NENF and DSP are biomarkers of breast cancer

The proportion of patients tested positive for OLFM4, NENF and DSP were showed in Figure 5 (first study). OLFM4 showed significant elevation in 208 breast cancer sera (335 sera in the total cohort). Among 127 patients who were negative for OLFM4, 58 were positive for NENF. Among the last 69 patients who were negative for OLFM4 and NENF, 39 were positive for low- DSP. At least, 30 patients were negative for the three biomarkers (Figure 5). It should be noted that among this cohort of 335 patients, 32 patients have a high-DSP level (>1800 pg/ml). Among these 32 patients, 12 were in a recurrence state (38%). 2.14. OLFM4, low-DSP and NENF are biomarkers for breast cancer in the early phase

• To further evaluate the potential of OLFM4, NENF and low-DSP as serum breast cancer in the early phase biomarkers, their serum concentrations were determined in a first study based on 65 healthy subjects and in 81 patients with a small tumor (<1cm), which represents early breast cancer. The proportion of subjects tested positive for OLFM4 was 65% and this proportion did not increase when the patients tested positive for NENF were added, suggesting that NENF might not be a biomarker of the early breast tumor. Nevertheless, adding the subjects positive for low-DSP, 91 % of patients were detected positive (Figure 6A). A predictor combining OLFM4 and low-DSP information was built by using a logistic regression model. This predictor showed an AUC=0.92 for early breast cancer patients compared to a population without cancer. The sensitivity of the test was 87%, and the specificity was 84% (Figure 6D).

• In the second validation study which was carried out on a much larger cohort of patients, the potential of OLFM4 and NENF was further evaluated as serum biomarkers in the early phase of breast cancer. To this end, their serum concentrations were determined in 336 patients with a small tumor (<1 cm) divided in 3 independent cohorts (105 patients for T1 ab-1 , 123 patients for T1 ab-2 and 108 patients for T1 ab-3).

As for the BC cohorts, the median concentration for OLFM4 and NENF were found to be significantly elevated (p<0.0001 ) in T1 a-T1 b breast tumors sera as compared to healthy samples (Figure 9A and 9C). The values did not differ significantly between the three cohorts, and were similar with the BC cohorts. When the 3 independent breast cancer sera cohorts were combined, the OLFM4 median concentration was determined to be of 51 .89 ng/ml (IQR 32.69- 83.52) and the NENF median concentration of 17.40 ng/ml (IQR 7.56-30.99).

ROC curves showed that the AUC for OLFM4 in the T1 a-T1 b cohort was slightly higher than in the BC cohort (0.89 and 0.88 respectively) while the AUC values for NENF were identical (Table 7 and Figure 9D). When both markers were combined, the median concentrations were identical in the T1a-T1 b cohorts and in the BC-cohorts (Figure 10). In the same manner, the ROC curves were superimposable. When the proportion of patients who were positive to OLFM4, NENF and OLFM4+NENF were compared in the different specificity conditions, the results appeared to be very closed to that obtained for the BC-cohorts. In addition, the ROC curves for OLFM4 in BC and T1 a-T1 b cohorts were superimposed.

The analyses of positive patients for OLFM4, NENF and OLFM4 plus NENF showed results comparable to those obtained in BC cohorts. The proportion of patients who were positive for OLFM4 or NENF at the optimum cutoff value of the ROC curve was 86%. At 90% specificity, this proportion was 78% and at 95% specificity, 67%. When combining OLFM4 plus NENF, the proportion of patients who were positive for OLFM4+NENF, or OLFM4 alone or NENF alone was also 86% at the optimum cutoff value, this value was 79% at 90% specificity and 69% to 95% specificity. 3. CONCLUSION

The aim of the present studies was to use an innovative proteomic approach to identify novel breast cancer biomarkers. Starting from the fact that no serious candidate had been highlighted with the usual approaches, the present work was focused on secreted proteins that had never been described as potential biomarkers of breast cancer. The first step was to create a protein database from a transformed breast cell line, and from luminal A, Her2- overexpressed and triple negative breast tumors with a global proteomic approach. The identified proteins were compared with the HUPO Plasma project database and Mann's work to be sure these proteins could be detected in the blood of patients. Then, the identified proteins were compared with the Breast Cancer database and over a hundred secreted proteins which were never tested in breast pathology were identified.

Among these proteins, the inventors have more specifically identified OLFM4, NENF and DSP as novel breast cancer biomarkers, easily detectable in sera from patients suffering from such cancer.

Indeed, based on the first study conducted by the inventors, elevation of OLFM4 and NENF, as well as dysregulation of DSP allows the detection of breast cancer. In the case of breast cancer prevention (small tumor detection), it appeared that an elevation of OLFM4 and a decrease of DSP (<600 pg/ml) should be preferably searched. In the case of breast cancer monitoring, an elevation of DSP (>1800 pg/ml) should be searched to identifiy a breast cancer recurrence.

The second study was carried out by the inventors on a larger cohort of patients with three independent sera cohorts, in order to further evaluate the breast cancer biomarkes NENF and OLFM4. When the diagnostic accuracy of the combination of both biomarkers was evaluated to distinguish breast cancer patients from healthy controls using a generalized ROC criterion, a very significant overall diagnostic accuracy was observed. By bringing together the 3 sera sets, an AUC of 91 % with a sensitivity of 82% and a specificity of 87% were obtained. Very interestingly, these values remained elevated in early stage breast cancer patients (tumor size inferior to 1 cm), and the sensitivity and specificity differed slightly between the different sera cohorts. The diagnostic capabilities of serum OLFM4 and NENF were similar in all breast cancer and in early-stage breast cancer patient cohorts. Moreover, the association of the two markers was really beneficial for diagnosis prediction. In the breast cancer set, 421 patients (70%) were positive for OLFM4 alone and 342 patients (53%) were positive for NENF alone. When both markers were combined, 556 patients (85%) were detected. In the same way, for the early-stage breast cancer cohort, 78% of patients were positive for OLFM4 and 53% for NENF alone. This number rose to 82% when OLFM4 and NENF concentrations were combined.

The present data therefore indicate that the serum biomarkers OLFM4, NENF and DSP, and more preferably OFLM4 and NENF, can be used to detect breast cancer, especially in the early-stage diagnosis. As those proteins are secreted, their expression can be simply measured in a reliable manner, without the need for invasive techniques in order to detect breast cancer early on, either combined with mammogram to increase the rate of detected occult cancer, either by spacing the number of mammograms in the patient monitoring. The potential benefit from a detection methodology designed to identify early-stage breast cancer is clear. Mammography has been shown to be the most effective screening tool for detecting breast cancer early and for saving lives. However, mammography has intrinsic limitations that may be difficult to overcome and its sensitivity ranges between 63 and 87%, depending on age, breast density, and tumor characteristics. Therefore, complementary tests are needed to detect women with breast cancer and to increase the diagnosis sensitivity of screening approaches. Serum biomarkers may be helpful to increase the positive predictive value of mammographic lesions, thereby decreasing the number of women who undergo unnecessary biopsies. In addition, biomarkers may also be used to select cases for more sensitive diagnostic techniques, such as magnetic resonance imaging. Another significant application will be the monitoring of young women "at risk". Women at a high risk of developing breast cancer are essentially those carrying BRCA1 and BRCA2 gene mutations or with a high likelihood of a hereditary predisposition to breast cancer. Consequently, screening is reinforced in women carrying these mutated genes: they should undergo twice-yearly clinical examinations and imaging tests as soon as they reach the age of 30.

Since 2004, several prospective trials have compared breast imaging techniques in women expressing these mutations or at a high-risk of breast cancer. All the trials found MRI superior to the other techniques for the early detection of breast cancer. Sardanelli et al. (2007) analysed the results of 5 prospective studies assessing the performances of mammography, ultrasound and breast MRI: the sensitivity of breast MRI is high (80% versus 40% for mammography), but its predictive value of MRI is low, i.e. only 53%, indicating that the number of biopsies for false-positive results increased with the test. Prophylactic bilateral mastectomy may reduce the risk of breast cancer onset by 85 to 100% but no data have shown the benefits of such a procedure in terms of overall survival compared to monitoring and early screening. The studies also found no survival benefits when prophylactic mastectomy was performed at an early stage, as early as 25 years.

Simple monitoring involving two yearly clinical examinations, breast MRI and a mammogram is an alternative associated with hardly any complications. The procedure does not reduce the risk of cancer, but aims to detect and treat precancer and cancer lesions as early as possible.

The data submitted herein clearly demonstrate that the use of the serum biomarkers OLFM4 and/or NENF, potentially along with DSP, would enhance this positive predictive value while maintaining a good sensitivity. REFERENCES

Independent UK Panel on Breast Cancer Screening (2012). The Lancet; 380(9855): 1778 - 1786.

Liu W. , Chen L, Zhu J., and Rodgers G.P. (2006J. Exp. Cell Res.; 312: 1785-1797.

Zhang X., Huang Q. , Yang Z., Li Y., and Li C.Y. (2004). Cancer Res.; 64:2474-2481.

Kobayashi D., Koshida S., Moriai R., Tsuji N., and Watanabe N. (2007). Cancer Sci. ; 98(3): 334-340.

Kimura I., Nakayama Y., Zhao Y., Konishi M., and Itoh N. (2013). Front Neurosci.; 25;7:1 1 1. Leung C.L., Green K.J., and Liem R.K. (2002). Trends Cell Biol.; 12: 37-45.

Allen E., Yu Q.C., and Fuchs E. (1996). J. Cell Biol.; 133: 1367-1382.

Wan H., South A.P., and Hart I.R. (2007). Exp. Cell Res.; 313: 2336-2344.

Rickelt S., Winter-Simanowski S., Noffz E., Kuhn C, and Franke W.W. (2009). Int. . Cancer, 125: 2036-2048.

Jonkman M.F., Pasmooij A.M., Pasmans S.G., Van Den Berg M.P., Ter Horst H.J., Timmer A., and Pas H.H. (2005). Am. J. Hum. Genet; 77: 653-660.

Kowalczyk A.P., Bornslaeger E.A., Borgwardt J.E., Palka H.L., Dhaliwal A.S., Corcoran CM., Denning M.F., and Green K.J. (1997). J. Cell Biol.; 139: 773-784.

Dowdy S.M., and Wearden S. (1983). Statistics for Research, John Wiley & Sons, New York. Hou H.W., Warkiani M.E., Khoo B.L., Li Z.R., Soo R.A., Tan D.S., Lim W.T., Han J., Bhagat A.A., Lim C.T. (2013). Sci. Rep.; 3:1259.

Reeves J.R. and Bartlett J. M.S. (2000). Methods in Molecular Medicine; vol.39, chapter 51 , p.471-483.

Schena M. (2005). Protein microarrays; Jones and Bartlett Learning.

Hamelinck D., Zhou H., Li L., Verweij C, Dillon D., Feng Z., Costa J., and Haab B.B. (2005). Mol. Cell Proteomics ;4 (6): 773-84.

Kohler G. and Milstein C. (1975). Nature; 256 (5517) : 495-7.

Kozbor D., Roder J.C. (1983). Immunology Today; vol. 4: p. 72-79.

Roder J.C., Cole S.P., and Kozbor D. (1986). Methods EnzymoL; 121 :140-167.

Huse W.D., Sastry L., Iverson S.A., Kang A.S., Alting-Mees M., Burton D.R., Benkovic S.J., and Lerner R.A. (1989). Science; 246:1275-1281.

Weigelt B. and Bissell M.J. (2008). Semin Cancer Biol.; 18(5): 31 1-321. Kenny PA, Lee G.Y., Myers C.A., Neve R.M., Semeiks J.R., Spellman P.T., Lorenz K., Lee E.H., Barcellos-Hoff M.H., Petersen O.W., Gray J.W., and Bissell M.J. (2007). Mol Oncol.; 1 (1 ):84-96.

Li Q., Chow A.B., and Mattingly R.R. (2010). J Pharmacol Exp Ther ; 332(3): 821-828.

Mitchell P. (2002). Nature Biotech; 20: 225-229.

Haab B.B. (2005). Mol Cell Proteomics ;4(4):377-83.

Eckel-Passow J.E., Hoering A., Therneau T.M., and Ghobrial I. (2005). Cancer Res; 15;65(8):2985-9.

Kingsmore S.F. (2006). Nat Rev Drug Discov. ;5(4):310-20.

Chandra H., Reddy P.J., and Srivastava S. (201 1 ). Expert Rev Proteomics ;8(1 ):61-79

Schenk S., Schoenhals G.J., de Souza G., and Mann (2008). BMC Med Genom/cs;15;1 :41.

Cao Z., Yende S., Kellum J.A., and Robinson R.A.S. (2013). Int J Proteomics: 2013:654356.

Imbalzano K.M., Tatarkova I., Imbalzano A.N. , and Nickerson J.A. (2009J. Cancer Cell Int. 9:7. doi: 10.1 186/1475-2867-9-7.

Wisniewski JR., Zougman A., Nagaraj N., and Mann M. (2009). Nat Methods; 6, 359-362.

Shilov I. V., Seymour S. L., Patel A. A., Loboda A., Tang W. H., Keating S. P., Hunter C. L.,

Nuwaysir L. M., and Schaeffer D. A. (2007). Mol Cell Proteomics; 6, 1638-1655.

Schwacke, J. H., Hill, E. G., Krug, E. L, Comte-Walters, S., and Schey, K. L. (2009). BMC

Bioinformatics; 10, 342.

Grant J.E., Bradshaw A.D., Schwacke, J.H, Baicu, C.F., Zile, M.R., and Schey, K.L. (2009). J Proteome Res.; 9, 4252-63.

Konishi H., Karakas B., and Abde M. Abukhdeir A.M. (2007). Cancer Res.; 67:8460-8467. Sardanelli F., and Podo F. (2007). Eur Radiol.; 17:873-87.