Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
A PROFILE OF BLOOD PROTEIN MARKERS AS A TEST FOR THE DETECTION OF LUNG CANCER
Document Type and Number:
WIPO Patent Application WO/2015/115922
Kind Code:
A1
Abstract:
The object of the invention is a predictive method for the prediction and/or exclusion of lung cancer which involves the measurement of serum concentrations of protein markers. DRG-cyfra21-l, IBL-hCRP, DRG-NSE, NT-CEA, NT-CA125, USCN-SCCA1, h_tPA, S-100, NT-CA19-9. Furthermore, an object of the invention is a test kit for the detection of lung cancer using the ELISA method, which includes a reference value given as a value of the upper limit of reference values for a given protein marker in the population of healthy individuals, with which its serum concentration is compared, enabling the determination of the following protein markers: DRG-cyfra21-l, IBL-hCRP, DRG-NSE, NT-CEA, NT-CA125, USCN-SCCA1, h_tPA, S-100, NT-CA19-9.

Inventors:
DZIADZIUSZKO RAFAŁ (PL)
RZYMAN WITOLD (PL)
SZUTOWICZ-ZIELIŃSKA EWA (PL)
JASSEM JACEK (PL)
POLAŃSKA JOANNA (PL)
WIDŁAK PIOTR (PL)
Application Number:
PCT/PL2015/000009
Publication Date:
August 06, 2015
Filing Date:
January 29, 2015
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
GDAŃSKI UNIV MEDYCZNY (PL)
International Classes:
G01N33/574
Domestic Patent References:
WO2007065463A12007-06-14
WO2011035433A12011-03-31
WO2009074275A12009-06-18
Foreign References:
US20120021929A12012-01-26
Other References:
H ZHANG ET AL: "Selective expression of S100A7 in lung squamous cell carcinomas and large cell carcinomas but not in adenocarcinomas and small cell carcinomas", THORAX, vol. 63, no. 4, 1 April 2008 (2008-04-01), pages 352 - 359, XP055187886, ISSN: 0040-6376, DOI: 10.1136/thx.2007.087015
ZHANG YONG ET AL: "[Value of low-dose spiral computed tomography in lung cancer screening]", ZHONGHUA YIXUE ZAZHI - NATIONAL MEDICAL JOURNAL OF CHINA, BEIJING, CH, vol. 93, no. 38, 15 October 2013 (2013-10-15), pages 3011 - 3014, XP008176239, ISSN: 0376-2491
VICTORIA DOSEEVA ET AL: "Performance of a multiplexed dual analyte immunoassay for the early detection of non-small cell lung cancer", JOURNAL OF TRANSLATIONAL MEDICINE, BIOMED CENTRAL, LONDON, GB, vol. 13, no. 1, 12 February 2015 (2015-02-12), pages 55, XP021211741, ISSN: 1479-5876, DOI: 10.1186/S12967-015-0419-Y
ABERLE DR; ADAMS AM; BERG CD: "National lung screening trialresearch team reducedlung-cancermortality with low-dosecomputedtomographic screening", N ENGL J MED, vol. 365, 2011, pages 395 - 409
KRZAKOWSKI M.; JASSEM J.; RZYMANW.: "Nowotwory ptuca i optucnej oraz srodpiersia. W: Zalecenia postqpowania diagnostyczno-terapeutycznego w nowotworach ztosliwych - 2013r", POD REDAKCJA KRZAKOWSKI M. I WARZOCHA K., VIA MEDICA, 2013
LAPRUS I; ADAMEK M.; KOZIELSKI J.: "Potrzeba bada? przesiewowych w kierunku bada? wczesnego wykrywania raka ptuca-nowe dowody, nowe nadzieje", PNEUMOL I ALERGOL POL, vol. 79, 2011, pages 419 - 427
ADRIANO MP; SANDRO MP; MATTEOGIAJ-LEVRA ET AL.: "Clinicalimplications and addedcosts of incidentalfindings in an earlydetectionstudy of lungcancer by usinglow-dose spiral computedtomography", CLINICALLUNGCANCER, vol. 14, 2012, pages 139 - 48
TUFMAN A.; RUDOLPH M.H.: "Biologicalmarkers in lungcancer: A clinician'sperspective", CANCER BIOMARKERS, vol. 6, 2009, pages 123 - 135
KULPA J.; WOJCIK E.; REINFUSS M.: "CarcinoembryonicAntigen, Squamous Cell Carcinoma Antigen, CYFRA 21-1, and Neuron-specificEnolase in Squamous Cell LungCancerPatients", CLINICALCHEMISTRY, vol. 48, 2002, pages 1931 - 7
HENSING T.; SALGIA R: "Molecularbiomarkers for future screening of lungcancer", J SURGONCOL, vol. 108, 2013, pages 327 - 33
CASASPINA T.; ZAPATA T.I.; LOPEZ BJ.: "Tumor markers in lungcancer: does the method of obtaining the cut-off point and referencepopulation influence diagnosticyield?", CLINICALBIOCHEMISTRY, vol. 32, 1999, pages 467 - 72
ZHANG H.; ZHAO Q.; CHEN Y.: "Selectiveexpression of S100 7 in lungsquamouscell carcinoma and large cel carcinomas but not in adenocarcinomas and small cel carcinomas", THORAX, vol. 63, 2008, pages 352 - 9
PATZ E.F.; CAMPA M.J.; GOTTLIN .: "Panel serum biomarkers for the diagnosis of lungcancer", JCO, vol. 25, 2007, pages 5578 - 83
"Diagnosticvalue of SCC, CEA and CYFRA 21.1 in lungcancer: a Baesiananalysis", EURRESP J, vol. 10, 1997, pages 603 - 9
BUCCHERI G.; TORCHIO P.; FERRIGNO D.: "Clinicalequivalence of twocytokeratinmarkers in non-small cel lungcancer", CHEST, vol. 124, 2003, pages 622 - 32
VOURLEKIS J.S.; SZABO E.: "Use of markers for the detection and treatment of lungcancer", DISEASEMARKERS, vol. 20, 2004, pages 71 - 85
KELOFF G.J.; BOONE C.W.; CROWELL J.A.: "Riskbiomarkers and currentstrategies for cancerchemoprevention", J CELL BIOCHEM, vol. 255, 1996, pages 1 - 14
"Lungcancerbiomarkers. Present status and futuredevelopments", ARCHPATHOL LAB MED., vol. 137, 2013, pages 1191 - 8
TROYANSKAYA O; CANTOR M; SHERLOCK G; BROWN P; HASTIE T; TIBSHIRANI R; BOTSTEIN D; ALTMAN RB: "Missing valueestimationmethods for DNA microarrays", BIOINFORMATICS, vol. 17, 2001, pages 520 - 5
Attorney, Agent or Firm:
KANCELARIA PRAWNO-PATENTOWA (ul. Kurierów AK 4A/7, Gdańsk, PL)
Download PDF:
Claims:
Patent claims

1. A method for the detection and/or exclusion of lung cancer comprising the determination of serum concentrations of protein markers.

2. A method according to Claim 1, assuming that it involves the measurement of serum concentrations of the following protein markers: DRG-cyfra21-l, IBL- hCRP, DRG-NSE, NT-CEA, NT-CA125, USCN-SCCAl, h_tPA, S-100, NT-CA19-9

3. A method according to Claims 1 and 2, assuming that it involves the measurement of serum concentrations of the following protein markers: S-100, USCN-SCCAl, IBL-hCRP.

4. A method according to Claim 1, assuming that lung cancer is an early lung cancer.

5. A method according to Claim 1, assuming that the biological samples are collected from peripheral blood.

6. A method according to Claims 1 to 5, assuming that the method is used prior to LDCT.

7. A method according to Claims 1 to 5, assuming that the method is used following LDCT.

8. The use of the method according to Claims 1 to 5, assuming that the method is employed for the detection of lung cancer in individuals at high risk of lung cancer.

9. A test kit for the detection of lung cancer using the ELISA method, assuming that it includes a reference value given as a value of the upper limit of reference values for a given protein marker in the population of healthy individuals, and to said reference value the serum concentration of the said protein marker is compared, enabling the determination of the following protein markers: DRG-cyfra21-l, IBL-hCRP, DRG-NSE, NT-CEA, NT-CA125, USCN-

SCCAl, h_tPA, S-100, NT-CA19-9

10. A method according to Claim 10 assuming that it involves the measurement of concentrations of the following protein markers: S-100, USCN-SCCAl, IBL-hCRP.

Description:
A profile of blood protein markers as a test for the detection of lung cancer

Background of the invention

The object of the invention is a signature of protein markers whose concentrations are higher than those defined by relevant normal ranges or higher than the relevant upper cut-off levels, which signature is tested in peripheral blood and serves to predict which individual among individuals at high risk of lung cancer is significantly more likely to have this disease and, at the same time, serves to rule out the presence of lung cancer in individuals with normal concentrations of the protein markers selected to be members of the signature.

Context of the invention

Treatment of cancer is the greatest challenge for public healthcare in industrialised countries worldwide. Lung cancer is the leading cause of death due to cancer and accounted for 18.2% of deaths due to cancer in 2008. This is affected by late diagnosis and limited treatment options in 75% of the patients in whom the disease is diagnosed at a very advanced stage. Due to this fact the 5-year survival rate in highly developed countries reaches only nearly 15% of all lung cancer patients. According to the National Cancer Registry (Krajowy Rejestr Nowotworow), in 2006, in Poland, a total of 20,232 new cases of lung cancer were recorded, including 15,157 cases in men and 5,075 cases in women (which accounted for 23.6% and 8.2% of all cancers, respectively). The standardised rates per 100,000 persons in 2003 were 58.5 in men and 15.1 in women. Lung cancer is the cause of the greatest number of deaths due to cancer. In 2006, a total of 21.731 deaths were recorded, including 16,623 deaths in men and 5,108 deaths in women (respective percentages: 32.1% and 12.8% of all cancers). The standardised mortality rates per 100,000 persons in 2006 were 63.6 in men and 14.5 in women. The constantly increasing number of new cases and the unsatisfactory treatment outcomes are a significant societal problem. Primary prevention, which involves a complete elimination of exposure to the components of tobacco smoke, is critical to reducing mortality due to lung cancer. The results of the attempts to implement various forms of primary prevention are, however, very limited. Secondary prevention, which involves the introduction of screening to identify affected individuals at an early stage of the disease, is the second most effective tool in the fight against lung cancer. All the attempts to introduce a method that would meet all the requirements qualifying it as a commonly used screening tool have been unsuccessful so far. In 2012, the results of the National Lung Screening Trial were published. The study was conducted in more than 53,000 volunteers and showed a reduction in mortality of more than 20% in the group of individuals at high risk of lung cancer among the volunteers who had undergone screening with low-dose computed tomography (LDCT) compared to the subjects monitored with the traditional X-ray. The analysis of this study showed that the detectability of lung cancer using LDCT was 2.4% within 3 years after performing three scans in each of the subjects. The study showed that the positive predictive value (PPV) was 1.2% and the negative predictive value (NPV) was 100%. The study was conducted in a group of individuals selected according to the risks defined as age 55-79 years and total exposure of more than 30 pack-years. The relatively high percentage of false positive results renders the introduction of this method as a routine tool for population screening rather questionable, both due to the high costs of discovering the cancer and due to the risks associated with performing invasive diagnostic tests in the group of patients diagnosed with a lung tumour. In view of the above, LDCT cannot be used at present as a screening tool for the entire population.

It is therefore needed to develop an effective, minimally invasive molecular test that would allow us to detect early, preclinical forms of lung cancer. Measurement of a serum biomarker meets these criteria and could considerably contribute to the improvement of treatment outcomes in lung cancer, as such test could be used as a screening test in the high-risk group defined above. The biomarkers that are currently being investigated most extensively as potential diagnostic tests for the detection of early lung cancer include circulating protein antibodies, microRNA (miRNA), and the proteomic profile as a separate or a multi-component peptide panel.

Summary of the invention The object of the invention is a new test that allows one to determine who in a group of individuals at high risk of lung cancer is most likely to have the disease. The proposed test is employed as an independent method used before the commencement of screening with LDCT. This method involves a combined measurement of levels of selected protein markers used as a multi-component profile. Thanks to its high NPV, the test allows one to rule out lung cancer, at an accuracy of at least 90%, in an individual with a negative result. On the other hand, thanks to its relatively high PPV, the test allows one to determine, at an accuracy of 32%, whether a given individual has lung cancer or not.

This profile can also be used in high-risk individuals as a predictor indicating individuals who should undergo imaging studies to facilitate the decision regarding their further diagnostic work-up and treatment.

Currently, LDCT is the only lung cancer-screening test of proved clinical usefulness. This test is characterised by a very high NPV (100%) that allows one to rule out the cancer in an individual with a negative test result, and by a very low PPV (approx. 1%). This means that the cancer will only be detected in 1 in 100 individuals tested with this method. This generates very high costs of detection of one case of the cancer, significant psychological problems for individuals diagnosed with a tumour which subsequently turns out not to be cancerous (false positive results) and the necessity for further follow-up or invasive diagnostic evaluation in individuals with a positive test result. These limitations are a considerable obstacle to the widespread use of LDCT. The method we have invented allows one to considerably narrow down the group of individuals with potential lung cancer. Detection of a biomarker identifying a lung cancer patient has always been a key task in many areas of laboratory medicine. Proteomic markers may be assessed directly in the serum or plasma. There are several dozens of laboratory tests in the modern laboratory diagnostics, which show various degrees of association with lung cancer in terms of test potency and test specificity. They are easy to perform and do not require expensive apparatus. Also, their costs are low compared to the costs of imaging and endoscopic studies. Their drawbacks as independent markers are, however, their low sensitivity and specificity. As a result, when a single marker is being measured, there is always a wide margin of diagnostic uncertainty, which is reflected by values of the diagnostic efficacy index, which does not exceed 0.70-0.75 for advanced stages of the disease.

Measurement of nearly all markers is currently possible with the use of ready- made kits that utilise various immunochemical methods. They include specific primary antibodies, mono- and/or polyclonal, that are mainly bound to the solid phase (test tube wall, glass microspheres) and enzyme-labelled detection antibodies that react with substrates yielding reaction products that can be determined using colorimetric, fluorometric or luminometric methods. These assays are available in various formats intended for automatic or manual determinations on measurement platforms (including ELISA). This allows one to adapt these methods for the purpose of measurements of a wide range of concentrations of these analytics in the plasma/serum from 10 ~8 to 10 18 mol/l.

Based on the available literature 16 protein markers selected and investigated whose levels, through various/independent mechanisms, could be increased in a possibly large fraction of individuals with lung cancer. Based on a study of 100 patients with lung cancer and 300 healthy individuals a signature composed of 3 markers was identified for use to detect lung cancer, with a particular emphasis on early forms of the disease. Of the various groups of tumour markers, the following were selected as potentially useful for the development of a diagnostic algorithm: "Glycoprotein" antigens: Carcinoembryonic antigen (CAE) and/or CA 125 and/or CA 199.

"Cytokeratin" and secretory antigens: CYFRA 21-1 or tissue polypeptide antigen (TPA). Both markers are equally valuable indicators of proliferation rate. Dickkopf-1 (DKK1) is a secretory protein significantly correlated with lung tumours. "Neuronal" antigens: Neuron-specific enolase (NSE) and/or squamous-cell carcinoma antigen (SCC-Ag), S110B protein— an antigen specific for squamous-cell and giant-cell lung carcinomas but not for lung adenocarcinomas or small-cell lung carcinomas.

Specific antigens: Progastrin-releasing peptide (ProGRP), a protein specific for small- cell lung carcinoma. Accompanying antigens: C-reactive protein (CRP), alpha-1 antitrypsin, retinol-binding protein 4 (RBP 4).

In the present invention, we present a signature composed of 9 protein markers whose serum concentrations are elevated in lung cancer patients relative to the defined norms. The signature guarantees a mean NPV value measured by traditional methods in the population of no less than 95%. The serum we tested originated from the participants of the Pomeranian Pilot Programme for Lung Cancer Screening (Pomorski Pilotazowy Program Badan Przesiewowych Raka Piuca). The serum samples had been collected in compliance with a strict protocol for the collection, preparation and storage of samples. The analysis was based on the measurements of serum protein marker levels in 100 individuals diagnosed with early lung cancer and 300 healthy individuals, who comprised the control group. The measurements were performed using sandwich ELISA. The control group was sex- and age-matched and its members had been selected from among 3500 individuals considered healthy when the test was being performed. The object of the invention is a predictive method for the prediction and/or exclusion of lung cancer, which involves the measurement of serum concentrations of protein markers. The method mentioned above involves the measurement of serum concentrations of the following protein markers: DRG-cyfra21-l, IBL-hCRP, DRG-NSE, NT-CEA, NT-CA125, USCN-SCCA1, h_tPA, S-100, NT-CA19-9

A method, which involves the measurement of serum concentrations of the following protein markers: S-100, USCN-SCCA1, IBL-hCRP. A method where lung cancer is early lung cancer.

A method where the biological samples are collected from peripheral blood.

This method is used prior to an LDCT scan.

This method is used following an LDCT scan.

The use of the method mentioned above for the detection of lung cancer in individuals at high risk of lung cancer.

A test kit for the detection of lung cancer using the ELISA method, which includes a reference value given as a value of the upper limit of reference values for a said protein marker in the population of healthy individuals, and to which its serum concentration is compared, enabling the determination of the following protein markers: DRG-cyfra21-l, IBL-hCRP, DRG-NSE, NT-CEA, NT-CA125, USCN-SCCA1, h_tPA, S-100, NT-CA19-9

A kit that provides concentration measurement of the following protein markers: S-100, USCN- SCCA1, IBL-hCRP.

The terms used above, in the patent description and in the patent claims have the following meanings:

Individual at high-risk of lung cancer — An individual aged 50-79 years who has smoked at least 20 pack-years of tobacco. Pack-years (a pack-year) — A traditional measure of the risk of tobacco-related diseases used in medicine. The number of pack-years is calculated by multiplying the number of packs of cigarettes smoked per 24 hours by the number of years of smoking, e.g. 1 pack-year refers to smoking 1 pack of cigarettes (20 cigarettes per pack) for 1 year. Low-dose computed tomography (LDCT) scan of the chest— A CT technique that does not involve intravenous administration of a contrast agent but uses low exposure parameters (a voltage of 120 kVp, an intensity of 40-80 mA), to maximise radiological protection and minimise the absorbed dose of radiation, while preserving the diagnostic value and sensitivity. ELISA (enzyme-linked immunosorbent assay)— An assay used in biomedical studies, both research and diagnostic studies. It is used to detect specific proteins in the test material using polyclonal or monoclonal antibodies conjugated with an appropriate marker enzyme.

Positive predictive value (PPV)— The likelihood of having the disease by an individual with a positive test result. If the individual tests positive, the PPV provides the individual with information on how certain the he/she can be that he/she is suffering from a given disease.

The confidence interval is constructed based on the Clopper-Pearson method for a single proportion.

Negative predictive value (NPV)— The likelihood of not having the disease by an individual with a negative test result. If the individual tests negative, the NPV provides the individual with information on how certain the he/she can be that he/she is not suffering from a given disease. The confidence interval is constructed based on the Clopper-Pearson method for a single proportion. Positive and negative predictive values depend on the prevalence of the disease (prevalence rate).

Multiple random variation (MRV) — Also: Monte Carlo cross-validation. A method for the assessment of the performance of prediction and stability of a signature that involves multiple construction of the classifier discriminant function on the basis of two randomly created data subsets: the training set and the testing set. At each step, selected indicators of classification performance (NPV and PPV in this case) are assessed, and the resulting set forms the basis for the estimation of the interval estimate of the indicator for the population. Subsequent draws of the training set and the testing set are independent from the previous draws and their structure (the percentages of sick individuals and healthy individuals) reflects the structure of the baseline data set. The percentage ratio p characterises the ratio of the size of the training set to the size of the testing set in each iteration.

Receiver operating characteristic (ROC) curve— A tool for the assessment of the performance of a classifier; it provides a combined description of the classifier's sensitivity and specificity. This method of decision support system is widely used in many applications, including medical diagnostics.

Sensitivity (Sens) — Also: true positive rate. An indicator of classification performance that defines the proportion of false positive results in the group of sick individuals.

TP

Sens = (3)

Specificity (SPC) — Also: true negative rate. An indicator of classification performance that defines the proportion of negative results in the group of healthy individuals.

TN

SPC = -^— (4)

TN + FP '

Area under curve (AUC)— The value of the area under the ROC curve. The area under the ROC curve is the probability that a classifier will give a higher rank to a randomly selected case from an appropriate group rather than give a higher rank to a randomly selected case from the group which is known not to include the data looked for. AUC contains a description of detection precision throughout the range of the system's operation. An AUC of 0.5 may be described as a random activity, and an AUC of 1.0 is the ideal indicator. This means that the curve running closer to the upper left corner represents a higher diagnostic accuracy.

Prediction method— A method that enables a rational, scientific prediction of the occurrence of an event. It is also a method for the prediction of the current status of a system, i.e. a method for the determination of the risk of the presence of an event and/or for ruling out the presence of an event.

Screening test— A type of strategic test, which is conducted among individuals who do not have the symptoms of a specific disease in order to detect the disease, provide early treatment or prevent serious consequences of the disease in future. Screening tests are performed in the entire population or in the so-called high-risk groups. Screening tests are aimed at detecting a specific disease in its early phase, thanks to which early intervention is possible.

Early lung cancer— Asymptomatic lung cancer. Signature of protein markers— A unique set (combination) of protein markers whose concentrations are elevated and as such determinate the diagnosis of lung cancer.

Description of the figures:

Fig. 1 Mean PPV and NPV values for a 9-item signature obtained using the logistic regression method combined with the MRV cross-validation technique, according to the threshold value (thr). The point with the suggested threshold value (thr) of 0.101 is marked red.

Fig. 2 An ROC curve for a 9-item signature of proteins constructed on the basis of estimations of the classifier's sensitivity and specificity carried out using the MRV cross-validation technique. The point with the suggested threshold value (thr) of 0.101 is marked red.

Fig. 3 Mean PPV and NPV values for a 3-item signature obtained using the logistic regression method combined with the forward feature selection algorithm, according to the threshold value (thr). The point with the suggested threshold value (thr) of 0.102 is marked red.

Fig. 4 An ROC curve for a 3-item signature of proteins constructed on the basis of estimations of the classifier's sensitivity and specificity carried out using the MRV cross-validation technique. The point with the suggested threshold value (thr) of 0.102 is marked red.

The present invention is illustrated by the following examples of execution, which are not a limitation of the present invention in any way:

Example 1:

Collection, into a white-top tube, of a 10-ml sample of peripheral blood from an individual at high risk of lung cancer. Incubation for 30 minutes at room temperature. Centrifugation of blood at 1000 g, at 18-20 degrees Celsius for 10 minutes. Collection of 6 aliquots of 500 μΙ of serum into cryogenic tubes that are freezed at -80 degrees Celsius. Measurement of the concentrations of 16 proteins using ELISA. The value of expression of each of the proteins mentioned above forms the basis for calculation of the discriminant function in the model for predicting the occurrence of lung cancer. A value of the discriminant function exceeding the threshold value thr forms the basis for qualifying an individual to the group at high risk of lung cancer. The following statistical analysis was used to create a signature of high risk of lung cancer: The discriminant function in the prediction model obtained using logistic regression is as follows: pO) = i (5) where p(z) is the value of the discriminant function, and the value of the argument z is calculated as a linear combination of the relative values of expression levels of n selected proteins and is given by the following formula:

The maximum likelihood (ML) method was used to estimate the values of β„ while the M V multivariate cross-validation technique was the basis for the selection of features and estimation of the NPV and PPV values. The resulting signature consists of 9 proteins showing expression levels that allow one to differentiate between sick and healthy individuals. Their list and their contribution percentages ¾ are provided in Table 1. The mean values of NPV, PPV, sensitivity (Sens), specificity (SPC) and AUC for three selected cut-off thresholds thr are provided in Table 2. Fig. 1 illustrates the dependence of the mean NPV and PPV values from the cut-off threshold thr, while Fig. 2 shows the ROC curve.

1. Proteins that make up the signature obtained using the logistic regression method using MRV cross-validation, and the related values of fi.

miRNA or protein Estimation of β miRNA or protein Estimation of |i

IBL-hCRP 0.22677 h_tPA 0.00043

DRG-NSE -0.00153 S-100 -1.13229

NT-CEA 0.00335 NT-CA19-9 0.00391

Table 2. Estimations of the values of PPV, NPV, sensitivity, specificity and AUC for a 9-item signature of proteins depending on the adopted threshold value thr and the method of error assessment.

Assessment of concentrations of the protein markers selected to be included in the signature in individuals at high risk of lung cancer. Individuals found to have elevated concentrations of all the markers are qualified for follow-up assessment using low- dose computed tomography.

Example 2:

Collection of blood samples and determination of the concentrations of individual proteins are performed as described in Example 1. The discriminant function described by Equations (5) and (6) also remains unchanged. What is modified compared to Example 1 is the method for the selection of proteins, as the stage in which the classifier was constructed involved the use of logistic regression in combination with forward feature selection (FS) and with Bayesian information criterion (BIC) of model selection. The use of backward feature elimination (BE) yielded the same results. The final signature is composed of 3 proteins (which are a subset of the initial set of 9 features given in Table 6, and the obtained estimations of PPV and NPV still fall within the intervals that meet the criteria for diagnostic and predictive efficacy.

The list of proteins making up the signature and their contribution percentages ¾ are provided in Table 3. Table 4 provides the mean values of NPV, PPV, sensitivity (Sens), specificity (SPC) and AUC for three selected cut-off thresholds thr. Fig. 3 illustrates the dependence of the mean NPV and PPV values from the cut-off threshold thr, while Fig. 4 shows the ROC curve.

Table 3. Proteins that make up the signature obtained using the logistic regression method using forward feature selection (FS) and the related values

Table 4. Estimations of the values of PPV, NPV, sensitivity, specificity and AUC for a 3-item signature of proteins depending on the adopted threshold value thr and the method of error assessment.

Assessment of concentrations of the 3 protein markers selected to be included in the signature in individuals in whom a nodule has been discovered during a low-dose computed tomography scan. Individuals found to have elevated concentrations of all the markers are qualified for invasive diagnostics.

Analytical procedures— A description of the ELISA method.

1. Serum levels of all the potential predictive and diagnostic markers were determined using ELISA microplates (96-well plates) on the automated measurement platform ETI-MAX 3000 (Dia Sorin, Bellugia, Italy). The principle of determination for each parameter was similar.

2. It involved the binding of the analyte present in the tested serum sample with a specific primary monoclonal antibody absorbed on the surfaces of the microplate wells. After completion of binding the unbound proteins were removed by washing. A second antibody— a horseradish peroxidase-labelled polyclonal antibody to human serum proteins— was then added. After the antibody had been bound with the absorbed analyte the samples were washed and a tetramethylbenzidine solution (the peroxidase substrate) was added; tetramethylbenzidine underwent oxidation yielding a colour product- incubation was interrupted after 30-60 minutes using H 2 S0 4 and the absorbance value in individual wells was recorded. The concentration of the ligand was calculated automatically from the calibration graph obtained for each of the 96-well microplates. The absorbance value was proportional to the content of the analyte in the test sample. All the measurements were carried out in duplicate.

Table 5. Protein markers used in the study determining a signature of the presence of lung cancer

# Name Code Manufacturer

1 CRP IBL-EU5931 IBL International, Hamburg, Germany

2 CRP high sensitive ELISA IBL-EU5951 IBL International, Hamburg, Germany

NovaTeclmmunodiagnostica,

3

CEA NT-DNOV060 Dietzenbach, Germany

NovaTeclmmunodiagnostica,

4

CA 125 NT-DNOV061 Dietzenbach, Germany

NovaTeclmmunodiagnostica,

5

CA 19-9 NT-DNOV063 Dietzenbach, Germany

6 Bender MedSystems, Vienna, Austria human t-PA ELISA BS-BMS258/2

ELISA Kit for Human

7 Cytokeratin Fragment Antigen DRG Instruments, Marburg, Germany 21-1 (CYFRA21-1) 96T DRG-EIA3943

BiomedicaMedizinprodukte, Wien,

8

DKK-1 ELISA BI-20412 Austria

ELISA Kit for Human

Squamous

9 USCN Life Science Inc., Wuhan, China

Cell Carcinoma Antigen 1

(SCCA1) 96T USCN-E1372HU

ELISA Kit for Human

10 Squamous Cell Carcinoma USCN Life Science Inc., Wuhan, China Antigen 2 (SCCA2) 96T USCN-E0159HU

11 DRG Instruments, Marburg, Germany

NSE DRG-EIA2353

ELISA Kit for Human

12 Antitrypsin Alpha 1 (alAT) USCN Life Science Inc., Wuhan, China 96T USCN-E1697HU

ELISA Kit for Human

Immunodiagnostik AG, Bensheim,

12 Antitrypsin Alpha 1 (alAT)

Germany

96T K6752

13 Abnova, Taipei, Taiwan

SAA (Human) ELISA kit ABN-KA0518

14 Abnova, Taipei, Taiwan

RBP4 (Human) ELISA kit ABN-KA0499

15 DiaSorin, Minnesota, USA

Sangtec 100 ELISA IS-364.701 ELISA Kit for Human Pro

16 Gastrin Releasing Peptide USCN-E1186HU USCN Life Science Inc., Wuhan, China (pro-GRP) 96T

ELISA Kit for Human Tissue

17 Polypeptide Specific Antigen USCN Life Science Inc., Wuhan, China (TPS) 96T USCN-E1281HU

D) Statistical analysis of data

The k-nearest neighbours algorithm (for k=10) was used for the prediction of expression levels for the case of missing data. The missing values were replaced by the median value for the nearest (within the meaning of the Euclidean norm) proteins (Troyanskaya et al. 2001).

The statistical method of logistic regression was used for the construction of the classifier.

A preliminary ordering of features from the most to the least significant ones was carried out using the modified Mann-Whitney rank statistic U.

The MRV Monte Carlo cross-validation method was used to select the molecular signature. A p of 0.5 was adopted for division of the dataset into the training subset and the testing subset. For each partial model, N=500 of independent draws were performed and based on the results of classification the levels of NPV and PPV were assessed.

The final signature is a set of proteins along with the specification of the threshold value thr of the logistic discriminative function of maximising values of NPV at the limitation of PPV>30.

E) Results

In the present invention, we present a signature composed of 9 proteins, which show altered expression in the serum. Their list is provided in Table 6. The estimated mean NPV value in the population for the logistic classifier created on the basis of each individual protein from this list is at least 75% (Tables 6 and 7). The use of all the 9 features improves the performance of classification to the NPV level of 95.65 for the traditional method, and to the NPV level of 94.09 for the Monte Carlo multiple validation method.

Table 6. Protein's list that creates the signature.

Table 7. Estimations (obtained using the traditional method) of the mean values of Sens, SPC, PPV and NPV for the population with respect to the logistic classifier created on the basis of each individual protein from the list included in Table 5 along with the individually selected threshold values thr.

Table 8. Estimations (obtained using the Monte Carlo multivariate cross-validation technique [MRV]} of the mean values of Sens, SPC, PPV and NPV for the population with respect to the logistic classifier created on the basis of each individual protein from the list included in Table 5 along with the individually selected threshold values thr.

Aberle DR, Adams AM, Berg CD, i wsp. National lung screening trialresearch team reducedlung-cancermortality with low-dosecomputedtomographic screening. N Engl J Med 2011, 365, 395-409.

Krzakowski M., Jassem J., RzymanW. i wsp. Nowotwory ptuca i optucnej oraz srodpiersia. W: Zalecenia post¾powania diagnostyczno-terapeutycznego w nowotworach ztosliwych - 2013r. pod redakcja. Krzakowski M. i Warzocha K., Via Medica 2013, Gdansk.

Laprus I, Adamek M., Kozielski J.„Potrzeba badan przesiewowych w kierunku badan wczesnego wykrywania raka ptuca-nowe dowody, nowe nadzieje" Pneumol i Alergol Pol 2011, 79, 419-427.

Adriano MP, Sandro MP, MatteoGiaj-Levra, et al. Clinicalimplications and addedcosts of incidentalfindings in an earlydetectionstudy of lungcancer by usinglow-dose spiral computedtomography. ClinicalLungCancer2012, 14, 139-48.

Tufman A., Rudolph M.H. Biologicalmarkers in lungcancer: A clinician'sperspective. Cancer Biomarkersl009, 6, 123-135.

Kulpa J., Wojcik E., Reinfuss M. I wsp. CarcinoembryonicAntigen, Squamous Cell Carcinoma Antigen, CYFRA 21-1, and Neuron-specificEnolase in Squamous Cell LungCancerPatients. ClinicalChemistry 2002, 48, 1931-7

Hensing T., Salgia R. Molecularbiomarkers for future screening of lungcancer. J SurgOncol 2013, 108, 327-33.

CasasPina T., Zapata T.I., Lopez B.J. i wsp. Tumor markers in lungcancer: does the method of obtaining the cut-off point and referencepopulation influence diagnosticyield? ClinicalBiochemistry 1999, 32, 467-72.

Zhang H., Zhao Q., Chen Y. i wsp. Selectiveexpression of S100A7 in lungsquamouscell carcinoma and large eel carcinomas but not in adenocarcinomas and small eel carcinomas. Thorax 2008, 63, 352-9.

Patz E.F., Campa M.J., Gottlin E. i wsp. Panel serum biomarkers for the diagnosis of lungcancer. JCO 2007, 25, 5578-83.

Diagnosticvalue of SCC, CEA and CYFRA 21.1 in lungcancer: a Baesiananalysis. EurResp J 1997, 10: 603-9.

Buccheri G., Torchio P., Ferrigno D. Clinicalequivalence of twocytokeratinmarkers in non-small eel lungcancer. Chest 2003, 124, 622-32.

Vourlekis J.S., Szabo E. Use of markers for the detection and treatment of lungcancer. DiseaseMarkers 2004, 20, 71-85.

Keloff G.J., Boone C.W., Crowell J.A. i wsp. Riskbiomarkers and currentstrategies for cancerchemoprevention. J Cell Biochem 1996, 255, 1-14.

Lungcancerbiomarkers. Present status and futuredevelopments. ArchPathol Lab Med. 2013, 137, 1191-8.

Troyanskaya O, Cantor M, Sherlock G, Brown P, Hastie T,Tibshirani R,Botstein D,Altman RB: Missing valueestimationmethods for DNA microarrays. Bioinformatics 2001, 17,