Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHODS FOR THE SCREENING OF TUMOR CELLS
Document Type and Number:
WIPO Patent Application WO/2012/017062
Kind Code:
A2
Abstract:
The present invention relates to a method for the screening of one or more cells derived from a subject for the presence of a particular genotype, comprising: identifying in the one or more cells one or more chromosomal regions, each chromosomal region comprising at least two candidate genetic markers; subjecting the one or more cells to nucleic acid amplification; determining in the nucleic acid amplification products obtained in the preceding step for each of the candidate genetic markers in each chromosomal region at least one parameter being indicative for the allele status of the marker; selecting those chromosomal regions in which the majority of the at least two genetic markers comprised in a chromosomal region shows consistent results in the preceding step; determining for each of the candidate genetic markers in each chromosomal region selected in the preceding step at least one parameter being indicative for the allele copy number of the candidate genetic marker; and selecting those chromosomal regions in which the majority of the at least two genetic markers comprised in a chromosomal region shows consistent results in the preceding step, wherein any one or more of the chromosomal regions selected in the preceding step is/are indicative for the presence of a particular genotype. The present invention further relates to a corresponding method for diagnosing and/or monitoring in a sample the presence of a tumor or a predisposition to develop a tumor, comprising performing such screening method as well as to the use of such screening method for diagnosing and/or monitoring the presence of a tumor, the presence of a particular tumor type, the presence of a particular tumor stage, the predisposition to develop a tumor, and response to tumor therapy.

Inventors:
HOFFMANN EVA-MARIA (AT)
GEIGL JOCHEN (AT)
SCHWARZBRAUN THOMAS (AT)
SPEICHER MICHAEL (AT)
ULZ PETER (AT)
HEITZER ELLEN (AT)
Application Number:
PCT/EP2011/063499
Publication Date:
February 09, 2012
Filing Date:
August 05, 2011
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNIV GRAZ MED (AT)
HOFFMANN EVA-MARIA (AT)
GEIGL JOCHEN (AT)
SCHWARZBRAUN THOMAS (AT)
SPEICHER MICHAEL (AT)
ULZ PETER (AT)
HEITZER ELLEN (AT)
International Classes:
C12Q1/68
Other References:
PANTEL, K. ET AL., NAT. REV. CANCER, vol. 8, 2008, pages 329 - 340
ALLARD, W.J. ET AL., CLIN. CANCER RES., vol. 10, 2004, pages 6897 - 6904
NAGRATH, S. ET AL., NATURE, vol. 450, 2007, pages 1235 - 1239
SAMBROOK, J., RUSSEL, D.W.: "Molecular cloning: A laboratory manual", 2001, COLD SPRING HARBOR LABORATORY PRESS
AUSUBEL, F.M. ET AL.: "Current Protocols in Molecular Biology", 2001, WILEY & SONS
PERTL, B. ET AL., MOL. HUM. REPROD., vol. 5, 1999, pages 1176 - 1179
SCHOUTEN, B. ET AL., NUCL. ACIDS RES., vol. 30, 2002, pages E57
ZHANG, L. ET AL., PROC. NATL. ACAD. SCI. USA, vol. 89, 1992, pages 5847 - 5851
DEAN, F.B. ET AL., PROC. NATL. ACAD. SCI. USA, vol. 99, 2002, pages 5261 - 5266
LUDECKE, H.J. ET AL., NATURE, vol. 338, 1989, pages 348 - 350
TELENIUS, H. ET AL., GENOMICS, vol. 13, 1992, pages 718 - 725
MANN, K. ET AL., LANCET, vol. 358, 2001, pages 1057 - 1061
Attorney, Agent or Firm:
DILG, HAEUSLER, SCHINDELMANN Patentanwaltsgesellschaft mbH (Munich, DE)
Download PDF:
Claims:
CLAIMS

1. Method for the screening of one or more cells derived from a subject for the presence of a particular genotype, comprising:

(a) identifying in the one or more cells one or more chromosomal regions, each chromosomal region comprising at least two candidate genetic markers;

(b) subjecting the one or more cells to nucleic acid amplification;

(c) determining in the nucleic acid amplification products obtained in step (b) for each of the candidate genetic markers in each chromosomal region at least one parameter being indicative for the allele status of the marker;

(d) selecting those chromosomal regions in which the majority of the at least two genetic markers comprised in a chromosomal region shows consistent results in step (c);

(e) determining for each of the candidate genetic markers in each chromosomal region selected in step (d) at least one parameter being indicative for the allele copy number of the candidate genetic marker; and

(f) selecting those chromosomal regions in which the majority of the at least two genetic markers comprised in a chromosomal region shows consistent results in step (e);

wherein any one or more of the chromosomal regions selected in step (f) is/are indicative for the particular genotype screened for.

2. The method of claim 1 , wherein the method is performed using only one cell.

3. The method of claim 1 or 2, wherein the one or more cells are tumor cells, preferably selected from the group consisting of disseminated tumor cells and circulating tumor cells.

4. The method of claim 3, wherein the genotype is indicative for any one or more phenotype in the subject that is selected from the group consisting of the presence of a tumor, the presence of a particular tumor type, the presence of a particular tumor stage, the predisposition to develop a tumor, and response to tumor therapy.

5. The method of any one of claims 1 to 4, wherein the one or more chromosomal regions have a length of at least 3 Mb.

6. The method of any one of claims 1 to 5, further comprising:

comparing the results obtained in any one or both of steps (c) and (e) with those obtained for a control.

7. The method of claim 6, wherein the control is a non-amplified nucleic acid sample, preferably derived from the subject to be analyzed.

8. The method of claim 6 or 7, wherein step (c) comprises determining at least one parameter selected from the group consisting of number of alleles and length of the alleles.

9. The method of any one of claims 6 to 8, wherein step (d) further comprises:

selecting those chromosomal regions in which for the majority of the at least two genetic markers comprised in a chromosomal region the results obtained in step (c) are preserved in the one or more cells analyzed and the control.

10. The method of any one of claims 6 to 9, wherein step (f) further comprises:

selecting those chromosomal regions in which for the majority of the at least two genetic markers comprised in a chromosomal region of the one or more cells analyzed the allele copy number is modified as compared to the control.

1 1. The method of claim 10, wherein the modification of the allele copy number is selected from the group consisting of a chromosomal loss and a chromosomal gain.

12. The method of any one of claims 1 to 1 1 , wherein any one or both of steps (c) and (e) are performed by means of a PCR-based technique, preferably by quantitative fluorescence PCR.

13. The method of any one of claims 1 to 12, wherein the method is performed in a multiplex-format, preferably in a high-throughput format.

14. Method for diagnosing and/or monitoring in a sample derived from a subject the presence of a tumor or a predisposition to develop a tumor, comprising:

(a) screening one or more cells derived from the sample of the subject according to the method as defined in any one of claims 1 to 13 for the presence of a particular genotype that is indicative for the presence of a tumor or a predisposition to develop a tumor; and optionally

(b) determining in the one or more cells obtained in step (a) that exhibit the particular genotype at least one further parameter being indicative for the presence of a tumor or a predisposition to develop a tumor.

15. Use of the methods as defined in any one of claims 1 to 14 for diagnosing and/or monitoring in a sample derived from a subject any one or more selected from the group consisting of the presence of a tumor, the presence of a particular tumor type, the presence of a particular tumor stage, the predisposition to develop a tumor, and response to tumor therapy.

Description:
METHODS FOR THE SCREENING OF TUMOR CELLS

FIELD OF THE INVENTION

The present invention relates to methods for the genetic screening of one or more cells, preferably of tumor cells, for diagnostic and research purposes. In particular, the method provides for a quality control and a measure for the detection of chromosomal copy number changes for assessing the prognostic impact of the cells analyzed.

BACKGROUND

Despite overall improvements in many cancer therapies, there are still many important questions to be answered, for example, why individual patients respond to therapy and others do not and why some patients relapse. Indeed, current prognostic approaches based on tumor staging usually provides little information with regard to response to treatment of individual patients. In recent years, a tremendous search has been undertaken in order to identify genetic markers, which may refine prognostic information and predict the benefit derived from systemic treatment. Although some important genetic biomarkers with predictive and prognostic information have been established, it is still poorly understood how serial monitoring of tumor genotypes, which are prone to changes under selection pressure, can be performed.

In order to tailor patients' specific treatment regimen sufficient quantities to tumor samples must be available for serial monitoring of tumor genotypes. However, such quantities are often problematic to obtain. Biopsies can be associated with complications and frequently yield only sparse quantities of cytological material - a major barrier to translating laboratory findings into clinical therapy. Furthermore, the availability of non-invasive approaches appears specifically attractive in order to avoid, for the patients affected, unnecessary surgical intervention. One approach to this end relates to the analysis of circulating tumor cells (CTCs) that can be readily isolated from peripheral blood. The detection of CTCs in patients with various forms of solid tumors has spurred multiple efforts to use them clinically (Pantel, K. et al. (2008) Nat. Rev. Cancer 8, 329-340). Indeed, in cancer patients, CTCs should provide snapshots of genomic alterations in primary tumors and metastases at various stages during the course of disease. Ideally, only single CTCs are subjected to analysis in order to avoid any falsification of the results obtained due to the presence of other cells, particularly of non-tumor cells. Furthermore, the real complexity of cell dissemination and metastasis, and especially the heterogeneity of CTCs, can only be understood when considering individual cells.

In order to reliably establish the role of CTCs as potential biomarkers hundreds or thousands of cells will have to be analyzed in order to obtain trustable results. Such cell numbers in patients with metastatic cancer were repeatedly reported in the literature (cf., e.g., Allard, W.J. et al. (2004) Clin. Cancer Res. 10, 6897-6904; Nagrath, S. et al. (2007) Nature 450, 1235-1239).

However, presently available technologies, such as FISH (fluorescence in situ hybridization) and array-CGH (comparative genomic hybridization) after whole-genome amplification, make single cell analyses not amenable to high-throughput analysis for reasons of costs but also other resources such as personnel, time, etc.

Accordingly, there remains a need for methods for the (pre-)screening of cells, particularly of tumor cells, in order to select those cells having the greatest impact for further analyses. In particular, there is a need for fast and affordable (i.e. cost-saving) methods enabling a reliable screening of cells in multiplex-format, such as a high-throughput format. Moreover, such method should also aid in an improved non-invasive diagnosis, staging, and monitoring of medical conditions, particularly of tumors, or a predisposition to develop a medical condition.

Thus, it is an object of the present invention to provide such methods for the (pre-)screening of cells, particularly of tumor cells. SUMMARY OF THE INVENTION

In one aspect, the present invention relates to a method for the screening of one or more cells derived from a subject for the presence of a particular genotype, comprising:

(a) identifying in the one or more cells one or more chromosomal regions, each chromosomal region comprising at least two candidate genetic markers;

(b) subjecting the one or more cells to nucleic acid amplification;

(c) determining in the nucleic acid amplification products obtained in step (b) for each of the candidate genetic markers in each chromosomal region at least one parameter being indicative for the allele status of the marker;

(d) selecting those chromosomal regions in which the majority of the at least two genetic markers comprised in a chromosomal region shows consistent results in step (c);

(e) determining for each of the candidate genetic markers in each chromosomal region selected in step (d) at least one parameter being indicative for the allele copy number of the candidate genetic marker; and

(f) selecting those chromosomal regions in which the majority of the at least two genetic markers comprised in a chromosomal region shows consistent results in step (e); wherein any one or more of the chromosomal regions selected in step (f) is/are indicative for the particular genotype screened for.

In a preferred embodiment, the method is performed using only one cell.

In another preferred embodiment, the one or more cells employed in the method are tumor cells, and are particularly preferably selected from the group consisting of disseminated tumor cells and circulating tumor cells.

In a specific embodiment, the genotype for which the one or more cells are screened for is indicative for any one or more phenotype in the subject that is selected from the group consisting of the presence of a tumor, the presence of a particular tumor type, the presence of a particular tumor stage, the predisposition to develop a tumor, and response to tumor therapy. ln another specific embodiment, the one or more chromosomal regions have a length of at least 3 Mb.

In another preferred embodiment, the method further comprises comparing the results obtained in any one or both of steps (c) and (e) with those obtained for a control.

Preferably, the control employed is a non-amplified nucleic acid sample, and is particularly preferably derived from the subject to be analyzed.

In a specific embodiment, step (c) comprises determining at least one parameter selected from the group consisting of number of alleles and length of the alleles.

In a further preferred embodiment, step (d) further comprises selecting those chromosomal regions in which for the majority of the at least two genetic markers comprised in a chromosomal region the results obtained in step (c) are preserved in the one or more cells analyzed and the control.

In yet another preferred embodiment, step (f) further comprises selecting those chromosomal regions in which for the majority of the at least two genetic markers comprised in a chromosomal region of the one or more cells analyzed the allele copy number is modified as compared to the control.

Particularly preferably, the modification of the allele copy number is selected from the group consisting of a chromosomal loss and a chromosomal gain.

In another specific embodiment, any one or both of steps (c) and (e) are performed by means of a PCR-based technique, preferably by quantitative fluorescence PCR.

Preferably, the method is performed in a multiplex-format, particularly preferably in a high- throughput format. In a further aspect, the present invention relates to a method for diagnosing and/or monitoring in a sample derived from a subject the presence of a tumor or a predisposition to develop a tumor, comprising:

(a) screening one or more cells derived from the sample of the subject according to the method as defined herein above for the presence of a particular genotype that is indicative for the presence of a tumor or a predisposition to develop a tumor; and optionally

(b) determining in the one or more cells obtained in step (a) that exhibit the particular genotype at least one further parameter being indicative for the presence of a tumor or a predisposition to develop a tumor.

In yet another aspect, the present invention relates to the use of the methods as defined herein above for diagnosing and/or monitoring in a sample derived from a subject any one or more selected from the group consisting of the presence of a tumor, the presence of a particular tumor type, the presence of a particular tumor stage, the predisposition to develop a tumor, and response to tumor therapy.

Other embodiments of the present invention will become apparent from the detailed description hereinafter.

DESCRIPTION OF THE DRAWINGS

FIGURE 1 : Analysis of the sizes of 5 different markers on chromosome 18 (y-axis). Shown are the results of four different single cell amplifications and two amplifications with 10-cell pools (x-axis; single cell amplification products are "JG_1 ", "JG_1.3", "JG_1.4", and "JG_1.5", the 10-cell pools are "JG_10.4" and "JG_10.5") The horizontal green bars display the standard deviation for the allele sizes as measured in all experiments together for each marker allele. The vertical green bars are mainly for orientation and indicate the location of the mean size. The black dots show the results obtained with non-amplified DNA the red dots the size obtained with amplified DNA. A black dot left from a red dot indicates that the size obtained with the amplified DNA was larger as the size measured with non-amplified DNA. A comparison with the standard deviation allows the easy identification of outliers. Here, an outlier was observed only for Marker D18S978 in cell JG_1. Otherwise, all red and black dots are close together. As marker D18S391 is uninformative, there are no dots for the second allele whereas the second allele of Marker D18S499 was not amplified in any experiment.

FIGURE 2: Peak areas for the same 5 different markers on chromosome 18 and the same experiments as in Figure 1 (here, the markers are indicated on the x-axis). The green dot represents the mean of all peak area measurements from the amplification products; the green bar the standard deviation. The black dots show the results obtained in each experiment with non-amplified DNA, the red dots the peak areas obtained with amplified DNA from the various experiments. Thus, the black dots reflect the "real" peak areas, whereas the red dots show the variability from the experiments. Note that marker D18S391 is uninformative and that the second allele of marker D18S499 was not amplified in any experiment.

FIGURE 3: Illustrated is the same representation as in Figure 1 but with peak areas instead of sizes.

FIGURE 4: Ratio values after normalization: The black line shows the normalized values for the non-amplified DNA for which all values were set to 1. Because Marker D18S391 is uninformative (i.e. there is only one allele present) the second allele has the value 0. The colored lines represent the results for each of the aforementioned experiments (e.g. the red line corresponds to the values of the single cell "JG_1"). All cells have for the second allele of marker D18S391 again the value 0. As the second allele of Marker D18S499 was not amplified in any experiment all amplification products (but not the normalized DNA) have for this marker also a 0.

FIGURE 5: Same data set as in Figure 4 but allele 1 of each marker was duplicated in order to simulate a data set with trisomy.

FIGURE 6: Illustrated are experimental results of an analysis for a first mix of seven different markers.

FIGURE 7: Illustrated are experimental results of an analysis for a second mix of seven different markers. FIGURE 8: Illustrated are experimental results of an analysis for a third mix of five different markers.

DETAILED DESCRIPTION OF THE INVENTION

The present invention is based on the unexpected finding that by combining the analysis of several parameters based on the allelic status of candidate genetic markers, all of which are per se well established and can be readily determined, and using a specific evaluation algorithm in a rapid (completion within 24-48 h, or even faster) and cost-effective method for the (pre-)screening of cells could be established. This pre-screening provides reliable information about how many cells are suitable for subsequent analysis, that is, represents a suitable means for quality assessment. Furthermore, the method also provides first insight as to the genotype of the one or more cells analyzed, namely the pattern of chromosomal aberrations (i.e. changes of allelic copy numbers), which may aid in subsequent diagnosis and/or monitoring of a medical condition this genotype is indicative for.

The present invention illustratively described in the following may suitably be practiced in the absence of any element or elements, limitation or limitations, not specifically disclosed herein.

Where the term "comprising" is used in the present description and the claims, it does not exclude other elements or steps. For the purposes of the present invention, the term "consisting of" is considered to be a preferred embodiment of the term "comprising". If hereinafter a group is defined to comprise at least a certain number of embodiments, this is also to be understood to disclose a group, which preferably consists only of these embodiments.

Where an indefinite or definite article is used when referring to a singular noun, e.g., "a", "an" or "the", this includes a plural of that noun unless specifically stated otherwise.

In case, numerical values are indicated in the context of the present invention the skilled person will understand that the technical effect of the feature in question is ensured within an interval of accuracy, which typically encompasses a deviation of the numerical value given of ± 10%, and preferably of ± 5%.

Furthermore, the terms first, second, third, (a), (b), (c), and the like, in the description and in the claims, are used for distinguishing between similar elements and not necessarily for describing a sequential or chronological order. It is to be understood that the terms so used are interchangeable under appropriate circumstances and that the embodiments of the invention described herein are capable of operation in other sequences than described or illustrated herein.

Further definitions of term will be given in the following in the context of which the terms are used. The following terms or definitions are provided solely to aid in the understanding of the invention. These definitions should not be construed to have a scope less than understood by a person of ordinary skill in the art.

In one aspect, the present invention relates to a method for the screening of one or more cells derived from a subject for the presence of a particular genotype, comprising:

(a) identifying in the one or more cells one or more chromosomal regions, each chromosomal region comprising at least two candidate genetic markers;

(b) subjecting the one or more cells to nucleic acid amplification;

(c) determining in the nucleic acid amplification products obtained in step (b) for each of the candidate genetic markers in each chromosomal region at least one parameter being indicative for the allele status of the marker;

(d) selecting those chromosomal regions in which the majority of the at least two genetic markers comprised in a chromosomal region shows consistent results in step (c);

(e) determining for each of the candidate genetic markers in each chromosomal region selected in step (d) at least one parameter being indicative for the allele copy number of the candidate genetic marker; and

(f) selecting those chromosomal regions in which the majority of the at least two genetic markers comprised in a chromosomal region shows consistent results in step (e); wherein any one or more of the chromosomal regions selected in step (f) is/are indicative for the particular genotype screened for. ln a preferred embodiment, the method is performed using only one cell (i.e. as a single cell analysis).

The one or more cells are derived from a subject to be analyzed by the present method. Typically, the subject is a mammal such as a mouse, rat, hamster, rabbit, cat, dog, pig, cow, horse or monkey. Preferably, the subject to be diagnosed is a human.

Typically, the one or more cells to be employed in the present invention are purified from a sample collected from the subject to be analyzed. The samples may include body tissues (e.g., biopsies or resections, bone marrow samples, placental tissue, and umbilical cord samples) and fluids, such as blood, sputum, cerebrospinal fluid, amniotic fluid, and urine. In some embodiments, the samples represent particular types of cells, such as oocytes (egg cells), adult stem cells, embryonic stem cells, and stem cell precursor cells such as blastomers. The samples used in the method of the present invention should generally be collected in a clinically acceptable manner. The skilled person is well aware of various methods for the purification of cells from a given sample (see, e.g., Sambrook, J., and Russel, D.W. (2001), Molecular cloning: A laboratory manual (3rd Ed.) Cold Spring Harbor, NY, Cold Spring Harbor Laboratory Press; Ausubel, F.M. et al. (2001) Current Protocols in Molecular Biology, Wiley & Sons, Hoboken, NJ, USA). In specific embodiments, the one or more cells employed in the present invention are derived from a blood sample such as whole blood, plasma, and serum. The term "whole blood", as used herein, refers to blood with all its constituents (i.e. both blood cells and plasma). The term "plasma", as used herein, denotes the blood's liquid medium. The term "serum", as used herein, refers to plasma from which the clotting proteins have been removed.

In a further preferred embodiment, the one or more cells (herein also referred to as "test sample") are tumor cells, and are particularly preferably selected from the group consisting of disseminated tumor cells and circulating tumor cells.

The term "tumor" (also commonly referred to as "cancer"), as used herein, denotes any type of malignant neoplasm, that is, any morphological and/or physiological alterations (based on genetic re-programming) of target cells exhibiting characteristics of a tumor as compared to unaffected (healthy) wild-type control cells. Examples of such alterations may relate inter alia to cell size and shape (enlargement or reduction), cell proliferation (increase in cell number), cell differentiation (change in physiological state), apoptosis (programmed cell death) or cell survival. In other words, a tumor is characterized by uncontrolled division of target cells based on genetic re-programming and by the ability of the target cells to spread, either by direct growth into adjacent tissue through invasion, or by implantation into distant sites by metastasis. Examples of tumors include inter alia breast cancer, colorectal cancer, prostate cancer, leukemia, lymphomas, neuroblastoma, glioblastoma, melanoma, liver cancer, and lung cancer.

The term "circulating tumor cells" (CTCs), as used herein, denote cells that have detached from a primary tumor and circulate in the bloodstream. The term "disseminated tumor cells" (DTCs), as used herein, denote cells that have detached from a primary tumor and can be derived from bone marrow. CTCs and DTCs may constitute 'seeds' for subsequent growth of metastases in different tissues.

In another embodiment, the one or more cells are derived from amniotic fluid and/or chorionic villi (i.e. placental tissue). In such case, the method of the present invention may be applied for pre-natal diagnosis. However, in the context of pre-natal diagnosis, it may also be possible to screen fetal cells circulating in maternal blood.

In another embodiment, the one or more cells are oocytes (i.e. egg cells, often also referred to as "ova") or, more specifically, the polar bodies isolated from oocytes. The egg cells of all mammals have two polar bodies. Polar bodies are produced during meiosis, contain relatively little cytoplasm and are haploid (i.e. contain 23 chromosomes). By analyzing the polar bodies, it is possible to infer the genetic status of the egg cells. This application is also referred to as pre-fertilization diagnosis.

In another embodiment, the one or more cells are adult stem cells, embryonic stem cells or blastomers. Stem cells are generally characterized by the ability to renew themselves and to differentiate into a diverse range of specialized cell types. Adult stem cells (as well as the more specific oligopotent progenitor cells) are found in adult organisms in a variety of sources such as blood, bone marrow, and umbilical cord and primarily act as a repair system for the body. Adult stem cells are multipotent, that is, they can differentiate into a number of closely related cells (i.e. are lineage-restricted). Embryonic stem cells are derived from the inner cell mass of blastocysts or earlier morula stage embryos. As used herein, the cells of such early stage embryos from which stem cells can be derived are referred to as "blastomers". Embryonic stem cells are pluripotent and can differentiate into almost all cells. The screening of said cells may also be used in the context of pre-fertilization diagnosis as well as for assessing stem cell quality for therapeutic purposes.

The term "genotype", as used herein, refers to the inherited instructions of an organism it carries within its genetic code. In other words, the genotype denotes the genetic constitution of a cell or an organism (i.e. the specific allele makeup) usually with reference to a specific character under consideration. Said character is commonly referred to as "phenotype" which denotes any observable feature or trait of a cell or an organism, such as its morphology, developmental status, biochemical or physiological properties, etc. under a particular set of environmental conditions. Phenotypes result from the expression of an organism's genes as well as the influence of environmental factors and the interactions between the two.

The method of the invention may be employed for the screening of any cellular genotype. Preferably, the genotype screened for is indicative for a specific phenotype of the cells analyzed, particularly for the presence of a medical condition such as cancer or a hereditary disease

In a preferred embodiment, the genotype is indicative for any one or more phenotype in the subject that is selected from the group consisting of the presence of a tumor, the presence of a particular tumor type (i.e. in order to discriminate between different tumors), the presence of a particular tumor stage (i.e. in order to monitor tumor progression), the predisposition to develop a tumor, and response to tumor therapy.

The term "predisposition to develop a tumor", as used herein, denotes any cellular phenotype being indicative for a pre-cancerous state, i.e. an intermediate state in the transformation of a normal cell into a tumor cell. In other words, the term denotes a state of risk of developing a tumor. An example of such pre-cancerous state is a benign adenoma, which may progress into a malignant adenocarcinoma. ln a first step, the method of the invention involves the identification in the (genome of the) one or more cells of one or more chromosomal regions, each chromosomal region comprising at least two candidate genetic markers, that represent suitable markers for the screening process. The identification (or selection) of the chromosomal regions will depend inter alia on the application of the method, the type of cell(s) employed, the parameters to be analyzed, and the like. The skilled person is well aware of methods how to identify one or more such chromosomal regions for a given setting. Typically, the selection is accomplished based on information retrieved from databases or the scientific literature. For example, for multiple tumors ample information is available with regard to the genetic re-programming of the tumor cells as compared to healthy controls (i.e. the occurrence of chromosomal aberrations, gene mutations, etc.). Accordingly, based on this information the skilled person may readily select one or more chromosomal regions that - at least with a reasonable likelihood - have the potency to be indicative for that tumor.

The one or more chromosomal regions selected may have any length such as at least 0.01 Mb (mega base pairs; 1 Mb = 1000000 bp), at least 0.1 Mb, at least 0.5 Mb, at least 1 Mb, at least 1.5 Mb, at least 2 Mb or at least 2.5 Mb. Preferably, the one or more chromosomal regions have a length of at least 3 Mb, such as, e.g., at least 3.2 Mb, at least 3.4 Mb, at least 3.6 Mb, at least 3.8 Mb, at least 4 Mb, at least 4.5 Mb, at least 5 Mb, or even larger.

The term "candidate genetic marker", as used herein, denotes a gene or a DNA sequence with a known location on a chromosome that can be used to identify a cell or an organism (i.e. that - at least with a reasonable likelihood - is characteristic for the genotype of the one or more cells analyzed). Such marker can be described as a variation (which may arise due to mutation or alteration in the genetic locus on the chromosome) that can be observed (for example, by means of an altered sequence or gene expression level). Examples of candidate genetic markers include inter alia single nucleotide polymorphisms, restriction fragment length polymorphisms, short tandem repeats, mini-satellites, and entire (mutated) genes). Preferred markers are repetitive sequences (i.e. iterations of identical sequence motifs directly adjacent to each others) such as short tandem repeats (also referred to as micro-satellites; the repeats are less than 10 bp in length) and mini-satellites (the repeats are 10 bp to more than 100 bp in length). Typically, each of the one or more chromosomal regions comprises at least two candidate genetic markers. In a further preferred embodiment, each chromosomal region comprises at least three, at least four, at least five, at least six, at least seven, at least eight, at least nine, at least ten or even more candidate genetic loci. The candidate genetic loci comprised in a chromosomal region may be located in close proximity to each other or may be distributed over the whole length of the chromosomal region. It may also be possible that only part of the candidate genetic loci comprised in a chromosomal region may be clustered, whereas the remaining ones are located at different positions.

In a further step of the method, the one or more cells (or the genomic DNA and/or RNA purified thereof) are subjected to nucleic acid amplification, which is typically accomplished by a PCR-based technique well known in the art (cf., e.g., Sambrook, J., and Russel, D.W. (2001), supra; Ausubel, F.M. et al. (2001) supra).

In a further step of the method, in the (whole genome) amplification products obtained (comprising highly fragmented nucleic acids) for each of the candidate genetic markers in each chromosomal region at least one parameter being indicative for the allele status of the marker is determined. Preferably, at least one parameter selected from the group consisting of number of alleles (e.g., presence of paternal and maternal alleles, presence of a third allele being indicative for a trisomy, the presence of chromosomal aberrations) and length (size) of the alleles.

In a specific embodiment, this determination step is performed by means of a PCR-based technique, preferably by quantitative fluorescence PCR (QF-PCR; Peril , B. et al. (1999) Mol. Hum. Reprod. 5, 1 176-1 179; see also below). The selection of a particular method may depend inter alia on the at least one parameter analyzed, the type of cells employed, and the genotype to be screened for. The skilled person is well aware how to select a specific technique for a given application.

Subsequently, those chromosomal regions are selected in which the majority of the at least two genetic markers comprised in a chromosomal region shows consistent (i.e. identical or at least similar) results in the above analysis. For example, if a chromosomal region comprises five genetic markers it is selected for further analysis provided that at least three of the markers display identical results with respect to the at least one parameter analyzed (e.g., the number of alleles). Such consistent results provides a measure for the quality of the amplification product used, and thus for the usefulness to include such chromosomal regions in further analyses. On the other hand, multiple differences within the results obtained for the genetic markers comprised in a chromosomal region indicate a low quality.

The likelihood to obtain consistent results will depend inter alia on the distance of the genetic markers on the chromosome (i.e. their respective locations to each other). The smaller the distance, the higher is the likelihood to obtain consistent results.

In another preferred embodiment, the method further comprises comparing the results obtained for the "test sample" with those obtained for a control.

Preferably, the control employed is a non-amplified nucleic acid sample (e.g., genomic cellular DNA not subjected to any amplification step), and is particularly preferably derived from the subject to be analyzed (that is, test sample and control are derived from the same subject). However, within the present invention, the term "control" also refers to reference values derived from databases or published in the scientific literature.

The term "comparing the results", as used herein, also includes normalizing the results obtained for the test sample relative to the control (for example, by setting the control to 100% and calculating the corresponding value for the test sample). In one preferred embodiment, such normalization is performed by means of using the algorithm described in the experimental section below.

In another preferred embodiment, the method further comprises selecting those chromosomal regions in which for the majority of the at least two genetic markers comprised in a chromosomal region the results obtained are preserved in the one or more cells analyzed and the control. For example, if the parameter being indicative for the allele status to be analyzed is allele length, then obtaining identical results in the test sample and the control (e.g., non-amplified DNA) indicates a correct size and provides an additional measure for the quality of the amplification products used. In the subsequent step, the method of the present invention comprises determining for each of the candidate genetic markers in each chromosomal region selected in the previous step at least one parameter being indicative for the allele copy number of the candidate genetic marker. As used herein, determining the allele status of a genetic marker (as described above) may also include determining of the allele copy number.

The allele copy number can be seen as a measure for the occurrence of chromosomal aberrations (i.e. abnormalities of the normal chromosomal structure). Various chromosomal aberration detection procedures, many of them PCR-based, are known in the art such as inter alia multiplex ligation-dependent probe amplification (MLPA; Schouten, B. et al. (2002) Nucl. Acids Res. 30, e57) and quantitative fluorescence PCR (QF-PCR; Peril , B. et al. (1999), supra), with the latter one being particularly preferred herein.

Thus, in a specific embodiment, this determination step is performed by means of a PCR- based technique, preferably by quantitative fluorescence PCR.

The PCR-based techniques employed herein are performed according to established protocols well known in the art. For performing a QF-PCR any one of the following basic experimental approaches may be used: primer extension pre-amplification PCR (Zhang, L. et al. (1992) Proc. Natl. Acad. Sci. USA 89, 5847-5851), multiple displacement amplification (Dean, F.B. et al. (2002) Proc. Natl. Acad. Sci. USA 99, 5261-5266), linker adapter PCR (Ludecke, H.J. et al. (1989) Nature 338, 348-350), and degenerate oligonucleotide primed PCR (Telenius, H. et al. (1992) Genomics 13, 718-725). Kits for performing such PCR- techniques for whole genome amplification are also available from various manufacturers.

Subsequently, in analogy to the previous "determination step", again those chromosomal regions are selected in which the majority of the at least two genetic markers comprised in a chromosomal region shows consistent (i.e. identical or at least similar) results in the above analysis. Such consistent results represent an indication for the significance of the allele copy number (or any variations of the allele copy number) determined in the analysis. In contrast, results based on the mere analysis of individual markers may be misleading, e.g., due to some artifacts. ln another preferred embodiment, the method again further comprises comparing the results obtained for the "test sample" with those obtained for a control. Preferably, the control employed is a non-amplified nucleic acid sample, and is particularly preferably derived from the subject to be analyzed. However, within the present invention, the term "control" also refers to reference values derived from databases or published in the scientific literature. The term "comparing the results", as used herein, also includes normalizing the results obtained for the test sample relative to the control. In one preferred embodiment, such normalization is performed by means of using the algorithm described in the experimental section below.

In yet another preferred embodiment, the method further comprises selecting those chromosomal regions in which for the majority of the at least two genetic markers comprised in a chromosomal region of the one or more cells analyzed the allele copy number is modified (i.e. altered) as compared to the control.

Particularly preferably, the modification of the allele copy number is selected from the group consisting of a chromosomal loss and a chromosomal gain.

The term "chromosomal gain", as used herein, denotes an increase in copy number due to a single, double or triple duplication of chromosomal regions. The duplicated or gained regions may be derived from the same chromosome or from different chromosomes. Preferably, they are from the same chromosome. The term "chromosomal loss", as used herein, denotes a decrease in copy number due to a deletion of chromosomal regions.

Within the present invention, any one or more of the chromosomal regions selected in the final step of the method is/are indicative for the presence of a particular genotype. Hence, the identification of a single chromosomal region may be sufficient to indicate the presence of a particular phenotype. However, in order to provide accurate and reliable results typically the selection of more than one chromosomal region will be required.

Furthermore, the one or more of the chromosomal regions selected may also have direct diagnostic and/or prognostic value. For example, it is well established in the scientific literature that multiple tumors are (partially) caused or at least associated with chromosomal aberrations, for example, colorectal cancer/adenocarcinoma and breast cancer. Thus, the detection of such chromosomal aberrations in tumor cells or cells suspected to be derived from a tumor may directly indicate or confirm the actual presence of a tumor.

The method of the present invention may also be performed using a computer-based approach employing various algorithms for evaluating the experimental data obtained. In a preferred embodiment, the method is performed using the algorithm described in the experimental section below.

In a further preferred embodiment, the method is performed in a multiplex-format. The term "multiplex-format", as used herein, refers to the parallel analysis of multiple (i.e two or more) samples, each sample comprising one or more cells. In particular, term also relates to (automated) high-throughput analyses of hundreds or thousands of samples, for example by employing array technology.

In a further aspect, the present invention relates to a method for diagnosing and/or monitoring in a sample derived from a subject the presence of a tumor or a predisposition to develop a tumor, comprising:

(a) screening one or more cells derived from the sample of the subject according to the method as defined herein above for the presence of a particular genotype that is indicative for the presence of a tumor or a predisposition to develop a tumor; and optionally

(b) determining in the one or more cells obtained in step (a) that exhibit the particular genotype at least one further parameter being indicative for the presence of a tumor or a predisposition to develop a tumor.

Typically, the method is performed as an in vitro method.

Within the present invention, the terms "diagnosing" and "monitoring" are intended to encompass predictions and likelihood analysis (based on both the qualitative and quantitative measurements). The present method is intended to be used clinically in making decisions concerning treatment modalities, including therapeutic intervention, disease staging, and disease monitoring and surveillance. According to the present invention, an intermediate result for examining the condition of a subject may be provided. Such intermediate result may be combined with additional information to assist a physician, nurse, or other practitioner to diagnose that a subject suffers from the disease. Alternatively, the present invention may be used to detect tumor cells in a subject-derived sample, and provide a doctor with useful information to diagnose that the subject suffers from the disease.

After performing the screening of one or more cells of a subject suffering or at least being suspected to suffer from a tumor according to the corresponding method as described herein above the results may optionally be corroborated and/or supplemented by performing an additional (more sophisticated or specific) analysis, for example, in order to extend the result of the pre-screening that in fact a tumor is present to the determination of stage of tumor progression. Accordingly, the screening method per se may be sufficient to also enable the diagnosis of a tumor. However, an accurate diagnosis may typically require performing further analyses.

In a final aspect, the present invention relates to the use of the methods as defined herein above for diagnosing and/or monitoring in a sample derived from a subject any one or more selected from the group consisting of the presence of a tumor, the presence of a particular tumor type, the presence of a particular tumor stage, the predisposition to develop a tumor, and response to tumor therapy.

The invention is further described by the figures and the following examples, which are solely for the purpose of illustrating specific embodiments of this invention, and are not to be construed as limiting the scope of the invention in any way.

EXAMPLES

Example 1 : Pre-screening of the whole-genome amplification product of single cells

The pre-screening method developed employs a quantitative fluorescence (QF)-PCR assay, which provides information on both the quality of the single-cell PCR product and on copy- number changes. The analyses focus on copy number changes frequently reported in the literature for certain tumor entities. However, in principle any region in the genome can be tested. The method involves determination of the allele size, number of alleles and ratio values corresponding to the copy number of certain microsatellite markers located in close proximity at defined regions with frequent copy number changes in breast and colorectal cancer. The aim is to use sets of primers, which are multiplexed together, as a "one-tube" test.

Herein, a set of 21 chromosomal markers (STRs, short tandem repeats) is used for the analysis of circulating tumor cells (CTCs) derived from breast and colorectal cancers.

TABLE 1 : Panel of chromosomal markers used in the QF-PCR strategy. HEX, FAM and ATTO550 NED represent the fluorescent labels employed-

The marker panel shown in Table 1 covers seven chromosomal regions that are frequently gained or lost in breast and colorectal tumors.

TABLE 2: Expected results for CTC analysis with the marker panel in Table 1 based on available literature data. Thus, these regions are frequently gained or lost but not in all cases.

Locus 1 Locus 2 Locus 3

Colorectal Gain 8p 17p 18q

Loss 8q 13q 20q

Breast Gain 8p 13q 17p

Loss 1 q 8q 20q

Control 2q23.5 The present strategy includes enrichment of the STR marker panel for those markers located in regions often involved in copy number changes in the respective tumor entity. This significantly increases the likelihood to detect alterations in the analyzed CTCs. The following characteristics are determined for each marker: chromosomal location, heterozygosity, and size range (bp).

The PCR analyses were performed following established protocols known in the art (see, e.g., Sambrook, J., and Russel, D.W. (2001), supra; Ausubel, F.M. et al. (2001), supra), preferably by quantitative fluorescence PCR (QF-PCR; PertI, B. et al. (1999), supra). For performing a QF-PCR any one of the following basic experimental approaches may be used: primer extension pre-amplification PCR (Zhang, L. et al. (1992), supra), multiple displacement amplification (Dean, F.B. et al. (2002), supra), linker adapter PCR (Ludecke, H.J. et al. (1989), supra), and degenerate oligonucleotide primed PCR (Telenius, H. et al. (1992), supra). Kits for performing such PCR-techniques for whole genome amplification are also available from various manufacturers. The PCR products obtained were analyzed on 3000 and 3100 capillary-based genetic analyzers (Applied Biosystems, Carlsbad, CA, USA).

Example 2: Analysis of a marker panel on chromosome 18

The following set of experiments was done with normal, diploid male cells. Four single cell experiments (labeled JG_1 , JG_1.3, JG_1.4, JG_1.5) and two experiments with pools of 10 cells (labeled JG_10.4, JG_10.5) were performed which involve the analyses of 5 highly polymorphic markers on chromosome 18 in order to examine the copy number status of chromosome 18. These markers are: D18S386 (located at 18q22.1 : 65,794,580-65,794,944), D18S391 (located at 18p11.31 : 5,781 ,224-5,781 ,405), D18S499 (located at 18q21.32- q21.33), D18S535 (located at 18q12.3: 38, 148,789-38, 148,934), and D18S978 (located at 18q12.3: 38,338, 136-38,338,382).

For interpretation of results, the following criteria were developed: Each marker provides information about the following characteristics: 1. number of alleles (i.e. one or two alleles), marker having a high rate of heterozygosity are selected according to data bases in order to increase the likelihood that for each marker two alleles will be observed;

2. marker size; and

3. peak area obtained in the PCR analysis.

The accurate evaluation of these parameters with non-amplified, high-molecular weight genomic DNA is usually relatively easy and can be performed according to published protocols (see, e.g., Mann, K. et al. (2001) Lancet 358, 1057-1061). However, when using highly fragmented DNA from single cell amplification products the peak areas show a great variability with the amplification products. Hence, the development of specific strategies is required. In particular, because of these differences a special evaluation algorithm is necessary for the evaluation of QF-PCR products from single cells. The algorithm employed herein evaluates both the quality of the amplification product and the copy number of selected regions within the genome.

2.1 Quality of the amplification product

The quality criteria are based on the following parameters:

1. A high number of preserved alleles in the amplified DNA having the correct size (established by comparing amplified and non-amplified DNA) suggests good quality (e.g. if 5 markers specific for a chromosomal region are used, 10 alleles should be present in the amplification product; the size of at least 8-9 alleles should be correctly reflected in the amplification product) (cf. Fig. 1).

2. The number of alleles and their peak size is used for estimation of the copy number as described in detail below. The distance of the markers is chosen so that the majority of markers should show similar if not identical results. Multiple differences (i.e. the first marker indicates a loss, the second marker a gain, the third marker a normal copy number, the fourth marker again a loss, and so on) indicate a poor quality.

3. Single-cell amplification products with easy interpretable results are the once with optimal quality and the best candidates for further processing with high-resolution tools. 2.2 Analysis of marker copy numbers by QF-PCR

Identification of lost regions - A lost region is characterized by loss of heterozygosity (LOH). The presence of two different alleles (i.e. maternal and paternal allele) for a given locus excludes a deletion; a lost region is always characterized by the occurrence of only one allele. However, the presence of only one allele at a single locus is not sufficient to determine that the respective region is indeed lost as such losses may also be due to other factors, such as chance (i.e. the person has in this locus two identical alleles, in this case the person would have only a single peak and such a marker is described as uninformative) or by artifact ("allelic drop out" due to the whole-genome amplification). Therefore, the present results are not based on only a single locus and for this reason the marker panel was selected so that several markers were in close proximity. Only if these markers show concordantly only one allele instead of two is this region defined as lost.

The detection of lost regions can be greatly facilitated if the marker analysis of non-amplified DNA is done in parallel, which provides, for each marker, very accurate information about the number of alleles (i.e. one or two alleles), the size of the markers and the peak areas:

• A single peak in the non-amplified DNA is described as uninformative.

• A minimum of two informative markers is required for confident interpretation, owing to the possibility of primer-site polymorphisms and somatic repeat instability.

In summary, a region is classified as lost if the majority or all markers show only one allele. The diagnosis will be more accurate if non-amplified DNA is available and shows the presence of two alleles so that the presence of a LOH can be easily confirmed.

Identification of gained regions - Accurate identification of gained regions represents a particular challenge. For example, in prenatal applications the diagnosis of a trisomy is based either on the observation of three alleles (i.e. three peaks) or on two alleles in a 2: 1 or a 1 :2 ratio. In the experimental settings used herein three alleles should not occur.

Identification of gained regions depends not on size but on peak areas or peak areas ratio calculations. However, a simple establishment of 2: 1 or 1 :2 ratios directly from the obtained peak values is often difficult because peak areas are usually not accurately reflected in single cell amplification products. In fact, due to the fragmentation of the DNA of single cell amplification products, it cannot be assumed that a trisomy always results in a 2: 1 ratio for each marker used; instead, peak areas calculated from single cell amplification products show a great variability. In fact, for some markers the observed ratio may be lower than the expected 2: 1 ratio, for other markers it may even be higher. To compensate for such PCR- artifacts, not all but only a given number of the markers used have to differ by factor 2 (cf. Figures 2 and 3)

The results obtained suggest that the larger the peak area the larger is the variability of the peak areas in the amplification product. In addition, it is notable that the second allele of marker D18S499 did not show an amplification product in any experiment. This suggests that alleles with larger peak areas could be more prone to amplification failure than alleles with smaller peak areas. However, this is apparently not a general phenomenon as the two alleles of marker D18S978, which have comparable peak area sizes, yielded amplification products in all experiments. It will be seen whether alleles with large peak areas are more prone to amplification failure and therefore more prone to provide false-positive signs for a possible marker loss. If this should turn out to be the case, the size of peak areas has to be included in the evaluation algorithm for losses as well.

Figures 2 and 3 demonstrate the variability of peak areas in the amplification products analyzed. There are a few patterns, which appear to occur relatively frequently (e.g. peak areas of amplification products are more likely to be smaller as compared to peak areas with non-amplified DNA; the variability tends to increase with the size of the peak areas), however, these patterns are insufficient for a reliable peak area ratio calculation. The most accurate peak area ratio calculations can be achieved by normalizing the peak areas.

Each informative marker has two alleles, a paternal and a maternal allele. The respective peak areas could be designated as:

Pe A (yWi j ) and Pe A (Pi j ) / Peu(Mi j ) and Ρβυ(β,) where Pe A is the peak area obtained with the amplified DNA; Peu is the peak area obtained with the non-amplified DNA; y is the maternal allele for chromosomal region / ' and marker y; and i j is the paternal allele for chromosomal region / ' and marker j

However, in the vast majority of cases it will not be known, which of the two markers is the maternal and which is the paternal. However, the parental origin must not be known and therefore, we can also just name the alleles and A 2 , respectively. Thus, the respective peak areas could be designated as:

Pe A (Aiij) and Pe A (/4 2i j) / Pe u (A^ and Ρ β υ(Λ ) where A ril is allele 1 for chromosomal region / ' and marker y; and Α 2 ^ is allele 2 for chromosomal region / ' and marker j

In the first step of the normalization procedure, the peak areas of the non-amplified DNA are set to the value 1 :

/=1 ; y=1

{ while / ' =1 < r

{ while j< m,

/ =/+ 1 ; j=1 } where r is the number of regions tested in the assay ;m, is the number of markers for each region / ' ; and NPeu is the normalized peak area obtained with the unamplified DNA.

For uninformative markers, there is only one allele; the missing second allele does get a value of 0.

In the next step, the peak areas of the amplified DNA are normalized via dividing their values by the respective value of the non-amplified DNA.

/=1 ; y=1

{ while / ' =1 < r

{ while j< m,

} / =/+ 1 ; j=1 } where NPe A is the normalized peak area obtained with the amplified DNA.

If Pe A is smaller than Peu (the data shown in Figs. 2 and 3 suggest that this will be the case for the majority of markers), then NPe A will have a value between 0 and 1 ; otherwise, if NPe A is larger than Peu then NPe A is larger than 1. Figure 4 depicts the data after the normalization procedure.

Although the peak areas show substantial variability after amplification the two alleles of a given marker often (but not always) have comparable peak areas after normalization (cf. Fig. 4). In the present case, similar peak areas are expected as the experiments were done with normal diploid cells.

The normalization is followed by calculations of ratio values in which the two alleles are divided by each other (cf. also Table 3):

/=1 ; y=1

{ while / ' =1 < r

{ while j< m,

RPe A (^=NPe A ( \ /NPe A ( \ 2ij ); y=y+1 ) }

/ =/+ 1 ; j=1 } where RPe A is the ratio between normalized peak areas obtained with the amplified DNA and-i j indicates the location for chromosomal region / ' and marker j.

TABLE 3: Results of the peak area ratio calculations: The in the column D18S499 indicates that this marker is not informative whereas the "0" in the column D18S391 denotes that one of the two alleles was not amplified in the respective cell or cells, respectively (here in none of the cells) or in other words the value 0 indicates a loss of an allele. Thus, the "0" corresponds according to our aforementioned criteria to a loss.

Cell sample D18S386 D18S391 D18S499 D18S535 D18S978

JG_1 1 .00320067 0 -1 1 .0401018 1 .16079108

JG 1.3 1 .85345215 0 -1 1 .4101 1 197 3.13359948 JG_1 .4 1 .56319682 0 1 .439181 18 1 .97609924 JG_1.5 1 .06500928 0 1 .88676892 1 .05478284 JG_10.4 1 .17107459 0 1 .28243175 1 .12792145 JG 10.5 1 .00295435 0 1 .69428161 1 .09133434

From such a table, it can be calculated, after normalization, whether the respective chromosomal region is lost, gained or present with a normal copy number. Herein, the following criteria were used. In a first step, each single ratio value is considered and classified according to these criteria:

• Loss: if there is only one allele in amplified DNA and two alleles in unamplified DNA;

• Trisomy: if the allele ratio allele ratio greater than 1 .8 (to achieve this, our algorithm always divides the larger peak area by the smaller peak area);

• Normal: there are two alleles present and the corresponding allele dosage ratios is less than 1 .8;

• Uninformative: if only a single peak is present in the unamplified DNA.

In a second step, the alleles of each region are evaluated together. In order to achieve a result for the copy number status of a region the majority of classifications of individual regional markers should be consistent. Thus, in the example describe herein 5 markers are used. Therefore, at least three markers indicating the same copy number status are required for making a classification. One marker (i.e. D18S499) is not informative and for another marker (i.e. D18S391 ) one of the two alleles present in the non-amplified DNA did not amplify in any experiment, therefore this marker had to be designated as "lost". Thus, the three remaining markers have to yield consistent results for an unambiguous classification.

/=1 ; y=1

{ while / ' =1 < r

{ while j< m,

(if RPe A ( -ij)>1.8) (countAmp<-countAmp+1 ) else

(if -1 ) (countUI<-countUI+1) else

(if 0) (countDel<-countDel+1) else

(countNI<-countNI+1 );

y=y+ i ) }

/ =/+ 1 ; j=1 } where countAmp, countUI, countDel, countNI are counters for alleles with amplified, uninformative, deleted and normal status, respectively. Herein, if one of these counters is equal or greater than 3 than the region will be assigned the respective status.

TABLE 4: Results of the classification (according to the above criteria) for each maker and for the respective region: A "0" indicates loss, a "2" diploidy, a "3" a trisomy, a "4" not classifiable and a uninformative.

Cell Result for sample D18S386 D18S391 D18S499 D18S535 D18S978 Region

JG_1 2 0 -1 2 2 2

JG_1.3 3 0 -1 2 3 4

JG_1.4 2 0 -1 2 3 4

JG_1.5 2 0 -1 3 2 4

JG_10.4 2 0 -1 2 2 2

JG 10.5 2 0 .1 2 2 2

For example, in cell "JG_1 " three markers (D18S386, D18S535, D18S978) were classified as being present in two copies (each marker has two alleles, and the normalized ratios between these two areas are less than 1.8; compare Table 3). Therefore, the entire region is classified as diploid. In contrast, in cell "JG_1.3" there are two diploid markers (D18S386, D18S535), one marker is triploid (D18S978), one marker is lost (D18S391) and one marker is uninformative (D18S499). Therefore, the region in this cell cannot be unequivocally classified and gets a "4".

Herein, from the 5 markers initially used only three remain for a classification because one marker is uninformative, and another one did not amplify in all amplification products. Based on three markers and six experiments there remained 18 (3*6) markers for classification. Thereof, 14 (78%) were correctly classified as balanced, which is the expected result as all experiments were done with normal, diploid cells. However, the classification for the entire region depends on the status of the majority of markers (here, on all three markers). Consequence, the eventual classification of a region can readily become "not classifiable" if one of these three markers has a deviant value. Herein, in 3/6 experiments (50%) chromosome 18 was correctly classified as balanced whereas in the other cells it was "not classifiable". Thus, on the marker level, correct classification was achieved for 78% of markers, whereas for the region it decreased to 50%. On the other hand, the region was in no experiment incorrectly determined as gained.

Accordingly, not all regions do seem to be always unequivocally classifiable. However, in the above-described approach only 5 loci were used, which significantly decreases resolution. An increasing number of markers will improve resolution. However, this in turn will also increase the costs of the prescreening.

Example 3: Analysis of chromosomal trisomies

In order to check whether trisomies can be identified via the method described herein in silico simulations were performed. Figure 5 illustrates such a simulation. Here, for each marker the peak areas of alleles 1 but not of alleles 2 were duplicated assuming that a trisomy will be reflected by an exact duplication of the peak areas of one of the two alleles. Figure 5 shows the normalized values. Apparently, for the majority of markers with two alleles there is a clear difference between the two peak areas.

Example 4: Additional Experimental Results

Some additional experimental results are described in the following: The experiments were performed with cells from both healthy controls and tumor cells and the marker patterns were analyzed with an algorithm according to an exemplary embodiment. The analyses included single cell amplification products and genomic DNA from the same individual with the QF- markers in order to achieve the optimal sensitivity for the detection of gains and losses (so called Losses of heterozygosity [LOH]).

Specifically single cells from a healthy control were either selected by a CellCelector of Aviso which facilitates automated selection of cells or manually. In all cases the whole genome amplification (WGA) was performed with the lllustra GenomiPhi V2 DNA Amplification Kit, the QF-PCR with the QIAGEN Multiplex PCR Product kit. After selection of a single cell of a healthy control with the CellCelector the following results were obtained:

The three different microsatellite marker sets were adjusted and the testing was performed for chromosomal regions 8p (5 markers), 8q (5 markers), 13q (4 markers), and 17p (5 markers). However, marker generation is easy and straightforward so that a person skilled in the art can easily switch to other chromosomal regions. To improve sensitivity the number of markers for each region is increased to 4 or 5. In particular, figures 6 to 8 each show graphs for the source data for un-amplified genomic DNA (top panel) and for two single cells (center and bottom panel). Each row depicts for the individual markers the number of alleles (here: always one or two), the size of the marker and the peak areas. In addition, below the graphs of each figure a table is added, which summarizes the exact locations of the microsatellite markers.

In particular, Figure 6 shows experimental results of an analysis for a first mix of seven different markers. In particular, the first mix consists of three markers for chromosome 8p, one marker for chromosome 8q, one marker for chromosome 13q, and two markers for chromosome 17p. For example, in Figure 6 the first marker represents a region on chromosome 8p and has two alleles, one at 1 14 bp and the second, which has a very small peak area, at 128 bp. These two alleles were identified also in the two single cells. The same is true for the other markers in this probe set.

In particular, Figure 7 shows experimental results of an analysis for a second mix of seven different markers. In particular, the second mix consists of one marker for chromosome 8p, four markers for chromosome 8q, and two markers for chromosome 17p. Figure 7 illustrates marker analysis with the same DNA and single cells, respectively and demonstrates how an evaluation algorithm according to an exemplary embodiment handles variability, which may occur in single cell analysis due to bias during the whole genome amplification process or other events, e.g. during the cell selection. The first marker for chromosome 8q has two alleles at 120 bp and 124 bp. In the single cell amplification products the second allele at 124 bp shows each a markedly reduced peak area. The algorithm would not classify this microsatellite marker as lost because both alleles are still present and thus there is no LOH. Although the peak area of the allele at 120 bp is much larger as compared to the 124 bp allele in the two single cells, the algorithm would vice versa not call this region amplified, because the peak area at 120 bp in the single cell amplification products has not increased compared to the area with the genomic DNA.

Another example is the marker 17p2, which consists of two alleles at 218 bp and 224 bp. The first single cell has clearly a LOH as the allele at 218 bp cannot be detected. Still, the region would not be called lost by the algorithm as marker 17p-5 (Figure 6) clearly shows the presence of two alleles.

In particular, Figure 8 shows experimental results of an analysis for a third mix of five different markers. In particular, the third mix consists of one marker for chromosome 8p, three markers for chromosome 13q, and one marker for chromosome 17p, Figure 8 summarizes the other markers, all of them were correctly identified in the single cells.

In particular, the above described results illustrate an example, which may demonstrate the reliability which could be achieved with the fine tuning of the QF-PCR. In summary, it may be possible to obtain highly-reproducible and level robust results by the described algorithm. The algorithm turned-out to process the data sets very accurately and generates thus reliable data on both the quality of the amplification product and the copy number status of the tested regions.

Example 5: Discussion

The resolution (sensitivity) that can be achieved with the method depends on the number of markers used for each region. An increase of the number of markers along the length of each chromosomal region increases the likelihood of detecting unbalanced chromosome rearrangements. On the other hand, each additional marker increases the complexity of the multiplex-PCR and makes it more expensive.

In order to get a reasonable resolution at affordable costs the question is how to design an optimal marker panel for a given set of individual cells. Preferably, these markers are located in regions, which show frequently chromosomal gains and losses in tumors to increase the chance to identify cells with copy number changes. Tables 1 and 2 show such regions in breast and colon cancer.

In general, the number of markers needed to identify a loss is probably lower than for identification of a gain. The identification of a loss is relatively straightforward as it depends on the presence of two alleles in the non-amplified DNA and loss of one allele in the cells or amplification product, respectively. To avoid wrong interpretation due to amplification artifacts (allelic drop out) interpretation depends not on a single, but on several markers. For the unequivocal identification of a loss three markers may be sufficient.

In contrast, identification of gains is more complex and depends mainly on a 2:1 peak area ratio. The calculation of such peak area ratios is relatively prone for artifacts, which can be significantly lowered by performing a normalization step. However, due to more complicated evaluation procedure for a possible gain, it is currently anticipated that at least 5 markers are needed for a reliable identification of gained regions.

Further resolution-determining factors include the number of markers showing a consistent result. In the present example, the copy number status of a region depends on the classification of the majority of markers, i.e. if at least 3 of the 5 markers indicate a certain copy number status the entire region is set to this value. However, one could also request that 4/5 markers or even 5/5 markers should show consistent results, which would make the entire procedure more stringent but, on the other hand, also increase the likelihood to find more unclassifiable cells. The selection of thresholds and stringency criteria may depend on the specific application and on the preferences of individual investigators.

Further options to improve the evaluation of tumor cells may take advantage of the clonality in tumor cell populations. For example, if a gain of a particular chromosome has a 2:1 ratio for a given allele 1 over allele 2, and if this clone expands, multiple progenitor cells will be generated having the same ratio shift between alleles 1 and 2. Thus, this strategy provides also information about clonality, which should again facilitate identification of gained regions.

Subsequently, cells of interest, e.g. based on certain patterns of gains and losses, can be subjected to further analyses, e.g. array-CGH. It is conceivable to pool cells with identical patterns of gains and losses as such cells may likely be from the same clone. Such an "intelligent pooling" or "smart pooling" of cells may also be important, if the amplification products are subjected to subsequent sequencing and furthermore it will also increase resolution of array-CGH.

The present invention illustratively described herein may suitably be practiced in the absence of any element or elements, limitation or limitations, not specifically disclosed herein. Thus, for example, the terms "comprising", "including", "containing", etc. shall be read expansively and without limitation. Additionally, the terms and expressions employed herein have been used as terms of description and not of limitation, and there is no intention in the use of such terms and expressions of excluding any equivalents of the features shown and described or portions thereof, but it is recognized that various modifications are possible within the scope of the invention claimed. Thus, it should be understood that although the present invention has been specifically disclosed by embodiments and optional features, modifications and variations of the inventions embodied therein may be resorted to by those skilled in the art, and that such modifications and variations are considered to be within the scope of this invention.

The invention has been described broadly and generically herein. Each of the narrower species and sub-generic groupings falling within the generic disclosure also form part of the invention. This includes the generic description of the invention with a proviso or negative limitation removing any subject matter from the genus, regardless of whether or not the excised material is specifically recited herein.

Other embodiments are within the following claims. In addition, where features or aspects of the invention are described in terms of Markush groups, those skilled in the art will recognize that the invention is also thereby described in terms of any individual member or subgroup of members of the Markush group.