Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
AUTOMATED GENOTYPING
Document Type and Number:
WIPO Patent Application WO/2003/031647
Kind Code:
A1
Abstract:
An automated method of genotyping provides rapid, automated analysis of genetic marker information for the purposes of genotyping. Preferably, the method relates to analysis of genetic markers produced by multiplex fluorescent nucleic acid sequence amplification. This method is particularly suited to computer-implementation for rapid, reliable high-throughput analysis of genetic marker information in genotyping applications such as detection of aneuploidy, polyploidy, trisomies, expansive disorders (such as Huntington's disease, Fragile X), diseases involving mutations and deletions such as cystic fibrosis, chimerism, sex and sex aneuploidies, microsatellite instability such as loss of heterozygosity in cancer, genetic identification such as forensic analysis, linkage analysis and DNA fingerprinting, although without limitation thereto.

Inventors:
FINDLAY IAN (AU)
MATTHEWS PAUL LAWRENCE (AU)
MULCAHY BRENDAN KHALID (AU)
Application Number:
PCT/AU2002/001389
Publication Date:
April 17, 2003
Filing Date:
October 14, 2002
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNIV QUEENSLAND (AU)
FINDLAY IAN (AU)
MATTHEWS PAUL LAWRENCE (AU)
MULCAHY BRENDAN KHALID (AU)
International Classes:
C12Q1/68; (IPC1-7): C12Q1/68
Domestic Patent References:
WO1998024937A21998-06-11
WO2000061817A12000-10-19
WO1999054500A21999-10-28
Foreign References:
EP1026258A22000-08-09
US5994064A1999-11-30
Other References:
FINDLAY I. ET AL.: "Using MF-PCR to diagnose multiple defects from single cells: implications for PGD", MOL. CELL. ENDOCRIN., vol. 183, 2001, pages S05 - S12, XP001199454
FINDLAY I. ET AL.: "Preimplantation genetic diagnosis using fluorescent polymerase chain reaction: results and future developments", J. ASSISTED REPROD. GENET., vol. 16, no. 4, 1999, pages 199 - 206, XP008033799
APESSOS A. ET AL.: "Preimplantation genetic diagnosis of the fragile X syndrome by use of linked polymorphic markers", PRENATAL DIAGNOSIS, vol. 21, 2001, pages 504 - 511, XP008033797
PIYAMONGKOL W. ET AL.: "Preimplantation genetic diagnostic protocols for alpha- and beta-thalassaemias using multiplex fluorescent PCR", PRENATAL DIAGNOSIS, vol. 21, 2001, pages 753 - 759, XP008033798
DREESEN J.C.F.M. ET AL.: "Multiplex PCR of polymorphic markers flanking the CFTR gene; a general approach for preimplantation genetic diagnosis of cystic fibrosis", MOL. HUM. REPROD., vol. 6, no. 5, 2000, pages 391 - 396, XP001199522
BARCELLOS L.F. ET AL.: "Association mapping of disease loci, by use of a pooled DNA genomic screen", AM. J. HUM. GENET., vol. 61, 1997, pages 734 - 747, XP009007918
DEAN N.L. ET AL.: "The development of preimplantation genetic diagnosis for myotonic dystrophy using multiplex fluorescent polymerase chain reaction and its clinical application", MOL. HUM. REPROD., vol. 7, no. 9, 2001, pages 895 - 901, XP001199523
See also references of EP 1451371A4
Attorney, Agent or Firm:
Fisher, Adams Kelly (G.P.O. Box 1413 Brisbane, Queensland 4000, AU)
Download PDF:
Claims:
CLAIMS
1. A method of genotyping including the steps of : (i) determining the presence or absence of one or more genetic markers in each of one or more samples, wherein the or each genetic marker in each said sample may be the same or different; (ii) using the or each genetic marker to produce a result for each said sample, the result selected from the group consisting of : (a) a genotype; (b) an indication that a sample error has occurred; (c) an indication that one or more markers has failed; (d) an indication that one or more alleles has failed; (e) an indication that one or more marker errors have occurred; and (f) an indication that genotyping should be repeated; and (iii) producing one or more confidence parameters that qualify the result obtained in step (ii).
2. The method of Claim 1, wherein the confidence parameter is marker reliability for the or each genetic marker used to produce the result in step (ii).
3. The method of Claim 1, wherein at step (ii) a plurality of genetic markers are used to produce the result for each said sample.
4. The method of Claim 1, wherein marker reliability is calculated for the or each same genetic marker in at least two of said samples.
5. The method of Claim 1, wherein said genetic markers are indicative of a chromosomal abnormality.
6. The method of Claim 5, wherein the chromosomal abnormality is trisomy.
7. The method of Claim 5, wherein the chromosomal abnormality is a human chromosomal abnormality.
8. The method of Claim 7, wherein the human chromosomal is selected from the group consisting of a chromosome 13 abnormality; a chromosome 18 abnormality; and chromosome 21 abnormality.
9. The method of Claim 1, wherein at least one of said genetic markers is indicative of sex.
10. The method of Claim 1, wherein the one or more genetic markers is a plurality of genetic markers produced by fluorescent multiplex PCR.
11. The method of Claim 10, wherein each said genetic marker is detected as a fluorescence signal.
12. The method of Claim 11, wherein if the fluorescence signal is detected as a split peak, an average fluorescence signal is calculated.
13. The method of Claim 11, wherein if the fluorescence signal is detected as a pull up peak, background fluorescence is removed to produce a subtracted fluorescence signal.
14. A method of genotyping including the steps of : (i) detecting the presence or absence of one or more genetic markers in each of one or more samples, wherein the or each genetic marker is an allelic form of each of one or more chromosomal genetic markers; and (ii) producing a result for the or each said chromosomal genetic marker selected from the group consisting of : (a) an absence of any allelic form of the chromosomal genetic marker indicating a failed result; (b) a presence of one allelic form of the chromosomal genetic marker indicating a homozygous result; (c) a presence of two allelic forms of the chromosomal genetic marker, indicating a heterozygous result; (d) a presence of two allelic forms of the chromosomal genetic marker wherein the ratio of one allelic form to the other is about 2: 1, indicating diallelic trisomy; (e) a presence of three allelic forms of the chromosomal genetic marker, indicating triallelic trisomy; and (f) a presence of three or more allelic forms of two or more different chromosomal markers indicating contamination or trisomy.
15. The method of Claim 14, wherein at step (i), the one or more genetic markers is a plurality of genetic markers, each of which is an allelic form of a respective chromosomal genetic marker.
16. The method of Claim 15, wherein the results at step (ii) are grouped with respect to each said chromosomal genetic marker to produce a group result that indicates either chromosomal trisomy or disomy.
17. The method of Claim 15, wherein steps (i) and (ii) are repeated at least once with respect to at least one of the samples and using the same said genetic markers.
18. The method of Claim 17, wherein the results obtained in each repeat are compared to the previous result.
19. The method of Claim 16, wherein trisomy is determined with respect to one or more human chromosomes.
20. The method of Claim 14 wherein the human chromosome is selected from the group consisting of chromosome 13; chromosome 18 ; and chromosome 21.
21. The method of Claim 14, wherein at step (i) one or more sex markers are also included.
22. The method of Claim 21, wherein a sex result is also produced.
23. The method of Claim 14, wherein the one or more genetic markers is a plurality of genetic markers produced by fluorescent multiplex PCR.
24. A method of genetic data analysis including the steps of : (i) determining the presence or absence of one or more genetic markers in each of one ore more samples; and (ii) producing one or more confidence parameters selected from the group consisting of : (a) a reliability parameter for the or each sample; (b) an accuracy parameter for the or each sample; (c) a marker reliability parameter for the or each said genetic marker; (d) an allele dropout parameter that indicates whether one or two allelic forms of the or each genetic marker are present in the or each sample; and (e) an amplification failure parameter with respect to the or each said genetic marker in the or each sample.
25. The method of Claim 24, wherein the one or more samples is a plurality of samples.
26. The method of Claim 25, further including step (iii) of producing one or more parameter analyses from the one or more confidence parameters for each of said plurality of samples in step (ii), said parameter analyses selected from the group consisting of : (A) a reliability analysis; (B) an allele dropout analysis; and (C) an amplification failure analysis.
27. The method of Claim 24, wherein the one or more genetic markers is a plurality of genetic markers produced by fluorescent multiplex PCR.
28. The method of Claim 27, wherein each said genetic marker is present as a fluorescence signal.
29. The method of Claim 25, wherein a fluorescence peak area ratio is calculated for each genetic marker in the or each sample.
30. The method of any one of Claims 1,14 or 24, wherein said genetic markers are mammalian genetic markers.
31. The method of Claim 30, wherein said mammalian genetic markers are human genetic markers.
32. The method of any one of Claims 1,14 or 24, wherein said genetic markers are plant genetic markers.
33. A computerimplemented method of genotyping according to Claim 1 or Claim 14.
34. A computerimplemented method of genetic data analysis according to Claim 24.
35. A computer programmed with the computerimplemented method of Claim 33.
36. A computer programmed with the computerimplemented method of Claim 34.
37. A computer programmed with the computerimplemented method of Claim 33 and the computerimplemented method of Claim 34.
38. An automated genotyping system comprising: (i) a source of genetic data; (ii) a computer according to Claim 35, Claim 36 or Claim 37 for analyzing genetic data obtained from said source of genetic data; (iii) a means for displaying a genotyping result produced by said computer.
39. The automated genotyping system of Claim 38, wherein the source of genetic data is a DNA sequencer, the genetic data being in the form of one or more fluorescence signals.
Description:
TITLE AUTOMATED GENOTYPING FIELD OF THE INVENTION THIS INVENTION relates to a method of genotyping. More particularly, this invention relates to automated analysis of genetic marker information for the purposes of genotyping. More specifically, this invention relates to analysis of genetic markers produced by multiplex fluorescent nucleic acid sequence amplification. This method is particularly suited to computer-implementation for rapid, reliable high-throughput analysis of genetic marker information in genotyping applications such as detection of aneuploidy, polyploidy, trisomies, expansive disorders (such as Huntington's disease, Fragile X), diseases involving mutations and deletions such as cystic fibrosis, chimerism, sex and sex aneuploidies, microsatellite instability such as loss of heterozygosity in cancer, forensic analysis, linkage analysis and genetic identification such as DNA fingerprinting, although without limitation thereto.

BACKGROUND OF THE INVENTION Genetic analysis or"genotyping"is an important technique in genetic research and in commercial applications such as medical and veterinary diagnosis, agriculture and forensic. This applies to mapping genes, genetic diagnosis, genetic predisposition to disease, chromosomal mapping and identity determination (DNA fingerprinting).

In linkage mapping for genetic disease, for example, DNA samples from affected individuals and their families are screened against a library of genetic markers. Resulting data are analyzed to find inherited chromosomal regions correlated with genetic disease.

As more complex traits and diseases are studied, the number of genotypes required grows significantly. Therefore performing accurate and high throughput genotyping is becoming an increasingly important factor for genetic analysis.

One process for genotyping uses a large family of markers called microsatellites which provide selective amplification. A typical amplification process uses polymerase chain reaction (PCR) to amplify short segments of chromosomal

DNA known to contain a variable length marker. Each possible length corresponds to a distinct allele for the particular marker.

The length of the marker is measured by separating the amplified DNA in a variety of ways including but not limited to gel or capillary electrophoretic systems.

However processing genetic marker data is very time consuming. To increase throughput, multiple genetic markers are amplified from a sample. This can be either by multiplexing where multiple markers undergo PCR simultaneously in the same reaction or in a process commonly known as"pooling". In pooling, each marker undergoes PCR individually ; the PCR products are then combined or"pooled" together. In both multiplex and pooled samples, markers with overlapping size ranges are tagged with colored (eg. fluorescent) dyes so that individual, amplified alleles can be distinguished. The same dye can be used for multiple markers as long as the size ranges do not overlap. Size separation equipment, for example DNA sequencers, is used to scan the gel and produce a pixel map color-coded image in machine-readable format. The pixel information is stored as a file in such a manner that it can be accessed to provide individual traces for each marker or group. Each of these traces can be used to determine genotype.

Genetic analysis of nucleic acids generate huge amounts of data, typically many hundreds of thousands to millions of data points. Much of this data is not required and must be filtered. Secondly, peaks that correspond to genetic markers must be distinguished from background peaks.

One conventional approach uses"human callers", highly trained people who visually examine the traces to determine whether or not peaks in particular traces correspond to a genetic marker such as an allele. Often two different allele callers examine traces in a double blind fashion. If both callers agree that a particular peak corresponds to an allele the genotype is identified. If there is no agreement, the trace may be uncallable and be subject to a third caller.

This manual processing is highly skilled, repetitive, time consuming and very expensive. Automated systems including commercial software such as Genotypes (Applied Biosystems) and ProfilerTM (Amersham) can be used to increase the speed of analysis. A comparison of such systems is shown in Table 1. Using Table 1 as an

example, the time consuming nature of manual processing results in limited sample throughput-32 samples per day in Table 1. Current automated processing software can increase throughput to approximately 100 samples per day, again using Table 1 as an example.

SUMMARY OF THE INVENTION The present inventors have realized that there remain significant limitations to automated genotyping systems and the software typically used to run computer-based systems. In particular, significant deficiencies exist in processing multiple genetic marker information within and between samples and detecting and reporting events such as, but not limited to, reliability of genetic marker amplification, failed amplification, allele dropout (also known as allelic dropout) and preferential amplification. These latter problems compromise or qualify the reliability and/or accuracy of a genotyping result.

In a first aspect, the invention provides a method of genotyping including the steps of : (i) determining the presence or absence of one or more genetic markers in each of one or more samples, wherein the or each genetic marker in each said sample may be the same or different; (ii) using the or each genetic marker to produce a result for each said sample, the result selected from the group consisting of : (a) a genotype; (b) an indication that a sample error has occurred; (c) an indication that one or more markers has failed; (d) an indication that one or more alleles has failed; (e) an indication that one or more marker errors have occurred; and (f) an indication that genotyping should be repeated; and (iii) producing one or more confidence parameters that qualify the result obtained in step (ii).

Preferred confidence parameters include, but are not limited to, average number of genetic markers amplified per sample, marker reliability, accuracy, allele dropout per genetic marker, overall allele dropout, amplification failure of genetic markers or of samples, and preferential amplification of genetic markers.

In a second aspect, the invention provides a method of genotyping including the steps of : (i) detecting the presence or absence of one or more genetic markers in each of one or more samples, wherein the or each genetic marker is an allelic form of each of one or more chromosomal genetic markers; and or (ii) producing a result for the or each said chromosomal genetic marker selected from the group consisting of : (a) an absence of any allelic form of the chromosomal genetic marker indicating a failed result; (b) a presence of one allelic form of the chromosomal genetic marker indicating a homozygous result; (c) a presence of two allelic forms of the chromosomal genetic marker, indicating a heterozygous result; (d) a presence of two allelic forms of the chromosomal genetic marker wherein the ratio of one allelic form to the other is about 2: 1, indicating diallelic trisomy; and (e) a presence of three allelic forms of the chromosomal genetic marker, indicating triallelic trisomy.

(f) a presence of three or more allelic forms of two or more different chromosomal markers indicating contamination or trisomy.

In a third aspect, the invention provides a method of genetic data analysis including the steps of : (i) determining the presence or absence of one or more genetic markers in each of one ore more samples; and (ii) producing one or more parameters selected from the group consisting of : (a) a reliability parameter for the or each sample;

(b) an accuracy parameter for the or each sample; (c) a marker reliability parameter for the or each said genetic marker; (d) an allele dropout parameter that indicates whether one or two allelic forms of the or each genetic marker are present in the or each sample; and (e) an amplification failure parameter with respect to the or each said genetic marker in the or each sample.

Preferably, according to the aforementioned aspects, the one or more genetic markers is a plurality of genetic markers, such as produced by fluorescent multiplex nucleic acid sequence amplification.

In a fourth aspect, the invention provides a machine-implemented method of genotyping that includes the steps recited in the first aspect, and/or the second aspect and/or the third aspect of the invention.

It will be appreciated that the machine-implemented method may comprise any or all of the methods of the aforementioned aspects of the invention, alone or in any combination.

In a fifth aspect, the invention provides a machine programmed to implement the method of the fourth aspect.

According to the fourth and fifth aspects, the machine may be a computer, calculator or any other programmable memory storage device suitable for implementation of the method of the invention.

Preferably, the machine is a computer.

The computer may be a"stand-alone"computer or may be part of a networked system communicably linked to a server that provides the computer- implemented method.

In one embodiment, the computer-implemented method is expressed in VisualBasic programming language.

In a seventh aspect, the invention provides an automated genotyping system comprising : (i) a source of genetic data;

(ii) a computer for analyzing genetic data obtained from said source of genetic data; (iii) a data output means that provides a genotyping result produced at step (ii).

Preferably, the source of genetic data is a DNA sequencer, the genetic data being in the form of one or more fluorescence signals.

Throughout this specification, unless otherwise indicated,"comprise", "comprises"and"comprising"are used inclusively rather than exclusively, so that a stated integer or group of integers may include one or more other non-stated integers or groups of integers.

BRIEF DESCRIPTION OF THE DRAWINGS Reference is now made to non-limiting embodiments of the present invention described by way of example with reference to the accompanying figures and tables.

Figure 1: Extreme preferential amplification using sex diagnosis as an example.

The Y allele peak is much smaller than the X allele. If the Y-allele is not detected this male call would be incorrectly called and be misdiagnosed as female.

Figure 2: Extreme preferential amplification using CF diagnosis as an example.

The AF508 allele peak is much smaller than the wild-type CFTR allele. If the AF508 allele is not detected this carrier call would be incorrectly called and misdiagnosed as unaffected.

Figure 3: Detection thresholds in fluorescent PCR and conventional PCR analysis. Preferential amplification would result in misdiagnosis when analyzed by agarose gels whereas fluorescent PCR analysis would give the correct diagnosis.

Allelic dropout would result in misdiagnosis in both cases.

Figure 4 : Flowchart describing a computer-implemented embodiment of the first aspect of the invention, herein referred as"Genefiler". Each of pages 2-9 of Figure 4 sets forth in detail a main menu option shown on page 1.

Figure 5: Flowchart describing a computer-implemented embodiment of the second aspect of the invention, herein referred to as"Triscan".

Figure 6: Screen display from AlleleScan at Sort module.

Figure 7: Screen display from AlleleScan at MplexSelect module.

Figure 8: Screen display from AlleleScan at MarkerCount module.

Figure 9: Screen display from AlleleScan at NonSTRsub module.

Figure 10: Screen display from AlleleScan at STRcount module.

Figure 11: Screen display from AlleleScan at Reliability module.

Figure 12: Screen display from AlleleScan at MarkerReliability module.

Figure 13: Screen display from AlleleScan at AlleleDropout module.

Figure 14: Screen display from AlleleScan at AmpFailure module.

Figure 15: Screen display from AlleleScan at Stats module.

Figure 16: Screen display from AlleleScan at Reportl module.

Figure 17: Screen display from AlleleScan at Card2 module.

Figure 18: Screen display from AlleleScan at MplexSelect module.

Figure 19: Screen display from AlleleScan at ColorCard module.

Figure 20: Screen display from AlleleScan at MarkerReport and ReportHeadings modules.

Figure 21: Screen display from AlleleScan at PeakAreaRatio module.

Figure 22: Screen display from AlleleScan at Graphs module.

Figure 23: Screen display from AlleleScan of Results page.

Figure 24: Screen display of Genefiler User Login.

Figure 25: Screen display of Genefiler marker entry.

Figure 26: Screen display of Genefiler sexing analysis parameter entry.

Figure 27: Screen display of Genefiler marker panel management.

Figure 28: Screen display of Genefiler data analysis and diagnosis.

Table 1 : Comparison of currently available manual genotyping method, commercially-available software and method of the present invention.

Table 2: Multiplex 1 contains the listed markers for an embodiment of the invention. In marker name the number immediately after the C refers to chromosome number. The number after the dash refers to marker designation. For example C13-1 refers to marker 1 on chromosome 13.

Table 3: Multiplex 2 contains the listed markers for an embodiment of the invention. In marker name the number immediately after the C refers to chromosome

number. The number after the dash refers to marker designation. For example C21-1 refers to marker 1 on chromosome 21.

Table 4: Sample 1 filter breakdown for an embodiment of the invention.

Table 5: Sample 2 filter breakdown for an embodiment of the invention.

Table 6: Listing of remaining multiplex peaks in Sample 1 for an embodiment of the invention.

Table 7: Markers have been assigned the correct result, which in turn allows the group and sample results to be called. In this instance in this embodiment the result is a normal male sample.

Table 8: Listing of remaining multiplex peaks for Sample 1 Multiplex 2 in an embodiment of the invention.

Table 9: Markers have been assigned the correct result, which in turn allows the group and sample results to be called. In this instance in this embodiment the result is a normal male sample.

Table 10: Listing of remaining peaks for Sample 2, Multiplex 1 run 1 for an embodiment of the invention.

Table 11: In this example, Sample 2 shows that there is an abnormal response from the chromosome 18 markers, indicative of Trisomy 18. The sample is flagged for repeat and placed into the repeat multiplex stack for the next Repeat run template.

All associated conditions and parameters are copied.

Table 12: Listing of remaining peaks for Sample 2, Multiplex 2.

Table 13: In this example, Sample 2 Group 21 results indicate that the sample is disomic for 21.

Table 14: Listing of remaining peaks for Sample 2, Multiplex 1, run 2.

Table 15: Example of a repeat of Sample 2, while one of the groups (sexing) failed, still shows a suspected trisomy 18 group. This would result in the final result (as the sample has been repeated and the maximum number of repeats is one, if more then more repeats would be conducted for confirmation) of sample 2 being shown to be a Trisomy 18, male.

DETAILED DESCRIPTION OF THE INVENTION

The present invention provides methods of genotyping and genetic data analysis. The invention provides processing of multiple genetic marker information within and between samples and/or markers, detecting and reporting events such as failed amplification, allele dropout and preferential amplification that qualify or constrain the reliability and accuracy of genotyping results. Preferably, the present invention provides methods for computer analysis of genetic marker information for the purposes of genotyping. This method provides rapid, reliable high-throughput analysis of genetic marker information in genotyping applications and genetic analysis such as detection of aneuploidy, polyploidy, trisomies, expansive disorders (such as Huntington's disease, Fragile X), diseases involving mutations and deletions such as cystic fibrosis, chimerism, sex and sex aneuploidies, microsatellite instability such as loss of heterozygosity in cancer, forensic analysis, genetic diagnosis, genetic screening, paternity testing, pedigree analysis, linkage analysis and genetic identification such as DNA fingerprinting, although without limitation thereto.

It will also be appreciated that the methods as set forth in the first, second and third aspect may be independent or utilized in any combination to form an integrated system for genetic analysis which, preferably, is embodied as software for use in a computer or other programmable machine environment.

It will be apparent to the skilled person that the present invention is broadly applicable to"genetic analysis". As used herein, genetic analysis and genetic diagnosis are used interchangeably and broadly covers, but is not limited to, detection, analysis, identification and/or characterization of genetic material and includes and encompasses terms such as genetic identification, genetic diagnosis, genetic screening, linkage analysis, paternity testing, genotyping and DNA fingerprinting which are variously used throughout this specification.

As used herein, a"genetic marker"is meant any locus or region of a genome.

The genetic marker may be a coding or non-coding region of a genome. For example, genetic markers may be coding regions of genes, non-coding regions of genes such as introns or promoters, or intervening sequences between genes such as those that include tandem repeat sequences, for example satellites, microsatellites and

minisatellites and genetic markers that are autosomal or sex chromosome markers, although without limitation thereto.

Preferred genetic markers are highly polymorphic and display allelic variation between individuals and populations of individuals.

In particular embodiments, preferred genetic markers are short tandem repeat sequences (STRs), such as are used in a variety of genotyping applications such as genetic identification by DNA fingerprinting, forensic DNA analysis, pre- implantation genetic analysis (also known as pre-implantation genetic diagnosis) and fetal genotyping such as prenatal diagnosis or screening.

The term"nucleic acid"as used herein designates single-or double-stranded mRNA, RNA, cRNA and DNA, said DNA inclusive of cDNA and genomic DNA.

Preferably, genetic marker information is produced, at least initially, by amplification of the genetic markers from a nucleic acid sample.

It will be appreciated that the present invention is applicable to analysis of any quantity or quality of detectable nucleic acid sample or product from nucleic acid samples obtained from any source, inclusive but not limited to of plants, bacteria, viruses and animals.

Animal samples may be obtained from any animal species, although typically human or other mammalian sources inclusive of domestic livestock such as cattle, sheep, pigs and horses are contemplated, although without limitation thereto.

Nucleic acid amplification techniques are well known to the skilled addressee, and include polymerase chain reaction (PCR) and ligase chain reaction (LCR) as for example described in Chapter 15 of Ausubel et al. CURRENT PROTOCOLS IN MOLECULAR BIOLOGY (John Wiley & Sons NY, 1995-1999); strand displacement amplification (SDA) as for example described in U. S. Patent No 5,422, 252; rolling circle replication (RCR) as for example described in Liu et al., 1996, J. Am. Chem. Soc. 118 1587 and International application WO 92/01813 and by Lizardi et al., in International Application WO 97/19193; nucleic acid sequence- based amplification (NASBA) as for example described by Sooknanan et al., 1994, Biotechniques 17 1077; and Q-p replicase amplification as for example described by Tyagi et al., 1996, Proc. Natl. Acad. Sci. USA 93 5395.

A preferred nucleic acid sequence amplification technique is PCR.

The skilled person will also be aware of still further variations of nucleic acid sequence amplification technology that may be useful in amplifying genetic markers for genetic analysis.

As used herein, an"amplification product"refers to a nucleic acid product generated by a nucleic acid amplification technique.

A'primeP'is usually a single-stranded oligonucleotide, preferably having 12- 50 contiguous nucleotides, which is capable of annealing to a complementary nucleic acid"template"and being extended in a template-dependent fashion by the action of a DNA polymerase such as Taq polymerase, RNA-dependent DNA polymerase or SequenaseTM.

In preferred embodiments, the genetic markers are amplified by"multiplex PCR", which involves a reaction utilizing a plurality of different primer sets (for example, primers for CF and sex) to amplify a plurality of genetic markers so that simultaneous diagnoses can be performed. Preferably, multiplex PCR produces a plurality of different sized products, thereby facilitating discrimination between genetic markers and allelic forms thereof.

PCR reactions utilizing a single set of primers amplifying one specific fragment are referred to herein as a"singleplex PCR"and it will be appreciated that this current invention is also applicable to singleplex PCR..

A preferred PCR system is'fluoreseent PCR". This system uses fluorescent primers and an automated DNA sequencer to detect PCR product (Tracy & Mulcahy, 1991, Biotechniques 11 68-75). Fluorescent PCR has improved both the accuracy and sensitivity of PCR for genotyping (Ziegle et al., 1992, Genomics, 14 1026-1031 ; Kimpton et al., 1993, PCR Methods and Applications 3 13-22).

Fluorescent amplification products are electrophoresed using gel or capillary systems and pass a scanning laser beam, which induces the tagged amplification product to fluoresce. The DNA sequencer combined with appropriate software is a source of genetic data generally known as a"Genescanner". Stored data can then be analyzed to provide product sizes and the relative amount of amplification product in each sample.

Fluorescent PCR is highly sensitive, approximately 1000 times more sensitive than conventional gel analysis, (Hattori et al., 1992, Electrophoresis 13 560-565.).

This allows the detection of a signal far below the threshold that can be obtained from conventional methods. This results in highly accurate and reliable detection even when the signal is very weak or much lower (<1%) than that of the other allele.

A further advantage of fluorescent PCR is that several primers can be multiplexed together since different fluorescent dyes can be simultaneously identified even if the amplification size product ranges overlap each other (Kimpton et al., 1993, supra). These different dyes allow identification of one amplification product from the others even if the product sizes are within 1-2 bp of each other. This method has been applied to multiplex amplification of as many as fifteen sets of primers, although relatively high amounts of DNA are required.

Fluorescent PCR has already been successfully applied to genetic screening for cystic fibrosis (Cuckle et al., 1996 British Journal of Obstetrics and Gynaecology 103 795-799), Down syndrome (Pertl et al., 1994, Lancet 343 1197), muscular dystrophies (Schwartz et al., 1992, American Journal of Human Genetics 51 721- 729; Mansfield et al., 1993a. Human Molecular Genetics 2 43-50) and Lesch-Nyhan disease (Mansfield et al., 1993b. Molecular and Cellular Probes 7 311-324).

As fluorescent PCR provides accurate quantitative measurements, it is therefore possible to determine the product ratio of one allele relative to the other.

These quantitative measurements allow difficulties of single cell PCR such as allelic dropout and preferential amplification to be investigated. These quantitative measurements from each allele can also be compared with each other, which may give an indication of relative numbers of chromosomes.

"Quantitative PCR"is where the amount of PCR product from each allele is compared, allowing a calculation of the relative number of chromosomes. This method has been applied to the detection of trisomies by utilising fluorescent PCR with polymorphic small tandem repeats (STRs ; Adinolfi et al., 1995, Bioessays 17 661-664). These DNA markers have unclear exact genomic function, and are found throughout the genome. STRs can also be used to determine the origin of the extra

chromosome and, if maternally derived, whether the extra chromosome is derived from meiosis I or meiosis II (Kotzot et al., 1996, European Journal of Human Genetics 4 168-174).

The method of the invention may be particularly useful for the purposes of genetic identification referred to as STR profiling, also referred to as"DNA f nge7printing", or typing. Preferably, STR amplification products are produced by fluorescent multiplex PCR as hereinbefore described DNA fingerprinting has been used by forensic science utilizing DNA markers. These STRs are similar to those used for trisomy detection. Their wide variation in length and their distribution between individuals makes STRs preferred genetic markers. In addition, their small size makes them more likely to survive degradation and allow PCR amplification. These STRs are used to build up a series of identifying markers which are then combined to determine the DNA'fingerprint' (Zeigle etal., 1992, Genomics 14 1026-1031).

The STR profiling system has several advantages over alternative earlier methods (Jeffreys et al., 1985, Nature 316 76-79) using single locus probes (SLPs).

It is more sensitive and requires only-Ing of DNA compared to upwards of 50ng for SLPs. It can also be used for highly degraded DNA as it amplifies 100-400 bp compared to the 1, 000-20, 000bp lengths produced by SLPs. It can be performed in a single tube; Southern blotting or hybridization is not required; and since alleles are discrete and can be sized precisely, the binding of alleles, a necessity in SLP analysis, is not required.

As used herein"amplificationfailure"is where a genetic marker, either singly or in combination with other markers, fails to be amplified.

The reasons for amplification failure of genetic markers obtained from single cells are unclear but are likely to be numerous. They may include problems with sample preparation; e. g. failure to transfer the cell, degradation or loss of the target sequence and/or problems associated with the PCR. The major cause of PCR failure however is probably due to inefficient cell lysis. This is reflected by the fact that failure varies with cell type used, probably because different cell types, with their different structure and nature, require different lysis procedures.

In the majority of unique sequences examined, PCR amplification failure occurs in the region of 15-30% of single cells (Li et al., 1988, Nature 335 414-419 ; Holding & Monk, 1989, Lancet Sep. 532-5; Boehnke et al., 1989, American Journal of Human Genetics 45 21-32; Monk et al., 1993, Prenatal Diagnosis 13 45-53).

Amplification failure from blastomeres from preimplantation embryos can be even higher; Pickering et al., 1992, Human Reproduction 7 1-7, for example, reported very low rates (45%) of (3-globin gene amplification using single blastomeres in comparison with single cumulus cells and oocytes (83%). Lesko et al., 1991, American Journal of Human Genetics 49 223, also reported high efficiency of amplification of the AF508 locus for cystic fibrosis using nested primers in lymphocytes, but lower efficiency when single blastomeres were used.

As used herein,"allelic dropout" (also known as allele dropout) is failure to amplify one or more heterozygous alleles, or the failure of one or more alleles to reach a threshold of detection when one or more alleles from the same marker is present.

Potential problems with the diagnosis of heterozygous individuals using PCR include the possibility of non-detection of alleles (allelic dropout), caused by total amplification failure of heterozygous alleles or the failure of one or more alleles to reach the threshold of detection whilst other alleles successfully amplify. The concept of allelic dropout has been considered, for example, in microsatellite-based detection of cancers (reviewed by Cawkwell et al., 1995, Gastroenterology 109 465-471).

The rate of allelic dropout increase appears to be inversely proportional to the amount of template in the sample and directly proportional to the number of primers contained in the PCR. At the single cell level previous work showed an allelic dropout rate of 25%-33% in cells from heterozygote human embryos (Ray & Handyside, 1994 Miami Bio/Technology Short Reports: proceedings of the 1994 Miami Bio/Technology European symposium Advances in Gene Technology: Molecular Biology and Human Genetic Disease 5 46.). In this case, this suggested that some of the inaccuracy of CF (cystic fibrosis) diagnosis in single cells may have been due, at least in part, to the allelic dropout of either the affected AF508 or the unaffected wild-type CFTR allele.

The question of allelic dropout remains controversial as although most groups describe allele dropout, since some groups have reported no allelic dropout even in large numbers of single cell analyses (Verlinsky & Kuliev, 1992 Preimplantation diagnosis of genetic disease: A new technique in assisted reproduction. Wiley-Liss, New York. ; Strom et al., 1994, Journal of Assisted Reproduction and Genetics 11 55- 62. ). In general though, the concept of allele dropout is becoming widely accepted and therefore specific PCR failure is relevant, particularly when starting nucleic acid template is low, as for example in single cells.

In light of the foregoing, it will be appreciated that"locus dropout"or"whole locus dropout"or marker failure is where neither allele is amplified to a detectable level.

As used herein,"preferential amplification"is the quantitatively unequal amplification of one of two heterozygous alleles. In other words, one allele is amplified preferentially over another.

The issue of preferential amplification has not been widely addressed in the literature, since conventional detection systems are generally unable to quantify the amount of PCR product from each allele. However, fluorescent PCR is an ideal system to identify preferential amplification for two reasons. Firstly, it provides highly accurate and reliable detection of signals even when signal strength is very weak or many times lower (to <1%) than the other allele. Secondly, it is quantitative.

It is possible to use these quantitative measurements to accurately determine the ratio of signal intensity between the two alleles and thus determine the degree of preferential amplification. Examples of extreme preferential amplification are seen in Figures 1 and 2.

Differences in signal intensity in sister alleles can be either preferential amplification or allelic dropout. Figure 3 demonstrates the effect that preferential amplification and allelic dropout would have on conventional and fluorescent analysis. In the example shown in this figure, conventional analysis would result in the diagnosis of a CF carrier as unaffected whereas fluorescent analysis would give the correct diagnosis. If the PCR produced allelic dropout rather than preferential amplification, no signal would be obtained with either technique and misdiagnosis of

a carrier cell would occur. Whereas if preferential amplification occurred correct signals would be detected by the fluorescent system only.

Current automated genotyping systems can increase throughput (Table 1), but are limited in several areas that are addressed by the present invention: I. In contrast to the present invention, current systems can provide analyses for single markers and chromosomes individually but cannot perform automatic comparison between two repeat samples i. e. same sample run in different lanes.

Automatic chromosome comparison. is equivalent to having second allele caller comparing results from repeat samples. Automated analysis results in much faster analyses and much increased confidence of diagnosis.

II. Current systems can provide analyses for single markers and chromosomes individually but cannot perform automatic comparison between different markers within the same PCR run in the same or different lanes. The present invention provides automatic marker comparison which is equivalent to having second allele caller comparing results from different markers. Automated analysis results in much faster analyses and much increased confidence of diagnosis.

III. Current systems can provide analyses for single markers and chromosomes individually but cannot perform automatic comparison between different multiplexes in the same or different samples. Automatic marker comparison provided by the present invention is equivalent to having second allele caller comparing results from different markers. Automated analysis results in much faster analyses and much increased confidence of diagnosis.

IV. Current systems can provide analyses for single markers and chromosomes individually, but commercial systems treat each marker as independent and do not combine to give individual or overall confidence value for the marker, diagnosis, sample or any of the parameters mentioned in"Summary of the Invention". The present invention can compare two differing results (for example in trisomy detection uses double-dose and triallelic) to provide higher confidence of result.

V. Although current systems can provide analyses for single markers and chromosomes individually, they do not provide overall diagnosis report for marker, PCR or sample.

VI. Current systems do not detect and/or flag failed amplifications or any of the parameters mentioned in"Summary of the Invention"to the user. For example the present invention provides a scorecard with"failed amp"when amplification fails.

VII. Current systems do not detect and/or flag inconsistent results for each marker, sample or PCR to the user. For example, the present invention provides a scorecard with"repeat 21"indicating that a particular multiplex PCR should be repeated.

VIII. Current systems do not detect and/or flag if particular markers are uninformative to the user.

In particular, the method of the invention provides one or more confidence parameters that may qualify the genotyping result obtained. These can be indicative of reliability and/or accuracy or any of parameters as herein described, with the result that further genotyping be performed (or repeated) before a reliable useful genotyping result can be obtained.

The present invention will now be described with reference to particular embodiments suitable for use in computer-implemented genetic analysis.

Reference is made to a computer-implemented method described herein with reference to Figure 4, that addresses a variety of parameters that are of importance in genotyping, particularly where genetic marker data is originally produced by fluorescent multiplex PCR amplification. These parameters include, but are not limited to, allele dropout, preferential amplification, allele dropout, marker reliability and accuracy. Commercial genetic analysis systems (such as Applied Biosystems Genotyper, Genemapper'rm and Amersham Profiler) do not determine these parameters.

The method is capable of manipulating data generated by other software, such as commercially available software (e. g. Amersham) and performs three main functions: A. Improves data management such as but not limited to data interrogation, panel and marker management and function integration; B. Calculation of confidence parameters, such as but not limited to marker reliability; and

C. Specific functions not found in other commercial software systems such as but not limited to"average split peaks"and"pull up peak removal".

As used herein, a"pull up peak"in the context of fluorescent PCR products is a fluorescence signal peak of one channel (filter) produced as a result of an overamplified product peak of another channel (filter). Most fluorescent dyes spectral characteristics overlap each other to some degree. If the signal from a particular marker is very strong, it can"pull up"signal from other dyes and creates incorrect erroneous lesser peaks. These lesser peaks, when analyzed, could indicate peaks from other markers and therefore result in incorrect calling and misdiagnosis. Such pull up can be detected, in this invention as scan numbers of pull up will be identical or very similar to scan numbers of strong peaks from other dyes. The present invention allows detection, analysis, elimination and reporting of pull up peaks using a user definable number of scans.

As used herein, a"split peak"in the context of fluorescent PCR products is a fluorescence signal in the form of a peak corresponding to two or more PCR products that differ from each other by a few base pairs. A typical example is the introduction of an extra base into a subset of PCR amplification products, resulting effectively in two peaks. When signal strength is very strong, this can result in the height of peaks being flattened across a range of fragment sizes. This can incorrectly result in several separate peaks several base pairs apart being reported instead of a single peak. These additional peaks could result incorrect calling and in misdiagnosis. The present invention allows detection, analysis, elimination and reporting of split peak correction using user definable number of scans.

This embodiment of the method of the invention uses integrated and compiled data for genetic diagnosis and screening including, but not limited to, detection of aneuploidy, polyploidy, trisomies, expansive disorders (such as Huntingtons disease, Fragile X), diseases involving mutations and deletions such as cystic fibrosis, chimerism, sex and sex aneuploidies, microsatellite instability such as loss of heterozygosity in cancer and linkage analysis, genetic identification although without limitation thereto.

Figure 4 provides a flow chart that describes a method that may be referred to herein as"Genefzle"for convenience.

Reference is now made to the flowchart set forth in Figure 4 where the main menu is described in Figure 4/1. This is the main screen of the program and all other menus and forms are based within it. Menu and toolbar access is provided to all areas of the program. To navigate, a path is chosen and then the user moves to the desired option (a reference numeral is indicated next to each option). The operation of each option is described in more detail as follows.

Figure 4/2 describes User Preferences. Users have the ability to set their own password, change folder options, and modify several"in program"parameters such as, but not limited to, including flagging poor performance of any marker (see Figure 4/2). Administrators may, in addition, restrict users from various utilities within the program, either by password or by other means such as restricting permission to save, write, modify and/or making utilities unavailable.

1. User Preferences Tab-This tab is open upon entering the form, and is active for all users. It is used to define user parameters and allows a user to set his/her password 2. User Access tab-this tab is not active to any user who is not an administrator. It is used to set the administrative password and shows all current users, their level of access and their access restrictions. Administrators can also remove users here.

3. This check box when enabled selects the bin name for an allele every time it is shown in Genefiler (with the exception of individual well results when selected by a user (double clicking on a well result)). If the check box is not enabled or the marker does not have bin names then the allele size is shown in base pairs 4. This selection shows the users preferences for to marker reliability for any marker used in Genefiler. When the reliability of the marker falls below a user defined percentage for a run then Genefiler will alert the user such as by displaying a warning, and the log reports in the output file from the run will also show the warning.

5. The user may define his/her own Password (which requires confirmation) so that access to his/her projects, runs and marker & panel data is protected. An administrative password will also open any users account.

6. This checkbox when enabled will duplicate any homozygous alleles in any output data and for any statistical purposes. Deselecting the box will mean no duplication of homozygous alleles 7. This allows the user to specify the folders that the raw data files will be stored and where the result output files will be placed. Upon exiting if not present, Genefiler will create folders in these locations.

8. Restrict user access preferences deny or allow access only by administrative password to particular user from the following areas of Genefiler:- a) Analysis and diagnosis control b) Project management c) Run management d Marker management e) Filter setting 9. Remove user enables an administrator to remove a user. A message box will check that user wish to remove the user as it will delete all marker and panel data, any new filter settings, any project data and any run templates, that the user has saved.

10. Set password or disable option buttons will either restrict users but allow an administrator to enter an area via the administrative password or disable access to the area in question.

Referring now to Figure 4/3, Marker and Panel Management provides the ability to add, modify, delete and select markers. Each marker has parameters that can be changed. Panels can also be added, deleted or modified.

1. The marker list is a list of all the markers that are currently available for use for a particular user. Users can add, modify parameters or delete markers in each panel.

2. The panel list is a list of all the panels that are currently available for use by a particular user.

3. Add/modify marker buttons enables the user to add a marker (so long as the name has not been previously used), or modify a currently existing marker. It will show the marker parameters form for that marker or the new marker will be shown.

4. Delete marker will delete the selected marker.

5. Select marker will allow the marker to be placed into the selected panel marker list, or allow it to be deleted. If marker is selected, it will bring up its parameters and will show the position, range and dye color of the marker graphically.

6. The first tabs on the marker parameters form are the marker's analysis parameters. These are as follows a) Name of marker (must be unique). b) Channel which also represents dye. Uses 2-5 for the Amersham Megabace system or a color if the Applied Biosystems platform is used. c) Minimum size range-the minimum size of DNA fragment for the marker, in base pairs d) Maximum size range-the maximum size of DNA fragment for the marker, in base pairs e) STR check box-Selects whether the marker is a small tandem repeat polymorphism. f) Marker Group. Allows user defined grouping of genetic markers (e. g. all markers on chromosome 21 markers can be grouped as"21"). g) Sexing marker. Allows selection of whether the marker is found on the X or Y chromosome and can therefore be used to perform a sex determination of the sample. h) Peak filter preferences. When enabled this allows additional filter settings for a particular marker if the marker analysis is problematic.

7. The PCR parameters tab of the marker parameters form shows all the panels that contain the conditions as well as the primer concentrations required for the particular PCR master mix.

8. Marker binning parameters tab of the marker parameters form shows the binning attributes of the particular marker. A bin may be added, deleted or modified (with restrictions that the new bin cannot overlap an existing bin or have the same

identifier as another bin within the marker). Each bin can be selected, providing its user definable minimum and maximum base pair size. The auto-assign bins button will allow Genefiler to automatically create a set of allele bins for a particular marker, or will analyse previous runs to create a series of bins.

9. The sexing parameters tab of the marker parameters form gives details of the X or Y chromosome origin of the marker, and its attributes. The incidence of the presence of the marker, or absence of the marker (for a female samples in the case of a Y chromosome marker), indicates a possible sex.

10. Delete panel button will delete the selected panel.

11. Clear panel button will clear the selected panel of any markers that are currently contained within it.

12. Add panel will allow a panel to be created (given that panel names must be unique) and provide the user an opportunity to change the preferred filter and diagnosis parameters of the panel.

13. Panel preferences allow the user to define which preferences, if any, the panel has for filter or diagnosis,. in order to ease run creation.

14. Select panel will sshow all the markers within the panel in the selected panel marker list, and graphically show the range and dye color of those markers.

15. The selected panel marker list show the list of markers within the currently selected panel 16. Graphical representation of the panel gives four charts (one for each of the possible four dye colors (excluding the color used to determine size standard color) or channels. Markers are shown graphically within these, with the length of the representation an indication of the size range of the marker in base pairs.

Referring now to filter settings described in Figure 4/4, these provide the ability to create a set of parameters to filter peaks down to only those peaks that are required.

1. Delete filter set-this will delete the currently selected filter set. The default filter set cannot be deleted.

2. Add filter set-this will create a new filter set which when saved will be available by the user.

3. Any filter set can be user defined and/or modified by the following parameters:- a) The filter set name must be unique b) The minimum and maximum peak sizes (in base pairs). A filter set can contain up to three individual ranges. Any peak outside of all active ranges will be removed from the active peak list. c) Minimum peak height and/or peak area. A filter set will remove all peaks below the peak height or peak area filter set limits. d) Stutter peak removal-will remove any smaller peak within a user- defined range distance of another. e) Split peak removal. If the peak has split (as hereinbefore defined), or if PCR artefacts are present, these can be removed by a user defined amount. An option to average split peaks is also available. f) Peak pull-up is measured by scan numbers. A peak lower than another but with the same or nearby scan number (within a user defined range) will be removed if peak pull-up is enabled. g) All peaks outside of the user defined range, or dye, of any or all the markers within a panel will be removed from the active peak list. h) All peaks outside of the user defined range, or dye, of any or all the alleles within a marker will be removed from the active peak list. i) All size standard peaks will be removed if this checkbox is enabled j) All peaks within a marker will be removed if they are lower than a user-defined percentage of the highest peak within the marker.

5. Clear all filter values/switch off all check boxes will clear all filter value or switch off all check boxes respectively for a selected filter set. Save filter set will allow the filter set to be available for any run.

Referring now to Figure 4/5, project management allows creation of sample lists devoted to one project, which can then be incorporated into runs. In addition, sample names are used by the results and statistics forms, for example to compare several results from the same sample.

1. Import Project-A text file starting with the words"Genefiler Project file" and containing a list of sample names can be selected and imported into the projects form. Only files containing those words will be allowed.

2. Create new project-A unique project name can be created, so long as the project name is not already in use by another project.

3. Add to project-a project, once selected can have samples added to it and deleted from it. Samples names will be placed into the"add to"project list box.

4. Delete project will delete the selected project.

5. Auto-number samples will create a list of sample names with a user defined prefix or suffix from a user defined specified number to a user defined specified number by a specified user defined increment.

6. Sample names must be unique and can be added to a project only if they are unique to that project.

7. The project can be saved for other aspects of Genefiler to use, and is user specific.

Referring now to Figure 4/6, run creation and management enables a user to create a run in which wells have multiple run parameters.

1. Delete run will delete the selected run. If the run contains results data a message box will warn alert the user to provide confirmation that this is the case.

2. Use existing run template-When creating a run, the option to use an existing run as a template is given. If no template is selected, then the new run will be blank.

3. Create new run-When creating a run, a unique run name must be entered.

4. Modify run-if a current run is selected, any changes made to the contents and parameters of the run can be made and saved by saving the run. If the run is not saved an alert warning is given upon leaving the run management form asking to confirm whether the changes are saved.

5. Assign sample name to well-the sample name 96 well format tab will present the sample names selected. A sample name can be user defined, added or modified by selecting the appropriate well and manually entering the name. If no project is selected for the sample name then it will be given the generic project selection and added to the generic project list. In addition the available projects and

all the sample names are available for user selection and can be selected and used to populate the sample list of the run, continuing from the last free well in the run.

6. The sample name may be entered directly.

7. A full list of projects is available to choose an appropriate sample list.

8. Selected samples are placed into a list box and can then be used to populate the run.

9. Each well in the run creation process has a number of parameters that can (or must, depending on the parameter) be selected. The selection is made from available parameters set up elsewhere in GeneFiler. Available parameters are as follows:- a) Sample name-unconditional. b) Panel-unconditional. c) Project-unconditional (automatically added if not selected). d) Filter-unconditional (default if not selected). e) Diagnosis (optional) f) Analysis (on/off) g) Test (optional) 10. Saving the run finishes the run creation process. If any changes are made and run is not saved when the form is closed then a message box gives the option of saving the changes. Any unsaved changes will be lost.

Referring to Figure 4/7 describing Analysis and Diagnosis control, the modules other than those indicated by reference numerals 4 and 14, are optionally available"plug-ins".

1. Trisomy detection. This is an analysis that enables a user to selectively check the status of chromosomes for chromosomal trisomy abnormalities such as, but not limited to, trisomies.

2. Sex determination. This is an analysis that enables a user to interrogate the data for sex determination. It requires the presence of sexing markers.

3. Sample comparison. This is an analysis that enables the user to interrogate the data for its correlation compatibility as a fingerprint with either a user defined genetic fingerprint or from a genetic fingerprint obtained from a specified sample.

4. Output-This is a facility that will be provided within the basic Genefiler package that enables the user to create a user defined output file, or place the results data into a user definable database. The user can define the results data selected and the order that the results are provided. Multiple output files from the same data are possible and can be user defined (see reference numeral 7).

5. Analysis control. This is where a user can create, modify or delete an assessment made up of several different analyses, so that a well and or sample can be subjected to multiple analyses simultaneously.

6. Select analyses. Different analyses can be selected for a particular assessment.

Each analysis is saved separately and named uniquely.

7. Add/remove analyses from assessment. Analyses can be added or removed from an assessment. If there are no analyses within an assessment, the assessment cannot be saved and will be lost upon exiting.

8. The assessment once saved can be used to interrogate the data using multiple analyses (e. g. trisomy detection using one set of parameters, trisomy detection using a second set of parameters and an output file).

9. Trisomy detection Parameters. Parameters include:- a) Diallelic trisomy detection using the user defined ratio of a heterozygous marker result, quantitative testing can be undertaken to provide results indicative of a triallelic status. The higher peak must fall between the upper and lower user defined percentages of the lower peak. If so then the marker is flagged as diallelic trisomy. b) Group control parameters. These are based on the number of markers that indicate give a particular repeat. User defined percentage figures can be modified for user defined parameters such as but not limited to, the number of markers being disomic, failing, diallelic trisomy, triallelic trisomy, or homozygous. These parameters will give the group a user defined result, based on a user defined hierarchy. (For example-triallelic trisomy has priority 1, and if set to 0 percentage, this means that if present will always give its defined group result, over the fact that the other 3 markers in a group have failed. ) This intelligence or"fuzzy logic"system

allows diagnostic accuracy to be manipulated and controlled in a much more efficient manner, particularly for validation, accreditation and quality control systems. c) Maximum number of repeats. This is the number of repeats that a sample is allowed can be subjected to before a final result is given (for example is the maximum number of repeats is set to 2 then the sample can be run and then repeated twice before its final result is given). d) Minimum number of markers for diagnosis. This allows the user to set a minimum number of markers before diagnosis is conducted. Increased markers increase diagnostic confidence. e) Group results. This allows the control of group results and how these affect the sample result status. There are two types available and are based on the number of groups achieving a certain result.

10. Sex determination parameters. The parameters of each sexing marker is reviewed, with its type of sexing marker (XY, X or Y) defined, how a result will affect the overall sexing of the sample and the minimum number of markers required for a sex determination.

11. Enter fingerprint. This enables user to enter a user defined reference genetic fingerprint, based on a user defined selected panel which can be used to interrogate the run and determine the correlation between each sample result and the reference fingerprint.

12. Select Run and Well. This is the same as reference numeral 11, but a run well is used as the reference genetic fingerprint.

13. Sample comparison parameters. These are user defined additional allele, false allele and/or allele dropout parameters. The number of these within a sample will be marked if they are above user-defined limits (fail or mark).

14. Output parameters. Allows user defined selection from a list of well/sample/result attributes in any order for output. The type of output file is also user defined, as text file, Microsoft excel format or database. User may define the output file name (destination is determined within the user preferences in settings) Figure 4/8 sets out Start Analysis with the following modules.

1. Select run and filename. Allows selection of an individual run or up to four runs. In addition a run folder can be user selected, and any raw data file with an associated run template will be automatically selected.

2. The run folder is checked for files that are compatible with raw data files that Genefiler can use. If not then the user is alerted.

3. This determines if the data is in a format suitable for analysis. For example, in a text file rather than tab-delimited format.

4. If all internal checks are positive, analysis is started. If the run has been conducted before, the new run will overwrite the previous one. Each well of the run is subjected as described in 5 and 6 below, using the run parameters previously set up and saved as in Figure 4/6. Each well may be analyzed differently depending on its own user selected parameters. (See Figure 4/6, reference numeral 9).

5. All generic peaks are interrogated using the user defined filter parameters previously set up on Figure 4/4 and saved as in reference numeral 5 of Figure 4/4.

Peaks of interest remain active, whilst all other peaks become inactive.

6. With only peaks of interest active, analyses set up on Figure 4/7 and saved as in reference numeral 8 of Figure 4/7, are then used to interrogate the remaining peaks, and produce a run result for each well.

7. The results are collated and Genefiler creates a run file to the users specifications (see Figure 4/7,4 and reference numeral 14).

8. Sample results and Analyses can be viewed in the results file (see reference numeral 10 of Figure 4/9,10), or samples can be viewed and compared with all previous runs with that sample (Figure 4/9, reference numeral 13).

Referring to Figure 4/9, upon completion of a run, the form automatically opens and shows the run results. If there are multiple runs, the last completed run will be shown.

1. Results from a run may be presented as a 12x8 array 96 well format.

2. Results from a run may be presented based on user defined individual sample, marker, or panel.

3. Results from a run may be view using user defined charts and statistics

4. If a user selects a project, this will open up the runs that have been completed containing samples from that project.

5. If a user selects a run, this will open up the panels used within that run 6. If a panel is selected, then the results for that panel are shown as the number of markers that have produced one or more alleles. The color red indicates that the sample has failed. The color yellow indicates that the sample has not reached the user defined level required to pass. The color green shows that the sample has succeeded in equalling or surpassing the level required to pass. The color blue indicates that the samples sizing standard has failed and the sample have been subjected to a total lane failure. Parameters for color assignment are user definable.

7. An individual analysis may be selected, in which case the 96 well format will change to show the user defined definitions of the results of that particular analysis.

8. A group may be selected. The 96 well format will change to show the results from that group of markers.

9. A marker may be selected. The 96 well format will change to show the allele sizes (if user settings are set to show allele sizes) or bin names of the marker alleles for each instance the marker was used in the run.

10. See 7 to 9 above.

11. In the sample management tab, selecting a project will provide all the samples contained within the project 12, Selecting a sample will show all the run and well identification (ID) details for every instance the sample has been run, or every run template containing the sample. If the run has not yet been analyzed,"Not yet analyzed"will be shown next to the run name and well identification.

13. Selecting an assessment and an analysis will show the results for all the runs that the sample has been subjected to. Selecting an individual run result will provide the panel data for that run and marker alleles/bins and individual marker result.

14. Selecting the project, run, panel or marker will show a variety of statistics concerning the marker. In addition any alerts from the run concerning a marker or sample dropping below the user defined limits in settings will be flagged, as well as any other comments and general statistics of the run.

The computer-implemented method described in Figure 5 may be herein referred to as"Triscan", for convenience. This method is optimized for analysis and detection of chromosomal abnormalities such as aneuploidies and may be used either as a"stand-alone"method or as a"plug-in"to Genefiler.

1. Open Genefiler. This will start Genefiler. Using a particular user name will access individual users marker panels, saved runs and analyzed data. Opening View > Marker + Panel management will open the marker and panel management form if required. (see Figure 4/3) 2. Modify and add markers. Markers specific to a particular chromosome will infer the trisomy status of that chromosome. Markers for one chromosome can be grouped accordingly. If marker chromosomes have been previously defined in Genefiler (Marker parameters-General parameters Figure 4/3), then this aspect can be performed automatically.

3. If the sexing marker checkbox is activated (Marker parameters-general parameters ; Figure 4/3), Genefiler requires the user to complete the sexing marker parameters before exiting the form. These parameters define how the marker alleles show a particular sex.

4. Selecting or creating a panel is required as trisomy detection analysis are linked to a particular panel. (See Figure 4/3 for creating or modifying a panel).

5. The marker and panel settings are saved upon exiting the marker and panel management form. If required, the user may now open the Analysis and diagnosis form (View > Analysis + Diagnosis control) 6. Selection of the trisomy detection tab allows creation, modification or deletion of a particular trisomy detection analysis. (see Figure 4/7) 7. For new runs, panel selection is required for completion of the analysis.

8. A diallelic trisomy signal is a marker that shows a 2: 1 ratio with 2 peaks. This is translated as a user definable range of percentage of the lower peak (i. e. 170 % to 240 % indicates that any peak that is between 170 % and 240 % of the lower peak is flagged as diallelic sample). If no higher percentage is given then Genefiler will take any peak size around the lower percentage. If no percentages are given then Diallelic calling will be deactivated.

9. Marker and Group control. Each marker within a group will be assigned a particular result, dependant on the distribution and number of alleles within it.

The possible results are:- a: Fail-The marker shows no alleles b: Homozygous-The marker shows only one allele. This is a result that will not give a trisomy status and is therefore considered uninformative. c. Disomic-The marker shows two alleles that have been found to have peaks of similar size and therefore are not a diallelic trisomy d. Diallelic trisomy-The marker shows only two alleles, but the ratio between the two alleles indicates that the there is a user definable ratio (approximately 2: 1) and indicating that there are three chromosomes present, and thus a trisomy for that chromosome. e. Triallelic trisomy-the marker shows three peaks, indicating that one allele came from each of three chromosomes, indicating a trisomy for that chromosome.

The number of markers that give a particular result for a group can be flagged as a percentage for a particular group result. A hierarchy for each type of result possible can be user defined.

The group control allows a number of groups with a result to affect the overall sample result. Again the number of groups can be user defined and adjusted to provide appropriate results.

10. The maximum number of repeats is user defined, and allows a user to stop sample from being continually repeated if abnormal or failing. It allows for automation of sample handling and defining repeat status. The number of markers required for a diagnosis can be user defined to stop a limited number of markers from giving an unconfirmed result.

11. Analysis parameters can be newly created, copied, or modified from an existing analysis and saved. Any number and variety of different user defined analyses for trisomy detection can be created and used independently and concurrently to interrogate a sample.

12. The sex determination tab will bring up a review of the Marker parameter's for each sexing marker available. Again Panel must be selected.

13. Sexing parameters may be changed here or in the marker parameters window (View > Marker & panel management > select marker (open marker/modify marker/double click on visual display of marker) ). Multiple sexing marker management is user defined to determine the results from all different permutations.

14. Sex determination analyses can be saved. Any number of different types of determination can be saved and used.

15. Analysis control tab enables a number of analyses to be compiled and linked to one assessment to interrogate a sample's result. Any number of saved analyses can be added to one assessment. In addition, any output file format can be added to an assessment. Once an appropriate trisomy detection and/or sex determination analysis is added then an assessment can be saved.

16. After exiting from the Analysis and diagnosis control form, filter settings can be opened to create or modify a filter setting appropriate for the panel to be used.

Again save and exit to main menu 17. A new run template with the sample names and the Assessment set up from 6- 15 is created. Add the appropriate panel, filter setting, project and save the run.

18. The run is then performed-the template sample is subjected to PCR using the panel, and the product fragments are separated using a fragment analyzer such as a DNA sequencer e. g. Megabace 1000.

19. A generic peak report is created from the fragment analyzer and the appropriate run template is chosen to conduct the analysis (see Figure 4/8).

20. Genefiler determines peaks of interest (those remaining after the filter settings have been used to eliminate any unwanted background or artefact peaks) and conducts analysis using the user defined parameters previously set up by the trisomy detection, sex determination and any other analysis saved within the assessment used.

21. Markers are automatically assigned a marker result. The markers within a group are subjected to the previous user defined parameters within the trisomy detection analysis. Groups again will show the sample result.

22. If the sample has been previously run, using trisomy detection analysis markers, the samples are automatically compared. If the sample's markers are consistent (i. e. the same markers show the same alleles) then results are compared, favouring the new results.

23. If the sample has not been previously run and the maximum number of repeats is above zero (if at zero then the results of this run are given) then the sample is assessed for repeating and the results are stored for that repeat to be run.

24. If less than the maximum number of repeats and the result has failed or is abnormal, then the sample is again assigned for repeating using the same user defined parameters automatically. All repeated runs of a sample are saved.

25. If the maximum number of repeats has been reached then the sample is one of the following:- a: a failed sample, which indicates either a total failure of the sample or a result which is inconsistent; b: a sample that is trisomic for a particular group ; c: the sample is disomic (No repeat is required in this case).

26. If the sample has failed then it is flagged as such and no repeats are sent to any repeat run templates.

27. If the sample is Trisomic then it is flagged as such.

28. If the sample is Disomic then it is flagged as a Normal disomic sample.

29. The sample output is performed and/or stored appropriately. This process is repeated with a new sample (Steps 18-29) Reference is now made to a computer-implemented embodiment of the third aspect of the invention. This embodiment may be referred to herein as"AlleleScan".

AlleleScan may be used as a standalone method or as a module within"Genefiler"as described in Figure 4. This computer-implemented embodiment of the method according to the third aspect may provide a"front end"system for genetic analysis, and is described with particular reference to Figures 6-23.

Allelescan provides statistical analysis of genetic marker data from multiple genetic markers within and between samples, markers, bins, multiplexes, panel sets and groups.

These analyses provide one or more confidence parameters, to be described in more detail hereinafter, by providing the following functions in the following example: NumberSamples: Counts the number of samples to be analyzed. This module starts at the first sample and counts each time the sample number changes, giving a variable that defines the total number of samples.

Sort: Sorts the sample information first by sample number in ascending order and secondly by category (marker) alphabetically. This module also deletes information left-over from previous runs (Figure 6).

Counter: Counts how many cells contain information. This tells the program where the inputted information ends in the spreadsheet.

MplexSelect : The user can select between different multiplexes (Figure 7).

This module determines which multiplex has been selected (B10 is in example in Figure 7). In this case multiplex number 1 has been selected. Multiplex 1 is located in Columns G and H, with the individual markers listed along with their STR/non- STR status.

MarkerCount: Analyses each marker in each sample, denoting the presence of a marker peak with a"1", and the absence of a marker peak with a"0"in the appropriate column (Figure 8).

NonSTRsub: Differentiates STR data from non-STR data. This module analyses the selected multiplex and defines the number of non-STR markers within the multiplex and their relative order (Figure 9). The module then opens the ASMain page and moves the non-STR data to its own column/s.

STRcount: With the non-STR data removed to separate columns, this module sums the number of STRs that have successfully amplified for each sample (Figure 10).

Reliability: This module analyses the information generated by the STRcount module and determines user defined reliability for each sample, denoting a "1"for a reliable sample or a"0"for a failed sample in the appropriate column.

(Figure 11).

Accuracy: This module analyses the information generated by the STRcount module and determines the user defined accuracy for each sample, denoting a"1"for an accurate sample or a"0"for a non-accurate sample in the appropriate column (Figure 12).

StartReliables: As the number of non-STRs within a multiplex can vary, the program must define where to place further information. Individual marker reliability data is placed in the columns adjacent to non-STR reliability. As such, this module selects the next unused column for the placement of subsequent data.

MarkerReliability: Using the information from StartReliables this module generates individual marker reliability for each sample across the spreadsheet (Figure 13).

AlleleDropout : In the spreadsheet adjacent to the individual marker reliability data this module generates individual marker allele dropout for each sample (Figure 14). The program assumes each marker is heterozygous, denoting a "1"when a single peak (homozygote) is detected for the marker or a"0"when both peaks (heterozygote) are detected.

AmpFailure: In the spreadsheet adjacent to the individual marker allele dropout data this module generates individual marker amplification failure for each sample (Figure 15), denoting a"1"where no peak is detected, or a"0"where 1 or more peaks are detected.

DeleteEnd : Using the information collected in Counter the program proceeds to the end of the inputted data and deletes all information below it in the spreadsheet.

Stats: Using the data generated by the previous modules Stats calculates run reliability, accuracy, mean STRs/sample along with individual marker reliability, allele dropout and amplification failure (Figure 16).

Reportl : This module pastes the run statistics for reliability, accuracy and mean STRs/sample into the ASReport sheet (Figure 17).

Card2: Draws a table representing a 96 well plate on the ASReport sheet and pastes the number of STRs that amplified per sample into the appropriate cell of the table (Figure 18).

MplexSelect: This module is re-run to reset the selected multiplex variables.

ColorCard: Applies color coding to the table created by Card2, using user defined variables for reliability and accuracy (Figure 19).

MarkerReport : Pastes the individual marker statistics generated in ASMain into a code-created table, to show reliability, allele dropout and amplification failure (Figure 20).

ReportHeadings: Inserts the required headings for the report in the ASReport sheet (Figure 20) PeakAreaRatio: In ASMain this module calculates the peak area ratio of each marker for each sample (calculated as the second peak/the sum of the two peaks). This value is inserted into column I (Figure 21).

Graphs: Pastes the values generated in the previous module for peak area ratio into ASGraphs. This information is then formatted for use with Excels Graph wizard (Figure 22).

It will be appreciated that the above functions are applicable to samples, markers, bins, multiplexes, panel sets and groups.

All run information is presented together in the Results page (Figure 23).

Examples of Genefiler using Trisomy detection and sex determination plug-ins Introduction To demonstrate the real-time working of computer-implemented methods of the invention, examples of two samples are presented. One is a disomic sample (sample 1), and required no repeating, and the other presenting as a Trisomy 18 (sample 2) upon repeating (result is Trisomy 18). Both samples were run twice with two different multiplexes (multiplex 1 and 2 for each sample).

These examples describes the process of creation of the marker panel, filter set, project and analysis assessment to show the process involved in producing a result.

This example will refer to the Genefiler flow diagrams (Figure 4) and the associated description and the Triscan flow diagrams (Figure 5; which incorporates the sex determination plug-in)

The full generic peak report peaks for all results covering both samples (sample A and sample B) consists of many hundreds to thousands of data sets (not shown).

Start up of Genefiler and creation of Marker Panels After the splash screen, Genefiler presents a User login screen. Once the user has input a name or entered under an existing user name (with the correct password if required), the user is presented with the main menu. Selecting view will bring up the functions required to create and analyse runs. Selecting View > Markers + Panel management will show the screen shown in FIG. 24.

Choosing"Add marker"here will bring up the new marker form. This requires a name. In this example, the first Marker is in Multiplex 1-"Sexingl"pressing OK will call up the marker parameters form as shown in FIG. 25.

The user defined marker parameters are completed, channel number indicates the dye of the primer set used (in this example, FAM labelled). The minimum and maximum size ranges for the marker, its group name (sex in this case) and chromosome (the marker is found both on chromosome X and Y) are also defined. In addition, in this example this marker is a sexing marker, indicated in the appropriate check box.

Selecting the sexing tab of the marker parameters will allow the user to input the user definable parameters specific to the sexing aspect of the marker (FIG. 26).

Once all user definable markers and marker parameters are entered, selecting "Add panel"will enable creation of a panel. This is called Multiplex 1 in this example. Selecting a marker and selecting">"will add the marker to the panel (Table 2). Once all the markers for Multiplex 1 are selected, a new panel, Multiplex 2 is created and populated as in Table 3. Selecting Multiplex 1 will provide the screen display shown in FIG. 27.

Once complete, exiting the Marker and panel management form will save the current status.

Filter settings and project generation In this example, the following filter settings were used (saved as DEFAULT) 1. Remove standard was activated

2. Peak area is automatically calculated 3. Peak size-two peak size ranges were selected a. Peak sizes from 0 to 90 base pairs (peak sizes to be removed) b. Peak sizes from 400 to 800 base pairs (peak sizes to be removed) 4. Minimum Peak Height was set to 300 5. Minimum Peak Area was set to 1500 6. Marker allocation was turned on (see panel for details) 7. Split peak correction was set to 1.6 base pairs 8. Pull up of 1 scan of another peak of a different channel was on 9. Remove peaks lower than 31% of the highest within a marker was turned on 10. Additional peak over 3 removed was also turned on.

Project generation is not defined here since specific samples were used. These samples were both from the same project, but were based on different gels (in order to get a variation in result type, i. e. one disomic sample and one trisomic sample).

Analysis & Diagnosis The analysis and diagnosis form is found via View > Analysis and Diagnosis Control. Selecting the tab"Trisomy detection"will bring up the screen shown in FIG.

28.

This enables a user to define the result of a sample by selecting how the marker result will affect the result of a group. The group results will affect the result of the sample.

Genefiler will assign marker status to each marker:- No peaks-Fail One peak-Homozygous Two peaks-Heterozygous (trisomy detection will change this to Disomic) Three peaks-Abnormal (changed to Triallelic trisomy if used for aneuploidy detection) Diallelic trisomy status is the instance where three alleles are present, but two are the same size, resulting in a seemingly disomic sample, but with a 2: 1 ratio in

peak area. Genefiler allows a user defined range to be selected, but in this example if the higher peak is 170% or greater than the lower peak then it is considered a 2: 1 ratio and therefore diallelic.

If greater than or equal to 80% of markers within a group are Disomic then the group is disomic a preference of 4 is assigned.

If greater than or equal to 60% of markers within a group have failed then the group has failed a preference of 3 is assigned.

If greater than or equal to 60% of markers within a group are Diallelic trisomy the group is a suspect trisomy a preference of 2 is assigned.

If greater than 0% (i. e. any incidence) of markers within a group showing three peaks (triallelic trisomy) then the group is a suspect trisomy and is assigned a preference of 1.

If greater than or equal to 100% (i. e. only if all) markers within a group show a homozygous (1 peak) result then the group is considered uninformative and is assigned a preference of 5.

The preference introduces a hierarchy into the group result to overcome more than one response being correct. This intelligence or"fuzzy logic"system allows diagnostic accuracy to manipulated and controlled in a much more efficient manner, particularly for validation, accreditation and quality control systems. For example ; If three markers are within a group and two have failed (triggering a Failed response), with a preference of 3, but the third marker is a triallelic trisomy (triggering a Suspect trisomy response) with a preference of 1, the group result is suspect trisomy, not Failed.

In this example the maximum number of repeats is set to 1, meaning that if a sample requires repeating, the next run is the last and an overall result must be given irrespective of the individual run results.

The minimum markers required for a result is also set to 1. This user defined parameter stops one marker being the only one allowed to predispose a group result, and therefore ensures marker confirmation is required before a result is reported.

Setting this parameter to 0 allows only one marker to give a result. It is used, for example in Multiplex 1 to stop C21-1 giving a group result (C21-1 is used to

maintain confirmation across different multiplexes in the same sample as the same marker is found in both Multiplex 1 and Multiplex 2).

Run template creation The run template is used by Genefiler to interrogate a generic peak report name, filter, store, analyze, diagnose flag failures, and determine repeat status for each well of the 96 well plate, independent of its neighbors. The parameters for each well include, but are not limited to:- 1. Sample name.

2. Panel name (Indicates the panel to be used on the well).

3. Project name (for use in results and sample management.

4. Filter set (Used to filter the peaks so only those of interest remain).

5. Analyze (Used to interrogate the peaks of interest).

6. Test (Used for comparing different experiments).

7. Analysis (used for samples imported post filter, pre analysis).

In this example, Sample 1 was run twice on the Run template PM230, at well B06 with Multiplexl, DEFAULT filter setting Run Results With the two samples that are shown here, one is a disomic sample that is subjected to both multiplexes in order to determine a chromosome status for Chromosomes 21,18 and 13, whilst determining the sex of the sample at the same time. The second sample is Trisomy 18 and as such has had a repeat Multiplex 1 to check the status (in order to confirm the trisomy 18).

The initial filter breakdown is present in Tables 4 and 5.

Sample 1 of Multiplex 1 has the remaining peaks shown in Table 6.

The markers are then assigned to give the correct result, which in turn allows the group and sample results to be called (Table 7). In this example, the result is a normal sample. Genefiler would store this result as normal and no further action would be taken with these Groups. The result would therefore show a Normal male sample.

For Sample 1 Multiplex 2 this leaves the remaining peaks shown in Table 8.

The group shown in Table 9 is in a separate multiplex and again shows that sample 1 is a normal sample. Therefore again no further again is warranted with this sample.

For sample 2, multiplex 1 run 1 the remaining peaks are shown in Table 10.

When group results are called for Sample 2, results indicate that there is an abnormal response from the chromosome 18 markers, indicative of Trisomy 18 (Table 11). The sample is flagged for repeat and placed into the repeat multiplex stack for the next Repeat run template. All associated conditions and parameters are copied.

For Sample 2, Multiplex 2 the remaining peaks are shown in Table 12 Sample 2 Group 21 results indicate that the sample is disomic for 21 (Table 13).

For Sample 2, Multiplex 1 run 2 the remaining peaks are shown in Table 14.

When Sample 2 is repeated, while one of the groups (sexing) failed, results again indicate a suspected trisomy 18 group (Table 15). This would result in the final result (as the sample has been repeated and the maximum number of repeats is one, sample 2 being shown to be a Trisomy 18, male.

Throughout this specification, the aim has been to describe the preferred embodiments of the invention without limiting the invention to any one embodiment or specific collection of features. Various changes and modifications may be made to the embodiments described and illustrated herein without departing from the broad spirit and scope of the invention.

All patent and scientific literature, computer programs and algorithms referred to in this specification are incorporated herein by reference in their entirety.

Table 1 Comparison Manual Using Present Analysis conventional Invention software Genefiler Time for Analysis of lx 96 well plate 120 min 25 min-20 secs Analysis time for 1 (9hr) days data 360 min 75min 9.3 mins Analysis time for 1 (24hr) day data 360 min 75 min 24 mins Number of analyses/sample 3996 3996 3996 Max genotypes analyzed per min 6. 4 45 4608 Sample diagnoses Subjective Subjective Objective (manual) (manual) (automatic) Confirmation Subjective Subjective Objective (manual) (manual) (automatic) Check on lane/lane comparison Subjective Subjective Objective (manual) (manual) (automatic) Number of samples analyzed per 9 hr day 32 100 >100,000 Max genotypes analyzed per 9 hr day 3,456 24,300 2.48 million Table 2 Marker Channel Min. Max Group Sex type Sexing name value Value parameters Sexing1 2 104 115 Sex XY 106,112 Sexing2 2 177 179 Sex Y 178 female if absent C13-1 3 115 150 13 C13-2 3 226 291 13 C13-3 4 175 209 13 C18-1 2 145 175 18 C18-2 3 147 191 18 C18-3 2 261 340 18 C21-1 2 202 259 21 Table 3 Marker name Channel Min. Value Max Value Group C21-1 2 202 259 21 C21-2 3 150 190 21 C21-3 2 263 319 21 C21-4 4 380 430 21 Table 4 Sample 1 Multiplex 1 Sample 1 Multiplex 2 Generic peak no. 105 143 1 Remove standard 85 123 3a Sizes 0-90 bp 44 68 3b Sizes 440 -800 bp 44 68 4 Peak heights <300 44 68 5 Peak Areas <1500 40 64 6 Marker Allocation 29 24 7 Split peak correction 19 16 8 Pullup 19 16 9 Remove lower peaks 15 8 10 Remove all but 3 15 8 Peaks of interest 15 8 Table 5 Sample 2 Sample 2 Sample 2 Multiplex 1 Multiplex 2 Run 2 Generic peak no. 129 170 577 Remove standard 109 150 557 3a Sizes 0-90 bp7595535 3b Sizes 440-800 bp 73 94 311 4 Peak heights <300 35 41 27 5 PeakAreas <1500 30 29 14 6 Marker Allocation 18 16 10 7 Split peak correction 18 10 10 8 Pullup 18 10 10 9 Remove lower peaks 17 10 10 Remove all but 3 17 8 10 Peaks of interest 17 8 10 Table 6 Run name Well ID Sample Channel Peak No. Scan No. Size (bp) Height Width Area Marker PM23001 B06 B06 2 26 2197 106.6 2232.9 2. 9 6475. 41 Sexing1 PM23001 B06 B06 2 28 2239 112.5 3411.2 3.8 12962.56 Sexingl PM23001 B06 B06 2 33 2583 157.7 4233.3 4 16933.2 C181 PM23001 B06 B06 2 34 2615 161.8 55762.1 4.1 228624.6 C181 PM23001 B06 B06 2 35 2744 177.7 7344.7 3.3 24237.51 Sexing2 PM23001 B06 B06 2 42 3227 234.4 22003.4 3.2 70410.88 C211 PM23001 B06 B06 2 44 3245 236.4 15318. 2 3.4 52081. 88 C211 PM23001 B06 B06 2 47 3626 278.6 17181.1 4 68724.4 C183 PM23001 B06 B06 2 48 3838 301.2 13223.5 5.3 70084.55 C183 PM23001 B06 B06 3 12 2300 120.9 820.2 3 2460.6 C131 PM23001 B06 B06 3 16 2784 182.6 18772.3 3.7 69457.51 C182 PM23001 B06 B06 3 18 3295 242 3742.6 4.3 16093.18 C132 PM23001 B06 B06 3 20 3597 275.5 3572.3 5 17861.5 C132 PM23001 B06 B06 4 14 2731 176.2 12559. 7 4.6 57774.62 C133 PM23001 B06 B06 4 16 2866 192.5 11899. 4 3.7 44027.78 C133 Table 7 Marker name Marker result Group Group Result Sample result Sexingl Male Sex Male Sexing2 Male C13-1 Homozygous C 13-2 Disomic 13 Disomic C 13-3 Disomic Normal C18-1 Diallelic trisomy C18-2Homozygous18 Disomic C18-3 Disomic C21-1 Disomic 21 No result Table 8 Run name Well ID Sample Channel Peak No. Scan No. Size (bp) Height Width Area Marker PM23001 E01 E01 2 35 3381 234.6 23184. 2 3. 7 85781. 54 C211 PM23001 E01 E01 2 37 3401 236. 7 17920. 1 4. 6 82432. 46 C211 PM23001 E01 E01 2 43 3792 276.2 8746. 2 4. 1 35859. 42 C213 PM23001 E01 E01 2 45 3956 292. 2 4446. 5 4.6 20453. 9 C213 PM23001 E01 E01 3 27 2806 172 15178. 2 3. 7 56159.34 C212 PM23001 E01 E01 3 29 2877 180.1 19529 4. 2 82021.8 C212 PM23001 E01 e01 4 47 5181 406.3 17807. 8 4. 5 80135.1 C214 PM23001 E01 E01 4 49 5264 413. 9 15078. 4 4. 9 73884. 16 C214 Table 9 Marker name Marker result Group Group Result Sample result C21-1 Disomic C21-2 Disomic 21 Disomic Disomic C21-3 Diallelic trisomy C21-4 Disomic Table 10 Run name Well ID Sample Channel Peak No. Scan No. Size (bp) Height Width Area Marker PM23001 B12 B12 2 18 2363 105.8 801. 8 3. 8 3046.84 Sexing1 PM23001 B12 B12 2 20 2407 111. 6 1134.6 3. 1 3517.26 Sexingl PM23001 B12 B12 2 26 2750 153.8 12382. 2 4. 5 55719.9 C181 PM23001 B12 B12 2 27 2784 157.8 10530. 4 3.2 33697.28 C181 PM23001 B12 B12 2 29 2854 165.9 10051. 1 3. 4 34173.74 C181 PM23001 B12 B12 2 30 2958 177. 9 940. 5 3 2821.5 Sexing2 PM23001 B12 B12 2 32 3407 226. 4 5929. 6 4. 2 24904.32 C211 PM23001 B12 B12 2 33 3487 234.7 6956. 8 3. 7 25740.16 C211 PM23001 B12 IB12 2 341 4007 286. 3 4456. 1 4. 3 19161.23 C183 PM23001 B12 B12 2 35 4086 293. 8 1920. 1 3. 9 7488.39 C183 PM23001 B12 B12 2 36 4206 305.1 4125. 4 4. 7 19389.38 C183 PM23001 B12 B12 3 24 2965 178. 7 6309. 6 2. 9 18297.84 C182 PM23001 B12 B12 3 26 3038 186.9 2018.2 4. 6 9283.72 C182 PM23001 B12 B12 3 28 3443 230.1 2031.4 3. 5 7109. 9 C132 PM23001 B12 B12 3 30 3574 243. 6 2388 4. 1 9790.8 C132 PM23001 B12 B12431 3055 188. 8 5678. 8 3. 1 17604. 28 C133 PM23001 B12 B12 4 33 3092 192. 9 4882. 7 4.5 21972. 15 C133 Table 11 Marker name Marker result Group Group Result Sample result Sexingl Male Sex Male Sexing2 Male C131 Fail C132 Disomic 13 Disomic Suspected C133 Disomic Trisomy 18 C181 Triallelic Trisomy Suspected Repeat C182 Diallelic Trisomy 18 Trisomy C183 Triallelic Trisom C211 Disomic 21 No result Table 12 Well Run name ID Sample Channel Peak No. Scan No. Size (bp) Height Width Area Marker PM23001 E07 E07 2 31 3275 226.1 8656. 7 4 34626.8 C211 PM23001 E07 E07 2 34 3353 234.4 6599. 4 4. 4 29037.36 C211 PM23001 E07 E07 2 36 3787 279.2 1924. 24. 3 8274.06 C213 PM23001 E07 E07 2 39 3989 299.1 1733. 33. 9 6759.87 C213 PM23001 E07 E07 3 20 2653 155.8 3730. 5 3. 8 14175.9 C212 PM23001 E07 E07 3 21 2789 172 5638.5 4. 5 25373.25 C212 PM23001 E07 E07 4 85 5127 407.5 3026. 1 6. 6 19972. 26 C214 PM23001 E07 E07 4 87 5239 418 6385. 15. 6 35756. 56 C214 Table 13 Marker name Marker result Group Group Result Sample result C211 Disomic C212 Diallelic trisom 21 Disomic Disomic C213 Disomic C214 Diallelic trisomy Table 14 Run name Well ID Sample Channel Peak No. Scan No. Size (bp) Height Width Area Marker PM23201 C05 C05 2 7 2572 153.8 1733. 3 3. 7 6413. 21 C181 PM23201 C05 C05 2 8 2604 157.8 12029 3. 2 3292. 8 C181 PM23201 C05 C05 2 9 2669 165.9 1122. 2 3 3366. 6 C181 PM23201 C05 C05 2 15 3180 226.1 551. 1 3. 6 1983. 96 C211 PM23201 C05 C05 2 17 3254 234.4 687. 7 4. 1 2819.57 C211 PM23201 C05 C05 2 18 3729 286.2 462 4. 4 2032. 8 C183 PM23201 C05 C05 3 62 3214 229.9 403. 9 3. 8 1534. 82 C132 PM23201 C05 C05 3 67 3335 243.5 369. 9 5. 8 2145. 42 C132 PM23201 C05 C05 4 48 2856 188. 7 515. 2 4.6 2369. 92 C133 PM23201 C05 C05 4 50 2889 192. 6 730. 2 3.3 2409. 66 C133 Table 15 Marker name Marker result Group Group Result Sample result Sexingl Fail Sex Fail Sexing2 Fail C131 Fail C132 Disomic 13 Disomic Suspected C133 Disomic Trisomy 18 C181 Triallelic Trisomy Suspected Repeast C182 Fail 18 Trisomy C183 Homozygous C211 Disomic 21 No result