Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHOD FOR DETERMINING THE SEQUENCE OF FRAGMENTED NUCLEIC ACIDS
Document Type and Number:
WIPO Patent Application WO/2013/093530
Kind Code:
A1
Abstract:
The invention relates to new sequencing methods for the efficient and reliable identification of the nucleotide sequence of fragmented or degraded nucleic acids, primarily in biological samples. Furthermore, the present invention relates to the spacer molecules used in such methods and to the kits required for carrying out such methods.

Inventors:
PETAK ISTVAN (HU)
PINTER FERENC (HU)
CJSCHWAB RICHARD (HU)
Application Number:
PCT/HU2012/000137
Publication Date:
June 27, 2013
Filing Date:
December 19, 2012
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
KPS ORVOSI BIOTECHNOLOGIAI ES EGESZSEGUEGYI SZOLGALTATO KFT (HU)
International Classes:
C12Q1/68
Domestic Patent References:
WO2002059353A22002-08-01
WO1997043449A11997-11-20
WO1999016904A11999-04-08
WO1999064624A21999-12-16
WO2007025340A12007-03-08
WO2009124085A12009-10-08
Foreign References:
US5023171A1991-06-11
Other References:
HEATH K E ET AL: "UNIVERSAL PRIMER QUANTITATIVE FLUORESCENT MULTIPLEX (UPQFM) PCR: A METHOD TO DETECT MAJOR AND MINOR REARRANGEMENTS OF THE LOW DENSITY LIPOPROTEIN RECEPTOR GENE", JOURNAL OF MEDICAL GENETICS,JOURNAL OF MEDICAL GENETICS CONF- BRITISH HUMAN GENETICS CONFERENCE; WARWICK, UK; SEPTEMBER 06 -08, 2010, BMJ PUBLISHING GROUP, LONDON, GB, vol. 37, no. 4, 1 April 2000 (2000-04-01), pages 272 - 280, XP001055883, ISSN: 0022-2593, DOI: 10.1136/JMG.37.4.272
NELSON MATTHEW D ET AL: "Overlap extension PCR: an efficient method for transgene construction", METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) UNITED STATES 2012,, vol. 772, 1 January 2011 (2011-01-01), pages 459 - 470, XP009169085, ISSN: 1940-6029, DOI: 10.1007/978-1-61779-228-1_27
BORRAS E. ET AL., BMC CANCER, vol. 11, 24 September 2011 (2011-09-24), pages 406
SOLASSOL J. ET AL., LNT. J. MOL. SCI., vol. 12, no. 5, 2011, pages 3191 - 3204
KNAPP M. ET AL., GENES, vol. 1, no. 2, 2010, pages 227 - 243
QUERINGS S. ET AL., PLOS ONE., vol. 6, no. 5, 5 May 2011 (2011-05-05), pages E19601
Attorney, Agent or Firm:
SBGK PATENT AND LAW OFFICES (Budapest, HU)
Download PDF:
Claims:

Claims

1. A method for the determination of the sequence of fragmented nucleic acids by the PCR amplification of a target sequence and the subsequent dideoxy sequencing, comprising the steps of:

a) isolating the nucleic acid to be sequenced or the cells containing it from a biological sample,

b) performing a PCR reaction in order to fuse and amplify the target sequence, in which one or more artificial spacer molecules containing a segment complementary to the template, a spacer segment of at least 20 bp, a segment complementary to the sequencing primer and a segment complementary to the amplifying primer are used, and

c) sequencing the PCR product thus generated, optionally after purification.

2. A method for the identification of fragmented nucleic acids by a two-step PCR amplification of a target sequence and the subsequent dideoxy sequencing, comprising the steps of:

a) isolating the nucleic acid to be sequenced or the cells containing it from a biological sample,

b) performing a first PCR reaction to amplify the target sequence,

c) then performing a second PCR reaction in order to fuse and amplify the amplicon, in which one or more artificial spacer DNA molecules containing a segment complementary to the amplicon, a spacer segment of at least 20 bp, a segment complementary to the sequencing primer and a segment complementary to the amplifying primer are used, wherein the sequence corresponding to the segment complementary to the amplicon starts at 0 to 70 bp farther towards the 5' end than the 5' end of the segment complementary to the amplifying primer, and

d) sequencing the PCR product thus generated, optionally after purification.

3. A method according to Claim 1 or 2, wherein the isolation of the nucleic acid to be sequenced can be eliminated in step a).

4. A method according to Claim 1 or 2, wherein step a) comprises isolating the nucleic acid to be sequenced from a biological sample. 47

5. A method according to Claim 1 or 2, wherein step a) comprises isolating the cells containing the nucleic acid to be sequenced from a biological sample.

6. The method according to claim 5, wherein the isolation is carried out by laser microdissection.

7. A method according to Claim 1 or 2, wherein the spacer molecule used in step b) or c) is a single- or double-stranded DNA.

8. A method according to Claim 1 or 2, wherein a single spacer molecule is used in step b) or c).

9. A method according to Claim 1 or 2, wherein one forward and one reverse spacer molecule is used in step b) or c).

10. A method according to Claim 2, wherein a second, nested PCR is carried out in step b).

11. A method according to any one of Claims 1 to 10, wherein sequencing in step d) is performed after purifying the PCR product.

12. A spacer molecule for the method of any one of Claims 1 to 11 , comprising:

- a segment (of approx. 20 bp) complementary to the template nucleic acid or amplicon on one end of the target sequence,

- an intermediary DNA segment of 20 to 120 bp, which together with the previous segment ensures an appropriate distance of more than 40 bp but less than 140 bp between the 3' end of the sequencing primer and the target sequence,

- a segment complementary with the sequencing primer (15 to 40 bp), and

- a segment complementary with the amplifying primer (15 to 40 bp).

13. The spacer molecule according to claim 2, which is a single- or double- stranded DNA.

14. The spacer molecule according to claim 12 or 13, which is a forward or reverse spacer molecule.

15. Use of the method according to any one of Claims 1 to 11 for the identification of gene sequences in plant, animal or human tissue samples containing fragmented DNA.

16. Use according to Claim 15 for the identification of the neuraminidase gene from H1 N1 influenza in human tissue samples containing fragmented DNA. WO 2013/093530 PCT.PCJ/HU2012/000137 3 7

48

17. Use of the method according to any one of Claims 1 to 11 for the identification of mutations in fragmented DNA samples from tumorous tissues.

18. Use according to claim 17 for the identification of the mutations occurring in Exon 2 of the KRAS gene.

19. Use according to claim 17 for the identification of the mutations occurring in Exon 19 and/or Exon 21 of the EGFR gene.

20. Use according to claim 17, wherein the DNA samples are from formalin- fixed tissues.

21. A kit for performing the method of Claim 1 containing the following in addition to the instructions for use:

- one or more spacer molecules according to Claim 12; or

- one or more spacer molecules according to Claim 12 and amplification primers for the PCR; and

- if desired, the polymerase and reagents required for the PCR.

22. A kit for performing the method of Claim 2 containing the following in addition to the instructions for use:

- one or more spacer molecules according to Claim 12; or

- one or more spacer molecules according to Claim 12 and amplification primers for the fusion PCR; or

- one or more spacer molecules according to Claim 12, amplification primers for the PCR and amplification primers for the first amplification PCR; and

- if desired, the polymerases and reagents required for the PCR reaction.

Description:
Method for determining the sequence of fragmented nucleic acids

Technical field of the invention

The present invention relates to field of molecular diagnostics. More specifically, the invention relates to sequencing methods for the efficient and reliable identification of the nucleotide sequence of damaged (fragmented or degraded) nucleic acids, primarily in biological samples. Furthermore, the present invention relates to the spacer molecules used in such methods and to the kits required for carrying out such methods.

Background

In many cases, the only biological samples available for the determination of the nucleotide sequence of nucleic acids carrying genetic information, i.e., of DNA and RNA molecules, contain these nucleic acids in a fragmented form. The most common examples include tumorous tissue samples removed for diagnostic purposes, fixed with formalin and stored embedded in paraffin. It is widely known that fragmentation of the nucleic acid contents of samples is inevitable already when fixing and storing the samples. The necessity for determining the sequence of certain target regions of the nucleic acids extracted from such samples is increasing. However, due to their small sizes and low quantities, the detection of these nucleic acids is problematic during a subsequent analysis. The most specific method for determining the sequence of nucleic acids involves PCR amplification of the target sequence followed by bidirectional dideoxy sequencing (Sanger's dideoxy sequencing or chain termination method).

However, the boom in the development of molecular diagnostics in recent years has rather focussed on the improvement of nucleic acid sequence analysis methods based on other techniques than the dideoxy sequencing, such as "high resolution melting" analysis (HRM), "next generation sequencing" (NGS), and realtime PCR (RT-PCR) [see e.g., Borras E. et al.: BMC Cancer, 2011 Sep 24, 11 :406; Solassol J. et al.: Int. J. Mol. Sci. 2011 , 12(5):3191-3204; Knapp M. et al.: Genes, 2010, 1 (2):227-243].

In these articles, the problems with the efficiency and sensitivity of the dideoxy sequencing re explained but no solutions are offered to overcome them. The only improvement that has been widely used is the introduction of the binding site of the M13 sequencing primer into the target sequence via an amplifying primer carrying the M13 primer in the form of an overhang sequence (Querings S. et al.: PLoS One. 201 1 May 5; 6(5):e19601 ). However, this fails to improve efficiency and sensitivity and only makes sequencing more economical by eliminating the need for purchasing separate sequencing primers for each target sequence and the need for individually optimising each sequencing reaction. Another problem with this common method is the fact that the PCR primer elongated with an overhanging M13 binding site has a greater tendency to dimer formation, which reduces PCR efficacy; in addition, during primer synthesis, primer synthesis errors, e.g., deletions and insertions occur more frequently, and this makes the sequence picture noisy and thereby reduce the sensitivity of the analysis.

The purpose of the techniques described in WO 97/43449 (Hagiwara Koichi et al.), WO 99/16904 (Liu Qiang et al.) and WO 99/64624 (Wallace Andrew et al.) is to allow simultaneous sequencing of more than one target sequences by joining them. The benefit expected from this was to reduce the costs and increase the efficiency of the dideoxy sequencing. However, due to the problems outlined below, these techniques have not become widespread. These methods use two separate PCR reactions for the overlap fusion of the target sequences. In the first one, each target sequence is amplified using separate primers with overhangs that overlap with the other target sequence and with the amplifying primer; next, the reaction products of the first PCRs are mixed and fused in a second PCR. Although the fusion product emerges, the following problems are encountered: a) the PCR amplification of each target sequence in the mixture continues individually and in competition with the fusion product, b) the use of long primers reduces the efficiency of the PCR technique, c) the overlapping primers form dimers during the fusion PCR causing sequence noise, d) synthesis errors in the primers also appear as noise during sequence analysis, e) due to the two separate PCR reactions, the total costs are not reduced either.

WO 2007/025340 (Keith S. et al.) describes a method for the noise- and discrimination-free amplification of sequences comprising two amplification steps. This method is also useful for the random pre-amplification of DNA samples available in small quantities. Such increased sample quantities are very useful for the evaluation of more than one target sequences in subsequent analyses. However, this technique only increases the amount of the fragments but no their size. Therefore, very short DNA sequences will continue to fail to participate in the sequencing reaction.

5 WO 2009/124085 (Dhruba J.S. et al.) describes a method for the reduction of the noise generated during nucleic acid sequencing reactions. According to the method, false chain termination products generated during the pre-sequencing amplification are selectively degraded during the sequencing reaction. This method fails to offer a solution for the technological problems associated with the

I 0 determination of short sequences.

One disadvantage of the dideoxy method is that it is impossible to precisely sequence the first 30 to 50 nucleotides, or a bidirectional analysis produces low quality results that are difficult to evaluate. The most common solution to this problem is to amplify sequences that are longer by 70 bp (primer length + 50 bp) than the

15 desired sequence in order to ensure a high quality segment to be analysed.

In the case of samples containing fragmented nucleic acids, elongation of the PCR target sequence results in lower number of copies of appropriate size available for the PCR amplification. In such cases, detection sensitivity is reduced and, in many instances, the detection is unsuccessful, or if it is successful, then the chance

10 for allele discrimination (allele bias) increases, that is, randomly only one of the sequence variants will be amplified. In samples containing both mutant and normal alleles often show either 100% or 0% frequency for the mutant allele in such cases. The latter case is of extreme clinical importance because false negative results are generated. Furthermore, the lower initial number of copies of the nucleic acid means

15 higher amplification needs, which increases the probability of the appearance of PCR artefact mutations.

Another common method for solving the above problem is to use primers that overhang the target sequence during the PCR amplification, which generates longer PCR products from shorter target sequences. One disadvantage of this method is the Ϊ0 fact that many primers will contain a few nucleotide errors - nucleotide deletions in several cases - as a result of synthesizing long primers. The sequence generated from this DNA strand will give a result which is shifted by one nucleotide. Upon superimposition, such shifted target sequences will render the sequence picture analytically noisy ("sequence noise").

One solution to this problem may be to attach a DNA molecule of at least 70 bp to the target sequence after the amplification.

5 A previously described method for coupling two DNA molecules is "overlap extension PCR" (OEP in short). The basis of the OEP method described in US 5,023,171 (Ho Steffan et al.) is the fact that the overlapping oligonucleotides on the 3' end of the single-stranded DNAs to be hybridised function as primers for the DNA polymerase to produce the double-stranded fusion DNA molecule.

I 0 In the original and most common application of this method, two or more overlapping PCR products are fused. Qiang Liu (WO 99/16904) and Andrew Wallace (WO 99/664624) proposed further methods to increase the efficiency of the fusion. One method uses two PCR reactions performed sequentially. In the first PCR, overlapping DNA strands are generated using primers containing linker sequences,

15 and these strands are fused as primers for each other during the second PCR. The 3' end and 5' end of the linker primers used in the first PCR are complementary to the template DNA and to the adjacent DNA, respectively. One disadvantage of this method is that the 5' end complementary to the adjacent DNA greatly diminishes primer efficiency during the amplification of the template sequence, and this in turn

10 reduces the sensitivity. In addition, primer dimers are formed during the first PCR, and this significantly reduces PCR efficiency in the case of small DNA sample amounts. Unless they are removed, such primer dimers will fuse during the second PCR in the same way as the DNA strand generated from the template sequence. Such DNA strands with no template content generate a noise during sequencing.5 From the method, it also follows that the sequence differences in the linker primers will be incorporated into the fused strand. In case of an insertion or deletion, this will appear as noise running through the entire sequencing electropherogram with an amplitude corresponding to the proportion of primers comprising the insertion/deletion. The above disadvantages are major obstacles in applying the

\0 OEP-based methods for sequencing small amounts of DNA and/or DNA containing fragmented nucleic acids.

The problem to be solved by the present invention is to Increase the efficiency and reliability of the dideoxy sequencing of individual sequences in samples containing fragmented nucleic acids by overcoming the disadvantages of the above methods.

To solve the above problems, a technique based on overlap extension PCR was designed, which ensures the fusion of short PCR products with the shortest possible primers; in addition, a mutation-free adapter sequence was designed, which fills up the noisy sequence generated during the first 30 to 50 bp of the sequencing reaction without the need for amplifying long PCR products.

Summary of the invention

In the method of the invention, a single PCR is used to attach previously prepared long artificial spacer DNA molecules to both ends of the target sequence to be analysed, and this allows the target sequence to be localised at a sufficient distance from the binding site of the sequencing primer to ensure noise-free Sanger reading. Thus, the method allows bidirectional Sanger sequencing to read the same target sequence from a shorter DNA molecule than the conventional method without fusion. In turn, this is achieved by eliminating the need for adding sequences that are complementary with other DNA strands to the DNA strand comprising the target sequence before the fusion, that is, the need for using primers that contain sequences other than those complementary to the template DNA as this reduces the efficiency/ Another particular advantage is that the adapter DNA hybridises at the binding site of the primer used for the amplification of the target sequence or a few nucleotides more inside, and this prevents adapters to attach to the primer dimers generated by the previous amplification. Therefore, this method is even more advantageous than when ligation is used to attach the adapter sequences to the target sequences.

Brief description of the figures

Figure 1 : Determination of the sequence of nucleic acids by fusing spacer molecules

(method of the invention).

Figure 2: Structure of the spacer molecule.

Figure 3: Location of the hybridisation site of the primer used in the pre-fusion PCR and of the spacer molecule. Figure 4: Key features of the method used to analyse Nl resistance mutations in H1 N1 influenza. The codons of the most important mutations were marked in accordance with the nomenclature of the H3N2 virus.

Figure 5: Efficient functioning of the method analysing mutations (segments in bold on the electropherogram) causing the Nl resistance of H1 N1 from the period between 1977 and 2008.

Figure 6: Efficient functioning of the method analysing mutations (segments in bold on the electropherogram) causing the Nl resistance of the new type H1 1.

Figure 7: Sequences of the amplicons generated by the conventional method (normal

PCR followed by nested PCR) and the new method. Conventional method:

Bold italic letters: sequences corresponding or complementary to the primers of the conventional PCR; letters in frames: sequences corresponding or complementary to the primers of the nested PCR. New method: grey marking: sequences corresponding to the forward primer and complementary to the reverse primer of the pre-fusion PCR; underlining: sequences corresponding to the template-specific segment of the forward spacer molecule and complementary to the template-specific segment of the reverse spacer molecule; white letters on a black background: sites of the most significant mutations of Exon 21 of EGFR.

Figure 8: Reduction of allele bias in the analysis of a sample containing fragmented

DNA using the new mutation detection method.

Detailed description of the invention

In the method of the invention, a single PCR is used to attach previously synthesised spacer DNA molecules to both ends of the target sequence to be analysed with a length that allows the target sequence to be localised at a sufficient distance from the 3' end of the primer to ensure noise-free Sanger reading. Thus, amplification before the dideoxy sequencing is also possible from a nucleic acid longer only by 2x20 bp (2* the length of the primer) than the target sequence. In case of templates with a low initial nucleic acid content, the target sequence can be PCR amplified before the PCR used for coupling the spacer molecule in order to ensure that the template is available in sufficient number of copies. This can be achieved by using primers that only contain sequences complementary to the template. In such cases, amplification is also possible from a nucleic acid longer by 2x30 bp (2* the length of the primer + 2>< 10) than the target sequence.

It was experimentally confirmed that amplification of the 2x70 bp extra sequence required for the known dideoxy sequencing can be induced by a shorter sequence, which allows a reduction in the size of the PCR amplicon and this will increase the sensitivity of sequencing, that is, the detection of fragmented DNA sequences carrying mutations and being present in small amounts, to a greater extent than expected, and will reduce the allele bias in DNA samples taken from formalin-fixed tissues (see Example 6).

The spacer molecule used in the method of the invention, which may be single- or double-stranded DNA, comprises the following (see Figure 2):

- a segment (of approx. 20 bp) complementary to the template nucleic acid or amplicon on one end of the target sequence,

- an intermediary DNA segment of at least 20 bp (but less than 120 bp), which together with the previous segment ensures an appropriate distance of more than 40 bp (but less than 140 bp) between the 3' end of the sequencing primer and the target sequence,

- a segment complementary with the sequencing primer (15 to 40 bp),

- a segment complementary with the amplifying primer (15 to 40 bp).

Accordingly, one aspect of the invention provides a method for the determination of the sequence of fragmented nucleic acids by the PCR amplification of a target sequence and the subsequent dideoxy sequencing, comprising the steps of:

a) isolating the nucleic acid to be sequenced or the cells containing it from a biological sample,

b) performing a PCR reaction in order to fuse and amplify the target sequence, in which one or more artificial spacer molecules containing a segment complementary to the template nucleic acid on one end of the target sequence, a spacer segment of at least 20 bp, a segment complementary to the sequencing primer and a segment complementary to the amplifying primer are used, and

c) sequencing the PCR product thus generated, optionally after purification. In a preferred embodiment of the above method, the nucleic acid is isolated from the biological sample. In another preferred embodiment, step a) (isolation) may be omitted provided that an appropriate polymerase is used. In another preferred embodiment, cells containing the nucleic acid are isolated from the biological sample, preferably by microdissection.

The PCR reaction fusing the spacer DNA molecules is presented in Figure 1 : The complementary 3' ends of the spacer DNA molecule and of the DNA molecules comprising the target sequence hybridise, and thus the polymerase generates the second strand of the DNA molecule fused from one end. Simultaneously, the same occurs on the other end of the target sequence. The spacer DNA molecule for the other end hybridises to the singly fused DNA molecule in the same way giving rise to the double-stranded DNA molecule fused at both ends, which comprises sequences complementary to both the forward and the reverse primer, thus also allowing the amplification of the fused molecule. The high initial concentration of the amplifying primers ensures that the final concentration of the amplified fused strand greatly exceeds the concentration of the parent molecules and intermediaries. Ultimately, the PCR reaction in this case is a multiplex reaction in which fusion and amplification occurs in the same step.

In case another PCR reaction is used before the fusion PCR, then this reaction should be designed to ensure that the 3' ends of the primers are complementary to the sequences of the template nucleic acid located at least 0 to 70 bp closer to the 3' end than the template specific section of the spacer DNA molecule (see Figure 3). This prevents fusion between the primer dimers generated in the first PCR and the spacer DNA molecules during the fusion PCR.

Accordingly, another aspect of the invention provides a method for the determination of the sequence of fragmented nucleic acids by a two-step PCR amplification of a target sequence and the subsequent dideoxy sequencing comprising the steps of:

a) isolating the nucleic acid to be sequenced or the cells containing it from a biological sample,

b) performing a first PCR reaction to amplify the target sequence,

c) then performing a second PCR reaction in order to fuse and amplify the amplicon, in which one or more artificial spacer molecules containing a segment complementary to the amplicon, a spacer segment of at least 20 bp, a segment complementary to the sequencing primer and a segment complementary to the amplifying primer are used, wherein the sequence corresponding to the segment complementary to the amplicon starts at 0 to 70 bp farther towards the 5' end than the 5' end of the segment complementary to the amplifying primer, and

d) sequencing the PCR product thus generated, optionally after purification.

In a preferred embodiment of the above method, the nucleic acid is isolated from the biological sample. In another preferred embodiment, step a) (isolation) may be omitted provided that an appropriate polymerase is used. In another preferred embodiment, cells containing the nucleic acid are isolated from the biological sample, preferably by microdissection.

In Step b), a further nested PCR may be performed after the first PCR, if required.

In the methods of the invention the nucleic acid to be sequenced may be DNA or RNA.

As used herein, the fragmented nucleic acid to be sequenced means fragmented DNA, RNA or miRNA. As used herein, the term "fragmented nucleic acid" means a nucleic acid with a length of 30 to 1000 bp, preferably 55 to 300 bp.

The method can be advantageously used in any application in which the DNA segment to be sequenced is not part of a DNA molecule which is at least 70 bp longer than the target segment. Thus, if the sequence of a DNA or RNA segment of 100 bp is to be determined, then it should be located in the middle of a DNA segment of at least 240 bp in length to allow sequencing using the known methods. A segment of 200 bp to be sequenced should be located in the middle of a segment of at least 340 bp in length. The upper limit is only represented by the technical limit of the dideoxy sequencing, which is approx. 1000 bp.

Biological samples include biopsies, tissue samples, cytology smears, blood samples, ascites fluid, spinal chord, urine, etc. For the method of the invention, tissue samples are preferred. The nucleic acid sample may be taken from human, animal or plant tissues or cells. Application of the method in case of a biological sample containing nucleic acid fragments of small sizes and in low amounts is particularly preferable. In such cases, the number of DNA fragments available as templates are increased in proportion to the reduction in the size of the target sequence. One example for such an application is the processing of histological samples or cytology smears in which laser microdissection microscopy is used to introduce nucleic acids corresponding to a few, or a few hundred or a few thousand cells into a test tube, and

5 then the nucleic acid is either isolated or directly amplified by PCR.

The spacer DNA molecule is always synthesised by the user or others commissioned by the user on the basis of the target sequence, which should be known in advance. In addition to the above, the spacer molecule may comprise a binding site for the amplification primer, an M13 binding site and a binding site

10 overlapping with the target sequence.

In one embodiment of the sequencing method of the invention forward and reverse spacer molecules are used and these can be prepared in the following manner. The parts of the framework of the forward spacer molecules which are around the pCR2.1-TOPO vector (Invitrogen) M13F primer, and the parts of the

15 framework of the reverse spacer molecules which are around the pENTR/H I/TO vector (Invitrogen) M13R primer are used. Using OEP, the sequences specific to the diagnostic systems are attached to this framework from both ends. The segment complementary to the template is attached to the spacer framework downstream of the M13F or R primer, and further nucleotides_required-for subsequent restriction

!O cleavages are added beyond that. To the other end of the spacer framework, sequences required to form a binding site for the same restriction enzyme as in the other end are attached. The artificial sequences thus generated are cloned using the CloneJET1.2 PCR Cloning Kit (Fermentas) and Top10 bacteria. Upon isolating the vector from a clone confirmed to contain the sequence, restriction enzymatic

!5 cleavage and gel isolation is used to obtain the spacer molecules. For the two diagnostic systems for H1 1 influenza, only forward spacers were used, while for the KRAS and EGFR diagnostic systems, both forward and reverse spacer molecules were used.

In all methods of the invention, a PCR is performed in which spacer molecules i0 are fused to the target sequences. In order to achieve this, the reaction mixture should contain the following components in addition to those required for the functioning of the polymerase: spacer molecules fusing to the ends of the target sequence, and in lower amounts than the amplification primers; and forward and reverse primers amplifying the fused molecule.

If the target sequences are pre-amplified in one or more PCR reactions before the fusion step, then fusion is performed on the amplicons thus prepared. In a preferred embodiment spacer molecules are fused to both ends of the amplicons (see Examples 1 to 3), therefore Figure 1 applies to such systems. In another preferred embodiment, such as in the analysis of the influenza neuraminidase gene, fusion only occurs on one end (see Figure 4), and the other end is used for overlap fusion with sequences overlapping between the 2 neuraminidase gene regions, which are prepared by primers overhanging into the other region during the pre- fusion PCR amplification (see Examples 4 and 5).

Next, the product obtained during the fusion and amplification PCR is sequenced directly or upon purification. If desired, the PCR product can be purified in known manners, e.g., by ethanol precipitation, on silica columns with the use of XTerminator®.

Sequencing is performed using a method known perse.

The method of the invention has the following advantages over the methods of the prior art:

1) Increased sensitivity: in contrast with the conventional method using longer DNA segments, this is achieved by reducing the size of the template required for the sequence analysis, which in turn increases the number of DNA fragments available as templates. In comparison with the methods using overlapping primers, sensitivity is improved by eliminating the need for adding sequences that are complementary with other DNA chains to the DNA strand comprising the target sequence, unlike in the previous methods. Namely, although the use of such overlapping (overhanging) primers also reduces the required size of the initial template, they increase the tendency to form dimers, which lowers the efficacy of PCR.

2) Reduced allele bias: again, this may be explained by the reduction of the template size required for sequence analysis and by the resulting increase of the available number of DNA fragments, that is, the chance of randomly amplifying only one of a several simultaneously occurring sequence variants is lowered.

3) Reduced frequency of artefact mutations: The increase in the number of DNA fragments associated with the reduction in the template size required for sequence analysis also lowers the required extent of amplification. In addition, all this reduces the chance of encountering polymerase errors.

4) Less noise: in case of using overlapping primers, the sequence differences of these primers will be incorporated in the fusion product. In case of an insertion or deletion, this will appear as noise running through the entire sequencing electropherogram with an amplitude corresponding to the proportion of primers comprising the insertion/deletion. Unless they are removed, the dimers formed by overlapping primers will fuse during the second PCR in the same way as the DNA strand generated from the template sequence. Such fused DNA strands with no template content also generate a noise during sequencing. With the method of the invention, both phenomena may be eliminated and thus, significant noise reduction may be achieved. By generating the spacer molecules in bacterial clones, the occurrence of different sequence variants in them can be excluded. Furthermore, appropriate selection of the hybridisation site of the spacer molecule prevents fusion of the reaction products containing no target sequence.

The method of the invention can be used in many fields. One field of application is the identification of gene sequences in plant, animal or human tissue samples containing fragmented DNA. For example, the method of the invention can be used to detect mutations in the H1 N1 influenza neuraminidase gene, and this may serve as the basis for therapy planning.

Moreover, the invention is useful for determining the nucleotide sequence of fragmented nucleic acids which optionally contain mutations and these mutations can be reliably detected by precise sequencing. One typical field of this is detecting mutations in fragmented DNA samples from tumorous tissues.

The sequencing method of the invention is particularly useful for the reliable analysis of tissue samples fixed and stored in formalin and thus containing damaged, fragmented nucleic acids. One typical field of application of this is cancer diagnostics, i.e., the identification of genetic markers affecting treatment strategies. Non-limiting examples for this include known mutations in Exons 18, 19, 20 and 21 of EGFR, mutations in Exons 2, 3 and 4 of KRAS, HRAS and NRAS, mutations in Exons 1 1 and 15 of BRAF, mutations in Exons 19 and 20 of HER2, mutations in Exons 1 , 9 and 20 of the PI3KCA gene, mutations in Exons 9, 1 1 , 13 and 17 of KIT, mutations in Exons 12, 14 and 18 of PDGFR-A and mutations in Exons 20, 22, 23, 24 and 25 of the ALK gene.

Another aspect of the invention relates to kits for carrying out the methods of the invention. In addition to the instructions for use, the kit of the invention may contain the following components for the method of the invention using a single PCR: one or more spacer molecules; or one or more spacer molecules and amplification primers. If desired, the kit may also contain the polymerase and reagents required for the PCR.

In addition to the instructions for use, the kit may contain the following components for the method of the invention using more than one PCR: one or more spacer molecules; or one or more spacer molecules and amplification primers for the fusion PCR; or one or more spacer molecules, amplification primers for the fusion PCR and amplification primers for the first amplification PCR. If desired, the kit may also contain the polymerases and reagents required for the PCR reactions.

The following examples are provided for the purpose of describing the present invention in more details and should not be interpreted as limiting the scope of the invention.

Examples:

The method of the invention is illustrated with the amplification and sequencing of 5 nucleic acid sequences. In 3 cases, the objective was to analyse human gene segments. Due to the mutations present in them, we wished to analyse the most relevant regions of Exon 2 of KRAS and Exons 19 and 21 of EGFR. In these analyses, the initial nucleic acid was DNA. The additional two gene segments analysed by the method of the invention were two segments of the influenza neuraminidase gene which contain mutations associated with varying degrees of resistance against various neuraminidase inhibitors. Since an RNA virus is concerned, RNA is used as template in this analysis. When designing the analysis of the gene regions of H1 N1 influenza influencing pharmaceutical effects, the starting materials included gene sequences of H1 N1 influenza viruses from the period between 1977 and 2005. The assay thus established can detect all influenzas from the period between 1977 and 2008. Upon analysing the neuraminidase gene of 337 variants of the new type H1 N1 influenza emerging in 2009, significant differences were found in comparison with the previous variants. Therefore, using the 337 gene sequences, an optimised assay was devised for the detection of the new type H1 N1 influenza, in which the primers and the template-specific parts of the spacer contain a few nucleotides that are different from the components of the H1 N1 assay for the year 2008.

The parameters of the diagnostic systems based on the method of the invention are presented in Table 1.

Table 1. Parameters of the diagnostic systems based on the method of the invention. In the case of influenza, the 10- to 11 -bp sequence marked with an asterisk does not amplify from the template but fuses with the primers containing sequences overhanging into the other gene region.

Example 1 : Analysis of Exon 2 of the KRAS gene

Method:

Preparation of the spacer molecules: For the diagnostic system, one forward and one reverse spacer molecule was prepared. The parts of the framework of the forward spacer molecules which are around the pCR2.1-TOPO vector (Invitrogen) M13F primer, and the parts of the framework of the reverse spacer molecules which are around the pENTR/H1/TO vector (Invitrogen) M13R primer were used.

The sequence acting as the framework of the pCR2.1-TOPO vector forward spacer molecules set forth as SEQ ID NO:1 (104 bp, the segment marked grey corresponds to the forward amplification primer, and the segment in white letters on a black background corresponds to the sequence of the M13F sequencing primer):

GCGATTAAGTTGGGTAACGCCAGGGTTT CCCAGTCAC AATTGTAATACGACTCACTATAGGGCGAATTGGGCCCTCTAG

The sequence acting as the framework of the pENTR/H1/TO vector reverse spacer molecules set forth as SEQ ID NO: 2 (1 10 bp, the segment marked grey corresponds to the reverse amplification primer, and the segment in white letters on a black background corresponds to the sequence of the M13R primer):

GTGCAATGTAACATCAGAGATTTTGAGACACGGGCCAGAGC CAGGAAACAGC TATGAC C HJGTAATACGACTCACTATAGGGGATATCAGCTGGATGGCAAATAATG Using OEP, the sequences specific to the diagnostic systems were attached to this framework from both ends. The segment complementary to the KRAS gene was attached to the vector sequence downstream of the M13F or R primer, and further nucleotides required for subsequent restriction cleavages were added beyond that. In case of the forward spacer molecules, the following reverse primer (SEQ ID NO: 3) was used (grey marking: restriction enzyme binding site, underlining: a sequence identical to the KRAS gene segment, italic letters: a segment complementary to the pCR2.1-TOPO vector sequence):

GATGGCCACAAGTTTATATTCAGTCATCTTATAATCTAGAGGGCCCAATTCGC

In case of the reverse spacer molecules, the following reverse primer (SEQ ID NO: 4) was used (grey marking: restriction enzyme binding site, underlining: a segment complementary to the KRAS gene segment, italic letters: a sequence identical to the pENTR/H1/TO vector segment):

GATGCGCAAGAGTGCCTTGACGATACCATTArrrGCCArCCA CTG

To the other (5') end of the spacer molecule framework, sequences required to form a binding site for the same restriction enzyme as in the other end were attached. In case of the forward spacer molecules, the following forward primer (SEQ ID NO: 5) was used (grey marking: restriction enzyme binding site, italic letters: a sequence identical to the pCR2.1-TOPO vector segment):

C^GGCCAGCGATTAAGTTGGGTAACGC

In case of the reverse spacer molecules, the following forward primer (SEQ ID NO: 6) was used (grey marking: restriction enzyme binding site, italic letters: a sequence complementary to the pENTR/H1 TO vector segment):

CTTGGGCAGTGCAATGTAACATCAGAGATTTTG

The attachment of the forward and reverse spacer molecules were achieved in separate PCR reactions. The only difference between the two reactions was in the primers used. Amplification was carried out in solutions containing the pCR2.1-TOPO vector and the pENTR/H1/TO vector, respectively, using AccuPrime Taq HiFi DNS polymerase (Invitrogen, 0.02 U/μΙ), 1x AccuPrime I PCR buffer and forward and reverse primers each at a concentration of 0.4 μΜ. The PCR included the following cycles:

Cycle 1 : (1 x) Step 1 94°C 1min OOsec

Cycle 2. (10x) Step 1 94°C Omin 20sec Step 2 63°C Omin 20sec

After each cycle, this cycle was reduced by 0.6°C

Step 3 68°C Omin 40sec

Cycle 3: (30 χ ) Step 1 94°C Omin 20sec

Step 2 63°C Omin 20sec

Step 3 68°C Omin 40sec

Cycle 4: (30*) Step 1 68°C 5min OOsec

Cycle 5: (1x) Step 1 4°C - The artificial sequences thus generated were cloned using the CloneJET1.2 PCR Cloning Kit (Fermentas) and Top10 bacteria (Invitrogen). The pJET1.2 vectors containing the spacer molecules were isolated using MidiPrep. The forward and the reverse spacer molecule were excised from the vector using the restriction enzymes Mlu Nl and Avill, respectively (both from Roche, 0.41 ) /μΙ, after 16 hours of incubation at 37°C). Using the E-Gel SizeSelect (Invitrogen) system, spacer molecules were isolated from gels.

The sequence of the forward spacer molecule (SEQ ID NO: 7) generated by the restriction cleavage is as follows (137 bp, the segment marked grey corresponds to the forward amplification primer, the segment in white letters on a black background corresponds to the M13F sequencing primer, and the underlined segment corresponds to a sequence complementary to a portion of the KRAS gene):

CCAGCG^TTiaGTTGGGTAACGCCAGGGTTTTCCCAGTCACGACGTi T GTAAAACGACGGCCA

HjGAATTGTAATACGACTCACTATAGGGCGAATTGGGCCCTCTAGATTATAAGATGA CTGAA TATAAACTTGTGG

The sequence of the reverse spacer molecule (SEQ ID NO: 8) generated by the restriction cleavage is as follows (134 bp, the segment marked grey corresponds to the reverse amplification primer, the segment in white letters on a black background corresponds to the M13R sequencing primer, and the underlined segment corresponds to a sequence complementary to a portion of the KRAS gene):

GT CAATGTAACATCAGAGATTTT^

BBBGTAATACGACTCACTATAGGGGATATCAGCTGGATGGCAAATAATGGTATCGTCAAG GCA

CTCTTGC Pre-fusion PCR amplification: The samples containing fragmented DNA isolated from a formalin-fixed, paraffin-embedded sample were subjected to pre-fusion PCR amplification. The PCR conditions were identical with those of the reaction used in the preparation of the spacer molecule. The sequence of the forward and reverse primers were GACATGTTCTAATATAGTCACATTTTCATTATT (SEQ ID NO: 9) and TCTGAATTAGCTGTATCGTCAAGG (SEQ ID NO:10), respectively.

The following 120-bp amplicon (SEQ ID NO:1 1 ) was generated upon the PCR (sequences with grey marking correspond to the forward primer and are complementary to the reverse primer, underlining: sequences corresponding to the template-specific portion of the forward spacer molecule and complementary to the template-specific portion of the reverse spacer molecule, white letters on a black background: sites of the most significant KRAS mutations):

GACATGTTCTAATATAGTCAGATTTTgATTAT-TTT ATTATAAGATGACTGAATATAAACTT GTGGTAGTTGGAGCT^T^CGTAGGCAAGAGTGeG T GACGAjTACAGCTAATTCAGA

Fusion of the spacer molecules: The forward and reverse spacer molecules were fused to the amplicon generated in the previous step using PCR. In addition to the amplicons, the PCR solution contained the following components: AccuPrime Taq HiFi DNA polymerase (Invitrogen, 0.02 U/μΙ), 1 * AccuPrime I PCR buffer, a forward primer (ATTAAGTTGGGTAACGCCAGGGTTT) (SEQ ID NO: 12) and a reverse primer (AGATTTTGAGACACGGGCCAGA) (SEQ ID NO:13) each at a concentration of 0.4 μΜ, and forward and reverse spacer molecules. When preparing the PCR solution, one fifth of the reaction volume was from the 100* dilution of the PCR solution containing the amplicons and another fifth was from the gel isolate of the spacer molecules. Thus, the PCR solution contained ~0.012 pM forward and reverse spacers. The PCR included the following cycles:

Cycle 1 : (1x) Step 1 94°C 2min OOsec

Cycle 2: (5*) Step 1 94°C Omin 30sec

Step 2 57°C Omin 30sec

Step 3 68°C 1 m in OOsec

Cycle 3: (35 χ ) Step 1 94°C Omin 30sec

Step 2 62°C Omin 30sec

Step 3 68°C 1min OOsec

Cycle 4: (1x) Step 1 68°C 10min OOsec Cycle 5: (1x) Step 1 4°C∞

IIff tthhee KKRRAASS ggeennee sseeggmmeenntt ooff tthhee ssaammppllee ddiidd nnoott ccoonnttaaiinn mmuuttaattiioonnss iinniittiiaallllyy,, tthhee ffoolllloowwiinngg 226666--bbpp ffuussiioonn pprroodduucctt ((SSEEQQ IIDD NNOO:: 1144)) wwaass ggeenneerraatteedd dduurriinngg tthhee rreeaaccttiioonn 55 ((sseeqquueenncceess wwiitthh ggrreeyy mmaarrkkiinngg ccoorrrreessppoonndd ttoo tthhee ffoorrwwaarrdd pprriimmeerr aanndd aarree ccoommpplleemmeennttaarryy ttoo tthhee rreevveerrssee pprriimmeerr,, wwhhiittee iittaalliicc lleetttteerrss oonn aa bbllaacckk bbaacckkggrroouunndd:: sseeqquueenncceess ccoorrrreessppoonnddiinngg ttoo tthhee MM1133FF sseeqquueenncciinngg pprriimmeerr aanndd ccoommpplleemmeennttaarryy ttoo tthhee MM1133RR pprriimmeerr,, uunnddeerrlliinniinngg:: sseeqquueenncceess ccoorrrreessppoonnddiinngg ttoo tthhee tteemmppllaattee--ssppeecciiffiicc ppoorrttiioonn ooff tthhee ffoorrwwaarrdd ssppaacceerr mmoolleeccuullee aanndd ccoommpplleemmeennttaarryy ttoo tthhee tteemmppllaattee--ssppeecciiffiicc ppoorrttiioonn ooff 00 tthhee rreevveerrssee ssppaacceerr mmoolleeccuullee,, nnoorrmmaall wwhhiittee lleetttteerrss oonn aa bbllaacckk bbaacckkggrroouunndd:: ssiitteess ooff tthhee mmoosstt ssiiggnniiffiiccaanntt KKRRAASS mmuuttaattiioonnss))::

AAXTIAAAGGTTTTGGGGGGTTAAAACCGGCCCCAAGGGGGGTTTT--TTTT CCCC CC AAGGTT CCAACCGGAACC CΓ,TΎWϋdΜMΛiΛι ΛlMΜAΛwΜtM^eβlΒSSβfiSΒRΙΆ ,, ΆΆΆΆΤΤ

TTGGTTAAAATTAACCGGAACCTTCCAACCTTAATTAAGGGGGGCCGGAAAATTTTG GGGGGCCCCCCTTCCTTAAGGAATTTTAATTAAAAGGAATTGGAACCTTGGAAAATTAAT TAAAAAA CCTTTTGGTTGGGGTTAAGGTTTTGGGGAAGGCCTTB^STTB^BICCGGTTAAGGGGCCA AAAGGAAGGTTGGCCCCTTTTGGAACCGGAATTAACCCCAATTTTAATTTTTTGGCCCCA ATT

GCCCGTGTCTCAAAATCT

Sequencing of the fusion product: After the PCR, the fusion product was purified using the Exo-SAP-IT (USB) enzyme mixture, and a bidirectional termination reaction was carried out using the BigDye Terminator v3.1 Sequencing Kit (Applied 20 Biosystems), the M13F primer (TGTAAAACGACGGCCAGT) (SEQ ID NO:15), and the M13R primer (CAGGAAACAGCTATGAC) (SEQ ID NO:16). The product of the termination reaction was purified using the BigDye XTerminator kit (Applied Biosystems), and the DNA sequence was read by capillary electrophoresis performed in an ABI Prism 3130 Genetic Analyzer (Applied Biosystems).

25 Results:

Sequence analysis was successfully carried out in isolates containing fragmented DNA isolated from 10 formalin-fixed, paraffin-embedded samples.

Example 2: Analysis of Exon 19 of the EGFR gene

30 Method:

Preparation of the spacer molecules: For the diagnostic system, one forward and one reverse spacer molecule was prepared as in Example 1 with the following modifications. In case of the forward spacer molecules, the following reverse primer (SEQ ID ΝΟ.Ί 7) was used in the OEP reaction to attach the segment complementary with the EGFR gene downstream of the M13F primer and the additional nucleotides required for the restriction cleavage to the pCR2.1-TOPO vector sequence (grey marking: restriction enzyme binding site, underlining: a sequence identical to the EGFR gene segment, italic letters: a segment complementary to the vector sequence):

GAGTTAACTTTCTCACCTTCTGGGATCCACrAGAGGGCCCAArrCGC

In case of the reverse spacer molecules, the following reverse primer (SEQ ID NO: 18) was used (grey marking: restriction enzyme binding site, underlining: a segment complementary to the EGFR gene segment, italic letters: a sequence identical to the pENTR/H1/TO vector segment):

GACAGCTGCTTTGCTGTGTGGGGGTCArrATTTGCCATCCA CTG

Using the latter primer, a guanine in the pENTR/ΗΙ ΓΟ vector was replaced by cytosine in the spacer molecule (white letters on a black background in the previous sequence) to eliminate the CAGCTG restriction cleavage site present in this segment in order to prevent a subsequent cleavage of the spacer molecule.

In case of the forward spacer molecules, the following forward primer (SEQ ID NO:

19) was used to introduce the restriction enzyme binding site on the other end of the pCR2.1-TOPO vector sequence (grey marking: restriction enzyme binding site, italic letters: a sequence identical to the vector segment):

CTSTTAACGCGATTAAGTTGGG AACGC

In case of the reverse spacer molecules, the following forward primer (SEQ ID NO:20) was used (grey marking: restriction enzyme binding site, italic letters: a sequence complementary to the pENTR/H1/TO vector segment):

CTC GCTGGTGCAATGTAACATCAGAGATTTTG

In case of the forward and reverse spacer molecule, Hpal and Pvull were used, respectively, for the restriction cleavage.

The sequence of the forward spacer molecule (SEQ ID NO:21) generated by the restriction cleavage is as follows (131 bp, the segment marked grey corresponds to the forward amplification primer, the segment in white letters on a black background corresponds to the M13F sequencing primer, and the underlined segment corresponds to a sequence complementary to a portion of the EGFR gene): AACGCGATT- PTJGGGTAACGGCAGSSTTFTCCCAGTCACGACGTJTGTAAAACGACGGCCA EHGAATTGTAATACGACTCACTATAGGGCGAATTGGGCCCTCTAGTGGATCCCAGAAGGT GA

GAAAGTT

The sequence of the reverse spacer molecule (SEQ ID NO:22) generated by the restriction cleavage is as follows (133 bp, the segment marked grey corresponds to the reverse amplification primer, the segment in white letters on a black background corresponds to the M13R sequencing primer, the underlined segment corresponds to a sequence complementary to a portion of the EGFR gene, white italic letter on a black background: nucleotides replaced in the vector sequence during the OEP): CTGGTGCAATGTAACATCAGAGATTtTGAGACAt!GGGCCAG^^

BBB^jGTAATACGACTCACTATAGGGGATATCAGgTGGATGGCAAATAATGACCCCCACA CA

GCAAAGCAG

Pre-fusion PCR amplifica tion : The samples containing fragmented DNA isolated from a formalin-fixed, paraffin-embedded sample were subjected to pre-fusion PCR amplification. The only difference between this PCR and the reaction used in Example 1 was in the primers used. The sequence of the forward and reverse primer were TCTCTGTCATAGGGACTCTGGA (SEQ ID NO:23) and AGCCATGGACCCCCACAC (SEQ ID NO:24), respectively. The following 147-bp amplicon (SEQ ID NO:25) was generated upon the PCR (sequences with grey marking correspond to the forward primer and are complementary to the reverse primer, underlining: sequences corresponding to the template-specific portion of the forward spacer molecule and complementary to the template-specific portion of the reverse spacer molecule, white letters on a black background: sites affected by the most significant mutations of Exon 19 of EGFR):

TCTC-TGTeATAG,GGACJCTG-GATCCCAGAAGGTGAGAAAGTTAAAATTCCCGTCGCT ATCg¾ ^^^j^^^^^^^^QQ^g^^^^^^^^^^QcTCGATGTGAGTTTC^GCT

TTGCTGTGTGGGGGTCCATGGCC

Fusion of the spacer molecules: For the fusion, the same PCR reaction was used as in Example 1 with different spacer molecules. If the EGFR gene segment of the sample did not contain mutations, the following 315-bp fusion product (SEQ ID NO:26) was generated (sequences with grey marking correspond to the forward primer and are complementary to the reverse primer, white italic letters on a black background: sequences corresponding to the M13F sequencing primer and complementary to the M13R primer, underlining: sequences corresponding to the template-specific portion of the forward spacer molecule and complementary to the template-specific portion of the reverse spacer molecule, normal white letters on a black background: sites affected by the most significant EGFR mutations):

TGTAATACGACTCACTATAGGGCGAATTGGGCCCTCTAGTGGATCCCAGAAGGTGAG AAAGT TAAAATTCCCGTCGCTATCS AGGAATTAAGAGAAGCAACATCTCCGAAAGCCAACAAGGAAA

ICTCGATGTGAGTTTCTGCTTTGCTGTGTGGGGGTCATTATTTGCCATCCACCTGAT ATCC

CCTATAGTGAGTCGTATTAC TGGTCATAGCTGTTTCCT gGCAGCTC.TGGCCCGTGTCT ' CAA

Sequencing of the fusion product: Sequences were analysed as described in Example 1.

Results:

Sequence analysis was successfully carried out in 10 isolates containing fragmented DNA isolated from formalin-fixed, paraffin-embedded samples.

Example 3: Analysis of Exon 21 of the EGFR gene

Method:

Preparation of the spacer molecules: For the diagnostic system, one forward and one reverse spacer molecule was prepared as in Example 1 with the following modifications. In case of the forward spacer molecules, the following reverse primer (SEQ ID NO:27) was used in the OEP reaction to attach the segment complementary with the EGFR gene downstream of the M13F primer and the additional nucleotides required for the restriction cleavage to the pCR2.1-TOPO vector sequence (grey marking: restriction enzyme binding site, underlining: a sequence identical to the EGFR gene segment, italic letters: a segment complementary to the vector sequence):

GAGCGGCCGCTGGCCAAAATCTGTGATCTTGACATGC CTAGAGGGCCCAATTCGC

In case of the reverse spacer molecules, the following reverse primer (SEQ ID NO:28) was used (grey marking: restriction enzyme binding site, underlining, a segment complementary to the EGFR gene segment, italic letters: a sequence identical to the pENTR/H1/TO vector segment):

CT-TGGCCATGCAGAAGGAGGCAAAGTTGCCATCCAGCTGA TA TCC

In case of the forward spacer molecules, the following forward primer (SEQ ID NO:29) was used to introduce the restriction enzyme binding site on the other end of the pCR2.1-TOPO vector sequence (grey marking: restriction enzyme binding site, italic letters: a sequence identical to the vector segment):

CTGCGGCCGCTGGCCAGCGATTAAGTTGGGTAACGC

In case of the reverse spacer molecules, the following forward primer (SEQ ID NO:30) was used (grey marking: restriction enzyme binding site, italic letters: a sequence complementary to the pENTR/H1/TO vector segment):

GATGGCCAGTGCAATGTAACATCAGAGATTTTG

For the restriction cleavage, MluNI was used for the spacer molecules for both ends. The sequence of the forward spacer molecule (SEQ ID NO:31) generated by the restriction cleavage is as follows (131 bp, the segment marked grey corresponds to the forward amplification primer, the segment in white letters on a black background corresponds to the M13F sequencing primer, and the underlined segment corresponds to a sequence complementary to a portion of the EGFR gene):

C C AGCGAT AAGTTGGGT AAC CCAGG T-TT C C AG T C Ά C.G A ( T IMM3MMMDMF3I BSGAATT GTAATACGACTCACTATAGGGCGAATTGGGCCCT CTAGGCATGT CAAGATCACAG

ATTTTGG

The sequence of the reverse spacer molecule (SEQ ID NO:32) generated by the restriction cleavage is as follows (127 bp, the segment marked grey corresponds to the reverse amplification primer, the segment in white letters on a black background corresponds to the M13R sequencing primer, and the underlined segment corresponds to a sequence complementary to a portion of the EGFR gene):

CCAGTGCAATGTAACATCAGAGA TTTGAGACACGGGCGAG¾

EHB-SGTAATACGACTCACTATAGGGGATATCAGCTGGATGGCAACTTTGCCTCCTTCTG CA TGG Pre-fusion PCR amplification: The samples containing fragmented DNA isolated from a formalin-fixed, paraffin-embedded sample were subjected to pre-fusion PCR amplification. The only difference between this PCR and the reaction used in Example 1 was in the primers used. The sequence of the forward and reverse primer were GAAAACACCGCAGCATGTC (SEQ ID NO:33) and AAAGCCACCTCCTTACTTTGC (SEQ ID NO:34), respectively. The following 107-bp amplicon (SEQ ID NO:35)was generated upon the PCR (sequences with grey marking correspond to the forward primer and are complementary to the reverse primer, underlining: sequences corresponding to the template-specific portion of the forward spacer molecule and complementary to the template-specific portion of the reverse spader molecule, white letters on a black background: sites of the most significant mutations of Exon 21 of EGFR):

G-¾AAACAC.CGCAG.CATGTCAAGATCACAGATTTTGGGCBGGCCAAACBGCTGGGTG CGGAAG AGAAAGAATACCATGCAGAAGGAGGCAAAGTAAGGAGGTGGCTTT

Fusion of the spacer molecules: For the fusion, the same PCR reaction was used as in Example 1 with different spacer molecules. If the EGFR gene segment of the sample did not contain mutations, the following 268-bp fusion product (SEQ ID NO:36) was generated (sequences with grey marking correspond to the forward primer and are complementary to the reverse primer, white italic letters on a black background: sequences corresponding to the M13F sequencing primer and complementary to the M13R primer, underlining: sequences corresponding to the template-specific portion of the forward spacer molecule and complementary to the template-specific portion of the reverse spacer molecule, normal white letters on a black background: sites affected by the most significant EGFR mutations):

w

ATTAAGTTGGGTAACGCCAGGGTTTTCCCAGTCAC^AC^T

TGTAATACGACTCACTATAGGGCGAATTGGGCCCTCTAGGCATGTCAAGATCACAGATTT TG GGCBGGCCAAACBGCTGGGTGCGGAAGAGAAAGAATACCA . TGCAGAAGGAGGCAAAGTTGCC ATCCAGCTGATATCCCCTATAGTGAGTCGTATTACBDMMRI^

TGGCCCGTGTCT.CAAAATCT

Sequencing of the fusion product: Sequences were analysed as described in

Example 1.

Results:

Sequence analysis was successfully carried out in 10 isolates containing fragmented DNA isolated from formalin-fixed, paraffin-embedded samples. Example 4: Analysis of the neuraminidase (NA) gene of the H1 N1 influenza causing infections between 1977 and 2008

Method:

When designing the analysis of the gene regions of H1 N1 influenza influencing pharmaceutical effects, the starting materials included gene sequences of H1 1 influenza viruses from the period between 1977 and 2005. The assay thus established can detect all influenzas from the period between 1977 and 2008. Whereas in the first 3 examples, spacers were fused to both ends of the amplicons (thus, Figure 1 applies to these systems), in the analysis of the influenza neuraminidase gene, spacers were only fused to one end (as illustrated in Figure 4). The other end was subjected to overlap fusion with overlapping sequences between the 2 neuraminidase gene regions generated by primers overhanging into the other region in the pre-fusion PCR amplification.

Method:

Preparation of the spacer molecule: For the diagnostic system, one forward spacer molecule was prepared as in Example 1 with the following modifications. The following reverse primer (SEQ ID NO:37) was used in the OEP reaction to attach the segment complementary with the NA gene downstream of the M13F primer and the additional nucleotides required for the restriction cleavage to the pCR2.1-TOPO vector sequence (grey marking: restriction enzyme binding site, underlining: a sequence identical to the NA gene segment, italic letters: a segment complementary to the vector sequence):

GATT-TAAAAACATCTCCTTTGGAGCC CTAGAGGGCCCAATTCGC

The following forward primer (SEQ ID NO:38) was used to introduce the restriction enzyme binding site on the other end of the pCR2.1-TOPO vector sequence (grey marking: restriction enzyme binding site, italic letters: a sequence identical to the vector segment):

CAT TAAKGCGA TTAAGTTGGGTAACGC

Dral was used for the restriction cleavage. During the gel-isolation, the ~6pg (4 wells) spacer molecule was individually collected into the 1 ml aqueous solution.

The sequence of the spacer molecule (SEQ ID NO:39) generated by the restriction cleavage is as follows (128 bp, the segment marked grey corresponds to the forward 25 Pffl/HlOTtfWn amplification primer, the segment in white letters on a black background corresponds to the M13F sequencing primer, and the underlined segment corresponds to a sequence complementary to a portion of the NA gene):

AAA CGA AA TGG TAAOY/!A GCT^

BCHGAATTGTAATACGACTCACTATAGGGCGAATT GGGCCCTCTAGGGCTCCAAAGGAGATGT

TTTT

Pre-fusion PCR amplification: Upon isolation of the RNA, RT-PCR was used. Using Qiagen's OneStep RT-PCR Kit, the reaction solution contained 1 * buffer, 400 μΜ of each dNTP, 0.04 μΙ/μΙ enzyme mix, 0.6 μΜ forward and reverse primer, and 0.1 U/μΙ RNasin Plus RNAse Inhibitor (Promega). The RT-PCR included the following cycles:

Cycle 1 : (1x) Step 1 50°C 30min OOsec

Cycle 2: (1x) Step 1 95°C 15min OOsec

Cycle 3: (40x) Step 1 94°C Omin 30sec

Step 2 56°C Omin 30sec

Step 3 72°C Omin 45sec

Cycle 4: (1x) Step 1 72°C 10min OOsec

Cycle 5: (1x) Step 1 4°C∞

The two NA gene regions were amplified in separate reactions. During the amplification of the first gene region, the following primers were used:

Forward: AAGACAACAGCATAAGAATTGGCTCC (SEQ ID NO:40)

Reverse (in bold italic letters: segment complementary to the other gene region):

CTATTGATTTGGAGCTTCACCTAGAGGACAGCTCATTA (SEQ ID NO:41)

During the amplification of the second gene region, the following primers were used: Forward (in bold italic letters: segment complementary to the first gene region):

GGTGAAGCTCCAAAT C AAT AG AGT G AA GC AC C C AAT (SEQ ID NO:42)

Reverse: GGATTGTCACCGAACACTCCACT (SEQ ID NO:43)

The following 206-bp amplicon (SEQ ID NO:44) was generated upon amplifying the first gene region (using the RNA of the Solomon Islands/3/2006 virus as template; sequences with grey marking correspond to the forward primer and are complementary to the reverse primer, in bold italic letters: a segment complementary to the other gene region; underlining: sequences corresponding to the template- specific portion of the spacer molecule, white letters on a black background: sites of the most significant mutations causing resistance to neuraminidase inhibitors): AAGACAACAGCAT AGAATTGGCTC C AAAG GAG AT GT T T T T GT CATAAGA— C C T T T C AT A

TCATGTTCTCACTTGGAATGCAGAACCTTTTTTCTGACCCAAGGTGCTCTATTAAAT GACAA ACATTCAAATGGGACCGTAAAGg^^^GTCCTTATAGGACCTTAATGAGCTGTCCTCTAG GTGAAGCTCCAAATCAATAG

The following 196-bp amplicon (SEQ ID NO.45) was generated upon amplifying the second gene region (using the RNA of the Solomon Islands/3/2006 virus as template; sequences with grey marking correspond to the forward primer and are complementary to the reverse primer, in bold italic letters: a segment complementary to the other gene region; white letters on a black background: sites of the most significant mutations causing resistance to neuraminidase inhibitors):

SgTgAA^TC^AATCAATAGAGTTGAATGCACCCAATTTTBraaTATGAGGAATGTTr.GT GT TACCCAGACACTGGCACAGTGATGTGTGTATGC¾gGAC¾¾TGGCATGGTTCAAATCG ACC TTGGGTGTCTTTTAATCAAAACTTGGATTATCAAATAGGATACATCTGCAGTGGAGTGTT CG GTGACAATCC Fusion of the spacer molecule and the amplicons of the two gene regions: The PCR solution for the fusion contained AccuPrime Taq HiFi DNA polymerase (Invitrogen, 0.02 U/μΙ), 1 x AccuPrime I PCR buffer, a forward primer (SEQ ID NO:46) (ATTAAGTTGGGTAACGCCAGGGTTT) and a reverse primer (SEQ ID NO:47) (CACTCCACTGCAGATGTATCCTATTT) each at a concentration of 0.4 μΜ. One fifth of the reaction volume was from the 100* dilution of the PCR solution containing the amplicons of one gene region, another fifth was from the 100* dilution of the PCR solution containing the amplicons of the other gene region, and the third fifth was from the gel isolate of the spacer molecules. When using the RNA of the Solomon Islands/3/2006 virus as template, if the sample did not contain Nl resistance mutations, the following 448-bp fusion product (SEQ ID NO:48) was generated (sequences with grey marking correspond to the forward primer and are complementary to the reverse primer, white italic letters on a black background: sequences corresponding to the M13F sequencing primer, underlining: sequences corresponding to the template-specific portion of the forward spacer molecule, normal white letters on a black background: sites affected by the most significant mutations causing resistance to neuraminidase inhibitors):

ATTAAG TGGGT ACGCCAGGG T TCCCAGTCACGACGT^ee S ^raGAAT

TGTAATACGACTCACTATAGGGCGAATTGGGCCCTCTAGGGCTCCAAAGGAGATGTT TTTGT 27 ΓΟ I M I U U \ CI «* - - ' .

CATAAGA^^CCTTTCATATCATGTTCTCACTTGGAATGCAGAACCTTTTTTCTGACC CAAG GTGCTCTATTAAATGACAAACATTCAAATGGGACCGTAAAG^^^AGTCCTTATAGGACC

TTAATGAGCTGTCCTCTAGGTGAAGCTCCAAATCAATAGAGTTGAATGCACCCAATT TTg53 TATGAGGAATGTTCCTGTTACCCAGACACTGGCACAGTGATGTGTGTATGC¾^GAC¾ TG GCATGGTTCAAATCGACCTTGGGTGTCTTTTAATCAAAACTTGGATTATCAAATAGGATA CA

TCTGCAGTGGAGTG

Sequencing of the fusion product: The forward and reverse termination reactions were performed using the M13F primer and the reverse primer used during the fusion of the product, respectively. Other conditions of the sequence analysis were identical to those as described in Example 1.

Results:

The efficiency of the sequence analysis was confirmed in 4 samples containing influenza viruses: 2 sputum samples (Samples ATL and AT according to our nomenclature) taken in the beginning of 2008 from 2 patients infected during an influenza epidemic (caused by the Solomon Islands virus variant of 2006) were analysed. Samples 3 and 4 are Fluval AB es Fluval P injections (Omnivest), respectively; these are purified, concentrated and formalin-inactivated suspensions of three influenza strains and a single strain, respectively, all propagated in embryonated hen's eggs. Fluval AB contains a H1 N1 from 2007 (A/Brisbane/59/2007), a H3N2 from 2007 (A/Uruguay/716/2007) and an influenza B strain from 2008. Fluval P contains a new type the H1 N1 influenza A virus (A/California/7/2009).

Whereas the pre-fusion RT-PCR of the influenza from 2009 was unsuccessful, it was efficient in all samples containing H N1 viruses isolated in or before 2008 (Figure 5). Fusion was successful in all three cases. In the case of the Fluval AB sample, the sequence of the fused gene segments was completely identical to the base sequence of the gene region corresponding to the neuraminidase gene of A/Brisbane/59/2007 downloaded from the NCBI database. Samples ATL and AT showed great homology with the neuraminidase gene of A/Solomon Islands/3/2006, again downloaded from the NCBI database; only a few nucleotides were found different but they did not affect the hot spots associated with resistance to neuraminidase inhibitors. 28 PCT/H U2U1 / u u u I /

Example 5: Analysis of the neuraminidase (NA) gene of the new type H1 N1

Upon analysing the neuraminidase gene of 337 variants of the new type H1 N1 influenza emerging in 2009, significant differences were found in comparison with the previous variants. Therefore, using the 337 gene sequences, an optimised assay was devised for the detection of the new type H 1 N1 influenza, in which only the primers and the template-specific parts of the spacer contain a few nucleotides that are different from the analysis described in Example 1 .

Method:

Preparation of the spacer molecule: The following reverse primer (SEQ ID NO:49) was used in the OEP reaction to attach the segment complementary with the NA gene downstream of the M13F primer and the additional nucleotides required for the restriction cleavage to the pCR2.1 -TOPO vector sequence (grey marking: restriction enzyme binding site, underlining: a sequence identical to the NA gene segment, italic letters: a segment complementary to the vector sequence):

GATTTAAACACATCCCCCTTGGAACC CTAGAGGGCCCAArrCGC

The remaining steps of preparing the primer and spacer molecule used to form the restriction enzyme binding site on the other end of the pCR2.1 -TOPO vector sequence are identical to those described in Example 4.

The sequence of the spacer molecule (SEQ ID NO:50) generated by the restriction cleavage is as follows (128 bp, the segment marked grey corresponds to the forward amplification primer, the segment in white letters on a black background corresponds to the M13F sequencing primer, and the underlined segment corresponds to a sequence complementary to a portion of the NA gene):

AAAGCGATTAAG TGGGTAACGCCAGGGTTTT^

BGGAATTGTAATACGACTCACTATAGGGCGAATT GGGCCCTCTAGGGTTCCAAGGGGGATGT

GTTT

Pre-fusion PCR amplification: The only difference between the RT-PCR carried out on the RNA isolate and the reactions used in Example 1 was in the primers used.

During the amplification of the first gene region, the following primers were used:

Forward: AAAGACAACAGTATAAGAATCGGTTCCA ( SEQ ID NO:51 )

Reverse (in bold italic letters: segment complementary to the other gene region):

CGACTGATTTGGAACTTCACCAATAGGACAGCTCATTA (SEQ ID NO:52)

During the amplification of the second gene region, the following primers were used: Forward (in bold italic letters: segment complementary to the first gene region):

GGTGAAGTTCCKhNT C AGT C GAAAT G AAT GC C C C (SEQ ID NO:53)

Reverse: GGATTGTCTCCGAAAATCCCACT (SEQ ID NO:54)

The following 207-bp amplicon (SEQ ID NO:55) was generated upon amplifying the first gene region (using the RNA of the California/07/2009 virus as template; sequences with grey marking correspond to the forward primer and are complementary to the reverse primer, in bold italic letters: a segment complementary to the other gene region; underlining: sequences corresponding to the template- specific portion of the spacer molecule, white letters on a black background: sites of the most significant mutations causing resistance to neuraminidase inhibitors):

AAAGACAACAGTGTAAGAATCGGTTCCAAGGGGGATGTGTTTGTCATAAGG^ECCATTCA T ATCATGCTCCCCCTTGGAATGCAGAACCTTCTTCTTGACTCAAGGGGCCTTGCTAAATGA CA AACATTCCAATGGAACCATTAAA@3^^AGCCCATATCGAACCCTAATGAGC GTCCFATT

GGTGAAGTTGGAAATCAGTCG

The following 196-bp amplicon (SEQ ID NO:56) was generated upon amplifying the. second gene region (using the RNA of the California/07/2009 virus as template; sequences with grey marking correspond to the forward primer and are complementary to the reverse primer, in bold italic letters: a segment complementary to the other gene region; white letters on a black background: sites of the most significant mutations causing resistance to neuraminidase inhibitors):

' G*> &!FGAflSTTCC ^ ■■■■■

TATCCTGATTCTAGTGAAATCACATGTGTGTGC^¾GAT¾^TGGCATGGCTCGAATCGA CC GTGGGTGTCTTTCAACCAGAATCTGGAATATCAGATAGGATACATATGCAGTGGGATTTT CG GAGACAATCC

Fusion of the spacer molecule and the amplicons of the two gene regions: The only difference between the fusion PCR and the one used in Example 4 was in the reverse primer (AATCCCACTGCATATGTATCCTATCT) (SEQ ID NO:57) used. When using the RNA of the California/07/2009 virus as template, if the sample did not contain Nl resistance mutations, the following 448-bp fusion product (SEQ ID NO:58) was generated (sequences with grey marking correspond to the forward primer and are complementary to the reverse primer, white italic letters on a black background: sequences corresponding to the M13F sequencing primer, underlining: sequences corresponding to the template-specific portion of the forward spacer molecule, normal white letters on a black background: sites affected by the most significant mutations causing resistance to neuraminidase inhibitors):

ATTAAfl TftflftfrAACflCCAfiftflT^^

TGTAATACGACTCACTATAGGGCGAATTGGGCCCTCTAGGGTTCCAAGGGGGATGTGTTT GT CATAAGG^^CCATTCATATCATGCTCCCCCTTGGAATGCAGAACCTTCTTCTTGACTCAA G GGGCCTTGCTAAATGACAAACATTCCAATGGAACCATTAAA ^^BAGCCCATATCGAACC CTAATGAGCTGTCCTATTGGTGAAGTTCCAAATCAGTCGAAATGAATGCCCCTAATTATB BBI TATGAGGAATGCTCCTGTTATCCTGATTCTAGTGAAATCACATGTGTGTGC^¾GAT^¾ TG

GCATGGCTCGAATCGACCGTGGGTGTCTTTCAACCAGAATCTGGAATATCSGATAGG ATACA TATGCAGTGGG Tl

Sequencing of the fusion product: The forward and reverse termination reactions were performed using the M13F primer and the reverse primer used during the fusion of the product, respectively. Other conditions of the sequence analysis were identical to those as described in Example .

Results:

The method analysing the neuraminidase gene regions of the new type H1 N1 was tested with the RNA isolated from the Fluval P vaccine (Omnivest). The vaccine contains the new type virus A/California/7/2009 in an inactivated form. The assay method was efficiently functioning in this case too (Figure 6). The virus-specific sequences of the fusion product were identical to the base sequence of the gene regions corresponding to the neuraminidase gene of A/California/7/2009 downloaded from the NCBI database.

Example 6: Reduced allele discrimination (allele bias) during the analysis of Exon 21 of the EGFR gene using the method of the invention

Reduced allele bias and the resulting increased sensitivity of the inventive method were confirmed using the following model.

Method:

DNA was isolated from an FFPE sample with a tumour ratio of 30% according to the pathologist's opinion. A point mutation (2573T>G=L858R) of Exon 21 of the EGFR gene was detected by sequence analysis after 2 independent (conventional) PCR reactions (the frequency of the mutant allele was 50%). The sample was diluted to the operational limit of the conventional one-step PCR (10* dilution). The diluted sample was subjected to 10 mutation analyses using a conventional method (a normal PCR followed by a nested PCR) and to another 10 using the method of the invention.

During the conventional PCR, the following primers were used for the first amplification: forward: TACTTGGAGGACCGTCGCTTG (SEQ ID NO:59), reverse: GGTCCCTGGTGTCAGGAAAAT (SEQ ID NO:60). After the reaction, the solution was diluted to 25* and was used in a volume corresponding to one fifth of the nested PCR mixture. In the nested PCR, the following primers were used: forward: AGCCAGGAACGTACTGGTGAAAAC (SEQ ID NO:61), reverse:

GCTGGCTGACCTAAAGCCACCT (SEQ ID NO:62). As regards other parameters, both PCR solutions were identical to those described in Example 1 , Preparation of the spacer molecule. During the analyses performed using the new method, the procedure described in Example 3 was followed. The sequences of the amplicons prepared using the above two methods are shown in Figure 7. Sequencing of the end-products of the two methods gave identical results. The signal intensity of the allele variants was produced using the values of the electropherograms.

Results:

The results of the comparative analysis are presented in Figure 7. After the first PCR during the conventional nested PCR, 8 of the 10 test tubes contained products detectable by gel technique, and the nested PCR gave rise to products detectable by gel technique in all 10 test tubes. When the new method was used, all 10 test tubes contained products detectable by gel technique after the pre-amplification PCR. In Step 2, all 10 products were successfully fused. The 10+10 products from the 2 methods were successfully sequenced from both directions. The conventional method detected mutations in 5 of the 10 products (in an allele ratio of 100% in 4 cases) corresponding to a sensitivity of 50%. The average allele ratio of the mutant signal was 42%, with an SD of 50%. The new method revealed mutations in 9 of the 10 cases corresponding to a sensitivity of 90%. Whereas the average allele ratio of the mutant signal was again 42%, the corresponding SD was only 23%.