Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
5' AND 3' POLYMERASE CHAIN REACTION WALKING FROM KNOWN DNA SEQUENCES
Document Type and Number:
WIPO Patent Application WO/1992/013104
Kind Code:
A1
Abstract:
The present invention is directed to a method of polymerase chain reaction (PCR) walking for generating vector free libraries of genomic DNA from which flanking sequences of known DNA sequences can be cloned and amplified. This method of PCR walking involves using unphosphorylated or phosphorylated DNA linkers which will ligate only at the 5' ends, blocking the unligated 3' ends, and specifically priming the synthesis of the desired flanking sequence with a primer complementary to the known DNA sequence. Cloning DNA sequences located 5' to the zeta-globin promoter region by using a polymerase chain reaction walking library is presented as an example.

Inventors:
Kun WU. C.
Deisseroth, Albert B.
Application Number:
PCT/US1992/000532
Publication Date:
August 06, 1992
Filing Date:
January 22, 1992
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
BOARD OF REGENTS, THE UNIVERSITY OF TEXAS SYSTEM.
International Classes:
C12Q1/68; (IPC1-7): C12N15/10; C12Q1/68
Foreign References:
EP0356021A21990-02-28
Other References:
BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS vol. 167, no. 2, 16 March 1990, DULUTH, MN US pages 504 - 506; M. KALMAN ET AL.: 'Polymerase chain reaction (PCR) amplification with a single specific primer'
GENE. vol. 84, 1989, AMSTERDAM NL pages 1 - 8; V. SHYAMALA ET AL: 'Genome walking by single-specific-primer polymerase chain reaction: SSP-PCR' cited in the application
Download PDF:
Claims:
CLAIMS:
1. A method for cloning DNA sequences flanking a known DNA sequences comprising: a) digesting DNA containing a known DNA sequence with at least one restriction endonuclease to form double stranded DNA fragments, each strand having a 5' end and a 3' end; b) annealing double stranded DNA linkers, asymmetrical in length, to the double stranded DNA fragments produced in step (a) , each DNA linker having a first strand and a second strand; c) ligating the first strand of each DNA linker to the 5' end of each double stranded DNA fragment, and leaving the second strand of the DNA linker unligated to the 3' end of the double stranded DNA fragment; d) blocking the unligated second strand of each DNA fragment; e) denaturing the double stranded DNA fragment produced in step (d) to produce single stranded DNA fragments; f) annealing a DNA primer to the single stranded DNA fragments, each DNA primer having a 3' and 5' end and each DNA primer being complementary to at least a.portion of known DNA sequences found in the single stranded DNA fragment; g) adding a polymerase enzyme to initiate a polymerase chain reaction, from the 3 ' end of the annealed DNA primer towards the 5' end of the single stranded DNA fragment and generate a complementary strand to the single stranded DNA fragment; and h) denaturing the complementary strand from the single stranded DNA fragment.
2. The method of claim 1 wherein, in digesting step (a) , the restriction endonuclease has specificity for at least one frequently recognized restriction site.
3. The method of claim 1 wherein, in digesting step (a) , the restriction endonuclease is selected from the group consisting of Sau3A, EcoRI, Taq 1, Hpa2, Hhal, Rsal, or Msp 1 or a combination thereof.
4. The method of claim 1 wherein blocking step (d) comprises incorporating at least one dideoxyribonucleotide to the 3' end of the double stranded DNA fragment.
5. The method of claim 1 wherein the double stranded DNA linkers are unphosphorylated.
6. The method of claim 1 wherein the double stranded DNA linkers are phosphorylated.
7. The method of claim 1 wherein, in annealing step (b) , the DNA linkers, are added in excess to the number of double stranded DNA fragments.
8. The method of claim 7 wherein the ratio of DNA linkers to double stranded DNA fragments is about 50 : 1.
9. The method of claim 1 wherein, in annealing step (b) , each DNA linker comprises a first strand of about 18 to 30 mers in length and a second strand of about 18 to 30 mers in length, plus an additional portion when compared to the length of the first strand, the additional portion on the second strand being complementary to the sticky 5' ends of the double stranded DNA fragment.
10. The method of claim 9 wherein, in annealing step (b) , each DNA linker comprises a 20 mer for the first strand and a 24 mer for the second strand.
11. The method of claim 5 wherein the unphosphorylated double stranded DNA linker further comprises a GATC cohesive end.
12. The method of claim 1, step (g) , wherein said polymerase enzyme is Taq polymerase.
13. A method of forming a vectorfree DNA library comprising: a) digesting DNA with at least one restriction endonuclease to form double stranded DNA fragments, each fragment having a 5' end and a 3' end; b) annealing double stranded DNA linkers, asymmetrical in length, to the double stranded DNA fragments, each DNA linker having a cohesive end specific to restriction enzyme sites generated by the restriction endonuclease employed in step (a) ; c) ligating a strand of the DNA linker to the 5' end of the double stranded DNA fragment; and d) blocking the unligated 3' end of the double stranded DNA fragments.
14. The method of claim 13 wherein, in digesting step (a) , the restriction endonuclease has specificity for at least one frequently recognized restriction site.
15. The method of claim 13 wherein, in digesting step (a) , the restriction endonuclease is selected from a group consisting of Sau3A, EcoRI, Taq 1, Hpa2, Hhal, Rsal, or Msp 1 or a combination thereof.
16. The method of claim 13 wherein, in annealing step (b) , each double stranded DNA linker is phosphorylated.
17. The method of claim 13 wherein, in annealing step (b) , each double stranded DNA linker is unphosphorylated.
18. The method of claim 13 wherein, in annealing step (b) , each double stranded DNA linker, asymmetrical in length, comprises a first strand of 18 to 30 mers in length and a second strand of 18 to 30 mers in length plus an additional portion when compared to the length of the first strand, and this additional portion on the second strand is complementary to the sticky ends of the double stranded DNA fragment formed by the restriction endonuclease, such that each DNA linker will anneal to the sticky ends of the DNA fragments.
19. The method of claim 15 wherein each phosphorylated DNA linker comprises a 20 mer for the first strand and a 24 mer strand for the second strand.
20. The method of claim 16 wherein each unphosphorylated DNA linker comprises a 20 mer for the first strand and a 24 mer strand for the second strand.
21. The method of claim 17 wherein the unphosphorylated DNA linker further comprises a GATC cohesive end.
22. The method of claim 13 wherein in annealing step (b) , the ratio of DNA linkers to double stranded DNA fragments is about 50 DNA linkers to 1 double stranded DNA fragment.
23. The method of claim 13 wherein blocking step (d) comprises incorporating at least one dideoxyribonucleotide to the 3' end of the double stranded DNA fragment.
24. A method for cloning a double stranded DNA fragment containing a known DNA sequence and having a DNA linker ligated to the 5' end and a blocked 3' end comprising: a) denaturing the double stranded DNA fragment to yield resulting single strands; b) admixing the resulting single strands from step (a) with a specific primer complementary to a portion of the single stranded DNA fragment and Taq 1 polymerase thereby initiating a polymerase chain reaction to form complementary strands.
25. A vectorfree DNA library comprising double stranded DNA having a 5' end ligated to a DNA linker and a blocked 3' end.
26. The vectorfree DNA library of claim 25 wherein each 3' end is blocked with dideoxyribonucleotides.
Description:
5' AND 3' POLYMERASE CHAIN REACTION

WALKING FROM KNOWN DNA SEQUENCES

In general, this invention is related to DNA libraries and describes methods for producing DNA libraries from different organisms, as well as for cloning and amplifying DNA sequences that flank known genes. More particularly, the instant invention concerns a method of rapid polymerase chain reaction walking.

Polymerase chain reaction (PCR) involves an enzymatic amplification of a specific DNA segment defined by two DNA primers having opposite polarity and flanking the region to be amplified. Repeated cycles of denaturing, annealing, and elongating, in the presence of thermostable Tag polymerase, result in the amplification and accumulation of the target sequence up to a million fold. This technique has been applied to aid diagnosis of genetic diseases, to detect infectious pathogens, to study activated oncogenes in tumors, and to analyze allelic sequence variation or genetic polymorphism. For general information and a protocol regarding polymerase chain reaction, see United States Patent # 4,965,188, herein incorporated by reference.

Despite the wide use of the conventional polymerase chain reaction, the conventional method is of limited use when one desires to clone uncharacterized DNA sequences flanking a known region because only one DNA primer can be designated from a known DNA region. A one primer

polymerase chain reaction only generates a linear increase in the number of copies; whereas when two primers are employed there is an exponential increase in the number of copies.

Thus, a polymerase chain reaction method that requires only one specific DNA primer, and exponentially increases the number of copies generated of an uncharacterized DNA sequence would substantially facilitate the cloning and sequencing of unknown DNA molecules.

Polymerase chain reaction-based methods using one defined DNA primer to isolate unknown sequences have been previously described. For example, asymetrical-PCR, and one sided PCR are techniques that have been used to clone uncharacterized and variable messenger RNAs (Loh, Science. 243:217-220 (1989). Yet problems still remain because these methods do not supply a simple way for linking an artificial DNA primer to one end of a double stranded DNA, due to the symmetrical nature of the DNA molecule itself.

To partially circumvent this problem, two modified PCR methods have been used. First, inverse polymerase chain reaction has been used to clone cellular DNA flanking sequences from an inherited murine ecotropic provirus and to isolate DNA sequences flanking the Insertion Sequence I element from the E. coli genome (Silver and Keerikatte, "Novel Use of Polymerase Chain Reaction To Amplify Cellular DNA Adjacent to an Integrated Provirus", J. Virol. 63:1924-1928 (1989) and Och an, et al . r "Genetic Applications of an Inverse Polymerase Chain Reaction", Genetics 120:621-623 (1988)). Inverse polymerase chain reaction involves enzyme restriction digestion and subsequent self circularization

of genomic DNA fragments. The circular DNA product is then used as a template for DNA amplification using two DNA primers designed from a known DNA region. Unfortunately, for efficient amplification of a targeted desired DNA sequence, the method of inverse polymerase chain reaction requires the digestion of a specific site between the two DNA primers, and the existence of this specific restriction site in the DNA region to be amplified, would abolish the amplification process.

The second method is the single-specific primer polymerase chain reaction (Shyamala and Ames, "Genome walking by single-specific-primer polymerase chain reaction: SSP-PCR", Gene 84:1-8 (1989)). This method involves linking a generic vector to a digested DNA fragment and a specific DNA primer complementary to the vector is used to initiate the polymerase chain reaction cycle. Although single-specific primer polymerase chain reaction has been used to take chromosomal walks in the histidine transport operon in Salmonella typhimurium, it is unclear whether this procedure would be sensitive enough to detect and amplify mammalian genomic sequences. Thus, the application of single-specific primer polymerase chain reaction for cloning a desired gene from mammalian genomes is very questionable.

Described herein are methods that eliminate existing problems in the area of amplification and cloning of desired DNA sequences that otherwise would not have been cloned and subsequently sequenced. This invention describes a method for producing a vector-free DNA library, as well as a method for to clone and amplify DNA sequences adjacent to a specific known DNA primer.

The present invention has several advantages over the standard cloning procedures. First, making the

vector-free library is significantly less tedious than making a lambda or cosmid library because a vector free library only involves linking a DNA oligonucleotide to the 5' end of the DNA fragment, blocking the 3' OH end by dideoxyribonucleotide and removing the linker oligo.

Second, the method provides for rapid isolation of a flanking sequence once the library has been made. It takes about five days to amplify the DNA fragment from the library, blot to the nitrocellulose filter, hybridize the internal probe, isolate the fragment from the gel, and sequence the fragments.

Third, it is well known that some genomic DNA sequences are preferentially deleted or rearranged in a cosmid or lambda vector because of the incompatibility of those DNA sequences in the host cell. The present invention can be used to resolve the ambiguities of the DNA sequence resulting from the DNA rearrangement by cloning and sequencing the region directly without resorting to making another vector-based library.

Also compared to the previous procedures, only a small quantity of DNA is needed for cloning of a flanking region. This is especially useful in the cloning of rearranged regions in tumors identified with translocated breakpoints or deletions where the amount of DNA isolated from a patient sample is usually limited.

The present invention does not contain the limitations of inverse polymerase chain reaction (IPCR) . IPCR requires a rare cutter site between two primers for amplification to occur efficiently. It is well known in the art that the choice of primers is the limiting step in using PCR to clone DNA sequences because of insufficient data on the effect of DNA sequences on the

annealing kinetics. In order for the amplification to be successful, optimization of the PCR is required and can be time consuming. Another advantage of the present invention over IPCR is that only one primer need be tested instead of two.

Moreover, with IPCR, each amplified DNA fragment is cloned and part of the sequence has already been identified in the previous step. Consequently, IPCR is inefficient because identified sequences are amplified along with unidentified portions.

On the other hand, with the present invention, two directions of walking can be performed at the same time, to generate DNA sequences in both directions in the same or separate reactions.

The present invention has an advantage over SSP-PCR because the DNA linker in the present invention can be easily incubated in excess of the genomic fragment to prevent ligation among the genomic fragments without increasing the viscosity of the incubation mix. Furthermore, the DNA linker can be removed before the cycles of the amplification to minimize the background.

This method can be used in various applications such as cloning the region upstream of a 5' end of cDNA or downstream of the 3' end of the CDNA; intron/exon junctions in the genome using a primer complementary to the exonic sequence; transposon where the end sequence of the transposon is usually available; and translocated breakpoints or deletion junctions when the region of the fused locus has been pinpointed.

Furthermore as part of the human genome project initiative, the present invention can be carried out

repeatedly allowing more distant sequences to be determined or the strategy can be modified to make a PCR- based chromosomal jumping library. The use of such a jumping library would enable construction of a physical chromosomal map with known sequences localized on every 200 kb on the average along the human genome.

The present invention provides a method for generating a vector-free DNA library and for cloning and amplifying DNA sequences adjacent to a specific DNA primer of known sequence. With the instant invention, vector-free DNA libraries may be produced from any source of DNA. The invention relates to the production of vector free DNA libraries representing different organisms. The invention can be used to directly clone and amplify DNA sequences from unknown regions of genomes by using an artificially linked unphosphorylated 5' DNA linker and a specific primer that is complementary to at least part a known DNA region as a primer to amplify a PCR-based walking DNA library.

This method can also produce an additional 1000 bases (on the average) of DNA flanking a known cloned sequence every 72 hours. The method of the invention does not use cloning vectors. One embodiment of PCR walking involves using unphosphorylated DNA linkers which will ligate only at the 5' ends, blocking the unligated 3' ends by dideoxyribonucleotide triphosphate, and specifically priming the synthesis of the desired flanking sequence with a primer complementary to the known sequence.

It will be appreciated that significant advantages are provided in generating vector-free DNA libraries and cloning and amplifying DNA sequences flanking known DNA sequences according to the present invention. In

particular, this invention allows for the automation of amplification and sequencing of DNA sequences. This method of PCR walking can generate vector-free DNA libraries from virtually all organisms. Once a vector- free DNA library is generated, DNA sequences flanking a known gene can then be cloned and sequenced.

In general, using vector-free DNA libraries, the instant invention allows isolation of DNA sequences from regions with no known restriction site. The need to isolate DNA from regions that do not contain a known restriction site is huge. The ability to isolate DNA sequences from regions that contain no known restriction site is immense.

Thus, the method of this invention and the libraries that are produced by these methods have applications in the fields of medicine, as well as in the area of biological sciences. The methods described in this invention have application to areas ranging from agriculture, food production, diagnostics, therapeutics, as well as personal care products.

More specifically, the present invention provides a method for generating a genomic DNA library from an organism and for cloning and amplifying DNA sequences flanking known DNA sequences.

For the purposes of this invention, an organism refers to any "thing" that an investigator might want clone and sequence DNA from that "thing." For example, one could clone and sequence DNA for a number of different genes, isolated from a variety of different sources, thereby including all eukaryotes and all procaryotes. For the purpose of this invention, eukaryote is defined as an organism with cells that have

nuclear membranes, membrane-bound organelles, 80S ribosomes, and charactertic biochemistry. For the purpose of this invention, prokaryote is defined as a simple unicellular organism, such as bacterium or blue- green algae, that contains no nuclear membrane, no membrane-bound organelles, and possesses no characteristic ribosomal system nor biochemistry.

In general, the method described by this invention involves the following steps:

1. digesting DNA containing a known DNA sequence with a restriction endonuclease enzyme, such that the enzyme specifically restricts within or at the known DNA sequence, to produce fragments, each fragment having 5' ends and 3' ends;

2. annealing DNA linkers to the fragments, each of the DNA linkers having a cohesive end specific to restriction enzyme sites generated by the restriction endonuclease;

3. ligating the cohesive end linkers to the fragments;

4. blocking the unligated 3' ends; 5. denaturing the fragments to form upper strands and lower strands; 6. annealing a specific primer to the strands, such a primer is complementary to at least a portion of the known DNA sequence; 7. forming template strands homologous to the known DNA sequences and the sequences flanking one side of the known DNA sequences by a DNA synthesis reaction with a polymerase; 8. denaturing the strands; 9. annealing a linker primer to the template strands;

10. denaturing the template strands;

11. using DNA synthesis to make copies of the DNA sequences flanking the known DNA sequences from the template strands.

The invention also involves a vector-free DNA library comprising fragments of double stranded DNA having a 5' end ligated to a DNA linker and blocked 3' end. In general, according to this invention, the vector-free DNA library is formed by the following steps:

1. digesting DNA with at least one restriction endonuclease enzyme to form double stranded DNA fragments; 2. annealing double stranded DNA linkers to the double stranded DNA fragments, each DNA linker having a cohesive end specific to a restriction enzyme site generated by the restriction endonuclease enzyme; 3. ligating a strand of the DNA linker to the 5' end of the double stranded DNA fragments; and 4. blocking the unligated 3' end of the double stranded DNA fragments.

For the purpose of this invention, DNA linkers are synthetic oligodeoxyribonucleotides that are used to generate cohesive ends at the termini of DNA fragments.

For the purpose of this invention, polymerase chain reaction walking library is defined as DNA fragments by which DNA sequences neighboring known DNA regions can be isolated.

PIG. 1 Figure 1 schematically illustrates the basic steps described, in this invention, in forming a polymerase chain reaction

vector-free walking library, as well as steps involved in amplifying unknown and uncharacterized DNA sequences using a polymerase chain reaction vector-free walking library.

PIG. 2 Figure 2 illustrates the zeta-globin promoter region target sites for linker primers and specific primers (Oligo A is the 33 mer oligo and Oligo B is the 17 mer oligo) .

PIG. 3 Figure 3 is a photograph of an ethidium bromide- stained 2% agarose gel demonstrating DNA fragments generated from two polymerase chain reactions using either oligo A or oligo B as specific primers and Sau3A linker DNA as the other primer. The amplified DNA products were extracted two times with chloroform, ethanol precipitated and run on 2% agarose gel. Lane M shows DNA size markers (given in nucleotides) . Lanes 1 and 2 show amplified DNA product primed by DNA linker alone and oligo B with DNA linker respectively. Lanes 3 and 4 show polymerase chain reaction DNA generated by oligo A with DNA linker and DNA linker alone respectively.

PIG.4 Figure 4 is a photograph of a Southern transfer of a 2% agarose gel transferred to nitrocellulose filter and probed with s P-labelled hybrid oligo. Lanes 2 and 3 are the PCR mix containing oligo A or oligo B, respectively. Lanes 1 and 4 are

- ¬

the PCR mix containing the DNA linker alone.

The invention relates to the production of vector free DNA libraries representing different organisms. The invention can be used to directly clone and amplify DNA sequences from unknown regions of genomes by using an artificially linked unphosphorylated 5' DNA linker and a specific primer that is complementary to at least part a known DNA region as primers to amplify a PCR-based walking DNA library.

This method can produce an additional 1000 bases (on the average) of DNA flanking a known cloned sequence every 72 hours. The method of the invention does not use cloning vectors. One embodiment of PCR walking involves using unphosphorylated DNA linkers which will ligate only at the 5' ends, blocking the unligated 3' ends by dideoxyribonucleotide triphosphate, and specifically priming the synthesis of the desired flanking sequence with a primer complementary to the known sequence.

Example I Schematic Representation of the Steps Involved in

Forming a Polymerase Chain Reaction Walking Library Which Employs unphosphorylated DNA Linkers and for Amplifying 5-Flanking DNA Sequences Using Such a Polymerase Chain Reaction Walking Library

Figure 1 illustrates the steps involved in forming a single primer polymerase chain reaction walking library and for amplifying DNA sequences using such a polymerase chain reaction walking library using Sau3A as the restriction endonuclease, as well as using unphosphorylated DNA linkers.

DNA linkers are synthetic oligodeoxyribonucleotides that are used to generate cohesive ends at the termini of DNA fragments. The steps are divided for purposes of illustration. For example, Steps 6-11 may be performed in one step.

STEP 1

In Step 1, human genomic DNA is digested by treating the DNA with Sau3A to produce DNA fragments small enough to be amplified by polymerase chain reaction. For example, EcoRI is another restriction enzyme that could be used. According to the method of this invention, any restriction endonuclease enzyme can be used, but an enzyme that frequently cuts the DNA is preferred because the PCR amplification size is generally limited to about 3 to 5 Kb. For the purposes of this invention, an enzyme that digests DNA at multiple sites (also known as a frequent cutter) is defined as an enzyme that recognizes four base pairs.

In addition, more than one restriction endonuclease may be used to digest the DNA. For example, in one embodiment of the invention, Sau3A, Taq 1, and Msp 1 may be used to generate smaller sized fragments of DNA. Any combination of 2-4 different enzymes that frequently digest the DNA (frequent cutters) could accomplish this objective and are therefore embodied in this invention. Other preferred enzymes include Hhal, Hpa2, Rsal and combinations thereof.

STEP 2

In Step 2, an unphosphorylated 20 mer/24 mer DNA linker with GATC cohesive ends (which are complementary to the 5' "CTAG" ends of Sau3A digested DNA fragments) , a cohesive end specific to Sau3A, is annealed to the Sau3A DNA fragment. The preferred DNA linker is a 20 mer/24

mer polymer having 20 mers (or 20 bases) as one strand of the DNA linker and 24 mers (or 24 bases) for the other strand of the linker DNA. For the purpose of this invention, "mer" is also defined as a nucleotide "base." According to this invention, only the 20 mer strand of the 20 mer/24 mer linker DNA ligates to the 5' end.

A 20 mer/22 mer DNA linker with GC cohesive ends (cohesive ends specific to Taq 1 and Msp 1) may be employed as well. When Taq 1 and Msp 1 are used, the rest of the procedure is performed as described above when only Sau3A is used.

If other combinations of restriction endonucleases are used, the DNA linkers should complement the cohesive ends produced by the restriction endonucleases. The length of the DNA linker can vary but the ends of the DNA linkers must be cohesive with the restriction enzyme sites produced by the restriction endonuclease.

It should be noted that if the DNA linker is smaller than 16 mers, the frequency of annealing to nonspecific sites increases. Non-specific sites are sites where a DNA linker primer or specific primer anneals without being 100% homologous to the sequence it is annealing to. These sites can be found in any region of the DNA molecule including the DNA terminus. For the purposes of this invention, the phrases DNA linker, linker DNA and linker are used interchangeably and are double stranded for the purpose of this inventoin. A common feature of primer DNA and specific primer DNA are that both are single stranded for the purpose of this invention. Thus, using DNA linkers smaller than 16 mers are not recommended.

DNA linkers 18-30 mers long are preferred. Preferably, the DNA linker comprises a first strand of 18-30 mer and a second strand of 18-30 mer, with the second strand having an additional portion such that the second strand is longer than the first strand. The additional portion of the second strand is complementary to the sticky ends formed by the restriction endonuclease, thus, allowing the DNA linker to anneal to the DNA fragment.

STEP 3

In Step 3, the Sau3A DNA fragment is ligated to the 20 mer/24 mer DNA linker with GATC cohesive ends with the enzyme DNA T4 ligase. The unphosphorylated DNA linker will ligate only at 5' phospho-end of the DNA fragment. The preferred ligation method is using unphosphorylated DNA linkers is slightly modified from the protocol described by Seth et al ("A New Method for Linker Ligation", Gene Anal. Tech. 1:99-103 (1984). The DNA linker is added in excess of the Sau3A DNA fragments to minimize annealing and ligation among the genomic DNA fragments. The optimal ratio of DNA linkers to Sau3A DNA fragments is 50 to 1, however, other ratios are also acceptable. Thus, because of the design of this method, only the 20 mer strands of the DNA linker ligate to Sau3A genomic DNA fragments. Therefore, for optimal results, the 20 mer sequence should always be unique.

STEP 4 In Step 4, briefly, the unligated 3' OH ends are blocked by three denaturation cycles (65 * C, 75 * C, and 85 * C) followed by 3 cycles of incubation with Taq polymerase for ten minutes at 73'C, in the presence of 2mM ddG. The excess DNA linkers are removed from the DNA linker-ligated genomic DNA fragment admixture. The 24 mer strand of the DNA linker that is not ligated to one

of the strands of the DNA fragment will also be removed along with the excess DNA linkers. The resulting 5' linker-ligated strand and the 3' OH blocked strand of the Sau3A DNA fragments constitutes a vector-free human Sau3A DNA library, vector-free DNA library or a PCR walking library.

More particularly, blocking the hydroxy group at the 3' end is an important step in the invention. In a preferred embodiment of this invention, dideoxyribonucleotide triphosphates are used to block the hydroxy group at the 3' end. Although other methods and reagents can be used, the dideoxyribonucleotide triphosphate method is currently the least expensive, and most efficient, and requires reagents that are simple to work with.

Blocking the hydroxy groups at the 3' end of the DNA fragment is the important aspect of Step 4, whether or not denaturation takes place. Thus, the denaturation portion depicted in Step 4 may not be necessary for optimal blocking of the 3' end hydroxy groups.

The dideoxyribonucleotide that is to be added to the 3' end can be either G, A, T, or C, depending on the last nucleotide of the cohesive end DNA linker. In this blocking step, any polymerase enzyme may used to add the dideoxyribonucleotide bases, however, the inventors prefer Taq polymerase.

In Step 4 of Figure 1, the dideoxyribonucleotide G is incorporated into the 3' OH end by Taq polymerase to prevent the synthesis of the 24 mer sequence at the 3' end of the DNA fragment.

In the blocking step, Step 4, a percentage of the 3' ends might not be blocked using a single dideoxyribonucleotide base because the dideoxy- triphosphate may contain a small impurities of deoxy- triphosphate which may result in synthesis of the 24 mer. Thus, the inventors recommend that at least two dideoxyribonucleotide bases be used in blocking the 2 ' end. For example, in the Sau3A digested genomic DNA of Figure 1, optimal blocking would result with all four bases but blocking would result from using only two bases.

In Step 4, the excess 24 mer may be removed by denaturing the 24 mer at 85'C, then putting in an excess amount of 20 mer to remove the 24 mer by the formation of 24/20 mer duplex. This step prevents the 24 mer from reannealing back to the DNA fragments. The excess 24/20 mer duplex DNA is removed by column filtration. Column filtration is a standard method routinely employed by those skilled in this art. The 24 mer strand of the DNA linker that is not ligated to one of the strands of the DNA fragment will also be removed along with the excess DNA linkers.

The resulting DNA fragments without linker and primer DNAs are blocked by incubation with Klenow fragment in the presence of dd-N at room temperature. For the purpose of this invention, dd-N refers to either of the 4 dideoxyribonucleotides dd-G, dd-A, dd-C or dd-T.

Once the 3' ends have been blocked by dideoxyribonucleotide incorporation, the product is called a vector-free DNA library or a PCR walking library. Amplification of DNA sequences flanking known DNA sequences in the vector-free DNA library using specific primers is the next part of the invention.

STEP 5

Before the DNA sequences adjacent to the promoter specific primers can be cloned, the fragments are denatured as shown in Step 5. Step 5 is crucial because if denaturation is not done first, any unblocked ends will be amplified, resulting in amplification of impurities. The denatured fragments provide independent upper and lower strands with each strand having a 5' end and a 3' end. For the purpose of this invention, these denatured fragments are called strands.

Polymerase chain reaction involving the vector-free DNA library is shown in steps 6-11. These steps may be combined into a single step.

Step 6

In Step 6, a specific DNA primer of interest is annealed to the upper strand. The specific DNA primer is defined as a complementary sequence to a known DNA sequence in the upper strand. For the purpose of this invention, a 20 mer DNA primer is employed. The inventors prefer to employ a 20 mer but other sequences in the range of 16 to 28 mers would also be effective.

STEP 7

Synthetic reaction by Taq polymerase is performed in step 7 to produce a complementary template strand adjacent to the DNA primer of interest (as depicted in Figure 1, the complementary template strand is being generated from the 3' end to the 5 # end). The newly synthesized template strand and the upper strand form a duplex.

STEP 8 Step 8 involves denaturing the duplex product generated in Step 7. This denaturation step yields two

strands: one strand is the original strand with the ligated DNA linker to it and the other strand is the newly synthesized complementary template strand with the DNA primer of interest at one end of this strand. Denaturation is a standard technique known to those skilled in this art. For effective denaturation, the inventors prefer a temperature range from 90"C to 97 * C with 94'C being the most optimal temperature. At lower- temperatures, denaturation may be incomplete.

STEP 9

Next, a DNA linker is annealed to the template strand produced in Step 9. This DNA linker is the same 20 mer DNA linker that was ligated to the 5' end of the original DNA fragment (see Step 2).

STEP 10

In Step 10, synthesis, by Taq polymerase, off the newly annealed 20 mer strand of DNA linker is achieved. For this step to be the most effective, a polymerase effective at about 90'C and above is required. At present, Taq polymerase is the only polymerase presently effective at this temperature. These high temperatures are required to denature the double-strand amplified DNA product so that DNA primer can anneal to the single stranded template from synthesis of DNA in the next round. At higher temperatures, greater than 97"C, the Taq polymerase activity will be inactivated at a faster rate.

STEP 11

In Step 11, 30-40 cycles, with specific DNA primers, DNA linkers, and Taq polymerase, are repeated to generate upwards of about 10 5 copies of the 5'-flanking DNA sequences. Less cyles can be performed but the inventors

prefer at least 30-45 cycles for optimal synthesis of the desired 5-flanking DNA sequences.

Although not shown in FIG. 1, a second specific DNA primer of interest, complementary to a sequence on the lower strand of the original DNA fragment, can be annealed as described in Steps 6-11.

The distance of each walk is limited by the location of the restriction endonuclease sites 5' upstream and downstream of the known sequence. All the steps may be performed on both strands simultaneously, thus, the upper strand need not be separated from the lower strand for optimal synthesis of desired flanking regions.

EXAMPLE II

Schematic Representation of the Steps Involved in Forming a Polymerase Chain Reaction Walking Library T-pr-h Employs Phosphorylated DNA Linkers and for Amplifying 5-Flanking DNA Seouences Using Such a Polymerase Chain Reaction Walking

Library

In yet another embodiment of this invention, in Step 2 phosphorylated DNA linkers are used instead of unphosphorylated DNA linkers. Lambda exonuclease is used to enrich the target DNA products primed by the specific DNA primer. This combination of phosphorylated DNA linkers and lambda exonuclease eliminates any nonspecific DNA resulting from the blocking step by dideoxyribonucleoside triphosphate. Non-specific amplification can decrease the yield of the desired DNA fragments.

The following scheme is proposed to circumvent the problem by enriching the DNA fragments generated by the linker primer and specific primer. The approach makes use of the phosphorylated 20 mer/24 mer linker DNA, with

the 5' end of the 20 mer kinased (another term for phosphorylated) in the step 2 of the scheme. The steps 3-12 are essentially identical as the non-kinased DNA linker 20 mer/24 mer.

With the newly amplified DNA products, the lambda exonuclease is added to the reaction mix ("Production of Single-Stranded DNA Templates by Exonuclease Digestion Following Polymerase Chain Reaction", Nucleic Acid Res. 17:5865 (1989)). The specificity of the lambda exonuclease allows it to degrade DNA from 5' to 3 ' only if the 5' end is phosphorylated.

Using the PCR DNA library as described in the invention, two major types of DNA products are produced by DNA amplification (one product is synthesized by employing DNA linkers on both ends; the second product is synthesized by employing a DNA linker, as well as a DNA specific primer) . After the lambda exonuclease reaction occurs the DNA fragments produced by DNA linkers alone are degraded on both ends and cannot be regenerated in the next round of the PCR reaction. For the purpose of this invention, DNA linker primer is defined as the 20 mer strand of the DNA linker (as shown in Figure 1, Step 9) . DNA fragments produced by specific priming where only the strand initiated by the linker primer are degraded while the strand initiated by the specific primer remain intact. This intact strand can be used for generating more targeted DNA fragments in the next round of PCR reaction. The procedure of PCR amplification and the lambda exonuclease reaction can be repeated a few times to ensure the amplification of the specific DNA products, i.e. the DNA fragments primed by the specific primer.

gSW P ?II

Method for Producing a Polymerase Chain Reaction Human Vector-Free Chromosomal Walking Library

Example III is presents the necessary steps for producing a polymerase chain reaction human vector-free chromosomal walking library derived from human genomic DNA (isolated from human peripheral blood) .

High molecular weight DNA was isolated from human peripheral blood. Briefly, the cells were lysed in Triton-X 100 buffer and digested with proteinase K in the presence of 0.5% sodium dodecyl sulfate overnight. The proteinase K was removed by two rounds of phenol extraction and RNAs were removed by digesting the phenol- extracted sample with 50 μg/ml pancreatic RNase A for 1 hour at 37 * C. The resulting sample was re-extracted with 2 rounds of phenol and ethanol precipitated and ready for restriction endonuclease digestion.

The high molecular weight DNA was digested with Sau3A in medium buffer overnight for at least 4 hours. The average size of the Sau3A digested DNA fragments were in the range of 0.1 Kb (100 base pairs) to 10 Kb. The digested DNA was treated with 0.5% sodium dodecyl sulfate at 65 * C for 20 hours to inactivate the enzyme, ethanol precipitated, and dissolved in IX Tris/EDTA. The dissolved DNA was ligated to unphosphorylated 20 mer/24 mer DNA linker with GATC cohesive end overnight at 20'C in standard ligation buffer in the presence of T4 DNA ligase. The DNA linkers were added in excess of the Sau3A digested DNA fragments to minimize ligation among the genomic DNA fragment (DNA linkers : Sau3A DNA fragments = 50 : 1) . The inventors prefer a 50 to 1 ratio of DNA linkers to Sau3A digested DNA fragments, but 5 to 1 or 100 to 1 would also suffice. The DNA was now ready for amplification by the polymerase chain reaction met

Only the 20 mer strands of the DNA linkers were ligated to the genomic DNA fragments. The unligated 3 ' OH ends were blocked by three cycles of denaturation at 65'C, 75'C, and 85"C followed by 3 cycles of incubation with Taq polymerase for ten minutes at 73"C or other DNA polymerase at 37 * C in the presence of 2mM ddG.

Next, the DNA linker was removed from the linker- ligated genomic fragment by column filtration. Column filtration is a standard protocol known to those skilled in this art. The 5' linker-ligated and the 3' OH blocked (five L-3B) Sau3A DNA fragments constitute the vector free human Sau3A DNA library.

EXAMPLE IV Cloning DNA Sequences Located 5 / to the Zeta-Globin Promoter Region by Using a Polymerase Chain Reaction Walking Library Produced in Example III

This example illustrates the necessary steps for cloning DNA sequences neighboring promoter specific primers (promoter specific primer is defined as a DNA primer complementary to the promoter region) .

For illustration purposes only, the DNA sequences in the zeta-globin promoter region were amplified using the methods of the invention. It is appreciated that virtually any region 5' to a known sequence may be cloned by employing the methods described in this invention.

Specific DNA primers used were a 33 mer oligomer

("oligo A 3 33 mer") and a 17 mer oligomer ("oligo B 17 mer") which were complementary to the sequences located, respectively, -302 to -270 and -415 to -399 upstream of the CAP cite in the zeta-globin region as illustrated in Figure 2.

For this experiment, a portion of the 5' DNA linker ligated and the 3' OH blocked Sau3A fragments (the Sau3A vector free library; 0.5 μg) were used as templates and the 20 mer from the 20 mer/24 mer DNA linker, as well as oligo A 33 mer or oligo B 17 mer, were used as the specific DNA primers. Two polymerase chain reactions were performed as described in steps 6-11 of the method of Figure 1. The cycles were repeated 30 times to generate the target sequences. For the purpose of this invention, target sequences are defined as DNA sequences primed by the specific primer and primed by the linker primer.

Polymerase Chain Reaction Amplification In this particular experiment, polymerase chain reaction amplification was performed by an automated method using a Perkin-Elmer DNA thermal cycler. 0.5 μg of genomic DNA as prepared above was incubated in 100 ul of IX PCR buffer (10 mM Tris-HCl, pH 8.3/50 mM KCl/1.5 mM MgCl 2 /0.1% gelatin/dNTP at 200 μΑ each, with each primer at 1 μM) . Two units of Taq polymerase were added and the reaction mixture was heated to 94 * C for 5 minutes, cooled to 55 * C for 2 minutes and brought to 73 * C for 3 minutes for the first cycle.

Next, was a denaturing step at 94 * C for 1 minute, annealing step at 55 * C for 2 minutes, and elongating step at 73"C for 3 minutes. These steps are repeated for 30 cycles except the elongation time for the last cycle is increased to 7 minutes.

The amplified DNA products were extracted two times with chloroform, ethanol precipitated, dissolved in IX T.E. buffer and ready for agarose gel analysis. Agarose gels are standard and known to those skilled in this art.

Figure 3 is a photograph of an ethidiu bromide stained agarose gel demonstrating DNA fragments from two reactions using oligo A 33 mer and oligo B 17 mer as specific DNA primers and Sau3A linker DNA as the other primer. Lanes 1 and 2 show amplified DNA product primed by DNA linker alone and oligo B 17 mer plus DNA linker, respectively. Lanes 3 and 4 show polymerase chain reaction DNA generated by oligo A 33 mer with DNA linker and DNA linker alone, respectively. Lane M shows marker DNA (size given in nucleotides) .

The results of this demonstrated a smear of DNA fragments was observed in each lane of the gel because of the non-specific priming of this library. However, of interest to note is there were no observed di ferences between polymerase chain reaction DNA products generated by the DNA linker and specific primer or DNA linker alone.

Southern Blot Analysis of the Amplified DNA Products Twenty percent of the polymerase chain reaction amplified DNA products, for each reaction, was run on a 2% agarose gel for 4 hours at the voltage of 80V. The DNA in the gel was transferred to nitrocellulose and probed with a 32 P labeled hybrid oligomer neighboring the region complementary to the specific primer.

The filter was exposed to Kodak X-ray film overnight and developed. The results from this southern transfer are shown in Figure 4. In Figure 4, two hybridized DNA fragments (sizes of 220 bp and 340 bp) were observed in the polymerase chain reaction mix containing the oligo A 33 mer or oligo B 17 mer (lanes 2 and 3, respectively). No similar hybridized fragments were observed in the polymerase chain reaction mix containing the DNA linker

alone or containing the specific DNA linker primer alone (lanes 1 and 4) .

Preparing the Hybridized DNA for Sequencing The hybridized DNA band was then excised from the gel and the DNA was eluted by a standard protocol known to those skilled in this art. The DNA fragment was re- amplified under the same conditions as described above. The second cycle DNA products were digested with restriction enzyme, and run on 2% agarose gel to check the yield and quality of the DNA product. Then, the primers were removed from the products by column filtration. The purified amplified DNA products were ethanol precipitated, dissolved in IX Tris/EDTA and were ready for DNA sequencing.

EXAMPLE V

Generating a Small Sized DNA Library Using Sau3A. Mspl and Taql and Amplifying a Selected Portion of this Library By Employing a Non-Primer Specific Polymerase

Chain Reaction Procedure.

A small amount of human genomic DNA was digested with Sau3A, Msp 1 and Taq 1, consecutively in appropriate buffers. The size population of the DNA fragments was mostly less than 2.5kb. This sized DNA can readily be amplified by polymerase chain reaction. Other enzymes such as Hhal, Hpa2 and Rsal may also be employed to digest the genomic DNA.

The DNA fragments were ligated to 22 mer/20 mer linker DNA with GC cohesive ends and 24/20 mer linker DNA with GATC cohesive ends overnight at 20 C in a standard ligation buffer. A small quantity of the linker ligated DNA prepared above was incubated in a PCR buffer in the presence of 20 mer primer. (Note: The DNA sequence of

the two 20 mer was identical in the linker DNA and this linker DNA may be synthetically or randomly designed.)

The mix was denatured at 94° C for 1 min. , annealed at 55°C for 1 minute, and elongated at 73° C for 15 minutes in sequential order. The cycle of denaturing, annealing and elongating was repeated 30 times. The amplified DNA product was extracted one time with chloroform, ethanol precipitated and dissolved in IX Tris/EDTA buffer.

The size distribution of amplified DNA fragments was almost the same as the original DNA digest suggesting the amplification may be homogeneous. The gel analysis of the DNA fragments suggested that the original composition of the human DNA digest can be regenerated by this non- primer specific PCR procedure.

Although other theories or mechanisms were not addressed, they should be considered as within the realm of possibility and the described mechanism is not intended to be limiting the invention. Even though the invention has been described with a certain degree of particularity, it is evident that many alternatives, modifications, and variations will be apparent to those skilled in the art in light of the foregoing disclosure. Accordingly, it is intended that all such alternatives, modifications, and variations which fall within the spirit and the scope of the invention be embraced as defined by the claims.

This technology can also be applied to quantitate the copy number of an amplified gene in small number of tumor cells. Another application is the use of the three

restriction endonucleases to digest DNA to yield DNA fragments for the production a polymerase chain reaction walking library, where the average DNA size is amenable to polymerase chain reaction amplification (at least 98% DNA fragments will amplify by the polymerase chain reaction method) . An additional application is the use of this strategy to homogeneously amplify DNA micro- dissected from a specific chromosomal region.