Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
STEM-LOOP COMPOSITE RNA-DNA ADAPTOR-PRIMERS: COMPOSITIONS AND METHODS FOR LIBRARY GENERATION, AMPLIFICATION AND OTHER DOWNSTREAM MANIPULATIONS
Document Type and Number:
WIPO Patent Application WO/2012/103154
Kind Code:
A1
Abstract:
The present invention concerns manipulation of nucleic acid molecules, such as amplification, or adaptor-ligation. RNA/DNA chimeric stem-loop adaptor/primers are utilized during target nucleic acid manipulation. Methods controlling the activation of stem-loop oligonucleotides are within the scope of the invention. Methods combining multiple nucleic acid reactions in one reaction volume are envisioned in various embodiments of the invention.

Inventors:
KURN NURITH (US)
STAPLETON MARK (US)
WANG SHENLONG (US)
Application Number:
PCT/US2012/022448
Publication Date:
August 02, 2012
Filing Date:
January 24, 2012
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
NUGEN TECHNOLOGIES INC (US)
KURN NURITH (US)
STAPLETON MARK (US)
WANG SHENLONG (US)
International Classes:
C12P19/34
Foreign References:
US7803550B22010-09-28
US20090203085A12009-08-13
US20060216724A12006-09-28
Other References:
METZKER: "Sequencing technologies - the next generation.", NATURE REV GENETICS, vol. 11, no. 1, January 2010 (2010-01-01), pages 31 - 46
BROUDE: "Stem-loop oligonucleotides: a robust tool for molecular biology and biotechnology.", TRENDS BIOTECHNOL, vol. 20, no. 6, June 2002 (2002-06-01), pages 249 - 256, XP004352763, DOI: doi:10.1016/S0167-7799(02)01942-X
Attorney, Agent or Firm:
SKUBATCH, Maya et al. (650 Page Mill RoadPalo Alto, CA, US)
Download PDF:
Claims:
CLAIMS

WHAT IS CLAIMED IS:

1. A method of preparing a target nucleic acid molecule for further processing, comprising:

(a) attaching an RNA/DNA chimeric stem-loop oligonucleotide having a portion of its stem and at least a portion of its loop comprise of RNA to said target nucleic acid molecule thereby producing a stem-loop-target complex; and

(b) contacting said stem-loop-target complex with an RNase that cleaves RNA from a RNA/DNA heteroduplex, thereby preparing the stem-loop-target complex for further processing.

2. The method of claim 1, wherein the RNase is RNase H.

3. The method of claims 1-2, wherein further processing comprises nucleic acid sequencing.

4. The method of claim 3, wherein sequencing comprises single molecule sequencing.

5. The method of claim 3, wherein sequencing comprises sequencing by synthesis.

6. The method of claims 1-5, wherein further processing comprises nucleic acid amplification.

7. The method of claim 6, wherein a portion of the stem-loop oligonucleotide is further used to prime an amplification reaction.

8. The method of claim 6-7, wherein the amplification is isothermal amplification.

9. The method of claim 6-7, wherein the amplification is PCR.

10. The method of claims 1-9, wherein the nucleic acid is produced by fragmenting a longer nucleic acid.

11. The method of claims 1-10, wherein the target nucleic acid is blunt ended.

12. The method of claims 1-11, wherein the target nucleic acid comprises a 3' overhang.

13. The method of claims 1-12, wherein the target nucleic acid comprises a 5' overhang.

14. The method of claim 12, wherein the stem- loop oligonucleotide further comprises a 3' overhang.

15. The method of claim 13, wherein the stem-loop oligonucleotide further comprises a 5' overhang.

16. The method of claims 14-15, wherein the stem-loop oligonucleotide overhang and nucleic acid overhang are at least partially complementary to each other.

17. The method of claims 1-16, wherein the attaching step comprises attaching a first RNA/DNA chimeric stem-loop oligonucleotide to a first end of said target nucleic acid and a second RNA/DNA chimeric stem- loop oligonuclotide to a second end of said target nucleic acid.

18. The method of claim 17, wherein said first and said second RNA/DNA chimeric stem-loop oligonucleotides are the same.

19. The method of claims 1-18, wherein the attaching step comprises the use of a ligase.

20. The method of claim 19, wherein the ligase is T4 ligase.

21. The method of claims 1-20, wherein the stem-loop oligonucleotide lacks 5' phosphate.

22. The method of claims 1-21, wherein the target nucleic acid is further extended along the cleaved stem-loop oligonucleotide sequence, thereby forming a target nucleic acid extension product, wherein a portion of the stem-loop oligonucleotide acts as a template.

23. The method of claim 22, wherein the stem-loop oligonucleotide that is in duplex with the nucleic acid extension is further cleaved using an RNase that cleaves RNA from a RNA/DNA heteroduplex, thereby forming a priming ready construct.

24. The method of claim 23, wherein the priming ready construct is further amplified.

25. The method of claim 24, wherein the amplification comprises single primer isothermal amplification (SPIA).

26. The method of claims 1-25, wherein at least one loop of the stem- loop oligonucleotide further comprises one or more sequence elements selected from the group consisting of a primer binding site, a sequencing primer binding site, a barcode, a promoter, an adaptor-group sequence, a restriction enzyme recognition site, a target complementary overhang, a probe complementary overhang, a probe binding site, a random sequence, and a complementary sequence to any of the sequence elements within the group.

27. The method of claims 1-26, wherein at least one stem of the stem- loop oligonucleotide further comprises one or more sequence elements selected from the group consisting of a primer binding site, a sequencing primer binding site, a barcode, a promoter, an adaptor-group sequence, a restriction enzyme recognition site, a target complementary overhang, a probe complementary overhang, a probe binding site, a random sequence, and a complementary sequence to any of the sequence elements within the group.

28. The method of claims 1-27, wherein at least one overhang of the stem- loop oligonucleotide further comprises one or more sequence elements selected from the group consisting of a primer binding site, a sequencing primer binding site, a barcode, a promoter, an adaptor-group sequence, a restriction enzyme recognition site, a target complementary overhang, a probe complementary overhang, a probe binding site, a random sequence, and a complementary sequence to any of the sequence elements within the group.

29. A method for amplifying a nucleic acid molecule, comprising:

contacting in a single mixture

(a) an R A/DNA stem-loop adaptor having a portion of its stem and at least a portion of its loop comprise of R A;

(b) a double-stranded nucleic acid with a 3' overhang;

(c) R ase that cleaves RNA from a RNA/DNA heteroduplex;

(d) a DNA polymerase with strand displacement activity.

30. The method of claim 29, further comprising forming an extension product by priming the double-stranded nucleic acid at its 3' overhang using at least a portion of the stem- loop oligonucleotide.

31. The method of claim 30, wherein the loop-R A in the extension product is further cleaved by the RNase, thereby forming a second double stranded nucleic acid with at least one 3' overhang.

32. The method of claims 29-31 , wherein the RNase comprises RNase H.

33. The method of claims 29-32, further comprising at least a second amplification mixture with at least a second stem-loop oligonucleotide, wherein each stem-loop

oligonucleotide further comprises a barcode sequence.

34. The method of claims 29-33, wherein the 3' overhang is produced by

(a) forming a product, wherein the stem-loop oligonucleotide is attached to a double stranded nucleic acid by one strand;

(b) extending the double stranded nucleic acid along the stem-loop oligonucleotide; and

(c) cleaving the stem-loop oligonucleotide using an RNase that cleaves RNA sequences in an RNA/DNA heteroduplex.

35. The method of claims 29-33, wherein the 3' overhang is produced during single primer isothermal amplification.

36. A kit for amplifying nucleic acids, comprising:

(a) a stem-loop oligonucleotide, wherein at least a portion of the stem comprises RNA and wherein at least a portion of the loop comprises RNA;

(b) an RNase that cleaves RNA sequences in an RNA/DNA duplex;

(c) a ligase; and

(d) a DNA polymerase.

37. A method of amplifying a double stranded target nucleic acid molecule, comprising:

(a) attaching a strand of a first stem-loop oligonucleotide to said double stranded target nucleic acid molecule thereby producing an oligonucleotide-attached nucleic acid molecule; and

(b) cleaving part of the stem-loop oligonucleotide using an RNase; and using a second stem-loop oligonucleotide as a primer in amplifying the double stranded target nucleic acid molecule.

38. The method of claim 37, wherein the second stem-loop oligonucleotide is an exact copy of the first stem-loop oligonucleotide.

39. The method of claims 37-38, wherein the R ase comprises R ase H.

40. The method of claims 37-39, wherein the stem-loop oligonucleotide comprises an R A/DNA heteroduplex in a stem.

41. The method of claims 37-40, wherein a strand of the oligonucleo tide-attached nucleic acid molecule is extended.

42. The method of claim 41, wherein the extension yields a second RNA/DNA heteroduplex.

43. The method of claim 42, wherein cleaving part of the stem-loop oligonucleotide comprises cleaving the second RNA DNA heteroduplex.

44. The method of claim 43, wherein cleaving part of the stem- loop oligonucleotide generates a priming ready construct.

45. The method of claim 43, wherein the priming ready construct comprises a single stranded region that is 3 ' to the double stranded target nucleic acid.

46. The method of claims 37-45, wherein the amplification comprises PCR.

47. The method of claims 37-45, wherein the amplification comprises SPIA.

48. The method of claims 37-47, wherein the attaching comprises ligation.

49. The method of claim 48, wherein T4 ligase is utilized in ligation.

50. The method of claims 37-49, wherein the stem- loop oligonucleotide lacks 5' phosphate.

51. The method of claims 37-50, wherein the target nucleic acid is produced by fragmenting a longer nucleic acid.

52. The method of claims 37-51 , wherein the target nucleic acid is blunt ended.

53. The method of claims 37-52, wherein the target nucleic acid comprises a 3' overhang.

54. The method of claims 37-53, wherein the target nucleic acid comprises a 5' overhang.

55. The method of claim 53, wherein the stem-loop oligonucleotide further comprises a 3' overhang.

56. The method of claim 54, wherein the stem-loop oligonucleotide further comprises a 5' overhang.

57. The method of claims 55-56, wherein the stem-loop oligonucleotide overhang and nucleic acid overhang are at least partially complementary to each other.

58. The method of claims 37-57, wherein at least one loop of the stem-loop oligonucleotide further comprises one or more sequence elements selected from the group consisting of a primer binding site, a sequencing primer binding site, a barcode, a promoter, an adaptor-group sequence, a restriction enzyme recognition site, a target complementary overhang, a probe complementary overhang, a probe binding site, a random sequence, and a complementary sequence to any of the sequence elements within the group.

59. The method of claims 37-58, wherein at least one stem of the stem-loop oligonucleotide further comprises one or more sequence elements selected from the group consisting of a primer binding site, a sequencing primer binding site, a barcode, a promoter, an adaptor-group sequence, a restriction enzyme recognition site, a target complementary overhang, a probe complementary overhang, a probe binding site, a random sequence, and a complementary sequence to any of the sequence elements within the group.

60. The method of claims 37-59, wherein at least one overhang of the stem-loop oligonucleotide further comprises one or more sequence elements selected from the group consisting of a primer binding site, a sequencing primer binding site, a barcode, a promoter, an adaptor-group sequence, a restriction enzyme recognition site, a target complementary overhang, a probe complementary overhang, a probe binding site, a random sequence, and a complementary sequence to any of the sequence elements within the group.

Description:
STEM-LOOP COMPOSITE RNA-DNA ADAPTOR-PRIMERS:

COMPOSITIONS AND METHODS FOR LIBRARY GENERATION, AMPLIFICATION

AND OTHER DOWNSTREAM MANIPULATIONS

CROSS-REFERENCE

[0001 ] This application claims the benefit of U.S. Provisional Application No. 61/435,730, filed

January 24, 2011, which application is incorporated herein by reference.

BACKGROUND OF THE INVENTION

[0002] Advances in the study of biological molecules have been led, in part, by improvement in technologies used to characterizee the molecules or their biological reactions. In particular, the study of the nucleic acids DNA and R A has benefited from developing technologies used for manipulating and analyzing nucleic acids, such as amplification, quantification, genotyping, and sequence analysis. Expanding use of nucleic acid based techniques demand employment of new methods and composition that exceed speed, efficiency, range and/or automation of current techniques. The instant application describes methods and compositions that improve currently available techniques for nucleic acid manipulation.

SUMMARY OF THE INVENTION

[0003] In one aspect, the invention provides methods of preparing a target nucleic acid molecule for further processing, comprising: attaching an RNA/DNA chimeric stem-loop oligonucleotide having a portion of its stem and at least a portion of its loop comprise of RNA to said target nucleic acid molecule thereby producing a stem-loop-target complex; and contacting said stem-loop-target complex with an RNase that cleaves RNA from a RNA/DNA heteroduplex, thereby preparing the stem-loop-target complex for further processing. In some embodiments, the RNase is RNase H. In some embodiments, further processing comprises nucleic acid sequencing. In some embodiments, sequencing comprises single molecule sequencing. In some embodiments, sequencing comprises sequencing by synthesis. In some embodiments, further processing comprises nucleic acid amplification. In some embodiments, a portion of the stem-loop oligonucleotide is further used to prime an amplification reaction. In some embodiments, the amplification is isothermal amplification. In some embodiments, the amplification is PCR. In some embodiments, the nucleic acid is produced by fragmenting a longer nucleic acid. In some embodiments, the target nucleic acid is blunt ended. In some embodiments, the target nucleic acid comprises a 3' overhang. In some embodiments, the target nucleic acid comprises a 5' overhang. In some embodiments, the stem- loop oligonucleotide further comprises a 3' overhang. In some embodiments, the stem- loop oligonucleotide further comprises a 5' overhang. In some embodiments, the stem- loop oligonucleotide overhang and nucleic acid overhang are at least partially complementary to each other. In some embodiments, the attaching step comprises attaching a first RNA/DNA chimeric stem- loop oligonucleotide to a first end of said target nucleic acid and a second RNA DNA chimeric stem-loop oligonucleotide to a second end of said target nucleic acid. In some embodiments, said first and said second RNA/DNA chimeric stem-loop oligonucleotides are the same. In some embodiments, the attaching step comprises the use of a ligase. In some embodiments, the ligase is T4 ligase. In some embodiments, the stem-loop oligonucleotide lacks 5' phosphate. In some embodiments, the target nucleic acid is further extended along the cleaved stem-loop oligonucleotide sequence, thereby forming a target nucleic acid extension product, wherein a portion of the stem-loop

oligonucleotide acts as a template. In some embodiments, the stem-loop oligonucleotide that is in duplex with the nucleic acid extension is further cleaved using an RNase that cleaves RNA from a RNA/DNA heteroduplex, thereby forming a priming ready construct. In some embodiments, the priming ready construct is further amplified. In some embodiments, the amplification comprises single primer isothermal amplification (SPIA). In some embodiments, at least one loop of the stem-loop oligonucleotide further comprises one or more sequence elements selected from the group consisting of a primer binding site, a sequencing primer binding site, a barcode, a promoter, an adaptor-group sequence, a restriction enzyme recognition site, a target complementary overhang, a probe complementary overhang, a probe binding site, a random sequence, and a complementary sequence to any of the sequence elements within the group. In some embodiments, at least one stem of the stem-loop oligonucleotide further comprises one or more sequence elements selected from the group consisting of a primer binding site, a sequencing primer binding site, a barcode, a promoter, an adaptor-group sequence, a restriction enzyme recognition site, a target complementary overhang, a probe complementary overhang, a probe binding site, a random sequence, and a complementary sequence to any of the sequence elements within the group. In some embodiments, at least one overhang of the stem-loop oligonucleotide further comprises one or more sequence elements selected from the group consisting of a primer binding site, a sequencing primer binding site, a barcode, a promoter, an adaptor-group sequence, a restriction enzyme recognition site, a target complementary overhang, a probe complementary overhang, a probe binding site, a random sequence, and a complementary sequence to any of the sequence elements within the group.

[0004] In another aspect, the invention provides methods for amplifying a nucleic acid molecule, comprising: contacting in a single mixture an R A/DNA stem-loop adaptor having a portion of its stem and at least a portion of its loop comprise of R A; a double-stranded nucleic acid with a 3' overhang; R ase that cleaves RNA from a RNA/DNA

heteroduplex; a DNA polymerase with strand displacement activity. Some embodiments, further comprise forming an extension product by priming the double-stranded nucleic acid at its 3' overhang using at least a portion of the stem- loop oligonucleotide. In some embodiments, the loop-RNA in the extension product is further cleaved by the RNase, thereby forming a second double stranded nucleic acid with at least one 3' overhang. In some embodiments, the loop-RNA in the extension product is further cleaved by the RNase, thereby forming a second double stranded nucleic acid with at least one 3 ' overhang. In some embodiments, the RNase comprises RNase H. Some embodiments further comprise at least a second amplification mixture with at least a second stem-loop oligonucleotide, wherein each stem-loop oligonucleotide further comprises a barcode sequence. In some embodiments, the 3' overhang is produced by forming a product, wherein the stem-loop oligonucleotide is attached to a double stranded nucleic acid by one strand; extending the double stranded nucleic acid along the stem-loop

oligonucleotide; and cleaving the stem-loop oligonucleotide using an RNase that cleaves RNA sequences in an RNA DNA heteroduplex. In some embodiments, the 3' overhang is produced during single primer isothermal amplification.

[0005] In yet another aspect, the invention provides kits for amplifying nucleic acids,

comprising: a stem-loop oligonucleotide, wherein at least a portion of the stem comprises RNA and wherein at least a portion of the loop comprises RNA; an RNase that cleaves RNA sequences in an RNA/DNA duplex; a ligase; and a DNA polymerase.

[0006] In a further aspect, the invention provides methods of amplifying a double stranded target nucleic acid molecule, comprising: (a) attaching a strand of a first stem-loop

oligonucleotide to said double stranded target nucleic acid molecule thereby producing an oligonucleotide-attached nucleic acid molecule; and (b) cleaving part of the stem-loop oligonucleotide using an RNase; and using a second stem-loop oligonucleotide as a primer in amplifying the double stranded target nucleic acid molecule. In some embodiments, the second stem-loop oligonucleotide is an exact copy of the first stem- loop oligonucleotide. In some embodiments, the RNase comprises RNase H. In some embodiments, the stem-loop oligonucleotide comprises an R A/DNA heteroduplex in a stem. In some embodiments, a strand of the oligonucleotide-attached nucleic acid molecule is extended. In some embodiments, the extension yields a second RNA/DNA heteroduplex. In some embodiments, cleaving part of the stem-loop oligonucleotide comprises cleaving the second RNA DNA heteroduplex. In some embodiments, cleaving part of the stem-loop oligonucleotide generates a priming ready construct. In some embodiments, the priming ready construct comprises a single stranded region that is 3' to the double stranded target nucleic acid. In some embodiments, the amplification comprises PCR. In some embodiments, the amplification comprises SPIA. In some embodiments, the attaching comprises ligation. In some embodiments, T4 ligase is utilized in ligation. In some embodiments, the stem- loop oligonucleotide lacks 5' phosphate. In some embodiments, the target nucleic acid is produced by fragmenting a longer nucleic acid. In some embodiments, the target nucleic acid is blunt ended. In some embodiments, the target nucleic acid comprises a 3' overhang. In some embodiments, the target nucleic acid comprises a 5' overhang. In some embodiments, the stem- loop oligonucleotide further comprises a 3' overhang. In some embodiments, the stem- loop oligonucleotide further comprises a 5' overhang. In some embodiments, the stem- loop oligonucleotide overhang and nucleic acid overhang are at least partially complementary to each other. In some embodiments, at least one loop of the stem-loop oligonucleotide further comprises one or more sequence elements selected from the group consisting of a primer binding site, a sequencing primer binding site, a barcode, a promoter, an adaptor- group sequence, a restriction enzyme recognition site, a target complementary overhang, a probe complementary overhang, a probe binding site, a random sequence, and a complementary sequence to any of the sequence elements within the group. In some embodiments, at least one stem of the stem-loop oligonucleotide further comprises one or more sequence elements selected from the group consisting of a primer binding site, a sequencing primer binding site, a barcode, a promoter, an adaptor-group sequence, a restriction enzyme recognition site, a target complementary overhang, a probe complementary overhang, a probe binding site, a random sequence, and a complementary sequence to any of the sequence elements within the group. In some embodiments, at least one overhang of the stem-loop oligonucleotide further comprises one or more sequence elements selected from the group consisting of a primer binding site, a sequencing primer binding site, a barcode, a promoter, an adaptor-group sequence, a restriction enzyme recognition site, a target complementary overhang, a probe complementary overhang, a probe binding site, a random sequence, and a complementary sequence to any of the sequence elements within the group.

INCORPORATION BY REFERENCE

[0007] All publications and patent applications mentioned in this specification are herein

incorporated by reference to the same extent as if each individual publication or patent application was specifically and individually indicated to be incorporated by reference.

BRIEF DESCRIPTION OF THE DRAWINGS

[0008] The novel features of the invention are set forth with particularity in the appended claims.

A better understanding of the features and advantages of the present invention will be obtained by reference to the following detailed description that sets forth illustrative embodiments, in which the principles of the invention are utilized, and the accompanying drawings of which:

[0009] Figure 1 illustrates examples of stem-loop oligonucleotides according to embodiments of the invention.

[0010] Figure 2 A-B illustrate two different methods of ligating stem- loop oligonucleotides to a double stranded target nucleic acid followed by nucleic acid extension and R A cleavage generating priming ready constructs.

[0011] Figure 3 illustrates a method for appending disparate sequences to the two ends of a

target nucleic acid according to the embodiments of the invention.

[0012] Figure 4 A-B illustrate methods for performing single primer isothermal amplification

(SPIA) using stem-loop oligonucleotide (A) or linear (B) primers according to

embodiments of the invention.

[0013] Figure 5 illustrates the use of stem- loop oligonucleotides as pro-primers and primers according to some embodiments of the invention.

[0014] Figure 6 illustrates in μg, the yields of whole genome amplification (WGA) reactions utilizing either the stem-loop chimeric amplification primer or the corresponding linear chimeric amplification primer.

[0015] Figure 7 illustrates the BioAnalyzer Profile of whole genome amplification (WGA)

products demonstrating the florescent units as a function of size in nucleotides. Line 1, stem- loop primer with lOOng FFPE sample; line 2, linear primer with 50 ng FFPE sample; and line 3, linear primer with lOOng FFPE sample. DEFINITIONS

[0016] The term "nucleic acid" refers to a nucleotide polymer, and unless otherwise limited, includes known analogs of natural nucleotides that can function in a similar manner (e.g., hybridize) to naturally occurring nucleotides.

[0017] The terms "polynucleotide", "nucleotide", "nucleotide sequence", "nucleic acid" and "oligonucleotide" are used interchangeably. They refer to a polymeric form of nucleotides of any length, either deoxyribonucleotides or ribonucleotides, or analogs thereof. Polynucleotides may have any three dimensional structure, and may perform any function, known or unknown. The following are non limiting examples of polynucleotides: coding or non-coding regions of a gene or gene fragment, intergenic DNA, loci (locus) defined from linkage analysis, exons, introns, messenger R A

(mR A), transfer RNA, ribosomal RNA, short interfering RNA (siRNA), short-hairpin RNA (shRNA), micro-RNA (miRNA), small nucleolar RNA, ribozymes, complementary DNA (cDNA), which is a DNA representation of mRNA, usually obtained by reverse transcription of messenger RNA (mRNA) or by amplification; DNA molecules produced synthetically or by amplification, genomic DNA, recombinant polynucleotides, branched polynucleotides, plasmids, vectors, isolated DNA of any sequence, isolated RNA of any sequence, nucleic acid probes, and primers. A polynucleotide may comprise modified nucleotides, such as methylated nucleotides and nucleotide analogs. If present, modifications to the nucleotide structure may be imparted before or after assembly of the polymer. The sequence of nucleotides may be interrupted by non nucleotide

components. A polynucleotide may be further modified after polymerization, such as by conjugation with a labeling component. Polynucleotide sequences, when provided, are listed in the 5' to 3' direction, unless stated otherwise.

[0018] The term nucleic acid encompasses double- or triple-stranded nucleic acids, as well as single-stranded molecules. In double- or triple-stranded nucleic acids, the nucleic acid strands need not be coextensive (i.e., a double-stranded nucleic acid need not be double- stranded along the entire length of both strands).

[0019] The term nucleic acid also encompasses any chemical modification thereof, such as by methylation and/or by capping. Nucleic acid modifications can include addition of chemical groups that incorporate additional charge, polarizability, hydrogen bonding, electrostatic interaction, and functionality to the individual nucleic acid bases or to the nucleic acid as a whole. Such modifications may include base modifications such as 2'- position sugar modifications, 5-position pyrimidine modifications, 8-position purine modifications, modifications at cytosine exocyclic amines, substitutions of 5-bromo- uracil, backbone modifications, unusual base pairing combinations such as the isobases isocytidine and isoguanidine, and the like.

[0020] More particularly, in certain embodiments, nucleic acids, can include

polydeoxyribonucleotides (containing 2-deoxy-D-ribose), polyribonucleotides

(containing D-ribose), and any other type of nucleic acid that is an N- or C-glycoside of a purine or pyrimidine base, as well as other polymers containing normucleotidic backbones, for example, polyamide (e.g., peptide nucleic acids (PNAs)) and

polymorpholino (commercially available from the Anti-Virals, Inc., Corvallis, Oreg., as Neugene) polymers, and other synthetic sequence-specific nucleic acid polymers providing that the polymers contain nucleobases in a configuration which allows for base pairing and base stacking, such as is found in DNA and RNA. The term nucleic acid also encompasses linked nucleic acids (LNAs), which are described in U.S. Pat. Nos.

6,794,499, 6,670,461, 6,262,490, and 6,770,748, which are incorporated herein by reference in their entirety for their disclosure of LNAs.

[0021] The nucleic acid(s) can be derived from a completely chemical synthesis process, such as a solid phase-mediated chemical synthesis, from a biological source, such as through isolation from any species that produces nucleic acid, or from processes that involve the manipulation of nucleic acids by molecular biology tools, such as DNA replication, PCR amplification, reverse transcription, or from a combination of those processes.

[0022] As used herein, the term "target polynucleotide" or "target nucleic acid" refer to a nucleic acid molecule or polynucleotide in a starting population of nucleic acid molecules having a target sequence whose presence, amount, and/or nucleotide sequence, or changes in these, are desired to be determined. In general, a target polynucleotide is a double- stranded nucleic acid molecule, and may be derived from any source of or process for generating double-stranded nucleic acid molecules.

[0023] As used herein, the term "target sequence" refers generally to a nucleic acid sequence on a single strand of nucleic acid. The target sequence may be a portion of a gene, a regulatory sequence, genomic DNA, cDNA, RNA including mRNA, miRNA, and rRNA, or others. The target sequence may be a target sequence from a sample or a secondary target such as a product of an amplification reaction. The "target sequence" can be embedded in a molecule that includes the nucleotide sequence of a target nucleic acid, such as, for example, the amplification product obtained by amplifying a target nucleic acid or the cDNA produced upon reverse transcription of an RNA target nucleic acid. [0024] A "nucleotide probe" or "probe" refers to a polynucleotide used for detecting or identifying its corresponding target polynucleotide in a hybridization reaction. Thus, a probe is hybridizable to one or more target polynucleotides. Probes can be perfectly complementary to one or more target polynucleotides in a sample, or contain one or more nucleotides that are not complemented by a corresponding nucleotide in the one or more target polynucleotides in a sample.

[0025] As used herein, the term "complementary" refers to the capacity for precise pairing

between two nucleotides. I.e., if a nucleotide at a given position of a nucleic acid is capable of hydrogen bonding with a nucleotide of another nucleic acid, then the two nucleic acids are considered to be complementary to one another at that position.

Complementarity between two single-stranded nucleic acid molecules may be "partial," in which only some of the nucleotides bind, or it may be complete when total

complementarity exists between the single-stranded molecules. The degree of complementarity between nucleic acid strands has significant effects on the efficiency and strength of hybridization between nucleic acid strands.

[0026] "Hybridization" and "annealing" refer to a reaction in which one or more polynucleotides react to form a complex that is stabilized via hydrogen bonding between the bases of the nucleotide residues. The hydrogen bonding may occur by Watson Crick base pairing, Hoogstein binding, or in any other sequence specific manner. The complex may comprise two strands forming a duplex structure, three or more strands forming a multi stranded complex, a single self hybridizing strand, or any combination of these. A hybridization reaction may constitute a step in a more extensive process, such as the initiation of a PCR or other amplification reactions, or the enzymatic cleavage of a polynucleotide by a ribozyme. A first sequence that can be stabilized via hydrogen bonding with the bases of the nucleotide residues of a second sequence is said to be "hybridizable" to said second sequence. In such a case, the second sequence can also be said to be hybridizable to the first sequence.

[0027] In general, a "complement" of a given sequence is a sequence that is fully or substantially complementary to and hybridizable to the given sequence. In general, a first sequence that is hybridizable to a second sequence or set of second sequences is specifically or selectively hybridizable to the second sequence or set of second sequences, such that hybridization to the second sequence or set of second sequences is preferred (e.g.

thermo dynamically more stable under a given set of conditions, such as stringent conditions commonly used in the art) to hybridization with non-target sequences during a hybridization reaction. Typically, hybridizable sequences share a degree of sequence complementarity over all or a portion of their respective lengths, such as between 25%- 100% complementarity, including at least about 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%), and 100% sequence complementarity.

[0028] The term "hybridized" as applied to a polynucleotide refers to a polynucleotide in a complex that is stabilized via hydrogen bonding between the bases of the nucleotide residues. The hydrogen bonding may occur by Watson Crick base pairing, Hoogstein binding, or in any other sequence specific manner. The complex may comprise two strands forming a duplex structure, three or more strands forming a multi stranded complex, a single self hybridizing strand, or any combination of these. The hybridization reaction may constitute a step in a more extensive process, such as the initiation of a PCR reaction, or the enzymatic cleavage of a polynucleotide by a ribozyme. A sequence hybridized with a given sequence is referred to as the "complement" of the given sequence.

[0029] "Specific hybridization" refers to the binding of a nucleic acid to a target nucleotide sequence in the absence of substantial binding to other nucleotide sequences present in the hybridization mixture under defined stringency conditions. Those of skill in the art recognize that relaxing the stringency of the hybridization conditions allows sequence mismatches to be tolerated.

[0030] As used herein, the term "complementary" refers to the capacity for precise pairing

between two nucleotides. I.e., if a nucleotide at a given position of a nucleic acid is capable of hydrogen bonding with a nucleotide of another nucleic acid, then the two nucleic acids are considered to be complementary to one another at that position.

Complementarity between two single-stranded nucleic acid molecules may be "partial," in which only some of the nucleotides bind, or it may be complete when total complementarity exists between the single-stranded molecules. The degree of complementarity between nucleic acid strands has significant effects on the efficiency and strength of hybridization between nucleic acid strands.

[0031] The term "oligonucleotide" is generally used to refer to a nucleic acid that is relatively short, generally shorter than 200 nucleotides, more particularly, shorter than 100 nucleotides, most particularly, shorter than 50 nucleotides. Typically, oligonucleotides are single-stranded DNA molecules. [0032] The term "primer" refers to an oligonucleotide that is capable of hybridizing (also termed "annealing") with a nucleic acid and serving as an initiation site for nucleotide (R A or DNA) polymerization under appropriate conditions (i.e., in the presence of four different nucleoside triphosphates and an agent for polymerization, such as DNA or RNA polymerase or reverse transcriptase) in an appropriate buffer and at a suitable

temperature. The appropriate length of a primer depends on the intended use of the primer, but primers are typically at least 7 nucleotides long and, more typically range from 10 to 30 nucleotides, or even more typically from 15 to 30 nucleotides, in length. Other primers can be somewhat longer, e.g., 30 to 50 nucleotides long. In this context, "primer length" refers to the portion of an oligonucleotide or nucleic acid that hybridizes to a complementary "target" sequence and primes nucleotide synthesis. Short primer molecules generally require cooler temperatures to form sufficiently stable hybrid complexes with the template. A primer need not reflect the exact sequence of the template but must be sufficiently complementary to hybridize with a template. The term "primer site" or "primer binding site" refers to the segment of the target nucleic acid to which a primer hybridizes. A construct with presenting a primer binding site is often referred to as a "priming ready construct" or "amplification ready construct".

[0033] A primer is said to anneal to another nucleic acid if the primer, or a portion thereof, hybridizes to a nucleotide sequence within the nucleic acid. The statement that a primer hybridizes to a particular nucleotide sequence is not intended to imply that the primer hybridizes either completely or exclusively to that nucleotide sequence.

[0034] As used herein, "expression" refers to the process by which a polynucleotide is

transcribed into mRNA and/or the process by which the transcribed mRNA (also referred to as "transcript") is subsequently being translated into peptides, polypeptides, or proteins. The transcripts and the encoded polypeptides are collectively referred to as "gene product." If the polynucleotide is derived from genomic DNA, expression may include splicing of the mRNA in a eukaryotic cell.

DETAILED DESCRIPTION OF THE INVENTION

[0035] The practice of the present invention employs, unless otherwise indicated, conventional techniques of immunology, biochemistry, chemistry, molecular biology, microbiology, cell biology, genomics and recombinant DNA, which are within the skill of the art. See Sambrook, Fritsch and Maniatis, MOLECULAR CLONING: A LABORATORY MANUAL, 2nd edition (1989); CURRENT PROTOCOLS IN MOLECULAR BIOLOGY (F. M. Ausubel, et al. eds., (1987)); the series METHODS IN ENZYMOLOGY (Academic Press, Inc.): PCR 2: A PRACTICAL APPROACH (M.J. MacPherson, B.D. Hames and G.R. Taylor eds. (1995)), Harlow and Lane, eds. (1988) ANTIBODIES, A LABORATORY MANUAL, and ANIMAL CELL CULTURE (R.I. Freshney, ed. (1987)).

[0036] The present invention relates to methods and compositions relating to the use of stem- loop oligonucleotides as adaptors and/or primers. In various embodiments, the stem-loop oligonucleotides are chimeras of RNA and DNA. In some embodiments, the stem- loop oligonucleotide is activated through the cleavage of an RNA/DNA heteroduplex stem (e.g. by degradation of RNA by RNase H). In some embodiments, the stem-loop oligonucleotide comprises an RNA sequence in at least a portion of its loop and/or in its stem. Methods of replication and amplification using the stem-loop oligonucleotides described herein can also generate RNA/DNA heteroduplexes. Further, methods and compositions in various embodiments of the invention relate to the use of a stem-loop oligonucleotide as a dual purpose agent acting as an adaptor and primer at once.

[0037] In some embodiments, the present invention allows for the amplification of molecules having at least one double stranded region by using adaptors that reduce the propensity to form adaptor dimers. In certain aspects, the present invention provides an inert oligonucleotide (e.g. a stem-loop oligonucleotide) for attachment to a double stranded molecule such that it renders the oligonucleotide-ligated molecule capable of being modified, such as amplified, for example by strand displacement, single primer isothermal amplification (SPIA) or polymerase chain reaction. For example, upon cleavage of the inert adaptor, the oligonucleotide becomes active and suitable for providing at least in part one or more sequences employable for adaptor dependent nucleic acid extension or amplification. For another example, during SPIA amplification or polymerase chain reaction, non-cleaved adaptor/primer (e.g. stem-loop composite oligonucleotide) cannot prime effectively due to duplex formation along its stem. Its use as a primer (before it is cleaved) can be mildly or strictly limited due to the internal stem formation competing with target hybridization.

[0038] In addition to the advantages provided by the adaptors/primers of the invention, the invention further provides novel conditions for modification of nucleic acid molecules with the composite stem-loop adaptors, and subsequent amplification with said activated oligonucleotides. In most cases in the art, the process of nucleic acid analysis involves multiple enzymatic reactions that are performed in a sequential manner, frequently with intermediate purification steps between at least some of the reactions. For example, preparation of a library from genomic or viral DNA may involve fragmentation, purification, "polishing" (end-repair) of fragments, a second purification or heat inactivation, adaptor ligation and subsequent extension reactions in separate steps that often need a considerable amount of time, equipment use and human intervention, creating major obstacles for high throughput and diagnostic-type applications. The process becomes more complicated when library preparation involves additional enzymatic steps, such as DNA or library digestion with methylation-sensitive or methylation-specific endonuc leases (for example, in the preparation of Methylome libraries or restriction nuclease cleavage within the adaptor sequence to produce sticky ends for DNA cloning.

[0039] In specific aspects, the present invention introduces a concept of multiplexing two or more enzymatic processes in one reaction, and teaches how to optimize a highly multiplexed enzymatic process. In particular embodiments, the present invention is directed to compositions and methods for simultaneous processing of DNA molecules with a combination of enzymes in a one-step-one-tube reaction and producing either a collection of molecules suitable for further processing, for example for amplification, or in some cases, the methods and compositions of the invention result in amplified DNA molecules. In addition, the methods and compositions of the present invention provides composite adaptor/primer oligonucleotides and methods of using the same that eliminate the possible formation of stable adaptor dimers, thus eliminating potential difficulties regularly introduced by other commonly employed methods and reagents. In particular embodiments, the methods of the invention can be easily applied to any type of fragmented double stranded DNA including but not limited to, for example, free DNA isolated from plasma, serum, and/or urine; apoptotic DNA from cells and/or tissues; DNA fragmented enzymatically in vitro (for example, by DNase I and/or restriction endonuc lease); and/or DNA fragmented by mechanical forces (hydro-shear, sonication, nebulization, etc.). Additional suitable methods and compositions of producing nucleic acid molecules comprising stem-loop oligonucleotides are further described in detail in US Patent No. 7,803,550, which is herein incorporated by reference in its entirety.

[0040] In other embodiments, the invention can be easily applied to any high molecular weight double stranded DNA including, for example, DNA isolated from tissues, cell culture, bodily fluids, animal tissue, plant, bacteria, fungi, viruses, etc. [0041 ] Disclosed are compositions and methods for the generation of libraries using chimeric RNA-

DNA stem loop oligonucleotides from any nucleic acid molecules. In various embodiments, the chimeric stem-loop oligonucleotides comprise heteroduplex stems. The stem-loop

oligonucleotides described herein can be used as adaptors, primers or both. In various embodiments, the nucleic acid molecules are double stranded DNA molecules. In some embodiments, the unique structure of the stem loop chimeric oligonucleotides provide for the generation of partial duplex DNA from fragments of the sample dsDNA comprising a dsDNA fragment insert and 3'-ssDNA at one or both ends, which comprise a sequence complementary to a sequence of the stem-loop composite chimeric oligonucleotide.

[0042] The partial duplex product thus generated can be a substrate for amplification, for example, PCR or isothermal amplification. In some embodiments, the amplification reaction employs the same chimeric RNA-DNA stem-loop oligonucleotide or a different stem-loop oligonucleotide as a primer. For example, an isothermal amplification reaction comprising a chimeric RNA-DNA stem-loop oligonucleotide or a portion thereof, DNA polymerase with strand displacement, and RNase H can be performed for nucleic acid amplification using as substrate said partial duplex product.

[0043] In some embodiments, the disclosed stem-loop chimeric adaptor-pro -primer, is used as a primer for amplification, for example isothermal amplification of nucleic acids. Such pro- primer molecules can be activated to a primer in the reaction mixture. Primer activation can be carried out by an RNase targeting RNA in a RNA/DNA duplex, such as, RNase H. The RNase can be included as a component of the amplification reaction mixture. The concentration of the pro -primer, the amount of RNase activity and other reaction conditions can be adjusted to achieve activation of the pro -primer at a desired rate. For example, the pro-primer can be activated in the same reaction mixture, at a rate that is comparable to the overall duration of the amplification reaction allowing for gradual activation of stem-loop oligonucleotides over time. The pro-primer chimeric RNA-DNA stem-loop oligonucleotide can be useful for the isothermal amplification, for example SPIA, of nucleic acids employing any other method for the generation of substrates for SPIA amplification such as described for RNA amplification or whole genome amplification, as described in US. Patent No.s 6,251,639 and 6,946,251, which are both herein incorporated by reference in their entirety. Primer activation from pro -primer stem-loop chimeric oligonucleotide serves as a "slow release" of an active primer in the reaction. In general, formation of side products, for example due to high primer concentration in the initial period of an amplification reaction, can be mitigated leading to increase in desired amplification products (specificity) and higher yield, for example, formation of stem-loop oligonucleotide dimers, which are formed from the attachment of two stem-loop oligonucleotides, can be targeted by R ase activity cleaving the R A in the R A/DNA duplex of the stem, thus releasing remaining portions of the two oligonucleotides from each other.

Stem-loop chimeric RNA-DNA adaptor/pro-primer:

[0044] In various embodiments, a chimeric/composite stem-loop oligonucleotide comprises any oligonucleotide having a sequence that can be joined to a target polynucleotide. In many embodiments, at least a portion of the oligonucleotide sequence is known. Stem-loop oligonucleotides can comprise DNA, RNA, nucleotide analogues, non-canonical nucleotides, labeled nucleotides, modified nucleotides, or combinations thereof. Stem- loop oligonucleotides can comprise one or more of single-stranded and double-stranded regions, giving rise to stems, loops and overhangs. For example, a stem-loop

oligonucleotide can have one or more loops, stems and optionally one or more overhangs. In a single strand stem-loop oligonucleotide, multiple overhangs can be achieved by lack of complementarity in the 5' and 3' distal regions. Any one of stems, loops, and overhangs can comprise DNA or RNA sequences. In general, a partial-duplex oligonucleotide comprises one or more single-stranded regions and one or more double- stranded regions. Double-stranded compositions can comprise two separate

oligonucleotides hybridized to one another (also referred to as an "oligonucleotide duplex"), and hybridization may leave one or more blunt ends, one or more 3' overhangs, one or more 5' overhangs, one or more bulges resulting from mismatched and/or unpaired nucleotides, or any combination of these. In some embodiments, a single- stranded adaptor comprises two or more sequences that are able to hybridize with one another. When two such hybridizable sequences are contained in a single-stranded oligonucleotide, hybridization yields a hairpin structure or a stem-loop structure. When two hybridized regions of an oligonucleotide are separated from one another by a non- hybridized region, a "bubble" structure results. Oligonucleotide comprising a bubble structure can consist of a single strand comprising internal hybridizations, or may comprise two or more strands hybridized to one another. Internal sequence

hybridization, such as between two hybridizable sequences in a strand, can produce a double-stranded structure in a single-stranded adaptor oligonucleotide. Oligonucleotides of different kinds can be used in combination, such as a hairpin oligonucleotide, stem- loop oligonucleotide and a double-stranded oligonucleotide, or oligonucleotides of different sequences. Hybridizable sequences of the adaptors and primers described herein may or may not include one or both ends of the oligonucleotide. When neither of the ends are included in the hybridizable sequences, both ends are "free" or

"overhanging." When only one end is hybridizable to another sequence in the adaptor, the other end forms an overhang, such as a 3' overhang or a 5' overhang. When both the 5 '-terminal nucleotide and the 3 '-terminal nucleotide are included in the hybridizable sequences, such that the 5 '-terminal nucleotide and the 3 '-terminal nucleotide are complementary and hybridize with one another, the end is referred to as "blunt."

Different adaptors can be joined to target polynucleotides in sequential reactions or simultaneously. For example, the first and second adaptors can be added to the same reaction. Adaptors can be manipulated prior to combining with target polynucleotides. For example, terminal phosphates can be added or removed.

[0045] Stem-loop oligonucleotides, according to the various embodiments of the invention vary in size. Stem- loop oligonucleotides are typically at least 10 nucleotides long and, more typically range from 14 to 60 nucleotides, or even more typically from 18, 21, 24, 27 to 40, 45 or 50 nucleotides, in length. Other stem-loop oligonucleotides can be somewhat longer, e.g., 50 to 80 nucleotides long or longer. Stem-loop oligonucleotides can have any of the length ranges capped by any of the limits above, e.g. 18 to 60 nucleotides.

[0046] In some embodiments, one of the hybridizable sequences in a single-stranded stem-loop structure comprises R A. For example, a stem- loop oligonucleotide can comprise a 5' end comprising sequence A' and a 3' end comprising sequence A, where A is

hybridizable to A', one of A or A' comprises DNA, and the other of A or A' comprises RNA. Examples of such stem- loop oligonucleotides are illustrated in Figure 1. Similarly, a stem- loop oligonucleotide can comprise a 5' end comprising sequence B and a 3' end comprising sequence B', where B is hybridizable to B', one of B or B' comprises DNA, and the other of B or B' comprises RNA. In some embodiments, one of A or A' consists entirely of DNA, and/or one of A or A' consists entirely of RNA. In some embodiment, one of B or B' consists entirely of DNA, and/or one of B or B' consists entirely of RNA. Sequence A can be the same as or different from sequence B and/or B'. Sequence A' can be the same as or different from sequence B and/or B'. In some embodiments, the stem comprising RNA (e.g. A, A', B, or B') further comprises one or more terminal DNA residues (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, or more terminal DNA residues), such that the sequence comprising RNA is flanked by DNA residues at both ends (i.e. both the 5' end and the 3' end of the sequence comprising RNA).

Hybridization of a sequence comprising RNA to a sequence comprising DNA creates an RNA-DNA heteroduplex. In some embodiments, RNA is cleaved by an enzyme that cleaves RNA from an RNA-DNA heteroduplex, such as enzymes comprising

ribonuclease activity. Preferably, the enzyme comprising ribonuclease activity cleaves ribonucleotides in an RNA/DNA heteroduplex regardless of the identity and type of nucleotides adjacent to the ribonucleotide to be cleaved. It is preferred that the ribonuclease cleaves independent of sequence identity. Examples of suitable enzymes comprising ribonuclease activity for the methods and compositions of the invention are well known in the art, including ribonuclease H (RNase H) and enzymes comprising RNase H activity, e.g., Hybridase. In some embodiments, cleavage of RNA from an RNA-DNA heteroduplex removes all double-stranded character from a single-stranded stem-loop oligonucleotide, such that extension by a polymerase that uses the

oligonucleotide as template requires no strand displacement step or strand displacement activity. In some embodiments, one or both ends of a stem-loop oligonucleotide comprising RNA in its stem are joined to a target polynucleotide, such that cleavage of the RNA from the RNA-DNA hetero duplex produces a 5' overhang or a 3' overhang. In some embodiments, a stem-loop oligonucleotide comprises a tag/barcode sequence. Reactions of the stem-loop oligonucleotide with a target nucleic acid, such as ligation, extension, priming etc., can impart the target oligonucleotide tagged with the barcode sequence. Blunt or sticky ended stem-loop oligonucleotides can be joined with blunt or sticky ended target polynucleotides. In some embodiments, an end comprising a 5' overhang produced by cleavage of RNA from an RNA-DNA heteroduplex is filled in by the extension of the produced 3' end using the 5' overhang as template.

Various exemplary configurations of the stem-loop chimeric RNA-DNA adaptor/pro-primer are shown in Figure 1. In some embodiments, the stem-loop chimeric RNA-DNA oligonucleotides are composed of sequences that are complementary at their distal ends (A and A) and form a DNA-RNA heteroduplex stem. In various embodiments, the stem-loop structures comprise a single stranded loop region (B). All or part of the loop region can comprise RNA (loop-RNA). The complement of the DNA sequence at the 3'-region as well as the RNA loop sequence can be removed exposing a priming site for nucleic acid amplification. In some embodiments, an isothermal amplification comprising an RNA-DNA chimeric primer utilizes said priming site. In some embodiments the RNA/DNA chimeric primer is a stem-loop chimeric pro-primer. In some embodiments the stem-loop chimeric adaptor and the stem-loop chimeric pro-primer are the same.

[0048] In some embodiments, the 3' or 5'-region of the chimeric R A-DNA stem loop

oligonucleotide of the current invention comprises a 3' or 5'-single stranded overhang, comprising DNA or RNA or combination of RNA and DNA. The 3' or 5 '-single stranded overhangs may be utilized for hybridization to sticky-ended template fragments, such as those generated by fragmentation of the template dsDNA by restriction enzymes, for example, or when fragments are tailed by a single nucleotide, such as for example in the case of AT cloning procedures as described in detail in US Patent Pub. US 20100167954 and Marchuk et al. (Construction of T-vectors, a rapid and general system for direct cloning of unmodified PCR products. Marchuk et al, Nucleic Acids Res. 1991 March 11; 19(5): 1154), which are both incorporated herein in their entirety. In these cases, the overhang sequence can be designed to be sufficiently complementary to a 3' or 5'- overhang on a template fragment allowing hybridization under the reaction conditions.

[0049] The sequence in the 3 ' or 5'-region comprising the stem of the stem-loop chimeric

oligonucleotide of the invention may be all RNA or may comprise RNA and DNA nucleotides. The length of the RNA sequence in the stem region can be designed to optimize the rate of cleavage of the RNA portion by an RNase H that cleaves the RNA sequences in a DNA-RNA heteroduplex. In some embodiments the chimeric stem-loop oligonucleotide adaptor and pro- primer are the same. In some embodiments, the stem sequence is designed for optimal performance of the chimeric primer generated by cleavage of the RNA portion of the stem heteroduplex by RNase H.

[0050] In various embodiments of the invention, a plurality of stem-loop oligonucleotides are used as adaptors and/or primers. For example, a first stem-loop oligonucleotide and a second stem-loop oligonucleotide can be used to attach to two different ends of a target nucleic acid. For another example, a first stem-loop oligonucleotide and a second stem-loop oligonucleotide can be used to prime from two different ends of a target nucleic acid. For yet another example, a first stem- loop oligonucleotide can be used as an adaptor and a second stem-loop oligonucleotide can be used as a primer. A first and a second stem-loop oligonucleotide can be 25%- 100% similar to each other, including at least about 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, and 100%) sequence similarity. A first and a second stem-loop oligonucleotide can be 25%>-100%> complementarity, including at least about 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, and 100% sequence complementarity.

[0051 ] In some embodiments, a second stem-loop oligonucleotide interacts with a product generated using a first stem-loop oligonucleotide. For example, a first stem- loop oligonucleotide can be attached to a target nucleic acid via the oligonucleotide's 3 ' end and the target nucleic acid can be extended complementary to the attached oligonucleotide, as described below (Figure 2a). A portion of the attached oligonucleotide can be cleaved generating a priming ready construct. A second stem-loop oligonucleotide that is hybridizable to the priming ready construct, such as one that has a sequence similarity with the first stem-loop oligonucleotde, can further prime an amplification reaction. For another example, a first stem- loop oligonucleotide can be attached to a target nucleic acid via the oligonucleotide's 5' end. Upon cleavage of a heteroduplex stem region, a priming ready construct can be generated comprising a portion of the stem-loop oligonucleotide in a 3' overhang. A second stem-loop oligonucleotide, that is hybridizable to the priming ready construct, such as one that has a sequence complementarity with the first stem- loop oligonucleotde, can further prime an amplification reaction.

[0052] The sequence of the stem-loop chimeric RNA-DNA adaptor/pro-primer may further comprise one or more unique identifier sequences, or molecular barcodes, which will be useful for multiplex amplification and analysis of pooled samples.

[0053] Further features and uses of adaptor oligonucleotides are described in US Patent Publications US 2011-0319290 and US 2011-0105364, which are both herein incorporated by reference in their entirety.

Library generation using stem-loop chimeric RNA-DNA adaptor/pro-primer

[0054] In various embodiments, library generation of target nucleic acids is carried out by ligation of stem-loop chimeric RNA-DNA adaptor/pro-primer to the ends of the said target nucleic acids. The target nucleic acids can be dsDNA or fragments thereof. Cleavage of the RNA portion of the heteroduplex DNA-RNA stem can be achieved using an RNase targeting said RNA, such as RNase H. Replication of the released stem and or loop sequences that is attached to a first strand of the target nucleic acid can be achieved, for example, by extending a complementary strand along the released sequences. The released sequences can be directly adjacent to the 3 -DNA sequence of the stem or include the same. Synthesis of a DNA sequence along the stem-loop oligonucleotide sequences, for example by DNA polymerase, can yield a DNA-RNA heteroduplex comprising one or more RNA sequences from the stem-loop oligonucleotide and the synthesized DNA that is complementary. Cleavage of the RNA sequence(s) in the RNA- DNA heteroduplex, for example with RNase H, can result in a 3 ' overhang that is complementary to at least a portion of the stem-loop oligonucleotide. In some embodiments, the RNA sequence(s) forming a duplex with the synthesized DNA originate from the loop of the stem-loop oligonucleotide. In some embodiments the DNA polymerase is a DNA dependent DNA polymerase with RNA dependent DNA polymerase activity. In some embodiment the DNA dependent DNA polymerase and the RNA dependent DNA polymerase are two different enzymes.

[0055] Figures 2a and 2b illustrate a series of reaction steps for the generation of nucleic acid libraries, according to some embodiments of the invention. The two figures differ in the first step of attachment of the stem-loop chimeric RNA-DNA adaptor/ pro-primer to the fragments of the dsDNA target, yielding two different kinds of stem- loop-target complexes. Adaptor attachment through ligation may be carried out by ligation to one of the strands of the dsDNA fragment (Figure 2a), resulting in a stem- loop-target complex with a nick or gap or to both strand of the dsDNA fragment (Figure 2b), resulting in a stem- loop-target complex with strand continuity, without a nick or a gap between the stem-loop oligonucleotide and the target. The nature of the ligation may depend on the nature of the ends of the dsDNA and the stem-loop adaptor as well as the choice of the ligase activity. In many cases, ligation depends on the phosphorylation on the 5 ' end. In these cases, ligation to both strands of the dsDNA can be achieved using a phosphorylated 5'-end on both the target nucleic acid and the stem-loop chimeric RNA-DNA adaptor/pro-primer. When only one of the components comprises a 5'- phosphate, only one of the strands of the target nucleic acids can be ligated to the adaptor/pro- primer. In some embodiments, a strand of the target nucleic acid is ligated to the double stranded portion of the stem-loop. When one end, e.g. the 5'-end, of the chimeric stem-loop RNA-DNA adaptor/pro-primer comprises RNA, employment of a ligase capable of ligating a DNA sequence to an RNA sequence can be beneficial. The ligase can further comprise DNA-DNA ligation activity allowing for a DNA end of the stem-loop oligonucleotide to ligate to a DNA target nucleic acid.

[0056] After ligation to either a single strand of the target nucleic acid, for example a double stranded target nucleic acid (Figure 2a) or to both strands (Figure 2b), a complementary sequence can be synthesized, for example using 3'-extension along the stem-loop oligonucleotide sequence. Cleaving the stem of the adaptor using an RNase, for example RNase H, can linearize the stem loop structure and release it from one of the strand of the target nucleic acid at the same time. The stem loop sequences can be replicated by a DNA polymerase through extension of a free 3'- end of the target nucleic acid, foiming a target nucleic acid extension product. Target nucleic acid-adaptor constructs with a single strand attachment can be replicated without the activity of an RNase. Extension of the fragment strand along the ligated chimeric DNA-RNA oligonucleotide can be carried out with a DNA polymerase comprising DNA- and RNA- dependent DNA polymerase, such as a reverse transcriptase or DNA polymerase comprising both activities, such as Bst, or a combination thereof. The target nucleic acid extension product resulting from the extension may comprise a DNA-RNA heteroduplex comprising an RNA sequence originating from the adaptor, e.g. from the loop, from any other adaptor sequence, or from a combination adaptor locations, and a DNA extension product. The RNA portion of the heteroduplex can be cleaved by an RNase, such as RNase H, in the same reaction mixture combining the product of the extension by DNA polymerase and RNase H. In various embodiments, the product of this reaction comprises a library of partial duplexes comprising dsDNA inserts of the target nucleic acids flanked by 3'-single stranded DNA extensions (or overhangs) comprising sequences complementary to the A and B sequences of the stem-loop chimeric RNA-DNA adaptor/pro-primer (Figure 2a and 2b).

[0057] The partial duplex products described above are suitable for further manipulation, for example generation of libraries comprising dsDNA fragments and 5'- and/or 3'-single stranded DNA overhangs as illustrated in Figure 3 or isothermal amplification using the chimeric RNA-DNA primers, as illustrated in Figures 4a and 4b.

Generation of libraries of dsDNA fragments with 5 - and 3 '-single stranded non- complementary DNA comprising defined sequences

[0058] In various embodiments, libraries comprising dsDNA fragment inserts and known overhang sequences at one or both ends are generated from the partial duplexes described above. A partial duplex with a 3 ' overhang can be combined with an oligonucleotide comprising a 3'-sequence sufficiently complementary to a sequence of the single stranded 3 ' overhang portion of the partial duplex and a 5 '-tail sequence. The complementary sequence can be directly adjacent to the double stranded region. The 5 ' tail sequence can be any desired sequence that is unable to efficiently hybridize to the 3 ' overhang based on lack of complementarity, under conditions which allow hybridization of the 3 ' end of the oligonucleotide to the partial duplex . The 3'- hybridized portion of the hybridized oligonucleotide can be ligated to the 5' end of the partial duplex downstream (Figure 3), thereby generating teiminal Y (or forked) products comprising a dsDNA portion and non-complementary 3'- and 5 '-single stranded sequences at a given end. Methods and compositions relating to teiminal Y products are further described in US Patent No. 7,741,463, which is herein incorporated in its entirety.

[0059] A library comprising nucleic acids with greater dsDNA content or fully double stranded (ds) nucleic acids with desired sequences at both ends can be generated. The desired sequences can be different on each end of a strand. An oligonucleotide comprising a sequence complementary to a single stranded 3'-end of a terminal Y product like the ones described above can be hybridized to the terminal Y product. Extension of the hybridized oligonucleotide (primer) by DNA polymerase can generate dsDNA products comprising a target insert and a distinct sequence at each end, as shown in Figure 3. This process provides for the generation of libraries with asymmetric end sequences without the need to append/ligate a different sequence to each end. The described process of appending sequences is symmetric with increased efficiency for the library generation.

[0060] Further amplification of the various libraries described herein is possible, when desired, by

various methods including, for example, PCR employing suitable forward and reverse primers or isothermal amplification/SPIA with one or more suitable primer(s). In some embodiments, the amplification primers may further comprise tails that are not hybridizable to the unique sequences of the library, thus extending the common sequences used in the library as may be desired for further manipulation of the libraries, such as for sequencing.

Isothermal amplification using chimeric RNA-DNA primers

[0061 ] Partial duplex products comprising dsDNA insert of the fragmented dsDNA template described above (Figures 2a and 2b), can be used as substrates for isothermal amplification employing chimeric RNA-DNA primers (SPIA), as shown in Figures 4a and 4b. Amplification may be carried out with stem-loop chimeric RNA-DNA pro-primer or a linear, non stem- loop chimeric RNA-DNA primer. The chimeric primer may be the stem- loop chimeric adaptor/pro-primer employed for the generation of the said partial duplex.

[0062] Amplification can be carried out in reaction mixtures comprising a partial duplex, a chimeric RNA-DNA primer, DNA polymerase with strand displacement and RNase, such as RNase H, which cleaves RNA in a RNA-DNA duplex. The chimeric RNA-DNA primer may be generated in the reaction mixture from the stem-loop chimeric RNA-DNA pro-primer through cleavage of the RNA portion of the stem RNA-DNA heteroduplex. The generation of an active primer from the pro-primer in the reaction mixture reduces the potential accumulation of undesired products that may be generated in the presence of high concentration of active primer. Active primer can be generated from the stem loop chimeric RNA-DNA pro-primer by cleavage of the RNA in the RNA-DNA heteroduplex stem portion by RNase H as shown in Figure 4a (insert).

[0063] The sequence of the chimeric RNA-DNA primer can comprise a sequence that is the same as sequence A and or B (the loop sequence of the stem-loop chimeric pro-primer; Figure 4a), or may comprise sequences hybridizable to the complement of the loop sequence (Figure 4b). [0064] Figure 5 illustrates the use of stem-loop oligonucleotides as pro-primers and primers according to some embodiments of the invention. Accordingly, a composite amplification primer can be generated in the amplification reaction mixture from a stem-loop chimeric pro-primer. The amplification reaction mixture can comprise one or more target partial duplex nucleic acid(s), for example a partial duplex DNA with a 3' overhang as illustrated in the figure, one or more chimeric stem-loop pro-primer(s), one or more DNA polymerase(s) with strand displacement activity, and one or more RNases targeting RNA in RNA/DNA heteroduplexes, for example RNase H. The RNA portion of the

RNA/DNA heteroduplex at the stem of the chimeric stem-loop pro-primer can be cleaved by RNase H to generate, for example, a linear composite primer comprising a 3'-DNA and 5'-RNA. The linearized amplification primer can hybridize to a 3 '-single stranded DNA portion (overhang) of a target partial duplex and can be extended by the DNA polymerase with strand displacement activity. The RNA portion of the hybridized primer in a heteroduplex can be cleaved by RNase H to free a portion of the primer binding site. A second linear composite amplification primer can hybridize to the freed primer binding site, and can be extended along the target DNA strand. The previously synthesized primer extension product (amplification product) can be displaced by the newly extended primer. Repeated cycles of primer hybridization, primer extension by strand

displacement DNA polymerase, and cleavage of the RNA portion of the hybridized primer can generate multiple copies of a target nucleic acid (Figure 5).

Tar set polynucleotides

[0065] In various embodiments of the invention, nucleic acid are used as substrates for further manipulation. The input nucleic acid can be DNA, or complex DNA, for example genomic DNA. The input DNA may also be cDNA. The cDNA can be generated from RNA, e.g., mRNA. The input DNA can be of a specific species, for example, human, rat, mouse, other animals, plants, bacteria, algae, viruses, and the like. The input nucleic acid also can be from a mixture of genomes of different species such as host-pathogen, bacterial populations and the like. The input DNA can be cDNA made from a mixture of genomes of different species. Alternatively, the input nucleic acid can be from a synthetic source. The input DNA can be mitochondrial DNA. The input DNA can be cell- free DNA. The cell- free DNA can be obtained from, e.g., a serum or plasma sample. The input DNA can comprise one or more chromosomes. For example, if the input DNA is from a human, the DNA can comprise one or more of chromosome 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, X, or Y. The DNA can be from a linear or circular genome. The DNA can be plasmid DNA, cosmid DNA, bacterial artificial chromosome (BAC), or yeast artificial chromosome (YAC). The input DNA can be from more than one individual or organism. The input DNA can be double stranded or single stranded. The input DNA can be part of chromatin. The input DNA can be associated with histones. The methods described herein can be applied to high molecular weight DNA, such as is isolated from tissues or cell culture, for example, as well as highly degraded DNA, such as cell- free DNA from blood and urine and/or DNA extracted from formalin- fixed, paraffin-embedded tissues, for example.

The different samples from which the target polynucleotides are derived can comprise multiple samples from the same individual, samples from different individuals, or combinations thereof. In some embodiments, a sample comprises a plurality of polynucleotides from a single individual. In some embodiments, a sample comprises a plurality of polynucleotides from two or more individuals. An individual is any organism or portion thereof from which target polynucleotides can be derived, non-limiting examples of which include plants, animals, fungi, protists, monerans, viruses, mitochondria, and chloroplasts. Sample polynucleotides can be isolated from a subject, such as a cell sample, tissue sample, or organ sample derived therefrom, including, for example, cultured cell lines, biopsy, blood sample, or fluid sample containing a cell. The subject may be an animal, including but not limited to, an animal such as a cow, a pig, a mouse, a rat, a chicken, a cat, a dog, etc., and is usually a mammal, such as a human. Samples can also be artificially derived, such as by chemical synthesis. In some embodiments, the samples comprise DNA. In some embodiments, the samples comprise genomic DNA. In some embodiments, the samples comprise mitochondrial DNA, chloroplast DNA, plasmid DNA, bacterial artificial chromosomes, yeast artificial chromosomes, oligonucleotide tags, or combinations thereof. In some embodiments, the samples comprise DNA generated by primer extension reactions using any suitable combination of primers and a DNA polymerase, including but not limited to polymerase chain reaction (PCR), reverse transcription, and combinations thereof. Where the template for the primer extension reaction is RNA, the product of reverse transcription is referred to as complementary DNA (cDNA). Primers useful in primer extension reactions can comprise sequences specific to one or more targets, random sequences, partially random sequences, and combinations thereof. Reaction conditions suitable for primer extension reactions are known in the art. In general, sample polynucleotides comprise any polynucleotide present in a sample, which may or may not include target polynucleotides.

[0067] Methods for the extraction and purification of nucleic acids are well known in the art.

For example, nucleic acids can be purified by organic extraction with phenol,

phenol/chloroform/isoamyl alcohol, or similar formulations, including TRIzol and Tri eagent. Other non-limiting examples of extraction techniques include: (1) organic extraction followed by ethanol precipitation, e.g., using a phenol/chloroform organic reagent (Ausubel et al, 1993), with or without the use of an automated nucleic acid extractor, e.g., the Model 341 DNA Extractor available from Applied Biosystems (Foster City, Calif); (2) stationary phase adsorption methods (U.S. Pat. No. 5,234,809; Walsh et al, 1991); and (3) salt-induced nucleic acid precipitation methods (Miller et al, (1988), such precipitation methods being typically referred to as "salting-out" methods. Another example of nucleic acid isolation and/or purification includes the use of magnetic particles to which nucleic acids can specifically or non- specifically bind, followed by isolation of the beads using a magnet, and washing and eluting the nucleic acids from the beads (see e.g. U.S. Pat. No. 5,705,628). In some embodiments, the above isolation methods may be preceded by an enzyme digestion step to help eliminate unwanted protein from the sample, e.g., digestion with proteinase K, or other like proteases. See, e.g., U.S. Pat. No. 7,001,724. If desired, RNase inhibitors may be added to the lysis buffer. For certain cell or sample types, it may be desirable to add a protein

denaturation/digestion step to the protocol. Purification methods may be directed to isolate DNA, RNA, or both. When both DNA and RNA are isolated together during or subsequent to an extraction procedure, further steps may be employed to purify one or both separately from the other. Sub-fractions of extracted nucleic acids can also be generated, for example, purification by size, sequence, or other physical or chemical characteristic. In addition to an initial nucleic isolation step, purification of nucleic acids can be performed after any step in the methods of the invention, such as to remove excess or unwanted reagents, reactants, or products.

Fragmentation methods

[0068] In some embodiments, sample polynucleotides are fragmented into a population of

fragmented insert DNA molecules of one or more specific size range(s). In some embodiments, fragments are generated from at least about 1, 10, 100, 1000, 10000, 100000, 300000, 500000, or more genome-equivalents of starting DNA. Fragmentation may be accomplished by methods known in the art, including chemical, enzymatic, and mechanical fragmentation. In some embodiments, the fragments have an average length from about 10 to about 10,000 nucleotides. In some embodiments, the fragments have an average length from about 50 to about 2,000 nucleotides. In some embodiments, the fragments have an average length from about 100-2,500, 10-1,000, 10-800, 10-500, 50- 500, 50-250, or 50-150 nucleotides. In some embodiments, the fragments have an average length less than 500 nucleotides, such as less than 400 nucleotides, less than 300 nucleotides, less than 200 nucleotides, or less than 150 nucleotides. In some

embodiments, the fragmentation is accomplished mechanically comprising subjecting sample polynucleotides to acoustic sonication. In some embodiments, the fragmentation comprises treating the sample polynucleotides with one or more enzymes under conditions suitable for the one or more enzymes to generate double-stranded nucleic acid breaks. Examples of enzymes useful in the generation of polynucleotide fragments include sequence specific and non-sequence specific nucleases. Non-limiting examples of nucleases include DNase I, Fragmentase, restriction endonucleases, variants thereof, and combinations thereof. For example, digestion with DNase I can induce random double-stranded breaks in DNA in the absence of Mg++ and in the presence of Mn++. In some embodiments, fragmentation comprises treating the sample polynucleotides with one or more restriction endonucleases. Fragmentation can produce fragments having 5' overhangs, 3' overhangs, blunt ends, or a combination thereof. In some embodiments, such as when fragmentation comprises the use of one or more restriction endonucleases, cleavage of sample polynucleotides leaves overhangs having a predictable sequence. In some embodiments, the method includes the step of size selecting the fragments via standard methods such as column purification or isolation from an agarose gel.

Combination of fragmentation methods can be utilized, such as a combination enzymatic and chemical methods. In a particular example, an abasic site can be generated, e.g. using a glycosylase (Uracil-DNA glycosylase, Thymine-DNA glycosylase etc.), and the abasic site can be cleaved using a chemical method, such as by contacting the abasic site with dimethylethylenediamine (DMED).

In some embodiments, the 5' and/or 3' end nucleotide sequences of fragmented DNA are not modified prior to ligation with one or more adaptor oligonucleotides. For example, fragmentation by a restriction endonuclease can be used to leave a predictable overhang, followed by ligation with one or more adaptor oligonucleotides comprising an overhang complementary to the predictable overhang on a DNA fragment. In another example, cleavage by an enzyme that leaves a predictable blunt end can be followed by ligation of blunt-ended DNA fragments to adaptor oligonucleotides comprising a blunt end. In some embodiments, the fragmented DNA molecules are blunt-end polished (or "end repaired") to produce DNA fragments having blunt ends, prior to being joined to adaptors. The blunt-end polishing step may be accomplished by incubation with a suitable enzyme, such as a DNA polymerase that has both 3' to 5' exonuc lease activity and 5' to 3' polymerase activity, for example T4 polymerase. In some embodiments, end repair is followed by an addition of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 or more nucleotides, such as one or more adenine, one or more thymine, one or more guanine, or one or more cytosine, to produce an overhang. DNA fragments having an overhang can be joined to one or more adaptor oligonucleotides having a complementary overhang, such as in a ligation reaction. For example, a single adenine can be added to the 3' ends of end repaired DNA fragments using a template independent polymerase, followed by ligation to one or more adaptors each having a thymine at a 3' end. In some embodiments, adaptor oligonucleotides can be joined to blunt end double-stranded DNA fragment molecules which have been modified by extension of the 3' end with one or more nucleotides followed by 5 ' phosphorylation. In some cases, extension of the 3' end may be performed with a polymerase such as for example Klenow polymerase or any of the suitable polymerases provided herein, or by use of a terminal deoxynucleotide transferase, in the presence of one or more dNTPs in a suitable buffer containing magnesium. In some embodiments, target polynucleotides having blunt ends are joined to one or more adaptors comprising a blunt end. Phosphorylation of 5' ends of DNA fragment molecules may be performed for example with T4 polynucleotide kinase in a suitable buffer containing ATP and magnesium. The fragmented DNA molecules may optionally be treated to dephosphorylate 5' ends or 3' ends, for example, by using enzymes known in the art, such as phosphatases.

[0070] In some embodiments, each of the plurality of independent samples comprises at least about lpg, lOpg, lOOpg, lng, lOng, 20ng, 30ng, 40ng, 50ng, 75ng, lOOng, 150ng, 200ng, 250ng, 300ng, 400ng, 500ng, ^g, l^g, 2μg, or more of nucleic acid material. In some embodiments, each of the plurality of independent samples comprises less than about lpg, lOpg, lOOpg, lng, lOng, 20ng, 30ng, 40ng, 50ng, 75ng, lOOng, 150ng, 200ng, 250ng, 300ng, 400ng, 500ng, ^g, l^g, 2μg, or more of nucleic acid.

[0071] In some embodiments each of the individual or plurality of samples comprises a single polynucleotide target or a single genome. [0072] In another aspect, the invention provides compositions that can be used in the above described methods. Compositions of the invention can comprise any one or more of the elements described herein. In one embodiment, the composition comprises a plurality of target polynucleotides, each target polynucleotide comprising one or more barcode sequences selected from a plurality of barcode sequences, wherein said target

polynucleotides are from two or more different samples, and further wherein the sample from which each of said polynucleotides is derived can be identified in a combined sequencing reaction with an accuracy of at least 95% based on a single barcode contained in the sequence of said target polynucleotide. In some embodiments, the composition comprises a plurality of first adaptor/primer oligonucleotides, wherein each of said first adaptor/primer oligonucleotides comprises at least one of a plurality of barcode sequences, wherein each barcode sequence of the plurality of barcode sequences differs from every other barcode sequence in said plurality of barcode sequences at at least three nucleotide positions.

Features of stem-loop oligonucleotides

[0073] Adaptors/primers can contain one or more of a variety of sequence elements, including but not limited to, one or more amplification primer annealing sequences or complements thereof, one or more sequencing primer annealing sequences or complements thereof, one or more transcriptional promoter sequences (e.g. T7, T3, SP6, etc.) or complements thereof, one or more barcode sequences, one or more common sequences shared among multiple different adaptors or subsets of different adaptors (adaptor group sequence), one or more restriction enzyme recognition sites, one or more overhangs complementary to one or more target polynucleotide overhangs, one or more probe binding sites (e.g. for attachment to a sequencing platform, such as a flow cell for massive parallel sequencing, such as developed by Illumina, Inc.), one or more random or near-random sequences (e.g. one or more nucleotides selected at random from a set of two or more different nucleotides at one or more positions, with each of the different nucleotides selected at one or more positions represented in a pool of adaptors comprising the random

sequence), and combinations thereof. Two or more sequence elements can be non- adjacent to one another (e.g. separated by one or more nucleotides), adjacent to one another, partially overlapping, or completely overlapping. For example, an amplification primer annealing sequence can also serve as a sequencing primer annealing sequence. Sequence elements can be located at or near the 3' end, at or near the 5' end, or in the interior of the adaptor oligonucleotide. In stem-loop oligonucleotides comprising stem, loop and/or overhang regions , sequence elements can be located partially or completely outside the secondary structure, partially or completely inside the secondary structure, or in between sequences participating in the secondary structure. For example, , sequence elements can be located partially or completely inside or outside the hybridizable sequences (the "stem"), including in the sequence between the hybridizable sequences (the "loop"). In some embodiments, the first adaptor/primer oligonucleotides in a plurality of first adaptor/primer oligonucleotides having different barcode sequences comprise a sequence element common among all first adaptor oligonucleotides in the plurality. In some embodiments, all second adaptor/primer oligonucleotides comprise a sequence element common among all second adaptor/primer oligonucleotides that is different from the common sequence element shared by the first adaptor/primer oligonucleotides. A difference in sequence elements can be any such that at least a portion of different adaptors /primers do not completely align, for example, due to changes in sequence length, deletion or insertion of one or more nucleotides, or a change in the nucleotide composition at one or more nucleotide positions (such as a base change or base modification). In some embodiments, an adaptor/primer oligonucleotide comprises a 5' overhang, a 3' overhang, or both that is complementary to one or more target polynucleotides. Complementary overhangs can be one or more nucleotides in length, including but not limited to 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, or more nucleotides in length. Complementary overhangs may comprise a fixed sequence.

Complementary overhangs may comprise a random sequence of one or more nucleotides, such that one or more nucleotides are selected at random from a set of two or more different nucleotides at one or more positions, with each of the different nucleotides selected at one or more positions represented in a pool of adaptors with complementary overhangs comprising the random sequence. In some embodiments, an adaptor/primer overhang is complementary to a target polynucleotide overhang produced by restriction endonuclease digestion. In some embodiments, an adaptor overhang consists of an adenine or a thymine.

Adaptor/primer oligonucleotides can have any suitable length, at least sufficient to accommodate the one or more sequence elements of which they are comprised. In some embodiments, adaptor/primers are about, less than about, or more than about 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 90, 100, 200, or more nucleotides in length. In some embodiments, the stem of a stem-loop adaptor/primer is about, less than about, or more than about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 25, 30, 35, 40, 45, 50, 75, 100, or more nucleotides in length. Stems may be designed using a variety of different sequences that result in hybridization between the complementary regions on a stem-loop chimeric adaptor, resulting in a local region of double-stranded DNA. For example, stem sequences may be utilized that are from 15 to 18 nucleotides in length with equal representation of G:C and A:T base pairs. Such stem sequences are predicted to form stable dsDNA structures below their predicted melting temperatures of ~45°C. Sequences participating in the stem of the stem-loop adaptor can be perfectly

complementary, such that each base of one region in the stem hybridizes via hydrogen bonding with each base in the other region in the stem according to Watson-Crick base- pairing rules. Alternatively, sequences in the stem may deviate from perfect

complementarity. For example, there can be mismatches and or bulges within the stem structure created by opposing bases that do not follow Watson-Crick base pairing rules, and/or one or more nucleotides in one region of the stem that do not have the one or more corresponding base positions in the other region participating in the stem. Mismatched sequences may be cleaved using enzymes that recognize mismatches. The stem and/or loop of a stem-loop can comprise DNA, RNA, or both DNA and RNA. In some embodiments, the stem and/or loop of a stem-loop, or one or both of the hybridizable sequences forming the stem of a stem-loop, comprise nucleotides, bonds, or sequences that are substrates for cleavage, such as by an enzyme, including but not limited to endonucleases and glycosylases. The composition of a stem may be such that only one of the hybridizable sequences forming the stem is cleaved. For example, one of the sequences forming the stem may comprise RNA while the other sequence forming the stem consists of DNA, such that cleavage by an enzyme that cleaves RNA in an RNA- DNA duplex, such as RNase H, cleaves only the sequence comprising RNA. The stem and/or loop of a stem-loop oligonucleotide can comprise non-canonical nucleotides (e.g. uracil), and/or methylated nucleotides. In some embodiments, the loop sequence of a hairpin adaptor is about, less than about, or more than about 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, or more nucleotides in length.

Barcodes

[0075] As used herein, the term "barcode" refers to a known nucleic acid sequence that allows some feature of a polynucleotide with which the barcode is associated to be identified. In some embodiments, the feature of the polynucleotide to be identified is the sample from which the polynucleotide is derived. In some embodiments, barcodes are at least 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, or more nucleotides in length. In some embodiments, barcodes are shorter than 10, 9, 8, 7, 6, 5, or 4 nucleotides in length. In some

embodiments, barcodes associated with some polynucleotides are of different length than barcodes associated with other polynucleotides. In general, barcodes are of sufficient length and comprise sequences that are sufficiently different to allow the identification of samples based on barcodes with which they are associated. In some embodiments, a barcode, and the sample source with which it is associated, can be identified accurately after the mutation, insertion, or deletion of one or more nucleotides in the barcode sequence, such as the mutation, insertion, or deletion of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more nucleotides. In some embodiments, each barcode in a plurality of barcodes differ from every other barcode in the plurality at at least three nucleotide positions, such as at least 3, 4, 5, 6, 7, 8, 9, 10, or more positions. In some embodiments, both a first adaptor/primer and a second adaptor/primer comprise at least one of a plurality of barcode sequences. In some embodiments, barcodes for second adaptor/primer oligonucleotides are selected independently from barcodes for first adaptor/primer oligonucleotides. In some embodiments, first adaptor/primer oligonucleotides and second adaptor/primer oligonucleotides having barcodes are paired, such that

adaptor/primers of the pair comprise the same or different one or more barcodes. In some embodiments, the methods of the invention further comprise identifying the sample from which a target polynucleotide is derived based on a barcode sequence to which the target polynucleotide is joined. In general, a barcode comprises a nucleic acid sequence that when joined to a target polynucleotide serves as an identifier of the sample from which the target polynucleotide was derived.

In some embodiments, the plurality of barcode sequences from which barcode sequences are selected includes sequences selected from the group consisting of: AAA, TTT, CCC, GGG. In some embodiments, the plurality of barcode sequences from which barcode sequences are selected includes sequences selected from the group consisting of: AAAA, CTGC, GCTG, TGCT, ACCC, CGTA, GAGT, TTAG, AGGG, CCAT, GTCA, TATC, ATTT, CACG, GGAC, and TCGA. In some embodiments, the plurality of barcode sequences from which barcode sequences are selected includes sequences selected from the group consisting of: AAAAA, AACCC, AAGGG, AATTT, ACACG, ACCAT, ACGTA, ACTGC, AGAGT, AGCTG, AGGAC, AGTCA, ATATC, ATCGA, ATGCT, ATTAG, CAACT, CACAG, CAGTC, CATGA, CCAAC, CCCCA, CCGGT, CCTTG, CGATA, CGCGC, CGGCG, CGTAT, CTAGG, CTCTT, CTGAA, CTTCC, GAAGC, GACTA, GAGAT, GATCG, GCATT, GCCGG, GCGCC, GCTAA, GGAAG, GGCCT, GGGGA, GGTTC, GTACA, GTCAC, GTGTG, GTTTT, TAATG, TACGT, TAGCA, TAT AC, TCAGA, TCCTC, TCGAG, TCTCT, TGACC, TGCAA, TGGTT, TGTGG, TTAAT, TTCCG, TTGGC, and TTTTA.

Attachment of Adaptors

Ligation

[0077] The terms "joining" and "ligation" as used herein, with respect to two polynucleotides, such as a stem-loop adaptor/primer oligonucleotide and a target polynucleotide, refers to the covalent attachment of two separate polynucleotides to produce a single larger polynucleotide with a contiguous backbone. Methods for joining two polynucleotides are known in the art, and include without limitation, enzymatic and non-enzymatic (e.g. chemical) methods. Examples of ligation reactions that are non-enzymatic include the non-enzymatic ligation techniques described in U.S. Pat. Nos. 5,780,613 and 5,476,930, which are herein incorporated by reference. In some embodiments, an adaptor oligonucleotide is joined to a target polynucleotide by a ligase, for example a DNA ligase or RNA ligase. Multiple ligases, each having characterized reaction conditions, are known in the art, and include, without limitation NAD + -dependent ligases including tRNA ligase, Taq DNA ligase, Thermus filiformis DNA ligase, Escherichia coli DNA ligase, Tth DNA ligase, Thermus scotoductus DNA ligase (I and II), thermostable ligase, Ampligase thermostable DNA ligase, VanC-type ligase, 9° N DNA Ligase, Tsp DNA ligase, and novel ligases discovered by bioprospectmg; ATP-dependent ligases including T4 RNA ligase, T4 DNA ligase, T3 DNA ligase, T7 DNA ligase, Pfu DNA ligase, DNA ligase 1, DNA ligase III, DNA ligase IV, and novel ligases discovered by bioprospectmg; and wild-type, mutant isoforms, and genetically engineered variants thereof. Ligation can be between polynucleotides having hybridizable sequences, such as complementary overhangs. Ligation can also be between two blunt ends. Generally, a 5' phosphate is utilized in a ligation reaction. The 5' phosphate can be provided by the target

polynucleotide, the adaptor oligonucleotide, or both. 5' phosphates can be added to or removed from polynucleotides to be joined, as needed. Methods for the addition or removal of 5' phosphates are known in the art, and include without limitation enzymatic and chemical processes. Enzymes useful in the addition and/or removal of 5' phosphates include kinases, phosphatases, and polymerases. In some embodiments, both of the two ends joined in a ligation reaction (e.g. an adaptor end and a target polynucleotide end) provide a 5' phosphate, such that two covalent linkages are made in joining the two ends. In some embodiments, only one of the two ends joined in a ligation reaction (e.g. only one of an adaptor end and a target polynucleotide end) provides a 5' phosphate, such that only one covalent linkage is made in joining the two ends. In some embodiments, only one strand at one or both ends of a target polynucleotide is joined to an adaptor oligonucleotide. In some embodiments, both strands at one or both ends of a target polynucleotide are joined to an adaptor oligonucleotide. In some embodiments, 3' phosphates are removed prior to ligation. In some embodiments, an adaptor

oligonucleotide is added to both ends of a target polynucleotide, wherein one or both strands at each end are joined to one or more adaptor oligonucleotides. When both strands at both ends are joined to an adaptor oligonucleotide, joining can be followed by a cleavage reaction that leaves a 5' overhang that can serve as a template for the extension of the corresponding 3' end, which 3' end may or may not include one or more nucleotides derived from the adaptor oligonucleotide. In some embodiments, a target polynucleotide is joined to a first adaptor oligonucleotide on one end and a second adaptor oligonucleotide on the other end. In some embodiments, the target

polynucleotide and the adaptor to which it is joined comprise blunt ends. In some embodiments, separate ligation reactions are carried out for each sample, using a different first adaptor oligonucleotide comprising at least one barcode sequence for each sample, such that no barcode sequence is joined to the target polynucleotides of more than one sample. A target polynucleotide that has an adaptor/primer oligonucleotide joined to it is considered "tagged" by the joined adaptor.

[0078] In some embodiments, joining of an adaptor/primer to a target polynucleotide produces a joined product polynucleotide having a 3' overhang comprising a nucleotide sequence derived from the adaptor/primer. In some embodiments, a primer oligonucleotide comprising a sequence complementary to all or a portion of the 3' overhang is hybridized to the overhang and extended using a DNA polymerase to produce a primer extension product hybridized to one strand of the joined product polynucleotide. The DNA polymerase may comprise strand displacement activity, such that one strand of the joined product polynucleotide is displaced during primer extension.

Pooling

[0079] In some embodiments, after joining at least one adaptor oligonucleotide to a target

polynucleotide, the 3' end of one or more target polynucleotides is extended using the one or more joined adaptor oligonucleotides as template. For example, an adaptor comprising two hybridized oligonucleotides that is joined to only the 5' end of a target polynucleotide allows for the extension of the unjoined 3 'end of the target using the joined strand of the adaptor as template, concurrently with or following displacement of the unjoined strand. If both strands of an adaptor comprising two hybridized

oligonucleotides are joined to a target polynucleotide such that the joined product has a 5' overhang, the complementary 3' end can be extended using the 5' overhang as template. As a further example, a stem-loop adaptor oligonucleotide can be joined to the 5' end of a target polynucleotide. While double-stranded in secondary structure, such a stem- loop adaptor remains single-stranded, and is thus a 5' overhang appended to the target polynucleotide (e.g. when the 5' end of the stem- loop adaptor is not joined to the target polynucleotide). Removal of the secondary structure, either prior to (e.g. thermal denaturing, or degradation) or concurrently with (e.g. strand displacement) the activity of a polymerase, provides a template for the extension of the 3' end of the complementary strand of the target polynucleotide. In some embodiments, the 3' end of the target polynucleotide that is extended comprises one or more nucleotides from an adaptor oligonucleotide. For target polynucleotides to which adaptors are joined on both ends, extension can be carried out for both 3' ends of a double-stranded target polynucleotide having 5' overhangs. This 3' end extension, or "fill-in" reaction, generates a

complementary sequence, or "complement," to the adaptor oligonucleotide template that is hybridized to the template, thus filling in the 5' overhang to produce a double-stranded sequence region. Where both ends of a double-stranded target polynucleotide have 5' overhangs that are filled in by extension of the complementary strands' 3' ends, the product is completely double-stranded. Extension can be carried out by any suitable polymerase known in the art, such as a DNA polymerase, many of which are

commercially available. DNA polymerases can comprise DNA-dependent DNA polymerase activity, RNA-dependent DNA polymerase activity, or DNA-dependent and RNA-dependent DNA polymerase activity. DNA polymerases can be thermostable or non-thermostable. Examples of DNA polymerases include, but are not limited to, Taq polymerase, Tth polymerase, Tli polymerase, Pfu polymerase, Pfutubo polymerase, Pyrobest polymerase, Pwo polymerase, KOD polymerase, Bst polymerase, Sac polymerase, Sso polymerase, Poc polymerase, Pab polymerase, Mth polymerase, Pho polymerase, ES4 polymerase, VENT polymerase, DEEPVENT polymerase, EX-Taq polymerase, LA-Taq polymerase, Expand polymerases, Platinum Taq polymerases, Hi-Fi polymerase, Tbr polymerase, Tfl polymerase, Tru polymerase, Tac polymerase, Tne polymerase, Tma polymerase, Tih polymerase, Tfi polymerase, Klenow fragment, and variants, modified products and derivatives thereof. 3' end extension can be performed before or after pooling of target polynucleotides from independent samples.

In some embodiments, the fill-in reaction is followed by or performed as part of amplification of one or more target polynucleotides using a first primer and/or a second primer, wherein the first primer comprises a sequence that is hybridizable to at least a portion of the complement of one or more of a first adaptor/primer oligonucleotides, and further wherein the second primer comprises a sequence that is hybridizable to at least a portion of the complement of one or more of a second adaptor/primer oligonucleotides. Each of the first and second primers may be of any suitable length, such as about, less than about, or more than about 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 90, 100, or more nucleotides, any portion or all of which may be complementary to the corresponding target sequence (e.g. about, less than about, or more than about 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, or more nucleotides). "Amplification" refers to any process by which the copy number of a target sequence is increased. Methods for primer-directed amplification of target polynucleotides are known in the art, and include without limitation, methods based on the polymerase chain reaction (PCR). Conditions favorable to the amplification of target sequences by PCR are known in the art, can be optimized at a variety of steps in the process, and depend on characteristics of elements in the reaction, such as target type, target concentration, sequence length to be amplified, sequence of the target and/or one or more primers, primer length, primer concentration, polymerase used, reaction volume, ratio of one or more elements to one or more other elements, and others, some or all of which can be altered. In general, PCR involves the steps of denaturation of the target to be amplified (if double stranded), hybridization of one or more primers to the target, and extension of the primers by a DNA polymerase, with the steps repeated (or "cycled") in order to amplify the target sequence. Steps in this process can be optimized for various outcomes, such as to enhance yield, decrease the formation of spurious products, and/or increase or decrease specificity of primer annealing. Methods of optimization are well known in the art and include adjustments to the type or amount of elements in the amplification reaction and/or to the conditions of a given step in the process, such as temperature at a particular step, duration of a particular step, and/or number of cycles. In some embodiments, an amplification reaction comprises at least 5, 10, 15, 20, 25, 30, 35, 50, or more cycles. In some embodiments, an amplification reaction comprises no more than 5, 10, 15, 20, 25, 35, 50, or more cycles. Cycles can contain any number of steps, such as 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more steps. Steps can comprise any temperature or gradient of temperatures, suitable for achieving the purpose of the given step, including but not limited to, 3' end extension (e.g. adaptor fill-in), primer annealing, primer extension, and strand denaturation. Steps can be of any duration, including but not limited to about, less than about, or more than about 1, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 70, 80, 90, 100, 120, 180, 240, 300, 360, 420, 480, 540, 600, or more seconds, including indefinitely until manually interrupted. Cycles of any number comprising different steps can be combined in any order. In some embodiments, different cycles comprising different steps are combined such that the total number of cycles in the combination is about, less that about, or more than about 5, 10, 15, 20, 25, 30, 35, 50, or more cycles. In some embodiments, amplification is performed following the fill-in reaction. Amplification can be performed before or after pooling of target polynucleotides from independent samples.

Methods

Methods of Amplification

[0081] The methods, compositions and kits described herein can be useful to generate

amplification-ready products directly from genomic DNA or whole or partial

transcriptome R A for downstream applications such as massively parallel sequencing (Next Generation Sequencing methods), multiplexed quantification of large sets of sequence regions of interest, such as by high density qPCR arrays and other highly parallel quantification platforms (selective massively parallel target pre-amplification), as well as generation of libraries with enriched population of sequence regions of interest. The methods described herein can be used to generate a collection of at least 25, 50, 75, 100, 500, 1000, 2500, 5000, 10,000, 25,000, 50,000, 100,000, 500,000, or 1,000,000 amplification-ready target sequence regions of interest directly from a sample of complex DNA using a plurality of oligonucleotides.

[0082] Methods of nucleic acid amplification are well known in the art. In some embodiments, the amplification method is isothermal. In other embodiments the amplification method is linear. In other embodiments the amplification is exponential.

[0083] SPIA Amplification

[0084] Amplification of the sequence regions of interest employing a linear amplification

method such as the single primer isothermal amplification (SPIA) can be used. SPIA enables generation of multiple copies of the strand specific sequence regions of interest and employs a single amplification primer, thus reducing the complexity associated with multiple oligonucleotide design and manufacturing, enables the use of a generic amplification primer, and can be linear. The fidelity of quantification of the copy number of the sequence regions of interest in the complex genomic NA sample is a highly desirable feature of the presented methods of the invention.

[0085] Amplification by SPIA can occur under conditions permitting composite primer

hybridization, primer extension by a DNA polymerase with strand displacement activity, cleavage of RNA from a RNA/DNA heteroduplex and strand displacement. In so far as the composite amplification primer hybridizes to the 3 '-single-stranded portion (of the partially double stranded polynucleotide which is formed by cleaving RNA in the complex comprising a RNA/DNA partial heteroduplex) comprising, generally, the complement of at least a portion of the composite amplification primer sequence, composite primer hybridization may be under conditions permitting specific

hybridization. In SPIA, all steps are isothermal (in the sense that thermal cycling is not required), although the temperatures for each of the steps may or may not be the same. It is understood that various other embodiments can be practiced given the general description provided above. For example, as described and exemplified herein, certain steps may be performed as temperature is changed (e.g., raised, or lowered).

[0086] Although generally only one composite amplification primer is described above, it is further understood that the SPIA amplification methods can be performed in the presence of two or more different first and/or second composite primers that randomly prime template polynucleotide. In addition, the amplification polynucleotide products of two or more separate amplification reactions conducted using two or more different first and/or second composite primers that randomly prime template polynucleotide can be combined.

[0087] The composite amplification primers are primers that are composed of RNA and DNA portions. In the amplification composite primer, both the RNA and the DNA portions are generally complementary and can hybridize to a sequence in the amplification-ready product to be copied or amplified. In some embodiments, a 3 '-portion of the amplification composite primer is DNA and a 5 '-portion of the composite amplification primer is RNA. The composite amplification primer is designed such that the primer is extended from the 3'-DNA portion to create a primer extension product. The 5'-RNA portion of this primer extension product in a RNA DNA heteroduplex is susceptible to cleavage by RNase H, thus freeing a portion of the polynucleotide to the hybridization of an additional composite amplification primer. The extension of the amplification composite primer by a DNA polymerase with strand displacement activity releases the primer extension product from the original primer and creates another copy of the sequence of the polynucleotide. Repeated rounds of primer hybridization, primer extension with strand displacement DNA synthesis, and RNA cleavage create multiple copies of the strand- specific sequence of the polynucleotide.

[0088] In some embodiments, the composite amplification primer is generated in the

amplification reaction mixture from a stem-loop chimeric pro-primer. The amplification reaction mixture can comprise a target partial duplex nucleic acid, for example a target partial duplex DNA, a chimeric stem-loop pro-primer, DNA polymerase with strand displacement activity, and an RNase targeting RNA in a RNA/DNA heteroduplex, for example RNase H. The RNA portion of the RNA/DNA heteroduplex at the stem of the chimeric stem-loop pro-primer can be cleaved by RNase H to generate, for example, a linear composite primer comprising a 3'-DNA and 5'-RNA. The linearized amplification primer can hybridize to a 3 '-single stranded DNA portion (overhang) of a target partial duplex and can be extended by the DNA polymerase with strand displacement activity. The RNA portion of the hybridized primer in a heteroduplex can be cleaved by RNase H to free a portion of the primer binding site. A second linear composite amplification primer can hybridize to the freed primer binding site, and can be extended along the target DNA strand. The previously synthesized primer extension product (amplification product) can be displaced by the newly extended primer. Repeated cycles of primer hybridization, primer extension by strand displacement DNA polymerase, and cleavage of the RNA portion of the hybridized primer can generate multiple copies of a target nucleic acid (Figure 5).

[0089] Other Amplification Methods

[0090] Some aspects of the invention comprise the amplification of polynucleotide molecules or sequences within the polynucleotide molecules. Amplification generally refers to a method that can result in the formation of one or more copies of a nucleic acid or polynucleotide molecule or in the formation of one or more copies of the complement of a nucleic acid or polynucleotide molecule. Amplifications can be used in the invention, for example, to amplify or analyze a polynucleotide bound to a solid surface. The amplifications can be performed, for example, after archiving the samples in order to analyze the archived polynucleotide. [0091] In some aspects of the invention, exponential amplification of nucleic acids or polynucleotides is used. These methods often depend on the product catalyzed formation of multiple copies of a nucleic acid or polynucleotide molecule or its complement. The amplification products are sometimes referred to as "amplicons." One such method for the enzymatic amplification of specific double stranded sequences of DNA is polymerase chain reaction (PCR). This in vitro amplification procedure is based on repeated cycles of denaturation, oligonucleotide primer annealing, and primer extension by thermophilic template dependent polynucleotide polymerase, resulting in the exponential increase in copies of the desired sequence of the polynucleotide analyte flanked by the primers. The two different PCR primers, which anneal to opposite strands of the DNA, are positioned so that the polymerase catalyzed extension product of one primer can serve as a template strand for the other, leading to the accumulation of a discrete double stranded fragment whose length is defined by the distance between the 5' ends of the oligonucleotide primers. Other amplification techniques that can be used in the methods of the provided invention include, e.g., AFLP (amplified fragment length polymorphism) PCR (see e.g.: Vos et al. 1995. AFLP: a new technique for DNA fingerprinting. Nucleic Acids Research 23: 4407-14), allele-specific PCR (see e.g., Saiki R K, Bugawan T L, Horn G T, Mullis K B, Erlich H A (1986). Analysis of enzymatically amplified beta-globin and HLA-DQ alpha DNA with allele-specific oligonucleotide probes Nature 324: 163-166), Alu PCR, assembly PCR (see e.g., Stemmer W P, Crameri A, Ha K D, Brennan T M, Heyneker H L (1995). Single-step assembly of a gene and entire plasmid from large numbers of oligodeoxyribonucleotides Gene 164: 49-53), assymetric PCR (see e.g., Saiki R K supra), colony PCR, helicase dependent PCR (see e.g., Myriam Vincent, Yan Xu and Huimin Kong (2004). Helicase-dependent isothermal DNA amplification EMBO reports 5 (8): 795-800), hot start PCR, inverse PCR (see e.g., Ochman H, Gerber A S, Hartl D L.

Genetics. 1988 November; 120(3):621-3), in situ PCR, intersequence-specific PCR or IS SR PCR, digital PCR, linear-after-the-exponential-PCR or Late PCR (see e.g., Pierce K E and Wangh L T (2007). Linear-after-the-exponential polymerase chain reaction and allied technologies Real-time detection strategies for rapid, reliable diagnosis from single cells Methods Mol. Med. 132: 65-85), long PCR, nested PCR, real-time PCR, duplex PCR, multiplex PCR, quantitative PCR, or single cell PCR.

[0092] Another method for amplification involves amplification of a single stranded

polynucleotide using a single oligonucleotide primer. The single stranded polynucleotide that is to be amplified contains two non-contiguous sequences that are substantially or completely complementary to one another and, thus, are capable of hybridizing together to form a stem-loop structure. This single stranded polynucleotide already may be part of a polynucleotide analyte or may be created as the result of the presence of a

polynucleotide analyte.

[0093] Another method for achieving the result of an amplification of nucleic acids is known as the ligase chain reaction (LCR). This method uses a ligase enzyme to join pairs of preformed nucleic acid probes. The probes hybridize with each complementary strand of the nucleic acid analyte, if present, and ligase is employed to bind each pair of probes together resulting in two templates that can serve in the next cycle to reiterate the particular nucleic acid sequence.

[0094] Another method for achieving nucleic acid amplification is the nucleic acid sequence based amplification (NASBA). This method is a promoter-directed, enzymatic process that induces in vitro continuous, homogeneous and isothermal amplification of a specific nucleic acid to provide RNA copies of the nucleic acid. The reagents for conducting NASBA include a first DNA primer with a 5 '-tail comprising a promoter, a second DNA primer, reverse transcriptase, RNase-H, T7 RNA polymerase, NTP's and dNTP's.

[0095] Another method for amplifying a specific group of nucleic acids is the Q-beta-replicase method, which relies on the ability of Q-beta-replicase to amplify its RNA substrate exponentially. The reagents for conducting such an amplification include "midi-variant RNA" (amplifiable hybridization probe), NTP's, and Q-beta-replicase.

[0096] Another method for amplifying nucleic acids is known as 3SR and is similar to NASBA except that the RNase-H activity is present in the reverse transcriptase. Amplification by 3SR is an RNA specific target method whereby RNA is amplified in an isothermal process combining promoter directed RNA polymerase, reverse transcriptase and RNase H with target RNA. See for example Fahy et al. PCR Methods Appl. 1 :25-33 (1991).

[0097] Another method for amplifying nucleic acids is the Transcription Mediated Amplification (TMA) used by Gen-Probe. The method is similar to NASBA in utilizing two enzymes in a self-sustained sequence replication. See U.S. Pat. No. 5,299,491 herein incorporated by reference.

[0098] Another method for amplification of nucleic acids is Strand Displacement Amplification (SDA) (Westin et al 2000, Nature Biotechnology, 18, 199-202; Walker et al 1992, Nucleic Acids Research, 20, 7, 1691-1696), which is an isothermal amplification technique based upon the ability of a restriction endonuc lease such as Hindi or BsoBI to nick the unmodified strand of a hemiphosphorothioate form of its recognition site, and the ability of an exonuclease deficient DNA polymerase such as Klenow exo minus polymerase, or Bst polymerase, to extend the 3 '-end at the nick and displace the downstream DNA strand. Exponential amplification results from coupling sense and antisense reactions in which strands displaced from a sense reaction serve as targets for an antisense reaction and vice versa.

[0099] Another method for amplification of nucleic acids is Rolling Circle Amplification (RCA) (Lizardi et al. 1998, Nature Genetics, 19:225-232). RCA can be used to amplify single stranded molecules in the form of circles of nucleic acids. In its simplest form, RCA involves the hybridization of a single primer to a circular nucleic acid. Extension of the primer by a DNA polymerase with strand displacement activity results in the production of multiple copies of the circular nucleic acid concatenated into a single DNA strand.

[00100] In some embodiments of the invention, RCA is coupled with ligation. For

example, a single oligonucleotide can be used both for ligation and as the circular template for RCA. This type of polynucleotide can be referred to as a "padlock probe" or a "RCA probe." For a padlock probe, both termini of the oligonucleotide contain sequences complementary to a domain within a nucleic acid sequence of interest. The first end of the padlock probe is substantially complementary to a first domain on the nucleic acid sequence of interest, and the second end of the padlock probe is substantially complementary to a second domain, adjacent to the first domain near the first domain. Hybridization of the oligonucleotide to the target nucleic acid results in the formation of a hybridization complex. Ligation of the ends of the padlock probe results in the formation of a modified hybridization complex containing a circular polynucleotide. In some cases, prior to ligation, a polymerase can fill in the gap by extending one end of the padlock probe. The circular polynucleotide thus formed can serve as a template for RCA that, with the addition of a polymerase, results in the formation of an amplified product nucleic acid. The methods of the invention described herein can produce amplified products with defined sequences on both the 5'- and 3 '-ends. Such amplified products can be used as padlock probes.

[00101] Some aspects of the invention utilize the linear amplification of nucleic acids or polynucleotides. Linear amplification generally refers to a method that involves the formation of one or more copies of the complement of only one strand of a nucleic acid or polynucleotide molecule, usually a nucleic acid or polynucleotide analyte. Thus, the primary difference between linear amplification and exponential amplification is that in the latter process, the product serves as substrate for the formation of more product, whereas in the former process the starting sequence is the substrate for the formation of product but the product of the reaction, i.e. the replication of the starting template, is not a substrate for generation of products. In linear amplification the amount of product formed increases as a linear function of time as opposed to exponential amplification where the amount of product formed is an exponential function of time.

[00102] In some embodiments, amplification methods can be solid-phase amplification, polony amplification, colony amplification, emulsion PCR, bead RCA, surface RCA, surface SDA, etc., as will be recognized by one of skill in the art. In some embodiments, amplification methods that results in amplification of free DNA molecules in solution or tethered to a suitable matrix by only one end of the DNA molecule can be used. Methods that rely on bridge PCR, where both PCR primers are attached to a surface (see, e.g., WO 2000/018957 and Adessi et al, Nucleic Acids Research (2000): 28(20): E87) can be used. In some cases the methods of the invention can create a "polymerase colony technology," or "polony." referring to a multiplex amplification that maintains spatial clustering of identical amplicons (see Harvard Molecular Technology Group and Lipper Center for Computational Genetics website). These include, for example, in situ polonies (Mitra and Church, Nucleic Acid Research 27, e34, Dec. 15, 1999), in situ rolling circle amplification (RCA) (Lizardi et al, Nature Genetics 19, 225, July 1998), bridge PCR (U.S. Pat. No. 5,641,658), picotiter PCR (Leamon et al, Electrophoresis 24, 3769, November 2003), and emulsion PCR (Dressman et al, PNAS 100, 8817, Jul. 22, 2003). The methods of the invention provide new methods for generating and using polonies.

Other Downstream Applications

[00103] An important aspect of the invention is that the methods and compositions

disclosed herein can be efficiently, cost-effectively, and with the minimal loss of biological material, utilized for downstream analyses.

[00104] In particular embodiments, the oligonucleotide-attached nucleic acid molecule is further modified, such as by cloning, including by incorporation of the modified molecule into a vector, said incorporation occurring at ends in the modified molecule generated by endonuclease cleavage within the inverted repeat.

[00105] In specific embodiments, the oligonucleotide-attached nucleic acid molecule is immobilized to a solid support, such as non-covalently or covalently.

[00106] The libraries of amplified, non-amplified, non-targeted nucleic acids, as well as enriched copies of selected sequence regions of interest are useful for massively parallel sequencing (Next Generation Sequencing methods), multiplexed quantification of large sets of sequence regions of interest, such as by high density qPCR arrays and other highly parallel quantification platforms (selective massively parallel target pre-amplification), as well as generation of libraries with enriched population of sequence regions of interest.

Sequencing

[00107] In one embodiment, the invention provides for products ready for amplification in preparation for sequencing. In some embodiments, the target polynucleotides are pooled followed by sequencing one or more polynucleotides in the pool. Sequencing methods utilizing adaptor incorporated sequences are well known in the art and are further described, for example, in US Patent No. s 8,053,192 and 8,017,335.

[00108] Sequencing processes are generally template dependent. Nucleic acid sequence analysis that employs template dependent synthesis identifies individual bases, or groups of bases as they are added during a template mediated synthesis reaction, such as a primer extension reaction, where the identity of the base is complementary to the template sequence to which the primer sequence is hybridized during synthesis. Other such processes include ligation driven processes, where oligonucleotides or

polynucleotides are complexed with an underlying template sequence, in order to identify the sequence of nucleotides in that sequence. Typically, such processes are

enzymatically mediated using nucleic acid polymerases, such as DNA polymerases, RNA polymerases, reverse transcriptases, and the like, or other enzymes such as in the case of ligation driven processes, e.g., ligases.

[00109] Sequence analysis using template dependent synthesis can include a number of different processes. For example, in the ubiquitously practiced four-color Sanger sequencing methods, a population of template molecules is used to create a population of complementary fragment sequences. Primer extension is carried out in the presence of the four naturally occurring nucleotides, and with a sub-population of dye labeled terminator nucleotides, e.g., dideoxyribonucleotides, where each type of terminator (ddATP, ddGTP, ddTTP, ddCTP) includes a different detectable label. As a result, a nested set of fragments is created where the fragments terminate at each nucleotide in the sequence beyond the primer, and are labeled in a manner that permits identification of the terminating nucleotide. The nested fragment population is then subjected to size based separation, e.g., using capillary electrophoresis, and the labels associated with each different sized fragment is identified to identify the terminating nucleotide. As a result, the sequence of labels moving past a detector in the separation system provides a direct readout of the sequence information of the synthesized fragments, and by complementarity, the underlying template (See, e.g., U.S. Pat. No. 5,171,534,

incorporated herein by reference in its entirety for all purposes).

[00110] Other examples of template dependent sequencing methods include sequence by synthesis processes, where individual nucleotides are identified iteratively, as they are added to the growing primer extension product.

[00111] Pyro sequencing is an example of a sequence by synthesis process that identifies the incorporation of a nucleotide by assaying the resulting synthesis mixture for the presence of by-products of the sequencing reaction, namely pyrophosphate. In particular, a primer/template/polymerase complex is contacted with a single type of nucleotide. If that nucleotide is incorporated, the polymerization reaction cleaves the nucleoside triphosphate between the a and β phosphates of the triphosphate chain, releasing pyrophosphate. The presence of released pyrophosphate is then identified using a chemiluminescent enzyme reporter system that converts the pyrophosphate, with AMP, into ATP, then measures ATP using a luciferase enzyme to produce measurable light signals. Where light is detected, the base is incorporated, where no light is detected, the base is not incorporated. Following appropriate washing steps, the various bases are cyclically contacted with the complex to sequentially identify subsequent bases in the template sequence. See, e.g., U.S. Pat. No. 6,210,891, incorporated herein by reference in its entirety for all purposes).

[00112] In related processes, the primer/template/polymerase complex is immobilized upon a substrate and the complex is contacted with labeled nucleotides. The

immobilization of the complex may be through the primer sequence, the template sequence and/or the polymerase enzyme, and may be covalent or noncovalent. For example, immobilization of the complex can be via a linkage between the polymerase or the primer and the substrate surface. A variety of types of linkages are useful for this attachment, including, e.g., provision of biotinylated surface components, using e.g., biotin-PEG-silane linkage chemistries, followed by biotinylation of the molecule to be immobilized, and subsequent linkage through, e.g., a streptavidin bridge. Other synthetic coupling chemistries, as well as non-specific protein adsorption can also be employed for immobilization. In alternate configurations, the nucleotides are provided with and without removable terminator groups. Upon incorporation, the label is coupled with the complex and is thus detectable. In the case of terminator bearing nucleotides, all four different nucleotides, bearing individually identifiable labels, are contacted with the complex. Incorporation of the labeled nucleotide arrests extension, by virtue of the presence of the terminator, and adds the label to the complex. The label and terminator are then removed from the incorporated nucleotide, and following appropriate washing steps, the process is repeated. In the case of non-terminated nucleotides, a single type of labeled nucleotide is added to the complex to determine whether it will be incorporated, as with pyro sequencing. Following removal of the label group on the nucleotide and appropriate washing steps, the various different nucleotides are cycled through the reaction mixture in the same process. See, e.g., U.S. Pat. No. 6,833,246, incorporated herein by reference in its entirety for all purposes). For example, the Illumina Genome Analyzer System is based on technology described in WO 98/44151, hereby incorporated by reference, wherein DNA molecules are bound to a sequencing platform (flow cell) via an anchor probe binding site (otherwise referred to as a flow cell binding site) and amplified in situ on a glass slide. The DNA molecules are then annealed to a sequencing primer and sequenced in parallel base-by-base using a reversible terminator approach. Typically, the Illumina Genome Analyzer System utilizes flow-cells with 8 channels, generating sequencing reads of 18 to 36 bases in length, generating >1.3 Gbp of high quality data per run (see www.illumina.com).

] In yet a further sequence by synthesis process, the incorporation of differently labeled nucleotides is observed in real time as template dependent synthesis is carried out. In particular, an individual immobilized primer/template/polymerase complex is observed as fluorescently labeled nucleotides are incorporated, permitting real time identification of each added base as it is added. In this process, label groups are attached to a portion of the nucleotide that is cleaved during incorporation. For example, by attaching the label group to a portion of the phosphate chain removed during

incorporation, i.e., a β, γ, or other terminal phosphate group on a nucleoside

polyphosphate, the label is not incorporated into the nascent strand, and instead, natural DNA is produced. Observation of individual molecules typically involves the optical confinement of the complex within a very small illumination volume. By optically confining the complex, one creates a monitored region in which randomly diffusing nucleotides are present for a very short period of time, while incorporated nucleotides are retained within the observation volume for longer as they are being incorporated. This results in a characteristic signal associated with the incorporation event, which is also characterized by a signal profile that is characteristic of the base being added. In related aspects, interacting label components, such as fluorescent resonant energy transfer (FRET) dye pairs, are provided upon the polymerase or other portion of the complex and the incorporating nucleotide, such that the incorporation event puts the labeling components in interactive proximity, and a characteristic signal results, that is again, also characteristic of the base being incorporated (See, e.g., U.S. Pat. Nos. 6,056,661, 6,917,726, 7,033,764, 7,052,847, 7,056,676, 7,170,050, 7,361,466, 7,416,844 and Published U.S. Patent Application No. 2007-0134128, the full disclosures of which are hereby incorporated herein by reference in their entirety for all purposes).

[00114] In some embodiments, the nucleic acids in the sample can be sequenced by

ligation. This method uses a DNA ligase enzyme to identify the target sequence, for example, as used in the polony method and in the SOLiD technology (Applied

Biosystems, now Invitrogen). In general, a pool of all possible oligonucleotides of a fixed length is provided, labeled according to the sequenced position. Oligonucleotides are annealed and ligated; the preferential ligation by DNA ligase for matching sequences results in a signal corresponding to the complementary sequence at that position.

[00115] In some embodiments, sequencing comprises extension of a sequencing primer comprising a sequence hybridizable to at least a portion of the complement of the first adaptor oligonucleotide. In some embodiments, sequencing comprises extension of a sequencing primer comprising a sequence hybridizable to at least a portion of the complement of the second adaptor oligonucleotide. A sequencing primer may be of any suitable length, such as about, less than about, or more than about 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 90, 100, or more nucleotides, any portion or all of which may be complementary to the corresponding target sequence (e.g. about, less than about, or more than about 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, or more nucleotides). In some embodiments, sequencing comprises a calibration step, wherein the calibration is based on each of the nucleotides at one or more nucleotide positions in the barcode sequences. Calibration can be useful in processing the sequencing data, for example, by facilitating or increasing the accuracy of identifying a base at a given position in the sequence.

[00116] In some embodiments, accurate identification of the sample from which a target polynucleotide is derived is based on at least a portion of the sequence obtained for the target polynucleotide and is at least 90%, 95%, 96%, 97%, 98%, 99%, 99.5%, 99.8%, 99.85%), 99.9%), 99.95%, 99.99%, or more accurate. In some embodiments, the sample source of a target polynucleotide is identified based on a single barcode contained in the sequence. In some embodiments, accuracy can be increased by identifying the source of a target polynucleotide using two or more barcodes contained in the sequence. Multiple barcodes can be joined to a target polynucleotide by the incorporation of multiple barcodes into a single adaptor/primer to which a target polynucleotide is joined, and/or by joining two or more adaptors/primers having one or more barcodes to a target polynucleotide. In some embodiments, the identity of the sample source of a target polynucleotide comprising two or more barcode sequences may be accurately determined using only one of the barcode sequences that it comprises. In general, accurate identification of a sample from which a target polynucleotide is derived comprises correct identification of a sample source from among two or more samples in a pool, such as about, less than about, or more than about 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 16, 20, 24, 28, 32, 36, 40, 50, 60, 70, 80, 90, 100, 128, 192, 384, 500, 1000 or more samples in a pool.] In some embodiments, the methods are useful for preparing target

polynucleotide(s) for sequencing by the sequencing by ligation methods commercialized by Applied Biosystems (e.g., SOLiD sequencing). In general, double stranded fragment polynucleotides can be prepared by the methods of the present invention, and then incorporated into a water-in-oil emulsion along with polystyrene beads and amplified, for example by PCR. In some cases, alternative amplification methods can be employed in the water-in-oil emulsion such as any of the methods provided herein. The amplified product in each water microdroplet formed by the emulsion can interact, bind, or hybridize with the one or more beads present in that microdroplet leading to beads with a plurality of amplified products of substantially one sequence. When the emulsion is broken, the beads float to the top of the sample and are placed onto an array. The methods can include a step of rendering the nucleic acid bound to the beads single- stranded or partially single stranded. Sequencing primers are then added along with a mixture of four different fluorescently labeled oligonucleotide probes. The probes bind specifically to the two bases in the polynucleotide to be sequenced immediately adjacent and 3 Of the sequencing primer to determine which of the four bases are at those positions. After washing and reading the fluorescence signal from the first incorporated probe, a ligase is added. The ligase cleaves the oligonucleotide probe between the fifth and sixth bases, removing the fluorescent dye from the polynucleotide to be sequenced. The whole process is repeated using a different sequence primer until all of the intervening positions in the sequence are imaged. The process allows the simultaneous reading of millions of DNA fragments in a 'massively parallel' manner. This 'sequence- by- ligation' technique uses probes that encode for two bases rather than just one allowing error recognition by signal mismatching, leading to increased base determination accuracy.

[00118] In some embodiments, the methods are useful for preparing target

polynucleotide(s) for sequencing by synthesis using the methods commercialized by 454/Roche Life Sciences including but not limited to the methods and apparatus described in Margulies et al, Nature (2005) 437:376-380 (2005); and U.S. Pat. Nos. 7,244,559; 7,335,762; 7,211,390; 7,244,567; 7,264,929; and 7,323,305. In general, double stranded fragment polynucleotides can be prepared by the methods of the present invention, immobilized onto beads, and compartmentalized in a water-in-oil PCR emulsion. In some cases, alternative amplification methods can be employed in the water-in-oil emulsion such as any of the methods provided herein. When the emulsion is broken, amplified fragments remain bound to the beads. The methods can include a step of rendering the nucleic acid bound to the beads single stranded or partially single stranded. The beads can be enriched and loaded into wells of a fiber optic slide so that there is approximately 1 bead in each well. Nucleotides are flowed across and into the wells in a fixed order in the presence of polymerase, sulfhydrolase, and luciferase.

Addition of nucleotides complementary to the target strand can result in a

chemiluminescent signal that is recorded, such as by a camera. The combination of signal intensity and positional information generated across the plate allows software to determine the DNA sequence.

[00119] In some embodiments, the methods are useful for preparing target

polynucleotide(s) for sequencing by the methods commercialized by Helicos Biosciences Corporation (Cambridge, Mass.) as described in U.S. application Ser. No. 11/167,046, and U.S. Pat. Nos. 7,501,245; 7,491,498; 7,276,720; and in U.S. Patent Application Publication Nos. US20090061439; US20080087826; US20060286566; US20060024711; US20060024678; US20080213770; and US20080103058. In general, double stranded fragment polynucleotides can be prepared by the methods of the present invention, and then immobilized onto a flow-cell surface. The methods can include a step of rendering the nucleic acid bound to the flow-cell surface stranded or partially single stranded.

Polymerase and labeled nucleotides are then flowed over the immobilized DNA. After fluorescently labeled nucleotides are incorporated into the DNA strands by a DNA polymerase, the surface is illuminated with a laser, and an image is captured and processed to record single molecule incorporation events to produce sequence data. [00120] In some embodiments, the methods are useful for preparing target polynucleotide(s) for sequencing by the sequencing by ligation methods commercialized by Dover Systems. Generally, double stranded fragment polynucleotides can be prepared by the methods of the present invention. The polynucleotides can then be amplified in an emulsion in the presence of magnetic beads. Any amplification methods can be employed in the water-in-oil emulsion such as any of the methods provided herein. The resulting beads with immobilized clonal polynucleotide polonies are then purified by magnetic separation, capped, amine functionalized, and covalently immobilized in a series of flow cells. The methods can include a step of rendering the nucleic acid bound to the flow-cell surface stranded or partially single stranded. Then, a series of anchor primers are flowed through the cell, where they hybridize to the synthetic oligonucleotide sequences at the 3' or 5' end of proximal or distal genomic DNA tags. Once an anchor primer is hybridized, a mixture of fully degenerate nonanucleotides ('nonamers') and T4 DNA ligase is flowed into the cell; each of the nonamer mixture's four components is labeled with one of four fluorophores, which correspond to the base type at the query position. The fluorophore- tagged nonamers selectively ligate onto the anchor primer, providing a fluorescent signal that identifies the corresponding base on the genomic DNA tag. Once the probes are ligated, fluorescently labeling the beads, the array is imaged in four colors. Each bead on the array will fluoresce in only one of the four images, indicating whether there is an A, C, G, or T at the position being queried. After imaging, the array of annealed primer- fluorescent probe complex, as well as residual enzyme, are chemically striped using guanidine HC1 and sodium hydroxide. After each cycle of base reads at a given position have been completed, and the primer- fluorescent probe complex has been stripped, the anchor primer is replaced, and a new mixture of fluorescently tagged nonamers is introduced, for which the query position is shifted one base further into the genomic DNA tag. Seven bases are queried in this fashion, with the sequence performed from the 5' end of the proximal tag, followed by six base reads with a different anchor primer from the 3' end of the proximal tag, for a total of 13 base pair reads for this tag. This sequence is then repeated for the 5' and 3' ends of the distal tag, resulting in another 13 base pair reads. The ultimate result is a read length of 26 bases (thirteen from each of the paired tags). However, it is understood that this method is not limited to 26 base read lengths.

[00121] In some embodiments, the methods are useful for preparing target

polynucleotide(s) from selectively enriched populations of specific sequence regions of interest in a strand-specific manner for sequencing by the methods well known in the art and further described below.

[00122] For example the methods are useful for sequencing by the method

commercialized by Illumina as described U.S. Pat. Nos. 5,750,341; 6,306,597; and 5,969,119. In general, double stranded fragment polynucleotides can be prepared by the methods of the present invention to produce amplified nucleic acid sequences tagged at one (e.g., (Α)/(Α') or both ends (e.g., (Α)/(Α') and (C)/(C)). In some cases, single stranded nucleic acid tagged at one or both ends is amplified by the methods of the present invention (e.g., by SPIA or linear PCR). The resulting nucleic acid is then denatured and the single stranded amplified polynucleotides are randomly attached to the inside surface of flow-cell channels. Unlabeled nucleotides are added to initiate solid- phase bridge amplification to produce dense clusters of double-stranded DNA. To initiate the first base sequencing cycle, four labeled reversible terminators, primers, and DNA polymerase are added. After laser excitation, fluorescence from each cluster on the flow cell is imaged. The identity of the first base for each cluster is then recorded. Cycles of sequencing are performed to determine the fragment sequence one base at a time. For paired-end sequencing, such as for example, when the polynucleotides are labeled at both ends by the methods of the present invention, sequencing templates can be regenerated in-situ so that the opposite end of the fragment can also be sequenced.

[00123] In some embodiments, the methods are useful for preparing target

polynucleotide(s) for sequencing by the methods commercialized by Pacific Biosciences as described in U.S. Pat. Nos. 7,462,452; 7,476,504; 7,405,281; 7,170,050; 7,462,468; 7,476,503; 7,315,019; 7,302,146; 7,313,308; and US Application Publication Nos.

US20090029385; US20090068655; US20090024331; and US20080206764. In general, double stranded fragment polynucleotides can be prepared by the methods of the present invention. The polynucleotides can then be immobilized in zero mode waveguide arrays. The methods may include a step of rendering the nucleic acid bound to the waveguide arrays single stranded or partially single stranded. Polymerase and labeled nucleotides are added in a reaction mixture, and nucleotide incorporations are visualized via fluorescent labels attached to the terminal phosphate groups of the nucleotides. The fluorescent labels are clipped off as part of the nucleotide incorporation. In some cases, circular templates are utilized to enable multiple reads on a single molecule.

[00124] Another example of a sequencing technique that can be used in the methods of the provided invention is nanopore sequencing (see e.g. Soni G V and Meller A. (2007) Clin Chem 53: 1996-2001). A nanopore can be a small hole of the order of 1 nanometer in diameter. Immersion of a nanopore in a conducting fluid and application of a potential across it can result in a slight electrical current due to conduction of ions through the nanopore. The amount of current that flows is sensitive to the size of the nanopore. As a DNA molecule passes through a nanopore, each nucleotide on the DNA molecule obstructs the nanopore to a different degree. Thus, the change in the current passing through the nanopore as the DNA molecule passes through the nanopore can represent a reading of the DNA sequence.

[00125] Another example of a sequencing technique that can be used in the methods of the provided invention is semiconductor sequencing provided by Ion Torrent (e.g., using the Ion Personal Genome Machine (PGM)). Ion Torrent technology can use a semiconductor chip with multiple layers, e.g., a layer with micro -machined wells, an ion-sensitive layer, and an ion sensor layer. Nucleic acids can be introduced into the wells, e.g., a clonal population of single nucleic can be attached to a single bead, and the bead can be introduced into a well. To initiate sequencing of the nucleic acids on the beads, one type of deoxyribonucleotide (e.g., dATP, dCTP, dGTP, or dTTP) can be introduced into the wells. When one or more nucleotides are incorporated by DNA polymerase, protons (hydrogen ions) are released in the well, which can be detected by the ion sensor. The semiconductor chip can then be washed and the process can be repeated with a different deoxyribonucleotide. A plurality of nucleic acids can be sequenced in the wells of a semiconductor chip. The semiconductor chip can comprise chemical-sensitive field effect transistor (chemFET) arrays to sequence DNA (for example, as described in U.S. Patent Application Publication No. 20090026082). Incorporation of one or more triphosphates into a new nucleic acid strand at the 3' end of the sequencing primer can be detected by a change in current by a chemFET. An array can have multiple chemFET sensors.

Genetic Analysis

[00126] The methods of the present invention can be used in the analysis of genetic

information of selective genomic regions of interest as well as genomic regions which may interact with the selective region of interest. Amplification methods as disclosed herein can be used in the devices, kits, and methods known to the art for genetic analysis, such as, but not limited to those found in U.S. Pat. Nos. 6,449,562, 6,287,766, 7,361,468, 7,414,117, 6,225,109, and 6,110,709. In some cases, amplification methods of the present invention can be used to amplify target nucleic acid for DNA hybridization studies to determine the presence or absence of polymorphisms. The polymorphisms, or alleles, can be associated with diseases or conditions such as genetic disease. In other cases the polymorphisms can be associated with susceptibility to diseases or conditions, for example, polymorphisms associated with addiction, degenerative and age related conditions, cancer, and the like. In other cases, the polymorphisms can be associated with beneficial traits such as increased coronary health, or resistance to diseases such as HIV or malaria, or resistance to degenerative diseases such as osteoporosis, Alzheimer's or dementia.

[00127] Any of the compositions described herein may be comprised in a kit. In a non- limiting example the kit, in suitable container means, comprises: one or more RNA DNA stem-loop oligonucleotides. The kit can further contain enzymes and/or reagents useful for ligation, cleavage and or amplification. The kit can contain a DNA-polymerase. The kit can contain reagents for amplification, for example reagents useful for single primer isothermal amplification methods. The kit can further optionally contain reagents for sequencing, for example, reagents useful for next-generation massively parallel sequencing methods.

[00128] The containers of the kits can generally include at least one vial, test tube, flask, bottle, syringe or other containers, into which a component may be placed, and preferably, suitably aliquoted. Where there is more than one component in the kit, the kit also can generally contain a second, third or other additional container into which the additional components can be separately placed. However, various combinations of components can be comprised in a container.

[00129] When the components of the kit are provided in one or more liquid solutions, the liquid solution can be an aqueous solution. However, the components of the kit can be provided as dried powder(s). When reagents and/or components are provided as a dry powder, the powder can be reconstituted by the addition of a suitable solvent.

[00130] A kit can include instructions for employing the kit components as well the use of any other reagent not included in the kit. Instructions can include variations that can be implemented.

[00131] In one aspect, the invention provides kits containing any one or more of the

elements disclosed in the above methods and compositions. In some embodiments, a kit comprises a composition of the invention, in one or more containers. In some embodiments, the invention provides kits comprising adaptors, primers, and/or other oligonucleotides described herein. In some embodiments, the kit further comprises one or more of: (a) a DNA ligase, (b) a DNA-dependent DNA polymerase, (c) an RNA- dependent DNA polymerase, (d) random primers, (e) primers comprising at least 4 thymidines at the 3 ' end, (f) a DNA endonuc lease, (g) a DNA-dependent DNA

polymerase having 3' to 5' exonuclease activity, (h) a plurality of primers, each primer having one of a plurality of selected sequences, (i) a DNA kinase, (j) a DNA

exonuclease, (k) magnetic beads, (1) an enzyme comprising RNase H activity, (m) an RNA ligase, and (n) one or more buffers suitable for one or more of the elements contained in said kit. The adaptors, primers, other oligonucleotides, and reagents can be, without limitation, any of those described above. Elements of the kit can further be provided, without limitation, in any suitable amounts and/or using any of the

combinations (such as in the same kit or same container) described above or any other suitable combination known in the art. The kits may further comprise additional agents, such as those described above, for use according to the methods of the invention. The kit elements can be provided in any suitable container, including but not limited to test tubes, vials, flasks, bottles, ampules, syringes, or the like. The agents can be provided in a form that may be directly used in the methods of the invention, or in a form that requires preparation prior to use, such as in the reconstitution of lyophilized agents. Agents may be provided in aliquots for single-use or as stocks from which multiple uses, such as in a number of reaction, may be obtained.

EXAMPLES

Example 1: Whole transcriptome amplification using stem-loop chimeric amplification primer and the corresponding linear (non stem-loop) chimeric amplification primer

] In this example cDNA was generated from total RNA (50pg total HeLa total

RNA) and control sample without added total RNA (no template control, NTC). The reagents and protocols for first and second strand cDNA were those of Ovation Pico (NuGEN Technologies). The chimeric first strand primer used for this example comprises a combination of a chimeric primer with a random sequence at the 3 '-end (Primer A random, in Table 1) and a chimeric primer with an oligo dT at the 3 '-portion (Primer A dT, in Table 1) and further comprises the same 5 '-RNA tail. The 5'-RNA tail of the first strand cDNA primer comprises the same sequence as the single chimeric amplification primer used in the isothermal single primer amplification step, as shown below.

[00133] The double stranded cDNA with an R A/DNA heteroduplex at one end

generated by the first and second cDNA synthesis steps above was purified as per the manufacturer instruction, and subjected to the single primer isothermal amplification

(SPIA) step, using the reagents and conditions of the Ovation Pico kit (NuGEN

Technologies) and either the stem-loop chimeric amplification primer (Stem-Loop amplification primer A, Table 1) or the corresponding linear chimeric amplification primer (Linear amplification primer A, Tablel). The amplification primer generated from the stem-loop amplification primer as well as the linear amplification primers are chimeric primers comprising a DNA sequence at the 3 '-end and an R A sequence at the 5 '-end, which are the same sequence as the corresponding tail of the first strand primers. SPIA amplification was carried out according to manufacturer instructions. SPIA

amplification products were purified using MinElute columns (QIAGEN) using the manufacturer instructions and reagents.

Table 1

[00134] Table 1 illustrates the primer sequences with ribonucleotides appearing in lower case and deoxyribonucleotides appearing in upper case. "N" represents a random nucleotide. [00135] The yields of transcriptome amplification employing the stem- loop chimeric amplification and the corresponding linear chimeric amplification primer using the method of the invention are summarized in Table 2.

Table 2.

[00136] The results demonstrate the higher amplification yields obtained in reactions

employing the stem-loop chimeric amplification primer of the invention as compared to the corresponding linear chimeric amplification primer. This is likely due to employing a pro-primer which is activated during the initial reaction thus leading to lower production of non-specific products as indicated by the lower yield of amplification product in the no target control.

Example 2: Whole genome amplification (genomic DNA derived from FFPE sample) using stem-loop chimeric amplification primer and the corresponding linear (non stem- loop) chimeric amplification primer

[00137] Whole genome amplification (WGA) was performed as described in the Ovation

WGA FFPE System Users Guide (Part #6200, NuGEN Technologies Inc). Briefly, 50 or 100 ng of isolated FFPE genomic DNA was fragmented, end-repaired and adenosine- tailed to generate 5 ' phosphorylated/3 ' A-overhang fragments. Adaptors suitable for amplification using the method of the invention were utilized for this example (Adaptor B, Table 3). Adaptor B comprises two oligonucleotides that when annealed form a single T-overhang to aid in ligation efficiency and a long 3 ' overhang that comprises the priming site for the SPIA chimeric amplification primer. After the adaptors were ligated to the fragments, the reactions were purified to remove excess adaptors using

Agencourt® RNAClean® XP beads. The resulting adaptor ligated fragments were isothermally amplified using the enzyme mixture and protocol from the kit and purified using an appropriate purification method, such as Qiagen® spin columns. Either the stem-loop chimeric amplification primer (Stem-loop amplification primer B, Table 3) or the corresponding linear chimeric amplification primer (Linear amplification primer B, Table 3), were employed in the isothermal single primer amplification step.

[00138] Amplification products were purified as per the manufacturer instructions and the yields of whole genome amplification reactions were determined using Nanodrop (Thermo Scientific; Figure 6). The size-distribution of the amplification products from these reactions were determined using Bio Analyze (Agilent) (Figure 7).

Table 3

[00139] While preferred embodiments of the present invention have been shown and

described herein, it will be obvious to those skilled in the art that such embodiments are provided by way of example only. Numerous variations, changes, and substitutions will now occur to those skilled in the art without departing from the invention. It should be understood that various alternatives to the embodiments of the invention described herein may be employed in practicing the invention. It is intended that the following claims define the scope of the invention and that methods and structures within the scope of these claims and their equivalents be covered thereby.