Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
BLOCKED ASYMMETRIC HAIRPIN ADAPTORS
Document Type and Number:
WIPO Patent Application WO/2024/015962
Kind Code:
A1
Abstract:
Disclosed herein are blocked asymmetric hairpin adaptors that include a spacer that prevents extension by a strand-displacing polymerase. Such adaptors can be utilized for creating libraries of templates that can be processed into template concatemers and/or interrogated by various sequencing methods.

Inventors:
SPARKS ANDREW (US)
HUROWITZ EVAN (US)
SHEN MIN-JUI RICHARD (US)
Application Number:
PCT/US2023/070211
Publication Date:
January 18, 2024
Filing Date:
July 14, 2023
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
PACIFIC BIOSCIENCES CALIFORNIA INC (US)
International Classes:
C12Q1/6855
Domestic Patent References:
WO2019018366A12019-01-24
WO2009120372A22009-10-01
WO2010117470A22010-10-14
WO1989010977A11989-11-16
WO2004018497A22004-03-04
WO1991006678A11991-05-16
WO2007123744A22007-11-01
Foreign References:
US20120196279A12012-08-02
US20220033895A12022-02-03
US5235033A1993-08-10
US5034506A1991-07-23
US5001050A1991-03-19
US5198543A1993-03-30
US5576204A1996-11-19
US8420366B22013-04-16
US8257954B22012-09-04
US8936926B22015-01-20
US20100260465A12010-10-14
US8921086B22014-12-30
US20160319345A12016-11-03
US20120196279A12012-08-02
US11326206B22022-05-10
US20070099208A12007-05-03
US20090026082A12009-01-29
US20090127589A12009-05-21
US20100137143A12010-06-03
US20100282617A12010-11-11
US6210891B12001-04-03
US6258568B12001-07-10
US6274320B12001-08-14
US5599675A1997-02-04
US5750341A1998-05-12
US10077470B22018-09-18
US10443098B22019-10-15
US10975427B22021-04-13
US20190119742A12019-04-25
US20180187245A12018-07-05
US7057026B22006-06-06
US7329492B22008-02-12
US7211414B22007-05-01
US7315019B22008-01-01
US7405281B22008-07-29
US20080108082A12008-05-08
US5302509A1994-04-12
US6828100B12004-12-07
US20090247414A12009-10-01
Other References:
PACIFIC BIOSCIENCES OF CALIFORNIA: "Template Preparation and Sequencing Guide", 1 January 2014 (2014-01-01), XP093090656, Retrieved from the Internet [retrieved on 20231011]
MEIJER ET AL.: "Φ29 Family of Phages", MICROBIOLOGY AND MOLECULAR BIOLOGY REVIEWS, vol. 65, no. 2, 2001, pages 261 - 287
STRYER, L: "Using Antibodies: A Laboratory Manual, Cells: A Laboratory Manual, PCR Primer: A Laboratory Manual, and Molecular Cloning: A Laboratory Manual", vol. I-IV, 1995, COLD SPRING HARBOR LABORATORY PRESS, article "Genome Analysis: A Laboratory Manual Series"
BERG ET AL.: "Biochemistry", 2002, W. H. FREEMAN PUB., article "Oligonucleotide Synthesis: A Practical Approach"
COX: "Lehninger, Principles of Biochemistry", 2000, W. H. FREEMAN PUB.
LIZARDI ET AL., NAT. GENET., vol. 19, 1998, pages 225 - 232
EID ET AL., SCIENCE, vol. 323, 2009, pages 133 - 138
TRAVERS ET AL., NUCLEIC ACIDS RES, vol. 38, 2010, pages e159
RONAGHI ET AL., ANALYTICAL BIOCHEMISTRY, vol. 242, no. 1, 1996, pages 84 - 9
RONAGHI, GENOME RES., vol. 11, no. 1, 2001, pages 3 - 11
RONAGHI ET AL., SCIENCE, vol. 281, no. 5375, 1998, pages 363
SHENDURE ET AL., SCIENCE, vol. 309, 2005, pages 1728 - 1732
BAINS ET AL., JOURNAL OF THEORETICAL BIOLOGY, vol. 135, no. 3, 1988, pages 303 - 7
DRMANAC ET AL., NATURE BIOTECHNOLOGY, vol. 16, 1998, pages 54 - 58
FODOR ET AL., SCIENCE, vol. 251, no. 4995, 1995, pages 767 - 773
LEVENE ET AL., SCIENCE, vol. 299, 2003, pages 682 - 686
LUNDQUIST ET AL., OPT. LETT., vol. 33, 2008, pages 1026 - 1028
KORLACH ET AL., PROC. NATL. ACAD. SCI. USA, vol. 105, 2008, pages 1176 - 1181
BENTLEY ET AL., NATURE, vol. 456, 2008, pages 53 - 59
Attorney, Agent or Firm:
NG, Jennifer et al. (US)
Download PDF:
Claims:
WHAT IS CLAIMED IS:

1. A blocked hairpin adaptor comprising:

(i) a loop strand comprising a first and second adaptor sequence separated by a polymerase-blocking spacer, wherein the first adaptor sequence is 3 ’ of the spacer and the second adaptor sequence is 5’ of the spacer, wherein the first adaptor sequence and the second adaptor sequence are each single stranded, wherein the first adaptor sequence comprises a first binding site comprising at least 6 nucleotides, wherein the second adaptor sequence comprises a second binding site comprising at least 10 nucleotides; and

(ii) a stem domain comprising a first stem sequence and a second stem sequence, wherein the 3’ end of the first adaptor sequence is connected to the first stem sequence and the 5’ end of the second adaptor sequence is connected to the second stem sequence, and wherein the first stem sequence and the second stem sequence each comprise at least 5 nucleotides, and wherein the first stem sequence is the reverse complement of the second stem sequence.

2. The blocked hairpin adaptor of claim 1, wherein the spacer prevents extension by a strand-displacing polymerase.

3. The blocked hairpin adaptor of claim 1 or 2, wherein the spacer comprises a nonnucleotide linker that blocks extension by the strand-displacing polymerase.

4. The blocked hairpin adaptor of claim 3, wherein the spacer comprises at least one polyethylene glycol (PEG) group.

5. The blocked hairpin adaptor of claim 4, wherein the spacer comprises at least 6 polyethylene glycol (PEG) groups.

6. The blocked hairpin adaptor of claim 4, wherein the spacer comprises a nonnucleotide linker backbone comprising at least 10 bonds in length.

7. The blocked hairpin adaptor of any one of claims 3-6, wherein the non-nucleotide linker comprises an eight-member ring.

8. The blocked hairpin adaptor of claim 7, wherein the eight-member ring is cyclooctene or cyclooctane.

9. The blocked hairpin adaptor of any one of claims 1-8, wherein the spacer comprises a non-nucleotide linker comprising at least one selected from the group consisting of an abasic site linker, an SpC3 linker, an SpC6 linker, an SpC9 linker, an SpC12 linker, and an SpC18 linker.

10. The blocked hairpin adaptor of any one of claims 1-9, wherein the non-nucleotide linker comprises a conjugate generated in a click chemistry reaction.

11. The blocked hairpin adaptor of claim 10, wherein the click chemistry reaction is a strain-promoted click chemistry reaction.

12. The blocked hairpin adaptor of any one of claims 1-11, wherein the spacer further comprises a continuous polynucleotide sequence comprising a cleavable site.

13. The blocked hairpin adaptor of claim 12, wherein the cleavable site is at the 5’ end of the continuous polynucleotide sequence of the spacer.

14. The blocked hairpin adaptor of claim 12, wherein the cleavable site is at the 3’ end of the continuous polynucleotide sequence of the spacer.

15. The blocked hairpin adaptor of any one of claims 12-14, wherein the cleavable site is a uracil.

16. The blocked hairpin adaptor of any one of claims 12-15, wherein the spacer comprising a continuous polynucleotide sequence further comprises at least one blocking moiety that prevents extension by the strand-displacing polymerase.

17. The blocked hairpin adaptor of claim 16, wherein the blocking moiety is a side chain of an unnatural nucleotide.

18. The blocked hairpin adaptor of claim 16, wherein the blocking moiety is a thermally stable unnatural nucleotide hybridized to a complementary sequence of the loop strand, optionally wherein the thermally stable unnatural nucleotide is a locked nucleic acid.

19. The blocked hairpin adaptor of any one of claims 16-18, wherein the blocking moiety of the spacer is 5’ to the continuous polynucleotide sequence of the spacer.

20. The blocked hairpin adaptor of any one of claims 16-18, wherein the blocking moiety of the spacer is 3’ to the continuous polynucleotide sequence of the spacer.

21. The blocked hairpin adaptor of any one of claims 1-20, wherein the spacer prevents extension by a strand-displacing polymerase and cannot be modified to allow extension while the polymerase is blocked by the spacer.

22. The blocked hairpin adaptor of any one of claims 1-21, wherein the first stem domain and the second stem domain each comprise from 7-20 nucleotides.

23. The blocked hairpin adaptor of any one of claims 1-22, wherein the stem domain comprises a blunt end at the free end that is not connected to the loop strand.

24. The blocked hairpin adaptor of any one of claims 1-22, wherein the stem domain comprises an overhang end at the free end that is not connected to the loop strand.

25. The blocked hairpin adaptor of claim 24, wherein the overhang end is an adenosine on the 3’ strand of the stem domain.

26. The blocked hairpin adaptor of claim 24 or 25, wherein the overhang end comprises no more than 4 nucleotides in length.

27. The blocked hairpin adaptor of any one of claims 1-26, wherein the first adaptor sequence further comprises a first binding site comprising at least 10 nucleotides in length.

28. The blocked hairpin adaptor of any one of claims 1-26, wherein the second adaptor sequence further comprises a second binding site comprising at least 10 nucleotides in length.

29. The blocked hairpin adaptor of any one of claims 1-26, wherein the first adaptor sequence and/or the second adaptor sequence comprises at least one tag.

30. The blocked hairpin adaptor of any one of claims 1-27, wherein the first adaptor sequence and the second adaptor sequence each comprise at least one tag sequence.

31. The blocked hairpin adaptor of any one of claims 1-27, wherein the first stem sequence and/or the second stem sequence comprises at least one tag sequence.

32. The blocked hairpin adaptor of any one of claims 1-27, wherein the first stem sequence and the second stem sequence each comprise at least one tag sequence.

33. The blocked hairpin adaptor of any one of claims 30-32, wherein the sequences of the first binding site and the second binding site are different and the at least one tag sequence on the first adaptor sequence can be read from the second binding site and the at least one tag sequence on the second adaptor sequence can be read from the first binding site.

34. The blocked hairpin adaptor of any one of claims 1-33, wherein the at least one tag sequence is a sample index sequence or a molecular index sequence.

35. The blocked hairpin adaptor of claim 34, wherein the at least one tag sequence on the first adaptor sequence and the at least one tag sequence on the second adaptor sequence together provide a dual sample index.

36. The blocked hairpin adaptor of claim 34, wherein the molecular index sequence comprises a variable or degenerate nucleotide sequence.

37. The blocked hairpin adaptor of any one of claims 1-36, wherein the first adaptor sequence and the second adaptor sequence each comprise a sample index.

38. The blocked hairpin adaptor of any one of claims 34-37, wherein the sample index sequence or the molecular index sequence comprise a click chemistry spacer.

39. A blocked hairpin adaptor comprising:

(i) a single- stranded loop strand comprising from 5’ to 3’: a second sequencing primer binding site, at least one tag sequence, a second primer binding site comprising at least 10 nucleotides in length, a polymerase-blocking spacer, a first primer binding site comprising at least 10 nucleotides in length, and a first sequencing primer binding site; and

(ii) a stem domain comprising: a first stem sequence at the 5’ of the loop strand and a second stem sequence at the 3’ of the loop strand, wherein the first and second stem sequences form a double-stranded DNA sequence under annealing conditions and wherein the free ends of the first and second stem sequences form a blunt end or an overhang end.

40. A blocked hairpin adaptor comprising:

(i) a loop strand comprising a first and second adaptor sequence separated by a polymerase-blocking spacer, wherein the first adaptor sequence is 3’ of the spacer and the second adaptor sequence is 5’ of the spacer, wherein the first adaptor sequence and the second adaptor sequence are each single stranded, wherein the first adaptor sequence comprises the reverse complement of a first primer binding site and a first tag sequence 3’ of the reverse complement of the first primer binding site, and wherein the second adaptor sequence comprises a second primer binding site and a second tag sequence 3’ of the second primer binding site; and

(ii) a stem domain comprising a first stem sequence and a second stem sequence, wherein the 3’ end of the first adaptor sequence is connected to the first stem sequence and the 5’ end of the second adaptor sequence is connected to the second stem sequence, and wherein the first stem sequence and the second stem sequence each comprise at least 5 nucleotides, and wherein the first stem sequence is the reverse complement of the second stem sequence.

41. A blocked hairpin adaptor comprising:

(i) a loop strand comprising a first and second adaptor sequence separated by a polymerase-blocking spacer, wherein the first adaptor sequence is 3 ’ of the spacer and the second adaptor sequence is 5’ of the spacer, wherein the first adaptor sequence and the second adaptor sequence are each single stranded, and wherein the spacer comprises a non-nucleotide linear polymer of at least 30 bonds in length; and

(ii) a stem domain comprising a first stem sequence and a second stem sequence, wherein the 3’ end of the first adaptor sequence is connected to the first stem sequence and the 5’ end of the second adaptor sequence is connected to the second stem sequence, and wherein the first stem sequence and the second stem sequence each comprise at least 5 nucleotides, and wherein the first stem sequence is the reverse complement of the second stem sequence.

42. The blocked hairpin adaptor of claim 41, wherein the spacer comprises a non- nucleotide linear polymer ranging from about 30 to 300 bonds in length or from about 60 to 200 bonds in length.

43. A kit comprising:

(a) the blocked hairpin adaptor of any one of claims 1-42; and

(b) a first primer complementary to the first binding site of the first adaptor sequence and/or a second primer complementary to the second binding site of the second adaptor sequence.

44. The kit of claim 43, further comprising a first polymerase.

45. The kit of claim 44, wherein the first polymerase has strand-displacing activity.

46. The kit of claim 44, wherein the first polymerase is a thermostable polymerase.

47. The kit of any one of claims 44-46, further comprising a second polymerase, wherein the first polymerase is a thermostable polymerase and the second polymerase has higher strand-displacing activity than the first polymerase.

48. The kit of any one of claims 44-46, further comprising a second polymerase, wherein the first polymerase has higher strand-displacing activity and the second polymerase is a thermostable polymerase.

49. The kit of any one of claims 43-48, wherein the first primer and/or second primer comprises at least one tag sequence.

50. The kit of claim 49, wherein the at least one tag sequence is a sample index sequence or a molecular index sequence.

51. The kit of claim 50, wherein the molecular index sequence comprises a degenerate nucleotide sequence.

52. The kit of claim 50, wherein the sample index sequence comprises dual sample index sequences.

53. The kit of any one of claims 43-52, further comprising a DNA ligase.

54. The kit of any one of claims 43-52, wherein the first primer and/or second primer is a tailed primer.

55. The kit of any one of claims 43-54, further comprising at least one reagent for fragmenting a double-stranded DNA (dsDNA) library.

56. The kit of any one of claims 43-55, further comprising at least one reagent for generating an overhang end on a dsDNA fragment that hybridizes to an overhang end on the blocked hairpin adaptor.

57. The kit of any one of claims 43-56, further comprising a splint oligonucleotide that hybridizes to a portion of each of the first and second adaptor sequences under annealing conditions.

58. A method compri sing :

(a) providing a sample comprising fragmented dsDNA molecules comprising target sequences;

(b) contacting the fragmented dsDNA molecules with a plurality of the blocked hairpin adaptors of any one of claims 1-41; and

(c) ligating two blocked hairpin adaptors at each of the stem domains of the adaptors to opposite ends of each of the fragmented dsDNA molecules to form circularized adaptor-target sequence constructs, wherein each of the stem domains of the blocked hairpin adaptors is double stranded during ligation.

59. The method of claim 58, wherein providing the sample of step (a) further comprises performing end repair of the fragmented dsDNA molecules.

60. The method of claim 58 or 59, wherein providing the sample of step (a) further comprises performing tailing of the fragmented dsDNA molecules.

61. The method of any one of claims 58-60, wherein providing the sample of step (a) comprises generating a library of fragmented dsDNA molecules comprising mechanical or enzymatic fragmentation shearing or target-specific pre-amplification.

62. The method of claim 61, wherein generating the library of fragmented dsDNA molecules further comprises performing end repair, and optionally performing A-tailing of the fragmented dsDNA molecules.

63. The method of claim 61 or 62, wherein the library of fragmented dsDNA molecules comprises blunt ends.

64. The method of claim 61 or 62, wherein the library of fragmented dsDNA molecules comprises overhang ends complementary to the overhang ends of the blocked hairpin adaptors.

65. The method of any one of claims 58-64, further comprising extending the circularized adaptor-target sequence constructs by primer extension comprising contacting the circularized constructs with a tailed primer that specifically hybridizes to the second primer binding site, and a strand-displacing polymerase, thereby producing linear target sequence constructs.

66. The method of any one of claims 58-65, further comprising amplifying by PCR the linear target sequence constructs comprising contacting the linear target sequence constructs with a reaction mixture comprising: a first primer comprising the reverse complement of the first primer binding site and a second primer comprising the sequence of the second primer binding site, and a thermostable polymerase, thereby amplifying linear target sequence constructs.

67. The method of claim 65, wherein the reaction mixture further comprises a stranddisplacing polymerase.

68. The method of any one of claims 65-67, wherein the tailed primer comprises at least one tag sequence.

69. The method of any one of claims 66-68, wherein the first primer is a tailed primer comprising at least one tag sequence.

70. The method of any one of claims 66-69, wherein the second primer is a tailed primer comprising at least one tag sequence.

71. The method of any one of claims 65-70, wherein the tailed primer comprises at least one sample index.

72. The method of any one of claims 65-71, wherein each of the linear target sequence constructs comprises a sample index.

73. The method of any one of claims 65-72, wherein the sample of step (a) is a plurality of different samples.

74. The method of claim 73, further comprising combining the linear target sequence constructs from the plurality of different samples.

75. The method of any one of claims 65-74, further comprising circularizing the linear target sequence constructs comprising hybridizing a splint oligonucleotide to the 5’ and 3’ ends of the linear construct and ligating the ends, thereby producing circularized target sequence constructs.

76. The method of claim 75, wherein the splint oligonucleotide hybridizes to the first binding site or the first binding site and a region of the second binding site.

77. The method of claim 75 or 76, further comprising forming concatemers of the circularized target sequence constructs comprising rolling circle amplification (RCA) with a strand-displacing polymerase and an RCA primer hybridized to the circularized target sequence construct.

78. The method of claim 77, wherein the RCA primer is immobilized on a surface.

79. The method of claim 78, wherein the circularized target sequence constructs are deposited on the surface.

80. The method of claim 77, wherein the RCA primer is in solution.

81. The method of any one of claims 75-80, wherein the splint oligonucleotide is the

RCA primer.

82. The method of any one of claims 75-81, further comprising sequencing.

83. The method of any one of claims 75-81, further comprising clustering before sequencing.

84. The method of claim 82 or 83, wherein the sequencing is cyclic sequencing or single molecule real-time sequencing.

85. A method compri sing :

(a) providing a plurality of blocked hairpin adaptors, each of the blocked hairpin adaptors comprising: (i) a loop strand comprising a first and second adaptor sequence separated by a polymerase-blocking spacer, wherein the first adaptor sequence is 3’ of the spacer and the second adaptor sequence is 5’ of the spacer, wherein the first adaptor sequence and the second adaptor sequence are each single-stranded, wherein the first adaptor sequence comprises the reverse complement of a first primer binding site, wherein the second adaptor sequence comprises a second primer binding site; and wherein the first and second primer binding sites each comprise at least 10 nucleotides in length, and

(ii) a stem domain comprising a first stem sequence and a second stem sequence that form a double-stranded DNA sequence under annealing conditions, wherein the 3’ end of the first adaptor sequence is connected to the 5’ end of the first stem sequence and the 5’ end of the second adaptor sequence is connected to the 3’ end of the second stem sequence, and wherein the first stem sequence comprises at least 5 nucleotides that are complementary to the second stem sequence, each comprise at least 5 nucleotides;

(b) ligating the hairpin adaptors to both ends of fragmented dsDNA molecules to form circularized adaptor-target sequence constructs; and

(c) amplifying the circularized adaptor-target sequence constructs by PCR to produce linear target sequence constructs; wherein the PCR amplification is performed with a strand-displacing polymerase, a first primer that specifically hybridizes to the first primer binding site, and a second primer that specifically hybridizes to the second primer binding site; and wherein the polymerase-blocking spacer prevents extension by the stranddisplacing polymerase.

Description:
BLOCKED ASYMMETRIC HAIRPIN ADAPTORS

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application claims the benefit of priority to U.S. Provisional Patent Application No. 63/389,611 filed July 15, 2022, the contents of which are incorporated herein by reference in their entirety.

REFERENCE TO SEQUENCE LISTING

[0002] This application contains a Sequence Listing, which is being submitted electronically in XML format in accordance with the WIPO Standard ST.26 and is hereby incorporated by reference in its entirety. The XML copy, created on July 11, 2023, is named “49217- 4010WO.xml” and is 2,833 bytes in size.

BACKGROUND

[0003] Nucleic acid template generation is an important step in sequencing applications. It can be advantageous for such templates to have homogeneous structures. This allows one to efficiently process templates of a variety of size ranges using a single protocol. Also, use of a primer binding site common to the templates can help reduce sequencing bias and normalized sequencing performance from templates of different sizes.

[0004] There is an increasing demand for efficient, low-cost methods for the preparation of high- quality nucleic acid templates for sequencing technologies. The present disclosure provides improved compositions and methods that would be useful for preparing library elements that can be applied to numerous downstream analyses, including sequencing applications.

BRIEF SUMMARY

[0005] In one aspect, provided herein is a blocked hairpin adaptor comprising: (i) a loop strand comprising a first and second adaptor sequence separated by a polymerase-blocking spacer, wherein the first adaptor sequence is 3’ of the spacer and the second adaptor sequence is 5’ of the spacer, wherein the first adaptor sequence and the second adaptor sequence are each single stranded, wherein the first adaptor sequence comprises a first binding site comprising at least 6 nucleotides, wherein the second adaptor sequence comprises a second binding site comprising at least 10 nucleotides; and (ii) a stem domain comprising a first stem sequence and a second stem sequence, wherein the 3’ end of the first adaptor sequence is connected to the first stem sequence and the 5’ end of the second adaptor sequence is connected to the second stem sequence, and wherein the first stem sequence and the second stem sequence each comprise at least 5 nucleotides, and wherein the first stem sequence is the reverse complement of the second stem sequence.

[0006] In some embodiments, the spacer prevents extension by a strand-displacing polymerase. In some embodiments, the spacer comprises a non-nucleotide linker that blocks extension by the strand-displacing polymerase.

[0007] In some embodiments, the spacer comprises at least one polyethylene glycol (PEG) group. In various embodiments, the spacer comprises at least 6 polyethylene glycol (PEG) groups.

[0008] In many embodiments, the spacer comprises a non-nucleotide linker backbone comprising at least 10 bonds in length. In some embodiments, the non-nucleotide linker comprises an eight-member ring. In many embodiments, the eight-member ring is cyclooctene or cyclooctane.

[0009] In some embodiments, the spacer comprises a non-nucleotide linker comprising at least one selected from the group consisting of an abasic site linker, an SpC3 linker, an SpC6 linker, an SpC9 linker, an SpC12 linker, and an SpC 18 linker.

[0010] In some embodiments, the non-nucleotide linker comprises a conjugate generated in a click chemistry reaction. In various embodiments, the click chemistry reaction is a strain- promoted click chemistry reaction.

[0011] In some embodiments, the spacer further comprises a continuous polynucleotide sequence comprising a cleavable site. In many embodiments, the cleavable site is at the 5’ end of the continuous polynucleotide sequence of the spacer. In various embodiments, the cleavable site is at the 3’ end of the continuous polynucleotide sequence of the spacer. In some embodiments, the cleavable site is a uracil.

[0012] In some embodiments, the spacer comprises a continuous polynucleotide sequence further comprises at least one blocking moiety that prevents extension by the strand-displacing polymerase. In many embodiments, the blocking moiety is a side chain of an unnatural nucleotide. In various embodiments, the blocking moiety is a thermally stable unnatural nucleotide hybridized to a complementary sequence of the loop strand, optionally wherein the thermally stable unnatural nucleotide is a locked nucleic acid.

[0013] In some embodiments, the blocking moiety of the spacer is 5’ to the continuous polynucleotide sequence of the spacer. In some embodiments, the blocking moiety of the spacer is 3’ to the continuous polynucleotide sequence of the spacer.

[0014] In some embodiments, the spacer prevents extension by a strand-displacing polymerase and cannot be modified to allow extension while the polymerase is blocked by the spacer.

[0015] In some embodiments, the first stem domain and the second stem domain each comprise from 7-20 nucleotides.

[0016] In some embodiments, the stem domain comprises a blunt end at the free end that is not connected to the loop strand. In many embodiments, the stem domain comprises an overhang end at the free end that is not connected to the loop strand.

[0017] In some embodiments, the overhang end is an adenosine on the 3’ strand of the stem domain. In many embodiments, the overhang end comprises no more than 4 nucleotides in length. In some instances, the overhang end comprises 1, 2, 3, or 4 nucleotides in length.

[0018] In some embodiments, the first adaptor sequence further comprises a first binding site comprising at least 10 nucleotides in length. In some embodiments, the second adaptor sequence further comprises a second binding site comprising at least 10 nucleotides in length.

[0019] In some embodiments, the first adaptor sequence and/or the second adaptor sequence comprises at least one tag. In various embodiments, the first adaptor sequence and the second adaptor sequence each comprise at least one tag sequence. In many embodiments, the first stem sequence and/or the second stem sequence comprises at least one tag sequence. In some embodiments, the first stem sequence and the second stem sequence each comprise at least one tag sequence.

[0020] In some embodiments, the sequences of the first binding site and the second binding site are different and the at least one tag sequence on the first adaptor sequence can be read from the second binding site and the at least one tag sequence on the second adaptor sequence can be read from the first binding site. [0021] In some embodiments, the at least one tag sequence is a sample index sequence or a molecular index sequence. In many embodiments, the at least one tag sequence on the first adaptor sequence and the at least one tag sequence on the second adaptor sequence together provide a dual sample index.

[0022] In various embodiments, the molecular index sequence comprises a variable or degenerate nucleotide sequence. In some embodiments, the first adaptor sequence and the second adaptor sequence each comprise a sample index. In some embodiments, the sample index sequence or the molecular index sequence comprise a click chemistry spacer.

[0023] In another aspect, provided is a blocked hairpin adaptor comprising: (i) a single-stranded loop strand comprising from 5’ to 3’: a second sequencing primer binding site, at least one tag sequence, a second primer binding site comprising at least 10 nucleotides in length, a polymerase-blocking spacer, a first primer binding site comprising at least 10 nucleotides in length, and a first sequencing primer binding site; and (ii) a stem domain comprising: a first stem sequence at the 5’ of the loop strand and a second stem sequence at the 3’ of the loop strand, wherein the first and second stem sequences form a double-stranded DNA sequence under annealing conditions and wherein the free ends of the first and second stem sequences form a blunt end or an overhang end.

[0024] In some aspects, provided is a blocked hairpin adaptor comprising: (i) a loop strand comprising a first and second adaptor sequence separated by a polymerase-blocking spacer, wherein the first adaptor sequence is 3’ of the spacer and the second adaptor sequence is 5’ of the spacer, wherein the first adaptor sequence and the second adaptor sequence are each single stranded, wherein the first adaptor sequence comprises the reverse complement of a first primer binding site and a first tag sequence 3’ of the reverse complement of the first primer binding site, and wherein the second adaptor sequence comprises a second primer binding site and a second tag sequence 3’ of the second primer binding site; and (ii) a stem domain comprising a first stem sequence and a second stem sequence, wherein the 3’ end of the first adaptor sequence is connected to the first stem sequence and the 5’ end of the second adaptor sequence is connected to the second stem sequence, and wherein the first stem sequence and the second stem sequence each comprise at least 5 nucleotides, and wherein the first stem sequence is the reverse complement of the second stem sequence. [0025] In some aspects, provided is a blocked hairpin adaptor comprising: (i) a loop strand comprising a first and second adaptor sequence separated by a polymerase-blocking spacer, wherein the first adaptor sequence is 3’ of the spacer and the second adaptor sequence is 5’ of the spacer, wherein the first adaptor sequence and the second adaptor sequence are each single stranded, and wherein the spacer comprises a non-nucleotide linear polymer of at least 30 bonds in length; and (ii) a stem domain comprising a first stem sequence and a second stem sequence, wherein the 3’ end of the first adaptor sequence is connected to the first stem sequence and the 5’ end of the second adaptor sequence is connected to the second stem sequence, and wherein the first stem sequence and the second stem sequence each comprise at least 5 nucleotides, and wherein the first stem sequence is the reverse complement of the second stem sequence.

[0026] In some embodiments, the spacer comprises a non-nucleotide linear polymer ranging from about 30 to 300 bonds in length or from about 60 to 200 bonds in length.

[0027] In some aspects, described herein is a kit comprising: (a) any of the blocked hairpin adaptors provided; and (b) a first primer complementary to the first binding site of the first adaptor sequence and/or a second primer complementary to the second binding site of the second adaptor sequence.

[0028] In some embodiments, the kit further comprises a first polymerase. In many embodiments, the first polymerase has strand-displacing activity. In some embodiments, the first polymerase is a thermostable polymerase.

[0029] In some embodiments, the kit further comprises a second polymerase, wherein the first polymerase is a thermostable polymerase and the second polymerase has higher stranddisplacing activity than the first polymerase. In various embodiments, the kit further comprises a second polymerase, wherein the first polymerase has higher strand-displacing activity and the second polymerase is a thermostable polymerase.

[0030] In some embodiments, the first primer and/or second primer comprises at least one tag sequence. In some embodiments, the at least one tag sequence is a sample index sequence or a molecular index sequence. In many embodiments, the molecular index sequence comprises a degenerate nucleotide sequence. In some embodiments, the sample index sequence comprises dual sample index sequences. [0031] In some embodiments, the kit further comprises a DNA ligase.

[0032] In many embodiments, the first primer and/or second primer is a tailed primer.

[0033] In some embodiments, the kit further comprises at least one reagent for fragmenting a double- stranded DNA (dsDNA) library. In various embodiments, the kit further comprises at least one reagent for generating an overhang end on a dsDNA fragment that hybridizes to an overhang end on the blocked hairpin adaptor. In some embodiments, the kit further comprises a splint oligonucleotide that hybridizes to a portion of each of the first and second adaptor sequences under annealing conditions.

[0034] In many aspects, provided is a method comprising: (a) providing a sample comprising fragmented dsDNA molecules comprising target sequences; (b) contacting the fragmented dsDNA molecules with a plurality of any of the blocked hairpin adaptors described herein; and (c) ligating two blocked hairpin adaptors at each of the stem domains of the adaptors to opposite ends of each of the fragmented dsDNA molecule to form circularized adaptor-target sequence constructs, wherein each of the stem domains of the blocked hairpin adaptors is double stranded during ligation.

[0035] In some embodiments of the method, the providing the sample of step (a) further comprises performing end repair of the fragmented dsDNA molecules. In many embodiments, the providing the sample of step (a) further comprises performing tailing of the fragmented dsDNA molecules. Non-limiting examples of tailing include A-tailing, C-tailing, G-tailing, and T-tailing such that one or more As, one or more Cs, one or more Gs, and one or more Ts, respectively are added to the 3’ ends of the dsDNA molecules. In various embodiments, the providing the sample of step (a) comprises generating a library of fragmented dsDNA molecules comprising mechanical or enzymatic fragmentation shearing or target-specific pre-amplification.

[0036] In some embodiments, the generating the library of fragmented dsDNA molecules further comprises performing end repair, and optionally performing A-tailing of the fragmented dsDNA molecules.

[0037] In some embodiments, the library of fragmented dsDNA molecules comprises blunt ends. In many embodiments, the library of fragmented dsDNA molecules comprises overhang ends complementary to the overhang ends of the blocked hairpin adaptors. [0038] In some embodiments, the method further comprises extending the circularized adaptortarget sequence constructs by primer extension comprising contacting the circularized constructs with a tailed primer that specifically hybridizes to the second primer binding site, and a stranddisplacing polymerase, thereby producing linear target sequence constructs.

[0039] In some embodiments, the method further comprises amplifying by PCR the linear target sequence constructs comprising contacting the linear target sequence constructs with a reaction mixture comprising: a first primer comprising the reverse complement of the first primer binding site and a second primer comprising the sequence of the second primer binding site, and a thermostable polymerase, thereby amplifying linear target sequence constructs.

[0040] In some embodiments, the reaction mixture further comprises a strand-displacing polymerase.

[0041] In some embodiments, the tailed primer comprises at least one tag sequence. In many embodiments, the first primer is a tailed primer comprises at least one tag sequence. In various embodiments, the second primer is a tailed primer comprising at least one tag sequence. In some embodiments, the tailed primer comprises at least one sample index.

[0042] In some embodiments, each of the linear target sequence constructs comprises a sample index.

[0043] In some embodiments, the sample of step (a) is a plurality of different samples.

[0044] In some embodiments, the method further comprises combining the linear target sequence constructs from the plurality of different samples.

[0045] In some embodiments, the method further comprises circularizing the linear target sequence constructs comprising hybridizing a splint oligonucleotide to the 5’ and 3’ ends of the linear construct and ligating the ends, thereby producing circularized target sequence constructs.

[0046] In some embodiments, the splint oligonucleotide hybridizes to the first binding site or the first binding site and a region of the second binding site.

[0047] In some embodiments, the method further comprises forming concatemers of the circularized target sequence constructs comprising rolling circle amplification (RCA) with a strand-displacing polymerase and an RCA primer hybridized to the circularized target sequence construct. [0048] In some embodiments, the RCA primer is immobilized on a surface. In many embodiments, the circularized target sequence constructs are deposited on the surface.

[0049] In some embodiments, the RCA primer is in solution.

[0050] In some embodiments, the splint oligonucleotide is the RCA primer.

[0051] In various embodiments, the method further comprises sequencing. In some embodiments, the method further comprises clustering before sequencing.

[0052] In some embodiments, the sequencing is cyclic sequencing or single molecule real-time sequencing.

[0053] In some aspects, provided is a method comprising:

[0054] (a) providing a plurality of blocked hairpin adaptors, each blocked hairpin adaptor comprising: (i) a loop strand comprising a first and second adaptor sequence separated by a polymerase-blocking spacer, wherein the first adaptor sequence is 3’ of the spacer and the second adaptor sequence is 5’ of the spacer, wherein the first adaptor sequence and the second adaptor sequence are each single stranded, wherein the first adaptor sequence comprises the reverse complement of a first primer binding site, wherein the second adaptor sequence comprises a second primer binding site; wherein the first and second primer binding sites each comprise at least 10 nucleotides in length, and (ii) a stem domain comprising a first stem sequence and a second stem sequence that form a double-stranded DNA sequence under annealing conditions, wherein the 3 ’ end of the first adaptor sequence is connected to the 5’ end of the first stem sequence and the 5’ end of the second adaptor sequence is connected to the 3’ end of the second stem sequence, and wherein the first stem sequence comprises at least 5 nucleotides that are complementary to the second stem sequence each comprise at least 5 nucleotides;

[0055] (b) ligating the hairpin adaptors to both ends of fragmented dsDNA molecules to form circularized adaptor-target sequence constructs; and

[0056] (c) amplifying the circularized adaptor-target sequence constructs by PCR to produce linear target sequence constructs; wherein the PCR amplification is performed with a stranddisplacing polymerase, a first primer that specifically hybridizes to the first primer binding site, and a second primer that specifically hybridizes to the second primer binding site; and wherein the polymerase-blocking spacer prevents extension by the strand-displacing polymerase.

BRIEF DESCRIPTION OF THE DRAWINGS

[0057] FIGS. 1A and IB depict adaptor ligation products. FIG. 1A shows ligation products from 2 complementary adaptor pairs (A: A’ adaptor sequences and B:B’ adaptor sequences). FIG. IB shows ligation products from asymmetric hairpin adaptors containing an A adaptor sequence and a B’ adaptor sequence.

[0058] FIG. 2 depicts exemplary examples of asymmetric hairpin adaptors including a cleavable linker or a non-cleavable linker.

[0059] FIG. 3 depicts primer extension to generate separate ssDNA library elements.

[0060] FIG. 4 depicts a schematic diagram of an integrated library preparation workflow using non-cleavable asymmetric hairpin adaptors.

[0061] FIG. 5 provides exemplary embodiments of blocked asymmetric hairpin adaptors with different spacers.

[0062] FIGS. 6A and 6B provide information relating to the blocked asymmetric hairpin adaptors with different spacers of Example 1. FIG. 6A shows a schematic of the PCR workflow using two blocked asymmetric hairpin adaptors connected to a dsDNA fragment. FIG. 6B shows qPCR data from templates containing different adaptors.

[0063] FIG. 7 shows an exemplary schematic of a blocked asymmetric hairpin adaptor containing a spacer, adaptor arms and a stem domain.

[0064] FIG. 8 shows an exemplary schematic of a blocked asymmetric hairpin adaptor with sample and molecular indexes.

[0065] FIG. 9 shows an exemplary schematic of a blocked asymmetric hairpin adaptor and primers for amplification of adaptor ligation products.

DETAILED DESCRIPTION

I. Introduction

[0066] Provided herein is a method for creating library elements appropriate for, in many instances, short read sequencing. Library preparation from double-stranded DNA (dsDNA) samples entails optionally fragmenting the dsDNA into appropriately sized fragments, repairing (and optionally A-tailing) the ends of the dsDNA such that they can be ligated to adaptors, and ligating dsDNA adaptors to each end of the dsDNA fragments.

[0067] Different adaptor arms (“A” adaptor arms vs. “B'” adaptor arms in FIG. 1A) are ligated on each end of a DNA fragment to create a library element appropriate for existing short read sequencing methods. Some library preparation methods accomplish this by ligating a mixture of A and B dsDNA adaptor arms to dsDNA fragments. This results in about 50% of library elements with A and B adaptor arms (FIG. 1A). By contrast, ligating a single asymmetric adaptor in accordance with the constructs and methods described herein results in about 100% of library elements with A and B' adaptor arms (FIG. IB). Moreover, some circularization/RPC generation processes are strand specific, in that they require that ssDNA library elements contain the A adaptor and the reverse complement of the B (e.g., B') adaptor to generate ssDNA circles. In such cases, only about 25% of 2 adaptor ligation products will contain A and B' (vs A' and B), whereas about 100% of asymmetric adaptor ligation products will contain A and B'.

[0068] In some embodiments, an asymmetric hairpin adaptor comprises (1) a dsDNA stem domain that can ligate to dsDNA fragments and (2) an ssDNA region containing two different sequences (A and B' in FIG. IB). Ligation of asymmetric adaptors to both ends of a dsDNA fragment connects the 5’ and 3’ ends of each ssDNA fragment with A and B' arms, respectively. In some embodiments, the double stranded asymmetric adaptor ligation products (see, FIG. IB) are separated into separate strands prior to performing clustering and sequencing.

[0069] While various embodiments of the invention(s) of the present disclosure have been shown and described herein, it will be obvious to those skilled in the art that such embodiments are provided by way of example only. Numerous variations, changes, and substitutions may occur to those skilled in the art without departing from the invention(s). It should be understood that various alternatives to the embodiments of the invention(s) described herein may be employed in practicing any one of the invention(s) set forth herein.

II. Definitions

[0070] As used herein, the term “nucleotide” can be used to refer to a native nucleotide or analog thereof. Examples include, but are not limited to, nucleotide triphosphates (NTPs) such as ribonucleotide triphosphates (rNTPs), deoxyribonucleotide triphosphates (dNTPs), or non- natural analogs thereof such as dideoxyribonucleotide triphosphates (ddNTPs) or reversibly terminated nucleotide triphosphates (rtNTPs).

[0071] By "nucleic acid", "polynucleotide", "oligonucleotide", or grammatical equivalents herein means at least two nucleotides covalently linked together. A nucleic acid of the present invention will generally contain phosphodiester bonds, although in some cases, nucleic acid analogs are included that may have alternate backbones, comprising, for example, phosphoramide, phosphorothioate, phosphorodithioate, and peptide nucleic acid (PNA) backbones and linkages. Other analog nucleic acids include those with positive backbones; non-ionic backbones, and nonribose backbones, including those described in US5,235,033 and US5,034,506. The template nucleic acid may also have other modifications, such as the inclusion of heteroatoms, the attachment of labels, such as dyes, or substitution with functional groups which will still allow for base pairing and for recognition by the enzyme.

[0072] As used herein, a "substantially identical" nucleic acid is one that has at least 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99% sequence identity to a reference nucleic acid sequence. The length of comparison is preferably the full length of the nucleic acid, but is generally at least 20 nucleotides, 30 nucleotides, 40 nucleotides, 50 nucleotides, 75 nucleotides, 100 nucleotides, 125 nucleotides, or more.

[0073] By "asymmetric nucleic acid sequence" is meant to include a nucleic acid that has a different nucleic acid composition at the first end of the nucleic acid as compared to the second end of the nucleic acid. In certain embodiments, the nucleic acid composition at the first end comprises a first annealed oligonucleotide that is not present at the second end. In some embodiments, an “asymmetric hairpin adaptor” includes a first adaptor sequence at a first end and a second adaptor sequence at a second end of a hairpin adaptor where the first adaptor and the second adaptor have at least one nucleic acid composition difference. The nucleic acid composition difference between the first adaptor and the second adaptor can be any desired difference, including but not limited to one or more nucleic acid sequence difference (e.g., substitutions, deletions, insertions, inversions, rearrangements (e.g., a different order of functional domains), or any combination thereof), or a nucleic acid modification difference. In some embodiments, and as detailed herein, asymmetric hairpin adaptors include a first primer binding site at one end and a second primer binding site at the second end that bind to different oligonucleotide sequences.

[0074] By "linker" is meant a moiety that functions to attach a first functional element or moiety to another. With respect to attaching nucleic acid domains to each other or to distinct moieties, linkers could be additional nucleotide bases (DNA, RNA, PNA, etc.), peptides, carbon-chain, poly-ethylene-glycol spacers, etc. Attachment of functional element and/or moieties with a linker can be covalent or non-covalent. No limitation in this regard is intended.

[0075] As used herein, the term "polymerase" can be used to refer to a nucleic acid synthesizing enzyme, including but not limited to, DNA polymerase, RNA polymerase, reverse transcriptase, primase and transferase. Typically, the polymerase has one or more active sites at which nucleotide binding and/or catalysis of nucleotide polymerization may occur. The polymerase may catalyze the polymerization of nucleotides to the 3’ end of the first strand of the double stranded nucleic acid molecule. For example, a polymerase catalyzes the addition of a next correct nucleotide to the 3’ oxygen moiety of the first strand of the double stranded nucleic acid molecule via a phosphodiester bond, thereby covalently incorporating the nucleotide to the first strand of the double stranded nucleic acid molecule. Optionally, a polymerase need not be capable of nucleotide incorporation under one or more conditions used in a method set forth herein. For example, a mutant polymerase may be capable of forming a ternary complex but incapable of catalyzing nucleotide incorporation.

[0076] By "strand-displacing nucleic acid polymerase", "strand-displacing polymerase", and equivalents thereof is meant a nucleic acid polymerase that has both 5' to 3' template dependent nucleic acid synthesis activity and 5' to 3' strand displacement activity. Thus, when such polymerases encounter a double-stranded region of a template during nucleic acid synthesis it will displace the non-template strand while continuing nucleic acid synthesis on the template strand. On circular templates (e.g., templates having a double-stranded insert with hairpin adaptors at both ends), such polymerases can enter into rolling circle replication under suitable nucleic acid synthesis conditions. While any suitable strand-displacing nucleic acid polymerase can be used, in certain embodiments the polymerase is a phi29 (029) DNA polymerase or a modified version thereof. Where modified recombinant 29 DNA polymerase is employed, it can be homologous to a wild-type or exonuclease deficient 029 DNA polymerase, e.g., as described in US5,001,050, US5, 198,543, or US5,576,204, the full disclosures of which are incorporated herein by reference in their entirety for all purposes. Alternately, the modified recombinant DNA polymerase can be homologous to other 029-type DNA polymerases, such as B103, GA-1, PZA, 015, BS32, M2Y, Nf, Gl, Cp-1, PRD1, PZE, SF5, Cp-5, Cp-7, PR4, PR5, PR722, L17, <P21, or the like. For nomenclature, see also, Meijer et al. (2001) "029 Family of Phages" Microbiology and Molecular Biology Reviews, 65(2):261 -287. Suitable polymerases are described, for example, in US8,420,366 and US8,257,954, incorporated herein by reference in their entirety for all purposes. Other strand displacing polymerases can also be used, e.g. as described in US8,936,926, US2010/0260465, and US8,921,086, each of which are hereby incorporated by reference herein in their entirety for all purposes. In certain embodiments, a polymerase/template complex (such as a polymerase-nucleic acid complex) generated according to aspects of the present disclosure is used directly in a sequencing-by-synthesis reaction, e.g., a SMRT® Sequencing reaction (Pacific Biosciences of California, Inc.).

[0077] By "sample index" is meant one or more nucleotide sequences that can be used to demultiplex and assign sequence reads to the correct samples during data analysis. By "dual sample index" is meant two nucleotide sequences that are together unique to particular sample libraries that have been pooled together and can be used to demultiplex and assign sequence reads to the correct samples during data analysis. In some embodiments, samples may be normalized prior to pooling.

[0078] As used herein, the term "extension," when used in reference to a nucleic acid, means a process of adding at least one nucleotide to the 3’ end or 5’ end of the nucleic acid. The term “polymerase extension,” when used in reference to a nucleic acid, refers to a polymerase catalyzed process of adding at least one nucleotide to the 3’ end of the nucleic acid. A nucleotide or oligonucleotide that is added to a nucleic acid by extension is said to be incorporated into the nucleic acid. Accordingly, the term "incorporating" can be used to refer to the process of joining a nucleotide or oligonucleotide to the 3’ end or 5’ end of a nucleic acid by formation of a phosphodiester bond.

[0079] As used herein, the term "primer" refers to a nucleic acid having a sequence that binds to a nucleic acid at or near a template sequence. Generally, the primer binds in a configuration that allows replication of the template, for example, via polymerase extension of the primer. The primer can be a first portion of a nucleic acid molecule that binds to a second portion of the nucleic acid molecule, the first portion being a primer sequence and the second portion being a primer binding sequence. Alternatively, the primer can be a first nucleic acid molecule that binds to a second nucleic acid molecule having the template sequence. A primer can consist of DNA, RNA or analogs thereof. A primer can have an extendible 3’ end or a 3’ end that is blocked from primer extension.

[0080] As used herein, the articles "a" and "an" refer to one or to more than one (i.e. to at least one) of the grammatical object of the article. By way of example, "an element" means one element or more than one element.

[0081] As used herein, the term "comprising" is intended to mean that the compositions and methods include the recited elements, but not excluding others. "Consisting essentially of' when used to define compositions and methods, shall mean excluding other elements of any essential significance to the composition or method. "Consisting of shall mean excluding more than trace elements of other ingredients for claimed compositions and substantial method steps. Embodiments defined by each of these transition terms are within the scope of this invention. Accordingly, it is intended that the methods and compositions can include additional steps and components (comprising) or alternatively including steps and compositions of no significance (consisting essentially of) or alternatively, intending only the stated method steps or compositions (consisting of).

[0082] As used herein, the term "about" will be understood by persons of ordinary skill in the art and will vary to some extent on the context in which it is used. As used herein, “about” when referring to a measurable value such as an amount, a temporal duration, and the like, is meant to encompass variations of ±20% or ±10%, more preferably ±5%, even more preferably ±1%, and still more preferably ±0.1% from the specified value, as such variations are appropriate to perform the disclosed methods. All numerical designations, e.g., pH, temperature, time, concentration, and molecular weight, including ranges, are approximations which are varied (±) or (-) by increments of 0.1. It is to be understood, although not always explicitly stated that all numerical designations are preceded by the term "about".

[0083] Where a range of values is provided, it is understood that each intervening value, to the tenth of the unit of the lower limit unless the context clearly dictates otherwise, between the upper and lower limit of that range and any other stated or intervening value in that stated range is encompassed within the invention. The upper and lower limits of these smaller ranges may independently be included in the smaller ranges is also encompassed within the invention, subject to any specifically excluded limit in the stated range. Where the stated range includes one or both of the limits, ranges excluding either or both of those included limits are also included in the invention.

[0084] Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. All publications mentioned herein are incorporated herein by reference for the purpose of describing and disclosing devices, compositions, formulations and methodologies which are described in the publication and which might be used in connection with the presently described invention.

[0085] In the following description, numerous specific details are set forth to provide a more thorough understanding of the present invention. However, it will be apparent to one of skill in the art that the present invention may be practiced without one or more of these specific details. In other instances, well-known features and procedures to those skilled in the art have not been described in order to avoid obscuring the invention.

[0086] The practice of the present invention may employ, unless otherwise indicated, conventional techniques and descriptions of organic chemistry, polymer technology, molecular biology (including recombinant techniques), cell biology, biochemistry, and immunology, which are within the skill of the art. Such conventional techniques include polymer array synthesis, hybridization, ligation, phage display, and detection of hybridization using a label. Specific illustrations of suitable techniques can be had by reference to the example herein below.

However, other equivalent conventional procedures can, of course, also be used. Such conventional techniques and descriptions can be found in standard laboratory manuals such as Genome Analysis: A Laboratory Manual Series (Vols. I-IV), Using Antibodies: A Laboratory Manual, Cells: A Laboratory Manual, PCR Primer: A Laboratory Manual, and Molecular Cloning: A Laboratory Manual (all from Cold Spring Harbor Laboratory Press), Stryer, L. (1995) Biochemistry (4th Ed.) Freeman, New York, Gait, "Oligonucleotide Synthesis: A Practical Approach" 1984, IRL Press, London, Nelson and Cox (2000), Lehninger, Principles of Biochemistry 3rd Ed., W. H. Freeman Pub., New York, N.Y. and Berg et al. (2002) Biochemistry, 5th Ed., W. H. Freeman Pub., New York, N.Y., all of which are herein incorporated in their entirety by reference for all purposes.

IL Blocked Asymmetric Hairpin Adaptors Compositions

[0087] Provided herein is a stem-loop or hairpin adaptor that includes a polymerase-blocking spacer. In some cases, the polymerase-blocking spacer prevents extension by the stranddisplacing polymerase. In one aspect, a blocked asymmetric hairpin (stem-loop) adaptor structure includes a first adaptor arm (e.g., “B'” adaptor arm of FIG. IB) and a second adaptor arm (e.g., “A” adaptor arm of FIG. IB) separated by a spacer. The spacer can be helpful for the efficient and controlled generation of separate ssDNA library elements from short insert template structures. In some cases, the two adaptor arms can contain different features such as different primer binding sites and tag sequences (e.g., one or more sample index sequences and/or one or more molecular index sequences). In other words, the two adaptor arms are asymmetric or possess different sequences. In some instances, one or both adaptor arms include a sample index. In some instances, one or both adaptor arms include a molecule index. The different sequences allow for separate index and insert reads and without paired-end sequencing. Such asymmetric hairpin adaptors can maximize template library yield and allows for the interrogation of both strands of dsDNA fragments.

[0088] In some embodiments, a single asymmetric hairpin adaptor includes (a) a continuous polynucleotide as a first adaptor arm, (b) a spacer, and (c) a second continuous polynucleotide as a second adaptor arm such that the free ends of the first and second polynucleotides can form a stem domain (e.g., stem region) by way of hybridization. A single hairpin adaptor can contain from 5’ to 3’ end: a first continuous polynucleotide linked at its 3’ end to a spacer which is linked to the 5’ end of a second continuous polynucleotide. Under certain conditions, a portion of the 5’ end of the first continuous polynucleotide hybridizes to a portion of the 3’ end of the second continuous polynucleotide, thereby creating a stem domain, while the other portions of the first and second continuous polynucleotides that are not part of the stem domain form a loop strand.

[0089] In some embodiments, a blocked asymmetric hairpin adaptor comprises two sequence regions that can hybridize together and form a stem domain. In other words, the sequences are reverse complements of each other. In some embodiments, a blocked asymmetric hairpin adaptor described includes a stem domain containing a first stem sequence and a second stem sequence that can hybridize to each other. In some embodiments, an asymmetric hairpin adaptor is single stranded and includes a first stem sequence at the 5’ end and a second stem sequence at the 3’ end such that the first stem sequence is the reverse complement of the second stem sequence. In some embodiments, the first stem sequence or a portion thereof has reverse complementarity to the second stem sequence or a portion thereof.

[0090] In some embodiments, the spacer is a polymerase-blocking spacer. An exemplary spacer prevents extension by a strand-displacing polymerase. In some embodiments, a spacer containing a blocking moiety controls the progress of a polymerase on an adaptor. In other words, the spacer provides a reaction stop or pause point for a polymerase at the blocking moiety.

[0091] In some embodiments, blocked asymmetric hairpin adaptors include binding sites (e.g., primer binding sites, amplification primer binding sites, sequencing primer binding sites, and binding sites for a splint oligonucleotide or a portion thereof), tag sequences (e.g., sample indexes, molecular indexes, and the like), polymerase-blocking elements (e.g., spacers) or any combination thereof.

[0092] In some embodiments, a blocked asymmetric hairpin adaptor contains a unique molecular index or unique molecular identifier (UMI). In other embodiments, a blocked asymmetric hairpin adaptor contains at least one molecular index. In some embodiments, a molecular index is about 5 to about 20 nucleotides, e.g., 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 nucleotides. In other embodiments, a molecular index is less than 5 nucleotides. In some embodiments, a molecular index is more than 20 nucleotides. A molecular index can be a random polynucleotide octamer. In some embodiments, a molecular index is a variable or degenerate sequence. Additional examples of a molecular index and uses thereof are provided in, e.g., U.S. Pat. App. Publ. No. 2016/0319345, the contents of which are herein incorporated by reference.

[0093] In some embodiments, a blocked asymmetric hairpin adaptor contains a sample index. In other embodiments, a blocked asymmetric hairpin adaptor contains at least one sample index. A sample index of the loop strand can be a defined polynucleotide sequence. [0094] In some embodiments, each adaptor arm contains one or more features including binding sites, such as but not limited to, a primer binding site, an amplification primer binding site, a sequencing primer binding site and a binding site for a splint oligonucleotide or a portion thereof, and tag sequences, such as, but not limited to, a sample index and a molecular index/identifier (unique molecular index).

[0095] A blocked asymmetric hairpin adaptor can be ligated to each end of a dsDNA library fragment. As such, each dsDNA fragment is ligated to two blocked two asymmetric hairpin adaptors to produce a dsDNA template capped by asymmetric hairpin adaptors at each end. The resulting template is structurally linear by topologically circular.

[0096] In some case, a blocked hairpin adaptor is an asymmetric cleavable hairpin adaptor structure that includes an oligonucleotide with an A adaptor arm (a first adaptor arm) and a B' adaptor arm (a second adaptor arm) separated by a cleavable moiety (see, for example FIG. 2). In some instances, the cleavable moiety includes a DNA linker containing one or more uracil residues that can be cleaved such as by a uracil-DNA glycosylase. In some embodiments, asymmetric cleavable hairpin adaptor structure includes a uracil-containing oligonucleotide between the A adaptor arm and the B' adaptor arm. In some embodiments, cleavable hairpin adaptors can be used to generate a dsDNA template capped by hairpin adaptors. Such a cleavable dsDNA template can undergo subsequent ssDNA library element separation by cleavage and denaturation. Hairpin adaptors that are both cleavable and blocked may be suitable for alternate downstream workflows.

[0097] In some embodiments, a blocked hairpin adaptor is a non-cleavable hairpin adaptor structure that includes an oligonucleotide with an A adaptor arm (a first adaptor region) and a B’ adaptor arm (a second adaptor arm) separated by a non-cleavable flexible moiety (see, for example FIG. 2). In some instances, the non-cleavable moiety includes a linker containing one or more polyethylene glycol (PEG) groups. In some embodiments, asymmetric non-cleavable hairpin adaptors can be used to generate a dsDNA template capped by hairpin adaptors. Such a non-cleavable dsDNA template can undergo subsequent separation of the two ssDNA library elements by polymerase extension and denaturation.

A. Spacers [0098] In some embodiments, the spacer of a blocked asymmetric hairpin adaptor provides one or more of the following: structural flexibility to facilitate efficient and complete primer extension across the entire length of all ssDNA library elements; and blockage of primer extension across the adaptor region junction to prevent creation of extension products containing multiple (e.g., complementary) copies of the template insert. Non-limiting examples of spacers for asymmetric hairpin adaptors include PEG linkers of various lengths, abasic sites, polynucleotide sequences with a blocking moiety, unnatural nucleotides or analogs thereof, modified nucleotides, damaged nucleotides, any large photolabile groups, strand-binding moieties, synthetic linkers, and combinations thereof. The blocked asymmetric hairpin adaptor described herein includes a spacer that prevents extension by a strand-displacing polymerase. In additional, the spacer cannot be modified to allow extension while the polymerase is blocked by the spacer. In some cases, a spacer includes a feature or moiety selected from the group consisting of a PEG linker, a synthetic linker, a non-nucleotide linker, an abasic site, a nick, a non-native/unnatural nucleotide or analog thereof, a primer binding site for a primer (or a primer), a large photolabile group, a strand-binding moiety, a damaged base, and a modified base. In some embodiments, the spacer comprises at least one feature selected from the group consisting of a non-nucleotide linker, a non-nucleotide linker that blocks extension by the stranddisplacing polymerase, such as but not limited to at least one PEG group, a non-nucleotide linker backbone containing at least 10 bonds in length, a non-nucleotide linker containing an eightmember ring, a continuous polynucleotide sequence containing a cleavable site, a continuous polynucleotide sequence further containing at least one blocking moiety that prevents extension by the strand-displacing polymerase.

[0099] In some embodiments, the spacer contains a non-nucleotide linker that blocks extension by the strand-displacing polymerase. The non-nucleotide linker includes a linear polymer such as a hydrophobic linear polymer. In many embodiments, the spacer includes at least one polyethylene glycol (PEG) group. In some embodiments, the spacer includes one or more, e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 30, 40, 50, 60, 70, 80, 90, 99,

100, 103, 105, 107, 110, 113, 117, 120, 125, 130, 136, 140, 150, 160, 170, 180, 190, 200, 210,

220, 230, 240, 250, 260, 270, 280, 290, 300, or more, PEG groups. In some embodiments, the spacer includes from about 50-300 PEG groups, e.g., 50, 60, 70, 80, 90, 99, 100, 103, 105, 107, 110, 113, 117, 120, 125, 130, 136, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 50-300, 50-280, 50-270, 50-250, 50-235, 50-230, 50-220, 50-210, 50- 200, 50-190, 50-180, 50-170, 50-160, 50-150, 50-140, 50-130, 50-120, 50-110, or 50-100 PEG groups. In some embodiments, the spacer includes from about 100 PEG groups.

[0100] In some embodiments, the spacer includes a non-nucleotide linear polymer of at least 30, e.g., 30, 35, 40, 45, 50, 55, 60, 65, 68, 70, 75, 80, 85, 90, 95, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 350, 400, 450, or more chemical bonds in length. In many embodiments, the non-nucleotide linear polymer of the spacer ranges from about 30 to 300, e.g., about 30-300, 40-300, 50-300, 60-300, 70-300, 80-300, 90- 300, 100-300, 30-200, 40-200, 50-200, 60-200, 70-200, 80-200, 90-200, 100-200, 120-300, or 200-300 chemical bonds in length. In some embodiments, the non-nucleotide linear polymer ranges from about 60 to 200, e.g., about 60-200, 70-200, 80-200, 90-200, 100-200, 60-100, 60- 155, 60-165, 60-175, 60-185, 65-195, 70-100, 70-155, 70-165, 70-175, 70-185, 70-195, 75-100, 75-155, 75-165, 75-175, 75-185, 75-195, 80-100, 80-155, 80-165, 80-175, 80-185, 80-195, 85- 100, 85-155, 85-165, 85-175, 85-185, 85-195, 90-100, 90-155, 90-165, 90-175, 90-185, 90-195, 95-100, 95-155, 95-165, 95-175, 95-185, or 95-195 chemical bonds in length. In some instances, the spacer includes a linker backbone such as a non-nucleotide linker backbone. Such a non- nucleotide linker backbone can include at least about 10, e.g., 10, 15, 20, 24, 25, 30, 35, 36, 38, 40, 42, 45, 47, 50, 54, 58, 60, 62, 65, 68, 70, 72, 74, 76, 78, 80, 84, 87, 90, 92, 95, 98, 100, 105, 108, 110, 114, 117, 120, 25, 130, 136, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, or 300 chemical bonds in length. In various embodiments, a non- nucleotide linker backbone includes at least 10 chemical bonds in length. In some embodiments, a non-nucleotide linker backbone includes 300 or less chemical bonds in length.

[0101] In many embodiments, a non-nucleotide linker of the spacer includes an eight-member ring, such as, but not limited to, cyclooctene or cyclooctane. In some embodiments, a non- nucleotide linker of a spacer includes a conjugate generated by a click chemistry reaction. In some cases, the click chemistry reaction is a copper-free click reaction such as a reaction using DBCO-TEG. In some cases, the click chemistry reaction is a strain-promoted click chemistry reaction. In many instances, the click chemistry reaction is a strain promoted alkyne-azide cycloaddition. [0102] In many embodiments, a non-nucleotide linker of the spacer includes one or more linkers selected from the group consisting of an abasic site linker, a spacer C3 (SpC3 or a spacer 3 carbon chain or a propyl spacer) linker, a spacer C6 (SpC6 or a spacer 6 carbon chain) linker, a spacer C9 (SpC9 or a spacer 9 carbon chain) linker, a spacer C12 (SpC12 or a spacer 12 carbon chain) linker, a spacer Cl 8 (SpC18 or a spacer 18 carbon chain) linker, spacer 9 and spacer 18. In some embodiments, an abasic site linker can include a 1’, 2’ -dideoxyribose (dSpacer or abasic furan). In many instances the spacer C3 (SpC3) linker can include multiple (2 or more) C3 spacers, such as but not limited to, SpC3-SpC3, SpC3-SpC3-SpC3, SpC3-SpC3-SpC3-SpC3, SpC3-SpC3-SpC3-SpC3-SpC3, and the like. The spacer C6 (SpC6) linker can include multiple C6 spacers, such as but not limited to, SpC6-SpC6, SpC6-SpC6-SpC6, SpC6-SpC6-SpC6-SpC6, SpC6-SpC6-SpC6-SpC6-SpC6, and the like. The spacer C9 (SpC9) linker can include multiple C9 spacers, such as but not limited to, SpC9-SpC9, SpC9-SpC9-SpC9, SpC9-SpC9-SpC9-SpC9, SpC9-SpC9-SpC9-SpC9-SpC9, and the like. The spacer C12 (SpC12) linker can include multiple C12 spacers, such as but not limited to, SpC12-SpC12, SpC12-SpC12-SpC12, SpC12- SpC12-SpC12-SpC12, SpC12-SpC12-SpC12-SpC12-SpC12, and the like. The spacer C18 (SpC18) linker can include multiple C18 spacers, such as but not limited to, SpC18-SpC18, SpC18-SpC18-SpC18, SpC18-SpC18-SpC18-SpC18, SpC18-SpC18-SpC18-SpC18-SpC18, and the like. In other embodiments, the spacer includes a 9-atom (6 carbons and 3 oxygens) triethylene glycol spacer (spacer 9 or Sp9). The Sp9 linker can include multiple Sp9 spacers, such as but not limited to, Sp9-Sp9, Sp9-Sp9-Sp9, Sp9-Sp9-Sp9-Sp9, Sp9-Sp9-Sp9-Sp9-Sp9, and the like. In many embodiments, the spacer includes is an 18-atom (12 carbons and 6 oxygens) hexa-ethylene glycol spacer (spacer 18 or Spl8). The Sp 18 linker can include multiple Spl8 spacers, such as but not limited to, Spl8-Spl8, Spl 8-Spl 8-Spl 8, Spl8-Spl8-Spl8-Spl8, Sp 18-Spl 8-Sp 18-Sp 18-Sp 18, and the like. In some embodiments, the spacer is a C6 hexandiol spacer or a six carbon glycol spacer. In some embodiments, the linker includes one or more nucleotides fused or connected to any spacer described herein. For instance, the linker includes TTTTT (5 thymine nucleotides) and a SpC3 spacer (also known as TTTTT-SpC3).

[0103] In some embodiments, the spacer also includes a continuous polynucleotide sequence containing a cleavable site. The cleavable site can be at the 5’ end of the continuous polynucleotide sequence. The cleavable site can be at the 3’ end of the continuous polynucleotide sequence. In some instances, a cleavable site of the continuous polynucleotide sequence is a uracil. In other instances, a cleavable site of the continuous polynucleotide sequence comprises a uracil.

[0104] In some embodiments, the spacer containing a continuous polynucleotide sequence can contain at least one blocking moiety described herein that prevents extension by the stranddisplacing polymerase. In some instances, the blocking moiety of the spacer is a thermally stable unnatural nucleotide hybridized to a complementary sequence of the loop strand. The thermally stable unnatural nucleotide can be a locked nucleic acid (LNA). In some embodiments, the continuous polynucleotide sequence of the spacer comprises one or more, e.g., 1, 2, 3, 4, 5, 6, or more blocking moieties described herein. The one or more blocking moieties can be located at the 5’ end of the continuous polynucleotide sequence of the spacer. The one or more blocking moieties can be located at the 3’ end of the continuous polynucleotide sequence of the spacer.

[0105] In some embodiments, the blocking moiety is an abasic site, e.g., introduced by a DNA glycosylase. In some embodiments, the blocking moiety includes one or more, e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more non-native nucleotides. In some embodiments, the blocking moiety is a photolabile group that inhibits polymerase mediated replication. In some embodiments, the blocking moiety is a strand-binding moiety that prevents processive synthesis.

B. Loop Strands

[0106] Described is a loop strand of a blocked asymmetric hairpin adaptor that includes a region of a first adaptor arm, a spacer, and a region of a second adaptor arm. In some cases, a loop strand contains from 5’ to 3’: a first adaptor region, a spacer and a second adaptor region. Each adaptor region of the loop strand is linked to its corresponding stem strand of its adaptor arm. In other words, the 5’ end of the first adaptor region is connected to the 3’ end of the first stem strand of the first adaptor arm. The second adaptor region of the loop is linked to the second stem strand such that the 3’ end of the second adaptor region is connected to the 5’ end of the second stem strand of the second adaptor arm. In some instances, the loop strands of the first and second adaptors are neither the same nor complementary. In some instances, the loop strands of two adaptors do not hybridize. In some embodiments, the loop strand or a portion thereof is singlestranded. In some instances, the first and second adaptor arms of the loop strand are singlestranded. Referring to FIG. 7, the sequences of the first adaptor arm and the second adaptor arm are not the same nor complementary nor reverse complementary sequences. [0107] In some embodiments, the loop strand of the hairpin adaptor contains one or more features including binding sites, such as but not limited to, a primer binding site, an amplification primer binding site, a sequencing primer binding site, and a binding site for a splint oligonucleotide; and tag sequences, such as, but not limited to, a sample index and a molecular index/identifier (unique molecular index). In some embodiments, the loop strand of the hairpin adaptor does not include a molecular index. In some embodiments, the loop strand of the hairpin adaptor does not include a sample index. In some embodiments, the loop strand has one or more binding sites. In some embodiments, the loop strand has one or more primer binding sites. In some embodiments, the loop strand has one or more primer binding sites and one or more sample indexes. In some instances, the primer binding site is identical to at least a portion of a primer sequence. In some instances, the primer binding site is the reverse complement of a primer sequence. The primer binding site can be primer binding site of an initial extension, an amplification primer binding site or a sequencing primer binding site. In one embodiment, the primer binding site is a sample insert reading primer binding site. The term “binding site” used in the context of a hairpin adaptor also includes the reverse complement thereof, such that an extension product of a hairpin adaptor ligation product provides a sequence which can be bound. This design may allow for PCR amplification. For example, a hairpin adaptor may include a primer binding site (e.g., 5’ of a spacer) that is bound by a primer in a first round of extension to form a first extension product, and the hairpin adaptor may further comprise another primer binding site (e.g., 3’ of the spacer) for which the reverse complement sequence of the first extension product will be extended from another primer in a second round of extension.

[0108] In some embodiments, a loop strand of a first adaptor arm includes one or more of the following: a binding site, a primer binding site or the reverse complement thereof, a molecular index, a sample index, a sequencing primer binding site or the reverse complement thereof, and any combination thereof. In some embodiments, a loop strand of a first adaptor arm includes a binding site, a primer binding site or the reverse complement thereof, a molecular index, and a sample index. In some embodiments, a loop strand of a first adaptor arm includes a binding site, a molecular index, and a sample index. In some embodiments, a loop strand of a first adaptor arm includes a binding site and a sample index. In some embodiments, a loop strand of a first adaptor arm includes a binding site and a molecular index. In some embodiments, a loop strand of a first adaptor arm includes a primer binding site or the reverse complement thereof, a molecular index, and a sample index. In some embodiments, a loop strand of a first adaptor arm includes a primer binding site or the reverse complement thereof, a molecular index, and a sequencing primer binding site or the reverse complement thereof. In some embodiments, a loop strand of a first adaptor arm includes a primer binding site or the reverse complement thereof, a sample index, and a sequencing primer binding site or the reverse complement thereof. In some embodiments, a loop strand of a first adaptor arm includes a primer binding site or the reverse complement thereof and a sample index. In some embodiments, a loop strand of a first adaptor arm includes a primer binding site or the reverse complement thereof and a molecular index. In some embodiments, a loop strand of a first adaptor arm includes a primer binding site or the reverse complement thereof and a sequencing primer binding site or the reverse complement thereof. In some embodiments, the primer binding site is a primer binding site for an initial extension. In various embodiments, the primer binding site is an amplification binding site. In one embodiment, the primer binding site is a sample insert reading primer binding site. In some embodiments, the primer binding site is a sequencing primer binding site.

[0109] In some embodiments, a loop strand of a second adaptor arm includes one or more of the following: a binding site, a primer binding site or the reverse complement thereof, a molecular index, a sample index, a sequencing primer binding site or the reverse complement thereof and any combination thereof. In some embodiments, a loop strand of a second adaptor arm includes a binding site, a primer binding site or the reverse complement thereof, a molecular index, and a sample index. In some embodiments, a loop strand of a second adaptor arm includes a binding site, a molecular index, and a sample index. In some embodiments, a loop strand of a second adaptor arm includes a binding site and a sample index. In some embodiments, a loop strand of a second adaptor arm includes a binding site and a molecular index.

[0110] In some embodiments, a loop strand of a second adaptor arm includes a primer binding site or the reverse complement thereof, a molecular index, and a sample index. In some embodiments, a loop strand of a second adaptor arm includes a primer binding site or the reverse complement thereof, a molecular index, and a sequencing primer binding site or the reverse complement thereof. In some embodiments, a loop strand of a second adaptor arm includes a primer binding site or the reverse complement thereof, a sample index, and a sequencing primer binding site or the reverse complement thereof. In some embodiments, a loop strand of a second adaptor arm includes a primer binding site or the reverse complement thereof and a sample index. In some embodiments, a loop strand of a second adaptor arm includes a primer binding site or the reverse complement thereof and a molecular index. In some embodiments, a loop strand of a second adaptor arm includes a primer binding site or the reverse complement thereof and a sequencing primer binding site or the reverse complement thereof. In many embodiments, the primer binding site is an amplification binding site. In certain embodiments, the primer binding site is a sample insert reading primer binding site. In some embodiments, the primer binding site is a sequencing primer binding site.

[OHl] In some embodiments, a first and/or second adaptor arm includes a primer binding site or the reverse complement thereof and a sequencing primer binding site or the reverse complement thereof. In some embodiments, a first and/or second arm includes a sequencing primer binding site or the reverse complement thereof. In some embodiments, a first and/or second adaptor arm includes a primer binding site or the reverse complement thereof. In some embodiments, a primer binding site or the reverse complement thereof is an amplification primer binding site or the reverse complement thereof. In some embodiments, a primer binding site or the reverse complement thereof is an amplification primer binding site or the reverse complement thereof. In some embodiments, a primer binding site or the reverse complement thereof is a primer binding site of an initial extension primer or the reverse complement thereof. In other embodiments, the primer binding site is a sample insert reading primer binding site. In many instances, the first and second adaptor arms include different primer binding sites.

[0112] In some instances, the primer binding site is identical to at least a portion of a primer sequence. In some instances, the primer binding site is the reverse complement of a primer sequence.

[0113] In some embodiments, a first and/or second adaptor arm includes a binding site such as a binding site for a splint oligonucleotide or a portion thereof.

[0114] In various embodiments, the binding site is at least 3, e.g., 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,

13, 14, 15, 16, 17, 18, 19, 20, 21 or more nucleotides in length. In some embodiments, the binding site is at least 6, e.g., 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 or more nucleotides in length. In many embodiments, the binding site is at least 10, e.g., 10, 11, 12, 13,

14, 15, 16, 17, 18, 19, 20, 21 or more nucleotides in length. In various embodiments, the binding site is at least 12, e.g., 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 or more nucleotides in length. [0115] In some embodiments, a first and/or second adaptor region of a loop strand includes at least one unique molecular index. In some embodiments, a first and/or second adaptor region of a loop strand includes at least one sample index. In some cases, sample index sequences can be used in a dual sample index format. In some embodiments, a first and second adaptor region of a loop strand each include a portion of a tag sequence that can be combined to function as dual sample indexes (e.g., combinatorial sample indexes).

[0116] In one embodiment, combinatorial sample indexes include a click chemistry conjugate (e.g., a click chemistry spacer). A click chemistry spacer can be attached to two adaptor arms to form the hairpin loop. This could be performed by ligation of the two adaptor arms, by conjugation such as click chemistry (e.g., copper click chemistry and strain promoted click chemistry), or any other suitable bioconjugation method known in the art. The conjugation may form the spacer. In some methods for combinatorial synthesis, the combinatorial indexes are arranged in a row by column format, such that each row and column has a specific sample index and the intersecting wells have a unique combination of two sample indexes.

C. Stem Domains

[0117] Described is a stem domain of a blocked asymmetric hairpin adaptor. A stem domain includes a stem strand of the first adaptor arm (e.g., a single-stranded polynucleotide sequence located at the 5’ end of the first adaptor arm) and a stem strand of the second adaptor arm (e.g., a single-stranded polynucleotide sequence located at the 3’ end of the second adaptor arm) such that the two strands can hybridize to form a double stranded nucleic acid molecule (see, FIG. 7). In other words, the sequence of the first stem strand (e.g., first stem sequence) is the reverse complement of the second stem strand (e.g., second stem sequence).

[0118] The two stem strands of a blocked asymmetric hairpin adaptor can hybridize under annealing conditions. In some embodiments, an annealing condition includes a condition in which an adaptor is ligated to an end of a dsDNA fragment. In some embodiments, an annealing condition includes a condition in which two adaptors are ligated to opposite ends of a dsDNA fragment.

[0119] In some embodiments, the first and second stem sequences are each at least 3 nucleotides, e.g., 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30 or more nucleotides. In some embodiments, the first and second stem sequences are each at least 5 nucleotides, e.g., 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25,

26, 27, 28, 29, 30 or more nucleotides. In some embodiments, the first and second stem sequences are each at least 7 nucleotides, e.g., 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30 or more nucleotides. In some embodiments, the first and second stem sequences are each at least 10 nucleotides, e.g., 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30 or more nucleotides. In some embodiments, the first and second stem sequences are each about 7-20 nucleotides, e.g., 7-20, 7-19, 7-18, 7-17, 7-16, 7- 15, 7-14, 7-13, 7-10, 8-20, 10-20, 10-19, 12-20, 14-20, 15-20, and 16-20 nucleotides and the like. The lengths of the first and second stem sequences can be the same. The lengths of the first and second stem sequences can be different.

[0120] In some embodiments, the stem domain formed by the first and second stem sequences is at least 3 nucleotides, e.g., 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30 or more nucleotides. In some embodiments, the stem domain is at least 5 nucleotides, e.g., 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26,

27, 28, 29, 30 or more nucleotides. In some embodiments, the stem domain is at least 7 nucleotides, e.g., 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30 or more nucleotides. In some embodiments, the stem domain is at least 10 nucleotides, e.g., 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30 or more nucleotides. In some embodiments, the stem domain is about 7-20 nucleotides, e.g., 7-20, 7-19, 7-18, 7-17, 7-16, 7-15, 7-14, 7-13, 7-10, 8-20, 10-20, 10-19, 12-20, 14-20, 15-20, and 16-20 nucleotides and the like.

[0121] In some instances, the first and second stem sequences hybridize and form a blunt end. In certain instances, the first and second stem sequences hybridize and form an overhang end. In certain instances, the stem domain includes a poly(T) tail. In some embodiments, the first stem sequence has a poly(T) end. In some embodiments, the second stem sequence has a poly(T) end. The poly(T) tail of the blocked asymmetric hairpin adaptor is useful for TA ligation to dsDNA fragments for generating a template library.

[0122] In some embodiments, the first and/or second stem sequence includes a primer binding site or the reverse complement thereof. In some embodiments, the first and/or second stem sequence includes a sequence primer binding site or the reverse complement thereof. In some embodiments, the first and/or second stem sequence includes a sample index or a portion thereof (e.g., sample identifier sequence). The sample index can span across a stem sequence and a sequence of a loop strand. In some embodiments, the first and/or second stem sequence includes a molecular index or a portion thereof (e.g., molecular identifier sequence). The molecular index can span across a stem sequence and a sequence of a loop strand.

[0123] In some embodiments, a first and/or second adaptor arm includes one or more tag sequences. In some embodiments, a first and/or second adaptor arm includes a tag sequence. In some embodiments, a first and/or second adaptor arm includes one or more sample index sequences. In some embodiments, a first and/or second adaptor arm includes a sample index sequence. In some embodiments, a first and/or second adaptor arm includes one or more molecular index sequences. In some embodiments, a first and/or second adaptor arm includes a molecular index sequence.

[0124] In some embodiments, a first and/or second loop strand includes one or more tag sequences or portions thereof. In some embodiments, a first and/or second loop strand includes a tag sequence or a portion thereof. In some embodiments, a first and/or second loop strand includes one or more sample index sequences or portions thereof. In some embodiments, a first and/or second loop strand includes a sample index sequence or a portion thereof. In some embodiments, a first and/or second loop strand includes one or more molecular index sequences or portions thereof. In some embodiments, a first and/or second loop strand includes a molecular index sequence or a portion thereof.

[0125] In some embodiments, a first and/or second stem strand includes one or more tag sequences or portions thereof. In some embodiments, a first and/or second stem strand includes a tag sequence or a portion thereof. In some embodiments, a first and/or second stem strand includes one or more sample index sequences or portions thereof. In some embodiments, a first and/or second stem strand includes a sample index sequence or a portion thereof. In some embodiments, a first and/or second stem strand includes one or more molecular index sequences or portions thereof. In some embodiments, a first and/or second stem strand includes a molecular index sequence or a portion thereof.

[0126] In some embodiments of an exemplary blocked asymmetric hairpin adaptor comprises one or more of the following components from 5’ to 3’ : a primer binding site (or the reverse complement thereof), a sample index sequence, a molecular index, and a binding site such as, but not limited to, a primer binding site (or the reverse complement thereof), a spacer, another primer binding site (or the reverse complement thereof), another sample index sequence, another molecular index, and a primer binding site (or the reverse complement thereof). In some instances, such adaptors is used with an untailed primer for downstream extension and/or PCR amplification methods.

D. Exemplary Embodiments of Adaptors

[0127] In some embodiments, two blocked asymmetric hairpin adaptors each comprising the following components from 5’ to 3’: an “SP2'” primer binding site, a sample index (e.g., “SID2'”), a molecular index (e.g., UMI2’), and a “Pl'” primer binding site, a spacer, an “A” primer binding site, a sample index (e.g., “SIDl”), a molecular index (e.g., “UMH”), and a “SP1” primer binding site, and a poly(T) tail are added to opposite ends of a dsDNA template fragment to generate a circularized adaptor-target sequence construct. FIG. 8 shows an exemplary blocked asymmetric hairpin adaptor. Forward and reverse untailed primers can be used with such circularized adaptor-target sequence constructs in extension and amplification methods to generate single-stranded linear target sequence constructs. In some embodiments, the forward tailed primer includes from 5’ to 3’ : a “Pl” primer sequence and the reverse tailed primer includes from 5’ to 3’ : an “A” primer. In some embodiments, a PCR amplification primer set comprising an “A” primer and a “Pl” primer are used. For downstream sequencing applications, the sequencing primer can include an “SP1” primer, an “SP2'” primer, an “SP1” primer or an “SP2” primer depending on the strand of the original template insert.

[0128] In some embodiments of an exemplary blocked asymmetric hairpin adaptor comprises one or more of the following components from 5’ to 3’ : a primer binding site (or the reverse complement thereof), a spacer, and another primer binding site (or the reverse complement thereof). In some instances, such adaptors are used with tailed primers for downstream extension and/or PCR amplification methods. In some embodiments, two blocked asymmetric hairpin adaptors each comprising the following components from 5’ to 3’ : an “SP2'” primer binding site, a spacer, and a “SP1” primer binding site is added to opposite ends of a dsDNA template fragment to generate circularized adaptor-target sequence constructs. FIG. 9 shows an exemplary blocked asymmetric hairpin adaptor that includes neither a sample index nor an molecular index. Forward and reverse tailed primers can be used with such circularized adaptor-target sequence constructs in extension and amplification methods to generate single-stranded linear target sequence constructs. In some embodiments, a forward tailed primer includes from 5’ to 3’: a primer sequence (“Pl”), a sample index and a sequencing primer sequence (“SP2”). In some embodiments, a reverse tailed primer includes from 5’ to 3’: a primer sequence (“A”), a sample index and a sequencing primer sequence (“SP1”). In some embodiments, a forward tailed primer includes from 5’ to 3’: a “Pl” primer sequence and a reverse tailed primer includes from 5’ to 3’: an “A” primer. In some embodiments, a PCR amplification primer set comprising an “A” primer and a “Pl” primer are used. For downstream sequencing applications, the sequencing primer can include an “SP1” primer, an “SP2” primer, an “SP1” primer or an “SP2” primer depending on the strand of the original template insert.

III. Methods of Using Asymmetric Hairpin Adaptor-Tagged Nucleic Acids to Generate Single-Stranded Template Sequence Constructs

A. Producing Circularized Adaptor-Target Sequence Constructs

[0129] Described herein is a method of generating circularized adaptor-target sequence constructs from a nucleic acid sample. In some embodiments, the nucleic acid sample comprises the entire genome of an organism. In some embodiments, the nucleic acid sample comprises a portion of the entire genome of an organism. In some aspects, a set of loci are selected to be enriched, e.g., where the set of loci are structurally or functionally related. Such nucleic acid molecules of the sample can comprise both natural and non-natural, artificial, or non-canonical nucleotides including, but not limited to, DNA, RNA, LNA (locked nucleic acid), PNA (peptide nucleic acid), morpholino nucleic acid, glycol nucleic acid, threose nucleic acid, and mimetics and combinations thereof. The starting population of nucleic acids can be from any source, e.g., a whole genome, a collection of chromosomes, a single chromosome, or one or more regions from one or more chromosomes. It can be derived from cloned DNA (e.g., BACs, YACs, PACs, etc.), cDNA, or amplified DNA (by PCR, whole genome amplification, e.g., using Phi29 polymerase). In some embodiments, the nucleic acid sample includes genomic DNA, synthetic DNA, amplified DNA, complementary DNA (cDNA), and the like.

[0130] The nucleic acid sample can include fragmented dsDNA molecules. In some embodiments, a sample, e.g., DNA sample is fragmented by way of mechanical, enzymatic, chemical, or electrochemical cleavage or shearing to produce a plurality of fragmented dsDNA molecules. Non-limiting examples of methods of generating nucleic acid fragments from a genomic DNA, a cDNA, or a DNA concatemer include mechanical methods, such as sonication, mechanical shearing, nebulization, hydroshearing, and the like; enzymatic methods, such as exonuclease digestion, restriction endonuclease digestion, and the like; chemical methods, such as treatment with hydroxyl radicals, Cu(II):thiol combinations, diazonium salts, and the like; and electrochemical cleavage. In some embodiments, the plurality of fragmented dsDNA molecules is produced by target-specific pre-amplification, such as PCR or LCR amplification of target nucleic acids. The fragmented dsDNA molecules can undergo end repair. In some instances, the fragmented dsDNA undergoes end repair in a reaction containing T4 DNA polymerase, T4 polynucleotide kinase, and dNTPs. The fragmented dsDNA molecules can undergo A-tailing. In some embodiments, the fragmented dsDNA molecules undergo end repair followed by A-tailing. As will be recognized by those skilled in the art, A-tailing includes adding one or more adenine bases to a dsDNA molecule to form 3’ overhangs.

[0131] The fragmented dsDNA molecules can have blunt ends. In some instances, the fragmented dsDNA molecules have overhang ends that are complementary to the overhang ends of a blocked asymmetric hairpin adaptor. In some embodiments, the fragmented dsDNA molecules have poly(A) tails that are complementary to the poly(T) tails of blocked asymmetric hairpin adaptors.

[0132] Any of the blocked asymmetric hairpin adaptors described can be ligated under suitable conditions to the fragmented dsDNA molecules. In some embodiments, the adaptors and the dsRNA molecules are contacted together under a condition that facilitates the formation of a stem domain via hybridization of the stem sequence of the first arm and the stem sequence second arm of the adaptor. The resulting double-stranded stem domains of the adaptors can be ligated to dsDNA molecules using a ligase. An exemplary embodiment of a circularized adaptortarget sequence construct is shown in FIG. IB.

[0133] In some embodiments, provided is a double-stranded nucleic acid molecule comprising (a) a sense strand with a 5’ terminal end (e.g., a 5’ end of the sense strand), an intermediate portion (e.g., an intermediate portion of the sense strand), and a 3’ terminal end (e.g., a 3’ end of the sense strand) and (b) an antisense strand with a 5’ terminal end (e.g., a 5’ end of the antisense strand), an intermediate portion (e.g., an intermediate portion of the antisense strand), and a 3’ terminal end (e.g., a 3’ end of the antisense strand). A first single-stranded asymmetric hairpin (e.g., first stem-loop where the loop includes an asymmetric sequence) adaptor can be ligated to the double-stranded nucleic acid molecule such that the 3’ terminal end of the hairpin adaptor is connected to the 5’ terminal end of the sense strand of the double-stranded nucleic acid molecule and the 5’ end of the hairpin adaptor is connected to the 3’ terminal end of the antisense strand. A second asymmetric hairpin adaptor can be ligated to the same double-stranded nucleic acid molecule such that the 5’ terminal end of the second hairpin adaptor is connected to the 3’ terminal end of the sense strand of the same double-stranded nucleic acid molecule and the 3’ end of the hairpin adaptor is connected to the 5’ terminal end of the antisense strand. The resulting nucleic acid molecule comprises the double-stranded nucleic acid molecule with both the first hairpin adaptor and the second hairpin adaptor ligated thereto.

[0134] In some embodiments, described herein includes a single template nucleic acid molecule comprising a duplex region (e.g., a dsDNA template); a first asymmetric hairpin adaptor linking the terminus of the first end of the duplex region; and a second asymmetric hairpin adaptor linking the terminus of the second end of the duplex region. In some instances, a region (e.g., a stem or portion thereof) of the first hairpin adaptor is complementary to a region of the second hairpin adaptor (e.g., a stem or portion thereof). In some embodiments, the duplex region can be separated or melted apart to transform the single template nucleic acid molecule into a topologically single-stranded, circular nucleic acid molecule. In some embodiments, the first asymmetric hairpin adaptor and second asymmetric hairpin adaptor are identical. In some embodiments, the first and second hairpin adaptors comprise regions that hybridize to one another prior to or during the ligation to the duplex region.

B. Generating Single-Stranded, Linear Target Sequence Constructs

[0135] Described herein are methods of generating single-stranded, linear target sequence constructs from circularized adaptor-target sequence constructs.

[0136] In some embodiments, the plurality of circularized adaptor-target sequence constructs are contacted with a primer and a strand-displacing polymerase to produce a plurality of linear target sequence constructs. In some instances, the primer comprises a sequence that is complementary to a binding site, e.g., a primer binding site, on the blocked asymmetric hairpin adaptor. In some embodiments, the primer allows for primer extension of the circularized adaptor-target sequence construct to generate a linear target sequence construct. In some instances, the extension is from the second adaptor arm (e.g., “A” adaptor arm of FIG. IB). In some embodiments, sequencing methods are not performed during the primer extension step. Primer extension can be a first step of linear amplification. In some embodiments, the single-stranded, linear target sequence construct undergoes linear amplification using one of the primers described herein. In other cases, primer extension can be a first step of PCR. In some embodiments, amplification is performed on the linear target sequence constructs using suitable primers for polymerase chain reaction (PCR).

[0137] In various embodiments, the circularized adaptor-target sequence constructs undergo amplification by contacting the constructs with a reaction mixture comprising primers and a thermostable polymerase. In some instances, the primers are suitable primers for PCR. In certain embodiments, one primer comprises a sequence that is complementary to a primer binding site of the blocked asymmetric hairpin adaptor and another different primer comprises a sequence that is complementary to a different primer binding site of the adaptor. In some embodiments, the reaction mixture includes a strand-displacing polymerase. In some instances, amplification is performed in combination with primer extension. In some instances, amplification is performed without primer extension. In some instances, one of the primers is a tailed primer comprising at least one tag sequence, e.g., a sample index sequence or a molecular index sequence. In other instances, each of the primers is a tailed primer comprising at least one tag sequence, e.g. a sample index sequence or a molecular index sequence. A tailed primer comprising a tag sequence may further comprise a primer binding site from which a sequencing primer can read the tag sequence. In some embodiments, at least one tag sequence is at the portion of the tailed primer that is not complementary to the nucleic acid template for amplification, such as at the 5’ end of the primer. In some embodiments, at least one tag sequence is at the portion of the tailed primer that is complementary to the nucleic acid template for amplification.

[0138] If a blocked asymmetric hairpin adaptor does not include a sample index, a tailed primer that includes at least a sample index sequence can be used in an initial extension. In some embodiments, two tailed primers, each including a sample index, are used when performing amplification by PCR. Such methods are useful for generating linear target sequence constructs comprising a sample index. Any of the linear target sequence constructs described can include a sample index.

[0139] In some embodiments, a set of linear target sequence constructs include a unique sample index such that the constructs can be combined with another set of linear target sequence constructs having a different sample index. In some instances, one set of the linear target sequence constructs are from a sample, and the other set of linear target sequence constructs are from a different sample. The combination of sets of linear target sequence constructs can be performed prior to circularization for rolling circle amplification (RCA). The combination of sets of linear target sequence constructs can be performed prior to clustering or concatemer synthesis. The combination of sets of linear target sequence constructs can be performed prior to sequencing.

[0140] In one aspect, provided herein is a method of using primer extension to generate ssDNA library elements from circularized adaptor-target sequence constructs containing blocked asymmetric hairpin adaptors. See, for example, FIG. 4. After formation of circularized adaptortarget sequence constructs, primers (e.g., those that are reverse complementary to the B' adaptor region) are hybridized to the B' adaptor region of the circularized adaptor-target sequence constructs and extended to create separate primer extension products from the two constituent ssDNA library elements. In some instances, the separate extension products serve as separate templates for clustering and short read sequencing. In some instances, the separate extension products are amplified by PCR prior to clustering and sequencing. In some embodiments, primer extension across the A7B junction in the asymmetric hairpin adaptors do not produce primer extension products spanning both insert strands.

[0141] Certain polymerases used in the primer extension reaction can be helpful for the efficient and controlled generation of separate ssDNA library elements from short insert template structures. In some embodiments, the processivity, strand displacement, and proofreading activity of phi29 polymerase facilitate efficient high fidelity primer extension. In some instances, phi29 primer extension products are separated by denaturation and then serve directly as substrates for clustering and sequencing. In some instances, PCR thermocycling is performed using a thermostable proofreading polymerase to generate primer extension products for clustering and sequencing. In some instances, a combined reaction containing both phi29 and a thermostable polymerase is performed. In some embodiments, an initial incubation step at 25-37°C is permissive for phi29 activity for an initial primer extension. In some instances, subsequent PCR thermocycling that includes a denaturing step can inactivate phi29 and also accomplish PCR amplification. An example of an integrated library prep workflow incorporating this exemplary method is presented in FIG. 4.

C. Generating Circularized Target Sequence Constructs

[0142] Described herein are methods of generating circularized target sequence constructs from single-stranded, linear target sequence constructs. The linear target sequence constructs of the previous section can be circularized and used as nucleic acid templates in various sequencing methods. Circularization of the target sequence can be achieved using a splint oligonucleotide. A splint oligonucleotide is complementary to both ends of a single-stranded, linear fragment. In some embodiments, a splint oligonucleotide binds (e.g., hybridizes) to a binding site at or around one end of the linear fragment and a different binding site at or around the opposite end of the linear fragment. In some instances, a sequence at the 5’ end of a splint oligonucleotide binds to a sequence at one end of the linear fragment and a sequence at the 3 ’ end of the splint binds to a sequence at the opposite end of the linear fragment. In some embodiments, a splint oligonucleotide is contacted with a single-stranded linear target sequence construct under conditions such that the splint oligonucleotide hybridizes to the 5’ and 3’ ends of a linear singlestranded target sequence construct which then can undergo ligation at the ends together to form a circularized, single-stranded target sequence construct. A splint oligonucleotide is complementary to both ends of the linear, single-stranded target sequence. The splint oligonucleotide brings the ends of the single-stranded fragment together so they can be ligated using a double-strand-specific ligase (e.g., T4 or Tag ligase) to generate a single-stranded, circular molecule. It will be understood that where annealing of a splint oligonucleotide to the ends of a single-stranded nucleic acid results in a gap between the 3'- and 5 '-termini, a gap-filling operation is carried out using an appropriate polymerase enzyme (e.g., T4 DNA polymerase) prior to the ligation step, and such gap-filling methods are known to those of ordinary skill in the art. Descriptions of producing a single-stranded, circular molecule from a single-stranded, linear molecule can be found in, for example, U.S. Pat. App. Pub. No. 2012/0196279.

IV. Methods of Using the Circularized Target Sequence Constructs [0143] Described herein are methods for processing circularized target sequence (e.g., template sequence) constructs. In some embodiments, the constructs are sequenced by standard sequencing methods including but not limited to cyclic sequencing or single molecule real-time sequencing. In some embodiments, the circularized target sequence constructs are processed to form concatemers before sequencing.

A. Rolling Circle Amplification (RCA)

[0144] In some embodiments, concatemers of the circularized target sequence constructs can be formed by way of rolling circle amplification (RCA). RCA can be performed using methods known in the art including, for example, those described in Lizardi et al., Nat. Genet., 19:225- 232 (1998). Generally, the method involves a polymerase, such as, for example, a 029 (phi29) DNA polymerase. The polymerase extends an RCA primer that is annealed to the circular nucleic acid template such that polymerase laps around the circular template multiple times, thereby producing a concatemeric single-stranded nucleic acid construct (e.g., a “sense” strand) that contains multiple tandem repeats, each of the repeats are complementary to the circular nucleic acid template. Typically, RCA is performed initially in the presence of a low concentration of a polymer, such as dendrimers (e.g., polyamidoamine (PAMAM)), and subsequently in the presence of a polymer. In one configuration, an RCA reaction is stopped by denaturing the polymerase, for example, by heating the sample at 60°C, 65°C, 70°C, 75°C, 80°C, or more. In one configuration, an RCA reaction is stopped by removing one or more components of the amplification, such as the polymerase and/or dNTPs. Components of RCA can be removed by, for example, washing.

[0145] In some embodiments, RCA includes using a strand-displacing polymerase and a primer, e.g., an RCA primer, that hybridizes to the circularized target sequence constructs. In some embodiments, the RCA primer is a splint oligonucleotide such as a splint oligonucleotide used to circularize a single-stranded, linear target construct. Descriptions of RCA and methods thereof can be found in, for example, US11,326,206, the disclosure of which is incorporated herein by reference.

[0146] The target sequence (e.g., template sequence) construct can be amplified via rolling circle amplification to generate a concatemer comprising multiple copies of the template that is subsequently sequenced to generate, a sequencing read that is internally redundant. Concatemer synthesis (formation) can be performed in solution (“solution-phase synthesis”) or on a solid surface (“solid-phase synthesis”). Generally, stabilization of concatemers is not significantly affected by the method employed to synthesize the concatemer in need of stabilization. Thus, it is specifically contemplated that any of the aspects and embodiments of the methods of forming stabilized concatemers or any of the aspects and embodiments of the compositions of stabilized concatemers or any of the aspects and embodiments of the methods of use described herein can utilize concatemers provided from solution-phase or solid-phase synthesis.

[0147] In some embodiments, the concatemers can be deposited on a surface such as a structured surface. In some embodiments, concatemers formed by solution-phase synthesis are deposited on a surface.

[0148] In some embodiments, the RCA primer is in solution. In some instances, the primer is a solution-phase primer.

[0149] In some embodiments, the RCA primer is immobilized or attached to a surface, e.g., a solid support. In some instances, the primer is a surface primer, i.e., is immobilized on the surface prior to extension in the presence of polymerase to form the RCA product. A splint oligonucleotide may circularize a linear construct prior to RCA production, such as through ligation of the linear construct as described herein. Suitable surfaces include, but are not limited to, a structured surface, a planar substrate, a hydrogel, a nanohole array, a microparticle, a nanoparticle, a flow cell surface, a surface of a solid support, or a surface of a solid support within a flow cell. The surface can be planar or curved. As discussed in further detail below, the solid support can be made from any of a variety of materials used for analytical biochemistry. Suitable materials may include, for example, glass, polymeric materials, silicon, quartz (fused silica), borofloat glass, silica, silica-based materials, carbon, metals, an optical fiber or bundle of optical fibers, sapphire, or plastic materials. The material can be selected based on properties desired for a particular use. For example, materials that are transparent to a desired wavelength of radiation are useful for analytical techniques that will utilize radiation of that wavelength. Conversely, it may be desirable to select a material that does not pass radiation of a certain wavelength (e.g., being opaque, absorptive, or reflective). Wavelength regions that may be pass or not pass through a particular material include, for example, UV, VIS (e g., red, yellow, green, or blue) or IR. Other properties of a material that can be exploited are inertness or reactivity to certain reagents used in a downstream process, such as those set forth herein, or ease of manipulation, or low cost of manufacture. Descriptions of methods of immobilizing an RCA primer and methods of using such can be found in, for example, U.S. Pat. App. Pub. No. US2007/0099208, WO2019/018366 and Lizardi et al., Nat. Genet., 19:225-232 (1998), each of which is incorporated herein by reference.

B. Sequencing Methods

[0150] Adaptors provided herein are useful for sequencing technologies. Any sequencing method can be performed using constructs that include any of the asymmetric hairpin adaptors described herein including, but not limited to concatemer clusters as described. For instance, the sequencing method can be cyclic sequencing or single molecule real-time (SMRT) sequencing. In some instances, cyclic sequencing includes, but is not limited to, sequencing-by-binding and sequencing-by-synthesis. Descriptions of SMRT sequencing can be found in, for example, W02009/120372; WO2010/117470; Eid et al., Science, 2009, 323, 133-138; and Travers et al., Nucleic Acids Res, 2010, 38, el 59, each of which is incorporated herein by reference.

[0151] Suitable sequencing processes for use in the provided methods include, but are not limited to, sequencing by binding, sequencing by synthesis (sequencing by incorporation), pH- based sequencing, sequencing by polymerase monitoring, sequencing by hybridization, and other methods of massively parallel sequencing or next-generation sequencing. Suitable surfaces for carrying out sequencing include, but are not limited to, a planar substrate, a hydrogel, a nanohole array, a microparticle, or a nanoparticle. Exemplary sequencing platforms including methods, reagents and solid-phase surfaces are set forth below and in the cited references.

[0152] Sequencing-by-binding (SBB) includes a sequencing technique wherein specific binding of a polymerase and cognate nucleotide to a primed template nucleic acid is used for identifying the next correct nucleotide to be incorporated into the primer strand of the primed template nucleic acid. The specific binding interaction need not result in chemical incorporation of the nucleotide into the primer. The specific binding interaction can precede chemical incorporation of the nucleotide into the primer strand or precedes chemical incorporation of an analogous, next correct nucleotide into the primer. Thus, identification of the next correct nucleotide can take place without incorporation of the next correct nucleotide. [0153] Some SBS embodiments include detection of a proton released upon incorporation of a nucleotide into an extension product. For example, sequencing based on detection of released protons can use reagents and an electrical detector that are commercially available from Thermo Fisher (Waltham, MA) or described in U.S. Pat. App. Pub. Nos. 2009/0026082, 2009/0127589, 2010/0137143, and 2010/0282617, each of which is incorporated by reference.

[0154] Other sequencing procedures can be used, such as pyrosequencing. Pyrosequencing detects the release of inorganic pyrophosphate (PPi) as particular nucleotides are incorporated into a nascent primer hybridized to a template nucleic acid strand. See, e.g., Ronaghi, et al., Analytical Biochemistry 242 (1), 84-9 (1996); Ronaghi, Genome Res. 11 (1), 3-11 (2001); Ronaghi et al. Science 281 (5375), 363 (1998); and U.S. Patent Nos. 6,210,891, 6,258,568, and 6,274,320, each of which is incorporated herein by reference. In pyrosequencing, released PPi can be detected by being converted to adenosine triphosphate (ATP) by ATP sulfurylase, and the resulting ATP can be detected via luciferase-produced photons. Thus, the sequencing reaction can be monitored via a luminescence detection system.

[0155] Sequencing-by-ligation reactions are also useful, including, for example, those described in Shendure et al. Science 309: 1728-1732 (2005) and U.S. Patent Nos. 5,599,675 and 5,750,341, each of which is incorporated herein by reference. Some embodiments can include sequencing- by-hybridization procedures as described, for example, in Bains et al., Journal of Theoretical Biology 135 (3), 303-7 (1988); Drmanac et al., Nature Biotechnology 16, 54-58 (1998); Fodor et al., Science 251 (4995), 767-773 (1995); and WO1989/10977, each of which is incorporated herein by reference. In both sequencing-by-ligation and sequencing-by-hybridization procedures, primers that are hybridized to nucleic acid templates are subjected to repeated cycles of extension by oligonucleotide ligation. Typically, the oligonucleotides are fluorescently labeled and can be detected to determine the sequence of the template.

[0156] Some embodiments can utilize methods involving real-time monitoring of DNA polymerase activity. For example, nucleotide incorporations can be detected through fluorescence resonance energy transfer (FRET) interactions between a fluorophore-bearing polymerase and gamma-phosphate-labeled nucleotides, or with zero mode waveguides (ZMWs). Techniques and reagents for sequencing via FRET and/or ZMW detection are described, for example, in Levene et al., Science, 299, 682-686 (2003); Lundquist et al., Opt. Lett., 33, 1026- 1028 (2008); and Korlach et al., Proc. Natl. Acad. Sci. USA, 105, 1176-1181 (2008), the disclosures of which are incorporated herein by reference. i. Sequencing-By-Binding

[0157] In one aspect, sequencing is performed with a sequencing by binding (SBB) technique. Exemplary particularly useful sequencing by binding reactions are described in U.S. Pat. Nos. 10,077,470, 10,443,098, and 10,975,427 and U.S. Pat. App. Pub. Nos. 20190119742 and 20180187245, each of which is incorporated by reference herein in its entirety. Generally, methods for determining the sequence of a template nucleic acid molecule through sequencing by binding can be based on formation of a ternary complex (between polymerase, primed nucleic acid, and cognate nucleotide) under specified conditions. The method can include an examination phase followed by a nucleotide incorporation phase.

[0158] The examination phase in a sequencing by binding procedure can be carried out in a flow cell having at least one template nucleic acid molecule (e.g., a concatemeric RCA product) primed with a primer; contacting the primed template nucleic acid molecule(s) with a first reaction mixture that includes a polymerase and at least one nucleotide type; observing the interaction of polymerase and a nucleotide with the primed template nucleic acid molecule(s), under conditions where the nucleotide is not covalently added to the primer(s); and identifying a next base in each template nucleic acid using the observed interaction of the polymerase and nucleotide with the primed template nucleic acid molecule(s). The interaction between the primed template, polymerase, and nucleotide can be detected in a variety of schemes. For example, the nucleotides can contain a detectable label. Each nucleotide can have a distinguishable label with respect to other nucleotides. Alternatively, some or all of the different nucleotide types can have the same label and the nucleotide types can be distinguished, e.g., based on separate deliveries of different nucleotide types to the flow cell. In some embodiments, the polymerase can be labeled. Polymerases that are associated with different nucleotide types can have unique labels that distinguish the type of nucleotide to which they are associated. Alternatively, polymerases can have similar labels and the different nucleotide types can be distinguished based on separate deliveries of different nucleotide types to the flow cell (e.g., delivering the labeled polymerase in combination with one or more unlabeled nucleotides at a time). [0159] During the examination phase, discrimination between correct and incorrect nucleotides can be facilitated by ternary complex stabilization. A variety of conditions and reagents can be useful for ternary complex stabilization, e.g., by preventing incorporation of nucleotide and/or preventing dissociation of the ternary complex. For example, the primer can contain a reversible blocking moiety that prevents covalent attachment of nucleotide, cofactors that are required for extension (such as divalent metal ions) can be absent, inhibitory divalent cations that inhibit polymerase-based primer extension can be present, the polymerase that is present in the examination phase can have a chemical modification and/or mutation that inhibits primer extension, and/or the nucleotides can have chemical modifications that inhibit incorporation, such as 5' modifications that remove or alter the native triphosphate moiety.

[0160] The extension phase can then be carried out by creating conditions in the flow cell where a nucleotide can be added to the primer on each template nucleic acid molecule. In some embodiments, this involves removal of reagents used in the examination phase and replacing them with reagents that facilitate extension. For example, examination reagents can be replaced with a polymerase and nucleotide(s) that are capable of extension. Alternatively, one or more reagents can be added to the examination phase reaction to create extension conditions. For example, catalytic divalent cations can be added to an examination mixture that was deficient in the cations, and/or polymerase inhibitors can be removed or disabled, and/or extension competent nucleotides can be added, and/or a deblocking reagent can be added to render primer(s) extension competent, and/or extension competent polymerase can be added.

Optionally, the nucleotide that is enzymatically incorporated into the primer strand of the primed template nucleic acid molecule is different from the nucleotide used in the examination step to identify the next correct nucleotide.

[0161] Optionally, the polymerase used in the incorporation step is different from the polymerase used in the examination step. Optionally, the incorporated nucleotide is a reversible terminator nucleotide, where primer extension is limited to a single nucleotide incorporation prior to removal of a reversible terminator moiety. Thus, for embodiments employing reversible terminator nucleotides, a deblocking reagent can be delivered to a flow cell (before or after detection occurs). Washes can be carried out between the various delivery steps. [0162] The above examination and extension phases can be carried out cyclically such that in each cycle a single next correct nucleotide is examined (i.e., the next correct nucleotide being a nucleotide that correctly binds to the nucleotide in a template nucleic acid that is located immediately 5' of the base in the template that is hybridized to the 3 ' end of the hybridized primer) and, subsequently, a single next correct nucleotide is added to the primer. Any number of cycles can be carried out including, for example, at least 1, 2, 5, 10, 20, 25, 30, 40, 50, 75, 100, 150 or more cycles. Alternatively or additionally, the number of cycles can be capped at no more than 150, 100, 75, 50, 40, 30, 25, 20, 10, 5, 2, or 1 cycles. This cyclical sequencing process produces a read of all or a portion of the template nucleic acid’s sequence.

[0163] A sequencing by synthesis (SBS) technique can also be used. This technique generally involves the enzymatic extension of a primer through the iterative addition of nucleotides against a template strand to which the primer is hybridized. Briefly, SBS can be initiated by contacting target nucleic acids, attached to features in a flow cell, with one or more labeled nucleotides, DNA polymerase, etc. Those features where a primer is extended using the target nucleic acid as template will incorporate a labeled nucleotide that can be detected. Optionally, the labeled nucleotides can further include a reversible termination property that terminates further primer extension once a nucleotide has been added to a primer. For example, a nucleotide analog having a reversible terminator moiety can be added to a primer so that subsequent extension cannot occur until a deblocking agent is delivered to remove the moiety. Thus, for embodiments that use reversible termination, a deblocking reagent can be delivered to the flow cell (before or after detection occurs). Washes can be carried out between the various delivery steps. The cycle can then be repeated. Exemplary SBS procedures, reagents and detection instruments that can be readily adapted for use with an array of concatemers in the methods of the present disclosure are described, for example, in Bentley et al., Nature 456:53-59 (2008), W004/018497, WO91/06678, WO07/123744, U.S. Pat. Nos. 7,057,026, 7,329,492, 7,211,414, 7,315,019, and 7,405,281, and U.S Pat. App. Pub. No. 2008/0108082, each of which is incorporated herein by reference. Also useful are SBS methods that are commercially available from Illumina, Inc. (San Diego, CA). ii. Sequencing-By-Synthesis [0164] Sequencing-by-synthesis (SBS) involves the enzymatic extension of a nascent primer through the iterative addition of nucleotides against a template strand to which the primer is hybridized. SBS has been described, for example, in U.S. Pat. Nos. 5,302,509 and 6,828,100, and U.S Pat. App. Pub. No. 2009/0247414; the content of each is incorporated herein by reference in its entirety. SBS differs from SBB, above, in that labeled nucleotides are incorporated into the extending strand, assayed and then the label is removed or deactivated, and the 3’ block removed, to iteratively sequence a template. On the other hand, in SBB, a labeled base is not incorporated into an extending strand. Rather, ternary complex formation is assayed, usually for the presence of a labeled base but sometimes for the presence of a labeled polymerase or other feature, after which point the complex is disassembled and a 3’ blocked, unlabeled base is used to extend the primer strand.

[0165] Briefly, SBS can be initiated by contacting target nucleic acids, attached to sites in a flow cell, with one or more labeled nucleotides, DNA polymerase, etc. Those sites where a primer is extended using the target nucleic acid as template will incorporate a labeled nucleotide that can be detected. Detection can include scanning using an apparatus or method set forth herein. Optionally, the labeled nucleotides can further include a reversible termination property that terminates further primer extension once a nucleotide has been added to a primer. For example, a nucleotide analog having a reversible terminator moiety can be added to a primer such that subsequent extension cannot occur until a deblocking agent is delivered to remove the moiety. Thus, for embodiments that use reversible termination, a deblocking reagent can be delivered to the vessel (before or after detection occurs). Washes can be carried out between the various delivery steps. The cycle can be performed n times to extend the primer by n nucleotides, thereby detecting a sequence of length n.

[0166] Exemplary SBS procedures, reagents and detection components that can be readily adapted for use with a method or composition of the present disclosure are described, for example, in Bentley et al., Nature 456:53-59 (2008), W004/018497; WO91/06678; WO07/123744; U.S. Pat. Nos. 7,057,026; 7,329,492; 7,211,414; 7,315,019 or 7,405,281, and U.S. Pat. App. Pub. No. 2008/0108082, each of which is incorporated herein by reference. One or more reagents used in an SBS process can optionally be delivered via a mixed-phase fluid (e.g., a fluid foam, fluid slurry or fluid emulsion), contacted with a mixed-phase fluid, and/or removed by a mixed-phase fluid. A mixed-phase fluid can be removed from a flow cell for detection during an SBS process. iii. SMRT® Sequencing

[0167] In SMRT® sequencing, a nucleic acid synthesis complex comprising a polymerase enzyme, a template sequence and a primer sequence complementary to a portion of the template sequence, is immobilized within a confined illumination volume, e.g., resulting from the evanescent optical field resulting from illumination of a zero mode waveguide (e.g., subwavelength optical nanostructures fabricated in a thin metallic film), or in a total internal reflectance fluorescence microscope system or other optical confinement system. The reaction mixture surrounding the complex contains the four different nucleotides (A, G, T and C) each labeled with a spectrally distinguishable fluorescent label attached through its terminal phosphate group. Because the small illumination volume, nucleotides and their associated fluorescent labels diffuse in and out of the illumination volume very quickly, and thus provide very short fluorescent signals. When a particular nucleotide is incorporated by the polymerase in a primer extension reaction, the fluorescent label associated with the nucleotide is retained within the illumination volume for a longer period of time. Once incorporated, the fluorescent label is cleaved from the base through the action of the polymerase, and the label diffuses away. By identifying longer pulses of different spectral characteristics, the system can detect, in real time, the identity of each incorporated base as it is being incorporated. Shorter pulses not associated with incorporation tend to be so short that they are not detected by the camera, while pulses from incorporation provide more pronounced and detectable pulses.

EXAMPLES

[0168] The following examples are included for illustrative purposes only and are not intended to limit the scope of the invention.

Example 1: Exemplary Embodiments of Blocked Asymmetric Hairpin Adaptors

[0169] Examples of blocked asymmetric hairpin adaptors are shown in the table of FIG. 5. The adaptors included spacers of various types and lengths such as a spacer containing an abasic site, a 3-carbon spacer (SpC3), a triplet of 3-carbon spacers SpC3-SpC3-SpC3, a TTTTT-SpC3 spacer, a tri ethylene glycol spacer (Sp9), a doublet of Sp9 spacer (Sp9-Sp9), and an 18-atom hexa-ethyleneglycol spacer (Spl8). The different adaptors were ligated to double-stranded DNA fragments and the resulting templates were evaluated by quantitative PCR (qPCR) using tailed primers that specifically bind to the adaptor sequences (see FIG. 6A).

[0170] Different adaptors generated different qPCR yields as denoted by Ct values. The highest yield was from the longest spacer - the Sp 18 spacer.

[0171] The following examples are presented in order to more fully illustrate some embodiments of the invention. They should, in no way be construed, however, as limiting the broad scope of the invention. One skilled in the art can readily devise many variations and modifications of the principles disclosed herein without departing from the scope of the invention.