Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHODS FOR ENRICHING NUCLEIC ACID TARGET SEQUENCES
Document Type and Number:
WIPO Patent Application WO/2024/064915
Kind Code:
A2
Abstract:
The invention provides methods for enriching nucleic acid target sequences from a sample, for example, from a biological sample or from a nucleic acid library.

Inventors:
SHUBER ANTHONY P (US)
GILLEY CAITLIN M (US)
WITKOWSKI ROSEMARY TURINGAN (US)
Application Number:
PCT/US2023/074937
Publication Date:
March 28, 2024
Filing Date:
September 22, 2023
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
FLAGSHIP PIONEERING INNOVATIONS VI LLC (US)
International Classes:
C12Q1/6806; C12Q1/686
Attorney, Agent or Firm:
GUSTAFSON, Megan A. et al. (US)
Download PDF:
Claims:
WHAT IS CLAIMED IS:

1. A nucleic acid enrichment method, the method comprising: cutting a nucleic acid molecule that includes a target sequence to generate a single stranded overhang at a cut end of the molecule that includes the target; filling in the overhang with at least one labeled nucleotide; and enriching the molecule that includes the target by contacting at least one of the labeled nucleotides in the molecule with a capture domain.

2. The method of claim 1 , wherein the cutting step is performed by a nuclease.

3. The method of claim 2, wherein the nuclease is a CRISPR-Cas nuclease.

4. The method of claim 3, wherein the nuclease is a type II or a type V CRISPR-Cas nuclease.

5. The method of claim 3 or claim 4, wherein the nuclease is a Cas9, Casl2, or CasX nuclease.

6. The method of claim 5, wherein the nuclease is a Casl2a/Cpfl nuclease.

7. The method of any one of claims 3-6, wherein the nuclease is associated with a guide RNA (gRNA) comprising a spacer sequence, wherein the spacer sequence binds to the nucleic acid molecule that includes the target sequence.

8. The method of any one of claims 1-7, wherein the cutting step is performed at room temperature.

9. The method of any one of claims 1-8, wherein the overhang is filled in using a DNA polymerase.

10. The method of claim 9, wherein the DNA polymerase is DNA polymerase I.

11. The method of claim 10, wherein the DNA polymerase I consists of the Klenow fragment.

12. The method of any one of claims 1-11, wherein the label comprises biotin or digoxigenin.

13. The method of any one of claims 1-12, wherein the capture domain comprises avidin, streptavidin, or a DIG-binding protein.

14. The method of any one of claims 1-13, wherein the capture domain comprises or is connected to a solid support.

15. The method of claim 14, wherein the solid support is a bead, a well, a tube, or a slide.

16. The method of claim 15, wherein the capture domain comprises streptavidin connected to the bead.

17. The method of any one of claims 1-16, wherein the nucleic acid molecule is present in a nucleic acid sequencing library, and the method enriches target sequences in the library.

18. The method of any one of claims 1-16, wherein the nucleic acid molecule was obtained from a nucleic acid sample from a subject.

19. The method of claim 18, wherein the nucleic acid sample is a plasma sample, and the plasma sample is used directly in the nucleic acid enrichment method without prior enrichment or purification of the nucleic acid.

20. The method of claim 18 or 19, wherein the nucleic acid sample comprises cell free DNA (cfDNA).

21. The method of claim 20, wherein cytosines in the cfDNA have been converted to uracils.

22. The method of claim 20 or 19 wherein the cfDNA has been treated with bisulfite.

23. The method of any one of claims 1-16 and 18-22, wherein the method further comprises preparing a library before or after enriching the molecule that includes the target.

24. The method of any one of claims 1-20 and 23, the method further comprising the step of converting methylated cytosines to uracils.

25. The method of any one of claims 1-24, the method further comprising a wash step to remove nucleic acid molecules that do not include the target.

26. The method of any one of claims 1-25, the method further comprising amplifying the nucleic acid molecule.

27. The method of claim 26, wherein the amplification occurs while the nucleic acid is in contact with the capture domain.

28. The method of any one of claims 1-27, the method further comprising sequencing the enriched molecule.

29. The method of any one of claims 1-28, the method further comprising separating the nucleic acid molecule from the capture domain.

30. The method of claim 29, wherein the separating step is performed using heat elution, a chemical agent, mechanical disruption, or combinations thereof.

31. The method of claim 29 or 30, wherein the method further comprises amplifying the nucleic acid after separation of the nucleic acid from the capture domain.

32. The method of any one of claims 1-31, wherein the method further comprises an additional enrichment step.

33. The method of claim 32, wherein the target sequence comprises a plurality of target sequences, and the enrichment step enriches a subset of the target sequences.

34. The method of claim 32 or 33, wherein the additional enrichment step comprises hybrid capture.

35. The method of any one of claims 32-34, wherein the additional enrichment step comprises using a nucleic acid binding protein.

36. A method of capturing a nucleic acid molecule having a target sequence, the method comprising: cutting a nucleic acid molecule that includes a target sequence to generate a single stranded overhang at a cut end of the molecule that includes the target; filling in the overhang with at least one labeled nucleotide; and capturing the molecule that includes the target by contacting at least one of the labeled nucleotides in the molecule with a capture domain.

37. The method of claim 36, wherein the cutting step is performed by a nuclease.

38. The method of claim 37, wherein the nuclease is a CRISPR-Cas nuclease.

39. The method of claim 38, wherein the nuclease is a type II or a type V CRISPR-Cas nuclease.

40. The method of claim 38 or claim 39, wherein the nuclease is a Cas9, Casl2, or CasX nuclease.

41. The method of claim 40, wherein the nuclease is a Casl2a/Cpfl nuclease.

42. The method of any one of claims 36-41 , wherein the nuclease is associated with a guide RNA (gRNA) comprising a spacer sequence, wherein the spacer sequence binds to the nucleic acid molecule that includes the target sequence.

43. The method of any one of claims 36-42, wherein the cutting step is performed at room temperature.

44. The method of any one of claims 36-43, wherein the overhang is filled in using a DNA polymerase.

45. The method of claim 44, wherein the DNA polymerase is DNA polymerase I.

46. The method of claim 45, wherein the DNA polymerase I consists of the Klenow fragment.

47. The method of any one of claims 36-46, wherein the label comprises biotin or digoxigenin.

48. The method of any one of claims 36-47, wherein the capture domain comprises avidin, streptavidin, or a DIG-binding protein.

49. The method of any one of claims 36-48, wherein the capture domain comprises or is connected to a solid support.

50. The method of claim 49, wherein the solid support is a bead, a well, a tube, or a slide.

51. The method of claim 50, wherein the capture domain comprises streptavidin connected to the bead.

52. The method of any one of claims 36-51, wherein the nucleic acid molecule is present in a nucleic acid sequencing library, and the method captures target sequences of interest in the library.

53. The method of any one of claims 36-51, wherein the nucleic acid molecule was obtained from a nucleic acid sample from a subject.

54. The method of claim 53, wherein the nucleic acid sample is a plasma sample, and the plasma sample is used directly in the method of capturing a nucleic acid molecule without prior enrichment or purification of the nucleic acid.

55. The method of claim 53, wherein the nucleic acid sample comprises cfDNA.

56. The method of claim 55, wherein cytosines in the cfDNA have been converted to uracils.

57. The method of claim 55 or 56 wherein the cfDNA has been treated with bisulfite.

58. The method of any one of claims 36-51 and 53-57, wherein the method further comprises preparing a library before or after capturing the molecule that includes the target.

59. The method of any one of claims 36-55 and 58, the method further comprising the step of converting methylated cytosines to uracils.

60. The method of any one of claims 36-59, the method further comprising a wash step to remove nucleic acid molecules that do not include the target.

61. The method of any one of claims 36-60, the method further comprising amplifying the nucleic acid molecule.

62. The method of claim 61, wherein the amplification occurs while the nucleic acid is in contact with the capture domain.

63. The method of any one of claims 36-62, the method further comprising sequencing the captured molecule.

64. The method of any one of claims 36-63, the method further comprising separating the nucleic acid molecule from the capture domain.

65. The method of claim 64, wherein the separating step is performed using heat elution, a chemical agent, mechanical disruption, or combinations thereof.

66. The method of claim 64 or 65, wherein the method further comprises amplifying the nucleic acid after separation of the nucleic acid from the capture domain.

67. The method of any one of claims 36-66, wherein the method further comprises an additional enrichment step.

68. The method of claim 67, wherein the target sequence comprises a plurality of target sequences, and the enrichment step enriches a subset of the target sequences.

69. The method of claim 67 or 68, wherein the additional enrichment step comprises hybrid capture.

70. The method of any one of claims 67-69, wherein the additional enrichment step comprises using a nucleic acid binding protein.

71. A nucleic acid enrichment method, the method comprising: cutting a nucleic acid molecule that includes a target sequence to generate a single stranded overhang at a cut end of the molecule that includes the target; filling in the overhang with at least one labeled nucleotide; and enriching the molecule that includes the target by separating labeled molecules from unlabeled molecules.

72. The method of claim 71, wherein the cutting step is performed by a nuclease.

73. The method of claim 72, wherein the nuclease is a CRISPR-Cas nuclease.

74. The method of claim 73, wherein the nuclease is a type II or a type V CRISPR-Cas nuclease.

75. The method of claim 73 or claim 74, wherein the nuclease is a Cas9, Casl2, or CasX nuclease.

76. The method of claim 75, wherein the nuclease is a Casl2a/Cpfl nuclease.

77. The method of any one of claims 71-76, wherein the nuclease is associated with a guide RNA (gRNA) comprising a spacer sequence, wherein the spacer sequence binds to the nucleic acid molecule that includes the target sequence.

78. The method of any one of claims 71-77, wherein the cutting step is performed at room temperature.

79. The method of any one of claims 71-78, wherein the overhang is filled in using a DNA polymerase.

80. The method of claim 79, wherein the DNA polymerase is DNA polymerase I.

81. The method of claim 80, wherein the DNA polymerase I consists of the Klenow fragment.

82. The method of any one of claims 71-81, wherein the label comprises biotin, digoxigenin, or a fluorophore.

83. The method of any one of claims 71-82, wherein the capture domain comprises or is connected to a solid support.

84. The method of claim 83, wherein the solid support is a bead, a well, a tube, or a slide.

85. The method of claim 84, wherein the capture domain comprises streptavidin connected to the bead.

86. The method of any one of claims 71-85, wherein the nucleic acid molecule is present in a nucleic acid sequencing library, and the method enriches target sequences of interest in the library.

87. The method of any one of claims 71-86, wherein the nucleic acid molecule was obtained from a nucleic acid sample from a subject.

88. The method of claim 87, wherein the nucleic acid sample is a plasma sample, and the plasma sample is used directly in the nucleic acid enrichment method without prior enrichment or purification of the nucleic acid.

89. The method of claim 88, wherein the nucleic acid sample comprises cell free DNA (cfDNA).

90. The method of claim 89, wherein cytosines in the cfDNA have been converted to uracils.

91. The method of claim 89 or 90 wherein the cfDNA has been treated with bisulfite.

92. The method of any one of claims 71-85 and 87-91, wherein the method further comprises preparing a library before or after enriching the molecule that includes the target.

93. The method of any one of claims 71-89 and 92, the method further comprising the step of converting methylated cytosines to uracils.

94. The method of any one of claims 71-93, wherein the method includes a wash step.

95. The method of any one of claims 71-94, the method further comprising amplifying the nucleic acid molecule.

96. The method of claims 95, wherein the amplification occurs while the nucleic acid is in contact with the capture domain.

97. The method of any one of claims 71-96, the method further comprising sequencing the enriched molecule.

98. The method of any one of claims 71-97, the method further comprising separating the nucleic acid molecule from the capture domain.

99. The method of claim 98, wherein the separating step is performed using heat elution, a chemical agent, mechanical disruption, or combinations thereof.

100. The method of claim 98 or 99, wherein the method further comprises amplifying the nucleic acid after separation of the nucleic acid from the capture domain.

101. The method of any one of claims 71-100, wherein the method further comprises an additional enrichment step.

102. The method of claim 101, wherein the target sequence comprises a plurality of target sequences, and the enrichment step enriches a subset of the target sequences.

103. The method of claim 101 or 102, wherein the additional enrichment step comprises hybrid capture.

104. The method of any one of claims 101-103, wherein the additional enrichment step comprises using a nucleic acid binding protein.

105. A method of producing a nucleic acid library enriched for regions of interest, the method comprising: cutting a plurality of nucleic acid molecules comprising regions of interest to generate single stranded overhangs at cut ends of the molecules that include the regions of interest; filling in each overhang with a least one labeled nucleotide; and enriching the molecules that include the regions of interest by contacting the labeled nucleotides in the molecule with capture domains.

106. The method of claim 105, wherein the cutting step is performed by a nuclease.

107. The method of claim 106, wherein the nuclease is a CRISPR-Cas nuclease.

108. The method of claim 107, wherein the nuclease is a type II or a type V CRISPR-Cas nuclease.

109. The method of claim 107 or claim 108, wherein the nuclease is a Cas9, Casl2, or CasX nuclease.

110. The method of claim 109, wherein the nuclease is a Casl2a/Cpfl nuclease.

111. The method of any one of claims 105-110, wherein the nuclease is associated with a guide RNA (gRNA) comprising a spacer sequence, wherein the spacer sequence binds to the nucleic acid molecules that include the regions of interest.

112. The method of any one of claims 105-111, wherein the cutting step is performed at room temperature.

113. The method of any one of claims 105-112, wherein the overhangs are filled in using a DNA polymerase.

114. The method of claim 113, wherein the DNA polymerase is DNA polymerase I.

115. The method of claim 114, wherein the DNA polymerase I consists of the Klenow fragment.

116. The method of any one of claims 105-115, wherein the label comprises biotin, digoxigenin, or a fluorophore.

117. The method of any one of claims 105-116, wherein the capture domains comprise or are connected to solid supports.

118. The method of claim 117, wherein the solid supports are beads, wells, tubes, or slides.

119. The method of claim 118, wherein the capture domains comprise streptavidin connected to beads.

120. The method of any one of claims 105-119, the method further comprising amplifying the nucleic acid molecules.

121. The method of claim 120, wherein the amplifying is performed with primers that comprise adapters to facilitate sequencing of the nucleic acid molecules.

122. The method of any one of claims 105-121, wherein the nucleic acid molecule was obtained from a nucleic acid sample from a subject.

123. The method of claim 122, wherein the nucleic acid sample is a plasma sample, and the plasma sample is used directly in the nucleic acid enrichment method without prior enrichment or purification of the nucleic acid.

124. The method of claim 123, wherein the nucleic acid sample comprises cell free DNA (cfDNA).

125. The method of claim 124, wherein cytosines in the cfDNA have been converted to uracils.

126. The method of claim 124 or 125 wherein the cfDNA has been treated with bisulfite.

127. The method of any one of claims 105-124, the method further comprising the step of converting methylated cytosines to uracils.

128. The method of any one of claims 105-127, the method further comprising a wash step to remove nucleic acid molecules that do not include the regions of interest.

129. The method of any one of claims 105-128, the method further comprising amplifying the nucleic acid molecule.

130. The method of claims 129, wherein the amplification occurs while the nucleic acid is in contact with the capture domain.

131. The method of any one of claims 105-130, the method further comprising separating the nucleic acid molecules from the capture domains.

132. The method of claim 131, wherein the separating step is performed using heat elution, a chemical agent, mechanical disruption, or combinations thereof.

133. The method of claim 131 or 132, wherein the method further comprises amplifying the nucleic acid after separation of the nucleic acid from the capture domain.

134. The method of any one of claims 105-133, wherein the method further comprises an additional enrichment step.

135. The method of claim 134, wherein the target sequence comprises a plurality of target sequences, and the enrichment step enriches a subset of the target sequences.

136. The method of claim 134 or 135, wherein the additional enrichment step comprises hybrid capture.

137. The method of any one of claims 134-136, wherein the additional enrichment step comprises using a nucleic acid binding protein.

138. A method for producing a nucleic acid library enriched for regions of interest, the method comprising: obtaining a sample comprising a plurality of nucleic acids, wherein a subset of the plurality of nucleic acids comprise regions of interest; optionally converting methylated cytosines to uracils; adding nucleic acid adapters to the plurality of nucleic acids to form a nucleic acid library; cutting the subset of the plurality of nucleic acid molecules having regions of interest to generate single stranded overhangs at cut ends of the molecules that include the regions of interest; filling in each overhang with a least one labeled nucleotide; enriching the molecules that include the regions of interest by contacting the labeled nucleotides in the molecule with capture domains; and amplifying the molecules that include the regions of interest to form the nucleic acid library enriched for regions of interest.

139. A method for producing a nucleic acid library enriched for regions of interest, the method comprising: obtaining a sample comprising a plurality of nucleic acids, wherein a subset of the plurality of nucleic acids comprise regions of interest; cutting the subset of the plurality of nucleic acid molecules having regions of interest to generate single stranded overhangs at cut ends of the molecules that include the regions of interest; filling in each overhang with a least one labeled nucleotide; and enriching the molecules that include the regions of interest by contacting the labeled nucleotides in the molecule with capture domains; removing the molecules that include the regions of interest from the capture domains; optionally converting methylated cytosines to uracils; and adding nucleic acid adapters to the plurality of nucleic acids to form the nucleic acid library enriched for regions of interest.

140. The method of claim 138 or 139, wherein the cutting is performed by a nuclease.

141. The method of claim 140, wherein the nuclease is a CRISPR-Cas nuclease.

142. The method of claim 141, wherein the nuclease is a type II or a type V CRISPR-Cas nuclease.

143. The method of claim 141 or claim 142, wherein the nuclease is a Cas9, Casl2, or CasX nuclease.

144. The method of claim 143, wherein the nuclease is a Casl2a/Cpfl nuclease.

145. The method of any one of claims 138-144, wherein the nuclease is associated with a guide RNA (gRNA) comprising a spacer sequence, wherein the spacer sequence binds to the nucleic acid molecule that includes the target sequence.

146. The method of any one of claims 138-145, wherein the cutting step is performed at room temperature.

147. The method of any one of claims 138-146, wherein the overhang is filled in using a DNA polymerase.

148 The method of claim 147, wherein the DNA polymerase is DNA polymerase I.

149. The method of claim 148, wherein the DNA polymerase I consists of the Klenow fragment.

150. The method of any one of claims 138-149, wherein the label comprises biotin or digoxigenin.

151. The method of any one of claims 138-150, wherein the capture domain comprises avidin, streptavidin, or a DIG-binding protein.

152. The method of any one of claims 138-151, wherein the capture domain comprises or is connected to a solid support.

153. The method of claim 152, wherein the solid support is a bead, a well, a tube, or a slide.

154. The method of claim 153, wherein the capture domain comprises streptavidin connected to the bead.

155. The method of any one of claims 138-154, wherein the method further comprises an additional enrichment step.

156. The method of claim 155, wherein the target sequence comprises a plurality of target sequences, and the enrichment step enriches a subset of the target sequences.

157. The method of claim 155 or 156, wherein the additional enrichment step comprises hybrid capture.

158. The method of any one of claims 155-157, wherein the additional enrichment step comprises using a nucleic acid binding protein.

159. A nucleic acid library, produced by the method of any one of claims 138-158.

160. A kit comprising: a nuclease that cuts a nucleic acid molecule including a target sequence to generate a single stranded overhand at a cut end of the molecule that includes the target; labeled dNTPs;

DNA polymerase; and a capture moiety comprising a capture domain.

161. The kit of claim 160, wherein the nuclease is a CRISPR-Cas nuclease.

162. The kit of claim 161, wherein the nuclease is a type II or a type V CRISPR-Cas nuclease.

163. The kit of claim 161 or claim 162, wherein the nuclease is a Cas9, Casl2, or CasX nuclease.

164. The kit of claim 163, wherein the nuclease is a Casl2a/Cpfl nuclease.

165. The kit of any one of claims 160-164, wherein the DNA polymerase is DNA polymerase I.

166. The kit of claim 165, wherein the DNA polymerase I consists of the Klenow fragment.

167. The kit of any one of claims 160-166, wherein the label comprises biotin, digoxigenin, or a fluorophore.

168. The kit of any one of claims 160-167, wherein the capture moiety comprises a solid support.

169. The kit of claim 168, wherein the solid support is a bead, a well, a tube, or a slide.

170. The kit of claim 169, wherein the capture domain comprises streptavidin connected to the bead.

171. An nucleic acid enrichment method comprising the steps of:

(a) designing a first set of guide RNAs to bind a first set of target sequences for cleavage with a first nuclease,

(b) designing a second set of guide RNAs to bind a second set of target sequences for cleavage with a second nuclease,

(c) adding the first and second sets of guide sequences and the first and second nucleases to a nucleic acid comprising a plurality of target sequences,

(d) generating single stranded overhangs at the cleavage sites in the first and second sets of target sequences,

(e) filling in each overhang with at least one labeled nucleotide; and

(f) enriching the target sequences by contacting at least one of the labeled nucleotides in the molecule with a capture domain.

172. The method of claim 171, wherein the first nuclease or the second nuclease is a CRISPR- Cas nuclease.

173. The method of claim 172, wherein the first nuclease or the second nuclease is a type II or a type V CRISPR-Cas nuclease.

174. The method of claim 172 or claim 173, wherein the first nuclease or the second nuclease is a Cas9, Casl2, or CasX nuclease.

175. The method of claim 174, wherein the first nuclease or the second nuclease is a Casl2a/Cpfl nuclease.

176. The method of any one of claims 171-175, wherein the first nuclease or the second nuclease is associated with a guide RNA (gRNA) comprising a spacer sequence, wherein the spacer sequence binds to the nucleic acid molecule that includes the target sequence.

177. The method of any one of claims 171-176, wherein the cutting step is performed at room temperature.

178. The method of any one of claims 171-177, wherein the overhang is filled in using a DNA polymerase.

179. The method of claim 178, wherein the DNA polymerase is DNA polymerase I.

180. The method of claim 179, wherein the DNA polymerase I consists of the Klenow fragment.

181. The method of any one of claims 171-180, wherein the label comprises biotin or digoxigenin.

182. The method of any one of claims 171-181, wherein the capture domain comprises avidin, streptavidin, or a DIG-binding protein.

183. The method of any one of claims 171-182, wherein the capture domain comprises or is connected to a solid support.

184. The method of claim 183, wherein the solid support is a bead, a well, a tube, or a slide.

185. The method of claim 184, wherein the capture domain comprises streptavidin connected to the bead.

186. The method of any one of claims 171-185, wherein the nucleic acid molecule is present in a nucleic acid sequencing library, and the method enriches target sequences in the library.

187. The method of any one of claims 171-185, wherein the nucleic acid molecule was obtained from a nucleic acid sample from a subject.

188. The method of claim 187, wherein the nucleic acid sample is a plasma sample, and the plasma sample is used directly in the nucleic acid enrichment method without prior enrichment or purification of the nucleic acid.

189. The method of claim 187 or 188, wherein the nucleic acid sample comprises cell free DNA (cfDNA).

190. The method of claim 189, wherein cytosines in the cfDNA have been converted to uracils.

191. The method of claim 189 or 190 wherein the cfDNA has been treated with bisulfite.

192. The method of any one of claims 171-185 and 187-191, wherein the method further comprises preparing a library before or after enriching the molecule that includes the target.

193. The method of any one of claims 171-189 and 192, the method further comprising the step of converting methylated cytosines to uracils.

194. The method of any one of claims 171-193, the method further comprising a wash step to remove nucleic acid molecules that do not include the target.

195. The method of any one of claims 171-194, the method further comprising amplifying the nucleic acid molecule.

196. The method of claim 195, wherein the amplification occurs while the nucleic acid is in contact with the capture domain.

197. The method of any one of claims 171-196, the method further comprising sequencing the enriched molecule.

198. The method of any one of claims 171-197, the method further comprising separating the nucleic acid molecule from the capture domain.

199. The method of claim 198, wherein the separating step is performed using heat elution, a chemical agent, mechanical disruption, or combinations thereof.

200. The method of claim 198 or 199, wherein the method further comprises amplifying the nucleic acid after separation of the nucleic acid from the capture domain.

201. The method of any one of claims 171-200, wherein the method further comprises an additional enrichment step.

202. The method of claim 201, wherein the target sequence comprises a plurality of target sequences, and the enrichment step enriches a subset of the target sequences.

203. The method of claim 201 or 202, wherein the additional enrichment step comprises hybrid capture.

204. The method of any one of claims 201-203, wherein the additional enrichment step comprises using a nucleic acid binding protein.

Description:
METHODS FOR ENRICHING NUCLEIC ACID TARGET SEQUENCES

CROSS REFERENCE TO RELATED APPLICATIONS

[0001] This application claims the benefit of and priority to U.S. Provisional Patent Application No. 63/409,589, filed on September 23, 2022 and U.S. Provisional Patent Application No. 63/497,175, filed on April 19, 2023, the entire contents of each of which are incorporated herein by reference.

FIELD OF THE INVENTION

[0002] The invention relates generally to methods for enriching nucleic acid target sequences from a sample, for example, from a biological sample or from a nucleic acid library.

BACKGROUND

[0003] Detection of target sequences in a nucleic acid can be a challenge when the target sequence is present at a low frequency in the nucleic acid sample. Amplification and/or sequencing of target sequences can fail if such sequences occur at a low frequency. For example, circulating tumor DNA (ctDNA) levels are present at a very low frequency in most early-stage and many advanced stage cancer patients (Bettegowda et al. (2014) Sci Transl Med 6(224): p. 224ra24). Accordingly, a major challenge in the identification of ctDNA is how to identify a trace amount of ctDNAs out of a much larger proportion of total cell free DNA (cfDNA). Several recent studies have adopted either reduced-representation bisulfite sequencing (RRBS; Guo et al. (2017) Nat Genet 49(4): p. 635-642), whole-genome bisulfite sequencing (WGBS; Li et al. (2018) Nucleic Acids Res. 46( 15):e89) or methylated DNA immunoprecipitation sequencing (MeDIP-seq; Shen et al. (2018) Nature 563(7732):579-583) approaches to enrich methylated DNA sequences from a cell-free DNA sample. However, all of these techniques suffer from poor coverage in regions of interest in exchange for the availability of genome-wide information.

[0004] Accordingly, there is a need in the art for improved techniques for enriching target sequences of interest in a nucleic acid sample. SUMMARY OF THE INVENTION

[0005] The disclosure relates to methods of enriching target sequences in a nucleic acid sample. The methods include, for example, cutting a nucleic acid molecule that includes a target sequence to form a single-stranded overhang, filling in the overhang with a label, and capturing the nucleic acid molecule that includes the target, thereby enriching the target sequence. The methods can be used to enrich target sequences prior to assembling a nucleic acid library or can be used to enrich target sequences in an existing library.

[0006] In one aspect, the disclosure relates to a nucleic acid enrichment method. The method includes cutting a nucleic acid molecule that includes a target sequence to generate a single stranded overhang at a cut end of the molecule that includes the target; filling in the overhang with at least one labeled nucleotide; and enriching the molecule that includes the target by contacting at least one of the labeled nucleotides in the molecule with a capture domain.

[0007] In certain embodiments, the cutting step is performed by a nuclease, for example, a CRISPR-Cas nuclease. In certain embodiments, the nuclease is a type II CRISPR-Cas nuclease. In certain embodiments, the nuclease is a Cas9 nuclease. In certain embodiments, the nuclease is a type V CRISPR-Cas nuclease. In certain embodiments, the nuclease is a Casl2 nuclease. In certain embodiments, the nuclease is a Casl2a/Cpfl nuclease. In certain embodiments, the nuclease is a MAD7 nuclease. In certain embodiments, the nuclease is a CasX nuclease. In certain embodiments, the nuclease is associated with a guide RNA (gRNA) comprising a spacer sequence, wherein the spacer sequence binds to the nucleic acid molecule that includes the target sequence.

[0008] In certain embodiments, the cutting step is performed at room temperature.

[0009] In certain embodiments, the overhang is filled in using a DNA polymerase. In certain embodiments, the DNA polymerase is DNA polymerase I. In certain embodiments, the DNA polymerase I consists of the Klenow fragment.

[0010] In certain embodiments, the label comprises biotin or digoxigenin. In certain embodiments, the capture domain comprises avidin, streptavidin, or a DIG-binding protein. In certain embodiments, the capture domain comprises or is connected to a solid support. In certain embodiments, the solid support is a bead, a well, a tube, or a slide. In certain embodiments, the capture domain comprises streptavidin connected to the bead.

[0011] In certain embodiments, the nucleic acid molecule is present in a nucleic acid sequencing library, and the method enriches target sequences in the library.

[0012] In certain embodiments, the nucleic acid molecule was obtained from a nucleic acid sample from a subject. In certain embodiments, the nucleic acid sample is a plasma sample. In certain embodiments, the plasma sample is used directly in the nucleic acid enrichment method (for example, directly in the cutting step) without prior enrichment or purification of the nucleic acid.

[0013] In certain embodiments, the nucleic acid sample comprises cell free DNA (cfDNA). In certain embodiments, cytosines in the cfDNA have been converted to uracils. In certain embodiments, the cfDNA has been treated with bisulfite. In certain embodiments, the method further comprises the step of converting methylated cytosines to uracils.

[0014] In certain embodiments, the method further comprises preparing a library before or after enriching the molecule that includes the target.

[0015] In certain embodiments, the method further comprises a wash step to remove nucleic acid molecules that do not include the target.

[0016] In certain embodiments, the method further comprises amplifying the nucleic acid molecule. In certain embodiments, the amplification occurs while the nucleic acid is in contact with the capture domain.

[0017] In certain embodiments, the method further comprises sequencing the enriched molecule.

[0018] In certain embodiments, the method further comprises separating the nucleic acid molecule from the capture domain. In certain embodiments, the separating step comprises heat elution off of the capture domain. In certain embodiments, the separating step is performed using a chemical agent. In certain embodiments, the separating step is performed using mechanical disruption. In certain embodiments, the separating step is performed using heat elution, a chemical agent, mechanical disruption, or combinations thereof. In certain embodiments, the method further comprises amplifying the nucleic acid after separation of the nucleic acid from the capture domain.

[0019] In certain embodiments, the method further comprises an additional enrichment step. In certain embodiments, the target sequence comprises a plurality of target sequences, and the enrichment step enriches a subset of the target sequences. In certain embodiments, the additional enrichment step comprises hybrid capture. In certain embodiments, the additional enrichment step comprises using a nucleic acid binding protein.

[0020] In another aspect, the disclosure relates to a method of capturing a nucleic acid molecule having a target sequence. The method includes cutting a nucleic acid molecule that includes a target sequence to generate a single stranded overhang at a cut end of the molecule that includes the target; filling in the overhang with at least one labeled nucleotide; and capturing the molecule that includes the target by contacting at least one of the labeled nucleotides in the molecule with a capture domain.

[0021] In certain embodiments, the cutting step is performed by a nuclease. In certain embodiments, the nuclease is a CRISPR-Cas nuclease. In certain embodiments, the nuclease is a type II CRISPR-Cas nuclease. In certain embodiments, the nuclease is a Cas9 nuclease. In certain embodiments, the nuclease is a type V CRISPR-Cas nuclease. In certain embodiments, the nuclease is a Casl2 nuclease. In certain embodiments, the nuclease is a Casl2a/Cpfl nuclease. In certain embodiments, the nuclease is a MAD7 nuclease. In certain embodiments, the nuclease is a CasX nuclease. In certain embodiments, the nuclease is associated with a guide RNA (gRNA) comprising a spacer sequence, wherein the spacer sequence binds to the nucleic acid molecule that includes the target sequence.

[0022] In certain embodiments, the cutting step is performed at room temperature.

[0023] In certain embodiments, the overhang is filled in using a DNA polymerase. In certain embodiments, the DNA polymerase is DNA polymerase I. In certain embodiments, the DNA polymerase I consists of the Klenow fragment.

[0024] In certain embodiments, the label comprises biotin or digoxigenin. In certain embodiments, the capture domain comprises avidin, streptavidin, or a DIG-binding protein. In certain embodiments, the capture domain comprises or is connected to a solid support. In certain embodiments, the solid support is a bead, a well, a tube, or a slide. In certain embodiments, the capture domain comprises streptavidin connected to the bead.

[0025] In certain embodiments, the nucleic acid molecule is present in a nucleic acid sequencing library, and the method captures target sequences of interest in the library.

[0026] In certain embodiments, the nucleic acid molecule was obtained from a nucleic acid sample from a subject. In certain embodiments, the nucleic acid sample is a plasma sample and the plasma sample is used directly in the nucleic acid enrichment method without prior enrichment or purification of the nucleic acid.

[0027] In certain embodiments, the nucleic acid sample comprises cell free DNA (cfDNA). In certain embodiments, the cfDNA have been converted to uracils. In certain embodiments, the cfDNA has been treated with bisulfite. In certain embodiments, the method further comprises the step of converting methylated cytosines to uracils.

[0028] In certain embodiments, the method further comprises preparing a library before or after capturing the molecule that includes the target.

[0029] In certain embodiments, the method further comprises a wash step to remove nucleic acid molecules that do not include the target.

[0030] In certain embodiments, the method further comprises amplifying the nucleic acid molecule. In certain embodiments, the amplification occurs while the nucleic acid is in contact with the capture domain. In certain embodiments, the method further comprises sequencing the captured molecule. In certain embodiments, the method further comprises separating the nucleic acid molecule from the capture domain. In certain embodiments, the separating step comprises heat elution off of the capture domain. In certain embodiments, the method further comprises amplifying the nucleic acid after separation of the nucleic acid from the capture domain.

[0031] In certain embodiments, the method further comprises an additional enrichment step. In certain embodiments, the target sequence comprises a plurality of target sequences, and the enrichment step enriches a subset of the target sequences. In certain embodiments, the additional enrichment step comprises hybrid capture. In certain embodiments, the additional enrichment step comprises using a nucleic acid binding protein. [0032] In another aspect, the disclosure relates to a nucleic acid enrichment method. The method includes cutting a nucleic acid molecule that includes a target sequence to generate a single stranded overhang at a cut end of the molecule that includes the target; filling in the overhang with at least one labeled nucleotide; and enriching the molecule that includes the target by separating labeled molecules from unlabeled molecules.

[0033] In certain embodiments, the cutting step is performed by a nuclease. In certain embodiments, the nuclease is a CRISPR-Cas nuclease. In certain embodiments, the nuclease is a type V CRISPR-Cas nuclease. In certain embodiments, the nuclease is a Cas9 or Casl2 nuclease. In certain embodiments, the nuclease is a Casl2a/Cpfl nuclease. In certain embodiments, the nuclease is a MAD7 nuclease. In certain embodiments, the nuclease is a CasX nuclease. In certain embodiments, the nuclease is associated with a guide RNA (gRNA) comprising a spacer sequence, wherein the spacer sequence binds to the nucleic acid molecule that includes the target sequence.

[0034] In certain embodiments, the cutting step is performed at room temperature.

[0035] In certain embodiments, the overhang is filled in using a DNA polymerase. In certain embodiments, the DNA polymerase is DNA polymerase I. In certain embodiments, the DNA polymerase I consists of the Klenow fragment.

[0036] In certain embodiments, the label comprises biotin, digoxigenin, or a fluorophore. In certain embodiments, the capture domain comprises or is connected to a solid support. In certain embodiments, the solid support is a bead, a well, a tube, or a slide. In certain embodiments, the capture domain comprises streptavidin connected to the bead. In certain embodiments, the nucleic acid molecule is present in a nucleic acid sequencing library, and the method enriches target sequences of interest in the library.

[0037] In certain embodiments, the nucleic acid molecule was obtained from a nucleic acid sample from a subject. In certain embodiments, the nucleic acid sample is a plasma sample. In certain embodiments, the plasma sample is used directly in the nucleic acid enrichment method without prior enrichment or purification of the nucleic acid.

[0038] In certain embodiments, the nucleic acid sample comprises cell free DNA (cfDNA). In certain embodiments, cytosines in the cfDNA have been converted to uracils. In certain embodiments, the cfDNA has been treated with bisulfite. In certain embodiments, the method further comprises the step of converting methylated cytosines to uracils.

[0039] In certain embodiments, the method further comprises preparing a library before or after enriching the molecule that includes the target.

[0040] In certain embodiments, the method includes a wash step.

[0041] In certain embodiments, the method further comprises amplifying the nucleic acid molecule. In certain embodiments, the amplification occurs while the nucleic acid is in contact with the capture domain. In certain embodiments, the method further comprises sequencing the enriched molecule. In certain embodiments, the method further comprises separating the nucleic acid molecule from the capture domain. In certain embodiments, the separating step comprises heat elution off of the capture domain. In certain embodiments, the method further comprises amplifying the nucleic acid after separation of the nucleic acid from the capture domain.

[0042] In certain embodiments, the method further comprises an additional enrichment step. In certain embodiments, the target sequence comprises a plurality of target sequences, and the enrichment step enriches a subset of the target sequences. In certain embodiments, the additional enrichment step comprises hybrid capture. In certain embodiments, the additional enrichment step comprises using a nucleic acid binding protein.

[0043] In another aspect, the disclosure relates to a method of producing a nucleic acid library enriched for regions of interest. The method includes cutting a plurality of nucleic acid molecules comprising regions of interest to generate single stranded overhangs at cut ends of the molecules that include the regions of interest; filling in each overhang with a least one labeled nucleotide; and enriching the molecules that include the regions of interest by contacting the labeled nucleotides in the molecule with capture domains.

[0044] In certain embodiments, the cutting step is performed by a nuclease. In certain embodiments, the nuclease is a CRISPR-Cas nuclease. In certain embodiments, the nuclease is a type II CRISPR-Cas nuclease. In certain embodiments, the nuclease is a Cas9 nuclease. In certain embodiments, the nuclease is a type V CRISPR-Cas nuclease. In certain embodiments, the nuclease is a Casl2 nuclease. In certain embodiments, the nuclease is a Casl2a/Cpfl nuclease. In certain embodiments, the nuclease is a MAD7 nuclease. In certain embodiments, the nuclease is a CasX nuclease. In certain embodiments, the nuclease is associated with a guide RNA (gRNA) comprising a spacer sequence, wherein the spacer sequence binds to the nucleic acid molecules that include the regions of interest.

[0045] In certain embodiments, the cutting step is performed at room temperature.

[0046] In certain embodiments, the overhangs are filled in using a DNA polymerase. In certain embodiments, the DNA polymerase is DNA polymerase I. In certain embodiments, the DNA polymerase I consists of the Klenow fragment.

[0047] In certain embodiments, the label comprises biotin, digoxigenin, or a fluorophore. In certain embodiments, the capture domains comprise or are connected to solid supports. In certain embodiments, the solid supports are beads, wells, tubes, or slides. In certain embodiments, the capture domains comprise streptavidin connected to beads. In certain embodiments, the method further comprises amplifying the nucleic acid molecules. In certain embodiments, the amplifying is performed with primers that comprise adapters to facilitate sequencing of the nucleic acid molecules.

[0048] In certain embodiments, the nucleic acid molecule was obtained from a nucleic acid sample from a subject. In certain embodiments, the nucleic acid sample is a plasma sample. In certain embodiments, the plasma sample is used directly in the nucleic acid enrichment method without prior enrichment or purification of the nucleic acid.

[0049] In certain embodiments, the nucleic acid sample comprises cell free DNA (cfDNA). In certain embodiments, cytosines in the cfDNA have been converted to uracils. In certain embodiments, the cfDNA has been treated with bisulfite. In certain embodiments, the method further comprises the step of converting methylated cytosines to uracils.

[0050] In certain embodiments, the method further comprises a wash step to remove nucleic acid molecules that do not include the regions of interest.

[0051] In certain embodiments, the method further comprises amplifying the nucleic acid molecule. In certain embodiments, the amplification occurs while the nucleic acid is in contact with the capture domain. In certain embodiments, the method further comprises separating the nucleic acid molecules from the capture domains. In certain embodiments, the separating step comprises heat elution off of the capture domain. In certain embodiments, the method further comprises amplifying the nucleic acid after separation of the nucleic acid from the capture domain.

[0052] In certain embodiments, the method further comprises an additional enrichment step. In certain embodiments, the target sequence comprises a plurality of target sequences, and the enrichment step enriches a subset of the target sequences. In certain embodiments, the additional enrichment step comprises hybrid capture. In certain embodiments, the additional enrichment step comprises using a nucleic acid binding protein.

[0053] In another aspect, the disclosure relates to a method for producing a nucleic acid library enriched for regions of interest. The method includes obtaining a sample comprising a plurality of nucleic acids, wherein a subset of the plurality of nucleic acids comprise regions of interest; optionally converting methylated cytosines to uracils; adding nucleic acid adapters to the plurality of nucleic acids to form a nucleic acid library; cutting the subset of the plurality of nucleic acid molecules having regions of interest to generate single stranded overhangs at cut ends of the molecules that include the regions of interest; filling in each overhang with a least one labeled nucleotide; enriching the molecules that include the regions of interest by contacting the labeled nucleotides in the molecule with capture domains; and amplifying the molecules that include the regions of interest to form the nucleic acid library enriched for regions of interest.

[0054] In another aspect, the disclosure relates to a method for producing a nucleic acid library enriched for regions of interest. The method includes obtaining a sample comprising a plurality of nucleic acids, wherein a subset of the plurality of nucleic acids comprise regions of interest; cutting the subset of the plurality of nucleic acid molecules having regions of interest to generate single stranded overhangs at cut ends of the molecules that include the regions of interest; filling in each overhang with a least one labeled nucleotide; enriching the molecules that include the regions of interest by contacting the labeled nucleotides in the molecule with capture domains; removing the molecules that include the regions of interest from the capture domains; optionally converting methylated cytosines to uracils; and adding nucleic acid adapters to the plurality of nucleic acids to form the nucleic acid library enriched for regions of interest.

[0055] In certain embodiments, the cutting step is performed by a nuclease. In certain embodiments, the nuclease is a CRISPR-Cas nuclease. In certain embodiments, the nuclease is a type II CRISPR-Cas nuclease. In certain embodiments, the nuclease is a Cas9 nuclease. In certain embodiments, the nuclease is a type V CRISPR-Cas nuclease. In certain embodiments, the nuclease is a Casl2 nuclease. In certain embodiments, the nuclease is a Casl2a/Cpfl nuclease. In certain embodiments, the nuclease is a MAD7 nuclease. In certain embodiments, the nuclease is a CasX nuclease. In certain embodiments, the nuclease is associated with a guide RNA (gRNA) comprising a spacer sequence, wherein the spacer sequence binds to the nucleic acid molecule that includes the target sequence.

[0056] In certain embodiments, the cutting step is performed at room temperature.

[0057] In certain embodiments, the overhang is filled in using a DNA polymerase. In certain embodiments, the DNA polymerase is DNA polymerase I. In certain embodiments, the DNA polymerase I consists of the Klenow fragment. In certain embodiments, the label comprises biotin or digoxigenin. In certain embodiments, the capture domain comprises avidin, streptavidin, or a DIG-binding protein.

[0058] In certain embodiments, the capture domain comprises or is connected to a solid support. In certain embodiments, the solid support is a bead, a well, a tube, or a slide. In certain embodiments, the capture domain comprises streptavidin connected to the bead. In another aspect, the disclosure relates to a nucleic acid library, produced by the methods described herein.

[0059] In certain embodiments, the method further comprises an additional enrichment step. In certain embodiments, the target sequence comprises a plurality of target sequences, and the enrichment step enriches a subset of the target sequences. In certain embodiments, the additional enrichment step comprises hybrid capture. In certain embodiments, the additional enrichment step comprises using a nucleic acid binding protein.

[0060] In another aspect, the disclosure relates to a kit comprising a nuclease that cuts a nucleic acid molecule including a target sequence to generate a single stranded overhand at a cut end of the molecule that includes the target; labeled dNTPs; DNA polymerase; and a capture moiety comprising a capture domain.

[0061] In certain embodiments, the nuclease is a CRISPR-Cas nuclease. In certain embodiments, the nuclease is a type II CRISPR-Cas nuclease. In certain embodiments, the nuclease is a Cas9 nuclease. In certain embodiments, the nuclease is a type V CRISPR-Cas nuclease. In certain embodiments, the nuclease is a Casl2 nuclease. In certain embodiments, the nuclease is a Casl2a/Cpfl nuclease. In certain embodiments, the nuclease is a MAD7 nuclease. In certain embodiments, the nuclease is a CasX nuclease.

[0062] In another aspect, the disclosure relates to a nucleic acid enrichment method comprising the steps of (a) designing a first set of guide RNAs to bind a first set of target sequences for cleavage with a first nuclease, (b) designing a second set of guide RNAs to bind a second set of target sequences for cleavage with a second nuclease, (c) adding the first and second sets of guide sequences and the first and second nucleases to a nucleic acid comprising a plurality of target sequences, (d) generating single stranded overhangs at the cleavage sites in the first and second sets of target sequences, (e) filling in each overhang with at least one labeled nucleotide; and (f) enriching the target sequences by contacting at least one of the labeled nucleotides in the molecule with a capture domain.

[0063] In certain embodiments, the first nuclease or the second nuclease is a CRISPR-Cas nuclease. In certain embodiments, the first nuclease or the second nuclease is a type II or a type V CRISPR-Cas nuclease. In certain embodiments, the first nuclease or the second nuclease is a Cas9, Casl2, or CasX nuclease. In certain embodiments, the first nuclease or the second nuclease is a Casl2a/Cpfl nuclease.

[0064] In certain embodiments, the first nuclease or the second nuclease is associated with a guide RNA (gRNA) comprising a spacer sequence, wherein the spacer sequence binds to the nucleic acid molecule that includes the target sequence.

[0065] In certain embodiments, the cutting step is performed at room temperature.

[0066] In certain embodiments, the overhang is filled in using a DNA polymerase. In certain embodiments, the DNA polymerase is DNA polymerase I. In certain embodiments, the DNA polymerase I consists of the Klenow fragment.

[0067] In certain embodiments, the label comprises biotin or digoxigenin. In certain embodiments, the capture domain comprises avidin, streptavidin, or a DIG-binding protein. In certain embodiments, the capture domain comprises or is connected to a solid support. In certain embodiments, the solid support is a bead, a well, a tube, or a slide. In certain embodiments, the capture domain comprises streptavidin connected to the bead. In certain embodiments, the nucleic acid molecule is present in a nucleic acid sequencing library, and the method enriches target sequences in the library. In certain embodiments, the nucleic acid molecule was obtained from a nucleic acid sample from a subject. In certain embodiments, the nucleic acid sample is a plasma sample. In certain embodiments, the plasma sample is used directly in the nucleic acid enrichment method without prior enrichment or purification of the nucleic acid.

[0068] In certain embodiments, the nucleic acid sample comprises cell free DNA (cfDNA). In certain embodiments, cytosines in the cfDNA have been converted to uracils. In certain embodiments, the cfDNA has been treated with bisulfite.

[0069] In certain embodiments, the method further comprises preparing a library before or after enriching the molecule that includes the target. In certain embodiments, the method further comprising the step of converting methylated cytosines to uracils.

[0070] In certain embodiments, the method further comprising a wash step to remove nucleic acid molecules that do not include the target.

[0071] In certain embodiments, the method further comprising amplifying the nucleic acid molecule. In certain embodiments, the amplification occurs while the nucleic acid is in contact with the capture domain.

[0072] In certain embodiments, the method further comprising sequencing the enriched molecule.

[0073] In certain embodiments, the method further comprising separating the nucleic acid molecule from the capture domain. In certain embodiments, the separating step is performed using heat elution, a chemical agent, mechanical disruption, or combinations thereof.

[0074] In certain embodiments, the method further comprises amplifying the nucleic acid after separation of the nucleic acid from the capture domain.

[0075] In certain embodiments, the method further comprises an additional enrichment step. In certain embodiments, the target sequence comprises a plurality of target sequences, and the enrichment step enriches a subset of the target sequences. In certain embodiments, the additional enrichment step comprises hybrid capture. In certain embodiments, the additional enrichment step comprises using a nucleic acid binding protein.

[0076] These and other aspects and features of the invention are described in the following detailed description and claims.

BRIEF DESCRIPTION OF THE DRAWINGS

[0077] The foregoing and other objects, features and advantages of the invention will become apparent from the following description of preferred embodiments, as illustrated in the accompanying drawings. Like referenced elements identify common features in the corresponding drawings. The drawings are not necessarily to scale, with emphasis instead being placed on illustrating the principles of the present invention, in which:

[0078] FIGURE 1 is a schematic flowchart showing a method according to the disclosure for enriching target sequences (z.e., regions of interest (RO I)) from a nucleic acid library.

[0079] FIGURE 2 is a schematic flowchart showing a method according to the disclosure for enriching target sequences (z.e., regions of interest (RO I)) from a sample and constructing a nucleic acid library using the enriched target sequences.

[0080] FIGURE 3 is a schematic of a bisulfite conversion reaction.

[0081] FIGURE 4 provides electrophoresis results for an experiment testing whether biotinylated dNTPs could be incorporated into a nucleic acid comprising a target sequence and enriched using streptavidin beads. Bands representing a biotinylated target fragment bound to beads were seen in both the lx and 5x polymerase (“Enzyme”) conditions and with anywhere from 10% to 100% biotinylated dNTPs. Three negative controls, “cut control” lacking polymerase enzyme and biotinylated dNTPs, “no bind control” lacking Casl2, crRNA, and polymerase enzyme, and “bind control” which contained a biotinylated amplicon, did not contain biotinylated target fragment. Accordingly, streptavidin beads were capable of binding to and isolating target fragments that had incorporated biotinylated dNTPs.

[0082] FIGURE 5 provides a flow chart showing an exemplary process overview for Casl2a positive enrichment of target sequences. [0083] FIGURE 6 provides a schematic of the steps of the exemplary library creation method of Example 5.

[0084] FIGURE 7 shows the sequencing results of a library constructed in Example 5 using the methods of the disclosure.

[0085] FIGURE 8 shows the sequencing results of a library constructed in Example 6 using the methods of the disclosure. As shown, target CpG-4 within a 5-plex target was successfully enriched using the methods of the disclosure.

DETAILED DESCRIPTION

[0086] The disclosure relates to methods of enriching target sequences in a nucleic acid sample. The methods include, for example, cutting a nucleic acid molecule that includes a target sequence to form a single-stranded overhang, filling in the overhang with a label, and capturing the nucleic acid molecule that includes the target, thereby enriching the target sequence. The methods can be used, for example, to enrich target sequences prior to assembling a nucleic acid library or can be used to enrich target sequences in an existing library.

[0087] An exemplary method of enriching target sequences is shown in FIG. 1. A cell-free DNA (cfDNA) sample comprising methylated nucleotides that have been converted using bisulfite treatment are used to construct a nucleic acid library. sgRNAs complementary to target sequences are constructed, and the library is exposed to Casl2 and the sgRNAs. The sgRNAs direct Casl2 to the target sequence and cleave the DNA, leaving an overhang. Biotinylated dNTPs are added with a polymerase (Klenow fragment) to fill in the nucleotides complementary to the overhang. Streptavidin beads bind to the biotinylated nucleotides of the target sequences, and any uncut, off-target sequences are washed away. A PCR amplification is then performed to amplify the enriched target sequences that have been bound to the bead.

[0088] Another exemplary method of enriching target sequences is shown in FIG. 2. In this method, a cell-free DNA (cfDNA) sample is exposed to Casl2 and sgRNAs complementary to target sequences in a target region. The sgRNAs direct Casl2 to the target sequence and cleave the DNA, leaving an overhang. Biotinylated dNTPs are added with a polymerase (Klenow fragment) to fill in the nucleotides complementary to the overhang. Streptavidin beads bind to the biotinylated nucleotides present in the target region, and any uncut, off-target sequences are washed away. The enriched target sequences are eluted off of the beads using heat treatment. The eluted target sequences are then treated with bisulfite to preserve methylation information. The bisulfite converted target sequences are used to construct a library that is enriched for target sequences.

[0089] Unless otherwise defined herein, scientific and technical terms used in this application shall have the meanings that are commonly understood by those of ordinary skill in the art.

[0090] The practice of the present disclosure will employ, unless otherwise indicated, conventional techniques of molecular biology (including recombinant techniques), microbiology, cell biology, biochemistry and immunology, which are within the skill of the art. Such techniques are explained fully in the literature, such as, Molecular Cloning: A Laboratory Manual, second edition (Sambrook et al., 1989) Cold Spring Harbor Press; Oligonucleotide Synthesis (M.J. Gait, ed., 1984); Methods in Molecular Biology, Humana Press; Cell Biology: A Laboratory Notebook (J.E. Cellis, ed., 1998) Academic Press; Animal Cell Culture (R.I. Freshney, ed., 1987); Introduction to Cell and Tissue Culture (J. P. Mather and P.E. Roberts, 1998) Plenum Press; Cell and Tissue Culture: Laboratory Procedures (A. Doyle, J.B. Griffiths, and D.G. Newell, eds., 1993-1998) J. Wiley and Sons; Methods in Enzymology (Academic Press, Inc.); Gene Transfer Vectors for Mammalian Cells (J.M. Miller and M.P. Calos, eds., 1987); Current Protocols in Molecular Biology (F.M. Ausubel et al., eds., 1987); PCR: The Polymerase Chain Reaction, (Mullis et al., eds., 1994); Sambrook and Russell, Molecular Cloning: A Laboratory Manual, 3rd. ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY (2001); Ausubel et al., Current Protocols in Molecular Biology, John Wiley & Sons, NY (2002); Harlow and Lane Using Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY (1998); Coligan et al., Short Protocols in Protein Science, John Wiley & Sons, NY (2003); Short Protocols in Molecular Biology (Wiley and Sons, 1999).

[0091] Enzymatic reactions and purification techniques are performed according to manufacturer’s specifications, as commonly accomplished in the art or as described herein. The nomenclatures used in connection with, and the laboratory procedures and techniques of, analytical chemistry, biochemistry, immunology, molecular biology, synthetic organic chemistry, and medicinal and pharmaceutical chemistry described herein are those well-known and commonly used in the art. Standard techniques are used for chemical syntheses, and chemical analyses.

[0092] Throughout this specification and embodiments, the word “comprise,” or variations such as “comprises” or “comprising,” will be understood to imply the inclusion of a stated integer or group of integers but not the exclusion of any other integer or group of integers.

[0093] It is understood that wherever embodiments are described herein with the language “comprising,” otherwise analogous embodiments described in terms of “consisting of’ and/or “consisting essentially of’ are also provided.

[0094] Any example(s) following the term “e.g.” or “for example” is not meant to be exhaustive or limiting.

[0095] Unless otherwise required by context, singular terms shall include pluralities and plural terms shall include the singular.

[0096] Notwithstanding that the numerical ranges and parameters setting forth the broad scope of the disclosure are approximations, the numerical values set forth in the specific examples are reported as precisely as possible. Any numerical value, however, inherently contains certain errors necessarily resulting from the standard deviation found in their respective testing measurements. Moreover, all ranges disclosed herein are to be understood to be inclusive of the numbers defining the range and to encompass any and all subranges subsumed therein. For example, a stated range of “1 to 10” should be considered to include any and all subranges between (and inclusive of) the minimum value of 1 and the maximum value of 10; that is, all subranges beginning with a minimum value of 1 or more, e.g., 1 to 6.1, and ending with a maximum value of 10 or less, e.g., 5.5 to 10.

[0097] Where aspects or embodiments of the disclosure are described in terms of a Markush group or other grouping of alternatives, the present disclosure encompasses not only the entire group listed as a whole, but each member of the group individually and all possible subgroups of the main group, but also the main group absent one or more of the group members. The present disclosure also envisages the explicit exclusion of one or more of any of the group members in an embodiment of the disclosure.

[0098] Exemplary methods and materials are described herein, although methods and materials similar or equivalent to those described herein can also be used in the practice or testing of the present disclosure. The materials, methods, and examples are illustrative only and not intended to be limiting.

I. Definitions

[0099] The articles “a” and “an” are used herein to refer to one or to more than one (z.e., to at least one) of the grammatical object of the article. By way of example, “an element” means one element or more than one element.

[00100] As used herein, the term “about” or “approximately” can mean within an acceptable error range for the particular value as determined by one of ordinary skill in the art, which can depend in part on how the value is measured or determined, e.g., the limitations of the measurement system. For example, “about” can mean within 1 or more than 1 standard deviation, per the practice in the art. “About” can mean a range of ±20%, ±10%, ±5%, or ±1% of a given value. The term “about” or “approximately” can mean within an order of magnitude, within 5-fold, or within 2-fold, of a value. Where a particular value is described in the application and claims, unless otherwise stated the term “about” meaning within an acceptable error range for the particular value can be assumed. The term “about” can have the meaning as commonly understood by one of ordinary skill in the art. The term “about” can refer to ±10%. The term “about” can refer to ±5%.

[00101] It should be understood that the expression of “at least one of’ includes individually each of the recited objects after the expression and the various combinations of two or more of the recited objects unless otherwise understood from the context and use. The expression “and/or” in connection with three or more recited objects should be understood to have the same meaning unless otherwise understood from the context. [00102] As used herein, the term “biological sample,” or “sample” refers to any sample taken from a subject, which can reflect a biological state associated with the subject, and that includes cell free DNA. A biological sample can take any of a variety of forms, such as a liquid biopsy (e.g., blood, urine, stool, saliva, or mucous), or a tissue biopsy, or other solid biopsy. Examples of biological samples include, but are not limited to, blood, whole blood, plasma, serum, urine, cerebrospinal fluid, fecal, saliva, sweat, tears, pleural fluid, pericardial fluid, or peritoneal fluid of the subject. A biological sample can include any tissue or material derived from a living or dead subject. A biological sample can be a cell-free sample. A biological sample can comprise a nucleic acid e.g., DNA or RNA) or a fragment thereof. The term “nucleic acid” can refer to deoxyribonucleic acid (DNA), ribonucleic acid (RNA) or any hybrid or fragment thereof. The nucleic acid in the sample can be a cell-free nucleic acid. A sample can be a liquid sample or a solid sample e.g., a cell or tissue sample). A biological sample can be a bodily fluid, such as blood, plasma, serum, urine, vaginal fluid, fluid from a hydrocele (e.g., of the testis), vaginal flushing fluids, pleural fluid, ascitic fluid, cerebrospinal fluid, saliva, sweat, tears, sputum, bronchoalveolar lavage fluid, discharge fluid from the nipple, aspiration fluid from different parts of the body (e.g., thyroid, breast), etc. A biological sample can be a stool sample. In various embodiments, the majority of DNA in a biological sample that has been enriched for cell-free DNA (e.g., a plasma sample obtained via a centrifugation protocol) can be cell-free (e.g., greater than 50%, 60%, 70%, 80%, 90%, 95%, or 99% of the DNA can be cell-free). A biological sample can be treated to physically disrupt tissue or cell structure (e.g., centrifugation and/or cell lysis), thus releasing intracellular components into a solution which can further contain enzymes, buffers, salts, detergents, and the like which can be used to prepare the sample for analysis.

[00103] As used herein, the terms “nucleic acid” and “nucleic acid molecule” are used interchangeably. The terms refer to nucleic acids of any composition form, such as deoxyribonucleic acid (DNA, e.g., complementary DNA (cDNA), genomic DNA (gDNA) and the like), and/or DNA analogs (e.g., containing base analogs, sugar analogs and/or a non-native backbone and the like), all of which can be in single- or double-stranded form. Unless otherwise limited, a nucleic acid can comprise known analogs of natural nucleotides, some of which can function in a similar manner as naturally occurring nucleotides. A nucleic acid can be in any form useful for conducting processes herein (e.g., linear, circular, supercoiled, single-stranded, double-stranded and the like). A nucleic acid in some embodiments can be from a single chromosome or fragment thereof (e.g. , a nucleic acid sample may be from one chromosome of a sample obtained from a diploid organism). In certain embodiments nucleic acids comprise nucleosomes, fragments or parts of nucleosomes or nucleosome-like structures. Nucleic acids can comprise protein e.g., histones, DNA binding proteins, and the like). Nucleic acids analyzed by processes described herein can be substantially isolated and are not substantially associated with protein or other molecules. Nucleic acids can also include derivatives, variants and analogs of DNA synthesized, replicated or amplified from single- stranded (“sense” or “antisense,” “plus” strand or “minus” strand, “forward” reading frame or “reverse” reading frame) and double-stranded polynucleotides. Deoxyribonucleotides can include deoxy adenosine, deoxycytidine, deoxyguanosine and deoxy thymidine. A nucleic acid may be prepared using a nucleic acid obtained from a subject as a template.

[00104] As used herein, the terms “template nucleic acid” and “template nucleic acid molecule(s)” are used interchangeably. The terms refer to nucleic acid that has been obtained from a sample and processed to form an immortalized library. The template nucleic acid can be nucleic acid obtained directly from the sample, or nucleic acid that is derived from that obtained directly from the sample. Examples of nucleic acid derived from a sample include DNA that has been reverse-transcribed from RNA obtained directly from a sample, or DNA that has be amplified from DNA obtained directly from a sample, for example, by PCR.

[00105] As used herein, the term “cell-free nucleic acids” refers to nucleic acid molecules that can be found outside cells, in bodily fluids such as blood, whole blood, plasma, serum, urine, cerebrospinal fluid, fecal, saliva, sweat, sweat, tears, pleural fluid, pericardial fluid, or peritoneal fluid of a subject. Cell-free nucleic acids originate from one or more healthy cells and/or from one or more cancer cells, or from non-human sources such bacteria, fungi, viruses. Examples of the cell-free nucleic acids include but are not limited to cell-free DNA (“cfDNA”), including mitochondrial DNA or genomic DNA, and cell-free RNA. In certain embodiments herein, instruments for assessing the quality of the cell-free nucleic acids, such as the TapeStation System from Agilent Technologies (Santa Clara, CA) can be used. Concentrating low- abundance cfDNA can be accomplished, for example using a Qubit™ Fluorometer from Thermofisher Scientific (Waltham, MA).

[00106] As used herein, the term “methylation” refers to a modification of a nucleic acid where a hydrogen atom on the pyrimidine ring of a cytosine base is converted to a methyl group, forming 5-methylcytosine. Methylation can occur at dinucleotides of cytosine and guanine referred to herein as “CpG sites”. Methylation of cytosine can occur in cytosines in other sequence contexts, for example, 5'-CHG-3' and 5'-CHH-3', where H is adenine, cytosine or thymine. Cytosine methylation can also be in the form of 5-hydroxymethylcytosine. Methylation of DNA can include methylation of non-cytosine nucleotides, such as N6-methyladenine. Anomalous cfDNA methylation can be identified as hypermethylation or hypomethylation, both of which may be indicative of cancer status. As is well known in the art, DNA methylation anomalies (compared to healthy controls) can cause different effects, which may contribute to cancer.

[00107] As used herein the term “methylation index” for each genomic site (e.g., a CpG site, a region of DNA where a cytosine nucleotide is followed by a guanine nucleotide in the linear sequence of bases along its 5'— >3' direction) can refer to the proportion of sequence reads showing methylation at the site over the total number of reads covering that site. The “methylation density” of a region can be the number of reads at sites within a region showing methylation divided by the total number of reads covering the sites in the region. The sites can have specific characteristics, e.g., the sites can be CpG sites). The “CpG methylation density” of a region can be the number of reads showing CpG methylation divided by the total number of reads covering CpG sites in the region e.g., a particular CpG site, CpG sites within a CpG island, or a larger region). For example, the methylation density for each 100-kb bin in the human genome can be determined from the total number of unconverted cytosines (which can correspond to methylated cytosine) at CpG sites as a proportion of all CpG sites covered by sequence reads mapped to the 100-kb region. In some embodiments, this analysis is performed for other bin sizes, e.g., 50-kb or 1-Mb, etc. In some embodiments, a region is an entire genome or a chromosome or part of a chromosome (e.g., a chromosomal arm). A methylation index of a CpG site can be the same as the methylation density for a region when the region includes that CpG site. The “proportion of methylated cytosines” can refer the number of cytosine sites, “C's,” that are shown to be methylated (for example unconverted after bisulfite conversion) over the total number of analyzed cytosine residues, e.g., including cytosines outside of the CpG context, in the region. The methylation index, methylation density and proportion of methylated cytosines are examples of “methylation levels.”

[00108] Certain portions of a genome comprise regions with a high frequency of CpG sites. A CpG site is portion of a genome that has cytosine and guanine separated by only one phosphate group and is often denoted as “5' — C — phosphate — G — 3'”, or “CpG” for short. Regions with a high frequency of CpG sites are commonly referred to as “CG islands” or “CGIs”. It has been found that certain CGIs and certain features of certain CGIs in tumor cells tend to be different from the same CGIs or features of the CGIs in healthy cells. Herein, such CGIS and features of the genome are referred to herein as “cancer informative CGIs”, which is defined and described in more detail below. An “informative CpG” can be specified by reference to a specific CpG site, or to a collection of one or more CpG sites by reference to a CG island that contains the collection. These cancer informative CGIs tend to have methylation patterns in tumor cells that are different from the methylation patterns in healthy cells. DNA fragments from other CGIs may not express such differences.

[00109] As used herein, the term “methylation profile” (also called methylation status) can include information related to DNA methylation for a region. Information related to DNA methylation can include a methylation index of a CpG site, a methylation density of CpG sites in a region, a distribution of CpG sites over a contiguous region, a pattern or level of methylation for each individual CpG site within a region that contains more than one CpG site, and non-CpG methylation. A methylation profile of a substantial part of the genome can be considered equivalent to the methylome. “DNA methylation” in mammalian genomes can refer to the addition of a methyl group to position 5 of the heterocyclic ring of cytosine e.g., to produce 5- methylcytosine) among CpG dinucleotides. Methylation of cytosine can occur in cytosines in other sequence contexts, for example, 5'-CHG-3' and 5'-CHH-3', where H is adenine, cytosine or thymine. Cytosine methylation can also be in the form of 5-hydroxymethylcytosine.

Methylation of DNA can include methylation of non-cytosine nucleotides, such as N6- methyladenine. [00110] The term “epitype” or “nucleic acid epitype” refer to a region of nucleic acid (i.e., DNA or RNA) containing an epigenetic variation. For example, the epigenetic variation could be methylation or non-methylation of one or more nucleotides in a region of nucleic acid. For instance, in some embodiments the nucleotide that could be methylated or non-methylated may be a cytidine, e.g., at a CpG site (e.g., the nucleotide could be 5 -methylcytidine or cytidine). Exemplary CpG sites may be found in, for example, CpG islands (CGIs) shown in TABLES 1-4. CpG islands (CGIs) may be regions having a length greater than 200 bp, a GC content greater than 50% and a ratio of observed to expected CpG greater than 0.6. CpG islands are often found in promoter regions, where methylation is associated with transcriptional repression. Generally, a nucleic acid epitype containing one or more CpG sites may have a methylation pattern, such as any of fully non-methylated (e.g., none of the CpG sites in the epitype are methylated), partially methylated (e.g., at least one but not all of the CpG sites in the epitype are methylated), or fully methylated (e.g., all of the CpG sites in the epitype are methylated). In other embodiments, the nucleotide that could be methylated or non-methylated may be adenosine (e.g., the nucleotide could be N6-methyladenosine or adenosine).

[00111] As used herein, the term “amplifying” means performing an amplification reaction. In one aspect, an amplification reaction is “template-driven” in that base pairing of reactants, either nucleotides or oligonucleotides, have complements in a template polynucleotide that are required for the creation of reaction products. In one aspect, template-driven reactions are primer extensions with a nucleic acid polymerase, or oligonucleotide ligations with a nucleic acid ligase. Such reactions include, but are not limited to, polymerase chain reactions (PCRs), linear polymerase reactions, nucleic acid sequence-based amplification (NASBAs), rolling circle amplifications, and the like, disclosed in the following references, each of which are incorporated herein by reference herein in their entirety: Mullis et al., U.S. Pat. Nos. 4,683,195; 4,965,188; 4,683,202; 4,800,159 (PCR); Gelfand et al., U.S. Pat. No. 5,210,015 (real-time PCR with “taqman” probes); Wittwer et al., U.S. Pat. No. 6,174,670; Kacian et al., U.S. Pat. No. 5,399,491 (“NASBA”); Lizardi, U.S. Pat. No. 5,854,033; Aono et al., Japanese patent publ. JP 4-262799 (rolling circle amplification); and the like. In one aspect, the amplification reaction is PCR. An amplification reaction may be a “real-time” amplification if a detection chemistry is available that permits a reaction product to be measured as the amplification reaction progresses, e.g., “real-time PCR”, or “real-time NASBA” as described in Leone et al., Nucleic Acids Research, 26: 2150-2155 (1998), and like references.

[00112] A “reaction mixture” means a solution containing all the necessary reactants for performing a reaction, which may include, but is not be limited to, buffering agents to maintain pH at a selected level during a reaction, salts, co-factors, scavengers, and the like.

[00113] The terms “fragment” or “segment”, as used interchangeably herein, refer to a portion of a larger polynucleotide molecule. A polynucleotide, for example, can be broken up, or fragmented into, a plurality of segments. Various methods of fragmenting nucleic acid are well known in the art. These methods may be, for example, either chemical or physical or enzymatic in nature. Enzymatic fragmentation may include partial degradation with a DNase; partial depurination with acid; the use of restriction enzymes; intron-encoded endonucleases; DNA-based cleavage methods, such as triplex and hybrid formation methods, that rely on the specific hybridization of a nucleic acid segment to localize a cleavage agent to a specific location in the nucleic acid molecule; or other enzymes or compounds which cleave a polynucleotide at known or unknown locations. Physical fragmentation methods may involve subjecting a polynucleotide to a high shear rate. High shear rates may be produced, for example, by moving DNA through a chamber or channel with pits or spikes, or forcing a DNA sample through a restricted size flow passage, e.g., an aperture having a cross sectional dimension in the micron or submicron range. Other physical methods include sonication and nebulization. Combinations of physical and chemical fragmentation methods may likewise be employed, such as fragmentation by heat and ion-mediated hydrolysis. See, e.g., Sambrook et al., “Molecular Cloning: A Laboratory Manual,” 3rd Ed. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (2001) (“Sambrook et al.) which is incorporated herein by reference for all purposes. These methods can be optimized to digest a nucleic acid into fragments of a selected size range.

[00114] The terms “polymerase chain reaction” or “PCR”, as used interchangeably herein, mean a reaction for the in vitro amplification of specific DNA sequences by the simultaneous primer extension of complementary strands of DNA. In other words, PCR is a reaction for making multiple copies or replicates of a target nucleic acid flanked by primer binding sites, such reaction comprising one or more repetitions of the following steps: (i) denaturing the target nucleic acid, (ii) annealing primers to the primer binding sites, and (iii) extending the primers by a nucleic acid polymerase in the presence of nucleoside triphosphates. Usually, the reaction is cycled through different temperatures optimized for each step in a thermal cycler instrument. Particular temperatures, durations at each step, and rates of change between steps depend on many factors that are well-known to those of ordinary skill in the art, e.g., exemplified by the following references: McPherson et al., editors, PCR: A Practical Approach and PCR2: A Practical Approach (IRL Press, Oxford, 1991 and 1995, respectively). For example, in a conventional PCR using Taq DNA polymerase, a double stranded target nucleic acid may be denatured at a temperature>90° C, primers annealed at a temperature in the range 50-75° C, and primers extended at a temperature in the range 72-78° C. The term “PCR” encompasses derivative forms of the reaction, including, but not limited to, RT-PCR, real-time PCR, nested PCR, quantitative PCR, multiplexed PCR, and the like. The particular format of PCR being employed is discernible by one skilled in the art from the context of an application. Reaction volumes can range from a few hundred nanoliters, e.g., 200 nL, to a few hundred pL, e. g., 200 pL. “Reverse transcription PCR,” or “RT-PCR,” means a PCR that is preceded by a reverse transcription reaction that converts a target RNA to a complementary single stranded DNA, which is then amplified, an example of which is described in Tecott et al., U.S. Pat. No.

5,168,038, the disclosure of which is incorporated herein by reference in its entirety. “Real-time PCR” means a PCR for which the amount of reaction product, i.e., amplicon, is monitored as the reaction proceeds. There are many forms of real-time PCR that differ mainly in the detection chemistries used for monitoring the reaction product, e.g., Gelfand et al., U.S. Pat. No. 5,210,015 (“taqman”); Wittwer et al., U.S. Pat. Nos. 6,174,670 and 6,569,627 (intercalating dyes); Tyagi et al., U.S. Pat. No. 5,925,517 (molecular beacons); the disclosures of which are hereby incorporated by reference herein in their entireties. Detection chemistries for real-time PCR are reviewed in Mackay et al., Nucleic Acids Research, 30: 1292-1305 (2002), which is also incorporated herein by reference. “Nested PCR” means a two-stage PCR wherein the amplicon of a first PCR becomes the sample for a second PCR using a new set of primers, at least one of which binds to an interior location of the first amplicon. As used herein, “initial primers” in reference to a nested amplification reaction mean the primers used to generate a first amplicon, and “secondary primers” mean the one or more primers used to generate a second, or nested, amplicon. “Asymmetric PCR” means a PCR wherein one of the two primers employed is in great excess concentration so that the reaction is primarily a linear amplification in which one of the two strands of a target nucleic acid is preferentially copied. The excess concentration of asymmetric PCR primers may be expressed as a concentration ratio. Typical ratios are in the range of from 10 to 100. “Multiplexed PCR” means a PCR wherein multiple target sequences (or a single target sequence and one or more reference sequences) are simultaneously carried out in the same reaction mixture, e.g., Bernard et al., Anal. Biochem., 273: 221-228 (1999) (two- color real-time PCR). Usually, distinct sets of primers are employed for each sequence being amplified. Typically, the number of target sequences in a multiplex PCR is in the range of from 2 to 50, or from 2 to 40, or from 2 to 30. “Quantitative PCR” means a PCR designed to measure the abundance of one or more specific target sequences in a sample or specimen. Quantitative PCR includes both absolute quantitation and relative quantitation of such target sequences.

Quantitative measurements are made using one or more reference sequences or internal standards that may be assayed separately or together with a target sequence. The reference sequence may be endogenous or exogenous to a sample or specimen, and in the latter case, may comprise one or more competitor templates. Typical endogenous reference sequences include segments of transcripts of the following genes: P-actin, GAPDH, p2-microglobulin, ribosomal RNA, and the like. Techniques for quantitative PCR are well-known to those of ordinary skill in the art, as exemplified in the following references, which are incorporated by reference herein in their entireties: Freeman et al., Biotechniques, 26: 112-126 (1999); Becker- Andre et al., Nucleic Acids Research, 17: 9437-9447 (1989); Zimmerman et al., Biotechniques, 21: 268-279 (1996); Diviacco et al., Gene, 122: 3013-3020 (1992); and Becker- Andre et al., Nucleic Acids Research, 17: 9437-9446 (1989).

[00115] The term “primer” as used herein means an oligonucleotide, either natural or synthetic, that is capable, upon forming a duplex with a polynucleotide template, of acting as a point of initiation of nucleic acid synthesis and being extended from its 3' end along the template so that an extended duplex is formed. Extension of a primer is usually carried out with a nucleic acid polymerase, such as a DNA or RNA polymerase. The sequence of nucleotides added in the extension process is determined by the sequence of the template polynucleotide. Usually, primers are extended by a DNA polymerase. Primers usually have a length in the range of from 14 to 40 nucleotides, or in the range of from 18 to 36 nucleotides. Primers are employed in a variety of nucleic amplification reactions, for example, linear amplification reactions using a single primer, or polymerase chain reactions, employing two or more primers. Guidance for selecting the lengths and sequences of primers for particular applications is well known to those of ordinary skill in the art, as evidenced by the following reference that is incorporated by reference herein in its entirety: Dieffenbach, editor, PCR Primer: A Laboratory Manual, 2nd Edition (Cold Spring Harbor Press, New York, 2003).

[00116] The terms “unique identifier”, “unique sequence tag”, “sequence tag”, “tag” or “barcode”, as used interchangeably herein, refer to an oligonucleotide that is attached to a polynucleotide or template molecule and is used to identify and/or track the polynucleotide or template in a reaction or a series of reactions. A unique identifier may be attached to the 3'- or 5 '-end of a polynucleotide or template, or it may be inserted into the interior of such polynucleotide or template to form a linear conjugate, sometimes referred to herein as a “tagged polynucleotide,” or “tagged template,” or the like. A unique identifier may vary widely in size and compositions; the following references, which are incorporated herein by reference in their entireties, provide guidance for selecting sets of unique identifiers appropriate for particular embodiments: Brenner, U.S. Pat. No. 5,635,400; Brenner and Macevicz, U.S. Pat. No.

7,537,897; Brenner et al., Proc. Natl. Acad. Sci., 97: 1665-1670 (2000); Church et al., European patent publication 0 303 459; Shoemaker et al., Nature Genetics, 14: 450-456 (1996); Morris et al., European patent publication 0799897A1; Wallace, U.S. Pat. No. 5,981,179; and the like. Lengths and compositions of unique identifiers can vary widely, and the selection of particular lengths and/or compositions depends on several factors including, without limitation, how unique identifiers are used to generate a readout, e.g., via a hybridization reaction or via an enzymatic reaction, such as sequencing; whether they are labeled, e.g., with a fluorescent dye or the like; the number of distinguishable oligonucleotide identifiers required to unambiguously identify a set of polynucleotides, and the like, and how different the identifiers of a particular set must be in order to ensure reliable identification, e.g., freedom from cross hybridization or misidentification from sequencing errors. In one aspect, unique identifiers can each have a length within a range of from about 2 to about 36 nucleotides, or from about 4 to about 30 nucleotides, or from about 8 to about 20 nucleotides, or from about 6 to about 10 nucleotides. In one aspect, sets of unique identifiers are used, wherein each unique identifiers of a set has a unique nucleotide sequence that differs from that of every other tag of the same set by at least two bases; in another aspect, sets of unique identifiers are used wherein the sequence of each unique identifiers of a set differs from that of every other unique identifiers of the same set by at least three bases.

[00117] Aspects of the invention involve the use of unique identifiers. Unique identifiers in accordance with embodiments of the invention can serve many functions. For example, unique sequence tags can include molecular barcode sequences, unique molecular identifier (UMI) sequences, or index sequences. In one embodiment, unique sequence tags (e.g., barcode or index sequences) can be used to identify DNA sequences originating from a common source such as a sample type, tissue, subject, or individual. In accordance with one embodiment, barcodes or index sequences can be used for multiplex sequencing. In one embodiment, unique sequence tags e.g., unique molecular identifiers (UMIs)) can be used to identify unique nucleic acid sequences from a mixed nucleic acid sample. For example, differing unique molecular identifiers e.g., UMIs) can be used to differentiate ssDNA molecules, dsDNA molecules, or damaged molecules (e.g., nicked dsDNA) contained in a cfDNA sample. In another embodiment, unique molecular identifiers (e.g., UMIs) can be used to reduce amplification bias, which is the asymmetric amplification of different targets due to differences in nucleic acid composition (e.g., high GC content). The unique molecular identifiers (UMIs) can be used to discriminate between nucleic acid mutations that arise during amplification. The unique sequence tags can be present in a multi-functional nucleic acid adapter, which adapter can comprise both a unique sequence tag and a universal priming site. In some embodiments, unique sequence tags can be greater than about 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, or 18 nucleic acids in length.

[00118] In one embodiment, ssDNA molecules in a mixture of dsDNA and ssDNA molecules can be tagged with a unique sequence tags (e.g., ssDNA-specific tags, barcodes or UMIs) using an ssDNA ligation protocol and converted to dsDNA prior to preparation of a combined cfDNA library.

[00119] In another embodiment, dsDNA molecules in a mixture of dsDNA and ssDNA molecules can be tagged with unique molecular identifiers (e.g., UMIs) in a dsDNA ligation protocol using Y -shaped sequencing adapters and then ssDNA molecules can be tagged with a unique identifiers (e.g., barcode or unique UMI) and converted to dsDNA.

[00120] In some embodiments, the methods of the invention involve differential tagging of populations of cfDNA molecules e.g., dsDNA molecules, ssDNA molecules, and nicked dsDNA molecules) in a sample with unique sequence tags to distinguish sequence information derived from one population of cfDNA molecules e.g., dsDNA molecules) from sequence information derived from another population of cfDNA molecules (e.g., ssDNA molecules). Analysis of all populations of cfDNA molecules (e.g., dsDNA molecules, ssDNA molecules, and nicked dsDNA molecules) may increase the sensitivity of certain protocols, for example, a cancer screening protocol. Without being bound by theory, it is believed that ssDNA molecules and/or nicked dsDNA may provide additional valuable insight for cancer detection and screening from a cfDNA sample, and/or may be more representative of tumor content in a cfDNA sample.

[00121] In one embodiment, ssDNA molecules in a mixture of dsDNA and ssDNA molecules can be tagged with a unique sequence tags (e.g., ssDNA-specific tags, barcodes or UMIs) using an ssDNA ligation protocol and converted to dsDNA prior to preparation of a combined cfDNA library.

[00122] In another embodiment, dsDNA molecules in a mixture of dsDNA and ssDNA molecules can be tagged with unique sequence tags (e.g., UMIs) in a dsDNA ligation protocol using Y-shaped sequencing adapters (also referred to herein as “Y adapters”) and then ssDNA molecules can be tagged with a unique sequence tags (e.g., barcode or unique UMI) and converted to dsDNA.

[00123] In one embodiment, the incorporated unique sequences tags and ssDNA-specific tag can be used to distinguish sequencing reads as being originally derived from dsDNA or ssDNA in a cfDNA sample.

[00124] In another embodiment, the incorporated unique sequences tags (e.g., UMIs) and ssDNA-specific tags (e.g., barcodes or UMIs) can be used to obtain fragment size information and genome position associated with sequencing reads from nicked dsDNA fragments in a cfDNA sample. [00125] In yet another embodiment, the incorporated unique sequences tags (e.g., UMIs) and ssDNA-specific tags e.g., barcodes or UMIs) are used to reduce error introduced by amplification, library preparation, and/or sequencing.

[00126] As used herein, the term “sensitivity” refers to the ability of a diagnostic assay to correctly identify subjects with a condition of interest. As used herein, the term “specificity” refers to the ability of a diagnostic assay to correctly identify subjects without a condition of interest.

[00127] As used herein, the term “subject” refers to any living or non-living organism, including but not limited to a human e.g., a male human, female human, fetus, pregnant female, child, or the like), a non-human animal, a plant, a bacterium, a fungus or a protist. Any human or non-human animal can serve as a subject, including but not limited to mammal, reptile, avian, amphibian, fish, ungulate, ruminant, bovine (e.g., cattle), equine (e.g., horse), caprine and ovine (e.g., sheep, goat), swine (e.g., pig), camelid (e.g., camel, llama, alpaca), monkey, ape (e.g., gorilla, chimpanzee), ursid (e.g., bear), poultry, dog, cat, mouse, rat, fish, dolphin, whale and shark. In some embodiments, a subject is a male or female of any age (e.g., a man, a women or a child).

II. Methods of Isolating and/or Enriching Target Nucleic Acid Sequences

[00128] Detecting a target sequence in a nucleic acid can be a challenge when the target sequence is present at a low frequency in the nucleic acid sample. The instant disclosure provides methods that improve detection of nucleic acids containing a target sequence, e.g., rare target sequences, by isolating and/or enriching such target sequences in a nucleic acid sample. For example, the rare target sequence may be in a nucleic acid sequence from a cfDNA sample, such as a cfDNA sample that has been treated with bisulfite or chemical conversion to convert cytosines to uracils to preserve information regarding the methylation status of a particular nucleic acid sequence (e.g., comprising a CpG site), in a subject. The target sequence may be indicative of the risk of developing or the presence of cancer in the subject from whom the sample was taken. In certain embodiments, the target sequence is present in a nucleic acid library (e.g., the sample may be a nucleic acid library), and the methods described herein enrich the target sequence in the nucleic acid library. a. Enrichment Using Staggered Nucleic Acid Cleavage of a Target Sequence and Overhang Fill with a Label

[00129] A nucleic acid target sequence can be isolated and/or enriched by cutting a nucleic acid molecule that includes the target sequence to form a single-stranded overhang, filling in the overhang with a label, and capturing the nucleic acid molecule that includes the target by contacting at least one of the labeled nucleotides with a capture domain, thereby isolating and/or enriching the target sequence. In other embodiments, a nucleic acid molecule that includes the target molecule can be enriched by separating labeled nucleic acids from unlabeled nucleic acids without the use of a capture domain. For example, labeled (e.g., fluorescence-labeled) nucleic acids can be separated from unlabeled nucleic acids using a method that sorts fluorescent molecules away from non-fluorescent molecules and/or by using magnetic fields to sort a nucleic acid labeled with a magnetic label away from non-labeled nucleic acids.

[00130] Nucleic acids containing or suspected of containing a target sequence can be contacted with an enzyme that (1) recognizes a target sequence and (2) cuts (cleaves) the nucleic acid molecules that contain the target sequence with a nuclease. In certain embodiments, the nuclease cleaves the nucleic acid within the target sequence. In certain embodiments, the nuclease cleaves the nucleic acid near the target sequence. For example, the nuclease may cleave the nucleic acid within about 1 nt to about 20 nt, about 2 nt to about 20 nt, about 5 nt to about 20 nt, about 10 nt to about 20 nt, about 15 nt to about 20 nt, about 1 nt to about 15 nt, about 2 nt to about 15 nt, about 5 nt to about 15 nt, about 10 nt to about 15 nt, about 1 nt to about 10 nt, about 2 nt to about 10 nt, about 5 nt to about 10 nt, about 1 nt to about 10 nt, about 2 nt to about 10 nt, about 5 nt to about 10 nt, about 1 nt to about 5 nt, about 2 nt to about 5 nt, about 2 nt to about 5 nt of the target sequence.

[00131] Nucleases suitable for use herein generate a staggered cut in the nucleic acid, leaving a single stranded overhang of unpaired nucleotides. The overhang may be any length, for example, between 1 and 3 nt, between 1 and 5 nt, between 1 and 10 nt, between 1 and 15 nt, between 3 and 5 nt, between 3 and 10 nt, between 3 and 15 nt, or between 5 and 10 nt, between 5 and 15 nt, between 10 and 15 nt, or about 2 nt, about 3 nt, about 4 nt, about 5 nt, about 6 nt, about 7 nt, about 8 nt, about 9 nt, about 10 nt, about 11 nt, about 12 nt, about 13 nt, about 14 nt, or about 15 nt. Exemplary nucleases include type II and type V CRISPR-Cas nucleases. In certain embodiments, the nuclease is a Cas9 or Casl2 nuclease, or a variant thereof (see, e.g., Liu et al. (2019) Nature Communications 10; Article 5524). In certain embodiments, the nuclease is a Casl2a/Cpfl nuclease. In certain embodiments, the nuclease is a MAD7 nuclease. In certain embodiments, the nuclease is a CasX nuclease. In certain embodiments, the nuclease is associated with a guide RNA (gRNA) comprising a spacer sequence that binds to the target sequence.

[00132] In certain embodiments, two or more nucleases are used in the methods described herein. In certain embodiments, the two or more nucleases are used sequentially. In certain embodiments, the two or more nucleases are used are used simultaneously. In certain embodiments, two or more nucleases are used to increase the number or variety of target sequences that can be enriched. For example, in certain embodiments, if one or more target sequences is not located near a Casl2 (e.g., Casl2a) PAM site, a second nuclease e.g., a Cas9 or a CasX nuclease) may be used if the second nuclease has a PAM site near the remaining target sequences.

[00133] An enrichment method may include one or more of the steps of: (1) designing guide RNAs to bind target sequences for cleavage with Casl2 (e.g., Casl2a), (2) designing guide RNAs to bind target sequences for cleavage with Cas9 (e.g., for additional target sequence that are not near a Casl2 PAM), (3) adding a mixture of guide sequences, Casl2, Cas9 to the nucleic acid comprising a plurality of target sequences.

[00134] In certain embodiments, an end-repair step may be performed prior to a cutting step. For example, when enriching a nucleic acid that may contain overhangs, such as cfDNA, an end-repair step can be performed to blunt-end repair any overhangs unrelated to the target sequence, prior to the cutting step.

[00135] The overhangs are filled in using a polymerase, such as a DNA polymerase. In certain embodiments, the DNA polymerase I consists of the Klenow fragment. The polymerase reaction includes nucleotides (free nucleotides), such as dNTPs, which are used by the DNA polymerase to fill in the overhangs. Klenow fragment can be used in an amount of from about 0.01 units/pL to about 1 unit/pL, for example, from about 0.05 units/pL to about 0.5 units/pL, from about 0.075 units/pL to about 0.125 units/pL or at about 0.1 unit/pL. The nucleotides are associated with (e.g., bound to) a label, which allows for the separation and/or isolation of the nucleic acid comprising the target sequence from other nucleic acids not containing the target sequence. In certain embodiments, the label comprises a fluorophore, a magnetic moiety, biotin or digoxigenin.

[00136] In certain embodiments, a labeled e.g., biotin-labeled) nucleic acid comprising a target sequence is exposed to a capture domain e.g., avidin), forming a capture domain-label- nucleic acid target complex. The capture domain can be bound to a solid support, such as a bead. Thus, after exposure to the solid support-bound capture domain, the beads will be bound to the capture domain-label-nucleic acid target complex, which can be separated from non-target sequence from the nucleic acid comprising the target sequence, e.g. , by a wash step.

[00137] In certain embodiments, the capture domain comprises avidin, streptavidin, or a DIG-binding protein. In certain embodiments, the solid support is a bead, a well, a tube, or a slide.

[00138] The steps of the methods described herein, such as the enrichment method, including the cutting step, can be performed at a variety of temperatures, including but not limited to room temperature and/or about 20°C to about 45°C, about 20°C, about 25°C, about 30°C, about 35°C, about 37°C, about 40°C, about 45°C, or any ranges therein (e.g., about 20°C to about 25°C, about 25°C to about 37°C, about 35°C to about 45°C, and so on). b. Additional Enrichment Steps

[00139] In certain embodiments, the target sequence or a subset of target sequences in the nucleic acid can be enriched using one or more additional enrichment steps. The one or more additional enrichments steps can be performed using any enrichment method known in the art. Non-limiting examples include hybrid capture and use of DNA-binding proteins to enrich a target sequence or a subset of target sequences.

[00140] One or more additional enrichment steps can be performed before or after enrichment using a targeted cutting and overhang filling method described above. For example, in certain embodiments, the method comprises a first step of subjecting a plurality of nucleic acid molecules that include one or more (e.g., a plurality of) target sequences to a nucleic acid enrichment method that includes cutting the nucleic acid molecules that include the one or more (e.g., the plurality of) target sequences to generate single stranded overhangs at the cut end of the molecules that include the one or more e.g., the plurality of) target sequences, filling the overhangs with at least one labeled nucleotide, and enriching the molecules that include the one or more e.g., the plurality of) target sequences by contacting the labeled nucleotides in the molecules with the capture domains and/or separating labeled molecules from unlabeled molecules. The method comprises a second step of subjecting the one or more (e.g., the plurality of) target molecules to one or more additional enrichment steps to enrich for a subset of the target sequences. In certain embodiments, the first step enriches for target sequences in a genomic region of interest and the second step enriches for a subset of the target sequences that contains a methylation pattern (e.g., an epitype) of interest.

[00141] In certain embodiments, the method comprises a first step of subjecting a plurality of nucleic acid molecules that include one or more (e.g., a plurality of) target sequences to an enrichment step to enrich for the one or more (e.g., the plurality of) target sequences. The method comprises a second step of subjecting the one or more (e.g., the plurality of) target molecules to a nucleic acid enrichment method that includes cutting the one or more (e.g., the plurality of) target sequences to generate single stranded overhangs at the cut end of the target sequences or a subset of the target sequences, filling the overhangs with at least one labeled nucleotide, and enriching the molecules that include the one or more target sequences or subset of target sequences by contacting the labeled nucleotides in the molecules with the capture domains and/or separating labeled molecules from unlabeled molecules. In certain embodiments, the first step enriches for target sequences in a genomic region of interest and the second step enriches for a subset of the target sequences that contains a methylation pattern (e.g., an epitype) of interest. i. Hybrid Capture

[00142] In certain embodiments, a target sequence or a subset of target sequences in the nucleic acid can be enriched by subjecting the nucleic acid comprising the target sequence or the subset of target sequences to hybrid capture. In hybrid capture, labeled (e.g., biotinylated) capture probes that can bind to one or more target sequences or subsets of target sequences are exposed to the nucleic acid comprising the one or more target sequences. The capture probes are specific to a sequence of interest, for example, a methylation pattern of interest that can be detected as a bisulfite-converted epitype. Examples of such hybrid capture probe sets include the KAPA HyperPrep Kit and SeqCAP Epi Enrichment System from Roche Diagnostics (Pleasanton, CA).

[00143] Hybrid capture can be performed before or after enrichment using the targeted cutting and overhang filling method described above. For example, in certain embodiments, the method comprises a first step of subjecting a plurality of nucleic acid molecules that include one or more (e.g., a plurality of) target sequences to a nucleic acid enrichment method that includes cutting the nucleic acid molecules that include the one or more e.g., the plurality of) target sequences to generate single stranded overhangs at the cut end of the molecules that include the one or more e.g., the plurality of) target sequences, filling the overhangs with at least one labeled nucleotide, and enriching the molecules that include the one or more (e.g., the plurality of) target sequences by contacting the labeled nucleotides in the molecules with the capture domains and/or separating labeled molecules from unlabeled molecules. The method comprises a second step of subjecting the one or more (e.g., the plurality of) target molecules to hybrid capture to enrich for a subset of the target sequences. In certain embodiments, the first step enriches for target sequences in a genomic region of interest and the second step enriches for a subset of the target sequences that contains a methylation pattern (e.g., an epitype) of interest.

[00144] In certain embodiments, the method comprises a first step of subjecting a plurality of nucleic acid molecules that include one or more (e.g., a plurality of) target sequences to hybrid capture to enrich for the one or more (e.g., the plurality of) target sequences. The method comprises a second step of subjecting the one or more (e.g., the plurality of) target molecules to a nucleic acid enrichment method that includes cutting the one or more (e.g., the plurality of) target sequences to generate single stranded overhangs at the cut end of the target sequences or a subset of the target sequences, filling the overhangs with at least one labeled nucleotide, and enriching the molecules that include the one or more target sequences or subset of target sequences by contacting the labeled nucleotides in the molecules with the capture domains and/or separating labeled molecules from unlabeled molecules. In certain embodiments, the first step enriches for target sequences in a genomic region of interest and the second step enriches for a subset of the target sequences that contains a methylation pattern (e.g., an epitype) of interest. ii. Nucleic Acid Binding Proteins

[00145] In certain embodiments, a target sequence or a subset of target sequences in the nucleic acid can be enriched by subjecting the nucleic acid comprising the target sequence to nucleic acid binding proteins (also referred to herein as protein binders). The nucleic acid binding protein may bind a particular sequence or may bind to methylated CpGs. Exemplary nucleic acid binding proteins that bind to a particular sequence include transcription factors and nuclease deficient CRISPR enzymes (e.g., DCas9). Exemplary DNA binding proteins that bind methylated CpGs include methyl-CpG-binding domain (MBD) proteins such as MECP2 (methyl-CpG-binding protein 2), MBD1, MBD2, MBD3, MBD4, MBD5, MBD6, the Kaiso family proteins, and the SET- and Ring finger-associated (SRA) domain family. In certain embodiments, the MBD protein is selected from MECP2, MBD1, MBD2, and MBD4. See, e.g., Du et al. (2015) Epigenomics 7(6): 1051-1073, incorporated by reference herein for all purposes.

[00146] In certain embodiments, a nucleic acid comprising a target sequence is exposed to a protein comprising a nucleic acid binding protein, which binds to a target sequence or a subset of target sequences. The target sequence or subset of target sequences can be enriched by isolating the target sequence- nucleic acid binding protein complex, for example, using an antibody to the nucleic acid binding protein. In certain embodiments, the nucleic acid binding protein is attached to a label. In certain embodiments, the label comprises a fluorophore, biotin or digoxigenin. In certain embodiments, the label binds a capture domain that can be used to isolate and/or separate the target nucleic acid or subset of target nucleic acids from a sample. In certain embodiments, the capture domain comprises avidin, streptavidin, or a DIG-binding protein. In certain embodiments, the solid support is a bead, a well, a tube, or a slide.

[00147] In certain embodiments, a nucleic acid comprising a target sequence is exposed to a protein comprising a methyl-CpG-binding domain (MBD), which binds to a methylated CpG within the target sequence. The target sequence can be enriched by isolating the target sequence MDB complex. Because CGIs are typically not methylated, use of a nucleic acid binding protein enrichment using an MBD would enrich for methylated fragments, for example, rare methylated fragments. In certain embodiments, the MBD is MBD3, which binds to 5- hydroxymethylcytosine. In certain embodiments, the method enriches for hydroxymethylcytosine-containing fragments of a CGI.

[00148] Enrichment using a nucleic acid binding protein can be performed before or after enrichment using the targeted cutting and overhang filling method described above. For example, in certain embodiments, the method comprises a first step of subjecting a plurality of nucleic acid molecules that include one or more (e.g., a plurality of) target sequences to a nucleic acid enrichment method that includes cutting the nucleic acid molecules that include the one or more (e.g. , the plurality of) target sequences to generate single stranded overhangs at the cut end of the molecules that include the one or more e.g., the plurality of) target sequences, filling the overhangs with at least one labeled nucleotide, and enriching the molecules that include the one or more (e.g., the plurality of) target sequences by contacting the labeled nucleotides in the molecules with the capture domains and/or separating labeled molecules from unlabeled molecules. The method comprises a second step of combining the one or more (e.g., the plurality of) target molecules with a nucleic acid binding protein to enrich for a subset of the target sequences. In this method, the nucleic acid binding protein binds to the one or more (e.g., the plurality of) target molecules and the nucleic acid binding protein-target complex is isolated as described above. In certain embodiments, the first step enriches for target sequences in a genomic region of interest and the second step enriches for a subset of the target sequences that contains a methylation pattern (e.g., an epitype) of interest.

[00149] In certain embodiments, the method comprises a first step of subjecting a plurality of nucleic acid molecules that include one or more (e.g., a plurality of) target sequences to hybrid capture to enrich for the one or more (e.g., the plurality of) target sequences. The method comprises a second step of subjecting the one or more (e.g., the plurality of) target molecules to a nucleic acid enrichment method that includes cutting the one or more (e.g., the plurality of) target sequences to generate single stranded overhangs at the cut end of the target sequences or a subset of the target sequences, filling the overhangs with at least one labeled nucleotide, and enriching the molecules that include the one or more target sequences or subset of target sequences by contacting the labeled nucleotides in the molecules with the capture domains and/or separating labeled molecules from unlabeled molecules. In certain embodiments, the first step enriches for target sequences in a genomic region of interest and the second step enriches for a subset of the target sequences that contains a methylation pattern (e.g., an epitype) of interest.

[00150] After isolating and/or separating the nucleic acid comprising a target sequence, the method can include amplifying the nucleic acid molecule, e.g., by PCR. Amplification can occur while the nucleic acid is in contact with the capture domain, or the nucleic acid can be removed from the capture domain e.g., by heat elution, a chemical agent, mechanical disruption, or combinations thereof) prior to amplification.

III. Nucleic Acid Libraries Enriched for Regions of Interest and Methods of Making Same

[00151] In another aspect, the disclosure relates to a method for producing a nucleic acid library enriched for regions of interest (i.e., targets). In certain embodiments, regions of interest are enriched prior to making the library. In other embodiments, the library is made and then regions of interest present in the library are enriched.

[00152] A method of making a nucleic acid library enriched for regions of interest can include obtaining a sample comprising a plurality of nucleic acids, wherein the plurality of nucleic acids comprise regions of interest. In certain embodiments, methylated cytosines are converted to uracils. Adapters are added to the plurality of nucleic acids to form a nucleic acid library. To enrich the nucleic acid library that is initially created for regions of interest, the plurality of nucleic acid molecules having regions of interest can be cut, e.g., by a nuclease, to generate single stranded overhangs at cut ends of the molecules that include the regions of interest. The overhangs are filled in, e.g. , using a polymerase, with a least one labeled nucleotide. The molecules that include the regions of interest are then enriched by contacting the labeled nucleic acids in the molecule with capture domains which can be used to separate and/or isolate the labeled nucleic acids from unlabeled nucleic acids. Nucleic acids containing the regions of interest can be amplified to form the nucleic acid library enriched for regions of interest.

[00153] An exemplary method of making a nucleic acid library enriched for regions of interest is shown in FIG. 1. A cell-free DNA (cfDNA) sample comprising methylated nucleotides that have been converted using bisulfite treatment are used to construct a nucleic acid library. sgRNAs complementary to target sequences are constructed, and the library is exposed to Casl2 and the sgRNAs. The sgRNAs direct Casl2 to the target sequence and cleave the DNA, leaving an overhang. Biotinylated dNTPs are added with a polymerase (Klenow fragment) to fill in the nucleotides complementary to the overhang. Streptavidin beads bind to the biotinylated nucleotides of the target sequences, and any uncut, off-target sequences are washed away. A PCR amplification is then performed to amplify the enriched target sequences that have been bound to the bead.

[00154] Enriched libraries can also be made by enriching for regions of interest prior to making the library. In this method, a sample is obtained which comprises a plurality of nucleic acids, wherein a subset of the plurality of nucleic acids comprise or are suspected to comprise regions of interest. The subset of the plurality of nucleic acid molecules having regions of interest are cut to generate single stranded overhangs at cut ends of the molecules that include the regions of interest. The overhangs are filled in, e.g., using a polymerase, with a least one labeled nucleotide. The nucleic acids that include the regions of interest are then enriched by contacting the labeled nucleic acids with capture domains which can be used to separate and/or isolate the labeled nucleic acids from unlabeled nucleic acids. The nucleic acids that include the regions of interest are removed from the capture domains. In certain embodiments, the nucleic acids are treated to convert methylated cytosines to uracils, to preserve information about the methylation state of the nucleic acids. Nucleic acid adapters are added to the plurality of nucleic acids to form the nucleic acid library enriched for regions of interest.

[00155] Another exemplary method of making a nucleic acid library enriched for regions of interest is shown in FIG. 2. In this method, a cell-free DNA (cfDNA) sample is exposed to Casl2 and sgRNAs complementary to target sequences in a target region. The sgRNAs direct Casl2 to the target sequence, and cleave the DNA, leaving an overhang. Biotinylated dNTPs are added with a polymerase (Klenow fragment) to fill in the nucleotides complementary to the overhang. Streptavidin beads bind to the biotinylated nucleotides present in the target region, and any uncut, off-target sequences are washed away. The enriched target sequences are eluted off of the beads using heat treatment. The eluted target sequences are then treated with bisulfite to preserve methylation information. The bisulfite converted target sequences are used to construct a library that is enriched for target sequences.

[00156] In certain embodiments, nucleic acids containing or suspected of containing a target sequence are contacted with an enzyme that (1) recognizes a region of interest (z.e., target sequence) and (2) cuts (cleaves) the nucleic acid molecules that contain the target sequence with a nuclease. In certain embodiments, the nuclease cleaves the nucleic acid within the target sequence. In certain embodiments, the nuclease cleaves the nucleic acid near the target sequence. For example, the nuclease may cleave the nucleic acid within about 1 nt to about 20 nt, about 2 nt to about 20 nt, about 5 nt to about 20 nt, about 10 nt to about 20 nt, about 15 nt to about 20 nt, about 1 nt to about 15 nt, about 2 nt to about 15 nt, about 5 nt to about 15 nt, about 10 nt to about 15 nt, about 1 nt to about 10 nt, about 2 nt to about 10 nt, about 5 nt to about 10 nt, about 1 nt to about 10 nt, about 2 nt to about 10 nt, about 5 nt to about 10 nt, about 1 nt to about 5 nt, about 2 nt to about 5 nt, about 2 nt to about 5 nt of the target sequence.

[00157] Nucleases suitable for use herein generate a staggered cut in the nucleic acid, leaving a single stranded overhang of unpaired nucleotides. The overhang may be any length, for example, between 1 and 3 nt, between 1 and 5 nt, between 1 and 10 nt, between 1 and 15 nt, between 3 and 5 nt, between 3 and 10 nt, between 3 and 15 nt, or between 5 and 10 nt, between 5 and 15 nt, between 10 and 15 nt, or about 2 nt, about 3 nt, about 4 nt, about 5 nt, about 6 nt, about 7 nt, about 8 nt, about 9 nt, about 10 nt, about 11 nt, about 12 nt, about 13 nt, about 14 nt, or about 15 nt. Exemplary nucleases include type II and type V CRISPR-Cas nucleases. In certain embodiments, the nuclease is a Cas9 or Casl2 nuclease, or a variant thereof (see, e.g., Liu et al. (2019) supra). In certain embodiments, the nuclease is a Casl2a/Cpfl nuclease. In certain embodiments, the nuclease is a MAD7 nuclease. In certain embodiments, the nuclease is a CasX nuclease. In certain embodiments, the nuclease is associated with a guide RNA (gRNA) comprising a spacer sequence that binds to the target sequence.

[00158] As described above, overhangs are filled in using a polymerase, such as a DNA polymerase. In certain embodiments, the DNA polymerase I consists of the Klenow fragment. The polymerase reaction includes nucleotides (free nucleotides), such as dNTPs, which are used by the DNA polymerase to fill in the overhangs. In certain embodiments, at least one nucleotide comprises a label. In certain embodiments, the label comprises a fluorophore, biotin or digoxigenin. In certain embodiments, the target nucleic acid is enriched by isolating and/or separating the labeled nucleic acid. In certain embodiments, the label binds to a capture domain. In certain embodiments, the capture domain comprises avidin, streptavidin, or a DIG-binding protein. In certain embodiments, the capture moiety comprises or is connected to a solid support. In certain embodiments, the solid support is a bead, a well, a tube, or a slide.

[00159] In any of the foregoing embodiments, at any step in the method, the target sequence or a subset of target sequences in the nucleic acid can be enriched by subjecting the nucleic acid comprising the target sequence to hybrid capture. For example, hybrid capture can be performed before or after enrichment using the targeted cutting and overhang filling method described above. Hybrid capture can be performed before or after addition of adaptors. Hybrid capture can be performed before or after conversion of nucleotides (e.g., by bisulfite conversion). In certain embodiments, hybrid capture enriches for target sequences in a genomic region of interest and targeted cutting and overhang filling enriches for a subset of the target sequences that contains a methylation pattern e.g., an epitype) of interest. In certain embodiments, targeted cutting and overhang filling enriches for target sequences in a genomic region of interest and hybrid capture enriches for a subset of the target sequences that contains a methylation pattern e.g., an epitype) of interest.

[00160] After isolating and/or separating the nucleic acid comprising a target sequence, the method can include amplifying the nucleic acid molecule. Amplification can occur while the nucleic acid is in contact with the capture domain, or the nucleic acid can be removed from the capture domain (e.g., by heat elution, a chemical agent, mechanical disruption, or combinations thereof) prior to amplification.

[00161] In any of the foregoing embodiments, adaptors can be attached to a nucleic acid by any means known in the art, for example, as are used in connection with next generation sequencing (NGS). For example, adapters, such as a Y adapter, can be attached to a nucleic acid by ligation. In certain embodiments, the adapter is attached by nucleic acid amplification of the cell-free nucleic acid using a primer comprising the adapter. In certain embodiments, the adapter comprises one or more of a flow cell binding site, an index, a unique molecular identifier (UMI), and a sequencing binding site. Such adapters can be used to subsequently sequence the library using NGS.

[00162] In another aspect, the disclosure relates to a nucleic acid library, produced by the methods described herein.

IV. Nucleic Acids

A. Source of Nucleic Acids

[00163] Nucleic acids used in the methods described herein can be derived from any source, such as a sample taken from the environment or from a subject (e.g., a human subject). A biological sample can be treated to physically disrupt tissue or cell structure (e.g. , centrifugation and/or cell lysis), thus releasing intracellular components into a solution which can further contain enzymes, buffers, salts, detergents, and the like which can be used to prepare the sample for analysis. A biological sample can take any of a variety of forms, such as a liquid biopsy e.g., blood, urine, stool, saliva, or mucous), or a tissue biopsy, or other solid biopsy. Examples of biological samples include, but are not limited to, blood, whole blood, plasma, serum, urine, cerebrospinal fluid, fecal, saliva, sweat, tears, pleural fluid, pericardial fluid, or peritoneal fluid of the subject. A biological sample can include any tissue or material derived from a living or dead subject. A biological sample can be a cell-free sample. A sample can be a liquid sample or a solid sample (e.g., a cell or tissue sample). A biological sample can be a bodily fluid, such as blood, plasma, serum, urine, vaginal fluid, fluid from a hydrocele (e.g., of the testis), vaginal flushing fluids, pleural fluid, ascitic fluid, cerebrospinal fluid, saliva, sweat, tears, sputum, bronchoalveolar lavage fluid, discharge fluid from the nipple, aspiration fluid from different parts of the body (e.g., thyroid, breast), etc.

[00164] The nucleic acid can be of any composition form, such as deoxyribonucleic acid (DNA, e.g., complementary DNA (cDNA), genomic DNA (gDNA) and the like), and/or DNA analogs (e.g., containing base analogs, sugar analogs and/or a non-native backbone and the like), and/or ribonucleic acid (RNA) and or RNA analogs, all of which can be in single- or doublestranded form. In certain embodiments, single-stranded nucleic acids can be made double stranded prior to cutting with an enzyme. Unless otherwise limited, a nucleic acid can comprise known analogs of natural nucleotides, some of which can function in a similar manner as naturally occurring nucleotides. A nucleic acid can be in any form useful for conducting processes herein (e.g., linear, circular, supercoiled, single-stranded, double-stranded and the like). A nucleic acid in some embodiments can be from a single chromosome or fragment thereof (e.g. , a nucleic acid sample may be from one chromosome of a sample obtained from a diploid organism). In certain embodiments nucleic acids comprise nucleosomes, fragments or parts of nucleosomes or nucleosome-like structures. Nucleic acids can comprise protein e.g., histones, DNA binding proteins, and the like). Nucleic acids analyzed by processes described herein can be substantially isolated and are not substantially associated with protein or other molecules. Nucleic acids can also include derivatives, variants and analogs of DNA synthesized, replicated or amplified from single-stranded (“sense” or “antisense,” “plus” strand or “minus” strand, “forward” reading frame or “reverse” reading frame) and double-stranded polynucleotides. Deoxyribonucleotides can include deoxyadenosine, deoxycytidine, deoxyguanosine and deoxythymidine. A nucleic acid may be prepared using a nucleic acid obtained from a subject as a template.

[00165] In certain embodiments, the nucleic acid is a cell-free nucleic acid, which can be found in bodily fluids such as blood, whole blood, plasma, serum, urine, cerebrospinal fluid, fecal, saliva, sweat, sweat, tears, pleural fluid, pericardial fluid, or peritoneal fluid of a subject. In certain embodiments, a plasma sample can be used directly in the methods disclosed herein (for example, in the cutting step), without prior purification or isolation of nucleic acids in the plasma. Cell-free nucleic acids originate from one or more healthy cells and/or from one or more cancer cells, or from non-human sources such bacteria, fungi, viruses. Examples of the cell-free nucleic acids include but are not limited to cell-free DNA (“cfDNA”), including mitochondrial DNA or genomic DNA, and cell-free RNA. In certain embodiments herein, instruments for assessing the quality of the cell-free nucleic acids, such as the TapeStation System from Agilent Technologies (Santa Clara, CA) can be used. Concentrating low- abundance cfDNA can be accomplished, for example using a Qubit Fluorometer from Thermofisher Scientific (Waltham, MA).

[00166] In various embodiments, the majority of DNA in a biological sample that has been enriched for cell-free DNA (e.g., a plasma sample obtained via a centrifugation protocol) can be cell-free e.g., greater than 50%, 60%, 70%, 80%, 90%, 95%, or 99% of the DNA can be cell-free).

B. Nucleic Acids Derived From Methylated Nucleic Acids

[00167] A methylated nucleic acid is a nucleic acid having a modification in which a hydrogen atom on the pyrimidine ring of a cytosine base is converted to a methyl group, forming 5 -methylcytosine. Methylation can occur at dinucleotides of cytosine and guanine referred to herein as “CpG sites”, which can be a target for enrichment. Methylation of cytosine can occur in cytosines in other sequence contexts, for example, 5'-CHG-3' and 5'-CHH-3', where H is adenine, cytosine or thymine. Cytosine methylation can also be in the form of 5- hydroxymethylcytosine. Methylation of DNA can include methylation of non-cytosine nucleotides, such as /'/’-methyl adenine (6mA). Anomalous cfDNA methylation can be identified as hypermethylation or hypomethylation, both of which may be indicative of cancer status. As is well known in the art, DNA methylation anomalies (compared to healthy controls) can cause different effects, which may contribute to cancer.

[00168] In certain embodiments, the nucleic acid comprises a CpG site (z'.e. , cytosine and guanine separated by only one phosphate group). In certain embodiments, the nucleic acid comprises a CpG island (also referred to as a “CG islands” or “CGI”) or a portion thereof, which is the target for enrichment. Because certain CGIs and certain features of certain CGIs in tumor cells tend to be different from the same CGIs or features of the CGIs in healthy cells, detection of such CGIs can be informative of a health condition. In certain embodiments, the CGI is a “cancer informative CGIs”, which is defined and described in more detail below. In certain embodiments, the CpG is an “informative CpG”, e.g., a “cancer informative CGI”. Such CGIs may have methylation patterns in tumor cells that are different from the methylation patterns in healthy cells. Accordingly, detection of a cancer informative CGI can be informative regarding a subject’s risk of developing cancer or can be indicative that the subject has cancer. Exemplary cancer informative CGIs, which can be target sequences as described herein, are identified in, e.g., Table 1 of U.S. Patent Publication 2020/0109456A1, Tables 2 and 3 of WO2022/133315, and TABLES 1-4 provided herein. C. Converting Unmethylated Nucleic Acids

[00169] In certain aspects, the nucleic acids of the invention have been treated to convert one or more unmethylated nucleotides (e.g., cytosines) to another nucleotide (a “converted nucleotide”, as used herein, such as a uracil), for example, prior to amplification. In certain embodiments, one or more unmethylated cytosines are converted to a nucleotide that pairs with adenine e.g., the unmethylated cytosine may be converted to uracil). In certain embodiments, one or more unmethylated adenines are converted to a base that pairs with cytosine e.g., the unmethylated adenine may be converted to inosine (I)). In certain embodiments, one or more methylated cytosines (e.g., a 5 -methylcytosine (5mC)) is converted to a thymine, which pairs with adenine. In certain embodiments, methylated cytosines are protected from conversion (e.g., deamination) during the conversion step.

[00170] After a nucleic acid has been treated to convert unmethylated, or, in some cases, methylated nucleotides, into another nucleotide, the nucleic acid may be amplified. During amplification, the converted nucleotide pairs with its complementary nucleotide, and in the next round of amplification, the complementary nucleotide pairs with a replacement nucleotide. For example, following the conversion of an unmethylated cytosine to a uracil, the nucleic acid may be amplified such that an adenine pairs with the uracil in the first round of replication, and in the second round of replication, the adenine pairs with a thymine. Accordingly, the thymine replaces the uracil in the original nucleic acid sequence, and is referred to herein as a “replacement nucleotide”.

[00171] In certain aspects, the nucleic acids of the invention have been selectively deaminated. Selective deamination refers to a process in which unmethylated cytosine residues are selectively deaminated over methylated cytosine (5-methylcytosine) residues. In certain embodiments, deamination of cytosine forms uracil, effectively inducing a C to T point mutation to allow for detection of methylated cytosines. Methods of deaminating cytosine are known in the art, and include bisulfite conversion and enzymatic conversion. In certain embodiments, the enzymatic conversion comprises subjecting the nucleic acid to TET2, which oxidizes methylated cytosines, thereby protecting them, and subsequent exposure to APOBEC, which converts unprotected (i.e., unmethylated) cytosines to uracils. [00172] In some embodiments, the conversion, for example, bisulfite conversion or enzymatic conversion, uses commercially available kits. Bisulfite conversion can be performed using commercially available technologies, such as EZ DNA Methylation-Gold, EZ DNAMethylation-Direct or an EZ DNAMethylation-Lighting kit (Zymo Research Corp (Irvine, California)) or EpiTect Fast available from Qiagen (Germantown, MD). In another example a kit such as APOBECSeq (NEBiolabs) or OneStep qMethyl-PCR Kit (Zymo Research Corp (Irvine, California)) is used. i. Bisulfite conversion

[00173] Bisulfite conversion is performed on DNA by denaturation using high heat, preferential deamination (at an acidic pH) of unmethylated cytosines, which are then converted to uracil by desulfonation (at an alkaline pH). Methylated cytosines remain unchanged on the single-stranded DNA (ssDNA) product. An overview of bisulfite conversion is provided in FIG. 3.

[00174] In some embodiments the methods include treatment of the sample with bisulfite (e.g., sodium bisulfite, potassium bisulfite, ammonium bisulfite, magnesium bisulfite, sodium metabisulfite, potassium metabisulfite, ammonium metabisulfite, magnesium metabisulfite and the like). Unmethylated cytosine is converted to uracil through a three-step process during sodium bisulfite modification. As shown in FIG. 3, the steps are sulphonation to convert cytosine to cytosine sulphonate, deamination to convert cytosine sulphonate to uracil sulphonate and alkali desulfonation to convert uracil sulphonate to uracil. Conversion on methylated cytosine is much slower and is not observed at significant levels in a 4-16 hour reaction. (See Clark et al., Nucleic Acids Res., 22(15):2990-7 (1994).) If the cytosine is methylated it will remain a methylated cytosine. If the cytosine is unmethylated it will be converted to uracil.

When the modified strand is copied, for example, through extension of a locus specific primer, a random or degenerate primer or a primer to an adaptor, a G will be incorporated in the interrogation position (opposite the C being interrogated) if the C was methylated and an A will be incorporated in the interrogation position if the C was unmethylated and converted to U.

When the double stranded extension product is amplified those Cs that were converted to Us and resulted in incorporation of A in the extended primer will be replaced by Ts during amplification. Those Cs that were not converted (z'.e. , the methylated Cs) and resulted in the incorporation of G will be replaced by unmethylated Cs during amplification. ii. Enzymatic conversion

[00175] In certain embodiments, the enzymatic treatment with a cytidine deaminase enzyme is used to convert cytosine to uracil. Enzymatic conversion can include an oxidation step, in which Tet methylcytosine dioxygenase 2 (TET2) catalyzes the oxidation of 5mC to 5hmC to protect methylated cytosines from conversion by subsequent exposure to a cytidine deaminase. Other protection steps known in the art can be used in addition to or in place of oxidation by TET2. After the oxidation step, the nucleic acid is treated with the cytidine deaminase to convert one or more unmethylated cytosines to uracils. As with bisulfite conversion, when the modified strand is copied, a G will be incorporated in the interrogation position (opposite the C being interrogated) if the C was methylated and an A will be incorporated in the interrogation position if the C was unmethylated. When the double stranded extension product is amplified those Cs that were converted to Us and resulted in incorporation of A in the extended primer will be replaced by Ts during amplification. Those Cs that were not modified and resulted in the incorporation of G will remain as C.

[00176] In certain embodiments the cytidine deaminase may be APOBEC. In certain embodiments the cytidine deaminase includes activation induced cytidine deaminase (AID) and apolipoprotein B mRNA editing enzymes, catalytic polypeptide-like (APOBEC). In certain embodiments, the APOBEC enzyme is selected from the human APOBEC family consisting of: APOBEC- 1 (Apol), APOBEC-2 (Apo2), AID, APOBEC-3A, -3B, -3C, -3DE, -3F, -3G, -3H and APOBEC-4 (Apo4). In certain embodiments, the APOBEC enzyme is APOBEC-seq. iii. Nitrite Conversion

[00177] In certain embodiments, nitrite treatment is used to deaminate adenine and cytosine. Deamination of an A results in conversion to an inosine (I), which is read by a polymerase as a G, whereas deamination of a methylated A (A 6 -methyladenine (6mA)) results in a nitrosylated 6mA (6mA-N0), which causes the base to be read by a polymerase as an A. Deamination of a C results in conversion to a uracil, which is read by a polymerase as a T, whereas deamination of a A 4 -methylcytosine (4mC) to 4mC-N0 or a 5-methylcytosine (5mC) to a T causes the base to be read by a polymerase as a C or a T, respectively. For 5mC bases, the C to T ratio at the 5mC position is about 40% higher than other cytosine positions, allowing 5mC to be differentiated from C. (See, Li et al. (2022) Genome Biology 23: 122.)

V. Guide RNAs (gRNAs, sgRNAs)

[00178] A “guide RNA” (“gRNA”) is a type of RNA that includes a CRISPR RNA sequence (crRNA, also referred to as a “guide sequence” or “spacer”), and, in certain embodiments, a trans-activating CRISPR RNA sequence (tracrRNA). The tracrRNA, if present, binds to an endonuclease (e.g., a CRISPR enzyme) and the crRNA is complementary to a target sequence. In certain embodiments, the guide RNA is referred to as a single guide RNA (sgRNA), which refers to a guide RNA comprising both a crRNA and tracrRNA.

[00179] A guide sequence can be designed to have complementarity to a target sequence of the disclosure, where hybridization between a target sequence and a guide sequence promotes the formation of a gene editing endonuclease complex (e.g., a CRISPR complex). Full complementarity may not be required, provided there is sufficient complementarity to cause hybridization and promote formation of a gene editing endonuclease complex (e.g., a CRISPR complex). In some embodiments, the degree of complementarity between a guide sequence and its corresponding target sequence, when optimally aligned using a suitable alignment algorithm, is about or more than about 50%, 60%, 75%, 80%, 85%, 90%, 95%, 97.5%, 99%, or more. In certain embodiments, the guide sequence and the target sequence exhibit full (100%) complementarity.

[00180] Optimal alignment of the polyribonucleotide to the target sequence may be determined with the use of any suitable algorithm for aligning sequences, non-limiting example of which include the Smith-Waterman algorithm, the Needleman-Wunsch algorithm, algorithms based on the Burrows -Wheeler Transform (e.g. the Burrows Wheeler Aligner), ClustalW, Clustal X, BLAT, Novoalign (Novocraft Technologies, ELAND (Illumina®, San Diego, Calif.), SOAP (available at soap.genomics.org.cn), and Maq (available at maq.sourceforge.net). In some embodiments, a guide sequence is about or more than about 5, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 75, or more nucleotides in length. In some embodiments, a guide sequence is less than about 75, 50, 45, 40, 35, 30, 25, 20, 15, 12, or fewer nucleotides in length.

[00181] The ability of a guide sequence to direct sequence- specific binding of a CRISPR complex to a target sequence may be assessed by any suitable assay. For example, the components of a CRISPR system sufficient to form a CRISPR complex, including the guide sequence to be tested, may be provided to a host cell having the corresponding target sequence, such as by transfection with vectors encoding the components of the CRISPR sequence, followed by an assessment of preferential cleavage within the target sequence, such as by Surveyor assay as described herein. Similarly, cleavage of a target polynucleotide sequence may be evaluated in a test tube by providing the target sequence, components of a CRISPR complex, including the guide sequence to be tested and a control guide sequence different from the test guide sequence, and comparing binding or rate of cleavage at the target sequence between the test and control guide sequence reactions. Other assays are possible, and will occur to those skilled in the art.

VI. Nucleases

[00182] The nuclease used in the methods described herein can be an endonuclease, for example, a Cas protein, that is capable of cleaving DNA to effect a staggered break at the intended locus, wherein the break results in an overhang. Non-limiting examples of Cas proteins that are capable of cleaving DNA to effect a staggered break at the intended locus, wherein the break results in an overhang, include type V CRISPR enzymes such as Casl2, Casl2a, Casl2b, Casl2c, Casl2d, Casl2e, Casl2fl, Casl2g, Casl2h, Casl2i, homologs thereof, or modified versions thereof, (see, e.g., Liu et al. (2019) supra). For example, in some embodiments, the DNA endonuclease is a Cas 12 endonuclease that effects a staggered break at a locus within or near a target sequence, producing a 1-5 nt overhang. Cas 12 recognizes a 5’-T-rich PAM, such as TTN or TTTN.

[00183] In certain embodiments, the endonuclease is a Casl2a/Cpfl endonuclease; a homolog thereof, a recombinant of the naturally occurring molecule thereof, a codon-optimized version thereof, a modified version thereof, and combinations of any of the foregoing. The Casl2a/Cpfl endonuclease can be derived from a variety of bacterial species. For example, in certain embodiments, the Casl2a/Cpfl endonuclease is derived from Acidaminococcus bacteria or Lachnospiraceae bacteria. In a specific embodiment, the Casl2a/Cpfl endonuclease is a Lachnospiraceae bacterium ND2006 Cpf 1.

[00184] In certain embodiments, the endonuclease is a MAD7 endonuclease, a homolog thereof, a recombinant of the naturally occurring molecule thereof, a codon-optimized version thereof, a modified version thereof, and combinations of any of the foregoing. MAD7 is a codon optimized endonuclease can be derived from Eubacterium rectale (Inscripta, Boulder, CO.) MAD7 is described in U.S. Patent No. 9,982,279.

[00185] In addition, it has recently been discovered that Cas9 enzyme (a type II CRISPR enzyme that recognizes a 3’-G-rich PAM such as NGG) previously thought to create blunt cuts (z.e., breaks in DNA that do not result in an overhang), is capable of creating a 1 nt, 2 nt, and 3 nt overhangs. (See, e.g., Shi et al. (2019) Cell Discovery 5:53.) Accordingly, in certain embodiments, the endonuclease is a Cas9 protein.

[00186] Another nuclease capable of cleaving DNA to effect a staggered break at the intended locus, wherein the break results in an overhang, is a CasX nuclease. (See, Liu et al. (2019) Nature 566:218-223. CasX recognizes a 5’-TTCN PAM and is capable of creating 10-nt overhangs. (Id.)

[00187] In some embodiments, the endonuclease (e.g., a CRISPR enzyme) directs cleavage at the location of a target sequence, such as within the target sequence and/or within the complement of the target sequence. In some embodiments, the endonuclease directs cleavage within about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 50, 100, 200, 500, or more base pairs from the first or last nucleotide of a target sequence.

[00188] In general, “CRISPR system” refers collectively to transcripts and other elements involved in the expression of or directing the activity of CRISPR-associated (“Cas”) genes, including sequences encoding a Cas gene, a guide sequence (also referred to as a “spacer” in the context of an endogenous CRISPR system), or other sequences and transcripts from a CRISPR locus. In some embodiments, one or more elements of a CRISPR system is derived from a type I, type II, or type III CRISPR system. In some embodiments, one or more elements of a CRISPR system is derived from a particular organism comprising an endogenous CRISPR system, such as Streptococcus pyogenes. In general, a CRISPR system is characterized by elements that promote the formation of a CRISPR complex at the site of a target sequence (also referred to as a protospacer in the context of an endogenous CRISPR system).

VII. Kits

[00189] Also disclosed herein are kits for enriching a target nucleic acid and/or making an enriched nucleic acid library. For example, the kit may include a nuclease that cuts a nucleic acid molecule including a target sequence to generate a single stranded overhand at a cut end of the molecule that includes the target; labeled dNTPs; DNA polymerase; and a capture moiety comprising a capture domain.

[00190] In certain embodiments, the kit includes a nuclease, such as a CRISPR-Cas nuclease. In certain embodiments, the nuclease is a type II CRISPR-Cas nuclease. In certain embodiments, the nuclease is a Cas9 nuclease. In certain embodiments, the nuclease is a type V CRISPR-Cas nuclease. In certain embodiments, the nuclease is a Casl2 nuclease. In certain embodiments, the nuclease is a Casl2a/Cpfl nuclease. In certain embodiments, the nuclease is a MAD7 nuclease. In certain embodiments, the nuclease is a CasX nuclease.

[00191] In certain embodiments, the DNA polymerase is DNA polymerase I. In certain embodiments, the DNA polymerase I consists of the Klenow fragment. In certain embodiments, the label comprises biotin, digoxigenin, a magnetic moiety or a fluorophore. In certain embodiments, the capture moiety comprises avidin, streptavidin, or a DIG-binding molecule. In certain embodiments, the capture moiety comprises or is connected to a solid support.

[00192] Kits contemplated herein may further include a solid support, such as a bead, a well, a tube, or a slide. In certain embodiments, the capture domain comprises streptavidin connected to a bead.

[00193] Throughout the description, where apparatus, devices, and systems are described as having, including, or comprising specific components, or where processes and methods are described as having, including, or comprising specific steps, it is contemplated that, additionally, there are apparatus, devices, and systems of the present invention that consist essentially of, or consist of, the recited components, and that there are processes and methods according to the present invention that consist essentially of, or consist of, the recited processing steps. [00194] Practice of the invention will be more fully understood from the foregoing examples, which are presented herein for illustrative purposes only, and should not be construed as limiting the invention in any way.

EXAMPLES

Example 1 - Cleaving Target Sequences Using CRISPR-Casl2a

[00195] This example describes an exemplary method for target cleavage (e.g., at a CpG site) using a gene editing system (CRISPR-Casl2a), for use in an enrichment method provided herein.

Prepare target specimen

[00196] DNA samples (either genomic DNA (gDNA) or sheared genomic DNA (shDNA)) comprising a target sequence were obtained. The shDNA was sheared to approximately 180bp to serve as a model for cfDNA. Herring DNA lacking the target sequence of interest was used as a negative control. An amplicon containing the target sequence of interest (HPRT control target, or one or six experimental CpG sites) was generated using PCR (New England BioLabs® LongAmp®) and purified using solid-phase reversible immobilization (SPRI).

Cleaving target specimen using Casl2a

[00197] One (1) pM Casl2a and 300nM crRNA were incubated at room temperature for at least 10 minutes to create Cas complexes. Thirty (30) nM of each amplicon generated in the purification step was added to the complexes to cut the amplicon at the target sites with a 4-base overhang on the opposite strand to the PAM. A solid-phase reversible immobilization (SPRI) selection was performed to remove unwanted DNA fragments, excess enzymes, dNTPs and molecules. Purified cleaved DNA was analyzed using the Agilent TapeStation and Qubit™ Fluorometer to determine cutting efficiency.

Results

[00198] As shown in TABLE 5, Cas 12a cut each target, with an average cutting efficiency of between 48% and 93%. TABLE 5

Example 2 - Cleaving Target Sequences in Plasma Using CRISPR-Casl2a

[00199] This example demonstrates that a CRISPR cleavage reaction can be performed in plasma instead of buffer, suggesting that a plasma sample can be used directly in a CRISPR cleavage reaction in connection with the methods of the disclosure.

Prepare target specimen

[00200] A human DNA sample comprising a target sequence of interest was obtained. An amplicon containing the target sequence of interest (HPRT control target) was generated using PCR (New England BioLabs® LongAmp®) and purified using solid-phase reversible immobilization (SPRI).

Cleaving target specimen using Casl2a in plasma

[00201] One (1) pM (lx), 3 pM (3x), or 5 pM (5x) Casl2a and 300nM (lx), 900 nM (3x), or 1.5 pM (5x) crRNA, respectively, were incubated at room temperature for at least 10 minutes to create Cas complexes. Thirty (30) nM of the HPRT amplicon and 21 pL of deionized water or plasma were added to the complexes to cut the amplicon at the target sites with a 4-base overhang on the opposite strand to the PAM. A solid-phase reversible immobilization (SPRI) selection was performed to remove unwanted DNA fragments, excess enzymes, dNTPs and molecules. Purified cleaved DNA was analyzed using the Agilent TapeStation and Qubit™ Fluorometer to determine cutting efficiency.

[00202] The experiment was repeated using various combinations of 3 pM (3x), or 5 pM

(5x) Cas 12a and 300nM (lx), 600 nM (2x), or 900 nM (3x) crRNA

Results

[00203] As shown in TABLE 6, Casl2a is capable of cutting a target DNA sequence in the presence of plasma, and increasing the amount of Casl2a and crRNA in the reaction increases the efficiency of cutting to a level that is similar to the efficiency of cutting in buffer.

TABLE 6

Example 3 - Cleaving Target Sequences at Room Temperature Using CRISPR-Casl2a

[00204] This example demonstrates that the methods of the disclosure can be performed at room temperature.

Prepare target specimen

[00205] A human DNA sample comprising a target sequence of interest (CpG-4) was obtained. An amplicon containing the target sequence of interest was generated using PCR (New England BioLabs® LongAmp®) and purified using solid-phase reversible immobilization (SPRI).

Cleaving target specimen using Casl2a at room temperature

[00206] One (1) pM Casl2a and 300nM crRNA were incubated at room temperature for at least 10 minutes to create Cas complexes. Thirty (30) nM of the CpG-4 amplicon and 21pL of deionized water were added to the complexes to cut the amplicon at the target site. The reaction was incubated at room temperature for 30s, Im, 3m, 5m, or 10m, and then 1 pL ProK was added. A solid-phase reversible immobilization (SPRI) selection was performed to remove unwanted DNA fragments, excess enzymes, dNTPs and molecules. Purified cleaved DNA was analyzed using the Agilent TapeStation and Qubit™ Fluorometer to determine cutting efficiency.

Results

[00207] As shown in TABLE 7, Cas 12a is capable of cutting a target DNA sequence at room temperature, with the highest efficiency of cutting seen with a 5m and 10m (above 5 minutes) incubation time. TABLE 7

Example 4 - Incorporation of Biotin-Labeled dNTPs at an Overhang of a CRISPR Cut Site

[00208] This experiment demonstrates that biotin-labeled dNTPs can be incorporated at the overhang of a CRISPR cut site at a target sequence.

Prepare target specimen

[00209] A human DNA sample comprising a target sequence of interest was obtained. An amplicon containing the target sequence of interest was generated using PCR (New England BioLabs® LongAmp®) and purified using solid-phase reversible immobilization (SPRI).

Cleaving target specimen using Casl2a and filling in overhangs with biotinylated dNTPs

[00210] Casl2a (1 pM) and crRNA (300 nM) were incubated for at least 10 minutes to create Cas complexes. Amplicon was added to the complexes to cut the amplicon at the target sites with a 4-base overhang on the opposite strand to the PAM. The overhang bases were filled in using DNA Polymerase-I and 1 mM biotinylated-dNTPs and/or 1 mM unlabeled dNTPs. DNA polymerase was used at 0.1 units/pL (lx) or 0.5 units/pL (5x). Streptavidin beads were added and bound to DNA containing biotinylated dNTPs. The reaction mixture was centrifuged and the beads separated from the supernatant. Bead and supernatant samples were analyzed using the Agilent TapeStation and Qubit™ Fluorometer to determine cutting efficiency. Results

[00211] As shown in FIG. 4, the results of the experiment show that streptavidin beads were capable of binding to and isolating target fragments that had incorporated biotinylated dNTPs. Bands representing a biotinylated target fragment bound to beads were seen in both the lx and 5x polymerase (“Enzyme”) conditions and with anywhere from 10% to 100% biotinylated dNTPs. Three negative controls, “cut control” lacking polymerase enzyme and biotinylated dNTPs, “no bind control” lacking Casl2, crRNA, and polymerase enzyme, and “bind control” which contained a biotinylated amplicon, did not contain biotinylated target fragment.

Example 5 - Positive Enrichment of Target Sequences using CRISPR and Library Generation

[00212] This example provides an exemplary process overview for Casl2a positive enrichment of target sequences. A flowchart of the experimental design is shown in FIG. 5 and a schematic of each step is shown in FIG. 6. This example demonstrates successful completion of a target-sequence enriched library using CRISPR to enrich sequences of interest.

End-repair cfDNA

[00213] Cell-free DNA comprising a target of interest was blunt end-repaired by incubating cfDNA, dNTPs and Klenow fragment (3 ’-5’ exo-) at 37C for 30 minutes. A solidphase reversible immobilization (SPRI) selection was used to remove unwanted DNA fragments, excess enzymes, dNTPs and molecules.

Positive enrichment using Casl2a

[00214] Casl2a and crRNA were incubated at room temperature (25°C) for 10 minutes to create Cas complexes. The target specimen was then spiked into the complexes to cut the specimen at the target sites with a 4-base overhang on the opposite strand to the PAM. Biotinylated dNTPs and Klenow fragment (3 ’-5’ exo-; 0.1 units/pL) were then added and incubated at 37°C for 30 minutes to fill-in the overhang. A solid-phase reversible immobilization (SPRI) selection was performed to remove unwanted DNA fragments, excess enzymes, dNTPs and molecules. DNA comprising the target sequence (having biotinylated dNTPs) were hybridized to streptavidin beads using the biotin: streptavidin interaction. A series of washes removed off-target DNA molecules and the samples were enriched for on-target fragments and depleted for off-target fragments. Streptavidin beads with target DNA bound were resuspended in water.

Unmethylated cytosines converted to uracils

[00215] Bisulfite conversion was performed on DNA bound to streptavidin beads by denaturation using high heat, preferential deamination (at an acidic pH) of unmethylated cytosines, which were then converted to uracil by desulfonation (at an alkaline pH). Methylated cytosines remained unchanged on the single-stranded DNA (ssDNA) product.

Generate library

[00216] The bisulfite converted ssDNA was then used to create a library (“library creation”, LC) using Adaptase® technology from IDT. This technology uses an enzymatic reaction resulting in unbiased addition of a truncated adapter. The Adaptase® enzymatic reaction performed end-repairing, tailing of 3’ ends and ligation of first truncated adapter complement to 3 ’ ends simultaneously. A uracil-free reverse complement to the bisulfite converted ssDNA was then generated using the truncated adapter to prime and extend. A solidphase reversible immobilization (SPRI) selection was performed to remove unwanted ssDNA fragments, excess adapters and molecules. A ligation reaction was performed, adding truncated P5 adapter to the 3’ end of the uracil-free reverse complement fragment. A solid-phase reversible immobilization (SPRI) selection was used to remove unwanted ssDNA fragments, excess adapters and molecules. Indexing PCR amplification was performed with a high fidelity DNA polymerase and unique, known 10-bp barcodes. Indices allow for sample multiplex for the downstream assay. The product was a bisulfite converted dsDNA library with full length adapters. Post-PCR, a SPRI selection was done to remove unwanted ssDNA fragments, excess primers, excess adapters and excess molecules. After library construction, the library quality and quantity were evaluated using the Agilent TapeStation and Qubit Fluorometer, respectively. Sequencing of enriched library

[00217] Sequencing was performed using an iSeq using paired end 150x150 base sequencing with a 5% PhiX spike-in. Sequencing data generated was then demultiplexed utilizing the assigned barcode, aligned to the human genome and trimmed. The cleaned-up data was then processed through a quality pipeline to collapse duplicate reads and the sequencing data was evaluated. As shown in FIG. 7, the library exhibited a conversion efficiency of 99.04%.

Example 6 — Enrichment of a Target Region Within a Nucleic Acid Having Multiple Target Sites

The methods of Example 5 were repeated using gDNA as the nucleic acid source and CpG-5plex as the target, which contained multiple cut sites. Enrichment for one of the targets (CpG-4) within the CPG-5plex is shown in FIG. 8. As shown, the CpG-4 target is enriched in the resulting library, where a no Cas/crRNA (“no cut”) control and a library constructed using the gDNA without the enrichment steps (“no C-Select”) showed no enrichment. These results demonstrate that a library enriched for specific target sequences can be constructed using the methods of the disclosure.

INCORPORATION BY REFERENCE

[00218] The entire disclosure of each of the patent and scientific documents referred to herein is incorporated by reference for all purposes.

EQUIVALENTS

[00219] The invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. The foregoing embodiments are therefore to be considered in all respects illustrative rather than limiting on the invention described herein. Scope of the invention is thus indicated by the appended claims rather than by the foregoing description, and all changes that come within the meaning and range of equivalency of the claims are intended to be embraced therein. TABLE 1 - List of CGIs

Reference Pos (hgl9 coordinates)

1 chrl3:108518334-108518633

2 chr6:137242315-137245442

3 chr2:177016416-177016632

4 chr5:2738953-2741237

5 chr4:111553079-111554210

6 chrl5:96909815-96910030

7 chr6:42072032-42072701

8 chrl0:123922850-123923542

9 chrl6:86612188-86613821

10 chrl9:47151768-47153125

11 chrl:110610265-110613303

12 chr5:3594467-3603054

13 chr9:126773246-126780953

14 chr3:138656627-138659107

15 chr4:4859632-4860191

16 chrl0:118895963-118898037

17 chr7:103086344-103086840

18 chrl9:407011-409511

19 chrl0:22764708-22767050

20 chrl6:86549069-86550512

21 chr9:96713325-96718186

22 chr8:139508795-139509774

23 chr2:73143055-73148260

24 chr8:26721642-26724566

25 chr9:129386112-129389231

26 chrl2:49483601-49484255

27 chrl6:54325040-54325703

28 chr8:72468560-72469561

29 chrl8:70533965-70536871

30 chr9:98111364-98112362

31 chrl:50882997-50883426

32 chrl0:88122924-88127364

33 chrll:31839353-31839813

34 chrl0:101290025-101290338

35 chr6:41528266-41528900

36 chrl6:51183699-51188763

37 chr5:140346105- 140346931

38 chr9:23820691-23822135

39 chr20:690575-691099

40 chrl:177133392-177133846

41 chr5:45695394-45696510

42 chr2:45395869-45398186

43 chr20:48184193-48184833

44 chr6:6002471-6005125

45 chrl4:101192851-101193499 chr8:4848968-4852635 chr8:53851701-53854426 chrl2:186863-187610 chr5:54519054-54519628 chr6:108485671-108490539 chr3:157815581-157816095 chrll:626728-628037 chr2:177012371-177012675 chrl7:59531723-59535254 chrl6:55364823-55365483 chr8:99960497-99961438 chr7:42267545-42267823 chrl7:14202632- 14203258 chrl0:102891010-102891794 chr5:174158680-174159729 chrl4:33402094-33404079 chr2:177036254-177037213 chrl0:106399567-106402812 chr6:166579973-166583423 chrll:123066517-123066986 chrll:44327240-44327932 chrl4:95237622-95238211 chr9:102590742-102591303 chrl5:76630029-75630970 chr4:24801109-24801902 chr8:97169731-97170432 chr3:6902823-6903516 chr22:48884884-48887043 chrl5:45408573-45409528 chr9:100610696-100611517 chr4:174448333-174448845 chrl6:20084707-20085305 chr4:174439812-174440249 chr6:10381558-10382354 chrl5:35046443-35047480 chrl0:119494493-119494991 chr5:72676120-72678421 chrll:44325657-44326517 chrl7:46670522-46671458 chrl4:92789494-92790712 chr4:174459200-174460054 chr2:80549578-80549798 chr7:153748407-153750444 chr6:1389139-1391393 chrl6:49314037-49316543 chr2:105459127-105461770 chr21:38079941-38081833 chr4:174427891-174428192 chrl4:60973772-60974123 chr8:99985733-99986983 chr2:63281034-63281347 chrl2:101109863-101111622 chrl:119549144-119551320 chr5:38257825-38259136 chr5:54522302-54523533 chrl:165324191-165326328 chrl5:33602816-33604003 chrl0:118030732-118034230 chr2:45240372-45241579 chr4:174430386-174430861 chr6:50810642-50810994 chr5:122430676-122431443 chrl0:109674196-109674964 chr8:97172634-97173880 chr8:11536767-11538961 chr5:180486154-180486892 chr2:38301276-38304518 chrl0:1778784-1780018 chrl2:54424610-54425173 ch r 17 :46669434-46669811 chrll:8190226-8190671 chr8:25900562-25905842 chrl2:81102034-81102716 chr7:27199661-27200960 chrl0:119311204-119312104 chrl2:130387609-130389139 chr7:155258827-155261403 chr6:117591533-117592279 chrl0:111216604-111217083 chrl:29585897-29586598 chr2:144694656- 144695180 chrl2:48397889-48398731 chr5:2748368-2757024 chrl2:114845861-114847650 chr2:80529677-80530846 chr5:1874907-1879032 chr6:100905952-100906686 chrl5:96904722-96905050 chr5:134374385-134376751 chr2:66652691-66654218 chrl2:54440642-54441543 chr6:108495654-108495986 chrl7:70112824-70114271 chr3:87841796-87842563 chr7:96650221-96651551 chr4:110222970-110224257 chr6:78172231-78174088 chr7:155164557-155167854 chrl2:113900750-113906442 chr9:112081402-112082905 chrl2:114886354-114886579 chr5:3590644-3592000 chr2:119592602-119593845 chr20:21485932-21496714 chrl8:11148307-11149936 chrl7:46824785-46825372 chrl0:100992156-100992687 chrl4:36986362-36990576 chrl8:55094825-55096310 chrl5:96895306-96895729 chrl7:36717727-36718593 chr2:223183013-223185468 chr7:30721372-30722445 chrl:53527572-53528974 chrl8:56939624-56941540 chr5:175085004-175085756 chrl0:50817601-50820356 chrl4:60975732-60978180 chrl5:89920793-89922768 chr9:122131086-122132214 chrl:217311467-217311773 chrl4:38724254-38725537 chrl4:61103978-61104663 chrl8:73167402-73167920 chrl:50880916-50881516 chr2:241758141-241760783 chrll:31825743-31826967 chr7:27260101-27260467 chr20:41817475-41819212 chr3:238391-240140 chr7:121950249-121950927 chr5:72526203-72526497 chrl5:96903311-96903711 chrl0:26504383-26507434 chr6:100915602-100915883 chrl:18962842-18963481 chr3:127794369-127796136 chr7:27203915-27206462 chr8:25899335-25899692 chrl2:114838312-114838889 chr6:38682949-38583265 chrll:31841315-31842003 chr4:174451828-174452962 chr9:129372737-129378106 chr2:176964062-176965509 chr2:176931575-176932663 chrl2:114833911-114834210 chrll:79148358-79152200 chr2:177024501-177025692 chr5:172672311-172672971 chr7:27291119-27292197 chrl:180198119-180204975 chrl4:37126786-37128274 chr2:200333687-200334172 chrl4:58331676-58333121 chr3:147131066- 147131333 chrl3:109147798-109149019 chrl4:48143433-48145589 chr6:100905444-100905697 chrl7:14200579- 14200996 chr6:1379693-1380014 chrl:34642382-34643024 chr2:119599059-119599299 chr2:119613031-119615565 chr4:85413997-85414874 chr9:17906419-17907488 chrl2:29302034-29302954 chr20:10200088-10200384 chr8:57358126-57359415 chrl0:63212495-63213009 chr2:176936246-176936809 chrll:20618197-20619920 chrl8:19744936-19752363 chrl4:29234889-29235908 chrl7:46673532-46674181 chr4:144620822- 144622218 chrl6:82660651-82661813 chr3:192125821-192127994 chr2:119599458-119600966 chr22:44257942-44258612 chrl9:13616752-13617267 chr3:147138916- 147139564 chr9:969529-973276 chrl8:55103154-55108853 chr4:174422024-174422443 chr4:57521621-57522703 chrl5:79724099-79725643 chrl4:37135513-37136348 chrl0:23480697-23482455 chr2:45169505-45171884 chrl8:30349690-30352302 chr6:99291327-99291737 chr9:21970913-21971190 chr4:107146-107898 chrl2:117798076-117799448 chr2:219736132-219736592 chrl0:118892161-118892639 chrll:27743472-27744564 chrl2:65218245-65219143 chrl2:75601081-75601752 chr7:54612324-54612558 chr6:100912071-100913337 chrl0:102905714-102906693 chr8:87081653-87082046 chr6:50818180-50818431 chrl:91189139-91189400 chr2:118981769-118982466 chrl0:50602989-50606783 chrl7:59528979-59530266 chr4:147559205- 147561901 chrl:4713989-4716555 chrl3:102568425-102569495 chrl6:6068914-6070401 chr22:29709281-29712013 chrl0:100993820-100994188 chr6:391188-393790 chr2:176977284-176977540 chr4:4868440-4869173 chr6:137809342-137810204 chrl2:54321301-54321721 chr2:105468851-105473488 chr8:55366180-55367628 chrl2:72665683-72667551 chr4:54966163-54968063 chr5:134366913-134367438 chrl:226075150-226075680 chr20:17206528-17206952 chr4:172733734-172735118 chrl8:55019707-55021605 chr2:162279835-162280709 chr6:1381743-1385211 chr7:103968783-103969959 chr6:150358872-150359394 chr2:119914126-119916663 chr7:27278945-27279469 chrl2:114851957-114852360 chrl6:24267040-24267527 chr6:7229877-7230865 chr2:45227644-45228783 chr4:174450046-174451469 chr4:154712073-154712706 chr3:22413492-22414365 chr20:21694472-21695344 chr6:1378445-1379318 chr8:70981873-70984888 chrl2:53107912-53108471 chrl0:102996034-102996646 chr3:157821232-157821604 chr4:111554965-111555504 chrl3:58206526-58208930 chrl0:22634000-22634862 chr9:22005887-22006229 chr5:159399004-159399928 chr2:31805293-31806403 chr6:100903491-100903713 chr5:77268350-77268787 chrl4:85997468-85998637 chr5:92923487-92924497 chrll:64480199-64481344 chrl3:28366549-28368505 chr5:77805753-77806313 chr9:79633326-79636030 chr4:93226348-93227007 chr2:223170486-223171140 chrl:91172102-91172771 chrl:1181756-1182470 chr8:65281903-65283043 chrl0:94825546-94826320 chr6:108491033-108491410 chr21:38076752-38077685 chrl:91183240-91184540 chr3:147136903- 147137328 chrl5:96911511-96911808 chrl4:57274607-57276840 chrl3:112726281-112728419 chr2:171672310-171675447 chr8:11559596-11562956 chrl0:48438411-48439320 chrl8:59000683-59001692 chrl5:91642908-91643702 chr5:3592391-3592644 chrl9:56988313-55989741 chr6:26614013-26614851 chrll:27742059-27742273 chr3:147113608- 147114479 chrl4:57264638-57265561 chr7:155302253-155303158 chrll:31848487-31848776 chrl6:54970301-54972846 chrl9:30715549-30715753 chr9:96710811-96711717 chrl8:77557780-77558948 chr20:21686199-21687689 chrll:31847132-31847958 chrl6:86530747-86532994 chrl:203044722-203045390 chrl5:53096014-53096482 chr7:97361132-97363018 chrl4:29236835-29237832 chrl3:79182859-79183880 chrll:69517840-69519929 chrl:231296559-231297345 chrl9:8675333-8575699 chrl:63795363-63796140 chr4:90228714-90229010 chr3:62362610-62363082 chrl9:5827754-5828405 chrl0:125732220-125732843 chr9:136293566-136294160 chrl:53782394-63790471 chr4:4867386-4857673 chr9:133534534-133542394 chrl5:100913438-100914022 chrl0:101279941-101280382 chrl3:53419897-53422872 chrl:77747314-77748224 chrl4:36974548-36975425 chrl2:57618769-57619402 chr7:49813008-49815752 chr4:188916605-188916876 chrll:31831620-31839038 chr8:132052203-132054749 chr2:237071794-237078762 chr20:39994545-39995810 chrll:132812662-132813075 chr5:1707351S9-170739863 chrl:221051966-221053673 chr5:72529099-72529976 chrl4:36973169-35973740 chr4:158141404-158141836 chrl4:103655241-103655928 chrl:65731411-65731849 chrl:38218190-38218977 chr3:128719865-128721245 chrl5:33009530-33011696 chr2:162275161-162275596 chr7:155241323-155243757 chrl9:46001830-46002686 chr6:137814355-137815202 chr7:70596228-70598382 chrl5:96959341-96960531 chrl6:66612749-66613412 chr6:110299365-110301267 chrl5:27215951-27216856 chrll:88241710-88242562 chr2:124782252-124783255 chrl7:70111979-70112308 chr2:63283936-63284147 chrl7:46800945-46801288 chr6:1393049-1394170 chr3:137489594-137491004 chrl5:60296135-60298520 chrl2:106979429-106981086 chrl2:54360374-54360660 chrl4:36991594-36992488 chr4:156129168-156130209 chr4:54975387-54976202 chr3:137482964-137484454 chrl0:118893527-118894432 chrl8:76737005-76741244 chrl0:110671724-110672326 chr5:71014917-71015715 chr6:50787285-50788091 chrl9:3868585-3869217 chr4:5894071-5895116 chrll:131780328-131781532 chr6:101846766-101847135 chrll:71952112-71952528 chr5:172663616-172664584 chr9:23822412-23822667 chr4:5891981-5892365 chrl:217310749-217311178 chrl0:108923780-108924805 chr6:100038655-100039477 chr7:121945345-121946235 chr3:147126988- 147128999 chr7:121956543-121957341 chr4:156680095-156681386 chr4:85404986-85405252 chrl:221064889-221065600 chrl7:73749618-73750178 chr8:55370170-55372525 chr6:70992040-70992912 chrl6:55513220-55513526 chr6:106433984-106434459 chrl4:29254365-29255069 chr6:33655965-33656238 chr9:19788215-19789288 chrll:115630398-115631117 chrl:34628783-34630976 chrl4:101923575-101925995 chrl7:72855621-72858012 chr2:223162946-223163912 chr4:85417659-85420799 chrl:156390403-156391581 chr3:147130342- 147130577 chr2:119602616-119604486 chr9:120175253-120177496 chr4:174443355-174443948 chr5:145724294- 145724551 chrll:32454874-32457311 chr2:176949511-176949795 chrl:18436551-18437673 chr3:26665950-26666164 chr3:170303044-170303249 chr2:223176493-223177515 chr2:182321761-182323029 chrl8:44789742-44790678 chrl7:46796234-46797292 chrl8:44772992-44775577 chr8:101117922-101118693 chr7:27134097-27134303 chrl0:102507482-102509646 chrl9:39754973-39756540 chr7:26415746-26416891 chrl4:37116188-37117628 chr4:174421347-174421559 chr6:85472702-85474132 chr20:22557517-22559240 chr6:117198089-117198705 chrl0:71331926-71333392 chrl9:36334994-35335321 chr4:46995128-45995872 chr9:135455164-135458586 chr8:65290108-65290946 chrl0:94828102-94829040 chrl:116380359-116382364 chrl5:47476369-47477499 chr3:147115764- 147116421 chrl7:59485573-59485780 chrl0:23983366-23984978 chr2:176949993-176950336 chr9:137967110-137967727 chr2:176957054-176958279 chrll:119293320-119293943 chrll:132813562-132814395 chr2:237068071-237068834 chrl0:27547668-27548402 chr4:4866438-4866813 chr21:19617098-19617874 chrl:91185156-91185577 chrl9:15292399-15292632 chrl:145075483- 145075845 chr2:19560963-19561650 chrl4:57260878-57262123 chr8:55378928-55380186 chr6:99290279-99290771 chrl9:13124959-13125259 chrl5:27112030-27113479 chr8:145925410- 145926101 chrll:124629723-124629926 chr4:109093038-109094546 chr3:62356773-62357315 chrl4:37131181-37132785 chrl0:124905634-124906161 chr7:35296921-35298218 chrl9:36248979-36249307 chrl2:15475318-15475901 chr5:87985470-87985810 chrl2:54423427-54423712 chr7:96653467-96654199 chr2:45155195-45157049 chrl5:96896928-96897301 chrl2:58004982-58005351 chr2:176933131-176933449 chr2:176962179-176962487 chr20:25063838-25065525 chrl2:5153012-5154346 chr3:154146347-154146965

■ ' :165323486-165323811 chr21:38065179-38066185 chrl0:119000435-119001530 chrl2:45444202-45445386 chr4:158143296-158144053 chr5:76932317-76933523 chr5:172659049-172660277 chr2:223168653-223169008 chrl:248020330-248021252 chrl8:904578-909574 chrl2:127940451-127940907 chr9:135461934-135462909 chrl7:48041282-48043064 chr4:94755786-94756310 chrl0:130338695-130338994 chr2:119616133-119616825 chr2:177042751-177043444 chr2:105478600-105479188 chr5:172670829-172671824 chr2:176952695-176953297 chrl3:28549839-28550246 chrl3:112720564-112723582 chr6:100895773-100896062 chr7:136553854-136556194 chr6:127441553-127441760 chrl:119526782-119527192 chrl2:49484920-49485178 chr9:23850910-23851522 chr2:220299483-220300243 chr5:1881924-1887743 chr8:57360585-57360815 chrl8:74961556-74963822 chr5:172660720-172661133 chrl7:75277317-75278172 chrl0:99789614-99791320 chr2:176944087-176948446 chr4:154709512-154710827 chr5:140798757- 140799359 chr3:44063314-44063837 chrl5:79574830-79575211 chr2:223161531-223161919 chr6:134210639-134211218 chrl0:102899177-102899489 chrl3:79181944-79182222 chr7:71800757-71802768 chr3:186078710-186080111 chrl:24229115-24229537 chrl6:48844551-48845264 chr7:113724924-113727795 chr22:44726724-44727590 chr4:15779998-15780729 chr4:41869174-41869459 chrl:38941919-38942404 chr2:176971706-176972305 chr2:119607378-119607910 chr5:76934581-76935296 chrl2:103696090-103696418 chr5:63255044-63255407 chrl:221067447-221068185 chr2:119611296-119611881 chrl0:124907283-124911035 chrl2:114878143-114879155 chrl2:49371690-49375550 chrl7:36719544-36719938 chrl7:46696553-46696926 chr3:147142181- 147142391 chr8:9762661-9764748 chrl4:74706188-74708192 chr3:12837992-12838359 chr20:37352130-37357372 chrl0:8077829-8078378 chr4:4864456-4864834 chr4:13524062-13526083 chrl:66258440-66258918 chrll:17740789-17743779 chrl2:106975195-106975714 chr9:91792662-91793611 chrl:149333785- 149334111 chr3:170303532-170303768 chr5:72594147-72595808 chr5:145725286- 145725852 chrl0:23462224-23463889 chr20:21689758-21690048 chrl5:53080458-53083699 chr2:154727906-154728271 chr5:170743178-170744107 chrl0:102899822-102900263 chr5:134368578-134370466 chr2:66808568-66809404 chr7:96651963-96652246 chrl:91190489-91192804 chrl7:75368688-75370506 chr4:185939222-185942747 chr7:43152020-43153340 chrl3:84453654-84453897 chr2:176956504-176956707 chr7:87563342-87564571 chr20:17208550-17208756 chr22:19746924-19747141 chr2:223159725-223160487 chrl2:131200509-131200726 chrl8:44336183-44337110 chr2:63285949-63287097 chr4:13526553-13526770 chrl5:89949373-89951130 chrl9:55815940-55816277 chrl7:50235175-50236466 chrl9:58545115-58545897 chrl2:113592203-113592620 chrl2:115109503-115110061 chr4:164264821-164265772 chrl:2772126-2772665 chr3:71834068-71834653 chrl2:5018585-5021171 chrl5:74419870-74423044 chr3:147108511- 147111703 chr5:88185224-88185589 chrl2:54354529-54355491 chrl0:101290625-101291178 chr8:11557852-11558252 chr8:105478672-105479340 chrll:20181200-20182325 chrl9:54483021-54483572 chrl3:112707804-112708696 chrl6:22824616-22826459 chr4:66536065-66536674 chr4:154713537-154714240 chr7:12151220-12151559 chrl2:119212110-119212393 chrl7:14201726- 14202052 chr20:21376358-21378245 chrl3:36045931-36046143 chrl5:60287107-60287663 chr9:100613938-100614622 chrl0:102475276-102475579 chr7:121940006-121940648 chr5:37834671-37835128 chrl:197887088-197887791 chrl2:99139386-99139769 chr6:1619093-1621094 chrl2:113917394-113918107 chrl4:24044886-24046760 chr5:77253832-77254049 chr4:85403830-85404524 chr6:166666837-166667541 chrl8:77547965-77549038 chr2:219848919-219850541 chrl7:7832532-7833164 chr5:134363092-134365146 chrl0:103043990-103044480 chr8:97171805-97172022 chr20:57089460-57090237 chrl2:114840853-114841063 chr4:66535193-66535620 chr8:85096759-85097247 chr6:10881846-10882051 chrl3:28498226-28499046 chrl:161695637-161697298 chrll:2890388-2891337 chrl7:5000369-5001205 chrl3:27334226-27335205 chrl0:22623350-22625875 chr2:157185557-157186355 chr7:20370003-20371504 chr4:961347-962155 chrl2:49485766-49485977 chr3:62356119-62356378 chrll:14995128- 14995908 chrl2:53359192-53359507 chrl6:51168266-51169110 chrl4:57278709-57279116 chr6:37616722-37617179 chrl8:11750953-11752756 chrl9:45260352-45261809 chrl:119531991-119532196 chrl9:36523391-36523887 chrl2:52652018-52652743 chr8:49468683-49468959 chr8:9760750-9761643 chr7:19146923-19147308 chrl3:32889533-32889900 chr5:140797162- 140797701 chr21:42218489-42219222 chrl9:54411376-54411968 chr3:62354291-62355012 chrl2:113590806-113591304 chrl:225865068-225865328 chr7:130790358-130792773 chrl5:53076187-53077926 chrl:214158726-214159080 chrl2:3308812-3310270 chrl:39044059-39044561 chrl0:119312766-119313563 chrl2:65514878-65515863 chrl2:54366815-54369103 chrl2:114885105-114885418 chrl6:2228190-2230946 chrll:68622722-68623252 chr2:25499763-25500429 chr5:172661486-172662228 chrl7:46691520-46692097 chrl2:75602991-75603344 chr2:80531367-80531719 chr5:158478378-158478630 chr2:177017266-177017489 chr2:63282514-63283122 chr7:155595692-155599414 chr5:172665306-172666072 chrl2:114843022-114843610 chrl3:112758598-112760491 chr4:4858389-4858893 chrl6:55365814-55366022 chr9:96108466-96108992 chrl2:3475010-3475654 chr9:86152353-86153777 chr6:10384965-10385492 chr22:31500396-31501239 chr5:179228283-179229003 chr6:137816474-137817223 chr2:106681982-106682403 chrl4:95239375-95239679 chr7:154001964-154002281 chrl:1476093- 1476669 chrl5:89904822-89906050 chrll:89224416-89224718 chr9:100615234-100617510 chr3:172165372-172166738 chrl:202678881-202679769 chrl4:37053134-37053690 chr4:41875445-41875794 chr2:162273294-162273725 chrl:181287300-181287873 chrl3:79181327-79181614 chr8:145103285- 145108027 chr22:42305617-42307254 chr8:102505512-102506430 chrl7:74533281-74534566 chrl:214156000-214156851 chr20:2780978-2781497 chr4:4861227-4862241 chrl9:13215244-13215543 chr7:121943867-121944538 chrl7:71948478-71949255 chr2:127413696-127414171 chrl:113286332-113287172 chrl:47009575-47010132 chrl6:62069121-62070634 chrl6:3013651-3015131 chrl8:76732970-76734765 chr4:155664819-155665833 chr6:72298274-72298528 chrl5:89147660-89149198 chrl7:33775294-33775794 chrl8:44337510-44338100 chrl0:8076002-8077261 chrl3:112717125-112717421 chrl5:89914363-89915061 chrl:228785986-228786204 chrl:156358050-156358252 chr7:751712-752150 chr3:137489051-137489409 chrl7:7905927-7907445 chrl8:35144907-35147628 chr3:9177691-9178189 chr6:10390888-10391098 chrl4:37052537-37052838 chrl:47909712-47911020 chrl3:93879245-93880877 chrl:50893468-50893745 chr7:27282085-27283136 chr4:147558231- 147558583 chrl9:13124569-13124788 chrl7:46619087-46619314 chr3:44596535-44597018 chrl4:24803678-24804353 chr2:3286324-3286530 chrl2:14134626- 14135242 chrl2:114881649-114881937 chr20:22548967-22549720 chr8:3782248S-37824008 chrl3:100641334-100642188 chr4:206377-206892 chr3:11034445-11035384 chr7:152622343-152623305 chrl0:22629360-22630328 chr4:140201064- 140201449 chrl9:46318490-46319266 chr3:121902742-121903645 chr9:77112712-77113583 chr2:114256775-114258043 chrl0:15761423-15762101 chrl:115880167-115881332 chr6:50791110-50791573 chr6:55039170-55039392 chr2:176980755-176981423 chr8:86350765-86351196 chr8:24812946-24814299 chr7:19184818-19185033 chr5:76936126-76936984 chr5:87980878-87981272 chr9:77111778-77112042 chrll:20622720-20623399 chrl:50882433-50882660 chrl7:35291899-35300875 chrl7:46675044-46675589 chr20:5296265-5297798 chr7:156871054-156871297 chr4:681313-681514 chr2:177039551-177039951 chrl7:46695325-46695553 chrl:41283840-41284591 chr9:16726859-16727273 chrl:65991001-65991811 chrl:181452706-181453073 chr8:120428398-120429178 chr3:32863174-32863415 chr4:134069152-134070442 chrl2:123754049-123754373 chr5:63256548-63257886 chr5:1879689-1879928 chrl0:118899247-118900329 chr20:2731063-2731395 chr5:134385967-134386370 chr2:177014948-177015214 chrl:67218079-67218293 chrll:65408344-65408631 chr7:156801418-156801632 chrl8:54788959-54789194 chr2:220173870-220174283 chr2:220173021-220173271 chrl2:113908887-113910681 chr6:100897080-100897621 chrl:155290606-155291001 chr2:130763483-130763764 chrl2:129337870-129338653 chr21:34395128-34400245 chrl2:52115410-52115679 chr3:126113547-126113967 chrl6:3220438-3221356 chrl:119543056-119543454 chrl4:62279476-62280019 chrll:636906- 640628 chrl0:102893660-102895059 chr3:3840513-3842772 chrl:119529819-119530712 chr9:32782936-32783625 chrl9:1064897-1065191 chr5:54527319-54527760 chr7:156795355-156799394 chrl:155147185-155147444 chr9:37002489-37002957 chrll:69831571-69832484 chr2:128421719-128422182 chr22:38476836-38478839 chrl9:54412710-54413087 chr9:123656750-123656972 chr7:129422997-129423355 chrl9:36336275-36337138 chr2:50574045-50574817 chrl0:102975969-102978096 chr6:5996185-5996486 chr3:26664104-26664796 chr7:155170623-155170939 chr8:65286067-65286659 chrl4:37125219-37125661 chrll:65816404-65816665 chr6:41908745-41909711 chrl7:46620367-46621373 chr2:142887724- 142888553 chrl:221050448-221050864 chrl2:106974412-106974951 chrl4:57278068-57278287 chrl:67773329-67773767 chrl7:40936445-40936668 chr20:2729997-2730797 chrl2:113013099-113013529 chr7:155244046-155244357 chrl:214153214-214153668 chrl:156863415-156863711 chrl:114695136-114696672 chrl4:85996494-85996958 chr7:100823307-100823701 chr20:52789252-52790986 chr5:178421225-178422337 chrll:36397926-36399398 chrl3:36052553-36053119 chrl4:57283967-57284558 chr4:25090106-25090510 chr2:5831187-5831413 chr6:117869097-117869530 chrl9:58094739-58095764 chr4:85422929-85423190 chrl3:100547172-100547431 chr8:68864584-68864946 chrl6:49311413-49312308 chr7:19184221-19184686 chr2:19562749-19562965 chrl9:54481412-54481955 chrl0:124901907-124902617 chr3:62357639-62359774 chrll:31827696-31827921 chrl7:43037166-43037740 chr7:37955622-37956555 chr6:106429111-106429772 chr6:50682334-50683214 chr5:76923887-76924502 chr6:168841818-168843100 chr7:19145872-19146256 chr20:32856659-32857248 chrl7:79859808-79860963 chr7:95225503-95226194 chrl4:105167663-105168129 chrl7:14248391- 14248721 chrl6:84002269-84002860 chr9:104499849-104501076 chrl7 :46604362-46604881 chr2:87015974-87018182 chrl4:36990873-36991209 chr5:52777788-52777996 chrl9:35633847-35634629 chrl:221055492-221055800 chrl:146551476- 146551764 chrl3:100642774-100643094 chrl4:85999532-86000478 chrl3:36049570-36050159 chr2:119606038-119606313 chrll:123065426-123066184 chr3:172167526-172167866 chr4:41882450-41882964 chr8:142528185- 142529029 chr9:79637814-79638169 chr3:19189688-19190100 chr4:122301567-122302290 chrl0:130339526-130339777 chr9:35846310-35846638 chrl5:53097551-53098476 chr2:157184389-157184632 chr5:145718289- 145720095 chrll:105481126-105481422 chr5:170741603-170742751 chr3:62355315-62355534 chrl:38219702-38220012 chr4:41881177-41881418 chrl3:112715359-112716234 chrl7:1880789-1881116 chrl8:56887091-56887665 chr6:10390038-10390565 chrll:69516931-69517218 chrl9:39737689-39739288 chr3:157812053-157812764 chrl4:37049333-37051726 chr7:156409023-156409294 chrll:46366876-46367101 chr5:50685453-50686148 chr4:41883492-41884570 chrl3:112709884-112712665 chr22:44287497-44288061 chr22:46440393-45441019 chr8:23562475-23565175 chr2:207506774-207507422 chr4:169799086-169799625 chr3:133393118-133393657 chr8:41424341-41425300 chr4:100870377-100871994 chr4:107956555-107957453 chrl7:79314962-79320653 chr2:30453566-30455655 chrl:18956895-18959829 chrl2:41086522-41087102 chr22:42685894-42686095 chr6:100914946-100915245 986 chrl:46951168-46951792

987 chr4:41749184-41749811

988 chrll:128419198-128419513

989 chr2:171671598-171671804

990 chrl:170630456-170630851

991 chr20:44657463-44659243

992 chr9:139096665-139096993

993 chr7:155174128-155175248

994 chrl4:36993488-36994488

995 chr3:138654837-138655363

996 chr4:5709985-5710495

997 chrl5:23157794-23158624

998 chr20:9496471-9496893

999 chr4:174437914-174438346

1000 chr5:140305712- 140307193

1001 chrl5:79576059-79576270

1002 chrl4:38678245-38680937

1003 chrl0:102473206-102474026

1004 chrl7:59486727-59487132

1005 chr3:64253533-64253819

1006 chrl0:102484200-102484476

1007 chr7:27198182-27198514

1008 chr2:97192977-97193383

1009 chr9:77113709-77113927

1010 chr6:154360586-154361008

1011 chrll:44324875-44325087

1012 chr2:182521221-182521927

1013 chr7:124404700-124406189

1014 chr2:132182327-132183101

1015 chr7:101005899-101007443

1016 chr7:149744402- 149746469

1017 chr8:50822270-50822860

1018 chr7:27227520-27229043

1019 chr6:134212690-134213098

1020 chrl3:36044844-36045481

1021 chrll:132934059-132934291

1022 chrl6:51189800-51190260

1023 chrl:155145342-155145938

1024 chr4:682724-683079

1025 chr5:92939795-92940216

1026 chrl0:134597357-134602649

1027 chrl:200009807-200010036

1028 chrl9:12666243-12666682

1029 chr9:97401286-97402067

1030 chr2:107103833-107104053

1031 chrl5:89910521-89912177

1032 chr5:140789094- 140789762 1033 chr2:114033359-114033617

1034 chrl7:12568667-12569335

1035 chrll:68622108-68622339

1036 chrl:160340604-160340843

1037 chr7:103085710-103086132

1038 chrl5:76628998-76629207

1039 chr20:10198135-10198984

1040 chr20:44660342-44660948

1041 chrl7:35290403-35290663

1042 chrl7:933026-933236

1043 chr4:128544031-128544903

1044 chrl:50881884-50882103

1045 chrl0:125425495-125426642

1046 chrl7:46801784-46802071

1047 chrl:25255527-25259005

1048 chr3:32861141-32861429

1049 chrl7:70116274-70119998

1050 chrl0:75407413-75407706

1051 chr2:467849-468659

1052 chrll:132952538-132953307

1053 chr3:6904133-6904641

1054 chrl0:120353692-120355821

1055 chr7:20830567-20830817

1056 chrll:71950815-71951408

1057 chrl4:95240083-95240341

1058 chrl9:5829048-5829474

1059 chr20:9495253-9495597

1060 chr9:112083333-112083549

1061 chrl5:96873408-96877721

1062 chrl6:67208067-67208678

1063 chrl:175568376-175568808

1064 chr6:5999149-5999787

1065 chr3:129693127-129694841

1066 chr6:10383525-10384114

1067 chrll:636435-636668

1068 chrl:181451311-181452049

1069 chr9:135464586-135466240

1070 chrl5:60289325-60289533

1071 chrl6:49309123-49309353

1072 chrl:243646394-243646888

1073 chrl2:54071053-54071265

1074 chrl:91176404-91176701

1075 chr5:140864527- 140864748

1076 chr4:47034427-47034940

1077 chrl0:102489343-102491011

1078 chrl0:102419147-102419668

1079 chrl2:81471569-81472119 1080 chr6:50813314-50813699

1081 chr5:158526133-158526431

1082 chrl:119543821-119544339

1083 chr5:77140542-77140914

1084 chr8:23567180-23567678

1085 chrl:41831976-41832542

1086 chr2:139537692-139538650

1087 chr7:100075303-100075551

1088 chr2:176969217-176969895

1089 chr7:27284639-27286237

1090 chr5:31193952-31194419

1091 chr6:37616393-37616621

1092 chrl9:1748167-1750243

1093 chrl0:101281181-101282116

1094 chr21:31311386-31312106

1095 chr2:176973427-176973718

1096 chrl5:96900142-96900644

1097 chr7:158936507-158938492

1098 chr3:63263989-63264205

1099 chrl6:71459781-71460338

1100 chr7:155601175-155603235

1101 chrl2:54447744-54448091

1102 chrl2:53491572-53491955

1103 chrl0:16561604-16563822

1104 chrll:133994709-133995090

1105 chr2:137522460-137523696

1106 chrl7:12877270-12877773

1107 chr8:98289604-98290404

1108 chr4:185937242-185937750

1109 chr3:185911344-185912228

1110 chrl2:54378696-54380102

1111 chrl:221060850-221061071

1112 chrl2:63543636-63544967

1113 chr6:6006689-6007043

1114 chrl9:51169659-51172023

1115 chrl:1474962- 1475220

1116 chrl4:54418677-54418881

1117 chr6:108497595-108497996

1118 chrl7:37764092-37764304

1119 chr4:109092578-109092839

1120 chrl:91182097-91182364

1121 chrl3:112760865-112761113

1122 chrl2:122018170-122018457

1123 chr7:142494563- 142495248

1124 chrl3:58203586-58204322

1125 chrl:92945907-92952609

1126 chrl2:106977388-106977713 1127 chr5:76925445-76926875

1128 chrl6:3190765-3191389

1129 chrl:12123488-12124148

1130 chrl7:48545570-48546900

1131 chrl2:113916433-113916717

1132 chr4:41747508-41747944

1133 chrl9:46916587-46916862

1134 chrl5:49254984-49255564

1135 chrl9:8674332-8674764

1136 chr2:223167205-223167560

1137 chrl7:1173535-1174733

1138 chr3:75955759-75956308

1139 chr5:115697134-115697589

1140 chr8:21644908-21647845

1141 chr5:59189046-59189894

1142 chrl2:54338761-54339168

1143 chrl6:31053479-31053800

1144 chrl:50892437-50893243

1145 chrl7:40935964-40936180

1146 chrl9:44203558-44203987

1147 chr4:81109887-81110460

1148 chrl:2979275-2980758

1149 chrl6:49872449-49872926

1150 chrl:200008392-200009047

1151 chrl6:49316997-49317263

1152 chr2:114034594-114036041

1153 chr2:105480197-105480760

1154 chrl8:44777632-44778084

1155 chrl9:13213450-13213821

1156 chrl7:6616422-6617471

1157 chrl4:36977518-36977996

1158 chrl:214160798-214161034

1159 chrl:91182509-91182857

1160 chrl0:130508443-130508658

1161 chr2:154728944-154729328

1162 chrl5:89952271-89953061

1163 chrl8:55102427-55102708

1164 chr22:31198491-31199033

1165 chrl0:50821487-50821688

1166 chr7:100076454-100076785

1167 chrl8:13641584-13642415

1168 chrl8:13868532-13869026

1169 chr6:168841438-168841699

1170 chrl:61515875-61516831

1171 chr7:32110063-32110910

1172 chr7:56355508-56355798

1173 chrl9:12767749-12767980 1174 chrl9:19371675-19372393

1175 chrl4:69256676-69257036

1176 chrl7:75447477-75447821

1177 chrl4:24801680-24802153

1178 chr5:148033472- 148034080

1179 chrl0:125650820-125651373

1180 chrll:43568921-43569854

1181 chr22:37212769-37213467

1182 chr2:162283581-162284677

1183 chr8:130995921-130996149

1184 chrll:70508328-70508617

1185 chrl6:88943427-88943669

1186 chrl9:42891311-42891646

1187 chrl5:53079220-53079579

1188 chrl7:46690390-46691055

1189 chr4:41880224-41880500

1190 chrl:156105707-156106171

1191 chr6:5997027-5997414

1192 chrl:18964180-18964401

1193 chrl4:36983440-36983738

1194 chrl2:54445876-54446113

1195 chr5:87968635-87968907

1196 chrl:29587087-29587412

1197 chrll:60718428-60718888

1198 chr2:66672431-66673636

1199 chr4:81119095-81119391

1200 chrl0:76573195-76573507

1201 chr22:42322043-42322909

1202 chrl9:45898879-45900315

1203 chrl4:95826675-95826941

1204 chrl7:48194634-48195085

1205 chrl9:49669275-49669552

1206 chrl5:96897596-96898046

1207 chrl9:40314926-40315144

1208 chr9:120507227-120507642

1209 chr5:145722467- 145722925

1210 chr3:19188246-19188772

1211 chr5:140787447- 140788044

1212 chrl9:50881418-50881664

1213 chrl0:102896342-102896665

1214 chr7:53286851-53287192

1215 chrl5:89903446-89903720

1216 chrl0:23461300-23461610

1217 chr2:127783081-127783311

1218 chrll:72532612-72533774

1219 chr2:119605200-119605620

1220 chrl8:12254147-12255089 1221 chr7:100817759-100817975

1222 chrl4:77736733-77737772

1223 chrl2:127212279-127212529

1224 chr2:119606569-119606826

1225 chrl:155264318-155265536

1226 chrl2:131199824-131200157

1227 chrl:91300979-91301891

1228 chr6:100909210-100909444

1229 chr6:4079052-4079443

1230 chr2:233251361-233253414

1231 chr4:960505-960836

1232 chrl9:21769189-21769786

1233 chrl0:102279162-102279730

1234 chrl2:127210778-127211651

1235 chrl2:54069625-54070177

1236 chrl5:53087211-53087488

1237 chrl3:28365545-28365785

1238 chrl2:113913615-113914322

1239 chrl4:51338712-51339146

1240 chr7:155604725-155605095

1241 chr3:62364017-62364316

1242 chr6:6008857-6009299

1243 chr3:46618307-46618669

1244 chrl7:33776553-33776888

1245 chrl2:58158855-58160000

1246 chr2:219857682-219858917

1247 chrl9:44278273-44278777

1248 chrl0:101282725-101282934

1249 chr20:2539133-2539877

1250 chrl2:58003880-58004249

1251 chrl6:51147490-51147944

1252 chrl:179544720-179545307

1253 chr2:71787430-71787897

1254 chrl0:129534410-129537366

1255 chr6:42145847-42146053

1256 chrl4:24802927-24803159

1257 chr22:29707479-29707797

1258 chr9:132459587-132460017

1259 chrl7:40937258-40937480

1260 chr4:151504011-151505085

1261 chrl:18967251-18968119

1262 chrl9:56598038-56600296

1263 chrl9:35633409-35633697

1264 chr2:171678546-171680358

1265 chr6:134638797-134639021

1266 chrl:36549554-36549965

1267 chr!9:12833104-12833574 1268 chr3:137487429-137488021

1269 chr9:139715663-139716441

1270 chr6:37617863-37618147

1271 chrl7:32484007-32484280

1272 chr7:156409577-156409865

1273 chr5:11384681-11385521

1274 chr8:102504478-102504841

1275 chr20:33296514-33298242

1276 chr20:57415135-57417153

1277 chrl0:71331449-71331691

1278 chr3:75667777-75669067

1279 chrl6:67571252-67572728

1280 chrl9:36500169-36500530

1281 chr2:154729613-154729918

1282 chrl2:48399168-48399372

1283 chr4:41867385-41867586

1284 chrl7:46800533-46800746

1285 chr20:44685771-44687610

1286 chrl9:10406934-10407342

1287 chr6:108496715-108497320

1288 chr5:158523906-158524598

1289 chr9:124413512-124414193

1290 chr20:57427691-57427995

1291 chrl6:10912159-10912719

1292 chr7:149389654- 149389976

1293 chrl:173638662-173639045

1294 chrl9:55597977-55598887

1295 chrl4:62279037-62279339

1296 chr3:13114627-13115245

1297 chr2:3750828-3751927

1298 chr4:85402764-85403175

1299 chrl7:74017769-74018658

1300 chr5:54523676-54523901

1301 chr7:89747892-89749036

1302 chrl8:72916107-72917233

1303 chr9:136294738-136295236

1304 chrl:201252452-201253648

1305 chr5:146888750- 146889840

1306 chrl4:52734207-52735486

1307 chrl3:20875518-20876214

1308 chrl8:77560088-77560292

1309 chr2:102803672-102804556

1310 chr2:176982107-176982402

1311 chrl7:6679205-6679710

1312 chrl9:10463626-10464378

1313 chr5:140810494- 140812617

1314 chrll:46299544-46300216 1315 chrll:64136814-64138187

1316 chr6:6007387-6007797

1317 chrl7:37321482-37322099

1318 chrl0:94455524-94455896

1319 chrl3:51417371-51418149

1320 chr8:11565217-11567212

1321 chrl:226127112-226127695

1322 chr2:3287874-3288228

1323 chr6:10882926-10883149

1324 chr22:19746155-19746369

1325 chr3:12838471-12838782

1326 chr9:36739534-36739782

1327 chr9:134429866-134430491

1328 chrll:70672834-70673055

1329 chrl4:24641053-24642220

1330 chr7:27283408-27283614

1331 chrl2:49182421-49182658

1332 chrl:44031286-44031853

1333 chrl:114696886-114697185

1334 chrl5:89901914-89902785

1335 chrll:65352231-65353134

1336 chr7:72838383-72838815

1337 chr22:38379093-38379964

1338 chr4:155663809-155664315

1339 chr9:100619984-100620192

1340 chr7:143582125- 143582610

1341 chr7:23287221-23287508

1342 chrll:64815040-64815722

1343 chr2:87088816-87089037

1344 chr20:57426729-57427047

1345 chrl0:43428167-43429460

1346 chrl0:121577529-121578385

1347 chr4:190939801-190940591

1348 chr6:100037323-100037544

1349 chrl9:12880574-12880888

1350 chr2:171670110-171670549

1351 chr7:124404174-124404432

1352 chr7:97840559-97840845

1353 chrl9:50879606-50880094

1354 chrl:113265573-113265787

1355 chrl9:2424005-2427983

1356 chr3:127633993-127634588

1357 chrl0:50817095-50817309

1358 chr2:171676552-171676980

1359 chrl:86621278-86622871

1360 chrl:164545540-164545917

1361 chr22:19967279-19967808 1362 chrll:67350928-67351953

1363 chr20:36226617-36226841

1364 chrl9:14089570- 14089796

1365 chrl9:38700333-38700577

1366 chrl:18435566-18435904

1367 chr8:21905461-21905757

1368 chr2:176950595-176950846

1369 chrl7:75251958-75252180

1370 chrl5:37390175-37390380

1371 chr9:98113447-98113662

1372 chrl:40235767-40237190

1373 chr8:144811237- 144811446

1374 chr8:99984584-99985072

1375 chr7:152621916-152622149

1376 chrl:40769186-40769871

1377 chrl9:2428349-2428731

1378 chrl7:15820620-15821325

1379 chr22:25081850-25082112

1380 chrl:19203874-19204234

1381 chr20:61703526-61704022

1382 chr2:237080188-237080432

1383 chrl:156338758-156339251

1384 chrl:149332993- 149333389

1385 chr22:50496441-50497393

1386 chr7:27146069-27146600

1387 chrl3:100547633-100548911

1388 chr4:190939007-190939274

1389 chr7:73894815-73895110

1390 chrl9:35632356-35632572

1391 chrl6:67918679-67918909

1392 chr2:108602824-108603467

1393 chr2:238864315-238865170

1394 chr8:144808221- 144810978

1395 chr8:145101631- 145101834

1396 chrl2:132905449-132906206

1397 chr6:99275763-99276038

1398 chr5:140800760- 140801072

1399 chrl7:75242871-75243613

1400 chrl7:41278134-41278460

1401 chrl2:122016170-122017693

1402 chrl0:131264948-131265710

1403 chrl7:46631800-46632212

1404 chrl4:105167277-105167501

1405 chrl0:23982382-23982589

1406 chrl9:50931270-50931638

1407 chr3:27771638-27771942

1408 chrl8:74799144-74800038 1409 chrl:21616380-21617101

1410 chrl:147782066- 147782473

1411 chr7:6590563-6590957

1412 chr7:97839862-97840222

1413 chrl2:113914440-113914657

1414 chrl9:7933263-7934898

1415 chr20:22559553-22560001

1416 chrl5:53086629-53086858

1417 chrl0:94180315-94180754

1418 chr5:140052059- 140053381

1419 chrl0:101287162-101287920

1420 chrl4:38677154-38677787

1421 chr22:39262338-39263211

1422 chrl8:74153239-74155073

1423 chrl5:59157045-59157594

1424 chr4:963804-964115

1425 chrll:624780-625053

1426 chr7:1362811-1363643

1427 chrl9:36246328-36247982

1428 chr5:54528095-54528404

1429 chrl2:54359658-54359906

1430 chr2:127782613-127782829

1431 chrl9:406131-405511

1432 chrl7:46697413-45697701

1433 chrl8:43608140-43608510

1434 chrl6:23724270-23724775

1435 chrl8:55922987-55924068

1436 chrl5:60291879-60292167

1437 chrl4:92788913-92789204

1438 chrl9:1108394-1109610

1439 chrll:124628367-124629590

1440 chrl:32052471-32052771

1441 chrl9:11594372-11594987

1442 chrl9:870774-871318

1443 chr2:54086775-54087266

1444 chr2:241459632-241460047

1445 chr7:127990926-127992616

1446 chrl:208132327-208133117

1447 chr7:90893567-90896683

1448 chrl:41284847-41285149

1449 chrll:32452144-32452708

1450 chr5:77146998-77147785

1451 chrl9:45901452-45901688

1452 chr7:6661875-6662695

1453 chr6:161188084-161188639

1454 chrl7:934417-935088

1455 chrll:65409636-65410127 1456 chrl7:19883325-19883610

1457 chrl8:77549524-77550299

1458 chrl:38461584-38461988

1459 chrl9:10464666-10464927

1460 chrl7:70120139-70120442

1461 chr7:27147589-27148389

1462 chr2:31806545-31806782

1463 chrll:119292689-119292891

1464 chrl9:18979351-18981200

1465 chr6:42879279-42879623

1466 chrl2:130908777-130909191

1467 chrl7:46629553-46629816

1468 chrl:202162958-202163390

1469 chrl7:21367114-21367592

1470 chrl6:84001805-84002011

1471 chrl:221057463-221057757

1472 chrl7:27899511-27900067

1473 chrl5:40268581-40269061

1474 chr22:37465056-37465331

1475 chrl7:77805866-77809046

1476 chrl9:13198699-13198999

1477 chr3:184056419-184056671

1478 chr22:37911979-37912258

1479 chrl9:19368708-19369681

1480 chrll:64135815-64136381

1481 chrl8:77552401-77552603

1482 chrl9:58554354-58554587

1483 chr20:57414595-57414896

1484 chr4:190938106-190938848

1485 chr5:172110282-172111166

1486 chrl6:68480864-68482822

1487 chr9:139395020-139395287

1488 chrl2:113515164-113515970

1489 chrl:221054554-221054888

1490 chr8:144990270- 145002135

1491 chr9:131154346-131155923

1492 chr6:150335525-150336278

1493 chr9:115824684-115825033

1494 chrl2:54519768-54520457

1495 chr6:35479872-35480154

1496 chrl9:3870788-3871043

1497 chrl9:48965002-48965792

1498 chr6:35479388-35479678

1499 chrl2:52408381-52408675

1500 chrl:221068782-221069159

1501 chr6:46655262-45556738

1502 chr3:55508335-55508708 1503 chrl:39980365-39981768

1504 chrl6:3067521-3068358

1505 chrl:1473107-1473342

1506 chrl0:105362549-105362827

1507 chrl7:46698880-46699083

1508 chr2:198029068-198029438

1509 chr20:17209418-17209622

1510 chrl2:49183049-49183282

1511 chrl6:58030214-58031633

1512 chrl0:94820026-94823252

1513 chrll:725596-726870

1514 chr6:170732119-170732442

1515 chrl2:120835586-120835927

1516 chr20:36012595-36013439

1517 chr8:143545445- 143546178

1518 chr6:27228100-27228364

1519 chr21:32624144-32624382

1520 chr9:95477296-95477708

1521 chrl0:105420685-105421076

1522 chrl:1470604- 1471450

1523 chrl:146552328- 146552577

1524 chrl9:33625467-33625805

1525 chrll:64478843- 64479598

1526 chr20:57428308-57428516

1527 chr7:27182613-27185562

1528 chrl9:51815157-51815458

1529 chrl7:46607804-46608390

1530 chrl2:52408860-52409121

1531 chrl9:10405924-10406398

1532 chrll:14993452- 14993661

1533 chrl9:13135317-13136169

1534 chr7:750788-751237

1535 chrl:53742297-53742845

1536 chrl:200010625-200010832

1537 chr5:139138875-139139242

1538 chrl7:45949676-45949885

1539 chr3:128722283-128723036

1540 chrl5:89312719-89313183

1541 chr9:135039673-135039978

1542 chrl9:12831793-12832225

1543 chr20:51589707-51590020

1544 chr20:3145121-3145746

1545 chr8:65710990-65711722

1546 chrll:128694084-128694688

1547 chr2:20870006-20871280

1548 chrl9:18977466-18977833

1549 chr3:49947621-49948430 1550 chr6:30139718-30140263

1551 chrl2:104697348-104697984

1552 chrl0:105361784-105362188

1553 chr6:29894140-29895117

1554 chr4:187219320-187219745

1555 chrl5:67073306-67073943

1556 chr2:220412341-220412678

1557 chr6:170730395-170730887

1558 chr9:115822071-115823416

1559 chrl:10764449-10764925

1560 chrl7:46627787-46628444

1561 chrl9:51601822-51602260

1562 chrl9:55814067-55814278

1563 chr6:138745348-138745593

1564 chr9:124987743-124991086

1565 chr22:46318693-46319087

1566 chrl6:3013016-3013228

1567 chr4:114900355-114900810

1568 chrl9:1063544-1064265

1569 chrl9:1110399-1110701

1570 chr7:97841636-97842005

1571 chr8:57359899-57360114

1572 chrl7:72915558-72916510

1573 chrl:16860873-16862296

1574 chrl7:75398284-75398527

1575 chr9:139397412-139397710

1576 chr6:33393592-33393908

1577 chr6:29595298-29595795

1578 chrl2:6438272-6438931

1579 chr3:113160299-113160641

1580 chrl:55505060-55506015

1581 chrll:132951692-132952260

1582 chr4:81118137-81118603

1583 chrl9:38876070-38876332

1584 chrl9:58549305-58549712

1585 chrl7:43472527-43474343

1586 chr9:139396205-139397040

1587 chrl6:3192181-3192669

1588 chr6:33048416-33048814

1589 chr7:128555329-128556650

1590 chrl9:46915311-46915802

1591 chr6:30095173-30095610 Table 2: Example CGIs

Table 3: Additional Example CGIs

Ill

Table 4: Additional Example CGIs