Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
COMPOSITIONS AND METHODS FOR DETECTION OF GENOMIC VARIANCE AND DNA METHYLATION STATUS
Document Type and Number:
WIPO Patent Application WO/2018/195211
Kind Code:
A1
Abstract:
In one aspect, provided herein is an integrated method for simultaneous detection of both a genomic variance and quantification of a DNA methylation state/status on one or more (e.g., hundreds of thousands of) targets, without splitting the limited materials for two different workflows. The present disclosure relates to compositions, kus, devices, and methods for conducting genetic arid genomic analysis, for example, by polynucleotide sequencing in particular aspects, provided herein are compositions, kits, and methods for constructing libraries for simultaneous detection of genomic variants and DNA methylation status on limited DNA inputs, such as circulating polynucleotide fragments in the body of a subject, including circulating tumor DNA.

Inventors:
LIU, Rui (50 Guanglan Rd, Ste. 12-101Shanghai, 3, 201203, CN)
Application Number:
US2018/028185
Publication Date:
October 25, 2018
Filing Date:
April 18, 2018
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
SINGLERA GENOMICS, INC. (505 Coast Blvd S, Suite 307La Jolla, CA, 92037, US)
International Classes:
C12N9/16; C12P19/34; C12Q1/6806; C12Q1/683; C12Q1/686
Domestic Patent References:
WO2016115530A12016-07-21
WO2016172442A12016-10-27
Foreign References:
US20080261217A12008-10-23
US20070092883A12007-04-26
EP3168309A12017-05-17
Attorney, Agent or Firm:
CHEN, Peng et al. (Rimon, P.C.One Embarcadero Center,Suite 40, San Francisco CA, 94111, US)
Download PDF:
Claims:
CLAIMS

1. A method for analyzing a first target polynucleotide sequence and a methylation status of a second target polynucleotide sequence in a sample, comprising:

1 ) contacting a sample comprising a polynucleotide with a methylation- sensitive restriction enzyme (MSRE), wherein the MSRE selectively cleaves the polynucleotide at a residue when it is unmethylated or selectively cleaves the

polynucleotide at the residue when it is methylated;

2) subjecting the sample from step 1) to polynucleotide amplification, using a mixture of:

i) a first primer set for amplifying a first target polynucleotide sequence in the sample, and

ii) a second primer set for analyzing a methylation status of a second target polynucleotide sequence in the sample, wherein the methylation status is of a residue in the second target polynucleotide sequence, and one primer of the second primer set hybridizes to the uncleaved second target polynucleotide sequence and together with another primer in the set, amplifies the uncleaved sequence but not the second target polynucleotide sequence cleaved at the residue by the MSRE; and

3) sequencing polynucleotides amplified in step 2),

wherein the first target polynucleotide sequence is analyzed using sequencing reads from the amplified first target polynucleotide sequence, and the methylation status of the residue of the second target polynucleotide sequence is analyzed by comparing the observed number of sequencing reads (No) from the amplified second target polynucleotide sequence to a reference number.

2. The method of claim 1 , wherein the MSRE cleaves the polynucleotide at a residue when it is unmethylated and does not cleave at the residue when it is

methylated.

3. The method of claim 1 or 2, wherein the method comprises amplification and sequencing of a polynucleotide from a sample that is not contacted with the MSRE.

4. The method of any one of claims 1 -3, wherein the MSRE is selected from the group consisting of Hpall, Sa/I, Sa/I-HF®, ScrR, Bbe\, Not\, Sma\, Xma\, Mbo\, BstB\, C/al, Mlu\, Nae\, A/a 1, Pvu\, Sad I, Hha\, and any combination thereof.

5. The method of any one of claims 1 -4, wherein the first target

polynucleotide sequence comprises a genetic or epigenetic information, such as a mutation, a single nucleotide polymorphism (SNP), a copy number variation (CNV), a DNA modification such as DNA methylation, and/or a histone modification.

6. The method of claim 5, wherein the mutation comprises a point mutation, an insertion, a deletion, an indel, an inversion, a truncation, a fusion, a translocation, an amplification, or any combination thereof.

7. The method of claim 5 or 6, wherein the genetic or epigenetic information is associated with a condition or disease in a subject or a population, such as a cancer- related mutation.

8. The method of any one of claims 1 -7, wherein the second target polynucleotide sequence comprises one or more CpG sites within the recognition site of the MSRE, wherein at each CpG site the cytosine (C) comprises a 5-methyl moiety or a 5-hydrogen moiety.

9. The method of claim 8, wherein the second target polynucleotide sequence comprises a regulatory sequence for a gene, such as a promoter region, an enhancer region, an insulator region, a silencer region, a 5'UTR region, a 3'UTR region, or a splice control region, and the one or more CpG sites are within the regulatory sequence.

10. The method of claim 9, wherein the gene is associated with a condition or disease in a subject or a population, such as a gene overexpressed, underexpressed, constitutively active, silenced, or ectopically expressed in a cancer or neoplasia.

11. The method of any one of claims 1 -10, wherein the sample is a biological sample. 12. The method of claim 11 , wherein the biological sample is from a subject having or suspected of having a disease or condition, such as a cancer or neoplasia.

13. The method of claim 11 or 12, wherein the biological sample is a sample comprising circulating tumor DNA (ctDNA), such as a blood, serum, plasma, or body fluid sample, or any combination thereof.

14. The method of any one of claims 1 -13, wherein the polynucleotide in the sample is or comprises a double-stranded sequence.

15. The method of any one of claims 1 -13, wherein the polynucleotide in the sample is or comprises a single-stranded sequence, and the method optionally comprises converting the single-stranded sequence to a double-stranded sequence based on sequence complementarity, for example, by primer extension.

16. The method of any one of claims 1 -15, wherein the first and second target polynucleotide sequences are on the same molecule or on different molecules, for example, two different DNA fragments, in the sample.

17. The method of any one of claims 1 -16, wherein the first and second target polynucleotide sequences are on the same gene, optionally wherein the first target polynucleotide sequence is in a coding region of the gene whereas the second target polynucleotide sequence is in a non-coding, regulatory region of the gene.

18. The method of any one of claims 1 -16, wherein the first and second target polynucleotide sequences are on different genes, optionally wherein the genes function in the same biological pathway or network.

19. The method of any one of claims 1 -18, wherein the first and second target polynucleotide sequences are on the same or different chromosomes, or on the same or different extrachromosomal DNA molecules (such as mitochondria DNA), or one on a chromosome and the other on an extrachromosomal DNA molecule. 20. The method of any one of claims 1 -19, wherein the amplification step comprises a polymerase chain reaction (PCR), reverse-transcription PCR amplification, allele-specific PCR (ASPCR), single-base extension (SBE), allele specific primer extension (ASPE), strand displacement amplification (SDA), transcription mediated amplification (TMA), ligase chain reaction (LCR), nucleic acid sequence based amplification (NASBA), primer extension, rolling circle amplification (RCA), self- sustained sequence replication (3SR), the use of Q Beta replicase, nick translation, or loop-mediated isothermal amplification (LAMP), or any combination thereof.

21. The method of claim 20, wherein the allele-specific PCR (ASPCR) is used to amplify the first target polynucleotide sequence, and the first set of primers comprise at least two allele-specific primers and a common primer, and optionally the ASPCR uses a DNA polymerase without a 3' to 5' exonuclease activity.

22. The method of claim 21 , wherein at least one of the at least two allele- specific primers is specific for a cancer mutation.

23. The method of any one of claims 1 -22, wherein the second set of primers comprise a common primer and at least two primers each for a different CpG site in the second target polynucleotide sequence.

24. The method of any one of claims 1 -23, further comprising purifying polynucleotides from the sample in step 1 ), purifying polynucleotides from the sample in step 2), and/or purifying polynucleotides during the sequencing step 3).

25. The method of any one of claims 1 -24, wherein the sequencing step comprises attaching a sequencing adapter and/or a sample-specific barcode to each polynucleotide.

26. The method of claim 25, wherein the attaching step is performed using a polymerase chain reaction (PCR).

27. The method of any one of claims 1 -26, wherein the sequencing is a high- throughput sequencing, a digital sequencing, or a next-generating sequencing (NGS) such as lllumina (Solexa) sequencing, Roche 454 sequencing, Ion torrent: Proton / PGM sequencing, and SOLiD sequencing.

28. The method of any one of claims 1 -27, wherein the reference number is determined in parallel as the analysis of the first and second target polynucleotide sequences, as the expected number of sequencing reads (/Ve) based on a control locus and/or a reference sample, with or without a control reaction using an isoschizomer of the MSRS that is methylation insensitive.

29. The method of claim 28, wherein the sample is a tumor sample and the reference sample is from a normal tissue adjacent to the tumor, and/or the methylation status at the residue in the second target polynucleotide sequence is a qualitative or quantitative readout, for example, as indicated by the methylation level mC = NM.

30. The method of any one of claims 1 -30, wherein the first primer set and/or the second primer set comprise one or more primers listed in Table 1 and/or Table 2, in any suitable combination.

31. The method of any one of claims 1 -30, wherein the first primer set comprises one or more primers for a gene selected from the group consisting of ABCB1 , CYP2C19, CYP2C8, CYP2D6, CYP3A4, CYP3A5, DPYD, GSTP1 , MTHFR, NQ01, RHEB, SULT1A1 , UGT1A1 , MPL, JAK1 , NRAS, DDR2, PTEN, FGFR2, HRAS, ATM, CBL, KRAS, ERBB3, CDK4, HNF1A, FLT3, RB1 , AKT1 , IDH2, CDH1 , TR53, ERBB2, STAT3, SMAD4, STK11, GNA11 , JAK3, PPP2R1A, RET, DNMT3A, ALK, NFE2L2, SF3B1, PIK3CA, ERBB4, GNAS, U2AF1 , SLC19A1 , SMARCB1 , CHEK2, VHL, RAF1 , CTNNB1 , PDGFRA, KIT, KDR, FBXW7, APC, NEUROG1 , CSF1 R, NPM1 , TPMT, EGFR, MET, SMO, BRAF, EZH2, FGFR1 , JAK2, CDKN2A, PAX5, PTCH1, ABL1 , NOTCH 1 , ARAF, MED12, BTK, and any combination thereof.

32. The method of claim 31 , wherein the one or more primers comprise, consist essentially of, or consist of a sequence set forth in SEQ ID NOs: 61-788, or any combination thereof. 33. The method of any one of claims 1 -32, wherein the second primer set comprises one or more primers for a gene selected from the group consisting of NDRG4, SEPT, MLH1 , WTN5A, AGTR1 , BMP3, SFRP2, NEUROG1 , TFPI2, SDC2, and any combination thereof.

34. The method of claim 33, wherein the one or more primers comprise, consist essentially of, or consist of a sequence set forth in SEQ ID NOs: 1-60, or any combination thereof.

35. The method of any one of claims 1 -34, wherein the amplification is multiplexed.

36. The method of any one of claims 1 -35, wherein the analysis of the first target polynucleotide sequence and the analysis of the methylation status of the second target polynucleotide sequence are conducted simultaneously in a single reaction.

37. The method of any one of claims 1 -36, wherein the polynucleotide concentration in the sample is less than about 0.1 ng/mL, less than about 1 ng/mL, less than about 3 ng/mL, less than about 5 ng/mL, less than about 10 ng/mL, less than about 20 ng/mL, or less than about 100 ng/mL.

38. The method of any one of claims 1 -37, which is used for the diagnosis and/or prognosis of a disease or condition in a subject, predicting the responsiveness of a subject to a treatment, identifying a pharmacogenetics marker for the

disease/condition or treatment, and/or screening a population for a genetic information.

39. The method of claim 38, wherein the disease or condition is a cancer or neoplasia, and the treatment is a cancer or neoplasia treatment.

40. A kit, comprising:

a methylation-sensitive restriction enzyme (MSRE), wherein the MSRE selectively cleaves at a residue when it is unmethylated or selectively cleaves at the residue when it is methylated; a first primer set for amplifying a first target polynucleotide sequence in a sample; and/or

a second primer set for analyzing a methylation status of a second target polynucleotide sequence in the sample, wherein the methylation status is of a residue in the second target polynucleotide sequence, and one primer of the second primer set hybridizes to the uncleaved second target polynucleotide sequence and together with another primer in the set, amplifies the uncleaved sequence but not the second target polynucleotide sequence cleaved at the residue by the MSRE.

41. The kit of claim 40, wherein the MSRE is selected from the group consisting of Hpall, Sa/I, Sa/I-HF®, Serf I, Sibel, Not\, Sma\, Xma\, Mbo\, BstB\, C/al, Mlu\, Nae\, Λ/arl, Pvu\, SacW, Hha\, and any combination thereof.

42. The kit of claim 40 or 41 , wherein the first set of primers comprise at least two allele-specific primers and a common primer, and optionally a DNA polymerase without a 3' to 5' exonuclease activity.

43. The kit of any one of claims 40-42, wherein the second set of primers comprise a common primer and at least two primers each for a different CpG site in the second target polynucleotide sequence.

44. The kit of any one of claims 40-43, further comprising an agent for purifying polynucleotides from a sample.

45. The kit of any one of claims 40-44, further comprising an agent for sequencing, such as a sequencing adapter and/or a sample-specific barcode.

46. The kit of any one of claims 40-45, wherein the first and second sets of primers are mixed.

47. The kit of any one of claims 40-45, wherein the first and second sets of primers are in separate vials and the kit further comprises instruction to mix all or a subset of the primers, and/or wherein the first primer set and/or the second primer set comprise one or more primers listed in Table 1 and/or Table 2, in any suitable combination.

48. The kit of any one of claims 40-47, wherein the first primer set comprises one or more primers for a gene selected from the group consisting of ABCB1 ,

CYP2C19, CYP2C8, CYP2D6, CYP3A4, CYP3A5, DPYD, GSTP1 , MTHFR, NQ01 , RHEB, SULT1A1 , UGT1A1 , MPL, JAK1 , NRAS, DDR2, PTEN, FGFR2, HRAS, ATM, CBL, KRAS, ERBB3, CDK4, HNF1A, FLT3, RB1 , AKT1 , IDH2, CDH1 , TR53, ERBB2, STAT3, SMAD4, STK11 , GNA11, JAK3, PPP2R1A, RET, DNMT3A, ALK, NFE2L2, SF3B1 , PIK3CA, ERBB4, GNAS, U2AF1 , SLC19A1 , SMARCB1 , CHEK2, VHL, RAF1, CTNNB1 , PDGFRA, KIT, KDR, FBXW7, APC, NEUROG1 , CSF1 R, NPM1 , TPMT, EGFR, MET, SMO, BRAF, EZH2, FGFR1 , JAK2, CDKN2A, PAX5, PTCH1 , ABL1 , NOTCH1 , ARAF, MED12, BTK, and any combination thereof.

49. The kit of claim 48, wherein the one or more primers comprise, consist essentially of, or consist of a sequence set forth in SEQ ID NOs: 61-788, or any combination thereof.

50. The kit of any one of claims 40-49, wherein the second primer set comprises one or more primers for a gene selected from the group consisting of NDRG4, SEPT, MLH1 , WTN5A, AGTR1 , BMP3, SFRP2, NEUROG1 , TFPI2, SDC2, and any combination thereof.

51. The kit of claim 50, wherein the one or more primers comprise, consist essentially of, or consist of a sequence set forth in SEQ ID NOs: 1-60, or any combination thereof.

52. The kit of any one of claims 40-51 , further comprising instruction of comparing an observed number of sequencing reads to a reference number.

53. The kit of claim 52, further comprising a reference sample and or information of a control locus.

54. The kit of any one of claims 40-53, further comprising separate vials for one or more components and/or instructions for using the kit.

Description:
COMPOSITIONS AND METHODS FOR DETECTION OF GENOMIC VARIANCE AND

DNA METHYLATION STATUS

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application claims benefit of priority to U.S. Provisional Application Serial No. 62/487,422, filed on April 19, 2017, the content of which is incorporated by reference its entirety for all purposes. In some aspect, the present disclosure relates to U.S. provisional application serial No. 62/487,423, filed on April 19, 2017, and U.S. Provisional Application Serial No. 62/657,544, filed April 13, 2018, the contents of both applications are incorporated by reference in their entireties for all purposes.

TECHNICAL FIELD

[0002] The present disclosure relates to compositions, kits, devices, and methods for conducting genetic and genomic analysis, for example, by polynucleotide

sequencing. In particular aspects, provided herein are compositions, kits, and methods for constructing libraries for simultaneous detection of genomic variants and DNA methylation status on limited DNA inputs, such as circulating polynucleotide fragments in the body of a subject, including circulating tumor DNA.

BACKGROUND

[0003] In the following discussion, certain articles and methods are described for background and introductory purposes. Nothing contained herein is to be construed as an "admission" of prior art. Applicant expressly reserves the right to demonstrate, where appropriate, that the articles and methods referenced herein do not constitute prior art under the applicable statutory provisions.

[0004] Mammalian (including human) cells typically have DNA methylation at CpG di-nucleotides. The status of CpG methylation in general can be determined with at least four mechanisms, (i) sodium bisulfite treatment to convert the modification status into different genetic codes; (ii) affinity enrichment by antibodies or methyl-CpG binding proteins; (iii) digestion by methyl-sensitive restriction enzymes; (iv) direct sequencing by nano-pores or PacBio polymerase real-time monitoring. Depending on the number of targets per assay, the methylation information can be read out by gel electrophoresis, real-time quantitative PCR, Sanger sequencing, microarray, second-generation sequencing, or mass spectrometry. Notably, while genome-wide measurements provide very rich information for discovery purposes, many clinical assays focus on limited number of most informative and reliable markers, and use PCR, hybridization- based enrichment, or padlock capture to enrich assay targets specifically. Laird (2010), "Principles and challenges of genome-wide DNA methylation analysis," Nat Rev Genet 11: 191-203; and Plongthongkum et al. (2014), "Advances in the profiling of DNA modifications: cytosine methylation and beyond," Nat Rev Genet 15: 647-661. In general, bisulfite-based methods provide absolute quantification at the single-base resolution, both are highly desirable features. Yet the chemical treatment is harsh and tends to lead to material losses, which can compromise the assay sensitivity on low- input samples.

[0005] Methods for detecting and quantifying germ line or somatic genetic variants have evolved over the past three decades. While Sanger sequencing and real-time quantitative PCR based methods have been routinely implemented in clinical labs, several targeted sequencing methods based on next-generation sequencing have started to be implemented as clinical tests. Rehm (2013). "Disease-targeted

sequencing: a cornerstone in the clinic," Nat Rev Genet 14: 295-300. These tests typically use hybridization capture methods, multiplexed PCR, or circulahzation capture using padlock probes or selectors. These methods differ in scalability, uniformity, library conversion efficiency, and assay cost.

[0006] Many clinical samples contain limited amounts of DNA molecules, which can often be degraded or fragmented. For multiple diagnostic purposes, it will be beneficial to obtain multi-layer of information for making accurate and specific prediction of disease status or disease types. There is a growing need for assays that can efficiently read out both genomics and epigenetics information from very limited amount of DNA materials, and can be easily deployed and robustly implemented in clinical laboratories. The present disclosure addresses this and other related needs. BRIEF SUMMARY

[0007] The summary is not intended to be used to limit the scope of the claimed subject matter. Other features, details, utilities, and advantages of the claimed subject matter will be apparent from the detailed description including those aspects disclosed in the accompanying drawings and in the appended claims.

[0008] In one aspect, provided herein is a method for analyzing a first target polynucleotide sequence and a methylation status of a second target polynucleotide sequence in a sample, comprising contacting a sample containing or suspected of containing a polynucleotide with a methylation-sensitive restriction enzyme (MSRE). In one aspect, the MSRE selectively cleaves the polynucleotide at a residue when it is unmethylated or selectively cleaves the polynucleotide at the residue when it is methylated.

[0009] In another aspect, the method comprises subjecting an MSRE-treated sample to polynucleotide amplification, using a mixture of: i) a first primer set for amplifying a first target polynucleotide sequence in the sample, and ii) a second primer set for analyzing a methylation status of a second target polynucleotide sequence in the sample.

[0010] In any of the preceding embodiments, the methylation status can be of a residue in the second target polynucleotide sequence, and one primer of the second primer set can hybridize to the uncleaved second target polynucleotide sequence and together with another primer in the set, can amplify the uncleaved sequence but not the second target polynucleotide sequence cleaved at the residue by the MSRE.

[0011] In any of the preceding embodiments, the method can further comprise sequencing the amplified polynucleotides.

[0012] In any of the preceding embodiments, the first target polynucleotide sequence can be analyzed using sequencing reads from the amplified first target polynucleotide sequence.

[0013] In any of the preceding embodiments, the methylation status of the residue of the second target polynucleotide sequence can be analyzed by comparing the observed number of sequencing reads ( >) from the amplified second target

polynucleotide sequence to a reference number. [0014] In yet another aspect, provided herein is a method for analyzing a first target polynucleotide sequence and a methylation status of a second target polynucleotide sequence in a sample. In one embodiment, the method comprises: (1) contacting a sample comprising a polynucleotide with a methylation-sensitive restriction enzyme (MSRE), and the MSRE selectively cleaves the polynucleotide at a residue when it is unmethylated or selectively cleaves the polynucleotide at the residue when it is methylated; (2) subjecting the sample from step (1 ) to polynucleotide amplification, using a mixture of: i) a first primer set for amplifying a first target polynucleotide sequence in the sample, and ii) a second primer set for analyzing a methylation status of a second target polynucleotide sequence in the sample, and the methylation status is of a residue in the second target polynucleotide sequence, and one primer of the second primer set hybridizes to the uncleaved second target polynucleotide sequence and together with another primer in the set, amplifies the uncleaved sequence but not the second target polynucleotide sequence cleaved at the residue by the MSRE; and (3) sequencing polynucleotides amplified in step (2), and the first target polynucleotide sequence is analyzed using sequencing reads from the amplified first target

polynucleotide sequence, and the methylation status of the residue of the second target polynucleotide sequence is analyzed by comparing the observed number of sequencing reads ( >) from the amplified second target polynucleotide sequence to a reference number.

[0015] In any of the preceding embodiments, the MSRE can cleave the

polynucleotide at a residue when it is unmethylated and not cleave at the residue when it is methylated.

[0016] In any of the preceding embodiments, the method can further comprise amplification and sequencing of a polynucleotide from a sample that is not contacted with the MSRE.

[0017] In any of the preceding embodiments, the MSRE can be selected from the group consisting of Hpall, Sa/I, Sa/I-HF®, ScrF\, Bbe\, Not , Sma\, Xma\, Mbo\, BstB\, Cla\, Mlu\, Nae\, Λ/ari, Pvu\, SacU, Hha\, and any combination thereof.

[0018] In any of the preceding embodiments, the first target polynucleotide sequence can comprise a genetic or epigenetic information, such as a mutation, a single nucleotide polymorphism (SNP), a copy number variation (CNV), a DNA modification such as DNA methylation, and/or a histone modification. In one

embodiment, the mutation comprises a point mutation, an insertion, a deletion, an indel, an inversion, a truncation, a fusion, a translocation, an amplification, or any combination thereof. In any of the preceding embodiments, the genetic or epigenetic information can be associated with a condition or disease in a subject or a population, such as a cancer-related mutation.

[0019] In any of the preceding embodiments, the second target polynucleotide sequence can comprise one or more CpG sites within the recognition site of the MSRE. In one embodiment, at each CpG site the cytosine (C) comprises a 5-methyl moiety or a 5-hydrogen moiety.

[0020] In any of the preceding embodiments, the second target polynucleotide sequence can comprise a regulatory sequence for a gene, such as a promoter region, an enhancer region, an insulator region, a silencer region, a 5'UTR region, a 3'UTR region, or a splice control region, and one or more CpG sites are located within the regulatory sequence. In one aspect, the gene is associated with a condition or disease in a subject or a population, such as a gene overexpressed, underexpressed, constitutively active, silenced, or ectopically expressed in a cancer or neoplasia.

[0021] In any of the preceding embodiments, the sample is can be a biological sample. In one aspect, the biological sample is from a subject having or suspected of having a disease or condition, such as a cancer or neoplasia.

[0022] In any of the preceding embodiments, the sample can comprise circulating tumor DNA (ctDNA), such as a blood, serum, plasma, or body fluid sample, or any combination thereof.

[0023] In any of the preceding embodiments, the polynucleotide in the sample can be or comprise a double-stranded sequence.

[0024] In any of the preceding embodiments, the polynucleotide in the sample can be or comprise a single-stranded sequence.

[0025] In any of the preceding embodiments, the method can comprise converting the single-stranded sequence to a double-stranded sequence based on sequence complementarity, for example, by primer extension. [0026] In any of the preceding embodiments, the first and second target

polynucleotide sequences can be on the same molecule or on different molecules, for example, two different DNA fragments, in the sample.

[0027] In any of the preceding embodiments, the first and second target

polynucleotide sequences can be on the same gene.

[0028] In any of the preceding embodiments, the first target polynucleotide sequence can be in a coding region of a gene whereas the second target

polynucleotide sequence can be in a non-coding and/or regulatory region of or for the same gene.

[0029] In any of the preceding embodiments, the first and second target

polynucleotide sequences can be on different genes. In one aspect, the genes function in the same biological pathway or network.

[0030] In any of the preceding embodiments, the first and second target

polynucleotide sequences can be on the same or different chromosomes, or on the same or different extrachromosomal DNA molecules (such as mitochondria DNA), or one on a chromosome and the other on an extrachromosomal DNA molecule.

[0031] In any of the preceding embodiments, the amplification step can comprise a polymerase chain reaction (PCR), reverse-transcription PCR amplification, allele- specific PCR (ASPCR), single-base extension (SBE), allele specific primer extension (ASPE), strand displacement amplification (SDA), transcription mediated amplification (TMA), ligase chain reaction (LCR), nucleic acid sequence based amplification

(NASBA), primer extension, rolling circle amplification (RCA), self-sustained sequence replication (3SR), the use of Q Beta replicase, nick translation, or loop-mediated isothermal amplification (LAMP), or any combination thereof.

[0032] In any of the preceding embodiments, allele-specific PCR (ASPCR) can be used to amplify the first target polynucleotide sequence, and the first set of primers comprise at least two allele-specific primers and a common primer. In one aspect, the ASPCR uses a DNA polymerase without a 3' to 5' exonuclease activity. In another aspect, at least one of the at least two allele-specific primers is specific for a cancer mutation. [0033] In any of the preceding embodiments, the second set of primers can comprise a common primer and at least two primers each for a different CpG site in the second target polynucleotide sequence.

[0034] In any of the preceding embodiments, the method can further comprise purifying polynucleotides from an MSRE-treated sample, purifying polynucleotides from the sample from the amplification step, and/or purifying polynucleotides before, during, and/or after the sequencing step.

[0035] In any of the preceding embodiments, the sequencing step can comprise attaching a sequencing adapter and/or a sample-specific barcode to each

polynucleotide. In one aspect, the attaching step is performed using a polymerase chain reaction (PCR).

[0036] In any of the preceding embodiments, the sequencing can be a high- throughput sequencing, a digital sequencing, or a next-generating sequencing (NGS) such as lllumina (Solexa) sequencing, Roche 454 sequencing, Ion torrent: Proton / PGM sequencing, and SOLiD sequencing.

[0037] In any of the preceding embodiments, the reference number can be predetermined (for example, based on literature) or determined in parallel as the analysis of the first and second target polynucleotide sequences. In one aspect, the reference number is an expected number of sequencing reads (A/ e ) based on a control locus and/or a reference sample, with or without a control reaction using an

isoschizomer of the MSRS that is methylation insensitive.

[0038] In any of the preceding embodiments, the sample can be a tumor sample and the reference sample can be from a normal tissue adjacent to the tumor.

[0039] In any of the preceding embodiments, the methylation status at the residue in the second target polynucleotide sequence can be a qualitative or quantitative readout, for example, as indicated by the methylation level mC = NJNe.

[0040] In any of the preceding embodiments, the first primer set and/or the second primer set can comprise one or more primers listed in Table 1 and/or Table 2, in any suitable combination.

[0041] In any of the preceding embodiments, the first primer set can comprise one or more primers for a gene selected from the group consisting of ABCB1 , CYP2C19, CYP2C8, CYP2D6, CYP3A4, CYP3A5, DPYD, GSTP1 , MTHFR, NQ01 , RHEB, SULT1A1, UGT1A1 , MPL, JAK1 , NRAS, DDR2, PTEN, FGFR2, HRAS, ATM, CBL, KRAS, ERBB3, CDK4, HNF1A, FLT3, RB1 , AKT1 , IDH2, CDH1 , TR53, ERBB2, STAT3, SMAD4, STK11 , GNA11, JAK3, PPP2R1A, RET, DNMT3A, ALK, NFE2L2, SF3B1 , PIK3CA, ERBB4, GNAS, U2AF1 , SLC19A1 , SMARCB1 , CHEK2, VHL, RAF1, CTNNB1 , PDGFRA, KIT, KDR, FBXW7, APC, NEUR0G1 , CSF1 R, NPM1 , TPMT, EGFR, MET, SMO, BRAF, EZH2, FGFR1 , JAK2, CDKN2A, PAX5, PTCH1 , ABL1 , N0TCH1 , ARAF, MED12, BTK, and any combination thereof.

[0042] In any of the preceding embodiments, the one or more primers from the first primer set can comprise, consist essentially of, or consist of a sequence set forth in SEQ ID NOs: 61-788, or any combination thereof.

[0043] In any of the preceding embodiments, the second primer set can comprise one or more primers for a gene selected from the group consisting of NDRG4, SEPT, MLH1 , WTN5A, AGTR1 , BMP3, SFRP2, NEUROG1 , TFPI2, SDC2, and any

combination thereof.

[0044] In any of the preceding embodiments, the one or more primers from the second primer set can comprise, consist essentially of, or consist of a sequence set forth in SEQ ID NOs: 1-60, or any combination thereof.

[0045] In any of the preceding embodiments, the amplification can be multiplexed.

[0046] In any of the preceding embodiments, the analysis of the first target polynucleotide sequence and the analysis of the methylation status of the second target polynucleotide sequence can be conducted simultaneously in a single reaction.

[0047] In any of the preceding embodiments, the polynucleotide concentration in the sample can be less than about 0.1 ng/mL, less than about 1 ng/mL, less than about 3 ng/mL, less than about 5 ng/mL, less than about 10 ng/mL, less than about 20 ng/mL, or less than about 100 ng/mL.

[0048] In any of the preceding embodiments, the method can be used for the diagnosis and/or prognosis of a disease or condition in a subject, predicting the responsiveness of a subject to a treatment, identifying a pharmacogenetics marker for the disease/condition or treatment, and/or screening a population for a genetic information. In one aspect, the disease or condition is a cancer or neoplasia, and the treatment is a cancer or neoplasia treatment.

[0049] In another aspect, disclosed herein is a kit, comprising: a methylation- sensitive restriction enzyme (MSRE), and the MSRE selectively cleaves at a residue when it is unmethylated or selectively cleaves at the residue when it is methylated; a first primer set for amplifying a first target polynucleotide sequence in a sample; and/or a second primer set for analyzing a methylation status of a second target

polynucleotide sequence in the sample, and the methylation status is of a residue in the second target polynucleotide sequence, and one primer of the second primer set hybridizes to the uncleaved second target polynucleotide sequence and together with another primer in the set, amplifies the uncleaved sequence but not the second target polynucleotide sequence cleaved at the residue by the MSRE. In one embodiment, the MSRE is selected from the group consisting of tfpall, Sa/I, Sa/I-HF®, ScrR, Bbe\, Not\, Sma\, Xma\, Mbo\, BstB\, C/al, Mlu\, Nae\, Nar\, Pvu\, SacW, Hha\, and any combination thereof.

[0050] In any of the preceding embodiments, the first set of primers can comprise at least two allele-specific primers and a common primer.

[0051] In any of the preceding embodiments, the kit can comprise a DNA

polymerase without a 3' to 5' exonuclease activity.

[0052] In any of the preceding embodiments, the second set of primers of the kit can comprise a common primer and at least two primers each for a different CpG site in the second target polynucleotide sequence.

[0053] In any of the preceding embodiments, the kit can further comprise an agent for purifying polynucleotides from a sample.

[0054] In any of the preceding embodiments, the kit can further comprise an agent for sequencing, such as a sequencing adapter and/or a sample-specific barcode.

[0055] In any of the preceding embodiments, the first and second sets of primers can be mixed, for example, in one vial within the kit, or the first and second sets of primers can be in separate vials and the kit can further comprise an instruction to mix all or a subset of the primers. [0056] In any of the preceding embodiments, the first primer set and/or the second primer set of the kit can comprise one or more primers listed in Table 1 and/or Table 2, in any suitable combination.

[0057] In any of the preceding embodiments, the first primer set of the kit can comprise one or more primers for a gene selected from the group consisting of ABCB1 , CYP2C19, CYP2C8, CYP2D6, CYP3A4, CYP3A5, DPYD, GSTP1 , MTHFR, NQ01 , RHEB, SULT1A1 , UGT1A1 , MPL, JAK1 , NRAS, DDR2, PTEN, FGFR2, HRAS, ATM, CBL, KRAS, ERBB3, CDK4, HNF1A, FLT3, RB1 , AKT1 , IDH2, CDH1 , TR53, ERBB2, STAT3, SMAD4, STK11 , GNA11 , JAK3, PPP2R1A, RET, DNMT3A, ALK, NFE2L2, SF3B1 , PIK3CA, ERBB4, GNAS, U2AF1 , SLC19A1 , SMARCB1, CHEK2, VHL, RAF1, CTNNB1 , PDGFRA, KIT, KDR, FBXW7, APC, NEUROG1 , CSF1 R, NPM1 , TPMT, EGFR, MET, SMO, BRAF, EZH2, FGFR1 , JAK2, CDKN2A, PAX5, PTCH1 , ABL1 , NOTCH1 , ARAF, MED12, BTK, and any combination thereof.

[0058] In any of the preceding embodiments, the first primer set of the kit can comprise, consist essentially of, or consist of a sequence set forth in SEQ ID NOs: 61- 788, or any combination thereof.

[0059] In any of the preceding embodiments, the second primer set of the kit can comprise one or more primers for a gene selected from the group consisting of NDRG4, SEPT, MLH1, WTN5A, AGTR1 , BMP3, SFRP2. NEUROG1 , TFPI2, SDC2, and any combination thereof.

[0060] In any of the preceding embodiments, the second primer set of the kit can comprise, consist essentially of, or consist of a sequence set forth in SEQ ID NOs: 1- 60, or any combination thereof.

[0061] In any of the preceding embodiments, the kit can further comprise an instruction of comparing an observed number of sequencing reads to a reference number. In one embodiment, the kit further comprises a reference sample and/or information of a control locus.

[0062] In any of the preceding embodiments, the kit can further comprise separate vials for one or more components and/or instructions for using the kit. BRIEF DESCRIPTION OF THE DRAWINGS

[0063] FIG. 1 is an overview of the MSA-Seq (methylation specific amplification sequencing) method, according to one aspect of the present disclosure.

[0064] FIG. 2 shows validation of analytical performance with synthetic DNA mixtures (1%, 5%, 10%, 20%, 50%) of fragmented genomic DNA from the cancer cell line HCT116, which is methylated at the 24 CpG sites, with genomic DNA from

NA12878 that is unmethylated at all these sites. MSA-seq was performed on these mixtures in triplicates.

[0065] FIG. 3 shows MSMC-Seq quantified CpG methylation for tumor clustering. MSMC stands for Multiple Sequentially Markovian Coalescent, a method for clustering multiple genome sequences, and in this instance, MSMC performs unbiased heretical clustering of tumor subgroups based on quantified CpG methylation.

DETAILED DESCRIPTION

[0066] Numerous specific details are set forth in the following description in order to provide a thorough understanding of the present disclosure. These details are provided for the purpose of example and the claimed subject matter may be practiced according to the claims without some or all of these specific details. It is to be understood that other embodiments can be used and structural changes can be made without departing from the scope of the claimed subject matter. It should be understood that the various features and functionality described in one or more of the individual embodiments are not limited in their applicability to the particular embodiment with which they are described. They instead can, be applied, alone or in some combination, to one or more of the other embodiments of the disclosure, whether or not such embodiments are described, and whether or not such features are presented as being a part of a described embodiment. For the purpose of clarity, technical material that is known in the technical fields related to the claimed subject matter has not been described in detail so that the claimed subject matter is not unnecessarily obscured.

[0067] All publications, including patent documents, scientific articles and

databases, referred to in this application are incorporated by reference in their entireties for all purposes to the same extent as if each individual publication were individually incorporated by reference. Citation of the publications or documents is not intended as an admission that any of them is pertinent prior art, nor does it constitute any admission as to the contents or date of these publications or documents.

[0068] All headings are for the convenience of the reader and should not be used to limit the meaning of the text that follows the heading, unless so specified.

[0069] The practice of the provided embodiments will employ, unless otherwise indicated, conventional techniques and descriptions of organic chemistry, polymer technology, molecular biology (including recombinant techniques), cell biology, biochemistry, and sequencing technology, which are within the skill of those who practice in the art. Such conventional techniques include polypeptide and protein synthesis and modification, polynucleotide synthesis and modification, polymer array synthesis, hybridization and ligation of polynucleotides, detection of hybridization, and nucleotide sequencing. Specific illustrations of suitable techniques can be had by reference to the examples herein. However, other equivalent conventional procedures can, of course, also be used. Such conventional techniques and descriptions can be found in standard laboratory manuals such as Green, ef a/., Eds., Genome Analysis: A Laboratory Manual Series (Vols. I-IV) (1999); Weiner, Gabriel, Stephens, Eds., Genetic Variation: A Laboratory Manual (2007); Dieffenbach, Dveksler, Eds., PCR Primer: A Laboratory Manual (2003); Bowtell and Sambrook, DNA Microarrays: A Molecular Cloning Manual (2003); Mount, Bioinformatics: Sequence and Genome Analysis (2004); Sambrook and Russell, Condensed Protocols from Molecular Cloning: A Laboratory Manual (2006); and Sambrook and Russell, Molecular Cloning: A

Laboratory Manual (2002) (all from Cold Spring Harbor Laboratory Press); Ausubel et al. eds., Current Protocols in Molecular Biology (1987); T. Brown ed., Essential Molecular Biology (1991), IRL Press; Goeddel ed., Gene Expression Technology (1991 ), Academic Press; A. Bothwell et al. eds., Methods for Cloning and Analysis of Eukaryotic Genes (1990), Bartlett Publ.; M. Kriegler, Gene Transfer and Expression (1990), Stockton Press; R. Wu et al. eds., Recombinant DNA Methodology (1989), Academic Press; M. McPherson et al., PCR: A Practical Approach (1991), IRL Press at Oxford University Press; Stryer, Biochemistry (4th Ed.) (1995), W. H. Freeman, New York N.Y.; Gait, Oligonucleotide Synthesis: A Practical Approach (2002), IRL Press, London; Nelson and Cox, Lehninger, Principles of Biochemistry (2000) 3rd Ed., W. H. Freeman Pub., New York, N.Y.; Berg, etal., Biochemistry (2002) 5th Ed., W. H.

Freeman Pub., New York, N.Y., all of which are herein incorporated in their entireties by reference for all purposes.

A. Definitions

[0070] Unless defined otherwise, all technical and scientific terms used herein have the same meaning as is commonly understood by one of ordinary skill in the art to which the present disclosure belongs. If a definition set forth in this section is contrary to or otherwise inconsistent with a definition set forth in the patents, applications, published applications and other publications that are herein incorporated by reference, the definition set forth in this section prevails over the definition that is incorporated herein by reference.

[0071] As used herein, "a" or "an" means "at least one" or "one or more." As used herein, the singular forms "a," "an," and "the * include the plural reference unless the context clearly dictates otherwise.

[0072] Throughout this disclosure, various aspects of the claimed subject matter are presented in a range format. It should be understood that the description in range format is merely for convenience and brevity and should not be construed as an inflexible limitation on the scope of the claimed subject matter. Accordingly, the description of a range should be considered to have specifically disclosed all the possible sub-ranges as well as individual numerical values within that range. For example, where a range of values is provided, it is understood that each intervening value, between the upper and lower limit of that range and any other stated or intervening value in that stated range is encompassed within the claimed subject matter. The upper and lower limits of these smaller ranges may independently be included in the smaller ranges, and are also encompassed within the claimed subject matter, subject to any specifically excluded limit in the stated range. Where the stated range includes one or both of the limits, ranges excluding either or both of those included limits are also included in the claimed subject matter. This applies regardless of the breadth of the range. [0073] Reference to "about" a value or parameter herein includes (and describes) variations that are directed to that value or parameter per se. For example, description referring to "about X" includes description of "X." Additionally, use of "about" preceding any series of numbers includes "about" each of the recited numbers in that series. For example, description referring to "about X, Y, or Z" is intended to describe "about X, about Y, or about Z."

[0074] The term "average" as used herein refers to either a mean or a median, or any value used to approximate the mean or the median, unless the context clearly indicates otherwise.

[0075] A "subject" as used herein refers to an organism, or a part or component of the organism, to which the provided compositions, methods, kits, devices, and systems can be administered or applied. For example, the subject can be a mammal or a cell, a tissue, an organ, or a part of the mammal. As used herein, "mammal" refers to any of the mammalian class of species, preferably human (including humans, human subjects, or human patients). Mammals include, but are not limited to, farm animals, sport animals, pets, primates, horses, dogs, cats, and rodents such as mice and rats.

[0076] As used herein the term "sample" refers to anything which may contain a target molecule for which analysis is desired, including a biological sample. As used herein, a "biological sample" can refer to any sample obtained from a living or viral (or prion) source or other source of macromolecules and biomolecules, and includes any cell type or tissue of a subject from which nucleic acid, protein and/or other

macromolecule can be obtained. The biological sample can be a sample obtained directly from a biological source or a sample that is processed. For example, isolated nucleic acids that are amplified constitute a biological sample. Biological samples include, but are not limited to, body fluids, such as blood, plasma, serum, cerebrospinal fluid, synovial fluid, urine, sweat, semen, stool, sputum, tears, mucus, amniotic fluid or the like, an effusion, a bone marrow sample, ascitic fluid, pelvic wash fluid, pleural fluid, spinal fluid, lymph, ocular fluid, extract of nasal, throat or genital swab, cell suspension from digested tissue, or extract of fecal material, and tissue and organ samples from animals and plants and processed samples derived therefrom. [0077J The terms "polynucleotide," "oligonucleotide," "nucleic acid" and "nucleic acid molecule" are used interchangeably herein to refer to a polymeric form of nucleotides of any length, and comprise ribonucleotides, deoxyribonucleotides, and analogs or mixtures thereof. The terms include triple-, double- and single-stranded

deoxyribonucleic acid ("DNA"), as well as triple-, double- and single-stranded

ribonucleic acid ("RNA"). It also includes modified, for example by alkylation, and/or by capping, and unmodified forms of the polynucleotide. More particularly, the terms "polynucleotide," "oligonucleotide," "nucleic acid," and "nucleic acid molecule" include polydeoxyribonucleotides (containing 2-deoxy-D-ribose), polyribonucleotides

(containing D-ribose), including tRNA, rRNA, hRNA, and mRNA, whether spliced or unspliced, any other type of polynucleotide which is an N- or C-glycoside of a purine or pyrimidine base, and other polymers containing nonnucleotidic backbones, for example, polyamide {e.g., peptide nucleic acids ("PNAs")) and polymorpholino (commercially available from the Anti-Virals, Inc., Corvallis, OR, as Neugene) polymers, and other synthetic sequence-specific nucleic acid polymers providing that the polymers contain nucleobases in a configuration which allows for base pairing and base stacking, such as is found in DNA and RNA. Thus, these terms include, for example, 3'-deoxy-2',5'- DNA, oligodeoxyribonucleotide N3' to P5' phosphorami dates, 2'-0-alkyl-substituted RNA, hybrids between DNA and RNA or between PNAs and DNA or RNA, and also include known types of modifications, for example, labels, alkylation, "caps,"

substitution of one or more of the nucleotides with an analog, inter-nucleotide

modifications such as, for example, those with uncharged linkages (e.g., methyl phosphonates, phosphotriesters, phosphoramidates, carbamates, etc.), with negatively charged linkages (e.g., phosphorothioates, phosphorodithioates, efc), and with positively charged linkages (e.g., aminoalkylphosphoramidates,

am inoa Iky I phosphotriesters), those containing pendant moieties, such as, for example, proteins (including enzymes (e.g. nucleases), toxins, antibodies, signal peptides, poly- L-lysine, efc), those with intercalates (e.g., acridine, psoralen, efc), those containing chelates (of, e.g., metals, radioactive metals, boron, oxidative metals, efc), those containing alkylators, those with modified linkages (e.g., alpha anomeric nucleic acids, efc), as well as unmodified forms of the polynucleotide or oligonucleotide. A nucleic acid generally will contain phosphodiester bonds, although in some cases nucleic acid analogs may be included that have alternative backbones such as phosphoramidite, phosphorodithioate, or methylphophoroamidite linkages; or peptide nucleic acid backbones and linkages. Other analog nucleic acids include those with bicyclic structures including locked nucleic acids, positive backbones, non-ionic backbones and non-ribose backbones. Modifications of the ribose-phosphate backbone may be done to increase the stability of the molecules; for example, PNA:DNA hybrids can exhibit higher stability in some environments. The terms "polynucleotide," "oligonucleotide," "nucleic acid" and "nucleic acid molecule" can comprise any suitable length, such as at least 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 100, 200, 300, 400, 500, 1 ,000 or more nucleotides.

[0078] It will be appreciated that, as used herein, the terms "nucleoside" and "nucleotide" include those moieties which contain not only the known purine and pyrimidine bases, but also other heterocyclic bases which have been modified. Such modifications include methylated purines or pyrimidines, acylated purines or pyrimidines, or other heterocycles. Modified nucleosides or nucleotides can also include modifications on the sugar moiety, e.g., wherein one or more of the hydroxyl groups are replaced with halogen, aliphatic groups, or are functional ized as ethers, amines, or the like. The term "nucleotidic unit" is intended to encompass nucleosides and nucleotides.

[0079] The terms "complementary" and "substantially complementary" include the hybridization or base pairing or the formation of a duplex between nucleotides or nucleic acids, for instance, between the two strands of a double-stranded DNA molecule or between an oligonucleotide primer and a primer binding site on a single- stranded nucleic acid. Complementary nucleotides are, generally, A and T (or A and U), or C and G. Two single-stranded RNA or DNA molecules are said to be

substantially complementary when the nucleotides of one strand, optimally aligned and compared and with appropriate nucleotide insertions or deletions, pair with at least about 80% of the other strand, usually at least about 90% to about 95%, and even about 98% to about 100%. In one aspect, two complementary sequences of nucleotides are capable of hybridizing, preferably with less than 25%, more preferably with less than 15%, even more preferably with less than 5%, most preferably with no mismatches between opposed nucleotides. Preferably the two molecules will hybridize under conditions of high stringency.

[0080] As used herein, for a reference sequence, the reverse complementary sequence is the complementary sequence of the reference sequence in the reverse order. For example, for 5'-ATCG-3', the complementary sequence is 3'-TAGC-5', and the reverse-complementary sequence is 5'-CGAT-3'.

[0081] "Hybridization" as used herein may refer to the process in which two single- stranded polynucleotides bind non-covalently to form a stable double-stranded polynucleotide. In one aspect, the resulting double-stranded polynucleotide can be a "hybrid" or "duplex." "Hybridization conditions" typically include salt concentrations of approximately less than 1 M, often less than about 500 mM and may be less than about 200 mM. A "hybridization buffer" includes a buffered salt solution such as 5% SSPE, or other such buffers known in the art. Hybridization temperatures can be as low as 5°C, but are typically greater than 22°C, and more typically greater than about 30°C, and typically in excess of 37°C. Hybridizations are often performed under stringent conditions, i.e., conditions under which a sequence will hybridize to its target sequence but will not hybridize to other, non-complementary sequences. Stringent conditions are sequence-dependent and are different in different circumstances. For example, longer fragments may require higher hybridization temperatures for specific hybridization than short fragments. As other factors may affect the stringency of hybridization, including base composition and length of the complementary strands, presence of organic solvents, and the extent of base mismatching, the combination of parameters is more important than the absolute measure of any one parameter alone. Generally stringent conditions are selected to be about 5°C lower than the T m for the specific sequence at a defined ionic strength and pH. The melting temperature T m can be the temperature at which a population of double-stranded nucleic acid molecules becomes half dissociated into single strands. Several equations for calculating the T m of nucleic acids are well known in the art. As indicated by standard references, a simple estimate of the Tm value may be calculated by the equation, T m =81.5 + 0.41 (% G + C), when a nucleic acid is in aqueous solution at 1 M NaCI (see e.g., Anderson and Young, Quantitative Filter Hybridization, in Nucleic Acid Hybridization (1985)). Other references (e.g., Allawi and SantaLucia, Jr., Biochemistry, 36:10581-94 (1997)) include alternative methods of computation which take structural and environmental, as well as sequence

characteristics into account for the calculation of T m .

[0082] In general, the stability of a hybrid is a function of the ion concentration and temperature. Typically, a hybridization reaction is performed under conditions of lower stringency, followed by washes of varying, but higher, stringency. Exemplary stringent conditions include a salt concentration of at least 0.01 M to no more than 1 M sodium ion concentration (or other salt) at a pH of about 7.0 to about 8.3 and a temperature of at least 25°C. For example, conditions of 5 χ SSPE (750 mM NaCI, 50 mM sodium phosphate, 5 mM EDTA at pH 7.4) and a temperature of approximately 30°C are suitable for allele-specific hybridizations, though a suitable temperature depends on the length and/or GC content of the region hybridized. In one aspect, "stringency of hybridization'' in determining percentage mismatch can be as follows: 1 ) high

stringency: 0.1 χ SSPE, 0.1% SDS, 65°C; 2) medium stringency: 0.2 x SSPE, 0.1% SDS, 50°C (also referred to as moderate stringency); and 3) low stringency: 1.0 * SSPE, 0.1 % SDS, 50°C. It is understood that equivalent stringencies may be achieved using alternative buffers, salts and temperatures. For example, moderately stringent hybridization can refer to conditions that permit a nucleic acid molecule such as a probe to bind a complementary nucleic acid molecule. The hybridized nucleic acid molecules generally have at least 60% identity, including for example at least any of 70%, 75%, 80%, 85%, 90%, or 95% identity. Moderately stringent conditions can be conditions equivalent to hybridization in 50% form amide, 5 χ Denhardt's solution, 5x SSPE, 0.2% SDS at 42°C, followed by washing in 0.2 χ SSPE, 0.2% SDS, at 42°C. High stringency conditions can be provided, for example, by hybridization in 50% form amide, 5 * Denhardt's solution, 5 x SSPE, 0.2% SDS at 42°C, followed by washing in 0.1 χ SSPE, and 0.1 % SDS at 65°C. Low stringency hybridization can refer to conditions equivalent to hybridization in 10% form amide, 5 * Denhardt's solution, 6 χ SSPE, 0.2% SDS at 22°C, followed by washing in 1x SSPE, 0.2% SDS, at 37°C. Denhardt's solution contains 1 % Ficoll, 1 % polyvinylpyrolidone, and 1% bovine serum albumin (BSA). 20 * SSPE (sodium chloride, sodium phosphate, EDTA) contains 3 M sodium chloride, 0.2 M sodium phosphate, and 0.025 M EDTA. Other suitable moderate stringency and high stringency hybridization buffers and conditions are well known to those of skill in the art and are described, for example, in Sambrook et al., Molecular Cloning: A

Laboratory Manual, 2nd ed., Cold Spring Harbor Press, Plainview, N.Y. (1989); and Ausubel et al., Short Protocols in Molecular Biology, 4th ed., John Wiley & Sons (1999).

[0083] Alternatively, substantial complementarity exists when an RNA or DNA strand will hybridize under selective hybridization conditions to its complement. Typically, selective hybridization will occur when there is at least about 65% complementary over a stretch of at least 14 to 25 nucleotides, preferably at least about 75%, more preferably at least about 90% complementary. See M. Kanehisa, Nucleic Acids Res. 12:203 (1984).

[0084] A "primer" used herein can be an oligonucleotide, either natural or synthetic, that is capable, upon forming a duplex with a polynucleotide template, of acting as a point of initiation of nucleic acid synthesis and being extended from its 3' end along the template so that an extended duplex is formed. The sequence of nucleotides added during the extension process is determined by the sequence of the template

polynucleotide. Primers usually are extended by a polymerase, for example, a DNA polymerase.

[0085] "Ligation" may refer to the formation of a covalent bond or linkage between the termini of two or more nucleic acids, e.g., oligonucleotides and/or polynucleotides, in a template-driven reaction. The nature of the bond or linkage may vary widely and the ligation may be carried out enzymatically. As used herein, ligations are usually carried out enzymatically to form a phosphodiester linkage between a 5' carbon terminal nucleotide of one oligonucleotide with a 3' carbon of another nucleotide.

[0086] "Amplification," as used herein, generally refers to the process of producing multiple copies of a desired sequence. "Multiple copies" means at least 2 copies. A "copy" does not necessarily mean perfect sequence complementarity or identity to the template sequence. For example, copies can include nucleotide analogs such as deoxyinosine, intentional sequence alterations (such as sequence alterations introduced through a primer comprising a sequence that is hybridizable, but not complementary, to the template), and/or sequence errors that occur during amplification.

[0087] "Sequence determination" and the like include determination of information relating to the nucleotide base sequence of a nucleic acid. Such information may include the identification or determination of partial as well as full sequence information of the nucleic acid. Sequence information may be determined with varying degrees of statistical reliability or confidence. In one aspect, the term includes the determination of the identity and ordering of a plurality of contiguous nucleotides in a nucleic acid.

[0088] The term "Sequencing," "High throughput sequencing," or "next generation sequencing" includes sequence determination using methods that determine many (typically thousands to billions) of nucleic acid sequences in an intrinsically parallel manner, i.e. where DNA templates are prepared for sequencing not one at a time, but in a bulk process, and where many sequences are read out preferably in parallel, or alternatively using an ultra-high throughput serial process that itself may be parallelized. Such methods include but are not limited to pyrosequencing (for example, as

commercialized by 454 Life Sciences, Inc., Branford, CT); sequencing by ligation (for example, as commercialized in the SOLiD™ technology, Life Technologies, Inc., Carlsbad, CA); sequencing by synthesis using modified nucleotides (such as

commercialized in TruSeq™ and HiSeq™ technology by lllumina, Inc., San Diego, CA; HeliScope™ by Helicos Biosciences Corporation, Cambridge, MA; and PacBio RS by Pacific Biosciences of California, Inc., Menlo Park, CA), sequencing by ion detection technologies (such as Ion Torrent™ technology, Life Technologies, Carlsbad, CA); sequencing of DNA nanoballs (Complete Genomics, Inc., Mountain View, CA);

nanopore-based sequencing technologies (for example, as developed by Oxford Nanopore Technologies, LTD, Oxford, UK), and like highly parallelized sequencing methods.

[0089] "SIMP" or "single nucleotide polymorphism" may include a genetic variation between individuals; e.g., a single nitrogenous base position in the DNA of organisms that is variable. SNPs are found across the genome; much of the genetic variation between individuals is due to variation at SNP loci, and often this genetic variation results in phenotypic variation between individuals. SNPs for use in the present disclosure and their respective alleles may be derived from any number of sources, such as public databases (U.C. Santa Cruz Human Genome Browser Gateway

(genome.ucsc.edu/cgi-bin/hgGateway) or the NCBI dbSNP website (ncbi.nlm.nih gov/SNP/), or may be experimentally determined as described in U.S. Pat. No.

6,969,589; and US Pub. No. 2006/0188875 entitled "Human Genomic Polymorphisms." Although the use of SNPs is described in some of the embodiments presented herein, it will be understood that other biallelic or multi-allelic genetic markers may also be used. A biallelic genetic marker is one that has two polymorphic forms, or alleles. As mentioned above, for a biallelic genetic marker that is associated with a trait, the allele that is more abundant in the genetic composition of a case group as compared to a control group is termed the "associated allele, 1 ' and the other allele may be referred to as the "unassociated allele.' Thus, for each biallelic polymorphism that is associated with a given trait {e.g., a disease or drug response), there is a corresponding

associated allele. Other biallelic polymorphisms that may be used with the methods presented herein include, but are not limited to multinucleotide changes, insertions, deletions, and translocations.

[0090] It will be further appreciated that references to DNA herein may include genomic DNA, mitochondrial DNA, episomal DNA, and/or derivatives of DNA such as amplicons, RNA transcripts, cDNA, DNA analogs, etc. The polymorphic loci that are screened in an association study may be in a diploid or a haploid state and, ideally, would be from sites across the genome. Sequencing technologies are available for SNP sequencing, such as the BeadArray platform (GOLDENGATE™ assay) (lllumina, Inc., San Diego, CA) (see Fan, et al., Cold Spring Symp. Quant. Biol., 68:69-78 (2003)), may be employed.

[0091] In some embodiments, the term "methylation state" or "methylation status" refers to the presence or absence of 5-methylcytosine ("5-mC" or "5-mCyt") at one or a plurality of CpG dinucleotides within a DNA sequence. Methylation states at one or more particular CpG methylation sites (each having two CpG dinucleotide sequences) within a DNA sequence include "unmethylated," "fully-methylated," and "hemi- methylated." The term "hemi-methylation" or "hemimethylation" refers to the

methylation state of a double stranded DNA wherein only one strand thereof is methylated. The term "hypermethylation" refers to the average methylation state corresponding to an increased presence of 5-mCyt at one or a plurality of CpG dinucleotides within a DNA sequence of a test DNA sample, relative to the amount of 5- mCyt found at corresponding CpG dinucleotides within a normal control DNA sample. The term "hypomethylation" refers to the average methylation state corresponding to a decreased presence of 5-mCyt at one or a plurality of CpG dinucleotides within a DNA sequence of a test DNA sample, relative to the amount of 5-mCyt found at

corresponding CpG dinucleotides within a normal control DNA sample.

[0092] "Multiplexing" or "multiplex assay" herein may refer to an assay or other analytical method in which the presence and/or amount of multiple targets, e.g., multiple nucleic acid sequences, can be assayed simultaneously by using more than one markers, each of which has at least one different detection characteristic, e.g., fluorescence characteristic (for example excitation wavelength, emission wavelength, emission intensity, FWHM (full width at half maximum peak height), or fluorescence lifetime) or a unique nucleic acid or protein sequence characteristic.

[0093] As used herein, "disease or disorder" refers to a pathological condition in an organism resulting from, e.g., infection or genetic defect, and characterized by identifiable symptoms.

B. Genetic variant detection.

[0094] Mutant DNA molecules offer unique advantages over cancer-associated biomarkers because they are specific. Though mutations occur in individual normal cells at a low rate (about 10~ 9 to 10~ 10 mutations/bp/generation), such mutations represent such a tiny fraction of the total normal DNA that they are orders of magnitude below the detection limit of certain art methods. Several studies have shown that mutant DNA can be detected in stool, urine, and blood of CRC patients (Osbom and Ahlquist, Stool screening for colorectal cancer: molecular approaches,

Gastroenterology 2005; 128: 192-206).

[0095] Based on the sequencing results, detection of mutant DNA (including tumor- associated mutations) in a patient can be made, and diagnosis of a disease such as cancer and predictions regarding tumor recurrence can be made. Based on the predictions, treatment and surveillance decisions can be made. For example, circulating tumor DNA which indicates a future recurrence, can lead to additional or more aggressive therapies as well as additional or more sophisticated imaging and monitoring. Circulating DNA refers to DNA that is ectopic to a tumor.

[0096] Samples which can be analyzed include blood and stool. Blood samples may be for example a fraction of blood, such as serum or plasma. Similarly stool can be fractionated to purify DNA from other components. Tumor samples are used to identify a somatically mutated gene in the tumor that can be used as a marker of tumor in other locations in the body. Thus, as an example, a particular somatic mutation in a tumor can be identified by any standard means known in the art. Typical means include direct sequencing of tumor DNA, using allele-specific probes, allele-specific amplification, primer extension, etc. Once the somatic mutation is identified, it can be used in other compartments of the body to distinguish tumor derived DNA from DNA derived from other cells of the body. Somatic mutations are confirmed by determining that they do not occur in normal tissues of the body of the same patient. Types of tumors which can be diagnosed and/or monitored in this fashion are virtually unlimited. Any tumor which sheds cells and/or DNA into the blood or stool or other bodily fluid can be used. Such tumors include, in addition to colorectal tumors, tumors of the breast, lung, kidney, liver, pancreas, stomach, brain, head and neck, lymphatics, ovaries, uterus, bone, blood, etc.

[0097] in one aspect, highly parallel next-generation sequencing methods are used to analyze a target sequence in sample, in order to detect a genetic variant associated with a disease or condition, such as cancer. Such sequencing methods can be carried out, for example, using a one pass sequencing method or using paired-end

sequencing. Next generation sequencing methods include, but are not limited to, hybridization-based methods, such as disclosed in Drmanac, U.S. Pat. Nos. 6,864,052; 6,309,824; and 6,401 ,267; and Drmanac er a/., U.S. patent publication 2005/0191656, and sequencing by synthesis methods, e.g., Nyren er a/., U.S. Pat. No. 6,210,891 ; Ronaghi, U.S. Pat. No. 6,828,100; Ronaghi er a/. (1998), Science, 281 : 363-365;

Ba!asubramanian, U.S. Pat. No. 6,833,246; Quake, U.S. Pat. No. 6,911 ,345; Li er a/., P c. Natl. Acad. Sci, 100: 414-419 (2003); Smith et al., PCT publication WO 2006/074351 ; use of reversible extension terminators, e.g., Turner, U.S. Pat. No.

6,833,246 and Turner, U.S. Pat. No. 6,833,246 and !igation-based methods, e.g., Shendure et al. (2005), Science, 309: 1728-1739, Macevicz, U.S. Pat. No. 6,306,597; Soddart et ai, PNAS USA. 2009 Apr. 20; Xiao et al., Nat Methods. 2009 March;

6{3): 199-201 , all of which references are incorporated by reference herein for all purposes.

[0098] For lllumina sequencing, on each end, these constructs have flow cell binding sites, P5 and P7, which allow the library fragment to attach to the flow cell surface. The P5 and P7 regions of single-stranded library fragments anneal to their complementary oligos on the flowcell surface. The flow cell oligos act as primers and a strand complementary to the library fragment is synthesized. Then, the original strand is washed away, leaving behind fragment copies that are covalently bonded to the flowcell surface in a mixture of orientations. Copies of each fragment are then generated by bridge amplification, creating clusters. Then, the P5 region is cleaved, resulting in clusters containing only fragments which are attached by the P7 region. This ensures that all copies are sequenced in the same direction. The sequencing primer anneals to the P5 end of the fragment, and begins the sequencing by synthesis process. Index reads are performed when a sample is barcoded. When Read 1 is finished, everything from Read 1 is removed and an index primer is added, which anneals at the P7 end of the fragment and sequences the barcode. Then, everything is stripped from the template, which forms clusters by bridge amplification as in Read 1. This leaves behind fragment copies that are covalently bonded to the flowcell surface in a mixture of orientations. This time, P7 is cut instead of P5, resulting in clusters containing only fragments which are attached by the P5 region. This ensures that all copies are sequenced in the same direction (opposite Read 1). The sequencing primer anneals to the P7 region and sequences the other end of the template.

[0099] Next-generation sequencing platforms, such as MiSeq (lllumina Inc., San Diego, CA), can also be used for highly multiplexed assay readout. A variety of statistical tools, such as the Proportion test, multiple comparison corrections based on False Discovery Rates (see Benjamini and Hochberg, 1995, Journal of the Royal Statistical Society Senes B (Methodological) 57, 289-300), and Bonferroni corrections for multiple testing, can be used to analyze assay results. In addition, approaches developed for the analysis of differential expression from RNA-Seq data can be used to reduce variance for each target sequence and increase overall power in the analysis. See Smyth, 2004, Stat. Appl. Genet. Mol. Biol. 3, Article 3.

[00100] In any of the preceding embodiments, the method can be used for the diagnosis and/or prognosis of a disease or condition in a subject, predicting the responsiveness of a subject to a treatment, identifying a pharmacogenetics marker for the disease/condition or treatment, and/or screening a population for a genetic information. In one aspect, the disease or condition is a cancer or neoplasia, and the treatment is a cancer or neoplasia treatment.

[00101] In some embodiments, the nucleic acid molecule of interest disclosed herein is a cell-free DNA, such as cell-free fetal DNA (also referred to as "cfDNA") or ctDNA. cfDNA circulates in the body, such as in the blood, of a pregnant mother, and represents the fetal genome, while ctDNA circulates in the body, such as in the blood, of a cancer patient, and is generally pre-fragmented. In other embodiments, the nucleic acid molecule of interest disclosed herein is an ancient and or damaged DNA, for example, due to storage under damaging conditions such as in formalin-fixed samples, or partially digested samples.

[00102] As cancer cells die, they release DNA into the bloodstream. This DNA, known as ctDNA, is highly fragmented, with an average length of approximately 150 base pairs. Once the white blood cells are removed, ctDNA generally comprises a very small fraction of the remaining plasma DNA, for example, ctDNA may constitute less than about 10% of the plasma DNA. Generally, this percentage is less than about 1%, for example, less than about 0.5% or less than about 0.01 %. Additionally, the total amount of plasma DNA is generally very low, for example, at about 10 ng/mL of plasma.

[00103] A DNA sample can be contacted with primers that result in specific amplification of a mutant sequence, if the mutant sequence is present in the sample. "Specific amplification" means that the primers amplify a specific mutant sequence and not other mutant sequences or the wild-type sequence. Allele-specific amplification- based methods or extension-based methods are described in WO 93/22456 and U.S. Pat. Nos. 4,851 ,331 ; 5,137,806; 5,595,890; and 5,639,611 , all of which are specifically incorporated herein by reference for their teachings regarding same. While methods such as ligase chain reaction, strand displacement assay, and various transcription- based amplification methods can be used (see, e.g., review by Abramson and Myers, Current Opinion in Biotechnology 4:41-47 (1993)), PCR and/or sequencing methods can be used.

[00104] Multiple allele-specific primers, such as multiple mutant alleles or various combinations of wild-type and mutant alleles, can be employed simultaneously in a single amplification and/or sequencing reaction. Amplification products can be distinguished by different labels or size.

C. DNA methylation and analysis.

[00105] DNA methylation was first the discovered epigenetic mark. Epigenetics is the study of changes in gene expression or cellular phenotype caused by mechanisms other than changes in the underlying DNA sequence. Methylation predominately involves the addition of a methyl group to the carbon-5 position of cytosine residues of the dinucleotide CpG and is associated with repression or inhibition of transcriptional activity.

[00106] DNA methylation may affect the transcription of genes in two ways. First, the methylation of DNA itself may physically impede the binding of transcriptional proteins to the gene and, second and likely more important, methylated DNA may be bound by proteins known as methyl-CpG-binding domain proteins (MBDs). MBD proteins then recruit additional proteins to the locus, such as histone deacetylases and other chromatin remodeling proteins that can modify histones, thereby forming compact, inactive chromatin, termed heterochromatin. This link between DNA methylation and chromatin structure is very important. In particular, loss of methyl-CpG-binding protein 2 (MeCP2) has been implicated in Rett syndrome; and methyl-CpG-binding domain protein 2 (MBD2) mediates the transcriptional silencing of hypermethylated genes in cancer.

[00107] DNA methylation is an important regulator of gene transcription and a large body of evidence has demonstrated that genes with high levels of 5-methylcytosine in their promoter region are transcriptionally silent, and that DNA methylation gradually accumulates upon long-term gene silencing. DNA methylation is essential during embryonic development and in somatic cells patterns of DNA methylation are generally transmitted to daughter cells with a high fidelity. Aberrant DNA methylation patterns - hypermethylation and hypomethylation compared to normal tissue - have been associated with a large number of human malignancies. Hypermethylation typically occurs at CpG islands in the promoter region and is associated with gene inactivation. Global hypomethylation has also been implicated in the development and progression of cancer through different mechanisms.

[00108] The detection of methylated DNA, therefore, can be useful in the diagnosis of certain cancers and, for example, for following treatment efficacy. For example, WO1998056952A1 discloses a cancer diagnostic method based upon DNA methylation differences at specific CpG sites, and the method comprises bisulfite treatment of DNA, followed by methylation-sensitive single nucleotide primer extension (Ms-SNuPE) for determination of strand-specific methylation status at cytosine residues. U.S.

8,541 ,207 B2 discloses methods for analyzing the methylation state of DNA with a gene array. WO2005123942A2 discloses a method for analysis methylation patterns in DNA and identifying aberrantly methylated genes in disease tissue. Other method for detection of cytosine methylation are disclosed in WO2005071106A1 ,

WO2003074730A1 , EP1342794A1 , EP1461458A2, EP1360317A2, U.S. 7,524,629 B2, WO2000070090A1 , WO2000026401A1 , US20060134650A1 , and U.S. 7,247,428 B2. All of the patent documents in this paragraph are incorporated by reference for all purposes.

[00109] One example of a cancer wherein bisulfite sequencing has proven useful is for the screening of colorectal cancer wherein the detection of methylated Septin 9 (mS9) is used as a biomarker. Other examples of target sequences for bisulfite conversion are esophageal squamous cell carcinoma (Baba er a/., Surg. Today, 2013), breast cancer (Dagdemir et al., In vivo, 2013, 27(1 ): 1-9), prostate cancer (Willard and Koochekpour, Am. J. Cancer Res. 2012, 2(6):620-657), non-Hodgkin's lymphomas (Yin et al., Front Genet, 2012, 3:233), oral cancers (Gasche and Goel, Future Onocol., 2012, 8(11 ): 1407-1425), etc. One of ordinary skill in the art will appreciate that the methods of the present invention are applicable to and easily adapted to the improved detection of these and other cancers known to be manifested at least in part by hypermethylation or hypomethylation of target gene sequences. Likewise, other medical conditions known to those of skill line art that wherein hypermethylation and/or hypomethylation are part of the known etiology will have improved detection, for diagnosis and/or prognosis and/or as companion diagnostics, with the application of the methods disclosed herein.

[00110] Bisulfite conversion is the use of bisulfite reagents to treat DNA to determine its pattern of methylation. The treatment of DNA with bisulfite converts cytosine residues to uracil but leaves 5-methylcytosine residues unaffected. Thus, bisulfite treatment introduces specific changes in the DNA sequence that depend on the methylation status of the individual cytosine residues. Various analyses can be performed on the altered sequence to retrieve this information, for example, in order to differentiate between single nucleotide polymorphisms (SNP) resulting from the bisulfite conversion. U.S. Patent No. 7,620,386, U.S. Patent No. 9,365,902, and U.S. Patent Application Publication 2006/0134643, all of which are incorporated herein by reference, exemplify methods known to one of ordinary skill in the art with regard to detecting sequences altered due to bisulfite conversion. However, one consequence of bisulfite conversion is that the double-stranded conformation of the original target is disrupted due to loss of sequence complementarity. In addition, bisulfite conversion is a harsh treatment that tends to lead to material losses, which can compromise the assay sensitivity on low-input samples, such cell-free DNA, including circulating tumor DNA (also referred to as "cell-free tumor DNA," or "ctDNA").

D. Simultaneous detection of genetic variants and DNA methylation on limited sample input.

[00111] Simultaneous detection of genetic variants and DNA methylation is difficult for the first- and second-generation sequencing, especially when the input DNA amount is low and that limited input needs to be further divided for two separate work flows, one for genetic variant detection and the other for DNA methylation analysis.

[00112] Flusberg et al. (2010) in "Direct detection of DNA methylation during single- molecule, real-time sequencing," Nat. Methods 7: 461-465, and Manrao et al. (2012) in "Reading DNA at single-nucleotide resolution with a mutant MspA nanopore and phi29 DNA polymerase," Nat. Biotechnol 30: 349-353, attempted to combine third generation sequencing with DNA methylation analysis. However, their detection accuracy was low, and far from being adequate for routine clinical tests.

[00113] In one aspect, disclosed herein is a method (MSA-seq) for efficient quantification of DNA methylation status of multiple CpG sites, and simultaneous detection and quantification of genetic variants at multiple targets. In some

embodiments, the input DNAs, such as ctDNA, are first digested with methylation- sensitivity restriction enzymes, such as Hap 11 and/or Sa/I, followed by multiplexed amplification of assayed targets and next-generation sequencing (FIG. 1, left panel). The methylation levels of the target CpG sites are inferred by the relative read depth, whereas the genetic variants are called from the raw sequencing reads (FIG. 1, right panel). In one aspect, the majority of genetic variants are accessible with a single- reaction assay. The variants in the ctDNA can be interrogated using various methods, including next generation sequencing discussed above.

[00114] In some embodiments, for a minority of variants that locate too close to the restriction enzyme recognition sites, a second multiplexed amplification reaction is performed on the undigested input DNA, for a separate sequencing library.

[00115] While methylation sensitive restriction enzyme digestion has been adopted for multiple methylation assays, including several NGS-based methods, such as Methyl-seq, MCA-seq, HELP-seq and MSCC, MSA-seq is unique in that genomic fragments containing the targeted CpG sites were extracted from the remaining genomic fragments by multiplexed amplification with at least one defined end, and the methylation levels are correlated with the amplifiable fragments. For a review of methods for methylation analysis, see Laird (2010), "Principles and challenges of genome-wide DNA methylation analysis," Nat Rev Genet 11: 191-203.

[00116] In one aspect, the present method does not rely on adaptor ligation with the digested ends. The number of targeted CpG sites per assay is highly flexible, in the range from one to tens of thousands. The methylation levels can be quantitated by normalization using the read depth information of internal control loci that do not contain the digestion sites, without requiring a second control reaction using methyl-insensitive restriction enzymes. In another method, the present method does not involve bisulfite conversion, which can result in >90% loss of DNA molecules. The combination of these features leads to high scalability, superior sensitivity and low input requirements which are particularly relevant to liquid biopsies.

[00117] In one aspect of the present disclosure, target capture can be implemented with at least three different methods, including multiplexed PCR (Qiagen Multiplexed PCR, Thermo Fisher AmpliSeq), padlock capture (Roche Heat-Seq), and selector capture (Agilent HaloPlex). In some embodiments, primers or probes targeting short genomic intervals (40-200bp including the oligo annealing regions) covering the CpG sites of interests are designed. A separate set of primers or probes is also designed for the genetic variants (mutations) of interest. Typically a larger fraction of target sequence in the second set do not contain restriction enzyme recognition sites, hence their sequencing read depth can be used as the internal controls for the calculation of CpG methylation levels. In rare situation where all targets in the second set can be digested by the restriction enzyme(s), additional amplicons will be designed as non- digested internal controls. The relative read depth (mean and variance) for all amplicons in an assay is first determined by multiplexed amplification and sequencing on the non-digested DNA fragments that mimic the fragment size distribution of real samples. In one aspect, this only needs to be done once for each type of clinical samples. For each clinical sample of interest, the methylation of each target CpG site is determined by calculating the ratio of observed read depth over expected read depth after regression normalization. In one aspect, genetic variants are called by routine variant calling procedures, including read mapping, local alignment, variant calling and/or filtering.

[00118] In one aspect, the present method has a number of immediate clinical applications. One of such applications is non-invasive screening, early detection, or monitor of tumors on patients' plasma, stool, urine or other types of biofluids. Another application is non-invasive prenatal screening of fetal aneuploidy, such as trisomy 21 Down's syndrome.

[00119] In one aspect, provided herein is a method for analyzing a first target polynucleotide sequence and a methylation status of a second target polynucleotide sequence in a sample, comprising contacting a sample containing or suspected of containing a polynucleotide with a methylation-sensitive restriction enzyme (MSRE). In one aspect, the MSRE selectively cleaves the polynucleotide at a residue when it is unmethylated or selectively cleaves the polynucleotide at the residue when it is methylated. In any of the preceding embodiments, the MSRE can be selected from the group consisting of Hpall, Sa/I, Sa/I-HF®, ScrF\, Bbe\, Λ/ofl, Smal, Xmal, Mbo\, BstB\, C/al, Mlu\, Nae\, Λ/ari, Pvu\, SacW, Hha\, and any combination thereof.

[00120] In one aspect, disclosed herein is a method for analyzing a first target set of polynucleotide sequence for sequence changes and a second target set of

polynucleotide sequence for methylation status in a sample, comprising: 1) contacting a sample comprising a polynucleotide with an MSRE, wherein the MSRE selectively cleaves the polynucleotide at a residue when it is unmethylated or selectively cleaves the polynucleotide at the residue when it is methylated; 2) subjecting the sample from step 1) to polynucleotide amplification, using a mixture of: i) a first primer set for amplifying a first target set of polynucleotide sequence in the sample, and ii) a second primer set for analyzing a methylation status of a second target set of polynucleotide sequence in the sample, wherein the methylation status is of a residue in the second target set of polynucleotide sequence, and one primer of the second primer set hybridizes to the uncleaved second target polynucleotide sequence and together with another primer in the set, amplifies the uncleaved sequence but not the second target polynucleotide sequence cleaved at the residue by the MSRE; and 3) sequencing analysis polynucleotides amplified in step 2), wherein the first target set of

polynucleotide sequence is analyzed using sequencing reads from the amplified first target set of polynucleotide sequence, and the methylation status of the residue of the second target polynucleotide sequence is analyzed by comparing the observed number of sequencing reads {No) from the amplified second target set of polynucleotide sequence to an expected reference number (N e ).

[00121] In one embodiment, the first target set of polynucleotide sequence is analyzed using sequencing reads from the amplified first target set of polynucleotide sequence, as compared to a reference sequence, for example, a wild-type sequence and/or a human sequence for the target sequence. The comparison can be done by sequence alignment.

[00122] In another embodiment, the first target set of polynucleotide sequence is analyzed using without comparing sequencing reads from the amplified first target set of polynucleotide sequence to a reference sequence. For example, by aligning all the sequencing reads to obtain a consensus sequence so it is possible to tell which variants are the minority alleles. In one aspect, the minority allele comprises a mutation.

[00123] In one embodiment, a sample contacted with an MSRE can be analyzed by constructing a single-stranded library by ligation, as disclosed in U.S. Provisional

Application No. , entitled "Compositions and Methods for Library Construction and

Sequence Analysis," filed April 19, 2017 (Attorney Docket No. 737993000200), which is incorporated herein by reference in its entirety for all purposes. In one aspect, the MSRE treatment is before the dephosphorylation and/or the denaturing step of the single-stranded ligation method. In one embodiment, a method comprising ligating a set of adaptors to a library of single-stranded polynucleotides is provided, and in the method, an MSRE-treated sample is denatured to create the library of single-stranded polynucleotides, and the ligation is catalyzed by a single-stranded DNA (ssDNA) ligase, each single-stranded polynucleotide is blocked at the 5' end to prevent ligation at the 5' end, each adaptor comprises a unique molecular identifier (UMI) sequence that earmarks the single-stranded polynucleotide to which the adaptor is ligated, each adaptor is blocked at the 3' end to prevent ligation at the 3' end, and the 5' end of the adaptor is ligated to the 3' end of the single-stranded polynucleotide by the ssDNA ligase to form a linear ligation product, thereby obtaining a library of linear, single- stranded ligation products. In any of the preceding embodiments, the method can further comprise converting the library of linear, single-stranded ligation products into a library of linear, double-stranded ligation products. In one aspect, the conversion uses a primer or a set of primers each comprising a sequence that is reverse-complement to the adaptor and or hybridizable to the adaptor. In any of the preceding embodiments, the method can further comprise amplifying and/or purifying the library of linear, double- stranded ligation products. In any of the preceding embodiments, the method herein can comprise amplifying the library of linear, double-stranded ligation products, e.g., by a polymerase chain reaction (PCR), using a primer or a set of primers each comprising a sequence that is reverse-complement to the adaptor and/or hybridizable to the adaptor, a primer hybridizable to the target sequence (e.g., an EGFR gene sequence), thereby obtaining an amplified library of linear, double-stranded ligation products comprising sequence information of the target sequence. In any of the preceding embodiments, the method can further comprise sequencing the amplified library of linear, double-stranded ligation products. Thus, the methylation status and/or genetic variant analysis of one or more target sequences can be performed using semi-targeted amplification of the single-stranded library.

[00124] The target sequence(s) for methylation analysis and/or the target

sequence(s) for variant detection can be on the same molecule or on different molecules, for example, two different DNA fragments, in the sample. In one aspect, the target polynucleotide sequences can be on the same gene. In another aspect, the target polynucleotide sequences can be in a coding region of a gene whereas the second target polynucleotide sequence can be in a non-coding and/or regulatory region of or for the same gene. In another aspect, the target polynucleotide sequences can be on different genes. In one aspect, the genes function in the same biological pathway or network. In another aspect, the target polynucleotide sequences can be on the same or different chromosomes (for example, as shown in Table 3) or on the same or different extrachromosomal DNA molecules (such as mitochondria DNA), or one on a chromosome and the other on an extrachromosomal DNA molecule.

[00125] In summary, one aspect of the present disclosure is an integrated method for simultaneous detection of both a genomic variance and quantification of a DNA methylation state/status on one or more (e.g., hundreds of thousands of) targets, without splitting the limited materials for two different workflows.

E. Kits.

[00126] Disclosed in another aspect herein is a kit, comprising: a first primer set for amplifying a first target polynucleotide sequence in a sample; and/or a second primer set for analyzing a methylation status of a second target polynucleotide sequence in the sample, and the methylation status is of a residue in the second target polynucleotide sequence, and one primer of the second primer set hybridizes to the uncleaved second target polynucleotide sequence and together with another primer in the set, amplifies the uncleaved sequence but not the second target polynucleotide sequence cleaved at the residue by the MSRE. In one embodiment, the kit further comprises an MSRE, and the MSRE selectively cleaves at a residue when it is unmethylated or selectively cleaves at the residue when it is methylated. In one embodiment, the MSRE is selected from the group consisting of Hpall, Sa/I, Sa/I-HF®, ScrF\, Bbe\, Not\, Sma\, Xma\, Mbo\, BstB\, C/al, Mlu\, Nae\, Νβή, Pvu\, SacU, Hha\, and any combination thereof.

[00127] In any of the preceding embodiments, the first primer set of the kit can comprise one or more primers for a gene selected from the group consisting of ABCB1 , CYP2C19, CYP2C8, CYP2D6, CYP3A4, CYP3A5, DPYD, GSTP1 , MTHFR, NQ01 , RHEB, SULT1A1 , UGT1A1 , MPL, JAK1 , NRAS, DDR2, PTEN, FGFR2, HRAS, ATM, CBL, KRAS, ERBB3, CDK4, HNF1A, FLT3, RB1 , AKT1 , IDH2, CDH1 , TR53, ERBB2, STAT3, SMAD4, STK11 , GNA11 , JAK3, PPP2R1A, RET, DNMT3A, ALK, NFE2L2, SF3B1 , PIK3CA, ERBB4, GNAS, U2AF1, SLC19A1 , SMARCB1 , CHEK2, VHL. RAF1, CTNNB1 , PDGFRA, KIT, KDR, FBXW7, APC, NEUROG1 , CSF1 R, NPM1 , TPMT, EGFR, MET, SMO, BRAF, EZH2, FGFR1 , JAK2, CDKN2A, PAX5, PTCH1 , ABL1 , NOTCH1 , ARAF, MED12, BTK, and any combination thereof.

[00128] In any of the preceding embodiments, the first primer set of the kit can comprise, consist essentially of, or consist of a sequence set forth in SEQ ID NOs: 61 - 788, or any combination thereof.

[00129] In any of the preceding embodiments, the second primer set of the kit can comprise one or more primers for a gene selected from the group consisting of NDRG4, SEPT, MLH1, WTN5A, AGTR1 , BMP3, SFRP2, NEUROG1 , TFPI2, SDC2, and any combination thereof.

[00130] In any of the preceding embodiments, the second primer set of the kit can comprise, consist essentially of, or consist of a sequence set forth in SEQ ID NOs: 1 - 60, or any combination thereof. [00131] Diagnostic kits based on the kit components described above are also provided, and they can be used to diagnose a disease or condition in a subject, for example, cancer. In another aspect, the kit can be used to predict individual's response to a drug, therapy, treatment, or a combination thereof. Such test kits can include devices and instructions that a subject can use to obtain a sample, e.g., of ctDNA, without the aid of a health care provider.

[00132] For use in the applications described or suggested above, kits or articles of manufacture are also provided. Such kits may comprise at least one reagent specific for genotyping a marker for a disease or condition, and may further include instructions for carrying out a method described herein.

[00133] In some embodiments, provided herein are compositions and kits comprising primers and primer pairs, which allow the specific amplification of the polynucleotides or of any specific parts thereof, and probes that selectively or specifically hybridize to nucleic acid molecules or to any part thereof for the purpose of detection, either qualitatively or quantitatively. Probes may be labeled with a detectable marker, such as, for example, a radioisotope, fluorescent compound, bioluminescent compound, a chemiluminescent compound, metal chelator or enzyme. Such probes and primers can be used to detect the presence of polynucleotides in a sample and as a means for detecting cell expressing proteins encoded by the polynucleotides. As will be

understood by the skilled artisan, a great many different primers and probes may be prepared based on the sequences provided herein and used effectively to amplify, clone and/or determine the presence and/or levels of polynucleotides, such as genomic DNAs, mtDNAs, and fragments thereof.

[00134] In some embodiments, the kit may additionally comprise reagents for detecting presence of polypeptides. Such reagents may be antibodies or other binding molecules that specifically bind to a polypeptide. In some embodiments, such

antibodies or binding molecules may be capable of distinguishing a structural variation to the polypeptide as a result of polymorphism, and thus may be used for

genotyping. The antibodies or binding molecules may be labeled with a detectable marker, such as, for example, a radioisotope, fluorescent compound, bioluminescent compound, a chemi luminescent compound, metal chelator or enzyme. Other reagents for performing binding assays, such as ELISA, may be included in the kit.

[00135] In some embodiments, the kits comprise reagents for genotyping at least two, at least three, at least five, at least ten, or more markers. The markers may be a polynucleotide marker (such as a cancer-associated mutation or SNP) or a polypeptide marker (such as overexpression or a post-translational modification, including hyper- or hypo-phosphorylation, of a protein) or any combination thereof. In some embodiments, the kits may further comprise a surface or substrate (such as a microarray) for capture probes for detecting of amplified nucleic acids.

[00136] The kits may further comprise a carrier means being compartmentalized to receive in close confinement one or more container means such as vials, tubes, and the like, each of the container means comprising one of the separate elements to be used in the method. For example, one of the container means may comprise a probe that is or can be detectably labeled. Such probe may be a polynucleotide specific for a biomarker. The kit may also have containers containing nucleotide(s) for amplification of the target nucleic acid sequence and/or a container comprising a reporter-means bound to a reporter molecule, such as an enzymatic, fiorescent, or radioisotope label.

[00137] The kit typically comprises the container(s) described above and one or more other containers comprising materials desirable from a commercial and user standpoint, including buffers, diluents, filters, needles, syringes, and package inserts with

instructions for use. A label may be present on the container to indicate that the composition is used for a specific therapy or non-therapeutic application, and may also indicate directions for either in vivo or in vitro use, such as those described above.

[00138] The kit can further comprise a set of instructions and materials for preparing a tissue or cell or body fluid sample and preparing nucleic acids (such as ctDNA) from the sample.

H. Further exemplary embodiments

[00139] In any of the preceding embodiments, the ssDNA ligase can be a Thermus bacteriophage RNA ligase such as a bacteriophage TS2126 RNA ligase (e.g.,

CircLigase™ and CircLigase II™), or an archaebacterium RNA ligase such as Methanobacterium thermoautotrophicum RNA ligase 1. In other aspects, the ssDNA ligase is an RNA ligase, such as a T4 RNA ligase, e.g., T4 RNA ligase I, e.g., New England Biosciences, M0204S. T4 RNA ligase 2, e.g., New England Biosciences, M0239S, T4 RNA ligase 2 truncated, e.g., New England Biosciences, M0242S, T4 RNA ligase 2 truncated KQ, e.g., M0373S, or T4 RNA ligase 2 truncated K227Q, e.g., New England Biosciences, M0351S. In any of the preceding embodiments, the ssDNA ligase can also be a thermostable 5' App DNA RNA ligase, e.g., New England

Biosciences, M0319S, or T4 DNA ligase, e.g., New England Biosciences, M0202S.

[00140] In some embodiments, the present methods comprise ligating a set of adaptors to a library of single-stranded polynucleotides using a single-stranded DNA (ssDNA) ligase. Any suitable ssDNA ligase, including the ones disclosed herein, can be used. The adaptors can be used at any suitable level or concentration, e.g., from about 1 μΜ to about 100 μΜ such as about 1 μΜ, 10 μΜ, 20 μΜ, 30 μΜ, 40 μΜ, 50 μΜ, 60 μΜ, 70 μΜ, 80 μΜ, 90 μΜ, or 100 μΜ. or any subrange thereof. The adapter can comprise or begin with any suitable sequences or bases. For example, the adapter sequence can begin with all 2 bp combinations of bases.

[00141] In some embodiments, the ligation reaction can be conducted in the presence of a crowding agent. In one aspect, the crowding agent comprises a polyethylene glycol (PEG), such as PEG 4000, PEG 6000, or PEG 8000, Dextran, and/or Ficoll. The crowding agent, e.g., PEG, can be used at any suitable level or concentration. For example, the crowding agent, e.g., PEG, can be used at a level or concentration from about 0% (w/v) to about 25% (w/v), e.g., at about 0% (w/v), 1% (w/v), 2% (w/v), 3% (w/v), 4%(w/v), 5% (w/v), 6% (w/v), 7%(w/v), 8% (w/v), 9%(w/v), 10% (w/v), 11 %(w/v), 12% (w/v), 13%(w/v), 14% (w/v), 15% (w/v), 16% (w/v), 17% (w/v), 18% (w/v), 19% (w/v), 20% (w/v), 21% (w/v), 22% (w/v), 23% (w/v), 24% (w/v), or 25% (w/v), or any subrange thereof.

[00142] In some embodiments, the ligation reaction can be conducted for any suitable length of time. For example, the ligation reaction can be conducted for a time from about 2 to about 16 hours, %, e.g., for about 2 hours, 3 hours, 4 hours, 5 hours, 6 hours, 7 hours, 8 hours, 9 hours, 10 hours, 11 hours, 12 hours, 13 hours, 14 hours, 15 hours, or 16 hours, or any subrange thereof. [00143] In some embodiments, the ssDNA ligase in the ligation reaction can be used in any suitable volume. For example, the ssDNA ligase in the ligation reaction can be used at a volume from about 0.5 μΙ to about 2 μΙ, %, e.g., at about 0.5 μΙ, 0.6 μΙ, 0.7 μΙ, 0.8 μΙ, 0.9 μ1 1 μΙ, 1.1 μΙ, 1.2 μΙ, 1.3 μΙ, 1.4 μΙ, 1.5 μΙ, 1.6 μΙ, 1.7 μΙ, 1.8 μΙ, 1.9 μΙ, or 2 μΙ, or any subrange thereof.

[00144] In some embodiments, the ligation reaction can be conducted in the presence of a ligation enhancer, e.g., betaine. The ligation enhancer, e.g., betaine, can be used at any suitable volume, e.g., from about 0 μΙ to about 1 μΙ, e.g., at about 0 μΙ, 0.1 μΙ, 0.2 μΙ, 0.3 μΙ, 0.4 μΙ, 0.5 μΙ, 0.6 μΙ, 0.7 μΙ, 0.8 μΙ, 0.9 μΙ, 1 μΙ, or any subrange thereof.

[00145] In some embodiments, the ligation reaction can be conducted using a T4 RNA ligase I, e.g., the T4 RNA ligase I from New England Biosciences, M0204S, in the following exemplary reaction mix (20 μΙ): 1 X Reaction Buffer (50 mM Tris-HCI, pH 7.5, 10 mM MgCI2, 1 mM DTT), 25% (wt/vol) PEG 8000, 1 mM hexamine cobalt chloride (optional), 1 μΙ (10 units) T4 RNA Ligase, and 1 mM ATP. The reaction can be incubated at 25°C for 16 hours. The reaction can be stopped by adding 40 μΙ of 10 mM Tris-HCI pH 8.0, 2.5 mM EDTA.

[00146] In some embodiments, the ligation reaction can be conducted using a Thermostable 5' App DNA/RNA ligase, e.g., the Thermostable 5' App DNA RNA ligase from New England Biosciences, M0319S, in the following exemplary reaction mix (20 μΙ): ssDNA/RNA Substrate 20 pmol (1 pmol/ul), 5 ' App DNA Oligonucleotide 40 pmol (2 pmol/μΙ), 10X NEBuffer 1 (2 μΙ), 50 mM MnCI 2 (for ssDNA ligation only) (2 μΙ),

Thermostable 5 ' App DNA/RNA Ligase (2 μΙ (40 pmol)), and Nuclease-free Water (to 20 μΙ). The reaction can be incubated at 65°C for 1 hour. The reaction can be stopped by heating at 90°C for 3 minutes.

[00147] In some embodiments, the ligation reaction can be conducted using a T4 RNA ligase 2, e.g., the T4 RNA ligase 2 from New England Biosciences, M0239S, in the following exemplary reaction mix (20 μΙ): T4 RNA ligase buffer (2 μΙ), enzyme (1 μΙ), PEG (10 μΙ), DNA (1 μΙ), Adapter (2 μΙ), and water (4 μΙ). The reaction can be incubated at 25°C for 16 hours. The reaction can be stopped by heating at 65°C for 20 minutes. [00148] In some embodiments, the ligation reaction can be conducted using a T4 RNA ligase 2 Truncated, e.g., the T4 RNA ligase 2 Truncated from New England Biosciences, M0242S, in the following exemplary reaction mix (20 μΙ): T4 RNA ligase buffer (2 μΙ), enzyme (1 μΙ), PEG (10 μΙ), DNA (1 μΙ), Adapter (2 μΙ), and water (4 μΙ). The reaction can be incubated at 25°C for 16 hours. The reaction can be stopped by heating at 65°C for 20 minutes.

[00149] In some embodiments, the ligation reaction can be conducted using a T4 RNA ligase 2 Truncated K227Q, e.g., the T4 RNA ligase 2 Truncated K227Q from New England Biosciences, M0351 S, in the following exemplary reaction mix (20 μΙ): T4 RNA ligase buffer (2 μΙ), enzyme (1 μΙ), PEG (10 μΙ), DNA (1 μΙ), Adenylated Adapter (0.72 μΙ), and water (5.28 μΙ). The reaction can be incubated at 25°C for 16 hours. The reaction can be stopped by heating at 65°C for 20 minutes.

[00150] In some embodiments, the ligation reaction can be conducted using a T4 RNA ligase 2 Truncated KQ, e.g., the T4 RNA ligase 2 Truncated KQ from New

England Biosciences, M0373S, in the following exemplary reaction mix (20 μΙ): T4 RNA ligase buffer (2 μΙ), enzyme (1 μΙ), PEG (10 μΙ), DNA (1 μΙ), Adenylated Adapter (0.72 μΙ), and water (5.28 μΙ). The reaction can be incubated at 25°C for 16 hours. The reaction can be stopped by heating at 65°C for 20 minutes.

[00151] In some embodiments, the ligation reaction can be conducted using a T4 DNA ligase, e.g., the T4 DNA ligase from New England Biosciences, M0202S, in the following exemplary reaction mix (20 μΙ): T4 RNA ligase buffer (2 μΙ), enzyme (1 μΙ), PEG (10 μΙ), DNA (1 μΙ), Adenylated Adapter (0.72 μΙ), and water (5.28 μΙ). The reaction can be incubated at 16°C for 16 hours. The reaction can be stopped by heating at 65°C for 10 minutes.

[00152] The second strand synthesis step can be conducted using any suitable enzyme. For example, the second strand synthesis step can be conducted using Bst polymerase, e.g., New England Biosciences, M0275S or Klenow fragment (3'->5' exo-), e.g., New England Biosciences, M0212S.

[00153] In some embodiments, the second strand synthesis step can be conducted using Bst polymerase, e.g., New England Biosciences, M0275S, in the following exemplary reaction mix (10 μΙ): water (1.5 μΙ), primer (0.5 μΙ), dNTP (1 μΙ), ThermoPol Reaction buffer (5 μΙ), and Bst (2 μΙ). The reaction can be incubated at 62°C for 2 minutes and at 65°C for 30 minutes. After the reaction, the double stranded DNA molecules can be further purified.

[00154] In some embodiments, the second strand synthesis step can be conducted using Klenow fragment (3'->5' exo), e.g., New England Biosciences, M0212S, in the following exemplary reaction mix (10 μΙ): water (0.5 μΙ), primer (0.5 μΙ), dNTP (1 μΙ), NEB buffer (2 μΙ), and exo- (3 μΙ). The reaction can be incubated at 37°C for 5 minutes and at 75°C for 20 minutes. After the reaction, the double stranded DNA molecules can be further purified.

[00155] After the second strand synthesis, but before the first or semi-targeted PCR, the double stranded DNA can be purified. The double stranded DNA can be purified using any suitable technique or procedure. For example, the double stranded DNA can be purified using any of the following kits: Zymo clean and concentrator, Zymo research, D4103; Qiaquick, Qiagen, 28104; Zymo ssDNA purification kit, Zymo

Research, D7010; Zymo Oligo purification kit, Zymo Research, D4060; and AmpureXP beads, Beckman Coulter, A63882: 1.2x- x bead ratio.

[00156] The first or semi-targeted PCR can be conducted using any suitable enzyme or reaction conditions. For example, the polynucleotides or DNA strands can be annealed at a temperature ranging from about 52°C to about 72°C, e.g., at about 52°C, 53°C, 54°C, 55°C, 56°C, 57°C, 58°C, 59°C, 60°C, 61 °C, 62°C, 63°C, 64°C, 65°C, 66°C, 67°C, 68°C, 69°C, 70°C, 71 °C, or 72°C, or any subrange thereof. The first or semi- targeted PCR can be conducted for any suitable rounds of cycles. For example, the first or semi-targeted PCR can be conducted for 10-40 cycles, e.g., for 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 , 32, 33, 34, 35, 36, 37, 38, 39, or 40 cycles. The primer pool can be used at any suitable concentration. For example, the primer pool can be used at a concentration ranging from about 5 nM to about 200 nM, e.g., at about 5 nM, 6 nM, 7 nM, 8 nM, 9 nM, 10 nM, 20 nM, 30 nM, 40 nM, 50 nM, 60 nM, 70 nM, 80 nM, 90 nM, 100 nM, 110 nM, 120 nM, 130 nM, 140 nM, 150 nM, 160 nM, 170 nM, 180 nM, 190 nM, or 200 nM, or any subrange thereof.

[00157] The first or semi-targeted PCR can be conducted using any suitable temperature cycle conditions. For example, the first or semi-targeted PCR can be conducted using any of the following cycle conditions: 95°C 3 minutes, (95°C 15 seconds, 62°C 30 seconds, 72°C 90 seconds) x3 or x5; or (95°C 15 seconds, 72°C 90 seconds) x23 orx 21 , 72C 1 minute, 4°C forever.

[00158] In some embodiments, the first or semi-targeted PCR can be conducted using KAPA SYBR FAST, e.g., KAPA biosciences, KK4600, in the following exemplary reaction mix (50 μΙ): DNA (2 μΙ), KAPASYBR (25 μΙ), Primer Pool (26nM each) (10 μΙ), Aprimer (100uM) (.4 μΙ), and water (12.6 μΙ). The first or semi-targeted PCR can be conducted using any of the following cycle conditions: 95°C 30 seconds, (95°C 10 seconds, 50-56°C 45 seconds, 72°C 35 seconds) x40.

[00159] In some embodiments, the first or semi-targeted PCR can be conducted using KAPA HiFi, e.g., KAPA Biosciences, KK2601 , in the following exemplary reaction mix (50 μΙ): DNA (15 μΙ), KAPAHiFi (25 μΙ), Primer Pool (26nM each) (10 μΙ), and Aprimer (100uM) (0.4 μΙ). The first or semi-targeted PCR can be conducted using any of the following cycle conditions: 95°C 3 minutes, (98°C 20 seconds, 53-54°C 15 seconds, 72°C 35 seconds) x15, 72°C 2 minutes, 4°C forever.

[00160] Bisulfite conversion can be conducted using any suitable techniques, procedures or reagents. In some embodiments, bisulfite conversion can be conducted using any of the following kits and procedures provided in the kit: EpiMark Bisulfite Conversion Kit, New England Biosciences, E3318S; EZ DNA Methylation Kit, Zymo Research, D5001 ; MethylCode Bisulfite Conversion Kit, Thermo Fisher Scientific, MECOV50; EZ DNA Methylation Gold Kit, Zymo Research, D5005; EZ DNA

Methylation Direct Kit, Zymo Research, D5020; EZ DNA Methylation Lightning Kit, Zymo Research, D5030T; EpiJET Bisulfite Conversion Kit, Thermo Fisher Scientific, K1461 ; or EpiTect Bisulfite Kit, Qiagen, 59104.

[00161] In some embodiments, DNA molecules can be prepared using the

procedures illustrated in Example 4, including the steps for constructing single-stranded polynucleotide, conversion of single-stranded polynucleotide library to double-stranded polynucleotide library, semi-targeted amplification of double-stranded polynucleotide library, and construction of sequence library. Such DNA molecules can further be analyzed for methylation status using any suitable methods or procedures. I. Examples.

Example 1

[00162] In this example, 24 CpG sites that overlap with the Hpall recognition motif in the promoters of ten genes (AGTR1 , BMP3, MLH1 , NDRG4, NEUROG1 , SDC2, SEPT, SFRP2, TFPI2, WNT5A) were selected. An AmpliSeq customized primer set was designed to cover these methylation targets, as well as 370 genomic regions that are commonly mutated in cancers.

[00163] Mixtures (1%, 5%, 10%, 20%, 50%) were created of fragmented genomic DNA from the cancer cell line HCT116, which is methylated at the 24 CpG sites, with genomic DNA from NA12878 that is unmethylated at all these sites. MSA-seq was performed on these mixtures in triplicates. The methylation measurements have high correlation (average correlation coefficient R=0.983) and linearity with the expected values (FIG. 2). FIG. 3 shows MSMC-Seq quantified CpG methylation for tumor clustering. This method of unbiased hierarchical clustering of tumor samples separates these tumor samples into 3 groups based on methylation biomarker level/status: Group A, Group B, and the group in between A and B.

[00164] Exemplary primer pairs used are listed in Table 1 below.

[00167] Table 3 lists the chromosome location and starting and ending positions of the genes for methylation analysis and variant detection.

[00168] To demonstrate the feasibility of quantifying DNA methylation and identifying genetic variants on tumor samples, MSA-seq was applied to 10 pairs of tumor and adjacent normal tissues from colorectal cancer (CRC) patients.

[00169] With 20ng of FFPE input DNA per sample, the DNA methylation levels of the 24 promoter CpG sites on the ten genes were quantified, and classified the ten tumor samples into two distinct groups, one is highly methylated for SEPT, AGTR1 , SDC2, SFRP2 and TFPI2, whereas the second group is also highly methylated on additional genes such as WNT5A, MLH1 and BMP3. With the same data set, 0-12 somatic mutations in each of the 10 tumor samples were also identified (Table 4).

[00170] All 28 mutations were detected in a single reaction on the Hpall digested DNA, without the need for a separate reaction on undigested DNA.

[00171] Table 4. Somatic mutation identified in 10 CRC tumor samples.

[00172] A customized AmpliSeq primer panel was designed using the Ion AmpliSeq Designer tool available at ampliseq.com, and purchased from ThermoFisher Scientific. For the purpose of method calibration, genomic DNAs from the cell lines HCT116 and NA12878 were fragmented by Bioruptor. A series of synthetic DNA mixtures was prepared that contain HCT116 at 0%, 1%, 5%, 10%, 20% and 50%. In each reaction, 10ng of DNA mixture was digested with NEB Msp lHpaW at 37°C for 4 hours, purified with AmPure beads, and processed with the AmpliSeq amplification and Ion library preparation protocol with slight modification in volume. Ten tumor samples derived from colon rectal cancer patients underwent the same procedure in a pair of digested and undigested to calibrate the background. The resulting sequencing libraries were sequenced on Ion pgm/S5 sequencer. Mutation calling was performed with Torrent Suite. CpG methylation levels were calculated from the amplicon read depth data using customized Perl/Python scripts.