Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHOD FOR DETECTING CANCER USING 5-HYDROXYMETHYLCYTOSINE (5-HMC)
Document Type and Number:
WIPO Patent Application WO/2023/137294
Kind Code:
A1
Abstract:
Disclosed herein is a method which includes extracting genomic deoxyribonucleic acid (DNA) at locations at or near cancer hotspots from a subject, modifying Tier-1 5hmC on the DNA to a modified 5hmC, detecting and identifying the presence or absence of the modified 5hmC, quantifying the detected and identified modified 5hmC; and providing a report comprising a score, wherein the score is indicative of the presence of cancer.

Inventors:
LU YABIN (US)
JINGANG LU MICHAEL (US)
Application Number:
PCT/US2023/060434
Publication Date:
July 20, 2023
Filing Date:
January 11, 2023
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
LU YABIN (US)
International Classes:
C12Q1/6886
Foreign References:
US20210108274A12021-04-15
Attorney, Agent or Firm:
TANKHA, Ashok (US)
Download PDF:
Claims:
CLAIMS

We claim:

1. A method comprising: extracting genomic deoxyribonucleic acid (DNA) from locations at or near cancer hotspots from a human; modifying specific Tier-1 5-hydroxymethylcytosine (5hmC) on the DNA to a modified specific Tier-1 5hmC; detecting and identifying presence or absence of the modified specific Tier-1 5hmC; quantifying the detected and identified modified specific Tier-1 5hmC; and providing a report comprising a score, wherein the score is indicative of the likelihood of a status, a degree, or a severity of the risk of cancer, wherein the specific Tier- 1 exist in cancer cell lines, in transformed and immortalized cells.

2. The method according to claim 1, wherein extracting genomic deoxyribonucleic acid (DNA) comprises pre-extracting the genomic deoxyribonucleic acid (DNA) from a tissue or plasma of the human.

3. The method according to claim 1, wherein modifying 5hmC comprises modifying 5hmC into a derivative of a nitrogenous base or a non-nitrogenous base which is a different base from unmodified 5hmC.

4. The method according to claim 1, wherein the modified 5hmC is a derivative of a cytosine (C) or a thymine (T), which is a base different from the unmodified 5hmC.

5. The method according to claim 1, wherein the specific Tier-1 sites exist in cancerous cells in organs of the human.

34

6. The method according to claim 1, wherein the specific Tier-1 sites exist in colon of the human. . The method according to claim 1, wherein the modifying 5hmC comprises modifying 5hmC with chemical reactions, and wherein the chemical reactions comprise oxidation or reduction reaction.

8. The method according to claim 1, wherein the modifying 5hmC comprises modifying 5hmC with enzymatic reactions.

9. The method according to claim 1, wherein the modifying 5hmC comprises modifying 5hmC with an oxidation agent or a reduction agent.

10. The method according to claim 9, wherein the oxidation agent is potassium perruthenate (KRuC ), manganese dioxide (MnCh), potassium permanganate (KMnCU), tetrapropylammonium perruthenate (TPAP), tetrabutylammonium perruthenate (TBAP), polymer- supported perruthenate (PSP), tetraphenylphosphonium ruthenate, or a combination thereof.

11. The method according to claim 9, wherein the reduction agent is pic-borane, pyridine borane, sodium borohydride (NaBPU), sodium cyanoborohydride (NaCNBPB), lithium borohydride (LiBPB), or a combination thereof.

12. The method according to claim 1, wherein detecting and identifying comprises replicating, amplifying, or copying the DNA region having the modified specific Tier-1 5hmC by DNA or RNA polymerase.

13. The method according to claim 12, wherein detecting and identifying further comprises identifying the 5hmC and its location recognized by the DNA or RNA polymerase as a different deoxy ribonucleotide on its complementary strand.

35

14. The method according to claim 13, wherein the different deoxy ribonucleotide is cytosine (C) mutated to thymine (T), or a Guanine (G) mutated to Adenine (A) on the complementary strand.

15. The method according to claim 1, wherein identifying, and detecting comprises capturing, sequestering and enriching DNA fragments having a base pair less than 1000 from a tissue or a cell of the human, monoclonal antibodies, or polyclonal antibodies, wherein the DNA fragments have a specific affinity in binding to 5hmC.

16. The method according to claim 1, wherein quantifying the detected and identified modified specific Tier-1 5hmC comprises one or more of quantifying or recording the quantity of the occurrence of 5hmC.

17. The method according to claim 1, wherein the at or near region of the hotspot includes DNA with sequence 40 base pair upstream and 40 base pair downstream of the hotspot.

18. The method according to claim 1, wherein the method further comprises using a reference material, wherein the reference material is a primary standard, a secondary standard, a calibrator, a quality control, a validation sample.

19. The method according to claim 17, wherein the method comprises using the reference material for the specific Tier-1 5hmC at cancer hotspots and nearby region as a part of the reference DNA sequence composition for detection, and quantification.

Description:
METHOD FOR DETECTING CANCER USING 5-HYDROXYMETHYLCYTOSINE (5-hmC)

CROSS REFERENCE TO RELATED APPLICATION

[0001] This international application claims priority to and the benefit of the nonprovisional patent application titled “Method For Detecting Cancer Using 5- Hydroxy methylcytosine (5-hmC)”, application number 17/961,571, filed in the United States Patent and Trademark Office on October 07, 2022 which is a continuation-in-part of non-provisional patent application titled “Method For Detecting Cancer Using 5- Hydroxy methylcytosine (5-hmC)”, application number 17/577,033, filed in the United States Patent and Trademark Office on January 17, 2022. The specification of the above referenced patent applications are incorporated herein by reference in their entirety.

FIELD OF THE INVENTION

[0002] The present disclosure relates to a method for detecting cancer. More particularly, it relates to a method for detecting, screening or predicting a likelihood of cancer using specific genomic 5-hydroxymethylcytosine (5hmC) sites at or near cancer mutation hot spots.

BACKGROUND

[0003] Cancer is a major disease worldwide. Each year, tens of millions of people are diagnosed with cancer around the world, and more than half of the patients eventually die from it. In many countries, cancer ranks the second most common cause of death following cardiovascular diseases. Early detection of cancer in a person improves the cure and outcomes for many types of cancers.

[0004] Efforts in using mutation hotspots as cancer biomarkers have not been fully successful due to the fact that cancer is usually associated with many mutations. These hotspots often do not show up in the majority of cancer cases. No single hotspot is prevalent enough to be used as a universal sensitive cancer marker. Universal markers like methylated cytosine (5-methylcytosine or 5mC) and Tumor Mutation Burden (TMB) have been widely explored as simple markers. However, both markers still lack large-scale validation, precluding implementation in clinical practice.

[0005] Mammalian deoxyribonucleic acid (DNA) contains oxidized forms of 5- methylcytosine (5mC). The base 5-hydroxymethylcytosine (5hmC) is the most commonly occurring oxidation product. In one well known mechanism, 5hmC is produced from 5mC in an enzymatic pathway involving three 5mC oxidases, Ten-eleven translocation (TET)l, TET2, and TET3. Formation of 5hmC from 5mC lowers the levels of 5mC genome. The conversion of 5mC to 5hmC may be the first step in a pathway leading towards DNA demethylation. However, the biological role of 5hmC is still unclear, and there may be conflicting results on inhibition of TET and suppressed hydroxymethylation (5hmC), such as promoting somatic cell reprogramming, increased gene expression of tumor suppression, and reduced cholangiocarcinoma progression.

[0006] Studies on the functional role of 5hmC have been heavily focused on change in chromosome- wide global 5hmC density or concentration, or regulation of transcription in the promoter region, or loss of 5hmc across many types of cancer. Unlike the uniform distribution of 5mC outside of the promoter regions, satellites, and repeat DNA sequences, 5hmC has distinct distributions across different functional regions, and its abundance varies across different tissues and cell types. Tissue type plays a dominant role in determining the distribution patterns of 5hmC. 5hmC is enriched primarily in the distal regulatory regions, gene bodies of actively expressing genes and promoters, indicating its connection with active transcription. Genome-wide analysis of 5mC has indicated the global hypo-methylation pattern in tumor tissues, whereas depletion of 5hmC has also been associated with the hyper-methylation of gene bodies in various cancers. Significant enrichment of 5hmC is observed in both tissue- specific and cancer- specific differentially methylated regions as compared with that of 5mC. [0007] Using massive parallel sequencing technique, thousands of genes from pancreatic cancer patients were simultaneously studied in which 5hmC is differentially expressed. Hundreds of genes related to pancreatic development or cancer were found to carry many 5hmC sites. By measuring signal (“peaks’) from thousands of 5hmC all together, “global” 5hmC profiles or patterns in either increase or decrease were observed at chromosomal or at clusters of gene sequence level. For example, the size of the group was described as “log [counts per million (base pair)] on 320 genes, a subset of the 13,180 genes that exhibited a statistically significant (FDR=0.05) increase or decrease in 5hmC”. Even though sample genes and their genomic locations are listed based on filtering criteria, each gene was covered by a few thousand base pair sequence, without pointing out which specific, individual 5hmC sites. However, there is no identification of specific individual 5hmC sites linked to cancer or hotspot mutations linked to cancer. But rather it was assumed the individual hydroxymethylation biomarkers may not have significant individual significance in the evaluation of a pancreatic lesion.

[0008] In our study, we demonstrated that, after chemical treatment to convert it to uracil (read as Thymine in NGS sequencing), 5hmCs are detected within CpG islands located either at or near a cancer mutation hotspot (within an 80bp flanking region). 5hmC detected on these discrete CpG sites showed a significantly greater proportion of cancer versus normal cells. The results showed that the 5hmCs detected at or near caner mutation hotspots consist near entirely by two characteristically distinct 5hmC groups: Tier 1 Group: the cytosine (C) residues that exhibit 3 to 8-fold more likelihood of 5hmCs detected in gDNAs from tumor-cells than from normal-cells; Tier 2 group: equal allele frequency (AF) of 5hmc detected in both normal and tumor-cells. It was hypothesized that, the Tier 1 group of 5hmC is associated with cancer cells and cancer hotspot formation. The 5hmC is an intermediate or precursor before the eventual C to T or G to A mutation. Unlike previous studies looking at the “global” 5hmC signals or patterns of 5hmC (as a group) across large chromosomal region, this study is based on identified specific, individual 5hmC sites at or near known cancer hotspots that display higher 5hmC occurrence in cancer cells. Tier 1 sites individually or combinedly detected can serve as specific marker for cancer. In Tier 2 5hmC sites, both cancer and normal cells have similar level of 5hmC. Tier 2 sites are not good as marker to distinguish between cancer and normal cells.

[0009] The detection of these specifically selected, individual Tier-1 5hmC sites at or near hotspot CpG sites in cancer cell can be a more convenient, more direct, and more sensitive cancer detection method than analysing the methylation profile at chromosomal level or from hundreds of sequences of entire genes.

[0010] Thus, there is a need for methods for detecting cancer using these specifically located 5hmCs directly at specific base (C or G) resolution.

SUMMARY OF THE INVENTION

[0011] A method is disclosed to detect risk of cancer. The method includes extracting genomic deoxyribonucleic acid (DNA) from locations at or near cancer hotspots from a subject, modifying the specific Tier-1 5 -hydroxy methylcytosine (5hmC) on the DNA to a modified specific Tier-1 5hmC, detecting and identifying presence or absence of modified Tier-1 5hmC, quantifying the detected and identified modified specific Tier-1 5hmC, and providing a report comprising a score, wherein the score is indicative of the likelihood of a status, a degree, or a severity of the risk of cancer, wherein the specific Tier-1 exist in cancer cell lines, in transformed and immortalized cells.

BRIEF DESCRIPTION OF THE DRAWINGS

[0012] FIGS. 1A-1C and 2A-2D illustrate examples of individual 5hmC sites as specific cancer marker (Tier 1) or not as marker (Tier 2).

[0013] FIG. 3 illustrates average AF% of detected C>T (G>A) at hotspots before and after DNA treatment. [0014] FIG. 4 illustrates 5hmC sites in tumor as percentage of 5hmC in normal at increasing AF cut-off.

[0015] FIG. 5 illustrates an example amplification plot from qPCR.

DETAILED DESCRIPTION OF THE INVENTION

[0016] Before explaining at least one embodiment of the disclosure in detail, it is to be understood that the disclosure is not necessarily limited in its application to the details set forth in the following description. The disclosure is capable of other embodiments, and of being practiced or carried out in various ways. Also, it is to be understood that the phraseology and terminology employed herein is for the purpose of description and should not be regarded as limiting.

[0017] As used herein, a cancer mutation hot spot is any single nucleotide having C-to-T or G-to-A substitution mutations reported in the literature that is associated with any cancer. A C>T or G>A change at hotspot resulted in an amino acid change, such as ATM p.R337C, SMARCA4 p.T790M, IDH1 p.R137H, KRAS p.G12C, etc. By way of example, hotspots comprise the following (Table 1):

Table 1

*HGVS, human genome variation society, http://www.hgvs.org

■{■Reference sequence > Altered sequence

[0018] More examples include, but are not limited to, the hotspots identified in the following:

[0019] Cancer Discov. 2018 Feb; 8(2): 174-183 (Supplementary Material - Refer to Web version on PubMed Central for supplementary material); Database: The Journal of Biological Databases and Curation, 2020, 1-8; npj Genomic Medicine (2021) 6, Article number: 33; Computational and Structural Biotechnology Journal, Volume 18, 2020, Pages 3567-3576.

[0020] As used herein, the nearby region of the hotspot includes DNA with sequence 40 base pair upstream and 40 base pair downstream of the hotspot.

[0021] As used herein, Tier 1 5hmC are cytosine (C) residues that exhibit 3 to 8-fold more likelihood of becoming 5hmCs in genomic DNAs from tumor-cells than from normal-cells, and Tier 2 5hmC are sites that exhibit equal allele frequency of 5hmC in both normal and tumor-cells.

[0022] As used herein, the genomic DNA includes total or partial full-length or fragmented (i.e., cell-free DNA) genomic DNA isolated from any human tissues, including plasma.

[0023] The term “genome” generally refers to an entirety of an organism’s hereditary information. A genome can be encoded either in DNA or in RNA. A genome can comprise coding regions that code for proteins as well as non-coding regions. A genome can include the sequence of all chromosomes together in an organism. For example, the human genome has a total of 46 chromosomes. The sequence of all of these together constitutes a human genome. [0024] The term “subject” and “patient” are used interchangeably herein, and refer to an animal, for example, a human from whom cells can be obtained. The term “mammal” is intended to encompass a singular “mammal” and plural “mammals,” and includes, but is not limited to humans; primates such as apes, monkeys, orangutans, and chimpanzees; canids such as dogs and wolves; felids such as cats, lions, and tigers; equids such as horses, donkeys, and zebras; food animals such as cows, pigs, and sheep; ungulates such as deer and giraffes; rodents such as mice, rats, hamsters and guinea pigs; and bears. In some preferred embodiments, a mammal is a human.

[0025] The term “sample” as used herein relates to a material or mixture of materials, typically, although not necessarily, in liquid form, containing one or more analytes of interest. Nucleic acid samples may be complex in that they contain multiple different molecules that contain sequences. Genomic DNA from a mammal (e.g., mouse or human) are types of complex samples. Any sample containing nucleic acid, e.g., genomic DNA made from tissue culture cells or a sample of tissue, may be employed herein.

[0026] Using chemical oxidation and reduction technique combined with Next Generation Sequencing (NGS), the present inventors explored the existence of 5hmC at cytosine and 5' — C — phosphate — G — 3' (CpG) sites within the gene bodies of a group of oncogenes, especially at or near (e.g., within 40 base pairs) the known cancer mutation hotspots. The cancer mutation hotspot can be expressed as a single base on genomic DNA that is frequently observed to have single nucleotide variant (SNV). The present inventors found that 5hmC does not randomly exist on all CpG sites on a gene, but rather on a small portion of all the CpG sites or cytosine residues. They exist specifically at cytosine sites (mostly at cytosine in CpG islands) located right at or within a range of 40 base pairs of a cancer mutation hotspot. Sometimes 5hmC occurs on a cytosine (C) that is not adjacent to a guanine (G). The results show the presence of two characteristically distinct 5hmC groups: Tier 1 Group with 3 to 8-fold more 5hmCs detected in tumor-cells than in normal-cell derived DNA. Tier 2 group with equal allele frequency of 5hmc among normal and tumorcell derived DNA at 5 CpG hotspot sites as well as 5 non-CpG hotspots. Significantly more Tier 1 group 5hmC sites are found at hotspots in either tumor cells or cell lines rendered immortal (by transforming agents such as SV40 T-antigen (Simian Vacuolating Virus 40 TAg)) than in healthy normal cells.

[0027] FIGS. 1A-1C and 2A-2D illustrates examples of individual 5hmc sites as specific cancer marker (Tier 1) or not as marker (Tier 2) i.e., Tier 1 and Tier 2 Group 5hmCs at or near hotspots.

[0028] In particular, FIGS. 1A-1C are a representative Tier 1 observation (arrow 102) at cancer mutation hotspot ERBB4 R711C (Chr2: 211623993).

[0029] In FIGS. 1A-1C, the top half of the plots are for untreated, background level of hot spot mutation. The bottom half shows the treated group. Y-Axis: Allele Frequency (AF) shown as fraction of C>T mutation at all observed hotspots. X-Axis: genome coordinates of all nearby CpG sites. Vertical Arrows: Hotspot Location (Vertical arrows 101: Wildtype C or G; Vertical arrows 102: Mutation T or A); Arrows 103 (Tier 2 site): 5hmc detected in DNA from all cells; and Arrows 104 (Tier 1 site near hotspot): 5hmC detected in cancerous cells at non-hotspot CpG sites.

[0030] FIGS. 2A-2D illustrate several more examples of Tier 1 group. In FIGS. 2A-2D, Arrows 202 identify location of Tier 1 5hmc: at or near cancer mutation hotspots; Y-Axis: Allele Frequency (AF) shown as fraction of C>T mutation at all observed hotspots; X- Axis: genome coordinates of nearby CpG sites; Arrows 203 identify location of Tier 1 5mhc not at hotspot; and Arrows 204 identify location of Tier 2 5hmc detected in DNA from all cells.

[0031] Allele frequencies (AF%) of detected 5hmC at each cancer mutation hotspot (17 Hotspots) after the treatment are shown in Table 2 and Table 3. Examples of Tier 1 group 5hmC at cancer mutation hotspots (>8% are in bold) are listed in Table 2. Examples of Tier 2 group 5hmC at cancer mutation hotspots (>8% are in bold) are listed in Table 2.

Table 2

Table 3

[0032] DNAs from normal cells (PBMC) and the two cancerous/tumor cells are compared.

Both base C and G of the CpG are checked. AFs higher than 8% are shown in bold and those between 4% and 8% can also be noted. In cancerous cells, most CpG hotspot sites have both the C and G in the CpG island mutated. One of the non-CpG hotspot, KRAS G12C (a “CC”, with a C to A mutation), showed significantly more 5hmCs in cancerous cells than in normal cells.

[0033] The observations in Table 1 and 2, averaged AF% for each group, before or after the 5hmC>T conversion are plotted in FIG 3. Significantly higher AF% are observed in both PAM3005 (transformed cells) and HCT116 (cancer cell lines) in Tier 1, while the AF% were comparable among Tier 2. Background level of AF% for all groups are comparable.

[0034] In an expanded studies covering 33 cancer mutation hotspots employing 12 normal and 12 colorectal cancer samples further confirmed the above results in cell culture cells. Significantly more 5hmC sites were observed in tumor than normal DNA at higher AF. For example, at 5% AF or above, an average of 6095hmC sites were found in each tumor DNA versus 479 in normal DNA. At 10% or higher, the average number was about 153 in tumor versus 66 in normal. The number of extra 5hmC (Tier 1) found in tumor was proportionally higher in high AF range. Calculated as percentage of 5hmC sites found in normal, there were 2%, 36%, 170%, and 283% more 5hmC counts in tumor, and 24%, 46%, 147% and 230% higher sum of AF values in tumor than in normal gDNA, when detection criteria of AF were set at above 1%, 5%, 10 and 12%, respectively (See, FIG. 4).

[0035] FIG. 4. Illustrates 5hmC sites in tumor as percentage of 5hmC in normal at increasing AF cut-off.

[0036] Tier 1 5hmC sites showing three-fold or higher AF in colorectal tumor cells than in normal colon cells (in the 80bp hotspot flanking regions studied) are listed in Table 4 (Cancer Hotspot Targets with Single Nucleotide Variant (SNV) below. About half of these sites coincide with known mutation hotspots. Table 4 does not include all Tier 1 sites that are not detected in the experiment nor all Tier 1 sites in cells from other tumor types. Table 4

[0037] The association of increased quantity of specific, individual 5hmC at or near specific Tier-1 hotspots in cancer cells provides a way to distinguish cancer cells from normal cells directly at specific base (C or G) resolution. Because 5hmC is not detected by normal sequencing technique as mutated, the increased 5hmC occurrence at specific hotspots is a more sensitive marker of cancerous cells before the occurrence of many mutations (e.g., C to T changes). Furthermore, the detection of these specifically selected, individual Tier-1 5hmC sites at or near hotspot CpG sites in cancer cell can be a more convenient, more direct cancer detection method than analysing the group 5hmC profile at chromosomal level or from hundreds of sequences of entire genes.

[0038] Thus, the detection and quantification of the number of selected specific individually targeted Tier-1 5hmC sites or its prevalence at or near many cancer mutation hot spots in a given cell enables one to detect, screen and predict the likelihood of cancer occurrence or the severity of the cancer. Moreover, the existence of 5hmC at many hotspots in cancer cell lines suggests a previously unknown higher order mechanism underlying the development of cancer. Markers along the 5hmC-mediated mechanism or pathway in cancer development are not only better diagnostic targets than mutations at hotspots, but also potentially better therapeutic targets. Drugs directly or indirectly either prevent 5hmC from occurring, prevent 5hmC from being converted to uracil- or thymine-analog, or correct 5hmC back to regular cytosine may prevent or treat cancer.

[0039] In one aspect, the present disclosure provides a method which includes: extracting genomic deoxyribonucleic acid (DNA) from locations at or near specific target cancer hotspots from a subject; modifying specific Tier-1 5-hydroxymethylcytosine (5hmC) on the DNA to a modified specific Tier-1 5hmC; detecting and identifying presence or absence of the modified specific Tier-1 5hmC; quantifying the detected and identified modified specific Tier-1 5hmC; and providing a report comprising a score, wherein the score is indicative of the likelihood of a status, a degree, or a severity of the risk of cancer.

[0040] In one embodiment of this aspect, the specific Tier-1 5hmC can exist in cancer cell lines, in transformed and immortalized cells. [0041] In particular, the present disclosure provides selected specific Tier-1 5- hydroxymethylcytosine (5hmC) at or near cancer mutation hot spots as targets for early cancer detection. Such methods provide for high sensitivity detection of one or more genetic variants.

[0042] In another embodiment, the method comprises quantifying the detected and identified specific Tier-1 5hmC at or near cancer mutation hot spots located at a specific set of oncogenes in which, when mutated, a cytosine (C) is mutated to thymine (T), or a Guanine (G) is mutated to Adenine (A) on the complementary strand after amplification.

[0043] A cancer mutation hot spot is any single nucleotide having substitution mutations reported in the literature that is associated with any cancer. The cancer mutation hotspot can also be expressed as a single base on genomic DNA that is frequently observed to have single nucleotide variant (SNV) or deletion.

[0044] In another embodiment, modifying specific Tier-1 5hmC on the DNA to a modified 5hmC includes treating genomic deoxyribonucleic acid (DNA) to convert 5hmC on the DNA to a modified 5hmC includes any technique to modify 5hmC into another derivative of a nitrogenous base, such as derivative of a cytosine (C) or a thymine (T), or any non- nitrogenous molecule which can be detected as a different base from the original 5hmC. The detected different base can be used to calculate the quantity of 5hmC at any specific nucleotide locations on human genome.

[0045] In another embodiment, treating genomic deoxyribonucleic acid (DNA) to convert specific Tier-1 5hmC on the DNA to a modified 5hmC includes a method that employs either chemical or enzymatic reaction processes or both to modify the 5hmC into another derivative of a nitrogenous base, such as derivative of a cytosine (C) or a thymine (T), or any non-nitrogenous molecule which can be detected as a different base from the original 5hmC. [0046] In another embodiment, treating genomic deoxyribonucleic acid (DNA) to convert specific Tier-1 5hmC on the DNA to a modified 5hmC includes a method that employs either oxidation or reduction reaction processes or both to modify the 5hmC into another derivative of a nitrogenous base, such as derivative of a cytosine (C) or a thymine (T), or any non-nitrogenous molecule which can be detected as a different base from the original 5hmC (C). In preferred embodiments, the oxidation or reduction reaction processes can be either chemical or enzymatic reactions.

[0047] Preferably, the oxidising agent may be an organic or inorganic chemical compound. Suitable oxidising agents are well known in the art and include metal oxides, such as Potassium perruthenate (KRuO4), Manganese dioxide (Mn02), Potassium permanganate (KMnO4). Particularly useful oxidising agents are those that may be used in aqueous conditions. However, oxidising agents that are suitable for use in organic solvents may also be employed where practicable. In some embodiments, the oxidising agent may comprise a perruthenate anion (RuO). Suitable perruthenate oxidising agents include organic and inorganic perruthenate salts, such as potassium perruthenate (KRuO4) and other metal perruthenates; tetraalkylammonium perruthenates, such as tetrapropylammonium perruthenate (TPAP) and tetrabutylammonium perruthenate (TBAP); polymer- supported perruthenate (PSP) and tetraphenylphosphonium ruthenate.

[0048] Advantageously, the oxidising agent or the oxidising conditions may also preserve the DNA in a denatured state. Optionally, the polynucleotide (DNA) may be subjected to further, repeat oxidising steps.

[0049] Suitable reducing agents are well-known in the art and include Pic-borane, Pyridine borane, Sodium borohydride (NaBH4), Sodium cyanoborohydride (NaCNBH4) and Lithium borohydride (LiBH4). Particularly useful reducing agents are those that may be used in aqueous conditions, as such are most convenient for the handling of the polynucleotide (DNA). However, reducing agents that are suitable for use in organic solvents may also be employed where practicable. [0050] In another embodiment, the method further includes any technique for one of more of capturing, sequestering and enriching DNA fragments of 1000 base pair or less from any human tissue or cells by any molecule, such as monoclonal or polyclonal antibodies, having specific affinity in binding to specific Tier-1 5hmC. The captured, sequestered, or enriched DNA can be then analyzed to calculate the quantity of a variable which is a function of the quantity of cancer- specific genetic features, which include but not limit the quantity of cancer mutation hotspots.

[0051] In another embodiment, the method employs a method to quantify the number of detected specific Tier- 1 5hmC occurred at or near a specific hotspot or multiple of hot spots or one or more cytosine near the hotspot.

[0052] In another embodiment, the present disclosure comprises any anti-cancer therapeutic methods or agents targeting either the specific Tier-1 5hmC itself, biochemical steps of converting regular cytosine to 5hmC, conversion of the 5hmC to uracil- or thymine- analog, or the 5hmC-mediated pathway that leads to cancer development.

[0053] In another embodiment, the method comprises any reference material, including but not limited to primary standard, secondary standard, calibrator, quality control, validation sample, using any of the specific Tier-1 5hmC at hotspot and its nearby region as part of the reference DNA sequence composition for diagnosis of cancer via specific Tier-1 5hmC detection, and quantification.

[0054] In another embodiment, the method includes quantifying a variable which is a function of a quantity of specific Tier-1 5 -hydroxy methylcytosine (5hmC) at any specific nucleotide location on a human genome; and thereby detecting, screening or predicting a likelihood of cancer occurrence in a subject.

[0055] In another embodiment, the method provides the diagnostic methods that comprises the following steps: [0056] Step 1: Modification of specific Tier-1 5hmC at locations which are at or near the said cancer hotspots.

[0057] Genomic DNA from human tissue (including plasma) is pre-extracted from patient specimen. It is subjected to a treatment to convert 5hmC on the DNA to a different moiety, such as an uracil, that is recognizable to identify the location of the 5hmC.

[0058] Examples of modification methods comprise the following:

(1) DNA containing 5hmC or the cancer hotspots and its adjacent region is oxidized by potassium perruthenate (KRuCU) or other salts of high oxidation state of transition metals such as potassium permanganate (KMnCU), or other oxidizing agent, to produce an aldehyde, such as 5 -formylcytosine (5fC). 5fC is then reduced by a reducing agent such as Pic-borane or Pyridine borane to produce an uracil derivative dihydrouracil (DHU). DHU is then recognized as thymine (T) in the subsequence replication or amplification reaction involving RNA and/or DNA polymerase and any DNA sequence identification method. Alternatively, cytosine (C)’s complementary base, guanine (G) is recognized as Adenine (A) after replication or amplification reaction involving RNA and/or DNA polymerase and any DNA sequence identification method.

(2) DNA containing 5hmC or the cancer hotspots and its adjacent region is oxidized by enzymes such as Ten-eleven translocation (TET)l, TET2, and TET3, or another oxidative enzyme modifying 5hmC. The product of the oxidation of 5hmC is 5fC and subsequently 5-carboxylcytosine (5caC). These products are then reduced by reduction agents such as bisulfite (NaHSOs) and produce derivative of uracil which is subsequently recognized as thymine (T) in the subsequence replication or amplification reaction involving RNA and/or DNA polymerase. Alternatively, cytosine (C)’s complementary base, guanine (G) is recognized as Adenine (A) after replication or amplification reaction involving RNA and/or DNA polymerase and any DNA sequence identification method. In addition to replication or amplification, C-to-T change or G-to-A change at the hotspot can be recognized (and distinguished from other nucleotides) by other methods disclosed in (1).

(3) The chemicals or enzymes in oxidations and reductions in (1) and (2) can be optionally switched to achieve the same modifying result, i.e., either C is converted to T, or after replication, its complementary base G is converted to A.

(4) Different modifications using oxidation or reduction can be applied to regular cytosine base (C), 5 -methylcytosine (5mC), and 5hmC separately in order to produce different products so that the three can be distinguished and identified in subsequent procedures. For example, bisulfite reaction can distinguish regular cytosine from 5mC and 5hmC by modifying the regular cytosine. Alternatively, TET can separate both 5mC and 5hmC from regular cytosine by modifying both 5mC and 5hmC. In a separate experiment, 5mC and 5hmC can be distinguished by protection of 5hmC specifically from oxidation and reduction by using 0-glucose transferase to attach glucose to the hydroxyl group of 5hmC to create 5-glucosyl-hydroxylmethylcytosine (5-ghmC). The unprotected 5mC can be reduced to produce DHU while the same reaction is blocked on 5-ghmC.

(5) Alternatively, regular cytosine, 5mC and 5-ghmC can be distinguished by their susceptibility to restriction digestion by enzymes such as MspI and Hpal.

(6) Specific sequence guided or sequence dependent recognition or cutting of DNA in the vicinity of regular cytosine, 5mC and 5-ghmC at or near a cancer mutation hotspot is performed via techniques such as DNA- or RNA-guided gene editing (such as Crisper technology), homologous recombination, or transposition via transposon.

[0059] Step 2 : Detection, identification or confirmation of the presence or absence of modified 5hmC at specific Tier-1 locations which are at or near the said cancer hotspots.

[0060] Example: (1) The DNA region having the modified specific Tier-1 5hmC is replicated, amplified or copied by DNA or RNA polymerase, the modified bases (from Step 1) contribute to the identification of the 5hmC and its location by being recognized by the polymerase as a different deoxy-ribonucleotide such as thymine (T, for modified C) or adenine (A) on its complementary strand.

(2) The DNA region having the protected specific Tier-1 5hmC (such as 5-ghmC) is replicated, amplified or copied by DNA or RNA polymerase as regular cytosine, while regular cytosine and other cytosine derivative such as 5mC are recognized as a different deoxy-ribonucleotide such as thymine (T) or adenine (A) on its complementary strand.

(3) The detection methods of (1) and (2) comprise various processes of replication or amplification mediated by DNA or RNA polymerase. These methods comprise, Sanger Sequencing, massive paralleled sequencing or Next Generation Sequencing (NGS), any form of single-cell-sequencing, such as technologies from Polymerase Chain Reaction (PCR), Droplet Digital PCR (ddPCR), Quantitative PCPacific Biosciences, Oxford Nanopore Technology, Quantapore (CA-USA), and Stratos (WA-USA), R (qPCR), Reverse Transcription PCR, isothermal amplification.

[0061] As examples shown in FIGS. 1A-1C, 2A-2D, 3 and 4_and Table 2 and 3, number of reads of 5hmC was obtained by NGS, and allele frequency (AF) can be calculated reflecting the frequency or amount of the detected 5hmC. The detection signal of Tier-1 5hmC can be generated by a single 5hmC site or multiple specific Tier-1 5hmC sites.

(4) The detection methods of (1) and (2) can include RNA-guided gene editing methods.

(5) Regular C, 5mC, 5hmC and their modified forms generated in Step 1 can be distinguished from each other by using different restriction enzymes that exhibit differential cutting efficiency among modified or unmodified forms. With or without PCR amplification, the size pattern of the restriction product can be compared using agarose gel or any form of chromatography. (6) The detection methods of (1) and (2) can employ any technique to capture, sequester or enrich DNA fragments of 1000 base pair or less from any human tissue or cells by any molecule, such as monoclonal or polyclonal antibodies from any species, having specific and affinity in binding to 5hmC.

(7) In addition to replication or amplification, C-to-T change or G-to-A change at the hotspot can be recognized and distinguished from other nucleotides by other methods comprise chromatographical methods (e.g., size exclusion, affinity binding, ion-exchange chromatography), mass spectrometry, affinity binding and labelling methods utilizing antigen- antibody interaction (such as in ELISA), and molecule-to-molecule affinity binding (such as ligand-receptor binding).

[0062] Another example of detection method:

1) DNA oligos (primers) containing DNA sequence at or adjacent to Tier 1 sites were synthesized. These probes can be either immobilized on solid surface (flat or non-flat such as a magnetic bead surface) or chemically cross-linked to a moiety that is able to bind to a solid surface via affinity binding.

2) Labeled DNA oligos (probes) containing DNA sequence including one or multiple Tier 1 sites and sequences adjacent to them are synthesized. The deoxyribonucleotide at the Tier 1 5hmC location is a T (or A for the base complementary to 5hmC). This allows the probe to specifically hybridize to modified 5hmC after it is modified to uracil and subsequently amplified as T. The probe can serve as reporter (marker) during subsequent detection step (6).

3) The liquid biopsy sample (either from plasma or other body fluid) consists of numerous cell free DNA (cfDNA) derived from genomic DNA from either normal or tumor cells. The total cfDNA can be extracted from the sample using a variety of methods.

4) After extraction, cfDNA is subject to 5hmC modification (described in Step 1).

5) The pre-synthesized primers (oligos) from 1) are subjected to contacting with the modified extracted cfDNA (from 3)) via mixing or incubation under specific conditions promoting denaturation of the double stranded DNA, followed by hybridization of single-stranded DNA molecules based on complementary pairing scheme (ie. A to T, C to G).

6) The hybridized DNA is pulled out from the mixture via affinity binding followed by washing. This step can be skipped if enough DNA containing Tier 1 sites is available for analysis. If there is insufficient Tier 1 collected, multiple rounds of steps 2) to 5) can be done to accumulate sufficient DNA containing Tier 1 sites.

7) The probe from 2) is mixed with the hybridized DNA from 5) in a qPCR or ddPCR reaction. The labeled moiety of the probe provides signal indicating the quantity of the Tier 1 5hmC in the sample.

8) FIG. 5 shows an example amplification plot from qPCR. Nine Tier-1 5hmC targets were selected from Table 4 for preparing the assay with primer and probes synthesized. Equal amount of gDNA from a pool of colorectal cancer patient and their matched normal colorectal tissue were extracted. Real-time florescence curves, indicating the real-time detection of 5hmC, were plotted. Cycle Threshold (Ct) values (26.1 for cancer sample and 31.6 for normal sample), which reflecting the relative concentration of the 5hmC, can be obtained.

[0063] Step 3: Quantification of the detected and identified 5hmC at locations which are at or near the said cancer hotspots.

[0064] Quantifying or recording the quantity of the occurrence of 5hmC can be of the following forms:

(1) Absolute number, count, read, or event of such 5hmC found in a given sample preparation.

(2) Absolute number, count, read, or event of such 5hmC detected on one or more specific genes in any given sample preparation. The quantified number can be either from a single specific Tier-1 5hmC or multiple Tier-1 5hmCs. (3) Relative allele frequency or ratio or percentage of absolute numbers of 5hmC relative to either regular cytosine (C) or combination of regular C, 5mC and 5hmC at the same allele (base location) in situations in (1) and (2).

As examples shown in FIGS. 1A-1C and 2A-2D, 3 and 4, and Table 1 and 2, allele frequency (AF) was calculated indicating the quantification of the 5hmC based on the number of reads of 5hmC obtained by NGS.

(4) Relative numbers derived, transformed, or calculated from signal (e.g., florescence index), absorbance, intensity, color, hue, area of peak, or other measurements which reflect the numbers in (1), (2), or (3).

As an example in FIG. 5, difference in Ct values (DeltaCt) can be calculated to indicate the degree of 5hmC concentration difference. In addition, average, sum, square, exponential power, differences, ratio, or other simple mathematical operation or transformation that are used to reflect the quantity of the detected and identified specific Tier-1 5hmC at locations which are at or near the said cancer hotspots.

[0065] Step 4: The quantity of the quantitated number in Step 3 is applied to a predetermined algorithm so that a score is generated that is comparable to predetermined criteria that is indicative of the status, degree, severity, or size of the risk of cancer of that patient.

[0066] Examples:

(1) A score is a calculated value of a variable that is measuring of the propensity or likelihood of a patient’s chance in getting cancer (or severity of cancer). In examples shown in FIG. 4, the score is a percentage calculated between AF of detected 5hmC in cfDNA versus gDNA from normal tissue.

(2) In example of FIG. 5., the score can either be the deltaCt or the deltaCt value can be converted into a ratio between concentrations of targeted Tier-1 5hmC in tumor and normal tissue. In this case, the ratio is 44.8. [0067] The score calculated in (1) and (2) can be compared to a predetermined cut-off value (criteria or limit values, see Step5) to determine the presence of tumor.

[0068] Step 5: Via mass observations (clinical trials) on a population of normal and precancer or cancer patient samples, steps 1, 2 and 3 are used to generate raw data for generating an algorithm.

(1) The algorithm is a mathematical relationship between the quantified specific Tier- 1 5hmC values (obtained in Step3) and a score representing the degree of likelihood of having cancer.

(2) The score representing the likelihood of cancer can be obtained by giving a severity number to each patient based on the patient’s size of tumor or stages of cancer.

(3) Regression models may be established between the quantified specific Tier-1 5hmC values (obtained in Step3) and the score representing the likelihood of cancer.

(4) Based on large population of data, the cut-off value, the score that can separate normal or cancer patient can be statistically determined.

[0069] In another embodiment, the present disclosure provides both the Tierl and Tier2 5hmC sites as targets for making contrived patient-like reference materials, including positive or negative quality control samples, standards (eg. a primary standard, a secondary standard, or a calibrator), or validation samples for assays aiming for detecting Tierl or Tier2 5hmC to detect cancer. Synthetic DNA fragments mimicking the 5hmC patterns (at Tierl or Tier2 sites) in genomic DNAs from either tumor cells or normal cells can be produced either through DNA synthesis in vitro or site-directed gene-editing in vivo. The resulting contrived sample can be used to monitoring the performance of the assay or calibrating the measurement system within the assay.

[0070] In another embodiment, the present disclosure provides anti-cancer therapeutic methods targeting Tier-1 5hmC at or near hotspot that comprises the following strategies:

(1) Methods or agents preventing the conversion from regular cytosine to 5hmC at or near cancer mutation hotspots. [0071] Many biochemistry processes or pathways exist that result in 5hmCs, specifically located at or near cancer mutation hotspot, from regular cytosine or an intermediate, such as 5mC.

[0072] For example, enzymes Ten-eleven translocation (TET)l, TET2, and TET3 catalyzes the conversion of 5mC to 5hmC. Inhibitors of TET can be used to prevent this process. Specifically, any inhibitors that directly or indirectly inhibits the 5hmC formation at or near cancer mutation hotspot to achieve anti-cancer effect are encompassed within the scope of this disclosure.

[0073] Alternatively, methods or agents that prevent the formation of 5hmC at or near cancer hotspots through TET-independent mechanisms are also encompassed within the scope of this disclosure.

(2) Methods or agents preventing the formation of uracil- or thymine-analog from 5hmC at or near cancer mutation hotspots.

Any methods or agents that directly or indirectly inhibit the cellular process converting 5hmC to uracil- or thymine- analog at or near cancer mutation hotspots are encompassed within the scope of this disclosure.

(3) Methods or agents converting, directly or indirectly 5hmC to cytosine or another cytosine derivative (recognized as “C” by RNA or DNA polymerases) at or near cancer mutation hotspot are also encompassed within the scope of this disclosure. All combinations of modification strategies, aimed to identify 5-hmC at locations which are at or near the said cancer hotspots are encompassed within the scope of this disclosure.

[0074] The below references cited herein are hereby incorporated by reference-for-all purposes.

[0075] REFERENCES: [0076] Pfeifer GP, et al. 5-hydroxymethylcytosine and its potential roles in development and cancer. Epigenetics & Chromatin. 2013; 6: 10.

[0077] Singh AK, et al. Selective targeting of TET catalytic domain promotes somatic cell reprogramming. PNAS. 2020; 117: 3621-3626.

[0078] Gerecke C, et al. Vitamin C in combination with inhibition of mutant IDH1 synergistically activates TET enzymes and epigenetically modulates gene silencing in colon cancer cells. Epigenetics 2020 Mar; 15(3): 307-322

[0079] Bai X, et al. Ten-Eleven Translocation 1 promotes malignant progression of Cholangiocarcinoma with wild-type isocitrate dehydrogenase 1. Hepatology. 2021 May; 73(5): 1747-1763

[0080] Margalit S, et al. 5-Hydroxymethylcytosine as a clinical biomarker: Fluorescencebased assay for high-throughput epigenetic quantification in human tissues. Int. J. Cancer. 2019; 146: 115-122.

[0081] Li W, et al. 5-Hydroxymethylcytosine signatures in circulating cell-free DNA as diagnostic biomarkers for human cancers. Cell Research. 2017; 27: 1243-1257.

[0082] Zeng C, et al. Towards precision medicine: advances in 5-hydroxymethylcytosine cancer biomarker discovery in liquid biopsy. Cancer Communications. 2019; 39: 12.

[0083] Song CX, et al. Mapping recently identified nucleotide variants in the genome and transcriptome. Nat. Biotechnol. 2012; 30: 1107-1116.

[0084] Yu M, et al. Base-resolution analysis of 5-hydroxymethylcytosine in the mammalian genome. Cell. 2012 Jun 8; 149(6): 1368-80 [0085] Nestor CE, et al. Tissue type is a major modifier of the 5-hydroxymethylcytosine content of human genes. Genome Res. 2012; 22: 467-477.

[0086] Thomson JP, et al. The application of genome-wide 5-hydroxymethylcytosine studies in cancer research. Epigenomics 2017; 9: 77-91.

[0087] Han D, et al. A highly sensitive and robust method for genome-wide 5hmC profiling of rare cell populations. Mol Cell. 2016; 63: 711-719.

[0088] Chen K, et al. Loss of 5-hydroxymethylcytosine is linked to gene body hypermethylation in kidney cancer. Cell Res. 2016; 26: 103-118.

[0089] Vasanthakumar A, et al. 5-hydroxymethylcytosine in cancer: significance in diagnosis and therapy. Cancer Genet. 2015; 208: 167-177.

[0090] Li X, et al. Whole-genome analysis of the methylome and hydroxymethylome in normal and malignant lung and liver. Genome Res. 2016; 26: 1730-1741.

[0091] Kohler F, et al. DNA Methylation in Epidermal Differentiation, Aging, and Cancer. J Invest Dermatol. 2020; 140: 38-47.

[0092] Sholl LM, et al. The Promises and Challenges of Tumor Mutation Burden as an Immunotherapy Biomarker: A Perspective from the International Association for the Study of Lung Cancer Pathology Committee. J Thorac Oncol. 2020; 15: 1409-1424.

[0093] Constancio V, et al. DNA Methylation-Based Testing in Liquid Biopsies as Detection and Prognostic Biomarkers for the Four Major Cancer Types. Cells 2020;9(3):624.

[0094] Addeo A, et al. TMB or not TMB as a biomarker: That is the question. Crit Rev

Oncol Hematol. 2021; 163: 103374 [0095] Arensdorf P, et al. Pancreatic Ductal Adenocarcinoma Evaluation Using Cell-free DNA Hydroxymethylation Profile. U.S. Patent Publication No. 2021/0108274.

[0096] The above disclosure of this invention is directed primarily to embodiments and practices thereof. It will be readily apparent to those skilled in the art that further changes and modifications in actual implementation of the concepts described herein can easily be made or may be learned by practice of the invention, without departing from the spirit and scope of the invention as defined by the following claims.