Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
LARGE-SCALE EPIGENOMIC REPROGRAMMING LINKS ANABOLIC GLUCOSE METABOLISM TO DISTANT METASTASIS DURING THE EVOLUTION OF PANCREATIC CANER PROGRESSION
Document Type and Number:
WIPO Patent Application WO/2018/067840
Kind Code:
A1
Abstract:
The present invention relates to a method of identifying targets for epigenetic reprogramming comprising detecting large organized heterochromatin lysine (K)-9 modified domains (LOCKs) and large DNA hypomethylated blocks in a sample containing DNA from a subject having cancer, for example, PDAC. The invention also provides for the use of differentially expressed genes to identify metastatic propensity in primary tumors, wherein the genes are selected from genes in the Tables herein, oxidative stress genes, EMT genes, immunological response genes, DNA repair genes, glucose metabolism genes, oxPPP genes, and PGD genes. Further, the invention provides a method for identifying agents or compounds to affect epigenomic changes, including inhibition of oxPPP comprising analyzing a sample from a subject before and after contacting with the agent or compound and determining the effect of the agent or compound on the epigenomic changes.

Inventors:
MCDONALD OLIVER (US)
LI XIN (US)
IACOBUZIO-DONAHUE CHRISTINE A (US)
FEINBERG ANDREW P (US)
Application Number:
PCT/US2017/055376
Publication Date:
April 12, 2018
Filing Date:
October 05, 2017
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNIV JOHNS HOPKINS (US)
MEMORIAL SLOAN KETTERING CANCER CENTER (US)
UNIV VANDERBILT (US)
International Classes:
A61K39/395; A61P35/04; A61P43/00; C12Q1/68; G01N33/50; G01N33/574
Domestic Patent References:
WO2013152186A12013-10-10
WO2016144371A12016-09-15
WO2014056627A12014-04-17
WO2016172332A12016-10-27
Foreign References:
US20140128283A12014-05-08
US7153691B22006-12-26
US20150174138A12015-06-25
Other References:
See also references of EP 3522924A4
Attorney, Agent or Firm:
HAILE, Lisa A. et al. (US)
Download PDF:
Claims:
What is claimed is:

1. A method of identifying a target for epigenetic reprogramming comprising detecting large organized heterochromatin lysine (K)-9 modified domains (LOCKs) and large DNA hypomethylated blocks in a sample containing DNA from a subject having cancer.

2. The method of claim 1, wherein the sample is from a solid tumor.

3. The method of claim 1, wherein the subject has or is at risk of having PDAC and/or metastasis thereof.

4. The method of claim 1, wherein the detection comprises analysis of H3K9Me2/3 and/or H4K20Me3.

5. The method of claim 1, wherein the detection comprises analysis of H3K27Ac and/or H3K9Ac.

6. The method of claim 1, wherein detection is by Western blotting.

7. The method of claim 1, wherein detection is by ChIP for antibodies to H3K9Me2/3 and/or H4K20Me3, optionally followed by sequencing.

8. The method of claim 1, wherein detection is by ChIP for antibodies to, H3K27Ac and/or H3K9Ac optionally followed by sequencing.

9. The method of claim 1, wherein detection is by whole genome bisulfite sequencing.

10. The method of claims 2-8, further comprising gene expression analysis.

11. The method of claims 1-10, wherein there is an absence of driver mutations for metastasis.

12. The method of claims 1-11, further comprising analysis of euchromatin islands and/or euchromatin LOCKs.

13. A method for identifying changes in ECDs in a DNA sample comprising analysis of euchromatin domains (ECDs) prior to and following a treatment regime to provide a prognosis or analysis of responsiveness to the treatment regime.

14. A method for identifying changes in LOCKs in a DNA sample comprising analysis of LOCKs prior to and following a treatment regime to provide a prognosis or analysis of responsiveness to the treatment regime.

15. The method of claims 13 or 14, wherein analysis is by combinatorial methods.

16. Use of differentially expressed genes to identify metastatic propensity in primary tumors, wherein the genes are selected from genes in the Tables herein, oxidative stress genes, EMT genes, immunological response genes, DNA repair genes, glucose metabolism genes, oxPPP genes, and PGD genes.

17. A method for reversing or affecting epigenomic changes involving oxPPP inhibition.

18. A method for reversing or affecting epigenomic changes involving PGD RNAi.

19. A method for reversing or affecting epigenomic changes involving 6AN.

20. A method for identifying agents or compounds to affect epigenomic changes, including inhibition of oxPPP comprising analyzing a sample from a subject as in claim 1 before and after contacting with the agent or compound and determining the effect of the agent or compound on the epigenomic changes.

21. The method of claims 17-19, wherein the method comprises methods selected from the group consisting of tumorsphere assays, matrigel assays, injection of cells into organotypic stroma.

22. An agent or compound identified by the method of claim 20, for use in treating a subject in need thereof.

23. The method of claim 20, wherein the subject has cancer.

24. The method of claim 20, wherein the subject has PDAC.

25. The method of claim 20, wherein the use is for metastasis treatment and/or prevention.

26. The method of claim 20, wherein the agent or compound is administered prior to, simultaneously with or following treatment with chemotherapy or radiation therapy or other treatment regime.

Description:
LARGE-SCALE EPIGENOMIC REPROGRAMMING LINKS ANABOLIC GLUCOSE METABOLISM TO DISTANT METASTASIS DURING THE EVOLUTION OF PANCREATIC CANCER PROGRESSION

CROSS REFERENCE TO RELATED APPLICATION(S)

[0001] This application claims the benefit of priority under 35 U.S.C. §119(e) of U.S. Serial No. 62/405, 155, filed October 6, 2016, the entire contents of which is incorporated herein by reference in its entirety.

INCORPORATION OF SEQUENCE LISTING

[0002] The material in the accompanying sequence listing is hereby incorporated by reference into this application. The accompanying sequence listing text file, name JHU4080_lWO_lWO_Sequence_Listing, was created on October 4, 2017, and is 9 kb. The file can be assessed using Microsoft Word on a computer that uses Windows OS.

STATEMENT OF GOVERNMENT SUPPORT

[0003] This work was supported by NIH grant C A38548 (APF), National Institutes of Health grants CA140599, CA179991 (CID), the AACR Pancreatic Cancer Action Network Pathway to Leadership grant (OGM), Vanderbilt GI SPORE (OGM), and institutional grants from the American Cancer Society to the Vanderbilt-Ingram Cancer Center (OGM), and CA180682 (AMM). The government has certain rights in the invention.

BACKGROUND OF THE INVENTION FIELD OF THE INVENTION

[0004] The invention relates generally to genetic analysis and more specifically to cancer and the epigenetic influence on progression and metastases of cancer.

BACKGROUND INFORMATION

[0005] During the evolutionary progression of pancreatic ductal adenocarcinoma (PDAC), heterogeneous subclonal populations emerge that drive primary tumor growth, regional spread, distant metastasis, and patient death. However, the genetics of metastases largely reflects that of the primary tumor in untreated patients, and PDAC driver mutations are shared by all subclones. This raises the possibility that an epigenetic process might be operative during metastasis. Here we detected striking epigenetic reprogramming of global chromatin modifications during the natural evolutionary history of distant metastasis. Genome-wide mapping revealed that these global changes were targeted to thousands of large chromatin domains across the genome that collectively specified malignant traits, including euchromatin and large organized chromatin K9-modified (LOCK) heterochromatin. Parallel to these changes, distant metastases co-evolved a dependence on the oxidative branch of the pentose phosphate pathway (oxPPP), and oxPPP inhibition selectively reversed reprogrammed chromatin and blocked tumorigenic potential. Thus, divergent metabolic, epigenetic, and tumorigenic programs emerged during the evolution of pancreatic cancer progression.

[0006] Despite significant progress in survival rates for most human cancers, PDAC remains nearly universally lethal with survival rates of 8%. In fact, PDAC is projected to be the second-leading cause of cancer deaths in the western world by 2020. Primary PDACs have been shown to contain distinct subclonal populations. However, these subclones share identical driver mutations and the genetics of metastases largely reflects that of the primary tumor. Furthermore, subclones are defined genetically by their unique progressor mutations, the vast majority if not all of which are thought to be passenger events. This raises questions as to what mechanisms might drive progression and metastasis during the natural history of disease evolution.

[0007] One prometastatic candidate is epigenomic regulation. In particular, the inventors wished to investigate the role of large-scale epigenomic changes during PDAC subclonal evolution and distant metastasis, especially within heterochromatin domains including large organized heterochromatin lysine (K)-9 modified domains (LOCKs) and large DNA hypomethylated blocks. These regions could represent selectable targets for large-scale epigenetic reprogramming, since they occupy over half of the genome, partially overlap with one another, and are found in many human cancers including PDAC. It was therefore hypothesized that epigenomic dysregulation within these regions could be a major selective force for tumor progression, given the lack of any consistent metastasis-specific driver mutations.

SUMMARY OF THE INVENTION

[0008] The present invention relates to a method of identifying targets for epigenetic reprogramming comprising detecting large organized heterochromatin lysine (K)-9 modified domains (LOCKs) and large DNA hypomethylated blocks in a sample containing DNA from a subject having cancer. For example, the method applies to a subject that has or is at risk of having PDAC and/or metastasis thereof. In one aspect, the detection comprises analysis of H3K9Me2/3 and/or H4K20Me3. In another aspect, the detection comprises analysis of H3K27Ac and/or H3K9Ac.

[0009] In another embodiment, the invention provides for the use of differentially expressed genes to identify metastatic propensity in primary tumors, wherein the genes are selected from genes in the Tables herein, oxidative stress genes, EMT genes, immunological response genes, DNA repair genes, glucose metabolism genes, oxPPP genes, and PGD genes.

[00010] In another embodiment, the invention provides a method for identifying agents or compounds to affect epigenomic changes, including inhibition of oxPPP comprising analyzing a sample from a subject before and after contacting with the agent or compound and determining the effect of the agent or compound on the epigenomic changes.

BRIEF DESCRIPTION OF THE DRAWINGS

[0010] Figures 1A-1E relate to global epigenetic reprogramming during the evolution of distant metastasis.

[0011] Figure 1A is a series of immunohistochemical stains.

[0012] Figure IB is a series of immunohistochemical stains.

[0013] Figure 1C is a series of immunohistochemical stains.

[0014] Figure ID is a series of immunohistochemical stains.

[0015] Figure IE is a series of immunohistochemical stains.

[0016] Figure IF is a series of western blot images.

[0017] Figure 1G is a series of graphical representations data.

[0018] Figures 2A-2D relate to epigenomic reprogramming of chromatin domains during PDAC sub clonal evolution.

[0019] Figure 2A is a graphical representation of data.

[0020] Figure 2b is a graphical representation of data.

[0021] Figure 2C is a graphical representation of data.

[0022] Figure 2D is a graphical representation of data.

[0023] Figure 2E is a graphical representation of data.

[0024] Figures 3A-3G relate to reprogrammed chromatin domains encoding divergent malignant properties.

[0025] Figure 3 A is a graphical representation of data.

[0026] Figure 3B is a series of western blot images.

[0027] Figure 3C is a graphical representation of data.

[0028] Figure 3D is a series of western blot images.

[0029] Figure 3E is a series of western blot images.

[0030] Figure 3F is a graphical representation of data.

[0031] Figure 3G is a series of images of tumor forming assays and related graphical plots. [0032] Figures 4A-4F relate to hyperactive glucose metabolism and 6PG depletion in distant metastatic subclones.

[0033] Figure 4A is a graphical representation of data.

[0034] Figure 4B is a graphical representation of data.

[0035] Figure 4C is a graphical representation of data.

[0036] Figure 4D is a schematic diagram.

[0037] Figure 4E is a series of graphical representations of data.

[0038] Figure 4F is a graphical representation of data.

[0039] Figures 5A-5D relate to PGD-dependence in distant metastatic subclones.

[0040] Figure 5A is a series of western blot images.

[0041] Figure 5B is a series of western blot images.

[0042] Figure 5C is a series of images of tumor forming assays and related graphical plots.

[0043] Figure 5D is a series of images of tumor forming assays and related graphical plots.

[0044] Figures 6A-6F relate to reversal of reprogrammed chromatin, tumorigenicity, and malignant gene expression programs by 6AN.

[0045] Figure 6A is a series of graphical representations of data.

[0046] Figure 6B is a series of graphical representations of data.

[0047] Figure 6C is a series of images of tumor forming assays and related graphical plots.

[0048] Figure 6D is a series of images and related graphical plots.

[0049] Figure 6E is a series of graphical plots.

[0050] Figure 6F is a series of images of tumor forming assays and related graphical plots.

[0051] Figures 7A-7B relate to reprogrammed chromatin across distant metastatic subclones.

[0052] Figure 7A is a series of western blot images.

[0053] Figure 7B is a series of western blot images.

[0054] Figures 8A-8E relate to specificity of reprogrammed histone modifications.

[0055] Figure 8A is a series of immunohistochemical stains.

[0056] Figure 8B is a graphical representation of data.

[0057] Figure 8C is a table.

[0058] Figure 8D is a series of western blot images.

[0059] Figure 8E is a series of western blot images.

[0060] Figure 9 is a series of graphical plots relating to enrichment of heterochromatin modifications within LOCKs. [0061] Figures 10A-10B relate to reprogramming of H3K9Me3 in LOCKs during PDAC sub clonal evolution.

[0062] Figure 1 OA is a graphical representation of data.

[0063] Figure 10B is a graphical representation of data.

[0064] Figures 11 A-l IB relate to local reprogramming of DE gene loci within LOCKs.

[0065] Figure 1 IB is a series of graphical representations of data.

[0066] Figure 1 IB is a series of graphical representations of data.

[0067] Figure 12 is a series of graphical plots relating to enrichment of euchromatin modifications within ECDs.

[0068] Figures 13A-13E relate to reprogramming of large LOCKs during PDAC evolution.

[0069] Figure 13 A is a graphical representation of data.

[0070] Figure 13B is a graphical representation of data.

[0071] Figure 13C is a graphical representation of data.

[0072] Figure 13D is a graphical representation of data.

[0073] Figure 13E is a graphical representation of data.

[0074] Figures 14A-14F relate to malignant heterogeneity between A38 subclones.

[0075] Figure 14A is a graphical representation of data.

[0076] Figure 14B is a series of graphical representations of data.

[0077] Figure 14C is a series of immunohistochemical stains.

[0078] Figure 14D is a series of immunohistochemical stains.

[0079] Figure 14E is a series of western blot images.

[0080] Figure 14F is a series of images of tumor forming assays and related graphical plots.

[0081] Figures 15A-15C relate to rearrangements targeted to Large LOCKs and ECDs.

[0082] Figure 15A is a series of graphical representations of data.

[0083] Figure 15B is a series of graphical representations of data.

[0084] Figure 15C is a series of graphical representations of data.

[0085] Figures 16A-16B relate to enhanced glucose metabolism with depleted 6PG levels across distant metastases.

[0086] Figure 16A is a series of graphical representations of data.

[0087] Figure 16B is a series of graphical representations of data.

[0088] Figures 17A-17C relate to 6AN targeting of glucose metabolism and the PGD step of the PPP.

[0089] Figure 17A is a series of graphical representations of data.

[0090] Figure 17B is a series of graphical representations of data. [0091] Figure 17C is a series of graphical representations of data.

[0092] Figures 18A-18C relate to 6 AN selectively modulation of the reprogrammed chromatin state of distant metastatic subclones.

[0093] Figure 18A is a series of western blot images.

[0094] Figure 18B is a series of western blot images.

[0095] Figure 18C is a series of western blot images.

[0096] Figures 19A-19D relate to 6AN regulated gene expression in LOCK-EI regions.

[0097] Figure 19A is a series of graphical representations of data.

[0098] Figure 19B is a series of graphical representations of data.

[0099] Figure 19C is a series of graphical representations of data.

[00100] Figure 19D is a series of graphical representations of data.

[00101] Figures 20A-20C relate to 6AN selectively blocked tumor formation in distant metastatic subclones.

[00102] Figure 20A is a series of images of tumor forming assays and related graphical plots.

[00103] Figure 20B is a series of images of tumor forming assays and related graphical plots.

[00104] Figure 20C is a series of images of tumor forming assays and related graphical plots.

[00105] Figures 21A-21C relates to reprogramming of the TOP2B locus in response to 6AN.

[00106] Figure 20A is a series of graphical representations of data.

[00107] Figure 20B is a series of graphical representations of data.

[00108] Figure 20C is a series of graphical representations of data.

[00109] Figure 22 is a screen shot illustrating data directly linking loss of large-scale heterochromatic regions as described herein to increased variability of gene expression, allowing for increased phenotypic plasticity. An example is a gene SHC4 that is involved in ERK signaling and tumor invasion and metastasis. Its expression variability statistical index measured by single cell RNA experiments is +1.12 in the A38-5 (epigenomically altered, distant metastatic) line in the paper, and -2.68 in the corresponding A38-41 (epigenomically stable, locally invasive) line, with a FDR p value of 0.00. Figure 22 represents the data showing the loss of LOCKs over the gene in 38-5.

DETAILED DESCRIPTION OF THE INVENTION [00110] The present invention is based on the seminal discovery that a prometastatic candidate is epigenomic regulation. The invention is based on discovery of the role of large- scale epigenomic changes during PDAC subclonal evolution and distant metastasis, especially within heterochromatin domains including large organized heterochromatin lysine (K)-9 modified domains (LOCKs) and large DNA hypomethylated blocks. These regions represent selectable targets for large-scale epigenetic reprogramming, since they occupy over half of the genome, partially overlap with one another, and are found in many human cancers including PDAC. The inventors therefore hypothesized that epigenomic dysregulation within these regions could be a major selective force for tumor progression, given the lack of any consistent metastasis-specific driver mutations.

[00111] Before the present systems and methods are described, it is to be understood that this invention is not limited to particular systems, methods, and experimental conditions described, as such systems, methods, and conditions may vary. It is also to be understood that the terminology used herein is for purposes of describing particular embodiments only, and is not intended to be limiting, since the scope of the present invention will be limited only in the appended claims.

[00112] As used in this specification and the appended claims, the singular forms "a", "an", and "the" include plural references unless the context clearly dictates otherwise. Thus, for example, references to "the method" includes one or more methods, and/or steps of the type described herein which will become apparent to those persons skilled in the art upon reading this disclosure and so forth.

[00113] Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Although any methods and materials similar or equivalent to those described herein can be used in the practice or testing of the invention, the preferred methods and materials are now described.

[00114] The present invention provides a method of identifying targets for epigenetic reprogramming comprising detecting large organized heterochromatin lysine (K)-9 modified domains (LOCKs) and large DNA hypomethylated blocks in a sample containing DNA from a subject having cancer. For example, the method applies to a subject that has or is at risk of having PDAC and/or metastasis thereof. In one aspect, the detection comprises analysis of H3K9Me2/3 and/or H4K20Me3.

[00115] As used herein, reprogramming, is intended to refer to a process that alters or reverses the differentiation status of a somatic cell that is either partially or terminally differentiated. Reprogramming of a somatic cell may be a partial or complete reversion of the differentiation status of the somatic cell. In an exemplary aspect, reprogramming is complete wherein a somatic cell is reprogrammed into an iPS cell. However, reprogramming may be partial, such as reversion into any less differentiated state. For example, reverting a terminally differentiated cell into a cell of a less differentiated state, such as a multipotent cell.

[00116] As used herein, pluripotent cells include cells that have the potential to divide in vitro for an extended period of time (greater than one year) and have the unique ability to differentiate into cells derived from all three embryonic germ layers, namely endoderm, mesoderm and ectoderm.

[00117] Somatic cells for use with the present invention may be primary cells or immortalized cells. Such cells may be primary cells (non-immortalized cells), such as those freshly isolated from an animal, or may be derived from a cell line (immortalized cells). In an exemplary aspect, the somatic cells are mammalian cells, such as, for example, human cells or mouse cells. They may be obtained by well-known methods, from different organs, such as, but not limited to skin, brain, lung, pancreas, liver, spleen, stomach, intestine, heart, reproductive organs, bladder, kidney, urethra and other urinary organs, or generally from any organ or tissue containing living somatic cells, or from blood cells. Mammalian somatic cells useful in the present invention include, by way of example, adult stem cells, Sertoli cells, endothelial cells, granulosa epithelial cells, neurons, pancreatic islet cells, epidermal cells, epithelial cells, hepatocytes, hair follicle cells, keratinocytes, hematopoietic cells, melanocytes, chondrocytes, lymphocytes (B and T lymphocytes), erythrocytes, macrophages, monocytes, mononuclear cells, fibroblasts, cardiac muscle cells, other known muscle cells, and generally any live somatic cells. In particular embodiments, fibroblasts are used. The term somatic cell, as used herein, is also intended to include adult stem cells. An adult stem cell is a cell that is capable of giving rise to all cell types of a particular tissue. Exemplary adult stem cells include hematopoietic stem cells, neural stem cells, and mesenchymal stem cells.

[00118] As discussed herein, alterations in methylation patterns occur during differentiation or dedifferention of a cell which work to regulate gene expression of critical factors that are 'turned on' or 'turned off at various stages of differentiation. As such, one of skill in the art would appreciate that many types of agents are capable of altering the methylation status of one or more nucleic acid sequences of a somatic cell to induce pluripotency that may be suitable for use with the present invention. [00119] An agent, as used herein, is intended to include any agent capable of altering the methylation status of one or more nucleic acid sequences of a somatic cell. For example, an agent useful in any of the method of the invention may be any type of molecule, for example, a polynucleotide, a peptide, a peptidomimetic, peptoids such as vinylogous peptoids, chemical compounds, such as organic molecules or small organic molecules, or the like. In various aspects, the agent may be a polynucleotide, such as DNA molecule, an antisense oligonucleotide or RNA molecule, such as microRNA, dsRNA, siRNA, stRNA, and shRNA.

[00120] MicroRNA (miRNA) are single-stranded RNA molecules whose expression is known to be regulated by methylation to play a key role in regulation of gene expression during differentiation and dedifferentiation of cells. Thus an agent may be one that inhibits or induces expression of miRNA or may be a mimic miRNA. As used herein, "mimic" microRNAs which are intended to mean a microRNA exogenously introduced into a cell that have the same or substantially the same function as their endogenous counterpart.

[00121] In various aspects of the present invention, an agent that alters the methylation status of one or more nucleic acid sequences is a nuclear reprogramming factor. Nuclear reprogramming factors may be genes that induce pluripotency and utilized to reprogram differentiated or semi-differentiated cells to a phenotype that is more primitive than that of the initial cell, such as the phenotype of a pluripotent stem cell. Those skilled in the art would understand that such genes and agents are capable of generating a pluripotent stem cell from a somatic cell upon expression of one or more such genes having been integrated into the genome of the somatic cell or upon contact of the somatic cell with the agent or expression product of the gene. As used herein, a gene that induces pluripotency is intended to refer to a gene that is associated with pluripotency and capable of generating a less differentiated cell, such as a pluripotent stem cell from a somatic cell upon integration and expression of the gene. The expression of a pluripotency gene is typically restricted to pluripotent stem cells, and is crucial for the functional identity of pluripotent stem cells.

[00122] Several genes have been found to be associated with pluripotency and suitable for use with the present invention as reprogramming factors. Such genes are known in the art and include, by way of example, SOX family genes (SOX1, SOX2, SOX3, SOX15,

SOX18), KLF family genes (KLF1, KLF2, KLF4, KLF5), MYC family genes (C-MYC, L-

MYC, N-MYC), SALL4, OCT4, NANOG, LIN28, STELLA, NOBOX, POU5F1 or a

STAT family gene. STAT family members may include for example STAT1, STAT2,

STAT3, STAT4, STAT5 (STAT5A and STAT5B), and STAT6 . While in some instances, use of only one gene to induce pluripotency may be possible, in general, expression of more than one gene is required to induce pluripotency. For example, two, three, four or more genes may be simultaneously integrated into the somatic cell genome as a polycistronic construct to allow simultaneous expression of such genes. In an exemplary aspect, four genes are utilized to induce pluripotency including OCT4, POU5F1, SOX2, KLF4 and C- MYC. Additional genes known as reprogramming factors suitable for use with the present invention are disclosed in U.S. Patent Application No. 10/997, 146 and U. S. Patent Application No. 12/289,873, incorporated herein by reference.

[00123] All of these genes commonly exist in mammals, including human, and thus homologues from any mammals may be used in the present invention, such as genes derived from mammals including, but not limited to mouse, rat, bovine, ovine, horse, and ape. Further, in addition to wild-type gene products, mutant gene products including substitution, insertion, and/or deletion of several (e.g., 1 to 10, 1 to 6, 1 to 4, 1 to 3, and 1 or 2) amino acids and having similar function to that of the wild-type gene products can also be used. Furthermore, the combinations of factors are not limited to the use of wild-type genes or gene products. For example, Myc chimeras or other Myc variants can be used instead of wild-type Myc.

[00124] The present invention is not limited to any particular combination of nuclear reprogramming factors. As discussed herein a nuclear reprogramming factor may comprise one or more gene products. The nuclear reprogramming factor may also comprise a combination of gene products as discussed herein. Each nuclear reprogramming factor may be used alone or in combination with other nuclear reprogramming factors as disclosed herein. Further, nuclear reprogramming factors of the present invention can be identified by screening methods, for example, as discussed in U. S. Patent Application No. 10/997, 146, incorporated herein by reference. Additionally, the nuclear reprogramming factor of the present invention may contain one or more factors relating to differentiation, development, proliferation or the like and factors having other physiological activities, as well as other gene products which can function as a nuclear reprogramming factor.

[00125] The nuclear reprogramming factor may include a protein or peptide. The protein may be produced from a gene as discussed herein, or alternatively, in the form of a fusion gene product of the protein with another protein, peptide or the like. The protein or peptide may be a fluorescent protein and/or a fusion protein. For example, a fusion protein with green fluorescence protein (GFP) or a fusion gene product with a peptide such as a histidine tag can also be used. Further, by preparing and using a fusion protein with the TAT peptide derived from the virus HIV, intracellular uptake of the nuclear reprogramming factor through cell membranes can be promoted, thereby enabling induction of reprogramming only by adding the fusion protein to a medium thus avoiding complicated operations such as gene transduction. Since preparation methods of such fusion gene products are well known to those skilled in the art, skilled artisans can easily design and prepare an appropriate fusion gene product depending on the purpose.

[00126] In certain embodiments, the agent alters the methylation status of one or more nucleic acid sequences, such as any gene listed in a Table set forth herein.

[00127] Expression profiling of reprogrammed somatic cells to assess their pluripotency characteristics may also be conducted. Expression of individual genes associated with pluripotency may also be examined. Additionally, expression of embryonic stem cell surface markers may be analyzed. As used herein, "expression" refers to the production of a material or substance as well as the level or amount of production of a material or substance. Thus, determining the expression of a specific marker refers to detecting either the relative or absolute amount of the marker that is expressed or simply detecting the presence or absence of the marker. As used herein, "marker" refers to any molecule that can be observed or detected. For example, a marker can include, but is not limited to, a nucleic acid, such as a transcript of a specific gene, a polypeptide product of a gene, a non- gene product polypeptide, a glycoprotein, a carbohydrate, a glycolipd, a lipid, a lipoprotein or a small molecule.

[00128] Detection and analysis of a variety of genes known in the art to be associated with pluripotent stem cells may include analysis of genes such as, but not limited to OCT4, NANOG, SALL4, SSEA-1, SSEA-3, SSEA-4, TRA-1-60, TRA-1-81, or a combination thereof. iPS cells may express any number of pluripotent cell markers, including: alkaline phosphatase (AP); ABCG2; stage specific embryonic antigen-1 (SSEA-1); SSEA-3; SSEA- 4; TRA-1-60; TRA-1-81; Tra-2-49/6E; ERas/ECAT5, E-cadherin; β-ΙΙΙ-tubulin; γ -smooth muscle actin (γ-SMA); fibroblast growth factor 4 (Fgf4), Cripto, Daxl; zinc finger protein 296 (Zfp296); N-acetyltransf erase- 1 (Natl); ES cell associated transcript 1 (ECAT1); ESG1/DPPA5/ECAT2; ECAT3; ECAT6; ECAT7; ECAT8; ECAT9; ECAT10; ECAT15-1; ECAT15-2; Fthll7; Sall4; undifferentiated embryonic cell transcription factor (Utfl); Rexl; p53; G3PDH; telomerase, including TERT; silent X chromosome genes; Dnmt3a; Dnmt3b; TRIM28; F-box containing protein 15 (Fbxl5); Nanog/ECAT4; Oct3/4; Sox2; Klf4; c-Myc; Esrrb; TDGF1; GABRB3; Zfp42, FoxD3; GDF3; CYP25A1; developmental pluripotency- associated 2 (DPPA2); T-cell lymphoma breakpoint 1 (Tell); DPPA3/Stella; DPPA4; as well as other general markers for pluripotency, for example any genes used during induction to reprogram the cell. iPS cells can also be characterized by the down-regulation of markers characteristic of the differentiated cell from which the iPS cell is induced.

[00129] As used herein, "differentiation" refers to a change that occurs in cells to cause those cells to assume certain specialized functions and to lose the ability to change into certain other specialized functional units. Cells capable of differentiation may be any of totipotent, pluripotent or multipotent cells. Differentiation may be partial or complete with respect to mature adult cells.

[00130] "Differentiated cell" refers to a non-embryonic, non-parthenogenetic or non- pluripotent cell that possesses a particular differentiated, i.e., non-embryonic, state. The three earliest differentiated cell types are endoderm, mesoderm, and ectoderm.

[00131] Pluripotency can also be confirmed by injecting the cells into a suitable animal, e.g., a SCID mouse, and observing the production of differentiated cells and tissues. Still another method of confirming pluripotency is using the subject pluripotent cells to generate chimeric animals and observing the contribution of the introduced cells to different cell types. Methods for producing chimeric animals are well known in the art and are described in U.S. Pat. No. 6,642,433, incorporated by reference herein.

[00132] Yet another method of confirming pluripotency is to observe cell differentiation into embryoid bodies and other differentiated cell types when cultured under conditions that favor differentiation {e.g., removal of fibroblast feeder layers).

[00133] In various aspects of the invention, methylation status is converted to an M value. As used herein an M value, can be a log ratio of intensities from total (Cy3) and McrBC- fractionated DNA (Cy5): positive and negative M values are quantitatively associated with methylated and unmethylated sites, respectively.

[00134] In various aspects of the invention large hypomethylated blocks are identified Hypomethylation is present when there is a measurable decrease in methylation. In some embodiments, a DNA block is hypomethylated when less than 50% of the methylation sites analyzed are not methylated. DNA block. Methods for determining methylation states are provided herein and are known in the art. In some embodiments methylation status is converted to an M value. As used herein an M value, can be a log ratio of intensities from total (Cy3) and McrBC -fractionated DNA (Cy5): positive and negative M values are quantitatively associated with methylated and unmethylated sites, respectively. M values are calculated as described in the Examples. In some embodiments, M values which range from -0.5 to 0.5 represent unmethylated sites as defined by the control probes, and values from 0.5 to 1.5 represent baseline levels of methylation.

[00135] Numerous methods for analyzing methylation status of a gene are known in the art and can be used in the methods of the present invention to identify either hypomethylation or hypermethylation of the one or more DMRs. In various embodiments, the determining of methylation status in the methods of the invention is performed by one or more techniques selected from the group consisting of a nucleic acid amplification, polymerase chain reaction (PCR), methylation specific PCR, bisulfite pyrosequenceing, single-strand conformation polymorphism (SSCP) analysis, restriction analysis, microarray technology, and proteomics. As illustrated in the Examples herein, analysis of methylation can be performed by bisulfite genomic sequencing. Bisulfite treatment modifies DNA converting unmethylated, but not methylated, cytosines to uracil. Bisulfite treatment can be carried out using the METHYLEASY bisulfite modification kit (Human Genetic Signatures).

[00136] In some embodiments, bisulfite pyrosequencing, which is a sequencing-based analysis of DNA methylation that quantitatively measures multiple, consecutive CpG sites individually with high accuracy and reproducibility, may be used.

[00137] It will be recognized that depending on the site bound by the primer and the direction of extension from a primer, that the primers listed above can be used in different pairs. Furthermore, it will be recognized that additional primers can be identified within the hypomethylated blocks, especially primers that allow analysis of the same methylation sites as those analyzed with primers that correspond to the primers disclosed herein.

[00138] Altered methylation can be identified by identifying a detectable difference in methylation. For example, hypomethylation can be determined by identifying whether after bisulfite treatment a uracil or a cytosine is present a particular location. If uracil is present after bisulfite treatment, then the residue is unmethylated. Hypomethylation is present when there is a measurable decrease in methylation.

[00139] In an alternative embodiment, the method for analyzing methylation of a hypomethylated block can include amplification using a primer pair specific for methylated residues within a DMR. In these embodiments, selective hybridization or binding of at least one of the primers is dependent on the methylation state of the target DNA sequence (Herman et al., Proc. Natl. Acad. Sci. USA, 93 :9821 (1996)). For example, the amplification reaction can be preceded by bisulfite treatment, and the primers can selectively hybridize to target sequences in a manner that is dependent on bisulfite treatment. For example, one primer can selectively bind to a target sequence only when one or more base of the target sequence is altered by bisulfite treatment, thereby being specific for a methylated target sequence.

[00140] Other methods are known in the art for determining methylation status of a hypomethylated block, including, but not limited to, array-based methylation analysis and Southern blot analysis.

[00141] Methods using an amplification reaction, for example methods above for detecting hypomethylation or hyprmethylation of one or more hypomethylated blocks, can utilize a real-time detection amplification procedure. For example, the method can utilize molecular beacon technology (Tyagi et al., Nature Biotechnology, 14: 303 (1996)) or Taqman™ technology (Holland et al., Proc. Natl. Acad. Sci. USA, 88:7276 (1991)).

[00142] Also methyl light (Trinh et al., Methods 25(4):456-62 (2001), incorporated herein in its entirety by reference), Methyl Heavy (Epigenomics, Berlin, Germany), or SNuPE (single nucleotide primer extension) (see e.g., Watson et al., Genet Res. 75(3):269- 74 (2000)) Can be used in the methods of the present invention related to identifying altered methylation of DMRs.

[00143] As used herein, the term "selective hybridization" or "selectively hybridize" refers to hybridization under moderately stringent or highly stringent physiological conditions, which can distinguish related nucleotide sequences from unrelated nucleotide sequences.

[00144] As known in the art, in nucleic acid hybridization reactions, the conditions used to achieve a particular level of stringency will vary, depending on the nature of the nucleic acids being hybridized. For example, the length, degree of complementarity, nucleotide sequence composition (for example, relative GC:AT content), and nucleic acid type, for example, whether the oligonucleotide or the target nucleic acid sequence is DNA or RNA, can be considered in selecting hybridization conditions. An additional consideration is whether one of the nucleic acids is immobilized, for example, on a filter. Methods for selecting appropriate stringency conditions can be determined empirically or estimated using various formulas, and are well known in the art (see, e.g., Sambrook et al., supra, 1989).

[00145] An example of progressively higher stringency conditions is as follows: 2X

SSC/0.1% SDS at about room temperature (hybridization conditions); 0.2X SSC/0.1% SDS at about room temperature (low stringency conditions); 0.2X SSC/0.1%) SDS at about 42°C

(moderate stringency conditions); and 0.1X SSC at about 68°C (high stringency conditions). Washing can be carried out using only one of these conditions, for example, high stringency conditions, or each of the conditions can be used, for example, for 10 to 15 minutes each, in the order listed above, repeating any or all of the steps listed.

[00146] The degree of methylation in the DNA associated with the DMRs being assessed, may be measured by fluorescent in situ hybridization (FISH) by means of probes which identify and differentiate between genomic DNAs, associated with the DMRs being assessed, which exhibit different degrees of DNA methylation. FISH is described, for example, in de Capoa et al. {Cytometry. 31 :85-92, 1998) which is incorporated herein by reference. In this case, the biological sample will typically be any which contains sufficient whole cells or nuclei to perform short term culture. Usually, the sample will be a sample that contains 10 to 10,000, or, for example, 100 to 10,000, whole cells.

[00147] Additionally, as mentioned above, methyl light, methyl heavy, and array-based methylation analysis can be performed, by using bisulfite treated DNA that is then PCR- amplified, against microarrays of oligonucleotide target sequences with the various forms corresponding to unmethylated and methylated DNA.

[00148] The term "nucleic acid molecule" is used broadly herein to mean a sequence of deoxyribonucleotides or ribonucleotides that are linked together by a phosphodiester bond. As such, the term "nucleic acid molecule" is meant to include DNA and RNA, which can be single stranded or double stranded, as well as DNA/RNA hybrids. Furthermore, the term "nucleic acid molecule" as used herein includes naturally occurring nucleic acid molecules, which can be isolated from a cell, as well as synthetic molecules, which can be prepared, for example, by methods of chemical synthesis or by enzymatic methods such as by the polymerase chain reaction (PCR), and, in various embodiments, can contain nucleotide analogs or a backbone bond other than a phosphodiester bond.

[00149] The terms "polynucleotide" and "oligonucleotide" also are used herein to refer to nucleic acid molecules. Although no specific distinction from each other or from "nucleic acid molecule" is intended by the use of these terms, the term "polynucleotide" is used generally in reference to a nucleic acid molecule that encodes a polypeptide, or a peptide portion thereof, whereas the term "oligonucleotide" is used generally in reference to a nucleotide sequence useful as a probe, a PCR primer, an antisense molecule, or the like. Of course, it will be recognized that an "oligonucleotide" also can encode a peptide. As such, the different terms are used primarily for convenience of discussion.

[00150] A polynucleotide or oligonucleotide comprising naturally occurring nucleotides and phosphodiester bonds can be chemically synthesized or can be produced using recombinant DNA methods, using an appropriate polynucleotide as a template. In comparison, a polynucleotide comprising nucleotide analogs or covalent bonds other than phosphodiester bonds generally will be chemically synthesized, although an enzyme such as T7 polymerase can incorporate certain types of nucleotide analogs into a polynucleotide and, therefore, can be used to produce such a polynucleotide recombinantly from an appropriate template.

[00151] In another aspect, the present invention includes kits that are useful for carrying out the methods of the present invention. The components contained in the kit depend on a number of factors, including: the particular analytical technique used to detect methylation or measure the degree of methylation or a change in methylation, and the one or more hypomethylated blocks being assayed for.

[00152] Accordingly, the present invention provides a kit for determining a methylation status of one or more hypomethylated blocks of the invention.

[00153] To examine DNAm on a genome-wide scale, comprehensive high-throughput array-based relative methylation (CHARM) analysis, which is a microarray-based method agnostic to preconceptions about DNAm, including location relative to genes and CpG content was carried out. The resulting quantitative measurements of DNAm, denoted with M, are log ratios of intensities from total (Cy3) and McrBC -fractionated DNA (Cy5): positive and negative M values are quantitatively associated with methylated and unmethylated sites, respectively. For each sample, -4.6 million CpG sites across the genome of iPS cells, parental somatic cells and ES cells were analyzed using a custom- designed NimbleGen HD2 microarray, including all of the classically defined CpG islands as well as all nonrepetitive lower CpG density genomic regions of the genome. 4,500 control probes were included to standardize these M values so that unmethylated regions were associated, on average, with values of 0. CHARM is 100% specific at 90% sensitivity for known methylation marks identified by other methods (for example, in promoters) and includes the approximately half of the genome not identified by conventional region preselection. The CHARM results were also extensively corroborated by quantitative bisulfite pyrosequencing analysis.

[00154] In one aspect of the invention, methylation density is determined for a region of nucleic acid. Density may be used as an indication of a hypomethylated block region of DNA, for example. A density of about 0.2 to 0.7, about 0.3 to 0.7 , 0.3 to 0.6 or 0.3 to 0.4, or 0.3, may be indicative of a hypomethylated block (the calculated DNA methylation density is the number of methylated CpGs divided by the total number of CpGs sequenced for each sample). Methods for determining methylation density are well known in the art. For example, a method for determining methylation density of target CpG islands has been established by Luo et al. Analytical Biochemistry, Vol. 387:2 2009, pp. 143-149. In the method, DNA microarray was prepared by spotting a set of PCR products amplified from bisulfite-converted sample DNAs. This method not only allows the quantitative analysis of regional methylation density of a set of given genes but also could provide information of methylation density for a large amount of clinical samples as well as use in the methods of the invention regarding iPS cell generation and detection. Other methods are well known in the art (e.g., Holemon et al., BioTechniques, 43 :5, 2007, pp. 683-693).

[00155] The present invention is described partly in terms of functional components and various processing steps. Such functional components and processing steps may be realized by any number of components, operations and techniques configured to perform the specified functions and achieve the various results. For example, the present invention may employ various biological samples, biomarkers, elements, materials, computers, data sources, storage systems and media, information gathering techniques and processes, data processing criteria, statistical analyses, regression analyses and the like, which may carry out a variety of functions. In addition, although the invention is described in the medical diagnosis context, the present invention may be practiced in conjunction with any number of applications, environments and data analyses; the systems described herein are merely exemplary applications for the invention.

[00156] Methods for analysis according to various aspects of the present invention may be implemented in any suitable manner, for example using a computer program operating on the computer system. An exemplary analysis system, according to various aspects of the present invention, may be implemented in conjunction with a computer system, for example a conventional computer system comprising a processor and a random access memory, such as a remotely-accessible application server, network server, personal computer or workstation. The computer system also suitably includes additional memory devices or information storage systems, such as a mass storage system and a user interface, for example a conventional monitor, keyboard and tracking device. The computer system may, however, comprise any suitable computer system and associated equipment and may be configured in any suitable manner. In one embodiment, the computer system comprises a stand-alone system. In another embodiment, the computer system is part of a network of computers including a server and a database. [00157] The software required for receiving, processing, and analyzing biomarker information may be implemented in a single device or implemented in a plurality of devices. The software may be accessible via a network such that storage and processing of information takes place remotely with respect to users. The analysis system according to various aspects of the present invention and its various elements provide functions and operations to facilitate biomarker analysis, such as data gathering, processing, analysis, reporting and/or diagnosis. The present analysis system maintains information relating to methylation and samples and facilitates analysis and/or diagnosis, For example, in the present embodiment, the computer system executes the computer program, which may receive, store, search, analyze, and report information relating to the epigenome. The computer program may comprise multiple modules performing various functions or operations, such as a processing module for processing raw data and generating supplemental data and an analysis module for analyzing raw data and supplemental data to generate a disease status model and/or diagnosis information.

[00158] The procedures performed by the analysis system may comprise any suitable processes to facilitate analysis and/or disease diagnosis. In one embodiment, the analysis system is configured to establish a disease status model and/or determine disease status in a patient. Determining or identifying disease status may comprise generating any useful information regarding the condition of the patient relative to the disease, such as performing a diagnosis, providing information helpful to a diagnosis, assessing the stage or progress of a disease, identifying a condition that may indicate a susceptibility to the disease, identify whether further tests may be recommended, predicting and/or assessing the efficacy of one or more treatment programs, or otherwise assessing the disease status, likelihood of disease, or other health aspect of the patient.

[00159] The analysis system may also provide various additional modules and/or individual functions. For example, the analysis system may also include a reporting function, for example to provide information relating to the processing and analysis functions. The analysis system may also provide various administrative and management functions, such as controlling access and performing other administrative functions.

[00160] The analysis system suitably generates a disease status model and/or provides a diagnosis for a patient based on raw biomarker data and/or additional subject data relating to the subjects. The data may be acquired from any suitable biological samples.

[00161] [00162] The following examples are provided to further illustrate the advantages and features of the present invention, but are not intended to limit the scope of the invention. While they are typical of those that might be used, other procedures, methodologies, or techniques known to those skilled in the art may alternatively be used.

EXAMPLE 1

LARGE-SCALE EPIGENOMIC REPROGRAMMING LINKS ANABOLIC GLUCOSE METABOLISM TO DISTANT METASTASIS DURING THE EVOLUTION OFPANCREA TIC CANCER PROGRESSION

[00163] BACKGROUND AND HYPOTHESIS

[00164] As discussed previously, one prometastatic candidate is epigenomic regulation. In particular, we wished to investigate the role of large-scale epigenomic changes during PDAC subclonal evolution and distant metastasis, especially within heterochromatin domains including large organized heterochromatin lysine (K)-9 modified domains (LOCKs) and large DNA hypomethylated blocks. These regions could represent selectable targets for large-scale epigenetic reprogramming, since they occupy over half of the genome, partially overlap with one another, and are found in many human cancers including PDAC. The inventors therefore hypothesized that epigenomic dysregulation within these regions could be a major selective force for tumor progression, given the lack of any consistent metastasis-specific driver mutations.

[00165] RESULTS

[00166] Reprogramming of global epigenetic state during the evolution of distant metastasis

[00167] To test this hypothesis, we first determined whether large-scale changes in epigenetic modifications could be detected during PDAC evolution in patient samples in vivo. We previously collected matched primary and metastatic PDAC lesions from individual patients by rapid autopsy, and reported the genetic progression of subclonal evolution by whole exome and Sanger sequencing for mutations, paired-end sequencing for rearrangements, and whole genome sequencing in a subset of these samples. These samples represent a unique resource especially suited to study tumor evolution, since they were collected from matched primary and metastatic tumors from the same patient(s), each has been deep sequenced, individual subclones have been identified, and no metastasis-specific driver mutations are present. From these patients, we selected a large panel of diverse PDAC samples to test for global epigenomic reprogramming during subclonal evolution. As summarized in Table 1, these samples were chosen because they represented the diversity of PDAC evolution (different regions of primary tumor paired to peritoneal and distant metastases), each sample represented a sequence-verified (sub)clonal population, patients were both treated and untreated, driver mutations were shared by all subclones in each patient in the absence of metastasis-specific drivers, formalin-fixed tissue was available for immunoassays, frozen tissue was available for whole-genome bisulfite sequencing, and cell lines were available for all other experiments.

[00168] Table 1 : PDAC sample characteristics with LOCK epigenetic changes

ARIDIA

KRAS

A132PrF Distant Tissue:

CDKN2A

(Founder metastas Primary Untreated K9Me2/3 IHC: Diffusely Positive

TP53

Clone) es Tumor

ATM

KRAS

Distant Tissue:

A132PrS CDKN2A

metastas Primary Untreated K9Me2/3 IHC: Positive+Negative

(Subclone) TP53

es Tumor

ATM

KRAS

Distant

A132Lv Tissue: CDKN2A

metastas Untreated K9Me2/3 IHC: Positive+Negative

(Metastasis) Liver TP53

es

ATM

Local-

A38PrF regional Tissue: KRAS

Gem.

(Founder + Distant Primary TP53 K9Me2/3 IHC: Diffusely Positive

Bev.

Clone) metastas Tumor SMAD4

es

Local-

A38PrSl

regional Tissue: KRAS

(Peritoneal Gem.

+ Distant Primary TP53 K9Me2/3 IHC: Diffusely Positive

Precursor Bev.

metastas Tumor SMAD4

Subclone)

es

Local-

A38PrS2 KRAS

regional Tissue: Gem.

(Liver/Lung TP53

+ Distant Primary Bev. K9Me2/3 IHC: Positive+Negative

Precursor SMAD4

metastas Tumor

Subclone) SMARCA2 C

es

Local- regional KRAS Gem.

A38Lgl Tissue:

+ Distant TP53 Bev. K9Me2/3 IHC: Diffusely Negative

(Metastasis) Lung

metastas SMAD4

es

Local- K9Me2

Cell

regional KRAS Gem. Western Blot: High (100%, cont.)

A38Per Line:

+ Distant TP53 Bev. K9Me2

(Metastasis) Peritone

metastas SMAD4 ChlP-seq: High (100%, cont.) um

es WGBS: High (79%, cont.)

KRAS

Cell

AsPCl d Local- CDKN2A K9Me2

Line: Gem.

(N/A) regional TP53 Western Blot: High (102%)

Ascites

SMAD4

Cell

HPAFII d Local- KRAS K9Me2

Line: Gem.

(N/A* regional TP53 Western Blot: High (94%)

Ascites

Cell

KRAS

Capan2 d Local- Line: K9Me2

CDKN2A Gem.

(N/A) regional Primary Western Blot: High (93%)

TP53

Tumor

Local- regional Cell

A2Lg KRAS Taxoprex. K9Me2

+ Distant Line:

(Metastasis) TP53 Gem. Western Blot: Reduced (40%) metastas Lung

es

Local- regional Cell

A2Lv KRAS Taxoprex. K9Me2

+ Distant Line:

(Metastasis) TP53 Gem. Western Blot: Reduced (50%) metastas Liver

es

A6Lv Local- Cell KRAS Gem. K9Me2 (Metastasis) regional Line: TP53 Trox. Western Blot: Reduced (47%) + Distant Liver MLL3

metastas

es

Distant Cell KRAS

AlOLv K9Me2

metastas Line: TP53 Untreated

(Metastasis) Western Blot: Reduced (61%) es Liver MLL3

K9Me2

Cell KRAS

Distant Western Blot: Reduced (64%)

A13Prl Line: CDKN2A

metastas Untreated K9Me2

(Subclone) Primary MYC

es CMP-seq: Reduced (25%)

Tumor TP53

WGBS: Reduced (72%)

K9Me2

Cell KRAS

Distant Western Blot: Reduced (51%)

A13Pr2 Line: CDKN2A

metastas Untreated K9Me2

(Subclone) Primary MYC

es CMP-seq: Reduced (81%)

Tumor TP53

WGBS: Reduced (73%)

K9Me2/3

KRAS

Distant Cell Western Blot: Reduced (53%)

A13Lg CDKN2A

metastas Line: Untreated K9Me2

(Metastasis) MYC

es Lung CMP-seq: Reduced (86%)

TP53

WGBS: Reduced (71%)

Local-

Cell

regional

A320 Line: KRAS K9Me2

+ Distant 5-FU

(Metastasis) Omenta TP53 Western Blot: Reduced (58%) metastas

m e

es

Local- regional Cell KRAS K9Me2

A38Lv Gem.

+ Distant Line: TP53 Western Blot: Reduced (47%)

(Metastasis) Bev.

metastas Liver SMAD4 WGBS: Reduced (67%) es

Local- K9Me2

regional Cell KRAS Western Blot: Reduced (58%)

A38Lg Gem.

+ Distant Line: TP53 K9Me2

(Metastasis) Bev.

metastas Lung SMAD4 CMP-seq: Reduced (48%) es WGBS: Reduced (72%)

[00169] Table 1 : Summary of H3K9 and DNA methylation changes across tissue and cell line samples. Samples from patients with local -regional spread (peritoneal/ascites) showed relatively high global H3K9/DNA methylation as indicated by multiple assays (right two columns), while samples from patients with distant metastases showed reduced methylation across all assays, which initiated in primary tumors as indicated.

[00170] Abbreviations: Gem. (Gemcitabine), Bev. (Bevacizumab), Taxoprex. (Taxoprexin), Trox. (Troxacitabine).

[00171] Superscript Notes

[00172] a Clonal origins represent phylogenetic estimates from previously published (ref. 1) and other unpublished (text footnote 1) whole-genome sequencing data.

[00173] b Western blot data reflect densitometry percentages of H3K9Me2 signals relative to A38Per controls (cont). Western blots are shown in Supplementary Fig. 1 and the absolute densitometry values are shown in Fig. lg with p-values included in the figure legend. ChlP-seq data reflect percent of LOCK Mb with reduced H3K9Me2 relative to A38Per controls (con ), as detailed with Mb and RPKM values in Supplementary Data File 2. WGBS data reflect percent of DNA methylation within LOCKs relative to A124Pr controls (tissues) and A38Per controls (cell lines), as detailed in Supplementary Figs 1 and 2.

[00174] c This metastasis from a chemotherapy-treated patient had a missense mutation in SMARCA2 of unclear significance.

[00175] d These cell lines were not from the rapid autopsy cohort and rely on previously published genotyping data which may underestimate the driver mutations.

[00176] e The A320 cell line was isolated from an omental mass lesion in a patient with very aggressive disease including widespread lung metastases, and showed findings similar to the other distant (lung/liver) metastatic subclones.

[00177] We began our analysis with the formalin-fixed tissue samples (totaling 16 uniquely matched, sequence-verified tumor sections, Table 1). We performed immunostains against heterochromatin modifications (e.g. H3K9Me2/3) in order to detect global changes from heterochromatin domains (including LOCKs as defined in Wen et al. (Large histone H3 lysine 9 dimethylated chromatin blocks distinguish differentiated from embryonic stem cells. Nat Genet 41, 246-50 (2009)), and McDonald et al. (Genome-scale epigenetic reprogramming during epithelial -to-mesenchymal transition. Nat Struct Mol Biol 18, 867- 74 (2011))) that might be selectable targets during subclonal evolution. Immunostains revealed diffusely positive (>80%) H3K9Me2/3 staining of PDAC cell nuclei across both primary tumor and peritoneal metastatic subclones from patients who presented with peritoneal carcinomatosis (Fig. la, b and Table 1). In contrast, samples from patients who presented with distant metastatic disease displayed progressive loss of H3K9Me2/3 during subclonal evolution. This manifested as heterogeneous (mixtures of positive + negative PDAC nuclei) staining in primary tumors followed by either diffusely negative (<20%) staining or retention of heterogeneous staining in the paired metastases (Fig. lc, d and Table 1). We also observed similar results during subclonal evolution from a patient for which sequence-verified primary tumor subclones that seeded both peritoneal and distant metastases were available. The peritoneal precursor retained diffusely strong staining of heterochromatin modifications as seen in the clone that founded the neoplasm (Fig. le, top two panels). In contrast, cell-to-cell heterogeneity of staining patterns emerged in the primary tumor subclone that seeded distant metastasis, followed by diffuse loss of staining at the distant metastatic site (Fig. le, bottom two panels). The collective findings across samples from patients with distant metastatic disease thus suggested that reprogramming initiated during subclonal evolution in the primary tumor, and that these changes were inherited or even accentuated in subclones that formed tumors at the distant metastatic sites themselves.

[00178] To expand our analysis to more patient samples and test the generality of our findings, we employed twelve low-passaged cell lines collected from eight patients (Table 1), including a subset that corresponded to the patient tissues above. Cell lines were isolated from nine distant metastatic subclones, a peritoneal metastasis with paired liver and lung metastases that corresponded to the patient presented in Fig. le, and two (non-founder) primary tumor subclones matched to a lung metastasis collected from the same patient. Importantly, six of the nine distant metastatic cell lines were previously whole exome sequenced, and mutations present in the cell lines were also present in the corresponding patient tissues as detected by Sanger sequencing. Because rapid autopsy cell lines were largely isolated from patients who presented with distant metastases, additional PDAC samples from other sources of regional disease were also included: malignant ascites fluid from two patients with peritoneal carcinomatosis (AsPCl, HPAFII), and a primary tumor from a long-term survivor without distant metastases (Capan2).

[00179] We then examined whether global changes in chromatin as seen in the patient tissues were maintained in cell lines, which could reflect genome-wide reprogramming events. We began by performing western blots for eight histone modifications with well- understood functions. Comparison of local -regional PDAC samples (A38Per, AsPCl, HPAFII, Capan2) by western blots showed minimal or non-recurrent global changes across all histone modifications tested (Fig. 7a), similar to the evolution of peritoneal carcinomatosis observed in patient tissues. In contrast, distant metastases displayed striking reprogramming of methylation and acetylation that was targeted to specific histone residues (Fig. 7b), including between the paired peritoneal (A38Per) and distant metastatic (A38Lv, A38Lg) subclones that corresponded to the patient presented in Fig. le (Fig. If). This manifested as recurrent reductions in H3K9Me2/3 and H4K20Me3 that was coupled to hyper-acetylation of H3K9Ac and H3K27Ac in distant metastases (summarized in Fig. lg and Table 1, shown in Fig. 7b). H3K9 methylation is critical for encoding heterochromatic epigenetic states over large genomic regions including LOCKs, and H3K27Ac encodes gene regulatory elements. Reprogramming appeared specific, as we observed no consistent/recurrent changes in H3K27Me3 or H3K36Me3 across samples (Fig. lg, Fig. 7) and the reprogrammed modifications themselves were not dependent on proliferation rates and could not be induced by PDAC chemotherapy (Fig. 8). Finally, western blots on cell lines isolated from matched primary tumor (A13Prl, A13Pr2) and distant metastatic (A13Lg) subclones collected from a patient who presented with widespread distant metastases in the absence of regional (peritoneal) spread also showed reductions in H3K9Me3 and H4K20Me3 between the primary tumor subclones, which was retained in the distant metastasis (Fig. 7c), further suggesting that reprogramming initiated in the primary tumor during the evolution of distant metastasis. Thus, in vivo tissue and in vitro cell culture findings across the collective 30 patient samples strongly suggested that global epigenetic state was reprogrammed during the evolution of distant metastasis.

[00180] The epigenomic landscape of PDAC subclonal evolution

[00181] We next wished to map the locations of reprogrammed chromatin modifications across the PDAC genome. To this end, we comprehensively mapped the epigenetic landscape of PDAC evolution with chromatin immunoprecipitation followed by high- throughput sequencing (ChlP-seq) for histone modifications with well-understood functions (heterochromatin: H3K9Me2, H3K9Me3, H3K27Me3; euchromatin: H3K27Ac, H3K36Me3). To capture the diversity of subclonal evolution and malignant progression, ChlP-seq was performed on sequence-verified cell lines isolated from matched subclones, including a peritoneal metastasis (A38Per) matched to a lung metastasis (A38Lg) from the same patient, and two primary tumor subclones (A13Prl, A13Pr2) that were also matched to a lung metastasis (A13Lg) from the same patient. For each patient, all subclones shared identical driver gene mutations without acquisition of new metastasis-specific drivers (Table 1). We also performed RNA-seq in parallel to identify matched gene expression changes. Finally, we complemented these datasets with whole genome bisulfite sequencing (WGBS) across these cell lines and frozen tumor tissues that corresponded to a subset of the formalin-fixed tumor sections presented in Fig. 1. In all, we generated 183 datasets with 19.3 x 10 9 uniquely aligned sequencing reads, including >15.0 x 10 6 (median: 32.3 x 10 6 ) uniquely aligned reads for each ChIP experiment as recommended by ENCODE guidelines (Supplementary Data 1). Experiments were performed as biological replicates, with good correlation between replicates (median correlation coefficient: 0.956; range: 0.746-0.997, Supplementary Data 1). To our knowledge, this represents the first comprehensive genome- wide analysis of epigenetic reprogramming during the evolutionary progression of a human cancer. [00182] Because global chromatin modifications were stably inherited during the evolution of peritoneal carcinomatosis whereas reprogramming emerged during distant metastasis, we began by comparing the peritoneal subclone against the distant metastatic subclones and their matched primary tumor subclones. This analysis revealed a striking and unexpected degree of genome-wide epigenetic reprogramming that was targeted to thousands of chromatin domains covering >95% of the PDAC genome. These included heterochromatin regions corresponding to /arge organized chromatin lysine ( )-modified domains (LOCKs) that occupied approximately half the genome, gene-rich euchromatin domains (ECDs), and a smaller subset of very large LOCK domains that were uniquely reprogrammed compared to other heterochromatin regions. The collective domain characteristics are detailed in Supplementary Data 2, and comprehensive statistical analyses across experiments are presented in Supplementary Data 3.

[00183] We first analyzed heterochromatin domains defined by strong, broad enrichments of H3K9Me2 and H3K27Me3 with depletion of euchromatin modifications (Fig. 2a, Fig. 9, and Supplementary methods). Thousands of heterochromatin domains were detected in each sample (range: 2,008-3,166) and were organized into large block-like segments (median lengths: 232Kb-311Kb) that occupied more than half of the genome in each subclone (average: 61.7% of the genome, range: 54.1-71.7%), Supplementary Data 2). The domain calls were robust as determined by sensitivity analyses across multiple thresholds (Supplementary Data 3), and the called heterochromatin regions themselves overlapped significantly with previously reported LOCK heterochromatin domains (avg: 76.7 +/- 16.9%) overlap; p<0.01 by permutation testing), suggesting that heterochromatin largely corresponded to LOCKs. Similar to immunostain (Fig. la-d) and western blot data (Fig. le, f), we detected strong H3K9Me2 enrichment across LOCKs in the peritoneal subclone, whereas these same regions displayed global reductions of H3K9Me2 in the distant metastases and their matched primary tumor subclones (Fig. 2a, average: 591Mb/l, 470Mb; range: 204-1, l lOMb/1, 470Mb; p<2.2e-16 by chi square; Table 1 and Supplementary Data 3). In contrast, high global levels of H3K27Me3 were detected from these regions across all subclones (Fig. 2a), similar to the western blot findings (Fig. lg). We also detected patient- specific patterns of global H3K9Me3 reprogramming in LOCKs (Fig. 10). In patient A38, H3K9Me3 was largely absent from A38Lg LOCKs relative to A38Per (loss of H3K9Me3 from 184/208Mb, 88.5%, p<2.2e-16). Loss of LOCK-wide H3K9Me3 was also detected between the paired primary tumor subclones in patient A13 (106Mb/l 18Mb, 90.0% in

A13Pr2 vs. A13Prl, p<2.2e-16), and similar to western blot findings (Supplementary Fig. lc) this change was inherited in the distant metastatic subclone (152Mb/169Mb, 90.2% in A13Lg vs. A13Prl, p<2.2e-16). Finally, localized reprogramming events were also detected specifically within chromatin encoding differentially expressed (DE) genes in LOCKs (Fig. 11, Supplementary Data 3). This included reciprocal changes in H3K27Ac and H3K9Me2 over promoters coupled to similar reciprocal changes in H3K36Me3 and H3K27Me3 over gene bodies of LOCK genes that were up- and down-regulated. This suggested that DE genes from LOCKs were situated in specific sub-regions that possessed gene regulatory potential, consistent with hybrid LOCK-euchromatin islands (LOCK-EIs). Collectively, these findings indicated that heterochromatin domains (LOCKs) represented a major target for both global and local chromatin reprogramming events during the evolution of distant metastasis.

[00184] Because LOCKs correspond to a subset of block-like regions that are DNA hypomethylated in pancreatic and other human cancers, we also asked whether DNA methylation changes were targeted to LOCKs during PDAC subclonal evolution. For this analysis, we performed WGBS on all of the cell lines with ChlP-seq data reported above. Because frozen tissues corresponding to the cell lines had been exhausted during previous studies, we selected 7 other frozen tissue samples for in vivo WGBS that were uniquely matched to the same formalin-fixed tissues with IHC data presented in Fig. la (A124, local- regional spread) and Fig. lc (A125, distant metastasis). Normal pancreas was included with frozen tissue samples as an internal control. All samples and results of WGBS with corresponding IHC, western blot, and ChlP-seq findings are summarized in Table 1, and quantified WGBS results with statistical analyses are presented in Tables 2 and 3. These experiments revealed significant reductions in LOCK-wide DNA methylation across cell lines isolated from distant metastases relative to peritoneal carcinomatosis (Fig. 2b, c, Table 3). These findings matched the reductions of H3K9Me2 in LOCKs detected by ChlP-seq on the same samples (Fig. 2a), revealing that reprogramming of DNA methylation in hypomethylated block regions is targeted to LOCKs with reprogrammed histone methylation (Table 1). Analysis of the same LOCK regions from the frozen tissue samples also revealed relatively high DNA methylation in LOCKs from patient A124 (peritoneal spread) and the founder clone from patient A125, while the primary tumor and distant metastatic subclone descendants displayed striking loss of DNA methylation that was even more pronounced than that seen in the cell lines (Fig. 2b, c, Table 1, and Table 2). We also detected strong, localized DNA hypomethylation from down-regulated DE genes in the hybrid LOCK-EI sub-regions, while up-resulated genes remained hypermethylated with sharp dips at the 5 '-ends of genes, similar to H3K9Me2 (Fig. 11). Thus, DNA methylation was globally and locally reprogrammed across LOCKs from primary tumor and distant metastatic subclones, similar to histone modifications. Based on the collective immunostain (Fig. la-e), western blot (Fig. If, g), ChlP-seq (Fig. 2a), and WGBS (Fig. 2b, c) data (Summarized in Table 1), we conclude that a substantial fraction of global reprogramming events was targeted to heterochromatin domains (LOCKs) during the evolution of distant metastasis.

[00185] Table 2. Percent CpG Methylation levels across LOCK domains detected in frozen tissue samples by WGBS. DNA methylation levels were relatively high in both primary tumor and metastatic tumors from patient A124, who presented with peritoneal carcinomatosis. Similar high levels of DNA methylation were also detected in the founder clone from patient A125, which were significantly reduced in the primary tumor subclone that seeded distant metastases and in the liver metastases themselves. P-values were calculated with paired wilcox tests using a 3% threshold.

[00186] Table 3. Percent CpG Methylation levels across LOCK domains detected in cell lines by WGBS. DNA methylation levels were highest for the peritoneal subclone A38Per across cell lines. Methylation was significantly reduced in distant metastases from the same patient (A38Lv, A38Lg) and in primary tumor precursors (A13Prl/2) and the matched lung metastasis from patient A13. P-values were calculated with paired wilcox tests using a 3% threshold.

[00187] We next analyzed reprogramming within ECDs, which were defined by enrichments for global euchromatin modifications H3K27Ac and H3K36Me3 with depletion of heterochromatin modifications (Fig. 12). Similar to heterochromatin, thousands of these domains (range: 1,935-2,318) were partitioned into large, block-like segments (median lengths: 207Kb-277Kb) that occupied similar lengths of the genome across subclones (average: 29% of the genome; range: 23.5%-32.0%, Supplementary Data 2). All subclones displayed similar global patterns of modifications within ECDs, including broad H3K36Me3 signals over gene bodies that were flanked by sharp peaks of H3K27Ac and dips in DNA methylation at gene regulatory elements, consistent with actively transcribed euchromatin (Fig. 2d). However, mapping DE genes from RNA-seq data to ECDs (Supplementary Data 3) identified clear patterns of local reprogramming events within chromatin encoding these genes (Fig. 2e, Supplementary Data 3). Genes up-regulated from ECDs acquired increased levels of both H3K36Me3 and H3K27Ac, which could reflect a permissive chromatin state or hyperactive transcription. In contrast, down-regulated genes displayed greatly reduced H3K36Me3 with relatively minor reductions of H3K27Ac, which could reflect an inactive yet poised chromatin state or direct transcriptional repression. Unlike LOCKs, DNA methylation remained stable around DE genes in ECDs (data not shown). Thus, reprogramming in ECDs was largely localized and targeted to H3K27Ac and H3K36Me3 in chromatin encoding DE genes.

[00188] Finally, we also detected patient-specific reprogramming targeted to a unique subset of very large LOCK domains. Although these regions were situated within DNA hypomethylated blocks similar to other LOCKs, they differed in several other respects. First, these domains were substantially larger (median lengths: 730Kb-l, 340Kb vs. 232- 311Kb for other LOCKs, Supplementary Data 2). Second, they were strongly enriched with H3K9Me3 yet depleted of H3K9Me2/H3K27Me3 (Fig. 13 and Supplementary Data 2). Third, their abundance was patient-specific: subclones from patient A13 possessed very few of these domains (range: 50-111 domains covering 1.4-3.5% of the genome) while they occupied a much higher fraction of the A38 genome (range: 226-344 domains covering 14.5-20.6% of the genome, Figure 13a and Supplementary Data 2). Finally, unlike reprogramming changes detected in other LOCKs (loss of H3K9Me2/3 and DNA methylation), reprogramming in these LOCKs was characterized by loss of H3K9Me3 coupled to increased H3K9Me2 and DNA methylation (Fig. 13b-e). Although the functional significance of these findings is uncertain, they could hold implications for patterns of genome instability that emerged during subclonal evolution, as outlined below.

[00189] Reprogrammed chromatin domains specify malignant heterogeneity

[00190] Subclonal evolution may generate significant phenotypic heterogeneity within an individual patient, and we have hypothesized that such diversity could be encoded by large- scale epigenetic changes similar to those detected above. We therefore wished to investigate in-depth whether reprogrammed chromatin domains might encode heterogeneous malignant properties between PDAC subclones from the same patient. To this end, we selected matched peritoneal and lung metastasis subclones from the same patient (A38Per and A38Lg), performed gene ontology (GO) analyses on reprogrammed LOCK and ECD genes that were differentially expressed between the subclones (Tables 4-7, derived from reprogrammed genes as shown in Fig. 2e and Fig. 11), and then tested whether GO results matched actual phenotypic differences measured by experimental assays. This analysis revealed that reprogrammed LOCKs and ECDs encoded substantial phenotypic differences that emerged during subclonal evolution, as described below.

[00191] Table 4: GO analysis of DE genes that were up-regulated from reprogrammed LOCKs in A38Lg, relative to A38Per. Genes involved in redox balance (oxidation- reduction, NADP) and EMT (cell adhesion, migration) were up-regulated from reprogrammed DE genes in LOCKs (Detailed in Supplementary Data 3).

GO Terms # of Genes % of Genes P-value

Oxi dati on-reducti on 64 6.0 3.9e-6

Oxidoreductase 56 5.2 4.7e-6

EGF-like domain 28 2.6 6.4e-5

Transferase 105 9.8 1.0e-4

NADP 21 2.0 1.7e-4

Cell Adhesion 41 3.8 1.8e-4 Cell Migration 31 2.9 2.7e-4

Cell Morphogenesis 29 2.7 3.8e-4

Mitochondrion 67 6.3 4.0e-4

Acetyl ati on 174 16.3 5.0e-4

[00192] Table 5: GO analysis of DE genes that were down-regulated from reprogrammed LOCKs in A38Lg, relative to A38Per. Genes involved in differentiation state (cell adhesion, development, epithelial genes), immune regulation (immune response, cytokines, inflammation), and response to environmental cues (transmembrane signaling, extracellular matrix, secretion, locomotion) were down-regulated from reprogrammed LOCKs (Detailed in Supplementary Data 3).

[00193] Table 6: GO analysis of DE genes that were up-regulated from reprogrammed ECDs in A38Lg, relative to A38Per. Genes involved in post-translational modifications, cell cycle control, DNA repair, response to stress, and DNA/RNA/protein biosynthesis were up- regulated from reprogrammed ECDs (Detailed in Supplementary Data 3).

GO Terms # of Genes % of Genes P-value

Acetyl ati on 622 21.7 l . le-59

Phosphoprotein 1290 45.0 5.1e-53

Cell Cycle 154 5.4 3.4e-30

Mitotic Cell Cycle 137 4.8 4.9e-30

Organelle Fission 95 3.3 3.7e-25

DNA Metabolic Process 157 5.5 9.4e-25

DNA Repair 95 3.3 1.4e-17

Response to DNA Damage Stimulus 113 3.9 3.8e-17 DNA Replication 68 2.4 2.7e-14

Cellular Response to Stress 144 5.0 3.1e-14

Protein Biosynthesis 62 2.1 2.8e-12

ATP Binding 256 8.9 1.3e-l l

Rib onucl eoprotein 78 2.7 4.1e-l l

Nucleotide Binding 309 10.7 4.3e-l l

Translation 86 3.0 2.5e-9 ncRNA Metabolic Process 66 2.3 3.5e-9

Mitochondrion 167 5.8 4.3e-9

DNA Recombination 39 1.4 4.6e-9

Microtubule-based Process 70 2.4 5.9e-9

[00194] Table 7: GO analysis of DE genes that were down-regulated from reprogrammed ECDs in A38Lg, relative to A38Per. Genes involved in oncogenic signal transduction cascades (Sh3 domains, transmembrane proteins, kinases, Ras signaling), cell motion (wounding, migration, locomotion), and cell death control (apoptosis) were down-regulated from reprogrammed ECDs (Detailed in Supplementary Data 3).

[00195] First, a large number of DE genes involved in redox (oxidation-reduction) balance were up-regulated from reprogrammed LOCKs in A38Lg (Table 4, Supplementary

Data 3). This subclone was accordingly highly resistant to H 2 0 2 -mediated oxidative stress

(Fig. 3a), and possessed higher oxidoreductase activity and NADPH levels than A38Per

(Fig. 14a, b). Second, genes encoding differentiation state (epithelial vs. EMT) were reciprocally expressed from A38Per and A38Lg LOCKs (Table 5, Supplementary Data 3), and we confirmed several well-known epithelial and EMT expression changes (e.g.

CDHl/E-cadherin, CDH2/N-cadherin) at the protein level by western blots (Fig. 3b). Further consistent with GO results, A38Per maintained well-differentiated (epithelial) morphology while A38Lg was poorly differentiated (EMT-like) across multiple in vitro culture conditions (Fig. 14c), and immunofluorescence experiments showed that EMT emerged in the primary tumor subclone that seeded the A38Lg metastasis in vivo (Fig. 14d). We also note that immune-related genes were differentially expressed from reprogrammed LOCKs (Table 5), which could hold implications for PDAC immunotherapy. Third, genes involved in DNA repair and cell stress responses were significantly up-regulated in ECDs from A38Lg, including genes crucial for maintenance of genome integrity (Fanconi anemia complex, non-homologous end joining, and the TOP2B/OGG1/KDM1A complex, among others, Table 6, Supplementary Data 3). This subclone was accordingly highly resistant to PDAC chemotherapy (gemcitabine) compared to A38Per (Fig. 3c), and western blots showed hyper-phosphorylation of hi stone H2AX S139 (γΗ2ΑΧ, a signature of activated DNA repair pathways, Fig. 3d). Fourth, genes involved in oncogenic signal transduction cascades were down-regulated in ECDs from A38Lg, especially KRAS/ERK-related genes (Table 7, Supplementary Data 3). Indeed, A38Lg showed loss of phosphorylated ERK (Fig. 3e), resistance to ERK inhibition (Fig. 3f), and minimal response to knockdown of oncogenic KRAS in 3D tumor forming assays (Fig. 3g, Fig. 14e, f), despite possessing identical KRAS° 12V mutations as A38Per. Finally, mapping previously reported rearrangements from this patient to chromatin domains revealed that rearrangements were preferentially targeted to ECDs and the small subset of uniquely reprogrammed large LOCK domains, whereas other LOCKs were strongly depleted (Fig. 15).

[00196] Thus, reprogrammed chromatin domains collectively specified malignant gene expression programs, divergent phenotypic properties, and patterns of genome instability that emerged during subclonal evolution in patient A38. This patient was unusual in having received chemotherapy prior to tissue harvesting and had a missense mutation in SMARCA2 of unclear significance (CID, unpublished observations), and thus in this case epigenetic selection may have occurred downstream of a genetic driver. Although the nature and extent of such findings will certainly vary among patients, they imply that PDAC is capable of acquiring substantial epigenetic and malignant diversity during subclonal evolution, even in the same cancer from the same patient.

[00197] Anabolic glucose metabolism controls epigenetic state and tumorigenicity

[00198] We next asked whether a recurrent, metastasis-intrinsic pathway might have been selected for during subclonal evolution to exert upstream control over global epigenetic state and tumorigenic potential. Several recent studies have linked nutrient status and metabolic activity to global levels of histone modifications. Because distant metastases in the rapid autopsy cohort were largely isolated from organs (liver, lung) that provide a rich supply of glucose, we asked whether reprogrammed chromatin and tumorigenicity in these subclones might have evolved a dependence on specific aspects of glucose metabolism.

[00199] Altered glucose metabolism (i.e. Warburg effect) is a well-known property of neoplastic and highly proliferative cells. Although most of our metastatic subclones actually displayed modest proliferative rates in culture (e.g. Fig. 8) and in vivo, we nonetheless asked whether distant metastases might have acquired further adaptations in glucose metabolism. Surprisingly, relative to proliferative (immortalized) normal HPDE cells and local -regional PDAC samples, glucose strongly stimulated metabolic (oxidoreductase) activity across distant metastatic subclones (Fig. 4a), and glucose was accordingly required for these subclones to withstand oxidative stress (Fig 4b, c). Distant metastases also hyper- consumed glucose, as we detected elevated glucose uptake and lactate secretion in distant metastases and their precursors relative to peritoneal carcinomatosis (Fig. 16a). To determine if excess glucose uptake was specifically incorporated into downstream metabolic pathways, we selected paired peritoneal and distant metastatic subclones from the same patient, incubated them with 13 C[l-2]-labeled glucose, and measured glucose incorporation into metabolic products with liquid chromatography followed by high resolution mass spectrometry (LC-FIRMS). These experiments revealed elevated incorporation of both CI- and Cl,2-labeled glucose into lactate and nucleotides in the distant metastasis (Fig. 4d,e), consistent with enhanced glucose entry into both glycolysis and the pentose phosphate pathway (PPP).

[00200] We next asked whether distant metastases might have evolved a dependence on specific enzymatic steps in either of these glucose-driven pathways, which we hypothesized would manifest as severe depletion of metabolite substrate secondary to hyper-consumption. To test this, we surveyed glycolytic and PPP metabolite profiles across a diverse panel of samples including FIPDE cells, peritoneal carcinomatosis, distant metastases, and primary tumor precursor subclones. Analysis of all detected glycolytic and pentose phosphate metabolites (Fig. 16b) revealed a striking, recurrent depletion of 6-phosphogluconic acid (6PG) across distant metastases and their precursors (Fig. 4f). 6PG is the substrate for 6- phosphogluconate dehydrogenase (PGD), an enzyme involved in anabolic glucose metabolism that operates within the oxidative branch of the PPP.

[00201] Glucose may enter the PPP via the oxidative (oxPPP) or the non-oxidative

(noxPPP) branch of the pathway, which are thought to be uncoupled. Although some studies in other cancers have suggested that PGD is an important oncogene, it is KRAS- mediated noxPPP activation that drives primary tumor growth in mouse models of PDAC. Because KRAS and other driver mutations are acquired early in PDAC progression and shared by all subclones that evolve thereafter, we hypothesized that PGD dependence might have been selected for specifically during the evolution of distant metastasis to maintain reprogrammed chromatin and tumorigenicity. Glucose deprivation, RNAi against PGD, and 6-aminonicotinamide (6AN, a nicotinamide antimetabolite prodrug reported to preferentially inhibit PGD) had no effect on global chromatin modifications in the peritoneal subclone, while all treatments reversed the reprogrammed chromatin state of the paired lung metastasis from the same patient (Fig. 5a). PGD loss-of-function appeared specific, as PGD knockdown did not alter expression of KRAS or other PPP components (Fig. 5b).

[00202] We next asked whether PGD knockdown might affect intrinsic tumor forming capacity across a larger panel of subclones. Despite their aggressive behavior in patients, distant metastatic subclones were unable to effectively form metastatic tumors in immunodeficient mice, and PGD RNAi was not toxic to any subclones grown in routine 2- D cultures (data not shown). To bypass these limitations, we treated cells with RNAi and used 3-D matrigel tumor-forming assays to measure the effects of PGD knockdown on intrinsic tumor-forming capacity. PGD RNAi had minimal effect on the ability of HPDE cells to form spheres or local -regional PDACs to form tumors by these assays (Fig. 5c). Remarkably, PGD RNAi universally interfered with the ability of distant metastatic subclones to form tumors (Fig. 5d). These findings suggested that PGD might represent a therapeutic target with selectivity for PDAC distant metastasis. Because 6AN could represent a lead compound for future design of PGD targeted therapies, we stringently tested it for activity against distant metastases with metabolomics, western blots, multiple 3D tumorigenic assays, RNA-seq, and ChIP experiments.

[00203] 6AN treatments slowed rates of glucose consumption and lactate secretion with no effect on glutamine consumption or glutamate secretion in distant metastatic and precursor subclones (Fig. 17a), and 6 AN reversed the previously detected high incorporation of glucose into lactate and nucleotides (Fig. 117). Furthermore, steady state levels of glucose and metabolites directly upstream of the PGD reaction were dramatically elevated in response to 6AN with corresponding reductions in downstream metabolites (Fig. 17c), which is consistent with strong PGD inhibition as previously reported by others. [00204] We next tested the effects of 6AN on epigenetic state. Strikingly, 6AN treatments quantitatively reversed several reprogrammed chromatin modifications across distant metastatic subclones with minimal effect on normal cells or local -regional PDACs (Fig. 18a, b; summarized in Fig. 6a, b), and this effect persisted upon removal of 6AN from the media (Fig. 18c). Because these changes mirrored aspects of LOCK reprogramming, we examined the chromatin state of LOCK DE genes regulated by 6AN, as identified by RNA- seq (Supplementary Data 3, 4). This revealed that DE genes were located within the reprogrammed hybrid LOCK-EI regions that possessed strong H3K27Ac and H3K36Me3, low H3K27Me3, and sharp 5 '-depletion (dips) of H3K9Me2 (Fig. 19a, Supplementary Data 3). ChlP-seq experiments on control and 6AN-treated A38Lg cells further showed that the quantitative increase of global H3K9Me2 was targeted to LOCK regions that were reprogrammed in A38Lg vs. A38Per (Fig. 19b), while the reduced H3K27Ac was specifically targeted to genes repressed from LOCKs with no effect on other LOCK genes or ECD-regulated genes (Fig. 19c). Levels of H3K27Me3 remained stable across all regions in response to 6 AN (Fig. 19b, d), similar to western blot findings. Collectively, these experiments demonstrated that 6AN selectively and quantitatively targeted several chromatin changes within LOCKs that emerged during the evolution of distant metastasis.

[00205] Because 6AN modulated the global epigenetic state, we hypothesized that it might also selectively block tumorigenic potential in distant metastatic subclones, similar to PGD knockdown experiments. Strikingly, 6AN selectively and strongly blocked tumor formation in distant metastatic and primary tumor precursor subclones but not local -regional PDACs across multiple 3D tumorigenic experimental platforms, including suspension tumorsphere assays (Supplementary Fig. 20), matrigel tumor forming assays (Fig. 6b), and injection of PDAC cells into organotypic stroma that recapitulates aspects of in vivo patient tumors (Fig. 6c). Thus, like PGD knockdown, chemical inhibition of PGD by 6AN selectively blocked the tumorigenic potential of distant metastatic subclones.

[00206] We next examined our RNA-seq datasets to explore whether the above findings might be linked to regulation of malignant gene expression programs. Remarkably, over half (952/1832, 52%, Supplementary Data 4) of 6 AN down-regulated genes from A38Lg corresponded to genes that were over-expressed in this subclone (compared to the peritoneal subclone from the same patient). In addition, a large fraction of 6AN up-regulated genes also matched DE genes that were repressed (914/2122, 42%, Supplementary Data 4). Even more striking, nearly one-third (255/891, 29%, Supplementary Data 4) of recurrently over- expressed genes across distant metastatic subclones were down-regulated by 6AN. Comparative GO analyses on these gene subsets produced overlapping results that were strongly enriched for cancer-related functions, including mitotic cell cycle control, acetylation, chromosome stability, DNA repair, cell stress responses, and anabolic/biosynthetic activities (Tables 8-10).

[00207] Table 8: GO analysis of genes that were both recurrently over-expressed in distant metastases and down-regulated by 6AN, detected by RNA-seq (detailed in

Supplementary Data 3).

[00208] Table 9: GO analysis of recurrently over-expressed genes detected by RNA-seq in distant metastatic subclones and primary tumor precursors, relative to peritoneal carcinomatosis (detailed in Supplementary Data 3).

[00209] Table 10: GO analysis of DE genes detected by RNA-seq that were down- regulated in response to 6 AN, compared to DMSO control cells (A38Lg subclone, detailed in Supplementary Data 3).

[00210] The above findings led us to hypothesize that 6AN-ablation of tumorigenicity in distant metastatic subclones might be mediated through epigenetic control of cancer-related genes important to maintain tumorigenic capacity. To validate this, we selected two candidate genes for in-depth experiments: N-cadherin (CDH2) and topoisomerase 2β (TOP2B). CDH2 and TOP2B are both thought to be important for cancer progression, are not known to be mutated in PDAC, can be therapeutically targeted, were recurrently over- expressed across distant metastatic and primary tumor precursor subclones by RNA-seq (Supplementary Data 3), and were selectively repressed by 6AN which we confirmed with RT-PCR (Fig. 6e top panels). Furthermore, CDH2 was located within a reprogrammed LOCK targeted by 6 AN, and TOP2B was located immediately adjacent to a LOCK boundary. ChlP-qPCR assays performed on control and 6AN treated cells showed nearly identical enrichments for H3K9Me2 and H3K27Ac across these gene loci in the peritoneal subclone (Fig. 6e left panels and Fig. 21a). In contrast, 6AN treatments on the matched lung metastasis from the same patient resulted in enrichment of H3K9Me2 across both loci with concordant reductions of H3K27Ac over the CDH2 genie region (Fig. 6e right panels, Fig. 21a). This strongly suggested that a major downstream effect of 6 AN treatments was epigenetic repression of over-expressed cancer genes. We therefore performed RNAi experiments to test whether knockdown of these genes might be important to selectively maintain tumorigenicity. Indeed, RNAi selectively blocked 3D tumor formation in distant metastatic and precursor subclones that over-expressed CDH2 and TOP2B, with no effect on HPDE cells or peritoneal carcinomatosis (Fig. 6f). Collectively, these targeted validation studies strongly supported conclusions inferred from the sequencing data, in that inhibition of PGD/oxPPP by 6AN selectively targeted gene expression, epigenetic state, and downstream tumorigenic functions of over-expressed cancer genes (CDH2/TOP2B).

[00211] FIGURE LEGENDS

[00212] Figure 1 : Global epigenetic reprogramming during the evolution of distant metastasis, a, Immunohistochemical (IHC) stains against H3K9Me2/3 performed on tumor sections from 6 subclones collected from two patients who presented with widespread peritoneal carcinomatosis (a: patient A124, b: patient A141) showed similar strong nuclear staining across all primary tumor and peritoneal subclones, b, Similar stains on 6 subclones from two patients who presented with widespread distant metastases (c: patient A125, d: patient A132) showed progressive loss of nuclear staining that initiated in primary tumor subclones that seeded metastases (middle panel) and was further lost (c) or stably inherited

(d) in the liver metastases. Scale bars=10C^m for IHC, 20μιη for IF. c, IHC against the indicated modifications performed on tumor sections representing 4 paired subclones from a patient (patient A38) that presented with both peritoneal carcinomatosis and distant metastases shows that the peritoneal precursor subclone in the primary tumor that seeded carcinomatosis inherited strong nuclear staining of heterochromatin modifications as seen in the parental clone that founded the neoplasm. In contrast, the primary tumor precursor subclone that seeded distant metastases showed cell-to-cell variation in staining, with complete loss of staining in the paired lung metastasis. Staining for the euchromatin modification H3K36Me3 remained stable across all subclones, f, Similar to IHC on tissues

(e) , western blots on cells lines collected from the peritoneal subclone (Per), liver metastasis, and lung metastasis from patient A38 also showed loss of heterochromatin modifications in distant metastatic subclones, with corresponding increased acetylation. Levels of H3K27Me3 and H3K36Me3 did not differ between subclones, g, Densitometry summary of western blot findings for the indicated histone modifications across cell lines from distant metastatic subclones compared to peritoneal carcinomatosis (Supplementary Fig. lb, c, n=8 biological replicates, error bars=s.e.m., *p<0.01). [00213] Figure 2: Epigenomic reprogramming of chromatin domains during PDAC subclonal evolution, a, Representative (left panels) and total summarized (right panels) ChlP-seq experiments revealed loss of H3K9Me2 from LOCKs between peritoneal (A38Per) and distant metastatic and primary tumor precursor subclones (others). H3K27Me3 remained strong in all subclones, b, Bisulfite-seq data on cell lines (A38, A13, top panel) and frozen tissue samples (A124, A125 panels) showed that samples from local regional spread and parental clones (A38Per, A124PrF, A124Per, A125PrF) possessed hypermethylated LOCKs. In contrast, distant metastatic subclones (A125Lvl/2, A13Lg, A38Lv, A38Lg) and their primary tumor subclones (A125PrS, A13Prl, A13Pr2) showed hypomethylation of DNA across the same LOCK regions, c, Examples of individual LOCKs displaying hyper- vs. hypomethylation across subclones, as described above, d, Global levels of H3K36Me3, H3K27Ac, and DNA methylation within ECDs did not show any clear differences between subclones, e, In contrast, distant metastatic subclones and primary tumor subclones (red lines) displayed local reprogramming of H3K36Me3 and H3K27Ac specifically over DE genes within ECDs, compared to the same DE genie ECD regions from A38Per (black lines).

[00214] Figure 3 : Reprogrammed chromatin domains encode divergent malignant properties, a, A38Lg was remarkably resistant to H 2 0 2 treatments compared to A38Per. MTT signals reflect cell viability normalized to untreated controls. n=4 technical replicates, *p<0.03. b, Western blots for proteins involved in epithelial and EMT differentiation were differentially expressed between A38Per and A38Lg, as predicted by GO analyses of reprogrammed DE genes from LOCKs. c, A38Lg was completely resistant to gemcitabine compared to A38Per, as predicted by GO analyses of reprogrammed DE genes from ECDs. MTT signals reflect cell viability normalized to untreated controls. n=4 technical replicates, *p<0.01. d, A38Lg possessed elevated levels of yH2AX by western blot, consistent with activation of DNA repair pathways, e, Western blots showed that A38Lg lost hyper- phosphorylated ERK and f, was resistant to ERK targeted therapy, compared to A38Per. MTT signals reflect cell viability, normalized to untreated controls. n=4 technical replicates, *p<0.03. g, A38Lg also lost sensitivity to KRAS knockdown by matrigel 3D tumor forming assays, compared to A38Per. n=4 technical replicates, *p<0.01.

[00215] Figure 4: Hyperactive glucose metabolism and 6PG depletion in distant metastatic subclones, a, MTT assays performed on equal numbers (20K) of viable, growth- arrested cells from the indicated subclones showed greatly elevated signal (oxidoreductase activity) across distant metastatic subclones, compared to HPDE and local -regional PDAC samples (n=4 technical replicates for each, error bars=s.d.m., *p<10 "5 ). b, Normalized cell counts for the indicated samples incubated with (+) or without (-) lOmM glucose and treated with ImM H 2 0 2 as indicated (+, -) for 24h showed that normal HPDE cells were sensitive to H 2 0 2 under either glucose condition (as expected), whereas local-regional PDAC samples were resistant to H 2 0 2 irrespective of glucose availability (n=3 technical replicates for each, error bars=s.d.m). c, In contrast, distant metastatic subclones were sensitive to H 2 0 2 when glucose was not present in the media (n=3 technical replicates for each, error bars=s.d.m, *p<0.001). d, Simplified schematic of 13 C-(l,2)-labeled glucose flow through glycolysis and the PPP. Glucose that enters the oxidative branch of the PPP has one labeled carbon cleaved during conversion of 6PG to Ru5P (m+1), whereas glucose that travels through glycolysis or the non-oxidative PPP retains both labeled carbons (m+2). Note that cross-talk allows glucose with either labeling pattern to re-enter the other pathway and incorporate, e, LC-MS for nucleotides and lactate showed that these downstream metabolites acquired greatly elevated 13 C-1,2 labels from glucose in the lung metastasis from patient A38 (A38Lg), compared to its paired peritoneal subclone (A38Per, n=3 biological replicates, error bars=s.d.m., *p<0.01). f, Steady state LC-HRMS measurements for 6PG showed either complete (ND: not detected) or near complete loss of metabolite across distant metastases and their precursors compared to peritoneal carcinomatosis and HPDE cells (Supplementary Fig. 10b).

[00216] Figure 5: PGD-dependence in distant metastatic subclones, a, Western blots against indicated histone modifications performed on paired peritoneal (A38Per) and distant metastatic (A38Lg) subclones from the same patient showed that global levels of reprogrammed H3K9Me2/3 and acetylation in A38Lg were reversed by removal of glucose from the media (left panel), PGD RNAi (middle panel), and 6AN treatments (right panel), b, Western blots on A38Lg indicated that PGD knockdown by RNAi did not perturb expression of other PPP components or KRAS. c, PGD RNAi did not affect the ability of normal HPDE cells or local-regional PDAC samples to form tumors in 3D matrigel assays (representative photomicrographs shown with quantified numbers of tumors/well, n=4 technical replicates for each, error bars=s.d.m.). d, In contrast, PGD RNAi significantly reduced tumor formation across all distant metastatic subclones that were available for testing from the rapid autopsy cohort (n=4 technical replicates for each, error bars=s.d.m., *p<0.01). Scale bars: 200μιη.

[00217] Figure 6: Reversal of reprogrammed chromatin, tumorigenicity, and malignant gene expression programs by 6AN. a, Densitometry summary of western blots shown in Supplementary Fig. 12a (n=8 biological replicates, error bars=s.e.m., *p<0.01). 6AN selectively reversed reprogrammed H3K9Me2 and acetylation across most distant metastatic subclones, with minimal or non-recurrent effects on H3K27Me3 or H4K20Me3. b, Densitometry summary of western blots shown in Supplementary Fig. 12b (n=6 biological replicates, error bars=s.e.m.). 6AN had minimal effects on histone modifications across normal (FIPDE, fibroblast) or local -regional PDAC samples, c, 6AN ablated tumor formation in 3D matrigel assays (n=4 technical replicates for each, error bars=s.d.m., *p<0.01, scale bars: 200μπι) and 3D tumorsphere assays (Supplementary Fig. 14c) across distant metastastic subclones. 6AN had minimal effects on local -regional PDAC samples by either assay (Supplementary Fig. 14a,b). d, 6AN also blocked the ability of distant metastatic subclones to form tumors when injected into 3D organotypic stromal cultures (n=3 technical replicates for each, error bars=s.d.m., *p<0.05; scale bars: 200μπι). e, Realtime RT-PCR (top panels) showed that the distant metastatic subclone (A38Lg) over- expressed CDH2 relative to the peritoneal subclone (A38Per) from the same patient, and that expression was repressed by 6AN (n=4 technical PCR replicates from two biological replicate experiments, error bars: s.d.m., *p=0.002). ChIP assays for H3K9Me2 and H3K27Ac with PCR primers (location indicated by numbers) spaced across the 1.4Mb chromatin domain showed that 6AN induced spreading of H3K9Me2 across the locus with corresponding loss of H3K27Ac in A38Lg, with no effect on A38Per (n=2 biological replicates, error bars=s.e.m.). f, RNAi against both CDH2 and TOP2B selectively blocked tumor formation in distant metastatic and precursor subclones that over-expressed these genes by RNA-seq, with no effect on A38Per or FIPDE cells (n=4 technical replicates for each, error bars=s.d.m., *p<0.01; scale bars: 200μπι). Specific RNAi knockdown of gene expression is shown in Supplementary Fig. 15c, d.

[00218] Figure 7: Reprogrammed chromatin across distant metastatic subclones, a, Western blots showed minimal or inconsistent changes for the indicated histone modifications between local -regional PDAC samples, including A38Per (Per), b, In contrast, a panel of distant metastatic subclones showed recurrent changes in specific modifications, compared to A38Per. c, Reprogramming was also observed between primary tumor subclones (Prl , Pr2) and the lung met. from the same patient.

[00219] Figure 8: Specificity of reprogrammed histone modifications, a, Ki67 stains showed similar cell cycle rates for peritoneal and the matched lung met grown in serum, and serum-free media (SFM) arrested growth, b, Serial cell counts for the indicated times confirmed equal growth rates and growth arrest in SFM. c, GO analysis on RNA-seq data from cells cultured in serum vs. SFM further confirmed growth arrest in SFM (lung data, peritoneal gave identical results), d, Western blots showed persistence of reprogrammed chromatin modifications in serum/proliferative (-) and SFM/growth arrested (+) cells, e, Treatment of the peritoneal subclone with PDAC chemotherapies(Gem: Gemcitabine, G+FU: Gemcitabine+5-Fluorouracil) did not induce loss of methylation or gain of acetylation as seen between peritoneal and distant metastases, confirming that reprogramming was unrelated to treatment effects.

[00220] Figure 9: Enrichment of heterochromatin modifications within LOCKs. Plots of ChlP-seq read densities normalized to inputs for histone modifications (left labels) showed that heterochromatin modifications (H3K9Me2/3, H3K27Me3) were enriched in regions that were called LOCKs (0% to 100%, bottom panel labels) for each subclone (indicated above graphs). In contrast, euchromatin modifications (H3K36Me3, H3K27Ac) were depleted from LOCKs.

[00221] Figure 10: Reprogramming of H3K9Me3 in LOCKs during PDAC subclonal evolution, a, ChlP-seq data from paired peritoneal (A38Per) and lung (A38Lg) metastatic subclones detected dramatic reduction of H3K9Me3 in A38Lg, that overlapped with H3K9Me2 (which marks LOCK domains), b, Similar data for patient A13, which also showed loss of H3K9Me3 from LOCK regions in A13Pr2/A13Lg subclones, compared to the Al 3Prl primary tumor subclone.

[00222] Figure 11 : Local reprogramming of DE gene loci within LOCKs. a, Mapping DE genes from RNA-seq (distant mets and precursors vs. A38Per) to LOCKs revealed reciprocal changes in H3K27Me3 and H3K27Ac/H3K36Me3/DNA methylation around genes downregulated in LOCKs. b, The opposite changes in H3K27Me3 and H3K27Ac/H3K36Me3 were detected from genes upregulated in LOCKs. DNA methylation remained high in these regions. P-values for each comparison are listed in Supplementary Data 3.

[00223] Figure 12: Enrichment of euchromatin modifications within ECDs. Plots of ChlP- seq read densities normalized to inputs for histone modifications (left labels) showed that euchromatin modifications (H3K36Me3,H3K4Me3, H3K27Ac) were enriched in regions that were called ECDs (0% to 100%, bottom panel labels) for each subclone (indicated above the graphs). In contrast, heterochromatin modifications (H3K9Me2/3, H3K27Me3) were depleted from ECDs.

[00224] Figure 13 : Reprogramming of large LOCKs during PDAC evolution, a,

H3K9Me3 was enriched and DNA hypomethylated in large LOCK domains, b-c, Striking reprogramming of H3K9Me3/2 and DNA methylation was detected in a subset of A38 large LOCKs. d, H3K9Me3 was also enriched across large HPDE LOCKs. e, Examples of reprogrammed domains between samples.

[00225] Figure 14: Malignant heterogeneity between A38 subclones, a, Oxidoreductase capacity was measured with MTT assays performed on equal numbers of growth-arrested cells in the absence of serum, and MTT signals normalized to total cell numbers per well. Consistent with GO results, A38Lg possessed higher oxidoreductase activity. n=4 technical replicates, b, NADPH/NADP levels were measured with enzyme cycling assays on equal numbers of growth arrested cells. More NADPH/million cells was detected in A38Lg. n=2 biological replicates, c, A38Per and A38Lg maintained well/poorly differentiated morphology in patient tissues and across three separate in vitro culture conditions as indicated, d, IF performed on fixed tissues from the primary tumor showed loss of E- cadherin with gain of vimentin in the precursor subclone that seeded A38Lg, consistent with EMT. e, RNAi knockdown of KRAS blocked f, 3-D tumor formation in suspension assays more efficiently in A38Per than A38Lg. n=4 technical reps.

[00226] Figure 15: Rearrangements targeted to Large LOCKs and ECDs. a, Total breakpoints were not significantly enriched within Large LOCKs or ECDs. b, Unlike typical LOCKs, Large LOCK/ECD breakpoints were significantly joined to breakpoints from homologous domains to form rearrangements, c, Examples of Large LOCK rearrangements that generated translocations and amplifications.

[00227] Figure 16: Enhanced glucose metabolism with depleted 6PG levels across distant metastases a, Extra-cellular glucose consumption and lactate secretion were elevated in distant mets relative to per. (n=3). b, Schematic of glycolytic (outside) and PPP (boxed) metabolites with intra-cellular metabolite levels plotted for each sample. Data represent LC- MS signals normalized to protein (n=3-5).

[00228] Figure 17: 6 AN targets glucose metabolism and the PGD step of the PPP. a, 6 AN selectively slowed rates of extra-cellular glucose consumption and lactate secretion in metastatic subclones with no effect on glutamine/glutamate. b, 6AN reduced incorporation of intracellular 013-labeled glucose into metabolites downstream of the PPP. c, 6AN greatly increased metabolite levels of PGD substrate (6PG) and upstream metabolites (G1,5L) with corresponding reductions in downstream products.

[00229] Figure 18: 6 AN selectively modulated the reprogrammed chromatin state of distant metastatic subclones, a, 6AN treatments generally increased global H3K9Me2 with corresponding decreased acetylation in distant metastatic subclones, b, Normal cells and local -regional PDACs did not show such changes, c, 6AN changes persisted after 3d treatment (+) followed by removal of 6 AN from the media (+-) for an additional 3d.

[00230] Figure 19: 6AN targeted reprogrammed LOCK regions, a, Mapping 6AN repressed DE genes to A38Lg LOCKs revealed that these were located in reprogrammed LOCK-El regions, b, ChlP-seq on DMSO vs. 6AN treated A38Lg detected a quantitative increase in LOCK-wide H3K9Me2 from reprogrammed regions (as aligned to A38Per LOCKs). c, ChlP-seq also detected 6AN-reduced H3K27Ac specifically from genes repressed in LOCKs with d, unchanged H3K27Me3.

[00231] Figure 20: 6AN selectively blocked tumor formation in distant metastatic subclones, a, 6AN did not interfere with the ability of local-regional PDAC samples to form tumors in 3-D matrigel assays or b, in 3-D suspension tumorsphere assays. n=2-4. c, In contrast, 6AN strongly blocked the ability of distant metastatic subclones to form tumors in 3-D suspension tumorsphere assays (shown) and 3-D matrigel assays (Fig. 5b). n=4, p<0.003. Scalebars: 200uM.

[00232] Figure 21 : Reprogramming of the TOP2B locus in response to 6 AN. a, RT-qPCR (top panels) showed that 6AN selectively repressed TOP2B the lung metastatic subclone. Similarly, ChIP assays showed that 6AN induced spreading of H3K9Me2 across the locus in the lung metastasis, with no effect on the paired peritoneal subclone. b,c Representative RT-qPCR verified RNAi knockdown of CDH2 and TOP2B with minimal effect on vimentin (normalized to ERK, which was equally expressed across all conditions).

[00233] DISCUSSION

[00234] The first major result of this study was widespread epigenetic reprogramming during the evolution of distant metastasis in the absence of metastasis-specific driver mutations, i.e. those not already present in the founder clone of the primary tumor. These involved large-scale reprogramming of histone H3K9 and DNA methylation within large heterochromatin domains (LOCKs and hypomethylated blocks), as well as regional changes in gene regulatory modifications (H3K27Ac, H3K36Me3). Second, these changes specified heterogeneous malignant properties that emerged during subclonal evolution. In particular, evolutionarily divergent subclones from the same patient showed changes in gene expression from reprogrammed regions consistent with their individual malignant properties, including oxidoreductase capacity, differentiation state, chemoresistance, oncogene addiction, and patterns of genome instability. Third, it was the PGD step of the oxPPP that controlled aspects of reprogrammed chromatin and tumorigenicity in distant metasatic subclones, as shown by metabolomics, genetic knockdown of PGD, chemical inhibition of PGD, and knockdown of downstream target genes. This strongly suggests that this anabolic glucose pathway was selected during the evolution of distant metastasis to maintain malignant epigenetic state and tumorigenic properties.

[00235] These findings also raise several important but complex questions, which we are pursuing in other studies. Perhaps the most complicated pertains to the extent of epigenetic and malignant heterogeneity between subclones across patients. Just to answer this in a single patient, a combination of whole-genome mapping, RNA-seq, bioinformatics, and several downstream experimental approaches were required. We hypothesize that such heterogeneity is a function of evolutionary time: patients who present with late-stage, widely metastatic disease may possess more epigenetic and malignant divergence between subclones in their tumors than patients who present with early-stage disease. This possibility underscores the pressing need to detect cancers early, before such malignant heterogeneity arises.

[00236] Also unclear are the precise mechanisms whereby PGD/oxPPP activity controls global epigenetic state, which are likely to be complex. This could be mediated through any of the known oxPPP-dependent changes in cellular metabolism, including redox balance, fatty acid biosynthesis, and/or ribose biosynthesis, any of which can affect global epigenetic state through control of metabolite cofactors that activate or inhibit entire classes of chromatin modifying enzymes. PGD activity itself is also complex and subject to several modes of regulation, including transcriptional over-expression, post-transcriptional repression, post-translational modification, protein: protein interactions, substrate availability, feedback inhibition, cross-talk with other pathways, and subcellular localization including a highly conserved yet uncharacterized nuclear fraction (C. Lyssiotis, personal communication). PGD-dependence may be selected for by any of these mechanisms during the evolution of distant metastasis in different patients.

[00237] A final question is how global epigenetic changes are targeted to specific chromatin domains that encode gene expression changes during subclonal evolution. We hypothesize that transcription factors and chromatin modifying enzymes that directly bind these regions play major roles in targeting the reprogramming events, and several candidates were recurrently over-expressed in our RNA-seq datasets. This includes the histone demethylase KDMIA (LSDl), which could be particularly important since we previously showed that this enzyme controls LOCK reprogramming and other studies have shown that it regulates breast cancer metastasis. [00238] In summary, our findings in conjunction with deep sequencing studies on many of the same samples reported here suggest a model whereby driver mutations arise early to initiate PDAC tumorigenesis, followed by a period of subclonal evolution that generates heterogeneous metabolic, epigenetic, and malignant properties. Like driver mutations, those properties that confer increased fitness to cells that acquire them may be selected for and clonally expanded during invasive tumor growth and metastatic spread. The strong oxPPP- PGD dependence we observe in distant metastatic subclones could reflect such selection: distant metastatic sites provide ample glucose to fuel the pathway, pathway products (glucose-dependent NADPH) reduce oxygen species encountered within the sites, and the pathway itself is coupled to epigenetic programs that promote tumorigenesis. As such, reversal of malignant epigenetic programs by targeting the oxPPP could represent an effective therapeutic strategy for metastatic PDAC, one of the most lethal of all human malignancies.

[00239] DATA DEPOSITS

[00240] All ChlP-seq, RNA-seq, and bisulfite-seq sequencing data has been deposited online (GEO Number: GSE63126) at the following URL: ncbi.nlm.nih.gov/geo/query/acc.cgi?token=sxyjkaqsvfalheh& ;acc=GSE63126

[00241] METHODS SUMMARY

[00242] Tissue samples and cell lines were previously collected from PDAC patients by rapid autopsy, sequenced-validated, and monitored for mycoplasma as previously described. Low passage (2-17) rapid autopsy cell lines were cultured at 37°C in DMEM with 10% fetal bovine serum (FBS, Gibco). For MTT assays, 15,000 cells/well were plated into 96 well plates in triplicate, treated 12 hours later, and assayed after 24hr (glucose responses) or 6 days (chemotherapy) with CellTiter96 (Promega). For glucose response assays, nutrient- deplete DMEM (no glucose, glutamine, pyruvate, or serum) was used with addition of glucose as indicated. For glucose-dependent oxidative stress analysis, cells plated in triplicate and grown to 80% confluence followed by incubation in nutrient-deplete DMEM containing 10% dialyzed FBS with or without lOmM glucose and lmM H 2 0 2 for 24 hours. Cells were then washed with PBS, trypsinized, and viable cells counted with a hemocytometer. Glucose uptake and lactate secretion were measured with a YSI 7100 Bioanalyzer as described in the supplementary methods. For 13 C-1,2 glucose tracing and steady state metabolite profiling, the Q Exactive MS (QE-MS; Thermo Scientific) coupled to liquid chromatography (LC Ultimate 3000 UHPLC) was used for metabolite separation and detection as previously described. Detailed conditions are provided in the supplementary methods.

[00243] Histones were acid extracted as described and western blots performed on 3.5ug histones, which were checked by Ponceau stains prior to western blot to ensure equal loading. Densitometry was performed with ImageJ software. RNA was extracted with Trizol reagent (Life Technologies) and isopropanol precipitated. Genomic DNA was purified with MasterPure DNA extraction reagents (Epicenter). Immunohistochemistry, H&E staining, and immunofluorescence on formalin-fixed, paraffin-embedded (FFPE) tissue microarray sections (TMAs) were performed according to standard procedures. Antibodies used for western blot, IHC, and ChIP are listed in Table 11.

[00244] Table 11. Antibodies and conditions used for western blots, immunostain, and

ChIP experiments

[00245] RNAi experiments were performed with siRNA transfections (Oligofectamine, Life Technologies) using negative control siRNA (Sigma, SICOOl) and pre-designed siRNA oligonucleotides against indicated genes in parallel (Sigma, PGD: SASI_Hs02_00334150, CDH2: SASI_Hs01_00153995, TOP2B: SASI_Hs02_00311874). siRNAs against mutant KRAS G12V (CUACGCCAACAGCUCCAAC) (SEQ ID NO: l) were custom designed. Cells were incubated with siRNAs for 4 days after transfection and harvested. For drug treatments in 2-D, cells were grown to 70-80% confluency and treated for 3 days with 250uM 6AN or DMSO negative control.

[00246] 3-D matrigel assays were adapted from Cheung et al. (Control of alveolar differentiation by the lineage transcription factors GATA6 and HOPX inhibits lung adenocarcinoma metastasis. Cancer Cell 23, 725-38 (2013)). Briefly, 2-D cultures were trypsinized into single cells, 4,000cells/mL were suspended and thoroughly mixed in ice- cold DMEM containing 5% matrigel (BD systems) and 2% FBS (+/- DM S 0/6 AN as needed), 500ul plated in quadruplicate into 24 well ultra-low attachment plates, and incubated for at least 7 days to allow tumor growth. Well-formed tumors were then counted and representative photographs taken with an EVOS instrument. 3-D suspension tumorsphere assays were performed with 20,000 starting cells/well in ultra-low attachment 6 well plates as described, and tumors counted/photographed after at least 7 days of tumor growth.

[00247] Organotypic tumor forming assays were adapted from Ridky et al. (P. A. Invasive three-dimensional organotypic neoplasia from multiple normal human epithelia. Nat Med 16, 1450-5 (2010)) and Andl et al. (Epidermal growth factor receptor mediates increased cell proliferation, migration, and aggregation in esophageal keratinocytes in vitro and in vivo. J Biol Chem 278, 1824-30 (2003)). Briefly, 6 well permeable transwell plates (Costar 3414) were overlayed with lmL type 1 collagen containing 10X DMEM (acellular layer). Human dermal fibroblasts (ATCC) were suspended (12xl0 6 cells/mL) in a mixture of ice cold 10X DMEM, 10% FBS, 52.5% collagen, and 17.5% matrigel (cellular layer), thoroughly mixed, and 2mL/well plated over the acellular layer. The mixture was allowed to partially solidify for approximately 15 minutes at 37°C, followed by triplicate injection of lxlO 6 PDAC (suspended in 20ul DMEM) cells into the cellular layer. Cells were incubated for 24 hours in fibroblast growth media above and below the inserts to initiate contraction of the discs. Fresh media with DMSO or 6AN was then added and replenished every 2 days for 6 additional days, followed by addition of DMEM with DMSO or 6AN underneath the inserts (no media on the top) for an additional 7 days. Discs were harvested, fixed overnight in 10% formalin, thinly sectioned, paraffin embedded, and stained with H&E. Tumors were photographed and measured with an Olympus BX53 microscope using cellSens Standard software.

[00248] Tests for statistical significance (two-tailed students t-test) were performed on data collected from technical replicate (performed in parallel at the same time) or biological replicate (performed at different times) experiments as indicated in the figure legends using excel software for western blot densitometry, MTT assays, and tumor measurements. Whole genome bisulfite sequencing and RNA-seq were performed with HiSeq instruments (Illumina) as described in Hansen et al. (Increased methylation variation in epigenetic domains across cancer types. Nat Genet 43, 768-75 (201 1)). ChIP assays were performed as previously described for fixed cells with sonication. For ChlP-qPCR, equal amounts of paired input/IP DNA were amplified by real-time PCR (Roche LightCycler96) and fold enrichments calculated. Primer sequences are listed in Supplementary Table 1 1. For ChlP- seq, immunoprecipitated and input DNA was further sheared to 200-3 OObp fragments, size- selected on agarose gels, and sequenced on either HiSeq (Illumina) or SOLiD (Applied Biosystems) formats with comparable results. IP sequencing reads were normalized to their corresponding inputs. Sequencing procedures, bioinformatics methods including domain calls, and statistical analyses are described in detail within the supplementary methods section.

[00249] Table 12. Real-time PCR primer sequences used for ChlP-qPCR and RT-PCR experiments

CDH2 ChIP Chr 18 25.808R: CTGCTAGCGTAGCCATCTGAGATCG 17

TOP2B ChIP Chr3 25 398F: GCCCTGTCTTCCCAGAATCATTGC 18

TOP2B ChIP Chr3_25.398R: CATGAAGCCTATGAAGATCATTATGG 19

T0P2B ChIP Chr3_25 540F: TTTAGCCAGCAAGTATTCTAGCATGG 20

T0P2B ChIP Chr3 25 540R: GTCAGTGTGATTCAGTAACAATGATGG 21

T0P2B ChIP Chr3 25.622F : CCTGCTCAAGGCTGACATGTCACC 22

T0P2B ChIP Chr3 25.622R: GTCGGACTCGATGGTCAGCACTGG 23

T0P2B ChIP Chr3 25 733F: AACCCGAAACTTTCAATGCACTTGG 24

T0P2B ChIP Chr3 25 733R: CTTCCTCTATAGTGAAGACCCTAGG 25

T0P2B ChIP Chr3 25.812F: TATGGCCATTCTTGCAGCAGTAAGG 26

T0P2B ChIP Chr3 25.812R: AAAGTTGGCTAAGGACATGAATAGGC 27

T0P2B ChIP Chr3 25 973F: GGAGATTCCCTCAGGTGCCTATACC 28

T0P2B ChIP Chr3 25.973R: CTGGTGTTCCAGGCACCACTGAGG 29

CDH2 RT-PCR CDH2F: TTATTACTCCTGGTGCGAGT 30

CDH2 RT-PCR CDH2R: GAGCTGATGACAAATAGCGG 31

T0P2B RT- T0P2BF: GTTACAGGTGGTCGTAATGGTT 32

T0P2B RT- T0P2BR: TTGGCTTCAGAAGTCTTCATCA 33

[00250] SUPPLEMENTARY METHODS

[00251] YSI metabolite analysis. Metabolite consumption (glucose and glutamine) and production (lactate and glutamate) were measured using a YSI 7100 Bioanalyzer. Indicated cell lines were plated at day -1 in a 6-well plate. At day 0 cells were counted (3 wells) or cultured in either regular medium or medium supplemented with the indicated compound. Tissue culture supernatants (lmL, n=3, each condition) were harvested 72 hours after cell plating. Tissue culture conditions were optimized to ensure nutrient availability and exponential cell growth. Metabolite consumption/ production data were normalized to cell number area under the curve, as previously described (Lee et al 2014: PMID: 24998913). The area under the curve (AUC) was calculated as N(T)d/ln2(l-2 "T/d ), where N(T) is the final cell count, d is doubling time, and T is time of experiment. Doubling time was calculated as d = (T)[log(2)/log(Q2/Ql)], where Ql is starting cell number and Q2 is final cell number, as determined by manual counting using a hemocytometer.

[00252] LC-HRMS Metabolite Profiling. LC-HRMS samples were prepared and analyzed as described in Liu et al. (Development and quantitative evaluation of a high-resolution metabolomics technology. Anal Chem 86, 2175-84 (2014)). For glucose tracing experiments, cells were plated into 6 well plates in triplicate, grown in DMEM with 10% FBS until 70-80%) confluent, washed 2X with nutrient deplete DMEM, and incubated in nutrient deplete DMEM containing lOmM C-1,2 labeled glucose (Cayman) and 10% dialyzed FBS (Invitrogen) for an additional 36 hours. Additional replicates were also included and counted at the end of the experiment for normalization. Metabolism was quenched by quickly removing media and adding lmL pre-chilled (-80°C) LC-MS grade 80% methanol (Sigma), incubated at -80°C for at least 20minutes, followed by scraping into the methanol and pelleting of metabolites by centrifugation. For drug treatments, cells were incubated in standard DMEM +/- DMSO/6AN for 36hours, followed by incubation in labeled glucose media +/- DMSO/6AN for an additional 36 hours, quenched and pelleted as above. Pellets were reconstituted in equal volumes of 1 : 1 LC-MS grade acetonitrile:methanol and water and 5ul were injected to the LC-QE-MS for analysis. For steady state measurements cells were incubated in growth media (DMEM with 10% FBS for PDACs, keratinocyte serum-free media for FIPDE) until they reached 80-90% confluence, followed by 48 hours in DMEM without serum (for PDACs, since the standard growth media for comparison FIPDE cells also did not contain serum). Metabolism was then quenched with methanol and metabolites pelleted as above. Pellets were reconstituted into a volume normalized to protein content (15uL of 1 : 1 acetonitrile: methanol and 15uL of water was used per lmg protein) and analyzed by LC-QE-MS. Raw data collected from the LC- QE-MS was processed on Sieve 2.0 (Thermo Scientific) using a targeted frame-seed that included glycolytic/PPP metabolites as required for the analysis. The output file including detected m/z and relative intensity in different samples is obtained after data processing, and replicates of selected metabolites from each sample were graphed and presented as shown in the figures.

[00253] Preparation of sequencing libraries. Libraries were prepared from 2-10 ng of IP ChIP DNA and lOOng of input DNA and sequenced on Illumina HiSeq (APF laboratory). Briefly, samples were checked for quality and concentration from 150-250bp on a bioanalyzer. DNA was end-repaired using Klenow polymerase in 58ul of reaction buffer. For IP DNA, Klenow was diluted 1 :5. Samples were incubated at 20°C for 30 minutes and subsequently purified on QIAquick PCR purification columns. A-tails were then added to the DNA with Klenow and dATP in NEB buffer 2 at 37°C for 30 minutes and cleaned with Qiagen MiniElute PCR purification columns. Sequencing adapters were then ligated onto the DNA for 15 minutes at room temperature followed by cleaning with MiniElute columns. Samples were then run on 2% agarose gels and DNA from 216bp-366bp (DNA plus adapters) were cut from the gel and purified with Qiagen Gel extraction kits. Concentrations were then checked on a bioanalyzer and 8ns were PCR amplified with Phusion polymerase (Fisher) for 15 cycles (lOsec 98°C, 30sec 65°C, 30sec 72°C) followed by 5 minutes at 72°C. Samples were then cleaned with Ampure kits (Illumina) and washed with 80% ethanol. DNA samples were resuspended at the end of the cleanup into 17.5ul buffer EB (Qiagen) and subjected to next generation sequencing on Illumina HiSeq platform according to manufacturer's instructions. For SOLID sequencing, ChIP DNA was prepared and samples were processed according to manufacturer's protocols in the Johns Hopkins CRBII core facility.

[00254] BS-Seq data processing. 100 bp paired-end HiSeq2000 sequencing reads were aligned by BSmooth bisulfite alignment pipeline (version 0.7.1) as previously described in Hansen et al. (Increased methylation variation in epigenetic domains across cancer types. Nat Genet 43, 768-75 (2011)). Briefly, reads were aligned by Bowtie2 (version 2.0.1) against human genome (hgl9) as well as the lambda phage genome. After alignment, methylation measurements for each CpG were extracted from aligned reads. We filtered out measurements with mapping quality < 20 or nucleotide base quality on cytosine position < 10 and we also removed measurements from the 5' most 10 nucleotides of both mates. Then, bsseq package in BSmooth was used to identify small and large differentially methylated regions (DMRs). Only CpGs with at least coverage of 3 in all samples were included in our analysis. For small DMRs, smooth window of 20 CpGs or 1 kb was used, and t-statistic cutoff of -4.6, 4.6 and methylation difference greater than 20% were used for identifying small DMRs. While for large DMRs, smooth window of 200 CpGs or 10,000 bps was used, and t-statistic cutoff of -2, 2, methylation difference greater than 10% and length of DMRs > 5 kb were used for identifying large DMRs.

[00255] RNA-Seq data processing. 100 bp paired-end HiSeq2000 sequencing reads were aligned against human genome (hgl9) by OSA (version 2.0.1) with default parameters. After alignment, only uniquely aligned reads were kept for further analysis. Gene annotation information was downloaded from ENSEMBL (ensembl.com, release 66). Reads count for each gene of all samples were estimated using HTSeq (huber.embl.de/users/anders/HTSeq/doc/overview.html) and then were used to identify differentially expressed (DE) genes using DESeq package. Genes with FDR <0.01 and fold- change >1.5 were considered DE genes.

[00256] Chip-seq data processing. For 46 bp paired-end Illumina HiSeq2000 sequencing data, reads were aligned against human genome (hgl9) using BWA with default parameters as described in Li et al. (Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754-60 (2009)). After alignment, duplicate reads were removed and only uniquely aligned reads were kept for further analysis. For 48 bp single- end Solid sequencing data, reads were aligned using Bowtie 7 with default parameters and only uniquely aligned reads were kept for further analysis. For narrow histone modification peaks (H3K4Me3 and H3K27Ac), MACS2 were used for peak calling with default parameters 8 . For broad histone modification enrichments (H3K36Me3, H3K27me3, and H3K9Me2/3), peak calling were performed using RSEG which is based on hidden Markov model (HMM) and specifically designed for identifying broad histone peaks (see Song et al. (Identifying dispersed epigenomic domains from ChlP-Seq data. Bioinformatics 27, 870-1 (2011))).

[00257] Identifying large chromatin domains. We define LOCK domains for heterochromatin modifications (H3K9Me2/H3K27Me3) based on the peak calling results from RSEG. Briefly, peaks shorter than 5 kb were first removed to prevent regions with many nearby, short peaks being called as LOCKs. Then, neighboring peaks with distance less than 20 kb were merged to into one domain. Merged regions greater than 100Kb identified in both biological replicates were called LOCKs. We noticed another unique subset of LOCKs that were invariably larger than 500 kb, strongly enriched with H3K9Me3, depleted of H3K9Me2 and H3K27Me3, and flanked by strong peaks of H3K27Me3 at the boundaries. Because of this, we defined these LOCKs by H3K9Me3 regions with length greater than 500 kb and less than 50% of their length overlapped with H3K27me3. Finally, the large regions (>50 kb) between heterochromatin domains that contained at least one gene with corresponding euchromatic H3K4Me3/H3K27Ac regulatory peaks were defined as ECDs. Because we found that H3K27Ac alone was sufficient for these calls, H3K4Me3 was also used for the initial test dataset with A38Per/Lg, but not required in subsequent datasets (A13Prl/2, A13Lg). All codes used for domain calls are available upon request.

[00258] Defining different gene groups. Genes were classified as belonging within euchromatin (> 50% of genie region located in ECDs) or heterochromatin (> 50% of genie region located in those heterochromatin domains including LOCKs and G-LOCKs). A handful of other genes that did not fit these criteria and were classified as "other".

[00259] Quantifying and enrichment plotting of ChlP-seq and RNA-seq. To plot each histone modification on defined large chromatin domains and their flanking regions, we divided flanking sequences of chromatin domains into bins with fixed length (in bp) and domains themselves into bins with fixed percentage of each domain length. ChIP enrichment was measured and normalized as described in Hawkins et al. (Distinct epigenomic landscapes of pluripotent and lineage-committed human cells. Cell Stem Cell 6, 479-91 (2010)). In brief, the number of reads per kilobase of bin per million reads sequenced was calculated for each ChIP and its input control (denoted as RPKMCM P and RPKMi n p ut ). ChIP enrichment is measured as ARPKM = RPKMCM P - RPKM INPUT and ChIP enrichment regions should have ARPKM > 0. Then all ARPKM were normalized to a scale between 0 and 1 and the average normalized ChIP enrichment signals across all large chromatin domains were plotted for each histone mark. RNA-Seq data was also normalized by the number of reads per kilobase of bin per million reads sequenced and plotted similarly.

[00260] SUPPLEMENTARY DATA

[00261] Supplementary Data 1. This lists numbers of sequencing reads (total reads and uniquely aligned reads) for all replicate samples for ChlP-seq, WGBS, and RNA-seq experiments and includes correlation coefficients between the replicate samples.

[00262] Supplementary Data 1 A: Summary of ChlP-seq reads for all replicate samples

Samples

Uniquely

(Name_replicate#_IP Total reads

aligned reads

antibody)

38Per_l_K27ac 27,100,820 23,396,028

38Per_2_K27ac 20,750,076 17,913,278

38Lg_l_K27ac 22,821,996 19,946,514

38Lg_2_K27ac 25,996,970 22,475,058

38Per_l_K9ac 27, 148,594 22,200,344

38Per_2_K9ac 25,585,898 18,706,456

38Lg_l_K9ac 26,607,612 21,750,232

38Lg_2_K9ac 28,316,524 23,222,230

38Per_l_K4me3 28,059,734 23,718,542

38Per_l_K4me3 43,068,697 20,704,383

38Lg_2_K4me3 24,455,456 21,184,788

38Lg_2_K4me3 44,715,633 17,458,603

38Per_l_K36me3 44,738,783 23,961,134

38Per_2_K36me3 25, 140,434 19,239,814

38Lg_l_K36me3 25,532,398 18,263,078

38Lg_2_K36me3 48,929,336 24,747,992

38Per_l_K27me3 26,546,612 21,523,318

38Per_l_K27me3 47,076, 156 21,613,912

38Lg_2_K27me3 25,528,442 20,886,386

38Lg_2_K27me3 44,444,667 16,568,247

38Per_l_K9me2 69,629,048 55,830,176

38Per_K9me2_2 64,949,506 51,281,848

38Lg_K9me2_l 51,787,694 39,725,736

38Lg_2_K9me2 71,421, 136 55,643,850

38Per_l_K9me3 28,530,078 18,843,038

38Per_2_K9me3 30,869,994 19,860,544

38Lg_l_K9me3 33,055, 178 20,251,534 13Prl_2_Input 37,251,906 30,188,840 for 13Prl_2 K36Me3 and K27Ac

13Prl_l_Input 45,330,014 36,495,018 for 13Prl_l K27Me3 and K4Me3

13Prl_2_Input_ 41,671,660 33,507,094 for 13Prl_2 K27Me3 and K4Me3

13Lg_l_K27Me3 46,174,838 35,837,518

13Lg_2_K27Me3 47,763,132 36,963,900

13Lg_l_K36Me3 48,735,296 39,047,890

13Lg_2_K36Me3 44,439,570 35,973,060

13Lg_l_K27Ac 52,682,716 44,187,348

13Lg_2_K27Ac 38,043,964 32,298,344

13Lg_l_K9Me3 44,449,474 25,717,050

13Lg_2_K9Me3 49,020,848 28,456,692

13Lg_l_K9Me2 40,580,872 31,979,380

13Lg_2_K9Me2 42,760,754 33,395,094

13Lg_l_input 53,086,242 42,631,966 for 13Lg_l K27Ac and K9Me2

13Lg_2_input 41,676,088 33,503,572 for 13Lg_2 K27Ac and K9Me2

13Lg_l_Input 40,822,392 31,354,916 for 13Lg_l K36Me3

13Lg_2_Input 45,292,342 35,189,166 for 13Lg_2 K36Me3

13Lg_l_Input 49,243,126 37,741,490 for 13Lg_l K27Me3 and K9Me3

13Lg_2_Input 47,226,322 36,317,110 for 13Lg_2 K27Me3 and K9Me3

HPDE_l_K27Me3 41,619,730 1,644,221,182

HPDE_2_K27Me3 45,844,562 1,870,133,582

HPDE_l_K36Me3 38,327,244 28,084,826

HPDE_2_K36Me3 35,519,752 27,012,400

HPDE_l_K27Ac 52,294,724 43,428,888

HPDE_2_K27Ac 35,271,886 30,001,598

HPDE_l_K9Me3 49,006,354 31,150,780

HPDE_2_K9Me3 49,995,186 31,313,578

HPDE_l_K9Me2 47,415,884 36,297,824

HPDE_2_K9Me2 47,621,364 36,772,334

HPDE_l_K4Me3 45,262,500 39,528,934

HPDE_2_K4Me3 34,511,978 30,181,046

HPDE l Input 50,286,666 40,743,792 for HPDE l K9Me3 and K9Me2

HPDE_2_Input 45,780,736 37,154,676 for HPDE 2 K9Me3 and K9Me2

HPDE l Input 36,424,754 29,600,008 for HPDE l K36Me3 and K27Ac

HPDE_2_Input 48,330,362 39,054,926 for HPDE 2 K36Me3 and K27Ac

HPDE l Input 38,269,792 30,829,930 for HPDE l K27Me3 and K4Me3

HPDE_2_Input 37,635,368 30,421,766 for HPDE 2 K27Me3 and K4Me3

38Lg DMSO 1 K27

45,364,340 33,889,440

Me3

38Lg DMSO 2 K27

37,628,254 29,077,804

Me3

38-

57,024,354 42,891,924

5 DMSO 1 K9Me2

38Lg DMSO 1 K27

44,665,878 37,664,038

Ac

38-

5 DMSO 1 Input bat 69,705,942 56,571,128 for 38-5_DMSO_l K9Me2 ch4

38Lg_DMSO_l_Input 37,296,752 30,470,440 for 38Lg_DMSO_l K27Ac

38Lg_DMSO_l_Input 49,727,864 40,284,668 for 38Lg_DMSO_l K27Me3 38Lg_DMSO_2_Input 40,505,458 32,886,902 for 38Lg_DMSO_2 K27Me3

38Lg 6 AN 1 K27Me

50,310,568 39,528,064

3

38Lg 6 AN 2 K27Me

38,884,546 32,325,978

3

38-5_6AN_l_K9Me2 33,324,396 24,956,330

38Lg_6AN_l_K27Ac 40,895,998 34,565,998

38-

5 6 AN 1 Input batch 42,480,878 34,537,520 for 38-5_6AN_l K9Me2

4

38Lg_6AN_l_Input_b

40,615,920 33,033,908 for 38Lg_6AN_l K27Ac atch3

38Lg_6AN_l_Input_b

45,637,332 37,039,730 for 38Lg_6AN_l K27Me3 atch5

38Lg_6AN_2_Input_b

36,771,474 29,835,218 for 38Lg_6AN_2 K27Me3 atch5

Total 7,441,856,062

[00263] Supplementary Data IB: Summary of WGBS reads for all replicate samples

[00264] Supplementary Data 1C: Summary of RNA-seq reads for all replicate samples

[00265] Supplementary Data ID: Summary of sequencing correlation coefficients for each replicate

Samples Modification Correlation coefficients between replicates

38Per K4Me3 0.8487501

38Per K36Me3 0.7908561

38Per K27Me3 0.8087827

38Per K9Me2 0.9533903

38Per K9Me3 0.973981

38Per K9Ac 0.8405352

38Per K27Ac 0.9662306

38Per K16Ac 0.8958976

38Lg K4Me3 0.8977917

38Lg K36Me3 0.7457856

38Lg K27Me3 0.8283084 38Lg K9Me2 0.9345982

38Lg K9Me3 0.9870598

38Lg K9Ac 0.8557192

38Lg K27Ac 0.9749662

38Lg K16Ac 0.8634484

13Pr2 K27Ac 0.9795735

13Pr2 K27Me3 0.957889

13Pr2 K36Me3 0.970403

13Pr2 K9Me2 0.9710885

13Pr2 K9Me3 0.9918936

13Prl K27Ac 0.9797524

13Prl K27Me3 0.9481441

13Prl K36Me3 0.9697859

13Prl K9Me2 0.9238059

13Prl K9Me3 0.9841044

13Prl K4Me3 0.9969566

13Lg K27Ac 0.9824737

13Lg K27Me3 0.9518218

13Lg K36Me3 0.9647369

13Lg K9Me2 0.9505661

13Lg K9Me3 0.9927826

HPDE K27Ac 0.9969341

HPDE K27Me3 0.9694532

HPDE K36Me3 0.9582686

HPDE K9Me2 0.9820727

HPDE K9Me3 0.9655677

HPDE K4Me3 0.991134

38Lg DMS

K27Ac 0.9523149

0

38Lg DMS

K27Me3 0.9190836

0

38Lg DMS

K36Me3 0.8830686

0

38Lg_6AN K27Ac 0.9515394

38Lg_6AN K27Me3 0.9328865

38Lg_6AN K36Me3 0.8750725

Average 0.9331653

Median 0.9556397

[00266] Supplementary Data 2. This provides summaries of chromatin domain calls for

LOCKs, large LOCKs, and ECDs for each sample including median lengths, ranges, % genome coverage, and levels of individual histone modifications in each type of domain.

Median lengths, ranges, and % genome coverage for each individual heterochromatin modification individually (irrespective of domain location) is also included.

[00267] Supplementary Data 2A: Summary of large chromatin domains detected by

ChlP-seq

[00268] Supplementary Data 2B: Summary of broad heterochromatin modifications detected by ChlP-seq

[00269] Supplementary Data 3. This provides all p-values calculated for sequencing experiments, as designated by the figure labels. Sensitivity analyses for LOCK domain calls are also included.

[00270] Supplementary File 3 A: p-values for H3K9Me2 reprogramming across LOCKs

[00271] Supplementary Data 3B: p-values for reprogrammed euchromatin modifications from DE genes

Supplementary Data 3C: p-values for reprogramming of H3K9Me3 across

[00273] Supplementary Data 3D: p-values for reprogramming of modifications from LOCK DE genes Sample K9Me2 K27me3 K27ac K36me3 DNA

(<A38Per (<A38Per (>A38Per (>A38PerWi Methylation

Wilcox) Wilcox) Wilcox) lcox) (>A38Per

Wilcox)

A38Lg p=0.99 p< 2.2e-16 p=1.163e-06 p=0.0000000 p=0.01699

07195

A13Pr2 p< 2.2e-16 p< 2.2e-16 p< 2.2e-16 p< 2.2e-16 p=3.312e-08

A13Prl p= 0.51 p=2.738e- p=1.362e-06 p=0.0004268 p=0.99

06

A13Lg p< 2.2e-16 p< 2.2e-16 p=2.362e-13 p< 2.2e-16 p<2.2e-16

[00274] Supplementary Data 3E: p-values for reprogramming across Large LOCK domains

[00275] Supplementary Data 3F: p-values for 6 AN RNA/ChlP-seq experiments

value

[00276] Supplementary Data 3G: LOCK sensitivity analyses

[00277] Supplementary Data 4. This file lists all differentially expressed (DE) genes detected in each sample by RNA-seq, including level of expression, p-values, directional changes, and chromatin domains that each DE gene mapped to. Analysis of recurrent DE genes detected across distant metastatic samples and between control (DMSO) and 6AN treated cells is also reported.

[00278] Supplementary Data 4A: Summary of DE genes between A38Per and A13Prl detected by RNA-seq and mapped to chromatin domains (data not shown - publically available on the World Wide Web at

nature.com/ng/journal/v49/n3/full/ng.3753. html?foxtrotcallback=true#supplementary- information by clicking link "Supplementary Table 7").

[00279] Supplementary Data 4B: Summary of DE genes between A38Per and A13Pr2 detected by RNA-seq and mapped to chromatin domains (data not shown - publically available on the World Wide Web at

naturexom/ng/journal/v49/n3/full/ng 753.html?foxtrotcallback=true#supplementary- information by clicking link "Supplementary Table 7").

[00280] Supplementary Data 4C: Summary of DE genes between A38Per and A13Lg detected by RNA-seq and mapped to chromatin domains (data not shown - publically available on the World Wide Web at

nature.com/ng/journal/v49/n3/full/ng.3753. html?foxtrotcallback=true#supplementary- information by clicking link "Supplementary Table 7").

[00281] Supplementary Data 4D: Summary of DE genes between A38Per and A38Lg detected by RNA-seq and mapped to chromatin domains (data not shown - publically available on the World Wide Web at

nature.com/ng/journal/v49/n3/full/ng.3753. html?foxtrotcallback=true#supplementary- information by clicking link "Supplementary Table 7").

[00282] Supplementary Data 4E: Summary of DE genes between A38Per and A38Lv detected by RNA-seq (ChlP-seq not performed for chromatin domains) (data not shown - publically available on the World Wide Web at

nature.com/ng/journal/v49/n3/full/ng.3753. html?foxtrotcallback=true#supplementary- information by clicking link "Supplementary Table 7").

[00283] Supplementary Data 4F: Summary of DE genes recurrently up/down-regulated across primary tumor precursor (A13Prl/2) and distant metastatic subclones (A13Lg, A38Lg, A38Lg) vs. A38Per (data not shown - publically available on the World Wide Web at nature.com/ng/journal/v49/n3/full/ng.3753. html?foxtrotcallback=true#supplementary- information by clicking link "Supplementary Table 7").

[00284] Supplementary Data 4G: Summary of DE genes between control (DMSO) and 6AN treated A38Lg cells detected by RNA-seq and mapped to chromatin domains (data not shown - publically available on the World Wide Web at

nature.com/ng/journal/v49/n3/full/ng.3753. html?foxtrotcallback=true#supplementary- information by clicking link "Supplementary Table 7").

[00285] Supplementary Data 4H: Comparison of overlaps between DE genes between matched lung and peritoneal subclones (A38Lg vs. A38Per) and DE genes between control and 6AN treated A38Lg cells.

RP11-

PPAPDCIA TMEM59 TSC22D1

346D6.6.1

RGS5 GAS8 NBAS ITGAM

RP11- ABC7-

ENG PTPRZ1 618G20.2.1 42389800N19.1.1

RIPPLY 1 RASSF7 SSTR5 GPR116

UHRF1 MRPS6 CRIP2 CYP24A1

RP11-

SUSD5 NUDT18 DHRS9

353B9.1.1

KRT80 GRAMD4 RPL31 GRPR

KRT32 SERINC5 KIAA1143 AIF1L

DNAJC3-

SLC26A7 TMEM102 CGN

AS1

RP11-

TFPI2 TMPRSS5 C1R 314P12.3.

1

RP11- TMEM63

GAB2 CYFIP2 314P12.2.1 B

ATP6V0A

MMP7 FAM109A FAM174B

4

KLHL23 TLR4 SYT11 MPP7

KREMEN

Clorfl lO C20orf96 DET1

1

GRIN2A SMAD6 TBC1D8B TTN

PADI2 VAMP4 TAZ LEF1

AL162759.1.

MYOIE UCP2 SP140 1

RP4-

MYBL2 DVL1 SP6

647C14.2.1

GLYATL AC069513.3.

AARS ANXA10 2 1

RRM2 ITPKC DALRD3 NYAP2

E2F2 USP18 SEMA3C MCOLN3

DENND2

CPA4 RAB33B ATP2B 1

A

RP11-

HHIP THTPA HSH2D

325F22.3.1

MYPN ADHFEl GPER Clorfl06

RP11-

LENG8 ITPK1 EVL 150012.6.1

CLSPN RILPL2 IL1RAP C7orf58

RP1-

WDR69 T SNARE 1 RPL29

95L4.4.1

RP11-

SPIRE2 EFHC1 EL11 150012.1.1

AKAP12 MMAA CRTAP DDN

DOCK 10 SLC15A3 B3GAT1 snoU13

PLEKHM

PSG5 SEL1L SLC27A2

1

RP11-

SPC25 ME1 DGAT2

403113.8.1

TM4SF4 STARD10 TBC1D9B SHANK2

FAM71D SNPH EEF1A1P5 HNRPCP

OLFML2 AC005083

EME2 VKORC1

A .1.1

OXCT1 DHRS3 SNX21 S1PR3

FAM111

MX1 KDELCl ARL14 B snoU13 RRN3P1 AIP SAMD5

AC073130.1. RPl- ARHGDI

FAM100A

1 34B21.6.1 B

CCNE2 DDO OSBPL6 GNG4

CDC42BP

CDC25A TPRN C4orf34

G

RP3-

CD58 FAM18B2 SMO

324017.4.1

MCM10 PLCG2 TXNDC15 TIAM1

RP11- RP11-

LIN7A NMU

424C20.2.1 22011.1.1

CTD-

MSRB3 ZNF554 SOX6

2287016.1.1

HEATR7

GLI4 FLYWCH1 PPFIBP2 B l

TP73 TRAF1 Cl lorf2 LY6D

PRKDC ANKRA2 CHST12 PLAC1

DLEU2 SYNGR2 FTX SYTL5

CTB-

FAM113B CCNO SELL

164N12.1.1

AL844908.5.

SDPR CROCC MY05C

1

RPl l-

P0LE2 HIVEP2 UNC5A

996F15.2.1

MEST RHBDD2 CPEB2 SPNS2

ASTN2 GATS. l PDIA6 BK

LAMP5 RALGPS1 STON1 PPYR1

MKI67 VWA1 ARRDC3 FAM46B

HEATR7

TMEFF2 C7orf23 COL17A1

A

CPA5 HLA-K TESK1 MACC1

ANKRD18D

MAF1 HEG1 DNAH12 P

RPl l-

ESC02 TMEM8A CYP2J2

85K15.2.1

LDLRAD

TUBA1B DDX58 MTHFD2

3

PYROXD

WNT7A RWDD2A RAB6B

1

MCM4 CYP27C1 RPL10 BTBD11

AKAP5 RNF145 PHACTR1 PLA2G7

TYMS ZDHHC14 NUDT22 TBC1D30

CDCA2 KLF7 TRIM4 SNX25P1

CAP2 KYNU RHOG TESC

RP11-

LDLR EIF1B COL1A1

527N22.2.1

RP11- ZDHHC11

SSR4 FERMTl

297M9.2.1 B

DEPDC1 AC007383.3.

ARG2 MED12L B 1

RP11-

DPF1 SRCRB4D TCEAL1

582J16.5.1

RP11-

LRP5L ZFPL1 BRI3BP 55418.2.1

RP11-

FOXC1 RGS10 LAD1 33N16.1.1

MAP3K1

AACS MEIS3 C16orf74 4 EXOl SLC44A2 LLGL1 CD163L1

RP11-

SEPP1 C9orf37 ANXA8L1

253E3.3.1

FABP6 ARHGAP25 USE1 AEN

RP11-

STAT6 YPEL1 C8orf46 184116.2.1

RP11-

FOSL1 PLA2G15 CDH1

2711.2.1

CCDC99 VPS28 RPS27 XDH

AC016831.7.

R3HDM2 VEGFB XK 1

FJX1 Cl lorf35 CELSR3 ANXA8L2

SCN5A ARHGEFIOL EPHX1 COL12A1

RP11- PPARGC1

PSMD9 TINF2

687M24.3.1 B

METTL7

ILDR2 ZNF70 C6orfl32

A

MUC5AC TMTC2 C3orfl8 TTC3P1

AC092614.2.

NFE2L1 PITPNC1 GALNT6 1

APOBEC3B C5orfl3 ISL1 SEMA7A

RP11-

ZNF768 COR06 F2RL1 54A9.1.1

NCAPG SIK1 NEK3 ALDH1A1

RPl l-

ATAD2 OSCP1 CRABP2

313D6.4.1

SMC4 ACADS RPL27A CA2

RPl l-

FUT9 HERC6 FGFBP1

216F19.1.1

ZNF488 CLDN9 ABC A3 SNAI2

TK1 RHEBL1 PITPNM2 PLS1

CTD-

CHST6 LGR5 KLHL13

2023N9.1.1

RP5-

HELLS VMAC AGA

862P8.2.1

MMP24 AQP3 FAM113A RAB38

SLITRK3 DNAL4 GPR37 SLC7A8

ELOVL2 HBP1 VASH1 WNT3

SERINC2 ZG16B CAMKK1 SARDH

AC002066.1.

FAM100B MAST1 TPM1 1

RPl l-

DIAPH3 EREG RASEF

574K11.20.1

RP11-

ZNF862 RPL12 SKAP1 291L15.2.1

GINS1 NARFL NSUN5P2 TRIM 14

RP5- RP11-

SWSAP1 CTSL2

968P14.2.1 47714.3.1

DHFR YPEL5 NSUN5P1 GK

ROR1 SLFN5 HAUS4 VEPH1

RP11-

MCM3 CCNDBP1 TMEM234

800 A3.4.1

RP1-

PAQR8 MRC2 FGFR3 140K8.5.1

GLYATL SLC25A2

PYCR1 PADI1 1 7

GOLGA8

CDCA7 SLC26A6 CDS1

B CSMD2 IKZF2 C3orf78 JPH1

HSD17B1

RBPMS2 CD68 FAM81A

1

AP003068.23.

RBL1 LMBR1L GALNT12

1

RP11-

KCNJ14 PAK3 DPY19L2

677M14.3.1

CCBE1 DDR1 GMPPA ITGBL1

CCND1 C7orf53 EEF1A1P6 SYNP02

AC093673.5. RP11-

E2F8 RELB

1 157P1.4.1

AN RD2

CDC45 WASH3P CBS

2

LMNB 1 PDCD4 MIF4GD GRHL1

Cl lorf41 NR1D1 CNPY4 KRT15

RP11-

BEST3 NCOA7 GGT7

93L9.1.1

CTPS GOLGA2 LRP1B C15orf62

RASSF10 CEBPG FAIM2 AADAC

AC112229

POLA1 C2orfl5 ZNF333

.1.1

TUBB6 KLF13 AGXT2L2 TRIM59

HOMER 1 GEMIN8 B9D1 DNAH10

CTD-

CD200 ARRDC2 EEF1A1 2021J15.2.

1

SLC39A1

CENPM ATRNL1 KRT19

1

ERCC6L SAT2 PAOX KIAA1199

HMMR ZER1 CTSA GPM6A

CENPW KIAA1407 SLC45A1 AKAP6

RP11-

OAS1 RAB3IL1 GPR110

29813.4.1

RP11-

AC004410.1 C6orfl QPCT 259P6.1.1

RP11-

CTD-

THBS1 DGAT1 382A18.1.

2314G24.2.1

1

GPR63 RNF103 CYP4X1 METTL7B

DTL EFHC2 HCFC1R1 PHACTR2

RP11-

C17orf69 FAM70B PLEKHA7 101K10.8.1

RP11- LINC0049

SHMT2 PLAU 204J18.3.1 3

ST6GALNAC

WDR4 SEC14L5 SLC4A3

2

BRIP1 KLHL31 CLK4 KRT18

LMNB2 LGMN UGDH-AS1 C3orf52

C14orf49 LLGL2 ITGA11 SYTL2

EIF6 ENDOV CGREF1 PKDCC

ARHGEF

CST1 RPL22L1 ASB9 26

RP13- SLC22A2

ETNK2 FKBP2

516M14.1.1 0

POLQ LIPA OXA1L IGSF3

RP11-

FAM13A C10orfl02 CLDN10 573111.2.1 APLN ST3GAL1 DNASE1L1 MB 0 ATI

RP11-

APOBEC3G HIGD2A TMEM169

304F15.3.1

NDC80 PLAUR MLLT3 SYK

MCM6 ACYP2 APOO 1496.1 MAST4

CAV1 RTN2 FAM156B SDC1

RP11-

RP11-

HIF1A TRPT1 613M10.6. 53M11.3.1

1

RPl l-

MARK1 DNASE2 PPM1H

571M6.6.1

CTA- RPl l-

CA11 GCHFR 445C9.15.1 243J18.3.1

USP13 SLC7A7 SGSH RHOD

RP5- CTD-

ABCA10 EZR

1172A22.1.1 3074O7.5.1

RP11-

UCN2 PTK6 CNTNAP1 303E16.2.

1

MCAM RPS6KA2 TNFRSFIOC RCC2

ZWINT WHAMMP3 B3GNT1 KDR

LINC004 RP11-

ENGASE LRP1

60 41612.1.1

BIRC5 LPCAT4 P2RX6 GPR65

FLNC EPS8L3 SSBP2 BCL7A

SYNE2 TMEM198B ZC3H6 PPCDC

PRICKLE

HMGN2 CCNG1 NBPF10

3

AC108463

ADCY3 PDE4DIP TBC1D4

.1

PDSS1 GPR108 MAGED2 CADM4

CDC A3 TMC4 DNAJC1 SPTBN5

AXL PTP4A3 PCK2 CUEDC1

KIAA010

ITGAX MANF CMTM4 1

NAV3 TSSC4 ZCWPW2 CDK5R1

KIF11 AN09 Clorf213 C19orf21

LPCAT2 PPP1R3B CACNG6 Clorfl l6

AC004540.5.

KIF4A ANTXR2 UNC5C

1

CTD-

ARHGEF3 TSLP ALDH5A1

2334D19.1.1

RP5-

EML5 C10orf32 MB21D2

1103G7.4.1

RP11-

NME1 PGM2L1 CIS

7K24.3.1

ARHGAP11

HLA-H SPATA25 DSG2 A

MCM2 TUBG2 PPAP2A PLD6

RPl l-

SLC5A11 SLC16A3 GALNT5

390P2.4.1

RPl l-

CHN1 CCNL2 ZNF185

539L10.3.1

KIF23 SGK1 SH2B2 P2RY2

ARHGAP19 MEF2C NEURL2 PLCXD1

SLC8A1 CAMTA2 KLHDC2 AIG1

FBN2 HLA-F- SHISA2 PALM AS1

FKBP5 PIWIL4 PXK PKP4

IGSF9B LSS WBSCR27 ZNF462

KIF15 PCMTD2 DSEL SAA1

NUP210 CFI TRIM46 SPATA13

AS3MT ARL4D CTSF LAMA3

TNFAIP8L3 MVK KLHL35 NIDI

PM20D2 CDH17 PDIA5 PEAR1

OPN3 MAFF PLK3 PYGL

RRM1 CACNB 1 ZNF815 USP40

DNA2 C2orf63 NICNl DIS3L

DEPDC1 IL15 SCAND1 ID2

PDE1A ST3GAL6 CALHM3 FUT1

RP5-

ALYREF C9orf7 PRSS3

827C21.4.1

MORC4 ENDOU LETMD1 NMNAT2

FBX05 PIM3 JAKMIP3 DNAJA4

DPY19L2P1 CLDN7 LZTFL1 DOCK9

CENPF DAB2IP CTNS DDI2

RPl l-

GPR19 TARSL2 PLAC8

280F2.2.1

SYCE2 REC8 PDK1 FBP1

GINS4 FAM160A2 MBL1P RAG1

XRCC2 ZFP36 LHPP PRPF4

RP11-

DARS2 BLVRB RPS6KA1

73K9.2.1

HMGB3 STARD4 CALHM2 DSP

KIAA152 RPl l-

SGK2 SLC22A5 4 277P12.20.1

NUP188 CIRl PTCHD2 AADAT

MAPK8IP

DPF3 UST FARP2

3

STMN1 USP20 IRF2BPL MCC

RP11-

WDHD1 PLK1S1 ZNF658

149123.3.1

RP11-

NCAPD3 TTC39B B3GNT5

108M9.4.1

CCNE1 HIP1R ABHD14B NT5DC3

TRPV2 CYP4F3 DLG4 ABCC9

CDKN3 SH3YL1 PCDHGA7 PNPLA4

PRR11 TNFSF12 MAGED1 ZNF717

CSRP2 TMEM8B LEPREL2 RAVER2

COTL1 IGSF8 LOXL2 MREG

MSH2 PTK2B SLC43A2 RBP1

KIF18B NR4A2 ZEB 1-AS1 RAPGEF3

SKP2 SPIRE 1 LENG8-AS1 CNKSR3

RP11-

ADORA2

DLL1 C17orf72 757F18.5. B

1

GFRA3 TMED 1 C16orf93 PRSS12

ZNF724P SQLE ARHGAP4 HOOK1 RASGRF

ABCC6 CNTD 1 COL16A1 1

FAM83D VSIG10L PCDHGA3 MST4.1

BUB 1 FA2H STK32A CMIP

MTHFD1 OSBPL7 SLC29A4 SMC5

DHX9 KCNIP3 LEPRE1 ANXA2P2

NRG1 LYNX1 HOMER3 STEAP2

PCNA ULK3 MCEE NUDT14

UTP20 HINT3 XBP1 MMD

MIR17H AN RD5

FDFT1 ALDH1L2

G 6

ANLN TRIB 1 MAP2 SURF2

RUVBL1 NPDC1 NNMT SH2D3A

MY01B VEGFC NUCB 1 MAP3K1

NCAPG2 GPRC5C KLRC2 UACA

FH SIGIRR TMEM158 TAF2

GNG11 TCEANC SEPT7L DCBLD2

GS1- RPl l- RP11-

MKNK2

465N13.1.1 161M6.4.1 295M3.1.1

AC079922.3. NAALAD

MT1L N4BP2L1

1 L2

TNS1 KCNMB4 PCDHB7 CNTRL

RP5-

BARD1 HSD17B7P2 CD82

1182A14.3.1

FBXL19-

THOP1 GET4 NAT6

AS1

SKA1 FKBP10 PTX3 RHBDL2

RP11-

C1QL1 RDH10 ZNF836 448G15.3.

1

RP11-

SLC22A1

Cl lorf82 TMEM106A 357H14.19

5

.1

CCNA2 CCDC57 TFF2 TUBGCP5

SLC35F3 SHB CYP1A1 INADL

ClOorfU

TMPRSS3 PPP1R3E C9orf64 0

NEXN NYNRIN TMEM120A ARAP2

ASF IB TNFRSF9 POLN KRT16

PRTFDC1 SESN3 RAPGEF4 ALS2CL

FAT3 A2LD1 RABAC1

RP11-

H2AFX CACNA1G

475N22.4.1

PRIM1 ZNF628 LOX

SNX18P3 JAG2 FAM151B

RP11- RP11-

PTPRM

973F15.1.1 244H3.1.1

GOLGA8

ADRB2 CLEC2B

A

FBXL13 TP53INP2 KDELR3

CDC20 WASH4P ISYNA1

CENPK TNIP1 RASL11A

RP4-

CDCA4 CALR

659J6.2.1 RP11-

AT AD 5 CCDC126

108K14.4.1

SMC1A PTPRB METTL12

FAM196

U6 BNIP3 B

UGT1A1 SPRY3 FAM175A

CTD-

MMP15 CRLF2

2574D22.2.1

MCM5 SLC23A3 TMEM231

MELK TBC1D3F CRELD2

LYPD1 ABCA5 PLXNA3

RP4- AC026202.3.

DLGAP5

798A10.2.1 1

NUSAP1 CLDN15 ASNS

TUBBP1 TMC7 RASIP1

TCTEX1D

MYH10 SFRP5

2

MIR155H RP11-

NOS3

G 66N24.3.1

ARHGEF1

CCDC138 ABCC3

6

ERCC2 TPBG RIBCl

KIF20A FAAH HSP90B 1

TMEM14

TMEM150A HYOU1 B

H2AFY2 LRRC56 BAMBI

OXTR CEP85L TPP1

RDM1 MIA ZCCHC24

PLEK2 CDH6 FAM161B

AC004080.12

SHCBP1 PON3

.1

GJC1 CCDC92 MAPT

AC018755. i l

SNRNP25 PTPRH

.1

KCNQ5 FOXQ1 PLOD2

ODC1 J01415.23 MAN1A1

EBNA1B

PPFIA3 NUCB2 P2

SMPDL3

FABP3 DERL3

A

AC007283.5.

MCM7 SCN1B

1

CHAC2 ALPPL2 FN1

BORA MTHFR PLCD1

ASRGL1 WASH2P CCDC85B

VIPR1 RSAD2 BTBD 19

PTTG1 LIMA1 AP000769.1

PPP1R16 AC147651.3.

ACTB

A 1

TOP2A PIK3C2B ATHL1

NEIL 3 XAF1 CYP2E1

ALDH1B

KIAA0513 FAM182B 1

GRAMD1

CEP250 PDIA4

C FAM131

GLDN ZFP2 B

FAH GPCPD1 SLC25A29

CIT CABP4 RAB24

RPl l-

CDCA8 NXNL2

755F10.1.1

CDK2 SEMA4C QPCTL

EZH2 PHYHIP ALDOC

AC093734. i l

PSMC3 AKR1C1

.1

FGGY IRF9 LRRC29

RPl l-

NXPH4 ZNF517

307013.1.1

RP5-

FRMD4A 1187M17.10. SEMA3F

1

HIST1H1

CSPG4 RUSCl-ASl

c

AN RD4

CSE1L EVI2B

2

KIF20B ZC3H12A TIE1

C2CD3 JUND HSPA5

PPIAP29 SEMA4B TSPYL2

CTA-

CTAGE5 HERPUD1

221G9.10.1

TPX2 NLRP1 AQP2

RP4-

DKK1 GNE

794H19.2.1

PFAS LTB4R2 C2orfl6

PBK BBS12 AKR1C2

DICER 1-

TRIP 13 ANGPTL4

AS

CCDC18 CMPK2 HCN2

UBE2T CALB2 NOG

AL357673.1 DHRS2 AKR1B 1

PDE12 LRFN3 ST7-AS1

NUP155 PROC EVI2A

AC022007.5.

ARNTL2 PROS1

1

POLD2 TMEM80 ZNF575

NCAPD2 RALGDS RCN3

VCL SLC6A8 PCDHB 15

CCDC85 RP13-

TMC6

C 895J2.7.1

FANCD2 HMGCR C9orfl50

RP11-

TBC1D17 ANGPT1 117P22.1.1

MYLK TBC1D3 ARSA

RP11- RPl l-

GRN

394B2.4.1 49111.1.1

PRLR ROM1 PRPH

AC004383.5.

AP2B 1 MID1IP1

1

RASSF2 C15orf61 P4HA1

PRKAG2 SPATA20 ANGPTL2 AC002480.4.

COL4A6 FAM78A

1

PSMC3IP STAT2 MTMR9LP

ARRB2 PYGM CRELD 1

CAMK4 RHBDF1 UPB 1

AC002480.3.

TSPAN2 LRCH4

1

RPl l-

RFC2 PODN

554A11.9.1

PLK2 SPRY4 GALNT9

SRRT CITED4 AL137145.2

WNT7B LIPG VLDLR

ADORA1 CAPS IFITM10

C22orf29 EIF1 LDHD

HTR1B CYP4F12 ER01LB

ITGB8 AKAP17A NTNG2

ZNF660 ACTR1B NPY1R

EPB41L2 WDR66

COQ3 APOO 1468.1

AASS AP001372.2.1

PAICS FAM214A

CENPI GSN

CASC5 FAM193B

TUBA1C TPRG1L

NUDCD1 ARHGEF2

L2HGDH TNFAIP3

CHML CDA

CCNB 1 PLA2R1

RP11-

ZSWIM4

799021.1.1

DNMT1 IRAK2

CDK1 LYZ

RP11-

TJP3

673C5.1.1

NUP37 C8orf55

CTB-

TPMT

131B5.5.1

AURKA AC103810.1

PCDHGB

PEA15

2

NRGN KLC4

RP4-

PLK1

541C22.5.1

ZNF124 SIX5

CGNL1 SEL1L3

SNRPD1 MIR29C

PTPN14 MZF1

CDC7 ADAM8

DERA WDR45

PYGOl ZNFXl-ASl

LRRCC1 AC017099.3. 1

FAM173

ARSD B

AHCY DLX4

UBAC2-AS1 HK2

HIGD1A HS3ST1

GMNN ABCG1

NCAPH CCDC146

CEP128 EGLN3

SLC38A5 CLK1

TUBB4B UPK3B

RP11-

SLC16A6 512F24.1.1

BICC1 HLA-F

HMGB 1P

NT5M 5

NUF2 BTG1

VCAN C7orf63

FAM64A RASSF9

DNAJC9 CTSL1

RANBP1 ENDOD1

C4BPB FAM116B

C9orfl40 KNDC1

SKA2 PPP1R3F

RP11-

LARP6 181C3.2.1

PKMYT1 FBX06

RFWD3 CCDC69

TCOF1 PRRT1

CTD-

KNTC1

2258A20.4.1

DKC1 TSPAN1

RP11-

MPP2

362F19.1.1

CCNF SLC2A10

CTD-

Clorfl l2

2341M24.1.1

AC046143.7.

C2orf81 1

STMN3 FZD4

OIP5 RAB40C

DGKH EXD3

NRM IFIT3

PCDHGC

RFC3

5

RP11-

RBBP8

285F7.2.1

HTR7P1 RHBDF2

PRICKLE

NMT2

4

ARHGAP11

RABL2A B

DPYSL3 WBP1 TCAM1.1 LAMB 3

RP4-

YBX1

697K14.7.1

HSP90AA

TMEM91 1

PCDHAC

ACTA2

1

MPP5 TRIM2

C8orf84 C16orf7

ASPM RTP4

LRRC8C KIAA1875

RP11-

TBC1D7

540D14.6.1

RP11-

JUP

462L8.1.1

ZNF367 NEU1

RP11-

DDX60 956J14.1.1

CCT5 HOOK2

C6orf52 BC02

NEK6 TBC1D8

TSPY26P PRRX1

MRPL1 KRTCAP3

FARSB PINK1

PKP2 NFIL3

CAND2 LCN2

MRT04 TYMP

KIAA058 RP11- 6 429J17.2.1

DEK WARS

FAM54A COL6A1

FEN1 LZTS2

RP3-

NPIPL2

510D11.2.1

SMC2 LCA5L

KB-

RELT

1460A1.5.1

SGOL1 CYP4V2

ANKRD1 FAM59A

SUV39H2 SEZ6L2

KDELC2 PTPRE

HMGB2 NR2F6

ATRIP COL11A2

SACS RENBP

DEPDC7 AQP6

HMGB 1 ULBP1

RP11-

CX3CL1 380J14.1.1

POP1 CCDC149

GFAP ASAH1

TNFRSF1

NUP205

4

CENPA C12orf63 TPGS2 C12orf57

CD 109 PSPN

RP11-

WDR3

263K19.6.1

NOP56 MAPRE3

RP11-

PTRF

44N21.1.1

TNFSF13

PHIP B

QSER1 FM05

FAM86A PRX

POC1A GSDMB

CYB5RL JUNB

H2AFZ HMGCS1

AC092329.1 CSF2RA

RAD51A

HEXDC PI

CCP110 PLXND1

CEP55 NFAM1

ZNF347 PERI

NCS1 CALCOCOl

RAD54L DRAM1

MDM2 AOC2

SMTN GAS7

CDC25C IFIT2

TFDP1 MTRNR2L9

NEURL1

TSTD1 B

TBCD NR4A1

KRBOX1 TM6SF1

SAE1 CBLB

CHEK1 CCDC24

RP11-

FAM66C

678B3.2.1

REV3L C9orfl6

SPC24 KDM6B

TMPO TBX6

DHRS4L2 MMP28

MYOF ACSL1

TUBA4A G0S2

CDC42EP

ADAT2

2

RP11-

SCD

348A11.4.1

BST1 NFKBIA

FOXM1 PLEKHF1

AC027612.6.

AGFG2 1

RP11-

LEPREL4

1334A24.4.1

UBE2N ULK1

PRIM2 ADAMTS13 CAMK2N

S100A2

1

NEGRI PDE7B

HNRNPA

AC073343.1 B

AC002117.1.

CCDC41

1

CAMK1 SPSB3

GTSE1 SLC5A12

MMP2 PLA2G6

BOLA3 HOXD4

HS3ST3B

SYT15

1

BCL2 JAK3

ECHDC3 ClOorflO

KIFC1 NFKBIZ

G3BP1 DUSP8

PFN1 NEIL1

RAET1K ECEL1P2

RP11-

MNS1

783K16.13.1

SIRPA KLHDC1

RP11-

MST1 152N13.12.1

PORCN ROB04

NOC3L SYT17

RAD18 NR1H3

ESPL1 GDPD3

PTER CEBPB

RP11-

SDCBP2

1277A3.2.1

NUP107 NOV

SGOL2 DNAH7

DI02 IER5L

ZNF726 FANK1

SFXN2 ORAI3

FRMD6 SLC5A3

CENPN B3GALT4

NT5DC2 ZDHHC1

RFC4 KIF27

NOP 16 GIMAP2

CDC42EP

TUBG1

5

AC006028.9.

DZIP1

1

IP05 PRSS16

LHFP TGFBR3

ARHGAP22 FASN

TMEM48 ATP IB 1

RPA3 C17orfl08

TMC07 EPHB6 THSD1 BCL6

AC013461.1. RP4- 1 811H24.6.1

RP11-

CCDC165

49619.1.1

MAGOH

CXCL2 B

EFEMP1 CACNA2D4

EED PPM1K

KIF14 C17orfl03

NUP35 MXD 1

DTYMK MAGIX

SSX2IP PTPRCAP

CELF2 DPM3

NET02 EVPL

PHLPP2 SLC2A13

PTBP1 IL1B

FKBP3 KCNE4

URB2 INPP5J

EEF1E1 FRAT1

BEGAIN DUOX2

CACYBP FBXL15

HMGN5 C15orf48

HJURP FAM86FP

DBF4 NR3C2

XRCC3 STX1B

RP11-

TMCC3

273G15.2.1

SPAG5 RASSF4

AC108488.3.

P2RX4 1

ATOX1 TPD52L1

RP11-

SLC25A3

420G6.4.1

TUBB P2RY11

CCDC152 ADCK3

CENPL PARP10

GAS6 AC103810.2

RP11-

HERC5 540A21.2.1

NXT2 TFEB

LRR1 FAM84B

AC093627.10

E2F1

.1

RP11-

OAS2

122A3.2.1

ARHGAP

KCNH4

9

MYEOV FBX024

TMEM194B CD34

RANGAP

CD 14 1

CNTF OTUD1 NASP ICA1

AC005152.2.

MAP7D3

1

RP1-

WDFY3-AS2

239B22.1.1

DUS2L HIST1H2AC

FKBP1A SELPLG

NOLC1 PDGFRB

RP11-

ZNF702P

566K11.1.1

CDCA5 SHC2

IMMP2L CES3

SUPT16H GBP5

MIR621 MYL5

RP11-

JHDM1D

64D22.2.1

CEP97 YPEL2

ITGB3BP DHX58

ERCC8 IFIT1

RP3-

WDR62

395M20.8.1

ZNF239 DFNB31

RP11-

LYG1 680F8.1.1

GNL3L FAM66D

RP5-

MAPKAPK3

882C2.2.1

ADK IL1R1

CENPV LTB4R

UTRN PDXDC2P

PSMD1 IL4I1

GTF2H3 FOX04

USP49 AC127496.1

C3orf26 AN ZF1

CYP26B 1 CSF2

JAM3 LENG9

AC097500.2.

SPA17

1

BTG3 CCDC19

MRPL15 FURIN

KPNA2 IFIH1

RRP15 GBP4

HSPB 11 CCNG2

C17orf89 FZD1

HAT1 SNHG5

UBE2S PDZD7

RBM12B DAPK2

MRPL47 C20orfl95

MSI2 C9orfl63

KIAA002

PCSK4 0 PTPN1 KIF26B

SERPINE

GLRX3

2

CFL1 MSMOl

CTC-

CDH24

523E23.1.1

HNRNPA HIST1H3 3 E

XYLB PAQR6

DRAP1 P4HA2

ACTR3B AOC3

CHCHD3 IRS2

H2AFV MAFB

NSMCE1 MXD4

COLQ DNER

CEP41 MDGA1

NUTF2 CH25H

SMARCC

TST

1

KIF2C RARRES3

ATP5G1 UNC13A

ISPD MST1P9

CHTF8 DNAH2

IP09 KCNK5

AC026271.4.

PPIL6 1

RP11-

TTF2

1391J7.1.1

DHRS4 RHPN1

AC068282.3.

FOSB

1

RP11-

SF3B3

202P11.1.1

GSG2 MIR29B2

CEACAM

AHCTF1

1

LPHN3 RNF24

KIF22 YPEL3

IMMP1L GPT

SNRPB CALML6

DCLRE1 RP1- A 163M9.6.1

NPPA-

ZNF681

AS1

CDH2 PELI2

GPR125 FAM71E1

AN RD18A BPI

RP11-

RASSF5 11011.12.1

SCD5 TMEM53

ARMC10 ABTB 1

AC009948.5.

UCN

1

KIF5C IDUA RP11-

GDPD1 381E24.1.1

RP4-

PACSIN3

758J18.10.1

GNB4 BMF

RP11-

TRIB2 58E21.3.1

PARP1 DYRK1B

HSPA4L SCNN1D

PRPS1 CLIP2

PPIL1 INHA

XRCC6 FAM47E

COQ2 LRRC24

CBX2 GFI1

EPB41L4A-

LBH AS1

RNASEH

MVD

1

RFT1 THBS3

Clorfl l4 CARNS1

ITSN1 C16orf79

SLC16A1

ATXN10

3

CTD-

WHSC1

2292P10.4.1

RP5-

PDE5A

991G20.4.1

CTC-

SAAL1

378H22.2.1

FER MUC1

RP1- AC008440.10

152L7.5.1 .1

MCM8 PCSK9

CTD-

C4orfl0

2547L24.3.1

CROT ICAM5

GTF2H2 MUC20

BAG2 PPP1R3C

METTL8 KLF4

SEC14L2 NINJ1

C20orf94 CLIP3

CTD-

HDAC8 2517M22.14.

1

PRMT3 AC022098.1

HAUS1 ABCA6

POLR3G BBC3

RP13-

CCT2

15E13.1.1

RP11-

CXCL3

14N7.2.1

AC034193.5. RP11- 1 369J21.5.1

CNTLN PRR15

CCDC34 C14orf45 DLX1 TMEM198

ABCC4 ZNF425

RFC5 GAL3ST1

PSMC5 SLC1A7

CSTF2 ISG15

LIG1 ATF3

SLC19A1 ICOSLG

BRCC3 ATP8A1

LINC0032

ZW10

4

AFAP1L1 FBX032

CDC123 EXOC3L4

RP11-

SRCIN1 521B24.3.1

LRRC58 HSF4

ALMS1 DBP

FAM111

CDH3 A

PUS7 AC021593.1

DYNC1H

RNF152 1

EFNB2 ISG20

BCAS4 KIAA1683

RP5-

MYH9

885L7.10.1

HSPA14 CARD 14

TMEM56 PIP5K1B

ZDHHC2 NCF2

KIF18A PPP1R32

CDK4 RAB 17

CKAP2L CHPF

MRE11A KLHL24

PSRC1 PDGFRA

NUP88 DNAHIOOS

RP11-

RCC1

454H13.6.1

GEMIN4 ABCA7

Clorf74 TCP11L2

VRK1 S1PR1

PSMC1 HIST1H2BD

D0CK1 NEURL3

ALDH3A

SYT5 1

GLT8D2 GPR35

KATNAL

CARD 9 1

CTD-

SERBP1

2313N18.5.1

WDR77 LIPH

PHTF2 EFNA3

PSMB2 C3 BCAT1 MID2

RP11-

STRA13

536G4.2.1

UBE2C PNPLA7

PHEX CYP7A1

ADSL TESK2

HSDL2 TNFSF15

CEP 192 ZNF385C

RP11-

GOT2

757G1.6.1

STIP1 FBX02

VBP1 LRRN1

GNPNAT

ZSCAN4 1

DLC1 PRSS27

FUBP1 EFNA1

HEATR1 CNNM1

TBC1D1 DDIT4

ECT2 SCN9A

GALNT1 C2

EXOG CCDC114

CTD-

RRM2P3

2366F13.1.1

POLA2 AP000696.2.1

SLC25A1

PDE4C 5

CATSPER

NEU3

G

DOCK5 ELF5

HS3ST3A

CETN3

1

AC073321.5.

BMP4

1

CCDC72 CFB

SORLl PARM1

UCHL5 Clorfl45

ARPC4 C19orf51

FAM198 CTC- B 454M9.1.1

KIAA158

CCPG1 6

GSTOl CTH

AC097359.1 MALAT1

BMP2K TRIB3

SSBP1 NY API

C3orf67 KLF9

LINC0017

SMC3

6

AP1B 1 IER3

ACAD 9 GBP2

C2CD4C VEGFA

SLIRP SAT1

SNAPC1 BAIAP3 RP11-

RPL26L1

93B 14.5.1

ANKRD2

MGAT5B

4

FSTL1 GAA

PARVB CASP5

TIMM23 OASL

KLF12 MLXIPL

RP11-

C1QBP

115C10.1.1

CCRL1 TSC22D3

SLC16A7 OPRL1

SERPINA

CDC6

5

AC004381.6.

SLC04A1 1

CBFB IGSF10

APAF1 SI OOP

PRKARl

EPAS1 B

FAM72D SCARF 1

FOXRED

TNFSF10 2

TMEM20

ECT2L 9

CEP76 PSD

DLG5 CHAC1

AC003665.1.

ANG

1

ACTL6A CA9

PDP1 PIK3IP1

Cl lorf51 FLRT3

RP11-

RPL39L

736K20.5.1

NUBPL CEACAM22P

SFPQ MY015B

EME1 DPEP1

FANCM KLKIO

HMBS ADM2

HNRNPR ACCN3

HDAC9 LDB3

HECW1 FER1L4

GCSH CEBPD

LAPTM4

TMEM105 B

GDF11 CCDC40

RP3-

NAV2

395C13.1.1

NR2F2 ADCY4

INO80C NKPD1

PPAT SLC6A9

UMPS C13orf33

SERPINA

CACNB4

3 PLCL2 ACSS2

PRMT5 SLC2A6

ACAD SB EGR1

INCENP TSPEAR-ASl

LARS2 MAPK15

RP11-

ZNF714

178D12.1.1

CCDC86 FOS

PSMD2 ANGPTL1

LRRC20 ZNF467

RP11-

ROCK2

712B9.2.1

C12orf48 PRSS35

TNPOl NHLRC4

PPAP2B SLC6A12

C14orfl2

ICAM1 6

CDC42BP

C2CD4A A

BCCIP GPR132

PSIP1 STRC

LSM3 C6orf223

MALT1 NFE2

CTD-

AAK1

2319112.1.1

CABYR FAM167B

RBM12 PNCK

CYCS TPPP3

SFMBT1 ITGA10

MPHOSPH9 ODF3B

WFDC10

SEH1L

B

DCAF13 GRIP2

TIMM10 TXNIP

LYRM1 NUPR1

FAM171A1 ARRDC4

CENPH SSPO

RP11-

TBX4 117L6.1.1

BRIX1 LAMP3

PEG 10 SLC04C1

THOC7 HPN

CDCA7L FAM132A

HPRT1 GAL3ST2

CEP 170 NGFR

SF3A3 C2CD4B

PMVK RANBP3L

HSPD1 C17orf28

TDP1 C21orf90

DNMT3B MY015A CKAP2 LRRC4C

C12orf24 PLCH2

ANKRD2

HSPA6 8

CKAP5 IGFALS

POLR1E KLB

DCUN1D AC005013.5. 5 1

ORC1 CSF3R

KCTD1 RYR1

PSMA7

CCDC134

PSMC2

PLS3

PIGN

CTD- 2224J9.2.1

NUP153

ME3

PITPNM3

ABCD3

IQCC

DCLRE1

B

USP1

AC108463.2.

1

UPF2

SSRP1

ALG8

STARD13

NEK2

FCF1

GNG12

MASTL

TBC1D5

AP3B 1

CBX3

PGRMC1

ATG3

POLH

PDCD11

RP11-

666A20.1.1

ZDHHC2

3

RGS17

FAM72A

SEPHS1

FAM122

B VOPP1

PSMA3

FLVCR2. 1

LIN9

MANEA

FAM208 B

GTPBP4

TTI1

CCDC88 A

FAM48A

TMSB 15 B

SELRC1

RIMKLB

WDR17

ODZ3

CNOT1

SMCHD1

RDX

POLD3

LAS1L

CLTC

PPP1R14 B

GEMIN5

AGK

C10orfl2 5

SMS

PDAP1

LYPD6

RP11- 290F20.1.1

SLC36A1

TRAIP

RP6- 65G23.3.1

PPIA

RP11- 85G18.4.1

KCNC4

GDAP1

CABLES

1

LGALS1

RPP30

PI4K2B

DSN1

ZNF620 USP6NL

C9orfl00

CAV2

PTCD2

KIAA114

7

TOP2B

TXNRD1

MYL6

RNASEH2A

RBFOX2

CCT6A

AMMEC

Rl

DLD

BAI2

ORC5

TIMM8A

BEND6

TSC22D2

NAP1L5

ATIC

SIGLECl 5

AP4S1

BAX

TTK

WDR12

TFAM

GPRIN1

PRDX1

GEMIN6

MMACH C

JUN

ERC1

EPHB2

XP06

KIF24

SRGAP1

AURKB

TRRAP

GPD2

SCFD2

SMAP2

GSS [00286] Supplementary Data 41: Comparison of overlaps between recurrent DE genes across all samples (vs. A38Per) and DMSO vs. 6AN treated A38Lg cells

GTPBP4 DRAM1 ABHD14B ARAP2

TUBA4A TRIM2 PDK1 AN RD2

2

SMCHD1 NFKBIZ C9orfl50 CDS1

FOXM1 ADAM8 MLLT3 ALS2CL

SMARCC TRIB2 SSBP2 GALNT6 1

TFDP1 GSDMB C3orf23 KRT15

XP06 SGK1 DDT XDH

TOP2B ARHGEF1 SARM1 DGAT2

6

HEATR1 LRCH4 OSBPL6 FBP1

ODC1 HERC5 LEPREL2 CLDN10

ZWINT PLEKHM EEF1A1P RHOD

1 5

ATXN10 ASAH1 HCN2 PRSS3

NUP155 TMEM8A VLDLR CTSL2

CDCA7L PRICKLE BAMBI SNAI2

4

SNRPB NEU1 USE1 GALNT1

2

GNPNAT TNFSF15 CGREF1 P2RY2 1

WDR3 PLAUR ATHL1 RAG1

HJURP AC103810.2 GPR37 PADI3

KNTC1 RHBDF1 DERL3 TESC

REV3L HSD17B 1 PLK3 SELL

1

CDH2 ROB04 SLC29A4 GPR116

FARSB EREG TMEM234 GCHFR

CCDC88 VWA1 RP11- RP11- A 66N24.3.1 314P12.3.

1

NET02 TJP3 COR06 CHST4

NUP153 PDCD4 C9orf37 SHANK2

USP1 PTPRE EEF1A1P DNAH10

6

MCM6 C15orf48 DNAJC3-AS1 KRT23

POLD2 TPBG C3orf78 COL1A1

SF3A3 TNFAIP3 MCEE NMNAT2

TYMS PROS1 RP11- AKAP6

480A16.1.1

KIAA132 GPR108 ZNF70 AADAC 4

TMEM48 AGFG2 NAT6 VEPH1

TCOF1 RHBDD2 SEMA3F C7orf58

PHIP DGAT1 VKORC1 DNAH12

BRIX1 HBP1 RP5- SLC22A2

1103G7.4.1 0

UTP20 RNF103 P2RX6 MCC

CTPS TSC22D3 C10orfl02 RP11-

7K24.3.1

RANBP1 CAPS THTPA AN RD5

6

SACS SLC2A6 FTX CTD- 2021J15.2 .1

DTL ABCA7 AC002472.8. ARL14

1

CDC20 TMC4 RP11- C15orf62

243J18.3.1

AP1B 1 FAM193B CTD- METTL7

307407.5.1 B

CDC6 VPS28 RP11- SLC4A3

574K11.20.1

ABCD3 ARSD CES4A KLHL13

EBNA1B NCF2 METTL12 RP11- P2 357H14.1

9.1

PHTF2 MMP28 RP11- FGFR3

390P2.4.1

TRIP 13 PARP10 RP11- PKDCC

216F19.1.1

RUVBL1 LZTS2 RP11- RP11- 85K15.2.1 157P1.4.1

RCC1 AN RD13D SEPT7L SMO

MYBL2 DNASE2 CD22

CHML LYNX1 SYTL5

CCDC165 APOBEC3G TNNT2

SEH1L CDH3 ADAMTS

14

WDR12 AQP3 ANXA10

TMEM56 NYNRIN TTC3P1

ERC1 ABCG1 NMU

RFC3 SH3YL1 RP11-

597D13.9. 1

DEPDC1 FBX02 UNC5C

RP1- PPP1R16 AADAT

239B22.1.1 A

BMP2K SESN3 GRHL1

NT5DC2 TMPRSS3 TMEM16

9

MPHOSP HEATR7 CYP24A1 H9 A

HDAC9 KYNU RP11-

448G15.3. 1

STMN3 METTL7 ASB9

A

MRT04 FA2H NAALAD

L2

CDCA8 ZC3H12A FAM46B

S100A2 CFI CFTR

PI4K2B SLC15A3 C16orf74

MCM8 SIGIRR RP11- 41612.1.1

GTF2H3 FER1L4 PLAC1

JUN SLC39A1 ZNF658

1

TTF2 FOXQ1 SP6

FAM171 RASSF5 TTN Al

FKBP5 DHRS3 C8orf46 ADSL ARHGEF3 GPR65

DIAPH3 SI OOP LEF1

KIAA058 ZG16B RP11- 6 6F2.4.1

POLQ PTK6 DENND2

A

EXOl CDA

EEF1E1 IL32

ODZ3 ZNF862

SSX2IP CD 14

CEP97 TMED1

ABCC4 G0S2

RAD51A ENDOV

PI

MRPL47 SMAD6

C9orfl40 MIA

GEMIN5 CACNB 1

FANCD2 ARL4D

CCNE2 JAK2

UMPS KCNMB4

BARD1 IRAK2

USP13 RELB

CENPV CEBPB

SMTN MAFF

PRPS1 C9orfl6

CCNF RALGPS1

RFC5 LTB4R

ATP5G1 CSF2RA

POLR3G C10orf32

CBX2 SAT2

NOP 16 ALPPL2

MCM10 CCDC69

SLC16A7 NR1H3

POP1 GRAMD1

C

CLSPN FAM113B

PFAS WASH7P

KLF12 TGFBR3

GEMIN4 YPEL3

C3orf26 LRRN1

GMNN MAST3

CENPN ABC7-

42389800N19.1.1

KIF15 ZNF467

GDAP1 XAF1

THOC7 C16orf7

OPN3 RP3-

395M20.8.1

C22orf29 DNAH2

BAG2 ADCY4 ERCC2 CDC42EP 5

JAM3 ITGAX

DERA ANKRD4

2

GPRIN1 NLRP1

E2F2 AC021593.1

CDC45 N4BP2L1

ARRB2 CLDN15

NCS1 PIWIL4

SFMBT1 S1PR1

RAD54L HLA-H

Clorfl l2 NEURL3

PRKARl P2RX4 B

FANCM AC007283.5.

1

CENPL CA11

SELRC1 DPEP1

DOCK 10 TBC1D3F

ORC1 IER5L

ALDH1B PERI 1

BCL2 ZSWIM4

PSMC3IP ITGA10

SNRNP25 IL15

PRIM1 SERPINA

3

HSPA4L RTP4

SUV39H2 GLI4

MAGOH C17orfl03 B

HMGB 1P SIX5 5

RP11- SNPH

14N7.2.1

WDR77 PIP5K1B

CENPH TFEB

NME1 IDUA

LRRC8C TMEM102

AC02761 SPRY3 2.6.1

METTL8 FM05

POLR1E MTRNR2L9

HOMER 1 VSIG10L

RGS17 GIMAP2

DCLRE1 PLCH2 B

POLE2 LRRC6

CENPW LRFN3

SKA1 RILPL2

ADAT2 TNFSF12

CCDC41 CROCC WDR4 HLA-K

CCDC18 PRICKLE

3

PDSS1 CXCL3

C12orf24 NOS3

PARVB FBX06

MMP24 CCDC146

C14orfl2 CSF2 6

SLC25A1 CEP85L 5

NEGRI GEMIN8

CAP2 ZNF628

OIP5 CXCL2

KIF5C ZDHHC1

ARHGAP NR3C2 11B

FAM86A RP11-

285F7.2.1

TBC1D7 PLEKHF1

CEP128 RP11- 403113.8.1

DLEU2 B3GALT4

BOLA3 NEIL1

COQ3 KNDC1

C17orf89 PON3

ARHGEF CFB

26

GDF11 ENDOU

ACTR3B FOX04

ASRGL1 DDO

XYLB BBC3

RP1- KB- 140K8.5.1 1460A1.5.1

SFXN2 SLC16A1

3

TIMM8A GAS7

MNS1 RNF152

GTF2H2 ABCA6

AC00994 SLC16A6 8.5.1

ZNF239 RENBP

RP11- KIAA1683

253E3.3.1

ASTN2 BBS12

CHAC2 CACNA2D4

CSMD3 UPK3B

RASSF2 AP001372.2.1

CCDC138 ZNF517

Clorf74 ACYP2

CCDC134 ARHGAP25

CABYR PRX

RP6- CES3 65G23.3.1

PPIAP29 ACCN3

CYB5RL C19orf51

RP11- IL4I1 521B24.3.1

RP1- VMAC

152L7.5.1

C20orf94 TNFRSF9

AL35767 RP11- 3.1 757G1.6.1

ISPD NFAMl

ELOVL2 RP11-

353B9.1.1

RBM24 PDZD7

RP11- CARD 14 204J18.3.1

DPF1 NY API

RP11- CD34 381E24.1.1

RP11- RP11-

1334A24.4.1 325F22.3.1

MIR621 EXOC3L4

AC09232 RP5- 9.1 1182A14.3.1

RP3- JAK3

324017.4.1

CTD- WDFY3-AS2

2574D22.2.1

PDE1A PDGFRA

RP11- MLXIPL 33N16.1.1

RP11- KLHDC1 618G20.2.1

MIR29C

RP11-

712B9.2.1

CTD-

2292P10.4.1

PCSK4

CLDN9

PODN

COL11A2

RP11- 65J3.1.1

PPP1R32

EFHC2

TMEM105

AC005152.2. 1

CASP5

TBX6

DNAHIOOS

SCARF 1

RP11-

420G6.4.1

ROM1 PYGM

SLC04C1

CCDC114

FAM71E1

RP11-

263K19.6.1

CTD-

2341M24.1.1

PCDHAC

1

CLIP3

C7orf63

C17orfl08

ECEL1P2

LRRC24

ZNF385C

TPPP3

RP11- 108M9.4.1

ANKRD2 4

SRCRB4D

MIR29B2

DNAH7

RP11-

454H13.6.1

RP11- 369J21.5.1

TMPRSS5

PTPRCAP

BC02

C2orf81

FAM66C

LINC0017 6

GPR132

SLC5A12

PDE4C

ICAM5

C20orfl95

FBX024

NGFR

TNFSF13 B

MST1P9

CH25H

CTC-

523E23.1.1

NFE2

CTC-

378H22.2.1

FAM132A

GBP5 HOXD4

UCN

N PD1

ELF5

LINC0032

4

RP11- 115C10.1.1

ANGPTL1

RP11-

536G4.2.1

FAM66D

RP4-

541C22.5.1

HSPA6

CTD-

2313N18.5.1

CTD-

2547L24.3.1

CYP7A1

IGFALS

RRM2P3

U7

[00287] Although the invention has been described with reference to the above example, it will be understood that modifications and variations are encompassed within the spirit and scope of the invention. Accordingly, the invention is limited only by the following claims.