Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
ARID5B OVEREXPRESSION IN INFLAMMATORY DISEASES
Document Type and Number:
WIPO Patent Application WO/2018/195522
Kind Code:
A1
Abstract:
Disclosed herein are compositions and methods for diagnosing and treating inflammatory diseases based on the disclosed overexpression and hypomethylation of ARID5B that is associated with an increased risk of inflammatory disease. Therefore, methods are disclosed that involve assaying samples from subjects for ARID5B gene expression, or for methylation of an ARID5B CpG site that is negatively correlated with ARID5B gene expression. Also disclosed are methods that involves administering to the subject a composition comprising an ARID5B inhibitor in an amount effective to reduce endogenous expression levels of ARID5B in the subject.

Inventors:
LIU YONGMEI (US)
Application Number:
PCT/US2018/028747
Publication Date:
October 25, 2018
Filing Date:
April 21, 2018
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNIV WAKE FOREST HEALTH SCIENCES (US)
International Classes:
C12Q1/68; A61K39/395; C07H21/04; C40B40/08; G01N33/48; G01N33/50; G01N33/564
Domestic Patent References:
WO2016105535A22016-06-30
WO2016049024A22016-03-31
WO2015124921A12015-08-27
WO2014036314A22014-03-06
Foreign References:
US20020182586A12002-12-05
US20090130096A12009-05-21
Other References:
LIU, Y ET AL. ET AL.: "Transcriptomics and Methylomics of Atherosclerosis in Circulating Monocytes - the Multi-Ethnic Study of Atherosclerosis. Abstract 53", CIRCULATION, vol. 131, no. 1, 10 March 2015 (2015-03-10), pages 1
REYNOLDS, LM ET AL.: "Age-related variations in the methylome associated with gene expression in human monocytes and T cells", NATURE COMMUNICATIONS, vol. 5, 18 November 2014 (2014-11-18), pages 1 - 8, XP055428472
LIU, Y ET AL. ET AL.: "Blood monocyte transcriptome and epigenome analyses reveal loci associated with human Atherosclerosis", NATURE COMMUNICATIONS, vol. 8, no. 1, 30 August 2017 (2017-08-30), pages 1 - 12, XP055550825
Attorney, Agent or Firm:
GILES, P. Brian (US)
Download PDF:
Claims:
WHAT IS CLAIMED IS:

1. A method, comprising obtaining a sample from a subject having or suspected of having an inflammatory disease, and assaying the sample for methylation of an ARID5B CpG site affecting ARID5B gene expression.

2. The method of claim 1, wherein the CpG methylation site comprises a CpG site found in the nucleic acid sequence (SEQ ID NO:1).

3. The method of claim 2, wherein the CpG methylation site comprises cg25953130.

4. A method for diagnosing or prognosing an inflammatory disease in a subject, comprising assaying a sample from the subject for ARID5B expression levels, wherein detection of elevated ARID5B expression levels compared to a control is an indication that the subject has an inflammatory disease.

5. The method of claim 4, wherein the ARID5B expression levels are assayed by assaying for methylation of an ARID5B CpG site that negatively correlates with ARID5B gene expression.

6. The method of claim 5, wherein the CpG methylation site is found in the nucleic acid sequence SEQ ID NO:1.

7. The method of claim 6, wherein the CpG methylation site comprises cg25953130.

8. The method of any one of claims 4 to 6, wherein the inflammatory disease is diabetes mellitus type 2 or a cardiovascular disease.

9. The method of claim 8, wherein the cardiovascular disease comprises atherosclerosis.

10. A method for treating an inflammatory disease in a subject, comprising

obtaining a sample from a subject having or suspected of having an inflammatory disease,

assaying the sample for ARID5B expression levels, and

treating the subject for an inflammatory disease if the ARID5B expression levels are elevated compared to a control.

11. The method of claim 10, wherein the ARID5B expression levels are assayed by assaying for methylation of an ARID5B CpG site that negatively correlates with ARID5B gene expression.

12. The method of claim 11, wherein the CpG methylation site is found in the nucleic acid sequence SEQ ID NO:1.

13. The method of claim 12, wherein the CpG methylation site comprises cg25953130.

14. The method of any one of claims 10 to 13, wherein the subject is at high risk for cardiovascular disease, wherein the subject is treated for cardiovascular diseasef ARID5B expression levels are elevated.

15. The method of any one of claims 10 to 13, wherein the subject is at high risk for diabetes mellitus type 2, wherein the subject is treated for diabetes if ARID5B expression levels are elevated.

16. The method of claim 14 or 15, wherein the subject is treated with an antiinflammatory agent \f ARID5B expression levels are elevated.

17. A method for treating a subject with an inflammatory disease, comprising administering to the subject a composition comprising an ARID5B inhibitor in an amount effective to reduce endogenous expression levels of ARID5B in the subject.

18. The method of claim 17, wherein the ARID5B inhibitor is a gene silencing agent.

19. The method of claim 18, wherein the gene silencing agent comprises an antisense oligonucleotide, siRNA, or shRNA.

20. The method of claim 17, wherein the ARID5B inhibitor is a CRISPR-Cas system that methylates an ARID5B CpG site that negatively correlates with ARID5B gene expression.

21. The method of claim 20, wherein the CpG methylation site is found in the nucleic acid sequence SEQ ID NO:1.

22. The method of claim 21, wherein the CpG methylation site comprises cg25953130.

23. The method of any one of claims 17 to 22, wherein the inflammatory disease is diabetes mellitus type 2 or a cardiovascular disease.

24. The method of claim 23, wherein the cardiovascular disease comprises atherosclerosis.

25. The method of any one of claims 17 to 24, wherein prior to treatment the method further comprises detecting elevated ARID5B expression levels in a sample from the subject.

26. The method of claim 25, wherein the ARID5B expression levels are detected by assaying for methylation of an ARID5B CpG site that negatively correlates with ARID5B gene expression.

Description:
ARID5B OVEREXPRESSION IN INFLAMMATORY DISEASES

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims benefit of U.S. Provisional Application No. 62/488,320, filed April 21, 2017, which is hereby incorporated herein by reference in its entirety.

STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR

DEVELOPMENT

This invention was made with Government Support under Grant No.

HL101250 and HL135009 awarded by the National Institutes of Health. The

Government has certain rights in the invention.

BACKGROUND

Despite improvements in prevention and therapy of atherosclerotic

cardiovascular disease (CVD), it remains to be the leading cause of death in the developed world. Although traditional risk factors for atherosclerosis are established, there is substantial unexplained variation in the atherosclerosis phenotype. Most acute coronary syndromes and ischemic strokes are caused by rupture-prone vulnerable plaques, however, the risk factors for plaque vulnerability are largely unknown. Genome-wide association studies (GWAS) have identified a number of susceptibility variants for CVD; however, they explain only a small percent of CVD risk. The available therapies continue to primarily target hyperlipidemia and prevention of thrombosis. To develop effective therapeutic approaches for primary and secondary CVD prevention, a better understanding of the molecular mechanisms and pathophysiology of the disease is needed.

SUMMARY

Disclosed herein are compositions and methods for diagnosing and treating inflammatory diseases based on the disclosed overexpression and hypomethylation of ARID5B that is associated with an increased risk of inflammatory disease.

Therefore, a method is disclosed that involves obtaining a sample from a subject, such as a subject having or suspected of having an inflammatory disease, and assaying the sample for ARID5B gene expression, or for methylation of an ARID5B CpG site that is negatively correlated with ARID5B gene expression.

In some embodiments, the CpG methylation site is one or more of the underlined CpG site found in the nucleic acid sequence: ATAAATATTYGGTGTTATAATAGAGTAGTAATTAAATGGAAATTTTAATATA ATTGTTAATAGTAGTAAGAATTGGAATAGYGTTAGGTATATAAAGTTGATGTTTTG TYGTTGGTTGGTTGTTAGGAGTATAGTGA (SEQ ID NO:1). For example, the CpG methylation site can be cg25953130.

Also disclosed is a method for diagnosing or prognosing an inflammatory disease in a subject that involves assaying a sample from the subject for ARID5B expression levels, wherein detection of elevated ARID5B expression levels compared to a control is an indication that the subject has an inflammatory disease. Since methylation of the disclosed ARID5B CpG sites negatively correlate with ARID5B gene expression, in some embodiments, ARID5B expression levels are assayed by assaying for methylation of the ARID5B CpG sites.

The inflammatory disease of the disclosed methods can be selected from the group comprising diabetes mellitus type 2 or a cardiovascular diseases (e.g.

atherosclerosis or stroke).

Also disclosed is a method for treating an inflammatory disease in a subject that involves obtaining a sample from a subject having or suspected of having an inflammatory disease, assaying the sample for ARID5B expression levels, and treating the subject for an inflammatory disease if the ARID5B expression levels are elevated compared to a control.

For example, in some embodiments the subject presents with a high risk for cardiovascular diseases (CVD), e.g. based on family history, obesity, diabetes, hyperlipidemia, hypertension, or prior incidence of CVD. In these embodiments, the subject can be treated for CVD if ARID5B expression levels are elevated. Treatment for cardiovascular disease include lifestyle changes (e.g. diet and exercise), statins, and/or blood thinners. In some embodiments, the subject is treated with an antiinflammatory agent \f ARID5B expression levels are elevated.

In some cases, the choice of treatment is based on the level of ARID5B expression. In particular, the disclosed methods use ARID5B expression levels to stratify subjects based on disease risk. Subjects determined to be at higher risk based on elevated ARID5B expression can be treated with standard of care for that risk, including the use of aggressive treatments earlier than would otherwise be used. For example, all subjects with CVD should be recommended lifestyle modification. As ARID5B levels elevate, the subject can be treated with a statin. As levels elevate further, the subject can be further treated with ezetimibe and/or PCSK9 inhibitors. As levels elevate even further, the subject can be further treated with a PCSK9 antibody. In some embodiments the subject presents with a high risk for diabetes or pre-diabetes, e.g. based on family history, weight, blood glucose levels, and/or prior incidence of diabetes. In these embodiments, the subject is treated for diabetes or pre-diabetes if ARID5B expression levels are elevated. Treatment for diabetes include lifestyle changes (diet and exercise), metformin, insulin, sulfonylurea, GLP-1 agonists, thiazolidinedione, glinide, SGLT2 inhibitors, DDP-4 inhibitorss, alpha- glucosidase inhibitors, and pramlintide. In some embodiments, the subject is treated with an anti-inflammatory agent \ ARID5B expression levels are elevated. In some cases, the choice of treatment is based on the level of ARID5B expression.

In some embodiments, ARID5B expression levels are used to screen subjects for the presence or risk of an inflammatory disease, such as CVD or diabetes, before the subject presents with any clinical symptoms. In these embodiments, detection of elevated ARID5B expression levels can be an indication that the subject should be screened for an inflammatory disease, such as CVD or diabetes. In some cases, the subject is directed to make lifestyle changes (e.g. diet and exercise) to reduce the risk of disease onset. In some cases, the subject is treated prophylactically with antiinflammatory agents to prevent the onset of an inflammatory disease. In some cases, the subject is treated using standards of care for CVD and/or diabetes prevention, e.g. statins.

Also disclosed is a method of treating a subject with elevated ARID5B expression levels that involves administering to the subject an anti-inflammatory agent. For example, in some embodiments, the subject having elevated ARID5B expression levels is treated with an anti-IL6 drug, such as tocilizumab (Actemra), siltuximab (Sylvant), Sarilumab, clazakizumab, olokizumab (CDP6038), elsilimomab, BMS-945429(ALD518), sirukumab (CNTO 136), CPSI-2364, ARGX-109, FE301, and FM101.

Also disclosed is a method for treating a subject with an inflammatory disease that involves administering to the subject a composition comprising an ARID5B inhibitor in an amount effective to reduce endogenous expression levels of ARID5B in the subject.

In some embodiments, the ARID5B inhibitor is a gene silencing agent. As an example, the gene silencing agent can be an antisense oligonucleotide, siRNA, or shRNA. In some embodiments, the ARID5B inhibitor is a CRISPR-Cas system that methylates a disclosed ARID5B CpG site that negatively correlates with ARID5B gene expression. In some embodiments, prior to treatment, the method further comprises detecting elevated ARID5B expression levels in a sample from the subject. Since methylation of the disclosed ARID5B CpG sites negatively correlate with ARID5B gene expression, in some embodiments, ARID5B expression levels are assayed by assaying for methylation of the ARID5B CpG sites.

The details of one or more embodiments of the invention are set forth in the accompanying drawings and the description below. Other features, objects, and advantages of the invention will be apparent from the description and drawings, and from the claims.

DESCRIPTION OF DRAWINGS

Figure 1. Transcriptomic associations with Carotid Plaque and CAC Scores in 1 ,208 MESA participants. The two volcano plots showing effect size (β) of -logl 0(p- value) for associations of each mRNA expression (for 10,989 unique mRNA expression) with A) carotid plaque score and B) CAC. The red line illustrates the threshold for an FDR<0.05 (linear regression adjusting for age, sex, race, and study site). Expression of ARID5B and PDLIM7 (bolded) were associated with both carotid plaque and CAC score; green star indicates mRNA expression most significantly associated with carotid plaque or CAC. The two forest plots showing the direction and effect size (β) for associations of each mRNA expression (in Panel A and B) with C) carotid plaque score and D) CAC score in the full model (including traditional CVD risk factors), overall and by the two independent studies (study 1 : JHU + CO, red; study 2: UMN + WFU, blue); φ significantly different effect sizes observed by study (pinteraction<0.05). E) The forest plot showing ARID5B mRNA expression

associations with carotid plaque, stratified by age (<65 years and >65 years), sex, ethnicity/race, CVD and statin use.

Figure 2. Methylomic associations with Carotid Plaque and CAC Scores in 1,208 MESA participants. The two Manhattan plots showing chromosomal locations of -loglO(p-value) for associations of each CpG site (for 484,817 CpG sites) with A) carotid plaque score and B) CAC score. The red line illustrates the threshold for an FDR≤0.05; linear regression adjusting for age, sex, race, and study site). One CpG at ILVBL (bolded, green dot) was associated with both carotid plaque and CAC; green star indicates CpGs most significantly associated with carotid plaque or CAC.

Notably, for ARID5B, both methylation (cg25953130, green dot) and mRNA expression were significantly associated with carotid plaque score. The two forest plots showing the direction and effect size (β) for associations of each CpG sites (in Panel A and B, the most significant association at each unique loci is shown) with C) Carotid plaque score and D) CAC in the full model (including traditional CVD risk factors), overall and by the two independent studies (study 1 : JHU + CO, red; study 2: UMN + WFU, blue);† significantly different effect sizes observed by study

(pinteraction<0.05). E) The forest plot showing ARID5B cg25953130 methylation associations with carotid plaque score, stratified by age (<65 years and >65 years), sex, ethnicity/race, CVD and statin use.

Figure 3. In vivo and in silico functional analysis of ARID5B CpG cg25953130

A) ARID5B mRNA expression (y-axis; normalized value) is significantly negatively correlated (Pearson r) with methylation of a CpG (cg25953130) in 1,264 CD14+ samples from MESA participants. B) A regional association plot of ARID5B CpG methylation with carotid plaque score in MESA (y-axis: -Iog10 (p-value), x-axis:

position on chromosome (chr) 10) is shown in the top panel; the bottom panel shows the ARID5B expression-associated CpG (cg25953130, chrl 0:63,753,550, hg19, indicated by the light blue line) located in an ARID5B intron, overlaps a DNase hypersensitive site in a CD14+ sample from BLUEPRINT, histone marks indicative of a strong enhancer/promoter in a CD14+ and B cell line sample from BLUEPRINT and ENCODE (see Supplementary Fig. 3 for ChromHMM color code), as well as a transcription factor binding site for EP300, detected in a neuroblastoma cell line (SK- N-SH_RA). C) A physical interaction is detected between the ARID5B promoter and the region near the ARID5B CpG cg25953130 by both HiC and ChlA-PET data. The heatmap (white, low; red, high; top panel) graphically displays Hi-C interaction counts (the normalized number of contacts between a pair of loci) for the large region surrounding ARID5B in a B cell line (GM12878; reported by Rao et al. 2014); active enhancer marks (H3K27ac and H3K4me1 in GM12878 from ENCODE) are shown below; the bottom panel zooms in on ARID5B (blue highlighted region) Hi-C interaction. There were 1937 contacts between the ARID5B promoter and the cg25953130 region. Below the Hi-C interaction is a depiction of detected ChlA-PET interactions (Chromatin Interaction Analysis by Paired-End Tag sequencing) for the region flanking ARID5B in the B cell line (GM12878; reported by Heidari et al. 2015), represented by blue curves.

Figure 4. ARID5B expression and methylation are associated with subclinical and clinical CVD risk. Odds ratio of subclinical and clinical CVD (in the full model with adjustment of traditional CVD risk factors and statin use) by tertiles of A) ARID5B mRNA expression and B) ARID5B methylation (cg25953130).

Figure 5. siRNA Knockdown of ARID5B alters immune/inflammatory response and lipid metabolism genes. After 3 h of LPS treatment (100 ng ml-1) of THP1- monocyte ARID5B knockdown samples compared to control samples (scrambled siRNA), A) the transcriptome (N=8 per group) were enriched with the inflammatory response genes including the listed bio-functions and canonical pathways

(enrichment FDR<0.05 from IRA); proportion of genes upregulated shown as red, downregulated shown as blue. B) Relative IL1A levels (mean ± standard deviation) of mRNA expression in cells (n=3 per group, by RT-PCR) and protein expression in culture media (n=3 per group, by ELISA) decreased. C) Cell migration (mean ± standard deviation, n=10 and 4 per group in experiment 1 and 2, respectively), and phagocytosis (mean ± standard deviation, n=8, 4, and 4 per group in experiment 1 , 2, and 3, respectively) were inhibited.

Figures 6A and 6B. Histograms of subclinical CVD phenotypes.

Figures 7A to 7D. Reproducibility of Microarray mRNA Expression and DNA

Methylomic Data Overtime. A) Correlation of the repeated measurements of transcriptomic profiles for one subject between two visits 5-month apart; B) correlation of transcriptomic profiles between two subjects at visit 1 ; C) Correlation of the repeated measurements of methylomic profiles for one subject between two visits 5-month apart; D) correlation of methylomic profiles between two subjects at visit 1.

Figure 8. Chromatin Map of Monocytes. Six chromatin states (top panel) predicted using ChromHMM and four CD14+ histone marks (from BLUEPRINT) are enriched with various genomic features (middle panel); annotation of the six chromatin states is provided (bottom panel).

Figures 9A and 9B. siRNA Knockdown of ARID5B alters

immune/inflammatory response and lipid metabolism genes. A) Relative ARID5B mRNA expression levels in human THP1 -monocytes after treatment with two ARID5B siRNA individually, and in combination. Levels shown are the average (±S.E.M.). After 3 h of LPS treatment (100 ng ml-1), the THP1-monocyte transcriptome in 8 ARID5B knockdown samples compared to 8 control samples (scrambled siRNA) were enriched with inflammatory response genes; B) heatmap shows mRNA expression of selected key inflammatory response genes (including cytokine, interferon signaling, and antigen processing and presentation genes) in the 8 scrambled siRNA samples compared to 8 siARID5B samples.

Figures 10A and 10B. Time Course of LPS-induced TNF and IL1A Gene

Expression in Human THP1 monocytes. Gene expression of (A) TNF and (B) IL1A, in LPS (100 ng/ml) stimulated THP1 cells.

Figures 11A and 11B are bar graphs showing diabetes (Fig. 11A) and subclinical CVD (Fig. 11B) risk as a function of ARID5B expression with

overweigh/obesity or impaired glucose tolerance/diabetes status. DETAILED DESCRIPTION

The disclosed subject matter can be understood more readily by reference to the following detailed description, the Figures, and the examples included herein.

Before the present disclosure is described in greater detail, it is to be understood that this disclosure is not limited to particular embodiments described, and as such may, of course, vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to be limiting.

Where a range of values is provided, it is understood that each intervening value, to the tenth of the unit of the lower limit unless the context clearly dictates otherwise, between the upper and lower limit of that range and any other stated or intervening value in that stated range, is encompassed within the disclosure. The upper and lower limits of these smaller ranges may independently be included in the smaller ranges and are also encompassed within the disclosure, subject to any specifically excluded limit in the stated range. Where the stated range includes one or both of the limits, ranges excluding either or both of those included limits are also included in the disclosure.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs. Although any methods and materials similar or equivalent to those described herein can also be used in the practice or testing of the present disclosure, the preferred methods and materials are now described.

All publications and patents cited in this specification are cited to disclose and describe the methods and/or materials in connection with which the publications are cited. All such publications and patents are herein incorporated by references as if each individual publication or patent were specifically and individually indicated to be incorporated by reference. Such incorporation by reference is expressly limited to the methods and/or materials described in the cited publications and patents and does not extend to any lexicographical definitions from the cited publications and patents. Any lexicographical definition in the publications and patents cited that is not also expressly repeated in the instant application should not be treated as such and should not be read as defining any terms appearing in the accompanying claims. The citation of any publication is for its disclosure prior to the filing date and should not be construed as an admission that the present disclosure is not entitled to antedate such publication by virtue of prior disclosure. Further, the dates of publication provided could be different from the actual publication dates that may need to be independently confirmed. As will be apparent to those of skill in the art upon reading this disclosure, each of the individual embodiments described and illustrated herein has discrete components and features which may be readily separated from or combined with the features of any of the other several embodiments without departing from the scope or spirit of the present disclosure. Any recited method can be carried out in the order of events recited or in any other order that is logically possible.

It is understood that the disclosed methods and systems are not limited to the particular methodology, protocols, and systems described as these may vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to limit the scope of the present invention which will be limited only by the appended claims.

Definitions

Unless otherwise expressly stated, it is in no way intended that any method or aspect set forth herein be construed as requiring that its steps be performed in a specific order. Accordingly, where a method claim does not specifically state in the claims or descriptions that the steps are to be limited to a specific order, it is no way intended that an order be inferred, in any respect. This holds for any possible non- express basis for interpretation, including matters of logic with respect to

arrangement of steps or operational flow, plain meaning derived from grammatical organization or punctuation, or the number or type of aspects described in the specification.

As used in the specification and the appended claims, the singular forms "a," "an" and "the" include plural referents unless the context clearly dictates otherwise.

The word "or" as used herein means any one member of a particular list and can also include any combination of members of that list.

Ranges can be expressed herein as from "about" one particular value, and/or to "about" another particular value. When such a range is expressed, a further aspect includes from the one particular value and/or to the other particular value. Similarly, when values are expressed as approximations, by use of the antecedent "about," it will be understood that the particular value forms a further aspect. It will be further understood that the endpoints of each of the ranges are significant both in relation to the other endpoint, and independently of the other endpoint. It is also understood that there are a number of values disclosed herein, and that each value is also herein disclosed as "about" that particular value in addition to the value itself. For example, if the value "10" is disclosed, then "about 10" is also disclosed. It is also understood that each unit between two particular units are also disclosed. For example, if 10 and 15 are disclosed, then 11 , 12, 13, and 14 are also disclosed.

As used herein, the terms "optional" or "optionally" means that the

subsequently described event or circumstance can or can not occur, and that the description includes instances where said event or circumstance occurs and instances where it does not.

As used herein, the term "subject" refers to the target of administration, e.g., an animal. Thus, the subject of the herein disclosed methods can be a vertebrate, such as a mammal. The subject of the herein disclosed methods can be a human. The term does not denote a particular age or sex. Thus, adult and newborn subjects, as well as fetuses, whether male or female, are intended to be covered. In one aspect, the subject is a patient. A patient refers to a subject afflicted with a disease or disorder, such as, for example, cancer and/or aberrant cell growth. The term "patient" includes human and veterinary subjects. In an aspect, the subject has been diagnosed with a need for treatment for an inflammatory disease. In another aspect, the subject has not yet been diagnosed and the disclosed methods to predict a higher risk for developing an inflammatory disease.

The terms "treating", "treatment", "therapy", and "therapeutic treatment' as used herein refer to curative therapy, prophylactic therapy, or preventative therapy. As used herein, the terms refers to the medical management of a subject or a patient with the intent to cure, ameliorate, stabilize, or prevent a disease, pathological condition, or disorder, such as, for example, cancer or a tumor. This term includes active treatment, that is, treatment directed specifically toward the improvement of a disease, pathological condition, or disorder, and also includes causal treatment, that is, treatment directed toward removal of the cause of the associated disease, pathological condition, or disorder. In addition, this term includes palliative treatment, that is, treatment designed for the relief of symptoms rather than the curing of the disease, pathological condition, or disorder; preventative treatment, that is, treatment directed to minimizing or partially or completely inhibiting the development of the associated disease, pathological condition, or disorder; and supportive treatment, that is, treatment employed to supplement another specific therapy directed toward the improvement of the associated disease, pathological condition, or disorder. In various aspects, the term covers any treatment of a subject, including a mammal (e.g., a human), and includes: (i) preventing the disease from occurring in a subject that can be predisposed to the disease but has not yet been diagnosed as having it; (ii) inhibiting the disease, i.e., arresting its development; or (iii) relieving the disease, i.e., causing regression of the disease.

As used herein, the terms "administering" and "administration" refer to any method of providing a composition to a subject. Such methods are well known to those skilled in the art and include, but are not limited to, intracardiac administration, oral administration, transdermal administration, administration by inhalation, nasal administration, topical administration, intravaginal administration, ophthalmic administration, intraaural administration, intracerebral administration, rectal administration, sublingual administration, buccal administration, and parenteral administration, including injectable such as intravenous administration, intra-arterial administration, intramuscular administration, and subcutaneous administration.

Administration can be continuous or intermittent. In various aspects, a preparation can be administered therapeutically; that is, administered to treat an existing disease or condition. In further various aspects, a preparation can be administered prophylactically; that is, administered for prevention of a disease or condition.

The term "contacting" as used herein refers to bringing a disclosed composition or peptide or pharmaceutical preparation and a cell, target receptor, or other biological entity together in such a manner that the compound can affect the activity of the target (e.g., receptor, transcription factor, cell, etc.), either directly; i.e., by interacting with the target itself, or indirectly; i.e., by interacting with another molecule, co-factor, factor, or protein on which the activity of the target is dependent.

As used herein, the terms "effective amount" and "amount effective" refer to an amount that is sufficient to achieve the desired result or to have an effect on an undesired condition. For example, in an aspect, an effective amount of the polymeric nanoparticle is an amount that kills and/or inhibits the growth of cells without causing extraneous damage to surrounding non-cancerous cells. For example, a

"therapeutically effective amount" refers to an amount that is sufficient to achieve the desired therapeutic result or to have an effect on undesired symptoms, but is generally insufficient to cause adverse side effects. The specific therapeutically effective dose level for any particular patient will depend upon a variety of factors including the disorder being treated and the severity of the disorder; the specific composition employed; the age, body weight, general health, sex and diet of the patient; the time of administration; the route of administration; the rate of excretion of the specific compound employed; the duration of the treatment; drugs used in combination or coincidental with the specific compound employed and like factors well known in the medical arts. The term "pharmaceutically acceptable" describes a material that is not biologically or otherwise undesirable, i.e., without causing an unacceptable level of undesirable biological effects or interacting in a deleterious manner. As used herein, the term "pharmaceutically acceptable carrier" refers to sterile aqueous or

nonaqueous solutions, dispersions, suspensions or emulsions, as well as sterile powders for reconstitution into sterile injectable solutions or dispersions just prior to use. Examples of suitable aqueous and nonaqueous carriers, diluents, solvents or vehicles include water, ethanol, polyols (such as glycerol, propylene glycol, polyethylene glycol and the like), carboxymethylcellulose and suitable mixtures thereof, vegetable oils (such as olive oil) and injectable organic esters such as ethyl oleate. Proper fluidity can be maintained, for example, by the use of coating materials such as lecithin, by the maintenance of the required particle size in the case of dispersions and by the use of surfactants. These compositions can also contain adjuvants such as preservatives, wetting agents, emulsifying agents and dispersing agents. Prevention of the action of microorganisms can be ensured by the inclusion of various antibacterial and antifungal agents such as paraben, chlorobutanol, phenol, sorbic acid and the like. It can also be desirable to include isotonic agents such as sugars, sodium chloride and the like. Prolonged absorption of the injectable pharmaceutical form can be brought about by the inclusion of agents, such as aluminum monostearate and gelatin, which delay absorption. Injectable depot forms are made by forming microencapsule matrices of the drug in biodegradable polymers such as polylactide-polyglycolide, poly(orthoesters) and poly(anhydrides). Depending upon the ratio of drug to polymer and the nature of the particular polymer employed, the rate of drug release can be controlled. Depot injectable formulations are also prepared by entrapping the drug in liposomes or microemulsions which are compatible with body tissues. The injectable formulations can be sterilized, for example, by filtration through a bacterial-retaining filter or by incorporating sterilizing agents in the form of sterile solid compositions which can be dissolved or dispersed in sterile water or other sterile injectable media just prior to use. Suitable inert carriers can include sugars such as lactose. Desirably, at least 95% by weight of the particles of the active ingredient have an effective particle size in the range of 0.01 to 10 micrometers.

As used herein, the term "RNA silencing agent" refers to an RNA which is capable of preventing complete processing (e.g, the full translation and/or

expression) of a mRNA molecule through a post-transcriptional silencing mechanism. RNA silencing agents include small (<50 b.p.), noncoding RNA molecules, for example RNA duplexes comprising paired strands, as well as precursor RNAs from which such small non-coding RNAs can be generated. Exemplary RNA silencing agents include siRNAs, miRNAs, and siRNA-like duplexes, as well as precursors thereof.

As used herein, the term "small interfering RNA" ("siRNA") (also referred to in the art as "short interfering RNAs") refers to an RNA (or RNA analog) comprising between about 10-50 nucleotides (or nucleotide analogs) which is capable of directing or mediating RNA interference. Preferably, an siRNA comprises between about 15-30 nucleotides or nucleotide analogs, more preferably between about 16-25 nucleotides (or nucleotide analogs), even more preferably between about 18-23 nucleotides (or nucleotide analogs), and even more preferably between about 19-22 nucleotides (or nucleotide analogs) (e.g., 19, 20, 21 or 22 nucleotides or nucleotide analogs).

As used herein, the term "microRNA" ("miRNA"), also referred to in the art as "small temporal RNAs" ("stRNAs"), refers to a small (10-50 nucleotide) RNA which is capable of directed or mediating RNA silencing. A "natural miRNA" refers to a microRNA that occurs naturally. An "miRNA disorder" shall refer to a disease or disorder characterized by a aberrant expression or activity of a natural miRNA.

As used herein, the term "RNA interference" ("RNAi") (also referred to in the art as "gene silencing" and/or "target silencing", e.g., "target mRNA silencing") refers to a selective intracellular degradation of RNA. RNAi occurs in cells naturally to remove foreign RNAs (e.g., viral RNAs). Natural RNAi proceeds via fragments cleaved from free dsRNA which direct the degradative mechanism to other similar RNA sequences. As used herein, the term "translational repression" refers to a selective inhibition of mRNA translation. Natural translational repression proceeds via miRNAs cleaved from shRNA precursors. Both RNAi and translational repression are mediated by RISC. Both RNAi and translational repression occur naturally or can be initiated by the hand of man, for example, to silence the expression of target genes.

As used herein, the term "antisense strand" of an siRNA or RNAi agent refers to a strand that is substantially complementary to a section of about 10-50 nucleotides, e.g., about 15-30, 16-25, 18-23 or 19-22 nucleotides of the mRNA of the gene targeted for silencing. The antisense strand or first strand has sequence sufficiently complementary to the desired target mRNA sequence to direct target- specific RNA interference (RNAi), e.g., complementarity sufficient to trigger the destruction of the desired target mRNA by the RNAi machinery or process (RNAi interference) or complementarity sufficient to trigger translational repression of the desired target mRNA. The term "sense strand" or "second strand" of an siRNA or RNAi agent refers to a strand that is complementary to the antisense strand or first strand. Antisense and sense strands can also be referred to as first or second strands, the first or second strand having complementarity to the target sequence and the respective second or first strand having complementarity to said first or second strand. miRNA duplex intermediates or siRNA-like duplexes include a miRNA strand having sufficient complementarity to a section of about 10-50 nucleotides of the mRNA of the gene targeted for silencing and a miRNA* strand having sufficient complementarity to form a duplex with the miRNA strand.

As used herein, the term "guide strand" refers to a strand of an RNAi agent, e.g., an antisense strand of an siRNA duplex, that enters into the RISC complex and directs cleavage of the target mRNA.

The practice of the present invention will typically employ, unless otherwise indicated, conventional techniques of cell biology, cell culture, molecular biology, transgenic biology, microbiology, recombinant nucleic acid (e.g., DNA) technology, immunology, and RNA interference (RNAi) which are within the skill of the art. Non- limiting descriptions of certain of these techniques are found in the following publications: Ausubel, R, et al., (eds.), Current Protocols in Molecular Biology, Current Protocols in Immunology, Current Protocols in Protein Science, and Current Protocols in Cell Biology, all John Wiley & Sons, N. Y, edition as of December 2008; Sambrook, Russell, and Sambrook, Molecular Cloning: A Laboratory Manual, 3rd ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, 2001 ; Harlow, E. and Lane, D., Antibodies - A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, 1988; Freshney, R.I., "Culture of Animal Cells, A Manual of Basic Technique", 5th ed., John Wiley & Sons, Hoboken, NJ, 2005. Non-limiting information regarding therapeutic agents and human diseases is found in Goodman and

Gilman's The Pharmacological Basis of Therapeutics, 11th Ed., McGraw Hill, 2005, Katzung, B. (ed.) Basic and Clinical Pharmacology, McGraw-Hill/Appleton & Lange; 10th ed. (2006) or 11th edition (July 2009).

As used herein, the terms "inflammatory disease" and "inflammatory disorder" refer to diseases, disorders, and conditions characterized by abnormal inflammation. Examples include type 2 diabetes and cardiovascular diseases, such as

atherosclerosis, stroke, and ischemic heart disease.

As used herein, the term "anti-inflammatory agent" refers to substance or treatment that reduce inflammation or swelling. For example, the anti-inflammatory agent can be an anti-IL6 drug, such as tocilizumab (Actemra), siltuximab (Sylvant), Sarilumab, clazakizumab, olokizumab (CDP6038), elsilimomab, BMS- 945429(ALD518), sirukumab (CNTO 136), CPSI-2364, ARGX-109, FE301, and FM101. The anti-inflammatory agent can be an anti-IL1 drug, such as canakinumab or rilonacept. The anti-inflammatory agent can be a nonsteroidal anti-inflammatory drug (NSAID). Examples of anti-inflammatory agents include Alclofenac,

Alclometasone Dipropionate, Algestone Acetonide, alpha Amylase, Amcinafal, Amcinafide, Amfenac Sodium, Amiprilose Hydrochloride, Anakinra, Anirolac, Anitrazafen, Apazone, Balsalazide Disodium, Bendazac, Benoxaprofen,

Benzydamine Hydrochloride, Bromelains, Broperamole, Budesonide, Carprofen, Cicloprofen, Cintazone, Cliprofen, Clobetasol Propionate, Clobetasone Butyrate,

Clopirac, Cloticasone Propionate, Cormethasone Acetate, Cortodoxone, Decanoate, Deflazacort, Delatestryl, Depo-Testosterone, Desonide, Desoximetasone,

Dexamethasone Dipropionate, Diclofenac Potassium, Diclofenac Sodium,

Diflorasone Diacetate, Diflumidone Sodium, Diflunisal, Difluprednate, Diftalone, Dimethyl Sulfoxide, Drocinonide, Endrysone, Enlimomab, Enolicam Sodium,

Epirizole, Etodolac, Etofenamate, Felbinac, Fenamole, Fenbufen, Fenclofenac, Fenclorac, Fendosal, Fenpipalone, Fentiazac, Flazalone, Fluazacort, Flufenamic Acid, Flumizole, Flunisolide Acetate, Flunixin, Flunixin Meglumine, Fluocortin Butyl, Fluorometholone Acetate, Fluquazone, Flurbiprofen, Fluretofen, Fluticasone

Propionate, Furaprofen, Furobufen, Halcinonide, Halobetasol Propionate,

Halopredone Acetate, Ibufenac, Ibuprofen, Ibuprofen Aluminum, Ibuprofen Piconol, llonidap, Indomethacin, Indomethacin Sodium, Indoprofen, Indoxole, Intrazole, Isoflupredone Acetate, Isoxepac, Isoxicam, Ketoprofen, Lofemizole Hydrochloride, Lomoxicam, Loteprednol Etabonate, Meclofenamate Sodium, Meclofenamic Acid, Meclorisone Dibutyrate, MefenamicAcid, Mesalamine, Meseclazone, Mesterolone, Methandrostenolone, Methenolone, Methenolone Acetate, Methylprednisolone Suleptanate, Momiflumate, Nabumetone, Nandrolone, Naproxen, Naproxen Sodium, Naproxol, Nimazone, Olsalazine Sodium, Orgotein, Orpanoxin, Oxandrolane, Oxaprozin, Oxyphenbutazone, Oxymetholone, Paranyline Hydrochloride, Pentosan Polysulfate Sodium, Phenbutazone Sodium Glycerate, Pirfenidone, Piroxicam,

Piroxicam Cinnamate, Piroxicam Olamine, Pirprofen, Prednazate, Prifelone, Prodolic Acid, Proquazone, Proxazole, Proxazole Citrate, Rimexolone, Romazarit, Salcolex, Salnacedin, Salsalate, Sanguinarium Chloride, Seclazone, Sermetacin, Stanozolol, Sudoxicam, Sulindac, Suprofen, Talmetacin, Talniflumate, Talosalate, Tebufelone, Tenidap, Tenidap Sodium, Tenoxicam, Tesicam, Tesimide, Testosterone, Testosterone Blends, Tetrydamine, Tiopinac, Tixocortol Pivalate, Tolmetin, Tolmetin Sodium, Triclonide, Triflumidate, Zidometacin, Zomepirac Sodium.

Methods of Diagnosis

Gene [Expression Assay

Methods of "determining gene expression levels" include methods that quantify levels of gene transcripts as well as methods that determine whether a gene of interest is expressed at all. A measured expression level may be expressed as any quantitative value, for example, a fold-change in expression, up or down, relative to a control gene or relative to the same gene in another sample, or a log ratio of expression, or any visual representation thereof, such as, for example, a "heatmap" where a color intensity is representative of the amount of gene expression detected. Exemplary methods for detecting the level of expression of a gene include, but are not limited to, Northern blotting, dot or slot blots, reporter gene matrix, nuclease protection, RT-PCR, microarray profiling, differential display, 2D gel electrophoresis, SELDI-TOF, ICAT, enzyme assay, antibody assay, and MNAzyme-based detection methods. Optionally a gene whose level of expression is to be detected may be amplified, for example by methods that may include one or more of: polymerase chain reaction (PCR), strand displacement amplification (SDA), loop-mediated isothermal amplification (LAMP), rolling circle amplification (RCA), transcription- mediated amplification (TMA), self-sustained sequence replication (3SR), nucleic acid sequence based amplification (NASBA), or reverse transcription polymerase chain reaction (RT-PCR).

A number of suitable high throughput formats exist for evaluating expression patterns and profiles of the disclosed genes. Numerous technological platforms for performing high throughput expression analysis are known. Generally, such methods involve a logical or physical array of either the subject samples, the biomarkers, or both. Common array formats include both liquid and solid phase arrays. For example, assays employing liquid phase arrays, e.g., for hybridization of nucleic acids, binding of antibodies or other receptors to ligand, etc., can be performed in multiwell or microtiter plates. Microtiter plates with 96, 384 or 1536 wells are widely available, and even higher numbers of wells, e.g., 3456 and 9600 can be used. In general, the choice of microtiter plates is determined by the methods and equipment, e.g., robotic handling and loading systems, used for sample preparation and analysis. Exemplary systems include, e.g., xMAP® technology from Luminex (Austin, TX), the SECTOR® Imager with MULTI-ARRAY® and MULTI-SPOT® technologies from Meso Scale Discovery (Gaithersburg, MD), the ORCA™ system from Beckman-Coulter, Inc. (Fullerton, Calif.) and the ZYMATE™ systems from Zymark Corporation (Hopkinton, MA), miRCURY LNA™ microRNA Arrays (Exiqon, Woburn, MA).

Alternatively, a variety of solid phase arrays can favorably be employed to determine expression patterns in the context of the disclosed methods, assays and kits. Exemplary formats include membrane or filter arrays (e.g., nitrocellulose, nylon), pin arrays, and bead arrays (e.g., in a liquid "slurry"). Typically, probes corresponding to nucleic acid or protein reagents that specifically interact with (e.g., hybridize to or bind to) an expression product corresponding to a member of the candidate library, are immobilized, for example by direct or indirect cross-linking, to the solid support. Essentially any solid support capable of withstanding the reagents and conditions necessary for performing the particular expression assay can be utilized. For example, functionalized glass, silicon, silicon dioxide, modified silicon, any of a variety of polymers, such as (poly)tetrafluoroethylene, (poly)vinylidenedifluoride, polystyrene, polycarbonate, or combinations thereof can all serve as the substrate for a solid phase array.

In one embodiment, the array is a "chip" composed, e.g., of one of the above- specified materials. Polynucleotide probes, e.g., RNA or DNA, such as cDNA, synthetic oligonucleotides, and the like, or binding proteins such as antibodies or antigen-binding fragments or derivatives thereof, that specifically interact with expression products of individual components of the candidate library are affixed to the chip in a logically ordered manner, i.e., in an array. In addition, any molecule with a specific affinity for either the sense or anti-sense sequence of the marker nucleotide sequence (depending on the design of the sample labeling), can be fixed to the array surface without loss of specific affinity for the marker and can be obtained and produced for array production, for example, proteins that specifically recognize the specific nucleic acid sequence of the marker, ribozymes, peptide nucleic acids (PNA), or other chemicals or molecules with specific affinity.

Microarray expression may be detected by scanning the microarray with a variety of laser or CCD-based scanners, and extracting features with numerous software packages, for example, IMAGENE™ (Biodiscovery), Feature Extraction Software (Agilent), SCANLYZE™ (Stanford Univ., Stanford, CA.), GENEPIX™ (Axon Instruments).

Methvlation Assay

There are many techniques for measuring DNA methylation. For example, one can use Methylation-Specific-Quantitative PCR (MS-QPCR) or to measure DNA Methylation. (See: Eads C. A., MethyLight: a high-throughput assay to measure DNA methylation. Nucleic Acids Res. 2000 Apr. 15; 28(8): E32; 2. Darst R. P., Bisulfite sequencing of DNA. Curr Protoc Mol Biol. 2010 July; Chapter 7: Unit 7.9.1-17, and Cottrell S. E., et al., A real-time PCR assay for DNA-methylation using methylation specific blockers, Nucleic Acids Res. 2004; 32(1): e10.). In some embodiments, the methylation is quantified with PyroMark™MD Pyrosequencing System (Qiagen) using PyroPyroMark® Gold Q96 Reagents (Qiagen, Cat#972804). Other approaches for methylation quantification include, for example, methylation specific QPCR or quantitative bisulfite sequencing of methylation.

ARID5B Inhibitors

Disclosed herein are ARID5B inhibitors that reduce the level of endogenous ARID5B in a subject. The term "inhibitor" refers to a compound which is capable of reducing the expression of a gene or the activity of the product of such gene to an extent sufficient to achieve a desired biological or physiological effect. Therefore, the term "inhibitor" as used herein includes one or more of an oligonucleotide inhibitor, including siRNA, shRNA, miRNA and ribozymes.

Gene Silencing

As used herein, the phrase "RNA interference" (also called "RNAi" herein) refers to its meaning as is generally accepted in the art. The term generally refers to the biological process of inhibiting, decreasing, or down-regulating gene expression in a cell, and which is mediated by short interfering nucleic acid molecules (e.g., siRNAs, miRNAs, shRNAs). Additionally, the term "RNA interference" (or "RNAi") is meant to be equivalent to other terms used to describe sequence-specific RNA interference, such as post-transcriptional gene silencing, translational inhibition, transcriptional inhibition, or epigenetics. For example, nucleic acid agents of the invention can be used to epigenetically silence ARID5B at either the post- transcriptional level or the pre-transcriptional level. In a non-limiting example, epigenetic modulation of gene expression by nucleic acid agents of the invention can result from modification of chromatin structure or methylation patterns to alter gene expression. In another non-limiting example, modulation of gene expression by nucleic acid agents of the invention can result from cleavage of RNA (either coding or non-coding RNA) via RISC, or via translational inhibition, as is known in the art or modulation can result from transcriptional inhibition.

The terms "inhibit," "down-regulate," "reduce" or "knockdown" as used herein refer to their meanings as are generally accepted in the art. With reference to exemplary single-stranded RNAi molecules of the invention, the terms generally refer to the reduction in the (i) expression of a gene or target sequence and/or the level of RNA molecules encoding one or more proteins or protein subunits, and/or (ii) the activity of one or more proteins or protein subunits, below that observed in the absence of the single-stranded RNAi molecules of the invention. Down-regulation can also be associated with post-transcriptional silencing, such as RNAi-mediated cleavage, or by alteration in DNA methylation patterns or DNA chromatin structure. Inhibition, down-regulation, reduction or knockdown with an RNAi agent can be in reference to an inactive molecule, an attenuated molecule, an RNAi agent with a scrambled sequence, or an RNAi agent with mismatches. The phrase "gene silencing" refers to a partial or complete loss-of-function through targeted inhibition of an endogenous target gene in a cell. As such, the term is used interchangeably with RNAi, "knockdown," "inhibition," "down-regulation," or "reduction" of expression of a target gene.

The nucleic acid sequence for ARID5B is known in the art, and it is within the skill of those in the art to design and produce gene silencing oligonucleotides such as siRNA using this sequence. For example, siRNA that silence ARID5B are

commercially available from ThermoFisher Scientific (e.g. Catalog No. S38579, S38580, S38581).

Targeted Methylation

The disclosed methods can involve the targeted methylation of CpG sites within the ARID5B gene. Enzymes capable of modifying methylation status are known on the art and can be targeted to ARID5B gene to methylate the CpG sites. In some embodiments, targeted DNA methylation of the CpG sites within the ARID5B gene is accomplished using a gene editing tool such as a zinc finger nuclease (ZFN), transcription activator-like effector nuclease (TALEN), mega-nuclease, CRISPR/Cas, structure-guided endonuclease (SGN), or targetron. All of themcan achieve precise genetic modifications by inducing targeted DNA double-strand breaks (DSBs).

Depending on the cell cycle stage, as well as the presence or absenceof a repair template with homologous terminal regions, the DSB may then be repaired by either non-homologous end joining repair system (NHEJ) or homologous recombination- based double-strand break repair pathway (HDR).

In some embodiments, targeted DNA methylation of the CpG sites within the ARID5B gene is accomplished using a CRISPR system. Compositions and methods for making and using CRISPR-Cas systems are described in U.S. Pat. No.

8,697,359, which is incorporated herein in its entirety. In particular, the method can involve a catalytically inactive site specific nuclease fused to an effector domain having methylation activity; and a guide sequence or a nucleic acid that encodes a guide sequence. In some aspects, the catalytically inactive site specific nuclease is a catalytically inactive Cas protein (e.g., a Cas9 protein or a Cpfl protein). The guide sequences may be ribonucleic acid guide sequences. In certain aspects, the guide sequence is from about 10 base pairs to about 150 base pairs in length. The one or more guide sequences may comprise two or more guide sequences.

There are various ways that a polypeptide comprising a catalytically inactive site specific nuclease fused to an effector domain having methylation or

demethylation activity can be delivered to a cell or subject, e.g., by administering a nucleic acid that encodes the polypeptide, which nucleic acid may be, e.g., a viral vector or may be a translatable nucleic acid (e.g, synthetic modified mRNA.

Examples of modified mRNA are described in Warren et al. (Cell Stem Cell 7(5):618- 30, 2010, Mandal PK, Rossi DJ. Nat Protoc. 2013 8(3):568-82, US Pat. Pub. No. 20120046346 and/or WO2011/130624.

In some aspects, one or more guide sequences include sequences that recognize DNA in a site-specific manner. For example, guide sequences can include guide ribonucleic acid (RNA) sequences utilized by a CRISPR system or sequences within a TALEN or zinc finger system that recognize DNA in a site-specific manner. The guide sequences comprise a portion that is complementary to a portion of each of the one or more genomic sequences and comprise a binding site for the catalytically inactive site specific nuclease. In some embodiments, the RNA sequence is referred to as guide RNA (gRNA) or single guide RNA (sgRNA).

In some aspects, a single RNA sequence can be complementary to one or more (e.g., all) of the genomic sequences that are being modulated or modified. In one aspect, a single RNA is complementary to a single target genomic sequence. In a particular aspect in which two or more target genomic sequences are to be modulated or modified, multiple (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, or more) RNA sequences are introduced wherein each RNA sequence is complementary to (specific for) one target genomic sequence. In some aspects, two or more, three or more, four or more, five or more, or six or more RNA sequences are complementary to (specific for) different parts of the same target sequence. In one aspect, two or more RNA sequences bind to different sequences of the same region of DNA. In some aspects, a single RNA sequence is complementary to at least two target or more (e.g., all) of the genomic sequences. It will also be apparent to those of skill in the art that the portion of the RNA sequence that is complementary to one or more of the genomic sequences and the portion of the RNA sequence that binds to the catalytically inactive site specific nuclease can be introduced as a single sequence or as 2 (or more) separate sequences into a cell, zygote, embryo or nonhuman animal. In some embodiments the sequence that binds to the catalytically inactive site specific nuclease comprises a stem-loop.

In some embodiments, the RNA sequence used to modify gene expression is a modified RNA sequence comprising a 5-methylcytidine (5mC). The RNA sequence can vary in length from about 8 base pairs (bp) to about 200 bp. In some

embodiments, the RNA sequence can be about 9 to about 190 bp; about 10 to about 150 bp; about 15 to about 120 bp; about 20 to about 100 bp; about 30 to about 90 bp; about 40 to about 80 bp; about 50 to about 70 bp in length.

The portion of each genomic sequence to which each RNA sequence is complementary can also vary in size. In particular aspects, the portion of each genomic sequence to which the RNA is complementary can be about 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34,35, 36, 37, 38 39, 40, 41, 42, 43, 44, 45, 46 47, 48, 49, 50, 51, 52, 53,54, 55,

56,57, 58, 59 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80 81, 82, 83, 84, 85, 86, 87 88, 89, 90, 81, 92, 93, 94, 95, 96, 97, 98, or 100 nucleotides (contiguous nucleotides) in length. In some embodiments, each RNA sequence can be at least about 70%, 75%, 80%, 85%, 90%, 95%, 100%), etc.

identical or similar to the portion of each genomic sequence. In some embodiments, each RNA sequence is completely or partially identical or similar to each genomic sequence. For example, each RNA sequence can differ from perfect complementarity to the portion of the genomic sequence by about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, etc. nucleotides. In some embodiments, one or more RNA sequences are perfectly complementary (100%>) across at least about 10 to about 25 (e.g., about 20) nucleotides of the genomic sequence.

As described herein, the one or more RNA sequences also comprise a (one or more) binding site for a (one or more) catalytically inactive site specific nuclease. The catalytically inactive site specific nuclease may be a catalytically inactive

CRISPR associated (Cas) protein. In a particular aspect, upon hybridization of the one or more RNA sequences to the one or more genomic sequences, the

catalytically inactive site specific nuclease binds to the one or more RNA sequences. In some aspects, the method further comprises administering one or more

catalytically inactive Cas nucleic acid or protein.

A variety of CRISPR associated (Cas) genes or proteins which are known in the art can be used in the methods of the invention and the choice of Cas protein will depend upon the particular conditions of the method. Specific examples of Cas proteins include Cas1, Cas2, Cas3, Cas4, Cas5, Cas5e (or CasD), Cas6, Cas6e, Cas6f, Cas7, Cas8al, Cas8a2, Cas8b, Cas8c, Cas9, CaslO, CaslOd, CasF, CasG, CasH, Csyl, Csy2, Csy3, Csel (or CasA), Cse2 (or CasB), Cse3 (or CasE), Cse4 (or CasC), Cscl, Csc2, Csa5, Csn2, Csm2, Csm3, Csm4, Csm5, Csm6, Cmrl, Cmr3, Cmr4, Cmr5, Cmr6, Csbl, Csb2, Csb3, Csxl7, Csxl4, CsxIO, Csxl6, CsaX, Csx3, Cszl, Csxl5, Csfl, Csf2, Csf3, Csf4, Cpfl, and Cul966. In a particular aspect, the Cas nucleic acid or protein used in the methods is Cas9. In some embodiments a Cas protein, e.g., a Cas9 protein, may be from any of a variety of prokaryotic species. In some embodiments a particular Cas protein, e.g., a particular Cas9 protein, may be selected to recognize a particular protospacer-adjacent motif (RAM) sequence. In certain embodiments a Cas protein, e.g., a Cas9 protein, may be obtained from a bacteria or archaea or synthesized using known methods. In certain embodiments, a Cas protein may be from a gram positive bacteria or a gram negative bacteria. In some embodiments, the Cas protein is Cpfl protein or a functional portion thereof. In some embodiments, the Cas protein is Cpfl from any bacterial species or functional portion thereof.

Disclosed herein is a catalytically inactive Cas9 protein tethered to all or a portion of methyltransferase to create a chimeric protein that can be guided to specific DNA sites by one or more RNA sequences (sgRNA) to methylate ARID5B CpG sites. DNA methylation is established by two de novo DNA methyltransferases (Dnmt3a/b), and is maintained by Dnmtl.

As will be apparent to those of skill in the art, the method can further comprise introducing other molecules or factors into the cell to facilitate methylation of the genomic sequence. An agent that enhances DNA methylation may be an inhibitor of an endogenous DNA demethylase.

Sequence-specific nucleases have been developed to increase the efficiency of gene targeting or genome editing in animal and plant systems. Among them, zinc finger nucleases (ZFNs) and transcription activator- 1 ike effector nucleases (TALENs) are the two most commonly used sequence-specific chimeric proteins. Once the ZFN or TALEN constructs are introduced into and expressed in cells, the programmable DNA binding domain can specifically bind to a corresponding sequence and guide the chimeric nuclease to make a specific DNA strand cleavage. A pair of ZFNs or TALENs can be introduced to generate double strand breaks (DSBs), which activate the DNA repair systems and significantly increase the frequency of both

nonhomologous end joining (NHEJ) and homologous recombination (HR). These and other methods for DNA methylation editing are disclosed in WO2017090724, WO2018035495, and WO2018053035, which are incorporated by reference in its entirety for these teachings.

Pharmaceutical Compositions

The compositions disclosed can be used therapeutically in combination with a pharmaceutically acceptable carrier. By "pharmaceutically acceptable" is meant a material that is not biologically or otherwise undesirable, i.e., the material may be administered to a subject, along with the nucleic acid or vector, without causing any undesirable biological effects or interacting in a deleterious manner with any of the other components of the pharmaceutical composition in which it is contained. The carrier would naturally be selected to minimize any degradation of the active ingredient and to minimize any adverse side effects in the subject, as would be well known to one of skill in the art.

The disclosed compositions can be formulated in buffer solutions such as phosphate buffered saline solutions, liposomes, micellar structures, and capsids. Formulations of DsiRNA agent with cationic lipids can be used to facilitate

transfection of the DsiRNA agent into cells. For example, cationic lipids, such as lipofectin (U.S. Pat. No. 5,705,188), cationic glycerol derivatives, and polycationic molecules, such as polylysine (published PCT International Application WO

97/30731), can be used. Suitable lipids include Oligofectamine, Lipofectamine (Life Technologies), NC388 (Ribozyme Pharmaceuticals, Inc., Boulder, Colo.), or FuGene 6 (Roche) all of which can be used according to the manufacturer's instructions.

Such compositions typically include the nucleic acid molecule and a pharmaceutically acceptable carrier. As used herein the language "pharmaceutically acceptable carrier" includes saline, solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and absorption delaying agents, and the like, compatible with pharmaceutical administration. Supplementary active compounds can also be incorporated into the compositions.

A pharmaceutical composition is formulated to be compatible with its intended route of administration. [Examples of routes of administration include parenteral, e.g., intravenous, intradermal, subcutaneous, oral (e.g., inhalation), transdermal (topical), transmucosal, and rectal administration. Solutions or suspensions used for parenteral, intradermal, or subcutaneous application can include the following components: a sterile diluent such as water for injection, saline solution, fixed oils, polyethylene glycols, glycerine, propylene glycol or other synthetic solvents;

antibacterial agents such as benzyl alcohol or methyl parabens; antioxidants such as ascorbic acid or sodium bisulfite; chelating agents such as

ethylenediaminetetraacetic acid; buffers such as acetates, citrates or phosphates and agents for the adjustment of tonicity such as sodium chloride or dextrose. pH can be adjusted with acids or bases, such as hydrochloric acid or sodium hydroxide. The parenteral preparation can be enclosed in ampoules, disposable syringes or multiple dose vials made of glass or plastic.

Pharmaceutical compositions suitable for injectable use include sterile aqueous solutions (where water soluble) or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersion. For intravenous administration, suitable carriers include physiological saline,

bacteriostatic water, Cremophor EL.TM. (BASF, Parsippany, N.J.) or phosphate buffered saline (PBS). In all cases, the composition must be sterile and should be fluid to the extent that easy syringability exists. It should be stable under the conditions of manufacture and storage and must be preserved against the contaminating action of microorganisms such as bacteria and fungi. The carrier can be a solvent or dispersion medium containing, for example, water, ethanol, polyol (for example, glycerol, propylene glycol, and liquid polyetheylene glycol, and the like), and suitable mixtures thereof. The proper fluidity can be maintained, for example, by the use of a coating such as lecithin, by the maintenance of the required particle size in the case of dispersion and by the use of surfactants. Prevention of the action of microorganisms can be achieved by various antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. In many cases, it will be preferable to include isotonic agents, for example, sugars, polyalcohols such as manitol, sorbitol, sodium chloride in the composition. Prolonged absorption of the injectable compositions can be brought about by including in the composition an agent which delays absorption, for example, aluminum monostearate and gelatin.

Sterile injectable solutions can be prepared by incorporating the active compound in the required amount in an appropriate solvent with one or a

combination of ingredients enumerated above, as required, followed by filtered sterilization. Generally, dispersions are prepared by incorporating the active compound into a sterile vehicle, which contains a basic dispersion medium and the required other ingredients from those enumerated above. In the case of sterile powders for the preparation of sterile injectable solutions, the preferred methods of preparation are vacuum drying and freeze-drying which yields a powder of the active ingredient plus any additional desired ingredient from a previously sterile-filtered solution thereof.

Oral compositions generally include an inert diluent or an edible carrier. For the purpose of oral therapeutic administration, the active compound can be incorporated with excipients and used in the form of tablets, troches, or capsules, e.g., gelatin capsules. Oral compositions can also be prepared using a fluid carrier for use as a mouthwash. Pharmaceutically compatible binding agents, and/or adjuvant materials can be included as part of the composition. The tablets, pills, capsules, troches and the like can contain any of the following ingredients, or compounds of a similar nature: a binder such as microcrystalline cellulose, gum tragacanth or gelatin; an excipient such as starch or lactose, a disintegrating agent such as alginic acid, Primogel, or corn starch; a lubricant such as magnesium stearate or Sterotes; a glidant such as colloidal silicon dioxide; a sweetening agent such as sucrose or saccharin; or a flavoring agent such as peppermint, methyl salicylate, or orange flavoring.

For administration by inhalation, the compositions are delivered in the form of an aerosol spray from pressured container or dispenser which contains a suitable propellant, e.g., a gas such as carbon dioxide, or a nebulizer. Such methods include those described in U.S. Pat. No. 6,468,798.

Systemic administration can also be by transmucosal or transdermal means.

For transmucosal or transdermal administration, penetrants appropriate to the barrier to be permeated are used in the formulation. Such penetrants are generally known in the art, and include, for example, for transmucosal administration, detergents, bile salts, and fusidic acid derivatives. Transmucosal administration can be accomplished through the use of nasal sprays or suppositories. For transdermal administration, the active compounds are formulated into ointments, salves, gels, or creams as generally known in the art.

The compounds can also be prepared in the form of suppositories (e.g., with conventional suppository bases such as cocoa butter and other glycerides) or retention enemas for rectal delivery.

The compounds can also be administered by a method suitable for administration of nucleic acid agents, such as a DNA vaccine. These methods include gene guns, bio injectors, skin patches, micro-particle DNA vaccine technology, transdermal needle-free vaccination, intranasal delivery, and

microencapsulation. In one embodiment, the disclosed compositions are prepared with carriers that will protect the nucleic acids against rapid elimination from the body, such as a controlled release formulation, including implants and microencapsulated delivery systems. Biodegradable, biocompatible polymers can be used, such as ethylene vinyl acetate, polyanhydrides, polyglycolic acid, collagen, polyorthoesters, and polylactic acid. Such formulations can be prepared using standard techniques. The materials can also be obtained commercially from Alza Corporation and Nova Pharmaceuticals, Inc. Liposomal suspensions (including liposomes targeted to infected cells with monoclonal antibodies to viral antigens) can also be used as pharmaceutically acceptable carriers. These can be prepared according to methods known to those skilled in the art, for example, as described in U.S. Pat. No.

4,522,811.

As defined herein, a therapeutically effective amount of a nucleic acid molecule (i.e., an effective dosage) depends on the nucleic acid selected. For instance, if a plasmid encoding an siRNA agent is selected, single dose amounts in the range of approximately 1 pg to 1000 mg may be administered; in some embodiments, 10, 30, 100, or 1000 pg, or 10, 30, 100, or 1000 ng, or 10, 30, 100, or 1000 g, or 10, 30, 100, or 1000 mg may be administered. In some embodiments, 1- 5 g of the compositions can be administered. The compositions can be administered one from one or more times per day to one or more times per week; including once every other day. The skilled artisan will appreciate that certain factors may influence the dosage and timing required to effectively treat a subject, including but not limited to the severity of the disease or disorder, previous treatments, the general health and/or age of the subject, and other diseases present. Moreover, treatment of a subject with a therapeutically effective amount of a protein, polypeptide, or antibody can include a single treatment or, preferably, can include a series of treatments.

The nucleic acid molecules of the invention can be inserted into expression constructs, e.g., viral vectors, retroviral vectors, expression cassettes, or plasmid viral vectors, e.g., using methods known in the art, including but not limited to those described in Xia et al., (2002), supra. Expression constructs can be delivered to a subject by, for example, inhalation, orally, intravenous injection, local administration (see U.S. Pat. No. 5,328,470) or by stereotactic injection (see e.g., Chen et al.

(1994), Proc. Natl. Acad. Sci. USA, 91, 3054-3057). The pharmaceutical preparation of the delivery vector can include the vector in an acceptable diluent, or can comprise a slow release matrix in which the delivery vehicle is imbedded. Alternatively, where the complete delivery vector can be produced intact from recombinant cells, e.g., retroviral vectors, the pharmaceutical preparation can include one or more cells which produce the gene delivery system.

The expression constructs may be constructs suitable for use in the appropriate expression system and include, but are not limited to retroviral vectors, linear expression cassettes, plasmids and viral or virally-derived vectors, as known in the art. Such expression constructs may include one or more inducible promoters, RNA Pol III promoter systems such as U6 snRNA promoters or H1 RNA polymerase III promoters, or other promoters known in the art. The constructs can include one or both strands of the siRNA. Expression constructs expressing both strands can also include loop structures linking both strands, or each strand can be separately transcribed from separate promoters within the same construct.

Suitably formulated pharmaceutical compositions of this invention can be administered by means known in the art such as by parenteral routes, including intravenous, intramuscular, intraperitoneal, subcutaneous, transdermal, airway (aerosol), rectal, vaginal and topical (including buccal and sublingual) administration. In some embodiments, the pharmaceutical compositions are administered by intravenous or intraparenteral infusion or injection.

In general, a suitable dosage unit of nucleic acid agent will be in the range of 0.001 to 0.25 milligrams per kilogram body weight of the recipient per day, or in the range of 0.01 to 20 micrograms per kilogram body weight per day, or in the range of 0.01 to 10 micrograms per kilogram body weight per day, or in the range of 0.10 to 5 micrograms per kilogram body weight per day, or in the range of 0.1 to 2.5 micrograms per kilogram body weight per day. Pharmaceutical composition comprising the nucleic acid agent can be administered once daily. However, the therapeutic agent may also be dosed in dosage units containing two, three, four, five, six or more sub-doses administered at appropriate intervals throughout the day. In that case, the nucleic acid agent contained in each sub-dose must be

correspondingly smaller in order to achieve the total daily dosage unit. The dosage unit can also be compounded for a single dose over several days, e.g., using a conventional sustained release formulation which provides sustained and consistent release of the nucleic acid agent over a several day period. Sustained release formulations are well known in the art. In this embodiment, the dosage unit contains a corresponding multiple of the daily dose. Regardless of the formulation, the pharmaceutical composition must contain nucleic acid agent in a quantity sufficient to inhibit expression of the target gene in the animal or human being treated. The nucleic acid agent can be compounded in such a way that the sum of the multiple units of dsRNA together contain a sufficient dose.

Data can be obtained from cell culture assays and animal studies to formulate a suitable dosage range for humans. The dosage of compositions of the invention lies within a range of circulating concentrations that include the ED50 (as determined by known methods) with little or no toxicity. The dosage may vary within this range depending upon the dosage form employed and the route of administration utilized. For a compound used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. A dose may be formulated in animal models to achieve a circulating plasma concentration range of the compound that includes the IC50 (i.e., the concentration of the test compound which achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such information can be used to more accurately determine useful doses in humans. Levels of dsRNA in plasma may be measured by standard methods, for example, by high performance liquid chromatography.

The pharmaceutical compositions can be included in a kit, container, pack, or dispenser together with instructions for administration.

Methods of Treatment

The present invention provides for both prophylactic and therapeutic methods of treating a subject at risk of (or susceptible to) inflammatory disease or disorders. Subjects at risk for the disease can be identified by, for example, one or a

combination of diagnostic or prognostic assays as described herein. Administration of a prophylactic agent can occur prior to the detection of or the manifestation of symptoms characteristic of the disease or disorder, such that the disease or disorder is prevented or, alternatively, delayed in its progression. Another aspect of the invention pertains to methods of treating subjects therapeutically, i.e., altering the onset of symptoms of the disease or disorder.

With regards to both prophylactic and therapeutic methods of treatment, such treatments may be specifically tailored or modified, based on knowledge obtained from the field of pharmacogenomics. "Pharmacogenomics", as used herein, refers to the application of genomics technologies such as gene sequencing, statistical genetics, and gene expression analysis to drugs in clinical development and on the market. More specifically, the term refers the study of how a patient's genes determine his or her response to a drug (e.g., a patient's "drug response phenotype", or "drug response genotype"). Thus, another aspect of the invention provides methods for tailoring an individual's prophylactic or therapeutic treatment according to that individual's drug response genotype. Pharmacogenomics allows a clinician or physician to target prophylactic or therapeutic treatments to patients who will most benefit from the treatment and to avoid treatment of patients who will experience toxic drug-related side effects.

A number of embodiments of the invention have been described.

Nevertheless, it will be understood that various modifications may be made without departing from the spirit and scope of the invention. Accordingly, other embodiments are within the scope of the following claims. EXAMPLES

Example 1 : Blood Monocyte Transcriptome and Epigenome Analyses Reveal Loci Associated with Human Atherosclerosis.

Epigenomics and transcriptomics can illuminate the interplay between the genome and its environment and may provide insights into the molecular basis of complex diseases, including cardiovascular disease (CVD) (Lewis, J. & Bird, A.

FEBS Lett. 285:155-159 (1991); Bird, A.P. & Wolffe, A.P. Cell 99:451-454 (1999); Jones, P. A. & Takai, D. Science 293:1068-1070 (2001); Egger, G., et al. Nature 429:457-463 (2004); Petronis, A. Nature 465:721-727 (2010)). Epigenetic targeting also is an attractive treatment strategy for reordering dysregulated gene expression. To date, epigenome-wide studies of CVD traits are limited (Reinius, L.E., et al. PLoS One. 7:e41361 (2012)) and their interpretation is potentially complicated by use of data from mixed cell types, which may obscure cell-type specific functional mechanisms. Monocytes and their derived macrophages are key players in inflammation that contribute to the development of many chronic diseases, including atherosclerosis (Lessner, S.M., et al. Am. J. Pathol. 160:2145-2155 (2002); Osterud, B. & Bjorklid, E. Physiol Rev. 83:1069-1112 (2003); Hansson, G.K. N. Engl. J. Med. 352:1685-1695 (2005); Ley, K., et al. Arterioscler. Thromb. Vase. Biol. 31:1506-1516 (2011)).

There are blood monocyte epigenome and transcriptome signatures of several CVD risk factors, including age, obesity, and cigarette smoking (Reynolds, L.M. et al. Nat. Commun. 5:5366 (2014); Reynolds, L.M. et al. BMC Genomics 16:333 (2015); Ding, J. et al. Diabetes. 64:3464-3474 (2015); Reynolds, L.M. et al. Circ Cardiovasc. Genet 8:707-716 (2015)). In this Example, transcriptome and methylome features that are associated with atherosclerosis are identified. In addition to traditional CVD risk factors, we also integrate these findings with histone modification, DNase-seq, Hi-C (genome chromosome conformation capture), and ChlA-PET (Chromatin Interaction Analysis by Paired-End Tag) sequencing data, and in vitro functional data in order to characterize novel molecular mechanisms of atherosclerosis. This Example discloses a role of ARID5B in atherogenesis, and an epigenetically controlled regulatory site of ARID5B expression.

Methods

Participants

The present analyses are primarily based on data collected at MESA exam 5 with concurrent analyses of purified monocyte samples of 1 ,264 randomly selected MESA participants from four MESA sites [John Hopkins University (JHU); Columbia University (CO); University of Minnesota (UMN); Wake Forest University (WFU)]. The study protocol was approved by Institutional Review Boards of the four institutions. All participants signed informed consent.

Measurement of Atherosclerosis and Prevalent CVD

All measurements in humans were obtained at MESA exam 5 unless otherwise specified. To obtain carotid artery plaque scores, readers at the UW

Atherosclerosis Imaging Research Program Laboratory adjudicated carotid plaque presence or absence, defined as a focal abnormal wall thickness (carotid IMT >1.5 mm) or a focal thickening of >50% of the surrounding IMT, as reported previously (Stein, J.H. et al. J Am Soc Echocardiogr. 21:93-111 (2008); Touboul, P.J. et al. Cerebrovasc. Dis. 23:75-80 (2007)). Presence or absence of plaque acoustic shadowing was recorded. A total plaque score (range 0-12) was calculated to describe carotid plaque burden. One point per plaque was allocated for the near and far walls of each segment (CCA, bulb, ICA) of each carotid artery that was interrogated. For carotid plaque presence, intra-reader reproducibility was excellent (per reader Kappa = 0.82-1.0, overall Kappa = 0.83, 95% confidence interval [CI] 0.70-0.96), as was inter-reader reproducibility (Kappa = 0.89; 95% CI 0.72 - 1.00) (Tattersall, M.C. et al. Stroke 45:3257-3262 (2014)). The CT Reading Center for cardiac scans in the MESA is at UCLA-Biomedical Research Institute. Coronary artery calcium (CAC) score was determined using the Agatston method, which accounts for both lesion area and calcium density using Hounsfield brightness. The re-read agreement for the CAC score (intraclass correlation coefficient = 0.99) was excellent. Prevalent CVD was defined as a past history of myocardial infarction, angina (which included definite angina and probable angina if coronary

revascularization was performed at the same time or afterwards), resuscitated cardiac arrest, or stroke.

Measurement of Covariates Weight was measured with a Detecto Platform Balance Scale to the nearest 0.5 kg. Height was measured with a stadiometer (Accu-Hite Measure Device with level bubble) to the nearest 0.1 cm. Body-mass index (BMI) was defined as weight in kilograms divided by square of height in meters. Type II diabetes mellitus (T2D) was defined as fasting glucose≥ 7.0 mmol L 1 (> 126 mg/dL) or use of hypoglycemic medication, and impaired fasting glucose was defined as fasting glucose 5.6-6.9 mmol L 1 (100-125 mg dL 1 ). Plasma IL-6 was measured by ultra-sensitive ELISA (Quantikine HS Human IL-6 Immunoassay; R&D Systems, Minneapolis, MN).

Plasma C-reactive protein (CRP) was measured using the BNII nephelometer (High Sensitivity CRP; Dade Behring Inc., Deerfield, IL). Resting blood pressure was measured three times in the seated position using a Dinamap model Pro 100 automated oscillometric sphygmomanometer (Critikon, Tampa, FL). The average of the last two measurements was used in analysis. Hypertension was defined as systolic pressure greater than or equal to 140 mm Hg, diastolic pressure greater than or equal to 90 mm Hg or current use of anti-hypertensive medication. The single nucleotide polymorphisms (SNPs) data were derived from MESA Affymetrix 6.0 array genotype data (Fox, C.S. et al. PLoS. Genet. 8:e1002705 (2012)).

Blood Cell Count and Purification of CD14+ and CD4+ cells

Samples for complete blood count (CBC) with differential analysis were obtained by venipuncture and collected into tubes containing

ethylenediaminetetraacetic acid (EDTA). Total circulating white blood cell count and cell subtype counts were performed at local LabCorp. Blood was also collected in sodium heparin-containing Vacutainer CPT™ cell separation tubes (Becton

Dickinson, Rutherford, NJ) to separate peripheral blood mononuclear cells from other elements within two hours from blood draw. Subsequently, monocytes and T cells were isolated with anti-CD14 and anti-CD4 monoclonal antibody coated magnetic beads, respectively, using an autoMACS automated magnetic separation unit (Miltenyi Biotec, Bergisch Gladbach, Germany). Initially flow cytometry analysis of 18 specimens was performed, including samples from all four MESA field centers, which were found to be consistently > 90% pure.

DNA/RNA Extraction

DNA and RNA were isolated from samples simultaneously using the AllPrep DNA/RNA Mini Kit (Qiagen, Inc., Hilden, Germany). DNA and RNA QC metrics included optical density (OD) measurements, using a NanoDrop spectrophotometer and evaluation of the integrity of 18 s and 28 s ribosomal RNA using the Agilent 2100 Bioanalyzer with RNA 6000 Nano chips (Agilent Technology, Inc., Santa Clara, CA) following manufacturer's instructions. RNA with RIN (RNA Integrity) scores > 9.0 was used for global expression microarrays. The median of RIN for our 1,264 samples was 9.9.

Global mRNA Expression Quantification

The lllumina HumanHT-12 v4 Expression BeadChip and lllumina Bead Array

Reader were used to perform the genome-wide expression analysis, as previously described (Liu, Y. et al. Hum Mol Genet 22:5065-5074 (2013)). This data has been deposited in the NCBI Gene Expression Omnibus and is accessible through GEO Series accession number (GSE56047).

Epigenome-wide Methylation Quantification

lllumina HumanMethylation450 BeadChips and HiScan reader were used to perform the epigenome-wide methylation analysis, as previously described (Liu, Y. et al. Hum Mol Genet 22:5065-5074 (2013)). This methylation data has been deposited in the NCBI Gene Expression Omnibus and is accessible through GEO Series accession number (GSE56046).

Quality Control and Pre-Processing of Microarray Data

Data pre-processing and quality control (QC) analyses were performed in R using Bioconductor packages, as previously described (Liu, Y. et al. Hum Mol Genet 22:5065-5074 (2013)). For both monocyte and T cell assays, we included 2% blind duplicates. Correlations among technical replicates exceeded 0.997.

Multidimensional scaling plots showed the five common control samples were highly clustered together and identified three outlier samples, which were excluded subsequently.

The lllumina HumanHT-12 v4 Expression BeadChip included >47,000 probes for >30,000 genes (with unique Entrez gene IDs). Statistical analyses excluded probes with non-detectable expression in >90% of MESA samples (using a detection p-value cut-off of 0.0001), probes overlapping repetitive elements or regions, probes with low variance across the samples (<10th percentile), or probes targeting putative and/or not well-characterized genes, i.e. gene names starting with KIAA, FI_J, HS, MGC, or LOG, 14,619 mRNA transcripts from 10,989 unique genes were included in the analyses of the presented manuscript.

The lllumina HumanMethylation450 BeadChip included probes for 485K CpG sites. Exclusion criteria included probes with "detected" methylation levels in <90% of MESA samples using a detection p-value cut-off of 0.05 and 65 control probes which assay highly-polymorphic single nucleotide polymorphisms (SNPs) rather than DNA methylation (Pidsley, R. et al. BMC Genomics 14:293 (2013)). Methylation data for the total of 484,817 CpG sites were included in the analyses.

To estimate residual sample contamination, we generated separate enrichment scores for neutrophils, B cells, T cells, and natural killer cells as described previously (Liu, Y. et al. Hum Mol Genet 22:5065-5074 (2013)). Residual sample contamination was adjusted for with non-monocyte cell types in all the analyses. Although most of monocytes (80-90%) are expected to be CD14+CD16- (Wong, K.L et al. Blood 118:e16-e31 (2011)), we also adjusted for expression of the FCGR3a gene (CD16a).

RNA sequencing

A subset of 374 samples was randomly selected from the 1 ,264 MESA monocyte samples for RNA sequencing. Total RNA samples were enriched for mRNA, by depleting rRNA using the MICROBExpress kit from Ambion and following the manufacturer's instructions. Poly(A) mRNA was enriched, and lllumina compatible, strand-specific libraries were constructed using lllumina's TruSeq

Stranded mRNA HT Sample Prep Kit (lllumina, RS-122-2103). 1 of total RNA with RIN≥ 8.0 was converted into a library of stranded template molecules suitable for subsequent cluster generation and sequencing by lllumina HiSeq. The libraries generated were validated using Agilent 2100 Bioanalyzer and quantitated using Quant-iT dsDNA HS Kit (Invitrogen) and qPCR. Six individually indexed cDNA libraries were pooled and sequenced on lllumina HiSeq, resulting in an average of close to 30 million reads per sample. Libraries were clustered onto flow cells using lllumina's TruSeq PE Cluster Kit v3 (PE-401-3001) and sequenced 2X100 cycles using TruSeq SBS Kit -HS (FC-401-3001) on an lllumina HiSeq™ 2500. A total of 64 lanes were run to generate approximately 30 million 2 x 101 Paired End reads per sample. The lllumina HiSeq Control Software (HCS v2.0.12) with Real Time Analysis (RTA v1.3.61) was used to provide the management and execution of the HiSeq 2500.

lllumina sequencing runs were processed to de-multiplex samples and generate FastQ files using the lllumina provided configureBclToFastq.pl script to automate running CASAVA 1.8.4 using default parameters for removal of sequencing reads failing the chastity filter (yes) and mismatches in the barcode read (0).

Following generation of FastQ files, reads were trimmed to remove poor quality reads (or read tails) using Btrim (5 base sliding window average with Q > 15) (Kong.Y. Genomics 98:152-153 (2011)) and then trimmed to remove any adaptor sequence present in the reads using custom perl scripts (trim sequences containing 11 base tag of adaptor, final length >40 bases). The Ensembl GRCh37 Homo Sapiens reference file, annotations and Bowtie2 indexes were downloaded from the igenomes website for mapping of the sequencing reads to the genome and read counting. Bowtie2 (2.1.0) and TopHat2 (2.0.8) were used to map the sequencing reads to the genome using a mate-inner-distance of 100 bp and 'firststrand' options

(Langmead.B., et al. Genome Biol. 10, R25 (2009); Trapnell, C, et al. Bioinformatics. 25:1105-1111 (2009)) . Following alignment, bam files were merged using the samtools (0.1.19) merge function 4, and read counts per gene were obtained using HTSeq (0.5.4p3). The 'intersection-strict' overlap resolution mode and 'stranded reverse' options were used in HTSeq.

Data pre-processing and QC analyses were performed in R using

Bioconductor packages. The transcript-based raw count data files for each sample from TopHat2 were combined into a count matrix with 56,303 features (rows) and 374 MESA samples (columns). The median total count per sample was 28.8 million. Reads denoted by TopHat2 as "no_feature","ambiguous", "too_low_aQual",

"not_aligned", "alignment_not_unique" were removed. Counts were converted to Counts Per Million (CPM) using the cpm function of the edgeR package

(Robinson,M.D., et al. Bioinformatics. 26:139-140 (2010)), and all features with CPM < 0.25 in >90% of the 374 MESA samples were removed. Features assigned to the mitochondrial genome were removed as well. Using the biomaRt package and querying the Ensembl BioMart database, Entrez Gene IDs, Gene Symbols, genome coordinates, gene length and percent GC content were obtained for 12,585 features which had a corresponding Entrez ID or lllumina HumanHT-12 v4 probe ID. To be able to continue to use the flexible and computationally efficient linear modeling functions in R, we transformed the raw count data to log counts per million (y = logCPM):

where Cg S is the raw count of gene transcript g in sample s, and T s is the normalized total count of sample s, using the Trimmed Mean of M-values (TMM) normalization method (Robinson, M.D. & Oshlack, A. Genome Biol. 11:R25 (2010)) as

implemented in the calcNormFactors function in the edgeR Bioconductor package (Robinson, M.D., et al. Bioinformatics. 26:139-140 (2010)). Ether only this TMM normalization was performed, or quantile normalization (QN) was applied to the logCPM values. Because the logCPM values' variance tends to decrease with increasing count for smaller counts, we used the voom function of the limma package to estimate the mean-variance trend non-parametrically and to predict the residual variance of each individual observation for each gene. Then the inverse residual variances were incorporated into the linear modeling (Im) as weights in a standard manner. The same low variance filter that was used for the microarray data was imposed for the logCPM data, removing another 192 features with the lowest variance and retaining 12,380 features for analysis. Weighted linear model analyses wasWe then performed with the otherwise exact same models as for the microarray data.

Bisulfite treatment of genomic DNA and pyrosequencing

A subset of 90 samples was selected from the 1 ,264 MESA monocyte samples, based on carotid plaque score extremes, matched for age, sex, and race. Genomic DNA from monocytes was bisulfite modified using the EZ DNA

Methylation™ Kit (Zymo Research Co., Irvine, CA) according to the manufacturer's protocol for the Infinium methylation assay. Primers (F1-Biotinated 5'- GAAAATAGGAAATGTTTAATTGTG-3' (SEQ ID NO:2), R1 5'- TCACTATACTCCTAACAACCAACC-3' (SEQ ID NO:3)) for pyrosequencing assays of ARID5B-cg25953130 were designed using PSQ assay design software version 1.0.6 (Biotage, Uppsala, Sweden). PCR was performed with the PyroMark PCR Master Mix (Qiagen Inc.). Pyrosequencing was conducted using PyroMark Gold Q98 Reagents (Qiagen Inc.). Methylation values for each CpG site were calculated using Pyro Q-CpG software 1.0.9 (Biotage).

Weighted Gene Co-Expression Network Analyses

The Weighted Gene Co-Expression Network Analysis (WGCNA) method was used to construct network modules of highly correlated transcripts, using the R package WGCNA (Langfelder, P. & Horvath, S. BMC. Bioinformatics. 9:559 (2008)). A total of 1 ,261 MESA samples were included in the co-expression network analysis after removing three outlier samples based on hierarchical clustering, and 13,196 mRNA transcripts were included. First, an unsigned weighted network was constructed based on the pairwise correlations among all transcripts considered, using a soft thresholding power of 5, chosen to produce approximately a scale-free topology. Then, using the topological overlap measure to estimate the network interconnectedness, the transcripts were hierarchically clustered. The default parameters of WGCNA were used, except for changing the correlation type from Pearson to biweight midcorrelation (which is more robust to outliers), the deepSplit setting from 2 to 3, the detectCutHeight value from 0.995 to 0.999, the maximum block size from 5,000 to 14,000 transcripts, and the minimum size for module detection from 20 to 10. WGCNA produces a set of modules, each containing a unique set of mRNA transcripts. The module eigengene was obtained to represent each module, which corresponds to the first eigenvector of the within-module expression correlation matrix (or the first right-singular vector of the standardized within-module expression matrix).

Association Analyses

The overall goal of the association analysis was to characterize the associations of each measure of atherosclerosis (carotid plaque and CAC) with each of genome-wide measures of mRNA expression and DNA methylation. Analyses were performed using the linear model (Im) function of the Stats package and the stepAIC function of the MASS package in R. Reported correlations (r) represent the partial Pearson product-moment correlation coefficient. Separate linear regression models were fit for each measure of atherosclerosis, with 1) genome-wide (log2 transformed) mRNA expression profiles, 2) network module eigengenes from

WGCNA, and 3) genome-wide DNA methylation profiles (M-values). The M-value is well suited for high-level analyses and can be transformed into the beta-value, an estimate of the percent methylation of an individual CpG site that ranges from 0 to 1 (M is logit(beta-value)). P-values were adjusted for multiple testing using the q-value FDR method (Storey, J.D. & Tibshirani, R. Proc. Natl. Acad. Sci. U. S. A 100:9440- 9445 (2003)) and Benjamini Hochberg FDR (Benjamini, Y. & Hochberg, Y. Journal of the Royal Statistical Society B 57:289-300 (2006)) when applicable. All analyses accounted for effects of residual cell contamination and covariates including age, gender, ethnicity, and study site. The full model also adjusted for traditional CVD risk factors (cigarette smoking, BMI, HDL-C and LDL-C levels, hypertension, diabetes mellitus) and statin use.

Mediation analysis

To investigate the genomic features as a potential molecular link between CVD risk factors and extent of atherosclerosis, mediation analyses was performed under an assumed causal model in which a CVD risk factor leads to a change in the genomic feature, which at least partially mediates the effects of the CVD risk factor on atherosclerotic burden. The mediation analyses accounted for the biological and technical covariates and were performed by robust Structural Equation Modeling (SEM) as implemented in the R package lavaan (Rosseel, Y. Journal of Statistical Software 48:1-36 (2012)). SEM analysis in general, and as implemented in lavaan, is based on Maximum Likelihood and the normal distribution, but provides several approaches to effectively deal with non-normal data. A first approach consists of computing robust standard errors (SE) by sandwich-type covariance matrices and scaled test statistics, in particular the Satorra-Bentler statistic whose amount of rescaling reflects the degree of kurtosis, while the second approach uses specific bootstrapping methods to obtain both SE and test statistics (Rosseel, Y. Journal of Statistical Software 48:1-36 (2012)). The reported results are based on bootstrapping which were found to be somewhat more conservative than the use of robust SE and the Satorra-Bentler statistic.

In vivo Functional Annotation Analysis

CpGs with methylation significantly associated with carotid plaque scores or CAC were investigated for association with cis-gene expression by performing a look-up in the results from our previous analysis (Liu, Y. et al. Hum Mol Genet 22:5065-5074 (2013)) of the same samples. Briefly, to identify DNA methylation associated with gene expression in cis, separate linear regression models were fit with the M-value for each CpG site (adjusted for methylation chip and position effects) as a predictor of transcript expression for any autosomal gene within 1 Mb of the CpG in question. Covariates were age, sex, and race/ethnicity, and study site. mRNA expression and DNA methylation of the most significantly associated mRNA and CpGs associated with carotid plaques scores and CAC were also investigated for association with nearby genetic variants. A large window (± 1 Mb) surrounding the mRNA/CpG was investigated to avoid missing any potential effects. Covariates were age, sex, and race/ethnicity, and study site. Separate linear regression models were fit with single nucleotide polymorphisms (SNPs) located within 1 Mb as a predictor of the mRNA expression (log2 transformed) or the methylation (M-value) in the MESA samples from Caucasian participants, including SNPs with a minor allele frequency > 0.05 in the MESA Caucasian population. P- values were adjusted for multiple testing using the q-value FDR method (Storey, J.D. & Tibshirani, R. Proc. Natl. Acad. Sci. U. S. A 100:9440-9445 (2003)).

In silico Functional Annotation Analysis

For the differentially expressed co-expression modules, Ingenuity® Pathway Analysis (I PA®, QIAGEN Redwood City) was used to examine the enrichment of canonical pathways and bio-functions. In silico functional prediction of chromatin states in monocytes was performed using ChromHMM (Ernst, J. & Kellis, M. Nat Methods 9:215-216 (2012)) to predict segmentation among six states, based on histone modifications in monocyte samples from the BLUEPRINTproject (Adams, D. et al. Nat Biotechnol 30:224-226 (2012); Saeed, S. et al. Science 345:1251086 (2014)) (H3K27ac, H3K4me1, H3K4me3) and ENCODE (Rosenbloom, K.R. et al. Nucleic Acids Res. 41:D56-D63 (2013)) (H3K36me3). Other functional information utilized includes DNase hypersensitive hotspot data in a monocyte sample

(C001UY46) from the BLUEPRINT project (Adams, D. et al. Nat Biotechnol 30:224- 226 (2012); Saeed, S. et al. Science 345:1251086 (2014)), and transcription factor binding sites detected in any cell type from ENCODE (Rosenbloom, K.R. et al.

Nucleic Acids Res. 41 :D56-D63 (2013)). Data was accessed from the UCSC

Genome Browser (Karolchik, D. et al. Nucleic Acids Res (2013)) and the Gene Expression Omnibus.

Hi-C data in a B-cell line (GM12878) (Rao, S.S. et al. Cell 159:1665-1680 (2014); Dixon, J.R. et al. Nature 485:376-380 (2012); Heidari, N. et al. Genome Res 24:1905-1917 (2014); Dixon, J.R. et al. Nature 518:331-336 (2015)), including the contact matrix heatmap and virtual 4C results were adapted from the YUE lab Hi-C Interactions and Virtual 4C website.

Functional Evaluation ofARIDSB Using in vitro Models

To examine the functional role of ARID5B, siRNA-mediated ARID5B

knockdown was used in LPS-stimulated human THP1 -monocytes. Using the protocols as described above, transcriptomic profiling was performed using the lllumina HumanHT-12 v4 Expression BeadChip, and ARID5B mRNA expression measured by qPCR.

Human monocytic THP-1 cell line was purchased from the American Tissue Culture Collection (ATCC). Cells were maintained in complete RPMI 1640 medium (Invitrogen) supplemented with 100 units ml 1 of penicillin, 100 ug ml 1 of

streptomycin, 2 mM L-glutamine, and 10% fetal bovine serum (FBS, HyClone, Logan, UT) in a humidified incubator with 5% C0 2 at 37 °C.

Two target-specific siRNAs for ARID5B exon 8&9 (life technologies, siRNA ID: S38579, designated as siARID5B1) and for exon 6 (siRNA ID: s38580, designated as siARIDSB 2) were used initially to compare for the knockdown efficiency. Then, the siARID5B 1 (s38579) was chosen for the remaining in vitro experiments. SiRNA transfection was performed using THP1 monocytes that were split 24h prior to transfection. 10nM of siARIDSB were electronically transfected into THP-1 cells for 24 h and 48h using Amaxa® Human Monocyte Nucleofector Kit and an Amaxa nucleofector II device (Lonza, Inc.). Scrambled siRNAs were transfected as negative controls. Samples were incubated for 24 h, followed by 3 h of LPS (LPS: 100 ng ml 1 ) stimulation.

Levels of cytokines in human monocytic THP-1 cells presented in Fig. 10 were measured by quantitative real-time RT-PCR using gene-specific TaqMan probe sets in an ABI prism 7000 sequence detection system (life technologies). GAPDH mRNA was the internal loading control.

To reduce false positive rates, two independent experiments (4 siARID5B vs. 4 scrambled siRNA samples per experiment) were performed. To detect differential expression between two groups with small sample sizes, the regularized t-test implemented in the limma R package was used (Smyth, G.K. Stat. Appl. Genet. Mol. Biol. 3:Article3 (2004)). For the differentially expressed genes, Ingenuity® Pathway Analysis (I PA®, QIAGEN Redwood City) was used to examine the enrichment of canonical pathways and bio-functions.

To further interrogate the potential pro-inflammatory effect that ARI D5B may play, the third siARID5B knockdown experiment was repeated as described above and ELISA assays of culture media performed for IL-1a, the pro-inflammatory cytokine that was most significantly reduced by ARID5B mRNA Knockdown.

Supernatants collected from LPS stimulated THP1 cells were analyzed for human IL- 1a production using commercial sandwich ELISA kit (R&D Systems) according to the manufacturer's instructions. The results are representative of three or more experiments performed in triplicated as means + SEM. Levels of IL-1a mRNA expression presented in Figure 5B were measured by RT-PCR as described previously.

To further examine the effects of ARID5B knockdown on cellular functions suggested by the transcriptomic profile changes, the siARID5B knockdown experiment was repeated as described above and THP1-monoctye migration and phagocytosis assays were performed. Cell migration was evaluated using Transwell inserts (6.5 mm diameter) with polycarbonate filters (5-μιτι pore size, Corning Costar) in 24 well plates. THP1 cells (30,000 cells per well) were added to the upper chamber of the insert. The lower chamber contained 600 μΙ of RPMI 1640 medium/1% FBS with chemokine, MCP-1 (40 ng/ml). The plates were incubated at 37°C in 5% CO2 for 24 h and cells that had migrated into the lower chamber were counted using cell countess. Cell phagocytosis was evaluated using yellow-green carboxylate modified microspheres (Thermo Fisher Scientific) in 24-well plates. Cells (500,000 cells per well) were incubated with 2 μΙ of beads (beads:cells = 5:1) for 60 minutes. After thoroughly washing 10 times with cold PBS, % engulfed beads in cells were calculated using FACs.

Data Availability Genome-wide gene expression and DNA methylation data has been deposited in the NCBI Gene Expression Omnibus (GEO) and is accessible through GEO Series accession number GSE56047 and GSE56046, respectively.

Results

Clinical Data and Sample Characteristic

The Multi-Ethnic Study of Atherosclerosis (MESA) is a multi-site, longitudinal study designed to investigate the prevalence, correlates, and progression of subclinical cardiovascular disease in a population cohort of 6,814 participants. Since its inception in 2000, five clinic visits have collected extensive clinical, socio- demographic, lifestyle, behavior, laboratory, nutrition, and medication data (Bild, D.E. et al. Am. J. Epidemiol. 156:871-881 (2002)). At Exam 5, carotid ultrasound and computed tomography (CT), were used to quantify carotid plaque burden (carotid plaque score, range 0-12) and coronary artery calcification (CAC Agatston score), respectively. These two measures of atherosclerosis burden independently predict future CVD events in MESA and other cohorts (Tattersall, M.C. et al. Stroke 45:3257- 3262 (2014); Gepner, A.D. et al. Circ. Cardiovasc. Imaging 8 (2015); Plichart, M. et al. Atherosclerosis 219:917-924 (2011); Stein, J.H., et al. J. Am. Soc. Echocardiogr. 21 :93-111 (2008)) (distribution of scores presented in Fig. 6). Table 1 shows characteristics of the MESA participants at Exam 5, overall and stratified by study site (study site 1: N=709, site 2: N=499), including demographics (age, sex, race/ethnicity, and study site), traditional CVD risk factors [cigarette smoking, body- mass index (BMI), high- and low- density lipoprotein cholesterol (HDL-C and LDL-C), hypertension, type II diabetes mellitus (T2DM)], atherosclerosis burden measures, prevalent CVD, and statin use. The transcriptome and methylome of monocytes purified at MESA Exam 5 were profiled concurrently. In a separate effort to evaluate reproducibility of single-time measures, there was a high consistency of repeated lllumina microarray data over five months (Fig. 7), suggesting both RNA expression and DNA methylation at most loci can be stable over time in an individual.

Blood Leukocyte Count and Atherosclerosis

Total white blood cell (WBC) count and its constituent subtypes were measured for all samples prior to monocyte purification, which provided the absolute monocyte count. Higher monocyte count, but not monocyte percentage or other leukocyte count, was marginally associated with carotid plaque score and CAC

[natural log (carotid plaque score/CAC + 1), p=0.026 with 0.4% of variability explained, and p=0.079, respectively], in agreement with previous reports (Madjid, M., et al. J Am Coll Cardiol. 44:1945-1956 (2004)). Positive immunoselection

(magnetic beads) (Lyons, P.A. et al. Genomics 8:64 (2007)) were used to produce samples of monocytes with greater than 90% purity. Residual contamination with neutrophils, B cells, T cells, and natural killer cells was estimated as previously reported (Liu, Y. et al. Hum Mol Genet 22:5065-5074 (2013)). Likewise, the generally small percentage of CD14 + monocytes that are also CD16 + (~10%) (Wong, K.L. et al. Blood 118:e16-e31 (2011)) was estimated based on expression of FCGR3a

(CD16a). Neither the surrogates of residual cell contamination nor the % of

CD14+CD16+ cells was associated with the measures of atherosclerosis.

Transcriptome Signature of Atherosclerosis Transcriptomic studies of atherosclerosis in mouse models and in humans (Berisha, S.Z., et al. PLoS. One. 8:e65003 (2013); Maiwald, S., et al. Curr. Opin. Clin. Nutr. Metab Care 16:411-417 (2013); Sivapalaratnam, S. et al. PLoS. One. 7:e32166 (2012)) have been reported. However, no consensus atherosclerosis biomarkers or pathways have been identified. Of the 10,989 unique genes with RNA expression detectable in monocytes, genes were identified with expression associated with carotid plaque score (n=2) and CAC (n=13) at a q-value-based false discovery rate (FDR) (Storey, J.D. & Tibshirani, R. Proc. Natl. Acad. Sci. U. S. A 100:9440-9445 (2003)) of 0.05 after adjusting for demographics (Figure 1A and 1 B). Using the FDR level of 0.05 for the genome-wide search, signals with small effect size would inevitably be missed due to the limited statistical power for the sample size we have, thus additional signals for FDR of 0.20 were shown in Table 2 (21 gene transcripts with carotid plaque, and 104 gene transcripts with CAC). Expression of two genes, ARID5B and PDLIM7 (PDZ and LIM domain protein 7), were positively associated with both measures of atherosclerosis (FDR≤0.05); ARID5B was most significantly associated with carotid plaque score (p=6.30x10 "8 , FDR=1.08x10 "3 ; with CAC: p=2.47x10 "5 , FDR=0.03). ARID5B plays a key metabolic role in adipose, liver and smooth muscle, and was previously implicated in lipid metabolism and adipogenesis in mice (Whitson, R.H., et al. Biochem. Biophys. Res Commun.

312:997-1004 (2003); Yamakawa, T., et al. Mol Endocrinol. 22:441-453 (2008)).

PDLIM7 is an actin and protein kinase adaptor that promotes mineralization, which might be relevant to plaques with calcium. The most significant signal associated with CAC was lower expression of CACNA2D3 (Calcium Channel, Voltage- Dependent, Alpha 2/Delta Subunit 3; p=2.08x10 '8 , FDR=3.19x10 4 ; with carotid plaque:

p=6.71x10 "3 , FDR=0.40) a tumor suppressor gene that can induce mitochondrial- mediated apoptosis (Wong, A.M. et al. Int J Cancer. 133:2284-2295 (2013)).

Additionally adjusting for other traditional CVD risk factors and statin use (in a model designated as the full model) had minimal impact on the significant associations (Figures 1C and 1 D). All of these associations had a consistent direction of effect across the independent study sites, particularly for three genes (ARID5B, TSPYL1, and ADM) which were cross-validated (p<0.05, Figures 1C and 1D). Race/ethnicity- and sex-specific analyses also were consistent across various strata (Table 3). For the top signal, ARID5B, the associations with carotid plaque were consistent in direction and nominally significant across subgroups of age (< or > 65 years), sex, and statin use (Figure 1E). Results utilizing RNA-sequencing derived expression levels from a subset (N=354) of the monocyte samples validated significant associations between ARID5B and CAC (beta ± SE = 0.45 ± 0.20, p = 0.03) and carotid plaque (beta ± SE = 0.13 ± 0.06, p = 0.04).

To identify network modules of highly correlated transcripts in an unbiased fashion, the weighted gene co-expression network analysis was applied (Langfelder, P. & Horvath, S. BMC. Bioinformatics. 9:559 (2008)), identifying 40 co-expressed gene network modules. Three modules significantly (FDR≤ 0.05) associated with CAC (Table 4), including a cholesterol metabolism transcriptional network (CMTN) with 12 functionally coupled genes that was also associated with carotid plaque. The CMTN eigengene (the first principal component) of CMTN is associated with a transcriptional profile expected to increase intracellular cholesterol, including up- regulation of cholesterol uptake (LDLR and MYLIP) and cholesterol and fatty acid synthesis genes (HMGCS1, FDFT1, SQLE, CYP51A1, SC4MOL, SC5DL, SCD and FADS1), as well as down-regulation of cholesterol efflux genes (ABCG1 and

ABCA1). The CMTN is a signature feature of obesity, which was associated with CAC (p=3.34x10 '4 ) (Ding, J. et al. Diabetes. 64:3464-3474 (2015)). Here, it is further showed that the CMTN was positively associated with carotid plaque (p=2.31x10 -4 ; FDR=0.009). Among the CMTN members, LDLR expression was most significantly associated with CAC (p=1.34x10 -4 , FDR=0.10) and ABCG1 expression was most significantly associated with carotid plaque (p=1.48x10 4 , FDR=0.18, Table 2). The other two network modules significantly (FDR<0.05) associated with CAC were enriched with genes involved in phagosome formation (module 39; FDR=1.32x10 2 ) and migration of cells (module 23; FDR=3.96x10 2 ; Table 4).

Methylome Signature of Atherosclerosis

Of the 484,817 CpG sites, 31 and 7 had methylation significantly associated with carotid plaque and CAC, respectively, including one CpG (cg23661483 in exon of ILVBL) associated with both carotid plaque and CAC (FDR≤ 0.05, adjusting for demographics, Figure 2A and 2B, and Table 5 showing additional signals for FDR≤ 0.10). The most significant CpG (cg06126421) associated with carotid plaque score (p=2.00x10 "10 , FDR=8.91x10 5 ) is an intergenic CpG, which recently was reported as one of the most significant smoking-associated methylation sites genome-wide (Shenker, N.S. et al. Hum Mol Genet 22:843-851 (2013)). Notably, the carotid plaque-associated methylation sites include one CpG in ARID5B (cg25953130, intron 2, p=4.31x10 "7 , FDR=0.01), which tends to be hypomethylated in the individuals with higher carotid plaque scores. Similar results were found between cg25953130 methylation and CAC (p=6.80x10 "5 , FDR=0.32). The 37 differentially methylated sites associated with carotid plaque or CAC presented higher variable CpG methylation levels (across MESA population) with the interdecile range (IDR) of the percentage of methylation (measured by beta- value (Du, P. et al. BMC Bioinformatics. 11:587 (2010)) ranging from 4 to 24% (median: 10%), compared to the whole methylome (median: 4%). The majority of the methylation sites associated with carotid plaque or CAC had inverse associations with the atherosclerosis measures (Table 5). The 37 atherosclerosis-associated CpGs are distributed among 30 unique genomic loci.

Additional adjustments in the full model had minimal impact on the significant associations, which were also consistent across the two study sites for the majority (61%) of differentially methylated sites (p<0.05, as shown for each unique loci in Figures 2C and 2D). Race/ethnicity- and sex-specific analyses also showed high consistency across the various strata (Table 6), suggesting shared effects across ancestries. At the genome-wide level, race/ethnicity-specific analyses did not uncover additional significant associations. The associations of ARID5B methylation (cg25953130, intron) with carotid plaque were consistent in direction and nominally significant across subgroups of age (< or >65 years), sex, CVD status, and statin use (Figure 2E). Pyrosequencing derived methylation of cg25953130 from a subset (n=90) of the monocyte samples significantly correlated with microarray-based methylation levels (r = 0.92, p=5.2x10 "37 ), and validated significant associations between cg25953130 methylation and carotid plaque (beta ± se = -0.07 ± 0.03, p=6.31x10 "3 ).

In Vivo and In Silico Functional Validation

DNA methylation has been viewed as an important potential regulator of gene expression (Jones, P.A. Nat Rev Genet 13:484-492 (2012)). To prioritize the list of differentially methylated CpGs, expression-associated methylation sites (eMS) reported in the same monocyte samples from a previous study were assessed (Wong, K.L et al. Blood 118:e16-e31 (2011)). Five atherosclerosis-associated CpGs whose methylation was significantly related to mRNA expression of at least one nearby gene (Table 7) was identified. Out of the five-atherosclerosis associated eMS, four had methylation correlated with mRNA expression profiles that were nominally (p<0.05) associated with atherosclerosis, including expression of SC4MOL. The CpG most significantly associated with CAC (cg05119988, located in the 5' UTR region of SC4MOL) significantly correlated (r=-0.17, p=1.2x10 "10 , FDR=7.3x10 8 ) with mRNA expression of SC4MOL, a member of the identified CMNT, which was nominally associated with both carotid plaque and CAC. The SC4MOL eMS resides in a predicted weak promoter region of a B cell line; however, this region was identified as heterochromatin in monocytes (using ChromHMM (Ernst, J. & Kellis, M. Nat Methods 9:215-216 (2012)) for histone modification data from BLUEPRINT (Adams, D. et al. Nat Biotechnol 30:224-226 (2012); Saeed, S. et al. Science 345:1251086 (2014)), Fig. 8, Table 5 and Table 7).

The ARID5B CpG (cg25953130) was the only CpG site significantly associated with expression of ARID5B (p=3.84x10 '14 , FDR=5.70x10 6 ; Figure 3A). As shown in Figure 3B, it overlaps a DNase hypersensitive hotspot (BLUEPRINT monocyte data (Adams, D. et al. Nat Biotechnol 30:224-226 (2012); Saeed, S. et al. Science 345:1251086 (2014))), a predicted strong enhancer (using both monocyte and B cell line histone mark data from the BLUEPRINT (Adams, D. et al. Nat Biotechnol 30:224-226 (2012); Saeed, S. et al. Science 345:1251086 (2014)) and ENCODE (Rosenbloom, K.R. et al. Nucleic Acids Res. 41:D56-D63 (2013)) projects, respectively), and a transcription factor binding site occupied by EP300 (in a neuroblastoma cell line). More importantly, chromatin-capture sequencing

technologies (both Hi-C and ChlA-PET) confirmed direct interactions between regions in the ARID5B cg25953130 locus and the ARID5B promoter region in B cell line (Rao, S.S. et al. Cell 159:1665-1680 (2014); Dixon, J.R. et al. Nature 485:376- 380 (2012); Heidari, N. et al. Genome Res 24, 1905-1917 (2014); Dixon, J.R. et al. Nature 518:331-336 (2015)) (Figure 3C). These data, together with the publically available functional data, strongly support the presence of an ARID5B regulatory region in the ARID5B gene body flanking ARID5B cg25953130.

To test whether the assumed methylation effects on atherosclerosis burden were mediated through its associated mRNA expression, Structural Equation

Modeling was used with bootstrapping (SEM, R package lavaan (Rosseel, Y. Journal of Statistical Software 48:1-36 (2012))) to perform mediation analyses. ARID5B mRNA expression significantly mediated 15% and 14% of the total effect of this ARID5B CpG on carotid plaque score (indirect effect, p=2.1x10 -4 ) and CAC (p=2.1 x10 3 ), respectively.

The inverse association between methylation of this ARID5B CpG and carotid plaque score after adjusting for ARID5B expression (direct effect) was also significant. Jointly, the associations of ARID5B gene expression and methylation levels with atherosclerosis explain an additional 2.3% of the variability in carotid plaque score beyond well-known CVD risk factors and statin use. The effect sizes of ARID5B gene expression and methylation levels associating with carotid plaque score are higher than the effect sizes of T2DM (1 % of variability) or hypertension (0.9%) in the same model. These data suggest that different types of related genomic features (mRNA expression and DNA methylation) may offer additive values in prediction of CVD susceptibility.

To further demonstrate the clinical relevance of ARID5B, the tertiles of ARID5B mRNA expression and ARID5B cg25953130 methylation were associated with presence of carotid plaque (defined as carotid plaque score greater than zero, N=816 cases), presence of CAC (defined as CAC greater than zero, N=844 cases), and prevalent CVD (history of a coronary heart event or stroke, N=64 cases).

ARID5B mRNA expression was positively associated with presence of carotid plaque (3 rd fertile odds ratio=2.10, 95% CI: 1.42-3.09, p=1.87x10 "4 ), presence of CAC (3 rd tertile odds ratio=2.10, 95% CI: 1.42-3.09, p=1.87x10 4 ), and prevalent CVD (3" 1 fertile odds ratio=2.33, 95% CI: 1.08-5.02, p=3.12x10 "2 , Figure 4A); while ARID5B cg25953130 methylation was inversely associated with presence of carotid plaque (3 rd tertile odds ratio=0.64, 95% CI: 0.45-0.92, p= 1.59x10 ) and presence of CAC (3 ri tertile odds ratio=0.64, 95% CI: 0.45-0.92, p=1.59x10 "2 , Figure 4B) in the full model. To examine the dose response relationship between ARID5B and extent of atherosclerosis, which may indicate their potential contribution to the progression of plaques, we performed linear regression analysis while excluding those with zero value of carotid plaque score. The associations with carotid plaque score remain significant for the ARID5B mRNA expression (unique variability explained: 1.2%, p=1.1x10 3 ) and cg25953130 methylation (unique variability explained: 0.73%, p=1.1x10 "2 ).

Identified Genomic Features and Known CVD Risk Factors

Among the atherosclerosis-associated genomic features, eight mRNA (Table 8) and 33 CpGs (Table 9) were also associated with one or more traditional CVD risk factors, particularly demographics (age, sex, race/ethnicity), cigarette smoking, and obesity. For three mRNA and 29 CpGs (bolded in Table 6 and Table 9), the predicted effects of the majority of CVD risk factors on atherosclerosis that were mediated through the genomic features have the similar direction as the observed associations between the CVD risk factors and the measure of atherosclerosis.

ARID5B expression was positively associated with many CVD risk factors (FDR<0.05 with adjustment for demographic variables), such as age, (p=2.06x10 " 13 ), BMI (p=1.67x10 "8 ), T2DM (p=3.43x10 "5 ), and inflammatory stress (measured by plasma interleukin-6 levels (IL6; p=1.36x10 "10 ), and inversely associated with HDL-C levels (p=6.86x10 "6 ). ARID5B methylation (cg25953130) was associated inversely with age (p=3.33 x10 "11 ) and plasma IL-6 levels (p=0.004), and the ARID5B CpG tended to be hypomethylated in current smokers (p=7.19x10 7 ). \f ARID5B is in fact causal, mediation analyses showed the expression of ARID5B significantly mediated 10% and 25% of the total effect of age and IL-6 on carotid plaque score (p=7.46x10 "6 , 4.33x10 "5 , respectively), and the ARID5B methylation significantly mediated 7% and 10% of the total effect of age and IL6 on carotid plaque score (p=2.57x10 "4 , 8.28 x10 3 , respectively).

ARID5B RNA Expression and Methylation in CD4+ T cells

To examine the ARID5B expression and methylation across cell-types, similar analyses were performed in a subset of 517 MESA CD4+ T cell samples.

Correlations for ARID5B expression and methylation between monocyte and T cells were weak (r=0.12, 0.30, respectively). Within T cell samples, ARID5B expression was inversely correlated with cg25953130 methylation (r = -0.45, p = 1.27x10 "31 ), as seen in monocytes. The ARID5B mRNA and cg25953130 methylation associations with carotid plaque and CAC were not significant in T cell samples (p>0.05), but remained statistically significant (p ranges from 3.8x10 3 to 7.2x10 3 ) when analyzed in the monocyte samples from the same subset. In the same data, however, there was an association of AHRR cg05575921 with carotid plaque score and CAC score in T cell samples (p=3.18x10 5 , 5.89x10 "5 , respectively), with a strong correlation between the two cell types for the AHRR methylation site (r=0.97). AHRR

hypomethylation is a well-known, robust biomarker of smoking that is linked to carotid plaque score in MESA (Reynolds, L.M. et al. Circ Cardiovasc. Genet 8:707-716 (2015)). These results demonstrate examples of both potential cell-type specific and shared genomic features in relation to burden of atherosclerosis.

Functional Evaluation ofARID5B Using In Vitro Models

Although little is known about the majority of identified genomic features, pleotropic effects of ARID5B in adipogenesis (Whitson, R.H., et al. Biochem.

Biophys. Res Commun. 312:997-1004 (2003)), chondrogenesis (Hata, K. et al. Nat Commun. 4:2850. (2013)), autoimmune diseases (Yang, W. et al. Am J Hum Genet. 92:41-51 (2013); Okada, Y. et al. Nat Genet. 44:511-516 (2012)), lipid metabolism (Yamakawa, T., et al. Mol Endocrinol. 22:441-453 (2008)), and smooth muscle cell differentiation (Watanabe, M. et al. Circ Res. 91:382-389 (2002)) have been previously reported. As a transcription coactivator that is part of the H3K9me2 demethylase complex with PHD finger protein 2 (PHDF2), ARID5B is expected to activate its target genes by removing the repressive H3K9Me2 histone mark

(demethylation of H3K9me2) from the promoter region of target genes (Hata, K. et al. Nat Commun. 4:2850. (2013); Baba, A. et al. Nat Cell Biol. 13:668-675 (2011)). To evaluate the functional role of ARID5B in monocytes/macrophages, the effects of its knockdown on transcriptomic profiles we was in LPS stimulated human THP1- monocytes in two independent experiments (LPS: 100 ng ml 1 , n=4 per group at each experiment).

ARID5B mRNA knockdown efficiency was 85% and 76% in the two experiments (Fig. 9); both consistently showed that ARID5B knockdown decreased expression of 1,320 genes and increased expression of 1 ,162 genes (FDR<0.005 in the discovery set and FDR<0.05 in the replication set).The 2,482 ARID5B-modified genes displayed a significant overrepresentation of related pathways including inflammatory/immune response, chemotaxis, migration, extravasation signaling, and phagocytosis, and also the lipid synthesis functional pathway, compared to the background list of genes detectable on the array (enrichment FDR<0.05, Figure 5A, full list of genes provided in Table 10). Enriched bio-functions and canonical pathways in the inflammatory response pathway consisted of mainly down-regulated genes, including key proinflammatory cytokines (e.g. TNF, IL1a), activator and effector cytokines from the type I interferon signaling pathway (e.g. IRF3, IFNB1 , STAT1 and STAT2), and antigen processing and presentation genes (e.g. HC class II cell surface receptors: HLA-DRA, and HLA-DRBs, Fig. 9, bolded in Table 10), collectively indicating decreased pathway activation. Decreased expression of genes in two of the enriched pathways, phagocytosis and lipid synthesis (Table 10), overlapped with the members of two atherosclerosis-associated networks we identified in MESA (CMTN and module 39 - enriched for the phagosome formation pathway, Table 4), which suggests ARID5B may contribute to the gene network associations with atherosclerosis.

To interrogate the potential pro-inflammatory role that ARID5B may play, the third ARID5B knockdown experiment was repeated and ELISA assays of culture media performed for IL1A, the pro-inflammatory cytokine that was most significantly reduced by ARID5B mRNA knockdown (FDR: 3.79x10 "8 , 4.48x10 -4 in the first two experiments). ARID5B knockdown decreased levels of both IL1A mRNA (p=0.02) in the THP1 -monocytes and IL1A protein expression (p=4.28x10 -4 ) in their culture media, as shown in Figure 5B.

To further examine the effects of ARID5B knockdown on cellular functions suggested by the transcriptomic profile changes, THP1-monoctye migration and phagocytosis assays were performed. ARID5B knockdown suppressed monocyte migration (p=3x10 7 , 0.004 in experiment 1 and 2, respectively, Figure 5C). ARID5B knockdown also moderately inhibited monocyte phagocytosis (p=0.008) as shown in Figure 5C.

Example 2: Risk Stratification Role of ARID5B Expression in Type II Diabetes and Atherosclerosis

Using the data reported in the Example 1, Figure 11A gives the odds ratio of type II diabetes (T2DM) according to ARID5B levels after prior stratification of persons according to their overweigh and obesity status, the strongest risk factor for T2DM. Tertiles of ARID5B were used in these analyses to ensure an adequate representation of ARID5B levels for all strata under investigation. The ARID5B tertiles showed a strong effect in overweight and obesity subgroups, subjects with diabetes more frequently had ARID5B values in the medium and highest tertile (Odds ratio for obese subjects with the highest tertile of ARID5B was 11 , compared to those with normal weight and the lowest tertile of ARID5B, p values: 1.1 E-6)

Figure 11 B showed the mean carotid plaque scores (log transformed) according to ARID5B levels after prior stratification of persons according to their impaired glucose tolerance and diabetes status, one of the strongest risk factors for atherosclerotic CVD. The ARID5B tertiles showed a significant (p< 0.001) effect at normal, impaired glucose tolerance, and diabetes subgroups).

In conclusion, the type II diabetes and atherosclerosis risk increases with the higher levels of ARID5B in all the low, medium, and high risk groups. ARID5B significantly improves the risk stratifications for both T2DM and atherosclerosis, above and beyond their known risk factors.

Unless defined otherwise, all technical and scientific terms used herein have the same meanings as commonly understood by one of skill in the art to which the disclosed invention belongs. Publications cited herein and the materials for which they are cited are specifically incorporated by reference.

Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many equivalents to the specific embodiments of the invention described herein. Such equivalents are intended to be encompassed by the following claims.