Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
DNA METHYLATION MARKERS FOR NEUROPSYCHIATRIC DISORDERS AND METHODS, USES AND KITS THEREOF
Document Type and Number:
WIPO Patent Application WO/2018/107294
Kind Code:
A1
Abstract:
The present disclosure provides epigenetic signatures, comprising genomic CpG dinucleotide sequences, genes, and/or genomic regions, which are differentially methylated in individuals with pathogenic copy number variants or a pathogenic mutation that are associated with a neuropsychiatric disorder (ND) or increased likelihood of an ND, and their use in methods and kits for detecting and/or screening for ND, or the likelihood of ND.

Inventors:
WEKSBERG ROSANNA (CA)
CHOUFANI SANAA (CA)
BUTCHER DARCI (CA)
SIU MICHELLE (CA)
Application Number:
PCT/CA2017/051516
Publication Date:
June 21, 2018
Filing Date:
December 14, 2017
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
HOSPITAL FOR SICK CHILDREN (CA)
International Classes:
C12Q1/68; C12Q1/6809; G01N33/48
Domestic Patent References:
WO2010120526A22010-10-21
WO2011127194A12011-10-13
WO2016182835A12016-11-17
Foreign References:
CA2963831A12016-04-21
CA2675290A12008-07-31
US20100029009A12010-02-04
Attorney, Agent or Firm:
BERESKIN & PARR LLP/S.E.N.C.R.L., S.R.L. (CA)
Download PDF:
Claims:
CLAIMS:

1 . A method of detecting and/or screening for a neuropsychiatric disorder (ND), or an increased likelihood of ND, in a human subject, comprising: determining a sample methylation profile from a sample comprising DNA from said subject, said sample profile comprising:

(a) the methylation level of at least 5, optionally at least 10, at least 25, at least 41 , at least 50, at least 75, at least 90, or all CpG loci from (i) Table 2 and/or (ii) associated CpG loci residing within 300 nucleotides, optionally within 150 nucleotides, of the CpG loci of (i);

(b) the methylation level of all CpG loci from (i) Table 5 and/or

(ii) associated CpG loci residing within 300 nucleotides, optionally within 150 nucleotides, of the CpG loci of (i); and/or

(c) the methylation level of at least 5, optionally at least 7, at least 10, at least 25, at least 50, at least 53, at least 75, at least 100, at least 125, at least 140, at least 175, at least 200, at least 250 or all CpG loci from (i) Table 8 and/or (ii) associated CpG loci residing within 300 nucleotides, optionally within 150 nucleotides, of the CpG loci of (i); and

determining the level of similarity of said sample profile to one or more control profiles, wherein (i) a high level of similarity of the sample profile to an ND specific control profile; (ii) a low level of similarity to a non-ND control profile; and/or (iii) a higher level of similarity to an ND specific control profile than to a non-ND control profile indicates the presence of, or an increased likelihood of, ND.

2. The method of claim 1 , wherein the CpG loci of (a) comprise (i) CpG loci from Table 2 having an absolute delta-beta value > 0.05, optionally >

0.06, > 0.07, > 0.1 , or > 0.15; and/or (ii) associated CpG loci residing within 300 nucleotides, optionally within 150 nucleotides, of the CpG loci of (i).

3. The method of claim 1 , wherein the CpG loci of (b) comprise (i) CpG loci from Table 5 having an absolute delta-beta value > 0.05; and/or (ii) associated CpG loci residing within 300 nucleotides, optionally within 150 nucleotides, of the CpG loci of (i). 4. The method of claim 1 , wherein the CpG loci of (c) comprise (i) CpG loci from Table 8 having an absolute delta-beta value > 0.05, optionally > 0.06, > 0.07, > 0.1 , > 0.12, > 0.15 or > 0.17; and/or (ii) associated CpG loci residing within 300 nucleotides, optionally within 150 nucleotides, of the CpG loci of (i). 5. A method of detecting and/or screening for a neuropsychiatric disorder (ND), or an increased likelihood of an ND, in a human subject, comprising: determining a sample methylation profile from a sample comprising DNA from said subject, said sample profile comprising the methylation level of CpG loci, wherein the CpG loci are the loci from Tables 2, 5 and/or 8 having an absolute delta-beta value > 0.05; and

determining the level of similarity of said sample profile to one or more control profiles, wherein (i) a high level of similarity of the sample profile to an ND specific control profile; (ii) a low level of similarity to a non-ND control profile; and/or (iii) a higher level of similarity to an ND specific control profile than to a non-ND control profile indicates the presence of, or an increased likelihood of, the ND.

6. The method of any one of claims 1 to 5, wherein determining the sample methylation profile comprises the steps:

a) providing the sample comprising genomic DNA from the subject; b) optionally, isolating DNA from the sample;

c) optionally, treating DNA from the sample with bisulfite for a time and under conditions sufficient to convert non-methylated cytosines to uracils; d) optionally, amplifying the DNA; and e) determining the methylation level at the CpG loci by means of bisulfite sequencing, pyrosequencing, methylation-sensitive single-strand conformation analysis (MS-SSCA), high resolution melting analysis (HRM), combined bisulfite restriction analysis (COBRA), methylation-sensitive single nucleotide primer extension (MS-SnuPE), base-specific cleavage/MALDI- TOF, methylation-specific PCR (MSP), methylation-sensitive restriction enzyme-based methods, microarray-based methods, whole-genome bisulfite sequencing (WGBS, MethylC-seq or BS-seq), reduced-representation bisulfite sequencing(RRBS), and/or enrichment-based methods such as MeDIP-seq, MBD-seq, or MRE-seq.

7. The method of any one of claims 1 to 6, wherein a high level of similarity to the control profile is indicated by a correlation coefficient between the sample profile and the control profile having an absolute value between 0.5 to 1 , optionally between 0.75 to 1 , and a low level of similarity to the control profile is indicated by a correlation coefficient between the sample profile and the control profile having an absolute value between 0 to 0.5, optionally between 0 to 0.25.

8. The method of any one of claims 1 to 7, wherein a higher level of similarity to the ND specific profile than to the non-ND control profile is indicated by a higher correlation value computed between the sample profile and the ND specific profile than an equivalent correlation value computed between the sample profile and the non-ND control profile, optionally wherein the correlation value is a correlation coefficient.

9. The method of claims 7 or 8, wherein the correlation coefficient is a linear correlation coefficient, optionally a Pearson correlation coefficient.

10. The method any one of claims 1 to 9, wherein methylation level is measured as a β-value.

1 1 . The method of any one of claims 1 to 10, wherein determining the profile of methylated DNA from the subject comprises contacting the DNA with at least one agent that provides for determination of a CpG methylation status of at least one, optionally all, of the selected CpG loci, wherein the agent comprises an oligonucleotide-immobilized substrate comprising a plurality of capture probes, each capture probe comprising a pair of capture oligonucleotides, wherein the capture oligonucleotide pairs comprise (a) an oligonucleotide comprising nucleotide sequence complementary to or identical to a nucleotide sequence of genomic DNA comprising a selected CpG loci, and (b) an oligonucleotide comprising nucleotide sequence complementary to or identical to a nucleotide sequence of genomic DNA comprising the same selected CpG loci of (a), in which the cytosine residue of the CpG loci is replaced with a thymine residue.

12. The method of claim 1 1 , wherein the contacting is under hybridizing conditions.

13. The method of any one of claims 1 to 12, wherein the methylation levels of the selected loci of at least one control profile is derived from one or more samples, optionally from historical methylation data for a patient or pool of patients. 14. The method of any one of claims 1 to 13, wherein the non-ND control profile comprises methylation levels for the selected CpG loci listed in Tables 2, 5 and/or 8.

15. The method of any one of claims 1 to 14, wherein the ND specific control profile comprises methylation levels for the selected CpG loci listed in Tables 2, 5 and/or 8.

16. The method of any one of claims 1 to 15, wherein methylation level of a selected CpG locus not listed in Tables 2, 5 and/or 8 is assumed to be equivalent to the methylation level of a CpG locus listed in Tables 2, 5 and/or 8 with which the selected DNA CpG locus is associated.

17. The method of any one of claims 1 to 16, wherein the sample is derived from blood, fibroblast tissue, buccal tissue, lymphoblastoid cell line, saliva or a prenatal sample, optionally a CVS, placenta, circulating fetal DNA and/or amniotic fluid sample.

18. The method of claim 17, wherein the sample is derived from blood.

19. The method of any one of claims 1 to 18, wherein the human subject is a fetus. 20. A method of detecting and/or screening for a neuropsychiatric disorder (ND), or an increased likelihood of an ND, in a human subject, comprising:

determining a sample methylation profile from a sample comprising DNA from said subject, said sample profile comprising:

(a) the methylation level of at least 5, optionally at least 7, at least 10, at least 15, at least 20, at least 25, at least 30, at least 35, at least 40, at least 45 or all the genes from Table 2;

(b) the methylation level of all the genes from Table 5; and/or

(c) the methylation level of at least 2, optionally at least 3, at least 5, at least 10, at least 15, at least 20, at least 25, at least 35, at least 50, at least

75, at least 100, at least 125, at least 150, at least 170, or all the genes from Table 8; and

determining the level of similarity of said sample profile to one or more control profiles, wherein (i) a high level of similarity of the sample profile to an ND specific control profile; (ii) a low level of similarity to a non-ND control profile; and/or (iii) a higher level of similarity to an ND specific control profile than to a non-ND control profile indicates the presence of, or an increased likelihood of, the ND.

21 . The method of claim 20, wherein determining the methylation levels of the selected genes comprises the steps:

a) providing the sample comprising genomic DNA from the subject; b) optionally, isolating DNA from the sample;

c) optionally, treating DNA from the sample with bisulfite for a time and under conditions sufficient to convert non-methylated cytosines to uracils; d) optionally, amplifying the DNA; and

e) determining the methylation status at the selected genes by means of bisulfite sequencing, pyrosequencing, methylation-sensitive single-strand conformation analysis (MS-SSCA), high resolution melting analysis (HRM), combined bisulfite restriction analysis (COBRA), methylation-sensitive single nucleotide primer extension (MS-SnuPE), base-specific cleavage/MALDI- TOF, methylation-specific PCR (MSP), methylation-sensitive restriction enzyme-based methods, microarray-based methods, whole-genome bisulfite sequencing (WGBS, MethylC-seq or BS-seq), reduced-representation bisulfite sequencing (RRBS), and/or enrichment-based methods such as MeDIP-seq, MBD-seq, or MRE-seq.

22. The method of claim 20 or 21 , wherein the methylation level is measured as a β-value. 23. The method of claim 22, wherein hypermethylation is indicated by the gene having a significantly higher methylation beta value in the ND specific control profile compared to the non-ND control profile and hypomethylation is indicated by the gene having a significantly lower methylation beta value in the ND specific control profile compared to the non-ND control profile. 24. The method of any one of claims 20 to 23, wherein the sample is derived from blood, fibroblast tissue, buccal tissue, lymphoblastoid cell line, saliva or a prenatal sample, optionally a CVS, placenta, circulating fetal DNA and/or amniotic fluid sample.

25. The method of claim 24, wherein the sample is derived from blood.

26. The method of any one of claims 20 to 25, wherein the human subject is a fetus.

27. A method of determining a course of management for an individual with a neuropsychiatric disorder (ND), or an increased likelihood of an ND, comprising:

a) identifying an individual with an ND or an increased likelihood of an ND, according to the method of any one of claims 1 -26; and

b) assigning a course of management for the ND and/or symptoms of the ND, comprising i) testing for at least one medical condition associated with ND and ii) applying an appropriate medical intervention based on the results of the testing.

28. The method of claim 27, wherein the medical condition associated with ND is selected from developmental delay, ASD, ID, ADHD, and SZ.

29. A kit for detecting and/or screening for a neuropsychiatric disorder (ND), or an increased likelihood of ND, in a sample, comprising:

a) at least one detection agent for determining the methylation level of:

(i) (A) at least 2, optionally at least 5, at least 10, at least 25, at least 41 , at least 50, at least 75, at least 90, or all CpG loci from (i) Table 2 and/or (ii) associated CpG loci residing within 300 nucleotides, optionally within 150 nucleotides, of the CpG loci of (i);

(B) the methylation level of all CpG loci from (i) Table 5 and/or (ii) associated CpG loci residing within 300 nucleotides, optionally within 150 nucleotides, of the CpG loci of (i); and/or (C) the methylation level of at least 5, optionally at least

7, at least 10, at least 25, at least 50, at least 53, at least 75, at least 100, at least 125, at least 140, at least 175, at least 200, at least 250 or all CpG loci from (i) Table 8 and/or (ii) associated CpG loci residing within 300 nucleotides, optionally within 150 nucleotides, of the CpG loci of (i); and/or

(ii) (A) the methylation level of at least 5, optionally at least 7, at least 10, at least 15, at least 20, at least 25, at least 30, at least 35, at least 40, at least 45 or all the genes from Table 2;

(B) the methylation level of all the genes from Table 5; and/or

(C) the methylation level of at least 2, optionally at least 3, at least 5, at least 10, at least 15, at least 20, at least 25, at least 35, at least 50, at least 75, at least 100, at least 125, at least 150, at least 170, or all the genes from Table 8; and b) instructions for use.

30. The kit according to claim 29, further comprising bisulfite conversion reagents, methylation-dependent restriction enzymes, methylation-sensitive restriction enzymes, PCR reagents, probes and/or primers.

31 . The kit according to claim 29 or 30, further comprising a computer- readable medium that causes a computer to compare methylation levels from a sample at the selected CpG loci to one or more control profiles and compute a correlation value between the sample and control profile.

Description:
TITLE: DNA METHYLATION MARKERS FOR NEUROPSYCHIATRIC DISORDERS AND METHODS, USES AND KITS THEREOF

RELATED APPLICATIONS

[0001] This application claims the benefit of priority to United States Provisional Application No. 62/434,495 filed December 15, 2016, the contents of which are incorporated herein by reference in their entirety.

FIELD

[0002] The disclosure relates to methods and kits for detecting and/or screening for a neuropsychiatric disorder or an increased likelihood or risk of a neuropsychiatric disorder (ND), in a human subject.

BACKGROUND

[0003] Neuropsychiatric disorders (NDs) including autism spectrum disorder (ASD), intellectual disability (ID), attention deficit/hyperactivity disorder (ADHD) and schizophrenia (SZ) affect -3% of the population. It is clear that genetics plays an important etiologic role in these disorders. However, the same genetic change can be associated with very broad neurodevelopmental, physiological and morphological phenotypes/outcomes. That is, genetics contributes only a fraction of the risk for a specific phenotype e.g. -25% in ASD. The relative contributions of other factors such as environment and epigenetics, the latter of which refers to a mechanism of regulation of gene expression without altering the DNA sequence itself, are not well understood. Heterogeneity, both genetic and phenotypic, poses one of the greater challenges in understanding the underlying mechanisms of these disorders.

[0004] The molecular basis of genetically and phenotypically heterogeneous NDs has been significantly advanced by the more recent findings obtained through genomic research. Both microarrays and next generation sequencing, specifically exome sequencing, have dramatically increased the rate at which new copy number variants and gene mutations causing NDs are being identified. A significant number of genes implicated in the etiology of these disorders are epigenes. Epigenetics has emerged as a vital genome-wide regulatory mechanism that modulates the transcriptome via DNA methylation (DNAm), histone modifications, and chromatin conformation. Such epigenetic markers are precisely programmed during normal development across different tissues and cell types by an estimated 800 epigenes (Turinsky, A.L. et al., 2010). To date, genomic aberrations in approximately 80 epigenes have been identified as being causative in syndromic and non-syndromic ID (S-ID and NS-ID) (Bjornsson, H.T., 2015; Kleefstra, T. et al., 2014), where some of these genes also overlap with known ND-associated risk genes (Tatton-Brown, K. et al., 2014; Bernier, R. et al., 2014; Grafodatskaya, D. et al., 2013; Lalani, S.R. et al., 2006; Choufani, S. et al., 2015). Many of these conditions present with overlapping features, making them difficult to distinguish clinically, particularly at a young age. In contrast, despite a role for epigenetics in the etiology of NDs, given the heterogeneity, it is not expected that each individual with an ND will necessarily have detectable epigenetic aberrations. For example, preliminary evidence suggests that Rett syndrome (OMI M# 312750), which is strongly associated with ASD and where the majority of cases are caused by the loss in function of an epigene called Methyl CpG binding Protein (MECP2), does not result in appreciable DNAm changes (unpublished data). Furthermore, studies of DNAm in mixed ASD cohorts have yielded conflicting results, with some reports of modest, yet significant, differences, and other reports of no differences (Berko, E.R. et ai, 2014; Ladd-Acosta, C. et al., 2014; Nguyen, A. et al., 2010; Wong, C.C. et al., 2014; Ginsberg, M.R. et al., 2012). These findings are most likely driven by etiologic heterogeneity of the cases studied.

SUMMARY

[0005] The present inventors assessed two types of genomic aberrations that have all been associated to an increased risk for neuropsychiatric disorders, including autism spectrum disorder (ASD) and others: 1 ) copy number variants (CNVs), which often involve large genomic regions and multiple genes, and 2) single gene mutations. Two genomic regions not known to contain genes that have confirmed and direct roles in epigenetic regulation, or epigenes, were chosen to examine the epigenetic outcomes that accompany these genomic alterations that are variably associated with neuropsychiatric disorders (NDs). In addition, mutations in one epigene were studied. This mutation and CNVs are associated with, but do not always present with neuropsychiatric outcomes including ASD, ID, ADHD, SZ or congenital anomalies. The de novo gene mutation that was chosen causes haploinsuffiency in an epigene (chromodomain helicase DNA- binding protein 8 [CHD8]), and the two pathogenic CNVs that were selected are from genomic regions on chromosomes 16 (16p1 1 .2 deletions) and 22 (22q1 1 .2 deletions), which do not yet have a known direct effect on epigenes. These pathogenic CNVs encompass genomic regions (involving specific genes) that have previously been shown to be associated with aberrant developmental trajectories for a significant number of individuals carrying these CNVs. CHD8 mutations and 16p1 1 .2 deletions are the two most common single gene mutations and CNVs, respectively, associated with ASD, each accounting for no more than 1 % of ASD cases (Bernier, R. et al., 2014; Hanson, E. et al. , 2015). 16p1 1 .2 deletions and duplications and 22q1 1 .2 deletions are also variably associated with ASD and/or SZ. In contrast, 22q1 1 .2 duplications generally have a milder phenotype, with evidence for protection against SZ risk. All 3 genomic aberrations are also associated with ID to varying degrees.

[0006] The present disclosure identifies DNA methylation (DNAm) markers which are capable of differentiating pathogenic 16p1 1 .2 or 22q1 1 .2 deletions or pathogenic CHD8 mutations from controls. The DNAm markers and the methods of their use described herein may provide useful alternative or supplementary diagnostics to currently available methods of detecting and/or screening for NDs, including autism spectrum disorder (ASD), attention deficit hyperactivity disorder (ADHD), intellectual disability (ID) and schizophrenia (SZ), or likelihood of said NDs, because risk or increased likelihood of such disorders are known to be associated with these CNV deletions and the pathogenic mutation disclosed herein.

[0007] Accordingly, in an aspect, there is provided a method of detecting and/or screening for a neuropsychiatric disorder (ND), or an increased likelihood of an ND, in a human subject, comprising determining a sample DNA methylation (DNAm) profile from a sample of DNA from said subject, said sample profile comprising:

(a) the methylation level of at least 5, optionally at least 10, at least 25, at least 41 , at least 50, at least 75, at least 90, or all CpG loci from (i) Table 2 and/or (ii) associated CpG loci residing within 300 nucleotides, optionally within 150 nucleotides, of the CpG loci of (i);

(b) the methylation level of all CpG loci from (i) Table 5 and/or (ii) associated CpG loci residing within 300 nucleotides, optionally within 150 nucleotides, of the CpG loci of (i); and/or

(c) the methylation level of at least 5, optionally at least 7, at least 10, at least 25, at least 50, at least 53, at least 75, at least 100, at least 125, at least 140, at least 175, at least 200, at least 250 or all CpG loci from (i) Table 8 and/or (ii) associated CpG loci residing within 300 nucleotides, optionally within 150 nucleotides, of the CpG loci of (i).

[0008] The method further comprises determining the level of similarity of said sample profile to one or more control profiles, wherein (i) a high level of similarity of the sample profile to an ND specific control profile; (ii) a low level of similarity to a non-ND control profile; and/or (iii) a higher level of similarity to an ND specific control profile than to a non-ND control profile indicates the presence of, or an increased likelihood of, the ND.

[0009] In an embodiment, the CpG loci of (a) comprise (i) CpG loci from Table 2 having an absolute delta-beta value > 0.05, optionally > 0.06, > 0.07, > 0.1 , or > 0.15; and/or (ii) associated CpG loci residing within 300 nucleotides, optionally within 150 nucleotides, of the CpG loci of (i). [0010] In an embodiment, the CpG loci of (b) comprise (i) CpG loci from Table 5 having an absolute delta-beta value > 0.05; and/or (ii) associated CpG loci residing within 300 nucleotides, optionally within 150 nucleotides, of the CpG loci of (i).

[0011] In an embodiment, the CpG loci of (c) comprise (i) CpG loci from Table 8 having an absolute delta-beta value > 0.05, optionally > 0.06, > 0.07, > 0.1 , > 0.12, > 0.15 or > 0.17; and/or (ii) associated CpG loci residing within 300 nucleotides, optionally within 150 nucleotides, of the CpG loci of (i).

[0012] The ND, disclosed herein, may be any ND that is associated with the CNV 16p 1 .2 deletion with respect to the CpG loci of Table 2; may be any ND that is associated with the CNV 22q1 1 .2 deletion with respect to the CpG loci of Table 5 and/or may be any ND that is associated with a pathogenic CHD8 mutation with respect to the CpG loci of Table 8. For example, the ND may be autism spectrum disorder (ASD), schizophrenia (SZ), attention deficit/hyperactivity disorder (ADHD) or intellectual disability (ID). In an embodiment, the ND is autism spectrum disorder. In another embodiment, the ND is SZ.

[0013] In another embodiment, the sample profile comprises CpG loci from a), b) and c).

[0014] In another aspect, there is provided a method of detecting and/or screening for a neuropsychiatric disorder (ND), or an increased likelihood of ND, in a sample from a human subject, comprising:

(a) determining a sample methylation profile from a sample of DNA from said subject, said sample profile comprising

(i) the methylation level of CpG loci, wherein the CpG loci are the loci from Table 2 having an absolute delta-beta value > 0.05;

(ii) the methylation level of CpG loci, wherein the CpG loci are the loci from Table 5 having an absolute delta-beta value > 0.05; and/or (iii) the methylation level of CpG loci, wherein the CpG loci are the loci from Table 8 having an absolute delta-beta value > 0.05; and

(b) determining the level of similarity of said sample profile to one or more control profiles, wherein (i) a high level of similarity of the sample profile to an ND specific control profile; (ii) a low level of similarity to a non-ND control profile; and/or (iii) a higher level of similarity to an ND specific control profile than to a non-ND control profile indicates the presence of, or an increased likelihood of, ND.

[0015] In another embodiment, a sample profile comprises CpG loci from (i), (ii) and (iii).

[0016] In an embodiment, the CpG loci comprise CpG loci from Table 2 having an absolute delta-beta value > 0.05, optionally > 0.06, > 0.07, > 0.1 , or > 0.15.

[0017] In an embodiment, the CpG loci comprise CpG loci from Table 8 having an absolute delta-beta value > 0.05, optionally > 0.06, > 0.07, > 0.1 , > 0.12, 0.15 or≥ 0.17.

[0018] In an embodiment, the sample is a blood sample.

[0019] In another embodiment, determining the sample methylation profile comprises the steps:

a) providing the sample comprising genomic DNA from the subject; b) optionally, isolating DNA from the sample;

c) optionally, treating DNA from the sample with sodium bisulfite for a time and under conditions sufficient to convert non-methylated cytosines to uracils;

d) optionally, amplifying the DNA; and

e) determining the methylation level at the CpG loci by means of bisulfite sequencing, pyrosequencing, methylation-sensitive single-strand conformation analysis (MS-SSCA), high resolution melting analysis (HRM), combined bisulfite restriction analysis (COBRA), methylation-sensitive single nucleotide primer extension (MS-SnuPE), base-specific cleavage/MALDI- TOF, methylation-specific PCR (MSP), methylation-sensitive restriction enzyme-based methods, microarray-based methods, whole-genome bisulfite sequencing (WGBS, MethylC-seq or BS-seq), reduced-representation bisulfite sequencing (RRBS), and/or enrichment-based methods such as MeDIP-seq, MBD-seq, or MRE-seq.

[0020] In another embodiment, a higher level of similarity to the ND specific control profile than to the non-ND control profile, respectively, is indicated by a higher correlation value computed between the sample profile and the ND specific control profile than an equivalent correlation value computed between the sample profile and the non-ND control profile, optionally wherein the correlation value is a correlation coefficient.

[0021] In an embodiment, the correlation coefficient is a linear correlation coefficient, optionally a Pearson correlation coefficient or a Spearman correlation coefficient.

[0022] In yet another embodiment, a high level of similarity to the control profile is indicated by a Pearson correlation coefficient between the sample profile and the control profile having an absolute value between 0.5 to 1 , optionally between 0.75 to 1 , and a low level of similarity to the control profile is indicated by a correlation coefficient between the sample profile and the control profile having an absolute value between 0 to 0.5, optionally between 0 to 0.25.

[0023] In an embodiment, the methylation level is measured as a β- value.

[0024] In another embodiment, a ND Score is calculated according to following formula:

ND score(B) = r (B, ND profile) - r (B, non-ND profile) where r is a Pearson correlation coefficient, and B is a vector of DNAm levels across the selected methylation loci in the sample, for example methylation loci listed in Tables 2, 5 or 8.

[0025] In another embodiment, determining the sample methylation profile comprises contacting the DNA with at least one agent that provides for determination of a CpG methylation status of at least one, optionally all, of the selected CpG loci, wherein the agent comprises an oligonucleotide- immobilized substrate comprising a plurality of capture probes, each capture probe comprising a pair of capture oligonucleotides, wherein the capture oligonucleotide pairs comprise (a) an oligonucleotide comprising nucleotide sequence complementary to or identical to a nucleotide sequence of genomic DNA comprising a selected CpG, and (b) an oligonucleotide comprising nucleotide sequence complementary to or identical to a nucleotide sequence of genomic DNA comprising the same selected CpG locus of (a), in which the cytosine residue of the CpG locus is replaced with a thymine residue.

[0026] In yet another embodiment, the contacting is under hybridizing conditions.

[0027] In an embodiment, the methylation levels of the selected loci of at least one ND specific control profile is derived from one or more samples, optionally from historical methylation data for a patient or pool of patients, known to have a 16p1 1 .2 pathogenic deletion or duplication, a 22q1 1 .2 pathogenic deletion or duplication or a CHD8 pathogenic mutation. It is expected that subprofiles specific to certain ND phenotypes will be determined and used as a particular ND specific control profile.

[0028] In another embodiment, the non-ND control profile comprises DNAm levels for the selected ND loci listed in Tables 2, 5 and/or 8. In yet another embodiment, the ND specific control profile comprises DNAm levels for the selected CpG loci listed in Tables 2, 5 and/or 8. In an embodiment, the methylation levels of associated CpG loci not listed in Tables 2, 5 and/or 8 is assumed to be equivalent to the methylation level of a CpG loci listed in Tables 2, 5 and/or 8 with which the CpG loci is associated.

[0029] In an embodiment, the sample is derived from blood, fibroblast tissue, buccal tissue, lymphoblastoid cell line, saliva or a prenatal sample. The prenatal sample is optionally a CVS, placenta, circulating fetal DNA and/or amniotic fluid sample. In another embodiment, the sample is derived from a tissue biopsy. In yet another embodiment, the sample is derived from blood.

[0030] In another embodiment, the human subject is a fetus.

[0031] Another aspect provides a method of detecting and/or screening for neuropsychiatric disorders (ND), or an increased likelihood of ND, in a human subject, comprising determining a sample DNA methylation (DNAm) profile from a sample of DNA from said subject, said sample profile comprising:

(a) the methylation level of at least 5, optionally at least 7, at least 10, at least 15, at least 20, at least 25, at least 30, at least 35, at least 40, at least 45 or all the genes from Table 2;

(b) the methylation level of all the genes from Table 5; and/or

(c) the methylation level of at least 2, optionally at least 3, at least 5, at least 10, at least 15, at least 20, at least 25, at least 35, at least 50, at least

75, at least 100, at least 125, at least 150, at least 170, or all the genes from Table 8.

[0032] In another embodiment, the sample profile comprises genes from (a), (b) and (c).

[0033] The method further comprises determining the level of similarity of said sample profile to one or more control profiles, wherein (i) a high level of similarity of the sample profile to an ND specific control profile; (ii) a low level of similarity to a non-ND control profile; and/or (iii) a higher level of similarity to an ND specific control profile than to a non-ND control profile indicates the presence of, or an increased likelihood of, ND.

[0034] Another aspect of the disclosure provides a method of assigning a course of management for an individual with a neuropsychiatric disorder (ND), or an increased likelihood of ND, comprising:

a) identifying an individual with ND or an increased likelihood of ND, according to the methods described herein; and

b) assigning a course of management for the ND, and/or symptoms of the ND, comprising i) testing for at least one medical condition associated with the ND and ii) applying an appropriate medical intervention based on the results of the testing.

[0035] In one embodiment, the medical condition is selected from, but not limited to, developmental difficulties, ID and NDs, including neurodevelopmental disorders such as ASD and ADHD for both pathogenic CNVs and CHD8 mutations; for 16p1 1 .2 deletions/duplication- obesity, speech, impaired language and motor function, morphological anomalies (e.g. macro- and microcephaly); for 22q1 1 .2deletions/duplications- congenital anomalies (e.g. heart defects, cleft palate), impaired immune function; for both 16p1 1 .2 and 22q1 1 .2 CNVs, SZ, and seizures.

[0036] Another aspect of the disclosure provides a kit for detecting and/or screening for a neuropsychiatric disorder (ND), or an increased likelihood of ND, in a sample, comprising:

a) at least one detection agent for determining the methylation level of:

(i) (A) at least 5, optionally at least 10, at least 25, at least 41 , at least 50, at least 75, at least 90, or all CpG loci from (i)

Table 2 and/or (ii) associated CpG loci residing within 300 nucleotides, optionally within 150 nucleotides, of the CpG loci of

(i); (B) the methylation level of all CpG loci from (i) Table 5 and/or (ii) associated CpG loci residing within 300 nucleotides, optionally within 150 nucleotides, of the CpG loci of (i); and/or

(C) the methylation level of at least 5, optionally at least 7, at least 10, at least 25, at least 50, at least 53, at least 75, at least 100, at least 125, at least 140, at least 175, at least 200, at least 250 or all CpG loci from (i) Table 8 and/or (ii) associated CpG loci residing within 300 nucleotides, optionally within 150 nucleotides, of the CpG loci of (i); and/or

(ii) (A) the methylation level of at least 5, optionally at least 7, at least 10, at least 15, at least 20, at least 25, at least 30, at least 35, at least 40, at least 45 or all the genes from Table 2;

(B) the methylation level of all the genes from Table 5; and/or

(C) the methylation level of at least 2, optionally at least 3, at least 5, at least 10, at least 15, at least 20, at least 25, at least 35, at least 50, at least 75, at least 100, at least 125, at least 150, at least 170, or all the genes from Table 8; and b) instructions for use.

[0037] In an embodiment, the kit further comprises bisulfite conversion reagents, methylation-dependent restriction enzymes, methylation-sensitive restriction enzymes, PCR reagents, probes and/or primers. In an embodiment, the probes or primers are specific to the selected CpG loci, selected from the loci in Tables 2, 5 and/or 8.

[0038] In an embodiment, the kit further comprises a computer- readable medium that causes a computer to compare methylation levels from a sample at the selected CpG loci to one or more control profiles and computes a correlation value between the sample and control profile. In an embodiment, the computer readable medium obtains the control profile from historical methylation data for a patient or pool of patients known to have, or not have, one of the pathogenic deletions or mutation disclosed herein and often an associated ND, or known to have a particular methylation signature determined to be specific to a range of NDs as disclosed herein. In some embodiments, the computer readable medium causes a computer to update the control profile based on the testing results from the testing of a new patient.

[0039] Other features and advantages of the present disclosure will become apparent from the following detailed description. It should be understood, however, that the detailed description and the specific examples while indicating embodiments of the disclosure are given by way of illustration only, since various changes and modifications within the spirit and scope of the disclosure will become apparent to those skilled in the art from this detailed description.

BRIEF DESCRIPTION OF THE DRAWINGS

[0040] Embodiments are described below in relation to the drawings in which:

[0041] Figure 1 shows a volcano plot showing the relationship between the average change in blood DNAm in the 16p1 1 .2 deletion cohort compared to normal controls (Δβ effect size, X-axis) and the statistical significance of such changes (p-value of the F-test implemented in the minfi software package, after Benjamini-Hochberg correction for multiple testing, shown in logarithmic scale, Y-axis). Each semi-transparent point represents one of the CpG sites on the HumanMethylation450K array. Array probes on sex chromosomes, near SNPs and cross-reactive probes were removed. The horizontal line represents the statistical significance level p=0.05. The vertical lines represent the effect size of 5% change in DNAm. The data cohorts contained 8 affected samples and 20 normal controls.

[0042] Figure 2 shows hierarchical clustering of 8 affected samples (black bar) and 20 control samples (grey bar) from blood. The heatmap shows the clustering based on the DNAm levels across the 91 CpG sites that exhibited significant changes in methylation (p<0.05 and at least 5% DNAm difference) between the two cohorts. Clustering was performed based on the Pearson correlation metric with average linkage (correlation scale shown on the right).

[0043] Figure 3 shows classification of various categories of blood DNAm samples. Two median-methylation profiles were built over the 91 significant CpGs: one using the 8 mutation samples (circles), and another using the 20 Control samples (squares). 1056 normal blood DNAm samples derived from GEO (crosses) were then examined, almost all of which were more similar to the Control profile (specificity = 98.9%). 6 variants were also classified, of which 2 cases showed a higher similarity to the pathogenic mutation cases and the remaining 4 variants were more similar to the controls. Pearson correlation was used as the similarity metric.

[0044] Figure 4 shows a volcano plot showing the relationship between the average change in blood DNAm in the 22q1 1 .2 deletion cohort compared to normal controls (Δβ effect size, X-axis) and the statistical significance of such changes (p-value of the F-test implemented in the minfi software package, after Benjamini-Hochberg correction for multiple testing, shown in logarithmic scale, Y-axis). Each semi-transparent point represents one of the CpG sites on the HumanMethylation450K array. Array probes on sex chromosomes, near SNPs and cross-reactive probes were removed. The horizontal line represents the statistical significance level p=0.01 . The vertical lines represent the effect size of 5% change in DNAm. The data cohorts contained 7 affected samples and 20 normal controls.

[0045] Figure 5 shows hierarchical clustering of 7 affected samples (black bar) and 20 control samples (grey bar) from blood. The heatmap shows the clustering based on the DNAm levels across the 51 CpG sites that exhibited significant changes in methylation (p<0.01 and at least 5% DNAm difference) between the two cohorts. Clustering was performed based on the Pearson correlation metric with average linkage (correlation scale shown on the right).

[0046] Figure 6 shows classification of various categories of blood DNAm samples. Two median-methylation profiles were built over the 51 significant CpGs: one using the 7 mutation samples (circles), and another using the 20 Control samples (squares). 1056 normal blood DNAm samples derived from GEO (crosses) were then examined, almost all of which were more similar to the Control profile (specificity = 82.5%). Pearson correlation was used as the similarity metric.

[0047] Figure 7 shows a volcano plot showing the relationship between the average change in blood DNAm in the CHD8 mutation cohort compared to normal controls (Δβ effect size, X-axis) and the statistical significance of such changes (p-value of the F-test implemented in the minfi software package, after Benjamini-Hochberg correction for multiple testing, shown in logarithmic scale, Y-axis). Each semi-transparent point represents one of the CpG sites on the HumanMethylation450K array. Arrays probes on sex chromosomes, near SNPs and cross-reactive probes were removed. The horizontal line represents the statistical significance level p=0.01 . The vertical lines represent the effect size of 5% change in DNAm. The data cohorts contained 8 affected samples and 85 normal controls.

[0048] Figure 8 shows hierarchical clustering of 8 affected samples (black bar) and 85 control samples (grey bar) from blood. The heatmap shows the clustering based on the DNAm levels across the 264 CpG sites that exhibited significant changes in methylation (p<0.01 and at least 5% DNAm difference) between the two cohorts. Clustering was performed based on the Pearson correlation metric with average linkage (correlation scale shown on the right).

Figure 9 shows classification of various categories of blood DNAm samples. Two median-methylation profiles were built over the 264 significant CpGs: one using the 8 mutation samples (circles), and another using the 85 Control samples (squares). 1056 normal blood DNAm samples derived from GEO (crosses) were then examined, almost all of which were more similar to the Control profile (specificity = 99.6%). 4 variants were also classified, none of which showed a higher similarity to the pathogenic mutation cases, therefore classifying them as benign variants. These 4 variants were known to harbour inherited, missense mutations and were therefore not expected to show higher similarity to the pathogenic mutation cases. Pearson correlation was used as the similarity metric.

DETAILED DESCRIPTION

[0049] The present inventors have demonstrated that patients with copy number variants (CNVs) of 16p1 1 .2 and 22q1 1 .2 or pathogenic mutations in CHD8 have unique blood DNAm signatures at specific DNAm genome-wide markers that are not confounded with sex, age or ethnic background. These DNAm profiles have the ability to correctly classify single non-synonymous sequence mutation variants in CHD8 or CNVs as pathogenic or benign variants and are not detected in >1000 independent DNA methylomes of patients without the specific genetic aberrations described herein. Furthermore, since only some patients with the same mutations or CNVs also present with neuropsychiatric and/or dysmorphological phenotypes, there is the potential that the identification of disease- and phenotype-specific DNAm signatures will distinguish those individuals with and without such phenotypes. Despite the variable phenotypes, DNAm signatures are strong enough to differentiate between those with and without the pathogenic CNVs in question, regardless of their presentations.

I. Definitions

[0050] Terms of degree such as "substantially", "about" and "approximately" as used herein mean a reasonable amount of deviation of the modified term such that the end result is not significantly changed. These terms of degree should be construed as including a deviation of at least ±5% of the modified term if this deviation would not negate the meaning of the word it modifies or unless the context suggests otherwise to a person skilled in the art.

[0051] As used herein, the term "isolated" or "purified" when used in relation to a DNA molecule refers to a DNA molecule that is extracted and separated from one or more contaminants with which it naturally occurs.

[0052] As used herein, "methylation" refers specifically to DNA methylation or DNAm, and more particularly to a modification in which a methyl group or hydroxymethyl group is added to the 5 position of a cytosine residue to form a 5-methyl cytosine (5-mCyt) or 5-hydroxymethylcytosine (5- hmC).

[0053] As used herein, "CpG locus" or "methylation locus" refers to an individual CpG dinucleotide sequence in genomic DNA which is capable of being methylated. Individual CpG loci may be identified by reference to an lllumina CpG locus (lllumina ID #) which is defined by a chromosome number, genomic coordinate (referenced to NCBI, hg19), genome build (37), and +/- strand designation to unambiguously define each CpG locus. The genomic information is publically available through the UCSC genome browser at https://genome.ucsc.edu/.

[0054] The term "methylation level" refers to a measure of the amount of methylation at a target site (for example, a CpG locus) within a DNA molecule in a sample. For example, the level of methylation can be measured for one or more CpG dinucleotides, or for a region of DNA. If the methylation level of a target site within a sample is higher than a reference level, the sample is considered to have increased methylation relative to the reference at the target site. Conversely, if the methylation level of a target site within a sample is lower than the reference level, the sample is considered to have a decreased methylation level relative to the reference at the target site. The target site may be an individual CpG locus or a region of DNA comprising multiple CpG loci, for example, a gene promoter. Methylation levels of a target site may be measured by methods known in the art, for example, as a "β value" or "beta value", which is calculated as:

β value = intensity of the methylated target (M)/(intensity of the unmethylated target (U) + intensity of the methylated target (M) + 100)

[0055] A β value of zero indicates no methylation and a value of one indicates 100% methylation.

[0056] As used herein, the term "methylation status" refers to whether a specified target DNA site is methylated or not methylated. The target site may be an individual CpG locus or a region of DNA comprising multiple CpG loci, for example, a gene promoter. For example, a target site may have a methylation status of "methylated" or "hypermethylated" if the target has significantly higher methylation beta value in an ND specific control profile compared to a non-ND control profile. Conversely, a target site may have a methylation status of "not methylated" or "hypomethylated" if the target has significantly lower methylation beta value in an ND specific control profile compared to a non-ND control profile. The methylation status of at least some CpG loci in a pathogenic duplication region of a CNV disclosed herein is likely to be opposite to that of the deletion, such that a locus that is hypermethylated in the deletion, is likely to be hypomethylated in a duplication.

[0057] As used herein, the term "delta beta" or "delta β" refers to the difference between the β value of a methylation target in two different samples, for example, the β value of a methylation target in an ND specific control profile and the β value of the same methylation target in a non-ND control profile.

[0058] As used herein the term "gene" refers to a genomic DNA sequence that comprises a coding sequence associated with the production of a polypeptide or polynucleotide product (e.g., rRNA, tRNA). The methylation level of a gene as used herein, encompasses the methylation level of sequences which are known or predicted to affect expression of the gene, including the promoter, enhancer, and transcription factor binding sites. As used herein, the term "enhancer" refers to a cis-acting region of DNA that is located up to 1 Mbp (upstream or downstream) of a gene.

[0059] As used herein, the term "sample methylation profile" or "sample profile" refers to the methylation levels at one or more target sequences in a subject's genomic DNA. The target sequence may be an individual CpG locus or a region of DNA comprising multiple CpG loci, for example, a gene promoter or CpG island. The methylation profile of a sample tested according to the methods disclosed herein is referred to as a sample profile.

[0060] In some embodiments, the sample methylation profile is compared to one or more control profiles. The control profile may be a reference value and/or may be derived from one or more samples, optionally from historical methylation data for a patient or pool of patients who are known to have (ND specific or positive control), or not have an ND (non-ND or negative control). In one embodiment the patient or pool of patients is known to have a 16p1 1 .2 pathogenic deletion or duplication, a 22q1 1 .2 pathogenic deletion or duplication, or a pathogenic mutation in CHD8 and increased risk for NDs. In such cases, the historical methylation data can be a value that is continually updated as further samples are collected and individuals are identified as ND or not-ND or as individuals are identified with a particular ND phenotype. It will be understood that the control profile represents an average of the methylation levels for selected CpG loci as described herein. Average methylation values may, for example, be the mean values or median values.

[0061] For example, an "ND specific control profile" or "ND control profile" may be generated by measuring the methylation levels at specified target sequences in genomic DNA from an individual subject, or population of subjects, who are known to have a 16p1 1 .2 pathogenic deletion or duplication, a 22q1 1 .2 pathogenic deletion or duplication, or a CHD8 pathogenic mutation. The ND specific control profile or ND control profile also includes the methylation levels at specified target sequences in genomic DNA from an individual subject, or population of subjects, who are known to have a particular ND, such as ASD, ADHD, SZ or ID, and a 16p1 1 .2 pathogenic deletion or duplication, a 22q1 1 .2 pathogenic deletion or duplication, or a CHD8 pathogenic mutation. Similarly, a "non-ND control profile" may be generated by measuring the methylation levels at specified target sequences in genomic DNA from an individual subject or population of subjects who are known to not have ND.

[0062] In certain embodiments, the tissue source from which the sample profile and control profile are derived is matched, so that they are both derived from the same or similar tissue.

[0063] As used herein, the phrase "detecting and/or screening" for a condition refers to a method or process of determining if a subject has or does not have said condition. Where the condition is a likelihood or risk for a disease or disorder, the phrase "detecting and/or screening" will be understood to refer to a method or process of determining if a subject is at an increased or decreased likelihood for the disease or disorder.

[0064] As used herein, the term "sensitivity" refers to the ability of the test to correctly identify those patients with the disease or disorder, such that a 100% sensitivity indicates a test that correctly identifies all patients with the disease or disorder. Sensitivity is calculated as:

Sensitivity = (True Positives)/(True Positives + False Negatives). A high sensitivity as used herein refers to a sensitivity of greater than 50%.

[0065] As used herein, the term "specificity" refers to the ability of a test to correctly identify those patients without the disease or disorder, such that a 100% specificity indicates a test that correctly identifies all patients without the disease or disorder. Specificity is calculated as: Specificity = (True Negatives)/(True Negatives + False Positives). A high specificity as used herein refers to a specificity of greater than 50%.

[0066] As used herein, the term "CpG" or "CG" site refers to cytosine and guanosine residues located sequentially (5'->3') in a polynucleotide DNA sequence. The term "CpG island" refers to a region of genomic DNA characterized by a high frequency of CpG sites, for example, a CpG island may be characterized by CpG dinucleotide content of at least 60% over the length of the island. As used herein the term "CpG island shore" refers to a region of DNA occurring within 2kbp (upstream or downstream) of a CpG island. As used herein the term "body" (in reference to a gene) refers to the genomic region covering the entire gene from the transcription start site to the end of the transcript. As used herein the term "distance from TSS" refers to the genomic difference in base pairs between specific CpG locus and the nearest transcription start site.

[0067] As used herein, a first CpG locus is "associated" with a second CpG locus, if the methylation status at the first locus is reasonably predictive of the methylation status of the second locus and vice versa. CpG loci may be considered "associated", for example, if they occur within the same CpG island, CpG island shore, gene promoter or gene enhancer region. CpG loci may also be considered "associated" by virtue of their genomic proximity, for example, CpG loci residing within 300 nucleotides, optionally within 150 nucleotides, of each other may be considered associated.

[0068] As used herein, the term "treating DNA from the sample with bisulfite" refers to treatment of DNA with a reagent comprising bisulfite, disulfite, hydrogen sulfite or combinations thereof, for a time and under conditions sufficient to convert unmethylated DNA cytosine residues to uracil, thereby facilitating the identification of methylated and unmethylated CpG dinucleotide sequences. Bisulfite modifications to DNA may be detected according to methods known in the art, for example, using sequencing or detection probes which are capable of discerning the presence of a cytosine or uracil residue at the CpG site.

[0069] The term "subject" as used herein refers to a human subject and includes, for example, a fetus.

[0070] The terms "complementary" or "complementarity" are used in reference to a first polynucleotide (which may be an oligonucleotide) which is in "antiparallel association" with a second polynucleotide (which also may be an oligonucleotide). As used herein, the term "antiparallel association" refers to the alignment of two polynucleotides such that individual nucleotides or bases of the two associated polynucleotides are paired substantially in accordance with Watson-Crick base-pairing rules. Complementarity may be "partial," in which only some of the polynucleotides' bases are matched according to the base pairing rules. Or, there may be "complete" or "total" complementarity between the polynucleotides. Those skilled in the art of nucleic acid technology can determine duplex stability empirically by considering a number of variables, including, for example, the length of the first polynucleotide, which may be an oligonucleotide, the base composition and sequence of the first polynucleotide, and the ionic strength and incidence of mismatched base pairs.

[0071] The term "hybridize" refers to the sequence specific non- covalent binding interaction with a complementary nucleic acid. Appropriate stringency conditions which promote hybridization are known to those skilled in the art, or can be found in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1 6.3.6. For example, 6.0 x sodium chloride/sodium citrate (SSC) at about 45°C for 15 minutes, followed by a wash of 2.0 x SSC at 50°C for 15 minutes may be employed.

[0072] The stringency may be selected based on the conditions used in the wash step. For example, the salt concentration in the wash step can be selected from a high stringency of about 0.2 x SSC at 50°C for 15 minutes. In addition, the temperature in the wash step can be at high stringency conditions, at about 65°C for 15 minutes.

[0073] By "at least moderately stringent hybridization conditions" it is meant that conditions are selected which promote selective hybridization between two complementary nucleic acid molecules in solution. Hybridization may occur to all or a portion of a nucleic acid sequence molecule. The hybridizing portion is typically at least 15 (e.g. 20, 25, 30, 40 or 50) nucleotides in length. Those skilled in the art will recognize that the stability of a nucleic acid duplex, or hybrids, is determined by the Tm, which in sodium containing buffers is a function of the sodium ion concentration and temperature (Tm = 81 .5°C - 16.6 (Log10 [Na+]) + 0.41 (%(G+C) - 600/I), or similar equation). Accordingly, the parameters in the wash conditions that determine hybrid stability are sodium ion concentration and temperature. In order to identify molecules that are similar, but not identical, to a known nucleic acid molecule a 1 % mismatch may be assumed to result in about a 1 °C decrease in Tm, for example if nucleic acid molecules are sought that have a >95% sequence identity, the final wash temperature will be reduced by about 5°C. Based on these considerations those skilled in the art will be able to readily select appropriate hybridization conditions. In an embodiment, stringent hybridization conditions are selected. By way of example the following conditions may be employed to achieve stringent hybridization: hybridization at 5x sodium chloride/sodium citrate (SSC)/5x Denhardt's solution/1 .0% SDS at Tm - 5°C based on the above equation, followed by a wash of 0.2x SSC/0.1 % SDS at 60°C for 15 minutes. Moderately stringent hybridization conditions include a washing step in 3x SSC at 42°C for 15 minutes. It is understood, however, that equivalent stringencies may be achieved using alternative buffers, salts and temperatures. Additional guidance regarding hybridization conditions may be found in: Current Protocols in Molecular Biology, John Wiley & Sons, N.Y., 1989, 6.3.1 -6.3.6 and in: Sambrook et al. , Molecular Cloning, a Laboratory Manual, Cold Spring Harbor Laboratory Press, 2000, Third Edition. [0074] The term "oligonucleotide" as used herein refers to a nucleic acid substantially free of cellular material or culture medium when produced by recombinant DNA techniques, or chemical precursors, or other chemicals when chemically synthesized. The term "nucleic acid" and/or "oligonucleotide" as used herein refers to a sequence of nucleotide or nucleoside monomers consisting of naturally occurring bases, sugars, and intersugar (backbone) linkages, and is intended to include DNA and RNA which can be either double stranded or single stranded, represent the sense or antisense strand. The term also includes modified or substituted oligomers comprising non-naturally occurring monomers or portions thereof.

[0075] As used herein, the term "amplify", "amplifying" or "amplification" of DNA refers to the process of generating at least one copy of a DNA molecule or portion thereof. Methods of amplification of DNA are well known in the art, including but not limited to polymerase chain reaction (PCR), ligase chain reaction (LCR), self-sustained sequence replication (3SR), nucleic acid sequence based amplification (NASBA), strand displacement amplification (SDA), multiple displacement amplification (MDA) and rolling circle amplification (RCA).

[0076] The term "copy number variant" or "CNV" refers to a structural variant in the genome that results in a change in the number of copies (increased or decreased) of particular genomic regions, encompassing multiple genes.

[0077] As used herein, the term "neuropsychiatric disorder" as used herein refers to a neuropsychiatric disorder that has been associated with a 16p1 1 .2 pathogenic deletion or a duplication thereof, a 22q1 1 .2 pathogenic deletion or a duplication thereof and/or a CHD8 pathogenic mutation, and includes, without limitation, autism spectrum disorder (ASD) attention deficit hyperactivity disorder (ADHD), intellectual disability (ID), autistic symptoms and schizophrenia (SD). The subject may present with a combination of autism and/or one or more of these other neuropsychiatric disorders. [0078] As used herein, the term "16p1 1 .2" refers to the genomic region in which a recurrent deletion or reciprocal duplication occurs, affecting approximately 25 genes.

[0079] As used herein, the term "16p1 1 .2 pathogenic deletion" (OMIM# 61 1913) or "16p1 1 .2 pathogenic duplication" (OMIM# 614671 ) refers to aberrations in the gene dosage of these particular genomic regions due to deletions and duplications of ~600kb, where the majority of individuals harbouring this deletion or duplication will present with a range of CNV- specific phenotypes including morphological anomalies and NDs (Leung, T.Y. et al., 2010).

[0080] As used herein, the term "22q1 1 .2" refers to the genomic region in which a recurrent deletion or reciprocal duplication occurs, affecting approximately 30-40 genes.

[0081] As used herein, the term "22q1 1 .2 pathogenic deletion" (OMIM# 188400, 192430) or "22q1 1 .2 pathogenic duplication" (OMIM# 608363) refers to aberrations in the gene dosage of these particular genomic regions due to deletions or duplications of ~3Mb, where the majority of individuals harbouring this deletion or duplication will present with a range of CNV-specific phenotypes including morphological anomalies and NDs.

[0082] As used herein, "CHD8" refers to the chromodomain helicase DNA-binding protein 8 gene located in the 14q1 1 .2 region (OMIM# 610528).

[0083] As used herein, the term "CHD8 pathogenic mutation" (OMIM# 615032) as used herein refers to loss of function mutations, including nonsense mutations and deletions, leading to a reduction of functioning protein product and associated with the observation of variable phenotypes including both morphological anomalies and NDs (e.g. ID, ADHD), in the majority if individuals harbouring theses mutations. II. Methods

[0084] As set out in Table 2, the instant disclosure identifies 91 distinct CpG loci, each of which show a statistically significant (corrected p-value < 0.05) difference in methylation levels between individuals with a 16p1 1 .2 pathogenic deletion and controls over the tested population. As set out in Table 5, the instant disclosure identifies 51 distinct CpG loci, each of which show as statistically significant (corrected p-value < 0.01 ) difference in methylation levels between individuals with a 22q1 1 .2 pathogenic deletion and controls over the tested population. As set out in Table 8, the instant disclosure identifies 264 CpG loci, each of which show as statistically significant (corrected p-value < 0.01 ) difference in methylation levels between individuals with a pathogenic CHD8 mutation with a high risk for ASD and non-ND controls over the tested population. As described in the Examples, the methylation levels of the disclosed loci, or a subset thereof, may be used in diagnostic testing for determining a pathogenic mutation associated with a genomic aberration such as a mutation that is associated with an increased risk for a neuropsychiatric disorder (ND), with up to 100% sensitivity and specificity. It will be understood that the sensitivity and specificity of the methods described will tend to increase with the number of CpG loci or sites selected for testing (i.e. the size of the signature), to a maximal sensitivity/specificity of 100%. However, signatures utilizing fewer CpG loci, are described herein which retain greater than 50% sensitivity and specificity and are useful for assessing likelihood of ND.

[0085] Useful methylation signatures according to the described methods are not intended to be limited to the sites of Table 2, Table 5, and

Table 8, but are intended to include associated CpG loci, and associated gene and non-gene regions. DNAm at a single CpG locus can predict DNAm of multiple other loci residing in near genomic proximity or overlapping CpG islands. Accordingly, "associated" loci and regions are loci and regions, the methylation levels or status of which may be reasonably predicted by the methylation levels or status of one or more of the CpG loci of Table 2, Table 5, and Table 8. CpG loci may be considered "associated", for example, if they occur within the same CpG island, CpG island shore, gene promoter or gene enhancer region. CpG loci may also be considered "associated" by virtue of their proximity, for example, CpG loci residing within 300 nucleotides, optionally within 150 nucleotides, of each other may be considered associated.

[0086] Accordingly, an aspect of the disclosure provides a method of detecting and/or screening for a neuropsychiatric disorder (ND), or an increased likelihood of ND, in a human subject, comprising determining a sample methylation profile from a sample of DNA from said subject, said sample profile comprising:

(a) the methylation level of at least 5, optionally at least 10, at least 25, at least 41 , at least 50, at least 75, at least 90, or all CpG loci from (i) Table 2 and/or (ii) associated CpG loci residing within 300 nucleotides, optionally within 150 nucleotides, of the CpG loci of (i);

(b) the methylation level of all CpG loci from (i) Table 5 and/or (ii) associated CpG loci residing within 300 nucleotides, optionally within 150 nucleotides, of the CpG loci of (i); and/or

(c) the methylation level of at least 5, optionally at least 7, at least 10, at least 25, at least 50, at least 53, at least 75, at least 100, at least 125, at least 140, at least 175, at least 200, at least 250 or all CpG loci from (i) Table 8 and/or (ii) associated CpG loci residing within 300 nucleotides, optionally within 150 nucleotides, of the CpG loci of (i).

[0087] Methods of DNAm profiling of target genomic regions are generally known in the art (Harris, R.A. et al., 2010; Stevens, M. et al. , 2013; Hirst, M., 2013).

[0088] For example, a non-limiting list of exemplary methods that may be used to determine methylation levels at a specified target sequence of DNA include: bisulfite sequencing, pyrosequencing, methylation-sensitive single-strand conformation analysis (MS-SSCA), high resolution melting analysis (HRM), methylation-sensitive single nucleotide primer extension (MS-SnuPE), base-specific cleavage/MALDI-TOF, methylation-specific PCR (MSP), methylation-sensitive restriction enzyme-based methods and/or microarray-based methods.

[0089] In an embodiment, methylation levels are measured using an agent that provides for determination of a CpG methylation status of at least one, optionally all, of the selected CpG loci, wherein the agent comprises an oligonucleotide-immobilized substrate comprising a plurality of capture probes, each capture probe comprising a pair of capture oligonucleotides, wherein the capture oligonucleotide pairs comprise (a) an oligonucleotide comprising nucleotide sequence complementary to or identical to a nucleotide sequence of genomic DNA comprising a selected CpG loci, and (b) an oligonucleotide comprising nucleotide sequence complementary to or identical to a nucleotide sequence of genomic DNA comprising the same selected CpG loci of (a), in which the cytosine residue of the CpG loci is replaced with a thymine residue. A non-limiting example of such an agent includes a "microarray", comprising an ordered set of probes fixed to a solid surface that permits analysis such as methylation analysis of a plurality of genomic targets sequences.

[0090] According to the methods described herein, similarity of the DNAm profile from a sample to one or more control profiles, may be used to identify individuals having a specific gene mutation or CNV, an ND, or an increased likelihood of ND or as not having an ND. For example, in an embodiment, the method comprises determining the level of similarity of a sample profile to one or more control profiles, wherein (i) a high level of similarity of the sample profile to an ND specific control profile; (ii) a low level of similarity to a non-ND control profile; and/or (iii) a higher level of similarity to an ND specific control profile than to a non-ND control profile indicates the presence of, or an increased likelihood of, the ND. [0091] It will be appreciated that the control profile may be a reference value, or derived from one or more samples, optionally from historical methylation data for a patient or pool of patients. The control profile may be a reference value and/or may be derived from one or more samples, optionally from historical methylation data for a patient or pool of patients who are known to have, or not have, a 16p1 1 .2 pathogenic deletion or duplication (for Table 2 loci), a 22q1 1 .2 pathogenic deletion or duplication (for Table 5 loci) or a CHD8 pathogenic mutation (for Table 8 loci) or from a patient or pool of patients who are known to have, ND and either a 16p1 1 .2 pathogenic deletion or duplication (for Table 2 loci), a 22q1 1 .2 pathogenic deletion or duplication (for Table 5 loci) or a CHD8 pathogenic mutation (for Table 8 loci). In such cases, the historical methylation data can be a value that is continually updated as further samples are collected and individuals are identified as ND or not- ND. For example, the control database may be stored on an online database, which is continually updated with methylation data from diagnosed ND and non- ND patients. It will be understood that the control profile represents an average of the methylation levels for selected CpG loci as described herein. Similarly, a reference value and/or a control profile from one or more samples, including historical methylation data, may be derived from a subset of individuals that have a particular ND phenotype and either a 16p1 1 .2 pathogenic deletion or duplication (for Table 2 loci), a 22q1 1 .2 pathogenic deletion or duplication (for Table 5 loci) or a CHD8 pathogenic mutation (for Table 8 loci).

[0092] In an embodiment, the "ND specific control profile" or "ND control profile" is generated by measuring the methylation levels at specified target sequences in genomic DNA from an individual subject, or population of subjects, who are known to have a 16p1 1 .2 pathogenic deletion or duplication, a 22q1 1 .2 pathogenic deletion or duplication, or a CHD8 pathogenic mutation. The ND specific control profile or ND control profile also includes the methylation levels at specified target sequences in genomic DNA from an individual subject, or population of subjects, who are known to have a particular ND, such as ASD, ADHD, SZ or I D, and a 16p1 1 .2 pathogenic deletion or duplication, a 22q1 1 .2 pathogenic deletion or duplication, or a CHD8 pathogenic mutation. Similarly, the "non-ND control profile" is generated by measuring the methylation levels at specified target sequences in genomic DNA from an individual subject or population of subjects who are known to not have ND.

[0093] In certain embodiments, the tissue source from which the sample profile and control profile are derived is matched, so that they are both derived from the same or similar tissue. In other embodiments, the sample profile and control profile are derived from different tissues. In certain other embodiments, the ND specific control profile and the non-ND control profile are derived from historical data and can indicate similarity of a sample to either the ND or non-ND profiles.

[0094] Methods of determining the similarity between methylation profiles are well known in the art. Methods of determining similarity may in some embodiments provide a non-quantitative measure of similarity, for example, using visual clustering. In another embodiment, similarity may be determined using methods which provide a quantitative measure of similarity.

[0095] For example, in an embodiment, similarity may be measured using hierarchical clustering, optionally using Manhattan distance. For example, unsupervised hierarchical clustering of a sample with an ND specific control profile indicates similarity to the ND specific control profile. Likewise, unsupervised hierarchical clustering of a sample with a non-ND control profile indicates similarity to the non-ND control profile.

[0096] The Manhattan distance function computes the distance that would be traveled to get from one data point to the other if a grid-like path is followed. The Manhattan distance between two items is the sum of the differences of their corresponding components.

[0097] The formula for this distance between a point X=(X1 , X2, etc.) and a point Y=(Y1 , Y2, etc.) is:

Where n is the number of variables, and Xi and Yi are the values of the variable, at points X and Y respectively.

[0098] In another embodiment, similarity may be measured by computing a "correlation coefficient", which is a measure of the interdependence of random variables that ranges in value from -1 to +1 , indicating perfect negative correlation at -1 , absence of correlation at zero, and perfect positive correlation at +1 . In an embodiment, the correlation coefficient may be a linear correlation coefficient, for example, a Pearson product-moment correlation coefficient.

[0099] A Pearson correlation coefficient (r) is calculated using the following formula:

[00100] In one embodiment, x and y are the beta values for various CpG loci in a sample profile and a control profile, respectively.

[00101] In an embodiment, a correlation coefficient calculated between the sample profile and the control profile indicates a high level of similarity to the control profile when the correlation coefficient has an absolute value between 0.5 to 1 , optionally between 0.75 to 1 , and a low level of similarity to the control profile when the correlation coefficient has an absolute value between 0 to 0.5, optionally between 0 to 0.25.

[00102] It will be appreciated that any "correlation value" which provides a quantitative scaling measure of similarity between methylation profiles may be used to measure similarity. A sample profile may be identified as belonging to an individual with an ND, or an increased likelihood of ND, where the sample profile has high similarity to the ND profile, low similarity to the non- ND profile, or higher similarity to the ND profile than to the non-ND profile. Conversely, a sample profile may be identified as belonging to an individual without an ND, or a decreased likelihood of ND, where the sample profile has high similarity to the non-ND profile, low similarity to the ND profile, or higher similarity to the non-ND profile than to the ND profile.

[00103] Similarly, in an embodiment, a sample profile may be identified as belonging to an individual with ND, or an increased likelihood of ND, based on calculation of an ND Score, which generally is defined by the following formula:

ND score(B) = r (B, ND profile) - r (B, control profile)

where r is the Pearson correlation coefficient, and B is a vector of DNAm levels across the selected CpG loci.

[00104] A sample profile with a positive ND Score is more similar to the ND specific profile across the selected CpG loci, and is therefore classified as "ND"; whereas a sample with a negative ND Score is more similar to the non- ND profile across the selected CpG loci, and is classified as "not ND".

[00105] As used herein the term "sample" refers to a biological sample comprising genomic DNA from a human subject. The sample may, for example, comprise blood, fibroblast tissue, buccal tissue, and/or amniotic fluid. In one embodiment, the sample comprises blood.

[00106] Median methylation levels for pathogenic mutation/CNV cases (ND or higher likelihood of ND) and benign (non-ND) cases reported in Tables 2, 5 and/or 8 were identified using whole blood samples. Based on DNAm profiles in other disorders with mutations in epigenes, it is predicted that the DNAm profile for ND and non-ND, can be present in other samples, for example, fibroblast tissue, buccal tissue, lymphoblastoid cell lines, saliva or a prenatal sample. The prenatal sample is optionally a CVS, placenta, circulating fetal DNA and/or amniotic fluid sample.

[00107] Another aspect provides a method of detecting and/or screening for a neuropsychiatric disorder (ND), or an increased likelihood of an ND, in a human subject, comprising determining a sample DNAm profile from a sample of DNA from said subject, said sample profile comprising:

(a) the methylation level of at least 5, optionally at least 7, at least 10, at least 15, at least 20, at least 25, at least 30, at least 35, at least 40, at least 45 or all the genes from Table 2;

(b) the methylation level of all the genes from Table 5; and/or

(c) the methylation level of at least 2, optionally at least 3, at least 5, at least 10, at least 15, at least 20, at least 25, at least 35, at least 50, at least 75, at least 100, at least 125, at least 150, at least 170, or all the genes from Table 8.

[00108] The method further comprises determining the level of similarity of said sample profile to one or more control profiles, wherein (i) a high level of similarity of the sample profile to an ND specific control profile; (ii) a low level of similarity to a non-ND control profile; and/or (iii) a higher level of similarity to an ND specific control profile than to a non-ND control profile indicates the presence of, or an increased likelihood of, the ND.

[00109] In one embodiment, the genes comprise ANKMY1, CAPS2, CCDC17, GLIPR1L2 and IZUM01. It is shown in Table 3, for example, that at an absolute delta beta of 0.15 and p-value < 0.05, the 5 genes ANKMY1, CAPS2, CCDC17, GLIPR1L2 and IZUM01 provide a specificity of 100% and a sensitivity of 88%.

[00110] In yet another embodiment, the genes comprise SOX9 and USP18. It is shown in Table 9, for example, that at an absolute delta beta of 0.15 and a p-value < 0.01 , Sox9 and USP18 provide a specificity of 65% and a sensitivity of 63%. [00111] It will also be appreciated by a person of skill in the art that the methods described herein can be used to distinguish between an ND with a pathogenic 16p1 1 .2 deletion, an ND with a 22q1 1 .2 pathogenic deletion and an ND with a CHD8 pathogenic mutation. It will also be appreciated by a person of skill in the art that the methods described herein can be used to distinguish between subsets of individuals with ND and a pathogenic 16p1 1 .2 deletion, subsets of individuals with ND and a 22q1 1 .2 pathogenic deletion and subsets of individuals with ND and a CHD8 pathogenic mutation disclosed herein. The subsets of individuals may be stratified based on ND phenotype, such as ASD, ADHD, ID or SZ.

[00112] While an ND with a 16p 1 .2 pathogenic deletion or duplication, with a 22q1 1 .2 pathogenic deletion or duplication and with a CHD8 pathogenic mutation share some characteristics such as ASD, ID, ADHD and SZ, there are likely also other clinical characteristics that are typical of the individual ND genotypes.

[00113] Improved molecular (epigenetically-based) predictions will enable genotype- and ND risk-specific medical management guidelines, which will allow for optimization of health care. Presently, to confirm the diagnosis of a genetic condition, one typically relies upon the ability to detect the specific gene mutation through DNA sequencing. However, sequencing has many inherent challenges, most notably, the common finding of DNA sequence changes of unclear significance. Additionally, there is substantial heterogeneity in the presentation of disorders caused by the same mutation or CNV. Therefore, additional molecular methods of classification can be achieved by evaluating epigenetic markers, such as the DNAm signatures disclosed herein.

[00114] Therefore, a proper diagnosis of a particular ND genotype (which may, in some instances, be specific to a subset of ND phenotype) allows for testing, treatment and medical management appropriate for each condition, given the differences in their clinical characteristics. [00115] Confirmation of a diagnosis of an ND aids in medical management by enabling targeted screening for the multisystem manifestations of these complex conditions, optimizing the opportunity for early intervention and management. Recommended evaluations following a diagnosis include: For all mutations/CNVs- Neuropsychological and cognitive assessment to screen for developmental difficulties, ID and neuropsychiatnc disorders, including neurodevelopmental disorders such as ASD and ADHD; for 16p1 1 .2 deletions/duplication- weight monitoring, speech, language and motor function assessments, physical assessments for morphological anomalies (e.g. macro- and microcephaly); for 22q1 1 .2deletions/duplications- assessments for congenital anomalies (e.g. heart defects, cleft palate), assessment of immune function; for both 16p 1 .2 and 22q1 1 .2 CNVs, neuropsychiatnc assessments for SZ risk, seizures. Early identification of the above medical and cognitive issues provides the opportunity for an enhanced quality of life for individuals with these CNVs.

[00116] Accordingly, an aspect of the disclosure provides a method of assigning a course of management for an individual with a neuropsychiatnc disorder (ND), or an increased likelihood of an ND, comprising:

a) identifying an individual with an ND or an increased likelihood of an ND, according to the methods described herein; and

b) assigning a course of management for the ND and/or symptoms of the ND, comprising i) testing for at least one medical condition associated with the ND and ii) applying an appropriate medical intervention based on the results of the testing.

[00117] As used herein, the term "a course of management" refers to any testing, treatment, medical intervention and/or therapy applied to an individual with an ND and/or symptoms of an ND. Medical interventions include, but are not limited to, pharmaceutical treatments, surgical procedures, weight management, physical or occupational therapy. In addition, behavioral and cognitive assessments, anticipatory follow up and guidance, monitoring and therapy would be indicated. [00118] In one embodiment, the medical condition associated with ND is selected from, but is not limited to, developmental delay, ASD, ID, ADHD, SZ, morphological anomalies (e.g. craniofacial, cardiac, etc.), obesity, and growth abnormalities.

III. Kits

[00119] Another aspect provides a kit for detecting and/or screening for a neuropsychiatnc disorder (ND), or an increased likelihood of an ND, in a sample, comprising:

(a) at least one detection agent for determining:

(A) the methylation level of at least 5, optionally at least 10, at least 25, at least 41 , at least 50, at least 75, at least 90, or all CpG loci from (i) Table 2 and/or (ii) associated CpG loci residing within 300 nucleotides, optionally within 150 nucleotides, of the CpG loci of (i);

(B) the methylation level of all CpG loci from (i) Table 5 and/or (ii) associated CpG loci residing within 300 nucleotides, optionally within 150 nucleotides, of the CpG loci of (i); and/or

(C) the methylation level of at least 5, optionally at least 7, at least 10, at least 25, at least 50, at least 53, at least 75, at least 100, at least 125, at least 140, at least 175, at least 200, at least 250 or all CpG loci from (i) Table 8 and/or (ii) associated CpG loci residing within 300 nucleotides, optionally within 150 nucleotides, of the CpG loci of (i).

and;

(b) instructions for use.

[00120] Another aspect provides a kit for detecting and/or screening for a neuropsychiatnc disorders (ND), or an increased likelihood of an ND, in a sample, comprising: (a) at least one detection agent for determining:

(i) the methylation level of at least 5, optionally at least 7, at least 10, at least 15, at least 20, at least 25, at least 30, at least 35, at least 40, at least 45 or all the genes from Table 2;

(ii) the methylation level of all the genes from Table 5; and/or

(iii) the methylation level of at least 2, optionally at least 3, at least 5, at least 10, at least 15, at least 20, at least 25, at least 50, at least 75, at least 100, at least 125, at least 150, at least 170, or all the genes from Table 8.

(b) instructions for use.

[00121] In an embodiment, the instructions for use comprise instructions for carrying out the methods disclosed herein.

[00122] In an embodiment, the kit further comprises bisulfite conversion reagents, methylation-dependent restriction enzymes, methylation-sensitive restriction enzymes, PCR reagents, probes and/or primers. For example, in one embodiment, the probes or primers are specific to selected CpG loci of Tables 2, 5 and/or 8.

[00123] In another embodiment, the kit further comprises a computer- readable medium that causes a computer to compare methylation levels from a sample at the selected genes to one or more control profiles and compute a correlation value between the sample and control profile.

[00124] In another embodiment, the kit further comprises a computer- readable medium that causes a computer to compare methylation levels from a sample at the selected CpG loci to one or more control profiles and compute a correlation value between the sample and control profile.

[00125] In an embodiment, the control profiles include profiles from individuals or reference values from individuals with ND and a pathogenic deletion or mutation disclosed herein. In another embodiment, the control profile is from individuals or reference values from individuals with a particular ND phenotype and a pathogenic deletion or mutation disclosed herein.

[00126] The above disclosure generally describes the present application. A more complete understanding can be obtained by reference to the following specific examples. These examples are described solely for the purpose of illustration and are not intended to limit the scope of the disclosure. Changes in form and substitution of equivalents are contemplated as circumstances might suggest or render expedient. Although specific terms have been employed herein, such terms are intended in a descriptive sense and not for purposes of limitation.

[00127] The following non-limiting examples are illustrative of the present disclosure:

EXAMPLES

[00128] The present inventors hypothesized that substratifying cases of neuropsychiatric disorders based on genomic aberrations as well as clinical phenotypes within a mixed cohort of e.g. ASD patients will enhance the specific epigenetic patterns. It has been shown that each epigenetic pattern is unique to each CNV and specific gene mutation but there is also further epigenetic variation likely due to each individual's genomic/environmental status. It is hypothesized that epigenetic patterns observed in association with specific CNVs can differentiate between highly prevalent phenotypes e.g. 16p .2 deletions with and without ASD. This approach can be applied in future to other genetically or clinically substratified groups, e.g. for the examination of the variable SZ phenotype observed with 22q1 1 .2 deletions. The present inventors have demonstrated that epigenetic patterns specific to each mutation differentiates between pathogenic and benign mutations, allowing for more accurate clinical diagnosis where there are variable phenotypes presenting with the same genetic mutation. Given that many of these variable phenotypes overlap between specific gene mutations/CNVs, from a diagnostic or risk assessment perspective, it would be advantageous to be able to screen and categorize an individual using DNAm signatures through a DNA microarray, which is more cost and labour efficient than whole genome sequencing. These data can be adapted and utilized by clinicians to better classify patients molecularly and to assess for risk of various NDDs and neuropsychiatric conditions.

Subjects and methods

Sample Collection

16p1 1 .2 deletion

[00129] Extracted DNA samples from whole blood of 8 individuals with 16p1 1 .2 pathogenic deletions in the 600kb risk locus (Table 1- 16p1 1 .2 deletions Table) were obtained through Dr. Jim Stavropoulos of the Department of Paediatric and Laboratory Medicine (DPLM) at the Hospital for Sick Children. Six additional samples (Table 1 ) were obtained from TCAG at the Hospital for Sick Children. These samples were compared with 20 sex-, age- and ethnicity-matched controls.

22q1 1 .2 deletion

[00130] Extracted DNA samples from whole blood of 7 individuals with 22q1 1 .2 pathogenic deletions (Table 4 - 22q1 1 .2 deletions table) were obtained through Dr. Jim Stavropoulos of the DPLM at the Hospital for Sick Children. These samples were compared with 20 sex-, age- and ethnicity- matched controls.

CHD8

[00131] Extracted DNA samples from whole blood of 8 individuals with CHD8 pathogenic mutations (Table 7 - CHD8 mutation table) were obtained from the Simons Simplex Collection (Ginsberg, M.R. et al., 2012). Extracted DNA samples from whole blood of 4 individuals with non-synonymous missense mutations in CHD8 (Table 7) were obtained from The Centre for Applied Genomics (TCAG) at the Hospital for Sick Children. These samples were compared with 85 sex-, age- and ethnicity-matched controls. Phenotypic information was available for all of the subjects.

[00132] All subjects were recruited following informed consent. The studies were approved by the Research Ethics Boards of the Hospital for Sick Children, Toronto.

Methylation Array Analysis

[00133] The following procedure was applied to each dataset individually. DNA samples were modified by sodium bisulfite conversion (EpiTect PLUS Bisulfite Kit, QIAGEN). The sodium bisulfite converted DNA was then hybridized to the lllumina Infinium HumanMethylation450 BeadChip Array (450K array) to interrogate over 485,577 CpG sites in the human genome at single nucleotide resolution, lllumina Genome studio software was used to extract DNAm values (β-values), calculated after control probe normalization and background subtraction using the formula C/(C+T), and ranging between 0 (no methylation) and 1 (full methylation). Autosomal probes that cross-react with sex chromosome probes, non-specific probes, and probes targeting CpG sites at a known SNP (Chen, Y.-a. et ai , 2013; Chen, Y.A. et ai, 201 1 ) were excluded, leaving a remaining 426,377 CpG sites. Remaining sites were then assessed for quality using the R package IMA to remove probes of poor quality (final set of probes used for group comparisons are variable depending on this step). Using a statistical package in R designed specifically for analyzing 450K array data (minfi) (Aryee, M.J. et ai, 2014), groups were compared at each CpG using the F-Test and the FDR corrected p-values were reported. Delta β-values (Δβ) was defined for each probe as the difference between average control and average mutation/CNV case methylation levels.

[00134] To determine the appropriate significance level for the F-tests and Δβ threshold, volcano plots (Figures 1 ,4,7) were first examined in order to select the appropriate p-value significance level for each cohort comparison. This p-value threshold was confirmed by a series of leave-one-out (LOO) cross-validations on the combined dataset. In each LOO iteration, one sample was removed from the dataset for the subsequent validation step (Tables 3,6,9). The remaining samples were used to generate the list of significant CpGs for that LOO iteration, as well as the median DNAm profiles for the subjects in a mutation/CNV group and for the control group, respectively, where both profiles were restricted to the LOO-derived list of CpG. To ensure robust results, statistically significant probes were additionally filtered for the effect size: only those significant probes were retained for which the DNAm difference (Δβ) was greater by absolute value than a pre-selected threshold. The retained validation sample was then compared to both reference profiles, using only the significant CpGs, and with Pearson correlation as the measure of similarity. The sample was assigned to the group with the more similar profile, and the assignment compared to the true status of the sample (those with a pathogenic mutation/CNV or control). Iterating the LOO process over all samples, the classification accuracy was estimated in terms of the specificity and sensitivity for a given level of significance and Δβ. The specificity of the predictive model was also estimated on a collection of 1056 normal blood samples extracted from the GEO. For this, the list of significant CpGs was derived by comparing all pathogenic mutation/CNV samples to all control samples; the two respective DNAm profiles were created; and each of the 1056 GEO samples were classified as described above.

[00135] Once the LOO iterations and validation on GEO indicated the best parameter combination (p-value and Δβ thresholds), it was used to derive the final classification signature.

16p11.2

[00136] There have been several recurrent 16p1 1 .2 deletion (OMIM # 61 1913) risk loci identified in the literature. The present inventors focused on deletions, both de novo and inherited, in the ~600kb risk locus between BP4- 5, which encompasses -25 genes, including a seizure related gene Seizure Related 6-Like Protein 2 (SEZ6L2) and one unconfirmed epigene, HIRA Interacting Protein 3 {HIRIP3), which may play a role in chromatin remodelling. Without wishing to be bound by theory, the presence of an epigene in the CNV region is not the only way in which epigenetic outcomes may be altered; other genes in the region, some of which have uncharacterized functions, may have downstream effects on epigenetic regulation, therefore also leading to aberrant DNAm profiles observed in association with the CNV. Nor should one assume that genetic mutations in epigenes necessarily lead to epigenetic aberrations that can be implicated in the pathogenesis of a disorder.

[00137] There is remarkable phenotypic variability with this CNV, where individuals with the 16p1 1 .2 pathogenic deletion most commonly have delays in language, speech, and motor function, as well as a significantly higher rate of obesity (Walters, R.G. et al., 2010). Interestingly, not all patients with the deletion have ASD (20-40%) (Hanson, E. et al. , 2015), but almost all patients exhibit ASD features. 16p1 1 .2 pathogenic deletions and duplications exhibit some overlap as well as reciprocal and CNV-specific phenotypes given the differences in gene region dosage; seizures are observed with both the deletion and the duplication, most individuals with either CNV have speech and motor delay and dysmorphology, however ASD is more prevalent and more severe in the deletion, whereas congenital anomalies and ADHD are more common with the duplication (Fernandez, B.A. et al., 2010; Shinawi, M. et al., 2010). Macrocephaly is seen with the deletion, where the opposite phenotype, microcephaly, is observed with the duplication (Qureshi, A.Y. et al., 2014; McCarthy, S.E. et al., 2009). An increased risk for SZ (8-24-fold) is a neuropsychiatric outcome that is only seen with the duplication (McCarthy, S.E. et al., 2009).

16p11.2 Deletion Signature

[00138] The 16p1 1 .2 deletion signature was derived from a comparison of DNAm from 8 individuals with pathogenic deletions in the 600kb risk locus of the 16p1 1 .2 region to 20 controls (Table 2) using the analysis methodology previously described in the Methylation Array Analysis section.

[00139] The LOO procedure confirmed that the p-value threshold 0.05, when combined with the effect size threshold |Δβ| > 5%, was the necessary significance level at which the LOO procedure makes no classification errors (ie. Sensitivity and specificity of 100%) (Table 3). Applying the statistical tests with these parameters to the full collection of 8 16p1 1 .2 deletion samples and 20 controls, a "signature set" of 91 significant CpG sites was derived. As expected, the set defined a perfect separation between the samples with a 16p1 1 .2 pathogenic deletion and controls (Figure 2).

16p11.2 Deletion Signature Validation

[00140] The set of 91 probes for specific CpG sites comprising the 16p1 1 .2 deletion signature were located within the bodies or promoter regions of 49 known genes (Table 2). 12 individual genes had 2 or more differentially methylated probes, including GLI Pathogenesis-Related 1 Like 2 (GLIPR1L2, 9 probes), Eukaryotic Translation Initiation Factor 4E (EIF4E, 6 probes), and Coiled-Coil Domain Containing 17 (CCDC17, 4 probes).

[00141] Next the specificity of the signature CpGs was validated on a collection of 1056 normal blood samples derived from GEO. Similar to the LOO procedure, median DNAm profiles for the 8 pathogenic 16p1 1 .2 deletion samples and for the 20 control samples, respectively, were generated. The Pearson correlation of each of the GEO samples was computed with the reference 16p1 1 .2 deletion profile and the reference control profiles, using the 91 significant CpGs sites. A high degree (98.9%) of specificity was achieved with the GEO control samples (Figure 3). Similar estimates were tabulated for additional parameter combinations for effect size threshold |Δβ| from 5% to 25% and significance level from p< 0.05 to 0.0001 (Table 10).

[00142] The signature was then applied to classify 6 subjects with

16p .2 deletions (including overlapping deletions, variable deletion sizes, variable phenotype) into either pathogenic or benign mutations (Figure 3). Using the same classification procedure as was used to define the signature, 2 of the variants were predicted to be pathogenic and 4 were predicted to be benign (Table 1 ). The 2 cases showing higher similarity to the pathogenic mutation cases harbour identical deletions to the 8 mutation samples used to derive the signature. Of the 4 variants that were more similar to controls, one is known to have a shorter deletion in the 16p1 1 .2 region, but not overlapping the 600kb region common between the 8 mutation samples, one is known to have 50% mosaicism for the deletion, one has unconfirmed deletion coordinates and one has confirmed deletion coordinates, but a less severe diagnosis (PDD-NOS).

16p11.2 Duplication Signature

[00143] The 16p1 1 .2 duplication signature is derived from a comparison of DNAm from individuals with pathogenic duplications in the 600kb risk locus of the 16p1 1 .2 region to control using the analysis methodology previously described in the Methylation Array Analysis section.

[00144] The LOO procedure confirms that a p-value threshold, when combined with an effect size threshold |Δβ| is the necessary significance level at which the LOO procedure makes no classification errors (ie. Sensitivity and specificity of 100%). Applying the statistical tests with these parameters to the full collection of 16p1 1 .2 duplication samples and controls, a "signature set" of significant CpG sites is derived. It is expected that the signature set will likely contain some regions of methylation that will be opposite that of the deletion signature i.e., in certain regions where there is hypermethylation in the deletion signature, it is expected that there is hypomethylation in the duplication signature, and vice versa.

16p11.2 Duplication Signature Validation

[00145] The set of probes for specific CpG sites comprising the 16p1 1 .2 duplication signature are located within the bodies or promoter regions of known genes. [00146] Next the specificity of the signature CpGs is validated on a collection of normal blood samples derived from GEO. Similar to the LOO procedure, median DNAm profiles for the pathogenic 16p1 1 .2 duplication samples and for the control samples, respectively, are generated. The Pearson correlation of each of the GEO samples is computed with the reference 16p1 1 .2 duplication profile and the reference control profiles, using the significant CpGs sites. A high degree of specificity is achieved with the GEO control samples.

[00147] The signature is then applied to classify subjects with 16p 1 .2 duplications (including variable deletion sizes, variable phenotype) into either pathogenic or benign CNVs.

22q11.2

[00148] Recurrent hemizygous 3 Mb deletions between LCR22-2 and LCR22-4 in the 22q1 1 .2 (OMIM# 188400, 192430) region have also been identified. This region encompasses 30-40 genes, including DiGeorge critical region 6 (DGCR6), SZ-associated proline dehydrogenase 1 (PRODH), as well as many genes with no known or characterized function. This region does not contain any putative or confirmed epigenes. However, there are significant alterations to the epigenetic profiles, suggesting downstream epigenetic effects. Much like 16p1 1 .2 CNVs, individuals with 22q1 1 .2 deletions also have variable physical and neuropsychiatric phenotypes. 22q1 1 .2 deletions are also strongly associated with, but do not always present with, ASD (-14% with ASD diagnosis, >20% with ASD symptoms) (Fine, S.E. et al., 2005), SZ (Murphy, K.C. et al., 1999) (20-25% of patients with deletions, but not with duplications, OMI M# 608363), as well as congenital heart malformations, cleft palate and T-cell immunodeficiency (Cohen, E. et al. , 1999). The 22q1 1 .2 duplication typically has a milder phenotype, and therefore is largely undetected, but has a diverse range of phenotypes, ranging from normal phenotype to a combination of behavioural and morphological abnormalities (Yobb, T.M. et al. , 2005). It presents variably with seizures, heart defects and velopharyngeal insufficiency, examples of overlapping phenotypes with the deletion. There are also distinct phenotypes specific to the duplication including particular craniofacial abnormalities, growth and motor delay.

22q11.2 Deletion Signature

[00149] The 22q1 1 .2 deletion signature was derived from a comparison of DNAm from 7 individuals with pathogenic deletions in the 300Mb DiGeorge/Velocardiofacial Syndrome regions in the 22q1 1 .2 region to 20 controls (Table 5) using the analysis methodology previously described in the Methylation Array Analysis section.

[00150] The LOO procedure confirmed that the p-value threshold 0.01 , when combined with the effect size threshold |Δβ| > 5%, was the necessary significance level at which the LOO procedure yielded a sensitivity of 71 % and a specificity of 100% (Table 6). Applying the statistical tests with these parameters to the full collection of 7 pathogenic 22q1 1 .2 deletion samples and 20 controls, a "signature set" of 51 significant CpG sites was derived. As expected, the set defined a perfect separation between the samples with a pathogenic 22q1 1 .2 deletion and controls (Figure 5).

22q11.2 Deletion Signature Validation

[00151] The set of 51 probes for specific CpG sites comprising the 22q1 1 .2 deletion signature were located within the bodies or promoter regions of 42 known genes (Table 5). Only one gene contained multiple differentially methylated probes— 2 probes in the Developmental Pluripotency Associated 5 (DPPA5) gene.

[00152] Next the specificity of the signature CpGs was validated on a collection of 1056 normal blood samples derived from GEO. Similar to the LOO procedure, median DNAm profiles for the 7 pathogenic 22q1 1 .2 deletion samples and for the 20 control samples, respectively, were generated. The Pearson correlation of each of the GEO samples was computed with the reference 22q1 1 .2 deletion profile and the reference control profiles, using the 51 significant CpGs sites. A high degree (82.5%) of specificity was achieved with the GEO control samples (Figure 6). Similar estimates were tabulated for additional parameter combinations for effect size threshold |Δβ| from 5% to 25% and significance level from p< 0.05 to 0.0001 (Table 1 1 ).

22q11.2 Duplication Signature

[00153] The 22q1 1 .2 duplication signature is derived from a comparison of DNAm from individuals with pathogenic deletions in the 300Mb DiGeorge/Velocardiofacial Syndrome regions in the 22q1 1 .2 region to controls using the analysis methodology previously described in the Methylation Array Analysis section.

[00154] The LOO procedure confirms that a p-value threshold, when combined with an effect size threshold |Δβ| is the necessary significance level at which the LOO procedure makes no classification errors (ie. Sensitivity and specificity of 100%). Applying the statistical tests with these parameters to the full collection of 22q1 1 .2 duplication samples and controls, a "signature set" of significant CpG sites is derived. It is expected that the signature set will likely contain some regions of methylation that will be opposite that of the deletion signature i.e., in certain regions where there is hypermethylation in the deletion signature, it is expected that there is hypomethylation in the duplication signature, and vice versa.

22q11.2 Duplication Signature Validation

[00155] The set of probes for specific CpG sites comprising the 22q1 1 .2 duplication signature are located within the bodies or promoter regions of known genes.

[00156] Next the specificity of the signature CpGs is validated on a collection of normal blood samples derived from GEO. Similar to the LOO procedure, median DNAm profiles for the pathogenic 22q1 1.2 duplication samples and for the control samples, respectively, are generated. The Pearson correlation of each of the GEO samples is computed with the reference 22q1 1 .2 duplication profile and the reference control profiles, using the significant CpGs sites. A high degree of specificity is achieved with the GEO control samples.

[00157] The signature is then applied to classify subjects with 22q1 1 .2 duplications (including variable deletion sizes, variable phenotype) into either pathogenic or benign CNVs.

Results for CHD8

[00158] CHD8 (14q1 1 .2, OMIM # 615032) belongs to the chromodomain helicase DNA binding (CHD) family of proteins, which use the energy from ATP hydrolysis to modify chromatin structure through alterations in nucleosome positioning and composition, thus modifying gene expression. The CHD family is characterized by the presence of tandem chromo (chromatin organisation modifier) domains in the N-terminal region and a catalytic, central S/VF2-related helicase/ATPase domain (Tajul-Arifin, K. et al., 2003). CHD8 is a known binding partner of CHD7, mutations in which lead to CHARGE syndrome (OMIM# 214800, coloboma of the eye, heart defects, choanal atresia, retardation of growth and development, genital hypoplasia, and ear/deafness/vestibular/olfactory/other cranial nerve disorders), which is another ASD-associated syndrome with a distinct methylation signature (Butcher, D.T. et al., 2013). CHD8 plays a significant role during embryonic development by binding to β-catenin (Thompson, B.A. et al., 2008) and in its regulation of WNT signaling (Nishiyama, M. et al., 2012).

[00159] CHD8 mutations associated with ASD are rare, accounting for less than 1 % of ASD cases. Severe disruptive nonsense, splice and frameshift de novo mutations in CHD8 are highly associated with ASD (>85% of individuals), as well as other phenotypes including macrocephaly (80%), prominent supraorbital ridge (>75%), ID (60%), and attention problems (60%) (Bernier, R. et al. , 2014). Although nonsense mutations in CHD8 can be predictive of severity of outcome, the same cannot be said for other genetic variants such as missense mutations or variants of unknown significance (VUS). Therefore, additional molecular markers, such as of the presently disclosed DNAm signature, beyond that of genetics alone can help to properly classify mutations as pathogenic or benign.

CHD8 Signature

[00160] The CHD8 signature was derived from a comparison of DNAm from 8 individuals with pathogenic mutations in the CHD8 gene to 85 controls (Table 8) using the analysis methodology previously described in the Methylation Array Analysis section.

[00161] The LOO procedure confirmed that the p-value threshold 0.01 , when combined with the effect size threshold |Δβ| > 5%, was the necessary significance level at which the LOO procedure yielded a sensitivity of 88% and a specificity of 100% (Table 9). Applying the statistical tests with these parameters to the full collection of 8 CHD8 pathogenic mutation samples and 85 controls, a "signature set" of 264 significant CpG sites was derived. As expected, the set defined a near perfect separation between the samples with a pathogenic CHD8 mutation and controls (Figure 8) save for one sample (CHD8-8), which has a mutation in the last exon of the CHD8 gene, and likely has residual CHD8 activity, thus rendering it non-pathogenic.

CHD8 Signature Validation

[00162] The set of 264 probes for specific CpG sites comprising the CHD8 signature were located within the bodies or promoter regions of 173 known genes (Table 8). 32 individual genes had 2 or more differentially methylated probes, including Zinc Finger and BTB domain containing 22 (ZBTB22, 12 probes), CD47 antigen molecule (CD47, 5 probes) and ArfGAP with GTPase domain, Ankyrin repeat and PH domain 2 (AGAP2, 4 probes).

[00163] Next the specificity of the signature CpGs was validated on a collection of 1056 normal blood samples derived from GEO. Similar to the LOO procedure, median DNAm profiles for the 8 CHD8 pathogenic de novo mutation samples and for the 85 control samples, respectively, were generated. The Pearson correlation of each of the GEO samples was computed with the reference CHD8 profile and the reference control profiles, using the 264 significant CpGs sites. A near perfect degree (99.6%) of specificity was achieved with the GEO control samples (Figure 7). This high specificity estimate was encouraging, given the diversity and unknown phenotype of the combined data from GEO sources. Similar estimates were tabulated for additional parameter combinations for effect size threshold |Δβ| from 5% to 25% and significance level from p< 0.05 to 0.0001 (Table 12).

[00164] The signature was then applied to classify 4 subjects with CHD8 inherited missense mutations into either pathogenic or benign mutations (Figure 7). Using the same classification procedure as was used to define the signature, none of the variants were predicted to be pathogenic, as predicted due to the nature of their mutations which would be less deleterious than the de novo nonsense mutation cases used to derive the signature, as well as their inheritance directly from a phenotypically normal parent (Table 7).

[00165] While the present disclosure has been described with reference to what are presently considered to be the examples, it is to be understood that the disclosure is not limited to the disclosed examples. To the contrary, the disclosure is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.

[00166] All publications, patents and patent applications are herein incorporated by reference in their entirety to the same extent as if each individual publication, patent or patent application was specifically and individually indicated to be incorporated by reference in its entirety. Table 1. 16p11.2 deletions CNV table

Table 3. Cross-validation results for different effect-size (absolute delta beta, |Δβ|) thresholds at p-value < 0.05. Shown are the specificity (Spec) and sensitivity (Sens) of the LOO procedure, specificity for 1056 normal blood samples derived from GEO (Spec GEO). The total number of significant sites (CGs) in the resulting "16p11.2 deletion signature" set, the gene names (Names) and their total number (Genes) corresponding to the significant sites. One optimal combination (highlighted in bold) was selected to be p-value < 0.05 and |Δβ| > 5%. The p-values are corrected for multiple testing (Benjamini-Hochberg correction).

Table 4. 22q11.2 deletions CNV table

Table 6. Cross-validation results for different effect-size (absolute delta beta, |Δβ|) thresholds at p-value < 0.01. Shown are the specificity (Spec) and sensitivity (Sens) of the LOO procedure, specificity for 1056 normal blood samples derived from GEO (Spec GEO). The total number of significant sites (CGs) in the resulting "22q11.2 deletion signature" set, the gene names (Names) and their total number (Genes) corresponding to the significant sites. One optimal combination (highlighted in bold) was selected to be p-value < 0.01 and |Δβ| > 5%. The p-values are corrected for multiple testing (Benjamini- Hochberg correction).

Table 7. CHD8 mutation table

Table 9. Cross-validation results for different effect-size (absolute delta beta, |Δβ|) thresholds at p-value < 0.01. Shown are the specificity (Spec) and sensitivity (Sens) of the LOO procedure, specificity for 1056 normal blood samples derived from GEO (Spec GEO). The total number of significant sites (CGs) in the resulting "CHD8 signature" set, the gene names (Names) and their total number (Genes) corresponding to the significant sites. One optimal combination (highlighted in bold) was selected to be p-value < 0.01 and |Δβ| > 5%. The p-values are corrected for multiple testing (Benjamini-Hochberg correction).

References

1 . Turinsky, A.L. et al. DAnCER: disease-annotated chromatin epigenetics resource. Nucleic Acids Res 3d, D889-94 (2010).

2. Bjornsson, H.T. The Mendelian disorders of the epigenetic machinery. Genome Res 25, 1473-81 (2015).

3. Kleefstra, T., Schenck, A., Kramer, J.M. & van Bokhoven, H. The genetics of cognitive epigenetics. Neuropharmacology 80, 83-94 (2014).

4. Tatton-Brown, K. et al. Mutations in the DNA methyltransferase gene DNMT3A cause an overgrowth syndrome with intellectual disability. Nat Genet 46, 385-8 (2014).

5. Bernier, R. et al. Disruptive CHD8 mutations define a subtype of autism early in development. Cell 158, 263-76 (2014).

6. Grafodatskaya, D. et al. Multilocus loss of DNA methylation in individuals with mutations in the histone H3 lysine 4 demethylase KDM5C. BMC Med Genomics 6, 1 (2013).

7. Lalani, S.R. et al. Spectrum of CHD7 mutations in 1 10 individuals with CHARGE syndrome and genotype-phenotype correlation. Am J Hum Genet 78, 303-14 (2006).

8. Choufani, S. et al. NSD1 mutations generate a genome-wide DNA methylation signature. Nature Communications In press. (2015).

9. Berko, E.R. et al. Mosaic epigenetic dysregulation of ectodermal cells in autism spectrum disorder. PLoS GeneM O, e1004402 (2014).

10. Ladd-Acosta, C. et al. Common DNA methylation alterations in multiple brain regions in autism. Mol Psychiatry 19, 862-71 (2014).

1 1 . Nguyen, A., Rauch, T.A., Pfeifer, G.P. & Hu, V.W. Global methylation profiling of lymphoblastoid cell lines reveals epigenetic contributions to autism spectrum disorders and a novel autism candidate gene, RORA, whose protein product is reduced in autistic brain. FASEB J 24, 3036-51 (2010).

12. Wong, C.C. et al. Methylomic analysis of monozygotic twins discordant for autism spectrum disorder and related behavioural traits. Mol Psychiatry

19, 495-503 (2014).

13. Ginsberg, M.R., Rubin, R.A., Falcone, T., Ting, A.H. & Natowicz, M.R. Brain transcriptional and epigenetic associations with autism. PLoS One 7, e44736 (2012).

14. Hanson, E. et al. The cognitive and behavioral phenotype of the 16p1 1 .2 deletion in a clinically ascertained population. Biol Psychiatry 77, 785-93 (2015).

15. Leung, T.Y., Pooh, R.K., Wang, C.C, Lau, T.K. & Choy, K.W. Classification of pathogenic or benign status of CNVs detected by microarray analysis. Expert Rev Mol Diagn 10, 717-21 (2010).

16. Harris, R.A. et al. Comparison of sequencing-based methods to profile DNA methylation and identification of monoallelic epigenetic modifications. Nat Biotechnol 28, 1097-105 (2010).

17. Stevens, M. et al. Estimating absolute methylation levels at single-CpG resolution from methylation enrichment and restriction enzyme sequencing methods. Genome Res 23, 1541 -53 (2013). 18. Hirst, M. Epigenomics: sequencing the methylome. Methods Mol Biol 973, 39-54 (2013).

19. Fischbach, G.D. & Lord, C. The Simons Simplex Collection: a resource for identification of autism genetic risk factors. Neuron 68, 192-5 (2010).

20. Chen, Y.-a. et al. Discovery of cross-reactive probes and polymorphic CpGs in the lllumina Infinium HumanMethylation450 microarray. Epigenetics 8, 203-209 (2013).

21 . Chen, Y.A. et al. Sequence overlap between autosomal and sex-linked probes on the lllumina HumanMethylation27 microarray. Genomics 97, 214- 22 (201 1 ).

22. Aryee, M.J. et al. Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays. Bioinformatics 30, 1363-9 (2014).

23. Walters, R.G. et al. A new highly penetrant form of obesity due to deletions on chromosome 16p1 1 .2. Nature 463, 671 -5 (2010).

24. Fernandez, B.A. et al. Phenotypic spectrum associated with de novo and inherited deletions and duplications at 16p1 1 .2 in individuals ascertained for diagnosis of autism spectrum disorder. J Med Genet 47, 195-203 (2010).

25. Shinawi, M. et al. Recurrent reciprocal 16p1 1 .2 rearrangements associated with global developmental delay, behavioural problems, dysmorphism, epilepsy, and abnormal head size. J Med Genet 47, 332-41 (2010).

26. Qureshi, A.Y. et al. Opposing brain differences in 16p1 1 .2 deletion and duplication carriers. J Neurosci 34, 1 1 199-21 1 (2014).

27. McCarthy, S.E. et al. Microduplications of 16p1 1 .2 are associated with schizophrenia. Nat Genet 41 , 1223-7 (2009).

28. Fine, S.E. et al. Autism spectrum disorders and symptoms in children with molecularly confirmed 22q1 1 .2 deletion syndrome. J Autism Dev Disord 35, 461 -70 (2005).

29. Murphy, K.C., Jones, L.A. & Owen, M.J. High rates of schizophrenia in adults with velo-cardio-facial syndrome. Arch Gen Psychiatry 56, 940-5 (1999).

30. Cohen, E., Chow, E.W., Weksberg, R. & Bassett, A.S. Phenotype of adults with the 22q1 1 deletion syndrome: A review. Am J Med Genet 86, 359- 65 (1999).

31 . Yobb, T.M. et al. Microduplication and triplication of 22q1 1 .2: a highly variable syndrome. Am J Hum Genet 76, 865-76 (2005).

32. Tajul-Arifin, K. et al. Identification and analysis of chromodomain- containing proteins encoded in the mouse transcriptome. Genome Res 13, 1416-29 (2003).

33. Butcher, D.T. et al. DNA methylation alternations in CHARGE patients with heterozygous CHD7 mutations. Proceedings of the 63rd Annual Meeting of the American Society of Human Genetics. (2013).

34. Thompson, B.A., Tremblay, V., Lin, G. & Bochar, D.A. CHD8 is an ATP-dependent chromatin remodeling factor that regulates beta-catenin target genes. Mol Cell Biol 28, 3894-904 (2008). 35. Nishiyama, M., Skoultchi, A.I. & Nakayama, K.I. Histone H1 recruitment by CHD8 is essential for suppression of the Wnt-beta-catenin signaling pathway. Mol Cell Biol 32, 501 -12 (2012).