Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHOD OF DIAGNOSING OR PROGNOSING A NEUROLOGICAL DISORDER
Document Type and Number:
WIPO Patent Application WO/2014/012144
Kind Code:
A1
Abstract:
Enabled herein are methods for the diagnosis and prognosis of Autism Spectrum Disorder (ASD), predicated in part on the identification of genetic markers instructive as to the presence of or a predisposition to developing ASD, or which are indicative of the absence of ASD or protection against developing ASD. The genetic markers are single nucleotide polymorphisms (SNPs) populating molecular pathways that are associated directly or indirectly with the development of or protection from ASD. The SNPs are applied to generate a predictive classifier for phenotypes of affected individuals and their parents. Methods of treatment, kits and base stations for performing the methods of diagnosis and prognosis disclosed herein are also provided.

Inventors:
PANTELIS CHRISTOS (AU)
SKAFIDAS EFSTRATIOS (AU)
CHANA GURSHARAN (AU)
EVERALL IAN (AU)
TESTA RENEE (AU)
ZANTOMIO DANIELA (AU)
Application Number:
PCT/AU2013/000798
Publication Date:
January 23, 2014
Filing Date:
July 19, 2013
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNIV MELBOURNE (AU)
International Classes:
C12Q1/68
Domestic Patent References:
WO2012079008A22012-06-14
WO2000050444A12000-08-31
WO2005008249A12005-01-27
WO2006055927A22006-05-26
WO1999042575A21999-08-26
Foreign References:
US20110207124A12011-08-25
Other References:
WANG, K. ET AL.: "Common genetic variants on 5p14.1 associate with autism spectrum disorders", NATURE, vol. 459, 28 May 2009 (2009-05-28), pages 528 - 533
SKAFIDAS, E. ET AL.: "Predicting the diagnosis of autism spectrum disorder using gene pathway analysis", MOLECULAR PSYCHIATRY, 11 September 2012 (2012-09-11), pages 1 - 7
SKAFIDAS, E. ET AL.: "Genetic Factors Identifying Risk and Resilience in Autism Spectrum Disorders", BIOLOGICAL PYSCHIATRY, vol. 73, no. 9, 1 May 2013 (2013-05-01), pages 147S
Attorney, Agent or Firm:
DAVIES COLLISON CAVE (Melbourne, Victoria 3000, AU)
Download PDF:
Claims:
CLAIMS

A method for determining whether a subject has or has a predisposition to develop Autism Spectrum Disorder (ASD). the method comprising collecting a. genetic sample from the subject comprising a gene encoding calcium-activated potassium channel subunit 84 and screening lor a genetic marker in said gene that is statistically associated with ASD or the absence of AS¾ wherein the presence of the genetic marker is indicative that the subject has ASD, a predisposition to developing ASD, the absence of ASD or is protected from developing ASD.

The method of Claim 1, wherein the genetic marker is a single nucleotide polymorphism (SNP).

The method of Claim 2 wherein the SNP is indicative of the presence of ASD or a predisposition to developing ASD in the subject.

The method of Claim 3> wherein the SNP is rs968122.

The method of Claim 3, further comprising identifying a SNP statistically associated with ASD in a gene encoding a protein selected from the list consisting of:

(a) guanine rrueleotide-binding protein G(o) sobrnfit alpha;

(b) metabotropie glutamate receptor 5; (e) phosphatidylinositoh3?4,5»irisphosphate 5-phosphaiase 1 ;

(d) adenylate cyclase type 8;

(e) voltage dependent calcium channel c-2/δ submit 3;

(f) platelet-derived growth factor D;

(g) adenylate cyclase type 3;

(h) phosphatidylinositoi-4»phosphate 3 -kinase C2 domain-containing δ polypeptide;

(1) voltage-dependent calcium channel a! δ sobonit;

(I) Calmodulin 1 ; (k) receptor tyrosine protein, kinase erb B4;

(I) protein tyrosine phosphatase receptor-type ; and

(m) L-type voltage-dependent calcium channel ; C su mit

The method of Claim 4, wherein the SNP in the gene encoding guanine nucleotide- binding protein G(o) subnnit alpha is rs876619.

The method of Claim 4S wherein the SNP in the gene encoding metabotropic gintamate receptor 5 is rsl 1020772.

The method of Claim. 4, wherein the SNP in the gene encoding platelet-derived growth factor 13 is rs 18181 06.

The method of Claim 4, wherein the gene encoding phosphatidylinosiiol-3,4,5- trisphosphate 5 -phosphatase 1 is 1NPP5D and the SNP is selected f om the list consisting o.frs928868S and rsl 0193128.

The mediod of Claim 4, wherein the gene encoding adenylate cyclase type 8 is /X.T* and the SNP is rs7842798.

The method of Claim 4, wherein the gene encoding voltage dependent calcium channel ©:2/δ subunit 3 is CACNA2D3 and the SNP is rs3773540.

The method of Claim 4S wherein the gene encoding adenylate cyclase type 3 is ADCY3 and the SN P is rs238406L

l:he method of Claim 4, wherein the gene encoding phosphatidyiinositol-4- phosphate 3-kinase C2 domain-containing δ polypeptide is PIK3C2G arid the SNP is s!25S2971.

The method of Claim 4, wherein the gene encoding voltage-dependent calcium channel aid subnnit is CACNA1A and the SNP is rsl 0409541.

The method of Claim 4, wherein the gene encoding Calmodulin 1 is CALM1 and the SNP is rs2300497.

The method of Claim 4, wherein the gene encoding receptor tyrosine protein kinase erb B4 is ERBB4 and the SNP is rs7562445. The method of Claim 4, wherein the gene encoding tyrosine phosphatase receptor- type R is PTPRR and the S P is rs73t 3997.

The method of Claim 4, wherein the gene encoding L-type voltage-dependent calcium channel a* C sobunit is CACNA!C and the SNP Is rs22391 18,

The method of Claim 1 , further comprising identifying a profile of two or more SNPs as listed in Table 1 ,

The method of Claim 2 wherein the SNP is indicative of the absence of ASD or protection from developing ASD In the subject

The method of Claim 20, wherein the SNP Is rs! 2317962.

The method of Claim 21, farther comprising identifying a SNP statistically associated with the absence of ASD in a gene encoding a protein selected from the list consisting of:

(a) cGMP-dependent protein kinase 1 alpha Isozyme;

( ) nuclear factor NF-kappa-B pi OS subunit;

(c) C-termina! binding protein 2;

(d) olfactory receptor 6S 1 ;

(e) olfactory receptor 10H3;

(f) platelet-derived growth motor D;

(g) Axin-2;

(h) irblquitia-conjugated enzyme E2D2:

(i ) olfactory receptor 51.1 ; (I) deoxycytidine kinase;

(k) guanine nucleotide binding protein a 14 subunii;

(1 ) mctabotropic glutamate receptor 5; and

(m) guanine nueleotide-binding protein G(o) subunit alpha.

The method of Claim 21. wherein the gene encoding cGMP-depeodent protein ·· HO ·· kinase ! alpha isozyme is PRKGl and the SNP is rs 17629494

The method of Claim 22s wherein the gene encoding nuclear f ctor NP»kappa-B pI05 submit is NFKBJ and the SNP is rs4648I35.

The method of Claim 22, wherein the gene encoding C-iermkal binding protein 2 is CTBP2 and the SNP is rsl 7643974.

The method of Claim 22, wherein the gene encod g olfactory receptor 6S1 is OR6S1 and Lhe SNP is rsl 243679.

The method of Claim 22, wherein the gene encoding olfactory receptor 10113 is OR /U and the SNP is rs224G228.

The method of Claim 22, wherein the SNP is a gene encoding platelet-derived growth factor D is rs26080

The method of Claim 22, wherein the gene encoding Axk-2 is AXIN2 and the SNP is rs4!2894 L

The method of Claim 22, wherein the gene encoding ubiquitimcorrjugated enzyme E2D2 is 17S£ I)2 and the SNP is rs7690S2.

The method of Claim 22s wherein the gene encoding olfactory receptor 51,1 is OR5L1 and the SNP is rs984371.

The method of Claim 22s wherein the gene encoding deoxycytidine kinase is DCK and rhe SNP is rs4308342.

The method of Claim 22, wherein the gene encoding guanine nucleotide binding protein, «14 suhunit is GNA14 and the SNP is rsl 1 145506.

The method of Claim 22, wherein tire SNP in a gene encoding metabolropic giidamale receptor 5 is selected from the list consisting of rs905646 and rs6483362.

The method of Claim 22, wherein the SNP in a gene encoding guanine nucleotide- binding protein G(o) sab nli alpha is rs8053370.

The method any one of Claims 1 to 35. wherein the subject is human,

The method of claim 36, wherein the human is one of an ethnic group of humans. - 81 -

The method of Claim 37, wherein the ethnic group is not Han Chinese.

The method of Claim 38, comprising identifying an SNP in a gene encoding a peptide involved in Wnt signalling.

The method of any one of Claims 3 to 19 ferther comprising, where the subject is determined as having or having a predisposition to developing ASD, exposing the subject to a treatment for inhibiting the progression of ASD or for inhibiting the onset of ASD or for ameliorating the symptoms of ASD.

A kit for determining whether a subject has ot has a predisposition to develop ASD, the kit comprising a set of primers and/or probes for identifying a. genetic marker in accordance with the method of any one of Claims 1 to 3 .

A method of any one of Claims 1 to 39, further comprising:

(a) receiving data on. the presence of the genetic marker from a user via a communications network;

(b) processing the data to establish an ASD index value based on the presence or absence of the genetic marker;

(c) determining the status of the subject in accordance with the results of the ASD index value in comparison with a predetermined value; and id) transferring an indication of the status of the subject to the user via a.

communications network.

The method of Claim 41 , further comprising;

(a) having the user determine the data using a remote end station; and

(b) transferring the data from the end station to the base station via the eonrmunications network. lire method of Claim 42, ferther comprising iransietring the data through a firewall.

The method of any one of Claims 41 to 43, further comprising causing the base station to: - 82 ··

(a) determine payment information, the payment Information representing the provision of payment by the user; and

(b) perform the data processing and transfer in response to the determination of the payment information,

A base station tor stratifying a subject with respect to ASD in accordance with the method of any one of Claims 1 to 39, the base station comprising:

(a) a store method;

(b) a. processing system, the processing system being adapted to:

(i) receive subject data from a user via a cotmmnrications network, the data comprising information on the presence or absence of the genetic marker in the subject;

(ii) performing a processing function including comparing the data to predetermined data:

(iii) determining the status of the subject in accordance with the results of the processing function including the comparison; and

(c) output an Indication of the status of t re subject to the user the communications network.

Tire base station of Claim 45, wherein the processing system is adapted to receive data from a remote end station adapted to determine the data.

The base station of Claim 46, wherein the processing system comprises:

(a) a first processing system, adapted to: ii) receive the data; and

(ii) determine the state of the subject in accordance with the data Including comparing the data; and

(b) a second processing system adapted to:

(i) receive the data from the processing system;

(ii) perform the processing function including the comparison: and - 83 -

(iii) transfer the results to the first processing system.

48. A method for determining whether a subject has a genetic profile associated with Autism Spectrum Disorder (ASD) or with a risk of developing ASD, the method comprising collecting a genetic sample from the subject and screening the genetic sample for the presence or absence of a selected number of SNPs, wherein the SNPs are selected from the list in Table 1 and wherein the selected number of SNPs provides 50% or greater correlation between the genetic profile and ASD, or a risk of developing ASD.

49. A method for determining whether a subject has a genetic profile associated with an absence of Autism Spectrum Disorder (ASD) or with protection from developing ASD, the method comprising collecting a genetic sample from the subject and screening the genetic sample for the presence or absence of a selected number of SNPs, wherein the SNPs are selected from the list in Table 1 and wherein the selected number of SNPs provides 50% or greater correlation between the genetic profile and the absence of ASD, or protection from developing ASD.

50. A method for determining whether a subject has a genetic profile associated with Autism Spectrum Disorder (ASD) or with a risk of developing ASD, the method comprising collecting a genetic sample from the subject and screening the genetic sample for the presence or absence of a selected number of SNPs, wherein the SNPs are selected from the list in Table 5 and wherein the selected number of SNPs provides 50% or greater correlation between the genetic profile and ASD, or a risk of developing ASD.

51. A method for determining whether a subject has a genetic profile associated with an absence of Autism Spectrum Disorder (ASD) or with protection from developing ASD, the method comprising collecting a genetic sample from the subject and screening the genetic sample for the presence or absence of a selected number of SNPs, wherein the SNPs are selected from the list in Table 5 and wherein the selected number of SNPs provides 50% or greater correlation between the genetic profile and the absence of ASD, or protection from developing ASD.

52. A method for determining whether a subject has or has a predisposition to develop Autism Spectrum Disorder (ASD), the method comprising collecting a genetic sample from the subject and screening the

RECTIFIED SHEET

(Rule 91) ISA/AU

. 84 . genetic sample for a suitable number of SNPs selected from the list in Table I, and wberein the suitable number of SNPs is statistically associated with the presence of ASD or a predisposition to developing ASD.

53. The method of claim 52, wherein the method comprises screening for all SNPs listed In Table 1 that are statistically associated with the presence of ASD or a predisposition to developing ASD.

54. A method for determining the absence of ASD or protection from developing ASD in a subject, the method comprising collecting a genetic sample from the subject and screening the genetic sample for a suitable number of SNFs selected from, the list in Table !, and wherein fee suitable number of SNFs is statistically associated with the presence of ASD or a predisposition to developing ASD.

55. The method of claim 54, wherein the method comprises screening for all SNFs listed in Table 1 that are statistically associated with the absence of ASD or a protection from developing ASD.

56. A method for determining whether a subject has or has a predisposition to develop Autism Spectrum Disorder (ASD), the method comprising collecting a genetic sample from the subject and screening the genetic sample for a suitable number of SNFs seleeted from the list In Table 5, wberein the suitable number of SNPs is statistically associated with the presence of ASD or a predisposition to developing ASD.

57. The method of claim 56, wberein the suitable number of SNPs comprises: rs 17618615, rs6650972? rs 108231 5, rs 1942052, rs9798267, rs4696443, rs4648!3S, rs2431 6, rs7145618, rsl013459s rsl09S2662, rs7580690, rs7756516, rs3935743, rs7903424, rs8054767, rsl 1001056, rs2684777, rs2300497, rsl 6931011, rs4324526, rsl 1736177, rs2239! !8, rs3020827, rs2270838, rsl76439745 rs2036109, rs) 2582971 , rsl 873423, rs3734464, rs7067880, rsl 029088, rsl 046868 J , rs7731023s rs8840805 rs629720, rs2920022, rs4465567, rs2394538, r l 0794 197, rs? 100765, rs312 ! 309, rs42540S6, rs976266, rsl 3359392, rs27i 6191 , rs6483362 and rsl 0407 ; 44.

58. The method of claim 56, wherein the method comprises screening for all SNFs identified in Table 5 that are statistically associated with the presence of ASD or a predisposition to developing ASD. - 85 -

59, A method for determining the absence of ASD or protection from developing ASD in a subject, the method comprising collecting a genetic sample from the subject and screening the genetic sample for a suitable number of SNPs selected from the list in I'abie 5 that, are statistically associated with the absence of ASD or protection from developing ASD.

60, The method of claim 59, wherein the suitable number of SNPs comprises; rsl 1583646, rs258?891, rsl 480645, rs!2462609, rs46513435 rs8888!7, rsl()50395, rs697! 99, rs l 928168, rs2272!97, rsl 1644436, rs806346L rsl 1602535, rs339408, rs! 0762342, rs7536307, rs7512378, rs4947963, rs3904668, rs4!28941, rs7870040, rs6679454, rs 271 6928, rslG783235. rs3910363, rs4647992 aud rsl6853387.

Description:
METHOD OF DIAGNOSING OR PROGNOSING A NEUROLOGICAL

DISORDER

TECHNICAL FIELD

[0001] This disclosure relates generally to methods of diagnosing and prognosing a neurological disorder. In particular, methods are taught herein for the diagnosis and prognosis of Autism Spectrum Disorder using genetic markers, including single nucleotide polymorphisms.

BACKGROUND

[0002] Neurological disorders represent potentially debilitating conditions and can affect people of all ages. One particular disorder is Autism spectrum disorder (ASD). ASD is a complex group of sporadic and familial developmental disorders affecting 1 in 150 births and characterized by: abnormal social interaction, impaired communication, and stereotypic behaviours. The etiology of ASD is poorly understood, however, a genetic basis is evidenced by the greater than 70% concordance in monozygotic (MZ) twins and elevated risk in siblings compared to the population. Despite this, ASD, in keeping with other psychiatric disorders, depends entirely on a clinical interview and has no biomarkers to aid diagnosis.

[0003] The search for genetic loci in ASD, including linkage and genome wide association screens (GWAS), has identified a number of candidate genes and loci on almost every chromosome, with multiple hotspots on several chromosomes (e.g. CNTNAP2, NGLNX4, NRXNl, IMMP2L, DOCK4, SEMA5A, SYNGAPl, DLGAP2, SHANK2 and SHANK3), and copy number variations (CNVs). However, none of these have provided adequate specificity or accuracy in the diagnosis or prognosis of ASD. Also lacking is information on multiple genetic variants and their additive contribution to ASD, taking into account genetic differences between ethnicities and a consideration of protective versus vulnerability genetic markers. Thus, there is a need for methods that provide better specificity and/or accuracy in the diagnosis and prognosis of ASD. SUMMARY

[0004] Enabled herein are methods predicated in part on the identification of genetic markers instructive as to the presence of or a predisposition to developing ASD, or which are indicative of the absence of ASD or protection against developing ASD. In an embodiment, the genetic markers are single nucleotide polymorphisms (SNPs) populating molecular pathways that are associated directly or indirectly with the development of or protection from ASD. The SNPs have been applied to generate a predictive classifier for phenotypes of affected individuals and their parents. Methods for the diagnosis and prognosis of ASD are provided herein.

[0005] Disclosed herein is a method for determining whether a subject has or has a predisposition to develop Autism Spectrum Disorder (ASD), the method comprising collecting a genetic sample from the subject comprising a gene encoding calcium-activated potassium channel subunit β4 and screening for a genetic marker in the gene that is statistically associated with ASD or the absence of ASD, wherein the presence of the genetic marker is indicative that the subject has ASD, a predisposition to developing ASD, the absence of ASD or is protected from developing ASD.

[0006] In an embodiment, taught herein is a method for determining whether a subject has or has a predisposition to develop Autism Spectrum Disorder (ASD), the method comprising collecting a genetic sample from the subject comprising a gene encoding calcium-activated potassium channel subunit β4 and screening for a SNP in the gene that is statistically associated with ASD or the absence of ASD, wherein the presence of the SNP is indicative that the subject has ASD, a predisposition to developing ASD, the absence of ASD or is protected from developing ASD.

[0007] In an embodiment, the genetic marker (e.g., SNP) is indicative of the presence of ASD or a predisposition to developing ASD in the subject. In an embodiment, the SNP is rs968122.

[0008] Methods taught herein also encompass screening for additional genetic markers, such as SNPs, which are statistically associated with ASD. In an embodiment, the genetic marker is a SNP in a gene encoding a protein selected from the list consisting of:

(a) guanine nucleotide-binding protein G(o) subunit alpha; (b) metabotropic glutamate receptor 5;

(c) phosphatidylinositol-3,4,5-trisphosphate 5-phosphatase 1;

(d) adenylate cyclase type 8;

(e) voltage dependent calcium channel α2/δ subunit 3 ;

(f) platelet-derived growth factor D;

(g) adenylate cyclase type 3;

(h) phosphatidylinositol-4-phosphate 3-kinase C2 domain-containing δ polypeptide;

(i) voltage-dependent calcium channel αΐδ subunit; j) Calmodulin 1;

(k) receptor tyrosine protein kinase erb B4;

(1) protein tyrosine phosphatase receptor-type R; and

(m) L-type voltage-dependent calcium channel ai C subunit.

[0009] In embodiments disclosed herein, the SNP in the gene encoding guanine nucleotide-binding protein G(o) subunit alpha is rs876619; the SNP in the gene encoding metabotropic glutamate receptor 5 is rs 11020772; the SNP in the gene encoding platelet- derived growth factor D is rsl818106; the gene encoding phosphatidylinositol-3,4,5- trisphosphate 5-phosphatase 1 is INPP5D and the SNP is selected from the list consisting of rs9288685 and rs 10193128; the gene encoding adenylate cyclase type 8 is ADCY8 and the SNP is rs7842798; the gene encoding voltage dependent calcium channel α2/δ subunit 3 is CACNA2D3 and the SNP is rs3773540; the gene encoding adenylate cyclase type 3 is ADCY3 and the SNP is rs2384061; the gene encoding phosphatidylinositol-4-phosphate 3- kinase C2 domain-containing δ polypeptide is PIK3C2G and the SNP is rs 12582971; the gene encoding voltage-dependent calcium channel αΐδ subunit is CACNA1A and the SNP is rs 10409541; the gene encoding Calmodulin 1 is CALM1 and the SNP is rs2300497; the gene encoding receptor tyrosine protein kinase erb B4 is ERBB4 and the SNP is rs7562445; the gene encoding tyrosine phosphatase receptor-type R is PTPRR and the SNP is rs7313997; and the gene encoding L-type voltage-dependent calcium channel ai C subunit is CACNA1C and the SNP is rs2239118.

[0010] In an embodiment disclosed herein, the method further comprises identifying a profile of two or more SNPs as listed in Table 1.

[0011] In an embodiment, the genetic marker {e.g., SNP) is indicative of the absence of ASD or protection from developing ASD in a subject. In an embodiment, the SNP is rs 12317962.

[0012] Methods taught herein also encompass screening for additional genetic markers, such as SNPs, which are statistically associated with the absence of ASD or protection from developing ASD. In an embodiment, the genetic marker is a SNP in a gene encoding a protein selected from the list consisting of:

(a) cGMP-dependent protein kinase 1 alpha isozyme;

(b) nuclear factor NF-kappa-B pi 05 subunit;

(c) C-terminal binding protein 2;

(d) olfactory receptor 6S 1 ;

(e) olfactory receptor 10H3;

(f) platelet-derived growth factor D;

(g) Axin-2;

(h) ubiquitin-conjugated enzyme E2D2;

(i) olfactory receptor 5L1;

CD deoxycytidine kinase;

(k) guanine nucleotide binding protein al4 subunit;

(1) metabotropic glutamate receptor 5; and

(m) guanine nucleotide-binding protein G(o) subunit alpha.

[0013] In embodiments disclosed herein, the gene encoding cGMP-dependent protein kinase 1 alpha isozyme is PRKGl and the SNP is rs 17629494; the gene encoding nuclear factor NF-kappa-B pl05 subunit is NFKBl and the SNP is rs4648135; the gene encoding C-terminal binding protein 2 is CTBP2 and the SNP is rs 17643974; the gene encoding olfactory receptor 6S 1 is OR6S1 and the SNP is rs 1243679; the gene encoding olfactory receptor 10H3 is OR10H3 and the SNP is rs2240228; the gene encoding platelet-derived growth factor D is rs260808; the gene encoding Axin-2 is AXIN2 and the SNP is rs4128941; the gene encoding ubiquitin-conjugated enzyme E2D2 is UBE2D2 and the SNP is rs769052; the gene encoding olfactory receptor 5L1 is OR5L1 and the SNP is rs984371; the gene encoding deoxycytidine kinase is DCK and the SNP is rs4308342; the gene encoding guanine nucleotide binding protein, al4 subunit is GNA14 and the SNP is rsl 1145506; the SNP in a gene encoding metabotropic glutamate receptor 5 is selected from rs905646 and rs6483362; the gene encoding guanine nucleotide-binding protein G(o) subunit alpha is rs8053370.

[0014] The instant disclosure is also instructional for a method for determining whether a subject has a genetic profile associated with Autism Spectrum Disorder (ASD) or with a risk of developing ASD, the method comprising collecting a genetic sample from the subject and screening the genetic sample for the presence or absence of a selected number of SNPs, wherein the SNPs are selected from the list in Table 1 and wherein the selected number of SNPs provides 50% or greater correlation between the genetic profile and ASD, or a risk of developing ASD.

[0015] The instant disclosure is also instructional for a method is provided for determining whether a subject has a genetic profile associated with an absence of Autism Spectrum Disorder (ASD) or with protection from developing ASD, the method comprising collecting a genetic sample from the subject and screening the genetic sample for the presence or absence of a selected number of SNPs, wherein the SNPs are selected from the list in Table 1 and wherein the selected number of SNPs provides 50% or greater correlation between the genetic profile and the absence of ASD, or protection from developing ASD.

[0016] The instant disclosure is also instructional for a method for determining whether a subject has a genetic profile associated with Autism Spectrum Disorder (ASD) or with a risk of developing ASD, the method comprising collecting a genetic sample from the subject and screening the genetic sample for the presence or absence of a selected number of SNPs, wherein the SNPs are selected from the list in Table 5 and wherein the selected number of SNPs provides 50% or greater correlation between the genetic profile and ASD, or a risk of developing ASD.

[0017] The instant disclosure is also instructional for a method is provided for determining whether a subject has a genetic profile associated with an absence of Autism Spectrum Disorder (ASD) or with protection from developing ASD, the method comprising collecting a genetic sample from the subject and screening the genetic sample for the presence or absence of a selected number of SNPs, wherein the SNPs are selected from the list in Table 5 and wherein the selected number of SNPs provides 50% or greater correlation between the genetic profile and the absence of ASD, or protection from developing ASD.

[0018] The instant disclosure is also instructional for a method for determining whether a subject has a genetic profile associated with Autism Spectrum Disorder (ASD) or with a risk of developing ASD, the method comprising collecting a genetic sample from the subject and amplifying genomic DNA or corresponding RNA using primers which are selective for the presence or absence of a selected number of SNPs, wherein the SNPs are selected from the list in Table 1 and wherein the selected number of SNPs provides 50% or greater correlation between the genetic profile and ASD, or a risk of developing ASD.

[0019] The instant disclosure is also instructional for a method is provided for determining whether a subject has a genetic profile associated with an absence of Autism Spectrum Disorder (ASD) or with protection from developing ASD, the method comprising collecting a genetic sample from the subject and amplifying genomic DNA or corresponding RNA using primers which are selective for the presence or absence of a selected number of SNPs, wherein the SNPs are selected from the list in Table 1 and wherein the selected number of SNPs provides 50% or greater correlation between the genetic profile and the absence of ASD, or protection from developing ASD.

[0020] The instant disclosure is also instructional for a method for determining whether a subject has a genetic profile associated with Autism Spectrum Disorder (ASD) or with a risk of developing ASD, the method comprising collecting a genetic sample from the subject and amplifying genomic DNA or corresponding RNA using primers which are selective for the presence or absence of a selected number of SNPs, wherein the SNPs are selected from the list in Table 5 and wherein the selected number of SNPs provides 50% or greater correlation between the genetic profile and ASD, or a risk of developing ASD.

[0021] The instant disclosure is also instructional for a method is provided for determining whether a subject has a genetic profile associated with an absence of Autism Spectrum Disorder (ASD) or with protection from developing ASD, the method comprising collecting a genetic sample from the subject and amplifying genomic DNA or corresponding RNA using primers which are selective for the presence or absence of a selected number of SNPs, wherein the SNPs are selected from the list in Table 5 and wherein the selected number of SNPs provides 50% or greater correlation between the genetic profile and the absence of ASD, or protection from developing ASD.

[0022] The instant disclosure is also instructional for a method for determining whether a subject has or has a predisposition to develop Autism Spectrum Disorder (ASD), the method comprising collecting a genetic sample from the subject and screening the genetic sample for a suitable number of SNPs selected from the list in Table 1, and wherein the suitable number of SNPs is statistically associated with the presence of ASD or a predisposition to developing ASD.

[0023] The instant disclosure is also instructional for a method for determining the absence of ASD or protection from developing ASD in a subject, the method comprising collecting a genetic sample from the subject and screening the genetic sample for a suitable number of SNPs selected from the list in Table 1, and wherein the suitable number of SNPs is statistically associated with the presence of ASD or a predisposition to developing ASD.

[0024] The instant disclosure is also instructional for a method for determining whether a subject has or has a predisposition to develop Autism Spectrum Disorder (ASD), the method comprising collecting a genetic sample from the subject and screening for an SNP in the genetic sample that is statistically associated with the presence of ASD or a predisposition to developing ASD, wherein the SNP is selected from the list in Table 5 that is statistically associated with the presence of ASD or a predisposition to developing ASD.

[0025] The instant disclosure is also instructional for a method for determining the absence of ASD or protection from developing ASD in a subject, the method comprising collecting a genetic sample from the subject and screening for an SNP in the genetic sample that is statistically associated with the absence of ASD or protection from developing ASD, wherein the SNP is selected from the list in Table 5 that is statistically associated with the absence of ASD or protection from developing ASD.

[0026] The instant disclosure is also instructional for a method for determining whether a subject has or has a predisposition to develop Autism Spectrum Disorder (ASD), the method comprising collecting a genetic sample from the subject and screening the genetic sample for a suitable number of SNPs selected from the list in Table 5, wherein the suitable number of SNPs is statistically associated with the presence of ASD or a predisposition to developing ASD.

[0027] The instant disclosure is also instructional for a method for determining the absence of ASD or protection from developing ASD in a subject, the method comprising collecting a genetic sample from the subject and screening the genetic sample for a suitable number of SNPs selected from the list in Table 5 that are statistically associated with the absence of ASD or protection from developing ASD.

[0028] Also taught herein are methods further comprising, where the subject is determined as having or having a predisposition to developing ASD, exposing the subject to a treatment for inhibiting the progression of ASD or for inhibiting the onset of ASD or for ameliorating the symptoms of ASD.

[0029] The instant disclosure is also instructional for a kit for determining whether a subject has or has a predisposition to develop ASD, the kit comprising a set of primers and/or probes for identifying the genetic marker, such as a SNP, in accordance with the methods disclosed herein.

[0030] Further provided is a machine-implemented means for stratifying a subject with respect to ASD in accordance with the methods disclosed herein, comprising:

(a) receiving subject data from a user via a communications network, the data comprising information on the presence or absence of the genetic marker in the subject;

(b) performing a processing function including comparing the data to predetermined data; (c) determining the status of the subject in accordance with the results of the processing function including the comparison; and

(d) transferring an indication of the status of the subject to the user via the communications network.

[0031] In an embodiment, the genetic marker in a SNP.

[0032] Also disclosed herein is a method further comprising:

(a) having the user determine the data using a remote end station; and

(b) transferring the data from the end station to the base station via the communications network.

[0033] The method disclosed herein may further comprise transferring the data through a firewall. In an embodiment, the method further comprises causing the base station to:

(a) determine payment information, the payment information representing the provision of payment by the user; and

(b) perform the data processing and transfer in response to the determination of the payment information.

[0034] The instant disclosure is also instructional for a base station for stratifying a subject with respect to ASD in accordance with the methods disclosed herein, the base station comprising:

(a) a store method;

(b) a processing system, the processing system being adapted to:

(c) receive subject data from a user via a communications network, the data comprising information on the presence or absence of the genetic marker in the subject;

(d) performing a processing function including comparing the data to predetermined data;

(e) determining the status of the subject in accordance with the results of the processing function including the comparison; and (f) output an indication of the status of the subject to the user via the communications network.

[0035] In an embodiment disclosed herein, the processing system is adapted to receive data from a remote end station adapted to determine the data. The processing system may also comprise:

(a) a first processing system adapted to:

(i) receive the data; and

(ii) determine the status of the subject in accordance with the data including comparing the data; and

(b) a second processing system adapted to:

(i) receive the data from the processing system;

(ii) perform the processing function including the comparison; and

(iii) transfer the results to the first processing system. BRIEF DESCRIPTION OF THE FIGURES

[0036] Figures la and lb show flow charts on the subjects used in the analyses described herein. AGRE - Autism Genetic Research Exchange; SFARI - Simons Foundation Autism Research Initiative; WTBC - Wellcome Trust 1958 normal birth cohort; CEU - of Central (Western & Northern) European origin; HAN - of HAN Chinese origin; TSI - of Tuscan Italian origin; For Fig la & lb: 'red boxes' - samples used in the developing the predictive algorithm; 'blue boxes' - samples used to investigate different ethnic groups; 'green boxes' - validation sets; 'light green boxes' - relatives assessed, including parents and unaffected siblings.

[0037] Figure 2 shows the Cumulative Coefficient Estimation Error and Percentage Classification Error as a function of p-value; p=0.005 provides good trade-off between classification performance and cumulative regression coefficient error.

[0038] Figure 3a shows the Genetic based classification of CEU population (AGRE and Controls) for ASD and Non-ASD individuals, showing Gaussian approximation of distribution of individuals. As both the mapped ASD and control populations were well approximated by normal distributions, the asymptotic Test Positive Predictive Value (PPV) and Negative Predictive Value (NPV) was determined. For individuals with CEU ancestry the PPV and NPV were 96.72% and 94.74%, respectively. (Note the test was substantially less predictive on individuals with different ancestry, i.e. Han Chinese). Key: ASD - Autism Spectrum Disorder; Autism Classifier Score - scores for each individual derived from the predictive algorithm, with greater values representing greater risk for autism.

[0039] Figure 3b shows Genetic based classification of CEU population, including 1st degree relatives (parents and siblings of ASD children). Note that the distribution of relatives of ASD children maps between the ASD and the control groups, with no difference found between mothers and fathers (see Supplementary material S5). Key: ASD - Autism Spectrum Disorder; Relatives - 1st degree relatives (parents and siblings); Siblings - siblings of ASD cases not meeting criteria for ASD; Autism Classifier Score - scores for each individual derived from the predictive algorithm, with greater values representing greater risk for autism.

[0040] Figure 4 shows classifier performance. Labels were randomly permuted on the training sample and the resulting classifier was used to determine clinical status in the independent validation samples. The graph indicates the percentage of classifiers versus misclassification rate. The trained classifier on the correctly labelled data had the best performance of all other classifiers. The probability that the other classifier trained on the permuted labeled data has better performance is less than 1x10 " .

[0041] Figure 5 shows ASD prediction based on the area under ROC. The Receiver Operator Characteristic (ROC) curve determined for the independent validation set (SFARI & WTBC), not previously seen by the classifier, shows the performance of the classifier as a function of false positive and true positive rates, as compared to random. The area under the curve is 0.749.

[0042] Figure 6 shows the distribution of Relatives and Parents in AGRE. The mean values for parents (mothers = 2.83, S.D. = 2.17; fathers = 2.93, S.D. = 2.34) is similar to the mean values for the relatives (parents and siblings combined, mean = 2.68, S.D. = 2.27) overall. Values for unaffected siblings (not meeting diagnostic criteria for ASD) fell between parents and ASD cases (mean = 4.74, S.D. = 3.80). Mean for Controls = - 0.95, S.D. = 3.01; Mean for ASD cases = 7.74, S.D. = 2.07.

[0043] Figure 7 shows a principal components analysis demonstrating separation of HapMap populations on 3 principal components (PC). PC - Principal Component; ASW - African Ancestry South Western USA; CEU - Central European; CHB - Han Chinese Beijing; CHD - Han Chinese Denver; GIH - Gujurati indian in Houston; JPT - Japanese Tokyo; LWK - Luhyan in Webuye; MEX - Mexican Los Angeles; MKK - Maasai in Kinawa, Kenya TSI - Tuscan Italians; YRI - Yoruba in Ibadon, Nigeria.

[0044] Figure 8 shows a principal components analysis demonstrating separation of HapMap populations on 3 principal components (PC). Principal component analysis was performed on the 237 classifier SNPs. Within the assessed populations (White Non- Hispanic AGRE and SFARI) and the 58 WTBC, the two principal components account for 1.86 and 1.69% of the variance Furthermore, a two sample Kolmogorov Smirnov test comparing whether AGRE or SFARI differed from 58-WTBC on either of the two major principal components found that the null hypothesis (i.e., that the data was from the same distribution) could not be rejected (p<0.4).

[0045] Figures 9-11 show training and performance on expanded dataset to include all White Non- Hispanics from the AGRE, SFARI and WTBCC using bootstrapping, where 80% of the sample was used for training and the remaining 20% used as a validations set, 10,000 times in order to estimate the performance of the classifier based on identified SNPs. Note that over all random samples, the trained classifier exceeded 70% performance on both cases and controls (Figure 9). Validation performance on both cases and controls determined for all 10,000 trained classifiers illustrating good performance on both cases and controls for all of the trained samples. Average classifier performance based on 10,000 classifiers on random validation subsamples yielded performance of 70.88% with a standard deviation of 1.67%. 90% confidence interval [CI: 68.08-73.61] (Figure 10). The performance on AGRE versus SFARI indicating similar levels of classification performance (Figure 11). Key: WNH - White Non- Hispanic; AGRE - Autism Genetic Resource Exchange; SFARI - Simons Foundation Autism Research Initiative. [0046] Figure 12 shows the distribution of marked language delay (A) and classification performance on minimal and marked language delay within WNH ASD individuals within the AGRE and SFARI cohorts. Marked language delay is defined as individuals with an ADOS score of two or greater. Minimal language delay is defined as individuals with a score of 1 or less. The 90% confidence interval of classifier percentage correct performance had a median of 59.67% [CI: 51.42-67.92] (B). Distribution of gaze avoidance in AGRE and SFARI ASD populations that is commensurate with our classifiers ability to discern between siblings not meeting diagnostic criteria and probands. The 90% confidence interval had a median of 58.36 [CI: 50.20-65.62] (C). Classification performance on GAZE Avoidance within WNH ASD individuals within the AGRE and SFARI cohorts (D). ASD - Autism Spectrum Disorder; WNH - White Non-Hispanic; AGRE - Autism Genetic Resource Exchange; SFARI - Simons Foundation Autism Research Initiative.

[0047] Figure 13 shows a principal component (PC) analysis on SNPs in the classifier mapped to data from the Autism Genome Project (AGP) European cohort, as reported by AGP. Related controls refer to parents.

[0048] Figure 14 shows the distribution of related controls and cases in the AGP cohort demonstrating significant overlap/non overlap between parents and affected children. ASD - Autism Spectrum Disorder

[0049] Figure 15 shows the distribution of training classification performance within the AGP European Cohort (A) and validation of classification performance (B). Training Performance in the AGP cohort demonstrated that over all random samples, the trained classifier showed performance exceeding 61% on both cases and controls (C). Validation Classification Performance on AGP cohorts distinguished pseudo controls from probands. The 90% confidence interval of classifier performance had a median percentage correct performance of 58.27% [CI: 55.20-60.98] (D). WNH - White Non-Hispanic.

DETAILED DESCRIPTION

[0050] Throughout this specification, unless the context requires otherwise, the word "comprise", or variations such as "comprises" or "comprising", will be understood to imply the inclusion of a stated element, integer or method step or group of elements, integers or method steps, but not the exclusion of any other element, integer or method step or group of elements, integers or method steps.

[0051] The reference in this specification to any prior publication (or information derived from it), or to any matter which is known, is not, and should not be taken as an acknowledgment or admission or any form of suggestion that that prior publication (or information derived from it) or known matter forms part of the common general knowledge in the field of endeavor to which this specification relates.

[0052] As used herein, the singular forms "a", "an" and "the" include plural aspects unless the context clearly dictates otherwise. For example, reference to "a SNP" may include a single SNP, as well as two or more SNPs; reference to "an agent" includes a agent, as well as two or more agents; reference to "the invention" includes single and multiple aspects of the invention; and so forth. Aspects taught herein are encompassed by the term "invention". All aspects of the invention are enabled within the width of the claims.

[0053] Disclosed herein are genetic markers which are predictive of the presence of or a predisposition to developing ASD, or indicative of the absence of ASD or protection against developing ASD. In an embodiment, the genetic markers are SNPs. Reference to "SNP" includes a single SNP or a panel of SNPs. Machine learning has been applied to the identified SNPs to generate a predictive classifier for ASD diagnosis. In particular, it has been found that an SNP located within the gene encoding calcium-activated potassium channel subunit β4 is a significant genetic diagnostic classifier for ASD (i.e., of the presence or a predisposition to developing ASD or of the absence or protection against developing ASD).

[0054] Thus, in an aspect disclosed herein, there is provided a method for determining whether a subject has or has a predisposition to develop Autism Spectrum Disorder (ASD), the method comprising collecting a genetic sample from the subject comprising a gene encoding calcium-activated potassium channel subunit β4 and screening for a genetic marker in the gene that is statistically associated with ASD or the absence of ASD, wherein the presence of the genetic marker is indicative that the subject has ASD, a predisposition to developing ASD, the absence of ASD or is protected from developing ASD. [0055] In an embodiment, the genetic marker is a SNP. Hence, further taught herein is a method for determining whether a subject has or has a predisposition to develop Autism Spectrum Disorder (ASD), the method comprising collecting a genetic sample from the subject comprising a gene encoding calcium-activated potassium channel subunit β4 and screening for a SNP in the gene that is statistically associated with ASD or the absence of ASD, wherein the presence of the SNP is indicative that the subject has ASD, a predisposition to developing ASD, the absence of ASD or is protected from developing ASD.

[0056] The term "ASD", as used herein, would be clear to persons skilled in the art and includes Autism, Asperger's or Pervasive Developmental Disorder- Not Otherwise Specified (PDD-NOS). In an embodiment, the term "ASD" does not include RETT syndrome and/or Fragile X.

[0057] The term "indicative", as used herein, denotes an association or affiliation of a subject closely to a group or population of subjects who present, or likely to present, with the same or a similar clinical manifestations of ASD or a response to the treatment of ASD. The clinical manifestations of ASD are encompassed by symptoms of ASD.

[0058] Whilst the instant disclosure contemplates any genetic marker, such as methylation profile, nucleotide substitution, addition and/or deletion, SNPs are particularly useful up to the present time.

[0059] SNPs designate variations of individual base pairs in a DNA strand compared to corresponding wild-type sequence within a population. The term "wild-type", as used herein, refers to an allele of a gene which, when present in two copies in a subject results in a wild-type phenotype and typically refers to an allele in which an SNP of interest is absent. Studies suggest that SNPs represent about 90% of all genetic variants in the human genome, although they may a disproportionate frequency of occurrence in certain regions of the genome. SNPs can occur as a conservative nucleic acid substitution; that is, in which a base (e.g., cytosine) is replaced by another base (e.g., thymine), or as nucleic acid deletions or insertions. The majority of known SNPs are found in non-coding regions of the genome. These variants can affect regulatory sequences, such as promoters, enhancers or splicing sites, which in turn can affect the expression of genes. SNPs that are found within a coding region can be silent. For example, a nucleic acid substitution may not alter the translation of the corresponding triplet code into the analogous amino acid and, hence, will have no influence on the translated peptide sequence. However, because of different frequency of equivalent t-RNAs for specific base triplets, differences in the efficiency of translation can arise and, as a consequence, the expression of certain genes can be influenced post-transcriptionally by silent SNPs.

[0060] In the genome, biallelic SNPs can occur in three possible genotypes; namely, in one of two homozygotic forms (allele 1/allele 1 or allele 2/allele 2) or in one heterozygotic form (allele 1/allele 2). Since genomic DNA is double-stranded, each SNP can be identified with reference to each of the two strands. Thus, the SNPs may contain one substitution of one nucleotide by another at the polymorphic site of an SNP, or they may have a deletion of a nucleotide from one, or an insertion of a nucleotide into, one of two corresponding sequences. The SNP may be present in a "silent" or non-coding region of the gene, such as in the promoter region or in the 3' untranslated region. The SNP may also be present in the coding region of a particular gene and it may therefore be detectable at the mRNA level.

[0061] SNPs can be classified by a unique reference SNP ID number ("rs#"), allocated by the SNP database (dbSNP), which is an archive for genetic variations within and across different species developed and hosted by the National Center for Biotechnology Information (NCBI) in collaboration with the National Human Genome Research Institute (NHGRI).

[0062] It is taught herein that genetic markers in the Wnt signaling pathway contribute to the diagnosis of ASD in a CEU cohort, but not in a Han Chinese population. In an aspect, the genetic marker is a SNP. Thus, in an embodiment disclosed herein, the ethnic group is not Han Chinese and method disclosed herein comprises identifying an SNP in a gene encoding a peptide involved in Wnt signalling.

[0063] In an embodiment, the SNP is indicative of the presence of ASD or a predisposition to developing ASD in the subject. In a certain embodiment, the SNP is rs968122.

[0064] The statistical power of predicting whether a subject has ASD or has a predisposition to developing ASD increases where additional SNPs (that are indicative of the presence of ASD or predictive of the development of ASD) are detected. Thus, in an embodiment disclosed herein, the method further comprises identifying an SNP statistically associated with ASD, or a predisposition to developing ASD, in a gene encoding a protein selected from the list consisting of:

(a) guanine nucleotide-binding protein G(o) subunit alpha;

(b) metabotropic glutamate receptor 5;

(c) phosphatidylinositol-3,4,5-trisphosphate 5-phosphatase 1;

(d) adenylate cyclase type 8;

(e) voltage dependent calcium channel α2/δ subunit 3 ;

(f) platelet-derived growth factor D;

(g) adenylate cyclase type 3;

(h) phosphatidylinositol-4-phosphate 3-kinase C2 domain-containing δ polypeptide;

(i) voltage-dependent calcium channel αΐδ subunit;

CD Calmodulin 1;

(k) receptor tyrosine protein kinase erb B4;

(1) protein tyrosine phosphatase receptor-type R; and

(m) L-type voltage-dependent calcium channel ai C subunit.

[0065] In an embodiment disclosed herein, the SNP in the gene encoding guanine nucleotide-binding protein G(o) subunit alpha is rs 876619.

[0066] In an embodiment disclosed herein, the SNP in the gene encoding metabotropic glutamate receptor 5 is rsl 1020772.

[0067] In an embodiment disclosed herein, the SNP in the gene encoding platelet-derived growth factor D is rs 1818106.

[0068] In an embodiment disclosed herein, the gene encoding phosphatidylinositol-3,4,5- trisphosphate 5-phosphatase 1 is INPP5D and the SNP is selected from the list consisting of rs9288685 and rsl0193128. [0069] In an embodiment disclosed herein, the gene encoding adenylate cyclase type 8 is ADCY8 and the SNP is rs7842798.

[0070] In an embodiment disclosed herein, the gene encoding voltage dependent calcium channel α2/δ subunit 3 is CACNA2D3 and the SNP is rs3773540.

[0071] In an embodiment disclosed herein, the gene encoding adenylate cyclase type 3 is ADCY3 and the SNP is rs2384061.

[0072] In an embodiment disclosed herein, the gene encoding phosphatidylinositol-4- phosphate 3-kinase C2 domain-containing δ polypeptide is PIK3C2G and the SNP is rs 12582971.

[0073] In an embodiment disclosed herein, the gene encoding voltage-dependent calcium channel αΐδ subunit is CACNA1A and the SNP is rs 10409541.

[0074] In an embodiment disclosed herein, the gene encoding Calmodulin 1 is CALM1 and the SNP is rs2300497.

[0075] In an embodiment disclosed herein, the gene encoding receptor tyrosine protein kinase erb B4 is ERBB4 and the SNP is rs7562445.

[0076] In an embodiment disclosed herein, the gene encoding tyrosine phosphatase receptor-type R is PTPRR and the SNP is rs7313997.

[0077] In an embodiment disclosed herein, the gene encoding L-type voltage-dependent calcium channel ai C subunit is CACNA1C and the SNP is rs2239118.

[0078] In a further embodiment, the methods disclosed herein further comprising identifying a profile of two or more SNPs as listed in Table 1.

Table 1 - List of 237 SNPs and their weightings

SNP WEIGHT WEIGHT WEIGHT delta GENE GENE

LOWER HIGHER NUMBER SYMBOL

(0.95) (0.95)

rs968122 1.5465 1.5555 1.5645 0.0090 27345 KCNMB4 rs876619 0.9476 1.2092 1.4708 0.2616 2775 GNAOl rs 11020772 0.8553 0.8641 0.8729 0.0088 2915 GRM5 rs9288685 0.5856 0.5998 0.6140 0.0142 3635 INPP5D 0.5836 0.5946 0.6056 0.0110 3635 INPP5D

0.5298 0.5386 0.5474 0.0088 114 ADCY8

0.5125 0.5208 0.5291 0.0083 55799 CACNA2D3

0.5002 0.5161 0.5320 0.0159 80310 PDGFD

0.4195 0.4306 0.4417 0.0111 109 ADCY3

0.3983 0.4295 0.4607 0.0312 5288 PIK3C2G

0.4067 0.4189 0.4311 0.0122 773 CACNA1A

0.3782 0.3889 0.3996 0.0107 801 CALM1

0.3741 0.3843 0.3945 0.0102 2066 ERBB4

0.3382 0.3567 0.3752 0.0185 5801 PTPRR

0.3348 0.3552 0.3756 0.0204 775 CACNA1C

0.3214 0.3445 0.3676 0.0231 11060 WWP2

0.3262 0.3392 0.3522 0.0130 5071 PARK2

0.3140 0.3248 0.3356 0.0108 5332 PLCB4

0.3143 0.3234 0.3325 0.0091 5592 PRKG1

0.2958 0.3078 0.3198 0.0120 10725 NFAT5

0.2923 0.3048 0.3173 0.0125 51776 ZAK

0.2614 0.2727 0.2840 0.0113 661 POLR3D

0.2445 0.2659 0.2873 0.0214 3778 KCNMA1

0.2517 0.2616 0.2715 0.0099 3709 ITPR2

0.2460 0.2562 0.2664 0.0102 815 CAMK2A

0.2455 0.2561 0.2667 0.0106 115 ADCY9

0.2349 0.2548 0.2747 0.0199 80310 PDGFD

0.2357 0.2504 0.2651 0.0147 773 CACNA1A

0.2370 0.2464 0.2558 0.0094 5592 PRKG1

0.2327 0.2440 0.2553 0.0113 53343 NUDT9

0.2353 0.2439 0.2525 0.0086 2272 FHIT

0.2263 0.2434 0.2605 0.0171 5071 PARK2

0.2243 0.2390 0.2537 0.0147 83439 TCF7L1

0.2223 0.2381 0.2539 0.0158 390892 OR7A10

0.2260 0.2344 0.2428 0.0084 6934 TCF7L2

0.2116 0.2318 0.2520 0.0202 3098 HK1

0.2207 0.2294 0.2381 0.0087 5581 PRKCE

0.2193 0.2283 0.2373 0.0090 4293 MAP3K9

0.2112 0.2217 0.2322 0.0105 51465 UBE2J1

0.2095 0.2177 0.2259 0.0082 5592 PRKG1

0.2068 0.2169 0.2270 0.0101 23265 EXOC7

0.1935 0.2101 0.2267 0.0166 51807 TUBA8

0.1950 0.2077 0.2204 0.0127 7220 TRPC1

0.1930 0.2016 0.2102 0.0086 148 ADRA1A

0.1849 0.2003 0.2157 0.0154 5527 PPP2R5C

0.1848 0.1979 0.2110 0.0131 10125 RASGRP1

0.1830 0.1912 0.1994 0.0082 6262 RYR2

0.1801 0.1906 0.2011 0.0105 51422 PRKAG2

0.1739 0.1856 0.1973 0.0117 815 CAMK2A 0.1760 0 1844 0.1928 0.0084 956 ENTPD3

0.1567 0 1825 0.2083 0.0258 1716 DGUOK

0.1725 0 1804 0.1883 0.0079 5908 RAP IB

-0.0869 0 1750 0.4369 0.2619 2775 GNA01

0.1646 0 1735 0.1824 0.0089 3315 HSPB 1

0.1613 0 1717 0.1821 0.0104 3709 ITPR2

0.1606 0 1696 0.1786 0.0090 783 CACNB2

0.1458 0 1650 0.1842 0.0192 1261 CNGA3

0.1552 0 1646 0.1740 0.0094 1608 DGKG

0.1532 0 1627 0.1722 0.0095 5136 PDE1A

0.0751 0 1619 0.2487 0.0868 5590 PRKCZ

0.1421 0 1604 0.1787 0.0183 51422 PRKAG2

0.1475 0 1597 0.1719 0.0122 3710 ITPR3

0.1496 0 1588 0.1680 0.0092 55120 FANCL

0.1069 0 1581 0.2093 0.0512 6502 SKP2

0.1452 0 1573 0.1694 0.0121 5634 PRPS2

0.1403 0 1573 0.1743 0.0170 5592 PRKG1

0.1399 0 1496 0.1593 0.0097 390054 OR52A5

0.1416 0 1491 0.1566 0.0075 8450 CUL4B

0.1333 0 1426 0.1519 0.0093 2774 GNAL

0.1151 0 1413 0.1675 0.0262 5608 MAP2K6

-0.0267 0 1403 0.3073 0.1670 5801 PTPRR

0.1242 0 1383 0.1524 0.0141 2774 GNAL

0.1219 0 1311 0.1403 0.0092 5144 PDE4D

0.1135 0 1303 0.1471 0.0168 11060 WWP2

0.1143 0 1292 0.1441 0.0149 2890 GRIA1

0.1143 0 1290 0.1437 0.0147 3815 KIT

0.1169 0 1284 0.1399 0.0115 5336 PLCG2

0.1153 0 1283 0.1413 0.0130 5579 PRKCB

0.0028 0 1283 0.2538 0.1255 5593 PRKG2

0.1063 0 1248 0.1433 0.0185 4041 LRP5

0.1135 0 1227 0.1319 0.0092 2932 GSK3B

0.1094 0 1206 0.1318 0.0112 1488 CTBP2

0.1056 0 1156 0.1256 0.0100 4734 NEDD4

0.1032 0 1129 0.1226 0.0097 5321 PLA2G4A

0.0997 0 1119 0.1241 0.0122 55799 CACNA2D3

0.0903 0 1104 0.1305 0.0201 57521 RPTOR

0.0978 0 1075 0.1172 0.0097 3480 IGF1R

0.0968 0 1070 0.1172 0.0102 338751 OR52L1

0.0974 0 1041 0.1108 0.0067 2892 GRIA3

0.0921 0 1037 0.1153 0.0116 5592 PRKG1

0.0723 0.0998 0.1273 0.0275 5494 PPM1A

0.0913 0.0990 0.1067 0.0077 5562 PRKAA1

0.0736 0.0895 0.1054 0.0159 5579 PRKCB 0.0747 0.0891 0.1035 0.0144 808 CALM3

0.0725 0.0887 0.1049 0.0162 5071 PARK2

0.0710 0.0814 0.0918 0.0104 4790 NFKB 1

0.0709 0.0798 0.0887 0.0089 83439 TCF7L1

0.0580 0.0789 0.0998 0.0209 57521 RPTOR

0.0661 0.0780 0.0899 0.0119 5608 MAP2K6

0.0583 0.0747 0.0911 0.0164 55799 CACNA2D3

0.0457 0.0739 0.1021 0.0282 9630 GNA14

0.0556 0.0728 0.0900 0.0172 114 ADCY8

0.0476 0.0692 0.0908 0.0216 2890 GRIA1

0.0544 0.0654 0.0764 0.0110 341276 OR10A2

0.0233 0.0571 0.0909 0.0338 3708 ITPR1

0.0445 0.0545 0.0645 0.0100 7048 TGFBR2

0.0435 0.0533 0.0631 0.0098 1387 CREBBP

0.0451 0.0532 0.0613 0.0081 1488 CTBP2

0.0200 0.0433 0.0666 0.0233 5579 PRKCB

0.0327 0.0404 0.0481 0.0077 83439 TCF7L1

0.0137 0.0368 0.0599 0.0231 5336 PLCG2

0.0121 0.0243 0.0365 0.0122 6262 RYR2

-0.0119 0.0083 0.0285 0.0202 11060 WWP2

-0.0015 0.0062 0.0139 0.0077 1633 DCK

-0.0165 -0.0068 0.0029 0.0097 10846 PDE10A

-0.0709 -0.0220 0.0269 0.0489 9630 GNA14

-0.0381 -0.0238 -0.0095 0.0143 775 CACNA1C

-0.0373 -0.0284 -0.0195 0.0089 219981 OR5A2

-0.0415 -0.0301 -0.0187 0.0114 55799 CACNA2D3

-0.0404 -0.0328 -0.0252 0.0076 5142 PDE4B

-0.0577 -0.0358 -0.0139 0.0219 11184 MAP4K1

-0.0485 -0.0372 -0.0259 0.0113 8833 GMPS

-0.0493 -0.0393 -0.0293 0.0100 55799 CACNA2D3

-0.0519 -0.0436 -0.0353 0.0083 51366 UBR5

-0.0796 -0.0449 -0.0102 0.0347 6262 RYR2

-0.0569 -0.0457 -0.0345 0.0112 5136 PDE1A

-0.0819 -0.0549 -0.0279 0.0270 11060 WWP2

-0.0768 -0.0550 -0.0332 0.0218 3708 ITPR1

-0.0813 -0.0601 -0.0389 0.0212 10381 TUBB3

-0.0724 -0.0642 -0.0560 0.0082 5144 PDE4D

-0.0921 -0.0774 -0.0627 0.0147 5071 PARK2

-0.1001 -0.0803 -0.0605 0.0198 775 CACNA1C

-0.0896 -0.0812 -0.0728 0.0084 57521 RPTOR

-0.0935 -0.0817 -0.0699 0.0118 341276 OR10A2

-0.1090 -0.0915 -0.0740 0.0175 326 AIRE

-0.1079 -0.0968 -0.0857 0.0111 3708 ITPR1

-0.1296 -0.0983 -0.0670 0.0313 5581 PRKCE -0.1163 -0.1037 -0.0911 0.0126 81285 OR51E2

-0.1283 -0.1086 -0.0889 0.0197 55821 ALLC

-0.1209 -0.1095 -0.0981 0.0114 89780 WNT3A

-0.1293 -0.1192 -0.1091 0.0101 10580 SORBS 1

-0.1352 -0.1247 -0.1142 0.0105 2005 ELK4

-0.1402 -0.1250 -0.1098 0.0152 7325 UBE2E2

-0.1435 -0.1256 -0.1077 0.0179 5071 PARK2

-0.1345 -0.1263 -0.1181 0.0082 55799 CACNA2D3

-0.1372 -0.1273 -0.1174 0.0099 23295 MGRN1

-0.1380 -0.1274 -0.1168 0.0106 5592 PRKG1

-0.1382 -0.1299 -0.1216 0.0083 5592 PRKG1

-0.1417 -0.1300 -0.1183 0.0117 5502 PPP1R1A

-0.1448 -0.1310 -0.1172 0.0138 399694 SHC4

-0.1407 -0.1325 -0.1243 0.0082 26289 AK5

-0.1526 -0.1332 -0.1138 0.0194 9965 FGF19

-0.1869 -0.1371 -0.0873 0.0498 326 AIRE

-0.1463 -0.1373 -0.1283 0.0090 5330 PLCB2

-0.1619 -0.1394 -0.1169 0.0225 5592 PRKG1

-0.1687 -0.1468 -0.1249 0.0219 55811 ADCY10

-0.1563 -0.1469 -0.1375 0.0094 5071 PARK2

-0.1555 -0.1471 -0.1387 0.0084 5332 PLCB4

-0.1985 -0.1486 -0.0987 0.0499 3098 HK1

-0.1640 -0.1503 -0.1366 0.0137 5156 PDGFRA

-0.1751 -0.1583 -0.1415 0.0168 5027 P2RX7

-0.1883 -0.1653 -0.1423 0.0230 2775 GNAOl

-0.1798 -0.1657 -0.1516 0.0141 6263 RYR3

-0.1862 -0.1678 -0.1494 0.0184 5071 PARK2

-0.1804 -0.1692 -0.1580 0.0112 10242 KCNMB2

-0.1809 -0.1706 -0.1603 0.0103 773 CACNA1A

-0.1790 -0.1709 -0.1628 0.0081 10369 CACNG2

-0.1847 -0.1742 -0.1637 0.0105 2272 FHIT

-0.1841 -0.1761 -0.1681 0.0080 2257 FGF12

-0.2053 -0.1773 -0.1493 0.0280 2272 FHIT

-0.1939 -0.1777 -0.1615 0.0162 3708 ITPR1

-0.1900 -0.1804 -0.1708 0.0096 5592 PRKG1

-0.1907 -0.1807 -0.1707 0.0100 489 ATP2A3

-0.1943 -0.1834 -0.1725 0.0109 7048 TGFBR2

-0.1923 -0.1835 -0.1747 0.0088 7249 TSC2

-0.2242 -0.1886 -0.1530 0.0356 774 CACNA1B

-0.1990 -0.1902 -0.1814 0.0088 3704 ITPA

-0.2110 -0.1978 -0.1846 0.0132 390037 OR52I1

-0.2144 -0.2008 -0.1872 0.0136 2272 FHIT

-0.2130 -0.2009 -0.1888 0.0121 1608 DGKG

-0.2174 -0.2033 -0.1892 0.0141 783 CACNB2

-0.2223 -0.2100 -0.1977 0.0123 2895 GRID2 rs4651343 -0.2186 -0.2102 -0.2018 0.0084 5321 PLA2G4A rs6442886 -0.2372 -0.2103 -0.1834 0.0269 3708 ITPR1 rs2471226 -0.2226 -0.2121 -0.2016 0.0105 107 ADCY1 rs339408 -0.2272 -0.2168 -0.2064 0.0104 9322 TRIP 10 rs4553343 -0.2357 -0.2181 -0.2005 0.0176 2977 GUCY1A2 rsl7193 -0.2283 -0.2189 -0.2095 0.0094 10846 PDE10A rs2246219 -0.2492 -0.2288 -0.2084 0.0204 9626 GUCA1C rs38557 -0.2623 -0.2330 -0.2037 0.0293 781 CACNA2D rsl881638 -0.2467 -0.2342 -0.2217 0.0125 51422 PRKAG2 rs3776825 -0.2444 -0.2348 -0.2252 0.0096 815 CAMK2A rs 1480645 -0.2632 -0.2434 -0.2236 0.0198 27115 PDE7B rsl453541 -0.2665 -0.2442 -0.2219 0.0223 219983 OR4D6 rs 10505029 -0.2542 -0.2450 -0.2358 0.0092 51366 UBR5 rs 4947963 -0.2612 -0.2509 -0.2406 0.0103 1956 EGFR rs888817 -0.2644 -0.2565 -0.2486 0.0079 5924 RASGRF2 rs917948 -0.2711 -0.2577 -0.2443 0.0134 5536 PPP5C rs312481 -0.2709 -0.2614 -0.2519 0.0095 776 CACNA1D rs 10770675 -0.2761 -0.2623 -0.2485 0.0138 5139 PDE3A rs9355980 -0.2800 -0.2634 -0.2468 0.0166 5071 PARK2 rs6774037 -0.2977 -0.2659 -0.2341 0.0318 3708 ITPR1 rsl0783235 -0.2843 -0.2714 -0.2585 0.0129 121275 OR10AD1 rs2033655 -0.2833 -0.2747 -0.2661 0.0086 109 ADCY3 rs7918241 -0.2952 -0.2755 -0.2558 0.0197 1488 CTBP2 rs2161630 -0.2876 -0.2770 -0.2664 0.0106 10725 NFAT5 rs 12207523 -0.2923 -0.2846 -0.2769 0.0077 5169 ENPP3 rs 10768450 -0.2965 -0.2884 -0.2803 0.0081 119682 0R51L1 rs919741 -0.3291 -0.3083 -0.2875 0.0208 815 CAMK2A rs697851 -0.3325 -0.3239 -0.3153 0.0086 3707 ITPKB rs2271986 -0.3384 -0.3249 -0.3114 0.0135 4842 N0S 1 rs 12716928 -0.3452 -0.3370 -0.3288 0.0082 5336 PLCG2 rs6508808 -0.3703 -0.3554 -0.3405 0.0149 6261 RYR1 rs7297848 -0.3747 -0.3624 -0.3501 0.0123 26259 FBXW8 rs2320172 -0.3751 -0.3627 -0.3503 0.0124 4286 MITF rs2272197 -0.3812 -0.3715 -0.3618 0.0097 4216 MAP3K4 rs6971999 -0.3859 -0.3765 -0.3671 0.0094 26211 0R2F1 rs 1050395 -0.4095 -0.3913 -0.3731 0.0182 490 ATP2B1 rs 1872902 -0.4263 -0.4124 -0.3985 0.0139 80310 PDGFD rs3935743 -0.4425 -0.4278 -0.4131 0.0147 5336 PLCG2 rs 1369450 -0.5485 -0.4997 -0.4509 0.0488 114 ADCY8 rsl 1102321 -0.5120 -0.5030 -0.4940 0.0090 5906 RAP1A rs 17629494 -0.5242 -0.5070 -0.4898 0.0172 5592 PRKG1 rs 4648135 -0.5807 -0.5260 -0.4713 0.0547 4790 NFKB 1 rsl 7643974 -0.5527 -0.5424 -0.5321 0.0103 1488 CTBP2 rsl 243679 -0.5771 -0.5674 -0.5577 0.0097

341799 OR6S 1 rs2240228 -0.5942 -0.5816 -0.5690 0.0126 26532 OR10H3 rs260808 -0.5938 -0.5836 -0.5734 0.0102 80310 PDGFD rs4128941 -0.6166 -0.6082 -0.5998 0.0084 8313 AXIN2 rs769052 -0.6321 -0.6235 -0.6149 0.0086 7322 UBE2D2 rs984371 -0.7273 -0.7181 -0.7089 0.0092 219437 0R5L1 r s 4308342 -1.0196 -0.8938 -0.7680 0.1258 1633 DCK rsl 1145506 -0.9400 -0.9172 -0.8944 0.0228 9630 GNA14 rs905646 -0.9700 -0.9624 -0.9548 0.0076 2915 GRM5 rs6483362 -0.9894 -0.9661 -0.9428 0.0233 2915 GRM5 rsl 2317962 -1.4869 -1.3200 -1.1531 0.1669 27345 KCNMB4 rs8053370 -1.7162 -1.6956 -1.6750 0.0206 2775 GNA01

[0079] With respect to Table 1, |ί¾¾] denotes an SNP that is indicative that the subject has

ASD or a predisposition to developing ASD; and denotes an SNP that is indicative of the absence of ASD or protection from developing ASD.

[0080] In an embodiment disclosed herein, the SNP in a gene encoding calcium-activated potassium channel subunit β4 is indicative of the absence of ASD or protection from developing ASD in the subject. In an embodiment disclosed herein, the SNP is rsl2317962.

[0081] It would be understood by persons skilled in the art that the statistical power of predicting whether ASD is absent in a subject or a subject has a reduced likelihood of developing ASD (i.e., protection against developing ASD) is likely to increase where additional SNPs (that are indicative of the absence of ASD, or predictive of protection against developing ASD) are detected. Thus, in an embodiment disclosed herein, the method further comprising identifying an SNP statistically associated with the absence of ASD, or protection against developing ASD, in a gene encoding a protein selected from the list consisting of:

(a) cGMP-dependent protein kinase 1 alpha isozyme;

(b) nuclear factor NF-kappa-B pi 05 subunit;

(c) C-terminal binding protein 2;

(d) olfactory receptor 6S 1 ;

(e) olfactory receptor 10H3; (f) platelet-derived growth factor D;

(g) Axin-2;

(h) ubiquitin-conjugated enzyme E2D2;

(i) olfactory receptor 5L1;

CD deoxycytidine kinase;

(k) guanine nucleotide binding protein al4 subunit;

(1) metabotropic glutamate receptor 5; and

(m) guanine nucleotide-binding protein G(o) subunit alpha.

[0082] In an embodiment disclosed herein, the gene encoding cGMP-dependent protein kinase 1 alpha isozyme is PRKG1 and the SNP is rsl7629494.

[0083] In an embodiment disclosed herein, the gene encoding nuclear factor NF-kappa-B pl05 subunit is NFKB 1 and the SNP is rs4648135.

[0084] In an embodiment disclosed herein, the gene encoding C-terminal binding protein 2 is CTBP2 and the SNP is rs 17643974.

[0085] In an embodiment disclosed herein, the gene encoding olfactory receptor 6S 1 is OR6S 1 and the SNP is rs 1243679.

[0086] In an embodiment disclosed herein, the gene encoding olfactory receptor 10H3 is OR10H3 and the SNP is rs2240228.

[0087] In an embodiment disclosed herein, the SNP is a gene encoding platelet-derived growth factor D is rs260808.

[0088] In an embodiment disclosed herein, the gene encoding Axin-2 is AXIN2 and the SNP is rs4128941.

[0089] In an embodiment disclosed herein, the gene encoding ubiquitin-conjugated enzyme E2D2 is UBE2D2 and the SNP is rs769052.

[0090] In an embodiment disclosed herein, the gene encoding olfactory receptor 5L1 is OR5L1 and the SNP is rs984371. [0091] In an embodiment disclosed herein, the gene encoding deoxycytidine kinase is DCK and the SNP is rs4308342.

[0092] In an embodiment disclosed herein, the gene encoding guanine nucleotide binding protein, al4 subunit is GNA14 and the SNP is rsl 1145506.

[0093] In an embodiment disclosed herein, the SNP in a gene encoding metabotropic glutamate receptor 5 is selected from the list consisting of rs905646 and rs6483362.

[0094] In an embodiment disclosed herein, the SNP in a gene encoding guanine nucleotide-binding protein G(o) subunit alpha is rs8053370.

[0095] In another aspect disclosed herein, there is provided a method for determining whether a subject has a genetic profile associated with Autism Spectrum Disorder (ASD) or with a risk of developing ASD, the method comprising collecting a genetic sample from the subject and screening the genetic sample for the presence or absence of a selected number of SNPs, wherein the SNPs are selected from the list in Table 1 and wherein the selected number of SNPs provides 50% or greater correlation between the genetic profile and ASD, or a risk of developing ASD.

[0096] In another aspect disclosed herein, there is provided a method for determining whether a subject has a genetic profile associated with an absence of Autism Spectrum Disorder (ASD) or with protection from developing ASD, the method comprising collecting a genetic sample from the subject and screening the genetic sample for the presence or absence of a selected number of SNPs, wherein the SNPs are selected from the list in Table 1 and wherein the selected number of SNPs provides 50% or greater correlation between the genetic profile and the absence of ASD, or protection from developing ASD.

[0097] In another aspect disclosed herein, there is provided a method for determining whether a subject has a genetic profile associated with Autism Spectrum Disorder (ASD) or with a risk of developing ASD, the method comprising collecting a genetic sample from the subject and screening the genetic sample for the presence or absence of a selected number of SNPs, wherein the SNPs are selected from the list in Table 5 and wherein the selected number of SNPs provides 50% or greater correlation between the genetic profile and ASD, or a risk of developing ASD. [0098] In another aspect disclosed herein, there is provided a method for determining whether a subject has a genetic profile associated with an absence of Autism Spectrum Disorder (ASD) or with protection from developing ASD, the method comprising collecting a genetic sample from the subject and screening the genetic sample for the presence or absence of a selected number of SNPs, wherein the SNPs are selected from the list in Table 5 and wherein the selected number of SNPs provides 50% or greater correlation between the genetic profile and the absence of ASD, or protection from developing ASD.

[0099] In another aspect disclosed herein, there is provided a method for determining whether a subject has a genetic profile associated with Autism Spectrum Disorder (ASD) or with a risk of developing ASD, the method comprising collecting a genetic sample from the subject and amplifying genomic DNA or corresponding RNA using primers which are selective for the presence or absence of a selected number of SNPs, wherein the SNPs are selected from the list in Table 1 and wherein the selected number of SNPs provides 50% or greater correlation between the genetic profile and ASD, or a risk of developing ASD.

[0100] In another aspect disclosed herein, there is provided a method for determining whether a subject has a genetic profile associated with an absence of Autism Spectrum Disorder (ASD) or with protection from developing ASD, the method comprising collecting a genetic sample from the subject and amplifying genomic DNA or corresponding RNA using primers which are selective for the presence or absence of a selected number of SNPs, wherein the SNPs are selected from the list in Table 1 and wherein the selected number of SNPs provides 50% or greater correlation between the genetic profile and the absence of ASD, or protection from developing ASD.

[0101] In another aspect disclosed herein, there is provided a method for determining whether a subject has a genetic profile associated with Autism Spectrum Disorder (ASD) or with a risk of developing ASD, the method comprising collecting a genetic sample from the subject and amplifying genomic DNA or corresponding RNA using primers which are selective for the presence or absence of a selected number of SNPs, wherein the SNPs are selected from the list in Table 5 and wherein the selected number of SNPs provides 50% or greater correlation between the genetic profile and ASD, or a risk of developing ASD. [0102] In another aspect disclosed herein, there is provided a method for determining whether a subject has a genetic profile associated with an absence of Autism Spectrum Disorder (ASD) or with protection from developing ASD, the method comprising collecting a genetic sample from the subject and amplifying genomic DNA or corresponding RNA using primers which are selective for the presence or absence of a selected number of SNPs, wherein the SNPs are selected from the list in Table 5 and wherein the selected number of SNPs provides 50% or greater correlation between the genetic profile and the absence of ASD, or protection from developing ASD.

[0103] In another aspect disclosed herein, there is provided a method for determining whether a subject has or has a predisposition to develop Autism Spectrum Disorder (ASD), the method comprising collecting a genetic sample from the subject and screening the genetic sample for a suitable number of SNPs selected from the list in Table 1, and wherein the suitable number of SNPs is statistically associated with the presence of ASD or a predisposition to developing ASD. In an embodiment disclosed herein, the method comprises screening for all SNPs listed in Table 1 that are statistically associated with the presence of ASD or a predisposition to developing ASD.

[0104] In another aspect disclosed herein, there is provided a method for determining the absence of ASD or protection from developing ASD in a subject, the method comprising collecting a genetic sample from the subject and screening the genetic sample for a suitable number of SNPs selected from the list in Table 1, and wherein the suitable number of SNPs is statistically associated with the absence of ASD or protection from developing ASD. In an embodiment disclosed herein, the method comprises screening for all SNPs listed in Table 1 that are statistically associated with the absence of ASD or protection from developing ASD.

[0105] In another aspect disclosed herein, there is provided a method for determining whether a subject has or has a predisposition to develop Autism Spectrum Disorder (ASD), the method comprising collecting a genetic sample from the subject and screening for an SNP in the genetic sample that is statistically associated with the presence of ASD or a predisposition to developing ASD, wherein the SNP is selected from the list in Table 5 that is statistically associated with the presence of ASD or a predisposition to developing ASD. [0106] In an embodiment disclosed herein, the SNP is selected from the group consisting of: rsl7618615, rs6650972, rsl0823195, rsl942052, rs9798267, rs4696443, rs4648135, rs243196, rs7145618, rsl013459, rsl0952662, rs7580690, rs7756516, rs3935743, rs7903424, rs8054767, rsl l001056, rs2684777, rs2300497, rsl6931011, rs4324526, rsl 1736177, rs2239118, rs3020827, rs2270838, rsl7643974, rs2036109, rsl2582971, rsl873423, rs3734464, rs7067880, rsl029088, rsl0468681, rs7731023, rs884080, rs629720, rs2920022, rs4465567, rs2394538, rsl0794197, rs7100765, rs3121309, rs4254056, rs976266, rsl3359392, rs2716191, rs6483362 and rsl0407144.

[0107] In another aspect disclosed herein, there is provided a method for determining the absence of ASD or protection from developing ASD in a subject, the method comprising collecting a genetic sample from the subject and screening for an SNP in the genetic sample that is statistically associated with the absence of ASD or protection from developing ASD, wherein the SNP is selected from the list in Table 5 that is statistically associated with the absence of ASD or protection from developing ASD.

[0108] In an embodiment disclosed herein, the SNP is selected from the group consisting of: rsl 1583646, rs2587891, rsl480645, rsl2462609, rs4651343, rs888817, rsl050395, rs6971999, rsl928168, rs2272197, rsl 1644436, rs8063461, rsl 1602535, rs339408, rsl0762342, rs7536307, rs7512378, rs4947963, rs3904668, rs4128941, rs7870040, rs6679454, rsl2716928, rsl0783235, rs3910363, rs4647992 and rsl6853387.

[0109] In another aspect disclosed herein, there is provided a method for determining whether a subject has or has a predisposition to develop Autism Spectrum Disorder (ASD), the method comprising collecting a genetic sample from the subject and screening the genetic sample for a suitable number of SNPs selected from the list in Table 5, wherein the suitable number of SNPs is statistically associated with the presence of ASD or a predisposition to developing ASD. In an embodiment disclosed herein, the method comprises screening for the following SNPs: rsl7618615, rs6650972, rsl0823195, rsl942052, rs9798267, rs4696443, rs4648135, rs243196, rs7145618, rsl013459, rsl0952662, rs7580690, rs7756516, rs3935743, rs7903424, rs8054767, rsl 1001056, rs2684777, rs2300497, rsl6931011, rs4324526, rsl l736177, rs2239118, rs3020827, rs2270838, rsl7643974, rs2036109, rsl2582971, rsl873423, rs3734464, rs7067880, rsl029088, rsl0468681, rs7731023, rs884080, rs629720, rs2920022, rs4465567, rs2394538, rsl0794197, rs7100765, rs3121309, rs4254056, rs976266, rsl3359392, rs2716191, rs6483362 and rsl0407144.

[0110] In an embodiment disclosed herein, the method comprises screening for all SNPs identified in Table 5 that are statistically associated with the presence of ASD or a predisposition to developing ASD.

[0111] In another aspect disclosed herein, there is provided a method for determining the absence of ASD or protection from developing ASD in a subject, the method comprising collecting a genetic sample from the subject and screening the genetic sample for a suitable number of SNPs selected from the list in Table 5 that are statistically associated with the absence of ASD or protection from developing ASD.

[0112] In an embodiment disclosed herein, the SNPs are selected from the group consisting of: rsl 1583646, rs2587891, rsl480645, rsl2462609, rs4651343, rs888817, rsl050395, rs6971999, rsl928168, rs2272197, rsl 1644436, rs8063461, rsl 1602535, rs339408, rsl0762342, rs7536307, rs7512378, rs4947963, rs3904668, rs4128941, rs7870040, rs6679454, rsl2716928, rsl0783235, rs3910363, rs4647992 and rsl6853387.

[0113] In an embodiment disclosed herein, the method comprises screening for all SNPs listed in Table 5 that are statistically associated with the absence of ASD or protection from developing ASD.

[0114] Reference to "50% or greater" includes 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99 and 100%.

[0115] In an embodiment disclosed herein, the correlation is 60% or greater. In an embodiment, the correlation is 70% or greater. In an embodiment disclosed herein, the correlation is 80% or greater. In an embodiment disclosed herein, the correlation is 85% or greater. In an embodiment disclosed herein, the correlation is 90% or greater. In an embodiment disclosed herein, the correlation is 95% or greater.

[0116] In an embodiment, the SNPs have a weighting (weight) of not equal to zero (0), as shown, for example, in Tables 1 and 5. It should be noted that the weight assigned to each SNP is indicative only and may change to reflect, for example, changes in the reference allele (see Table 5 herein where exemplary reference alleles are shown). A change in the reference allele of an SNP may also change the threshold value used to predict the presence or absence of ASD in accordance with the present invention. The threshold may also be changed to alter the sensitivity or specificity of the test depending on its intended use. The present disclosure is instructional for calculating the ASD score following changes in reference alleles.

[0117] The weights assigned to each SNP may also differ when other SNPs that are in high linkage disequilibrium with the initial set of SNPs are chosen to build the classifier. Thus, in an embodiment, the methods disclosed herein can use additional, or the substitution SNPs, that are in high linkage disequilibrium with those listed in Tables 1 and 5. In another embodiment, any one or more of the SNPs listed in Tables 1 and 5 can be substituted with one or more SNPs that are in high linkage disequilibrium with those listed in Tables 1 and 5.

[0118] In an embodiment, a subject is first selected after a clinical diagnosis indicating the likelihood or otherwise of the subject having ASD. Alternatively, the genetic analysis is included as part of the clinical assessment of ASD.

[0119] A skilled artisan is familiar with methods of screening for the presence of (i.e., detecting) an SNP within a nucleic acid sample derived from a subject, also referred to herein as a genetic sample. Such methods include direct determination (e.g., sequencing) and indirect determination (e.g. amplification, hybridization and/or detection of proteins encoded by the gene in which the SNP is located, where such proteins vary in sequence, structure or function as compared to the wild-type proteins).

[0120] A genetic sample in which an SNP of interest is to be detected in accordance with the methods disclosed herein may be derived from any suitable sample derived from a subject. For example, the nucleic acid can be isolated from a biological sample (genetic sample) such as whole blood, cells from oral mucosa, somatic cells from nail, hair and the like, germ cells, sputum, amniotic fluid, urine, gastric juice, gastric lavage fluid and the like, mitochondria and tissue biopsies. The biological sample may be freshly derived from the subject, or it may be a sample that has previously been obtained from the subject and kept in storage (short, medium or long term). Such sample may, for example, have been stored frozen or paraffin-embedded. The biological sample may also be subject to fractionation or processing to remove highly abundant components.

[0121] Methods of collecting a genetic sample for use in the diagnostic and prognostic methods disclosed herein are known to persons skilled in the art. Such methods may include the use of raw material (e.g., cells or cell lysates) or they may include steps of isolating the nucleic acid from a sample. Commercially available genomic DNA or RNA isolation kits are available for this purpose. Sample nucleic acid can be obtained from any cell type or tissue of a subject. For example, a subject's bodily fluid (e.g. blood) can be obtained by known techniques (e.g., venepuncture). Alternatively, nucleic acid tests can be performed on dry samples (e.g., hair or skin). Foetal nucleic acid samples can be obtained from maternal blood. The genetic sample may also be an in utero sample (e.g., amniocytes or chorionic villi) that can be used, for example, for prenatal testing.

[0122] The methods disclosed herein can also be performed in situ directly upon tissue sections (fixed and/or frozen) obtained from a subject by biopsy or resection, such that no nucleic acid purification is necessary. Nucleic acid reagents (e.g., primers and probes) can also be used for such in situ procedures (see, for example, Nuovo, G. J. (1992) "PCR In Situ Hybridization: Protocols And Applications", Raven Press, NY).

[0123] The nucleic acid may be DNA or RNA (e.g., mRNA), single- stranded or double- stranded. Where the SNP represents a polymorphism found in a non-coding region, then the nucleic acid will comprise genomic DNA. On the other hand, where the SNP represents a polymorphism found in a coding region, the nucleic acid can comprise RNA, such as total RNA or mRNA.

[0124] The term "isolated", as used herein with respect to nucleic acids, would be understood by persons skilled in the art as meaning DNA or RNA molecules that are substantially free of other non-nucleic acid molecules normally present in their natural source, such as proteinaceous and other cellular material. It would also be understood by persons skilled in the art that the nucleic acid need not be completely free of non-nucleic acid material, as long as the non-nucleic acid material does not adversely affect the particular method that is to be employed for screening for an SNP of interest. [0125] The nucleic acid that is contained in the sample may be a nucleic acid originally contained in the sample or, alternatively, it may be an amplicon that has been prepared by nucleic acid amplification using the original nucleic acid in a biological sample as a template. Amplifying nucleic acid molecules originally contained in a sample may be required where the number of nucleic acid molecules in the original sample is below the level required for detection using the screening method of choice. It would be understood by persons skilled in the art that different methods of screening for SNPs will have different levels of sensitivity and, as a consequence, the number of copies of a nucleic acid molecule that is required for the identification of an SNP of interest will vary.

[0126] Methods of amplifying nucleic acid found in the original sample will be known to persons skilled in the art. Amplification may be performed directly, whereby an amplicon is generated by amplifying the nucleic acid originally contained in a biological sample as a template or indirectly by amplifying a template cDNA molecule that has been generated from RNA originally contained in the biological sample by reverse transcription. These amplicons may be used as template nucleic acids for screening for SNPs of interest in accordance with the methods disclosed herein. The length of the amplicon(s) can vary and may be, for example, from 50 bases to 1000 bases, or from 80 bases to 200 bases. Examples of suitable nucleic acid amplification methods that can be employed in the amplification steps described herein include PCR, NASBA (nucleic acid sequence based amplification), TMA (transcription-mediated amplification) and SDA (strand displacement amplification). Other examples of suitable nucleic acid amplification methods are described below.

[0127] Allele- specific PCR is a diagnostic or cloning technique used to identify or utilize SNPs. It requires prior knowledge of a DNA sequence, including differences between alleles, and uses primers whose 3' ends encompass the SNP. PCR amplification under stringent conditions is much less efficient in the presence of a mismatch between template and primer, so successful amplification with an SNP-specific primer signals presence of the specific SNP in a sequence.

[0128] Assembly PCR or Polymerase Cycling Assembly (PCA) comprises the artificial synthesis of long DNA sequences by performing PCR on a pool of long oligonucleotides with short overlapping segments. The oligonucleotides alternate between sense and antisense directions, and the overlapping segments determine the order of the PCR fragments thereby selectively producing the final long DNA product.

[0129] Asymmetric PCR is used to preferentially amplify one strand of the original DNA more than the other. It finds use in some types of sequencing and hybridization probing where having only one of the two complementary stands is required. PCR is carried out as usual, but with a great excess of the primers for the chosen strand. Due to the slow amplification later in the reaction after the limiting primer has been used up, extra cycles of PCR are required. A modification on this process, known as Linear-After- The- Exponential-PCR (LATE-PCR), uses a limiting primer with a higher melting temperature (Tm) than the excess primer to maintain reaction efficiency as the limiting primer concentration decreases mid-reaction.

[0130] Helicase-dependent amplification is similar to traditional PCR, but uses a constant temperature rather than cycling through denaturation and annealing/extension cycles. DNA Helicase, an enzyme that unwinds DNA, is used in place of thermal denaturation.

[0131] Hot-start PCR is a technique that reduces non-specific amplification during the initial set up stages of the PCR. The technique may be performed manually by heating the reaction components to the melting temperature (e.g., 95°C) before adding the polymerase. Specialized enzyme systems have been developed that inhibit the polymerase's activity at ambient temperature, either by the binding of an antibody or by the presence of covalently bound inhibitors that only dissociate after a high-temperature activation step. Hot- start/cold-finish PCR is achieved with new hybrid polymerases that are inactive at ambient temperature and are instantly activated at elongation temperature.

[0132] Ligation-mediated PCR uses small DNA linkers ligated to the DNA of interest and multiple primers annealing to the DNA linkers; it has been used for DNA sequencing, genome walking, and DNA footprinting.

[0133] Multiplex-PCR uses of multiple, unique primer sets within a single PCR mixture to produce amplicons of varying sizes specific to different DNA sequences. By targeting multiple genes at once, additional information may be gained from a single test run that otherwise would require several times the reagents and more time to perform. Annealing temperatures for each of the primer sets must be optimized to work correctly within a single reaction, and amplicon sizes; that is, their base pair length, should be different enough to form distinct bands when visualized by gel electrophoresis. Multiplex Ligation- dependent Probe Amplification (MLPA) permits multiple targets to be amplified with only a single primer pair, thus avoiding the resolution limitations of multiplex PCR.

[0134] Nested PCR increases the specificity of DNA amplification by reducing background due to non-specific amplification of DNA. Two sets of primers are being used in two successive PCRs. In the first reaction, one pair of primers is used to generate DNA products, which besides the intended target, may still consist of non- specifically amplified DNA fragments. The product(s) are then used in a second PCR with a set of primers whose binding sites are completely or partially different from and located 3' of each of the primers used in the first reaction. Nested PCR is often more successful in specifically amplifying long DNA fragments than conventional PCR, but it requires more detailed knowledge of the target sequences.

[0135] Reverse Transcription PCR(RT-PCR) is a method used to amplify, isolate or identify a known sequence from a cellular or tissue RNA. The PCR is preceded by a reaction using reverse transcriptase to convert RNA to cDNA. RT-PCR is widely used in expression profiling, to determine the expression of a gene or to identify the sequence of an RNA transcript, including transcription start and termination sites and, if the genomic DNA sequence of a gene is known, to map the location of exons and introns in the gene. The 5' end of a gene (corresponding to the transcription start site) is typically identified by an RT-PCR method, named Rapid Amplification of cDNA Ends (RACE-PCR).

[0136] Thermal asymmetric interlaced PCR (TAIL-PCR) is used to isolate unknown sequence flanking a known sequence. Within the known sequence TAIL-PCR uses a nested pair of primers with differing annealing temperatures; a degenerate primer is used to amplify in the other direction from the unknown sequence.

[0137] Touchdown PCR is a variant of PCR that can reduce nonspecific amplification by gradually lowering the annealing temperature as PCR cycling progresses. The annealing temperature at the initial cycles is usually a few degrees (3-5°C) above the Tm of the primers used, while at the later cycles, it is a few degrees (3-5°C) below the primer Tm. The higher temperatures give greater specificity for primer binding, and the lower temperatures permit more efficient amplification from the specific products formed during the initial cycles.

[0138] For amplifying at least a portion of a nucleic acid molecule of interest (i.e., a portion of the gene in which an SNP(s) of interest may be located), a forward primer (i.e., 5' primer) and a reverse primer (i.e., 3' primer) will be used. Forward and reverse primers hybridize to complementary strands of a double stranded nucleic acid, such that upon extension from each primer, a double stranded nucleic acid is amplified.

[0139] Primers used to amplify a target nucleic acid molecule in accordance with the methods disclosed herein typically comprise relatively short nucleic acid sequences that hybridize specifically to a nucleic acid sequence of interest. A primer can be used alone in a detection method, or a primer can be used together with at least one other primer or probe in a detection method. In an embodiment disclosed herein, primers comprise a nucleotide sequence which comprises a region having a nucleotide sequence that specifically hybridizes under stringent conditions to about: 6, or alternatively 8, or alternatively 10, or alternatively 12, or alternatively 25, or alternatively 30, or alternatively 40, or alternatively 50, or alternatively 75 consecutive nucleotides of the nucleic acid sequence of interest. Examples include allele specific hybridization using primers overlapping the SNP and having about 5, or alternatively 10, or alternatively 20, or alternatively 25, or alternatively 30 nucleotides around the SNP.

[0140] Nucleic acids that are obtained by amplification of the nucleic acid molecules originally contained in the genetic sample derived from a subject may be further subjected to detection hybridization using nucleic acid capture probes, a Tm determination and a polymorphism check/assessment.

[0141] Any of a variety of sequencing reactions known in those skilled in the art can also be used to directly sequence at least a portion of the gene of interest and detect the presence of an SNP in that gene in accordance with the methods disclosed herein, by comparing the sequence of the sample sequence with the corresponding wild-type (control) sequence. Exemplary sequencing reactions include those based on techniques developed by Maxam and Gilbert (1997) Proc. Natl. Acad. Sci, USA 74:560) or Sanger et al. (1977) Proc. Nat. Acad. Sci, 74:5463). It is also contemplated that any of a variety of automated sequencing procedures can be utilized when performing the subject assays (Biotechniques (1995) 19:448), including sequencing by mass spectrometry (see, for example, U.S. patent 5,547,835 and International Patent Application WO 94/16101, entitled DNA Sequencing by Mass Spectrometry by Koster; U.S. patent 5,547,835 and International Patent Application WO 94/21822 entitled "DNA Sequencing by Mass Spectrometry Via Exonuclease Degradation" by Koster; U.S. patent 5,605,798; Cohen et al. (1996) Adv. Chromat. 36: 127- 162; and Griffin et al. (1993) Appl. Biochem. Bio. 38: 147-159). Other exemplary sequencing methods are disclosed in U.S. patent 5,580,732 and U.S. patent 5,571,676.

[0142] In other examples, the presence of an SNP in a gene can be detected by restriction enzyme analysis, whereby the specific SNP results in a nucleotide sequence comprising a restriction site that is absent from the nucleotide sequence of another allelic variant or from the wild-type sequence.

[0143] Protection from cleavage agents (such as a nuclease, hydroxylamine or osmium tetroxide and with piperidine) can also be used to detect mismatched bases in RNA/RNA DNA/DNA, or RNA/DNA heteroduplexes. In general, the technique of "mismatch cleavage" starts by providing heteroduplexes formed by hybridizing a control nucleic acid, which is optionally labelled, e.g., RNA or DNA, comprising a nucleotide sequence with an SNP of interest, with a sample nucleic acid, e.g., RNA or DNA, obtained from a tissue sample. The double- stranded duplexes are treated with an agent that cleaves single- stranded regions of the duplex, such as duplexes formed based on base-pair mismatches between the control and sample strands. For instance, RNA/DNA duplexes can be treated with RNase and DNA/DNA hybrids treated with S I nuclease to enzymatically digest the mismatched regions. Alternatively, either DNA/DNA or RNA/DNA duplexes can be treated with hydroxylamine or osmium tetroxide and with piperidine in order to digest mismatched regions. After digestion of the mismatched regions, the resulting material is then separated by size on denaturing polyacrylamide gels to determine whether the control and sample nucleic acids have an identical nucleotide sequence or which nucleotides they are different. In some embodiments, the control or sample nucleic acid is labelled for detection. [0144] Alterations in electrophoretic mobility may also be used to identify an SNP of interest. For example, single strand conformation polymorphism (SSCP) may be used to detect differences in electrophoretic mobility between mutant and wild type nucleic acid sequences. Single-stranded DNA fragments of sample and control nucleic acids are denatured and allowed to renature. The secondary structure of single-stranded nucleic acids varies according to sequence, the resulting alteration in electrophoretic mobility enables the detection of a single base difference. The DNA fragments may be labelled or detected with labelled probes. The sensitivity of the assay may be enhanced by using RNA (rather than DNA), in which the secondary structure is more sensitive to sequence differences. Such methods may also utilize heteroduplex analysis to separate double stranded heteroduplex molecules on the basis of changes in electrophoretic mobility.

[0145] SNPs of interest may also be identified by analyzing the movement of a nucleic acid comprising the SNP in polyacrylamide gels containing a gradient of denaturant, which is assayed using denaturing gradient gel electrophoresis (DGGE). When DGGE is used as the method of analysis, DNA will be modified to ensure that it does not completely denature, for example, by adding a GC clamp of approximately 40bp of high-melting GC- rich DNA by PCR. A temperature gradient can be used in place of a denaturing agent gradient to identify differences in the mobility of control and sample DNA.

[0146] Other examples of methods for detecting SNPs include selective oligonucleotide hybridization, selective amplification, or selective primer extension. For example, oligonucleotide probes may be prepared in which the known SNP is placed centrally (allele- specific probes) and then hybridized to target DNA under conditions which permit hybridization only if a perfect match is found. Such allele- specific oligonucleotide hybridization techniques may be used for the detection of the nucleotide changes in the polymorphic region of the gene of interest. For example, oligonucleotides having the nucleotide sequence of the specific allelic variant are attached to a hybridizing membrane and this membrane is then hybridized with labelled sample nucleic acid. Analysis of the hybridization signal will then reveal the identity of the nucleotides of the sample nucleic acid.

[0147] SNPs can also be identified using an oligonucleotide ligation assay. OLA uses two oligonucleotides that are designed to be capable of hybridizing to abutting sequences of a single strand of a target sequence (i.e. , sequence comprising the SNP of interest). One of the oligonucleotides is linked to a separation marker (e.g., biotin, and the other is detectably labelled. If the precise complementary sequence is found in a target molecule, the oligonucleotides will hybridize such that their termini abut and create a ligation substrate. Ligation then permits the labelled oligonucleotide to be recovered using avidin, or another biotin ligand. Example of methods using OLA would be familiar to persons skilled in the art.

[0148] SNPs can also be detected by using a specialized exonuclease-resistant nucleotide. For example, a primer complementary to the allelic sequence immediately 3' to the polymorphic site is permitted to hybridize to a target molecule obtained from a particular animal or human. If the polymorphic site on the target molecule contains a nucleotide that is complementary to the particular exonuclease-resistant nucleotide derivative present, then that derivative will be incorporated onto the end of the hybridized primer. Such incorporation renders the primer resistant to exonuclease, and thereby permits its detection. Since the identity of the exonuclease-resistant derivative of the sample is known, a finding that the primer has become resistant to exonucleases reveals that the nucleotide present in the polymorphic site of the target molecule was complementary to that of the nucleotide derivative used in the reaction. This method has the advantage that it does not require the determination of large amounts of extraneous sequence data.

[0149] The amplification of nucleic acid molecules in the original genetic sample can be sequence specific, whereby the conditions favour amplification of the nucleic acid sequence comprising the SNP of interest (e.g., using sequence specific primers). Alternatively, amplification of the nucleic acid molecules in the original genetic sample can be non-sequence specific, whereby the conditions favour general amplification of nucleic acid molecules with the expectation that the amplicons will include those comprising the SNP of interest. Where non-sequence specific amplification is performed, the screening of an SNP of interest will usually be performed by using an SNP detection probe, also referred to interchangeably herein as an SNP capture probe. The term "probes" includes naturally occurring or recombinant single- or double-stranded nucleic acids or chemically synthesized nucleic acids. Persons skilled in the art would understand that the nucleic acid sequence of the SNP detection probe is not particularly limited, as long as it comprises a base that is complementary to the target nucleic acid sequence (i.e., a nucleic acid sequence comprising the SNP of interest) or a nucleic acid sequence which can hybridize (e.g., under stringent conditions) to a sequence that is complementary to the target nucleic acid sequence. The length of probe may vary, depending, for example, on the nature of the screening method used and the hybridization conditions employed. In some embodiments, the probe may be from 5 mer to 50 mer or, more preferably, from 10 mer to 30 mer. In some embodiments, the sequence of the SNP detection probe may be one that has the base corresponding to the sequence of interest, more preferably one that is 90- 100% identical to the sequence which is complementary to the sequence of interest, with the exception that it comprises the base corresponding to the SNP of interest. Whilst not essential, it would be understood by persons skilled in the art that, when the sequence of the SNP detection probe corresponds to the mutant type (i.e., the SNP), detection sensitivity may be increased.

[0150] In an embodiment disclosed herein, the detection probe may comprise a base corresponding to the wild-type sequence, wherein the absence of hybridization of the probe to a sample nucleic acid is indicative of the presence of an SNP.

[0151] Detection probes may be labelled by nick translation, Klenow fill-in reaction, PCR or other methods known to persons skilled in the art. Suitable detectable labels will be known to persons skilled in the art. Examples include a fluorescent dye and a fluorophore, which emit fluorescence by itself (i.e, when it is not hybridized with a complementary sequence) and become quenched through hybridization of the labelled probe with a complementary sequence. The probe may be labelled with a fluorescent dye on its base located in 3' region (e.g., 3' terminus) or 5' region (e.g., 5' terminus) of the probe. Nucleic acid bases that may be used to attach the detectable label include cytosine. Other detection means for detecting a detectable label in accordance with the methods described herein would be well-known to persons skilled in the art.

[0152] Examples of the suitable fluorescent dyes include fluorescein, phosphor, rhodamine and polymethine dye derivatives, BODIPY FL (Molecular Probes Inc.), FLUOREPRIME (Amersham Pharmacia), FLUOREDITE (Millipore Corporation), FAM (ABI), Cy3 and Cy5 (Amersham Pharmacia) and TAMRA (Molecular Probes Inc.). A combination of detectable labels may also be used, where appropriate, as long as, for example, the detectable labels are detectable under different detection conditions (e.g., at different wavelengths). Suitable combinations of detectable labels would be familiar to persons skilled in the art.

[0153] The SNP detection probes may also be labelled with two fluorescent dye molecules to form so-called "molecular beacons", which signal binding to a complementary nucleic acid sequence through relief of intramolecular fluorescence quenching between dyes bound to opposing ends on an oligonucleotide probe. The use of molecular beacons for genotyping will be familiar to persons skilled in the art. A quenching molecule is useful with a particular fluorophore if it has sufficient spectral overlap to substantially inhibit fluorescence of the fluorophore when the two are held proximal to one another, such as in a molecular beacon, or when attached to the ends of an oligonucleotide probe from about 1 to about 25 nucleotides.

[0154] Labelled SNP detection probes can also be used in conjunction with amplification of a nucleic acid molecule comprising the SNP of interest to provide real-time measurements of amplification products during PCR. Such approaches would be familiar to persons skilled in the art and can either employ intercalating dyes (such as ethidium bromide) to indicate the amount of double-stranded DNA present, or they can employ probes containing fluorescence-quencher pairs, where the probe is cleaved during amplification to release a fluorescent molecule whose concentration is proportional to the amount of double-stranded DNA present. During amplification, the probe is digested by the nuclease activity of a polymerase when hybridized to the target sequence to cause the fluorescent molecule to be separated from the quencher molecule, thereby causing fluorescence from the reporter molecule to appear. The Taq-Man approach uses a probe containing a reporter molecule— quencher molecule pair that specifically anneals to a region of a target polynucleotide containing the polymorphism.

[0155] In an embodiment disclosed herein, SNP detection probes are affixed to a solid support for use as "gene chips" or "microarrays" . Such gene chips can be used to screen for the SNPs of interest by a number of techniques known to one of skill in the art. In another embodiment, the SNP detection probes are affixed to an electrode surface for the electrochemical detection of nucleic acid sequences (as described, e.g., U.S. patent 5,952,172). [0156] Suitable solid supports for use in gene chips would be known to persons skilled in the art. Examples include glass, polystyrene, polypropylene, polyethylene, dextran, nylon, amylases, natural and modified celluloses, polyacrylamides, gabbros, and magnetite. The support material may have virtually any possible structural configuration, so long as the coupled molecule (e.g., SNP detection probe, antibody) is capable of binding to its target. The support configuration may be spherical, as in a bead, or cylindrical, as in the inside surface of a test tube, or the external surface of a rod. The support surface may be flat such as a sheet, test strip, etc. or alternatively polystyrene beads. SNP detection probes may be attached to the solid support by a variety of processes known to persons skilled in the art, including lithography. A gene chip may hold any number of detection probes of different sequences (specificities).

[0157] Suitable formats for gene chips or microarrays are known to persons skilled in the art. Examples include LabCard (ACLARA Bio Sciences Inc.); GeneChip (Affymetric, Inc); LabChip (Caliper Technologies Corp); a low-density array with electrochemical sensing (Clinical Micro Sensors); LabCD System (Gamera Bioscience Corp.); Omni Grid (Gene Machines); Q Array (Genetix Ltd.); a high-throughput, automated mass spectrometry systems with liquid-phase expression technology (Gene Trace Systems, Inc.); a thermal jet spotting system (Hewlett Packard Company); Hyseq HyChip (Hyseq, Inc.); BeadArray (Illumina, Inc.); GEM (Incyte Microarray Systems); a high-throughput microarraying system that can dispense from 12 to 64 spots onto multiple glass slides (Intelligent Bio-Instruments); Molecular Biology Workstation and NanoChip (Nanogen, Inc.); a microfluidic glass chip (Orchid biosciences, Inc.); BioChip Arrayer with four PiezoTip piezoelectric drop-on-demand tips (Packard Instruments, Inc.); FlexJet (Rosetta Inpharmatic, Inc.); MALDI-TOF mass spectrometer (Sequnome); ChipMaker 2 and ChipMaker 3 (TeleChem International, Inc.); and GenoSensor (Vysis, Inc.).

[0158] The probes and/or primers for use in accordance with the methods disclosed herein may be modified, for example, to improve their stability. Methods of modifying probes and/or primers would be known to persons skilled in the art. Exemplary nucleic acid molecules which are modified include phosphoramidate, phosphothioate and methylphosphonate analogs of DNA. They can also be modified at the base moiety, sugar moiety, or phosphate backbone, for example, to improve stability of the molecule. The probes and/or primers may include other appended groups such as hybridization-triggered cleavage agents or intercalating agents. The nucleic acid probes and/or primers can also include at least one modified sugar moiety, examples of which include arabinose, 2- fluoroarabinose, xylulose, and hexose or, alternatively, comprise at least one modified phosphate backbone selected from the group consisting of a phosphorothioate, a phosphorodithioate, a phosphoramidothioate, a phosphoramidate, a phosphordiamidate, a methylphosphonate, an alkyl phosphotriester, and a formacetal or analog thereof.

[0159] Suitable detection probes include both sense and antisense nucleic acid sequences, and may contain non-natural or derivatized nucleotide bases, as will be readily appreciated by persons skilled in the art. Also included are synthetic molecules that mimic polynucleotides in their ability to bind to a designated sequence via hydrogen bonding and other chemical interactions. Such molecules are known in the art and include, for example, those in which peptide linkages substitute for phosphate linkages in the backbone of the molecule.

[0160] The term "complementary" refers to the nucleic acid sequence that is the complement of a target nucleic acid sequence. When referring to double stranded nucleic acids, the complement of a nucleic acid having SEQ ID NO:X refers to the complementary strand of the strand having SEQ ID NO:X or to any nucleic acid having the nucleotide sequence of the complementary strand of SEQ ID NO:X. When referring to a single stranded nucleic acid having the nucleotide sequence SEQ ID NO:X, the complement of this nucleic acid is a nucleic acid having a nucleotide sequence which is complementary to that of SEQ ID NO:X. The nucleotide sequences and complementary sequences thereof are always given in the 5' to 3' direction. The terms "complement" and "reverse complement" can be used interchangeably herein.

[0161] The means and conditions for facilitation the hybridization of the SNP detection probe and the sample nucleic acid molecule (whether the original nucleic acid molecule in the genetic sample or an amplicon generated from the original nucleic acid molecule) would be known to persons skilled in the art. Conditions for obtaining single- stranded nucleic acids by denaturing double-stranded nucleic acids and conditions for hybridizing the single- stranded nucleic acid sequences with each other are well-known in the art. In some embodiments, the heating temperature for dissociating the double- stranded nucleic acid molecules can be in a range of from 85°C to 95°C. The duration of heating will also vary and can be, for example, in a range of from 1 second to 10 minutes, more preferably from 1 second to 5 minutes. The dissociated single- stranded nucleic acid and the SNP detection probe can be hybridized, for example, by lowering the heating temperature after dissociation. The hybridization temperature will also vary, for example, from about 40°C to about 50°C.

[0162] In determining whether hybridization between the single- stranded nucleic acid and the SNP detection probe has taken place (i.e., the detection step), in some embodiments, a signal change can be measured based on the dissociation of the SNP detection probe from the single-stranded nucleic acid molecule by changing temperature of the sample containing the hybrid in order to dissociate the hybrid. For instance, the signal value that indicates dissociation of the hybrid of a single-stranded nucleic acid (whether obtained by amplification or not) and the SNP detection probe can be measured with absorbance at a wavelength of 260 nm. In some embodiments, the dissociation may be measured by measuring the signal of a detectable label. When measuring of the signal of the detectable label is employed, detection sensitivity may be increased. An example of a suitable SNP detection probe includes a labelled probe showing a signal by itself, but not showing a signal when hybridized. Such a probe will typically not show a signal when hybridized with a target sequence (e.g., when double-strand DNA is formed), but will show a signal when the probe is dissociated by heating. Another example is a labelled SNP detection probe that typically does not show a signal by itself, but shows a signal when hybridized. Such a probe will show a signal when hybridized with a target sequence (e.g., when double- strand DNA is formed), but the signal may be decreased (i.e., quenched) when the SNP detection probe is dissociated by heating. Thus, by detecting the signal of the detectable label with a specific condition for the signal (absorption wavelength and the like), the progress of dissociation of the hybrid may be monitored and the melting temperature (Tm) for a particular hydrid can be determined, in a similar manner to measurement of absorbance, for example, at 260 nm.

[0163] Signal changes based on dissociation of the hybrid may be made by changing a temperature of a reaction solution. For example, heating the reaction solution (i.e., a hybrid between the single-strand DNA and the labelled SNP detection probe), and a change of signal value associated with increase of the temperature can be measured. If, for example, a probe comprising a labelled cysteine residue as its terminal base is used, when the probe is hybridized with a single strand DNA, fluorescence is decreased (or quenched), and when the probe is dissociated, fluorescence is emitted. Thus, by gradually heating a hybrid having decreased fluorescence (or quenched), an increase of fluorescent intensity associated with an increase in the reaction temperature can be measured.

[0164] In the Tm determination, Tm may be determined by analyzing a signal change obtained in the measuring step, and then assessed. Such measurement will be familiar to persons skilled in the art. For example, an amount of change for fluorescent intensity per unit time may be calculated from the obtained fluorescence for each temperature. When an amount of change is defined as [-d(increased amount of fluorescent intensity)/dt], the temperature showing the lowest value can be determined as a Tm. Also, an amount of change is defined as [d(increased amount of fluorescent intensity )/t], the temperature showing the highest value can be determined as a Tm. Where the labelled probe that used is not a quenching probe, but is a probe which does not show a signal by itself and shows a signal when hybridized, then a decrease of fluorescent intensity can be measured.

[0165] It would be understood by persons skilled in the art that, from the results of the analysis of Tm, a Tm which indicates dissociation a hybrid of full complementary strands (match) may be higher than that of a hybrid of one base-different strands (mismatch). Therefore, by determining the Tm of both a hybrid of full complementary strands and a hybrid of one base-different strands in advance, the SNP may be determined. For example, when a base of the target base site is presumed to be the mutant type, and an SNP detection probe complementary to the target sequence containing the SNP is used, the target base may be identified as the mutant type if the Tm of a formed hybrid is identical to the Tm of the hybrid of full complementary strands. Similarly, if the Tm of a formed hybrid is identical to the Tm of the hybrid of one base-different strands (lower than the Tm of the hybrid of full complementary strands), the target base can be identified as a normal type. If both Tms are detected, for example, it can be determined that a nucleic acid of mutation type and a nucleic acid of normal type co-exist in that sample.

[0166] In some embodiments, when a labelled SNP detection probe that shows a signal by itself but does not shows a signal when hybridized (e.g., a guanine quenching probe) is used, the probe emits fluorescence when a single- stranded nucleic acid and a labelled SNP detection probe are dissociated, but when the probe is hybridized by lowering the temperature, the fluorescence is decreased (or quenched). Thus, by gradually lowering the temperature of the reaction solution, decrease of fluorescent intensity can be measured. Conversely, when a labelled SNP detection probe that does not show a signal by itself but shows a signal when hybridized is used, the probe does not emit fluorescence when a single strand DNA and a probe are dissociated, but when the probe is hybridized by lowering the temperature, the probe emits fluorescence. Thus, by gradually lowering the temperature of the reaction solution, increase of fluorescent intensity can be measured.

[0167] Antibodies directed against wild type or mutant peptides encoded by the allelic variants of the gene of interest (i.e., the genes in which the SNP is located) may also be used to detect the presence (or absence) of an SNP of interest. Such methods may be used to detect abnormalities in the level of expression of the peptide or abnormalities in the structure and/or tissue, cellular, or subcellular location of the peptide. Proteins from the tissue or cell type to be analyzed may be detected or isolated using techniques which are well known to persons skilled in the art, such as Western blot analysis. Other examples include immunofluorescence techniques employing a fluorescently labelled antibody (see below) coupled with light microscopic, flow cytometric, or fluorometric detection. The antibodies (or fragments thereof) that are employed in the methods disclosed herein may also be used for histology, as in immunofluorescence or immunoelectron microscopy, for in situ detection of the peptides or their allelic variants. In situ detection may be accomplished by removing a histological specimen from a patient, and applying thereto a labelled antibody that specifically binds to a protein encoded by the allelic variant on which the SNP is located or normally located. Through the use of such a procedure, it is possible to determine not only the presence of the peptide, but also its distribution in the examined tissue. One of ordinary skill in the art will understand that any of a wide variety of histological methods (such as staining procedures) can be modified in order to achieve such in situ detection.

[0168] The screening of a sample for an SNP of interest may also include methods that determine the peptide function. For example, a peptide encoded by an allelic variant of the gene of interest (i.e., the genes in which the SNP is located) may have different activity as compared to the wild-type peptide. The activity of the peptide encoded by the allelic variant may be greater or less than the activity of the wild-type peptide, or it may have different activity altogether (e.g., where the peptide is an enzyme, it may display activity on a different substrate). For example, where the SNP of interest is located in the gene encoding calcium-activated potassium channel subunit β4 (KCNMB4), the presence of the SNP may be detected by comparing the activity of the calcium-activated potassium channel in a cell derived from a test subject with the activity of the calcium-activated potassium channel in a cell derived from a subject known to express the wild-type protein. Persons skilled in the art would be familiar with the type of methods that can be used to assay a cell for KCNMB4 activity. An example is provided by Wang et al. (2006, J Gen Physiol. 127(4):449-65).

[0169] The term "encode", as it is applied to nucleic acid molecules, refers to a nucleic acid molecule that is said to "encode" a polypeptide if, in its native state or when manipulated by methods well known to persons skilled in the art, it can be transcribed and/or translated to produce the mRNA for the polypeptide and/or a fragment thereof. The antisense strand is the complement of such a nucleic acid, and the encoding sequence can be deduced therefrom.

[0170] In an embodiment disclosed herein, the subject is human. In an embodiment disclosed herein, the human is one of an ethnic group of humans. In an embodiment disclosed herein, the ethnic group is not Han Chinese.

[0171] The methods for the diagnosis or prognosis of ASD, as disclosed herein, can also be used to monitor the course of treatment or therapy or for determining the most appropriate course of treatment for a subject in need thereof. By "treatment" includes amelioration or at least management of symptoms of ASD.

[0172] Based on the diagnostic or prognostic information, a medical practitioner can recommend a suitable regimen or therapeutic protocol (e.g., behavioural or pharmacological intervention) that is known to be beneficial in subjects with ASD; also referred to as "pharmacogenomics". For example, an individual's genetic profile can enable a medical practitioner: 1) to more effectively prescribe a drug that will address the molecular basis of the disease or condition (e.g., where the SNP is located on a gene associated with a known molecular pathway for which intervening agents (agonists or antagonists) are known); and 2) to better determine the appropriate dosage of a particular agent. The SNP expression patterns of a subject can therefore be used to determine the appropriate drug and dose to administer to the subject.

[0173] Thus, in an embodiment disclosed herein, the method further comprise, where the subject is determined as having or having a predisposition to developing ASD, exposing the subject to a treatment for inhibiting the progression of ASD or for inhibiting the onset of ASD or for ameliorating the symptoms of ASD.

[0174] Methods of treating a subject having or predisposed to developing ASD would be known to persons skilled in the art. Examples include (a) determining the presence of an SNP in a target gene in accordance with the methods disclosed herein; and (b) administering to the subject an effective amount of a compound that targets the gene, or the polypeptide encoded by said gene to inhibit or enhance its activity, as required (i.e., an agonist or antagonist). The methods of treatment or prophylaxis may also include gene therapy to replace an SNP that is indicative of the presence of ASD (or a predisposition to developing ASD) with a nucleotide base corresponding to the wild-type sequence, or a conservative nucleic acid substitution thereof. In another embodiment, the treatment comprises administering to the subject an agent that is capable of expressing in the subject a nucleic acid molecule comprising an SNP identified as being protective against developing ASD, where such expression counters (or negates) the phenotype associated with the SNP that is indicative of the presence of ASD or a predisposition to developing ASD. The SNP identified as being protective against developing ASD need not be an SNP located on the same gene as the SNP that is indicative of the presence of ASD or a predisposition to developing ASD and can therefore be located on another gene.

[0175] Methods for the treatment or prophylaxis of ASD also include administering to a subject an agent that targets the expression and/or activity of a peptide encoded by the allelic variant on which the SNP is located. The particular agent will depend on the nature of the peptide and persons skilled in the art would be familiar with the type of agent that can be used, having regard to the relevant peptide. [0176] Other suitable examples of methods for the treatment of prophylaxis of ASD include the use of a medication for managing high energy levels, inability to focus, depression or seizures that are occasionally seen in patients with ASD. Further examples include the use of anti-psychotic drugs such as Risperidone and Aripiprazole, particularly for the treatment of younger age patients with ASD who experience severe behavioural difficulties, including tantrums, aggression and self injury behaviour. Other methods include early intervention treatment with a focus of improving a child's development. Such early intervention services may include speech and social interaction therapy. Further examples include auditory training, discrete trial training, vitamin therapy, anti- yeast therapy, communication therapy, music therapy, occupational therapy, physical therapy and sensory integration. These may include behavioural and communication approaches, dietary intervention, medication or complementary and alternative medicines.

[0177] Another example of a suitable method for the treatment of prophylaxis of ASD is Applied Behaviour Analysis (ABA), which seeks to encourage positive behaviour and discourage negative behaviour, particularly in children with ASD. Examples of ABA include discrete trial training, early intensive behavioural intervention, pivotal response training and verbal behaviour intervention.

[0178] The ability to target populations having the highest likelihood of developing ASD, based on their SNP genetic profile as herein disclosed, can enable: 1) the repositioning of marketed drugs with disappointing market results; 2) the rescue of drug candidates whose clinical development has been discontinued as a result of safety or efficacy limitations, which are patient subgroup-specific; and 3) an accelerated and less costly development for drug candidates and more optimal drug labelling.

[0179] In another aspect disclosed herein, there is provided a kit for determining whether a subject has or has a predisposition to develop ASD, the kit comprising a set of primers and/or probes for identifying a SNP in accordance with the methods disclosed herein.

[0180] In an embodiment, the kit comprises one of more of the components herein described (e.g., primers, probes, detectable labels, reagents for amplication, etc) and, optionally, instructions for use. The kit may comprise at least one probe or primer which is capable of specifically hybridizing to the polymorphic region of the gene of interest and instructions for use. The kits may comprise at least one of the above described nucleic acids. Preferred kits for amplifying at least a portion of the gene of interest comprise two primers, at least one of which is capable of hybridizing to the allelic variant sequence. Such kits are suitable for detection of genotype by, for example, fluorescence detection, by electrochemical detection, or by other detection.

[0181] Nucleic acid molecule, whether used as probes or primers, contained in a kit can be detectably labelled. Labels can be detected either directly, for example for fluorescent labels, or indirectly. Indirect detection can include any detection method known to one of skill in the art, including biotin-avidin interactions, antibody binding and the like. Fluorescently labelled oligonucleotides also can contain a quenching molecule. The nucleic acid molecules may be bound to a solid support, as employed, for example, in gene chips and microarrays.

[0182] The kits may also include reagents for preparing (isolating) nucleic acid molecules from a genetic sample derived from a subject. The kits may also include all or some of the positive controls, negative controls, sequencing markers, and antibodies described herein for screening a subject's genetic sample for an SNP of interest.

[0183] Further provided is a machine-implemented means for stratifying a subject with respect to ASD in accordance with the methods disclosed herein, comprising:

(e) receiving subject data from a user via a communications network, the data comprising information on the presence or absence of the genetic marker in the subject;

(f) performing a processing function including comparing the data to predetermined data;

(g) determining the status of the subject in accordance with the results of the processing function including the comparison; and

(h) transferring an indication of the status of the subject to the user via the communications network.

[0184] In an embodiment, the genetic marker in a SNP.

[0185] In an embodiment, the methods disclosed herein further comprise having the user determine the data using a remote end station; and transferring the data from the end station to the base station via the communications network.

[0186] In an embodiment, the methods disclosed herein further comprise transferring the data through a firewall.

[0187] In an embodiment, the methods disclosed herein further comprise:

(a) determining payment information, the payment information representing the provision of payment by the user; and

(b) performing the data processing and transfer in response to the determination of the payment information.

[0188] Also provided is a base station for stratifying a subject with respect to ASD in accordance with the methods disclosed herein, the base station comprising:

(a) a store method;

(b) a processing system, the processing system being adapted to:

(i) receive subject data from a user via a communications network, the data comprising information on the presence or absence of the SNP in the subject;

(ii) performing a processing function including comparing the data to predetermined data;

(iii) determining the status of the subject in accordance with the results of the processing function including the comparison; and

(c) output an indication of the status of the subject to the user via the communications network.

[0189] In an embodiment, the processing system is adapted to receive data from a remote end station adapted to determine the data.

[0190] In an embodiment, the processing system comprises:

(a) a first processing system adapted to:

(i) receive the data; and (ii) determine the status of the subject in accordance with the data including comparing the data; and

(b) a second processing system adapted to:

(i) receive the data from the processing system;

(ii) perform the processing function including the comparison; and

(iii) transfer the results to the first processing system.

[0191] Also provided is a prognostic panel of genetic markers comprising one or more primers and/or one or more SNP detection probes, as herein described, for screening a subject for SNPs in accordance with the methods disclosed herein.

[0192] In one aspect of the panels, the one or more primers and/or one or more probes may be attached to a solid support, as hereinbefore described (i.e., as a microarray).

[0193] Those skilled in the art will appreciate that the invention described herein in susceptible to variations and modifications other than those specifically described. It is to be understood that the invention includes all such variations and modifications which fall within the spirit and scope. The invention also includes all of the steps, features, compositions and compounds referred to or indicated in this specification, individually or collectively, and any and all combinations of any two or more of said steps or features.

[0194] Certain embodiments of the invention will now be described with reference to the following examples which are intended for the purpose of illustration only and are not intended to limit the scope of the generality hereinbefore described.

EXAMPLES

Example 1 - Subjects

[0195] The following study was approved by the University of Melbourne Human Research Ethics Committee (Approval Numbers 0932503.1, 0932503.2).

[0196] Index sample: Subject data from 2,609 probands with ASD (including Autism, Asperger's or Pervasive Developmental Disorder-Not Otherwise Specified, but excluding RETT syndrome and Fragile X), and 4,165 relatives of probands, was available from AGRE 1 ; 1,862 probands and 2,587 first-degree relatives had SNP data from the Illumina 550 platform relevant to analyses (see Figure la). Diagnosis of ASD was made by a specialist clinician and confirmed using the Autism Diagnostic Interview Revised (ADI-

R 2 ). Control training data was obtained from HapMap 3 instead of relatives, as the latter may possess SNPs that predispose to ASD and skew analysis (see Figures la and lb).

[0197] Independent validation samples: 737 probands with ASD (ADI-R diagnosed) derived from SFARI; 2,930 control subjects from WTBC (see Figure lb).

[0198] As SNP incidence rates vary according to ancestral heritage, HapMap data (Phase 3 NCBI build 36) was utilized to allocate individuals to their closest ethnicity. Individuals of mixed ethnicity were excluded; HapMap data has 1,403,896 SNPs available from 11 ethnicities. Any SNPs not included on the AGRE Illumina 550 platform were discarded, resulting in 407,420 SNPs. Mitochondrial SNPs reported in the AGRE, but not available in HapMap were excluded. The 30 most prevalent (>95%) SNPs within each ethnicity were identified and each ASD individual assigned to the group for which they shared the highest number of ethnically specific SNPs. HapMap groups were determined to be appropriate for analysis, as prevalence rates of the 30 SNPs relevant to each ethnicity were similar for each AGRE group assigned to that ethnicity, p<0.05.

Example 2 - Gene Set Enrichment Analysis (GSEA)

[0199] Pathway analysis was selected because it depicts how groups of genes may contribute to ASD etiology (see Example 7) and mitigates the statistical problem of conducting a large number of multiple comparisons required in GWAS studies. The current pathway analysis differs from previous ASD analyses in three unique ways: (1) we divided the cohort into ethnically homogeneous samples with similar SNP rates; (2) both protective and contributory SNPs were accounted for in the analysis, and (3) the pathway test statistic was calculated using permutation analysis. Although this is computationally expensive, benefits include taking account of rare alleles, small sample sizes and familial effects. It also relaxes the Hardy- Weinberg equilibrium assumption, that allele and genotype frequencies remain constant within a population over generations. Pathways were obtained from Kyoto Encyclopedia of Genes and Genomes (KEGG) and SNP-to-gene data obtained from the National Center for Biotechnology Information (NCBI). Intronic and exonic SNPs were included. AGRE individuals most closely matching the genetics of Utah residents of Western and Northern European (CEU), Tuscan Italian (TSI) and Han Chinese origin were used in the analysis. CEU individuals (975 affected individuals and 165 controls) were chosen as the index sample, representing the largest group affected in AGRE (Fig la). The CEU and Han Chinese had 116,753 SNPs that differed, whereas the CEU and TSI had 627 SNPs, differing in allelic prevalence at p<lxl0 ~5 . The pathway test statistic was calculated for CEU and Han individuals using a "set-based test" in the PLINK 4 software package, with p=0.05, r2=0.5 and permutations set to 2,000,000. Significance threshold was set conservatively at p<2xl0 "6 , calculated from the number of pathways being examined (200) and maximum number of retained SNPs per pathway (125). Therefore, significance set at < 0.05/(200xl25)=2xl0 "6 (see Figure 4).

Example 3 - Predicting ASD phenotype based upon candidate SNPs.

[0200] For each individual, a 775 dimensional vector was constructed, corresponding to 775 SNPs identified as part of the GSEA. To examine whether SNPs could predict an individual's clinical status (ASD versus non-ASD), two-tail unpaired t-tests were used to identify which of the 775 SNPs had statistically significant differences in mean SNP value (p<0.005). This significance level provided low classification error while maintaining acceptable variance in estimation of regression coefficients for each SNP's contribution status, and provided the set of SNPs that maximized the classifier output between the populations (see Figure 2 and Example 8). This resulted in 237 SNPs selected for regression analysis (see Table 1). Each dimension of the vector was assigned a value of 0, 1 or 3, dependent on a SNP having two copies of the dominant allele, heterozygous or two copies of the minor allele. The Ό, 1, 3' weighting provided greater classification accuracy over Ό, 1, 2'. Such approaches using super- additive models have been used previously to understand genetic interactions 5 . The formulae for the classifier and classifier performance are presented in Example 9.

[0201] The CEU sample was divided into a training set (732 ASD individuals and 123 controls) and the remainder comprised the validation set. An affected individual was given a value of 10 and an unaffected individual a value of -10, providing a sufficiently large separation to maximize the distance between means (see Example 9). Least squares regression analysis of the training set determined coefficients whose product mapped SNPs to clinical status. Kolmogorov-Smirnov goodness of fit test assessed the nature of the distribution of SNPs by classification. At /?=0.05, the distributions were accepted as being normally distributed, allowing determination of positive and negative predictive values (see ROC, Figure 5). The Durbin- Watson test was used to investigate the residual errors of the training set to determine if further correlations existed. At /?=0.05 the residuals were uncorrected. Regression coefficients were used to assess individual SNP contribution to clinical status.

Example 4 - AGRE validation

[0202] After analyzing the CEU training cohort, 3 cohorts were used for validation: 285 (243 probands, 42 controls) CEUs; a genetically similar TSI sample (65 patients, 88 controls); and a genetically dissimilar Han Chinese population (33 patients, 169 controls). To illustrate overlap in SNPs in first-degree relatives of individuals with ASD (n=l,512), we mapped the SNPs of parents (n=l,219; 581 Male) and unaffected siblings (n=293; 98 Male) of CEU origin who did not meet criteria for ASD. Finally, the accuracy of the predictive model was modified to test predictive ability using 10, 30 and 60 SNPs having the greatest weightings.

[0203] Independent validation: Samples included 507 CEU and 18 TSI subjects with ASD from SFARI, and 2,557 CEU and 63 TSI from WTBC (see Figure lb).

Example 5 - Identification of Affected Pathways

[0204] Analyses focused on 978 CEU ASD individuals, in which 13 KEGG pathways were significantly affected (p<lxl0 ~5 ). The pathway analysis identified 775 significant SNPs perturbed in ASD. A number of the pathways were populated by the same genes and had inter-related functions (see Table 2).

[0205] The most significant pathways were: calcium signaling, gap junction, long term depression (LTD), long term potentiation (LTP), olfactory transduction, and mitogen activated kinase-like protein (MAPK) signaling. GSEA on the genetically distinct Han Chinese identified six pathways that overlapped with 13 pathways in the CEU cohort (estimate of this occurring by chance, p=0.05), including: purine metabolism, calcium signaling, phosphatidylinositol signaling, gap junction, LTP, and LTD. Related to these pathways, the statistically significant SNPs in both populations were rs3790095 within GNAOl, rsl869901 within PLCB2, rs6806529 within ADCY5 and rs9313203 in ADCY2. Example 6 - Diagnostic Prediction of ASD

[0206] From the 775 SNPs identified within the CEU cohort, accurate genetic classification of ASD versus non-ASD was possible using 237 SNPs determined to be highly significant (p<0.005). Figure 3a shows the distribution of ASD and non-ASD individuals based on genetic classification.

[0207] An individual's clinical status was set to ASD if their score exceeded the threshold of 3.93. This threshold corresponds to the intersection points of the two normal curves. The theoretical classification error was 8.55%, and positive (ASD) and negative predictive values (Controls) were 96.72% and 94.74%, respectively. Classification accuracy for the 285 CEU AGRE validation individuals was 85.6% and 84.3% for the TSI, while accuracy for the Han Chinese population was only 56.4%.

[0208] Using the same classifier with the identical set of SNPs, accuracy of prediction of ASD in the independent datasets was 71.6%; positive and negative predictive accuracies were 70.8% and 71.8%, respectively. SNPs were compared to the affected and unaffected individuals.

[0209] Table 2: Statistically significant pathways for the CEU and Han Chinese. -values in bold are statistically significant. The pathways highlighted in 'bold' denote pathways that have reached statistical significance in both populations. KEGG (Kyoto Encyclopedia of Genes and Genomes; ftp.kegg.jp); CEU - of Central (Western & Northern) European origin; HAN - of HAN Chinese origin.

Table 2

-57A-

Substitute Sheet (Rule 26) RO/AU - 58 -

[0210] Figure 3b shows that relatives (parents and unaffected siblings combined) fall between the two distributions, with a mean score of 2.68 (SD=2.27). The percentage overlap of the relatives and affected individuals was 30.4%. The mean scores of the mothers and fathers did not differ (at p=0.05) with scores of 2.83 (SD=2.17) and 2.93 (SD=2.34), respectively (see S5), while unaffected siblings (not meeting diagnostic criteria for ASD) fell between parents and cases (mean=4.74, SD=3.80). In testing the robustness of the predictive model, using fewer SNPs monotonically decreased accuracy in the AGRE-CEU analyses to 72% for 60 SNPs, 58% for 30 SNPs and 53.5% for 10 SNPs, with the distribution of parents being indistinguishable from controls.

[0211 ] Of the 237 SNPs within our classifier, presence of some contributed to vulnerability to ASD (Table 3a), while others were protective (Table 3b). Eight SNPs in three genes, GRM5, GNAOI and KCNMB4, were highly discriminatory in determining an individual's classification as ASD or non-ASD. For KCNMB4, rs968122 highly contributed to a clinical diagnosis of ASD whilst rsl2317962 was protective; for GNAOI, SNP rs876619 contributed whilst rs8053370 was protective; for GRM5, SNPs rsl 1020772 was contributory whilst rs905646 and rs6483362 were protective.

[0212] Tables 3a and 3b : List of 15 most contributory (Table 3a) and 1 5 most protective {Table 3h) SNPs for ASD diagnosis in the CEU Cohort. Weight indicates the contribution of each SNP to ASD clinical status. 'Weight Lower' indicates the 0.95 lower error bar of the estimate; 'Weight Higher' indicates the 0.95 upper error bar for that SNP. Note that some genes have SNPs that contribute to risk for ASD and SNPs that protect against ASD.

Substitute Sheet

(Rule 26) RO/AU Risk SNPs and their weightings

Protective SNPs and their weightings

- 59 -

Example 7 - SIGene set enrichment analysis (GSEA)

[0213] GSEA was undertaken to consider all possible genes related to pathways that might contribute to risk for autism. We were interested to examine the contribution of multiple SNPs to risk for autism, each with potentially small effect, rather than seek to identify individual or small numbers of SNPs of large effect. The latter approach, while providing some information about the genes contributing to autism, they have failed to provide any ability to predict which individuals may be at risk. The approach we have taken is to identify which of the known pathways are perturbed in ASD (using KEGG canonical pathways). Here, instead of attempting to identify significance for individual SNPs or genes, we sought to identify canonical pathways that differed compared with control subjects. This has the benefit of taking into account the complex interactions of genes, and since this approach is analyzing a much smaller number of sets it considerably increases the power of the analyses.

[0214] The collection of SNPs on the Illumina platform relevant to particular pathways were compared and a determination was made as to whether SNPs related to these pathways were perturbed. Data for the AGRE cohort provided SNP information from the

Substitute Sheet

(Rule 26) RO/AU - 60 -

Illumina 550 platform. The other datasets (HAPMAP, SFARI, Wellcome Trust) provided SNP data from the Illumina lM-Duo. The total number of SNPs consistent across the two platforms was 407,420. The number of KEGG Pathway genes examined was 5,936. For each Kegg pathway, we determined the collection of SNPs residing on genes that form part of the pathway. This was performed by firstly identifying all genes that reside on a pathway. NCBI data mapping a SNP to a gene was used to identify all SNPs relevant to a pathway. This included both intronic and exonic genes.

[0215] The /^-values relevant to the pathways were calculated using permutation analysis. This was necessary as analyses were performed on a set of SNPs, not a single SNP. The set-based association analysis procedure used was as described in plink; see Purcell (2007) 4 . In brief, SNPs that were in linkage disequilibrium (LD) above a certain threshold were removed, so to identify independent SNPs with the highest loadings. The statistic for each set was calculated as the mean of the single SNP statistics and the dataset was permuted 2,000,000 times while keeping LD between SNPs constant. For the two million permutations, the listed p-value was the number of times the permuted set statistic exceeded the mean p-value for that set.

[0216] All available Kegg pathways were tested. Only those showing statistical significance were retained at p<10 "5 . The significance threshold of p<10 "5 was set according to the number of pathways being examined, which was 200. Therefore, significance was less than 0.05/200 [set at less than 1 x 10 ~5 ].

[0217] Pathways were then determined. For post-pathway analysis, only SNPs that were part of the significant pathways were considered. Analysis was then performed to assess whether the collection of SNPs relevant to each pathway was more or less represented in ASD individuals versus controls. 775 SNPs were identified as being statistically significant SNPs.

Example 8 - Identification of SNPs

[0218] The 775 SNPs identified from the pathway analysis step were then examined further to determine which SNPs were most relevant to discriminating the groups. A linear classifier forms a hyperplane on the feature space of variables separating the two classes (ASD subjects versus controls). SNPs that have the greatest mean difference between - 61 - populations are good candidate SNPs for group separation. In this analysis, Bonferroni correction was used, with p value set at 0.05/775, which was rounded down to 1 x 10 "5 .

[0219] This procedure, however, does not ensure that the identified set of SNPs is linearly independent. In order to address such collinearity, the covariance matrix of SNPs was calculated, as were all covariance matrices with one SNP removed iteratively. The covariance matrix is a real symmetric matrix, which mandates that the eigenvalues of the matrix are greater than or equal to zero. Using a property of linear algebra, namely that the trace of the matrix is equal to the sum of the eigenvalues, the contribution to the total variance of each SNP was determined by removing the SNP and calculating the difference between the trace of the covariance matrix with and without that SNP. The SNPs that contributed least to the trace of the covariance matrix were thereby removed. This process was continued until the covariance matrix was full rank. In this way the remaining 237 SNPs were not linearly dependent on each other.

[0220] It should be noted that the SNP weights were not assumed to be Gaussian. The distributions of the weights for each of the SNPs were also examined by taking random subsamples of individuals and their genetic data, which were used to train the classifier, providing weights for each SNP with each training set. This was iteratively run 100,000 times and a histogram of the weights for each SNP was plotted. This allowed examination of the distribution of the weights for each SNP, allowing a determination to be made of the confidence interval for each SNP.

Example 9 - Formula for the classifier & classifier performance

(a) Formula for the classifier where Y J , W SNP correspond to the weighted output for individual j, regression coefficient weight for each SNP for each individual respectively.

[0221] That is, the sum of weight Wi x SNPi J ('0,1,3') value of the relevant allele + an offset coo (determined by the least squares analysis). Therefore, the weighting can be -62- negative, so that more deleterious effect is not necessarily assumed to be related to the minor allele. It can be either the least or the most deleterious and the off-set can also change the contribution of those SNPs to the clinical phenotype.

[0222] As herein described, an affected individual was given a value of 10 and an unaffected individual a value of -10, to provide a sufficiently large separation to maximize the distance between means. Thus, given the formula above:

Let C denote the group of controls and let A denote the group of affected individuals and W the weight vector which is defined as above W,

The mean of each roup can be shown to be equal to

The distance between the two means of the distribution is given by where,

t*c = ¾ec (SNPj 1 ) mid μ Α = E ieA (SNP ).

The optimal weight vector W, which maximizes the distance between the two groups, can be shown to correspond to the eigenvector with the maximum eigenvalue to the matrix

[0223] As the value of is independent of W 0 , W 0 can be chosen such that the two distributions of the two population means are symmetric about the origin. It is also evident that a scale factor can be chosen to place the two means at an arbitrary but symmetric location about the origin. Hence, the choice of the mean value for training is arbitrary, provided that the X values have no physical significance; that is, it does not measure a patient variable.

Substitute Sheet

(Rule 26) RO/AU - 63 -

[0224] Using pathway analysis, a genetic diagnostic classifier was generated based on a linear function of 237 SNPs (Table 1) that accurately distinguished ASD from controls within a CEU cohort. This same diagnostic classifier was able to correctly predict and identify ASD individuals with accuracy exceeding 85.8% and 84.3% in the unseen CEU and TSI cohorts, respectively. This classifier was able to predict ASD group membership in subjects derived from two independent datasets with an accuracy of 71.7%. However, the classifier was less accurate at predicting ASD in the genetically distinct Han Chinese cohort, which may be explained by differences in allelic prevalence. While only 627 SNPs significantly differed between the TSI and CEU cohorts, this figure increased to 116,753 SNPs between the CEU and Han Chinese. Interestingly, parents and siblings of ASD-CEU individuals fell as distinct groups between the ASD and controls, reinforcing a genetic basis for ASD with neurobehavioral abnormalities reported in parents of ASD individuals.

[0225] There was considerable overlap in the pathways implicated in both the CEU and Han Chinese populations. The analysis demonstrated that SNPs in the Wnt signaling pathway contributed to a diagnosis of ASD in the CEU cohort, but not in the Han Chinese population. Completion of diagnostic classification studies for other ethnic groups will invariably aid in identification of common pathological mechanisms for ASD.

[0226] The SNPs contributing most to diagnosis in our classifier corresponded to genes for KCNMB4, GNAOl, GRM5, INPP5D, and ADCY8. The three SNPs that markedly skewed an individual towards ASD were related to the genes coding for KCNMB4, GNAOl, and GRM5. Homozygosity for KCNMB4 SNP carries a higher risk of ASD than SNPs related to GNAOl and GRM5. By contrast, a number of SNPs protected against ASD, including rs8053370 (GNAOl), rsl2317962 (KCNMB4), rs6483362 and rs905646 (GRM5). KCNMB4 is a potassium channel that is important in neuronal excitability and has been implicated in epilepsy and dyskinesia 6 ' 7 . It is highly expressed within the fusiform gyrus, as well as in superior temporal, cingulate, and orbitofrontal regions (Allen Human Brain Atlas, Allen Institute for Brain Science; ), which are areas implicated in face identification and emotion face processing deficits seen in ASD. GNAOl protein is a subgroup of Ga(o), a g-protein that couples with many neurotransmitter receptors. Ga(o) knockout mice exhibit "autism-like" features, including impaired social interaction, poor motor skills, anxiety and stereotypic turning behaviour 9 . - 64 -

GNAOl has also been shown to have a role in nervous development co-localizing with GRIN1 at neuronal dendrites and synapses 10 , and interacting with GAP-43 at neuronal growth cones 11 , with increased levels of GAP-43 demonstrated in the white matter adjacent to the anterior cingulate cortex in brains from ASD patients 12.

[0227] In the findings disclosed herein, GRM5 SNPs have both a contributory (rs 11020772) and protective (rs905646, rs6483362) effect on ASD. GRM5 is highly expressed in hippocampus, inferior temporal gyrus, inferior frontal gyrus and putamen

(Allen Human Brain Atlas), regions implicated in ASD brain MRI studies 13. GRM5 has a role in synaptic plasticity, modulation of synaptic excitation, innate immune function, and microglial activation 14"17 . GRM5 positive allosteric modulators can reverse the negative behavioral effects of NMDA receptor antagonists, including stereotypies, sensory motor gating deficits, and deficits in working, spatial and recognition memory 18 , features described in ASD 19 ' 20 . With regard to G ? 5's involvement with neuroimmune function, this receptor is expressed on microglia 14 ' 21 , with microglial activation demonstrated by us and others in frontal cortex in ASD 22 ' 23.

[0228] Further, as GRM5 signalling is mediated via signalling through GCPRs, a possible interaction between GNAOl and GRM5 is plausible. Genes such as PLCB2, ADCY2, ADCY5 and ADCY8 encode for proteins involved in G-protein signalling. Given this association, GRM5 may represent a pivotal etiological target for ASD; however, further work is needed in demonstrating these potential interactions and contribution to glutamatergic dysregulation in ASD. In conclusion, within genetically homogeneous populations, our predictive genetic classifier obtained a high level of diagnostic accuracy. This demonstrates that genetic biomarkers can correctly classify ASD from non-ASD individuals. Further, the approach of identifying groups of SNPs that populate known KEGG pathways, as disclosed herein, has identified potential cellular processes that are perturbed in ASD, which are common across ethnic groups. Finally, a small number of genes with various SNPs of influential weighting were identified that strongly determine whether a subject fell within the control or ASD group. Overall these findings indicate that an SNP-based test may allow for early identification of ASD.

[0229] A predictive classifier as described herein is useful tool for screening at birth or during infancy to provide an index of "at-risk status", including probability estimates of - 65 -

ASD likelihood. Identifying clinical and brain-based developmental trajectories within such a group would provide the opportunity to investigate potential psychological, social and/or pharmacological interventions to prevent or ameliorate the disorder.

Example 10 - Population stratification

[0230] Population stratification (i.e., differences in ethnicity between the control and Autism Spectrum Disorder (ASD) populations) was not considered to have a negative impact on the findings as hereinbefore described. Given the genetic variation across different ancestral populations, ethnicity can be an important confound that needs to be addressed in genetic studies of this kind. However, the use of "relatives" as controls in isolation is highly problematic. Use of a normal (independent) control population of comparable ethnicity, determined from ancestral markers, is more important and is particularly relevant to the generation of a diagnostic classifier. Additional analyses below demonstrate the findings as hereinbefore described are not explained by population stratification and that by using machine-learning principles, single nucleotide polymorphisms (SNPs) with relatively weak predictive values can be combined into a stronger and more accurate classifier. The findings described herein also demonstrate that using parents and siblings for comparison with ASD subjects is problematic given the overlap in genotype and phenotype that is seen between these groups. The findings described herein also demonstrate that the approach taken, combined with analyses comparing ASD subjects to their relatives, is more informative with regard to identifying protective, as well as at risk, alleles.

[0231] The use of independent control samples, as opposed to "related" controls is also a common strategy employed in scientific investigation. This point is highlighted by recent evidence suggesting that, although parents and unaffected siblings may not strictly meet diagnostic criteria, they exhibit autistic features (Losh M et al. Arch Gen Psychiatry 2009; 66(5):518-526), a finding that is mirrored by previous analysis demonstrating that first degree relatives fall in-between normal controls and probands for ASD2 (see Figures 3b and 6).

[0232] It is noted that about 30 SNPs are able to classify patients versus controls with 58% accuracy. It is important to clarify that this statistic is relevant to the comparison of cases - 66 - versus independent controls. By contrast, when "related controls" are used, the appropriate statistic from the data described herein is approximately 60%, but only when the full set of 237 SNPs of Table 1 was included in the analysis.

[0233] As stated above, using "related controls" is problematic as it skews and underestimates genetic diversity within ethnic populations when attempting to look for candidate risk and protective genes for disease. Evidence against using parents as controls in such studies includes the following:

[0234] In a recent paper, Sullivan and colleagues demonstrated that families of children with ASD had a greater incidence of other psychiatric disorders, and vice versa. They concluded that schizophrenia, bipolar disorder and ASD shared common aetiologic factors (Sullivan PF, et al. Arch Gen Psychiatry 2012; 69(11): 1099-1103).

[0235] Family and twin studies of ASD have shown increased incidence of ASD traits in relatives of affected individuals. In their review, Happe and Ronald {Neuropsychol Rev 2008; 18(4):287-304) indicated that there are consistent findings of higher rates of communication, language and social impairments in relatives of patients with ASD.

[0236] In order to deal with the confound of SNPs that overlap between cases and "relative controls", previous studies have used the transmission disequilibrium test (TDT) in their analyses. It should be noted, however, that TDT analysis has its own problems, particularly if there is evidence for assortative mating.

[0237] Consideration was also given to ethnic diversity. In the studies described herein, care was taken to ensure ethnic homogeneity among ASD and control individuals. In doing so, SNP markers were identified that separated out the Hapmap ethnic groups in order to confirm the ethnic origin of each individual in the analysis. Subjects were divided into those of Central European (CEU), Tuscan (TSI) and Han Chinese (HAN) backgrounds. The data show that the classifier was sensitive to the effects of ethnicity, and while the classifier performed well for CEU and TSI groups, which are genetically similar, its performance was decreased in the ethnically different HAN population. Interestingly, the CEU and TSI groups showed major differences in SNP allele rates on chromosome 2, which were not related to our classifier. - 67 -

[0238] The difference in classification performance across ethnic groups is not attributed to population stratification. To the contrary, ethnic differences necessitate the development of separate and ethnically distinct genetic classifiers. Importantly, the analyses described herein identified overlapping cellular and molecular pathways across ethnically distinct populations. This suggests that, while SNPs coding for the genes relevant to these pathways may differ between ethnic groups, the pathways relevant to the underlying pathophysiology of ASD may be common.

[0239] To ascertain whether Estonian subgroups may be a particular problem in these studies, the following analyses are presented. In Figure 7, a principal components analysis (PCA) was run of SNP data for the available HAPMAP populations. The results demonstrate that PCA adequately separates the various ethnic groups and that they are represented as separable clusters. This technique performs as expected as it has been used extensively to detect for population stratification effects in GWAS and correct for such effects (Price AL, et al. Nat Genet 2006; 38(8):904-909).

[0240] The same PCA analysis was also performed on all 237 SNPs identified by the classifier (see Table 1), for all 'white non-hispanics' from each of the following cohorts: AGRE, SFARI and WTBC58. As shown in Figure 8, the SNP data was plotted against the 2 major principal components (PCI and PC2), the results demonstrating that there are no population stratification effects. Furthermore, PCI accounts for 1.86% of the variance and PC2 accounts for 1.69% of the variance.

[0241] A two sample Kolmogorov Smirnov Test was performed for each sample for both principal components (PCI and PC2). Even at p=0.4, the null hypothesis {i.e., that the two samples derive from the same distribution) could not be rejected adding further support to the notion that population stratification is unlikely to explain the differences we identified between ASD nd normative samples.

[0242] To further demonstrate that there was nothing exceptional in the subsample, a cross validation scheme was used (Cordell HJ. Nat Rev Genet 2009; 10(6):392-404), which involves a random sample of 80% of all the white non-hispanic (WNH) data across all of the cohorts (involving >4,000 subjects) being used to train a classifier; the remaining 20% of the population was then used to determine classification accuracy (validation sample). - 68 -

This technique was performed a large number of times (10,000 iterations). As shown in Figure 9, the training performance and the validation performance for all 10,000 classifiers. In particular, as shown in Figure 10, the performance on validation cases versus control cases shows that the classifier is correctly classifying each of these groups with the same high accuracy, with the 90% confidence interval for classifier correct classification performance being [CI: 68.08-73.61]. In Figure 11, the classification performance on SFARI versus AGRE validation samples demonstrates similar performance on all WNH individuals in all cohorts and further indicating that population stratification is unlikely.

[0243] The data presented in Figure 3b demonstrate that the classifier was able to discern siblings as separable from probands. Given these probands and their siblings are from the same ethnic origin, in the absence of other factors that relate to the disorder, it would follow that the probands and their siblings would have the same background SNP allele rates. The classifier represents a linear function of the 237 identified SNPs and their determined weightings. With the ethnic background SNPs rates being the same, the expected value and standard deviation of the two distributions for "related-controls" (parents and siblings) would be comparable; that is, the classifier should not be able to distinguish siblings and probands. Clearly, however, a separation of parents and siblings from probands is demonstrated, which would suggest that the probands would not be distinguishable from their siblings based on ethnicity alone.

[0244] To further confirm the SNPs identified were relevant to diagnosis, additional analysis was undertaken to determine whether one could predict some of the core symptoms of ASD (language delay and gaze avoidance) within the WNH AGRE and SFARI ASD populations using the SNPs identified in the classifier. Further analysis was also performed to determine if subgroups of ASD could be distinguished based on individual symptoms. As shown in Table 4, below, and Figures 12A-D, we are able to predict these core features with reasonable accuracy (almost 60% accuracy; 90% confidence interval [CI: 51.42-67.92] and with median correct classification performance of 59.67%). Therefore, even within the AGRE and SFARI ASD populations, the SNPs are informative about core features in the disorder, which suggests that these SNPs are relevant to ASD and are not an effect of ethnic differences. - 69 -

Performance of classifier in predicting ASD symptoms

[0245] Table 4 shows classifier performance in predicting symptoms. Poor performance on Stereotypical Speech and Immediate Echolalia could be due to the fact that there are other SNPs or CNVs that need to be included in our model or that these symptoms may have a large environmental component. AGP DATA (6626 Individuals - 1241 Cases - 5585 Relatives (Mainly Parents).

[0246] The ability of the identified SNPs was also examined to construct a classifier in the Autism Genome Project (AGP) dataset, a third independent sample. Figure 13 shows the distribution of related controls and probands plotted against the two main principal components from this analysis. Figure 14 shows the classifier output for the cases and related controls, indicating some separation but with a high degree of overlap.

[0247] These further analyses, together with the findings as hereinbefore described, demonstrate that the classifier cannot be merely attributable to population effects.

[0248] In conclusion, the SNPs identified show ability to distinguish ASD versus controls in 3 independent cohorts of individuals of White Non-Hispanic ancestry. Furthermore, the - 70 - results indicate that the identified SNPs can be used to distinguish individuals with certain features, including marked language delay and gaze avoidance.

Example 11 - linkage disequilibrium

[0249] SNPs in high linkage disequilibrium (LD) across the SFARI and AGRE sets were removed in the original data set. As not all SNPs of interest were genotyped across all validation sets, SNPs were chosen that were in linkage disequilibrium that were available across all data sets, as shows in Table 5, below.

Table 5 - List of 192 SNPs and their weightings

2014/012144

-72-

73 rsl0505029_G -0.077891 rs 4145903_A -0.085711 rsl l072416_G -0.086911 rsl430158_G -0.097637 rsl003854_G -0.104695 rs769052_G -0.104746 rs7864216_A -0.106676 rs 4748444_G -0.110758 rs4575213_A -0.117188 rs8053370_G -0.118313 rsl7193_A -0.128641 rs919741_A -0.153087 rs6139034_A -0.153803 rs7108524_G -0.161021 rsl881628_A -0.172548 rs750438_A -0.175337 rs2283492_A -0.176971 rs2503220_G -0.186014 rs 756944_A -0.186256 rsl l l02321_G -0.190884 rs7926083_C -0.222941 rs3748386_A -0.223168 rsl002424_G -0.225631 rs2179871_A -0.226334 rs2239316_G -0.241987 rs7079293_A -0.254784 rsl l602256_A -0.257169 rs 4643498_A -0.257191 rsl l048476_G -0.259268 rsl2972670_G -0.293899 rsl937671_A -0.294107 rsl0770675_A -0.294482 rs260808_A -0.305394 rs7842798_G -0.310071 rsl872902_A -0.331569 rs2302898_A -0.332808 rsl7682073_G -0.353338 rs7918241_A -0.361916 rsl2207523_A -0.371119 rs7971175_G -0.403818 rs9347587_G -0.40845 rs2271986_A -0.433163 rsl659506_A -0.462193 rs7677751_A -0.476296 rsl l583646_A -0.505221 -74-

75

[0250] With respect to Table 5, |¾g^ denotes an SNP that is indicative that the subject has ASD or a predisposition to developing ASD; and 1111 denotes an SNP that is indicative of the absence of ASD or protection from developing ASD. The letter that follows the SNP ID number ("rs#") represents the specific allele. For example, "A" represents adenosine, "C" represents cytidine and "G" represents guanosine.

References

1. Geschwind DH, Sowinski J, Lord C, et al. Am J Hum Genet 2001;69:463-6.

2. Lord C, Rutter M, Le Couteur A. J Autism Dev Disord 1994;24:659-85.

3. The International HapMap Project. Nature 2003;426:789-96.

4. Purcell S, Neale B, Todd-Brown K, et al. Am J Hum Genet 2007;81:559-75.

5. Perez-Perez JM, Candela H, Micol JL. Trends in genetics : TIG 2009;25:368-76.

6. Cavalleri GL, Weale ME, Shianna KV, et al. Lancet Neurol 2007;6:970-80.

7. Lee US, Cui J. J Physiol 2009;587: 1481-98.

8. Monk CS, Weng SJ, Wiggins JL, et al. J Psychiatry Neurosci 2010;35: 105-14.

9. Jiang M, Gold MS, Boulay G, et al. Proc Natl Acad Sci U S A 1998;95:3269-74.

10. Masuho I, Mototani Y, Sahara Y, et al. Dev Dyn 2008;237:2415-29.

11. Yang H, Wan L, Song F, Wang M, Huang Y. Int J Biochem Cell Biol 2009;41 : 1495- 501.

12. Zikopoulos B, Barbas H. J Neurosci 2010;30: 14595-609.

13. Toal F, Daly EM, Page L, et al. Psychol Med 2010;40: 1171-81.

14. Drouin-Ouellet J, Brownell AL, Saint-Pierre M, et al. Glia 2011;59: 188-99.

15. Le Duigou C, Holden T, Kullmann DM. Neuropharmacology 2010.

16. Popkirov SG, Manahan-Vaughan D. Cereb Cortex 2010.

17. Suzuki E, Okada T. Brain Res 2010;1313:45-52.

18. Fowler SW, Ramsey AK, Walker JM, et al. Neurobiol Learn Mem 2010. - 76 -

19. Sacco R, Curatolo P, Manzi B, et al. Autism Res 2010;3:237-52.

20. Boyd BA, Baranek GT, Sideris J, et al. Autism Res 2010;3:78-87.

21. Byrnes KR, et al. Glia 2009;57:550-60.

22. Vargas DL, et al., Ann Neurol 2005;57:67-81.

23. Morgan JT, Chana G, Pardo CA, et al. Biol Psychiatry 2010;68:368-76.