Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHOD OF USE OF YKL-40 IN SCOLIOSIS
Document Type and Number:
WIPO Patent Application WO/2019/140517
Kind Code:
A1
Abstract:
Disclosed herein are novel molecular markers in the CHI3L1 gene associated with idiopathic scoliosis (IS). Accordingly,the present disclosure concerns novel methods of identifying subjects at risk of developing IS or suffering from IS and of genotyping and classifying IS subjects into genetic and functional groups. Also provided are compositions, DNA chips and kits for applying the methods. Novel methods, uses and compositions for the prevention or treatment of IS by increasing the levels of YKL-40 polypeptide are also disclosed.

Inventors:
SAMUELS MARK E (CA)
NADA DINA TAREK (EG)
MOREAU ALAIN (CA)
Application Number:
PCT/CA2019/050054
Publication Date:
July 25, 2019
Filing Date:
January 16, 2019
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
CHU SAINTE JUSTINE (CA)
International Classes:
C12Q1/6883; A61K38/17; A61P19/00; C07K14/47; C12Q1/68; C12Q1/6876; G01N33/48; G01N33/53
Domestic Patent References:
WO2009070764A12009-06-04
WO2015032005A12015-03-12
WO2015032004A12015-03-12
Other References:
TSAI ET AL.: "CHI3L1 polymorphisms associate with asthma in a Taiwan ese population", BMC MEDICAL GENETICS, vol. 15, no. 86, 23 July 2014 (2014-07-23), pages 1 - 8, XP021194396
SALES DE GAUZY ET AL.: "Fasting total ghrelin levels are increased in patients with adolescent idiopathic scoliosis", SCOLIOSIS, vol. 10, no. 33, 30 November 2015 (2015-11-30), pages 1 - 5, XP055625418
NADA ET AL.: "Association of circulating YKL-40 levels and CHI3L1 gene variants with the risk of spinal deformity progression in adolescent idiopathic scoliosis", SCOLIOSIS AND SPINAL DISORDERS, vol. 13, no. 1, 4 May 2019 (2019-05-04), XP055625421
KAZAKOVA ET AL.: "YKL-40 - a novel biomarker in clinical practice?", FOLIA MEDICA LI, vol. 51, no. 1, January 2009 (2009-01-01), pages 5 - 14, XP055625429
AKOUME ET AL.: "Cell -based Assay Protocol for the Prognostic Prediction of Idiopathic Scoliosis Using Cellular Dielectric Spectroscopy", JOURNAL OF VISUALIZED EXPERIMENTS, vol. 80, 16 October 2013 (2013-10-16), pages 1 - 9, XP055323360, DOI: 10.3791/50768
Attorney, Agent or Firm:
LAVERY, DE BILLY LLP (CA)
Download PDF:
Claims:
CLAIMS:

1 . A method of determining whether a subject is at risk of developing Idiopathic scoliosis (IS) comprising: (a) (i) determining the level of YKL-40 protein in a biological sample from the subject; and/or (ii) determining the presence or absence of at least one variant in at least one allele of the CHI3L 1 gene of the subject, or a marker in linkage disequilibrium therewith; and (b) determining the risk of developing IS based on the level of YKL-40 detected and/or based on the presence or absence of the at least one variant.

2. A method of determining whether a subject is at risk of developing Idiopathic scoliosis (IS) comprising: (i) determining the level of YKL-40 protein in a biological sample from the subject; (ii) determining the level of ghrelin protein in a biological sample from the subject; and (iii) determining the risk of developing IS based on the ratio between the level of ghrelin protein and the level of YKL-40 protein.

3. The method of claim 1 or 2, wherein the risk of developing (IS) is a risk of developing a severe scoliosis.

4. The method of claim 1 or 2, wherein the risk of developing (IS) is a risk of severe scoliosis progression.

5. The method of any one of claims 1 to 4, further comprising classifying the subject in the FG1, FG2 or FG3 endophenotype group.

6. A method of classifying a subject suffering from IS or at risk of suffering from IS in the FG1 , FG2 or FG3 endophenotype group comprising (a) (i) determining the level of secreted YKL-40 protein in a biological sample from the subject; and/or (ii) determining the presence or absence of at least one variant in at least one allele of the CH3L /gene of the subject, or a marker in linkage disequilibrium therewith, and (b) classifying the subject based on the level of secreted YKL-40 protein and/or based on the presence or absence of at the least one variant in the CHI3L /gene

7. A method of classifying a subject suffering from IS or at risk of suffering from IS in the FG1 , FG2 or FG3 endophenotype group comprising (i) determining the level of secreted ghrelin in a blood sample from the subject; and (ii) classifying the subject in the FG1 endophenotype when the level of circulating ghrelin is lower than the level in a control sample or reference value; or (ii) classifying the subject based on the level of secreted ghrelin protein.

8. A method of genotyping a subject comprising determining the genotype of the subject for at least one variant in the CM3L 1 gene

9. The method of any one of claims 1 to 8, wherein the at least one variant comprises at least one of the following single nucleotide polymorphisms (SNPs): rs55700740, rs7542294, rs946259, rs880633, rs1538372, rs4950881 , rs10399805, rs6691378, rs946261, rs946262, rs116415868 and rs10920579.

10. The method of claim 9, wherein the at least one variant comprises at least one of the following SNPs: rs55700740, rs946259, rs880633, rs1538372, rs4950881, rs946261, rs946262, rs10920576, or any combination thereof.

11. The method of claim 10, wherein the at least one variant is rs1538372, and wherein the presence of this variant is associated with a higher likelihood that the subject belongs to endophenotype group FG1 relative to endophenotype group FG2 or FG3.

12. The method of claim 10, wherein the at least one variant is rs946262 and/or rs10920579, and wherein the presence of at least one of these two variants is associated with an increased risk of scoliosis if the subject is a male.

13. The method of any one of claims 1 to 10, comprising determining the presence or absence of at least two variants.

14. The method of any one of claims 1 to 10, comprising determining the presence or absence of at least four variants

15. The method of any one of claims 1 to 10, comprising determining the presence or absence of at least 6 variants.

16. The method of any one of claims 1 to 10, comprising determining the presence or absence of at least 8 variants

17. The method of any one of claims 1 to 14, wherein detecting the presence or absence of at least one variant comprises detecting the presence or absence of one of the combination of SNPs (haplotypes) identified in FIGs.5 to 10.

18. The method of claim 17, wherein the subject belongs to the endophenotype FG3 group, and wherein the haplotype G-G-A-G-G-A (rs880633|rs1538372|rs4950881 |rs10399805|rs6691378|rs946261) is indicative of a reduced risk of spinal deformity progression.

19. The method of claim 18, wherein the subject belongs to the FG3 endophenotype group.

20. The method of claim 17, wherein the presence of the haplotype A-A-G-G-G-G

(rs880633| rs1538372| rs49508811 rs10399805|rs6691378| rs946261 ) is indicative of an increased risk of spinal deformity progression.

21. The method of claim 20, wherein the subject belongs to the FG2 endophenotype group.

22. The method of claim 17, wherein the presence of the haplotype A-A-G-G-G-G

(rs 103998051 rs6691378| rs9462611 rs946262| rs1164158681 rs10920579) is indicative of a reduced risk of spinal deformity progression.

23. The method of claim 22, wherein the subject belongs to the FG3 endophenotype group.

24. The method of any one of claims 20 to 23, wherein the subject is a female.

25. The method of any one of claims 1 to 24, wherein the subject is a pediatric subject.

26. The method of any one of claims 1 to 25, wherein the subject is at risk of developing IS.

27. The method of any one of claims 1 to 26, wherein the subject has at least one family member having IS.

28. A method of treating or preventing IS in a subject comprising increasing the level of YKL-40 protein in the subject

29. A method of reducing a GiPCR signaling defect in a cell of a subject having Idiopathic scoliosis (IS) or at risk of developing IS comprising increasing the level of YKL-40 protein in the cell.

30. A method of increasing GiPCR signaling in a cell of a subject having Idiopathic scoliosis (IS) or at risk of developing IS comprising increasing the level of YKL-40 protein in the cell.

31. A method of reducing the effect of OPN on GiPCR signaling in a cell of a subject having Idiopathic scoliosis (IS) or at risk of developing IS comprising contacting the cells with YKL-40 protein.

32. The method of any one of claims 28 to 31, comprising administering an effective amount of (i) a YKL-40 polypeptide, (ii) a nucleic acid encoding the YKL-40 polypeptide of (i), or (iii) a cell expressing the YKL-40 polypeptide of (i) or the nucleic acid of (ii), to the subject

33. The method of claim 32, wherein said YKL-40 polypeptide comprises an amino acid sequence having at least 70% identity with the sequence set forth in SEQ ID NO: 25 or SEQ ID NO: 26.

34. The method of claim 33, wherein said YKL-40 polypeptide comprises an amino acid sequence having at least 90% identity with the amino acid sequence set forth in SEQ ID NO: 25 or SEQ ID NO: 26.

35. The method of claim 34, wherein said YKL-40 polypeptide comprises the amino acid sequence set forth in SEQ ID NO: 25 or SEQ ID NO: 26.

36. The method of any one of claims 28 to 35, wherein the subject belongs to the FG1 endophenotype group

37. Use of an agent that increases the level of YKL-40 protein for (a) treating or preventing Idiopathic scoliosis (IS) in a subject or (b) the manufacture of a medicament for treating or preventing IS in a subject.

38. Use of an agent that increases the level of YKL-40 protein for (a) reducing a GiPCR signaling defect in a cell of a subject having IS or at risk of developing IS; or (b) the manufacture of a medicament for reducing a GiPCR signaling defect in a cell of a subject having IS or at risk of developing IS

39. Use of an agent that increases the level of YKL-40 protein for (a) increasing GiPCR signaling in a cell of a subject having IS or at risk of developing IS; or (b) the manufacture of a medicament for increasing GiPCR signaling in a cell of a subject having IS or at risk of developing IS.

40. Use of an agent that increases the level of YKL-40 protein for (a) reducing the effect of OPN on GiPCR signaling in a cell of a subject having IS or at risk of developing IS; or (b) the manufacture of a medicament for reducing the effect of OPN on GiPCR signaling in a cell of a subject having IS or at risk of developing IS.

41. The use of any one of claims 37 to 40, wherein said agent is (i) a YKL-40 polypeptide, (ii) a nucleic acid encoding the YKL-40 polypeptide of (i), or (iii) a cell expressing the YKL-40 polypeptide of (i) or the nucleic acid of (ii).

42. The use of claim 41, wherein said YKL-40 polypeptide comprises an amino acid sequence having at least 70% identity with the sequence set forth in SEQ ID NO: 25 or SEQ ID NO: 26.

43. The use of claim 42, wherein said YKL-40 polypeptide comprises an amino acid sequence having at least 90% identity with the amino acid sequence set forth in SEQ ID NO: 25 or SEQ ID NO: 26.

44. The use of claim 43, wherein said YKL-40 polypeptide comprises the amino acid sequence set forth in SEQ ID NO: 25 or SEQ ID NO: 26

45. The use of any one of claims 37 to 44, wherein the subject belongs to the FG1 endophenotype group

46. A composition or kit for (i) determining whether a subject is at risk of developing Idiopathic scoliosis (IS) or (ii) classifying a subject in the FG1, FG2 or FG3 endophenotype group, comprising:

(a) One or more oligonucleotide probes or primers for detecting a variant in the CHI3L1 gene;

(b) One or more restriction enzymes for detecting the presence or absence of a variant in the CHI3L /gene;

(c) One or more antibodies for determining the level of YKL-40; and/or

(d) One or more antibodies for determining the level of ghrelin.

47. The composition of claim 46, further comprising a biological sample from a subject having IS or at risk of developing IS.

48. An oligonucleotide primer or probe for detecting at least one variant or haplotype in the CHI3L 1 gene.

49. The oligonucleotide primer or probe of claim 48, wherein the at least one variant or haplotype in the CHI3L 1 gene is at least one of the SNPs defined in claim 9 and/or of the haplotypes defined in claim 17.

50. A composition for (i) treating or preventing IS in a subject suffering from IS or at risk of developing IS; (ii) reducing a GiPCR signaling defect in a cell of a subject suffering from IS or at risk of developing IS; (iii) increasing GiPCR signaling in a cell of a subject suffering from IS or at risk of developing IS; and/or (iv) reducing the effect of OPN on GiPCR signaling in a cell of a subject suffering from IS or at risk of developing IS, the composition comprising an agent an agent that increases the level of YKL-40 protein in the subject.

51. The composition of claim 50, wherein said agent is (i) a YKL-40 polypeptide, (ii) a nucleic acid encoding the YKL- 40 polypeptide of (i), or (iii) a cell expressing the YKL-40 polypeptide of (i) or the nucleic acid of (ii).

52. The composition of claim 51, wherein said YKL-40 polypeptide comprises an amino acid sequence having at least 70% identity with the sequence set forth in SEQ ID NO: 25 or SEQ ID NO: 26.

53. The composition of claim 52, wherein said YKL-40 polypeptide comprises an amino acid sequence having at least 90% identity with the amino acid sequence set forth in SEQ ID NO: 25 or SEQ ID NO: 26.

54. The composition of claim 53, wherein said YKL-40 polypeptide comprises the amino acid sequence set forth in SEQ ID NO: 25 or SEQ ID NO: 26.

55. The composition of any one of claims 50 to 54, wherein the subject belongs to the FG1 endophenotype group.

Description:
METHOD OF USE OF YKL-40 IN SCOLIOSIS

CROSS REFERENCE TO RELATED APPLICATIONS

The present application claims the benefits of United States provisional patent application serial No 62/617,820 filed on January 16, 2018, the content of which is incorporated by reference in its entirety

FIELD OF THE INVENTION

The present disclosure relates to Idiopathic Scoliosis. More specifically, the present disclosure is concerned with reagents, methods, compositions and kits for the assessment and management of Idiopathic Scoliosis.

REFERENCE TO SEQUENCE LISTING

Pursuant to 37 C.F.R. 1.821 (c), a sequence listing is submitted herewith as an ASCII compliant text file named 14033 172 SL ST25 txt, that was created on January 16, 2018 and having a size of 12 kilobytes. The content of the aforementioned file named 14033_172_SI__ST25.txt j s hereby incorporated by reference in its entirety.

BACKGROUND OF THE INVENTION

Idiopathic scoliosis is a prevalent spinal deformity that affects an average of 1-4% of the global pediatric population (1). It is characterized by an abnormal three-dimensional curvature of the spine with an onset that can occur between birth and sexual maturity. Thus, it has been classified as infantile, juvenile, or adolescent based on when a curve is initiated (2). Adolescent Idiopathic Scoliosis (AIS) represents the most common form of scoliosis and occurs between the ages of 10 and 15 years, with girls affected more severely than boys (3). Although the etiology of AIS remains unclear, the fact that the syndrome is influenced by genetic factors has been widely accepted (4, 5) The great phenotypic heterogeneity of AIS, and given the multiple loci identified so far in genetic studies, (6) suggest that AIS is more likely to be multifactorial.

Phenotypic complexity and possible genetic heterogeneity have delayed progress to articulate our understanding of AIS etiology using traditional genetic approaches. Interestingly, previous work has demonstrated that AIS patients present with a distinctive systemic signaling dysfunction for G inhibitory (Gi)-coupled receptors (7, 8). The differential Gi signaling dysfunction among AIS patients allow their classification into three distinct biological endophenotypes (FG1, FG2, and FG3), based on the maximum Gi signaling response in cells (osteoblasts and other cell types) exposed to Gi specific stimuli (9). The use of endophenotypes in complex diseases has the advantage of partitioning the genetic variation, thus providing the study with greater power to detect a genetic effect.“Endophenotype” is a term that was first introduced to genetics in 1966 to describe“microscopic and internal” characteristics while phenotypes describe“obvious and external” characteristics (10, 11). Endophenotypes are heritable traits that are representative of the molecular path from genes to the phenotype (12, 13).

The present description refers to a number of documents, the content of which is herein incorporated by reference in their entirety.

SUMMARY OF THE INVENTION

In the course of identifying genes involved in the differential signaling response occurring in AIS, CHI3L 1 was identified as one of the genes that showed a significant differential expression among the different AIS endophenotypes

CHI3L 1 gene (HGNC: 1932; Entrez Gene: 1116; Ensembl: ENSG00000133048; OMIM: 601525; UniProtKB: P36222, Cytogenetic band: 1 q32 1) encodes for the secretory factor YKL-40. YKL-40 is a member of the family "mammalian chitinase-like proteins", which correspond to glycoproteins that bind to heparin. YKL-40 was first discovered in 1989 when it was reported to be secreted in vitro by MG63 osteosarcoma cell lines in large amounts (14). YKL-40 is expressed in many tissues and is secreted by several types of solid tumors. The exact function of YKL-40 in normal tissues and in certain pathological conditions remains unknown. YKL-40 acts as a growth factor in cells involved in tissue remodeling. It may have a role in cancer cell proliferation, survival and their ability to invade surrounding tissue (15). In addition, elevated serum levels of YKL-40 have also been observed in patients with non-malignant diseases of particular contexts of inflammation (16). The objectives of the present work were to determine whether the differential expression of the CW3L 1 gene initially found in AIS osteoblasts, is more systemic and plays a role in AIS pathogenesis. Therefore, plasma YKL-40 levels were analyzed in a large cohort of AIS patients which are distributed among the three known biological endophenotypes (FG1, FG2 and FG3) Also, YKL-40 levels were compared between AIS patients and controls, and AIS patients were further sub-classified according to their curve severity. Moreover, the same cohort was genotyped for 12 SNPs in the CHI3L 1 gene to test the association of those SNPs to the different phenotypes and/or to the plasma levels of YKL-40 among the different sub classification of AIS patients.

These studies presented herein demonstrate a significant association between plasma YKL-40 levels, CHI3L /gene single nucleotide polymorphisms and reduced susceptibility to the development of severe spinal deformities in the context of AIS.

By comparing plasma YKL-40 levels of each biological endophenotype of AIS patient versus controls, and since the interaction between gender and group was statistically significant, separate analyses for females and males were performed. Surprisingly, for females no significant difference between the groups was observed, whereas in males there were significant differences among the groups (P= 0.001). Specifically, males classified in biological endophenotype FG1 showed significantly higher plasma YKL-40 levels than controls and the two other AIS endophenotypes. Furthermore, as shown herein, when AIS patients were classified based on their spinal deformity severity, the non-severe group showed statistically significant higher levels of YKL-40 than controls. Thus, there are higher circulating YKL-40 ( e.g ;, SEQ ID NO: 26) levels in non-severe cases of AIS as well as in male AIS patients classified in the biological endophenotype FG1 , which is the AIS subgroup least likely to develop a severe curvature when compared to the other endophenotypes. Finally, treatment with YKL-40 rescued Gi-coupled receptor signaling dysfunction observed in AIS osteoblasts.

Collectively, the present findings reveal a novel role for YKL-40 in AIS pathogenesis and provide a first molecular mechanism by which this glycoprotein could interfere with spinal deformity progression.

Accordingly, in a first aspect, the present disclosure concerns a method of determining whether a subject is at risk of developing Idiopathic scoliosis (IS) [e.g, AIS) comprising: (i) determining the level of YKL-40 protein in a biological sample from the subject; and/or (ii) determining the presence or absence of at least one variant in at least one allele of the CHI3L 1 gene of the subject, or a marker in linkage disequilibrium therewith. The risk of developing Idiopathic scoliosis is then determined based on the level of YKL-40 protein determined in (i) and/or based on the presence or absence of the at least one variant (polymorphic marker) in the CHI3L1 gene determined in (ii).

Subjects with higher levels of YKL-40 protein have a lower risk of developing a scoliosis (or a severe scoliosis) than those with lower levels of YKL-40 protein. Thus, in embodiments, the detection of a high level of YKL-protein (e.g, a level above that of a control sample or reference value) is associated with a decreased risk of developing a scoliosis (e.g, a severe scoliosis) and the detection of a low level of YKL-protein (e.g , a level equal to or lower than that of a control sample or reference value) are associated with an increased risk of developing a scoliosis (e.g, a severe scoliosis).

In another aspect, the present disclosure provides a method of determining whether a subject is at risk of developing Idiopathic Scoliosis (IS) (e.g, AIS) comprising (i) determining the level of YKL-40 protein in a biological sample from the subject; (ii) determining the level of ghrelin protein in a biological sample from the subject; and (iii) determining the risk of developing IS based on the ratio between the level of ghrelin and the level of YKL-40 detected. The risk of developing IS is then determined based on the ratio between ghrelin and YKL-40 in the sample. The lower the level of ghrelin, the higher the level of YKL-40 and the lower is the risk of developing a scoliosis. Accordingly, in embodiments, the detection of a ratio between ghrelin and YKL-40 (ghrelin:YKL-40) below that of a control sample or reference value is indicative of a lower risk of developing a scoliosis. In embodiments, the detection of a ratio between YKL-40 and ghrelin (YKL-40: ghrelin) above that of a control sample or reference value is indicative of a lower risk of developing a scoliosis.

In embodiments, the risk of developing IS is a risk of developing a severe scoliosis. In embodiments, the risk of developing IS is a risk of scoliosis progression. In embodiments, the risk of developing IS is a risk of severe scoliosis progression.

In an embodiment, the above-noted methods further comprise classifying the subject in the FG1, FG2 or FG3 endophenotype group.

The present disclosure also provides a method of classifying a subject (e.g., a male subject) suffering from IS or at risk of developing IS (e.g, AIS) in the FG1 , FG2 or FG3 endophenotype group comprising (i) determining the level YKL-40 protein (e.g, circulating or secreted YKL-40) in a biological sample (e.g., biological fluid sample such as blood, plasma or serum) from the subject; and/or (ii) determining the presence or absence of at least one variant (e.g, SNP) in at least one allele of the CH3L /gene of the subject, or a marker in linkage disequilibrium therewith. The subject is classified in the FG1 , FG2 or FG3 endophenotype based on the level of circulating YKL-40 protein determined and/or based on the presence or absence of the at least one gene variant in the CH3L 1 gene, or of the marker in linkage disequilibrium therewith In embodiments, the subject is classified in the FG3 subgroup when the level of circulating YKL-40 protein is higher than that of a control sample or reference value. In embodiments, the subject is not classified in the FG3 subgroup when the level of YKL-40 protein is lower than that of a control sample or reference value.

The present disclosure also provides a method of classifying a subject having IS (e.g., AIS) in the FG1, FG2 or FG3 endophenotype group comprising (i) determining the level of ghrelin protein (e.g., circulating or secreted ghrelin) in a biological sample (e.g, biological fluid sample such as blood, plasma or serum) from the subject; and (ii) classifying the subject in the FG1 endophenotype when the level of circulating ghrelin is lower than the level in a control sample or reference value; or (ii) classifying the subject as belonging to the FG2 or FG3 endophenotype when the level of circulating ghrelin is higher than that of a control sample or reference value.

In embodiments, determining the presence or absence of at least one variant in the <7/// Z /gene or a marker in linkage disequilibrium therewith comprises determining the genotype of the subject (i.e., the presence or absence of a given variant in both alleles of the CHI3L /gene of the subject) for the at least one variant or marker in linkage disequilibrium therewith

In embodiments, determining the presence or absence of at least one variant in the CH13L 1 gene or a marker in linkage disequilibrium therewith comprises determining the haplotype of the subject for a plurality of variants (SNPs) in the CHI3L 1 gene (i e., the presence or absence of a given set of variants in the CHI3L 1 gene of the subject, e.g. , the haplotypes listed in any of FIGs.5 to 10 or marker(s) in linkage disequilibrium therewith).

In embodiments, determining the presence or absence of at least one variant in the CHI3L 1qene or a marker in linkage disequilibrium therewith comprises determining the presence or absence of at least one variant ( e.g. , SNP) in the CH3L 1 gene in at least one allele of the subject. In embodiments, determining the presence or absence of at least one variant (e.g, SNP) in the CHI3L 1 gene or a marker in linkage disequilibrium therewith comprises determining the presence or absence of at least one variant in the CH3L 1 gene in both alleles of the subject (i.e., determining whether the subject is homozygous or heterozygous for the at least one gene variant)

In a further aspect, the present disclosure concerns a method of genotyping a subject (e.g., having IS or at risk of developing IS) comprising determining the genotype of the subject for at least one variant in the CHI3L1 gene.

In embodiments, the at least one variant in the CHI3L 1 gene is at least one of the variants (SNPs) listed in Tables 1, 3, 4, 5A and/or 6. In embodiments, the at least one variant comprises one or more of the following SNPs: rs55700740, rs946259, rs880633, rs1538372, rs4950881, rs946261, rs946262 and rs10920576.

In embodiments, the method comprises determining the presence or absence of at least two SNPs. In embodiments, the method comprises determining the presence or absence of at least three SNPs. In embodiments, the method comprises determining the presence or absence of at least four SNPs. In embodiments, the method comprises determining the presence or absence of at least five SNPs. In embodiments, the method comprises determining the presence or absence of at least six SNPs. In embodiments, the method comprises determining the presence or absence of at least seven SNPs. In embodiments, the method comprises determining the presence or absence of all SNPs identified herein.

In embodiments, the at least one variant in the C///?Z /gene comprises or consists of a combination of SNPs listed in

FIGs.5 to 10.

In embodiments, the above methods comprise the use of an oligonucleotide probe or primer. In embodiments, the oligonucleotide probe or primer enables the specific detection of a variant sequence (allele) set forth in Tables 1, 3, 4* 5A and/or 6.

In embodiments, the biological sample is a biological fluid sample. In embodiments, the biological fluid sample is a blood sample. In embodiments, the biological fluid sample is plasma In embodiments, the biological fluid sample is serum. In embodiments, the biological sample is a cell sample. In embodiments, the biological sample is a protein sample. In embodiments, the biological sample is a nucleic acid sample

In embodiments, the YKL-40 is a circulating (secreted) YKL-40 polypeptide In embodiments, the circulating YKL-40 polypeptide comprises the amino acid sequence set forth in SEQ ID NO: 26, a fragment thereof, or an allelic variant thereof.

In embodiments, the subject has at least one family member diagnosed with IS (e.g., AIS) (first, second or third degree relative) In embodiments, the subject is diagnosed with IS. In embodiments, the subject is a subject diagnosed with IS and belonging to the FG1 endophenotype. In embodiments, the subject is a subject diagnosed with IS and belonging to the FG2 endophenotype. In embodiments, the subject is a subject diagnosed with IS and belonging to the FG3 endophenotype.

In embodiments, the subject is a male. In embodiments, the subject is a female In embodiments, the subject is between 6 and 26 years old. In embodiments, the subject is a pediatric subject. In embodiments, the pediatric subject is between 6 and 18 years old. In embodiments, the pediatric subject is between 10 and 15 years old In embodiments, the subject has at least one family member diagnosed with IS [e.g, AIS). In embodiments, the subject has at least one SNP identified herein [e.g., see Tables 1, 3, 4, 5A and/or 6) in at least one allele of the CHI3L 1 gene. In embodiments, the subject has at least two SNPs identified herein in at least one allele of the CHI3L1 gene. In embodiments, the subject has at least one SNP identified herein in both alleles of the CHI3L 1 gene (i.e., is homozygote for the SNP). In embodiments, the subject has at least two SNPs identified herein in both alleles of the CHI3L /gene (i.e., the subject is homozygote for at least two SNPs identified herein).

The present disclosure also concerns a method of treating or preventing IS [e.g., AIS) comprising increasing the level of secreted YKL-40 protein in the subject. In an embodiment, the method comprises administering an exogenous or recombinant YKL-40 polypeptide, or a cell expressing a YKL-40 polypeptide, to the subject In other embodiment, the method comprises increasing the expression of the endogenous YKL-40 polypeptide or correcting a defective CHI3L 1 gene, e.g. using a genome-editing technique such as the CRISPR/Cas9 system. The present disclosure also concerns the use of an exogenous or recombinant YKL-40 polypeptide for treating or preventing IS [e.g., AIS) in a subject, or for manufacture of a medicament for treating or preventing Idiopathic scoliosis in a subject. The present disclosure also concerns an exogenous or recombinant YKL-40 polypeptide for treating or preventing IS [e.g., AIS) in a subject.

The present disclosure also provides a method [in vitro ox in vivo) of reducing the GiPCR signaling defect in a cell of a subject having IS or at risk of developing IS [e.g, AIS) comprising contacting the cell with a YKL-40 polypeptide or increasing the expression of the endogenous YKL-40 polypeptide (or correcting a defective CHI3L 1 gene) in the cell. The present disclosure also concerns the use of an exogenous or recombinant YKL-40 polypeptide for reducing the GiPCR signaling defect in a cell of a subject having IS [e.g, AIS) or at risk of developing IS, or for the manufacture of a medicament for reducing the GiPCR signaling defect in a cell of a subject having IS or at risk of developing IS. The present disclosure also concerns an exogenous or recombinant YKL-40 polypeptide for reducing the GiPCR signaling defect in a cell of a subject having IS [e.g., AIS) or at risk of developing IS.

The present disclosure also provides a method of increasing GiPCR signaling in a cell of a subject having IS [e.g, AIS) or at risk of developing IS comprising contacting the cell with a YKL-40 polypeptide or increasing the expression of the endogenous YKL-40 polypeptide (or correcting a defective CHI3L 1 gene) in the cell. The present disclosure also concerns the use of an exogenous or recombinant YKL-40 polypeptide for increasing GiPCR signaling in a cell of a subject having IS or at risk of developing IS, or for the manufacture of a medicament for increasing GiPCR signaling in a cell of a subject having IS or at risk of developing IS. The present disclosure also concerns an exogenous or recombinant YKL-40 polypeptide for increasing GiPCR signaling in a cell of a subject having IS or at risk of developing IS.

The present disclosure also provides a method of reducing the effect of OPN on GiPCR signaling in a cell of a subject having IS [e.g., AIS) or at risk of developing IS comprising contacting the cells with a YKL-40 polypeptide or increasing the expression of the endogenous YKL-40 polypeptide (or correcting a defective CHI3L1 gene) in the cell. The present disclosure also concerns the use of an exogenous or recombinant YKL-40 polypeptide for reducing the effect of OPN on GiPCR signaling in a cell of a subject having IS [e.g, AIS) or at risk of developing IS, or for manufacture of a medicament for reducing the effect of OPN on GiPCR signaling in a cell of a subject having IS or at risk of developing IS The present disclosure also concerns an exogenous or recombinant YKL-40 polypeptide for reducing the effect of OPN on GiPCR signaling in a cell of a subject having IS [e.g, AIS) or at risk of developing IS.

In embodiments, the above-noted method of treating or preventing, or use, further comprise (i) determining the level of YKL-40 protein in a biological sample from the subject, (ii) determining the level of ghrelin protein in a biological sample from the subject (iii) classifying the subject in the FG1, FG2 or FG3 functional group; and/or (iv) determining the presence or absence of at least one variant (SNP) in at least one allele of the CHI3L /gene of the subject.

In another aspect, the present disclosure concerns a composition or kit for use in methods disclosed herein (e.g, for example, a kit for (i) detecting a variant in at least one allele of the CHI3L 1 gene in a biological sample; (ii) determining whether a subject is at risk of developing IS (e.g., AIS); (iii) treating or preventing IS (e.g., AIS) in a subject; (iv) genotyping a subject for at least one variant in the CHI3L 1 gene (e.g, SNP disclosed herein); (v) classifying a subject into a particular genetic or endophenotype group (FG1 , FG2 or FG3); (vi) reducing the GiPCR signaling defect in a cell of a subject (e.g, having IS or at risk of developing IS), (vii) reducing the effect of OPN on GiPCR signaling in a cell (e.g, of a subject having IS or at risk of developing IS), etc.). The kit may comprise for example one or more oligonucleotide probes or primers, one or more antibodies specific for detection of YKL-40 or C///7Z //YKL-40 gene variant, and/or soluble YKL-40 (for treating and preventing IS). In embodiments, the composition or kit further comprises reagents for classifying the subject in the FG1 , FG2 or FG3 functional groups as disclosed for example in WO/2003/073102, WO/2010/040234, WO/2012/045176, WO/2015/032005, WO/2014/201560, WO/2014/201557. In embodiments, the composition or kit further comprises a biological sample from the subject.

In another aspect, the present disclosure concerns a DNA chip comprising at least one oligonucleotide for detecting the presence or absence of at least one CHI3L 1 gene variant (SNP) set forth in Tables 1, 3, 4, 5A and/or 6) and a substrate on which the oligonucleotide is immobilized. In embodiments, the variant is a variant (or combinations of variants, including haplotypes) associated with a reduced risk of developing a scoliosis disclosed herein (e.g, set forth in Tables 1, 3, , 5A and/or 6 and/or FIGs.5 to 10).

In a further aspect, the present disclosure provides oligonucleotide probes or primers for use in the above described methods, compositions, kits, DNA chips, etc. In embodiments, the oligonucleotide is for the specific detection of a variant of the present disclosure and comprises a nucleotide sequence including the desired variant nucleotide (e.g, see Table 1). In embodiments, the variant is a variant associated with a reduced risk of developing a scoliosis disclosed herein. In embodiments, the oligonucleotide hybridizes to a reference (ancestral) or a variant polynucleotide sequence set forth in Table 1 or its complementary sequence. In embodiments, the oligonucleotide primer or probe further comprises a label. In embodiments, the oligonucleotide primer or probe comprises or consists of at least 10 nucleotides of a polynucleotide sequence set forth in any one of SEQ ID NOs: 1-24 and 27, or the complement thereof, and includes the nucleotide from the ancestral or variant allele set forth in Table 1 (or its complement). In embodiments, the oligonucleotide primer or probe comprises or consists of at least 10 nucleotides of a polynucleotide sequence set forth in any one of SEQ ID NOs: 1, 3, 5, 7, 9, and 11 or the complement thereof and includes the nucleotide from the ancestral allele set forth in Table 1 (or its complement). In embodiments, the oligonucleotide primer or probe comprises or consists of at least 10 nucleotides of a polynucleotide sequence set forth in any one of SEQ ID NOs: 2, 4, 6, 8, 10 and 12 or the complement thereof and includes the nucleotide from the variant allele set forth in Table 1 (or its complement) In embodiments, the oligonucleotide primer or probe consists of 10 to 100 nucleotides, preferably 10 to 60 nucleotides, 10 to 50 nucleotides, 10 to 40 nucleotides or 10 to 30 nucleotides. In embodiments, the oligonucleotide primer or probe consists of at least 12 nucleotides. In a further aspect, the present disclosure relates to the use of methods, compositions, kits, oligonucleotide primers or probes, and DNA chips of the present disclosure for (i) detecting a variant in at least one allele of the CW3L 1 gene in a biological sample; (ii) determining whether a subject is at risk of developing IS; (iii) treating or preventing IS in a subject ; (iv) genotyping a subject for at least one variant in the CHI3L1 gene ( e.g. , SNP disclosed herein); (v) classifying a subject in a particular genetic or endophenotype group (FG1 , FG2 or FG3); (vi) reducing the GiPCR signaling defect in a cell of a subject (e.g., having IS or at risk of developing IS); (vii) reducing the effect of OPN on GiPCR signaling in a cell of a subject (e.g., having IS or at risk of developing IS), etc

Other objects, advantages and features of the present disclosure will become more apparent upon reading of the following non-restrictive description of specific embodiments thereof, given by way of example only with reference to the accompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

In the appended drawings:

FIG. 1 shows plasma YKL-40 levels in function of sex and AIS biological endophenotypes. An Anova, two-sided t-test was applied and the presented p-values were after Bonferroni adjustment for pairwise comparisons;

FIG.2 shows linkage disequilibrium blocks of 12 SNPs identified herein;

FIG. 3 shows that YKL-40 rescues Gi-coupled receptor signaling defect induced by rOPN. Primary osteoblasts obtained from three scoliotic patients were pre-treated with purified rOPN (0.5 g/ml) with and without rYKL-40 (0.5 g/ml) for 18 h prior to the stimulation with 10 mM of oxymetazoline. Error bars show SEM of independent experiments performed three times in duplicate. Data represent the percentage of the maximum impedance measured by CDS assay and was normalized to the response achieved in the absence of rOPN (vehicle only). * P< 0.01 based on one-way ANOVA followed by in post-hoc test of Dunnett;

FIG. 4 shows expression analysis of CHI3L 1 in primary human osteoblasts obtained from AIS patients classified in FG1 biological endophenotype vs. AIS patients classified in FG2+FG3 biological endophenotypes. Statistical analysis was performed with an unpaired T-test (two tailed p-value);

FIG.5 depicts a table showing the results of the haplotype association analyses of plasma YKL-40 levels;

FIG.6 depicts a table showing the results of the haplotype association analyses of plasma YKL-40 levels in endophenotype FG1 ;

FIG.7 depicts a table showing the results of the haplotype association analyses of plasma YKL-40 levels in endophenotype FG2;

FIG.8 depicts a table showing the results of the haplotype association analyses of plasma YKL-40 levels in endophenotype FG3;

FIG. 9 depicts a table showing the results of the haplotype association analyses of plasma YKL-40 levels in female subjects;

FIG.10 depicts a table showing the results of the haplotype association analyses of plasma YKL-40 levels in male subjects; FIG. HAdepicts the amino acid sequence of human YKL-40 (RefSeq accession No NP_001267.2, SEQ ID NO: 25), with the mature form in bold; and

FIG. 11 B depicts the nucleotide sequence of human YKL-40 transcript (NCBI Reference Sequence: NM_001276.2, SEQ ID NO: 27), with the coding sequence in bold.

DESCRIPTION OF ILLUSTRATIVE EMBODIMENTS

The cellular and molecular mechanisms underlying spinal deformity progression in adolescent idiopathic scoliosis (AIS) remains poorly understood. In the studies described herein, it is demonstrated that increased circulating YKL-40 levels, a secreted glycoprotein encoded by the chitinase 3-like 1 ( CHI3L f) gene and some single nucleotide polymorphisms (SNPs) associated with higher plasma YKL-40 levels correlates with the risk of spinal deformity progression. 728 French-Canadian patients and 216 controls were genotyped for 12 SNPs located in the CHI3L 1 gene or its promoter. The occurrence of single polymorphisms, haplotypes and SNP-SNP interactions were statistically analyzed for association with disease risk, scoliosis severity phenotypes and endophenotypes as well as circulating YKL-40 levels. Plasma YKL-40 levels were determined by ELISA and SNPs were analyzed by multiplex polymerase chain reaction and genotyping. Functional effects of YKL-40 were tested on primary osteoblasts obtained from AIS patients by cellular dielectric spectroscopy assay A significant reduced risk of disease progression was observed with eight SNPs (rs55700740, rs946259, rs880633, rs1538372, rs4950881, rs946261 , rs946262 and rs10920579) associated with higher plasma YKL-40 levels Circulating

YKL-40 levels were also significantly increased in males classified in the first biological endophenotype (FG1) and associated with a non-severe scoliosis phenotype (Cobb angle < 40°). Treatmentwith YKL-40 rescued Gi-coupled receptor signaling dysfunction observed in AIS osteoblasts. Collectively, the findings described herein reveal a novel role for YKL- 40 in AIS pathogenesis and provide a first molecular mechanism by which this glycoprotein could interfere with spinal deformity progression.

Table 1 : Gene variants in the CHI3L1 gene linked to IS

Definitions

In order to provide clear and consistent understanding of the terms in the instant application, the following definitions are provided.

The articles "a," "an" and "the" are used herein to refer to one or to more than one (i.e., to at least one) of the grammatical object of the article.

As used herein, the words“comprising" (and any form of comprising, such as“comprise” and“comprises”),“having” (and any form of having, such as“have” and“has”),“including” (and any form of including, such as“includes” and“include”) or “containing” (and any form of containing, such as“contains” and“contain”) are inclusive or open-ended and do not exclude additional, un-recited elements or method steps and are used interchangeably with, the phrases "including but not limited to" and "comprising but not limited to"

The terms "such as" are used herein to mean, and is used interchangeably with, the phrase "such as but not limited to".

For the recitation of numeric ranges herein, each intervening number there between with the same degree of precision is explicitly contemplated. For example, for the range of 18-20, the numbers 18, 19 and 20 are explicitly contemplated, and for the range 6.0-7.0, the number 6.0, 6.1 , 62, 6.3, 64, 6.5, 6.6, 6.7, 6.8, 6.9, and 7.0 are explicitly contemplated

Unless otherwise defined herein, scientific and technical terms used in connection with the present disclosure shall have the meanings that are commonly understood by those of ordinary skill in the art For example, any nomenclature used in connection with, and techniques of, cell and tissue culture, molecular biology, immunology, microbiology, genetics and protein and nucleic acid chemistry and hybridization described herein are those that are well known and commonly used in the art. The meaning and scope of the terms should be clear; in the event however of any latent ambiguity, definitions provided herein take precedent over any dictionary or extrinsic definition. Further, unless otherwise required by context, singular terms shall include pluralities and plural terms shall include the singular.

Practice of the methods, as well as preparation and use of the products and compositions disclosed herein employ, unless otherwise indicated, conventional techniques in molecular biology, biochemistry, chromatin structure and analysis, computational chemistry, cell culture, recombinant DNA and related fields as are within the skill of the art. These techniques are fully explained in the literature See, for example, Sambrook et al. MOLECULAR CLONING: A LABORATORY MANUAL, Second edition, Cold Spring Harbor Laboratory Press, 1989 and Third edition, 2001; Ausubel et al., CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, John Wiley & Sons, New York, 1987 and periodic updates; the series METHODS IN ENZYMOLOGY, Academic Press, San Diego; Wolffe, CHROMATIN STRUCTURE AND FUNCTION, Third edition, Academic Press, San Diego, 1998; METHODS IN ENZYMOLOGY, Vol. 304, "Chromatin" (P M. Wassarman and A. P. Wolffe, eds.), Academic Press, San Diego, 1999; and METHODS IN MOLECULAR BIOLOGY, Vol. 119, "Chromatin Protocols" (P. B Becker, ed.) Humana Press, Totowa, 1999.

As used herein,“C'///?Z/gene” (GCID: GC01 M203148; HGNC: 1932; Entrez Gene: 1116; Ensembl: ENSG00000133048; OMIM: 601525; UniProtKB: P36222; located on Chromosome 1 (1 q32 1 ); RefSeqGene on chromosome 1 : NG_013056.1) refers to the gene encoding YKL-40, a secreted glycoprotein that is approximately 40kDa in size in humans. YKL-40 is expressed and secreted by various cell-types including macrophages, chondrocytes, fibroblast-like synovial cells, vascular smooth muscle cells, and hepatic stellate cells. YKL-40 lacks chitinase activity due to mutations within the active site. A non-limiting example of a human YKL-40 protein sequence include the sequence depicted in RefSeq accession No NP_001267 2 (SEQ ID NO: 25). The first 21 residues of the sequence correspond to the signal peptide, and the mature YKL-40 protein comprises residues 22-383 (SEQ ID NO: 26) Residues 70-71, 97-100 and 204-207 are chitooligosaccharide-binding domains, and the region encompassing residues 324-338 is believed to be involved in AKT 1 activation and IL-8 production For the methods and uses described herein based on the administration or use of a YKL- 40 polypeptide, a native YKL-40 protein {e.g., comprising the sequence of the mature form set forth in SEQ ID NO:26 or of the precursor set forth in SEQ ID NO:25) or a variant or fragment thereof having the biological activity of the native YKL- 40 protein, and more particularly the ability of rescuing (partially or completely) the a1-adrenergic receptor signaling dysfunction induced by rOPN. In an embodiment, the YKL-40 polypeptide comprises an amino acid sequence having at least 70% identity with the sequence set forth in SEQ ID NO:26 or SEQ ID NO:25, and has the biological activity of the native YKL-40 protein. In further embodiments, the YKL-40 polypeptide comprises an amino acid sequence having at least 75%, 80%, 85%, 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity with the sequence set forth in SEQ ID NO:26 or SEQ ID NO:25, and has the biological activity of the native YKL-40 protein In an embodiment, the YKL-40 polypeptide comprises the amino acid sequence set forth in SEQ ID NO:26 or SEQ ID NO:25.

As used herein,“ghrelin” refers to a protein or polypeptide encoded by the ghrelin gene (GHRL: GCID:GC03M010285; HGNC: 18129; Entrez Gene: 51738; Ensembl: ENSG00000157017; OMIM: 605353; UniProtKB: Q9UBU3. Located on chromosome on chromosome 3 (3p25.3)). The GHRL gene encodes the ghrelin-obestatin preproprotein that is cleaved to yield two peptides, ghrelin and obestatin. ghrelin is a powerful appetite stimulant and plays an important role in energy homeostasis. Non-limiting examples of ghrelin protein sequences include NP_001128413; NP_001128416; NP_001128417; NP_001128418; and NP_001289750).

As used herein, the term“Idiopathic scoliosis” or“IS” refers to the common complex disorder of the spine. It is a three- dimensional deformity of the skeleton characterized by a lateral curvature of ³10° on a standing radiograph (Cobb method), combined with vertebral rotation. It is the most common form of spinal disorder It mostly occurs at the age of adolescence and affects 1-4% (1) of the global pediatric population with higher prevalence in females who are generally more severely affected than males. The term“IS” includes Infantile (age of onset < 3 years old), Juvenile (age of onset between 3 and 9 years old) and Adolescent (age of onset between 10 and 15 years old) idiopathic scoliosis. A subject “diagnosed with IS” is a subject having a minimum curvature in the coronal plane of 10°, showed by for example a standing posteroanterior spinal radiograph, by the Cobb method with vertebral rotation and without any congenital or genetic disorder which could be the source of the spinal deformity observed.

As used herein, the terms“risk of developing IS” ( e.g :, a subject at risk of developing IS) or the like refer to a genetic or metabolic predisposition of a subject to develop a scoliosis (i.e spinal deformity) and/or a more severe scoliosis at a future time (i.e , curve progression of the spine). For instance, an increase of the Cobb angle of a subject (e.g, from 40° to 50° or from 18° to 25°) is a“development” of a scoliosis (i.e., a scoliosis progression). The terminology“a subject at risk of developing IS” includes asymptomatic subjects which are more likely than the general population to suffer in a future time of IS and includes subjects (e.g, children) having at least one parent, sibling or family member suffering from a scoliosis (either first degree, second degree or third degree relative). It also includes subjects which carry one or more known IS susceptibility markers (SNPs or other mutation/genetic variations). Also included in the terminology“a subject at risk of developing a scoliosis” are asymptomatic subjects (i.e., subjects which do not yet have a spinal deformity of over 10°) but which have been identified as having a GiPCR signaling defect and classified in the FG1 , FG2 or FG3 endophenotype group using well known methods (e.g., cAMP measurement, cellular impedance, etc. - see, for example, WO/2003/073102, WO/2010/040234, WO/2012/045176, WO/2015/032005, WO/2014/201560, WO/2014/201557).

As used herein, the terms“severe scoliosis”,“severe IS” or "severe scoliosis progression” refers to a scoliosis (or scoliosis progression) with a Cobb angle of 40° or more

As used herein, the terminology“polynucleotide sample” or“nucleic acid sample” is meant to refer to a DNA, or RNA (including cDNA) sample from a test subject. The sample should contain sufficient amount of polynucleotides for determining the presence or absence of SNPs and/or haplotypes (i.e , for genotyping) disclosed herein according to the selected method The choice of the sample type will of course depend on the specific conditions of the assay. For examples, gene variants (e.g., SNPs) found in intronic (or other untranscribed) sequences may not be detected using an RNA sample (or cDNA) sample as known in the art. Preferably, the sample is a cell sample from the subject but is not so limited as long as the polynucleotide sample allows for the detection of the gene variant.

As used herein, the terms“biological fluid sample” refers to blood, saliva, tears, sweat, urine, semen and milk. As used herein, the terminology“blood sample” is meant to refer to blood, plasma or serum.

As used herein, the term“subject” is meant to refer to any mammal including human, mouse, rat, dog, chicken, cat, pig, monkey, horse, etc. Preferably, the subject is a human, such as a human pediatric subject. “Polymorphism” or“variant”. The genomic sequence within populations is not identical when individuals are compared. Rather, the genome exhibits sequence variability between individuals at many locations in the genome. Such variations in sequence are commonly referred to as polymorphisms, and there are many such sites within each genome. For example, the human genome exhibits sequence variations which occur on average every 500 base pairs. Thus, as used herein, a “polymorphism” or“variant” refers to a variation in the sequence of nucleic acid [e.g, a gene sequence) Such variation includes insertion, deletion, and substitutions in one or more nucleotides.

The most common sequence variation (or polymorphism) consists of base variations at a single base position in the genome, and such sequence variants, or polymorphisms, are commonly called Single Nucleotide Polymorphisms ("SNPs"). There are usually two possibilities (or two alleles) at each SNP site; the original allele (ancestral) and the mutated allele (variant allele in Table 1) (although there may be 3 or 4 possibilities for each SNP site). Due to natural genetic drift and possibly also selective pressure, the original mutation has resulted in a polymorphism characterized by a particular frequency of its alleles in any given population. There may also exists SNPs that vary between paired chromosomes in an individual Each individual is in this instance either homozygous for one allele of the polymorphism (i.e. both chromosomal copies of the individual have the same nucleotide at the SNP location), or the individual is heterozygous (i.e. the two sister chromosomes of the individual contain different nucleotides). As used herein an SNP thus refers to a variation at a single nucleotide in a given nucleic acid sequence.

In general terms, each version of the sequence with respect to the polymorphic site represents a specific allele of the polymorphic site. These sequence variants can all be referred to as polymorphisms, occurring at specific polymorphic sites characteristic of the sequence variant in question. In general terms, polymorphisms can comprise any number of specific alleles.

In some instances, reference is made to different alleles at a variant/polymorphic site without choosing a reference allele. Alternatively, a reference sequence can be referred to for a particular polymorphic site. The reference allele is sometimes referred to as the "wild-type" allele or“ancestral allele" and refers herein to the allele from a "non-affected" or control/reference individual [e.g., an individual that does not display a trait or disease phenotype i.e., which does not suffer from a scoliosis or which has a lower risk of (or predisposition to) developing a scoliosis).

A“gene variant”, "genetic marker" or "polymorphic marker", as described herein, refers to a variation (mutation or alteration) in a gene sequence that occurs in a given population. Each polymorphic marker/gene variant has at least two sequence variations characteristic of particular alleles at the polymorphic site. The marker/gene variant can comprise any allele of any variant type found in the genome, including variations in a single nucleotide (SNPs, microsatellites, insertions, deletions, duplications and translocations. The polymorphic marker/gene variant, if found in a transcribed region of the genome can be detected not only in genomic DNA but also in RNA. In addition, when the polymorphism/variant is found in the gene portion that is translated into a polypeptide or protein, the polymorphic marker/gene variant can be detected at the protein/polypeptide level.

The term“defective CHI3L 1 gene” as used herein refers to a CHI3L 1 gene comprising one or more mutations that affect the expression of the CHI3L1 gene and/or that results in a YKL-40 protein having reduced activity relative to the native protein In an embodiment, the defective CHI3L 1 gene comprises one or more of the SNPs (variant allele) disclosed herein [e.g., variant allele in Table 1). The polymorphic marker/gene variant of the present disclosure and its specific sequence variation can be detected by various means such as by sequencing the nucleic acid or protein. Alternatively, when the polymorphism/variation affects the function of the gene or of its translated protein/polypeptide, the biological activity can be evaluated in order to identify which allele is present in the subject's sample. For example, if a particular risk allele (comprising a risk variant or combination of risk variants) affects the enzymatic activity of the protein, then, the presence of the allele or variant(s) can be assessed by performing an enzymatic test. Alternatively, if the risk allele (comprising a gene variant or combination of variants) affects the expression level of a polypeptide or nucleic acid, then, the presence of the variants(s) can be determined by assessing the expression level ( e.g. , Immunoassays, amplification assays, etc ) of such protein or nucleic acid and comparing it to a reference level in a control sample [e.g., sample from a subject not suffering from a scoliosis or at risk of developing a scoliosis).

An "allele" refers to the nucleotide sequence of a given locus (position) on a chromosome. A polymorphic marker allele thus refers to the composition (i.e., sequence) of the marker on a chromosome. Genomic DNA from an individual contains two alleles for any given polymorphic marker, representative of each copy of the marker on each chromosome. A“risk allele”, a“susceptibility allele” or a“predisposition allele” or a“risk variant” is nucleic acid sequence variation that is associated with an increased risk of (i e. compared to a control/reference) or predisposition to suffering from IS Conversely, a“protective allele” or“protective variant” is a sequence variation of a polymorphic marker that is associated with a lower risk of (i.e., compared to a control/reference) or predisposition to suffering from IS.

As used herein, the term“haplotype” refers to a set or collection of linked single-nucleotide polymorphism (SNP) alleles that tend to occur together [e.g., that are associated statistically with a reduced risk of developing IS).

As used herein, the term“non-conservative mutation” or“non-conservative substitution” in the context of polypeptides refers to a mutation in a polypeptide that changes an amino acid to a different amino acid with different biochemical properties (i e., charge, hydrophobicity and/or size). Although there are many ways to classify amino acids, they are often sorted into six main groups on the basis of their structure and the general chemical characteristics of their R groups (i) Aliphatic (Glycine, Alanine, Valine, Leucine, Isoleucine); (ii) Hydroxyl or Sulfur/Selenium-containing (also known as polar amino acids) (Serine, Cysteine, Selenocysteine, Threonine, Methionine); (iii) Cyclic (Proline); (iv) Aromatic (Phenylalanine, Tyrosine, Tryptophan); (v) Basic (Histidine, Lysine, Arginine) and (vi) Acidic and their Amide (Aspartate, Glutamate, Asparagine, Glutamine). Thus, a non-conservative substitution includes one that changes an amino acid of one group with another amino acid of another group [e.g., an aliphatic amino acid for a basic, a cyclic, an aromatic or a polar amino acid; a basic amino acid for an acidic amino acid, a negatively charged amino acid (aspartic acid or glutamic acid) for a positively charged amino acid (lysine, arginine or histidine) etc.

Conversely, a“conservative substitution" or“conservative mutation” in the context of polypeptides is a mutation that changes an amino acid to a different amino acid with similar biochemical properties [e.g., charge, hydrophobicity and size) For example, a leucine and isoleucine are both aliphatic, branched hydrophobic residues. Similarly, aspartic acid and glutamic acid are both small, negatively charged residues. Therefore, changing a leucine for an isoleucine (or vice versa) or changing an aspartic acid for a glutamic acid (or vice versa) are examples of conservative substitutions.

"Complement" or "complementary" as used herein refers to Watson-Crick [e.g., A-T/U and C-G) or Hoogsteen base pairing between nucleotides or nucleotide analogs of nucleic acid molecules. "Complementarity" refers to a property shared between two nucleic acid sequences, such that when they are aligned antiparallel to each other, the nucleotide bases at each position will be complementary.

Sequence similarity

"Homology" and "homologous” refers to sequence similarity between two polypeptides or two nucleic acid molecules. Homology can be determined by comparing each position in the aligned sequences. A degree of homology between nucleic acid or between amino acid sequences is a function of the number of identical or matching nucleotides or amino acids at positions shared by the sequences. As the term is used herein, a nucleic acid sequence is "substantially homologous" to another sequence if the two sequences are substantially identical and the functional activity of the sequences is conserved (as used herein, the term“homologous” does not infer evolutionary relatedness, but rather refers to substantial sequence identity, and thus is interchangeable with the terms“identityTidentical”) Two nucleic acid sequences are considered substantially identical if, when optimally aligned (with gaps permitted), they share at least about 50% sequence similarity or identity, or if the sequences share defined functional motifs. In alternative embodiments, sequence similarity in optimally aligned substantially identical sequences may be at least 60%, 70%, 75%, 80%, 85%, 90% or 95%. For the sake of brevity, the units (e.g., 66, 67. .81 , 82,...91, 92% ...) have not systematically been recited but are considered, nevertheless, within the scope of the present disclosure

Substantially complementary nucleic acids are nucleic acids in which the complement of one molecule is substantially identical to the other molecule. Two nucleic acid or protein sequences are considered substantially identical if, when optimally aligned, they share at least about 70% sequence identity. In alternative embodiments, sequence identity may for example be at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 98% or at least 99%. Optimal alignment of sequences for comparisons of identity may be conducted using a variety of algorithms, such as the local homology algorithm of Smith and Waterman, 1981, Adv. Appl. Math 2: 482, the homology alignment algorithm of Needleman and Wunsch, 1970, J. Mol. Biol. 48:443, the search for similarity method of Pearson and Lipman (Pearson and Lipman 1988), and the computerized implementations of these algorithms (such as GAP, BESTFIT, FASTA and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group, Madison, Wl, U.S.A ). Sequence identity may also be determined using the BLAST algorithm, described in Altschul etal (Altschul et al. 1990) (using the published default settings). Software for performing BLAST analysis may be available through the National Center for Biotechnology Information (through the internet at http://www.ncbi.nlm.nih.gov/). The BLAST algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence that either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as the neighborhood word score threshold. Initial neighborhood word hits act as seeds for initiating searches to find longer HSPs. The word hits are extended in both directions along each sequence for as far as the cumulative alignment score can be increased. Extension of the word hits in each direction is halted when the following parameters are met: the cumulative alignment score falls off by the quantity X from its maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of one or more negative-scoring residue alignments; or the end of either sequence is reached. The BLAST algorithm parameters W, T and X determine the sensitivity and speed of the alignment. One measure of the statistical similarity between two sequences using the BLAST algorithm is the smallest sum probability (P(N)), which provides an indication of the probability by which a match between two nucleotide or amino acid sequences would occur by chance. In alternative embodiments of the invention, nucleotide or amino acid sequences are considered substantially identical if the smallest sum probability in a comparison of the test sequences is less than about 1, preferably less than about 0.1, more preferably less than about 0.01, and most preferably less than about 0.001

An alternative indication that two nucleic acid sequences are substantially complementary is that the two sequences hybridize to each other under moderately stringent, or preferably stringent, conditions. Hybridization to filter-bound sequences under moderately stringent conditions may, for example, be performed in 05 M NaHP04, 7% sodium dodecyl sulfate (SDS), 1 mM EDTA at 65°C, and washing in 0.2 x SSC/0.1% SDS at 42°C (Ausubel 2010). Alternatively, hybridization to filter-bound sequences under stringent conditions may, for example, be performed in 0.5 M NaHP04, 7% SDS, 1 mM EDTA at 65°C, and washing in 0.1 x SSC/0 1% SDS at 68°C (Ausubel 2010). Hybridization conditions may be modified in accordance with known methods depending on the sequence of interest (Tijssen 1993). Generally, stringent conditions are selected to be about 5°C lower than the thermal melting point for the specific sequence at a defined ionic strength and pH.

As used herein the term“treating” or“treatment” in reference to scoliosis is meant to refer to at least one of a reduction of Cobb angle in a preexisting spinal deformity, improvement of column mobility, preservation/maintenance of column mobility, improvement of equilibrium and balance in a specific plan; maintenance/preservation of equilibrium and balance in a specific plan; improvement of functionality in a specific plan, preservation/maintenance of functionality in a specific plan, cosmetic improvement, and combination of any of the above.

As used herein the term“preventing” or“prevention” in reference to scoliosis is meant to refer to a at least one of a reduction in the progression of a Cobb angle in a patient having a scoliosis or in an asymptomatic patient, a complete prevention of apparition of a spinal deformity, including changes affecting the rib cage and pelvis in 3D, or a combination of any of the above.

As used herein the terms“follow-up schedule” is meant to refer to future medical visits a subject diagnosed with a scoliosis or at risk of developing a scoliosis is prescribed once the diagnosis or risk evaluation is made. For example, when a subject is identified as being at risk of developing a severe scoliosis or at risk of rapid curve progression (e.g, a subject classified as belonging to the FG2 subgroup in accordance with the present disclosure), the number of medical visits [e.g., to the orthopedist) is increased and/or the number of x-rays in a given period [e.g, 3, 6 or 12 months) is increased. On the other hand, when a subject is identified as having a lower risk of curve progression or rapid curve progression [e.g., subject being classified as belonging to the F1 or FG3 subgroup) the number of medical visits or x-rays may be decreased to less than the average [e.g., less than 22 x-rays over a 3 year period or less than 1 visit every 3 months, 6 months or 12 months).

Detection of CHRL /gene variants/polymorphic markers/haplotypes

Detecting specific gene variants or polymorphic markers and/or haplotypes of the present disclosure can be accomplished by methods known in the art. Such detection can be made at the nucleic acid or amino acid (protein) level.

For example, standard techniques for genotyping for the presence of gene variants [e.g., SNPs and/or microsatellite markers) can be used, such as sequencing, fluorescence-based techniques (Chen, X. eta/., Genome Res. 9(5): 492-98 (1999)), methods utilizing PCR, LCR, Nested PCR and other methods for nucleic acid amplification. Specific methodologies available for SNP genotyping include, but are not limited to, Restriction fragment polymorphism (RFLP), TaqMan™ genotyping assays and SNPlex™ platforms (Applied Biosystems), mass spectrometry [e.g, MassARRAY™ system from Sequenom™), minisequencing methods, real-time PCR, Bio-Plex™ system (BioRad), CEQ and SNPstream™ systems (Beckman), Molecular Inversion Probe™ array technology [e.g., Affymethx™ GeneChip), and BeadArray™ Technologies [e.g, Illumina GoldenGate™ and Infinium™ assays). By these or other methods available to the person skilled in the art, one or more alleles at polymorphic markers, including microsatellites, SNPs or other types of polymorphic markers, can be identified

Linkage Disequilibrium

In order to determine the risk of developing a scoliosis it is also possible to assess the presence of a gene variant (such as a SNP) in linkage disequilibrium with any of the gene variants identified herein [e.g., SNPs/variants listed in any of

Tables 1, 3, 4, 5A and/or 6 and/or FIGs.5 to 10).

Once a first SNP has been identified in a genomic region of interest, the practitioner of ordinary skill in the art can easily identify additional SNPs in linkage disequilibrium with this first SNP. In the context of the invention, the additional SNPs in linkage disequilibrium with a first SNP are within the same gene of said first SNP. Linkage disequilibrium (LD) is defined as the non-random association of alleles at different loci across the genome. Alleles at two or more loci are in LD if their combination occurs more or less frequently than expected by chance in the population.

For example, if a particular genetic element [e.g., an allele of a polymorphic marker, or a haplotype) occurs in a population at a frequency of 0.50 (50%) and another element occurs at a frequency of 0.50 (50%), then the predicted occurrence of a person’s having both elements is 0.25 (25%), assuming a random distribution of the elements. However, if it is discovered that the two elements occur together at a frequency higher than 0.25, then the elements are said to be in linkage disequilibrium, since they tend to be inherited together at a higher rate than what their independent frequencies of occurrence [e.g, allele or haplotype frequencies) would predict

When there is a causal locus in a DNA region, due to LD, one or more SNPs nearby are likely associated with the trait too. Therefore, any SNPs in LD with a first SNP associated with IS [e.g., AIS) or an associated disorder will be associated with this trait Identification of additional SNPs in linkage disequilibrium with a given SNP involves: (a) amplifying a fragment from the gene comprising a first SNP from a plurality of individuals; (b) identifying of second SNPs in the gene comprising said first SNP; (c) conducting a linkage disequilibrium analysis between said first SNP and second SNPs; and (d) selecting said second SNPs as being in linkage disequilibrium with said first marker. Subcombinations comprising steps (b) and (c) are also contemplated

Methods to identify SNPs and to conduct linkage disequilibrium analysis can be carried out by the skilled person without undue experimentation by using well-known methods.

Thus, the practitioner of ordinary skill in the art can easily identify SNPs or combination of SNPs within haplotypes in linkage disequilibrium with the at-risk gene variant [e.g. risk SNP).

Such markers are mapped and listed in public databases like HapMap as well known to the skilled person. Genomic LD maps have been generated across the genome, and such LD maps have been proposed to serve as framework for mapping disease-genes (Risch etai, 1996; Maniatis eta/, 2002; Reich etal, 2001). If all polymorphisms in the genome were independent at the population level (i.e., no LD), then every single one of them would need to be investigated in association studies, to assess all the different polymorphic states. However, due to linkage disequilibrium between polymorphisms, tightly linked polymorphisms are strongly correlated, which reduces the number of polymorphisms that need to be investigated in an association study to observe a significant association. Another consequence of LD is that many polymorphisms may give an association signal due to the fact that these polymorphisms are strongly correlated.

The two metrics most commonly used to measure LD are D’ and r 2 and can be written in terms of each other and allele frequencies. Both measures range from 0 (the two alleles are independent or in equilibrium) to 1 (the two alleles are completely dependent or in complete disequilibrium), but with different interpretation. D' is equal to 1 if at most two or three of the possible haplotypes defined by two markers are present, and <1 if all four possible haplotypes are present. r2 measures the statistical correlation between two markers and is equal to 1 if only two haplotypes are present.

Most SNPs in humans probably arose by single base modifying events that took place within chromosomes many times ago. A single newly created allele, at its time of origin, would have been surrounded by a series of alleles at other polymorphic loci like SNPs establishing a unique grouping of alleles (i.e. haplotype) If this specific haplotype is transmitted intact to next generations, complete LD exists between the new allele and each of the nearby polymorphisms meaning that these alleles would be 100% predictive of the new allele Thus, because of complete LD (D’ = 1 or r 2 = 1) an allele of one polymorphic marker can be used as a surrogate for a specific allele of another. Event like recombination may decrease LD between markers But, moderate (i.e. 0.5 £; r 2 <0.8) to high (i.e. 0.8 £; r 2 < 1) LD conserve the "surrogate" properties of markers In LD based association studies, when LD exist between markers and an unknown pathogenic allele, then all markers show a similar association with the disease.

It is well known that many SNPs have alleles that show strong LD (or high LD, defined as r 2 ³ 0.80) with other nearby SNP alleles and in regions of the genome with strong LD, a selection of evenly spaced SNPs, or those chosen on the basis of their LD with other SNPs (proxy SNPs or Tag SNPs), can capture most of the genetic information of SNPs, which are not genotyped with only slight loss of statistical power. In association studies, this region of LD is adequately covered using few SNPs (Tag SNPs) and a statistical association between a SNP and the phenotype under study means that the SNP is a causal variant or is in LD with a causal variant. It is a general consensus that a proxy (or Tag SNP) is defined as a SNP in LD (r 2 ³ 0.8) with one or more other SNPs. The genotype of the proxy SNP could predict the genotype of the other SNP via LD and inversely. In particular, any SNP in LD with one of the SNPs used herein may be replaced by one or more proxy SNPs defined according to their LD as r 2 ³ 0.8.

These SNPs in linkage disequilibrium can also be used in the methods according to the present disclosure, and more particularly in the diagnostic methods according to the present disclosure. In particular, the presence of SNPs in linkage disequilibrium (LD) with the above identified SNPs may be genotyped, in place of, or in addition to, said identified SNPs. In the context of the present disclosure, the SNPs in linkage disequilibrium with the above identified SNP are within the same gene of the above identified SNP. Therefore, in the present disclosure, the presence of SNPs in linkage disequilibrium (LD) with a SNP of interest and located within the same gene as the SNP of interest may be genotyped, in place of, or in addition to, said SNP of interest. Preferably, such an SNP and the SNP of interest have r 2 ³ 0.70, preferably r 2 ³ 075, more preferably r 2 ³ 0.80, and/or have D’ ³ 0.60, preferably D’ ³ 0.65, D' ³ 0.7, D’ ³ 0.75, more preferably D’ ³ 0.80 Most preferably, such an SNP and the SNP of interest have r 2 ³ 0.80, which is used as reference value to define "LD" between SNPs.

Exemplary markers in linkage disequilibrium are shown in FIG.2.

Compositions and kits Compositions and kits for use in the methods of the present disclosure (i.e., for determining the risk of developing a scoliosis; for genot ping a subject and for classifying a subject suffering from a scoliosis or at risk of developing a scoliosis) may include for example (i) one or more reagents for detecting the level of YKL-40 in a biological sample; and/or (ii) one or more reagents for detecting the presence or absence of at least one CHI3L 1 gene variants (e.g, one or more variants listed in any of Tables 1, 3, 4, 5A and/or 6 and/or FIGs.5 to 10) or a substitute marker in linkage disequilibrium therewith

Compositions and kits can comprise oligonucleotide primers and hybridization probes (e.g, allele-specific oligonucleotide primers and hybridization probes for determining the presence or absence of a variant in the CHI3L1 gene (e.g, one or more variants listed in any of Tables 1, 3, 4, 5A and/or 6 and/or FIGs. 5 to 10), restriction enzymes (e.g, for RFLP analysis) and/or antibodies that bind to a YKL-40 polypeptide (wild-type or variant polypeptide).

The kit (or composition) may also include any necessary buffers, enzymes (e.g., DNA polymerase) and/or reagents necessary for performing the methods of the present disclosure. The kit may comprise one or more labeled nucleic acids (or labeled antibody) capable of specific detection of YKL-40 polypeptide or at least one CHI3L /gene variant of the present disclosure (e.g., one or more variants listed in any of Tables 1, 3, 4, 5A and/or 6 and/or FIGs. 5 to 10) or any markers in linkage disequilibrium therewith as well as reagents for the detection of the label.

Reagents may be provided in separate containers or premixed depending on the requirements of the method. Suitable labels are well known in the art and will be chosen according to the specific method used. Non-limiting examples of suitable labels (including non-naturally occurring labels/synthetic labels) include a radioisotope, a fluorescent label, a magnetic label, an enzyme, etc.

The detection of a CHI3L1 gene variant (e.g, one or more variants listed in any of Tables 1, 3, 4, 5A and/or 6 and/or FIGs. 5 to 10) associated with IS in accordance with the present disclosure may be determined by DNA Chip analysis Such DNA chip or nucleic acid microarray consists of different nucleic acid probes that are chemically attached to a substrate, which can be a microchip, a glass slide or a microsphere-sized bead A microchip may be constituted of polymers, plastics, resins, polysaccharides, silica or silica-based materials, carbon, metals, inorganic glasses, or nitrocellulose Probes comprise nucleic acids such as cDNAs or oligonucleotides that may be about 10 to about 60 base pairs. To determine the alteration of the genes, a sample from a test subject is labelled and contacted with the microarray in hybridization conditions, leading to the formation of complexes between target nucleic acids that are complementary to probe sequences attached to the microarray surface. The presence of labelled hybridized complexes is then detected. Many variants of the microarray hybridization technology are available to the man skilled in the art.

In embodiments, there is provided a composition (e.g, a diagnostic composition) or assay mixture which is generated following one or more steps of the methods describe herein and which include a biological sample (e.g, cell sample, blood sample, plasma sample, serum sample, etc ) from the subject to be tested, and means or reagents for detecting the proteins, nucleic acids, and/or SNPs described herein The preparation of such composition occurs while testing a subject’s biological sample for the risk of developing a scoliosis (including the risk of developing a more severe scoliosis); for aiding in the prevention and treatment of scoliosis including for determining the best treatment regimen; for adapting an undergoing treatment regimen; for selecting a new treatment regimen, for determining the frequency of a specific treatment regimen or follow-up schedule or for classifying the subject into a particular genetic group or endophenotype (FG1 , FG2 or FG3). Such compositions may be prepared using as kits described herein. In embodiments, compositions and kits of the present disclosure may thus comprise one or more oligonucleotides probe or amplification primer for the detection (e.g, amplification or hybridization) of at least one CW3L 1 gene variant of the present disclosure (e.g, a variant or reference sequence defined in one or more variants listed in any of Tables 1, 3, 4, 5A and/or 6 and/or FIGs. 5 to 10). In embodiments, oligonucleotide probes are provided in the form of a microarray or DNA chip. The kit may further include instructions to use the kit in accordance with the methods of the present disclosure (e.g, for determining the risk of (or predisposition to) developing a scoliosis; for genotyping a subject; for treating a subject; for selecting the best follow-up schedule or treatment regimen; for determining the risk of future parents to have a child likely to suffer from IS (e.g, AIS) (or a severe form of IS); or for classifying a subject suffering from a scoliosis or at risk of developing a scoliosis in a specific genetic or functional group).

The present disclosure is illustrated in further details by the following non-limiting examples

EXAMPLE 1: CLINICAL AND BIOCHEMICAL CHARACTERISTICS OF COHORT

A summary of demographic features, clinical profiles and plasma YKL-40 levels for the French-Canadian cohort studied herein is provided in Table 2A. Plasma YKL-40 levels were analyzed in 710 patients and 227 controls. As expected, there were more females in patients than in controls (Fisher’s exact test P- 0001) Plasma YKL-40 levels and genotypes for the 12 C ?Z /SNPs were analyzed for 728 patients with AIS and 216 healthy controls after ancestral and relatedness testing. Stratification by scoliosis severity was determined only in the participants who have reached their skeletal maturity at the time of blood collection, which resulted in 132 AIS patients as severe cases (Cobb angle ³40°) and 227 AIS patients as non-severe cases (Cobb angle 10°-39°). Demographic and clinical data for the second cohort of AIS (n=137) and control subjects (n=51) genotyped by the multiplex polymerase chain reaction are provided in Table 2B.

Table 2A: Clinical and biochemical characteristics of the French-Canadian cohort studied herein

Groups Females Males

Mean Age Mean Cobb YKL-40 Mean Age Mean Cobb Angle YKL-40 (Years) Angle (°) (ng/ml) (Years) (°) (ng/ml)

137 ± 2.2 32 ± 20 (3- 14.0 ± 2.1 39 ± 36 (8-

All AIS 28 ± 16 (10-90)

(6.9-26.0) 326) (74-18.2) 22 ± 11 (10-72) 298)

N = 598 N = 112

Endophenotype 135 ± 2.1 30 ± 12 (5- 14.2 ± 1.6 57 ± 62 (14-

28 ± 15 (10-83)

FG 1 (6.9-18.7) 69) (10.9-16.7) 27 ± 20 (10-76) 298)

N = 124 N = 21

Endophenotype 138 ± 2.2 32 ± 19 (3- 14.0 ± 2.6 36 ± 39 (9-

30 ± 17 (10-89) 24 ± 12 (10-56)

FG2 (7.3-19.1) 183) (87-18.2) 228)

N = 229 N = 28

Endophenotype 136 ± 2.2 33 ± 24 (4- 14.0 ± 2.3

26 ± 15 (10-90) 22 ± 14 (10-66) 33 ± 16 (8-96)

FG3 (7.7-26.0) 326) (74-18.2)

N = 241 N = 60

Healthy Control 125 ± 3.3 29 ± 13 (8- 12.4 ± 3.1

NA 28 ± 13 (4-90)

Subjects (7.1-18.3) NA

81) (72-17.6)

N = 124 N = 103

Groups All Subjects Mean Age YKL-40

Mean Cobb Angle (°)

(Years) (ng/ml)

13.8 ± 2.2 33 ± 23 (3-

All AIS 27 ± 16 (10-90)

(6.9-26.0) 326)

N = 710

Endophenotype 13.6 ± 2.0 34 ± 27 (5-

FG 1 (6.9-18.7) 28 ± 16 (10-83) 298)

N = 145

13.9 ± 2.2 32 ± 22 (3-

Endophenotype FG2 29 ± 17 (10-89)

(7.3-19.1) 228)

N = 257

13.7 ± 2.2 33 ± 23 (4-

Endophenotype FG3 25 ± 14 (10-90)

(7.4-26.0) 326)

N= 301

Healthy Control 12.5 ± 3.2 NA 29 ± 13 (4-90) Subjects (7.1-18.3)

N = 227

Table 2B: Demographic and clinical data for the replication cohort genotyped using multiplex PCR

EXAMPLE 2: ASSOCIATION BETWEEN PLASMA YKL-40 LEVELS AND AIS BIOLOGICAL ENDOPHENOTYPES When all AIS patients were compared to matched healthy controls, patients showed higher plasma YKL-40 levels than controls (P- 0 002) Previous work demonstrated that AIS patients have a distinctive systemic signaling dysfunction for G inhibitory (Gi)-coupled receptors allowing their functional classification in three distinct biological endophenotypes with respect to the impedance values: FG1 = (10-40 W), FG2 = (40-80 W), and FG3 = (80-120 W) (while healthy controls always exceed 120 W) (8). The possible interaction between gender and endophenotype was first tested, which was found to be statistically significant ( P - 0 009) Therefore, the analyses for females and males were performed separately. By comparing only females of the three groups, there was no significant difference in circulating YKL-40 levels among the three different biological endophenotypes. While upon analyzing only males of the three groups, there were significant differences among the biological endophenotypes (P = 0.001 , FIG. 1). After Bonferroni adjusted pair wise comparisons, AIS FG1 males showed higher YKL-40 levels than controls males {P- 0.001) and AIS FG3 males [P= 0.042), respectively Therefore, changes observed in plasma YKL-40 levels replicated at the protein level the previous expression analyses using primary osteoblasts obtained from AIS patients and matched healthy controls (FIG. 4).

EXAMPLE 3: ASSOCIATION BETWEEN PLASMA YKL-40 LEVELS AND SCOLIOSIS SEVERITY PHENOTYPE

The IS patients were further classified into severe cases (Cobb ³ 40°) or non-severe ones (Cobb <40°) to assess for possible associations between plasma YKL-40 levels and scoliosis severity phenotype. No evidence of statistically significant interaction was found between gender and AIS status. Bonferroni adjusted pairwise comparisons between both AIS cases groups and controls showed a statistically significant elevation of plasma YKL-40 levels in the non-severe AIS cases relative to controls (P= 0.003).

EXAMPLE 4: ASSOCIATION BETWEEN PLASMA GHRELIN AND YKL-40 LEVELS

Plasma ghrelin levels were measured in a subset of AIS patients and matched healthy controls. The clinical and demographic summary of the participants tested is provided in Table 2C. Analysis of all AIS patients compared to matched controls did not show a significant effect of circulating ghrelin levels on plasma YKL-40 levels. However, when AIS patients were stratified according to their biological endophenotypes, mean plasma ghrelin levels were significantly lowered only in FG1 endophenotype samples (99.9 ± 44.9 pg/ml) relative to the mean control value (162.8 ± 63.9 pg/ml; P= 0028) In view of the link between reduced ghrelin levels and upregulation of YKL-40 secretion observed in obese prepubertal children (17), it is possible that the decreased ghrelin concentration in blood of AIS patients classified in FG1 endophenotype could contribute in part in the elevation of plasma YKL-40 in this AIS subgroup.

Table 2C: Demographic and clinical data for a subset of AIS patients and controls subjects tested for unacylated ghrelin

EXAMPLE 5: GENETIC VARIANTS IN CM3L1 GENE AND PLASMA YKL-40 LEVELS

To determine whether the CHI3L 1 genotypes affected circulating YKL-40 levels, 12 SNPs were analyzed in the same cohort involved in plasma YKL-40 measurement. The results showed that eight SNPs were significantly associated with the plasma YKL-40 levels in the AIS patients (Table 3), including rs55700740 [P- 3 8 x 10 5 ), rs946259 ( P- 3.9 c 10 5 ), rs880633 [P= 3.8 x 10 5 ), rs1538372 [P= 5.9 x 10 s ), rs4950881 [P= 6.1 x 10 4 ), rs946261 [P= 4.3 x 10 8 ), rs10920579 (P= 1.1 x 10 s ), and the highest association displayed by rs946262 (P = 6 x 10 ~12 ). By comparison, only two of these SNPs were associated with plasma YKL-40 levels in the healthy control group, including rs1538372 (P= 5.7 x 10 4 ) and rs946261 ( P- 0.0018), which is consistent with the fact that AIS patients showed higher plasma YKL-40 levels than controls [P= 0.002).

Table 3: Prevalence of the studied SNPs in CtfriQZ/gene and their associations with plasma YKL-40 levels in AIS patients and healthy controls

The patient samples were further divided into two groups: non-severe and severe based on their scoliosis phenotype. The same eight SNPs previously associated with plasma YKL-40 levels were still significantly associated with the non-severe cases: rs55700740 (/>=0.00063), rs946259 (2^=0.00064), rs880633 (2^=0.00063), rs1538372 (/>= 1 9*10 5 ), rs4950881 {P = 0 00011), rs946261 {P = 2.2*10- 5 ), rs10920579 ( P = 6.5*10- 6 ), while the SNP rs946262 still showed the most significant association (/°=6.2x10- 9 ). By comparison, in the severe group, only three SNPs showed marginal associations: rs55700740 (^=0.008), rs946259 (^=0008), rs880633 (/ , =0.008) (Table 4).

Table 4: Prevalence of the studied SNPs in CHI3L1 gene and their associations with plasma YKL-40 levels in function of scoliosis severity

AIS severe AIS non-severe

SNP

(Cobb ³ 40°) (Cobb < 40°)

N=132 Mean YKL-40 (ng/ml) LVALUE N=227 Mean YKL-40 (ng/ml) LVALUE rs55700740 0.008 0.0006

CC 38(28.8%) 28.67 57(25.1%) 26.5

CA 68(51.5%) 29.53 107(47.1 %) 34.59

AA 26(19.7) 40.17 63(27.8%) 38

rs7542294 0.651 0.69

GG 93(70.5%) 31.48 152(67%) 33.53

GA 39(29.5%) 31.87 67(29.5%) 34.02

AA 0 30.26 8(35%) 31.65

rs946259 0008 0.0006

GG 37(28%) 28.25 57(25.2%) 26.4

GA 68(51.5%) 29.83 106(46.9%) 34.55

AA 27(20.5%) 39.42 63(27.9%) 37.97

rs880633 0.008 0.0006

GG 27(20.5%) 39.42 63(27.8%) 37.97

GA 68(51.5%) 29.83 107(47.1 %) 34.51

AA 37(28%) 28.25 57(25.1%) 26.4

rs1538372 0.012 2*1 O 5

GG 50(37.9%) 37 100(45%) 37.69

GA 59(44.7%) 28.33 97(43.7%) 31.84

AA 23(17.4%) 28.68 25(11.3%) 22.31

rs4950881 0.012 0.0001

GG 17(12.9%) 25.06 15(6.6%) 20.78

GA 52(39.4%) 29.85 80(35.2%) 32.01

AA 63(47.7%) 34.77 132(58.1 %) 35.47

rs 10399805 0695 0353

GG 99(756%) 31 26 165(727%) 33 1

GA 30(22.9%) 31.5 56(24.7%) 34.84

AA 2(1.5%) 20.05 6(26%) 33.63

rs6691378 0.626 0.364

GG 100(75.8%) 31.60 165(72.7%) 33 AIS non-severe

_ (Cobb < 40°)

GA 30(22.7%) 31.49 56(24.7%) 35.5

AA 2(1.5%) 20.05 6(26%) 31.72

rs946261 0.079 2.2*10·*

AA 45(34.1%) 34.52 87(38.3%) 40.34

AG 67(50.8%) 30.62 102(44.9%) 31.68

GG 20(15.2%) 27.11 38(16.7%) 24.64

rs946262 0.023 6.23*10*

GG 78(60%) 34.67 149(65.9%) 38.31

GA 47(36.2%) 27.93 69(30.5%) 25.43

AA 5(3.8%) 26.35 8(35%) 13.36

rs116415868 0921 0 619

GG 131(99.2%) 31.49 223(98.2%) 33.69

GA 1(0.8%) 29.94 4(1 8%) 30.47

rs 10920579 0.08 6.5*10 ®

GG 85(64.4%) 33.26 164(72.2%) 36.91

GA 41(31.1%) 29.09 57(25.1%) 25.6

AA 6(4.5%) 23.49 6(2 6%) 13.69

Similarly, the same analyses were performed separately for each of the three biological endophenotype groups. Significant associations with plasma YKL-40 levels were found in the FG2 and FG3 endophenotypes, such as rs946261 (/’=0 0001 and /^ 0.00015, respectively), rs946262 {P= 7*10 6 and P= \.5*'\0 , respectively), and rs10920579 (/-'= 0.0013 and P = 2.2x 10 ~5 , respectively). Other SNPs showed specific associations only with AIS patients classified in FG2 endophenotype: rs55700740 (> 0 =0.00052), rs946259 (> 0 =0.00069), rs880633 (> ® =0000689), rs1538372 (>°=0.000634) (Table 5A). Of note, previously reported SNPs rs450928 and rs10399931 known to modulate YKL-40 levels (24, 25) were not included in the initial analysis as they were not assayed in the SNP genotyping array To address this issue, a targeted sequencing approach (Sanger method) was performed using a limited subgroup of AIS patients (Table 5B) producing very high circulating YKL-40 levels (>100 ng/ml) and considered as non-severely affected (mean Cobb angle = 21 °) Both SNPs were not associated with plasma YKL-40 levels in this subgroup and no additional common or rare variants were detected in CHI3L /gene, its proximal promoter and 3’UTR regions among the patients sequenced.

Table 5A: Prevalence of the studied SNPs in £ 7l/gene and their associations with plasma YKL-40 levels in function of AIS biological endophenotypes

SNP FG1 FG2

(N=146) Mean YKL-40 (ng/ml) /’VALUE (N=240) Mean YKL-40 (ng/ml) LVALUE

rs55700740 0.186 0.0005

CC 27(185%) 27.92 65(27.3%) 25.12

CA 78(534%) 34 18 123(51 7%) 33

AA 41(28 1%) 37.47 50(21 %) 40.16

rs7542294 0.613 068

GG 90(61 6%) 32.79 170(70.8%) 32.45

GA 53(363%) 36.5 60(25%) 32.02

AA 3(2.1%) 28.26 10(4.2%) 28.7

rs946259 0 19 0.0007

GG 27(186%) 27.92 65(27.1 %) 25.1 SNP _ FG1 _ _ FG2

GA 77(53 1%) 34.33 123(51.2%) 32.84

AA 41(283%) 37.47 52(21.7%) 39.69

rs880633 0.19 0.0007

GG 41(28 1%) 37.47 52(21.7%) 39.69

GA 78(534%) 34.18 123(51.2%) 32.84

AA 27(185%) 27.92 65(27.1 %) 25.1

rs1538372 0.045 0.0006

GG 79(545%) 38.53 83(35.6%) 38.47

GA 56(386%) 29.2 123(528) 30.31

AA 10(6.9%) 26.51 27(11.6%) 22.04

rs4950881 0.079 0.016

GG 4(2.7%) 16.66 17(7.1 %) 21.46

GA 51(349%) 3039 94(392%) 30 14

AA 91(623%) 36.79 129(53.8%) 35.08

rs 10399805 0.436 093

GG 95(65 1%) 32.37 177(74.1 %) 32.15

GA 49(336%) 37.8 53(22.2%) 33.3

AA 2(1.4%) 23.54 9(38%) 28.4

rs6691378 0 378 0 972

GG 95(65 1%) 32.19 178(74.2%) 31.98

GA 49(336%) 38.17 52(21.7%) 33.53

AA 2(1.4%) 23.54 10(4.2%) 28.44

rs946261 0.133 0.0001

AA 54(37%) 36.22 77(32.1 %) 40.92

AG 70(479%) 35.52 115(47.9%) 29.41

GG 22(15 1%) 23.12 48(20%) 25.33

rs946262 0.03 7x1 O ®

GG 103(70.5%) 36.95 138(57.7%) 37.87

GA 39(267%) 28 93(38.9%) 25.48

AA 4(2.7%) 14.9 8(33%) 14.61

rs116415868 0.865 0.977

GG 145(993%) 3404 235(97 9%) 32 17

GA 1(0.7%) 29.31 5(2 1 %) 32.5

rs 10920579 0.036 0.0013

GG 109(74.7%) 36.55 162(67.5%) 35.37

GA 33(226%) 27.63 71(29.6%) 26.5

AA 4(2.7%) 14.9 7(29%) 15.14

SNP FG3

(N=286) Mean YKL-40 (ng/ml) LVALUE

rs55700740 0.035

CC 66(23.1%) 28.33

CA 153(53.5%) 34.2

AA 67(23.4%) 36.88

rs7542294 0.98

GG 196(68 5) 3366

GA 81(28.3%) 32.85

AA 9(3.1%) 36.07 SNP _ FG3 _

rs946259 0.03

GG 63(22%) 27.94

GA 156(54.5%) 34.24

AA 67(23.4%) 36.88

rs880633 0.03

GG 67(23.4%) 36.88

GA 156(54.5%) 34.24

AA 63(22%) 27.94

rs1538372 0.015

GG 125(43.7%) 36.35

GA 128(44.8%) 32.88

AA 33(11.5%) 25.24

rs4950881 0 091

GG 23(8 1%) 23.82

GA 97(34%) 33.66

AA 165(57.9%) 34.53

rs 10399805 0.591

GG 208(73.2%) 33.07

GA 69(24 3%) 33 16

AA 7(2.5%) 42.57

rs6691378 0.599

GG 209(73.3%) 33.25

GA 70(24.6%) 34.21

AA 6(2.1%) 39.2

rs946261 0.0001

AA 106(37.1%) 39.89

AG 140(49%) 30.95

GG 40(14%) 25.42

rs946262 1.5*10-'

GG 195(68.7%) 38.01

GA 79(27.8%) 25.23

AA 10(35%) 1706

rs116415868 0.583

GG 276(96.8%) 33.66

GA 9(3.2%) 29.42

rs 10920579 2.3*10- 5

GG 206(72.3%) 36.83

GA 70(24 6%) 2527

AA 9(3.2%) 17.64

Table 5B: Demographic and clinical data for AIS patients identified as YKL-40 overproducers

rTIL: right thoracic and left lumbar; rT: right thoracic; ITrTIL: left thoracic, right thoracic and left lumbar

EXAMPLE 6: ASSOCIATION OF CH/3L1G VE VARIANTS IN FUNCTION OF AIS BIOLOGICAL ENDOPHENOTYPE

CLASSIFICATION

None of the individual SNPs showed a significant association with the disease when IS cases were compared to the matched control group. However, rs1538372 was the only SNP showing a significant difference when AIS biological endophenotypes were compared. Indeed, this SNP was more strongly associated with AIS patients classified in endophenotype FG1 when compared to AIS cases classified in FG2 after Bonferroni correction (A = 0.012) (Table 6) Neither did any of the individual SNPs showed any significant difference in function of scoliosis severity. However, upon sex separation, two SNPs showed a significant difference between the severe AIS and control males; rs946262 and rs10920576 (P- 0.012, and P- 0.005, respectively) after Bonferroni correction.

Table 6. Comparison between AIS biological endophenotypes for the associations of each of the 12 SNPs

FG1 vs. FG2 vs. FG3 vs. FG1 vs. FG1 vs. FG2 vs.

SNP Control Control Control FG2 FG3 FG3

rs55700740 A=0.899 A=0.099 A0.437 A Ό89 A0.415 AO.509 rs7542294 P=0.027 AND.195 AO.248 AO.042 AO.232 AO.606 rs946259 A-0.918 PO.082 A0.338 AO.108 A0.485 A0.411 rs880633 A-0.908 A=0.084 AO.349 AO.107 AO.486 A0.411 rs1538372 Ό.186 A=0.133 AO.832 7*«.001 AO.072 AO.15 rs4950881 A=0.368 P=0.706 A0.485 A Ό99 AO.09 A0.472 rs10399805 AO.058 AO.153 AO.434 AO.028 AO.112 AO 626 rs6691378 AO.027 AO.094 A0.432 A Ό17 AO.142 A0.319 rs946261 AO.368 AO.014 AO.444 AO.388 AO.95 AO.153 rs946262 A0.773 AO.097 A0.672 A Ό36 AO.902 AO.025 rs116415868 AO.644 AO.27 AO 08 AO.267 AO.095 A0.315 rs10920579 A0.798 A0.347 A0.729 A .306 A0.895 A0.451

EXAMPLE 7: HAPLOTYPE ANALYSIS OF THE CHBL /VARIANTS

Strong linkage disequilibrium was found among the 12 SNPs, as shown in FIG.2. It was next investigated whether certain haplotypes were associated with plasma YKL-40 levels. Evidence of strong associations of haplotypes with the plasma YKL-40 levels were found (FIG. 5). For instance, the haplotype A-A-G-G-G-G (rs880633|rs1538372|rs4950881 |rs10399805|rs6691378|rs946261), which was present in 15% of the analyzed subjects, showed a strong negative association with plasma YKL-40 levels (A=2 c 10 9 and coefficient = -9.56). By further stratifying the samples into three biological endophenotypes (sample sizes shown in Table 7), it was found that this haplotype showed the strongest association in endophenotype FG2 (/ =, =9.9><10 6 and coefficient = -13.53), relative to endophenotypes FG1 [P= 0.0044 and coefficient = -11.42) and FG3 (/ c ’=0.031 and coefficient = -19.14) (FIGs.6, 7, and 8). Of note, it showed stronger association in females [P = 1.6x10 ~7 and coefficient = -10.08) than males (Z 7 = 0.0021 and coefficient = -9.01) (FIGs. 9 and 10). In the endophenotype FG3 group, it was also found the haplotype G-G-A-G-G-A (rs880633|rs1538372|rs4950881 |rs10399805|rs6691378|rs946261 ), which was present in 48% of the cases, showed positive correlation with plasma YKL-40 levels [P- 7.6 c 10 6 and coefficient = 36; FIG. 8). In females, haplotype A-A-G-G- G-G (rs10399805|rs6691378|rs946261 |rs946262|rs116415868|rs10920579) showed significant association with plasma YKL-40 levels (P = 2.8 c 10 6 and coefficient = -8.43; FIG. 9). All these results support strong association of certain haplotypes with plasma YKL-40 levels in relation of the risk of disease progression.

Table 7: Sample sizes for haplotype analyses of plasma YKL-40 levels

EXAMPLE 8: FUNCTIONAL ASSESSMENT OF THE ROLE OF YKL-40 IN AIS PATHOGENESIS

It was previously demonstrated the occurrence of a differential Gi-coupled receptor signaling dysfunction in primary osteoblasts and other cell types obtained from AIS patients that led to the identification of three biological endophenotypes associated with AIS as measured by CDS assay (8, 9). To examine the possible functional impact of increased plasma YKL-40 levels, primary osteoblasts from three scoliotic patients were screened for their response to oxymetazoline (10 mM), a selective GiPCR agonist activating a1 -adrenergic receptor normally coupled to Gi proteins as cellular readout. Exposure to rOPN induced a reduction of a1 -adrenergic receptor signaling while treatment with purified YKL-40 rescued partially or completely the signaling dysfunction induced by rOPN, providing evidence that increasing YKL-40 levels could attenuate scoliosis severity (FIG. 3).

In conclusion, a positive correlation was found between plasma YKL-40 levels and non-severe form of scoliosis as well as with male patients classified in AIS endophenotype FG1, who are less prone to spinal deformity progression. A negative correlation was observed between circulating ghrelin levels and plasma YKL-40 levels in AIS patients classified in FG1 endophenotype but not in the other two AIS subgroups. Data presented herein also show associations between certain SNPs and haplotypes of the CHI3L 1 gene with spinal deformity severity and/or plasma YKL-40 levels. Positive associations between SNPs in the CHI3L1 gene and some AIS endophenotypes (FG2 and FG3) were observed. Finally, functional in vitro analysis provided compelling evidence that elevation of YKL-40 could reduce the severity of scoliosis by interfering with Gi-signaling dysfunction induced by osteopontin (OPN) in AIS. The role of OPN in scoliosis development in humans and different animal models has more recently been reported by different groups (27-29). Collectively, the findings reported herein suggest that YKL-40 could act as a protecting disease-modifying factor in the context of AIS and may be used in the prevention or treatment of AIS (e.g., to reduce scoliosis progression or the risk of developing a severe scoliosis).

EXAMPLE 9: MATERIALS AND METHODS Study populations. A total of 804 French-Canadian AIS patients and 239 age- and sex-matched healthy controls were enrolled between January 2008 and December 2012 in three pediatric spine centers in Montreal and surrounding schools All participants are residents of Quebec and of European descent. Each AIS patient was clinically examined by an orthopedic surgeon at the participating hospitals. Full medical history of each participant was collected to assess for other conditions including YKL-40 related diseases (e g. asthma) No other disease was found at the time of sample collection All healthy control subjects were screened by an orthopedic surgeon using the Adam’s forward-bending test with a scoliometer. Any children with an apparent spinal curvature or family history of scoliosis were excluded from the control cohort. Ancestral and relatedness testing were performed by applying respectively EIGENSTRAT (Principal Component Analysis or PCA, analysis of self-reported ethnicity) and PLINK identity-by-descent (IBD) Self-reported French-Canadian individuals falling outside the main core cluster were removed from further analyses. Another analysis was performed on the main core cluster to look for any remaining population substructures. Using the IBD approach, ancestral outliers and related samples (pi_hat>0.1875) were removed prior SNP analyses. Upon classification of the patients based on their spinal deformity severity, at skeletal maturity, 227 AIS patients were considered as non-severe cases (Cobb angle 10°- 39°) at the time of measuring the YKL-40 levels, while 132 patients were considered as severe cases (Cobb angle ³ 40°)

Isolation and classification of patient osteoblasts. Osteoblasts were derived from bone specimens removed from affected vertebrae (number varied from T3 to L4), as a part of correctional surgery for severe scoliosis (AIS cases) and from trauma cases (controls). Under sterile conditions, bone was fragmented with a bone cutter, then incubated in 1x Dulbecco’s Modification Eagle’s Medium (aDMEM) (Wisent Inc, Montreal QC, Canada) supplemented with 10% HyClone™ Fetal Bovine Serum (FBS) (Thermo Fischer Scientific, Logan UT, USA) and 1 % antibiotic (Invitrogen), at 37°C with 5% C0 2 until confluent (10-14 days). At confluence, osteoblasts were isolated by trypsinization and frozen in liquid nitrogen Patient endophenotypes were classified from osteoblast cultures using cellular dielectric spectroscopy (CDS), as previously described in (8) and (9).

Extraction of RNA. Total RNA was extracted using TriZol™ (Invitrogen, CA, USA) according to manufacturer’s instructions. After washing with 70% ethanol, and dried at RT, RNA was resuspended in 50 mI RNase-free water (Qiagen, Canada). An aliquot of total RNA diluted in RNase-free water was used for quantity and integrity checking (using NanoDrop™ Version 3.7.1), and the remaining sample was stored at -80"C until gene expression analysis. All samples maintained a 28S/18S rRNA ratio of 1.5 or greater

Microarray. Genome-wide expression patterns were assessed on microarrays performed in triplicate using the Affymetrix, GeneChip® Fluman Exon 1.0 ST array, at Centre d’lnnovation de Genome Quebec, Montreal. All protocols were conducted as described in Affymetrix GeneChip Expression Analysis technical manuals. Analysis of microarray data was conducted using GeneSifter software (www.genesifter.net/web/DC). Raw data were normalized using Robust Multi-array Average (RMA) (35), and these derived values were log2-transformed. Data was filtered using a Kruskal-Wallis test followed by a false recovery rate (FDR)-correction according to Benjamini and Flochberg (36), and a threshold of at least a 3-fold differential expression. Probes were considered significant if their FDR-corrected p value was £ 0.05 This filtered dataset was explored using clustering algorithms and candidate genes were selected for single-gene analysis via real time quantitative PCR.

Quantitative Real-Time PCR. All materials and methods followed MIQE (Minimum Information for Publication of Quantitative Real-Time PCR Experiments) Guidelines (37) CHI3L /gene selected from the filtered microarray dataset was further validated by real time quantitative PCR (RT-qPCR) using PerfeCTaSYBR Green SuperMix™ (Quanta Biosciences) on a 7900HT Real-Time PCR system (Applied Biosystems), according to manufacturer’s protocols. CHI3L1 primers were: Forward primer: 5'-CAGGAAAGCGTCAAAAGCAAGGTG-3’ (SEQ ID NO:28) and Reverse primer: 5’- GAGTGCATCCTTGATGGCATTGGT-3’ (SEQ ID NO:29). For each subject, 1 g of mRNA was reverse transcribed into cDNA using Thermoscript™ RT-PCR system (Invitrogen CA, USA), according to the manufacturer’s protocol A 1 in 10 dilution of cDNA was used for RT-qPCR. Each cDNA sample was loaded in triplicate for every gene tested, on an optical 384 reaction plate, and beta-actin was used as an endogenous control for normalizing gene expression, based on Stephens et al, 2011 (38). The fold-change in gene expression was determined by the program RQ manager (39). RQ values were log 2 transformed, and the Levene test checked for equal variances between groups If variances were equal we used either an ANOVA or a T-test, if not equal we used the Wilcoxon/Kruskal-Wallis test to examine whether average expression levels were different among clusters. Statistical analyses were performed using JMP 9 software (SAS Institute Inc 2012).

Measuring plasma YKL-40 and Ghrelin levels. Peripheral blood samples were collected in EDTA-treated tubes and then centrifuged. Plasma samples were collected, aliquoted and stored at -80oC until thawed and analyzed. The concentrations of plasma YKL-40 was measured by enzyme-linked immunosorbent assay (ELISA) kit (Quidel, San Diego, CA, USA) according to the protocol provided by the manufacturer. Unacylated ghrelin was measured in plasma of a subgroup of AIS patient exhibiting a severe scoliosis and matched control subjects by an EIA kit (Cayman Chemicals, Ann Arbor, Ml, USA) according to the manufacturer's specifications. Both assays were performed in duplicate and the mean values were taken for the analysis. The optical density was measured at 450 nm using a DTX880 microplate reader (Beckman Coulter, Brea, California, USA).

Genotyping of SNPs in the £¾K7Z/gene. Genomic DNA samples were derived from the peripheral blood of the subjects of the same cohort using PureLink® Genomic DNA kit (Thermo Fisher Scientific, Waltham, Massachusetts, USA). Then they were genotyped for 12 SNPs in the region of CHI3L1 gene. Part of the cohort was genotyped by the lllumina Fluman Omni 2.5M Bead Chip as part of a GWAS study previously done by our team at the McGill University and Genome Quebec Innovation Centre (31). These SNPs were chosen due to the fact that their genotypes were already available for most of the cohort used for biochemical analysis. Therefore the 12 SNPs which were present on the lllumina Fluman Omni 2.5M Bead Chip were further genotyped in the rest of the cohort using multiplex PCR at the McGill University and Genome Quebec Innovation. Multiplex PCR of the 12 SNPs was performed using standard procedures with 20ng of template genomic DNA and HotStarT aq™ DNA polymerase enzyme (QIAGEN). PCR reactions were run on the QIAxcel™ (QIAGEN) to assess the amplification, followed by the single base extension using iPlex™ Thermo Sequenase Genotypes were determined by MALDI-TOF mass-spectrometry and data were analyzed using Mass ARRAY Typer Analyser software.

Sanger sequencing. Sanger sequencing was performed at the Genome Quebec Innovation Centre at McGill University on a limited subgroup of the AIS patients (n=7) producing very high circulating YKL-40 levels (>100 ng/ml) and considered as non-severely affected (Table 5B). The primers were designed using the program Primer3. Sanger sequence chromatograms were analyzed using Mutation Surveyor (Soft Genetics, Inc.).

Cellular dielectric spectroscopy (CDS) assay. Functional effects of YKL-40 were investigated by a CDS assay as previously described (11 , 13). In brief, primary osteoblasts obtained from bone fragments obtained intraoperatively from AIS patients and control subjects (trauma cases) were seeded into the CellKey™ standard 96-well microplate at a density of 10 x 104 cells per well and incubated in standard conditions (37°C / 5% C02) with 0.5 g/ml of purified recombinant OPN (rOPN) or the vehicle (saline buffer) for 18h prior to stimulation. After overnight incubation, cells were directly stimulated with oxymetazoline (10 mM) (Tocris Chemical Co. St. Louis, MO, USA), a specific ligand activating a1 - adrenergic receptor normally coupled to Gi proteins The same test was done with and without treatment of the cells with recombinant YKL-40 (rYKL-40) to assess its effect

Phenotypic analyses. To compare patients and controls and compare among different sub-classifications of patients and controls, an ANOVA test was used with the log-transformed plasma YKL-40 level as the dependent variable and the phenotype and sex as independent variables, with age ascovariate. L/alue (two-sided) < 0.05 was considered statistically significant.

Individual SNP association analyses. The allele frequency of each SNP was calculated separately for each endophenotype subclassification of the patients and controls. Individual SNP association analyses were performed by comparing the allele frequencies of each SNP between each endophenotype pair of the patients and between patients and controls. The significance was calculated using Fisher’s exact test (two-sided). The software SPSS v.23 was used for these statistical analyses The quantitative association analysis of the plasma YKL-40 levels with each SNP was performed using the‘qassoc’ option in PLINK v1.09 (32). The presented lvalues have been corrected for multiple comparisons using Bonferroni correction

Haplotype association analysis. The linkage disequilibrium blocks of the 12 SNPs were estimated based on the genotype data using Flaploview (33) The haplotypes were inferred using UNPFIASED (34) with a sliding window of up to six SNPs. Association analyses were carried out between the inferred haplotypes and the YKL-40 level using an in-house R program Specifically, a linear regression model was performed for the haplotype associations with YKL-40 levels. The association analyses were also performed based on various subsets of the samples, such as males, females, and endophenotypes. The subgroups with no more than three samples, and the haplotypes with frequency < 0.01 were removed from the analyses To correct for multiple testing, the experiment-wise significance threshold P value was calculated based on the total number of estimated independent linkage disequilibrium blocks. In this study, as no more than three linkage disequilibrium blocks were observed among the 12 SNPs (FIG. 2), three was used as the number of independent tests. Significant associations were reported only when the original A’value was < 0.0167 (corresponding to a corrected A’value < 0 05)

The scope of the claims should not be limited by the preferred embodiments set forth in the examples, but should be given the broadest interpretation consistent with the description as a whole. REFERENCES

1. Cheng JC, Castelein RM, Chu WC, Danielsson AJ, Dobbs MB, Grivas TB, Gurnett CA, Luk KD, Moreau A, Newton PO, Stokes IA, Weinstein SL, Burwell RG. Adolescent idiopathic scoliosis. Nature reviews. Disease primers. 2015; 1 : 1 -20.

2. Donzelli S, Zaina F, Lusini M, Minnella S, Negrini S. In favour of the definition" adolescents with idiopathic scoliosis": juvenile and adolescent idiopathic scoliosis braced after ten years of age, do not show different end results. SOSORT award winner 2014. Scoliosis. 2014; 9(1): 7.

3. Asher MA, Burton DC. Adolescent idiopathic scoliosis: natural history and long term treatment effects Scoliosis 2006; 1 (1): 2.

4. Tang NL, Yeung HY, Hung VW, Di Liao C, Lam TP, Yeung HM, et al. Genetic epidemiology and heritability of AIS: A study of 415 Chinese female patients. Journal of orthopaedic research. 2012; 30(9): 1464-1469.

5. Grauers A, Rahman I, Gerdhem P. Heritability of scoliosis. European Spine Journal. 2012; 21 (6): 1069-1074.

6. Gorman KF, Julien C, Moreau A. The genetic epidemiology of idiopathic scoliosis European Spine Journal. 2012; 21(10): 1905-1919.

7. Azeddine B, Letellier K, Moldovan F, Moreau A. Molecular determinants of melatonin signaling dysfunction in adolescent idiopathic scoliosis. Clinical orthopaedics and related research. 2007; 462: 45-52.

8. Akoume M-Y, Azeddine B, Turgeon I, Franco A, Labelle H, Poitras B, et al Cell-based screening test for idiopathic scoliosis using cellular dielectric spectroscopy. Spine. 2010; 35(13): E601-E8.

9. Akoume M-Y, Franco A, Moreau A. Cell-based assay protocol for the prognostic prediction of idiopathic scoliosis using cellular dielectric spectroscopy. JoVE (Journal of Visualized Experiments). 2013; (80): e50768-e.

10. John B, Lewis KR. Chromosome variability and geographic distribution in insects. Science. 1966; 152: 711 -21.

11. Ertekin-Taner N. Gene expression endophenotypes: a novel approach for gene discovery in Alzheimer’s disease Molecular neurodegeneration. 2011 ; 6(1): 31.

12. Gottesman II, Gould TD. The endophenotype concept in psychiatry: etymology and strategic intentions. American Journal of Psychiatry. 2003; 160(4): 636-45.

13. Chan RC, Gottesman II. Neurological soft signs as candidate endophenotypes for schizophrenia: a shooting star or a Northern star? Neuroscience & Biobehavioral Reviews. 2008; 32(5): 957-71

14. Johansen JS. Studies on serum YKL-40 as a biomarker in diseases with inflammation, tissue remodelling, fibroses and cancer. Dan Med Bull 2006; 53(2): 172-209.

15. Johansen JS, Schultz NA, Jensen BV. Plasma YKL-40: a potential new cancer biomarker? Future Oncology. 2009; 5(7): 1065-82.

16. Huang K, Wu L. YKL-40: a potential biomarker for osteoarthritis. Journal of International Medical Research. 2009; 37(1): 18-24.

17. Tsuji T, Matsuyama Y, Natsume N, Hasegawa Y, Kondo S, Kawakami H, Yoshihara H, Iwata H Analysis of chondrex (YKL-40, HC gp-39) in the cerebrospinal fluid of patients with spine disease. Spine (Phiia Pa 1976) 2002; 27(7): 732-5. 18. Julien C, Gorman KF, Akoume M-Y, Moreau A. Towards a comprehensive diagnostic assay for scoliosis. Personalized Medicine. 10(1): 97-103 (2013)

19. Kyrgios I, Galli-Tsinopoulou A, Stylianou C. Ghrelin— leptin network influences serum chitinase 3-like protein 1 (YKL- 40) levels in obese prepubertal children. Regulatory peptides. 2013; 183: 69-73.

20. de Gauzy JS, Gennero I, Delrous O, Salles J-P, Lepage B, Accadbled F. Fasting total ghrelin levels are increased in patients with adolescent idiopathic scoliosis. Scoliosis. 2015; 10(1): 33.

21. Yu FIG, Zhang HQ, Zhou ZH, Wang YJ. High Ghrelin Level Predicts the Cun/e Progression of Adolescent Idiopathic Scoliosis Girls. Biomed Res Int. 2018: 9784083 (2018).

22. Ober C, Tan Z, Sun Y, Possick JD, Pan L, Nicolae R, et al. Effect of variation in CHI3L1 on serum YKL-40 level, risk of asthma, and lung function. New England Journal of Medicine. 2008; 358(16): 1682-91.

23. Nielsen KR, Steffensen R, Boegsted M, Baech J, Lundbye-Christensen S, Hetland ML, et al. Promoter polymorphisms in the chitinase 3-like 1 gene influence the serum concentration of YKL-40 in Danish patients with rheumatoid arthritis and in healthy subjects. Arthritis research & therapy. 2011 ; 13(3): R109

24. Kjaergaard A, Johansen J, Bojesen S, Nordestgaard B. Role of inflammatory marker YKL-40 in the diagnosis, prognosis and cause of cardiovascular and liver diseases. Critical reviews in clinical laboratory sciences 2016; 53(6): 396- 408.

25. Zheng JL, Lu L, Flu J, Zhang RY, Zhang Q, Chen QJ, et al. Genetic polymorphisms in chitinase 3-like 1 (CHI3L1 ) are associated with circulating YKL-40 levels, but not with angiographic coronary artery disease in a Chinese population Cytokine. 2011; 54(1): 51-5.

26. Aziz M, Wissing ML, Naver KV, Faber J, Skouby SO. Polycystic ovary syndrome and low grade inflammation with special reference to YKL-40. Gynecol Endocrinol. 30 (4): 311-315 (2014).

27. Sun G, Lam T, Bobby N, Yim P, Lee K, Moreau A, et al High osteopontin plasma level associated with abnormal cortical bone mineral density in girls with adolescent idiopathic scoliosis. Stud Health Technol Inform. 2012; 176: 457

28. Yadav MC, Fluesa C, Narisawa S, Hoylaerts MF, Moreau A, Farquharson C, et al. Ablation of osteopontin improves the skeletal phenotype of PhosphoW- mice. Journal of Bone and Mineral Research. 2014; 29(11): 2369-81.

29. Xie N, Li M, Wu T, Liu J, Wang B, Tang F. Does elevated osteopontin level play an important role in the development of scoliosis in bipedal mice? The Spine Journal. 2015; 15(7): 1660-4.

30. Floo C, Gatam L Role of Remodelling in Adolescent Idiopathic Scoliosis: an Evaluation of Osteopontin Level. The Journal of Indonesian Orthopaedic. 2012; 40(2).

31. Tang QL, Julien C, Eveleigh R, Bourque G, Franco A, Labelle H, Grimard G, Parent S, Ouellet J, Mac-Thiong JM, Gorman KF, Moreau A. A replication study for association of 53 single nucleotide polymorphisms in ScoliScore test with adolescent idiopathic scoliosis in French-Canadian population. Spine (Phiia Pa 1976). 40(8): 537-43 (2015).

32. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira M.A, Bender D, Mailer J, Sklar P, de Bakker PI, Daly MJ & Sham PC PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 81 : 559-75 (2007). 33. Barrett JC, Fry B, Mailer J, & Daly MJ (2005) Haploview: analysis and visualization of LD and haplotype maps (Translated from eng) Bioinformatics 21 (2):263-265 (in eng).

34. Dudbridge F (2008) Likelihood-based association analysis for nuclear families and unrelated subjects with missing genotype data. Flum Hered 66(2):87-98.

35. Irizarry RA, Bolstad BM, Collin F, et al. (2003) Summaries of Affymethx GeneChip probe level data. Nucleic Acids Res.

31(4): e15.

36. Flu JX, Zhao FI, Zhou HFH. (2010) False discovery rate control with groups. Journal of the American Statistical Association 104 (491): 1215-1227.

37. Bustin SA, BenesV, Garson JA, et al. (2009) The MIQE guidelines: minimum information for publication of quantitative real-time PCR experiments. Clin Chem. 55(4): 611-22.

38. Stephens AS, Stephens SR, Morrison NA. (2011) Internal control genes for quantitative RT-PCR expression analysis in mouse osteoblasts, osteoclasts and macrophages. BMC Res Notes. 4:410

39. Livak KJ, and Schmittgen TD. (2001) Analysis of Relative Gene Expression Data Using Real-Time Quantitative PCR and the 2 DD0T Method. Methods 25:402-408