JACKSON AIMEE (US)
ANDREONE BENJAMIN (US)
GODINHO BRUNO (US)
CHEN QINGMIN (US)
US10799591B2 | 2020-10-13 |
BONHAM LUKE W., SIRKIS DANIEL W., YOKOYAMA JENNIFER S.: "The transcriptional landscape of microglial genes in aging and neurodegenerative disease", FRONT. IMMUNOL., vol. 10, no. 1170, 4 June 2019 (2019-06-04), pages 1 - 17, XP085870060
GAO ET AL.: "Transcriptional regulation of homeostatic and disease-associated-microglial genes by IRF1, LXRp, and CEBPa", GLIA, vol. 67, no. 10, October 2019 (2019-10-01), pages 1958 - 1975, XP071740511
LEE J.-K., CHUNG J., MCALPINE F. E., TANSEY M. G.: "Regulator of G-protein signaling-10 negatively regulates NF-KB in microglia and neuroprotects dopaminergic neurons in hemiparkinsonian rats", J. NEUROSCI., vol. 31, no. 33, 17 August 2011 (2011-08-17), pages 11879 - 11888, XP055974864
MRAK ET AL.: "Interleukin-1 and the immunogenetics of Alzheimer disease", J. NEUROPATHOL. EXP. NEUROL., vol. 59, no. 6, June 2000 (2000-06-01), pages 471 - 476, XP009019231
ALTERMAN ET AL.: "A divalent siRNA chemical scaffold for potent and sustained modulation of gene expressio n throughout the central nervous system", NAT. BIOTECHNOL., vol. 37, no. 8, August 2019 (2019-08-01), pages 884 - 894, XP036850007, DOI: 10.1038/s41587-019-0205-0
MOUMNÉ LARA, BETUING SANDRINE, CABOCHE JOCELYNE: "Multiple aspects of gene dysregulation in Huntington's disease", FRONT. NEUROL., vol. 4, no. 127, 23 October 2013 (2013-10-23), pages 1 - 10, XP055974877
CLAIMS 1. A method of delivering a branched small interfering RNA (siRNA) molecule to a microglial cell in a subject in need of microglial gene silencing, the method comprising administering the branched siRNA molecule to the central nervous system of the subject. 2. The method of claim 1 , wherein the subject has been diagnosed as having a disease associated with expression of a dysregulated microglial gene ordysregulated microglial gene pathway. 3. The method of claim 2, wherein the dysregulated microglial gene exhibits increased expression and/or activity in microglial cells of the subject as compared to the expression and/or activity of the microglial gene in microglial cells of a reference subject. 4. The method of claim 2, wherein the dysregulated microglial gene exhibits reduced expression and/or activity in microglial cells of the subject as compared to the expression and/or activity of the microglial gene in microglial cells of a reference subject. 5. The method of claim 1 , wherein the microglial gene is a positive regulator of a gene for which increased expression and/or activity relative to the level of expression and/or activity observed in a reference subject is associated with a disease state. 6. The method of claim 1 , wherein the microglial gene is a negative regulator of a gene for which decreased expression and/or activity relative to the level of expression and/or activity observed in a reference subject is associated with a disease state. 7. The method of claim 1 , wherein the microglial gene is a splice isoform of a gene for which overexpression of the splice isoform relative to the expression of the splice isoform in a reference subject is associated with a disease state. 8. The method of any one of claims 2-7, wherein the disease is a neuroinflammatory or neurodegenerative disease. 9. The method of any one of claims 1-8, wherein the dysregulated gene is selected from the group consisting of ABCA7, ABI3, ADAM 10, APOC1 , APOE, AXL, BIN1 , C1QA, C3, C90RF72, CASS4, CCL5, CD2AP, CD33, CD68, CLPTM1 , CLU, CR1 , CSF1 , CST7, CTSB, CTSD, CTSL, CXCL10, CXCL13, DSG2, ECHDC3, EPHA1 , FABP5, FERMT2, FTH1 , GNAS, GRN, HBEGF, HLA-DRB1 , HLA-DRB5, I FIT 1 , IFIT3, IFITM3, IFNAR1 , IFNAR2, IGF1 , IL10RA, IL1A, IL1B, IL1RAP, INPP5D, ITGAM, ITGAX, LILRB4, LPL, MEF2C, MMP12, MS4A4A, MS4A6A, NLRP3, NME8, NOS2, PICALM, PILRA, PLCG2, PTK2B, SCIMP, SLC24A4, SORL1 , SPI1 , SPP1 , SPPL2A, TBK1 , TNF, TREM2, TREML2, TYROBP, and ZCWPW1. 10. The method of any one of claims 1-9, wherein the subject is a human. 11. The method of any one of claims 1-10, wherein the branched siRNA is administered to the subject intrathecally, intracerebroventricularly, or intrastriatally. 12. The method of any one of claims 1-11 , wherein the siRNA molecule is di-branched. 13. The method of any one of claims 1-12, wherein the siRNA comprises (i) an antisense strand having complementarity to one or more of genes selected from the group consisting of APOE, BIN1 , C1QA, C3, C90RF72, CCL5, CD33, CLU/APOJ, CR1 , CXCL10, CXCL13, IFIT1 , IFIT3, IFITM3, IFNAR1 , IFNAR2, IL10RA, IL1A, IL1B, IL1RAP, INPP5D, ITGAM, MEF2C, MMP12, NLRP3, NOS2, PILRA, PLCG2, PTK2B, SLC24A4, TBK1 , and TNF, and (ii) a sense strand having complementarity to the antisense strand. 14. The method of claim 13, wherein the antisense strand has the following formula, in the 5'-to-3' direction: Z-((A-P-)n(B-P-)m)q; wherein Z is a 5’ phosphorus stabilizing moiety; each A is, independently, a 2’-0-methyl (2'-0-Me) ribonucleoside; each B is, independently, a 2'-fluoro (2’-F) ribonucleoside; each P is, independently, an internucleoside linkage selected from a phosphodiester linkage and a phosphorothioate linkage; n is an integer from 1 to 5; m is an integer from 1 to 5; and q is an integer between 1 and 15 15. The method of claim 14, wherein Z is represented in any one of Formula l-VIII: wherein Nuc represents a nucleobase selected from the group consisting of adenine, uracil, guanine, thymine, and cytosine, and R represents optionally substituted alkyl, optionally substituted alkenyl, optionally substituted alkynyl, phenyl, benzyl, hydroxy, or hydrogen. 16. The method of claim 14 or 15, wherein Z is (E)-vinylphosphonate represented in Formula III. 17. The method of any one of claims 13-16, wherein at least 50% of the ribonucleosides are 2'-0-Me ribonucleoside. 18. The method of any one of claims 13-17, wherein at least 60% of the ribonucleosides are 2'-0-Me ribonucleoside. 19. The method of any one of claims 13-18, wherein at least 70% of the ribonucleosides are 2'-0-Me ribonucleoside. 20. The method of any one of claims 13-19, wherein at least 80% of the ribonucleosides are 2'-0-Me ribonucleoside. 21 . The method of any one of claims 13-20, wherein at least 90% of the ribonucleosides are 2'-0-Me ribonucleoside. 22. The method of any one of claims 13-21 , wherein the length of the antisense strand is between 10 and 30 nucleotides. 23. The method of any one of claims 13-22, wherein the length of the antisense strand is between 15 and 25 nucleotides. 24. The method of claim 23, wherein the length of the antisense strand is 20 nucleotides. 25. The method of claim 23, wherein the length of the antisense strand is 21 nucleotides. 26. The method of claim 23, wherein the length of the antisense strand is 22 nucleotides. 27. The method of claim 23, wherein the length of the antisense strand is 23 nucleotides. 28. The method of claim 23, wherein the length of the antisense strand is 24 nucleotides. 29. The method of claim 23, wherein the length of the antisense strand is 25 nucleotides. 30. The method of claim 22, wherein the length of the antisense strand is 26 nucleotides. 31. The method of claim 22, wherein the length of the antisense strand is 27 nucleotides. 32. The method of claim 22, wherein the length of the antisense strand is 28 nucleotides. 33. The method of claim 22, wherein the length of the antisense strand is 29 nucleotides. 34. The method of claim 22, wherein the length of the antisense strand is 30 nucleotides. 35. The method of any one of claims 13-34, wherein the length of the sense strand is between 12 and 30 nucleotides. 36. The method of claim 35, wherein the length of the sense strand is 14 nucleotides. 37. The method of claim 35, wherein the length of the sense strand is 15 nucleotides. 38. The method of claim 35, wherein the length of the sense strand is 16 nucleotides 39. The method of claim 35, wherein the length of the sense strand is 17 nucleotides. 40. The method of claim 35, wherein the length of the sense strand is 18 nucleotides. 41 . The method of claim 35, wherein the length of the sense strand is 19 nucleotides. 42. The method of claim 35, wherein the length of the sense strand is 20 nucleotides. 43. The method of claim 35, wherein the length of the sense strand is 21 nucleotides. 44. The method of claim 35, wherein the length of the sense strand is 22 nucleotides. 45. The method of claim 35, wherein the length of the sense strand is 23 nucleotides. 46. The method of claim 35, wherein the length of the sense strand is 24 nucleotides. 47. The method of claim 35, wherein the length of the sense strand is 25 nucleotides. 48. The method of claim 35, wherein the length of the sense strand is 26 nucleotides. 49. The method of claim 35, wherein the length of the sense strand is 27 nucleotides. 50. The method of claim 35, wherein the length of the sense strand is 28 nucleotides. 51. The method of claim 35, wherein the length of the sense strand is 29 nucleotides. 52. The method of claim 35, wherein the length of the sense strand is 30 nucleotides. 53. A branched siRNA molecule comprising a sense strand and an antisense strand, wherein the antisense strand comprises a region having complementarity to a segment of contiguous nucleotides within a gene selected from the group consisting of APOE, BIN1 , C1QA, C3, C90RF72, CCL5, CD33, CLU/APOJ, CR1 , CXCL10, CXCL13, IFIT1 , IFIT3, IFITM3, IFNAR1 , IFNAR2, IL10RA, IL1A, IL1B, IL1RAP, INPP5D, ITGAM, MEF2C, MMP12, NLRP3, NOS2, PILRA, PLCG2, PTK2B, SLC24A4, TBK1 , and TNF. 54. The molecule of claim 53, wherein the antisense strand has complementarity to a portion of a gene encoding a positive regulator of a gene for which increased expression and/or activity relative to the level of expression and/or activity observed in a reference subject is associated with a disease state. 55. The molecule of claim 53, wherein the antisense strand has complementarity to a portion of a gene encoding a negative regulator of a gene for which decreased expression and/or activity relative to the level of expression and/or activity observed in a reference subject is associated with a disease state. 56. The molecule of claim 53, wherein the antisense strand has complementarity to a splice isoform of a gene for which overexpression of the splice isoform relative to the expression of the splice isoform in a reference subject is associated with a disease state. 57. The molecule of any one of claims 53-56, wherein the sense strand has complementarity to the antisense strand. 58. The molecule of any one of claims 53-57, wherein the siRNA molecule is di-branched. 59. The molecule of any one of claims 53-58, wherein the antisense strand of the branched siRNA has the following formula in the 5'-to-3' direction: Z-((A-P-)n(B-P-)m)q; wherein Z is a 5' phosphorus stabilizing moiety; each A is, independently, a 2'-0-Me ribonucleoside; each B is, independently, a 2'-F ribonucleoside; each P is, independently, an internucleoside linkage selected from a phosphodiester linkage and a phosphorothioate linkage; n is an integer from 1 to 5; m is an integer from 1 to 5; and q is an integer between 1 and 15. 60. The molecule of claim 59, wherein Z is represented in any one of Formula l-VIII: wherein Nuc represents a nucleobase selected from the group consisting of adenine, uracil, guanine, thymine, and cytosine, and R represents optionally substituted alkyl, optionally substituted alkenyl, optionally substituted alkynyl, phenyl, benzyl, hydroxy, or hydrogen. 61. The molecule of claim 59 or 60, wherein Z is (E)-vinylphosphonate as represented in Formula III. 62. The molecule of any one of claims 53-61 , wherein the length of the antisense strand is between 10 and 30 nucleotides. 63. The molecule of claim 62, wherein the length of the antisense strand is between 15 and 30 nucleotides. 64. The molecule of claim 62, wherein the length of the antisense strand is 20 nucleotides. 65. The molecule of claim 62, wherein the length of the antisense strand is 21 nucleotides. 66. The molecule of claim 62, wherein the length of the antisense strand is 22 nucleotides. 67. The molecule of claim 62, wherein the length of the antisense strand is 23 nucleotides. 68. The molecule of claim 62, wherein the length of the antisense strand is 24 nucleotides. 69. The molecule of claim 62, wherein the length of the antisense strand is 25 nucleotides. 70. The molecule of claim 62, wherein the length of the antisense strand is 26 nucleotides. 71 . The molecule of claim 62, wherein the length of the antisense strand is 27 nucleotides. 72. The molecule of claim 62, wherein the length of the antisense strand is 28 nucleotides. 73. The molecule of claim 62, wherein the length of the antisense strand is 29 nucleotides. 74. The molecule of claim 62, wherein the length of the antisense strand is 30 nucleotides. 75. The molecule of any one of claims 53-74, wherein the length of the sense strand is between 12 and 30 nucleotides. 76. The molecule of claim 75, wherein the length of the sense strand is 14 nucleotides. 77. The molecule of claim 75, wherein the length of the sense strand is 15 nucleotides. 78. The molecule of claim 75, wherein the length of the sense strand is 16 nucleotides 79. The molecule of claim 75, wherein the length of the sense strand is 17 nucleotides. 80. The molecule of claim 75, wherein the length of the sense strand is 18 nucleotides. 81. The molecule of claim 75, wherein the length of the sense strand is 19 nucleotides. 82. The molecule of claim 75, wherein the length of the sense strand is 20 nucleotides. 83. The molecule of claim 75, wherein the length of the sense strand is 21 nucleotides. 84. The molecule of claim 75, wherein the length of the sense strand is 22 nucleotides. 85. The molecule of claim 75, wherein the length of the sense strand is 23 nucleotides. 86. The molecule of claim 75, wherein the length of the sense strand is 24 nucleotides. 87. The molecule of claim 75, wherein the length of the sense strand is 25 nucleotides. 88. The molecule of claim 75, wherein the length of the sense strand is 26 nucleotides. 89. The molecule of claim 75, wherein the length of the sense strand is 27 nucleotides. 90. The molecule of claim 75, wherein the length of the sense strand is 28 nucleotides. 91. The molecule of claim 75, wherein the length of the sense strand is 29 nucleotides. 92. The molecule of claim 75, wherein the length of the sense strand is 30 nucleotides. 93. A method of treating a subject diagnosed as having a disease associated with expression of a dysregulated microglial gene or dysregulated microglial gene pathway, the method comprising administering to the subject the branched siRNA molecule of any one of claims 53-92. 94. The method of claim 93, wherein the dysregulated microglial gene is selected from the group consisting of ABCA7, ABI3, ADAM 10, APOC1 , APOE, AXL, BIN1 , C1QA, C3, C90RF72, CASS4, CCL5, CD2AP, CD33, CD68, CLPTM1 , CLU, CR1 , CSF1 , CST7, CTSB, CTSD, CTSL, CXCL10, CXCL13, DSG2, ECHDC3, EPHA1 , FABP5, FERMT2, FTH1 , GNAS, GRN, HBEGF, HLA-DRB1 , HLA-DRB5, I FIT 1 , IFIT3, IFITM3, IFNAR1 , IFNAR2, IGF1 , IL10RA, IL1A, IL1B, IL1RAP, INPP5D, ITGAM, ITGAX, LILRB4, LPL, MEF2C, MMP12, MS4A4A, MS4A6A, NLRP3, NME8, NOS2, PICALM, PILRA, PLCG2, PTK2B, SCIMP, SLC24A4, SORL1 , SPI1 , SPP1 , SPPL2A, TBK1 , TNF, TREM2, TREML2, TYROBP, and ZCWPW1. 95. The method of claim 93, wherein the dysregulated microglial gene exhibits increased expression and/or activity in microglial cells of the subject as compared to the expression and/or activity of the same gene in microglial cells of a reference subject. 96. The method of claim 93, wherein the dysregulated microglial gene exhibits reduced expression and/or activity in microglial cells of the subject as compared to the expression and/or activity of the same gene in microglial cells of a reference subject. 97. The method of claim 93, wherein the administering of the branched siRNA molecule to the subject results in silencing of a gene in the subject. 98. The method of claim 97, wherein the silencing of a gene comprises silencing any one of the genes selected from the group consisting of APOE, BIN1 , C1QA, C3, C90RF72, CCL5, CD33, CLU/APOJ, CR1 , CXCL10, CXCL13, IFIT1 , IFIT3, IFITM3, IFNAR1 , IFNAR2, IL10RA, IL1A, IL1B, IL1RAP, INPP5D, ITGAM, MEF2C, MMP12, NLRP3, NOS2, PILRA, PLCG2, PTK2B, SLC24A4, TBK1 , and TNF. 99. The method of claim 97, wherein silencing of a gene comprises silencing of a positive regulator of a gene for which increased expression and/or activity relative to the level of expression and/or activity observed in a reference subject is associated with a disease state. 100. The method of claim 97, wherein silencing of a gene comprises silencing of a gene for which decreased expression and/or activity relative to the level of expression and/or activity observed in a reference subject is associated with a disease state. 101. The method of claim 97, wherein silencing of a gene comprises silencing of a splice isoform of a gene for which overexpression of the splice isoform relative to the expression of the splice isoform in a reference subject is associated with a disease state. 102. The method of any one of claims 93-101 , wherein the subject is a human. |
BACKGROUND OF THE INVENTION
In many species, introduction of double-stranded RNA (dsRNA) induces potent and specific gene silencing. This phenomenon occurs in both plants and animals and has roles in viral defense and transposon silencing mechanisms. Short interfering RNAs (siRNAs), which are generally much shorter than the target gene, have been shown to be effective at gene silencing.
Microglia are a type of glial cell found in the central nervous system (CNS). Microglia are an essential component of the CNS immune system; however, microglia with dysregulated genes can also be a source of disease. For example, a disease state may precipitate as a result of overactive microglial genes or genes with reduced expression and/or activity in microglia. Therefore, silencing of effector genes or pathway regulatory genes may be needed to restore normal gene network function and ameliorate the disease state. Thus, there remains a need for new and improved therapeutics capable of permeating microglial cells and silencing microglial genes in order to restore genetic and biochemical pathway activity in microglia from a disease state towards a normal healthy state.
SUMMARY OF THE INVENTION
In an aspect, the invention features a method of delivering a branched small interfering RNA (siRNA) molecule to a microglial cell in a subject in need of microglial gene silencing. The method may include administering the branched siRNA molecule to the subject (e.g., to the central nervous system of the subject).
In some embodiments, the subject has been diagnosed as having a disease associated with expression of a dysregulated microglial gene or dysregulated microglial gene pathway. In some embodiments, the subject has been diagnosed as having a disease associated with expression and/or activity of a dysregulated microglial gene (e.g., altered expression and/or activity of a wild-type or mutated microglial gene).
In some embodiments, the dysregulated microglial gene exhibits increased expression and/or activity in microglial cells of the subject as compared to the expression and/or activity of the microglial gene in microglial cells of a reference subject. In some embodiments, the dysregulated microglial gene exhibits reduced expression and/or activity in microglial cells of the subject as compared to the expression and/or activity of the microglial gene in microglial cells of a reference subject.
In some embodiments, the microglial gene is a positive regulator of a gene for which increased expression and/or activity relative to the level of expression and/or activity observed in a reference subject is associated with a disease state.
In some embodiments, the microglial gene is a negative regulator of a gene for which decreased expression and/or activity relative to the level of expression and/or activity observed in a reference subject is associated with a disease state.
In some embodiments, the microglial gene is a splice isoform of a gene for which overexpression of the splice isoform relative to the expression of the splice isoform in a reference subject is associated with a disease state.
In some embodiments, the disease is a neuroinflammatory disease or a neurodegenerative disease. In some embodiments, the disease is Alzheimer’s disease. In some embodiments, the disease is Amyotrophic Lateral Sclerosis. In some embodiments, the disease is Parkinson’s disease. In some embodiments, the disease is frontotemporal dementia. In some embodiments, the disease is Huntington’s disease. In some embodiments, the disease is multiple sclerosis. In some embodiments, the disease is progressive supranuclear palsy.
In some embodiments, the dysregulated microglial gene is selected from the group consisting of ABCA7, ABI3, ADAM10, APOC1 , APOE, AXL, BIN1 , C1QA, C3, C90RF72, CASS4, CCL5, CD2AP, CD33, CD68, CLPTM1 , CLU, CR1 , CSF1 , CST7, CTSB, CTSD, CTSL, CXCL10, CXCL13, DSG2, ECHDC3, EPHA1 , FABP5, FERMT2, FTH1 , GNAS, GRN, HBEGF, HLA-DRB1 , HLA-DRB5, IFIT1 , IFIT3, IFITM3, IFNAR1 , IFNAR2, IGF1 , IL10RA, IL1A, IL1 B, IL1RAP, INPP5D, ITGAM, ITGAX, LILRB4, LPL, MEF2C, MMP12, MS4A4A, MS4A6A, NLRP3, NME8, NOS2, PICALM, PILRA, PLCG2, PTK2B, SCIMP, SLC24A4, SORL1 , SPI1 , SPP1 , SPPL2A, TBK1 , TNF, TREM2, TREML2, TYROBP, and ZCWPW1 .
In some embodiments, the subject is a mammal (e.g., a human).
In some embodiments, the branched siRNA is administered to the subject intrathecally, intracerebroventricularly, or intrastriatally.
In some embodiments, the siRNA molecule is di-branched. In some embodiments, the siRNA molecule is tri-branched. In some embodiments, the siRNA molecule is tetra-branched.
In some embodiments, the siRNA comprises (i) an antisense strand having complementarity to a portion of one or more of genes selected from the group consisting of APOE, BIN1 , C1 QA, C3,
C90RF72, CCL5, CD33, CLU/APOJ, CR1 , CXCL10, CXCL13, IFIT1 , IFIT3, IFITM3, IFNAR1 , IFNAR2, IL10RA, IL1 A, IL1 B, IL1 RAP, INPP5D, ITGAM, MEF2C, MMP12, NLRP3, NOS2, PILRA, PLCG2, PTK2B, SLC24A4, TBK1 , and TNF and (ii) a sense strand having complementarity to the antisense strand.
In some embodiments, the siRNA includes (i) an antisense strand having complementarity to a portion of a gene encoding a positive regulator of a gene for which increased expression and/or activity (relative, e.g., to the level of expression and/or activity observed in a reference subject) is associated with a disease state.
In some embodiments, the siRNA includes (i) an antisense strand having complementarity to a portion of a gene encoding a negative regulator of a gene for which decreased expression and/or activity (relative, e.g., to the level of expression and/or activity observed in a reference subject) is associated with a disease state.
In some embodiments, the siRNA includes (i) an antisense strand having complementarity to a splice isoform of a gene for which overexpression of the splice isoform relative to the expression of the splice isoform in a reference subject is associated with a disease state.
In any of the foregoing embodiments, the siRNA may also include (ii) a sense strand having complementarity to the antisense strand.
In some embodiment, the antisense strand has complementarity (e.g., at least 85% complementarity, such as 85% complementarity, 86% complementarity, 87% complementarity, 88% complementarity, 89% complementarity, 90% complementarity, 91% complementarity, 92% complementarity, 93% complementarity, 94% complementarity, 95% complementarity, 96% complementarity, 97% complementarity, 98% complementarity, 99% complementarity, or 100% complementarity) to a portion of at least 10 contiguous nucleotides of an mRNA molecule encoding one or more of the above genes. For example, the antisense strand may have complementarity to a portion of 10 contiguous nucleotides, 11 contiguous nucleotides, 12 contiguous nucleotides, 13 contiguous nucleotides, 14 contiguous nucleotides, 15 contiguous nucleotides, 16 contiguous nucleotides, 17 contiguous nucleotides, 18 contiguous nucleotides, 19 contiguous nucleotides, 20 contiguous nucleotides, 21 contiguous nucleotides, 22 contiguous nucleotides, 23 contiguous nucleotides, 24 contiguous nucleotides, 25 contiguous nucleotides, 26 contiguous nucleotides, 27 contiguous nucleotides, 28 contiguous nucleotides, 29 contiguous nucleotides, 30 contiguous nucleotides, 31 contiguous nucleotides, 32 contiguous nucleotides 33 contiguous nucleotides, 34 contiguous nucleotides, 35 contiguous nucleotides, 36 contiguous nucleotides, 37 contiguous nucleotides, 38 contiguous nucleotides, 39 contiguous nucleotides, 40 contiguous nucleotides, 41 contiguous nucleotides, 42 contiguous nucleotides, 43 contiguous nucleotides, 44 contiguous nucleotides, 45 contiguous nucleotides, 46 contiguous nucleotides, 47 contiguous nucleotides, 48 contiguous nucleotides, 49 contiguous nucleotides, or 50 contiguous nucleotides, or more, of an mRNA molecule encoding one or more of the above genes.
In some embodiments, the antisense strand has complementarity (e.g., at least 85% complementarity, such as 85% complementarity, 86% complementarity, 87% complementarity, 88% complementarity, 89% complementarity, 90% complementarity, 91% complementarity, 92% complementarity, 93% complementarity, 94% complementarity, 95% complementarity, 96% complementarity, 97% complementarity, 98% complementarity, 99% complementarity, or 100% complementarity) to a portion of from 10 to 50 contiguous nucleotides of an mRNA molecule encoding one or more of the above genes. For example, the antisense strand may have complementarity to a portion of from 11 contiguous nucleotides to 45 contiguous nucleotides, from 12 contiguous nucleotides to 40 contiguous nucleotides, from 13 contiguous nucleotides to 35 contiguous nucleotides, from 14 contiguous nucleotides to 30 contiguous nucleotides, from 15 contiguous nucleotides to 29 contiguous nucleotides, from 16 contiguous nucleotides to 28 contiguous nucleotides, from 17 contiguous nucleotides to 27 contiguous nucleotides, from 18 contiguous nucleotides to 26 contiguous nucleotides, or from 19 contiguous nucleotides to 22 contiguous nucleotides of an mRNA molecule encoding one or more of the above genes.
In some embodiments, the antisense strand comprises a region represented by the following chemical formula, in the 5'-to-3' direction:
Z-((A-P-)n(B-P-)m)q; wherein Z is a 5’ phosphorus stabilizing moiety; each A is, independently, a 2’-0-methyl (2'-0-Me) ribonucleoside; each B is, independently, a 2'-fluoro-ribonucleoside; each P is, independently, an internucleoside linkage selected from a phosphodiester linkage and a phosphorothioate linkage; n is an integer from 1 to 5 (e.g., 1 , 2, 3, 4, or 5); m is an integer from 1 to 5 (e.g., 1 , 2, 3, 4, or 5); and q is an integer between 1 and 15 (1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, or 15). In some embodiments, the antisense strand has a structure represented by Formula A-l, wherein Formula A-l is, in the 5’-to-3’ direction:
A-B-(A’)j-C-P 2 -D-P 1 -(C’-P 1 ) k -C’
Formula A-l; wherein A is represented by the formula C-P 1 -D-P 1 ; each A’ is represented by the formula C-P 2 -D-P 2 ;
B is represented by the formula C-P 2 -D-P 2 -D-P 2 -D-P 2 ; each C is a 2’-0-methyl (2’-0-Me) ribonucleoside; each C’, independently, is a 2’-0-Me ribonucleoside or a 2’-fluoro (2’-F) ribonucleoside; each D is a 2’-F ribonucleoside; each P 1 is a phosphorothioate internucleoside linkage; each P 2 is a phosphodiester internucleoside linkage; j is an integer from 1 to 7 (e.g., 1 , 2, 3, 4, 5, 6, or 7); and k is an integer from 1 to 7 (e.g., 1 , 2, 3, 4, 5, 6, or 7).
In some embodiments, the antisense strand has a structure represented by Formula A1 , wherein Formula A1 is, in the 5’-to-3’ direction:
A-S-B-S-A-O-B-O-B-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A -O-B-S-A-S-A-S-A-S-B-S-A
Formula A1; wherein A represents a 2’-0-Me ribonucleoside, B represents a 2’-F ribonucleoside, O represents a phosphodiester internucleoside linkage, and S represents a phosphorothioate internucleoside linkage.
In some embodiments, the antisense strand has a structure represented by Formula A-ll, wherein Formula A-ll is, in the 5’-to-3’ direction:
A-B-(A’)j-C-P 2 -D-P 1 -(C-P 1 ) k -C’
Formula A-ll; wherein A is represented by the formula C-P 1 -D-P 1 ; each A’ is represented by the formula C-P 2 -D-P 2 ;
B is represented by the formula C-P 2 -D-P 2 -D-P 2 -D-P 2 ; each C is a 2’-0-methyl (2’-0-Me) ribonucleoside; each C’, independently, is a 2’-0-Me ribonucleoside or a 2’-fluoro (2’-F) ribonucleoside; each D is a 2’-F ribonucleoside; each P 1 is a phosphorothioate internucleoside linkage; each P 2 is a phosphodiester internucleoside linkage; j is an integer from 1 to 7 (e.g., 1 , 2, 3, 4, 5, 6, or 7); and k is an integer from 1 to 7 (e.g., 1 , 2, 3, 4, 5, 6, or 7). In some embodiments, antisense strand has a structure represented by Formula A2, wherein Formula A2 is, in the 5’-to-3’ direction:
A-S-B-S-A-O-B-O-B-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A -O-B-S-A-S-A-S-A-S-A-S-A
Formula A2; wherein A represents a 2’-0-Me ribonucleoside, B represents a 2’-F ribonucleoside, O represents a phosphodiester internucleoside linkage, and S represents a phosphorothioate internucleoside linkage.
In some embodiments, the sense strand has a structure represented by Formula S-lll, wherein Formula S-lll is, in the 5’-to-3’ direction:
E-(A’)m-F
Formula S-lll; wherein E is represented by the formula (C-P 1 )2;
F is represented by the formula (C-P 2 ) 3 -D-P 1 -C-P 1 -C, (C-P 2 ) 3 -D-P 2 -C-P 2 -C, (C-P 2 ) 3 -D-P 1 -C-P 1 -D, or (C- P 2 ) 3 -D-P 2 -C-P 2 -D;
A’, C, D, P 1 , and P 2 are as defined in Formula II; and m is an integer from 1 to 7 (e.g., 1 , 2, 3, 4, 5, 6, or 7).
In some embodiments, the sense strand has a structure represented by Formula S1 , wherein Formula S1 is, in the 5’-to-3’ direction:
A-S-A-S-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-A-O-A-O-B-S-A -S-A
Formula S1; wherein A represents a 2’-0-Me ribonucleoside, B represents a 2’-F ribonucleoside, O represents a phosphodiester internucleoside linkage, and S represents a phosphorothioate internucleoside linkage.
In some embodiments, the sense strand has a structure represented by Formula S2, wherein Formula S2 is, in the 5’-to-3’ direction:
A-S-A-S-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-A-O-A-O-B-O-A -O-A
Formula S2; wherein A represents a 2’-0-Me ribonucleoside, B represents a 2’-F ribonucleoside, O represents a phosphodiester internucleoside linkage, and S represents a phosphorothioate internucleoside linkage. In some embodiments, the sense strand has a structure represented by Formula S3, wherein Formula S3 is, in the 5’-to-3’ direction:
A-S-A-S-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-A-O-A-O-B-S-A -S-B
Formula S3; wherein A represents a 2’-0-Me ribonucleoside, B represents a 2’-F ribonucleoside, O represents a phosphodiester internucleoside linkage, and S represents a phosphorothioate internucleoside linkage.
In some embodiments, the sense strand has a structure represented by Formula S4, wherein Formula S4 is, in the 5’-to-3’ direction:
A-S-A-S-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-A-O-A-O-B-O-A -O-B
Formula S4; wherein A represents a 2’-0-Me ribonucleoside, B represents a 2’-F ribonucleoside, O represents a phosphodiester internucleoside linkage, and S represents a phosphorothioate internucleoside linkage.
In some embodiments, the antisense strand has a structure represented by Formula A-IV, wherein Formula A-IV is, in the 5’-to-3’ direction:
A-(A’)j-C-P 2 -B-(C-P 1 ) k -C’
Formula A-IV; wherein A is represented by the formula C-P 1 -D-P 1 ; each A’ is represented by the formula C-P 2 -D-P 2 ;
B is represented by the formula D-P 1 -C-P 1 -D-P 1 ; each C is a 2’-0-Me ribonucleoside; each C’, independently, is a 2’-0-Me ribonucleoside or a 2’-F ribonucleoside; each D is a 2’-F ribonucleoside; each P 1 is a phosphorothioate internucleoside linkage; each P 2 is a phosphodiester internucleoside linkage; j is an integer from 1 to 7 (e.g., 1 , 2, 3, 4, 5, 6, or 7); and k is an integer from 1 to 7 (e.g., 1 , 2, 3, 4, 5, 6, or 7).
In some embodiments, the antisense strand has a structure represented by Formula A3, wherein Formula A3 is, in the 5’-to-3’ direction:
A-S-B-S-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A -O-B-S-A-S-B-S-A-S-A-S-A
Formula A3; wherein A represents a 2’-0-Me ribonucleoside, B represents a 2’-F ribonucleoside, O represents a phosphodiester internucleoside linkage, and S represents a phosphorothioate internucleoside linkage. In some embodiments, the sense strand has a structure represented by Formula S-V, wherein Formula S-V is, in the 5’-to-3’ direction:
E-(A’) m -C-P 2 -F
Formula S-V; wherein E is represented by the formula (C-P 1 )2;
F is represented by the formula D-P 1 -C-P 1 -C, D-P 2 -C-P 2 -C, D-P 1 -C-P 1 -D, or D-P 2 -C-P 2 -D;
A’, C, D, P 1 and P 2 are as defined in Formula IV; and m is an integer from 1 to 7 (e.g., 1 , 2, 3, 4, 5, 6, or 7).
In some embodiments, the sense strand has a structure represented by Formula S5, wherein Formula S5 is, in the 5’-to-3’ direction:
A-S-A-S-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-S-A -S-A
Formula S5; wherein A represents a 2’-0-Me ribonucleoside, B represents a 2’-F ribonucleoside, O represents a phosphodiester internucleoside linkage, and S represents a phosphorothioate internucleoside linkage.
In some embodiments, the sense strand has a structure represented by Formula S6, wherein Formula S6 is, in the 5’-to-3’ direction:
A-S-A-S-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A -O-A
Formula S6; wherein A represents a 2’-0-Me ribonucleoside, B represents a 2’-F ribonucleoside, O represents a phosphodiester internucleoside linkage, and S represents a phosphorothioate internucleoside linkage.
In some embodiments, the sense strand has a structure represented by Formula S7, wherein Formula S7 is, in the 5’-to-3’ direction:
A-S-A-S-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-S-A -S-B
Formula S7; wherein A represents a 2’-0-Me ribonucleoside, B represents a 2’-F ribonucleoside, O represents a phosphodiester internucleoside linkage, and S represents a phosphorothioate internucleoside linkage.
In some embodiments, the sense strand has a structure represented by Formula S8, wherein Formula S8 is, in the 5’-to-3’ direction:
A-S-A-S-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A -O-B
Formula S8; wherein A represents a 2’-0-Me ribonucleoside, B represents a 2’-F ribonucleoside, O represents a phosphodiester internucleoside linkage, and S represents a phosphorothioate internucleoside linkage. In some embodiments, the antisense strand has a structure represented by Formula A- VI, wherein Formula A- VI is, in the 5’-to-3’ direction:
A-Bj-E-B k -E-F-Gi-D-P 1 -C’
Formula A-VI; wherein A is represented by the formula C-P 1 -D-P 1 ; each B is represented by the formula C-P 2 ; each C is a 2’-0-Me ribonucleoside; each C’, independently, is a 2’-0-Me ribonucleoside or a 2’-F ribonucleoside; each D is a 2’-F ribonucleoside; each E is represented by the formula D-P 2 -C-P 2 ;
F is represented by the formula D-P 1 -C-P 1 ; each G is represented by the formula C-P 1 ; each P 1 is a phosphorothioate internucleoside linkage; each P 2 is a phosphodiester internucleoside linkage; j is an integer from 1 to 7 (e.g., 1 , 2, 3, 4, 5, 6, or 7); k is an integer from 1 to 7 (e.g., 1 , 2, 3, 4, 5, 6, or 7); and
I is an integer from 1 to 7 (e.g., 1 , 2, 3, 4, 5, 6, or 7).
In some embodiments, the antisense strand has a structure represented by Formula A4, wherein Formula A4 is, in the 5’-to-3’ direction:
A-S-B-S-A-O-A-O-A-O-B-O-A-O-A-O-A-O-A-O-A-O-A-O-A-O-B-O-A -O-B-S-A-S-A-S-A-S-B-S-A
Formula A4; wherein A represents a 2’-0-Me ribonucleoside, B represents a 2’-F ribonucleoside, O represents a phosphodiester internucleoside linkage, and S represents a phosphorothioate internucleoside linkage.
In some embodiments, the sense strand has a structure represented by Formula S-VII, wherein Formula S-VII is, in the 5’-to-3’ direction:
H-Bm-ln-A’-Bo-H-C
Formula S-VII; wherein A’ is represented by the formula C-P 2 -D-P 2 ; each H is represented by the formula (C-P 1 )2; each I is represented by the formula (D-P 2 );
B, C, D, P 1 and P 2 are as defined in Formula VI; m is an integer from 1 to 7 (e.g., 1 , 2, 3, 4, 5, 6, or 7); n is an integer from 1 to 7 (e.g., 1 , 2, 3, 4, 5, 6, or 7); and o is an integer from 1 to 7 (e.g., 1 , 2, 3, 4, 5, 6, or 7). In some embodiments, the sense strand has a structure represented by Formula S9, wherein Formula S9 is, in the 5’-to-3’ direction:
A-S-A-S-A-O-A-O-A-O-B-O-B-O-B-O-A-O-B-O-A-O-A-O-A-O-A-S-A -S-A
Formula S9; wherein A represents a 2’-0-Me ribonucleoside, B represents a 2’-F ribonucleoside, O represents a phosphodiester internucleoside linkage, and S represents a phosphorothioate internucleoside linkage.
In some embodiments, the antisense strand also has a 5’ phosphorus stabilizing moiety at the 5’ end of the antisense strand.
In some embodiments, the sense strand also has a 5’ phosphorus stabilizing moiety at the 5’ end of the sense strand.
In some embodiments, each 5’-phosphorus stabilizing moiety is, independently represented by any one of Formula l-VIII: wherein Nuc represents a nucleobase, such as adenine, uracil, guanine, thymine, or cytosine, and R represents optionally substituted alkyl, optionally substituted alkenyl, or optionally substituted alkynyl (e.g., optionally substituted C1-C6 alkyl, optionally substituted C2-C6 alkenyl, or optionally substituted C2-C6 alkynyl), phenyl, benzyl, hydroxy, or hydrogen.
In some embodiments, Z is (E)-vinylphosphonate as represented in Formula III.
In some embodiments, n is from 1 to 4. In some embodiments, n is from 1 to 3. In some embodiments, n is from 1 to 2. In some embodiments, n is 1.
In some embodiments, m is from 1 to 4. In some embodiments, m is from 1 to 3. In some embodiments, m is from 1 to 2. In some embodiments, m is 1.
In some embodiments, n and m are each 1.
In some embodiments, 50% or more of the ribonucleotides in the antisense strand are 2'-0-Me ribonucleotides (e.g., 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% of the ribonucleotides in the antisense strand may be 2'-0-Me ribonucleotides).
In some embodiments, 60% or more of the ribonucleotides in the antisense strand are 2'-0-Me ribonucleotides (e.g., 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% of the ribonucleotides in the antisense strand may be 2'-0-Me ribonucleotides).
In some embodiments, 70% or more of the ribonucleotides in the antisense strand are 2'-0-Me ribonucleotides (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% of the ribonucleotides in the antisense strand may be 2'-0-Me ribonucleotides).
In some embodiments, 80% or more of the ribonucleotides in the antisense strand are 2'-0-Me ribonucleotides (e.g., 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% of the ribonucleotides in the antisense strand may be 2'-0-Me ribonucleotides).
In some embodiments, 90% or more of the ribonucleotides in the antisense strand are 2'-0-Me ribonucleotides (e.g., 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% of the ribonucleotides in the antisense strand may be 2'-0-Me ribonucleotides).
In some embodiments, 10% or less of the internucleoside linkages are phosphodiester linkages or phosphorothioate linkages. In some embodiments, at least 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, or 90% of the internucleoside linkages are phosphodiester linkages or phosphorothioate linkages.
In some embodiments, 100% of the internucleoside linkages are phosphodiester linkages or phosphorothioate linkages.
In some embodiments, 9 internucleoside linkages are phosphodiester linkages or phosphorothioate linkages.
In some embodiments, the length of the antisense strand is between 10 and 30 nucleotides (e.g., 10 nucleotides, 11 nucleotides, 12 nucleotides, 13 nucleotides, 14 nucleotides, 15 nucleotides, 16 nucleotides, 17 nucleotides, 18 nucleotides, 19 nucleotides, 20 nucleotides, 21 nucleotides, 22 nucleotides, 23 nucleotides, 24 nucleotides, 25 nucleotides, 26 nucleotides, 27 nucleotides, 28 nucleotides, 29 nucleotides, or 30 nucleotides), 15 and 25 nucleotides (e.g., 15 nucleotides, 16 nucleotides, 17 nucleotides, 18 nucleotides, 19 nucleotides, 20 nucleotides, 21 nucleotides, 22 nucleotides, 23 nucleotides, 24 nucleotides, or 25 nucleotides), or 18 and 23 nucleotides (e.g., 18 nucleotides, 19 nucleotides, 20 nucleotides, 21 nucleotides, 22 nucleotides, or 23 nucleotides). In some embodiments, the length of the antisense strand is 20 nucleotides. In some embodiments, the length of the antisense strand is 21 nucleotides. In some embodiments, the length of the antisense strand is 22 nucleotides. In some embodiments, the length of the antisense strand is 23 nucleotides. In some embodiments, the length of the antisense strand is 24 nucleotides. In some embodiments, the length of the antisense strand is 25 nucleotides. In some embodiments, the length of the antisense strand is 26 nucleotides. In some embodiments, the length of the antisense strand is 27 nucleotides. In some embodiments, the length of the antisense strand is 28 nucleotides. In some embodiments, the length of the antisense strand is 29 nucleotides. In some embodiments, the length of the antisense strand is 30 nucleotides. In some embodiments, the siRNA molecules of the branched compound are joined to one another by way of a linker (e.g., an ethylene glycol oligomer, such as tetraethylene glycol). In some embodiments, the siRNA molecules of the branched compound are joined to one another by way of a linker between the sense strand of one siRNA molecule and the sense strand of the other siRNA molecule. In some embodiments, the siRNA molecules are joined by way of linkers between the antisense strand of one siRNA molecule and the antisense strand of the other siRNA molecule. In some embodiments, the siRNA molecules of the branched compound are joined to one another by way of a linker between the sense strand of one siRNA molecule and the antisense strand of the other siRNA molecule.
In some embodiments, the length ofthe sense strand is between 12 and 30 nucleotides (e.g., 12 nucleotides, 13 nucleotides, 14 nucleotides, 15 nucleotides, 16 nucleotides, 17 nucleotides, 18 nucleotides, 19 nucleotides, 20 nucleotides, 21 nucleotides, 22 nucleotides, 23 nucleotides, 24 nucleotides, 25 nucleotides, 26 nucleotides, 27 nucleotides, 28 nucleotides, 29 nucleotides, or 30 nucleotides), or 14 and 18 nucleotides (e.g., 14 nucleotides, 15 nucleotides, 16 nucleotides, 17 nucleotides, or 18 nucleotides). In some embodiments, the length of the sense strand is 15 nucleotides.
In some embodiments, the length of the sense strand is 16 nucleotides. In some embodiments, the length of the sense strand is 17 nucleotides. In some embodiments, the length of the sense strand is 18 nucleotides. In some embodiments, the length of the sense strand is 19 nucleotides. In some embodiments, the length of the sense strand is 20 nucleotides. In some embodiments, the length of the sense strand is 21 nucleotides. In some embodiments, the length of the sense strand is 22 nucleotides. In some embodiments, the length of the sense strand is 23 nucleotides. In some embodiments, the length of the sense strand is 24 nucleotides. In some embodiments, the length of the sense strand is 25 nucleotides. In some embodiments, the length of the sense strand is 26 nucleotides. In some embodiments, the length of the sense strand is 27 nucleotides. In some embodiments, the length of the sense strand is 28 nucleotides. In some embodiments, the length of the sense strand is 29 nucleotides. In some embodiments, the length of the sense strand is 30 nucleotides.
In some embodiments, 4 internucleoside linkages are phosphorothioate linkages.
In some embodiments, the antisense strand is 18 nucleotides in length and the sense strand is
14 nucleotides in length.
In some embodiments, the antisense strand is 18 nucleotides in length and the sense strand is
15 nucleotides in length.
In some embodiments, the antisense strand is 18 nucleotides in length and the sense strand is
16 nucleotides in length.
In some embodiments, the antisense strand is 18 nucleotides in length and the sense strand is
17 nucleotides in length.
In some embodiments, the antisense strand is 18 nucleotides in length and the sense strand is
18 nucleotides in length.
In some embodiments, the antisense strand is 19 nucleotides in length and the sense strand is
14 nucleotides in length.
In some embodiments, the antisense strand is 19 nucleotides in length and the sense strand is
15 nucleotides in length. In some embodiments, the antisense strand is 19 nucleotides in length and the sense strand is
16 nucleotides in length.
In some embodiments, the antisense strand is 19 nucleotides in length and the sense strand is
17 nucleotides in length.
In some embodiments, the antisense strand is 19 nucleotides in length and the sense strand is
18 nucleotides in length.
In some embodiments, the antisense strand is 19 nucleotides in length and the sense strand is
19 nucleotides in length.
In some embodiments, the antisense strand is 20 nucleotides in length and the sense strand is
14 nucleotides in length.
In some embodiments, the antisense strand is 20 nucleotides in length and the sense strand is
15 nucleotides in length.
In some embodiments, the antisense strand is 20 nucleotides in length and the sense strand is
16 nucleotides in length.
In some embodiments, the antisense strand is 20 nucleotides in length and the sense strand is
17 nucleotides in length.
In some embodiments, the antisense strand is 20 nucleotides in length and the sense strand is
18 nucleotides in length.
In some embodiments, the antisense strand is 20 nucleotides in length and the sense strand is
19 nucleotides in length.
In some embodiments, the antisense strand is 20 nucleotides in length and the sense strand is
20 nucleotides in length.
In some embodiments, the antisense strand is 21 nucleotides in length and the sense strand is
14 nucleotides in length.
In some embodiments, the antisense strand is 21 nucleotides in length and the sense strand is
15 nucleotides in length.
In some embodiments, the antisense strand is 21 nucleotides in length and the sense strand is
16 nucleotides in length.
In some embodiments, the antisense strand is 21 nucleotides in length and the sense strand is
17 nucleotides in length.
In some embodiments, the antisense strand is 21 nucleotides in length and the sense strand is
18 nucleotides in length.
In some embodiments, the antisense strand is 21 nucleotides in length and the sense strand is
19 nucleotides in length.
In some embodiments, the antisense strand is 21 nucleotides in length and the sense strand is
20 nucleotides in length.
In some embodiments, the antisense strand is 21 nucleotides in length and the sense strand is
21 nucleotides in length.
In some embodiments, the antisense strand is 22 nucleotides in length and the sense strand is
14 nucleotides in length.
In some embodiments, the antisense strand is 22 nucleotides in length and the sense strand is
15 nucleotides in length. In some embodiments, the antisense strand is 22 nucleotides in length and the sense strand is
16 nucleotides in length.
In some embodiments, the antisense strand is 22 nucleotides in length and the sense strand is
17 nucleotides in length.
In some embodiments, the antisense strand is 22 nucleotides in length and the sense strand is
18 nucleotides in length.
In some embodiments, the antisense strand is 22 nucleotides in length and the sense strand is
19 nucleotides in length.
In some embodiments, the antisense strand is 22 nucleotides in length and the sense strand is
20 nucleotides in length.
In some embodiments, the antisense strand is 22 nucleotides in length and the sense strand is
21 nucleotides in length.
In some embodiments, the antisense strand is 22 nucleotides in length and the sense strand is
22 nucleotides in length.
In some embodiments, the antisense strand is 23 nucleotides in length and the sense strand is
14 nucleotides in length.
In some embodiments, the antisense strand is 23 nucleotides in length and the sense strand is
15 nucleotides in length.
In some embodiments, the antisense strand is 23 nucleotides in length and the sense strand is
16 nucleotides in length.
In some embodiments, the antisense strand is 23 nucleotides in length and the sense strand is
17 nucleotides in length.
In some embodiments, the antisense strand is 23 nucleotides in length and the sense strand is
18 nucleotides in length.
In some embodiments, the antisense strand is 23 nucleotides in length and the sense strand is
19 nucleotides in length.
In some embodiments, the antisense strand is 23 nucleotides in length and the sense strand is
20 nucleotides in length.
In some embodiments, the antisense strand is 23 nucleotides in length and the sense strand is
21 nucleotides in length.
In some embodiments, the antisense strand is 23 nucleotides in length and the sense strand is
22 nucleotides in length.
In some embodiments, the antisense strand is 23 nucleotides in length and the sense strand is
23 nucleotides in length.
In some embodiments, the antisense strand is 24 nucleotides in length and the sense strand is
14 nucleotides in length.
In some embodiments, the antisense strand is 24 nucleotides in length and the sense strand is
15 nucleotides in length.
In some embodiments, the antisense strand is 24 nucleotides in length and the sense strand is
16 nucleotides in length.
In some embodiments, the antisense strand is 24 nucleotides in length and the sense strand is
17 nucleotides in length. In some embodiments, the antisense strand is 24 nucleotides in length and the sense strand is
18 nucleotides in length.
In some embodiments, the antisense strand is 24 nucleotides in length and the sense strand is
19 nucleotides in length.
In some embodiments, the antisense strand is 24 nucleotides in length and the sense strand is
20 nucleotides in length.
In some embodiments, the antisense strand is 24 nucleotides in length and the sense strand is
21 nucleotides in length.
In some embodiments, the antisense strand is 24 nucleotides in length and the sense strand is
22 nucleotides in length.
In some embodiments, the antisense strand is 24 nucleotides in length and the sense strand is
23 nucleotides in length.
In some embodiments, the antisense strand is 24 nucleotides in length and the sense strand is
24 nucleotides in length.
In some embodiments, the antisense strand is 25 nucleotides in length and the sense strand is
14 nucleotides in length.
In some embodiments, the antisense strand is 25 nucleotides in length and the sense strand is
15 nucleotides in length.
In some embodiments, the antisense strand is 25 nucleotides in length and the sense strand is
16 nucleotides in length.
In some embodiments, the antisense strand is 25 nucleotides in length and the sense strand is
17 nucleotides in length.
In some embodiments, the antisense strand is 25 nucleotides in length and the sense strand is
18 nucleotides in length.
In some embodiments, the antisense strand is 25 nucleotides in length and the sense strand is
19 nucleotides in length.
In some embodiments, the antisense strand is 25 nucleotides in length and the sense strand is
20 nucleotides in length.
In some embodiments, the antisense strand is 25 nucleotides in length and the sense strand is
21 nucleotides in length.
In some embodiments, the antisense strand is 25 nucleotides in length and the sense strand is
22 nucleotides in length.
In some embodiments, the antisense strand is 25 nucleotides in length and the sense strand is
23 nucleotides in length.
In some embodiments, the antisense strand is 25 nucleotides in length and the sense strand is
24 nucleotides in length.
In some embodiments, the antisense strand is 25 nucleotides in length and the sense strand is
25 nucleotides in length.
In some embodiments, the antisense strand is 26 nucleotides in length and the sense strand is
14 nucleotides in length.
In some embodiments, the antisense strand is 26 nucleotides in length and the sense strand is
15 nucleotides in length. In some embodiments, the antisense strand is 26 nucleotides in length and the sense strand is
16 nucleotides in length.
In some embodiments, the antisense strand is 26 nucleotides in length and the sense strand is
17 nucleotides in length.
In some embodiments, the antisense strand is 26 nucleotides in length and the sense strand is
18 nucleotides in length.
In some embodiments, the antisense strand is 26 nucleotides in length and the sense strand is
19 nucleotides in length.
In some embodiments, the antisense strand is 26 nucleotides in length and the sense strand is
20 nucleotides in length.
In some embodiments, the antisense strand is 26 nucleotides in length and the sense strand is
21 nucleotides in length.
In some embodiments, the antisense strand is 26 nucleotides in length and the sense strand is
22 nucleotides in length.
In some embodiments, the antisense strand is 26 nucleotides in length and the sense strand is
23 nucleotides in length.
In some embodiments, the antisense strand is 26 nucleotides in length and the sense strand is
24 nucleotides in length.
In some embodiments, the antisense strand is 26 nucleotides in length and the sense strand is
25 nucleotides in length.
In some embodiments, the antisense strand is 26 nucleotides in length and the sense strand is
26 nucleotides in length.
In some embodiments, the antisense strand is 27 nucleotides in length and the sense strand is
14 nucleotides in length.
In some embodiments, the antisense strand is 27 nucleotides in length and the sense strand is
15 nucleotides in length.
In some embodiments, the antisense strand is 27 nucleotides in length and the sense strand is
16 nucleotides in length.
In some embodiments, the antisense strand is 27 nucleotides in length and the sense strand is
17 nucleotides in length.
In some embodiments, the antisense strand is 27 nucleotides in length and the sense strand is
18 nucleotides in length.
In some embodiments, the antisense strand is 27 nucleotides in length and the sense strand is
19 nucleotides in length.
In some embodiments, the antisense strand is 27 nucleotides in length and the sense strand is
20 nucleotides in length.
In some embodiments, the antisense strand is 27 nucleotides in length and the sense strand is
21 nucleotides in length.
In some embodiments, the antisense strand is 27 nucleotides in length and the sense strand is
22 nucleotides in length.
In some embodiments, the antisense strand is 27 nucleotides in length and the sense strand is
23 nucleotides in length. In some embodiments, the antisense strand is 27 nucleotides in length and the sense strand is
24 nucleotides in length.
In some embodiments, the antisense strand is 27 nucleotides in length and the sense strand is
25 nucleotides in length.
In some embodiments, the antisense strand is 27 nucleotides in length and the sense strand is
26 nucleotides in length.
In some embodiments, the antisense strand is 27 nucleotides in length and the sense strand is
27 nucleotides in length.
In some embodiments, the antisense strand is 28 nucleotides in length and the sense strand is
14 nucleotides in length.
In some embodiments, the antisense strand is 28 nucleotides in length and the sense strand is
15 nucleotides in length.
In some embodiments, the antisense strand is 28 nucleotides in length and the sense strand is
16 nucleotides in length.
In some embodiments, the antisense strand is 28 nucleotides in length and the sense strand is
17 nucleotides in length.
In some embodiments, the antisense strand is 28 nucleotides in length and the sense strand is
18 nucleotides in length.
In some embodiments, the antisense strand is 28 nucleotides in length and the sense strand is
19 nucleotides in length.
In some embodiments, the antisense strand is 28 nucleotides in length and the sense strand is
20 nucleotides in length.
In some embodiments, the antisense strand is 28 nucleotides in length and the sense strand is
21 nucleotides in length.
In some embodiments, the antisense strand is 28 nucleotides in length and the sense strand is
22 nucleotides in length.
In some embodiments, the antisense strand is 28 nucleotides in length and the sense strand is
23 nucleotides in length.
In some embodiments, the antisense strand is 28 nucleotides in length and the sense strand is
24 nucleotides in length.
In some embodiments, the antisense strand is 28 nucleotides in length and the sense strand is
25 nucleotides in length.
In some embodiments, the antisense strand is 28 nucleotides in length and the sense strand is
26 nucleotides in length.
In some embodiments, the antisense strand is 28 nucleotides in length and the sense strand is
27 nucleotides in length.
In some embodiments, the antisense strand is 28 nucleotides in length and the sense strand is
28 nucleotides in length.
In some embodiments, the antisense strand is 29 nucleotides in length and the sense strand is
14 nucleotides in length.
In some embodiments, the antisense strand is 29 nucleotides in length and the sense strand is
15 nucleotides in length. In some embodiments, the antisense strand is 29 nucleotides in length and the sense strand is
16 nucleotides in length.
In some embodiments, the antisense strand is 29 nucleotides in length and the sense strand is
17 nucleotides in length.
In some embodiments, the antisense strand is 29 nucleotides in length and the sense strand is
18 nucleotides in length.
In some embodiments, the antisense strand is 29 nucleotides in length and the sense strand is
19 nucleotides in length.
In some embodiments, the antisense strand is 29 nucleotides in length and the sense strand is
20 nucleotides in length.
In some embodiments, the antisense strand is 29 nucleotides in length and the sense strand is
21 nucleotides in length.
In some embodiments, the antisense strand is 29 nucleotides in length and the sense strand is
22 nucleotides in length.
In some embodiments, the antisense strand is 29 nucleotides in length and the sense strand is
23 nucleotides in length.
In some embodiments, the antisense strand is 29 nucleotides in length and the sense strand is
24 nucleotides in length.
In some embodiments, the antisense strand is 29 nucleotides in length and the sense strand is
25 nucleotides in length.
In some embodiments, the antisense strand is 29 nucleotides in length and the sense strand is
26 nucleotides in length.
In some embodiments, the antisense strand is 29 nucleotides in length and the sense strand is
27 nucleotides in length.
In some embodiments, the antisense strand is 29 nucleotides in length and the sense strand is
28 nucleotides in length.
In some embodiments, the antisense strand is 29 nucleotides in length and the sense strand is
29 nucleotides in length.
In some embodiments, the antisense strand is 30 nucleotides in length and the sense strand is
14 nucleotides in length.
In some embodiments, the antisense strand is 30 nucleotides in length and the sense strand is
15 nucleotides in length.
In some embodiments, the antisense strand is 30 nucleotides in length and the sense strand is
16 nucleotides in length.
In some embodiments, the antisense strand is 30 nucleotides in length and the sense strand is
17 nucleotides in length.
In some embodiments, the antisense strand is 30 nucleotides in length and the sense strand is
18 nucleotides in length.
In some embodiments, the antisense strand is 30 nucleotides in length and the sense strand is
19 nucleotides in length.
In some embodiments, the antisense strand is 30 nucleotides in length and the sense strand is
20 nucleotides in length. In some embodiments, the antisense strand is 30 nucleotides in length and the sense strand is
21 nucleotides in length.
In some embodiments, the antisense strand is 30 nucleotides in length and the sense strand is
22 nucleotides in length.
In some embodiments, the antisense strand is 30 nucleotides in length and the sense strand is
23 nucleotides in length.
In some embodiments, the antisense strand is 30 nucleotides in length and the sense strand is
24 nucleotides in length.
In some embodiments, the antisense strand is 30 nucleotides in length and the sense strand is
25 nucleotides in length.
In some embodiments, the antisense strand is 30 nucleotides in length and the sense strand is
26 nucleotides in length.
In some embodiments, the antisense strand is 30 nucleotides in length and the sense strand is
27 nucleotides in length.
In some embodiments, the antisense strand is 30 nucleotides in length and the sense strand is
28 nucleotides in length.
In some embodiments, the antisense strand is 30 nucleotides in length and the sense strand is
29 nucleotides in length.
In some embodiments, the antisense strand is 30 nucleotides in length and the sense strand is
30 nucleotides in length.
In another aspect, the invention features a branched siRNA molecule including a sense strand and an antisense strand, wherein the antisense strand includes a region having complementarity to a segment of contiguous nucleotides within a gene selected from the group consisting of APOE, BIN1 , C1QA, C3, C90RF72, CCL5, CD33, CLU/APOJ, CR1 , CXCL10, CXCL13, IFIT1 , IFIT3, IFITM3, IFNAR1 , IFNAR2, IL10RA, IL1A, IL1B, IL1RAP, INPP5D, ITGAM, MEF2C, MMP12, NLRP3, NOS2, PILRA,
PLCG2, PTK2B, SLC24A4, TBK1 , and TNF.
In some embodiments, the antisense strand has complementarity to a portion of a gene encoding a positive regulator of a gene for which increased expression and/or activity relative to the level of expression and/or activity observed in a reference subject is associated with a disease state.
In some embodiments, the antisense strand has complementarity to a portion of a gene encoding a negative regulator of a gene for which decreased expression and/or activity relative to the level of expression and/or activity observed in a reference subject is associated with a disease state.
In some embodiments, the antisense strand has complementarity to a splice isoform of a gene for which overexpression of the splice isoform relative to the expression of the splice isoform in a reference subject is associated with a disease state.
In some embodiments, the sense strand has complementarity to the antisense strand.
In some embodiments, the siRNA molecule is di-branched. In some embodiments, the siRNA molecule is tri-branched. In some embodiments, the siRNA molecule is tetra-branched.
In some embodiments, the antisense strand of the branched siRNA has the following Formula in the 5'-to-3' direction:
Z-((A-P-)n(B-P-)m)q; wherein Z is a 5' phosphorus stabilizing moiety; each A is, independently, a 2'-0-Me ribonucleoside; each B is, independently, a 2'-fluoro-ribonucleoside; each P is, independently, an internucleoside linkage selected from a phosphodiester linkage and a phosphorothioate linkage; n is an integer from 1 to 5 (e.g.,
1 , 2, 3, 4, or 5); m is an integer from 1 to 5 (e.g., 1 , 2, 3, 4, or 5); and q is an integer between 1 and 15 (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15).
In some embodiments, the antisense strand has a structure represented by Formula A-l, wherein Formula A-l is, in the 5’-to-3’ direction:
A-B-(A’)j-C-P 2 -D-P 1 -(C’-P 1 )k-C’
Formula A-l; wherein A is represented by the formula C-P 1 -D-P 1 ; each A’ is represented by the formula C-P 2 -D-P 2 ;
B is represented by the formula C-P 2 -D-P 2 -D-P 2 -D-P 2 ; each C is a 2’-0-methyl (2’-0-Me) ribonucleoside; each C’, independently, is a 2’-0-Me ribonucleoside or a 2’-fluoro (2’-F) ribonucleoside; each D is a 2’-F ribonucleoside; each P 1 is a phosphorothioate internucleoside linkage; each P 2 is a phosphodiester internucleoside linkage; j is an integer from 1 to 7 (e.g., 1 , 2, 3, 4, 5, 6, or 7); and k is an integer from 1 to 7 (e.g., 1 , 2, 3, 4, 5, 6, or 7).
In some embodiments, the antisense strand has a structure represented by Formula A1 , wherein Formula A1 is, in the 5’-to-3’ direction:
A-S-B-S-A-O-B-O-B-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A -O-B-S-A-S-A-S-A-S-B-S-A
Formula A1; wherein A represents a 2’-0-Me ribonucleoside, B represents a 2’-F ribonucleoside, O represents a phosphodiester internucleoside linkage, and S represents a phosphorothioate internucleoside linkage.
In some embodiments, the antisense strand has a structure represented by Formula A-ll, wherein Formula A-ll is, in the 5’-to-3’ direction:
A-B-(A’)j-C-P 2 -D-P 1 -(C-P 1 ) k -C’
Formula A-ll; wherein A is represented by the formula C-P 1 -D-P 1 ; each A’ is represented by the formula C-P 2 -D-P 2 ;
B is represented by the formula C-P 2 -D-P 2 -D-P 2 -D-P 2 ; each C is a 2’-0-methyl (2’-0-Me) ribonucleoside; each C’, independently, is a 2’-0-Me ribonucleoside or a 2’-fluoro (2’-F) ribonucleoside; each D is a 2’-F ribonucleoside; each P 1 is a phosphorothioate internucleoside linkage; each P 2 is a phosphodiester internucleoside linkage; j is an integer from 1 to 7 (e.g., 1 , 2, 3, 4, 5, 6, or 7); and k is an integer from 1 to 7 (e.g., 1 , 2, 3, 4, 5, 6, or 7).
In some embodiments, antisense strand has a structure represented by Formula A2, wherein Formula A2 is, in the 5’-to-3’ direction:
A-S-B-S-A-O-B-O-B-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A -O-B-S-A-S-A-S-A-S-A-S-A
Formula A2; wherein A represents a 2’-0-Me ribonucleoside, B represents a 2’-F ribonucleoside, O represents a phosphodiester internucleoside linkage, and S represents a phosphorothioate internucleoside linkage.
In some embodiments, the sense strand has a structure represented by Formula S-lll, wherein Formula S-lll is, in the 5’-to-3’ direction:
E-(A’)m-F
Formula S-lll; wherein E is represented by the formula (C-P 1 )2;
F is represented by the formula (C-P 2 ) 3 -D-P 1 -C-P 1 -C, (C-P 2 ) 3 -D-P 2 -C-P 2 -C, (C-P 2 ) 3 -D-P 1 -C-P 1 -D, or (C- P 2 ) 3 -D-P 2 -C-P 2 -D;
A’, C, D, P 1 , and P 2 are as defined in Formula II; and m is an integer from 1 to 7 (e.g., 1 , 2, 3, 4, 5, 6, or 7).
In some embodiments, the sense strand has a structure represented by Formula S1 , wherein Formula S1 is, in the 5’-to-3’ direction:
A-S-A-S-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-A-O-A-O-B-S-A -S-A
Formula S1; wherein A represents a 2’-0-Me ribonucleoside, B represents a 2’-F ribonucleoside, O represents a phosphodiester internucleoside linkage, and S represents a phosphorothioate internucleoside linkage.
In some embodiments, the sense strand has a structure represented by Formula S2, wherein Formula S2 is, in the 5’-to-3’ direction:
A-S-A-S-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-A-O-A-O-B-O-A -O-A
Formula S2; wherein A represents a 2’-0-Me ribonucleoside, B represents a 2’-F ribonucleoside, O represents a phosphodiester internucleoside linkage, and S represents a phosphorothioate internucleoside linkage. In some embodiments, the sense strand has a structure represented by Formula S3, wherein Formula S3 is, in the 5’-to-3’ direction:
A-S-A-S-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-A-O-A-O-B-S-A -S-B
Formula S3; wherein A represents a 2’-0-Me ribonucleoside, B represents a 2’-F ribonucleoside, O represents a phosphodiester internucleoside linkage, and S represents a phosphorothioate internucleoside linkage.
In some embodiments, the sense strand has a structure represented by Formula S4, wherein Formula S4 is, in the 5’-to-3’ direction:
A-S-A-S-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-A-O-A-O-B-O-A -O-B
Formula S4; wherein A represents a 2’-0-Me ribonucleoside, B represents a 2’-F ribonucleoside, O represents a phosphodiester internucleoside linkage, and S represents a phosphorothioate internucleoside linkage.
In some embodiments, the antisense strand has a structure represented by Formula A-IV, wherein Formula A-IV is, in the 5’-to-3’ direction:
A-(A’)j-C-P 2 -B-(C-P 1 ) k -C’
Formula A-IV; wherein A is represented by the formula C-P 1 -D-P 1 ; each A’ is represented by the formula C-P 2 -D-P 2 ;
B is represented by the formula D-P 1 -C-P 1 -D-P 1 ; each C is a 2’-0-Me ribonucleoside; each C’, independently, is a 2’-0-Me ribonucleoside or a 2’-F ribonucleoside; each D is a 2’-F ribonucleoside; each P 1 is a phosphorothioate internucleoside linkage; each P 2 is a phosphodiester internucleoside linkage; j is an integer from 1 to 7 (e.g., 1 , 2, 3, 4, 5, 6, or 7); and k is an integer from 1 to 7 (e.g., 1 , 2, 3, 4, 5, 6, or 7).
In some embodiments, the antisense strand has a structure represented by Formula A3, wherein Formula A3 is, in the 5’-to-3’ direction:
A-S-B-S-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A -O-B-S-A-S-B-S-A-S-A-S-A
Formula A3; wherein A represents a 2’-0-Me ribonucleoside, B represents a 2’-F ribonucleoside, O represents a phosphodiester internucleoside linkage, and S represents a phosphorothioate internucleoside linkage. In some embodiments, the sense strand has a structure represented by Formula S-V, wherein Formula S-V is, in the 5’-to-3’ direction:
E-(A’) m -C-P 2 -F
Formula S-V; wherein E is represented by the formula (C-P 1 )2;
F is represented by the formula D-P 1 -C-P 1 -C, D-P 2 -C-P 2 -C, D-P 1 -C-P 1 -D, or D-P 2 -C-P 2 -D;
A’, C, D, P 1 and P 2 are as defined in Formula IV; and m is an integer from 1 to 7 (e.g., 1 , 2, 3, 4, 5, 6, or 7).
In some embodiments, the sense strand has a structure represented by Formula S5, wherein Formula S5 is, in the 5’-to-3’ direction:
A-S-A-S-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-S-A -S-A
Formula S5; wherein A represents a 2’-0-Me ribonucleoside, B represents a 2’-F ribonucleoside, O represents a phosphodiester internucleoside linkage, and S represents a phosphorothioate internucleoside linkage.
In some embodiments, the sense strand has a structure represented by Formula S6, wherein Formula S6 is, in the 5’-to-3’ direction:
A-S-A-S-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A -O-A
Formula S6; wherein A represents a 2’-0-Me ribonucleoside, B represents a 2’-F ribonucleoside, O represents a phosphodiester internucleoside linkage, and S represents a phosphorothioate internucleoside linkage.
In some embodiments, the sense strand has a structure represented by Formula S7, wherein Formula S7 is, in the 5’-to-3’ direction:
A-S-A-S-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-S-A -S-B
Formula S7; wherein A represents a 2’-0-Me ribonucleoside, B represents a 2’-F ribonucleoside, O represents a phosphodiester internucleoside linkage, and S represents a phosphorothioate internucleoside linkage.
In some embodiments, the sense strand has a structure represented by Formula S8, wherein Formula S8 is, in the 5’-to-3’ direction:
A-S-A-S-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A -O-B
Formula S8; wherein A represents a 2’-0-Me ribonucleoside, B represents a 2’-F ribonucleoside, O represents a phosphodiester internucleoside linkage, and S represents a phosphorothioate internucleoside linkage. In some embodiments, the antisense strand has a structure represented by Formula A- VI, wherein Formula A- VI is, in the 5’-to-3’ direction:
A-Bj-E-B k -E-F-Gi-D-P 1 -C’
Formula A-VI; wherein A is represented by the formula C-P 1 -D-P 1 ; each B is represented by the formula C-P 2 ; each C is a 2’-0-Me ribonucleoside; each C’, independently, is a 2’-0-Me ribonucleoside or a 2’-F ribonucleoside; each D is a 2’-F ribonucleoside; each E is represented by the formula D-P 2 -C-P 2 ;
F is represented by the formula D-P 1 -C-P 1 ; each G is represented by the formula C-P 1 ; each P 1 is a phosphorothioate internucleoside linkage; each P 2 is a phosphodiester internucleoside linkage; j is an integer from 1 to 7 (e.g., 1 , 2, 3, 4, 5, 6, or 7); k is an integer from 1 to 7 (e.g., 1 , 2, 3, 4, 5, 6, or 7); and
I is an integer from 1 to 7 (e.g., 1 , 2, 3, 4, 5, 6, or 7).
In some embodiments, the antisense strand has a structure represented by Formula A4, wherein Formula A4 is, in the 5’-to-3’ direction:
A-S-B-S-A-O-A-O-A-O-B-O-A-O-A-O-A-O-A-O-A-O-A-O-A-O-B-O-A -O-B-S-A-S-A-S-A-S-B-S-A
Formula A4; wherein A represents a 2’-0-Me ribonucleoside, B represents a 2’-F ribonucleoside, O represents a phosphodiester internucleoside linkage, and S represents a phosphorothioate internucleoside linkage.
In some embodiments, the sense strand has a structure represented by Formula S-VII, wherein Formula S-VII is, in the 5’-to-3’ direction:
H-Bm-ln-A’-Bo-H-C
Formula S-VII; wherein A’ is represented by the formula C-P 2 -D-P 2 ; each H is represented by the formula (C-P 1 )2; each I is represented by the formula (D-P 2 );
B, C, D, P 1 and P 2 are as defined in Formula VI; m is an integer from 1 to 7 (e.g., 1 , 2, 3, 4, 5, 6, or 7); n is an integer from 1 to 7 (e.g., 1 , 2, 3, 4, 5, 6, or 7); and o is an integer from 1 to 7 (e.g., 1 , 2, 3, 4, 5, 6, or 7). In some embodiments, the sense strand has a structure represented by Formula S9, wherein Formula S9 is, in the 5’-to-3’ direction:
A-S-A-S-A-O-A-O-A-O-B-O-B-O-B-O-A-O-B-O-A-O-A-O-A-O-A-S-A -S-A
Formula S9; wherein A represents a 2’-0-Me ribonucleoside, B represents a 2’-F ribonucleoside, O represents a phosphodiester internucleoside linkage, and S represents a phosphorothioate internucleoside linkage.
In some embodiments, the antisense strand also has a 5’ phosphorus stabilizing moiety at the 5’ end of the antisense strand.
In some embodiments, the sense strand also has a 5’ phosphorus stabilizing moiety at the 5’ end of the sense strand.
In some embodiments, each 5’-phosphorus stabilizing moiety is, independently, represented by any one of Formula l-VIII: wherein Nuc represents a nucleobase, such as adenine, uracil, guanine, thymine, or cytosine, and R represents optionally substituted alkyl, optionally substituted alkenyl, or optionally substituted alkynyl (e.g., optionally substituted C1 -C6 alkyl, optionally substituted C2-C6 alkenyl, or optionally substituted C2- C6 alkynyl), phenyl, benzyl, hydroxy, or hydrogen.
In some embodiments, Z is (E)-vinylphosphonate as represented in Formula III.
In some embodiments, each P is independently selected from phosphodiester and phosphorothioate.
In some embodiments, n is from 1 to 4 (e.g., 1 , 2, 3, or 4), 1 to 3 (e.g., 1 , 2, or 3), or 1 to 2. In some embodiments, n is 1.
In some embodiments, m is from 1 to 4 (e.g., 1 , 2, 3, or 4), 1 to 3 (e.g., 1 , 2, or 3), or 1 to 2. In some embodiments, m is 1.
In some embodiments, n and m are each 1. In some embodiments, 50% or more of the ribonucleotides in the antisense strand are 2'-0-Me ribonucleotides (e.g., 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%,
82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% of the ribonucleotides in the antisense strand may be 2'-0-Me ribonucleotides).
In some embodiments, 60% or more of the ribonucleotides in the antisense strand are 2'-0-Me ribonucleotides (e.g., 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%,
92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% of the ribonucleotides in the antisense strand may be 2'-0-Me ribonucleotides).
In some embodiments, 70% or more of the ribonucleotides in the antisense strand are 2'-0-Me ribonucleotides (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% of the ribonucleotides in the antisense strand may be 2'-0-Me ribonucleotides).
In some embodiments, 80% or more of the ribonucleotides in the antisense strand are 2'-0-Me ribonucleotides (e.g., 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% of the ribonucleotides in the antisense strand may be 2'-0-Me ribonucleotides).
In some embodiments, 90% or more of the ribonucleotides in the antisense strand are 2'-0-Me ribonucleotides (e.g., 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% of the ribonucleotides in the antisense strand may be 2'-0-Me ribonucleotides).
In some embodiments, 10% or less of the internucleoside linkages are phosphodiester linkages or phosphorothioate. In some embodiments, at least 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, or 90% of the internucleoside linkages are phosphodiester linkages or phosphorothioate. In some embodiments, 100% of the internucleoside linkages are phosphodiester linkages or phosphorothioate.
In some embodiments, the length of the antisense strand is between 10 and 30 nucleotides (e.g., 10 nucleotides, 11 nucleotides, 12 nucleotides, 13 nucleotides, 14 nucleotides, 15 nucleotides, 16 nucleotides, 17 nucleotides, 18 nucleotides, 19 nucleotides, 20 nucleotides, 21 nucleotides, 22 nucleotides, 23 nucleotides, 24 nucleotides, 25 nucleotides, 26 nucleotides, 27 nucleotides, 28 nucleotides, 29 nucleotides, or 30 nucleotides), 15 and 25 nucleotides (e.g., 15 nucleotides, 16 nucleotides, 17 nucleotides, 18 nucleotides, 19 nucleotides, 20 nucleotides, 21 nucleotides, 22 nucleotides, 23 nucleotides, 24 nucleotides, or 25 nucleotides), or 18 and 23 nucleotides (e.g., 18 nucleotides, 19 nucleotides, 20 nucleotides, 21 nucleotides, 22 nucleotides, or 23 nucleotides). In some embodiments, the length of the antisense strand is 21 nucleotides. In some embodiments, the length of the antisense strand is 22 nucleotides. In some embodiments, the length of the antisense strand is 23 nucleotides. In some embodiments, the length of the antisense strand is 24 nucleotides. In some embodiments, the length of the antisense strand is 25 nucleotides. In some embodiments, the length of the antisense strand is 26 nucleotides. In some embodiments, the length of the antisense strand is 27 nucleotides. In some embodiments, the length of the antisense strand is 28 nucleotides. In some embodiments, the length of the antisense strand is 29 nucleotides. In some embodiments, the length of the antisense strand is 30 nucleotides.
In some embodiments, 9 internucleoside linkages are phosphorothioate. In some embodiments, the sense strand of the branched siRNA has the following formula in the
5'-to-3' direction:
Y-((A-P-)n(B-P-)m)qL-((B-P-)m(A-P-)n)q; wherein Y is a hydrophobic moiety (e.g., cholesterol, vitamin D, or tocopherol); L is a linker; each A is, independently, a 2'-0-Me ribonucleoside; each B is, independently, a 2'-fluoro-ribonucleoside; each P is, independently, an internucleoside linkage selected from a phosphodiester linkage and a phosphorothioate linkage; n is an integer from 1 to 5 (1 , 2, 3, 4, or 5); m is an integer from 1 to 5 (1 , 2, 3, 4, or 5); and q is an integer between 1 and 15 (1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, or 15).
In some embodiments, Y is cholesterol.
In some embodiments, Y tocopherol.
In some embodiments, L is an ethylene glycol oligomer.
In some embodiments, L is tetraethylene glycol.
In some embodiments, each P is independently selected from phosphodiester and phosphorothioate.
In some embodiments, n is from 1 to 4 (e.g., 1 , 2, 3, or 4), 1 to 3 (e.g., 1 , 2, or 3), or 1 to 2. In some embodiments, n is 1.
In some embodiments, m is from 1 to 4 (e.g., 1 , 2, 3, or 4), 1 to 3 (e.g., 1 , 2, or 3), or 1 to 2. In some embodiments, m is 1.
In some embodiments, n and m are each 1.
In some embodiments, 10% or less of the ribonucleosides are 2'-0-Me ribonucleoside.
In some embodiments, at least 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, or 90% of the ribonucleosides are 2'-0-Me ribonucleoside.
In some embodiments, 10% or less of the internucleoside linkages are phosphodiester linkages or phosphorothioate linkages. In some embodiments, at least 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, or 90% of the internucleoside linkages are phosphodiester linkages or phosphorothioate linkages. In some embodiments, 100% of the internucleoside linkages are phosphodiester linkages or phosphorothioate linkages.
In some embodiments, the length ofthe sense strand is between 12 and 30 nucleotides (e.g., 12 nucleotides, 13 nucleotides, 14 nucleotides, 15 nucleotides, 16 nucleotides, 17 nucleotides, 18 nucleotides, 19 nucleotides, 20 nucleotides, 21 , nucleotides, 22 nucleotides, 23 nucleotides, 24 nucleotides, 25 nucleotides, 26 nucleotides, 27 nucleotides, 28 nucleotides, 29 nucleotides, or 30 nucleotides), or 14 and 18 nucleotides (e.g., 14 nucleotides, 15 nucleotides, 16 nucleotides, 17 nucleotides, 18 nucleotides). In some embodiments, the length ofthe sense strand is 16 nucleotides. In some embodiments, the length ofthe sense strand is 17 nucleotides. In some embodiments, the length ofthe sense strand is 18 nucleotides. In some embodiments, the length ofthe sense strand is 19 nucleotides. In some embodiments, the length ofthe sense strand is 20 nucleotides. In some embodiments, the length of the sense strand is 21 nucleotides. In some embodiments, the length ofthe sense strand is 22 nucleotides. In some embodiments, the length ofthe sense strand is 23 nucleotides. In some embodiments, the length ofthe sense strand is 24 nucleotides. In some embodiments, the length ofthe sense strand is 25 nucleotides. In some embodiments, the length of the sense strand is 26 nucleotides. In some embodiments, the length of the sense strand is 27 nucleotides. In some embodiments, the length of the sense strand is 28 nucleotides. In some embodiments, the length of the sense strand is 29 nucleotides. In some embodiments, the length of the sense strand is 30 nucleotides.
In some embodiments, 4 internucleoside linkages are phosphorothioate.
In another aspect, the invention features a method of treating a subject diagnosed as having a disease associated with expression of a dysregulated microglial gene (e.g., wild-type or mutated microglial gene), the method includes administering to the subject the branched siRNA molecule of any one of the above aspects or embodiments.
In some embodiments, the dysregulated microglial gene is selected from the group consisting of ABCA7, ABI3, ADAM10, APOC1 , APOE, AXL, BIN1 , C1QA, C3, C90RF72, CASS4, CCL5, CD2AP, CD33, CD68, CLPTM1 , CLU, CR1 , CSF1 , CST7, CTSB, CTSD, CTSL, CXCL10, CXCL13, DSG2, ECHDC3, EPHA1 , FABP5, FERMT2, FTH1 , GNAS, GRN, HBEGF, HLA-DRB1 , HLA-DRB5, IFIT1 , IFIT3, IFITM3, IFNAR1 , IFNAR2, IGF1 , IL10RA, IL1A, IL1B, IL1RAP, INPP5D, ITGAM, ITGAX, LILRB4, LPL, MEF2C, MMP12, MS4A4A, MS4A6A, NLRP3, NME8, NOS2, PICALM, PILRA, PLCG2, PTK2B, SCIMP, SLC24A4, SORL1 , SPI1 , SPP1 , SPPL2A, TBK1 , TNF, TREM2, TREML2, TYROBP, and ZCWPW1.
In some embodiments, the dysregulated microglial gene exhibits increased expression and/or activity in microglial cells of the subject as compared to the expression and/or activity of the same gene in microglial cells of a reference subject.
In some embodiments, the dysregulated microglial gene exhibits reduced expression and/or activity in microglial cells of the subject as compared to the expression and/or activity of the same gene in microglial cells of a reference subject.
In some embodiments, the administering of the branched siRNA molecule to the subject results in silencing of gene in the subject.
In some embodiments, the silencing of a gene comprises silencing any one of the genes selected from the group consisting of APOE, BIN1 , C1QA, C3, C90RF72, CCL5, CD33, CLU/APOJ, CR1 , CXCL10, CXCL13, IFIT1 , IFIT3, IFITM3, IFNAR1 , IFNAR2, IL10RA, IL1A, IL1B, IL1RAP, INPP5D,
ITGAM, MEF2C, MMP12, NLRP3, NOS2, PILRA, PLCG2, PTK2B, SLC24A4, TBK1 , and TNF.
In some embodiments, silencing of a gene comprises silencing of a positive regulator of a gene for which increased expression and/or activity relative to the level of expression and/or activity observed in a reference subject is associated with a disease state.
In some embodiments, silencing of a gene comprises silencing of a negative regulator of a gene for which decreased expression and/or activity relative to the level of expression and/or activity observed in a reference subject is associated with a disease state.
In some embodiments, silencing of a gene comprises silencing of a splice isoform of a gene for which overexpression of the splice isoform relative to the expression of the splice isoform in a reference subject is associated with a disease state.
In some embodiments, the subject is a human.
BRIEF DESCRIPTION OF THE DRAWINGS
FIGS. 1A-1D are a series of fluorescence images of brain and spinal cord tissue of cynomolgus macaques treated with a single intrathecal injection of Cy3-labeled di-siRNA of the disclosure. Fluorescence images were acquired from representative regions of the brain, including cortex (FIG. 1A), hippocampus (FIG. 1 B), caudate nucleus (FIG 1C), and of the spinal cord (FIG. 1D). Microglia cells (Iba1 channel), di-siRNAs (Cy3 channel), and cell nuclei (DAPI) were labeled. White arrows indicate colocalization of Cy3 di-siRNA signal within microglial cells labeled with the Iba1 antibody. Scale bars = 20 pm.
DEFINITIONS
Unless otherwise defined herein, scientific, and technical terms used herein have the meanings that are commonly understood by those of ordinary skill in the art. In the event of any latent ambiguity, definitions provided herein take precedent over any dictionary or extrinsic definition. Unless otherwise required by context, singular terms shall include pluralities and plural terms shall include the singular.
The use of "or" means "and/or" unless stated otherwise. The use of the term "including," as well as other forms, such as "includes" and "included," is not limiting.
As used herein, the term "nucleic acids" refers to RNA or DNA molecules consisting of a chain of ribonucleotides or deoxyribonucleotides, respectively. As used herein, the term "therapeutic nucleic acid" refers to a nucleic acid molecule (e.g., ribonucleic acid) that has partial or complete complementarity to, and interacts with, a disease-associated target mRNA and mediates silencing of expression of the mRNA.
As used herein, the term "carrier nucleic acid" refers to a nucleic acid molecule (e.g., ribonucleic acid) that has sequence complementarity with, and hybridizes with, a therapeutic nucleic acid. As used herein, the term "3' end" refers to the end of the nucleic acid that contains an unmodified hydroxyl group at the 3' carbon of the ribose ring.
As used herein, the term "nucleoside" refers to a molecule made up of a heterocyclic base and its sugar.
As used herein, the term "nucleotide" refers to a nucleoside having a phosphate group on its 3' or 5' sugar hydroxyl group.
As used herein, the term "siRNA" refers to small interfering RNA duplexes that induce the RNA interference (RNAi) pathway. siRNA molecules can vary in length (generally, between 18-30 base pairs) and contain varying degrees of complementarity to their target mRNA. The term "siRNA" includes duplexes of two separate strands, as well as single strands that optionally form hairpin structures comprising a duplex region.
As used herein, the term "antisense strand" refers to the strand of the siRNA duplex that contains some degree of complementarity to the target gene.
As used herein, the term "sense strand" refers to the strand of the siRNA duplex that contains complementarity to the antisense strand.
As used herein, the terms "chemically modified nucleotide" or "nucleotide analog" or "altered nucleotide" or "modified nucleotide" refer to a non-standard nucleotide, including non-naturally occurring ribonucleotides or deoxyribonucleotides. Exemplary nucleotide analogs are modified at any position so as to alter certain chemical properties of the nucleotide yet retain the ability of the nucleotide analog to perform its intended function.
As used herein, the term "metabolically stabilized" refers to RNA molecules that contain ribonucleotides that have been chemically modified from 2'-hydroxyl groups to 2'-0-methyl groups. As used herein, the term "phosphorothioate" refers to the phosphate group of a nucleotide that is modified by substituting one or more of the oxygens of the phosphate group with sulfur.
As used herein, the term "ethylene glycol chain" refers to a carbon chain with the formula ((CH 2 OH) 2 ).
As used herein, “alkyl” refers to a saturated hydrocarbon group. Alkyl groups may be acyclic or cyclic and contain only C and H when unsubstituted. When an alkyl residue having a specific number of carbons is named, all geometric isomers having that number of carbons are intended to be encompassed and described; thus, for example, “butyl” is meant to include n-butyl, sec-butyl, and /so-butyl. Examples of alkyl include ethyl, propyl, butyl, pentyl, hexyl, heptyl, octyl, cyclopropyl, cyclobutyl, cyclopentyl, cyclohexyl, cycloheptyl, cyclooctyl, and the like. In some embodiments, alkyl may be substituted.
Suitable substituents that may be introduced into an alkyl group include, for example, hydroxy, alkoxy, amino, alkylamino, and halo, among others.
As used herein, “alkenyl” refers to an acyclic or cyclic unsaturated hydrocarbon group having at least one site of olefinic unsaturation (i.e. , having at least one moiety of the formula C=C). Alkenyl groups contain only C and H when unsubstituted. When an alkenyl residue having a specific number of carbons is named, all geometric isomers having that number of carbons are intended to be encompassed and described; thus, for example, “butenyl” is meant to include n-butenyl, sec-butenyl, and /so-butenyl. Examples of alkenyl include -CH=CH 2 , -CH 2 -CH=CH 2 , and -CH 2 -CH=CH-CH=CH 2 . In some embodiments, alkenyl may be substituted. Suitable substituents that may be introduced into an alkenyl group include, for example, hydroxy, alkoxy, amino, alkylamino, and halo, among others.
As used herein, “alkynyl” refers to an acyclic or cyclic unsaturated hydrocarbon group having at least one site of acetylenic unsaturation (i.e., having at least one moiety of the formula CºC). Alkynyl groups contain only C and H when unsubstituted. When an alkynyl residue having a specific number of carbons is named, all geometric isomers having that number of carbons are intended to be encompassed and described; thus, for example, “pentynyl” is meant to include n-pentynyl, sec-pentynyl, /so-pentynyl, and fe/f-pentynyl. Examples of alkynyl include -CºCH and -CºC-CH3. In some embodiments, alkynyl may be substituted. Suitable substituents that may be introduced into an alkynyl group include, for example, hydroxy, alkoxy, amino, alkylamino, and halo, among others.
As used herein the term "phenyl" denotes a monocyclic arene in which one hydrogen atom from a carbon atom of the ring has been removed. A phenyl group can be unsubstituted or substituted with one or more suitable substituents, wherein the substituent replaces an H of the phenyl group.
As used herein, the term “benzyl” refers to monovalent radical obtained when a hydrogen atom attached to the methyl group of toluene is removed. A benzyl generally has the formula of phenyl-CH 2 -.
A benzyl group can be unsubstituted or substituted with one or more suitable substituents. For example, the substituent may replace an H of the phenyl component and/or an H of the methylene (-CH 2 -) component.
As used herein, the term "amide" refers to an alkyl or aromatic group that is attached to an amino-carbonyl functional group.
As used herein, the term "internucleoside" and "internucleotide" refer to the bonds between nucleosides and nucleotides, respectively. As used herein, the term "triazol" refers to heterocyclic compounds with the formula (C2H3N3), having a five-membered ring of two carbons and three nitrogens, the positions of which can change resulting in multiple isomers.
As used herein, the term "terminal group" refers to the group at which a carbon chain or nucleic acid ends.
As used herein, the term "lipophilic amino acid" refers to an amino acid comprising a hydrophobic moiety (e.g., an alkyl chain or an aromatic ring).
As used herein, the term "antagomiRs" refers to nucleic acids that can function as inhibitors of miRNA activity.
As used herein, the term "gapmers" refers to chimeric antisense nucleic acids that contain a central block of deoxynucleotide monomers sufficiently long to induce RNase H cleavage. The deoxynucleotide block is flanked by ribonucleotide monomers or ribonucleotide monomers containing modifications.
As used herein, the term "mixmers" refers to nucleic acids that are comprised of a mix of locked nucleic acids (LNAs) and DNA.
As used herein, the term "guide RNAs" refers to nucleic acids that have sequence complementarity to a specific sequence in the genome immediately or 1 base pair upstream of the protospacer adjacent motif (PAM) sequence as used in CRISPR/Cas9 gene editing systems.
Alternatively, “guide RNAs” may refer to nucleic acids that have sequence complementarity (e.g., are antisense) to a specific messenger RNA (mRNA) sequence. In this context, a guide RNA may also have sequence complementarity to a “passenger RNA” sequence of equal or shorter length, which is identical or substantially identical to the sequence of mRNA to which the guide RNA hybridizes.
As used herein, the term "target of delivery" refers to the organ or part of the body that is desired to deliver the branched oligonucleotide compositions to.
As used herein, the term “branched siRNA” refers to a compound containing two or more double- stranded siRNA molecules covalently bound to one another. Branched siRNA molecules may be “di- branched,” also referred to herein as “di-siRNA,” wherein the siRNA molecule comprises 2 siRNA molecules covalently bound to one another, e.g., by way of a linker. Branched siRNA molecules may be “tri-branched,” also referred to herein as “tri-siRNA,” wherein the siRNA molecule comprises 3 siRNA molecules covalently bound to one another, e.g., by way of a linker. Branched siRNA molecules may be “tetra-branched,” also referred to herein as “tetra-siRNA,” wherein the siRNA molecule comprises 4 siRNA molecules covalently bound to one another, e.g., by way of a linker.
As used herein, the term “5' phosphorus stabilizing moiety” refers to a terminal phosphate group that includes phosphates as well as modified phosphates (e.g., phosphorothioates, phosphodiesters, phosphonates). The phosphate moiety can be located at either terminus but is preferred at the 5'- terminal nucleoside. In one aspect, the terminal phosphate is unmodified having the formula -O- P(=0)(0H)0H. In another aspect, the terminal phosphate is modified such that one or more of the O and OH groups are replaced with H, O, S, N(R’), or alkyl where R’ is H, an amino protecting group, or unsubstituted or substituted alkyl. In some embodiments, the 5' and or 3' terminal group can comprise from 1 to 3 phosphate moieties that are each, independently, unmodified (di- or tri-phosphates) or modified. As used herein, the term “between X and Y” is inclusive of the values of X and Y. For example, “between X and Y” refers to the range of values between the value of X and the value of Y, as well as the value of X and the value of Y.
As used herein, an "amino acid" refers to a molecule containing amine and carboxyl functional groups and a side chain specific to the amino acid. :
In some embodiments the amino acid is chosen from the group of proteinogenic amino acids. In other embodiments, the amino acid is an L-amino acid or a D-amino acid. In other embodiments, the amino acid is a synthetic amino acid (e.g., a beta-amino acid).
It is understood that certain internucleotide linkages provided herein, including, e.g., phosphodiester and phosphorothioate, comprise a formal charge of -1 at physiological pH, and that said formal charge will be balanced by a cationic moiety, e.g., an alkali metal such as sodium or potassium, an alkali earth metal such as calcium or magnesium, or an ammonium or guanidinium ion.
The phosphate group of the nucleotide may also be modified, e.g., by substituting one or more of the oxygens of the phosphate group with sulfur (e.g., phosphorothioates), or by making other substitutions which allow the nucleotide to perform its intended function such as described in, for example, Eckstein, Antisense Nucleic Acid Drug Dev. 2000 Apr. 10(2):117-21 , Rusckowski et al. Antisense Nucleic Acid Drug Dev. 2000 Oct. 10(5):333-45, Stein, Antisense Nucleic Acid Drug Dev. 2001 Oct. 11 (5): 317-25, Vorobjev et al. Antisense Nucleic Acid Drug Dev. 2001 Apr. 11 (2):77-85, and U.S.
Pat. No. 5,684,143. Certain of the above- referenced modifications (e.g., phosphate group modifications) preferably decrease the rate of hydrolysis of, for example, polynucleotides comprising said analogs in vivo or in vitro.
As used herein, the term “complementary” refers to two nucleotides that form canonical Watson- Crick base pairs. For the avoidance of doubt, Watson-Crick base pairs in the context of the present disclosure include adenine-thymine, adenine-uracil, and cytosine-guanine base pairs. A proper Watson- Crick base pair is referred to in this context as a “match,” while each unpaired nucleotide, and each incorrectly paired nucleotide, is referred to as a “mismatch.” Alignment for purposes of determining percent nucleic acid sequence complementarity can be achieved in various ways that are within the capabilities of one of skill in the art, for example, using publicly available computer software such as BLAST, BLAST-2, or Megalign software.
As used herein, the term “percent (%) sequence complementarity” with respect to a reference polynucleotide sequence is defined as the percentage of nucleic acids in a candidate sequence that are complementary to the nucleic acids in the reference polynucleotide sequence, after aligning the sequences and introducing gaps, if necessary, to achieve the maximum percent sequence complementarity. A given nucleotide is considered to be “complementary” to a reference nucleotide as described herein if the two nucleotides form canonical Watson-Crick base pairs. For the avoidance of doubt, Watson-Crick base pairs in the context of the present disclosure include adenine-thymine, adenine-uracil, and cytosine-guanine base pairs. A proper Watson-Crick base pair is referred to in this context as a “match,” while each unpaired nucleotide, and each incorrectly paired nucleotide, is referred to as a “mismatch.” Alignment for purposes of determining percent nucleic acid sequence complementarity can be achieved in various ways that are within the capabilities of one of skill in the art, for example, using publicly available computer software such as BLAST, BLAST-2, or Megalign software. Those skilled in the art can determine appropriate parameters for aligning sequences, including any algorithms needed to achieve maximal complementarity over the full length of the sequences being compared. As an illustration, the percent sequence complementarity of a given nucleic acid sequence, A, to a given nucleic acid sequence, B, (which can alternatively be phrased as a given nucleic acid sequence, A that has a certain percent complementarity to a given nucleic acid sequence, B) is calculated as follows:
100 multiplied by (the fraction X/Y) where X is the number of complementary base pairs in an alignment (e.g., as executed by computer software, such as BLAST) in that program’s alignment of A and B, and where Y is the total number of nucleic acids in B. It will be appreciated that where the length of nucleic acid sequence A is not equal to the length of nucleic acid sequence B, the percent sequence complementarity of A to B will not equal the percent sequence complementarity of B to A. As used herein, a query nucleic acid sequence is considered to be “completely complementary” to a reference nucleic acid sequence if the query nucleic acid sequence has 100% sequence complementarity to the reference nucleic acid sequence.
The term “gene silencing” refers to the suppression of gene expression, e.g., transgene, heterologous gene and/or endogenous gene expression, which may be mediated through processes that affect transcription and/or through processes that affect post-transcriptional mechanisms. In some embodiments, gene silencing occurs when an RNAi molecule initiates the inhibition or degradation of the mRNA transcribed from a gene of interest in a sequence-specific manner via RNA interference, thereby preventing translation of the gene's product.
The phrase “overactive disease driver gene,” as used herein, refers to a microglial gene having increased activity and/or expression that contributes to or causes a disease state in a subject (e.g., a human). The disease state may be caused or exacerbated by the overactive disease driver gene directly or by way of an intermediate gene(s).
The term “negative regulator,” as used herein, refers to a microglial gene that negatively regulates (e.g., reduces or inhibits) the expression and/or activity of another microglial gene or set of genes (e.g., dysregulated microglial gene ordysregulated microglial gene pathway).
The term “positive regulator,” as used herein, refers to a microglial gene that positively regulates (e.g., increases or saturates) the expression and/or activity of another microglial gene or set of microglial genes (e.g., dysregulated microglial gene ordysregulated microglial gene pathway).
The term “phosphate moiety” as used herein, refers to a terminal phosphate group that includes phosphates as well as modified phosphates. The phosphate moiety can be located at either terminus but is preferred at the 5'-terminal nucleoside. In one aspect, the terminal phosphate is unmodified having the formula — O — P(=0)(0H)0H. In another aspect, the terminal phosphate is modified such that one or more of the O and OH groups are replaced with H, O, S, N(R’) or alkyl where R’ is H, an amino protecting group or unsubstituted or substituted alkyl. In some embodiments, the 5' and or 3' terminal group can comprise from 1 to 3 phosphate moieties that are each, independently, unmodified (di or tri-phosphates) or modified.
In the context of this invention, the term "oligonucleotide" refers to an oligomer or polymer of ribonucleic acid (RNA) or deoxyribonucleic acid (DNA) or mimetics thereof. This term includes oligonucleotides composed of naturally-occurring nucleobases, sugars and covalent internucleoside (backbone) linkages as well as oligonucleotides having non-naturally-occurring portions that function similarly. Such modified or substituted oligonucleotides are often preferred over native forms because of desirable properties such as, for example, enhanced cellular uptake, enhanced affinity for nucleic acid target and increased stability in the presence of nucleases.
As used herein, the term “reference subject” refers to a healthy control subject of the same or similar, e.g., age, sex, geographical region, and/or education level as a subject treated with a composition of the disclosure. A healthy reference subject is one that does not suffer from a disease associated with expression of a dysregulated microglial gene or a dysregulated microglial gene pathway. Moreover, a healthy reference subject is one that does not suffer from a disease associated with altered (e.g., increased or decreased) expression and/or activity of a microglial gene.
As used herein, the terms “treat,” “treated,” or “treating” mean both therapeutic treatment and prophylactic or preventative measures wherein the object is to prevent or slow down (lessen) an undesired physiological condition, disorder, or disease, or obtain beneficial or desired clinical results. Beneficial or desired clinical results include, but are not limited to, alleviation of symptoms; diminishment of the extent of a condition, disorder, or disease; stabilized (i.e., not worsening) state of condition, disorder, or disease; delay in onset or slowing of condition, disorder, or disease progression; amelioration of the condition, disorder, or disease state or remission (whether partial or total), whether detectable or undetectable; an amelioration of at least one measurable physical parameter, not necessarily discernible by the patient; or enhancement or improvement of condition, disorder, or disease. Treatment includes eliciting a clinically significant response without excessive levels of side effects. Treatment also includes prolonging survival as compared to expected survival if not receiving treatment.
Genes described herein
As used herein, the term “ABCA7” refers to the gene encoding Phospholipid-transporting ATPase ABCA7. The terms “ABCA7” and "Phospholipid-transporting ATPase ABCA7" include wild-type forms of the ABCA7 gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type ABCA7. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type ABCA7 nucleic acid sequence (e.g., SEQ ID NO: 1 , European Nucleotide Archive (ENA) accession number AF250238). SEQ ID NO: 1 is a wild-type gene sequence encoding ABCA7 protein, and is shown below:
ATGGCCTTCTGGACACAGCTGATGCTGCTGCTCTGGAAGAATTTCATGTATCGCCGG AGA
CAGCCGGTCCAGCTCCTGGTCGAATTGCTGTGGCCTCTCTTCCTCTTCTTCATCCTG GTG
GCTGTTCGCCACTCCCACCCGCCCCTGGAGCACCATGAATGCCACTTCCCAAACAAG CCA
CTGCCATCGGCGGGCACCGTGCCCTGGCTCCAGGGTCTCATCTGTAATGTGAACAAC ACC
TGCTTTCCGCAGCTGACACCGGGCGAGGAGCCCGGGCGCCTGAGCAACTTCAACGAC TCC
CTGGTCTCCCGGCTGCTAGCCGATGCCCGCACTGTGCTGGGAGGGGCCAGTGCCCAC AGG
ACGCTGGCTGGCCTAGGGAAGCTGATCGCCACGCTGAGGGCTGCACGCAGCACGGCC CAG
CCTCAACCAACCAAGCAGTCTCCACTGGAACCACCCATGCTGGATGTCGCGGAGCTG CTG
ACGTCACTGCTGCGCACGGAATCCCTGGGGTTGGCACTGGGCCAAGCCCAGGAGCCC TTG
CACAGCTT GTT GGAGGCCGCT GAGGACCT GGCCCAGGAGCTCCTGGCGCTGCGCAGCCT G GTGGAGCTTCGGGCACTGCTGCAGAGACCCCGAGGGACCAGCGGCCCCCTGGAGTTGCTG
TCAGAGGCCCTCTGCAGTGTCAGGGGACCTAGCAGCACAGTGGGCCCCTCCCTCAAC TGG
TACGAGGCTAGT GACCT GAT GGAGCT GGTGGGGCAGGAGCCAGAATCCGCCCT GCCAGAC
AGCAGCCTGAGCCCCGCCTGCTCGGAGCTGATTGGAGCCCTGGACAGCCACCCGCTG TCC
CGCCTGCTCTGGAGACGCCTGAAGCCTCTGATCCTCGGGAAGCTACTCTTTGCACCA GAT
ACACCTTTTACCCGGAAGCTCATGGCCCAGGTCAACCGGACCTTCGAGGAGCTCACC CTG
CTGAGGGATGTCCGGGAGGTGTGGGAGATGCTGGGACCCCGGATCTTCACCTTCATG AAC
G ACAGTTCCAAT GT GGCCATGCT GCAGCGGCTCCT GCAGATGCAGGAT G AAGG AAGAAGG
CAGCCCAGACCTGGAGGCCGGGACCACATGGAGGCCCTGCGATCCTTTCTGGACCCT GGG
AGCGGTGGCTACAGCTGGCAGGACGCACACGCTGATGTGGGGCACCTGGTGGGCACG CTG
GGCCGAGTGACGGAGTGCCTGTCCTTGGACAAGCTGGAGGCGGCACCCTCAGAGGCA GCC
CTGGTGTCGCGGGCCCTGCAACTGCTCGCGGAACATCGATTCTGGGCCGGCGTCGTC TTC
TTGGGACCTGAGGACTCTTCAGACCCCACAGAGCACCCAACCCCAGACCTGGGCCCC GGC
CACGT GCGCATC AAAATCCGCATGGACATT G ACGT GGTCACGAGG ACC AAT AAG AT CAGG
GACAGGTTTTGGGACCCTGGCCCAGCCGCGGACCCCCTGACCGACCTGCGCTACGTG TGG
GGCGGCTTCGTGTACCTGCAAGACCTGGTGGAGCGTGCAGCCGTCCGCGTGCTCAGC GGC
GCCAACCCCCGGGCCGGCCTCTACCTGCAGCAGATGCCCTATCCGTGCTATGTGGAC GAC
GTGTTCCTGCGTGTGCTGAGCCGGTCGCTGCCGCTCTTCCTGACGCTGGCCTGGATC TAC
TCCGTGACACTGACAGTGAAGGCCGTGGTGCGGGAGAAGGAGACGCGGCTGCGGGAC ACC
ATGCGCGCCATGGGGCTCAGCCGCGCGGTGCTCTGGCTAGGCTGGTTCCTCAGCTGC CTC
GGGCCCTTCCTGCTCAGCGCCGCACTGCTGGTTCTGGTGCTCAAGCTGGGAGACATC CTC
CCCTACAGCCACCCGGGCGTGGTCTTCCTGTTCTTGGCAGCCTTCGCGGTGGCCACG GTG
ACCCAGAGCTTCCTGCTCAGCGCCTTCTTCTCCCGCGCCAACCTGGCTGCGGCCTGC GGC
GGCCTGGCCTACTTCTCCCTCTACCTGCCCTACGTGCTGTGTGTGGCTTGGCGGGAC CGG
CTGCCCGCGGGTGGCCGCGTGGCCGCGAGCCTGCTGTCGCCCGTGGCCTTCGGCTTC GGC
TGCGAGAGCCTGGCTCTGCTGGAGGAGCAGGGCGAGGGCGCGCAGTGGCACAACGTG GGC
ACCCGGCCTACGGCAGACGTCTTCAGCCTGGCCCAGGTCTCTGGCCTTCTGCTGCTG GAC
GCGGCGCTCTACGGCCTCGCCACCTGGTACCTGGAAGCTGTGTGCCCAGGCCAGTAC GGG
ATCCCTGAACCATGGAATTTTCCTTTTCGGAGGAGCTACTGGTGCGGACCTCGGCCC CCC
AAGAGTCCAGCCCCTTGCCCCACCCCGCTGGACCCAAAGGTGCTGGTAGAAGAGGCA CCG
CCCGGCCTGAGTCCTGGCGTCTCCGTTCGCAGCCTGGAGAAGCGCTTTCCTGGAAGC CCG
CAGCCAGCCCTGCGGGGGCTCAGCCTGGACTTCTACCAGGGCCACATCACCGCCTTC CTG
GGCCACAACGGGGCCGGCAAGACCACCACCCTGTCCATCTTGAGTGGCCTCTTCCCA CCC
AGTGGTGGCTCTGCCTTCATCCTGGGCCACGACGTCCGCTCCAGCATGGCCGCCATC CGG
CCCCACCTGGGCGTCTGTCCTCAGTACAACGTGCTGTTTGACATGCTGACCGTGGAC GAG
CACGTCTGGTTCTATGGGCGGCTGAAGGGTCTGAGTGCCGCTGTAGTGGGCCCCGAG CAG
GACCGTCTGCTGCAGGATGTGGGGCTGGTCTCCAAGCAGAGTGTGCAGACTCGCCAC CTC
TCTGGTGGGATGCAACGGAAGCTGTCCGTGGCCATTGCCTTTGTGGGCGGCTCCCAA GTT
GTTATCCTGGACGAGCCTACGGCTGGCGTGGATCCTGCTTCCCGCCGCGGTATTTGG GAG
CTGCTGCTCAAATACCGAGAAGGTCGCACGCTGATCCTCTCCACCCACCACCTGGAT GAG
GCAGAGCTGCTGGGAGACCGTGTGGCTGTGGTGGCAGGTGGCCGCTTGTGCTGCTGT GGC
TCCCCACTCTTCCTGCGCCGTCACCTGGGCTCCGGCTACTACCTGACGCTGGTGAAG GCC CGCCT GCCCCT GACCACCAAT GAGAAGGCTGACACT GACAT GGAGGGCAGTGTGGACACC
AGGCAGGAAAAGAAGAATGGCAGCCAGGGCAGCAGAGTCGGCACTCCTCAGCTGCTG GCC
CT GGTACAGCACT GGGT GCCCGGGGCACGGCT GGTGGAGGAGCT GCCACACGAGCT GGTG
CTGGTGCTGCCCTACACGGGTGCCCATGACGGCAGCTTCGCCACACTCTTCCGAGAG CTA
GACACGCGGCTGGCGGAGCTGAGGCTCACTGGCTACGGGATCTCCGACACCAGCCTC GAG
GAGATCTTCCTGAAGGTGGTGGAGGAGTGTGCTGCGGACACAGATATGGAGGATGGC AGC
TGCGGGCAGCACCTATGCACAGGCATTGCTGGCCTAGACGTAACCCTGCGGCTCAAG ATG
CCGCCACAGGAGACAGCGCTGGAGAACGGGGAACCAGCTGGGTCAGCCCCAGAGACT GAC
CAGGGCTCTGGGCCAGACGCCGTGGGCCGGGTACAGGGCTGGGCACTGACCCGCCAG CAG
CTCCAGGCCCTGCTTCTCAAGCGCTTTCTGCTTGCCCGCCGCAGCCGCCGCGGCCTG TTC
GCCCAGATCGTGCTGCCTGCCCTCTTTGTGGGCCTGGCCCTCGTGTTCAGCCTCATC GTG
CCTCCTTTCGGGCACTACCCGGCTCTGCGGCTCAGTCCCACCATGTACGGTGCTCAG GTG
TCCTTCTTCAGTGAGGACGCCCCAGGGGACCCTGGACGTGCCCGGCTGCTCGAGGCG CTG
CTGCAGGAGGCAGGACTGGAGGAGCCCCCAGTGCAGCATAGCTCCCACAGGTTCTCG GCA
CCAGAAGTTCCTGCTGAAGTGGCCAAGGTCTTGGCCAGTGGCAACTGGACCCCAGAG TCT
CCATCCCCAGCCTGCCAGTGTAGCCAGCCCGGTGCCCGGCGCCTGCTGCCCGACTGC CCG
GCTGCAGCTGGTGGTCCCCCTCCGCCCCAGGCAGTGACCGGCTCTGGGGAAGTGGTT CAG
AACCTGACAGGCCGGAACCTGTCTGACTTCCTGGTCAAGACCTACCCGCGCCTGGTG CGC
CAGGGCCTGAAGACTAAGAAGTGGGTGAATGAGGTCAGGTACGGAGGCTTCTCGCTG GGG
GGCCGAGACCCAGGCCTGCCCTCGGGCCAAGAGTTGGGCCGCTCAGTGGAGGAGTTG TGG
GCGCTGCTGAGTCCCCTGCCTGGCGGGGCCCTCGACCGTGTCCTGAAAAACCTCACA GCC
TGGGCTCACAGCCTGGACGCTCAGGACAGTCTCAAGATCTGGTTCAACAACAAAGGC TGG
CACTCCATGGTGGCCTTTGTCAACCGAGCCAGCAACGCAATCCTCCGTGCTCACCTG CCC
CCAGGCCGGGCCCGCCACGCCCACAGCATCACCACACTCAACCACCCCTTGAACCTC ACC
AAGGAGCAGCTGTTTGAGGCTGCATTGATGGCCTCCTCGGTGGACGTCCTCGTCTCC ATC
TGTGTGGTCTTTGCCATGTCCTTTGTCCCGGCCAGCTTCACTCTTGTCCTCATTGAG GAG
CGAGTCACCCGAGCCAAGCACCTGCAGCTCATGGGGGGCCTGTCCCCCACCCTCTAC TGG
CTTGGCAACTTTCTCTGGGACATGTGTAACTACTTGGTGCCAGCATGCATCGTGGTG CTC
ATCTTTCTGGCCTTCCAGCAGAGGGCATATGTGGCCCCTGCCAACCTGCCTGCTCTC CTG
CTGTTGCTACTACTGTATGGCTGGTCGATCACACCGCTCATGTACCCAGCCTCCTTC TTC
TTCTCCGTGCCCAGCACAGCCTATGTGGTGCTCACCTGCATAAACCTCTTTATTGGC ATC
AATGGAAGCATGGCCACCTTTGTGCTTGAGCTCTTCTCTGATCAGAAGCTGCAGGAG GTG
AGCCGGATCTTGAAACAGGTCTTCCTTATCTTCCCCCACTTCTGCTTGGGCCGGGGG CTT
ATTGACATGGTGCGGAACCAGGCCATGGCTGATGCCTTTGAGCGCTTGGGAGACAGG CAG
TTCCAGTCACCCCTGCGCTGGGAGGTGGTCGGCAAGAACCTCTTGGCCATGGTGATA CAG
GGGCCCCTCTTCCTTCTCTTCACACTACTGCTGCAGCACCGAAGCCAACTCCTGCCA CAG
CCCAGGGTGAGGTCTCTGCCACTCCTGGGAGAGGAGGACGAGGATGTAGCCCGTGAA CGG
GAGCGGGTGGTCCAAGGAGCCACCCAGGGGGATGTGTTGGTGCTGAGGAACTTGACC AAG
GTATACCGTGGGCAGAGGATGCCAGCTGTTGACCGCTTGTGCCTGGGGATTCCCCCT GGT
GAGTGTTTTGGGCTGCTGGGTGTGAATGGAGCAGGGAAGACGTCCACGTTTCGCATG GTG
ACGGGGGACACATTGGCCAGCAGGGGCGAGGCTGTGCTGGCAGGCCACAGCGTGGCC CGG
GAACCCAGTGCTGCGCACCTCAGCATGGGATACTGCCCTCAATCCGATGCCATCTTT GAG CTGCTGACGGGCCGCGAGCACCTGGAGCTGCTTGCGCGCCTGCGCGGTGTCCCGGAGGCC
CAGGTTGCCCAGACCGCTGGCTCGGGCCTGGCGCGTCTGGGACTCTCATGGTACGCA GAC
CGGCCTGCAGGCACCTACAGCGGAGGGAACAAACGCAAGCTGGCGACGGCCCTGGCG CTG
GTTGGGGACCCAGCCGTGGTGTTTCTGGACGAGCCGACCACAGGCATGGACCCCAGC GCG
CGGCGCTTCCTTTGGAACAGCCTTTTGGCCGTGGTGCGGGAGGGCCGTTCAGTGATG CTC
ACCTCCCATAGCATGGAGGAGTGTGAAGCGCTCTGCTCGCGCCTAGCCATCATGGTG AAT
GGGCGGTTCCGCTGCCTGGGCAGCCCGCAACATCTCAAGGGCAGATTCGCGGCGGGT CAC
ACACTGACCCTGCGGGTGCCCGCCGCAAGGTCCCAGCCGGCAGCGGCCTTCGTGGCG GCC
GAGTTCCCTGGGTCGGAGCTGCGCGAGGCACATGGAGGCCGCCTGCGCTTCCAGCTG CCG
CCGGGAGGGCGCTGCGCCCTGGCGCGCGTCTTTGGAGAGCTGGCGGTGCACGGCGCA GAG
CACGGCGTGGAGGACTTTTCCGTGAGCCAGACGATGCTGGAGGAGGTATTCTTGTAC TTC
TCCAAGGACCAGGGGAAGGACGAGGACACCGAAGAGCAGAAGGAGGCAGGAGTGGGA GTG
GACCCCGCGCCAGGCCTGCAGCACCCCAAACGCGTCAGCCAGTTCCTCGATGACCCT AGC
ACTGCCGAGACTGTGCTCTGAGCCTCCCTCCCCTGCGGGGCCGCGGGGAGGCCCTGG GAA
TGGCAAGGGCAAGGTAGAGTGCCTAGGAGCCCTGGACTCAGGCTGGCAGAGGGGCTG GTG
CCCTGGAGAAAATAAAGAGAAGGCTGGAGAGAAGCCGTGCTTGGTGAA
(SEQ ID NO: 1)
As used herein, the term “ABI3” refers to the gene encoding ABI gene family member 3. The terms “ABI3” and "ABI gene family member 3" include wild-type forms of the ABI3 gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type ABI3. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type ABI3 nucleic acid sequence (e.g., SEQ ID NO: 2, ENA accession number AF037886). SEQ ID NO: 2 is a wild-type gene sequence encoding ABI3 protein, and is shown below:
TCCTATCCACCCTCCACTCCCCTGTCCCTTGGTGACTCATCCCTGAGCTTCCCAAGG AAG
CCCCCACCCTCTGCCCTTTCCTCCCGCCTTCCATGAGTGGAAAATCCACCTCCGCCC CCT
ATAGCAGGCCAGCCCCCTTCCTCCCCAGTCTCCGACCCCATCCCCCAGCCGACCAGT TTC
CTCTCCAGGACCAGGGAGCAATCACAGCTGCCCCGACCTTGGCTTCCTCTGCTGGGT GGG
ATTGGGGGCTGGGCCCCCAAATGGGCCCCTGGCTTCCCCCTTCCTCTGGGCAGGGGA CAG
AGAGACACAGGCTCGGGGAGCAGGACTGACTTCCTCTTGTCCCGGAATGAGCATGCC TGC
CCTTTGCAAGCAGGTTTGGGTCTCACGCAGAGGAAACCAAAAGCAATAAGAGGGAGG GAA
GGCAGAGCAACCAATCAAGGGCAGGGTGAGACTCAAAACGAGCGGGCTCCCTGGGGA GCC
AGACAGAGGCTGGGGGTGATGGCGGAGCTACAGCAGCTGCAGGAGTTTGAGATCCCC ACT
GGCCGGGAGGCTCTGAGGGGCAACCACAGTGCCCTGCTGCGGGTCGCTGACTACTGC GAG
GACAACTATGTGCAGGCCACAGACAAGCGGAAGGCGCTGGAGGAGACCATGGCCTTC ACT
ACCCAGGCACTGGCCAGCGTGGCCTACCAGGTGGGCAACCTGGCCGGGCACACTCTG CGC
ATGTTGGACCTGCAGGGGGCCGCCCTGCGGCAGGTGGAAGCCCGTGTAAGCACGCTG GGC
CAGATGGTGAACATGCATATGGAGAAGGTGGCCCGAAGGGAGATCGGCACCTTAGCC ACT
GTCCAGCGGCTGCCCCCCGGCCAGAAGGTCATCGCCCCAGAGAACCTACCCCCTCTC ACG CCCT ACT GCAGG AG ACCCCT CAACTTTGGCT GCCT GG ACG AC ATTGGCCATGGG AT CAAG
GACCTCAGCACGCAGCTGTCAAGAACAGGCACCCTGTCTCGAAAGAGCATCAAGGCC CCT
GCCACACCCGCCTCCGCCACCTTGGGGAGACCACCCCGGATTCCCGAGCCAGTGCAC CTG
CCGGTGGTGCCCGACGGCAGACTCTCCGCCGCCTCCTCTGCGTCTTCCCTGGCCTCG GCC
GGCAGCGCCGAAGGTGTCGGTGGGGCCCCCACGCCCAAGGGGCAGGCAGCACCTCCA GCC
CCACCTCTCCCCAGCTCCTTGGACCCACCTCCTCCACCAGCAGCCGTCGAGGTGTTC CAG
CGGCCTCCCACGCTGGAGGAGTTGTCCCCACCCCCACCGGACGAAGAGCTGCCCCTG CCA
CTGGACCTGCCTCCTCCTCCACCCCTGGATGGAGATGAATTGGGGCTGCCTCCACCC CCA
CCAGGATTTGGGCCTGATGAGCCCAGCTGGGTGCCTGCCTCATACTTGGAGAAAGTG GTG
ACACTGTACCCATACACCAGCCAGAAGGACAATGAGCTCTCCTTCTCTGAGGGCACT GTC
ATCTGTGTCACTCGCCGCTACTCCGATGGCTGGTGCGAGGGCGTCAGCTCAGAGGGG ACT
GGATTCTTCCCTGGGAACTATGTGGAGCCCAGCTGCTGACAGCCCAGGGCTCTCTGG GCA
GCTGATGTCTGCACTGAGTGGGTTTCATGAGCCCCAAGCCAAAACCAGCTCCAGTCA CAG
CTGGACTGGGTCTGCCCACCTCTTGGGCTGTGAGCTGTGTTCTGTCCTTCCTCCCAT CGG
AGGGAGAAGGGGTCCTGGGGAGAGAGAATTTATCCAGAGGCCTGCTGCAGATGGGGA AGA
GCTGGAAACCAAGAAGTTTGTCAACAGAGGACCCCTACTCCATGCAGGACAGGGTCT CCT
GCTGCAAGTCCCAACTTTGAATAAAACAGATGATGTCCTGTGACTGCCCCACAGAGA TAA
GGGGCCAGGAGGGATTGAAAGGCATCCCAGTTCTAAGGCTGCTGCTAATTACAGCCC CCA
ACCTCCAACCCACCAGCTGACCTAGAAGCAGCATCTTCCCATTTCCTCAGTACCCAC AAA
GTGCAGCCCACATT GG ACCCC AG ACACCCCT CTGCAGCC ATT G ACT GCAACTT GTTCTTT
T GCCCATTAAAAAAAAAAAAAAAAAAAAA
(SEQ ID NO: 2)
As used herein, the term “ADAM10” refers to the gene encoding ADAM Metallopeptidase Domain 10. The terms “ADAM10” and " ADAM Metallopeptidase Domain 10" include wild-type forms of the ADAM10 gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type ADAM10. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type ADAM 10 nucleic acid sequence (e.g., SEQ ID NO: 3, NCBI Reference Sequence: NM_001110.3). SEQ ID NO: 3 is a wild-type gene sequence encoding ADAM10 protein, and is shown below:
GCGGCGGCAGGCCTAGCAGCACGGGAACCGTCCCCCGCGCGCATGCGCGCGCCCCTG AAGCGCC
TGGGGGACGGGTAGGGGCGGGAGGTAGGGGCGCGGCTCCGCGTGCCAGTTGGGTGCC CGCGCG
TCACGTGGTGAGGAAGGAGGCGGAGGTCTGAGTTTCGAAGGAGGGGGGGAGAGAAGA GGGAACG
AGCAAGGGAAGGAAAGCGGGGAAAGGAGGAAGGAAACGAACGAGGGGGAGGGAGGTC CCTGTTTT
GGAGGAGCTAGGAGCGTTGCCGGCCCCTGAAGTGGAGCGAGAGGGAGGTGCTTCGCC GTTTCTCC
TGCCAGGGGAGGTCCCGGCTTCCCGTGGAGGCTCCGGACCAAGCCCCTTCAGCTTCT CCCTCCGG
ATCGATGTGCTGCTGTTAACCCGTGAGGAGGCGGCGGCGGCGGCAGCGGCAGCGGAA GATGGTGT
TGCTGAGAGTGTTAATTCTGCTCCTCTCCTGGGCGGCGGGGATGGGAGGTCAGTATG GGAATCCTT
T AAATAAAT AT AT CAGAC ATTAT G AAGG ATT AT CTT ACAAT GTGGATTCATT ACACCAAAAACACCAGC
GTGCCAAAAGAGCAGTCTCACATGAAGACCAATTTTTACGTCTAGATTTCCATGCCC ATGGAAGACAT TTCAACCTACGAATGAAGAGGGACACTTCCCTTTTCAGTGATGAATTTAAAGTAGAAACA TCAAATAA
AGTACTTGATTATGATACCTCTCATATTTACACTGGACATATTTATGGTGAAGAAGG AAGTTTTAGCCA
TGGGTCTGTTATTGATGGAAGATTTGAAGGATTCATCCAGACTCGTGGTGGCACATT TTATGTTGAGC
CAGCAG AG AG AT AT ATT AAAGACCG AACT CTGCC ATTT CACT CTGTC ATTTAT CAT GAAGAT GAT ATTA
ACTATCCCCATAAATACGGTCCTCAGGGGGGCTGTGCAGATCATTCAGTATTTGAAA GAATGAGGAA
ATACCAGAT G ACT GGTGTAGAGGAAGT AACACAG AT ACCT CAAGAAGAACAT GCTGCT AATGGTCCA
G AACTTCT G AGG AAAAAACGT AC AACTTC AGCT G AAAAAAAT ACTTGT C AG CTTT AT ATT C AG ACT G A
TCATTTGTTCTTTAAATATTACGGAACACGAGAAGCTGTGATTGCCCAGATATCCAG TCATGTTAAAG
CGATTGATACAATTTACCAGACCACAGACTTCTCCGGAATCCGTAACATCAGTTTCA TGGTGAAACGC
ATAAGAATCAATACAACTGCTGATGAGAAGGACCCTACAAATCCTTTCCGTTTCCCA AATATTGGTGT
G G AG AAGTTT CTG G AATT G AATT CT G AGC AG AAT CAT G ATG ACT ACTGTTT GGCCTATGTCTT C AC AG
ACCGAGATTTTGATGATGGCGTACTTGGTCTGGCTTGGGTTGGAGCACCTTCAGGAA GCTCTGGAG
G AAT ATGT G AAAAAAGT AAACT CT ATT C AG AT G GT AAG AAG AAGTCCTT AAAC ACT G G AATT ATT ACT
GTTCAGAACTATGGGTCTCATGTACCTCCCAAAGTCTCTCACATTACTTTTGCTCAC GAAGTTGGACA
TAACTTTGGATCCCCACATGATTCTGGAACAGAGTGCACACCAGGAGAATCTAAGAA TTTGGGTCAA
AAAGAAAATGGCAATTACATCATGTATGCAAGAGCAACATCTGGGGACAAACTTAAC AACAATAAATT
CT CACT CT GT AGTATTAG AAATAT AAGCCAAGTTCTT G AG AAG AAG AG AAACAACT GTTTTGTT GAAT
CTGGCCAACCTATTTGTGGAAATGGAATGGTAGAACAAGGTGAAGAATGTGATTGTG GCTATAGTGA
CCAGTGTAAAGATGAATGCTGCTTCGATGCAAATCAACCAGAGGGAAGAAAATGCAA ACTGAAACCT
GGGAAACAGTGCAGTCCAAGTCAAGGTCCTTGTTGTACAGCACAGTGTGCATTCAAG TCAAAGTCTG
AGAAGTGTCGGGATGATTCAGACTGTGCAAGGGAAGGAATATGTAATGGCTTCACAG CTCTCTGCCC
AGCATCT G ACCCT AAACCAAACTT CACAG ACT GT AAT AGGC AT AC AC AAGTGT GCATT AATGGGCAAT
GTGCAGGTTCTATCTGTGAGAAATATGGCTTAGAGGAGTGTACGTGTGCCAGTTCTG ATGGCAAAGA
TGATAAAGAATTATGCCATGTATGCTGTATGAAGAAAATGGACCCATCAACTTGTGC CAGTACAGGGT
CTGTGCAGTGGAGTAGGCACTTCAGTGGTCGAACCATCACCCTGCAACCTGGATCCC CTTGCAACG
ATTTTAGAGGTTACTGTGATGTTTTCATGCGGTGCAGATTAGTAGATGCTGATGGTC CTCTAGCTAGG
CTTAAAAAAGCAATTTTTAGTCCAGAGCTCTATGAAAACATTGCTGAATGGATTGTG GCTCATTGGTG
GGCAGTATTACTTATGGGAATTGCTCTGATCATGCTAATGGCTGGATTTATTAAGAT ATGCAGTGTTC
ATACTCCAAGTAGTAATCCAAAGTTGCCTCCTCCTAAACCACTTCCAGGCACTTTAA AGAGGAGGAG
ACCTCCACAGCCCATTCAGCAACCCCAGCGTCAGCGGCCCCGAGAGAGTTATCAAAT GGGACACAT
GAGACGCTAACTGCAGCTTTTGCCTTGGTTCTTCCTAGTGCCTACAATGGGAAAACT TCACTCCAAA
GAGAAACCTATTAAGTCATCATCTCCAAACTAAACCCTCACAAGTAACAGTTGAAGA AAAAATGGCAA
GAGATCATATCCTCAGACCAGGTGGAATTACTTAAATTTTAAAGCCTGAAAATTCCA ATTTGGGGGTG
G G AGGT GG AAAAG G AACCC AATTTT CTT AT G AAC AG AT ATTTTT AACTT AAT G G C AC AAAGT CTT AG A
ATATTATTATGTGCCCCGTGTTCCCTGTTCTTCGTTGCTGCATTTTCTTCACTTGCA GGCAAACTTGG
CTCTCAATAAACTTTTACCACAAATTGAAATAAATATATTTTTTTCAACTGCCAATC AAGGCTAGGAGG
CTCGACC ACCT CAAC ATT GG AG ACATC ACTT GCC AAT GT ACAT ACCTT GTTATATGCAG ACAT GT ATT
TCTTACGTAC ACTGTACTTCTGTGT GC AATTGTAAAC AG AAATT G C AAT AT G GAT GTTT CTTTGTATT A
T AAAATTTTTCCG CT CTT AATT AAAAATT ACT GTTT AATT G AC AT ACT C AG GAT AAC AG AG AAT G GTG G
TATTCAGTGGTCCAGGATTCTGTAATGCTTTACACAGGCAGTTTTGAAATGAAAATC AATTTACCTTTC
TGTTACGATGGAGTTGGTTTTGATACTCATTTTTTCTTTATCACATGGCTGCTACGG GCACAAGTGAC
T ATACT G AAG AAC AC AG TT AAGT GTT GT GCAAACTGGACAT AGCAGCACAT ACT ACTTCAG AGTT CAT GATGTAGATGTCTGGTTTCTGCTTACGTCTTTTAAACTTTCTAATTCAATTCCATTTTTC AATTAATAGG T G AAATTTT ATT CAT G CTTT GAT AG AAATT ATGTC AAT G AAAT GATT CTTTTT ATTT GTAGCCTACTTAT TTGTGTTTTTCATATATCTGAAATATGCTAATTATGTTTTCTGTCTGATATGGAAAAGAA AAGCTGTGT CTTT AT C AAAAT ATTT AAACG GTTTTTT C AG CAT AT CAT C ACT GAT C ATT G GT AACC ACT A AAG AT GAG T AATTT GCTTAAGT AGTAGTT AAAATT GT AG AT AG G CCTTCT G AC ATTTTTTTTCCT AAAATTTTT AAC A GCATTGAAGGTGAAACAGCACAATGTCCCATTCCAAATTTATTTTTGAAACAGATGTAAA TAATTGGC ATTTTAAAGAGAAAGCAAAAACATTTAATGTATTAACAGGCTTATTGCTATGCAGGAAAT AGAAGGGG C ATT AC AAAAATT G AAGCTTGT G AC AT ATTT ATT G CTTCT GTTTTCC AACT AC AT C ACTT C AACT AG AA GTAAAGCTATGATTTTCCTGACTTCACATAGGAGGCAAATTTAGAGAAAGTTGTAAAGAT TTCTATGTT TTGGGTTTTTTTTTTTCCTTTTTTTTTTTAAGAGTATAAGGTTTACACAATCATTCTCAT AATGTGACGC AAGCCAGCAAGGCCAAAAATGCTAGAGAAAATAACGGGATCTCTTCCTTGTAAACTTGTA CAGTATGT GGT G ACTTTTT CAAAAT ACAGCTTTTTGTACAT G ATTT AG AG AC AAATTTT GTACAT G AAACCCC AG AT AG ACT AT AAAT AATT CT AAAC AAAC AAGTAG GT AG AT ATGTATGT AATT G CTTTT AAAT C ATTT AAAT G C CTTTGTTTTTGGACTGTGCAAAGGTTGGAAGTGGGTTTGCATTTCTAAAATGGTGACTTT TATTCTGC AAGAGTTCTTAGTAACTTCTTGAGTGTGGTAGACTTTGGAACATGTAAATTTTTTGCTTG TAATGTTAT CCTGTGGTAGGATTTTGGCAGGTACACACACTGCCCTATTTTATTTTGAGTCTAAGTTAA ATGTTTTCT G AAAAG AGAT ACAT GC ACT G AACTCTTTCC ACT GCGAAT CAAG AT GT GGTAAT AT AAAAGG AT CAAG A C AAAT GAG AT CT AAT ACT ACTGTC AGTTTT AAT GTCC ACTGT GTTTT ATAC AGTAT CTTTTTTT GTTC AC TTTGGAAATTTTTACT AAAAATTGCAAAAAAT AAAGTATT GT GCAAAG AT GT AAGGTTTTTT G AAACTT G AAAT GC ATT AAT AAAT AG ACG ATT AAAT C AACTT G AAG GTTCTATACT CTTT G AACT CT GAG AACT AT C ACAAGAAGCTTCCCACAAGGCAGTGTTTTCTTACAGTTGTCTCTTCCTACAAAAGTATAG ATTATCTTT ATTCTTAATACTTTGGAATCCATGTAGAAAATTTCCAGTTAGATACTCTGCGTACACACA ATAAACCTT TTTAAAACACCCAAAAAAAAAAAAAAAAAA (SEQ ID NO: 3)
The terms “APOC1” and " Apolipoprotein C1" include wild-type forms of the APOC1 gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type APOC1 . Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type APOC1 nucleic acid sequence (e.g.,
SEQ ID NO: 4, NCBI Reference Sequence: NM_001645). SEQ ID NO: 4 is a wild-type gene sequence encoding APOC1 protein, and is shown below:
AACGCTCACGGGACAGGGGCAGAGGAGAAAAACGTGGGTGGACAGAGGGAGGCAGGC GGTCAGG
GGAAGGCTCAGGAGGAGGGAGATCAACATCAACCTGCCCCGCCCCCTCCCCAGCCTG ATAAAGGT
CCTGCGGGCAGGACAGGACCTCCCAACCAAGCCCTCCAGCAAGGATTCAGAGTGCCC CTCCGGCC
TCGCCATGAGGCTCTTCCTGTCGCTCCCGGTCCTGGTGGTGGTTCTGTCGATCGTCT TGGAAGGCC
CAGCCCCAGCCCAGGGGACCCCAGACGTCTCCAGTGCCTTGGATAAGCTGAAGGAGT TTGGAAACA
CACTGGAGGACAAGGCTCGGGAACTCATCAGCCGCATCAAACAGAGTGAACTTTCTG CCAAGATGC
GGGAGT GGTTTT CAGAGACATTT CAGAAAGTG AAGGAGAAACT CAAG ATT G ACT CAT G AGGACCT G A
AGGGTGACATCCCAGGAGGGGCCTCTGAAATTTCCCACACCCCAGCGCCTGTGCTGA GGACTCCCT
CCATGTGGCCCCAGGTGCCACCAATAAAAATCCTACAGAAAATTCAAAAAAAAAAAA AAAAAA (SEQ ID NO: 4)
As used herein, the term “APOE” refers to the gene encoding Apolipoprotein E. The terms “APOE” and "Apolipoprotein E" include wild-type forms of the APOE gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type APOE. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type APOE nucleic acid sequence (e.g., SEQ ID NO: 5, ENA accession number M12529). SEQ ID NO: 5 is a wild-type gene sequence encoding APOE protein, and is shown below:
CCCCAGCGGAGGTGAAGGACGTCCTTCCCCAGGAGCCGACTGGCCAATCACAGGCAG GAA
GATGAAGGTTCTGTGGGCTGCGTTGCTGGTCACATTCCTGGCAGGATGCCAGGCCAA GGT
GGAGCAAGCGGTGGAGACAGAGCCGGAGCCCGAGCTGCGCCAGCAGACCGAGTGGCA GAG
CGGCCAGCGCTGGGAACTGGCACTGGGTCGCTTTTGGGATTACCTGCGCTGGGTGCA GAC
ACTGTCTGAGCAGGTGCAGGAGGAGCTGCTCAGCTCCCAAGTCACCCAAGAACTGAG GGC
GCTGATGGACGAGACCATGAAGGAGTTGAAGGCCTACAAATCGGAACTGGAGGAACA ACT
GACCCCGGTAGCGGAGGAGACGCGGGCACGGCTGTCCAAGGAGCTGCAGACGGCGCA GGC
CCGGCTGGGCGCGGACATGGAGGACGTGTGCGGCCGCCTGGTGCAGTACCGCGGCGA GGT
GCAGGCCATGCTCGGCCAGAGCACCGAGGAGCTGCGGGTGCGCCTCGCCTCCCACCT GCG
CAAGCTGCGTAAGCGGCTCCTCCGCGATCCCGATGACCTGCAGAAGCGCCTGGCAGT GTA
CCAGGCCGGGGCCCGCGAGGGCGCCGAGCGCGGCCTCAGCGCCATCCGCGAGCGCCT GGG
GCCCCTGGTGGAACAGGGCCGCGTGCGGGCCGCCACTGTGGGCTCCCTGGCCGGCCA GCC
GCTACAGGAGCGGGCCCAGGCCTGGGGCGAGCGGCTGCGCGCGCGGATGGAGGAGAT GGG
CAGTCGGACCCGCGACCGCCTGGACGAGGTGAAGGAGCAGGTGGCGGAGGTGCGCGC CAA
GCTGGAGGAGCAGGCCCAGCAGATACGCCTGCAGGCCGAGGCCTTCCAGGCCCGCCT CAA
GAGCTGGTTCGAGCCCCTGGTGGAAGACATGCAGCGCCAGTGGGCCGGGCTGGTGGA GAA
GGTGCAGGCTGCCGTGGGCACCAGCGCCGCCCCTGTGCCCAGCGACAATCACTGAAC GCC
GAAGCCTGCAGCCATGCGACCCCACGCCACCCCGTGCCTCCTGCCTCCGCGCAGCCT GCA
GCGGGAGACCCTGTCCCCGCCCCAGCCGTCCTCCTGGGGTGGACCCTAGTTTAATAA AGA
TTCACCAAGTTTCACGC
(SEQ ID NO: 5)
As used herein, the term “AXL” refers to the gene encoding Tyrosine-protein kinase receptor UFO. The terms “AXL” and "Tyrosine-protein kinase receptor UFO" include wild-type forms of the AXL gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type AXL. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type AXL nucleic acid sequence (e.g., SEQ ID NO: 6, ENA accession number M76125). SEQ ID NO: 6 is a wild-type gene sequence encoding AXL protein, and is shown below: GCTGGGCAAAGCCGGTGGCAAGGGCCTCCCCTGCCGCTGTGCCAGGCAGGCAGTGCCAAA
TCCGGGGAGCCTGGAGCTGGGGGGAGGGCCGGGGACAGCCCGGCCCTGCCCCCTCCC CCG
CTGGGAGCCCAGCAACTTCTGAGGAAAGTTTGGCACCCATGGCGTGGCGGTGCCCCA GGA
TGGGCAGGGTCCCGCTGGCCTGGTGCTTGGCGCTGTGCGGCTGGGCGTGCATGGCCC CCA
GGGGCACGCAGGCTGAAGAAAGTCCCTTCGTGGGCAACCCAGGGAATATCACAGGTG CCC
GGGGACTCACGGGCACCCTTCGGTGTCAGCTCCAGGTTCAGGGAGAGCCCCCCGAGG TAC
ATTGGCTTCGGGATGGACAGATCCTGGAGCTCGCGGACAGCACCCAGACCCAGGTGC CCC
TGGGTGAGGATGAACAGGATGACTGGATAGTGGTCAGCCAGCTCAGAATCACCTCCC TGC
AGCTTTCCGACACGGGACAGTACCAGTGTTTGGTGTTTCTGGGACATCAGACCTTCG TGT
CCCAGCCTGGCTATGTTGGGCTGGAGGGCTTGCCTTACTTCCTGGAGGAGCCCGAAG ACA
GGACTGTGGCCGCCAACACCCCCTTCAACCTGAGCTGCCAAGCTCAGGGACCCCCAG AGC
CCGTGGACCTACTCTGGCTCCAGGATGCTGTCCCCCTGGCCACGGCTCCAGGTCACG GCC
CCCAGCGCAGCCTGCATGTTCCAGGGCTGAACAAGACATCCTCTTTCTCCTGCGAAG CCC
ATAACGCCAAGGGGGTCACCACATCCCGCACAGCCACCATCACAGTGCTCCCCCAGC AGC
CCCGTAACCTCCACCTGGTCTCCCGCCAACCCACGGAGCTGGAGGTGGCTTGGACTC CAG
GCCT GAGCGGCATCTACCCCCT GACCCACT GCACCCTGCAGGCT GT GCTGTCAGACGAT G
GGATGGGCATCCAGGCGGGAGAACCAGACCCCCCAGAGGAGCCCCTCACCTCGCAAG CAT
CCGTGCCCCCCCATCAGCTTCGGCTAGGCAGCCTCCATCCTCACACCCCTTATCACA TCC
GCGTGGCATGCACCAGCAGCCAGGGCCCCTCATCCTGGACCCACTGGCTTCCTGTGG AGA
CGCCGGAGGGAGTGCCCCTGGGCCCCCCTAAGAACATTAGTGCTACGCGGAATGGGA GCC
AGGCCTTCGTGCATTGGCAAGAGCCCCGGGCGCCCCTGCAGGGTACCCTGTTAGGGT ACC
GGCTGGCGTATCAAGGCCAGGACACCCCAGAGGTGCTAATGGACATAGGGCTAAGGC AAG
AGGTGACCCTGGAGCTGCAGGGGGACGGGTCTGTGTCCAATCTGACAGTGTGTGTGG CAG
CCTACACTGCTGCTGGGGATGGACCCTGGAGCCTCCCAGTACCCCTGGAGGCCTGGC GCC
CAGTGAAGGAACCTTCAACTCCTGCCTTCTCGTGGCCCTGGTGGTATGTACTGCTAG GAG
CAGTCGTGGCCGCTGCCTGTGTCCTCATCTTGGCTCTCTTCCTTGTCCACCGGCGAA AGA
AGG AGACCCGTT AT GG AGAAGTGTTT GAACCAAC AGTGG AAAGAGGTG AACTGGTAGT CA
GGTACCGCGTGCGCAAGTCCTACAGTCGTCGGACCACTGAAGCTACCTTGAACAGCC TGG
GCATCAGT GAAGAGCT GAAGGAGAAGCT GCGGGAT GT GATGGTGGACCGGCACAAGGTGG
CCCTGGGGAAGACTCTGGGAGAGGGAGAGTTTGGAGCTGTGATGGAAGGCCAGCTCA ACC
AGGACGACTCCATCCTCAAGGTGGCTGTGAAGACGATGAAGATTGCCATCTGCACGA GGT
CAGAGCTGGAGGATTTCCTGAGTGAAGCGGTCTGCATGAAGGAATTTGACCATCCCA ACG
TCATGAGGCTCATCGGTGTCTGTTTCCAGGGTTCTGAACGAGAGAGCTTCCCAGCAC CTG
TGGTCATCTTACCTTTCATGAAACATGGAGACCTACACAGCTTCCTCCTCTATTCCC GGC
TCGGGGACCAGCCAGTGTACCTGCCCACTCAGATGCTAGTGAAGTTCATGGCAGACA TCG
CCAGTGGCATGGAGTATCTGAGTACCAAGAGATTCATACACCGGGACCTGGCGGCCA GGA
ACTGCATGCTGAATGAGAACATGTCCGTGTGTGTGGCGGACTTCGGGCTCTCCAAGA AGA
TCTACAATGGGGACTACTACCGCCAGGGACGTATCGCCAAGATGCCAGTCAAGTGGA TTG
CCATTGAGAGTCTAGCTGACCGTGTCTACACCAGCAAGAGCGATGTGTGGTCCTTCG GGG
TGACAATGTGGGAGATTGCCACAAGAGGCCAAACCCCATATCCGGGCGTGGAGAACA GCG
AGATTTATGACTATCTGCGCCAGGGAAATCGCCTGAAGCAGCCTGCGGACTGTCTGG ATG
GACTGTATGCCTTGATGTCGCGGTGCTGGGAGCTAAATCCCCAGGACCGGCCAAGTT TTA CAGAGCTGCGGGAAGATTTGGAGAACACACTGAAGGCCTTGCCTCCTGCCCAGGAGCCTG
ACGAAATCCTCTATGTCAACATGGATGAGGGTGGAGGTTATCCTGAACCCCCTGGAG CTG
CAGGAGGAGCTGACCCCCCAACCCAGCCAGACCCTAAGGATTCCTGTAGCTGCCTCA CTG
CGGCTGAGGTCCATCCTGCTGGACGCTATGTCCTCTGCCCTTCCACAACCCCTAGCC CCG
CTCAGCCTGCTGATAGGGGCTCCCCAGCAGCCCCAGGGCAGGAGGATGGTGCCTGAG ACA
ACCCTCCACCTGGTACTCCCTCTCAGGATCCAAGCTAAGCACTGCCACTGGGGAAAA CTC
CACCTTCCCACTTTTCCACCCCACGCCTTATCCCCACTTGCAGCCCTGTCTTCCTAC CTA
TCCCACCTCCATCCCAGACAGGTCCCTCCCCTTCTCTGTGCAGTAGCATCACCTTGA AAG
CAGTAGCATCACCATCTGTAAAAGGAAGGGGTTGGATTGCAATATCTGAAGCCCTCC CAG
GTGTTAACATTCCAAGACTCTAGAGTCCAAGGTTTAAAGAGTCTAGATTCAAAGGTT CTA
GGTTTCAAAGATGCTGTGAGTCTTTGGTTCTAAGGACCTGAAATTCCAAAGTCTCTA ATT
CT ATT AAAGT GOT AAG GTT CT AAGG C AAAAAAAAAAAAAAAAAAAAA
(SEQ ID NO: 6)
As used herein, the term “BIN refers to the gene encoding Myc box-dependent-interacting protein 1. The terms “BIN and "Myc box-dependent-interacting protein 1" include wild-type forms of the BIN1 gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type BIN1. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type BIN1 nucleic acid sequence (e.g., SEQ ID NO: 7, ENA accession number AF004015). SEQ ID NO: 7 is a wild-type gene sequence encoding BIN1 protein, and is shown below:
ATGGCAGAGATGGGCAGTAAAGGGGTGACGGCGGGAAAGATCGCCAGCAACGTGCAG AAG
AAGCTCACCCGCGCGCAGGAGAAGGTTCTCCAGAAGCTGGGGAAGGCAGATGAGACC AAG
GATGAGCAGTTTGAGCAGTGCGTCCAGAATTTCAACAAGCAGCTGACGGAGGGCACC CGG
CTGCAGAAGGATCTCCGGACCTACCTGGCCTCCGTCAAAGCCATGCACGAGGCTTCC AAG
AAGCTGAATGAGTGTCTGCAGGAGGTGTATGAGCCCGATTGGCCCGGCAGGGATGAG GCA
AACAAGATCGCAGAGAACAACGACCTGCTGTGGATGGATTACCACCAGAAGCTGGTG GAC
CAGGCGCTGCTGACCATGGACACGTACCTGGGCCAGTTCCCCGACATCAAGTCACGC ATT
GCCAAGCGGGGGCGCAAGCTGGTGGACTACGACAGTGCCCGGCACCACTACGAGTCC CTT
CAAACCGCCAAAAAGAAGGATGAAGCCAAAATTGCCAAGCCTGTCTCGCTGCTTGAG AAA
GCCGCCCCCCAGTGGTGCCAAGGCAAACTGCAGGCTCATCTCGTAGCTCAAACTAAC CTG
CTCCGAAATCAGGCCGAGGAGGAGCTCATCAAAGCCCAGAAGGTGTTTGAGGAGATG AAT
GTGGATCTGCAGGAGGAGCTGCCGTCCCTGTGGAACAGCCGCGTAGGTTTCTACGTC AAC
ACGTTCCAGAGCATCGCGGGCCTGGAGGAAAACTTCCACAAGGAGATGAGCAAGCTC AAC
CAGAACCTCAATGATGTGCTGGTCGGCCTGGAGAAGCAACACGGGAGCAACACCTTC ACG
GTCAAGGCCCAGCCCAGTGACAACGCGCCTGCAAAAGGGAACAAGAGCCCTTCGCCT CCA
GATGGCTCCCCTGCCGCCACCCCCGAGATCAGAGTCAACCACGAGCCAGAGCCGGCC GGC
GGGGCCACGCCCGGGGCCACCCTCCCCAAGTCCCCATCTCAGCTCCGGAAAGGCCCA CCA
GTCCCTCCGCCTCCCAAACACACCCCGTCCAAGGAAGTCAAGCAGGAGCAGATCCTC AGC
CTGTTTGAGGACACGTTTGTCCCTGAGATCAGCGTGACCACCCCCTCCCAGTTTGAG GCC CCGGGGCCTTTCTCGGAGCAGGCCAGTCTGCTGGACCTGGACTTTGACCCCCTCCCGCCC
GTGACGAGCCCTGTGAAGGCACCCACGCCCTCTGGTCAGTCAATTCCATGGGACCTC TGG
GAGCCCACAGAGAGTCCAGCCGGCAGCCTGCCTTCCGGGGAGCCCAGCGCTGCCGAG GGC
ACCTTT GCT GT GTCCTGGCCCAGCCAGACGGCCGAGCCGGGGCCT GCCCAACCAGCAGAG
GCCTCGGAGGTGGCGGGTGGGACCCAACCTGCGGCTGGAGCCCAGGAGCCAGGGGAG ACG
GCGGCAAGTGAAGCAGCCTCCAGCTCTCTTCCTGCTGTCGTGGTGGAGACCTTCCCA GCA
ACTGTGAATGGCACCGTGGAGGGCGGCAGTGGGGCCGGGCGCTTGGACCTGCCCCCA GGT
TTCATGTTCAAGGTACAGGCCCAGCACGACTACACGGCCACTGACACAGACGAGCTG CAG
CTCAAGGCTGGTGATGTGGTGCTGGTGATCCCCTTCCAGAACCCTGAAGAGCAGGAT GAA
GGCTGGCTCATGGGCGTGAAGGAGAGCGACTGGAACCAGCACAAGGAGCTGGAGAAG TGC
CGTGGCGTCTTCCCCGAGAACTTCACTGAGAGGGTCCCATGA
(SEQ ID NO: 7)
As used herein, the term “C1QA” refers to the gene encoding Complement C1q A Chain. The terms “C1QA” and " Complement C1q A Chain " include wild-type forms of the C1 QA gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type C1QA. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type C1QA nucleic acid sequence (e.g., SEQ ID NO: 8, NCBI Reference Sequence: NM_015991 .3). SEQ ID NO: 8 is a wild-type gene sequence encoding C1QA protein, and is shown below:
AGTCTTGCTGAAGTCTGCTTGAAATGTCCCTGGTGAGCTTCTGGCCACTGGGGAAGT TCAGGGGGC
AGGTCTGAAGAAGGGGAAGTAGGAAGGGATGTGAAACTTGGCCACAGCCTGGAGCCA CTCCTGCTG
GGCAGCCCACAGGGTCCCTGGGCGGAGGGCAGGAGCATCCAGTTGGAGTTGACAACA GGAGGCA
GAGGCATCATGGAGGGTCCCCGGGGATGGCTGGTGCTCTGTGTGCTGGCCATATCGC TGGCCTCT
ATGGTGACCGAGGACTTGTGCCGAGCACCAGACGGGAAGAAAGGGGAGGCAGGAAGA CCTGGCAG
ACGGGGGCGGCCAGGCCTCAAGGGGGAGCAAGGGGAGCCGGGGGCCCCTGGCATCCG GACAGG
CATCCAAGGCCTTAAAGGAGACCAGGGGGAACCTGGGCCCTCTGGAAACCCCGGCAA GGTGGGCT
ACCCAGGGCCCAGCGGCCCCCTCGGAGCCCGTGGCATCCCGGGAATTAAAGGCACCA AGGGCAGC
CCAGGAAACATCAAGGACCAGCCGAGGCCAGCCTTCTCCGCCATTCGGCGGAACCCC CCAATGGG
GGGCAACGTGGTCATCTTCGACACGGTCATCACCAACCAGGAAGAACCGTACCAGAA CCACTCCGG
CCGATTCGTCTGCACTGTACCCGGCTACTACTACTTCACCTTCCAGGTGCTGTCCCA GTGGGAAATC
TGCCTGTCCATCGTCTCCTCCTCAAGGGGCCAGGTCCGACGCTCCCTGGGCTTCTGT GACACCACC
AACAAGGGGCTCTTCCAGGTGGTGTCAGGGGGCATGGTGCTTCAGCTGCAGCAGGGT GACCAGGT
CTGGGTTGAAAAAGACCCCAAAAAGGGTCACATTTACCAGGGCTCTGAGGCCGACAG CGTCTTCAG
CGGCTTCCTCATCTTCCCATCTGCCTGAGCCAGGGAAGGACCCCCTCCCCCACCCAC CTCTCTGGC
TTCCATGCTCCGCCTGTAAAATGGGGGCGCTATTGCTTCAGCTGCTGAAGGGAGGGG GCTGGCTCT
GAGAGCCCCAGGACTGGCTGCCCCGTGACACATGCTCTAAGAAGCTCGTTTCTTAGA CCTCTTCCTG
GAATAAACATCTGTGTCTGTGTCTGCTGAACATGAGCTTCAGTTGCTACTCGGAGCA TTGAGAGGGA
GGCCTAAGAATAATAACAATCCAGTGCTTAAGAGTCAAAAAAAAAAAA
(SEQ ID NO: 8) As used herein, the term “C3” refers to the gene encoding Complement C3. The terms “C3” and " Complement C3 " include wild-type forms of the C3 gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type C3. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type C3 nucleic acid sequence (e.g., SEQ ID NO: 9, NCBI Reference Sequence: NM_000064.3). SEQ ID NO: 9 is a wild-type gene sequence encoding C3 protein, and is shown below:
AGATAAAAAGCCAGCTCCAGCAGGCGCTGCTCACTCCTCCCCATCCTCTCCCTCTGT CCCTCTGTCC
CTCTGACCCTGCACTGTCCCAGCACCATGGGACCCACCTCAGGTCCCAGCCTGCTGC TCCTGCTAC
TAACCCACCTCCCCCTGGCTCTGGGGAGTCCCATGTACTCTATCATCACCCCCAACA TCTTGCGGCT
GGAGAGCGAGGAGACCATGGTGCTGGAGGCCCACGACGCGCAAGGGGATGTTCCAGT CACTGTTA
CT GTCCACGACTTCCCAGGCAAAAAACTAGT GCT GTCCAGT GAGAAGACT GTGCTGACCCCT GCCA
CCAACCACATGGGCAACGTCACCTTCACGATCCCAGCCAACAGGGAGTTCAAGTCAG AAAAGGGGC
GCAACAAGTTCGTGACCGTGCAGGCCACCTTCGGGACCCAAGTGGTGGAGAAGGTGG TGCTGGTC
AGCCTGCAGAGCGGGTACCTCTTCATCCAGACAGACAAGACCATCTACACCCCTGGC TCCACAGTT
CTCTATCGGATCTTCACCGTCAACCACAAGCTGCTACCCGTGGGCCGGACGGTCATG GTCAACATT
GAGAACCCGGAAGGCATCCCGGTCAAGCAGGACTCCTTGTCTTCTCAGAACCAGCTT GGCGTCTTG
CCCTTGTCTTGGGACATTCCGGAACTCGTCAACATGGGCCAGTGGAAGATCCGAGCC TACTATGAAA
ACTCACCACAGCAGGTCTTCTCCACTGAGTTTGAGGTGAAGGAGTACGTGCTGCCCA GTTTCGAGGT
CATAGTGGAGCCTACAGAGAAATTCTACTACATCTATAACGAGAAGGGCCTGGAGGT CACCATCACC
GCCAGGTTCCTCTACGGGAAGAAAGTGGAGGGAACTGCCTTTGTCATCTTCGGGATC CAGGATGGC
GAACAGAGGATTTCCCTGCCTGAATCCCTCAAGCGCATTCCGATTGAGGATGGCTCG GGGGAGGTT
GTGCTGAGCCGGAAGGTACTGCTGGACGGGGTGCAGAACCCCCGAGCAGAAGACCTG GTGGGGAA
GTCTTTGTACGTGTCTGCCACCGTCATCTTGCACTCAGGCAGTGACATGGTGCAGGC AGAGCGCAG
CGGGATCCCCATCGTGACCTCTCCCTACCAGATCCACTTCACCAAGACACCCAAGTA CTTCAAACCA
GGAATGCCCTTTGACCTCATGGTGTTCGTGACGAACCCTGATGGCTCTCCAGCCTAC CGAGTCCCC
GTGGCAGTCCAGGGCGAGGACACTGTGCAGTCTCTAACCCAGGGAGATGGCGTGGCC AAACTCAG
CATCAACACACACCCCAGCCAGAAGCCCTTGAGCATCACGGTGCGCACGAAGAAGCA GGAGCTCTC
GGAGGCAGAGCAGGCTACCAGGACCATGCAGGCTCTGCCCTACAGCACCGTGGGCAA CTCCAACA
ATTACCTGCATCTCTCAGTGCTACGTACAGAGCTCAGACCCGGGGAGACCCTCAACG TCAACTTCCT
CCTGCGAATGGACCGCGCCCACGAGGCCAAGATCCGCTACTACACCTACCTGATCAT GAACAAGGG
CAGGCTGTTGAAGGCGGGACGCCAGGTGCGAGAGCCCGGCCAGGACCTGGTGGTGCT GCCCCTG
TCCATCACCACCGACTTCATCCCTTCCTTCCGCCTGGTGGCGTACTACACGCTGATC GGTGCCAGC
GGCCAGAGGGAGGTGGTGGCCGACTCCGTGTGGGTGGACGTCAAGGACTCCTGCGTG GGCTCGCT
GGTGGTAAAAAGCGGCCAGTCAGAAGACCGGCAGCCTGTACCTGGGCAGCAGATGAC CCTGAAGA
TAGAGGGTGACCACGGGGCCCGGGTGGTACTGGTGGCCGTGGACAAGGGCGTGTTCG TGCTGAAT
AAGAAGAACAAACTGACGCAGAGTAAGATCTGGGACGTGGTGGAGAAGGCAGACATC GGCTGCACC
CCGGGCAGTGGGAAGGATTACGCCGGTGTCTTCTCCGACGCAGGGCTGACCTTCACG AGCAGCAG
TGGCCAGCAGACCGCCCAGAGGGCAGAACTTCAGTGCCCGCAGCCAGCCGCCCGCCG ACGCCGTT CCGTGCAGCTCACGGAGAAGCGAATGGACAAAGTCGGCAAGTACCCCAAGGAGCTGCGCA AGTGC TGCGAGGACGGCATGCGGGAGAACCCCATGAGGTTCTCGTGCCAGCGCCGGACCCGTTTC ATCTC CCTGGGCGAGGCGTGCAAGAAGGTCTTCCTGGACTGCTGCAACTACATCACAGAGCTGCG GCGGC AGCACGCGCGGGCCAGCCACCTGGGCCTGGCCAGGAGTAACCTGGATGAGGACATCATTG CAGAA G AG AACATCGTTTCCCG AAGT GAGTTCCCAG AG AGCT GGCTGT GG AACGTT GAGG ACTT GAAAG AG CCACCGAAAAATGGAATCTCTACGAAGCTCATGAATATATTTTTGAAAGACTCCATCACC ACGTGGGA GATTCTGGCTGTGAGCATGTCGGACAAGAAAGGGATCTGTGTGGCAGACCCCTTCGAGGT CACAGT AATGCAGGACTTCTTCATCGACCTGCGGCTACCCTACTCTGTTGTTCGAAACGAGCAGGT GGAAATC CGAGCCGTTCTCTACAATTACCGGCAGAACCAAGAGCTCAAGGTGAGGGTGGAACTACTC CACAAT CCAGCCTTCTGCAGCCTGGCCACCACCAAGAGGCGTCACCAGCAGACCGTAACCATCCCC CCCAAG TCCTCGTTGTCCGTTCCATATGTCATCGTGCCGCTAAAGACCGGCCTGCAGGAAGTGGAA GTCAAG GCTGCTGTCTACCATCATTTCATCAGTGACGGTGTCAGGAAGTCCCTGAAGGTCGTGCCG GAAGGA ATCAGAATGAACAAAACTGTGGCTGTTCGCACCCTGGATCCAGAACGCCTGGGCCGTGAA GGAGTG CAGAAAGAGGACATCCCACCTGCAGACCTCAGTGACCAAGTCCCGGACACCGAGTCTGAG ACCAGA ATTCTCCTGCAAGGGACCCCAGTGGCCCAGATGACAGAGGATGCCGTCGACGCGGAACGG CTGAA GCACCTCATTGTGACCCCCTCGGGCTGCGGGGAACAGAACATGATCGGCATGACGCCCAC GGTCAT CGCTGTGCATTACCTGGATGAAACGGAGCAGTGGGAGAAGTTCGGCCTAGAGAAGCGGCA GGGGG CCTTGGAGCTCATCAAGAAGGGGTACACCCAGCAGCTGGCCTTCAGACAACCCAGCTCTG CCTTTG CGGCCTTCGTGAAACGGGCACCCAGCACCTGGCTGACCGCCTACGTGGTCAAGGTCTTCT CTCTGG CTGTCAACCTCATCGCCATCGACTCCCAAGTCCTCTGCGGGGCTGTTAAATGGCTGATCC TGGAGAA GCAGAAGCCCGACGGGGTCTTCCAGGAGGATGCGCCCGTGATACACCAAGAAATGATTGG TGGATT ACGGAACAACAACGAGAAAGACATGGCCCTCACGGCCTTTGTTCTCATCTCGCTGCAGGA GGCTAA AGATATTTGCGAGGAGCAGGTCAACAGCCTGCCAGGCAGCATCACTAAAGCAGGAGACTT CCTTGA AGCCAACTACATGAACCTACAGAGATCCTACACTGTGGCCATTGCTGGCTATGCTCTGGC CCAGATG GGCAGGCTGAAGGGGCCTCTTCTTAACAAATTTCTGACCACAGCCAAAGATAAGAACCGC TGGGAG GACCCTGGTAAGCAGCTCTACAACGTGGAGGCCACATCCTATGCCCTCTTGGCCCTACTG CAGCTA AAAG ACTTT G ACTTTGTGCCTCCCGTCGT GCGTTGGCT CAAT G AAC AG AG AT ACT ACGGTGGTGGCT ATGGCTCTACCCAGGCCACCTTCATGGTGTTCCAAGCCTTGGCTCAATACCAAAAGGACG CCCCTGA CCACCAGGAACTGAACCTTGATGTGTCCCTCCAACTGCCCAGCCGCAGCTCCAAGATCAC CCACCG TATCCACTGGGAATCTGCCAGCCTCCTGCGATCAGAAGAGACCAAGGAAAATGAGGGTTT CACAGTC ACAGCTGAAGGAAAAGGCCAAGGCACCTTGTCGGTGGTGACAATGTACCATGCTAAGGCC AAAGAT CAACTCACCTGTAATAAATTCGACCTCAAGGTCACCATAAAACCAGCACCGGAAACAGAA AAGAGGC CTCAGGATGCCAAGAACACTATGATCCTTGAGATCTGTACCAGGTACCGGGGAGACCAGG ATGCCA CTATGTCTATATTGGACATATCCATGATGACTGGCTTTGCTCCAGACACAGATGACCTGA AGCAGCT GGCCAATGGTGTTGACAGATACATCTCCAAGTATGAGCTGGACAAAGCCTTCTCCGATAG GAACACC CTCATCATCTACCTGGACAAGGTCTCACACTCTGAGGATGACTGTCTAGCTTTCAAAGTT CACCAATA CTTTAATGTAGAGCTTATCCAGCCTGGAGCAGTCAAGGTCTACGCCTATTACAACCTGGA GGAAAGC T GTACCCGGTT CTACCATCCGGAAAAGG AGG AT GG AAAGCT GAACAAGCT CT GCCGTG AT G AACT G TGCCGCTGTGCTGAGGAGAATTGCTTCATACAAAAGTCGGATGACAAGGTCACCCTGGAA GAACGG CTGGACAAGGCCTGTGAGCCAGGAGTGGACTATGTGTACAAGACCCGACTGGTCAAGGTT CAGCTG TCCAATGACTTTGACGAGTACATCATGGCCATTGAGCAGACCATCAAGTCAGGCTCGGAT GAGGTGC AGGTTGGACAGCAGCGCACGTTCATCAGCCCCATCAAGTGCAGAGAAGCCCTGAAGCTGG AGGAGA AGAAACACTACCTCATGTGGGGTCTCTCCTCCGATTTCTGGGGAGAGAAGCCCAACCTCA GCTACAT CATCGGGAAGGACACTTGGGTGGAGCACTGGCCCGAGGAGGACGAATGCCAAGACGAAGA GAACC AGAAACAATGCCAGGACCTCGGCGCCTTCACCGAGAGCATGGTTGTCTTTGGGTGCCCCA ACTGAC CACACCCCCATTCCCCCACTCCAGATAAAGCTTCAGTTATATCTCAAAAAAAAAAAAAAA AA (SEQ ID NO: 9)
As used herein, the term “C9orf72” refers to the gene encoding Guanine nucleotide exchange C9orf72. The terms “C9orf72” and "Guanine nucleotide exchange C9orf72" include wild-type forms of the C9orf72 gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type C9orf72. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type C9orf72 nucleic acid sequence (e.g., SEQ ID NO: 10, ENA accession number JN681271). SEQ ID NO: 10 is a wild-type gene sequence encoding C9orf72 protein, and is shown below:
AGG AAAGAG AGGTGCGT CAAACAGCG ACAAGTTCCGCCCACGTAAAAG AT G ACGCTT GGT
GTGTCAGCCGTCCCTGCTGCCCGGTTGCTTCTCTTTTGGGGGCGGGGTCTAGCAAGA GCA
GGTGTGGGTTTAGGAGATATCTCCGGAGCATTTGGATAATGTGACAGTTGGAATGCA GTG
ATGTCGACTCTTTGCCCACCGCCATCTCCAGCTGTTGCCAAGACAGAGATTGCTTTA AGT
GGCAAATCACCTTTATTAGCAGCTACTTTTGCTTACTGGGACAATATTCTTGGTCCT AGA
GTAAGGCACATTTGGGCTCCAAAGACAGAACAGGTACTTCTCAGTGATGGAGAAATA ACT
TTTCTTGCCAACCACACTCTAAATGGAGAAATCCTTCGAAATGCAGAGAGTGGTGCT ATA
GAT GTAAAGTTTTTTGTCTTGTCT GAAAAGGG AGT GATT ATTGTTT CATTAAT CTTT GAT
GGAAACTGGAATGGGGATCGCAGCACATATGGACTATCAATTATACTTCCACAGACA GAA
CTTAGTTTCTACCTCCCACTTCATAGAGTGTGTGTTGATAGATTAACACATATAATC CGG
AAAG G AAG AAT ATGG ATG CAT AAG G AAAG AC AAG AAAAT GTCC AG AAG ATT AT CTT AG AA
GGCACAGAGAGAATGGAAGATCAGGGTCAGAGTATTATTCCAATGCTTACTGGAGAA GTG
ATTCCT GT AAT GG AACT GCTTTCATCT AT G AAATCACACAGTGTTCCT G AAG AAAT AG AT
ATAGCTG ATAC AGTACT C AAT GAT GAT GAT ATT GGT G AC AG CTGT CAT G AAG GCTTT CTT
CTCAATGCCATCAGCTCACACTTGCAAACCTGTGGCTGTTCCGTTGTAGTAGGTAGC AGT
GCAGAGAAAGTAAATAAGATAGTCAGAACATTATGCCTTTTTCTGACTCCAGCAGAG AGA
AAATGCTCCAGGTTATGTGAAGCAGAATCATCATTTAAATATGAGTCAGGGCTCTTT GTA
CAAGGCCTGCTAAAGGATTCAACTGGAAGCTTTGTGCTGCCTTTCCGGCAAGTCATG TAT
GCTCCATATCCCACCACACACATAGATGTGGATGTCAATACTGTGAAGCAGATGCCA CCC
TGTCATGAACATATTTATAATCAGCGTAGATACATGAGATCCGAGCTGACAGCCTTC TGG
AGAGCCACTT CAGAAGAAGACATGGCTCAGG AT ACG AT CAT CT AC ACT GACGAAAGCTTT
ACTCCT G ATTT G AAT ATTTTT C AAG AT GTCTT AC AC AG AG AC ACT CT AGTG AAAG CCTT C
CTGGATCAGGTCTTTCAGCTGAAACCTGGCTTATCTCTCAGAAGTACTTTCCTTGCA CAG
TTTCT ACTTGTCCTTCACAGAAAAGCCTT G AC ACT AAT AAAAT AT ATAGAAGACG AT ACG
CAGAAGGGAAAAAAGCCCTTTAAATCTCTTCGGAACCTGAAGATAGACCTTGATTTA ACA
GCAGAGGGCGATCTTAACATAATAATGGCTCTGGCTGAGAAAATTAAACCAGGCCTA CAC
T CTTTT AT CTTT GG AAG ACCTTT CTAC ACT AGTGT G C AAG AACG AG AT GTTCT AAT G ACT TTTT AAAT GTGT AACTT AAT AAGCCT ATT C CAT C AC AAT CAT G ATCG CT G GT AAAGT AG C TCAGTGGTGTGGGGAAACGTTCCCCTGGATCATACTCCAGAATTCTGCTCTCAGCAATTG CAGTTAAGTAAGTTACACTACAGTTCTCACAAGAGCCTGTGAGGGGATGTCAGGTGCATC ATTACATTGGGTGTCTCTTTTCCTAGATTTATGCTTTTGGGATACAGACCTATGTTTACA AT AT AAT AAAT ATT ATT G CTAT CTTTT AAAG AT AT AAT AAT AGG ATGT AAACTT G ACC AC AACT ACT GTTTTTTT GAAAT AC AT GATTC AT GGTTTACAT GTGTC AAGGTG AAAT CT GAG TTGG CTTTT AC AG AT AGTT G ACTTT CTAT CTTTT GG C ATT CTTT G GTGTGT AG AATT ACT GTAATACTTCTGCAATCAACTGAAAACTAGAGCCTTTAAATGATTTCAATTCCACAGAAA G AAAGT GAG CTT G AAC AT AGG AT GAG CTTT AG AAAG AAAATT GAT C AAGC AG ATGTTT AA TTGGAATTGATTATTAGATCCTACTTTGTGGATTTAGTCCCTGGGATTCAGTCTGTAGAA ATGTCTAATAGTTCTCTATAGTCCTTGTTCCTGGTGAACCACAGTTAGGGTGTTTTGTTT ATTTT ATT GTTCTTGCT ATT GTT GAT ATT CTATGTAGTT GAG CTCTGT AAAAG G AAATT G T ATTTT AT GTTTT AGT AATTGTT GCCAACTTTTTAAATT AATTTTCATT ATTTTT G AGCC AAATTGAAATGTGCACCTCCTGTGCCTTTTTTCTCCTTAGAAAATCTAATTACTTGGAAC AAGTTCAGATTTCACTGGTCAGTCATTTTCATCTTGTTTTCTTCTTGCTAAGTCTTACCA TGTACCTG CTTT G GC AAT C ATT G C AACT CT GAG ATT AT AAAAT G CCTT AG AG AAT AT ACT AACT AAT AAGATCTTTTTTT CAG AAAC AG AAAAT AGTTCCTT G AGTACTTCCTT CTT GCA TTTCTGCCTATGTTTTTGAAGTTGTTGCTGTTTGCCTGCAATAGGCTATAAGGAATAGCA GGAGAAATTTTACTGAAGTGCTGTTTTCCTAGGTGCTACTTTGGCAGAGCTAAGTTATCT TTTGTTTT CTT AATGCGTTTGGACCATTTT GCT GGCTATAAAAT AACT GATT AAT AT AAT TCT AAC AC AAT GTT G AC ATT GTAGTT AC AC AAAC AC AAAT AAAT ATTTT ATTT AAAATT C TGGAAGTAATATAAAAGGGAAAATATATTTATAAGAAAGGGATAAAGGTAATAGAGCCCT TCTGCCCCCCACCCACCAAATTTACACAACAAAATGACATGTTCGAATGTGAAAGGTCAT AAT AGCTTTCCC AT CAT G AATC AG AAAG ATGTGGACAGCTT GAT GTTTT AG ACAACCACT G AACT AGAT G ACTGTT GT ACTGTAGCT CAGT CATTTAAAAAAT AT AT AAAT ACTACCTT G TAGTGTCCCATACTGTGTTTTTTACATGGTAGATTCTT ATTT AAGTGCTAACTGGTT ATT TTCTTTGGCTGGTTTATTGTACTGTTATACAGAATGTAAGTTGTACAGTGAAATAAGTTA TTAAAGCATGTGTAAAC ATTGTT AT AT AT CTTTTCTCCT AAATGGAGAATTTT G AAT AAA AT AT ATTT G AAATTTT (SEQ ID NO: 10)
As used herein, the term “CASS4” refers to the gene encoding Cas scaffolding protein family member 4. The terms “CASS4” and "Cas scaffolding protein family member 4" include wild-type forms of the CASS4 gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type CASS4. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type CASS4 nucleic acid sequence (e.g., SEQ ID NO: 11 , ENA accession number AJ276678). SEQ ID NO: 11 is a wild-type gene sequence encoding CASS4 protein, and is shown below:
G AAG AGT G GT GTTTTTTT CTTCTTCTTCTT CTTTT GTG GTTT C AC AT AGC AAAT G AGT G A CAGTCTCTACTTACAGACAAAGTGAGACGTCAGGCATTGAGACATAGCTCCATAGAATTC AGTTTCTGAGAACCAGCCAGAAGCATGCAGTGACATTGCACAATCTGCCTCTGAAGCTGG
AGATACTAGCTGCAGAGCTCAGGGGAGCTGCTCCACATCACCGACATGAAGGGAACA GGC
ATCATGGACTGTGCGCCCAAGGCACTCCTGGCCAGGGCACTTTATGACAACTGCCCT GAC
TGCTCTGACGAGCTGGCTTTCAGCAGAGGGGACATCCTGACCATTCTGGAGCAACAC GTG
CCAGAAAGCGAGGGTTGGTGGAAGTGTTTGCTCCATGGGAGGCAAGGCCTGGCCCCT GCC
AACCGCCTCCAAATCCTCACGGAGGTCGCTGCAGACAGGCCGTGCCCCCCATTCCTG AGA
GGCCTGGAAGAAGCTCCTGCCAGCTCAGAGGAGACCTATCAGGTGCCCACTCTACCC CGC
CCTCCCACTCCAGGCCCCGTTTATGAGCAGATGAGGAGTTGGGCGGAGGGGCCCCAG CCC
CCTACTGCCCAAGTCTATGAATTCCCCGACCCTCCCACCAGTGCCAGAATCATCTGT GAA
AAGACTCTCAGCTTTCCAAAACAGGCCATCCTCACGCTTCCCAGACCTGTCCGGGCC TCA
CTGCCGACTCTGCCTTCCCAGGTGTATGACGTGCCTACCCAGCACCGGGGCCCCGTG GTC
CTGAAGGAGCCAGAGAAGCAGCAGTTATATGACATACCAGCCAGCCCCAAGAAGGCA GGA
CTCCATCCCCCAGACAGCCAAGCAAGTGGGCAGGGTGTTCCCCTGATATCAGTGACT ACC
TTAAGAAGAGGCGGTTACAGCACATTACCAAATCCTCAGAAATCGGAATGGATTTAT GAC
ACTCCAGTGTCTCCAGGAAAGGCCAGCGTCAGAAACACGCCTCTCACCAGCTTTGCG GAA
GAATCAAGGCCCCACGCTCTCCCCAGTTCCAGCTCCACTTTCTACAATCCTCCAAGT GGC
AGATCCAGGTCCCT CACTCCAC AACT G AAT AACAAT GT GCCCATGCAG AAAAAACT CAGC
CTTCCAGAAATTCCTTCTTATGGCTTTCTTGTACCCAGAGGCACATTTCCTTTGGAT GAA
GAT GTCAGCAACAAGGTTCCTT CAAGCTT CT CT G ATTCCCCGAGTGGACAGC AG AACACC
AAGCCCAATATAGACATCCCTAAAGCAACGTCGAGTGTTTCTCAGGCTGGGAAGGAG CTG
GAGAAAGCCAAGGAGGTGTCAGAGAATTCCGCGGGCCATAATTCCTCATGGTTCTCC AGA
CGGACAACTTCCCCATCTCCTGAACCGGACAGATTATCAGGTTCCAGTTCTGACAGC AGA
GCTAGCATCGTTTCCTCGTGCTCCACCACATCCACCGACGACTCCTCCAGCTCTTCC TCG
GAGGAGTCAGCAAAGGAGCTCTCCTTGGACCTGGATGTGGCCAAGGAGACAGTGATG GCT
CTGCAGCACAAGGTGGTCAGCTCTGTCGCTGGCCTGATGCTCTTTGTCAGCAGGAAG TGG
AGATTCCGAGACTATCTGGAGGCCAACATTGATGCAATCCACAGGTCCACTGATCAC ATA
GAAGCCTCTGTAAGAGAATTTCTGGATTTTGCCCGAGGAGTCCATGGGACTGCCTGT AAC
CTCACTGACAGTAACCTTCAGAACAGAATTCGGGACCAGATGCAGACCATCTCCAAC TCC
TACCGCATCCTGCTTGAAACAAAGGAAAGCTTGGATAATCGCAATTGGCCTCTGGAA GTT
CTT GT GACT G ACAGT GTCCAGAACAGCCCAGAT G ACCTT GAG AGGTTTGTCATGGT GGCA
CGGATGCTTCCAGAAGACATCAAGAGGTTTGCCTCCATTGTCATTGCCAATGGAAGG CTC
CTTTTTAAGCGGAACTGTGAAAAGGAAGAGACTGTGCAGTTGACCCCAAATGCAGAA TTT
AAGTGTGAAAAATACATCCAGCCTCCCCAAAGAGAAACTGAATCACACCAAAAGAGT ACC
CCTTCC ACT AAG C AAAG G G AAG AT GAACACTCTTCT G AACT ATT AAAG AAAAAT AG AGC A
AATATCTGTGGACAGAATCCTGGCCCTCTTATACCTCAGCCTTCGAGTCAACAGACT CCT
GAGAGGAAACCCCGCTTATCTGAACACTGCCGGCTGTACTTTGGGGCGCTCTTCAAA GCC
ATCAGCGCATTTCACGGCAGCCTCAGCAGCAGCCAGCCCGCGGAGATCATCACTCAG AGC
AAGCTGGTCATCATGGTGGGACAGAAGCTGGTGGACACGCTGTGCATGGAGACCCAG GAG
AGGGACGTGCGCAATGAGATCCTCCGCGGCAGCAGTCACCTCTGCAGCCTGCTCAAG GAC
GTAGCGCTGGCCACTAAGAATGCCGTGCTCACATACCCCAGCCCTGCCGCGCTGGGG CAC
CTCCAGGCGGAGGCTGAGAAGCTGGAGCAACACACGCGGCAGTTCAGAGGGACACTG GGA
TGAGGACTGTCTACCTCCCTTCCTCCTCTGCTCACC (SEQ ID NO: 11)
As used herein, the term “CCL5” refers to the gene encoding C-C motif chemokine 5. The terms “CCL5” and "C-C motif chemokine 5" include wild-type forms of the CCL5 gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type CCL5. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type CCL5 nucleic acid sequence (e.g., SEQ ID NO: 12, ENA accession number M21121). SEQ ID NO: 12 is a wild-type gene sequence encoding CCL5 protein, and is shown below:
CCTCCGACAGCCTCTCCACAGGTACCATGAAGGTCTCCGCGGCACGCCTCGCTGTCA TCC
TCATTGCTACTGCCCTCTGCGCTCCTGCATCTGCCTCCCCATATTCCTCGGACACCA CAC
CCTGCTGCTTTGCCTACATTGCCCGCCCACTGCCCCGTGCCCACATCAAGGAGTATT TCT
ACACCAGTGGCAAGTGCTCCAACCCAGCAGTCGTCTTTGTCACCCGAAAGAACCGCC AAG
TGTGTGCCAACCCAGAGAAGAAATGGGTTCGGGAGTACATCAACTCTTTGGAGATGA GCT
AGG ATGGAGAGTCCTT G AACCT G AACTTACACAAATTTGCCT GTTT CTGCTTGCT CTT GT
CCTAGCTTGGGAGGCTTCCCCTCACTATCCTACCCCACCCGCTCCTTGAAGGGCCCA GAT
TCTGACCACGACGAGCAGCAGTTACAAAAACCTTCCCCAGGCTGGACGTGGTGGCTC AGC
CTTGTAATCCCAGCACTTTGGGAGGCCAAGGTGGGTGGATCACTTGAGGTCAGGAGT TCG
AGACAGCCTGGCCAACATGATGAAACCCCATGTGTACTAAAAATACAAAAAATTAGC CGG
GCGTGGTAGCGGGCGCCTGTAGTCCCAGCTACTCGGGAGGCTGAGGCAGGAGAATGG CGT
GAACCCGGGAGCGGAGCTTGCAGTGAGCCGAGATCGCGCCACTGCACTCCAGCCTGG GCG
ACAG AGCGAG ACTCCGT CT CAAAAAAAAAAAAAAAAAAAAAAAAAAT ACAAAAATT AGCC
GCGTGGTGGCCCACGCCTGTAATCCCAGCTACTCGGGAGGCTAAGGCAGGAAAATTG TTT
GAACCCAGGAGGTGGAGGCTGCAGTGAGCTGAGATTGTGCCACTTCACTCCAGCCTG GGT
GACAAAGTGAGACTCCGTCACAACAACAACAACAAAAAGCTTCCCCAACTAAAGCCT AGA
AGAGCTTCTGAGGCGCTGCTTTGTCAAAAGGAAGTCTCTAGGTTCTGAGCTCTGGCT TTG
CCTTGGCTTTGCAAGGGCTCTGTGACAAGGAAGGAAGTCAGCATGCCTCTAGAGGCA AGG
AAGGGAGGAACACTGCACTCTTAAGCTTCCGCCGTCTCAACCCCTCACAGGAGCTTA CTG
GCAAACATGAAAAATCGGGG
(SEQ ID NO: 12)
As used herein, the term “CD2AP” refers to the gene encoding CD2-associated protein. The terms “CD2AP” and "CD2-associated protein" include wild-type forms of the CD2AP gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type CD2AP. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type CD2AP nucleic acid sequence (e.g., SEQ ID NO: 13, ENA accession number AF146277). SEQ ID NO: 13 is a wild-type gene sequence encoding CD2AP protein, and is shown below: GGAATTCCGGGAGGAGCGGACGTCGGCTTCTCCCCGCGGGAGCCCCCAGCATGGTTGACT
AT ATTGTG G AGT ATG ACT AT GAT G CTGTAC AT GAT GAT G AATT AACT ATTCG AGTT G GAG
AAATCATCAGGAATGTGAAAAAGCTACAGGAGGAAGGGTGGCTGGAAGGAGAACTAA ATG
G G AG AAG AG G AAT GTTC CCT G AC AATTTCGTT AAG G AAATT AAAAG AG AG AC GG AATT C A
AGGATGACAGTTTGCCCATCAAACGGGAAAGGCATGGGAATGTAGCAAGTCTTGTAC AAC
GAATAAGCACCTATGGACTTCCAGCTGGAGGAATTCAGCCACATCCACAAACCAAAA ACA
TTAAGAAGAAGACCAAG AAGCGTC AGT GT AAAGTT CTTTTT GAGT ACATTCCACAAAAT G
AGG AT G AACT GG AGCT G AAAGTGGGAGATATT ATT GAT ATT AAT GAAGAGGTAG AAGAAG
GCTGGTGGAGT GG AACCCT G AAT AACAAGTTGGG ACT GTTTCCCT CAAATTTT GT GAAAG
AATT AG AGGTAACAG AT GAT GGTGAAACT CAT G AAGCCCAGG ACG ATT CAG AAACTGTTT
TGGCTGGGCCTACTTCACCTATACCTTCTCTGGGAAATGTGAGTGAAACTGCATCTG GAT
CAGTTACACAGCCAAAGAAAATTCGAGGAATTGGATTTGGAGACATTTTTAAAGAAG GTT
CT GT G AAACTTCGG AC AAG AACATCCAGT AGT GAAAC AG AAG AG AAAAAACCAGAAAAGC
CCTTAATCCTACAGTCACTGGGACCCAAAACTCAGAGTGTGGAGATAACAAAAACAG ATA
CCG AAGGT AAAATT AAAGCT AAAGAATATTGTAG AACATT ATTTGCCT AT GAAGGT ACT A
AT G AAG AT G AACTT ACTTTT AAAG AGG GG G AG AT AATCCATTT G ATAAGT AAG GAG ACT G
GAGAAGCTGGCTGGTGGAGGGGCGAACTTAATGGTAAAGAAGGAGTATTTCCAGACA ATT
TTGCT GTCCAG AT AAAT GAACTT GAT AAAG ACTTTCCAAAACC AAAGAAACCACCACCTC
CTGCTAAGGCTCCAGCTCCAAAGCCTGAACTGATAGCTGCAGAGAAGAAATATTTTT CTT
T AAAGCCT G AAG AAAAGGAT G AAAAAT CAACACT GG AACAGAAACCTT CTAAACCAGC AG
CTCCACAAGTCCCACCCAAGAAACCTACTCCACCTACCAAAGCCAGTAATTTATTGA GAT
CTTCTGGAACAGTGTACCCAAAGCGACCTGAAAAACCAGTTCCTCCACCACCTCCTA TAG
CC AAG ATT AAT G GG G AAGTTT CT AG C ATTT CAT C AAAATTT G AAACT G AGCC AGTAT C AA
AACT AAAG CT AG ATT CT G AAC AG CTGCCCCTT AG ACC AAAAT C AGT AG ACTTT GATT C AC
TTACAGTAAGGACCTCCAAAGAAACAGATGTTGTAAATTTTGATGACATAGCTTCCT CAG
AAAACTTGCTTCATCTCACTGCAAATAGACCAAAGATGCCTGGAAGAAGGTTGCCGG GCC
GTTTCAATGGTGGACATTCTCCAACTCACAGCCCCGAAAAAATCTTGAAGTTACCAA AAG
AAGAAGACAGTGCCAACCTGAAGCCATCTGAATTAAAAAAAGATACATGCTACTCTC CAA
AGCCATCTGTGTACCTTTCAACACCTTCCAGTGCTTCTAAAGCAAATACAACTGCTT TCC
TGACTCCATTAGAAATCAAAGCTAAAGTGGAAACAGATGATGTGAAAAAAAATTCCC TGG
AT GAACTT AG AGCCCAG ATTATT GAATTGTT GT GCATTGTAGAAGCACT GAAAAAGGATC
ACGGGAAAGAACTGGAAAAACTGCGAAAAGATTTGGAAGAAGAGAAGACAATGAGAA GTA
ATCTAGAGATGGAAATAGAGAAGCTGAAAAAAGCTGTCCTGTCTTCTTGAGTGGTGT GGA
CCTGGTGTTCATAATGTTCCAGGGATTCAGAAGCAACGCTATGAACTTCAGCTGACT TGT
T ACTT AAAAATTGTG AATT CTGTTGTTGT GAT AAAT AT GAG C AAAT G AAGT GT AAT ATCT
AT AG AAAAGT AG AGT G AG GGTG AATTT AT AT AT AT ATTTT GTTTT G CC AAT AT G AAG AAA
AAGAGGCCTTATTTCTTAACTGTGCTGGGATTGCAAACACTTTTTAAAAAATTGTTT GCT
T G AAAAT ACTACT G AAT AT AAAT AAG AAT GTGCTCAGT AGTTTTTTT ATT G AAACTT GTA
TT ATTTTT AAAG AG AT CTATACTAT AAAT ATGGTG ATAT ATTT AC AAGTAAT CTGT AAG A
TATACT ATTT GAG AGG G AC AG ATT AG CCTTTT AGTAACT ATAGT C ACT ACTTTTTCCAT A
ATGCATAAGGGATATAAACTCTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTG TGT
ATATATATATATATATTTTTACTTTTATCCTCTTACCGAAGGTTACACTGTTGTGCC TGT TTGTCTG C AAT G CTGTTT AT ATTTT G GGT GAT G AAAT AG G AGTTTCCT AG CTATAT AAAC C AG ATT ACT C ACCC AT G CAT ATAGT AAG AACT AAT G AAT AAT C AAAAT AATTT CAT C AAC TTTT AG AAT ATTTT ATGTTG CTTGC ACTAT AG G AGT CAT AAAAGG AACTT AGTT AAAAT A TGTT GG ATTGTT AAAC ATTT G GG G AAAT AT G AACT GT ATTTT AAATTTGTT AG GTCTG AA AAAT CT AAAACTGTT AATTTAACCCTTAACTT GTGCCT AG AAACT ACAGCACAT AT AAAA TATGTAAACACCAGCCTGTTGCTGTACTTTTCTGCTTATTTTACAGCCTCAAATATTTCT C ATT ATCTTGTC ACTTAGTTCTT C ATGTTT CTCCTTCT G ACTTTT AAT AAT GGT AAT AG G AAAACAAAACCCAAAGCTTTTCAAACTTCAGTGTGAGGTTTCCTATTTTGACAAGTTAAC TTGT AAAT ACT C AG GTTTT ACG ATGTAT AATTT ACCT AAT AG ACC AAACT AACT CAT G G A GAT ATTTT G AACT ATT ATTT AGGTAC AAACTTT AT AAAG AAT GTTAGT ATGTC AT AAAAT AT AACATT AC AG CTT ATTT AAAAC C AAAT AT ATT G AACAT ATTTT AAAAT AC ATTT C AC A G AATGG AT GAATT AGTT GTTT CTT CAAAAGTT ACTTAT G AACAGTT G AAT GCCTTT AAAA TGTTCTGTCTGTAGGTACATCTAAAAACACAAGTGGGTTTATTTAAATTTTTAAAATTTG AAATTTTTT ATTT GCCAAAAATT GTTTTATGCTTT ATT AT ATCGCAAAT GAGT GTCAGAT TTTT GAGT ACCAAT GAT CAT GCTTCCATTTTTTTTAGTTTTAAACCACCAAACCAAT ATT TTTCCTTT AAATTTT AAT CTT AT AAT AT AG AAAT CTT ATGTT AAT G AAATTTT GT CAT GT TT C AAAT AAAG AAAACT G AAGT AG AAAAT AG AAAT G CC AGTAAAC AAC AT AAT GTTT AAT TT AC AACTT AC ATT AG GG GTTT G GG GG AAT G CT AATT AT AT ATT GAG AAT AT AC ATT AG A ACTCTTCAAAATGGGCTCTTCTAATGAGGTCACTACTGAACAAAATTGTTCCCTCTTCTG TT AAAT AG AAT AG GTTT AAAT G ACT AGTC AAAT GAATT ATTTT CTCCTTGTT AAAT AAAT TAAATCTTACTTTCTTTTAATGACCAACCTTAGGTAAAACAAAAATATTGTAATCCTAGA AATTATCCTCCAGCTTTCTCACCTGAAAATCTATTGAAGTGATCCCTGGTCATCCTAATA ATGGGATGAGGGAAGTTTCCAGCAGATTTCAGGCTGTTCTTAAAGTTTTTGTTGGTCATT TTCT C AAT AGTAC AT G AAAT C AAG ATGCTT AT GAG CAT GG AAAT GT ATTT AAAGTTTTT G CTT GT GTCCTCCT CAGTCAG AATAG AAAAGTAACT GAAAT ACT CTT ACCTTT CT GTCCTT GAT AAAAT AGTAAAGAAAACCAAACAAACCCAGGCCTGATGGGAAAAATGATTCCTTTAT T CT AG C AATT ACTTT CTGTTG GTAT GG GAAAT GTT ATT AATTT CT ATT ACT AAAGTT CAT AT C AC AAAAT GAT ATTT AAT AAT AACCTTGGGGT AAAT CAT G AATTTTTTTTT CTACGTG T GAGT AT AAAAG AC AAAAGTT G AAC AG CAT G G AAT CTT C ATTGCC AAATT ATT AGTG AAT GTATAGTTCAGGTATTCTTTGAGACACACAGTATCATTAATTTCCGAATTGTATTTCAGT GTTATTTTTTGTTTGTGACCACTAAGCTTCTGTCTTAATACAAAGCTGTTACCTTCTACA G AATTT AAGTCTGAAGAT GT AAAG AGAG AACAGGCCTT GT GT AACAG AAG ATACT CTTTT TTAT GCTCCTT ACT GT GATCACAG AAAAATT AAAAATCCAAGT GCTCT CT AG ATTTGTT G AT AAAC ATTTT ATGCTTG C ATTT AAACTT G AAATGTAT G AGC AG AAT GAG AC AAT C AGTT AAATCAGAAATGAGAAGTATTATAATGTAAAGGCCTTGTTTTGCTGTAGCAATAAAATGA CCAAGTGCAAT G ACTT G ATTTAATAAAATCCGG AATT C (SEQ ID NO: 13)
As used herein, the term “CD33” refers to the gene encoding Myeloid cell surface antigen CD33. The terms “CD33” and "Myeloid cell surface antigen CD33" include wild-type forms of the CD33 gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type CD33. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71 %, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type CD33 nucleic acid sequence (e.g., SEQ ID NO: 14, ENA accession number M23197). SEQ ID NO: 14 is a wild-type gene sequence encoding CD33 protein, and is shown below:
GCTTCCTCAGACATGCCGCTGCTGCTACTGCTGCCCCTGCTGTGGGCAGGGGCCCTG GCT
ATGGATCCAAATTTCTGGCTGCAAGTGCAGGAGTCAGTGACGGTACAGGAGGGTTTG TGC
GTCCTCGTGCCCTGCACTTTCTTCCATCCCATACCCTACTACGACAAGAACTCCCCA GTT
CATGGTTACTGGTTCCGGGAAGGAGCCATTATATCCGGGGACTCTCCAGTGGCCACA AAC
AAGCTAGATCAAGAAGTACAGGAGGAGACTCAGGGCAGATTCCGCCTCCTTGGGGAT CCC
AGTAGGAACAACTGCTCCCTGAGCATCGTAGACGCCAGGAGGAGGGATAATGGTTCA TAC
TTCTTTCGGATGGAGAGAGGAAGTACCAAATACAGTTACAAATCTCCCCAGCTCTCT GTG
CATGTGACAGACTTGACCCACAGGCCCAAAATCCTCATCCCTGGCACTCTAGAACCC GGC
CACTCCAAAAACCTTACCTGCTCTGTGTCCTGGGCCTGTGAGCAGGGAACACCCCCG ATC
TTCTCCTGGTTGTCAGCTGCCCCCACCTCCCTGGGCCCCAGGACTACTCACTCCTCG GTG
CTCATAATCACCCCACGGCCCCAGGACCACGGCACCAACCTGACCTGTCAGGTGAAG TTC
GCTGGAGCTGGTGTGACTACGGAGAGAACCATCCAGCTCAACGTCACCTATGTTCCA CAG
AACCCAACAACTGGTATCTTTCCAGGAGATGGCTCAGGGAAACAAGAGACCAGAGCA GGA
CTGGTTCATGGGGCCATTGGAGGAGCTGGTGTTACAGCCCTGCTCGCTCTTTGTCTC TGC
CTCATCTTCTTCATAGTGAAGACCCACAGGAGGAAAGCAGCCAGGACAGCAGTGGGC AGC
AATGACACCCACCCTACCACAGGGTCAGCCTCCCCGAAACACCAGAAGAACTCCAAG TTA
CATGGCCCCACTGAAACCTCAAGCTGTTCAGGTGCCGCCCCTACTGTGGAGATGGAT GAG
GAGCTGCATTATGCTTCCCTCAACTTTCATGGGATGAATCCTTCCAAGGACACCTCC ACC
GAATACTCAGAGGTCAGGACCCAGTGAGGAACCCTCAAGAGCATCAGGCTCAGCTAG AAG
ATCCACATCCTCTACAGGTCGGGGACCAAAGGCTGATTCTTGGAGATTTAACTCCCC ACA
GGCAATGGGTTTATAGACATTATGTGAGTTTCCTGCTATATTAACATCATCTTGAGA CTT
TGCAAGCAGAGAGTCGTGGAATCAAATCTGTGCTCTTTCATTTGCTAAGTGTATGAT GTC
AC AC AAG CTCCTT AACCTTCC AT GT CTCC ATTTT CTTCTCTGTGAAGT AG GTAT AAG AAG
TCCTATCTCATAGGGATGCTGTGAGCATTAAATAAAGGTACACATGGAAAACACCAG
(SEQ ID NO: 14)
As used herein, the term “CD68” refers to the gene encoding CD68 Molecule. The terms “CD68” and " CD68 molecule" include wild-type forms of the CD68 gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type CD68. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type CD68 nucleic acid sequence (e.g., SEQ ID NO: 15, NCBI Reference Sequence: NM_001251 .2). SEQ ID NO: 15 is a wild-type gene sequence encoding CD68 protein, and is shown below:
TTAATTACAAAAACTAATGACTAAGAGAGAGGTGGCTAGAGCTGAGGCCCCTGAGTC AGGCTGTGG
GTGGGATCATCTCCAGTACAGGAAGTGAGACTTTCATTTCCTCCTTTCCAAGAGAGG GCTGAGGGAG CAGGGTTGAGCAACTGGTGCAGACAGCCTAGCTGGACTTTGGGTGAGGCGGTTCAGCCAT GAGGCT
GGCTGTGCTTTTCTCGGGGGCCCTGCTGGGGCTACTGGCAGCCCAGGGGACAGGGAA TGACTGTC
CTCACAAAAAATCAGCTACTTTGCTGCCATCCTTCACGGTGACACCCACGGTTACAG AGAGCACTGG
AACAACCAGCCACAGGACTACCAAGAGCCACAAAACCACCACTCACAGGACAACCAC CACAGGCAC
CACCAGCCACGGACCCACGACTGCCACTCACAACCCCACCACCACCAGCCATGGAAA CGTCACAGT
TCATCCAACAAGCAATAGCACTGCCACCAGCCAGGGACCCTCAACTGCCACTCACAG TCCTGCCAC
CACTAGTCATGGAAATGCCACGGTTCATCCAACAAGCAACAGCACTGCCACCAGCCC AGGATTCACC
AGTTCTGCCCACCCAGAACCACCTCCACCCTCTCCGAGTCCTAGCCCAACCTCCAAG GAGACCATT
GGAGACTACACGTGGACCAATGGTTCCCAGCCCTGTGTCCACCTCCAAGCCCAGATT CAGATTCGA
GTCATGTACACAACCCAGGGTGGAGGAGAGGCCTGGGGCATCTCTGTACTGAACCCC AACAAAACC
AAGGTCCAGGGAAGCTGTGAGGGTGCCCATCCCCACCTGCTTCTCTCATTCCCCTAT GGACACCTC
AGCTTTGGATTCATGCAGGACCTCCAGCAGAAGGTTGTCTACCTGAGCTACATGGCG GTGGAGTAC
AATGTGTCCTTCCCCCACGCAGCACAGTGGACATTCTCGGCTCAGAATGCATCCCTT CGAGATCTCC
AAGCACCCCTGGGGCAGAGCTTCAGTTGCAGCAACTCGAGCATCATTCTTTCACCAG CTGTCCACCT
CGACCTGCTCTCCCTGAGGCTCCAGGCTGCTCAGCTGCCCCACACAGGGGTCTTTGG GCAAAGTTT
CTCCTGCCCCAGTGACCGGTCCATCTTGCTGCCTCTCATCATCGGCCTGATCCTTCT TGGCCTCCTC
GCCCTGGTGCTTATTGCTTTCTGCATCATCCGGAGACGCCCATCCGCCTACCAGGCC CTCTGAGCAT
TTGCTTCAAACCCCAGGGCACTGAGGGGGTTGGGGTGTGGTGGGGGGGTACCCTTAT TTCCTCGAC
ACGCAACTGGCTCAAAGACAATGTTATTTTCCTTCCCTTTCTTGAAGAACAAAAAGA AAGCCGGGCAT
GACGGCTCATGCCTGTAATCCCAGCACTTTGGGAGGCTGAGGCAGGTGGATCACTGG AGGTCAGGA
GTTTGAGACCAGCCTGGCCAACATGGTGAAACCCTGTCTCTACTAAAAATACAATTA GCCAGGTGTG
GCGGCGTAATCCCAGCTGGCCTGTAATCCCAGCTACTTGGGAGGCTGAGGCAGAACT GCTTGAACC
CAGGAGGTGGAGGTTGCAGTGAGCCGTCATCGCGCCACTAAGCCAAGATCGCGCCAC TGCACTCC
AGCCTGGGCGACAGAGCCAGACTGTCTCAAATAAATAAATATGAGATAATGCAGTCG GGAGAAGGG
AGGGAGAGAATTTTATTAAATGTGACGAACTGCCCCCCCCCCCCCCCCAGCAGGAGA GCAGCAAAA
TTTATGCAAATCTTTGACGGGGTTTTCCTTGTCCTGCCAGGATTAAAAGCCATGAGT TTCTTGTCAAA
AAAAAAAAAAAAAA
(SEQ ID NO: 15)
As used herein, the term “CLPTM1” refers to the gene encoding CLPTM1 Regulator of GABA Type A Receptor Forward Trafficking. The terms “CLPTM1” and " CLPTM1 Regulator of GABA Type A Receptor Forward Trafficking " include wild-type forms of the CLPTM1 gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type CLPTM1. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type CLPTM1 nucleic acid sequence (e.g., SEQ ID NO: 16, NCBI Reference Sequence: NM_001294.3). SEQ ID NO: 16 is a wild-type gene sequence encoding CLPTM1 protein, and is shown below:
AGGTTGGTCCTTCCATAGCCGGAAGTGGCCTTCCTGAGAGGCGTGGCTGCGGCACTC TTGCCGGAT
AGGGTGGCCCGGCGGGGCTAGGAAAGCGTGAAATCTCGCGCGATTGCGCTGCGAAGT CGGGGAC
GGGGCGGGGCTGGCGGCGGGGGCGGGGACCCGGAGCGGGAAGATGGCGGCGGCGCAG GAGGC GGACGGGGCCCGCAGCGCCGTGGTGGCGGCCGGGGGAGGCAGCTCCGGTCAGGTGACCAG CAAT
GGCAGCATCGGGAGGGACCCGCCAGCGGAGACCCAGCCTCAGAACCCACCGGCCCAG CCGGCAC
CCAATGCCTGGCAGGTCATCAAAGGTGTGCTGTTTAGGATCTTCATCATCTGGGCCA TCAGCAGTTG
GTTCCGCCGAGGGCCGGCCCCTCAGGACCAGGCGGGCCCCGGAGGAGCTCCACGCGT CGCCAGC
CGCAACCTGTTCCCCAAAG AC ACTTT AAT G AACCTGCAT GT GT ACATCT CAG AGCACG AGC ACTTT A
CAGACTTCAACGCCACGTCGGCACTCTTCTGGGAACAGCACGATCTTGTGTATGGCG ACTGGACTA
GCGGCGAGAACTCAGACGGCTGCTACGAGCACTTTGCTGAGCTCGATATCCCACAGA GCGTCCAGC
AGAACGGCTCCATCTACATCCACGTTTACTTCACCAAGAGTGGCTTCCACCCAGACC CCCGGCAGAA
GGCCCTGTACCGCCGGCTTGCCACAGTCCACATGTCCCGGATGATCAACAAATACAA GCGCAGACG
ATTTCAGAAAACCAAGAACCTGCTGACAGGAGAGACAGAAGCGGACCCAGAAATGAT CAAGAGGGC
TGAGGACTATGGGCCTGTGGAGGTGATCTCCCATTGGCACCCCAACATCACCATCAA CATCGTGGA
CGACCACACGCCGTGGGTGAAGGGCAGTGTGCCCCCTCCCCTGGATCAATATGTGAA GTTCGACGC
CGTGAGCGGTGACTACTATCCCATCATCTACTTCAATGACTACTGGAACCTGCAGAA GGACTACTAC
CCCATCAACGAGAGCCTGGCCAGCCTGCCGCTCCGCGTCTCCTTCTGCCCACTCTCG CTTTGGCGC
TGGCAGCTCTATGCTGCCCAGAGCACCAAGTCGCCCTGGAACTTCCTGGGTGATGAG TTGTACGAG
CAGTCAGATGAGGAGCAGGACTCGGTGAAGGTGGCCCTGCTGGAGACCAACCCCTAC CTGCTGGC
GCTCACCATCATCGTGTCTATCGTTCACAGTGTCTTCGAGTTCCTGGCCTTCAAGAA TGATATCCAGT
TCTGGAACAGCCGGCAGTCCCTGGAGGGCCTGTCCGTGCGCTCCGTCTTCTTCGGCG TTTTCCAGT
CATTCGTGGTCCTCCTCTACATCCTGGACAACGAGACCAACTTCGTGGTCCAGGTCA GCGTCTTCAT
TGGGGTCCTCATCGACCTCTGGAAGATCACCAAGGTCATGGACGTCCGGCTGGACCG AGAGCACAG
GGTGGCAGGAATCTTCCCCCGCCTATCCTTCAAGGACAAGTCCACGTATATCGAGTC CTCGACCAAA
GTGTATGATGATATGGCATTCCGGTACCTGTCCTGGATCCTCTTCCCGCTCCTGGGC TGCTATGCCG
TCTACAGTCTTCTGTACCTGGAGCACAAGGGCTGGTACTCCTGGGTGCTCAGCATGC TCTACGGCTT
CCTGCTGACCTTCGGCTTCATCACCATGACGCCCCAGCTCTTCATCAACTACAAGCT CAAGTCTGTG
GCCCACCTTCCCTGGCGCATGCTCACCTACAAGGCCCTCAACACATTCATCGACGAC CTGTTCGCCT
TTGTCATCAAGATGCCCGTTATGTACCGGATCGGCTGCCTGCGGGACGATGTGGTTT TCTTCATCTA
CCTCTACCAACGGTGGATCTACCGCGTCGACCCCACCCGAGTCAACGAGTTTGGCAT GAGTGGAGA
AGACCCCACAGCTGCCGCCCCCGTGGCCGAGGTTCCCACAGCAGCAGGGGCCCTCAC GCCCACAC
CTGCACCCACCACGACCACCGCCACCAGGGAGGAGGCCTCCACGTCCCTGCCCACCA AGCCCACC
CAGGGGGCCAGCTCTGCCAGCGAGCCCCAGGAAGCCCCTCCAAAGCCAGCAGAGGAC AAGAAAAA
GGATTAGTCGAGACTGGTCCTCACCTGCTCCGGCTCCTGGCGACCACTACCCCTGCG TCCCGGCCC
CCTCGCCTCCCCTCCCTGTCGCCCTTTCCCTGGACAGATCAGGCCGGGGCGGTGGGA GGCCCGCC
TCAGGTCAGGGCCCAGCGTGTGATGTAGGGGCCGGGGCAGGCCAGGGTTTGTTTGTG GAGGCGCT
GTCTGTCCCTCTGTCCCTCTGTGTTTCCAGCCATCTCGCCCTGCCAGCCCAGCACCA CTGGGAATCA
TGGTGAAGCTGATGCAGCGTTGCCGAGGGGGTGGGTTGGGCGGGGGTGGGGCCGGGC CCCCCTA
CGGGATGCCCACGGCCGTTCATCATCTTGTCCCTCGTCCCCCTACCACACTCCCCCT CCTAGACCG
CCGCCCTTTAACACAGTCTGGATTTAATAAATTCATATGGGTGTTTAACTTAAACTC AGCACTAAAAAA
AAAAAAAAAAAA
(SEQ ID NO: 16)
As used herein, the term “CLU” refers to the gene encoding Clusterin. The terms “CLU” and "Clusterin' 1 include wild-type forms of the CLU gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type CLU. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type CLU nucleic acid sequence (e.g., SEQ ID NO: 17, ENA accession number M25915). SEQ ID NO: 17 is a wild-type gene sequence encoding CLU protein, and is shown below:
CTGCGAACCCTCTCTACTCTCCGAAGGGAATTGTCCTTCCTGGCTTCCACTACTTCC ACC
CCTGAATGCACAGGCAGCCCGGCCCAAGTCTCCCACTAGGGATGCAGATGGATTCGG TGT
GAAGGGCTGGCTGCTGTTGCCTCCGGCTCTTGAAAGTCAAGTTCAGAGGCGTGCAAA GAC
TCCAGAATTGGAGGCATGATGAAGACTCTGCTGCTGTTTGTGGGGCTGCTGCTGACC TGG
GAGAGTGGGCAGGTCCTGGGGGACCAGACGGTCTCAGACAATGAGCTCCAGGAAATG TCC
AATCAGGGAAGTAAGTACGTCAATAAGGAAATTCAAAATGCTGTCAACGGGGTGAAA CAG
AT AAAG ACT CTC AT AG AAAAAAC AAACG AAG AG CG C AAG AC ACT G CT C AGC AACCT AG AA
GAAGCCAAGAAGAAGAAAGAGGATGCCCTAAATGAGACCAGGGAATCAGAGACAAAG CTG
AAGGAGCTCCCAGGAGTGTGCAATGAGACCATGATGGCCCTCTGGGAAGAGTGTAAG CCC
TGCCTGAAACAGACCTGCATGAAGTTCTACGCACGCGTCTGCAGAAGTGGCTCAGGC CTG
GTTGGCCGCCAGCTTGAGGAGTTCCTGAACCAGAGCTCGCCCTTCTACTTCTGGATG AAT
GGTGACCGCATCGACTCCCTGCTGGAGAACGACCGGCAGCAGACGCACATGCTGGAT GTC
ATGCAGGACCACTTCAGCCGCGCGTCCAGCATCATAGACGAGCTCTTCCAGGACAGG TTC
TTCACCCGGGAGCCCCAGGATACCTACCACTACCTGCCCTTCAGCCTGCCCCACCGG AGG
CCTCACTTCTTCTTTCCCAAGTCCCGCATCGTCCGCAGCTTGATGCCCTTCTCTCCG TAC
GAGCCCCTGAACTTCCACGCCATGTTCCAGCCCTTCCTTGAGATGATACACGAGGCT CAG
CAGGCCATGGACATCCACTTCCACAGCCCGGCCTTCCAGCACCCGCCAACAGAATTC ATA
CGAGAAGGCGACGATGACCGGACTGTGTGCCGGGAGATCCGCCACAACTCCACGGGC TGC
CTGCGGATGAAGGACCAGTGTGACAAGTGCCGGGAGATCTTGTCTGTGGACTGTTCC ACC
AACAACCCCTCCCAGGCTAAGCTGCGGCGGGAGCTCGACGAATCCCTCCAGGTCGCT GAG
AGGTTGACCAGGAAATATAACGAGCTGCTAAAGTCCTACCAGTGGAAGATGCTCAAC ACC
TCCTCCTTGCTGGAGCAGCTGAACGAGCAGTTTAACTGGGTGTCCCGGCTGGCAAAC CTC
ACGCAAGGCGAAGACCAGTACTATCTGCGGGTCACCACGGTGGCTTCCCACACTTCT GAC
TCGGACGTTCCTTCCGGTGTCACTGAGGTGGTCGTGAAGCTCTTTGACTCTGATCCC ATC
ACTGTGACGGTCCCTGTAGAAGTCTCCAGGAAGAACCCTAAATTTATGGAGACCGTG GCG
GAGAAAGCGCTGCAGGAATACCGCAAAAAGCACCGGGAGGAGTGAGATGTGGATGTT GCT
TTTGCACCTACGGGGGCATCTGAGTCCAGCTCCCCCCAAGATGAGCTGCAGCCCCCC AGA
GAGAGCTCTGCACGTCACCAAGTAACCAGGC
(SEQ ID NO: 17)
As used herein, the term “CR1 ” refers to the gene encoding Complement receptor type 1 . The terms “CR1” and "Complement receptor type 1" include wild-type forms of the CR1 gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type CR1 . Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type CR1 nucleic acid sequence (e.g., SEQ ID NO: 18, ENA accession number Y00816). SEQ ID NO: 18 is a wild-type gene sequence encoding CR1 protein, and is shown below:
CGTGGTTTGTAGATGTGCTTGGGGAGAATGGGGGCCTCTTCTCCAAGAAGCCCGGAG CCT
GTCGGGCCGCCGGCGCCCGGTCTCCCCTTCTGCTGCGGAGGATCCCTGCTGGCGGTT GTG
GTGCTGCTTGCGCTGCCGGTGGCCTGGGGTCAATGCAATGCCCCAGAATGGCTTCCA TTT
GCCAGGCCTACCAACCTAACTGATGAGTTTGAGTTTCCCATTGGGACATATCTGAAC TAT
GAATGCCGCCCTGGTTATTCCGGAAGACCGTTTTCTATCATCTGCCTAAAAAACTCA GTC
TGGACTGGTGCTAAGGACAGGTGCAGACGTAAATCATGTCGTAATCCTCCAGATCCT GTG
AATGGCATGGTGCATGTGATCAAAGGCATCCAGTTCGGATCCCAAATTAAATATTCT TGT
ACTAAAGGATACCGACTCATTGGTTCCTCGTCTGCCACATGCATCATCTCAGGTGAT ACT
GTCATTTGGGATAATGAAACACCTATTTGTGACAGAATTCCTTGTGGGCTACCCCCC ACC
ATCACCAATGGAGATTTCATTAGCACCAACAGAGAGAATTTTCACTATGGATCAGTG GTG
ACCTACCGCTGCAATCCTGGAAGCGGAGGGAGAAAGGTGTTTGAGCTTGTGGGTGAG CCC
TCCATATACTGCACCAGCAATGACGATCAAGTGGGCATCTGGAGCGGCCCCGCCCCT CAG
TGCATTATACCTAACAAATGCACGCCTCCAAATGTGGAAAATGGAATATTGGTATCT GAC
AACAGAAGCTTATTTTCCTTAAATGAAGTTGTGGAGTTTAGGTGTCAGCCTGGCTTT GTC
ATGAAAGGACCCCGCCGTGTGAAGTGCCAGGCCCTGAACAAATGGGAGCCGGAGCTA CCA
AGCT GCTCCAGGGTAT GTCAGCCACCTCCAGAT GTCCT GCATGCT GAGCGTACCCAAAGG
GACAAGGACAACTTTTCACCTGGGCAGGAAGTGTTCTACAGCTGTGAGCCCGGCTAC GAC
CTCAGAGGGGCTGCGTCTATGCGCTGCACACCCCAGGGAGACTGGAGCCCTGCAGCC CCC
ACATGTGAAGTGAAATCCTGTGATGACTTCATGGGCCAACTTCTTAATGGCCGTGTG CTA
TTTCCAGTAAATCTCCAGCTTGGAGCAAAAGTGGATTTTGTTTGTGATGAAGGATTT CAA
TTAAAAGGCAGCTCTGCTAGTTACTGTGTCTTGGCTGGAATGGAAAGCCTTTGGAAT AGC
AGT GTTCCAGTGTGTG AACAAAT CTTTT GT CCAAGTCCTCCAGTTATTCCTAAT GGGAGA
CACACAGGAAAACCTCTGGAAGTCTTTCCCTTTGGAAAAGCAGTAAATTACACATGC GAC
CCCCACCCAGACAGAGGGACGAGCTTCGACCTCATTGGAGAGAGCACCATCCGCTGC ACA
AGTGACCCTCAAGGGAATGGGGTTTGGAGCAGCCCTGCCCCTCGCTGTGGAATTCTG GGT
CACTGTCAAGCCCCAGATCATTTTCTGTTTGCCAAGTTGAAAACCCAAACCAATGCA TCT
GACTTTCCCATTGGGACATCTTTAAAGTACGAATGCCGTCCTGAGTACTACGGGAGG CCA
TTCTCTATCACATGTCTAGATAACCTGGTCTGGTCAAGTCCCAAAGATGTCTGTAAA CGT
AAATCATGTAAAACTCCTCCAGATCCAGTGAATGGCATGGTGCATGTGATCACAGAC ATC
CAGGTTGGATCCAGAATCAACTATTCTTGTACTACAGGGCACCGACTCATTGGTCAC TCA
TCTGCTGAATGTATCCTCTCGGGCAATGCTGCCCATTGGAGCACGAAGCCGCCAATT TGT
CAACGAATTCCTTGTGGGCTACCCCCCACCATCGCCAATGGAGATTTCATTAGCACC AAC
AGAGAGAATTTTCACTATGGATCAGTGGTGACCTACCGCTGCAATCCTGGAAGCGGA GGG
AGAAAGGTGTTTGAGCTTGTGGGTGAGCCCTCCATATACTGCACCAGCAATGACGAT CAA
GTGGGCATCTGGAGCGGCCCGGCCCCTCAGTGCATTATACCTAACAAATGCACGCCT CCA
AATGTGGAAAATGGAATATTGGTATCTGACAACAGAAGCTTATTTTCCTTAAATGAA GTT
GTGGAGTTTAGGTGTCAGCCTGGCTTTGTCATGAAAGGACCCCGCCGTGTGAAGTGC CAG
GCCCTGAACAAATGGGAGCCGGAGCTACCAAGCTGCTCCAGGGTATGTCAGCCACCT CCA
GATGTCCTGCATGCTGAGCGTACCCAAAGGGACAAGGACAACTTTTCACCCGGGCAG GAA GTGTTCTACAGCT GT GAGCCCGGCTAT GACCTCAGAGGGGCT GCGTCTAT GCGCT GCACA
CCCCAGGG AG ACT GG AGCCCTGCAGCCCCCACAT GT GAAGT GAAATCCTGT GAT GACTT C
ATGGGCCAACTTCTTAATGGCCGTGTGCTATTTCCAGTAAATCTCCAGCTTGGAGCA AAA
GTGGATTTTGTTTGTGATGAAGGATTTCAATTAAAAGGCAGCTCTGCTAGTTATTGT GTC
TTGGCTGGAATGGAAAGCCTTTGGAATAGCAGTGTTCCAGTGTGTGAACAAATCTTT TGT
CCAAGTCCTCCAGTTATTCCTAATGGGAGACACACAGGAAAACCTCTGGAAGTCTTT CCC
TTTGGAAAAGCAGT AAATT ACACATGCG ACCCCCACCCAG ACAG AGGG ACG AGCTTCGAC
CTCATTGGAGAGAGCACCATCCGCTGCACAAGTGACCCTCAAGGGAATGGGGTTTGG AGC
AGCCCT GCCCCTCGCT GT GGAATTCT GGGTCACT GTCAAGCCCCAGATCATTTTCT GTTT
GCCAAGTTGAAAACCCAAACCAATGCATCTGACTTTCCCATTGGGACATCTTTAAAG TAC
GAATGCCGTCCTGAGTACTACGGGAGGCCATTCTCTATCACATGTCTAGATAACCTG GTC
TGGTCAAGTCCCAAAGATGTCTGTAAACGTAAATCATGTAAAACTCCTCCAGATCCA GTG
AATGGCATGGTGCATGTGATCACAGACATCCAGGTTGGATCCAGAATCAACTATTCT TGT
ACT AC AGGG C AC CG ACT C ATT GGTCACTCATCTGCT G AAT GTATCCTCT C AG GC AAT ACT
GCCCATTGGAGCACGAAGCCGCCAATTTGTCAACGAATTCCTTGTGGGCTACCCCCA ACC
ATCGCCAATGGAGATTTCATTAGCACCAACAGAGAGAATTTTCACTATGGATCAGTG GTG
ACCTACCGCTGCAATCTTGGAAGCAGAGGGAGAAAGGTGTTTGAGCTTGTGGGTGAG CCC
TCCATATACTGCACCAGCAATGACGATCAAGTGGGCATCTGGAGCGGCCCCGCCCCT CAG
TGCATTATACCTAACAAATGCACGCCTCCAAATGTGGAAAATGGAATATTGGTATCT GAC
AACAGAAGCTTATTTTCCTTAAATGAAGTTGTGGAGTTTAGGTGTCAGCCTGGCTTT GTC
ATGAAAGGACCCCGCCGTGTGAAGTGCCAGGCCCTGAACAAATGGGAGCCAGAGTTA CCA
AGCTGCTCCAGGGTGTGTCAGCCGCCTCCAGAAATCCTGCATGGTGAGCATACCCCA AGC
CATCAGGACAACTTTTCACCTGGGCAGGAAGTGTTCTACAGCTGTGAGCCTGGCTAT GAC
CTCAGAGGGGCTGCGTCTCTGCACTGCACACCCCAGGGAGACTGGAGCCCTGAAGCC CCG
AGATGTGCAGTGAAATCCTGTGATGACTTCTTGGGTCAACTCCCTCATGGCCGTGTG CTA
TTTCCACTTAATCTCCAGCTTGGGGCAAAGGTGTCCTTTGTCTGTGATGAAGGGTTT CGC
TTAAAGGGCAGTTCCGTTAGTCATTGTGTCTTGGTTGGAATGAGAAGCCTTTGGAAT AAC
AGTGTTCCTGTGTGTGAACATATCTTTTGTCCAAATCCTCCAGCTATCCTTAATGGG AGA
CACACAGGAACTCCCTCTGGAGATATTCCCTATGGAAAAGAAATATCTTACACATGT GAC
CCCCACCCAGACAGAGGGATGACCTTCAACCTCATTGGGGAGAGCACCATCCGCTGC ACA
AGTGACCCTCATGGGAATGGGGTTTGGAGCAGCCCTGCCCCTCGCTGTGAACTTTCT GTT
CGTGCTGGTCACTGTAAAACCCCAGAGCAGTTTCCATTTGCCAGTCCTACGATCCCA ATT
AATGACTTTGAGTTTCCAGTCGGGACATCTTTGAATTATGAATGCCGTCCTGGGTAT TTT
GGGAAAATGTTCTCTATCTCCTGCCTAGAAAACTTGGTCTGGTCAAGTGTTGAAGAC AAC
TGTAGACGAAAATCATGTGGACCTCCACCAGAACCCTTCAATGGAATGGTGCATATA AAC
ACAGATACACAGTTTGGATCAACAGTTAATTATTCTTGTAATGAAGGGTTTCGACTC ATT
GGTTCCCCATCTACTACTTGTCTCGTCTCAGGCAATAATGTCACATGGGATAAGAAG GCA
CCTATTTGTGAGATCATATCTTGTGAGCCACCTCCAACCATATCCAATGGAGACTTC TAC
AGCAACAATAGAACATCTTTTCACAATGGAACGGTGGTAACTTACCAGTGCCACACT GGA
CCAGATGGAGAACAGCTGTTTGAGCTTGTGGGAGAACGGTCAATATATTGCACCAGC AAA
GATGATCAAGTTGGTGTTTGGAGCAGCCCTCCCCCTCGGTGTATTTCTACTAATAAA TGC
ACAG CTCC AG AAGTT G AAAAT G C AATT AG AGT ACC AG G AAAC AG G AGTTT CTTTTCCCT C ACTGAGATCATCAGATTTAGATGTCAGCCCGGGTTTGTCATGGTAGGGTCCCACACTGTG
CAGTGCCAGACCAATGGCAGATGGGGGCCCAAGCTGCCACACTGCTCCAGGGTGTGT CAG
CCGCCTCCAGAAATCCT GCATGGT GAGCAT ACCCT AAGCC AT CAGG AC AACTTTT CACCT
GGGCAGGAAGT GTTCTACAGCT GT GAGCCCAGCTAT GACCTCAGAGGGGCT GCGTCTCT G
CACTGCACGCCCCAGGGAGACTGGAGCCCTGAAGCCCCTAGATGTACAGTGAAATCC TGT
GATGACTTCCTGGGCCAACTCCCTCATGGCCGTGTGCTACTTCCACTTAATCTCCAG CTT
GGGGCAAAGGTGTCCTTTGTTTGCGATGAAGGGTTCCGATTAAAAGGCAGGTCTGCT AGT
CATTGTGTCTTGGCTGGAATGAAAGCCCTTTGGAATAGCAGTGTTCCAGTGTGTGAA CAA
ATCTTTTGTCCAAATCCTCCAGCTATCCTTAATGGGAGACACACAGGAACTCCCTTT GGA
GAT ATTCCCT AT GG AAAAG AAATATCTT ACGCATGCG ACACCCACCCAG AC AGAGGGAT G
ACCTTCAACCTCATTGGGGAGAGCTCCATCCGCTGCACAAGTGACCCTCAAGGGAAT GGG
GTTTGGAGCAGCCCTGCCCCTCGCTGTGAACTTTCTGTTCCTGCTGCCTGCCCACAT CCA
CCCAAGATCCAAAACGGGCATTACATTGGAGGACACGTATCTCTATATCTTCCTGGG ATG
ACAATCAGCTACACTTGTGACCCCGGCTACCTGTTAGTGGGAAAGGGCTTCATTTTC TGT
ACAGACCAGGGAATCTGGAGCCAATTGGATCATTATTGCAAAGAAGTAAATTGTAGC TTC
CCACTGTTTATGAATGGAATCTCGAAGGAGTTAGAAATGAAAAAAGTATATCACTAT GGA
GATTATGTGACTTTGAAGTGTGAAGATGGGTATACTCTGGAAGGCAGTCCCTGGAGC CAG
TGCCAGGCGGATGACAGATGGGACCCTCCTCTGGCCAAATGTACCTCTCGTGCACAT GAT
G CTCT CAT AGTT G GC ACTTT ATCTG GTACG AT CTT CTTT ATTTT ACTC AT C ATTTTCCT C
TCTTGGATAATTCTAAAGCACAGAAAAGGCAATAATGCACATGAAAACCCTAAAGAA GTG
GCTATCCATTTACATTCTCAAGGAGGCAGCAGCGTTCATCCCCGAACTCTGCAAACA AAT
GAAGAAAATAGCAGGGTCCTTCCTTGACAAAGTACTATACAGCTGAAGAACATCTCG AAT
ACAATTTTGGTGGGAAAGGAGCCAATTGATTTCAACAGAATCAGATCTGAGCTTCAT AAA
GTCTTT G AAGT GACTT CACAG AG ACGCAG AC AT GTGCACTT GAAGATGCT GCCCCTTCCC
TGGTACCTAGCAAAGCTCCTGCCTCTTTGTGTGCGTCACTGTGAAACCCCCACCCTT CTG
CCTCGTGCTAAACGCACACAGTATCTAGTCAGGGGAAAAGACTGCATTTAGGAGATA GAA
AATAGTTTGGATTACTTAAAGGAATAAGGTGTTGCCTGGAATTTCTGGTTTGTAAGG TGG
TCACTGTTCTTTTTTAAAATATTTGTAATATGGAATGGGCTCAGTAAGAAGAGCTTG GAA
AAT GCAGAAAGTT AT G AAAAAT AAGTCACTT AT AATT AT GCT ACCT ACT GAT AACCACTC
CT AAT ATTTT GATT C ATTTT CTGCCTATCTT CTTT C AC AT ATGT GTTTTTTT AC AT ACGT
ACTTTTCCCCCCTTAGTTTGTTTCCTTTTATTTTATAGAGCAGAACCCTAGTCTTTT AAA
C AGTTT AG AGT G AAAT ATATGCTAT AT C AGTTTTT ACTTT CTCT AGG G AG AAAAATT AAT
TT ACT AG AAAG GC AT G AAAT GAT C ATGG G AAG AGTG GTT AAG ACT ACT G AAG AG AAAT AT
TTGGAAAATAAGATTTCGATATCTTCTTTTTTTTTGAGATGGAGTCTGGCTCTGTCT CCC
AGGCTGGAGTGCAGTGGCGTAATCTCGGCTCACTGCAACGTCCGCCTCCCG
(SEQ ID NO: 18)
As used herein, the term “CSF1” refers to the gene encoding Macrophage colony-stimulating factor 1 . The terms “CSF1” and "Macrophage colony-stimulating factor 1" include wild-type forms of the CSF1 gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type CSF1 . Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type CSF1 nucleic acid sequence (e.g., SEQ ID NO: 19, ENA accession number M37435). SEQ ID NO: 19 is a wild-type gene sequence encoding CSF1 protein, and is shown below:
CCTGGGTCCTCTCGGCGCCAGAGCCGCTCTCCGCATCCCAGGACAGCGGTGCGGCCC TCG
GCCGGGGCGCCCACTCCGCAGCAGCCAGCGAGCCAGCTGCCCCGTATGACCGCGCCG GGC
GCCGCCGGGCGCTGCCCTCCCACGACATGGCTGGGCTCCCTGCTGTTGTTGGTCTGT CTC
CTGGCGAGCAGGAGTATCACCGAGGAGGTGTCGGAGTACTGTAGCCACATGATTGGG AGT
GGACACCTGCAGTCTCTGCAGCGGCTGATTGACAGTCAGATGGAGACCTCGTGCCAA ATT
ACATTTGAGTTTGTAGACCAGGAACAGTTGAAAGATCCAGTGTGCTACCTTAAGAAG GCA
TTTCTCCTGGTACAAGACATAATGGAGGACACCATGCGCTTCAGAGATAACACCGCC AAT
CCCATCGCCATTGTGCAGCTGCAGGAACTCTCTTTGAGGCTGAAGAGCTGCTTCACC AAG
G ATTAT GAAGAGC AT GACAAGGCCTGCGTCCG AACTTT CTAT GAGACACCT CTCCAGTT G
CT GG AG AAG GT C AAG AATGTCTTT AAT G AAAC AAAG AAT CTCCTT G AC AAG G ACT GG AAT
ATTTTCAGCAAGAACTGCAACAACAGCTTTGCTGAATGCTCCAGCCAAGATGTGGTG ACC
AAGCCTGATTGCAACTGCCTGTACCCCAAAGCCATCCCTAGCAGTGACCCGGCCTCT GTC
TCCCCTCATCAGCCCCTCGCCCCCTCCATGGCCCCTGTGGCTGGCTTGACCTGGGAG GAC
TCTGAGGGAACTGAGGGCAGCTCCCTCTTGCCTGGTGAGCAGCCCCTGCACACAGTG GAT
CCAGGCAGTGCCAAGCAGCGGCCACCCAGGAGCACCTGCCAGAGCTTTGAGCCGCCA GAG
ACCCCAGTTGTCAAGGACAGCACCATCGGTGGCTCACCACAGCCTCGCCCCTCTGTC GGG
GCCTTCAACCCCGGGATGGAGGATATTCTTGACTCTGCAATGGGCACTAATTGGGTC CCA
GAAGAAGCCTCTGGAGAGGCCAGTGAGATTCCCGTACCCCAAGGGACAGAGCTTTCC CCC
TCCAGGCCAGGAGGGGGCAGCATGCAGACAGAGCCCGCCAGACCCAGCAACTTCCTC TCA
GCATCTTCTCCACTCCCTGCATCAGCAAAGGGCCAACAGCCGGCAGATGTAACTGCT ACA
GCCTTGCCCAGGGTGGGCCCCGTGATGCCCACTGGCCAGGACTGGAATCACACCCCC CAG
AAGACAGACCATCCATCTGCCCTGCTCAGAGACCCCCCGGAGCCAGGCTCTCCCAGG ATC
TCATCACTGCGCCCCCAGGCCCTCAGCAACCCCTCCACCCTCTCTGCTCAGCCACAG CTT
TCCAGAAGCCACTCCTCGGGCAGCGTGCTGCCCCTTGGGGAGCTGGAGGGCAGGAGG AGC
ACCAGGGATCGGACGAGCCCCGCAGAGCCAGAAGCAGCACCAGCAAGTGAAGGGGCA GCC
AGGCCCCTGCCCCGTTTTAACTCCGTTCCTTTGACTGACACAGGCCATGAGAGGCAG TCC
GAGGGATCCTCCAGCCCGCAGCTCCAGGAGTCTGTCTTCCACCTGCTGGTGCCCAGT GTC
ATCCTGGTCTTGCTGGCTGTCGGAGGCCTCTTGTTCTACAGGTGGAGGCGGCGGAGC CAT
CAAGAGCCTCAGAGAGCGGATTCTCCCTTGGAGCAACCAGAGGGCAGCCCCCTGACT CAG
GATGACAGACAGGTGGAACTGCCAGTGTAGAGGGAATTCTAAGCTGGACGCACAGAA CAG
TCTCTTCGTGGGAGGAGACATTATGGGGCGTCCACCACCACCCCTCCCTGGCCATCC TCC
T GGAAT GT GGTCT GCCCTCCACCAGAGCTCCT GCCT GCCAGGACTGGACCAGAGCAGCCA
GGCTGGGGCCCCTCTGTCTCAACCCGCAGACCCTTGACTGAATGAGAGAGGCCAGAG GAT
GCTCCCCATGCTGCCACTATTTATTGTGAGCCCTGGAGGCTCCCATGTGCTTGAGGA AGG
CTGGTGAGCCCGGCTCAGGACCCTCTTCCCTCAGGGGCTGCAGCCTCCTCTCACTCC CTT
CCATGCCGGAACCCAGGCCAGGGACCCACCGGCCTGTGGTTTGTGGGAAAGCAGGGT GCA
CGCTGAGGAGTGAAACAACCCTGCACCCAGAGGGCCTGCCTGGTGCCAAGGTATCCC AGC
CTGGACAGGCATGGACCTGTCTCCAGACAGAGGAGCCTGAAGTTCGTGGGGCGGGAC AGC CTCGGCCT GATTTCCCGTAAAGGT GT GCAGCCT GAGAGACGGGAAGAGGAGGCCTCT GCA
CCTGCTGGTCTGCACTGACAGCCTGAAGGGTCTACACCCTCGGCTCACCTAAGTCCC TGT
GCTGGTTGCCAGGCCCAGAGGGGAGGCCAGCCCTGCCCTCAGGACCTGCCTGACCTG CCA
GTGATGCCAAGAGGGGGATCAAGCACTGGCCTCTGCCCCTCCTCCTTCCAGCACCTG CCA
GAGCTTCTCCAGCAGGCCAAGCAGAGGCTCCCCTCATGAAGGAAGCCATTGCACTGT GAA
CACTGTACCTGCCTGCTGAACAGCCTCCCCCCGTCCATCCATGAGCCAGCATCCGTC CGT
CCTCCACTCTCCAGCCTCTCCCCAGCCTCCTGCACTGAGCTGGCCTCACCAGTCGAC TGA
GGGAGCCCCTCAGCCCTGACCTTCTCCTGACCTGGCCTTTGACTCCCCGGAGTGGAG TGG
GGTGGGAGAACCTCCTGGGCCGCCAGCCAGAGCCGCTCTTTAGGCTGTGTTCTTCGC CCA
GGTTTCTGCATCTTCCACTTTGACATTCCCAAGAGGGAAGGGACTAGTGGGAGAGAG CAA
GGGAGGGGAGGGCACAGACAGAGAGCCTACAGGGCGAGCTCTGACTGAAGATGGGCC TTT
GAAATATAGGTATGCACCTGAGGTTGGGGGAGGGTCTGCACTCCCAAACCCCAGCGC AGT
GTCCTTTCCCTGCTGCCGACAGGAACCTGGGGCTGAGCAGGTTATCCCTGTCAGGAG CCC
TGGACTGGGCTGCATCTCAGCCCCACCTGCATGGTATCCAGCTCCCATCCACTTCTC ACC
CTTCTTTCCTCCTGACCTTGGTCAGCAGTGATGACCTCCAACTCTCACCCACCCCCT CTA
CCATCACCTCTAACCAGGCAAGCCAGGGTGGGAGAGCAATCAGGAGAGCCAGGCCTC AGC
TTCCAATGCCTGGAGGGCCTCCACTTTGTGGCCAGCCTGTGGTGCTGGCTCTGAGGC CTA
GGCAACGAGCGACAGGGCTGCCAGTTGCCCCTGGGTTCCTTTGTGCTGCTGTGTGCC TCC
TCTCCTGCCGCCCTTTGTCCTCCGCTAAGAGACCCTGCCCTACCTGGCCGCTGGGCC CCG
TGACTTTCCCTTCCTGCCCAGGAAAGTGAGGGTCGGCTGGCCCCACCTTCCCTGTCC TGA
TGCCGACAGCTTAGGGAAGGGCACTGAACTTGCATATGGGGCTTAGCCTTCTAGTCA CAG
CCTCTAT ATTT G ATGCT AG AAAAC AC AT ATTTTT AAAT G G AAG AAAAAT AAAAAGG C ATT
CCCCCTTCATCCCCCTACCTTAAACATATAATATTTTAAAGGTCAAAAAAGCAATCC AAC
CCACTGCAGAAGCTCTTTTTGAGCACTTGGTGGCATCAGAGCAGGAGGAGCCCCAGA GCC
ACCTCTGGTGTCCCCCAGGCTACCTGCTCAGGAACCCCTTCTGTTCTCTGAGAACTC AAC
AG AGG AC ATT GGCTCACGCACTGT G AG ATTTT GTTTTT AT ACTT G C AACT GGT G AATT AT
TTTTT AT AAAGTC ATTT AAAT ATCT ATTT AAAAG AT AGG AAG CTG CTTATAT ATTT AAT A
ATAAAAGAAGTGCACAAGCTGCCGTTGACGTAGCTCGAG
(SEQ ID NO: 19)
As used herein, the term “CST7” refers to the gene encoding Cystatin-F. The terms “CST7” and "Cystatin-F" include wild-type forms of the CST7 gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type CST7. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type CST7 nucleic acid sequence (e.g., SEQ ID NO: 20, ENA accession number AF031824). SEQ ID NO: 20 is a wild-type gene sequence encoding CST7 protein, and is shown below:
GGCTCAGCACAGGCACAAACCATT GCCCGGCACT GGCCCGT GOT GCCTGAGAAGGATTGG CACGGGCACAGACCACTGCCCCCACCTGCCCTGCGCCATCTACCCAAGAAGGCTCGGCAC GGGCACCAACCACTGCCTCCAACTGCCCCATGCTGCCTGAGAAGGCACTGCACGGCCACC CCCAACTGCCCCGCACTGTCCCTACCCGGGCAGCCATGCGAGCGGCTGGAACTCTGCTGG
CCTTCTGCTGCCTGGTCTTGAGCACCACTGGGGGCCCTTCCCCAGATACTTGTTCCC AGG
ACCTTAACTCACGTGTGAAGCCAGGATTTCCTAAAACAATAAAGACCAATGACCCAG GAG
TCCTCCAAGCAGCCAGATACAGTGTTGAAAAGTTCAACAACTGCACGAACGACATGT TCT
TGTTCAAGGAGTCCCGCATCACAAGGGCCCTAGTTCAGATAGTGAAAGGCCTGAAAT ATA
TGCTGGAGGTGGAAATTGGCAGAACTACCTGCAAGAAAAACCAGCACCTGCGTCTGG ATG
ACTGTGACTTCCAAACC AACCACACCTT G AAGCAG ACT CT G AGCTGCTACT CT GAAGTCT
GGGTCGTGCCCTGGCTCCAGCACTTCGAGGTGCCTGTTCTCCGTTGTCACTGACCCC CGC
CTCTTCAGCAAGACCACAGCCATGACAAACACCAGGATGCATGCTCCTTGTCCCCTC CCA
CCCGCCTCATGACCCAGCCTCACAGACCCTCTCAGGCCTCTGACGAGTGAGCGGGTG AAG
TGCCACTGGGTCACCGCAGGGCAGCTGGAATGGCAGCATGGTAGCACCTCCTAACAG ATT
AAAT AG AT C AC ATTT GCTTCT AAAATT
(SEQ ID NO: 20)
As used herein, the term “CTSB” refers to the gene encoding Cathepsin B. The terms “CTSB” and "Cathepsin B" include wild-type forms of the CTSB gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type CTSB. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type CTSB nucleic acid sequence (e.g., SEQ ID NO: 21 , ENA accession number M14221). SEQ ID NO: 21 is a wild-type gene sequence encoding CTSB protein, and is shown below:
AATTCCGCGGCAACCGCTCCGGCAACGCCAACCGCTCCGCTGCGCGCAGGCTGGGCT GCA
GGCTCTCGGCTGCAGCGCTGGGCTGGTGTGCAGTGGTGCGACCACGGCTCACGGCAG CCT
CAGCCACCCAGATGTAAGCGATCTGGTTCCCACCTCAGCCTTCCGAGTAGTGGATCT AGG
ATCTGGCTTCCAACATGTGGCAGCTCTGGGCCTCCCTCTGCTGCCTGCTGGTGTTGG CCA
ATGCCCGGAGCAGGCCCTCTTTCCATCCCGTGTCGGATGAGCTGGTCAACTATGTCA ACA
AACGGAATACCACGTGGCAGGCCGGGCACAACTTCTACAACGTGGACATGAGCTACT TGA
AGAGGCTATGTGGTACCTTCCTGGGTGGGCCCAAGCCACCCCAGAGAGTTATGTTTA CCG
AGGACCTGAAGCTGCCTGCAAGCTTCGATGCACGGGAACAATGGCCACAGTGTCCCA CCA
TCAAAGAGATCAGAGACCAGGGCTCCTGTGGCTCCTGCTGGGCCTTCGGGGCTGTGG AAG
CCATCTCTGACCGCATCTGCATCCACACCAATGCGCACGTCAGCGTGGAGGTGTCGG CGG
AGGACCTGCTCACCTGCTGTGGCAGCATGTGTGGGGACGGCTGTAATGGTGGCTATC CTG
CTGAAGCTTGGAACTTCTGGACAAGAAAAGGCCTGGTTTCTGGTGGCCTCTATGAAT CCC
ATGTAGGGTGCAGACCGTACTCCATCCCTCCCTGTGAGCACCACGTCAACGGCTCCC GGC
CCCCATGCACGGGGGAGGGAGATACCCCCAAGTGTAGCAAGATCTGTGAGCCTGGCT ACA
GCCCGACCTACAAACAGGACAAGCACTACGGATACAATTCCTACAGCGTCTCCAATA GCG
AGAAGGACATCATGGCCGAGATCTACAAAAACGGCCCCGTGGAGGGAGCTTTCTCTG TGT
ATTCGGACTTCCTGCTCTACAAGTCAGGAGTGTACCAACACGTCACCGGAGAGATGA TGG
GTGGCCATGCCATCCGCATCCTGGGCTGGGGAGTGGAGAATGGCACACCCTACTGGC TGG
TTGCCAACTCCT GG AAC ACT GACT GGGGTG AC AAT GGCTTCTTT AAAAT ACT CAGAGGAC AGGATCACTGCGGAATCGAATCAGAAGTGGTGGCTGGAATTCCACGCACCGATCAGTACT
GGGAAAAGATCTAATCTGCCGTGGGCCTGTCGTGCCAGTCCTGGGGGCGAGATCGGG GTA
G AAAGT C ATTTT ATT CTTT AAGTT C ACGTAAG AT AC AAGTTT CAGGCAGGGTCT G AAG G A
CTGGATTGGCCAAAGTCCTCCAAGGAGACCAAGTCCTGGCTACATCCCAGCCTGTGG TTA
CAGTGCAGACAGGCCATGTGAGCCACCGCTGCCAGCACAGAGCGTCCTTCCCCCTGT AGA
CTAGTGCCGTGGGAGTACCTGCTGCCCAGCTGCTGTGGCCCCCTCCGTGATCCATCC ATC
TCCAGGGAGCAAGACAGAGACGCAGGATGGAAAGCGGAGTTCCTAACAGGATGAAAG TTC
CCCCATCAGTTCCCCCAGTACCTCCAAGCAAGTAGCTTTCCACATTTGTCACAGAAA TCA
GAGGAGAGATGGTGTTGGGAGCCCTTTGGAGAACGCCAGTCTCCAGGTCCCCCTGCA TCT
ATCGAGTTTGCAATGTCACAACCTCTCTGATCTTGTGCTCAGCATGATTCTTTAATA GAA
GTTTTATTTTTCGTGCACTCTGCTAATCATGTGGGTGAGCCAGTGGAACAGCGGGAG CCT
GTGCTGGTTTGCAGATTGCCTCCTAATGACGCGGCTCAAAAGGAAACCAAGTGGTCA GGA
GTTGTTTCTGACCCACTGATCTCTACTACCACAAGGAAAATAGTTTAGGAGAAACCA GCT
TTTACTGTTTTTGAAAAATTACAGCTTCACCCTGTCAAGTTAACAAGGAATGCCTGT GCC
AATAAAAGGTTTCTCCAACTTG
(SEQ ID NO: 21)
As used herein, the term “CTSD” refers to the gene encoding Cathepsin D. The terms “CTSD” and "Cathepsin D" include wild-type forms of the CTSD gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type CTSD. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type CTSD nucleic acid sequence (e.g., SEQ ID NO: 22, ENA accession number M11233). SEQ ID NO: 22 is a wild-type gene sequence encoding CTSD protein, and is shown below:
GGCTATAAGCGCACGGCCTCGGCGACCCTCTCCGACCCGGCCGCCGCCGCCATGCAG CCC
TCCAGCCTTCTGCCGCTCGCCCTCTGCCTGCTGGCTGCACCCGCCTCCGCGCTCGTC AGG
ATCCCGCTGCACAAGTTCACGTCCATCCGCCGGACCATGTCGGAGGTTGGGGGCTCT GTG
GAGGACCTGATTGCCAAAGGCCCCGTCTCAAAGTACTCCCAGGCGGTGCCAGCCGTG ACC
GAGGGGCCCATTCCCGAGGTGCTCAAGAACTACATGGACGCCCAGTACTACGGGGAG ATT
GGCATCGGGACGCCCCCCCAGTGCTTCACAGTCGTCTTCGACACGGGCTCCTCCAAC CTG
TGGGTCCCCTCCATCCACTGCAAACTGCTGGACATCGCTTGCTGGATCCACCACAAG TAC
AACAGCGACAAGTCCAGCACCTACGTGAAGAATGGTACCTCGTTTGACATCCACTAT GGC
TCGGGCAGCCTCTCCGGGTACCTGAGCCAGGACACTGTGTCGGTGCCCTGCCAGTCA GCG
TCGTCAGCCTCTGCCCTGGGCGGTGTCAAAGTGGAGAGGCAGGTCTTTGGGGAGGCC ACC
AAGCAGCCAGGCATCACCTTCATCGCAGCCAAGTTCGATGGCATCCTGGGCATGGCC TAC
CCCCGCATCTCCGTCAACAACGT GCTGCCCGT CTTCGACAACCT GAT GCAGCAGAAGCT G
GTGGACCAGAACATCTTCTCCTTCTACCTGAGCAGGGACCCAGATGCGCAGCCTGGG GGT
GAGCTGATGCTGGGTGGCACAGACTCCAAGTATTACAAGGGTTCTCTGTCCTACCTG AAT
GTCACCCGCAAGGCCTACTGGCAGGTCCACCTGGACCAGGTGGAGGTGGCCAGCGGG CTG
ACCCTGTGCAAGGAGGGCTGTGAGGCCATTGTGGACACAGGCACTTCCCTCATGGTG GGC CCGGTGGATGAGGTGCGCGAGCTGCAGAAGGCCATCGGGGCCGTGCCGCTGATTCAGGGC GAGTACATGATCCCCTGTGAGAAGGTGTCCACCCTGCCCGCGATCACACTGAAGCTGGGA GGCAAAGGCTACAAGCTGTCCCCAGAGGACTACACGCTCAAGGTGTCGCAGGCCGGGAAG ACCCTCTGCCTGAGCGGCTTCATGGGCATGGACATCCCGCCACCCAGCGGGCCACTCTGG ATCCTGGGCGACGTCTTCATCGGCCGCTACTACACTGTGTTTGACCGTGACAACAACAGG GTGGGCTTCGCCGAGGCTGCCCGCCTCTAGTTCCCAAGGCGTCCGCGCGCCAGCACAGAA ACAGAGGAGAGTCCCAGAGCAGGAGGCCCCTGGCCCAGCGGCCCCTCCCACACACACCCA CACACTCGCCCGCCCACTGTCCTGGGCGCCCTGGAAGCCGGCGGCCCAAGCCCGACTTGC TGTTTTGTTCTGTGGTTTTCCCCTCCCTGGGTTCAGAAATGCTGCCTGCCTGTCTGTCTC TCCATCTGTTTGGTGGGGGTAGAGCTGATCCAGAGCACAGATCTGTTTCGTGCATTGGAA GACCCCACCCAAGCTTGGCAGCCGAGCTCGTGTATCCTGGGGCTCCCTTCATCTCCAGGG AGTCCCCTCCCCGGCCCTACCAGCGCCCGCTGGGCTGAGCCCCTACCCCACACCAGGCCG TCCTCCCGGGCCCTCCCTTGGAAACCTGCCCTGCCTGAGGGCCCCTCTGCCCAGCTTGGG CCCAGCTGGGCTCTGCCACCCTACCTGTTCAGTGTCCCGGGCCCGTTGAGGATGAGGCCG CTAGAGGCCTGAGGATGAGCTGGAAGGAGTGAGAGGGGACAAAACCCACCTTGTTGGAGC CTGCAGGGTGGTGCTGGGACTGAGCCAGTCCCAGGGGCATGTATTGGCCTGGAGGTGGGG TTGGGATTGGGGGCTGGTGCCAGCCTTCCTCTGCAGCTGACCTCTGTTGTCCTCCCCTTG GGCGGCTGAGAGCCCCAGCTGACATGGAAATACAGTTGTTGGCCTCCGGCCTCCCCTC (SEQ ID NO: 22)
As used herein, the term “CTSL” refers to the gene encoding Cathepsin L1 . The terms “CTSL” and "Cathepsin L1" include wild-type forms of the CTSL gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type CTSL. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type CTSL nucleic acid sequence (e.g., SEQ ID NO: 23, ENA accession number X12451). SEQ ID NO: 23 is a wild-type gene sequence encoding CTSL protein, and is shown below:
AGAACCGCGACCTCCGCAACCTTGAGCGGCATCCGTGGAGTGCGCCTGCAGCTACGA CCG
CAGCAGGAAAGCGCCGCCGGCCAGGCCCAGCTGTGGCCGGACAGGGACTGGAAGAGA GGA
CGCGGTCGAGTAGGTGTGCACCAGCCCTGGCAACGAGAGCGTCTACCCCGAACTCTG CTG
GCCTTGAGGTGGGGAAGCCGGGGAGGGCAGTTGAGGACCCCGCGGAGGCGCGTGACT GGT
TGAGCGGGCAGGCCAGCCTCCGAGCCGGGTGGACACAGGTTTTAAAACATGAATCCT ACA
CTCATCCTTGCTGCCTTTTGCCTGGGAATTGCCTCAGCTACTCTAACATTTGATCAC AGT
TTAGAGGCACAGTGGACCAAGTGGAAGGCGATGCACAACAGATTATACGGCATGAAT GAA
G AAGGATGGAGG AGAGCAGTGTGGGAGAAG AAC AT GAAG AT GATT GAACT GCACAAT CAG
GAATACAGGGAAGGGAAACACAGCTTCACAATGGCCATGAACGCCTTTGGAGACATG ACC
AGTGAAGAATTCAGGCAGGTGATGAATGGCTTTCAAAACCGTAAGCCCAGGAAGGGG AAA
GTGTTCCAGGAACCTCTGTTTTATGAGGCCCCCAGATCTGTGGATTGGAGAGAGAAA GGC
TACGTGACTCCTGTGAAGAATCAGGGTCAGTGTGGTTCTTGTTGGGCTTTTAGTGCT ACT
GGTGCTCTTGAAGGACAGATGTTCCGGAAAACTGGGAGGCTTATCTCACTGAGTGAG CAG AATCTGGTAGACTGCTCTGGGCCTCAAGGCAATGAAGGCTGCAATGGTGGCCTAATGGAT
TATGCTTTCCAGTATGTTCAGGATAATGGAGGCCTGGACTCTGAGGAATCCTATCCA TAT
GAGGCAACAGAAGAATCCTGTAAGTACAATCCCAAGTATTCTGTTGCTAATGACACC GGC
TTTGTGGACATCCCTAAGCAGGAGAAGGCCCTGATGAAGGCAGTTGCAACTGTGGGG CCC
ATTTCTGTTGCTATTGATGCAGGTCATGAGTCCTTCCTGTTCTATAAAGAAGGCATT TAT
TTTGAGCCAGACTGTAGCAGTGAAGACATGGATCATGGTGTGCTGGTGGTTGGCTAC GGA
TTT G AAAG C AC AG AAT C AG AT AAC AAT AAAT ATT G GCTGGT G AAG AAC AGCT G GG GT G AA
GAATGGGGCATGGGTGGCTACGTAAAGATGGCCAAAGACCGGAGAAACCATTGTGGA ATT
GCCTCAGCAGCCAGCTACCCCACTGTGTGAGCTGGTGGACGGTGATGAGGAAGGACT TGA
CTGGGGATGGCGCATGCATGGGAGGAATTCATCTTCAGTCTACCAGCCCCCGCTGTG TCG
GAT ACACACTCG AAT CATT GAAG ATCCG AGT GT GATTT GAATT CTGTG AT ATTTTCACAC
TGGT AAAT GTTACCTCT ATTTT AATT ACTGCTAT AAAT AG GTTT AT ATT ATT GATT C ACT
TACT G ACTTT GC ATTTTCGTTTTT AAAAG GAT GTAT AAATTTTT ACCTGTTT AAAT AAAA
TTT AATTT C AAAT GT
(SEQ ID NO: 23)
As used herein, the term “CXCL10” refers to the gene encoding C-X-C motif chemokine 10. The terms “CXCL10” and "C-X-C motif chemokine 10" include wild-type forms of the CXCL10 gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type CXCL10. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type CXCL10 nucleic acid sequence (e.g., SEQ ID NO: 24, ENA accession number X02530). SEQ ID NO: 24 is a wild-type gene sequence encoding CXCL10 protein, and is shown below:
G AG AC ATTCCT C AATT G CTT AG AC AT ATT CTGAGCCT AC AG C AG AG GAACCTCCAGTCTC
AGCACCATGAATCAAACTGCGATTCTGATTTGCTGCCTTATCTTTCTGACTCTAAGT GGC
ATTCAAGGAGTACCTCTCTCTAGAACCGTACGCTGTACCTGCATCAGCATTAGTAAT CAA
CCTGTT AATCC AAG GT CTTT AG AAAAACTT G AAATT ATTCCT G C AAG CC AATTTT GTCC A
CGTGTTGAGATCATTGCTACAATGAAAAAGAAGGGTGAGAAGAGATGTCTGAATCCA GAA
TCGAAGGCCATCAAGAATTTACTGAAAGCAGTTAGCAAGGAAATGTCTAAAAGATCT CCT
TAAAACCAGAGGGGAGCAAAATCGATGCAGTGCTTCCAAGGATGGACCACACAGAGG CTG
CCTCTCCCATCACTTCCCTACATGGAGTATATGTCAAGCCATAATTGTTCTTAGTTT GCA
GTTACACTAAAAGGTGACCAATGATGGTCACCAAATCAGCTGCTACTACTCCTGTAG GAA
GGTTAATGTTCATCATCCTAAGCTATTCAGTAATAACTCTACCCTGGCACTATAATG TAA
GCTCTACTGAGGTGCTATGTTCTTAGTGGATGTTCTGACCCTGCTTCAAATATTTCC CTC
ACCTTTCCCATCTTCCAAGGGTACTAAGGAATCTTTCTGCTTTGGGGTTTATCAGAA TTC
T C AG AAT CT C AAAT AACT AAAAGGTATGC AAT C AAAT CTG CTTTTT AAAG AAT G CT CTTT
ACTTCATGGACTTCCACTGCCATCCTCCCAAGGGGCCCAAATTCTTTCAGTGGCTAC CTA
CATACAATTCCAAACACATACAGGAAGGTAGAAATATCTGAAAATGTATGTGTAAGT ATT
CTT ATTT AAT G AAAG ACTGTAC AAAGTAT AAGTCTT AG AT GTATAT ATTTCCT AT ATT GT
TTT C AGTGT AC AT G G AAT AAC AT GT AATT AAGTACTATGTAT C AAT G AGT AAC AGG AAAA TTTT AAAAAT AC AG AT AG ATATATG CTCTG CAT GTT AC AT AAG AT AAATGTG CT G AAT G G TTTT C AAAT AAAAAT GAGGTACTCTCCTG G AAAT ATT AAG AAAG ACT ATCT AAAT GTTG A AAG AT C AAAAGGTT AAT AAAGT AATT AT AACT (SEQ ID NO: 24)
As used herein, the term “CXCL13” refers to the gene encoding C-X-C motif chemokine 13. The terms “CXCL13” and "C-X-C motif chemokine 13" include wild-type forms of the CXCL13 gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type CXCL13. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type CXCL13 nucleic acid sequence (e.g., SEQ ID NO: 25, ENA accession number AF044197). SEQ ID NO: 25 is a wild-type gene sequence encoding CXCL13 protein, and is shown below:
TTCGGCACTTGGGAGAAGATGTTTGAAAAAACTGACTCTGCTAATGAGCCTGGACTC AGA
GCTCAAGTCTGAACTCTACCTCCAGACAGAATGAAGTTCATCTCGACATCTCTGCTT CTC
ATGCTGCTGGTCAGCAGCCTCTCTCCAGTCCAAGGTGTTCTGGAGGTCTATTACACA AGC
TTGAGGTGTAGATGTGTCCAAGAGAGCTCAGTCTTTATCCCTAGACGCTTCATTGAT CGA
ATTCAAATCTTGCCCCGTGGGAATGGTTGTCCAAGAAAAGAAATCATAGTCTGGAAG AAG
AACAAGTCAATTGTGTGTGTGGACCCTCAAGCTGAATGGATACAAAGAATGATGGAA GTA
TT GAGAAAAAGAAGTTCTT CAACTCT ACC AGTTCCAGTGTTT AAGAGAAAG ATTCCCT GA
TGCTGATATTTCCACTAAGAACACCTGCATTCTTCCCTTATCCCTGCTCTGGATTTT AGT
TTTGTGCTTAGTTAAATCTTTTCCAGGGAGAAAGAACTTCCCCATACAAATAAGGCA TGA
GGACTATGTGAAAAATAACCTTGCAGGAGCTGATGGGGCAAACTCAAGCTTCTTCAC TCA
CAGCACCCTAT AT ACACTT GG AGTTTGCATT CTT ATT CAT CAGGGAGG AAAGTTT CTTT G
AAAAT AGTT ATT C AGTTATAAGT AAT AC AGG ATT ATTTT GATT AT AT ACTTGTT GTTT AA
T GTTT AAAATTT CTT AG AAAAC AAT G G AAT G AG AATTT AAGCC T C AAATTT G AAC AT GTG
G CTT G AATT AAG AAG AAAATT AT G GC AT AT ATT AAAAGC AGG CTT CTAT G AAAG ACT C AA
AAAGCTGCCTGGGAGGCAGATGGAACTTGAGCCTGTCAAGAGGCAAAGGAATCCATG TAG
T AG AT ATCCTCTGCTT AAAAACT C ACT ACG G AG GAG AATT AAGTCCT ACTTTT AAAG AAT
TT CTTT AT AAAATTT ACTGTCT AAG ATT AAT AG C ATTCG AAG ATCCCC AG ACTT CAT AG A
ATACTCAGGGAAAGCATTTAAAGGGTGATGTACACATGTATCCTTTCACACATTTGC CTT
GACAAACTTCTTTCACTCACATCTTTTTCACTGACTTTTTTTGTGGGGGCGGGGCCG GGG
G G ACT CTGGTATCT AATT CTTT AAT G ATTCCT AT AAAT CT AAT G AC ATT C AAT AAAGTT G
AGCAAACATTTT ACTT
(SEQ ID NO: 25)
As used herein, the term “DSG2” refers to the gene encoding Desmoglein 2. The terms “DSG2” and "Desmoglein 2 " include wild-type forms of the DSG2 gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type DSG2. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type DSG2 nucleic acid sequence (e.g., SEQ ID NO: 26, NCBI Reference Sequence: NM_001943.4). SEQ ID NO: 26 is a wild-type gene sequence encoding DSG2 protein, and is shown below:
CCACCTCTGTAAAAGCGGCCCGGGCCGGCCCCCGGCTCCATTTTCTCGCGGCGGCCA CACCTGGA
GCCGCGCCTTTGGGTTGGGCTGGGCTGGGCCGCGCAACCGCCACGGGAAGACAGCCC TCGGGGC
GGGGAGGGAGAGGGTGGCCGGGCCGGGGGGAGGCCGGGGCCAGGGAGGAGCCGAGTG CGCGC
TCGGGGCAGGCGGCGGCGCGGAGCGGTGCGGCGGCGGGAGGCGGAGGCGAGGGTGCG ATGGC
GCGGAGCCCGGGACGCGCGTACGCCCTGCTGCTTCTCCTGATCTGCTTTAACGTTGG AAGTGGACT
T C ACTT AC AG GTCTT AAG C AC AAG AAAT G AAAAT AAGCTGCTTCCT AAAC AT C CT C ATTT AGTGCGGC
AAAAGCGCGCCTGGATCACCGCCCCCGTGGCTCTTCGGGAGGGAGAGGATCTGTCCA AGAAGAAT
CC AATT G CC AAG AT AC ATT CTG ATCTTG C AG AAG AAAG AG G ACT C AAAATT ACTT AC AAAT AC ACTGG
AAAAGGGATTACAGAGCCACCTTTTGGTATATTTGTCTTTAACAAAGATACTGGAGA ACTGAATGTTA
CCAGCATTCTTGATCGAGAAGAAACACCATTTTTTCTGCTAACAGGTTACGCTTTGG ATGCAAGAGGA
AACAATGTAGAGAAACCCTTAGAGCTACGCATTAAGGTTCTTGATATCAATGACAAC GAACCAGTGTT
CACACAGGATGTCTTTGTTGGGTCTGTTGAAGAGTTGAGTGCAGCACATACTCTTGT GATGAAAATCA
ATGCAACAGATGCAGATGAGCCCAATACCCTGAATTCGAAAATTTCCTATAGAATCG TATCTCTGGAG
CCTGCTTATCCTCCAGTGTTCTACCTAAATAAAGATACAGGAGAGATTTATACAACC AGTGTTACCTT
G G ACAG AG AG G AAC AC AGC AGCT AC ACTTT G AC AGT AG AAGC AAG AG AT G G C AAT G GAG AAG TT AC
AGACAAACCTGTAAAAC AAGCT CAAGTT CAGATTCGT ATTTTGG AT GT CAAT G ACAAT AT ACCTGTAG
TAGAAAATAAAGTGCTTGAAGGGATGGTTGAAGAAAATCAAGTCAACGTAGAAGTTA CGCGCATAAA
AGT GTTCG AT G C AG AT G AAAT AGGTTCT GAT AATT G GCT G GC AAATTTT AC ATTT G CAT C AG G AAAT G
AAGGAGGTTATTTCCACATAGAAACAGATGCTCAAACTAACGAAGGAATTGTGACCC TTATTAAGGAA
GTAGATTATGAAGAAATGAAGAATCTTGACTTCAGTGTTATTGTCGCTAATAAAGCA GCTTTTCACAA
GTCGATTAGGAGTAAATACAAGCCTACACCCATTCCCATCAAGGTCAAAGTGAAAAA TGTGAAAGAA
GGCATTCATTTTAAAAGCAGCGTCATCTCAATTTATGTTAGCGAGAGCATGGATAGA TCAAGCAAAGG
CCAAATAATTGGAAATTTTCAAGCTTTTGATGAGGACACTGGACTACCAGCCCATGC AAGATATGTAA
AATT AG AAG AT AG AG AT AATT GGATCTCTGTG GATT CTGT C AC AT CT G AAATT AAACTT GC AAAACTT C
CT G ATTTT G AAT CT AG AT AT GTT C AAAAT G GC AC AT AC ACT GT AAAG ATTGTG G CC AT AT C AG AAG ATT
ATCCT AG AAAAACC AT CACT GGCACAGTCCTT AT CAAT GTT G AAG AC ATC AACG ACAACTGTCCC AC A
CTGATAGAGCCTGTGCAGACAATCTGTCACGATGCAGAGTATGTGAATGTTACTGCA GAGGACCTGG
ATGGACACCCAAACAGTGGCCCTTTCAGTTTCTCCGTCATTGACAAACCACCTGGCA TGGCAGAAAA
ATGGAAAATAGCACGCCAAGAAAGTACCAGTGTGCTGCTGCAACAAAGTGAGAAAAA GCTTGGGAG
AAGTGAAATTC AGTTCCT GATTT CAG ACAAT CAGGGTTTT AGTT GTCCT GAAAAGCAGGTCCTT ACAC
TCACAGTTTGTGAGTGTCTGCATGGCAGCGGCTGCAGGGAAGCACAGCATGACTCCT ATGTGGGCC
TGGGACCCGCAGCAATTGCGCTCATGATTTTGGCCTTTCTGCTCCTGCTATTGGTAC CACTTTTACTG
CTGATGTGCCATTGCGGAAAGGGCGCCAAAGGCTTTACCCCCATACCTGGCACCATA GAGATGCTG
CATCCTTGGAATAATGAAGGAGCACCACCTGAAGACAAGGTGGTGCCATCATTTCTG CCAGTGGATC
AAGGGGGCAGTCTAGTAGGAAGAAATGGAGTAGGAGGTATGGCCAAGGAAGCCACGA TGAAAGGA
AGTAGCTCTGCTTCCATTGTCAAAGGGCAACATGAGATGTCCGAGATGGATGGAAGG TGGGAAGAA
CACAGAAGCCTGCTTTCTGGTAGAGCTACCCAGTTTACAGGGGCCACAGGCGCTATC ATGACCACT
GAAACCACGAAGACCGCAAGGGCCACAGGGGCTTCCAGAGACATGGCCGGAGCTCAG GCAGCTGC TGTTGCACTGAACGAAGAATTCTTAAGAAATTATTTCACTGATAAAGCGGCCTCTTACAC TGAGGAAG
ATGAAAATCACACAGCCAAAGATTGCCTTCTGGTTTATTCTCAGGAAGAAACTGAAT CGCTGAATGCT
TCTATTGGTTGTTGCAGTTTTATTGAAGGAGAGCTAGATGACCGCTTCTTAGATGAT TTGGGACTTAA
ATT C AAG AC ACT AG CT G AAGTTT GCCT G GGTC AAAAAAT AG AT AT AAAT AAG G AAATT GAG C AG AG AC
AAAAACCTGCC AC AG AAACAAGT AT G AACACAGCTTCACATTC ACT CTGTGAGCAA ACT ATGGTT AAT
TCAGAGAATACCTACTCCTCTGGCAGTAGCTTCCCAGTTCCAAAATCTTTGCAAGAA GCCAATGCAG
AGAAAGTAACTCAGGAAATAGTCACTGAAAGATCTGTGTCTTCTAGGCAGGCGCAAA AGGTAGCTAC
ACCTCTTCCTGACCCAATGGCTTCTAGAAATGTGATAGCAACAGAAACTTCCTATGT CACAGGGTCCA
CTATGCCACCAACCACTGTGATCCTGGGTCCTAGCCAGCCACAGAGCCTTATTGTGA CAGAGAGGG
TGTATGCTCCAGCTTCTACCTTGGTAGATCAGCCTTATGCTAATGAAGGTACAGTTG TGGTCACTGAA
AGAGTAATACAGCCTCATGGGGGTGGATCGAATCCTCTGGAAGGCACTCAGCATCTT CAAGATGTAC
CTTACGTCATGGT GAGGGAAAGAGAGAGCTTCCTT GCCCCCAGCTCAGGT GT GCAGCCTACTCT GG
CCATGCCTAATATAGCAGTAGGACAGAATGTGACAGTGACAGAAAGAGTTCTAGCAC CTGCTTCCAC
TCTGCAATCCAGTTACCAGATTCCCACTGAAAATTCTATGACGGCTAGGAACACCAC GGTGTCTGGA
GCTGGAGTCCCTGGCCCTCTGCCAGATTTTGGTTTAGAGGAATCTGGTCATTCTAAT TCTACCATAAC
CACATCTTCCACCAGAGTTACCAAGCATAGCACTGTACAGCATTCTTACTCCTAAAC AGCAGTCAGCC
AC AAACT G ACC C AG AGTTT AATT AG C AGT G ACT AATTT C ATGTTTCC AAT GTACCT G ATTTTT CAT GAG
CCTT AC AG AC AC AC AG AG AC AC AT AC AC ATT G ATCTT AAAATTTTT CTC AGT C ACT GAT AT GC AAAG G
ACCACACTGTCTCTGCTTCCAGGAGTATTTTAGAAATGTTCCACAATTTACTGAAGA CATAGAGATGA
TGCTGCTGCTTAGGTGCCTTTTAGCAAGCTATGCAAACAATCCTGATAAAACAAGAT ACATAGAGAGT
CAATCTGGCTTCTGAGAATTTACCAAGTGAACAGAGTACCTAGTTCATCAGCCGTCC AGTAAAGCAA
CCCAGGAAACTGACTGGGTCTCTTTGCCTACCGTATTAACATTAAACATTGATGTTC TGTATTCTGTA
CTTT ACTG C AC CC AG C AG ACTTT CAAC AACT C ATT G ATCC AAAG AT AC AT GC AC AGTCT GAG C AC C AG
CT ATGGT GCT CAT AACTT CTTT AAGACTT G AACCCTTTCAATCT GT GT GATTC ATT AAATT GG ACC ATT
GAT GAT AAG AAT AC AC ATT GT ATGTTTCT GT GCACAT GACAGTGTGTGTGT GTGCACGTACATACT GT
ATAGTCTTAAAAATAGCATTATACTGGCCAGGGGTGGTGGCTAACGCCTGTAATCCC AGCACTTTGG
GAGGCCGAGGCGGGTGGATCAACTGTGGTCAGGAGTTTGAGATCAGCCAGGCCAACC TGGTGAAA
CCCCGTCTCTACTAAAAATACAAAAATTAGCTGGGCGTGATGGTGGGCGCCTGTAAT CCCAGCTACT
TGGGAGGCTGAGGCAGGAGAATCACTTGAACCCGGGAGGCGGAGGTTGCAGTGAGCC GAGATCGC
ACCATTGCACTCCAGTCTGGGCAACAGAGTGAGATTCCGTCTCAAAAAAAAAAAGAA AAGGAAAAAA
AAATAGCATTATACCTCTTCCTTGTCTCAACCGCCATGAAAATTCTGAACACTCCAA ATTCAGTTGAAT
AATCCAAAACAAAATTT ATAAGTAT AAAATAATTTTACTTCTTAT AGTAATAGTAT ACTTT AAAAAGCCT
C AGG GTAT ATT ATCTTCT AAAC AGCT AC AATT C AGTG C AG CT AC ATT AACC AACT ATGTTCTCT AGTT G
AGAACAACTAGGCCTATTTCACTGCTGTGTAGCCTCAGTGCCTAACATGGGTGCCAA ATAAATATTCG
T AG AATT AC ACT GAATT GT AAAAACC ATTCGTTTTT GTTTACAATT GCCAAAAATCT CAAAAGGCCCT G
T ATTT ATGTAATT CTTT G AAATT ATT ATTTT ATTTT G ATTT CT C AGTT ATT GACTGGCTGGGTGTGACTT
AGT AC AT AAGTACT C AAT ATT AT AAAAACCT C AAAT AATT G ACTT G ATTTT AC AC AAC ATCCTTCCCTTT
TCT AC AAGTT AATTTTTTT AC AAAT C ATTT GGGTTATCTCCT AAAT AG GTT AT ATTTT ATT G CTTCT AG A
AAC AAT GTTT C AAAAT AT ATGTGC ATT ATC AGT AAT AATTTGTAT AAAT ATTT C CC AC AAC AATTTT CAT
AATTTTCAAAGACTAATTTCTTGACTGAAGATATTTTGCTAGGGAAGTGAAACTTTA AAATTTTGTAGA
TTTT AAAAAAT ATTGTT GAATGGT GT CAT GC AAAGGATTT AT AT AGT GT GCTCCCACT AACTGTACAG A
TCAGGACACATATTTTTAGACATCTAAGTCTGTAGCTTAAATGGAGGTTACTCTTCC ATCATCTAGAAT TGTTTACTTAGTAATTGTTGTTTCTTTTATTATTATAGACTTACTATCAGTTTTATTTTG CCAAGTATGCA
ACAGGTATATCACTAGTATATGAAAATGTAAATATCACTTGTGTACTCAAACAAAAG TTGGTCTTAAGC
TTCCACCTTGAGCAGCCTTGGAAACCTAACCTGCCTCTTTTAGCATAATCACATTTT CTAAATGATTTT
CTTTGTTCCT G AAAAAGTGATTTGTATT AGTTTTACATTTGTTTTTTGGAAG ATT AT ATTTGTAT AT GT A
TCATCATAAAATATTTAAATAAAAAGTATCTTTAGAGTGACCCTTTCCCCATAGATT TTTATTTCTCTAT
TATATTTTACAAGGAATATAACTCAGTTTGTTAGGGAGAGTGCCTTAAAGGCAGGTG TTTCTTGGACT
TTGTTATTTAATTAGATCTGCTTGCAATAAAAAAAGTTGTCGGTTATCTAAAATTCA AAAAAAAAAAAAA
AAAA
(SEQ ID NO: 26)
As used herein, the term “ECHDC3” refers to the gene encoding Enoyl-CoA Hydratase Domain Containing 3. The terms “ECHDC” and " Enoyl-CoA Hydratase Domain Containing 3" include wild-type forms of the ECHDC gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type ECHDC. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type ECHDC nucleic acid sequence (e.g., SEQ ID NO: 27, NCBI Reference Sequence: NM_024693.4). SEQ ID NO: 27 is a wild-type gene sequence encoding ECHDC protein, and is shown below:
GGGGCGGGGCGTGCCGGGGCGGGGCGTAGTACGGACTGGGCCTGGCCTGGGGCGTCC CCGCGA
AGCCTGGGCCTGTCAGGCGGTTCCGTCCGGGTCTCGGCCACCGTCGAGTTCCGTCGA GTTCCGTC
CCGGCCCTGCTCACAGCAGCGCCCTCGGAGCGCCCAGCACCTGCGGCCGGCCAGGCA GCGCGAT
CCTGCGGCGTCTGGCCATCCCGAATGCTATGGCCGCCGTCGCCGTCTTGCGGGCCTT CGGGGCAA
GTGGGCCCATGTGTCTCCGGCGCGGCCCCTGGGCCCAGCTCCCCGCCCGCTTCTGCA GCCGGGA
CCCGGCCGGGGCGGGGCGGCGGGAGTCGGAGCCGCGGCCCACCAGCGCGCGGCAGCT GGACGG
CATAAGGAACATCGTCTTGAGCAATCCCAAGAAGAGGAACACGTTGTCACTTGCAAT GCTGAAATCT
CTCCAAAGTGACATTCTTCATGACGCTGACAGCAACGATCTGAAAGTCATTATCATC TCGGCTGAGG
GGCCTGTGTTTTCTTCTGGGCATGACTTAAAGGAGCTGACAGAGGAGCAAGGCCGTG ATTACCATG
CCGAAGTATTTCAGACCTGTTCCAAGGTCATGATGCACATCCGGAACCACCCCGTCC CCGTCATTGC
CATGGTCAATGGCCTGGCCACGGCTGCCGGCTGTCAACTGGTTGCCAGCTGCGACAT TGCCGTGG
CGAGCGACAAGTCCTCTTTTGCCACTCCTGGGGTGAACGTCGGGCTCTTCTGTTCTA CCCCTGGGG
TTGCCTTGGCAAGAGCAGTGCCTAGAAAGGTGGCCTTGGAGATGCTCTTTACTGGTG AGCCCATTTC
TGCCCAGGAGGCCCTGCTCCACGGGCTGCTTAGCAAGGTGGTGCCAGAGGCGGAGCT GCAGGAG
GAGACCATGCGGATCGCTAGGAAGATCGCATCGCTGAGCCGTCCGGTGGTGTCCCTG GGCAAAGC
CACCTTCTACAAGCAGCTGCCCCAGGACCTGGGGACGGCTTACTACCTCACCTCCCA GGCCATGGT
GGACAACCTGGCCCTGCGGGACGGGCAGGAGGGCATCACGGCCTTCCTCCAGAAGAG AAAACCTG
TCTGGTCACACGAGCCAGTGTGAGTGGAGGCAGAGGAGTGAGGCCCACGGGCAGCGC CCAGGAG
CCCACCTTCCCCTCTGGCCCAGCCACCACTGCCTCTCAGCTTCAACAGGTGACAGGC TGCTTTCGT
GACTTGATATTGGTGTCATAGCATTTGGCCTACATTAAAAGCCACAATTTCATGGGG AAAGGACAAAA
T GGAGAGT GACTGAGGTGCT GACCTCAGT GCAAGGCT GGT GAACCCT GCAGCGGGCCAGCTATGG
TGGGAAGCCTGGCATTTGGGGTGCTCCTTGCAACGTCTTAAGCAAGCGACCCCCCTG ACATAGCAA
AAGGTGGCAACCCATGGAGGCAGAAAGAAGGACGCCAGCCTGACCCTTATCTGAAAC GTCCTAAGC AGAGTTAATCCTGGCTGCTCAGGAGAGGCGACACATTTCAAATCTCCACGAGATATTCTC CACACAG AAAATCTTCTTGATTCTATAGAGACTTAATCATGCCTATGGCTTTGAATAATCTTATGTG ATTTAAATAA ATT AAAT CTTT AT AAAAAAAAAAAAAAAAAAAA (SEQ ID NO: 27)
As used herein, the term ΈRHA1” refers to the gene encoding Ephrin type-A receptor 1 . The terms ΈRHA1” and "Ephrin type-A receptor 1" include wild-type forms of the EPHA1 gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type EPHA1 . Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type EPHA1 nucleic acid sequence (e.g.,
SEQ ID NO: 28, ENA accession number M18391). SEQ ID NO: 28 is a wild-type gene sequence encoding EPHA1 protein, and is shown below:
GCCCCCGCCCGGCCCGCCCCGCTCTCCTAGTCCCTTGCAACCTGGCGCTGCATCCGG GCC
ACTGTCCCAGGTCCCAGGTCCCGGCCCGGAGCTATGGAGCGGCGCTGGCCCCTGGGG CTA
GGGCTGGTGCTGCTGCTCTGCGCCCCGCTGCCCCCGGGGGCGCGCGCCAAGGAAGTT ACT
CTGATGGACACAAGCAAGGCACAGGGAGAGCTGGGCTGGCTGCTGGATCCCCCAAAA GAT
GGGTGGAGTGAACAGCAACAGATACTGAATGGGACACCCCTCTACATGTACCAGGAC TGC
CCAATGCAAGGACGCAGAGACACTGACCACTGGCTTCGCTCCAATTGGATCTACCGC GGG
GAGGAGGCTTCCCGCGTCCACGTGGAGCTGCAGTTCACCGTGCGGGACTGCAAGAGT TTC
CCTGGGGGAGCCGGGCCTCTGGGCTGCAAGGAGACCTTCAACCTTCTGTACATGGAG AGT
GACCAGGATGTGGGCATTCAGCTCCGACGGCCCTTGTTCCAGAAGGTAACCACGGTG GCT
GCAGACCAGAGCTTCACCATTCGAGACCTTGCGTCTGGCTCCGTGAAGCTGAATGTG GAG
CGCTGCTCTCTGGGCCGCCTGACCCGCCGTGGCCTCTACCTCGCTTTCCACAACCCG GGT
GCCTGTGTGGCCCTGGTGTCTGTCCGGGTCTTCTACCAGCGCTGTCCTGAGACCCTG AAT
GGCTTGGCCCAATTCCCAGACACTCTGCCTGGCCCCGCTGGGTTGGTGGAAGTGGCG GGC
ACCTGCTTGCCCCACGCGCGGGCCAGCCCCAGGCCCTCAGGTGCACCCCGCATGCAC TGC
AGCCCT GAT GGCGAGTGGCT GGT GCCT GTAGGACGGT GCCACTGTGAGCCT GGCTAT GAG
GAAGGTGGCAGTGGCGAAGCATGTGTTGCCTGCCCTAGCGGCTCCTACCGGATGGAC ATG
GACACACCCCATTGTCTCACGTGCCCCCAGCAGAGCACTGCTGAGTCTGAGGGGGCC ACC
ATCTGTACCTGTGAGAGCGGCCATTACAGAGCTCCCGGGGAGGGCCCCCAGGTGGCA TGC
ACAGGTCCCCCCTCGGCCCCCCGAAACCTGAGCTTCTCTGCCTCAGGGACTCAGCTC TCC
CTGCGTTGGGAACCCCCAGCAGATACGGGGGGACGCCAGGATGTCAGATACAGTGTG AGG
TGTTCCCAGTGTCAGGGCACAGCACAGGACGGGGGGCCCTGCCAGCCCTGTGGGGTG GGC
GTGCACTTCTCGCCGGGGGCCCGGGCGCTCACCACACCTGCAGTGCATGTCAATGGC CTT
GAACCTTATGCCAACTACACCTTTAATGTGGAAGCCCAAAATGGAGTGTCAGGGCTG GGC
AGCTCTGGCCATGCCAGCACCTCAGTCAGCATCAGCATGGGGCATGCAGAGTCACTG TCA
GGCCTGTCTCTGAGACTGGTGAAGAAAGAACCGAGGCAACTAGAGCTGACCTGGGCG GGG
TCCCGGCCCCGAAGCCCTGGGGCGAACCTGACCTATGAGCTGCACGTGCTGAACCAG GAT
GAAGAACGGTACCAGATGGTTCTAGAACCCAGGGTCTTGCTGACAGAGCTGCAGCCT GAC
ACCACATACATCGTCAGAGTCCGAATGCTGACCCCACTGGGTCCTGGCCCTTTCTCC CCT GATCATGAGTTTCGGACCAGCCCACCAGTGTCCAGGGGCCTGACTGGAGGAGAGATTGTA
GCCGTCATCTTTGGGCTGCTGCTTGGTGCAGCCTTGCTGCTTGGGATTCTCGTTTTC CGG
TCCAGGAGAGCCCAGCGGCAGAGGCAGCAGAGGCACGTGACCGCGCCACCGATGTGG ATC
GAGAGGACAAGCTGTGCTGAAGCCTTATGTGGTACCTCCAGGCATACGAGGACCCTG CAC
AGGGAGCCTTGGACTTTACCCGGAGGCTGGTCTAATTTTCCTTCCCGGGAGCTTGAT CCA
GCGTGGCT GATGGT GGACACT GTCATAGGAGAAGGAGAGTTT GGGGAAGT GTATCGAGGG
ACCCTCAGGCTCCCCAGCCAGGACTGCAAGACTGTGGCCATTAAGACCTTAAAAGAC ACA
TCCCCAGGTGGCCAGTGGTGGAACTTCCTTCGAGAGGCAACTATCATGGGCCAGTTT AGC
CACCCGCATATTCTGCATCTGGAAGGCGTCGTCACAAAGCGAAAGCCGATCATGATC ATC
ACAGAATTTATGGAGAATGCAGCCCTGGATGCCTTCCTGAGGGAGCGGGAGGACCAG CTG
GTCCCTGGGCAGCTAGTGGCCATGCTGCAGGGCATAGCATCTGGCATGAACTACCTC AGT
AATCACAATTATGTCCACCGGGACCTGGCTGCCAGAAACATCTTGGTGAATCAAAAC CTG
TGCTGCAAGGTGTCTGACTTTGGCCTGACTCGCCTCCTGGATGACTTTGATGGCACA TAC
GAAACCCAGGGAGGAAAGATCCCTATCCGTTGGACAGCCCCTGAAGCCATTGCCCAT CGG
ATCTTCACCACAGCCAGCGATGTGTGGAGCTTTGGGATTGTGATGTGGGAGGTGCTG AGC
TTTGGGGACAAGCCTTATGGGGAGATGAGCAATCAGGAGGTTATGAAGAGCATTGAG GAT
GGGTACCGGTTGCCCCCTCCTGTGGACTGCCCTGCCCCTCTGTATGAGCTCATGAAG AAC
TGCTGGGCATATGACCGTGCCCGCCGGCCACACTTCCAGAAGCTTCAGGCACATCTG GAG
CAACTGCTTGCCAACCCCCACTCCCTGCGGACCATTGCCAACTTTGACCCCAGGGTG ACT
CTTCGCCTGCCCAGCCTGAGTGGCTCAGATGGGATCCCGTATCGAACCGTCTCTGAG TGG
CTCGAGTCCATACGCATGAAACGCTACATCCTGCACTTCCACTCGGCTGGGCTGGAC ACC
ATGGAGTGTGTGCTGGAGCTGACCGCTGAGGACCTGACGCAGATGGGAATCACACTG CCC
GGGCACCAGAAGCGCATTCTTTGCAGTATTCAGGGATTCAAGGACTGATCCCTCCTC TCA
CCCCATGCCCAATCAGGGTGCAAGGAGCAAGGACGGGGCCAAGGTCGCTCATGGTCA CTC
CCTGCGCCCCTTCCCACAACCTGCCAGACTAGGCTATCGGTGCTGCTTCTGCCCGCT TTA
AGGAGAACCCTGCTCTGCACCCCAGAAAACCTCTTTGTTTTAAAAGGGAGGTGGGGG TAG
AAGTAAAAGGATGATCATGGGAGGGAGCTCAGGGGTTAATATATATACATACATACA CAT
ATATATATTGTTGTAAATAAACAGGAAATGATTTTCTGCCTCCATCCCACCCATCAG GGC
TGCAGGCACT
(SEQ ID NO: 28)
As used herein, the term “FABP5” refers to the gene encoding Fatty acid-binding protein 5. The terms “FABP5” and "Fatty acid-binding protein 5" include wild-type forms of the FABP5 gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type FABP5. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type FABP5 nucleic acid sequence (e.g., SEQ ID NO: 29, ENA accession number M94856). SEQ ID NO: 29 is a wild-type gene sequence encoding FABP5 protein, and is shown below:
ACCGCCGACGCAGACCCCTCTCTGCACGCCAGCCCGCCCGCACCCACCATGGCCACA GTT
CAGCAGCTGGAAGGAAGATGGCGCCTGGTGGACAGCAAAGGCTTTGATGAATACATG AAG GAGCTAGGAGTGGGAATAGCTTTGCGAAAAATGGGCGCAATGGCCAAGCCAGATTGTATC ATCACTTGTG AT GGTAAAAACCT CACCAT AAAAACT GAGAGCACTTT G AAAACAAC AC AG TTTTCTTGTACCCTGGGAGAGAAGTTTGAAGAAACCACAGCTGATGGCAGAAAAACTCAG ACTGTCTGCAACTTTACAGATGGTGCATTGGTTCAGCATCAGGAGTGGGATGGGAAGGAA AGCACAATAACAAGAAAATTGAAAGATGGGAAATTAGTGGTGGAGTGTGTCATGAACAAT GTCACCT GT ACTCGG AT CTAT G AAAAAGT AGAATAAAAATTCCATC AT CACTTT GG ACAG G AGTT AATT AAG AG AAT G ACC AAG CT C AGTT C AAT G AGO AAAT CTC CAT ACT GTTT CTTT CTTTTTTTTTT C ATT ACTGTGTT C AATT AT CTTT AT CAT AAAC ATTTT AC ATGC AGCTAT TTCAAAGTGTGTTGGATTAATTAGGATCATCCCTTTGGTTAATAAATAAATGTGTTTGTG CT (SEQ ID NO: 29)
As used herein, the term “FERMT2” refers to the gene encoding Fermitin family homolog 2. The terms “FERMT2” and "Fermitin family homolog 2" include wild-type forms of the FERMT2 gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type FERMT2. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type FERMT2 nucleic acid sequence (e.g., SEQ ID NO: 30, ENA accession number Z24725). SEQ ID NO: 30 is a wild-type gene sequence encoding FERMT2 protein, and is shown below:
CAAAAAGTGTGTGGAAAGGTGGATTGAGGGAGCGGGACCCCCGCGGGACCCGAGGGG GCG
GCAGGCGGGGAACGGGGAGTCAGCCCGCGCTGTGTCTCGGGGCCGGCCGGCAGGAAG GAG
CCATGGCTCTGGACGGGATAAGGATGCCAGATGGCTGCTACGCGGACGGGACGTGGG AAC
TGAGTGTCCATGTGACGGACCTGAACCGCGATATCACCCTGAGAGTGACCGGCGAGG TGC
ACATTGGAGGCGTGATGCTTAAGCTGGTGGAGAAACTCGATGTAAAAAAAGATTGGT CTG
ACCATGCTCTCTGGTGGGAAAAGAAGAGAACTTGGCTTCTGAAGACACATTGGACCT TAG
ATAAGTATGGTATTCAGGCAGATGCTAAGCTTCAGTTCACCCCTCAGCACAAACTGC TCC
GCCTGCAGCTTCCCAACATGAAGTATGTGAAGGTGAAAGTGAATTTCTCTGATAGAG TCT
TCAAAGCTGTTTCTGACATCTGTAAGACTTTTAATATCAGACACCCCGAAGAACTTT CTC
T CTTAAAG AAACCCAG AGATCCAACAAAG AAAAAAAAGAAGAAGCT AG AT G ACC AGT CT G
AAGATGAGGCACTTGAATTAGAGGGGCCTCTTATCACTCCTGGATCAGGAAGTATAT ATT
CAAGCCCAGGACTGTATAGTAAAACAATGACCCCCACTTATGATGCTCATGATGGAA GCC
CCTT GT CACC AACTT CTGCTTGGTTTGGT GACAGTGCTTT GT CAG AAGGCAATCCTGGT A
TACTTGCTGTCAGTCAACCAATCACGTCACCAGAAATCTTGGCAAAAATGTTCAAGC CTC
AAG CTCTTCTTG AT AAAG C AAAAAT C AACC AAGG AT G GCTT G ATTCCT C AAG AT CTCTC A
T GGAAC AAG AT GT GAAGG AAAAT GAGGCCTT GOT GCTCCG ATTCAAGT ATT ACAGCTTTT
TTGATTTGAATCCAAAGTATGATGCAATCAGAATCAATCAGCTTTATGAGCAGGCCA AAT
GGGCCATTCTCCTGGAAGAGATTGAATGCACAGAAGAAGAAATGATGATGTTTGCAG CCC
TGC AGTATC ATAT C AAT AAGCTGT C AAT CAT G AC AT C AG AG AAT C ATTT G AAC AAC AGT G
ACAAAGAAGTTGATGAAGTTGATGCTGCCCTTTCAGACCTGGAGATTACTCTGGAAG GGG
GTAAAACGTCAACAATTTTGGGTGACATTACTTCCATTCCTGAACTTGCTGACTACA TTA
AAGTTTTCAAGCCAAAAAAGCTGACTCTGAAAGGTTACAAACAATATTGGTGCACCT TCA AAG ACACAT CC ATTT CTT GTT AT AAG AGCAAAG AAGAATCC AGTGGCACACCAGCT CAT C
AGATGAACCTCAGGGGATGTGAAGTTACCCCAGATGTAAACATTTCAGGCCAAAAAT TTA
ACATT AAACTCCT G ATTCCAGTTGCAGAAGGCAT G AAT G AAAT CTGGCTTCGTTGTGACA
ATGAAAAACAGTATGCACACTGGATGGCAGCCTGCAGATTAGCCTCCAAAGGCAAGA CCA
TGGCGGACAGTTCTTACAACTTAGAAGTTCAGAATATTCTTTCCTTTCTGAAGATGC AGC
ATTT AAACCCAG ATCCTC AGTT AAT ACC AGAGC AG ATCACG ACT GAT AT AACTCCT GAAT
GTTTGGTGTCTCCCCGCTATCTAAAAAAGTATAAGAACAAGCAGATAACAGCGAGAA TCT
TGGAGGCCCATCAGAATGTAGCTCAGATGAGTCTAATTGAAGCCAAGATGAGATTTA TTC
AAGCTTGGCAGTCACTACCTGAATTTGGCATCACTCACTTCATTGCAAGGTTCCAAG GGG
G C AAAAAAG AAG AACTT ATT GG AATT G CAT AC AAC AG ACT GATT CG GAT G GAT GCC AGC A
CTGGAGATGCAATTAAAACATGGCGTTTCAGCAACATGAAACAGTGGAATGTCAACT GGG
AAATCAAAATGGTCACCGTAGAGTTTGCAGATGAAGTACGATTGTCCTTCATTTGTA CTG
AAGTAGATTGCAAAGTGGTTCATGAATTCATTGGTGGCTACATATTTCTCTCAACAC GTG
CAAAAG ACCAAAACG AG AGTTT AG AT GAAGAG AT GTTCT ACAAACTT ACC AGTGGTTGGG
TGTGAATAGAAATACTGTTTAATGAAACTCCACGGCCATAACAATATTTAACTTTAA AAG
CT GTTTGTT ATATGCTG CTT AAT AAAGT AAG CTT G AAATTT AT C ATTTT AT CAT G AAAAC
TTCTTTGCCTTACCAGACCAGTTAATATGTGCACTAAACAAGCACGACTATTAATCT ATC
ATGTTATGATATAATAAACTTGAATTTGGCACACATTCCTTAGGGCCATGAATTGAA AAC
T GAAAT AGT GGGCAAAT CAGG AACAAACCAT CACT G ATTT ACT GATTTAAGCT AGCCAAA
CTGTAAGAAACAAGCCATCTATTTTAAAGCTATCCAGGGCTTAACCTATATGAACTC TAT
TT AT CAT GTCT AAT G CAT GTG ATTT AAT GTAT GTTT AATTT GAT AT C ATGTTTT AAAAT A
TCCTACTTCTGGTAGCCATTTAATTCCTCCCCCTACCCCCAAATAAATCAGGCATGC AGG
AGGCCTGATATTTAGTAATGTCATTGTGTTTGACCTTGAAGGAAAATGCTATTAGTC CGT
CGTGCTTNATTTGTTTTTGTCCTTGAATAAGCATGTTATGTATATNGTCTCGTGTTT TTA
TTTTTACACCATATTGTATTACACTTTTAGTATTCACCAGCATAANCACTGTCTGCC TAA
AAT AT G CAACT CTTT GC ATT AC AAT AT G AAGTAAAGTT CTATGAAGTATG C ATTTT GTGT
AACT AAT GT AAAAAC AC AAATTTT AT AAAATT GT AC AGTTTTTT AAAAACT ACT C AC AAC
T AG CAG ATGG CTT AAAT GT AG CA AT CTCTGCGTT AATT AAATGCCTTT AAG AG AT AT AAT
TAACGTG C AGTTTT AAT AT CT ACT AAATT AAG AAT G ACTT C ATT AT GAT CAT G ATTT GCC
ACAATGTCCTTAACTCTAATGCCTGGACTGGCCATGTTCTAGTCTGTTGCGCTGTTA CAA
TCTGTATTGGTGCTAGTCAGAAAATTCCTAGCTCACATAGCCCAAAAGGGTGCGAGG GAG
AGGTGGATTACCAGTATTGTTCAATAATCCATGGTTCAAAGACTGTATAAATGCATT TTA
TTTT AAAT AAAAG C AAAACTTTT ATTT AAA
(SEQ ID NO: 30)
As used herein, the term “FTH1 ” refers to the gene encoding Ferritin heavy chain. The terms “FTH1 ” and "Ferritin heavy chain" include wild-type forms of the FTH1 gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type FTH1 . Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71 %, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81 %, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type FTH1 nucleic acid sequence (e.g., SEQ ID NO: 31 , ENA accession number X00318). SEQ ID NO: 31 is a wild-type gene sequence encoding FTH1 protein, and is shown below:
CACCGCACCCTCGGACTGCCCCAAGGCCCCCGCCGCCGCTCCAGCGCCGCGCAGCCA CCGCCGC
CGCCGCCGCCTCTCCTTAGTCGCCGCCATGACGACCGCGTCCACCTCGCAGGTGCGC CAGAACTA
CCACCAGGACTCAGAGGCCGCCATCAACCGCCAGATCAACCTGGAGCTCTACGCCTC CTACGTTTA
CCT GTCCATGTCTTACT ACTTT GACCGCG AT GATGT GGCTTT GAAG AACTTT GCCAAAT ACTTT CTT C
ACCAATCTCATGAGGAGAGGGAACATGCTGAGAAACTGATGAAGCTGCAGAACCAAC GAGGTGGCC
GAATCTTCCTTCAGGATATCAAGAAACCAGACTGTGATGACTGGGAGAGCGGGCTGA ATGCAATGGA
GTGT GCATT ACATTTGGAAAAAAAT GT GAAT CAGTC ACT ACT GG AACTGCACAAACT GGCC ACT GACA
AAAATGACCCCCATTTGTGTGACTTCATTGAGACACATTACCTGAATGAGCAGGTGA AAGCCATCAAA
GAATTGGGTGACCACGTGACCAACTTGCGCAAGATGGGAGCGCCCGAATCTGGCTTG GCGGAATAT
CTCTTTGACAAGCACACCTGGGAGACAGTGATAATGAAAGCTAAGCCTCGGGCTAAT TTCCCATAGC
CGTGGGGTGACTTCCTGGTCACCAAGGCAGTGCATGCATGTTGGGGTTTCCTTTACC TTTTCTATAA
GTTGTACCAAAACATCCACTTAAGTTCTTTGATTTGTACCATTCCTTCAAATAAAGA AATTTGGTACCC
(SEQ ID NO: 31)
As used herein, the term “GNAS” refers to the gene encoding Guanine nucleotide-binding protein G(s) subunit alpha isoforms XLas. The terms “GNAS” and "Guanine nucleotide-binding protein G(s) subunit alpha isoforms XLas" include wild-type forms of the GNAS gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type GNAS. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type GNAS nucleic acid sequence (e.g., SEQ ID NO: 32, ENA accession number X04408). SEQ ID NO: 32 is a wild-type gene sequence encoding GNAS protein, and is shown below:
GCGGGCGTGCTGCCGCCGCTGCCGCCGCCGCCGCAGCCCGGCCGCGCCCCGCCGCCG CCG
CCGCCGCCATGGGCTGCCTCGGGAACAGTAAGACCGAGGACCAGCGCAACGAGGAGA AGG
CGCAGCGTGAGGCCAACAAAAAGATCGAGAAGCAGCTGCAGAAGGACAAGCAGGTCT ACC
GGGCCACGCACCGCCT GCT GCT GCTGGGTGCTGGAGAATCT GGTAAAAGCACCATT GT GA
AGCAGATGAGGATCCTGCATGTTAATGGGTTTAATGGAGAGGGCGGCGAAGAGGACC CGC
AGGCTGCAAGGAGCAACAGCGATGGTGAGAAGGCAACCAAAGTGCAGGACATCAAAA ACA
ACCTGAAAGAGGCGATTGAAACCATTGTGGCCGCCATGAGCAACCTGGTGCCCCCCG TGG
AGCT GGCCAACCCCG AG AACCAGTT CAGAGTGGACT ACATCCT G AGTGT GAT GAACGTG C
CTGACTTTGACTTCCCTCCCGAATTCTATGAGCATGCCAAGGCTCTGTGGGAGGATG AAG
GAGTGCGTGCCTGCTACGAACGCTCCAACGAGTACCAGCTGATTGACTGTGCCCAGT ACT
TCCTGGACAAGATCGACGTGATCAAGCAGGCTGACTATGTGCCGAGCGATCAGGACC TGC
TTCGCTGCCGTGTCCTGACTTCTGGAATCTTTGAGACCAAGTTCCAGGTGGACAAAG TCA
ACTTCCACATGTTTGACGTGGGTGGCCAGCGCGATGAACGCCGCAAGTGGATCCAGT GCT
TCAACGATGTGACTGCCATCATCTTCGTGGTGGCCAGCAGCAGCTACAACATGGTCA TCC
GGGAGGACAACCAGACCAACCGCCTGCAGGAGGCTCTGAACCTCTTCAAGAGCATCT GGA ACAACAGATGGCTGCGCACCATCTCTGTGATCCTGTTCCTCAACAAGCAAGATCTGCTCG
CTGAGAAAGTCCTTGCTGGGAAATCGAAGATTGAGGACTACTTTCCAGAATTTGCTC GCT
ACACTACTCCTGAGGATGCTACTCCCGAGCCCGGAGAGGACCCACGCGTGACCCGGG CCA
AGT ACTT CATTCGAG AT GAGTTTCT GAGG AT CAGCACTGCCAGTGGAG AT GGGCGT CACT
ACTGCTACCCTCATTTCACCTGCGCTGTGGACACTGAGAACATCCGCCGTGTGTTCA ACG
ACTGCCGTGACATCATTCAGCGCATGCACCTTCGTCAGTACGAGCTGCTCTAAGAAG GGA
ACCCCCAAATTTAATTAAAGCCTTAAGCACAATTAATTAAAAGTGAAACGTAATTGT ACA
AGCAGTTAATCACCCACCATAGGGCATGATTAACAAAGCAACCTTTCCCTTCCCCCG AGT
GATTTTGCGAAACCCCCTTTTCCCTTCAGCTTGCTTAGATGTTCCAAATTTAGAAAG CTT
AAGGCGGCCTACAGAAAAAGGAAAAAAGGCCACAAAAGTTCCCTCTCACTTTCAGTA AAA
AT AAAT AAAAC AGO AGO AG C AAAC AAAT AAAAT G AAAT AAAAG AAAC AAAT G AAAT AAAT
ATTGTGTT GT GCAGCATT AAAAAAAATCAAAAT AAAAATT AAAT GT G AGCAAAG
(SEQ ID NO: 32)
As used herein, the term “GRN” refers to the gene encoding Progranulin. The terms “GRN” and "Progranulin" include wild-type forms of the GRN gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type GRN. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type GRN nucleic acid sequence (e.g., SEQ ID NO: 33, ENA accession number X62320). SEQ ID NO: 33 is a wild-type gene sequence encoding GRN protein, and is shown below:
GCTGCTGCCCAAGGACCGCGGAGTCGGACGCAGGCAGACCATGTGGACCCTGGTGAG CTG
GGTGGCCTTAACAGCAGGGCTGGTGGCTGGAACGCGGTGCCCAGATGGTCAGTTCTG CCC
TGTGGCCTGCTGCCTGGACCCCGGAGGAGCCAGCTACAGCTGCTGCCGTCCCCTTCT GGA
CAAATGGCCCACAACACTGAGCAGGCATCTGGGTGGCCCCTGCCAGGTTGATGCCCA CTG
CTCTGCCGGCCACTCCTGCATCTTTACCGTCTCAGGGACTTCCAGTTGCTGCCCCTT CCC
AGAGGCCGTGGCATGCGGGGATGGCCATCACTGCTGCCCACGGGGCTTCCACTGCAG TGC
AGACGGGCGATCCTGCTTCCAAAGATCAGGTAACAACTCCGTGGGTGCCATCCAGTG CCC
TGATAGTCAGTTCGAATGCCCGGACTTCTCCACGTGCTGTGTTATGGTCGATGGCTC CTG
GGGGTGCTGCCCCATGCCCCAGGCTTCCTGCTGTGAAGACAGGGTGCACTGCTGTCC GCA
CGGTGCCTTCTGCGACCTGGTTCACACCCGCTGCATCACACCCACGGGCACCCACCC CCT
GGCAAAGAAGCTCCCTGCCCAGAGGACTAACAGGGCAGTGGCCTTGTCCAGCTCGGT CAT
GTGTCCGGACGCACGGTCCCGGTGCCCTGATGGTTCTACCTGCTGTGAGCTGCCCAG TGG
GAAGTATGGCT GOT GCCCAATGCCCAACGCCACCT GOT GCTCCGATCACCTGCACT GOT G
CCCCCAAGACACTGTGTGTGACCTGATCCAGAGTAAGTGCCTCTCCAAGGAGAACGC TAC
CACGGACCTCCTCACTAAGCTGCCTGCGCACACAGTGGGGGATGTGAAATGTGACAT GGA
GGTGAGCTGCCCAGATGGCTATACCTGCTGCCGTCTACAGTCGGGGGCCTGGGGCTG CTG
CCCTTTTACCCAGGCTGTGTGCTGTGAGGACCACATACACTGCTGTCCCGCGGGGTT TAC
GTGTGACACGCAGAAGGGTACCTGTGAACAGGGGCCCCACCAGGTGCCCTGGATGGA GAA
GGCCCCAGCTCACCTCAGCCTGCCAGACCCACAAGCCTTGAAGAGAGATGTCCCCTG TGA TAATGTCAGCAGCTGTCCCTCCTCCGATACCTGCTGCCAACTCACGTCTGGGGAGTGGGG
CTGCTGTCCAATCCCAGAGGCTGTCTGCTGCTCGGACCACCAGCACTGCTGCCCCCA GGG
CTACACGTGTGTAGCTGAGGGGCAGTGTCAGCGAGGAAGCGAGATCGTGGCTGGACT GGA
GAAGATGCCTGCCCGCCGGGCTTCCTTATCCCACCCCAGAGACATCGGCTGTGACCA GCA
CACCAGCTGCCCGGTGGGGCAGACCTGCTGCCCGAGCCTGGGTGGGAGCTGGGCCTG CTG
CCAGTTGCCCCATGCTGTGTGCTGCGAGGATCGCCAGCACTGCTGCCCGGCTGGCTA CAC
CT GCAACGT GAAGGCTCGATCCT GCGAGAAGGAAGT GGTCTCT GCCCAGCCTGCCACCTT
CCT GGCCCGTAGCCCTCACGT GGGTGTGAAGGACGT GGAGT GT GGGGAAGGACACTTCT G
CCATGATAACCAGACCTGCTGCCGAGACAACCGACAGGGCTGGGCCTGCTGTCCCTA CCG
CCAGGGCGTCTGTTGTGCTGATCGGCGCCACTGCTGTCCTGCTGGCTTCCGCTGCGC AGC
CAGGGGTACCAAGT GTTT GCGCAGGGAGGCCCCGCGCT GGGACGCCCCTTT GAGGGACCC
AGCCTTGAGACAGCTGCTGTGAGGGACAGTACTGAAGACTCTGCAGCCCTCGGGACC CCA
CTCGGAGGGTGCCCTCTGCTCAGGCCTCCCTAGCACCTCCCCCTAACCAAATTCTCC CTG
GACCCCATTCTGAGCTCCCCATCACCATGGGAGGTGGGGCCTCAATCTAAGGCCTTC CCT
GTCAGAAGGGGGTTGTGGCAAAAGCCACATTACAAGCTGCCATCCCCTCCCCGTTTC AGT
GGACCCTGTGGCCAGGTGCTTTTCCCTATCCACAGGGGTGTTTGTGTGTGTGCGCGT GTG
CGTTT CAAT AAAGTTT GT ACACTTTCAAAAAAAAAAAAAAAAAAAAAAAAAA
(SEQ ID NO: 33)
As used herein, the term “HBEGF” refers to the gene encoding Heparin Binding EGF Like Growth Factor. The terms “HBEGF” and "Heparin Binding EGF Like Growth Factor" include wild-type forms of the HBEGF gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type HBEGF. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type HBEGF nucleic acid sequence (e.g., SEQ ID NO: 34, NCBI Reference Sequence: NM_001945.2). SEQ ID NO:
34 is a wild-type gene sequence encoding HBEGF protein, and is shown below:
ATTCGGCCGAAGGAGCTACGCGGGCCACGCTGCTGGCTGGCCTGACCTAGGCGCGCG GGGTCGG
GCGGCCGCGCGGGCGGGCTGAGTGAGCAAGACAAGACACTCAAGAAGAGCGAGCTGC GCCTGGG
TCCCGGCCAGGCTTGCACGCAGAGGCGGGCGGCAGACGGTGCCCGGCGGAATCTCCT GAGCTCC
GCCGCCCAGCTCTGGTGCCAGCGCCCAGTGGCCGCCGCTTCGAAAGTGACTGGTGCC TCGCCGCC
TCCTCTCGGTGCGGGACCATGAAGCTGCTGCCGTCGGTGGTGCTGAAGCTCTTTCTG GCTGCAGTT
CTCTCGGCACTGGTGACTGGCGAGAGCCTGGAGCGGCTTCGGAGAGGGCTAGCTGCT GGAACCAG
CAACCCGGACCCTCCCACTGTATCCACGGACCAGCTGCTACCCCTAGGAGGCGGCCG GGACCGGA
AAGTCCGTGACTTGCAAGAGGCAGATCTGGACCTTTTGAGAGTCACTTTATCCTCCA AGCCACAAGC
ACTGGCCACACCAAACAAGGAGGAGCACGGGAAAAGAAAGAAGAAAGGCAAGGGGCT AGGGAAGA
AGAGGGACCCATGTCTTCGGAAATACAAGGACTTCTGCATCCATGGAGAATGCAAAT ATGTGAAGGA
GCTCCGGGCTCCCTCCTGCATCTGCCACCCGGGTTACCATGGAGAGAGGTGTCATGG GCTGAGCCT
CCCAGTGGAAAATCGCTTATATACCTATGACCACACAACCATCCTGGCCGTGGTGGC TGTGGTGCTG
TCATCTGTCTGTCTGCTGGTCATCGTGGGGCTTCTCATGTTTAGGTACCATAGGAGA GGAGGTTATG
ATGTGG AAAAT G AAG AG AAAGT GAAGTTGGGCAT G ACT AATTCCCACT G AG AGAGACTTGTGCT CAA GGAATCGGCTGGGGACTGCTACCTCTGAGAAGACACAAGGTGATTTCAGACTGCAGAGGG GAAAGA
CTTCCATCTAGTCACAAAGACTCCTTCGTCCCCAGTTGCCGTCTAGGATTGGGCCTC CCATAATTGC
TTTGCCAAAATACCAGAGCCTTCAAGTGCCAAACAGAGTATGTCCGATGGTATCTGG GTAAGAAGAA
AGCAAAAGCAAGGGACCTTCATGCCCTTCTGATTCCCCTCCACCAAACCCCACTTCC CCTCATAAGT
TT GTTT AAAC ACTT AT CTT CT GG ATT AG AAT G CCG GTT AAATT C CAT ATG CTC CAG GAT CTTT G ACTG A
AAAAAAAAAAGAAGAAGAAGAAGGAGAGCAAGAAGGAAAGATTT GT GAACTGGAAGAAAGCAACAAA
GATTGAGAAGCCATGTACTCAAGTACCACCAAGGGATCTGCCATTGGGACCCTCCAG TGCTGGATTT
GATGAGTTAACTGTGAAATACCACAAGCCTGAGAACTGAATTTTGGGACTTCTACCC AGATGGAAAAA
TAACAACTATTTTTGTTGTTGTTGTTTGTAAATGCCTCTTAAATTATATATTTATTT TATTCTATGTATGT
T AATTT ATTT AGTTTTT AAC AAT CT AAC AAT AAT ATTT C AAGT GCCT AG ACTGTT ACTTTGG C AATTT C C
TGGCCCTCCACTCCTCATCCCCACAATCTGGCTTAGTGCCACCCACCTTTGCCACAA AGCTAGGATG
GTTCTGTGACCCATCTGTAGTAATTTATTGTCTGTCTACATTTCTGCAGATCTTCCG TGGTCAGAGTG
CCACTGCGGGAGCTCTGTATGGTCAGGATGTAGGGGTTAACTTGGTCAGAGCCACTC TATGAGTTG
GACTTCAGTCTTGCCTAGGCGATTTTGTCTACCATTTGTGTTTTGAAAGCCCAAGGT GCTGATGTCAA
AGTGTAACAGATATCAGTGTCTCCCCGTGTCCTCTCCCTGCCAAGTCTCAGAAGAGG TTGGGCTTCC
ATGCCTGTAGCTTTCCTGGTCCCTCACCCCCATGGCCCCAGGCCCACAGCGTGGGAA CTCACTTTC
CCTTGTGTCAAGACATTTCTCTAACTCCTGCCATTCTTCTGGTGCTACTCCATGCAG GGGTCAGTGCA
GCAGAGGACAGTCTGGAGAAGGTATTAGCAAAGCAAAAGGCTGAGAAGGAACAGGGA ACATTGGAG
CTGACTGTTCTTGGTAACTGATTACCTGCCAATTGCTACCGAGAAGGTTGGAGGTGG GGAAGGCTTT
GTATAATCCCACCCACCTCACCAAAACGATGAAGTTATGCTGTCATGGTCCTTTCTG GAAGTTTCTGG
T GCCATTT CT G AACT GTTACAACTT GT ATTTCCAAACCTGGTT CAT ATTT AT ACTTTGCAATCC AAATAA
AG AT AACC CTT ATTCC AT AAAAAAAAAAAAAAAAAAAAAAAA
(SEQ ID NO: 34)
As used herein, the term “HLA-DRB1” refers to the gene encoding HLA class II histocompatibility antigen, DRB1 beta chain. The terms “HLA-DRB1” and "HLA class II histocompatibility antigen, DRB1 beta chain" include wild-type forms of the HLA-DRB1 gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type HLA-DRB1 . Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type HLA-DRB1 nucleic acid sequence (e.g., SEQ ID NO: 35, ENA accession number X00699). SEQ ID NO: 35 is a wild-type gene sequence encoding HLA-DRB1 protein, and is shown below:
CTGCTCTGGCCCCTGGTCCTGTCCTGTTCTCCAGCATGGTGTGTCTGAGGCTCCCTG GAG
GCTCCTGCATGGCAGTTCTGACAGTGACACTGATGGTGCTGAGCTCCCCACTGGCTT TGG
CT GGGGACACC AG ACC ACGTTT CTTGGAGT ACT CTACGT CT GAGT GT CATTT CTT CAAT G
GGACGGAGCGGGTGCGGTACCTGGACAGATACTTCCATAACCAGGAGGAGAACGTGC GCT
TCGACAGCGACGTGGGGGAGTTCCGGGCGGTGACGGAGCTGGGGCGGCCTGATGCCG AGT
ACTGGAACAGCCAGAAGGACCTCCTGGAGCAGAAGCGGGGCCGGGTGGACAACTACT GCA
GACACAACTACGGGGTTGTGGAGAGCTTCACAGTGCAGCGGCGAGTCCATCCTAAGG TGA
CTGTGTATCCTTCAAAGACCCAGCCCCTGCAGCACCATAACCTCCTGGTCTGTTCTG TGA
GTGGTTTCTATCCAGGCAGCATTGAAGTCAGGTGGTTCCGGAATGGCCAGGAAGAGA AGA CTGGGGTGGTGTCCACAGGCCTGATCCACAATGGAGACTGGACCTTCCAGACCCTGGTGA
TGCTGGAAACAGTTCCTCGGAGTGGAGAGGTTTACACCTGCCAAGTGGAGCACCCAA GCG
TGACAAGCCCTCTCACAGTGGAATGGAGAGCACGGTCTGAATCTGCACAGAGCAAGA TGC
TGAGTGGAGTCGGGGGCTTTGTGCTGGGCCTGCTCTTCCTTGGGGCCGGGCTGTTCA TCT
ACTTCAGGAATCAGAAAGGACACTCTGGACTTCAGCCAAGAGGATTCCTGAGCTGAA GTG
CAGATGACACATTCAAAGAAGAACTTTCTGCCCCAGCTTTGCAGGATGAAAAGCTTT CCC
TCCTGGCTGTTATTCTTCCACAAGAGAGGGCTTTCTCAGGACCTGGTTGCTACTGGT TCA
GCAACTGCAGAAAATGTCCTCCCTTGTGGCTTCCTCAGCTCCTGTTCTTGGCCTGAA GCC
CCACAGCTTTGATGGCAGTGCCTCATCTTCAACTTTTGTGCTCCCCTTTGCCTAAAC CCT
ATGGCCTCCTGTGCATCTGTACTCACCCTGTACCA
(SEQ ID NO: 35)
As used herein, the term “HLA-DRB5” refers to the gene encoding HLA class II histocompatibility antigen, DR beta 5 chain. The terms “HLA-DRB5” and "HLA class II histocompatibility antigen, DR beta 5 chain" include wild-type forms of the HLA-DRB5 gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type HLA-DRB5. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type HLA-DRB5 nucleic acid sequence (e.g., SEQ ID NO: 36, ENA accession number M20429). SEQ ID NO: 36 is a wild-type gene sequence encoding HLA-DRB5 protein, and is shown below:
CCAGCATGGTGTGTCTGAAGCTCCCTGGAGGTTCCTACATGGCAAAGCTGACAGTGA CAC
TGATGGTGCTGAGCTCCCCACTGGCTTTGGCTGGGGACACCCGACCACGTTTCTTGC AGC
AGGATAAGTATGAGTGTCATTTCTTCAACGGGACGGAGCGGGTGCGGTTCCTGCACA GAG
ACATCTATAACCAAGAGGAGGACTTGCGCTTCGACAGCGACGTGGGGGAGTACCGGG CGG
T GACGGAGCT GGGGCGGCCT GACGCT GAGTACTGGAACAGCCAGAAGGACTTCCT GGAAG
ACAGGCGCGCCGCGGTGGACACCTACTGCAGACACAACTACGGGGTTGGTGAGAGCT TCA
CAGTGCAGCGGCGAGTTGAGCCTAAGGTGACTGTGTATCCTGCAAGGACCCAGACCC TGC
AGCACCACAACCTCCTGGTCTGCTCTGTGAATGGTTTCTATCCAGGCAGCATTGAAG TCA
GGTGGTTCCGGAACAGCCAGGAAGAGAAGGCTGGGGTGGTGTCCACAGGCCTGATTC AGA
ATGGAGACTGGACCTTCCAGACCCTGGTGATGCTGGAAACAGTTCCTCGAAGTGGAG AGG
TTTACACCTGCCAAGTGGAGCACCCAAGCGTGACGAGCCCTCTCACAGTGGAATGGA GAG
CACAGTCTGAATCTGCACAGAGCAAGATGCTGAGTGGAGTCGGGGGCTTTGTGCTGG GCC
TGCTCTTCCTTGGGGCCGGGCTATTCATCTACTTCAAGAATCAGAAAGGGCACTCTG GAC
TTCACCCAACAGGACTCGTGAGCTGAAGTGCAGATGACCACATTCAAGGGGGAACCT TCT
GCCCCAGCTTTGCATGATGAAAAGCTTTCCTGCTTGGCTCTTATTCTTCCACAAGAG AGG
ACTTTCTCAGGCCCTGGTTGCTACCGGTTCAGCAACTCTGCAGAAAATGTCCATCCT TGT
GGCTTCCTCAGCTCCTGCCCCTTGGCCTGAAGTCCCAGCATTGATGGCAGTGCCTCA TCT
TCAACTTTAGTGCTCCCCTTTACCTAACCCTACGGCCTCCCATGCATCTGTACTCCC CCT
GTGTGCCACAAATGCACTACGTTATTAAATTTTTCTGAAGCCCAGAGTTAAAAATCA TCT
GTCCACCTGGCTCCAAAGACAAAAAATAAAAA (SEQ ID NO: 36)
As used herein, the term “IFIT1 ” refers to the gene encoding Interferon-induced protein with tetratricopeptide repeats 1 . The terms “IFIT1 ” and "Interferon-induced protein with tetratricopeptide repeats 1" include wild-type forms of the IFIT1 gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type IFIT1 . Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71 %, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81 %, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type IFIT1 nucleic acid sequence (e.g., SEQ ID NO: 37, ENA accession number X03557). SEQ ID NO: 37 is a wild-type gene sequence encoding IFIT1 protein, and is shown below:
CCAGATCTCAGAGGAGCCTGGCTAAGCAAAACCCTGCAGAACGGCTGCCTAATTTAC AGC AAC CAT G AGT AC AAAT G GT GAT GAT CAT C AG GT C AAG GAT AGTCT G G AGC AATT GAG AT G T C ACTTT AC AT G GG AGTT ATCC ATT G ATG ACG AT G AAAT G CCT G ATTT AG AAAAC AG AGT CTTGG AT CAG ATT G AATTCCTAGACACC AAAT ACAGTGTGGG AATACACAACCT ACT AGC CTATGTGAAACACCTGAAAGGCCAGAATGAGGAAGCCCTGAAGAGCTTAAAAGAAGCTGA AAACTTAAT GCAGGAAGAACAT GACAACCAAGCAAAT GT GAGGAGTCTGGT GACCT GGGG CAACTTT GCCT GG AT GT ATT ACC AC AT GGGCAGACT GGCAGAAGCCCAG ACTT ACCT GG A CAAGGTGGAGAACATTTGCAAGAAGCTTTCAAATCCCTTCCGCTATAGAATGGAGTGTCC AGAAAT AGACT GT GAGG AAGGATGGGCCTTGCT G AAGTGTGG AGGAAAG AATT AT GAACG GGCCAAGGCCTGCTTTGAAAAGGTGCTTGAAGTGGACCCTGAAAACCCTGAATCCAGCGC TGGGTATGCGATCTCTGCCTATCGCCTGGATGGCTTTAAATTAGCCACAAAAAATCACAA GCCATTTTCTTTGCTTCCCCTAAGGCAGGCTGTCCGCTTAAATCCAGACAATGGATATAT TAAGGTTCTCCTTGCCCTGAAGCTTCAGGATGAAGGACAGGAAGCTGAAGGAGAAAAGTA C ATT G AAG AAG CTCT AG CC AAC AT GTCCTC AC AG AC CTATGT CTTTCG AT AT G C AG CC AA GTTTTACCGAAGAAAAGGCTCTGTGGATAAAGCTCTTGAGTTATTAAAAAAGGCCTTGCA GGAAACACCCACTTCTGTCTTACTGCATCACCAGATAGGGCTTTGCTACAAGGCACAAAT GATCCAAATCAAGGAGGCTACAAAAGGGCAGCCTAGAGGGCAGAACAGAGAAAAGCTAGA CAAAAT GAT AAG AT CAGCCAT ATTT CATTTT G AAT CTGCAGTGG AAAAAAAGCCCACATT TGAGGTGGCTCATCTAGACCTGGCAAGAATGTATATAGAAGCAGGCAATCACAGAAAAGC T GAAGAG AATTTT CAAAAATT GTT AT GCAT G AAACC AGTGGTAGAAG AAAC AAT GCAAG A CAT AC ATTT CTACTATGGTCG GTTT CAG G AATTT C AAAAG AAAT CTG ACGT C AAT GC AAT T ATCC ATT ATTT AAAAG CTAT AAAAAT AG AAC AG GCAT C ATT AAC AAG GG AT AAAAGTAT CAATTCTTTGAAGAAATTGGTTTTAAGGAAACTTCGGAGAAAGGCATTAGATCTGGAAAG CTT GAGCCTCCTT GGGTTCGT CTACAAATT GG AAGGAAAT AT GAAT G AAGCCCT GGAGT A CT AT G AGCGGGCCCT GAG ACT GGCTGCT GACTTT G AGAACTCT GT GAGACAAGGTCCTT A G GC ACCC AG AT AT C AGC C ACTTT C AC ATTT C ATTT CATTTT ATGCT AAC ATTT ACT AAT C AT CTTTT CTG CTT ACT GTTTT CAG AAAC ATT AT AATT C ACTGT AAT G ATGTAATT CTT G A AT AAT AAAT CT G AC AAAAT ATT (SEQ ID NO: 37) As used herein, the term “IFIT3” refers to the gene encoding Interferon-induced protein with tetratricopeptide repeats 3. The terms “IFIT3” and "Interferon-induced protein with tetratricopeptide repeats 3" include wild-type forms of the IFIT3 gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type IFIT3. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type IFIT3 nucleic acid sequence (e.g., SEQ ID NO: 38, ENA accession number AF026939). SEQ ID NO: 38 is a wild-type gene sequence encoding IFIT3 protein, and is shown below:
GTGGAAACCTCTTCAGCATTTGCTTGGAATCAGTAAGCTAAAAACAAAATCAACCGG GAC
CCCAGCTTTTCAGAACTGCAGGGAAACAGCCATCATGAGTGAGGTCACCAAGAATTC CCT
GGAGAAAATCCTCCCACAGCTGAAATGCCATTTCACCTGGAACTTATTCAAGGAAGA CAG
TGTCT C AAGG G AT CTAG AAG AT AG AGT GTGT AAC C AG ATT G AATTTTT AAAC ACT G AGTT
CAAAGCTACAATGTACAACTTGTTGGCCTACATAAAACACCTAGATGGTAACAACGA GGC
AGCCCTGGAATGCTTACGGCAAGCTGAAGAGTTAATCCAGCAAGAACATGCTGACCA AGC
AGAAATCAGAAGTCTAGTCACTTGGGGAAACTACGCCTGGGTCTACTATCACTTGGG CAG
ACTCTCAGATGCTCAGATTTATGTAGATAAGGTGAAACAAACCTGCAAGAAATTTTC AAA
TCCATACAGT ATT G AGT ATT CT GAACTT G ACTGTG AGGAAGGGTGGACAC AACTGAAGTG
TGGAAGAAATGAAAGGGCGAAGGTGTGTTTTGAGAAGGCTCTGGAAGAAAAGCCCAA CAA
CCCAGAATTCTCCTCTGGACTGGCAATTGCGATGTACCATCTGGATAATCACCCAGA GAA
ACAGTTCTCTACTGATGTTTTGAAGCAGGCCATTGAGCTGAGTCCTGATAACCAATA CGT
CAAGGTTCTCTTGGGCCTGAAACTGCAGAAGATGAATAAAGAAGCTGAAGGAGAGCA GTT
TGTTGAAGAAGCCTTGGAAAAGTCTCCTTGCCAAACAGATGTCCTCCGCAGTGCAGC CAA
ATTTT AC AG AAG AAAAG GT G ACCT AG AC AAAG CT ATT G AACTGTTT C AACG G GTGTTGG A
ATCCACACCAAACAATGGCTACCTCTATCACCAGATTGGGTGCTGCTACAAGGCAAA AGT
AAG ACAAATGCAG AAT ACAGG AG AAT CT G AAGCT AGT GG AAATAAAG AGAT GATT G AAGC
ACT AAAG C AAT ATGCTATG G ACT ATTCG AAT AAAG CTCTT GAG AAGG G ACT G AATCCT CT
G AATGC AT ACTCCG AT CTCGCT GAGTTCCTGGAGACGG AATGTT AT CAG ACACCATT CAA
TAAGGAAGTCCCTGATGCTGAAAAGCAACAATCCCATCAGCGCTACTGCAACCTTCA GAA
ATATAATGGGAAGTCTGAAGACACTGCTGTGCAACATGGTTTAGAGGGTTTGTCCAT AAG
C AAAAAAT C AACTG AC AAG G AAG AG AT C AAAG ACC AAC C AC AG AATGTATCC G AAAAT CT
G CTTCC AC AAAAT G C ACC AAATT ATT G GTATCTT C AAG GATT AATT CAT AAGC AG AAT GG
AGATCTGCTGCAAGCAGCCAAATGTTATGAGAAGGAACTGGGCCGCCTGCTAAGGGA TGC
CCCTTCAGGCATAGGCAGTATTTTCCTGTCAGCATCTGAGCTTGAGGATGGTAGTGA GGA
AATGGGCCAGGGCGCAGTCAGCTCCAGTCCCAGAGAGCTCCTCTCTAACTCAGAGCA ACT
GAACTGAGACAGAGGAGGAAAACAGAGCATCAGAAGCCTGCAGTGGTGGTTGTGACG GGT
AGGAGGATAGGAAGACAGGGGGCCCCAACCTGGGATTGCTGAGCAGGGAAGCTTTGC ATG
TTGCTCTAAGGTACATTTTTAAAGAGTTGTTTTTTGGCCGGGCGCAGTGGCTCATGC CTG
TAATCCCAGCACTTTGGGAGGCCGAGGTGGGCGGATCACGAGGTCTGGAGTTTGAGA CCA
TCCTGGCTAACACAGTGAAATCCCGTCTCTACTAAAAATACAAAAAATTAGCCAGGC GTG
GTGGCTGGCACCTGTAGTCCCAGCTACTTGGGAGGCTGAGGCAGGAGAATGGCGTGA ACC TGGAAGGAAGAGGTTGCAGTGAGCCAAGATTGCGCCCCTGCACTCCAGCCTGGGCAACAG
AGCAAGACTC
(SEQ ID NO: 38)
As used herein, the term “IFITM3” refers to the gene encoding Interferon Induced Transmembrane Protein. The terms “IFITM3” and "Interferon Induced Transmembrane Protein" include wild-type forms of the IFITM3 gene, as well as variants (e.g., splice variants and polymorphisms) of wild- type IFITM3. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild- type IFITM3 nucleic acid sequence (e.g., SEQ ID NO: 39, NCBI Reference Sequence: NM_021034.2). SEQ ID NO: 39 is a wild-type gene sequence encoding IFITM3 protein, and is shown below:
AGGAAAAGGAAACTGTTGAGAAACCGAAACTACTGGGGAAAGGGAGGGCTCACTGAG AACCATCCC
AGTAACCCGACCGCCGCTGGTCTTCGCTGGACACCATGAATCACACTGTCCAAACCT TCTTCTCTCC
TGTCAACAGTGGCCAGCCCCCCAACTATGAGATGCTCAAGGAGGAGCACGAGGTGGC TGTGCTGG
GGGCGCCCCACAACCCTGCTCCCCCGACGTCCACCGTGATCCACATCCGCAGCGAGA CCTCCGTG
CCCGACCATGTCGTCTGGTCCCTGTTCAACACCCTCTTCATGAACCCCTGCTGCCTG GGCTTCATAG
CATTCGCCTACTCCGTGAAGTCTAGGGACAGGAAGATGGTTGGCGACGTGACCGGGG CCCAGGCC
TATGCCTCCACCGCCAAGTGCCTGAACATCTGGGCCCTGATTCTGGGCATCCTCATG ACCATTCTGC
TCATCGTCATCCCAGTGCTGATCTTCCAGGCCTATGGATAGATCAGGAGGCATCACT GAGGCCAGG
AGCTCTGCCCATGACCTGTATCCCACGTACTCCAACTTCCATTCCTCGCCCTGCCCC CGGAGCCGA
GTCCTGTATCAGCCCTTTATCCTCACACGCTTTTCTACAATGGCATTCAATAAAGTG CACGTGTTTCT
G GTG CT AAAAAAAAAA
(SEQ ID NO: 39)
As used herein, the term “IFNAR1” refers to the gene encoding Interferon alpha/beta receptor 1 . The terms “IFNAR1” and "Interferon alpha/beta receptor 1" include wild-type forms of the IFNAR1 gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type IFNAR1. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type IFNAR1 nucleic acid sequence (e.g., SEQ ID NO: 40, ENA accession number J03171). SEQ ID NO: 40 is a wild-type gene sequence encoding IFNAR1 protein, and is shown below:
TTAGGACGGGGCGATGGCGGCTGAGAGGAGCTGCGCGTGCGCGAACATGTAACTGGT GGG
ATCTGCGGCGGCTCCCAGATGATGGTCGTCCTCCTGGGCGCGACGACCCTAGTGCTC GTC
GCCGTGGGCCCATGGGTGTTGTCCGCAGCCGCAGGTGGAAAAAATCTAAAATCTCCT CAA
AAAGTAGAGGTCGACATCATAGATGACAACTTTATCCTGAGGTGGAACAGGAGCGAT GAG
TCTGTCGGGAATGTGACTTTTTCATTCGATTATCAAAAAACTGGGATGGATAATTGG ATA
AAATTGTCTGGGTGTCAGAATATTACTAGTACCAAATGCAACTTTTCTTCACTCAAG CTG
AAT GTTT AT G AAG AAATT AAATT G CGTAT AAG AG C AG AAAAAG AAAACACTT CTT CAT G G TATGAGGTTGACTCATTTACACCATTTCGCAAAGCTCAGATTGGTCCTCCAGAAGTACAT
TTAGAAGCTGAAGATAAGGCAATAGTGATACACATCTCTCCTGGAACAAAAGATAGT GTT
ATGTGGGCTTTGGATGGTTTAAGCTTTACATATAGCTTACTTATCTGGAAAAACTCT TCA
G GTGTAG AAG AAAGG ATT G AAAAT ATTT ATTCC AG AC AT AAAATTT AT AAACT CT C ACC A
GAG ACT ACTT ATT GTCT AAAAGTT AAAG C AGC ACTACTTACGTC AT GG AAAATT G GTGTC
T ATAGTCCAGT AC ATTGTATAAAG ACC ACAGTT G AAAAT G AACT ACCTCCACCAGAAAAT
ATAGAAGTCAGTGTCCAAAATCAGAACTATGTTCTTAAATGGGATTATACATATGCA AAC
ATGACCTTTCAAGTTCAGTGGCTCCACGCCTTTTTAAAAAGGAATCCTGGAAACCAT TTG
TATAAATGGAAACAAATACCTGACTGTGAAAATGTCAAAACTACCCAGTGTGTCTTT CCT
CAAAACGTTTTCCAAAAAGGAATTTACCTTCTCCGCGTACAAGCATCTGATGGAAAT AAC
AC AT CTTTTT GGTCT G AAG AG AT AAAGTTT G ATACT G AAAT AC AAG CTTT CCTACTTCCT
CCAGTCTTTAACATTAGATCCCTTAGTGATTCATTCCATATCTATATCGGTGCTCCA AAA
CAGTCTGGAAACACGCCTGTGATCCAGGATTATCCACTGATTTATGAAATTATTTTT TGG
G AAAAC ACTT CAAAT GCT G AG AG AAAAATTATCG AGAAAAAAACT GAT GTT AC AGTTCCT
AATTT G AAACCACT GACT GT ATATT GT GT GAAAGCCAGAGCACACACCAT GG AT G AAAAG
CT GAATAAAAGCAGT GTTTTTAGT GACGCT GTAT GT GAGAAAACAAAACCAGGAAATACC
TCTAAAATTTGGCTTATAGTTGGAATTTGTATTGCATTATTTGCTCTCCCGTTTGTC ATT
TATG CT G CG AAAGT CTTCTT GAG AT G CAT C AATT ATGTCTT CTTTCC AT C ACTT AAACCT
TCTTCCAGTATAGATGAGTATTTCTCTGAACAGCCATTGAAGAATCTTCTGCTTTCA ACT
TCT G AGG AAC AAATCG AAAAAT GTTT CAT AATT G AAAAT AT AAGC AC AATT GCT AC AGT A
G AAG AAACT AAT C AAACT GAT G AAG AT CAT AAAAAAT AC AGTTCCC AAACT AG CC AAG AT
T CAGGAAATT ATTCT AAT G AAGAT GAAAGCG AAAGT AAAAC AAGTG AAG AACT ACAGC AG
GACTTTGTATGACCAGAAATGAACTGTGTCAAGTATAAGGTTTTTCAGCAGGAGTTA CAC
TGGGAGCCTGAGGTCCTCACCTTCCTCTCAGTAACTACAGAGAGGACGTTTCCTGTT TAG
GGAAAGAAAAAACATCTTCAGATCATAGGTCCTAAAAATACGGGCAAGCTCTTAACT ATT
TAAAAATGAAATTACAGGCCCGGGCACGGTGGCTCACACCTGTAATCCCAGCACTTT GGG
AGGCTGAGGCAGGCAGATCATGAGGTCAAGAGATCGAGACCAGCCTGGCCAACGTGG TGA
AACCCCATCTCTACTAAAAATACAAAAATTAGCCGGGTAGTAGGTAGGCGCGCGCCT GTT
GTCTTAGCTACTCAGGAGGCTGAGGCAGGAGAATCGCTTGAAAACAGGAGGTGGAGG TTG
CAGT GAGCCG AG AT CACGCCACT GCACTCCAGCCT GGTG ACAGCGTGAG ACTCTTT AAAA
AAAG AAATT AAAAGAGTT GAGACAAACGTTTCCTACATT CTTTTCCATGTGTAAAAT CAT
GAAAAAGCCTGTCACCGGACTTGCATTGGATGAGATGAGTCAGACCAAAACAGTGGC CAC
CCGTCTTCCTCCTGTGAGCCTAAGTGCAGCCGTGCTAGCTGCGCACCGTGGCTAAGG ATG
ACGTCTGTGTTCCTGTCCATCACTGATGCTGCTGGCTACTGCATGTGCCACACCTGT CTG
TTCGCCATTCCTAACATTCTGTTTCATTCTTCCTCGGGAGATATTTCAAACATTTGG TCT
TTTCTTTTAACACTGAGGGTAGGCCCTTAGGAAATTTATTTAGGAAAGTCTGAACAC GTT
ATCACTTGGTTTTCTGGAAAGTAGCTTACCCTAGAAAACAGCTGCAAATGCCAGAAA GAT
GATCCCTAAAAATGTTGAGGGACTTCTGTTCATTCATCCCGAGAACATTGGCTTCCA CAT
CACAGTATCTACCCTTACATGGTTTAGGATTAAAGCCAGGCAATCTTTTACTATG
(SEQ ID NO: 40) As used herein, the term “IFNAR2” refers to the gene encoding Interferon alpha/beta receptor 2. The terms “IFNAR2” and "Interferon alpha/beta receptor 2" include wild-type forms of the IFNAR2 gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type IFNAR2. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type IFNAR2 nucleic acid sequence (e.g., SEQ ID NO: 41 , ENA accession number X77722). SEQ ID NO: 41 is a wild-type gene sequence encoding IFNAR2 protein, and is shown below:
GCTTTTGTCCCCCGCCCGCCGCTTCTGTCCGAGAGGCCGCCCGCGAGGCGCATCCTG ACC
GCGAGCGTCGGGTCCCAGAGCCGGGCGCGGCTGGGGCCCGAGGCTAGCATCTCTCGG GAG
CCGCAAGGCGAGAGCTGCAAAGTTTAATTAGACACTTCAGAATTTTGATCACCTAAT GTT
G ATTTCAGATGTAAAAGTCAAG AG AAG ACT CT AAAAAT AGCAAAG AT GCTTTT GAGCCAG
AATGCCTTCATCGTCAGATCACTTAATTTGGTTCTCATGGTGTATATCAGCCTCGTG TTT
G GT ATTT CAT AT G ATTCG CCT GATT AC AC AG AT G AAT CTTG C ACTTT C AAG AT AT CATT G
CGAAATTTCCGGTCCATCTTATCATGGGAATTAAAAAACCACTCCATTGTACCAACT CAC
T AT AC ATT GCTGTAT AC AAT CAT G AGT AAACC AG AAG ATTT G AAGGTG GTT AAG AACT GT
GCAAATACCACAAGATCATTTTGTGACCTCACAGATGAGTGGAGAAGCACACACGAG GCC
T ATGTCACCGTCCT AG AAGGATTCAGCGGG AAC ACAACGTT GTT CAGTT GCT CAC ACAAT
TTCTGGCTGGCCATAGACATGTCTTTTGAACCACCAGAGTTTGAGATTGTTGGTTTT ACC
AAC CAC ATT AAT GT GAT G GTG AAATTTCC AT CT ATTGTT GAG G AAG AATT AC AGTTT GAT
TTAT CTCTCGT CATT G AAG AACAGT CAG AGGGAATT GTT AAGAAGCAT AAACCCG AAATA
AAAG G AAAC AT G AGTG G AAATTT CAC CTATAT CATT G AC AAGTT AATT C C AAAC ACG AAC
TACTGTGTATCTGTTTATTTAGAGCACAGTGATGAGCAAGCAGTAATAAAGTCTCCC TTA
AAATGCACCCTCCTTCCACCTGGCCAGGAATCAGAATCAGCAGAATCTGCCAAAATA GGA
GGAATAATT ACT GT GTTTTT GATAGCATT GGTCTT G ACAAGC ACC AT AGTGACACT G AAA
TGGATTGGTTATATATGCTTAAGAAATAGCCTCCCCAAAGTCTTGAGGCAAGGTCTC ACT
AAGGGCTGGAATGCAGTGGCTATTCACAGGTGCAGTCATAATGCACTACAGTCTGAA ACT
CCTGAGCTCAAACAGTCGTCCTGCCTAAGCTTCCCCAGTAGCTGGGATTACAAGCGT GCA
TCCCT GT GCCCCAGT GATTAAGTTTTATT AT GT AGAAAAT AAAG AG C AAAC AGTTAC AAA
AGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
(SEQ ID NO: 41)
As used herein, the term “IGF1” refers to the gene encoding Insulin-like growth factor I. The terms “IGF1” and "Insulin-like growth factor I" include wild-type forms of the IGF1 gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type IGF1. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type IGF1 nucleic acid sequence (e.g., SEQ ID NO: 42, ENA accession number X00173). SEQ ID NO: 42 is a wild-type gene sequence encoding IGF1 protein, and is shown below: CTTCAGAAGCAATGGGAAAAATCAGCAGTCTTCCAACCCAATTATTTAAGTGCTGCTTTT
GTGATTTCTTGAAGGTGAAGATGCACACCATGTCCTCCTCGCATCTCTTCTACCTGG CGC
TGTGCCTGCTCACCTTCACCAGCTCTGCCACGGCTGGACCGGAGACGCTCTGCGGGG CTG
AGCTGGTGGATGCTCTTCAGTTCGTGTGTGGAGACAGGGGCTTTTATTTCAACAAGC CCA
CAGGGTATGGCTCCAGCAGTCGGAGGGCGCCTCAGACAGGTATCGTGGATGAGTGCT GCT
TCCGGAGCTGTGATCTAAGGAGGCTGGAGATGTATTGCGCACCCCTCAAGCCTGCCA AGT
CAGCTCGCTCTGTCCGTGCCCAGCGCCACACCGACATGCCCAAGACCCAGAAGGAAG TAC
ATTT GAAG AACGCAAGT AG AGGG AGTGCAGG AAACAAG AACT AC AGGAT GT AGG AAG ACC
CTCCTGAGGAGTGAAGAGTGACATGCCACCGCAGGATCCTTTGCTCTGCACGAGTTA CCT
GTTAAACTTTGGAACACCTACCAAAAAATAAGTTTGATAACATTTAAAAGATGGGCG TTT
CCCCCAATGAAATACACAAGTAAACATTCCAACATTGTCTTTAGGAGTGATTTGCAC CTT
G C AAAAAT G GTCCTG G AGTT G GT AG ATT G CTGTTG AT CTTTT AT C AAT AATGTT CTAT AG
AAAAG
(SEQ ID NO: 42)
As used herein, the term “IL10RA” refers to the gene encoding Interleukin-10 receptor subunit alpha. The terms “IL10RA” and "Interleukin-10 receptor subunit alpha" include wild-type forms of the IL10RA gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type IL10RA.
Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type IL10RA nucleic acid sequence (e.g., SEQ ID NO: 43, ENA accession number U00672). SEQ ID NO: 43 is a wild- type gene sequence encoding IL10RA protein, and is shown below:
AAAGAGCTGGAGGCGCGCAGGCCGGCTCCGCTCCGGCCCCGGACGATGCGGCGCGCC CAG
GATGCTGCCGTGCCTCGTAGTGCTGCTGGCGGCGCTCCTCAGCCTCCGTCTTGGCTC AGA
CGCTCATGGGACAGAGCTGCCCAGCCCTCCGTCTGTGTGGTTTGAAGCAGAATTTTT CCA
CCACATCCTCC ACT GG ACACCC ATCCCAAAT CAGT CT G AAAGT ACCT GOT AT GAAGT GGC
GCTCCTGAGGTATGGAATAGAGTCCTGGAACTCCATCTCCAACTGTAGCCAGACCCT GTC
CTATGACCTTACCGCAGTGACCTTGGACCTGTACCACAGCAATGGCTACCGGGCCAG AGT
GCGGGCTGTGGACGGCAGCCGGCACTCCAACTGGACCGTCACCAACACCCGCTTCTC TGT
GGAT GAAGTG ACT CT GACAGTT GGCAGTGTG AACCT AGAG ATCCACAAT GGCTT CATCCT
CGGGAAGATTCAGCTACCCAGGCCCAAGATGGCCCCCGCGAATGACACATATGAAAG CAT
CTTCAGTCACTTCCGAGAGTATGAGATTGCCATTCGCAAGGTGCCGGGAAACTTCAC GTT
CACACACAAGAAAGTAAAACATGAAAACTTCAGCCTCCTAACCTCTGGAGAAGTGGG AGA
GTTCTGTGTCCAGGTGAAACCATCTGTCGCTTCCCGAAGTAACAAGGGGATGTGGTC TAA
AGAGGAGT GC AT CTCCCTCACCAGGCAGT ATTT CACCGT GACCAACGTCAT CAT CTTCTT
TGCCTTTGTCCTGCTGCTCTCCGGAGCCCTCGCCTACTGCCTGGCCCTCCAGCTGTA TGT
GCGGCGCCGAAAGAAGCTACCCAGTGTCCTGCTCTTCAAGAAGCCCAGCCCCTTCAT CTT
CATCAGCCAGCGTCCCTCCCCAGAGACCCAAGACACCATCCACCCGCTTGATGAGGA GGC
CTTTTTGAAGGTGTCCCCAGAGCTGAAGAACTTGGACCTGCACGGCAGCACAGACAG TGG
CTTTGGCAGCACCAAGCCATCCCTGCAGACTGAAGAGCCCCAGTTCCTCCTCCCTGA CCC TCACCCCCAGGCTGACAGAACGCTGGGAAACGGGGAGCCCCCTGTGCTGGGGGACAGCTG
CAGTAGTGGCAGCAGCAATAGCACAGACAGCGGGATCTGCCTGCAGGAGCCCAGCCT GAG
CCCCAGCACAGGGCCCACCTGGGAGCAACAGGTGGGGAGCAACAGCAGGGGCCAGGA TGA
CAGTGGCATTGACTTAGTTCAAAACTCTGAGGGCCGGGCTGGGGACACACAGGGTGG CTC
GGCCTT GGGCCACCACAGTCCCCCGGAGCCT GAGGT GCCT GGGGAAGAAGACCCAGCTGC
T GTGGCATTCCAGGGTT ACCT G AGGC AG ACC AG ATGTGCT GAAG AG AAGGC AACCAAGAC
AGGCTGCCTGGAGGAAGAATCGCCCTTGACAGATGGCCTTGGCCCCAAATTCGGGAG ATG
CCTGGTTGATGAGGCAGGCTTGCATCCACCAGCCCTGGCCAAGGGCTATTTGAAACA GGA
TCCTCTAGAAATGACTCTGGCTTCCTCAGGGGCCCCAACGGGACAGTGGAACCAGCC CAC
TGAGGAATGGTCACTCCTGGCCTTGAGCAGCTGCAGTGACCTGGGAATATCTGACTG GAG
CTTTGCCCATGACCTTGCCCCTCTAGGCTGTGTGGCAGCCCCAGGTGGTCTCCTGGG CAG
CTTTAACTCAGACCTGGTCACCCTGCCCCTCATCTCTAGCCTGCAGTCAAGTGAGTG ACT
CGGGCTGAGAGGCTGCTTTTGATTTTAGCCATGCCTGCTCCTCTGCCTGGACCAGGA GGA
GGGCCCTGGGGCAGAAGTTAGGCACGAGGCAGTCTGGGCACTTTTCTGCAAGTCCAC TGG
GGCTGGCCCAGCCAGGCTGCAGGGCTGGTCAGGGTGTCTGGGGCAGGAGGAGGCCAA CTC
ACTGAACTAGT GCAGGGTATGT GGGT GGCACT GACCT GTTCT GTTGACTGGGGCCCT GCA
GACTCTGGCAGAGCTGAGAAGGGCAGGGACCTTCTCCCTCCTAGGAACTCTTTCCTG TAT
CATAAAGGATTATTTGCTCAGGGGAACCATGGGGCTTTCTGGAGTTGTGGTGAGGCC ACC
AGGCTGAAGTCAGCTCAGACCCAGACCTCCCTGCTTAGGCCACTCGAGCATCAGAGC TTC
CAGCAGGAGGAAGGGCTGTAGGAATGGAAGCTTCAGGGCCTTGCTGCTGGGGTCATT TTT
AGGGGAAAAAGGAGGATATGATGGTCACATGGGGAACCTCCCCTCATCGGGCCTCTG GGG
CAGGAAGCTTGTCACTG GAAG AT CTT AAG GTATAT ATTTT CTG G AC ACT C AAAC AC AT C A
TAATGGATTCACTGAGGGGAGACAAAGGGAGCCGAGACCCTGGATGGGGCTTCCAGC TCA
GAACCCATCCCTCTGGTGGGTACCTCTGGCACCCATCTGCAAATATCTCCCTCTCTC CAA
CAAATGGAGTAGCATCCCCCTGGGGCACTTGCTGAGGCCAAGCCACTCACATCCTCA CTT
TGCTGCCCCACCATCTTGCTGACAACTTCCAGAGAAGCCATGGTTTTTTGTATTGGT CAT
AACTCAGCCCTTTGGGCGGCCTCTGGGCTTGGGCACCAGCTCATGCCAGCCCCAGAG GGT
CAGGGTTGGAGGCCTGTGCTTGTGTTTGCTGCTAATGTCCAGCTACAGACCCAGAGG ATA
AGCCACTGGGCACTGGGCTGGGGTCCCTGCCTTGTTGGTGTTCAGCTGTGTGATTTT GGA
CTAGCCACTTGTCAGAGGGCCTCAATCTCCCATCTGTGAAATAAGGACTCCACCTTT AGG
GGACCCTCCATGTTTGCTGGGTATTAGCCAAGCTGGTCCTGGGAGAATGCAGATACT GTC
CGTGG ACT ACC AAGCTGGCTT GTTT CTT AT GCCAGAGGCT AACAG ATCCAATGGG AGTCC
ATGGTGTCATGCCAAGACAGTATCAGACACAGCCCCAGAAGGGGGCATTATGGGCCC TGC
CTCCCCATAGGCCATTTGGACTCTGCCTTCAAACAAAGGCAGTTCAGTCCACAGGCA TGG
AAGCTGTGAGGGGACAGGCCTGTGCGTGCCATCCAGAGTCATCTCAGCCCTGCCTTT CTC
TGGAGCATTCTGAAAACAGATATTCTGGCCCAGGGAATCCAGCCATGACCCCCACCC CTC
T GCCAAAGTACTCTTAGGT GCCAGTCT GGTAACT GAACTCCCTCTGGAGGCAGGCTT GAG
GGAGGATTCCTCAGGGTTCCCTTGAAAGCTTTATTTATTTATTTTGTTCATTTATTT ATT
GGAGAGGCAGCATTGCACAGTGAAAGAATTCTGGATATCTCAGGAGCCCCGAAATTC TAG
CTCTGACTTTGCTGTTTCCAGTGGTATGACCTTGGAGAAGTCACTTATCCTCTTGGA GCC
TCAGTTTCCTCATCTGCAGAATAATGACTGACTTGTCTAATTCATAGGGATGTGAGG TTC
TGCT G AGGAAAT GGGTAT GAAT GT GCCTT G AACACAAAGCT CTGTCAAT AAGTGATACAT GTTTTTTATTCCAAT AAATT GT CAAGACCACA (SEQ ID NO: 43)
As used herein, the term “IL1 A” refers to the gene encoding lnterleukin-1 alpha. The terms “IL1 A” and "lnterleukin-1 alpha" include wild-type forms of the IL1A gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type IL1A. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type IL1 A nucleic acid sequence (e.g., SEQ ID NO: 44, ENA accession number X02531). SEQ ID NO: 44 is a wild-type gene sequence encoding IL1A protein, and is shown below:
ATGGCCAAAGTTCCAGACATGTTTGAAGACCTGAAGAACTGTTACAGTGAAAATGAA GAA
GACAGTTCCTCCATTGATCATCTGTCTCTGAATCAGAAATCCTTCTATCATGTAAGC TAT
GGCCCACTCCATGAAGGCTGCATGGATCAATCTGTGTCTCTGAGTATCTCTGAAACC TCT
AAAACATCCAAGCTTACCTTCAAGGAGAGCATGGTGGTAGTAGCAACCAACGGGAAG GTT
CT GAAG AAG AG ACGGTT GAGTTT AAGCC AATCCAT CACT GAT GAT G ACCT GG AGGCCATC
GCCAATGACTCAGAGGAAGAAATCATCAAGCCTAGGTCAGCACCTTTTAGCTTCCTG AGC
AAT GT G AAAT AC AACTTT AT GAGG AT CAT CAAAT ACGAATTC ATCCT GAAT G ACGCCCT C
AATCAAAGTATAATTCGAGCCAATGATCAGTACCTCACGGCTGCTGCATTACATAAT CTG
GAT GAAGC AGTG AAATTT GACATGGGTGCTTATAAGT CAT CAAAGGAT GAT GCT AAAATT
ACCGT GATTCT AAG AAT CT CAAAAACT CAATT GT AT GT GACTGCCCAAGAT G AAGACCAA
CCAGT GCTGCT G AAGGAG AT GCCT GAG AT ACCCAAAACC AT CACAGGT AGTG AG ACC AAC
CTCCTCTTCTTCTGGGAAACTCACGGCACTAAGAACTATTTCACATCAGTTGCCCAT CCA
AACTTGTTTATTGCCACAAAGCAAGACTACTGGGTGTGCTTGGCAGGGGGGCCACCC TCT
ATCACTGACTTTCAGATACTGGAAAACCAGGCGTAGGTCTGGAGTCTCACTTGTCTC ACT
TGTGCAGTGTTGACAGTTCATATGTACCATGTACATGAAGAAGCTAAATCCTTTACT GTT
AGT CATTT GCT GAG CAT GTACT G AG CCTT GT AATT CT AAAT GAAT GTTT AC ACT CTTTGT
AAG AGT G G AAC C AAC ACTAAC AT AT AAT GTTGTT ATTT AAAG AACAC C CTAT ATTTT GCA
T AGTACC AAT C ATTTT AATT ATT ATT CTT CAT AAC AATTTT AG GAGG AC C AG AGCT ACTG
ACTATGGCTACCAAAAAGACTCTACCCATATTACAGATGGGCAAATTAAGGCATAAG AAA
ACTAAGAAATATGCACAATAGCAGTTGAAACAAGAAGCCACAGACCTAGGATTTCAT GAT
TT CATTT C AACT GTTT GCCTTCTG CTTTT AAGTTGCTG AT G AACT CTT AAT CAAAT AG C A
TAAGTTTCTGGGACCTCAGTTTTATCATTTTCAAAATGGAGGGAATAATACCTAAGC CTT
CCTGCCGCAACAGTTTTTTATGCTAATCAGGGAGGTCATTTTGGTAAAATACTTCTC GAA
GCCGAGCCTCAAGATGAAGGCAAAGCACGAAATGTTATTTTTT AATT ATT ATTTATATAT
GTATTT AT AAAT AT ATTT AAG AT AATT AT AAT ATACTAT ATTT AT G GG AACC CCTT CAT C
CTCTGAGTGTGACCAGGCATCCTCCACAATAGCAGACAGTGTTTTCTGGGATAAGTA AGT
TTGATTTCATTAATACAGGGCATTTTGGTCCAAGTTGTGCTTATCCCATAGCCAGGA AAC
T CTGC ATT CTAGTACTT GGG AG ACCT GT AAT CAT AT AAT AAAT GT AC ATT AATT ACCTTG
AGCCAGTAATTGGTCCGATCTTTGACTCTTTTGCCATTAAACTTACCTGGGCATTCT TGT
TT C ATT C AATTCC ACCT G C AAT C AAGTCCT AC AAG CT AAAATT AG AT G AACT C AACTTT G ACAACC AT AG ACC ACT GTT AT CAAAACTTTCTTTT CT GG AATGTAAT CAAT GTTTCTTCT AGGTTCT AAAAATTGTG AT C AG ACC AT AAT GTT AC ATT ATT AT C AAC AAT AGT GATT GAT AGAGTGTTATCAGTCATAACTAAATAAAGCTTGCAAGTGAGGGAGTCATTTCATTGGCGT TT G AGT C AG C AAAG AAGTC AAG (SEQ ID NO: 44)
As used herein, the term “IL1B” refers to the gene encoding lnterleukin-1 beta. The terms “IL1 B” and "lnterleukin-1 beta" include wild-type forms of the IL1 B gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type IL1 B. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type IL1 B nucleic acid sequence (e.g., SEQ ID NO: 45, ENA accession number X02770). SEQ ID NO: 45 is a wild-type gene sequence encoding IL1 B protein, and is shown below:
ACAAACCTTTTCGAGGCAAAAGGCAAAAAAGGCTGCTCTGGGATTCTCTTCAGCCAA TCT
TCAATGCTCAAGTGTCTGAAGCAGCCATGGCAGAAGTACCTAAGCTCGCCAGTGAAA TGA
TGGCTTATTACAGTGGCAATGAGGATGACTTGTTCTTTGAAGCTGATGGCCCTAAAC AGA
TGAAGTGCTCCTTCCAGGACCTGGACCTCTGCCCTCTGGATGGCGGCATCCAGCTAC GAA
TCTCCGACCACCACTACAGCAAGGGCTTCAGGCAGGCCGCGTCAGTTGTTGTGGCCA TGG
ACAAGCTGAGGAAGATGCTGGTTCCCTGCCCACAGACCTTCCAGGAGAATGACCTGA GCA
CCTTCTTTCCCTTCATCTTTGAAGAAGAACCTATCTTCTTCGACACATGGGATAACG AGG
CTTAT GT GCACGATGCACCT GTACGATCACT GAACT GCACGCTCCGGGACTCACAGCAAA
AAAGCTTGGTGATGTCTGGTCCATATGAACTGAAAGCTCTCCACCTCCAGGGACAGG ATA
TGGAGCAACAAGTGGTGTTCTCCATGTCCTTTGTACAAGGAGAAGAAAGTAATGACA AAA
TACCTGTGGCCTTGGGCCTCAAGGAAAAGAATCTGTACCTGTCCTGCGTGTTGAAAG ATG
ATAAGCCCACTCT ACAGCT GG AG AGTGT AG ATCCCAAAAATT ACCCAAAG AAGAAGATGG
AAAAGCG ATTT GT CTT CAACAAGAT AGAAAT CAATAACAAGCTGGAATTT GAGTCT GCCC
AGTTCCCCAACTGGTACATCAGCACCTCTCAAGCAGAAAACATGCCCGTCTTCCTGG GAG
GGACCAAAGGCGGCCAGGATATAACTGACTTCACCATGCAATTTGTGTCTTCCTAAA GAG
AGCTGTACCCAGAGAGTCCTGTGCTGAATGTGGACTCAATCCCTAGGGCTGGCAGAA AGG
GAACAGAAAGGTTTTTGAGTACGGCTATAGCCTGGACTTTCCTGTTGTCTACACCAA TGC
CCAACTGCCTGCCTTAGGGTAGTGCTAAGAGGATCTCCTGTCCATCAGCCAGGACAG TCA
GCTCTCTCCTTTCAGGGCCAATCCCAGCCCTTTTGTTGAGCCAGGCCTCTCTCACCT CTC
CTACTCACTTAAAGCCCGCCTGACAGAAACCAGGCCACATTTTGGTTCTAAGAAACC CTC
CTCTGTCATTCGCTCCCACATTCTGATGAGCAACCGCTTCCCTATTTATTTATTTAT TTG
TTTGTTTGTTTTGATTCATTGGTCTAATTTATTCAAAGGGGGCAAGAAGTAGCAGTG TCT
GTAAAAGAGCCTAGTTTTTAATAGCTATGGAATCAATTCAATTTGGACTGGTGTGCT CTC
TTT AAAT C AAGTCCTTT AATT AAG ACT G AAAAT ATATAAGCT C AG ATT ATTT AAAT G GG A
AT ATTT AT AAAT G AGC AAAT AT CAT ACTGTT C AAT G GTTCT C AAAT AAACTT C ACT
(SEQ ID NO: 45) As used herein, the term “IL1RAP” refers to the gene encoding lnterleukin-1 receptor accessory protein. The terms “IL1 RAP” and "lnterleukin-1 receptor accessory protein" include wild-type forms of the IL1 RAP gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type IL1 RAP. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type IL1 RAP nucleic acid sequence (e.g., SEQ ID NO: 46, ENA accession number AF029213). SEQ ID NO: 46 is a wild-type gene sequence encoding IL1 RAP protein, and is shown below:
T CTCAAAGG AT G ACACTT CT GT GGT GT GT AGTG AGTCT CTACTTTT AT GGAATCCTGCAA
AGTGATGCCTCAGAACGCTGCGATGACTGGGGACTAGACACCATGAGGCAAATCCAA GTG
TTTGAAGATGAGCCAGCTCGCATCAAGTGCCCACTCTTTGAACACTTCTTGAAATTC AAC
TACAGCACAGCCCATTCAGCTGGCCTTACTCTGATCTGGTATTGGACTAGGCAGGAC CGG
GACCTTGAGGAGCCAATTAACTTCCGCCTCCCCGAGAACCGCATTAGTAAGGAGAAA GAT
GTGCTGTGGTTCCGGCCCACTCTCCTCAATGACACTGGCAACTATACCTGCATGTTA AGG
AAC ACT ACAT ATTGCAGCAAAGTT GCATTTCCCTTGGAAGTT GTT CAAAAAG ACAGCT GT
TTCAATTCCCCCATGAAACTCCCAGTGCATAAACTGTATATAGAATATGGCATTCAG AGG
ATCACTTGTCCAAATGTAGATGGATATTTTCCTTCCAGTGTCAAACCGACTATCACT TGG
TATATGGGCTGTTATAAAATACAGAATTTTAATAATGTAATACCCGAAGGTATGAAC TTG
AGTTTCCTCATTGCCTTAATTTCAAATAATGGAAATTACACATGTGTTGTTACATAT CCA
GAAAATGGACGTACGTTTCATCTCACCAGGACTCTGACTGTAAAGGTAGTAGGCTCT CCA
AAAAATGCAGTGCCCCCTGTGATCCATTCACCTAATGATCATGTGGTCTATGAGAAA GAA
CCAGGAGAGGAGCTACTCATTCCCTGTACGGTCTATTTTAGTTTTCTGATGGATTCT CGC
AAT GAGGTTT GGTGGACCATT GAT GG AAAAAAACCT GAT G AC AT CACT ATT GATGTCACC
ATTAACGAAAGTAT AAGTCATAGT AGAACAG AAG AT G AAACAAG AACT CAGATTTT GAGC
ATCAAGAAAGTTACCTCTGAGGATCTCAAGCGCAGCTATGTCTGTCATGCTAGAAGT GCC
AAAGGCGAAGTTGCCAAAGCAGCCAAGGTGAAGCAGAAAGTGCCAGCTCCAAGATAC ACA
GTGGAACTGGCTTGTGGTTTTGGAGCCACAGTCCTGCTAGTGGTGATTCTCATTGTT GTT
TACCATGTTTACTGGCTAGAGATGGTCCTATTTTACCGGGCTCATTTTGGAACAGAT GAA
ACCATTTTAGATGGAAAAGAGTATGATATTTATGTATCCTATGCAAGGAATGCGGAA GAA
G AAGAATTT GT ATTACT GACCCTCCGT GG AGTTTT GG AG AAT G AATTT GG AT ACAAGCT G
TGCATCTTTGACCGAGACAGTCTGCCTGGGGGAATTGTCACAGATGAGACTTTGAGC TTC
ATTCAGAAAAGCAGACGCCTCCTGGTTGTTCTAAGCCCCAACTACGTGCTCCAGGGA ACC
CAAGCCCTCCTGGAGCTCAAGGCTGGCCTAGAAAATATGGCCTCTCGGGGCAACATC AAC
GTCATTTTAGTACAGTACAAAGCTGTGAAGGAAACGAAGGTGAAAGAGCTGAAGAGG GCT
AAGACGGTGCTCACGGTCATTAAATGGAAAGGGGAAAAATCCAAGTATCCACAGGGC AGG
TTCTGGAAGCAGCTGCAGGTGGCCATGCCAGTGAAGAAAAGTCCCAGGCGGTCTAGC AGT
GATGAGCAGGGCCTCTCGTATTCATCTTTGAAAAATGTATGAAAGGAATAATGAAAA GGA
(SEQ ID NO: 46)
As used herein, the term “INPP5D” refers to the gene encoding Phosphatidylinositol 3,4,5- trisphosphate 5-phosphatase 1. The terms “INPP5D” and "Phosphatidylinositol 3,4,5-trisphosphate 5- phosphatase 1" include wild-type forms of the INPP5D gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type INPP5D. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type INPP5D nucleic acid sequence (e.g., SEQ ID NO: 47, ENA accession number X98429). SEQ ID NO: 47 is a wild-type gene sequence encoding INPP5D protein, and is shown below:
GTTGCTGTCGCCGTTGCTGTCGGCCGAGGCCACCAAGAGGCAACGGGCGGCAGGTTG CAG
TGGAGGGGCCTCCGCTCCCCTCGGTGGTGTGTGGGTCCTGGGGGTGCCTGCCGGCCC AGC
CGAGGAGGCCCACGCCCACCATGGTCCCCTGCTGGAACCATGGCAACATCACCCGCT CCA
AGGCGGAGGAGCTGCTTTCCAGGACAGGCAAGGACGGGAGCTTCCTCGTGCGTGCCA GCG
AGTCCATCTCCCGGGCATACGCGCTCTGCGTGCTGTATCGGAATTGCGTTTACACTT ACA
GAATTCTGCCCAATGAAGATGATAAATTCACTGTTCAGGCATCCGAAGGCGTCTCCA TGA
GGTTCTTCACCAAGCTGGACCAGCTCATCGAGTTTTACAAGAAGGAAAACATGGGGC TGG
TGACCCATCTGCAATACCCTGTGCCGCTGGAGGAAGAGGACACAGGCGACGACCCTG AGG
AGGACACAGAAAGTGTCGTGTCTCCACCCGAGCTGCCCCCAAGAAACATCCCGCTGA CTG
CCAGCTCCTGTGAGGCCAAGGAGGTTCCTTTTTCAAACGAGAATCCCCGAGCGACCG AGA
CCAGCCGGCCGAGCCTCTCCGAGACATTGTTCCAGCGACTGCAAAGCATGGACACCA GTG
GGCTTCCAGAAGAGCATCTTAAGGCCATCCAAGATTATTTAAGCACTCAGCTCGCCC AGG
ACTCTGAATTTGTGAAGACAGGGTCCAGCAGTCTTCCTCACCTGAAGAAACTGACCA CAC
TGCTCTGCAAGGAGCTCTATGGAGAAGTCATCCGGACCCTCCCATCCCTGGAGTCTC TGC
AGAGGTTATTTGACCAGCAGCTCTCCCCGGGCCTCCGTCCACGTCCTCAGGTTCCTG GTG
AGGCCAATCCCATCAACATGGTGTCCAAGCTCAGCCAACTGACAAGCCTGTTGTCGT CCA
TTGAAGACAAGGTCAAGGCCTTGCTGCACGAGGGTCCTGAGTCTCCGCACCGGCCCT CCC
TTATCCCTCCAGTCACCTTTGAGGTGAAGGCAGAGTCTCTGGGGATTCCTCAGAAAA TGC
AGCTCAAAGTCGACGTTGAGTCTGGGAAACTGATCATTAAGAAGTCCAAGGATGGTT CTG
AGG ACAAGTT CT ACAGCCAC AAG AAAATCCT GCAGCT GATT AAGT CACAGAAATTTCT GA
ATAAGTTGGTGATCTTGGTGGAAACGGAGAAGGAGAAGATCCTGCGGAAGGAATATG TTT
TTGCTGACTCCAAAAAGAGAGAAGGCTTCTGCCAGCTCCTGCAGCAGATGAAGAACA AGC
ACTCAGAGCAGCCGGAGCCCGACATGATCACCATCTTCATCGGCACCTGGAACATGG GTA
ACGCCCCCCCTCCCAAGAAGATCACGTCCTGGTTTCTCTCCAAGGGGCAGGGAAAGA CGC
GGGACGACTCTGCGGACTACATCCCCCATGACATTTACGTGATCGGCACCCAAGAGG ACC
CCCT GAGT G AG AAGGAGTGGCT GG AG ATCCT CAAACACTCCCTGCAAG AAATCACCAGT G
TGACTTTTAAAACAGTCGCCATCCACACGCTCTGGAACATCCGCATCGTGGTGCTGG CCA
AGCCT GAGCACGAGAACCGGATCAGCCACATCT GTACT GACAACGT GAAGACAGGCATT G
CAAACACACTGGGGAACAAGGGAGCCGTGGGGGTGTCGTTCATGTTCAATGGAACCT CCT
TAGGGTTCGTCAACAGCCACTTGACTTCAGGAAGTGAAAAGAAACTCAGGCGAAACC AAA
ACTATATGAACATTCTCCGGTTCCTGGCCCTGGGCGACAAGAAGCTGAGTCCCTTTA ACA
TCACTCACCGCTTCACGCACCTCTTCTGGTTTGGGGATCTTAACTACCGTGTGGATC TGC
CTACCTGGGAGGCAGAAACCATCATCCAGAAAATCAAGCAGCAGCAGTACGCAGACC TCC
TGTCCCACGACCAGCTGCTCACAGAGAGGAGGGAGCAGAAGGTCTTCCTACACTTCG AGG AGGAAGAAATCACGTTTGCCCCAACCTACCGTTTTGAGAGACTGACTCGGGACAAATACG
CCT AC ACCAAGCAGAAAGCG ACAGGGAT G AAGTACAACTTGCCTTCCTGGTGTG ACCG AG
TCCTCTGGAAGTCTTATCCCCTGGTGCACGTGGTGTGTCAGTCTTATGGCAGTACCA GCG
ACATCATGACGAGTGACCACAGCCCTGTCTTTGCCACATTTGAGGCAGGAGTCACTT CCC
AGTTTGTCTCCAAGAACGGTCCCGGGACTGTTGACAGCCAAGGACAGATTGAGTTTC TCA
GGTGCTATGCCACATTGAAGACCAAGTCCCAGACCAAATTCTACCTGGAGTTCCACT CGA
GCTGCTTGGAGAGTTTTGTCAAGAGTCAGGAAGGAGAAAATGAAGAAGGAAGTGAGG GGG
AGCT GGT GGTG AAGTTTGGT GAGACTCTTCCAAAGCT GAAGCCCATT AT CT CT GACCCT G
AGTACCTGCTAGACCAGCACATCCTCATCAGCATCAAGTCCTCTGACAGCGACGAAT CCT
ATGGCGAGGGCTGCATTGCCCTTCGGTTAGAGGCCACAGAAACGCAGCTGCCCATCT ACA
CGCCTCTCACCCACCATGGGGAGTTGACAGGCCACTTCCAGGGGGAGATCAAGCTGC AGA
CCTCTCAGGGCAAGACGAGGGAGAAGCTCTATGACTTTGTGAAGACGGAGCGTGATG AAT
CCAGTGGGCCAAAGACCCTGAAGAGCCTCACCAGCCACGACCCCATGAAGCAGTGGG AAG
TCACTAGCAGGGCCCCTCCGTGCAGTGGCTCCAGCATCACTGAAATCATCAACCCCA ACT
ACATGGGAGTGGGGCCCTTTGGGCCACCAATGCCCCTGCACGTGAAGCAGACCTTGT CCC
CTGACCAGCAGCCCACAGCCTGGAGCTACGACCAGCCGCCCAAGGACTCCCCGCTGG GGC
CCTGCAGGGGAGAAAGTCCTCCGACACCTCCCGGCCAGCCGCCCATATCACCCAAGA AGT
TTTTACCCTCAACAGCAAACCGGGGTCTCCCTCCCAGGACACAGGAGTCAAGGCCCA GTG
ACCTGGGGAAGAACGCAGGGGACACGCTGCCTCAGGAGGACCTGCCGCTGACGAAGC CCG
AGATGTTTGAGAACCCCCTGTATGGGTCCCTGAGTTCCTTCCCTAAGCCTGCTCCCA GGA
AGGACCAGGAATCCCCCAAAATGCCGCGGAAGGAACCCCCGCCCTGCCCGGAACCCG GCA
TCTTGTCGCCCAGCATCGTGCTCACCAAAGCCCAGGAGGCTGATCGCGGCGAGGGGC CCG
GCAAGCAGGTGCCCGCGCCCCGGCTGCGCTCCTTCACGTGCTCATCCTCTGCCGAGG GCA
GGGCGGCCGGCGGGGACAAGAGCCAAGGGAAGCCCAAGACCCCGGTCAGCTCCCAGG CCC
CGGTGCCGGCCAAGAGGCCCATCAAGCCTTCCAGATCGGAAATCAACCAGCAGACCC CGC
CCACCCCGACGCCGCGGCCGCCGCTGCCAGTCAAGAGCCCGGCGGTGCTGCACCTCC AGC
ACTCCAAGGGCCGCGACTACCGCGACAACACCGAGCTCCCGTATCACGGCAAGCACC GGC
CGGAGGAGGGGCCACCAGGGCCTCTAGGCAGGACTGCCATGCAGTGAAGCCCTCAGT GAG
CTGCCACTGAGTCGGGAGCCCAGAGGAACGGCGTGAAGCCACTGGACCCTCTCCCGG GAC
CTCCTGCTGGCTCCTCCTGCCCAGCTTCCTATGCAAGGCTTTGTGTTTTCAGGAAAG GGC
CTAGCTTCT GT GT GGCCCACAGAGTTCACT GCCT GTGAGACTTAGCACCAAGT GCTGAGG
CTGGAAGAAAAACGCACACCAGACGGGCAACAAACAGTCTGGGTCCCCAGCTCGCTC TTG
GTACTTGGGACCCCAGTGCCTTGTTGAGGGCGCCATTCTGAAGAAAGGAACTGCAGC GCC
GATTTGAGGGTGGAGATATAGATAATAATAATATTAATAATAATAATGGCCACATGG ATC
GAACACTCATGGTGTGCCAAGTGCTGTGCTAAGTGCTTTACGAACATTCGTCATATC AGG
ATGACCTCGAGAGCTGAGGCTCTAGCACCTAAAACCACGTGCCCAAACCCACCAGTT TAA
AACGGTGTGTGTTCGGAGGGGTGAAAGCATTAAGAAGCCCAGTGCCCTCCTGGAGTG AGA
CAAGGGCTCGGCCTTAAGGAGCTGAAGAGTCTGGGTAGCTTGTTTAGGGTACAAGAA GCC
TGTTCTGTCCAGCTTCAGTGACACAAGCTGCTTTAGCTAAAGTCCCGCGGGTTCCGG CAT
GGCTAGGCTGAGAGCAGGGATCTACCTGGCTTCTCAGTTCTTTGGTTGGAAGGAGCA GGA
AATCAGCTCCTATTCTCCAGTGGAGAGATCTGGCCTCAGCTTGGGCTAGAGATGCCA AGG
CCTGTGCCAGGTTCCCTGTGCCCTCCTCGAGGTGGGCAGCCATCACCAGCCACAGTT AAG CCAAGCCCCCCAACATGTATTCCATCGTGCTGGTAGAAGAGTCTTTGCTGTTGCTCCCGA
AAGCCGTGCTCTCCAGCCTGGCTGCCAGGGAGGGTGGGCCTCTTGGTTCCAGGCTCT TGA
AATAGTGCAGCCTTTTCTTCCTATCTCTGTGGCTTTCAGCTCTGCTTCCTTGGTTAT TAG
GAGAATAGATGGGTGATGTCTTTCCTTATGTTGCTTTTTCAACATAGCAGAATTAAT GTA
GGGAGCTAAATCCAGTGGTGTGTGTGAATGCAGAAGGGAATGCACCCCACATTCCCA TGA
TGGAAGTCTGCGTAACCAATAAATTGTGCCTTTCTCACTCAAAACC
(SEQ ID NO: 47)
As used herein, the term “ITGAM” refers to the gene encoding Integrin Subunit Alpha M. The terms “ITGAM” and "Integrin Subunit Alpha M" include wild-type forms of the ITGAM gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type ITGAM. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type ITGAM nucleic acid sequence (e.g., SEQ ID NO: 48, NCBI Reference Sequence: NM_000632.3). SEQ ID NO: 48 is a wild-type gene sequence encoding ITGAM protein, and is shown below:
TTTTCTGCCCTTCTTTGCTTTGGTGGCTTCCTTGTGGTTCCTCAGTGGTGCCTGCAA CCCCTGGTTCA
CCTCCTTCCAGGTTCTGGCTCCTTCCAGCCATGGCTCTCAGAGTCCTTCTGTTAACA GCCTTGACCT
TATGTCATGGGTTCAACTTGGACACTGAAAACGCAATGACCTTCCAAGAGAACGCAA GGGGCTTCGG
GCAGAGCGTGGTCCAGCTTCAGGGATCCAGGGTGGTGGTTGGAGCCCCCCAGGAGAT AGTGGCTG
CCAACCAAAGGGGCAGCCTCTACCAGTGCGACTACAGCACAGGCTCATGCGAGCCCA TCCGCCTGC
AGGTCCCCGTGGAGGCCGTGAACATGTCCCTGGGCCTGTCCCTGGCAGCCACCACCA GCCCCCCT
CAGCTGCTGGCCTGTGGTCCCACCGTGCACCAGACTTGCAGTGAGAACACGTATGTG AAAGGGCTC
TGCTTCCTGTTTGGATCCAACCTACGGCAGCAGCCCCAGAAGTTCCCAGAGGCCCTC CGAGGGTGT
CCT CAAGAGG AT AGT GACATT GCCTT CTT GATT GAT GGCTCTGGT AGCAT CATCCCACAT GACTTTCG
GCGGATGAAGGAGTTTGTCTCAACTGTGATGGAGCAATTAAAAAAGTCCAAAACCTT GTTCTCTTTGA
TGCAGTACTCTGAAGAATTCCGGATTCACTTTACCTTCAAAGAGTTCCAGAACAACC CTAACCCAAGA
TCACTGGTGAAGCCAATAACGCAGCTGCTTGGGCGGACACACACGGCCACGGGCATC CGCAAAGT
GGTACGAGAGCTGTTTAACATCACCAACGGAGCCCGAAAGAATGCCTTTAAGATCCT AGTTGTCATC
ACGGATGGAGAAAAGTTTGGCGATCCCTTGGGATATGAGGATGTCATCCCTGAGGCA GACAGAGAG
GGAGTCATTCGCTACGTCATTGGGGTGGGAGATGCCTTCCGCAGTGAGAAATCCCGC CAAGAGCTT
AATACCATCGCATCCAAGCCGCCTCGTGATCACGTGTTCCAGGTGAATAACTTTGAG GCTCTGAAGA
CCATTCAGAACCAGCTTCGGGAGAAGATCTTTGCGATCGAGGGTACTCAGACAGGAA GTAGCAGCT
CCTTTGAGCATGAGATGTCTCAGGAAGGCTTCAGCGCTGCCATCACCTCTAATGGCC CCTTGCTGAG
CACTGTGGGGAGCTATGACTGGGCTGGTGGAGTCTTTCTATATACATCAAAGGAGAA AAGCACCTTC
ATCAACATGACCAGAGTGGATTCAGACATGAATGATGCTTACTTGGGTTATGCTGCC GCCATCATCTT
ACGGAACCGGGTGCAAAGCCTGGTTCTGGGGGCACCTCGATATCAGCACATCGGCCT GGTAGCGAT
GTTCAGGCAGAACACTGGCATGTGGGAGTCCAACGCTAATGTCAAGGGCACCCAGAT CGGCGCCTA
CTTCGGGGCCTCCCTCTGCTCCGTGGACGTGGACAGCAACGGCAGCACCGACCTGGT CCTCATCG
GGGCCCCCCATTACTACGAGCAGACCCGAGGGGGCCAGGTGTCCGTGTGCCCCTTGC CCAGGGGG
AGGGCTCGGTGGCAGTGTGATGCTGTTCTCTACGGGGAGCAGGGCCAACCCTGGGGC CGCTTTGG GGCAGCCCTAACAGTGCTGGGGGACGTAAATGGGGACAAGCTGACGGACGTGGCCATTGG GGCCC CAGGAGAGGAGGACAACCGGGGTGCTGTTTACCTGTTTCACGGAACCTCAGGATCTGGCA TCAGCC CCTCCCATAGCCAGCGGATAGCAGGCTCCAAGCTCTCTCCCAGGCTCCAGTATTTTGGTC AGTCACT GAGTGGGGGCCAGGACCTCACAATGGATGGACTGGTAGACCTGACTGTAGGAGCCCAGGG GCACG TGCTGCTGCTCAGGTCCCAGCCAGTACTGAGAGTCAAGGCAATCATGGAGTTCAATCCCA GGGAAG TGGCAAGGAATGTATTTGAGTGTAATGATCAGGTGGTGAAAGGCAAGGAAGCCGGAGAGG TCAGAG T CTGCCTCCAT GTCCAGAAG AGCACACGGGATCGGCT AAG AG AAGGACAGATCCAGAGT GTTGTGA CTTATGACCTGGCTCTGGACTCCGGCCGCCCACATTCCCGCGCCGTCTTCAATGAGACAA AGAACA GCACACGCAGACAGACACAGGTCTTGGGGCTGACCCAGACTTGTGAGACCCTGAAACTAC AGTTGC CGAATTGCATCGAGGACCCAGTGAGCCCCATTGTGCTGCGCCTGAACTTCTCTCTGGTGG GAACGC CATTGTCTGCTTTCGGGAACCTCCGGCCAGTGCTGGCGGAGGATGCTCAGAGACTCTTCA CAGCCT TGTTTCCCTTTGAGAAGAATTGTGGCAATGACAACATCTGCCAGGATGACCTCAGCATCA CCTTCAGT TTCATGAGCCTGGACTGCCTCGTGGTGGGTGGGCCCCGGGAGTTCAACGTGACAGTGACT GTGAGA AATGATGGTGAGGACTCCTACAGGACACAGGTCACCTTCTTCTTCCCGCTTGACCTGTCC TACCGGA AGGTGTCCACGCTCCAGAACCAGCGCTCACAGCGATCCTGGCGCCTGGCCTGTGAGTCTG CCTCCT CCACCGAAGTGTCTGGGGCCTTGAAGAGCACCAGCTGCAGCATAAACCACCCCATCTTCC CGGAAA ACTCAGAGGTCACCTTTAATATCACGTTTGATGTAGACTCTAAGGCTTCCCTTGGAAACA AACTGCTC CTCAAGGCCAATGTGACCAGTGAGAACAACATGCCCAGAACCAACAAAACCGAATTCCAA CTGGAGC TGCCGGTGAAATATGCTGTCTACATGGTGGTCACCAGCCATGGGGTCTCCACTAAATATC TCAACTT CACGGCCTCAGAGAATACCAGTCGGGTCATGCAGCATCAATATCAGGTCAGCAACCTGGG GCAGAG GAGCCTCCCCATCAGCCTGGTGTTCTTGGTGCCCGTCCGGCTGAACCAGACTGTCATATG GGACCG CCCCCAGGTCACCTTCTCCGAGAACCTCTCGAGTACGTGCCACACCAAGGAGCGCTTGCC CTCTCA CTCCGACTTTCTGGCTGAGCTTCGGAAGGCCCCCGTGGTGAACTGCTCCATCGCTGTCTG CCAGAG AATCCAGTGTGACATCCCGTTCTTTGGCATCCAGGAAGAATTCAATGCTACCCTCAAAGG CAACCTC TCGTTTGACTGGTACATCAAGACCTCGCATAACCACCTCCTGATCGTGAGCACAGCTGAG ATCTTGT TTAACGATTCCGTGTTCACCCTGCTGCCGGGACAGGGGGCGTTTGTGAGGTCCCAGACGG AGACCA AAGTGGAGCCGTTCGAGGTCCCCAACCCCCTGCCGCTCATCGTGGGCAGCTCTGTCGGGG GACTG CTGCTCCTGGCCCTCATCACCGCCGCGCTGTACAAGCTCGGCTTCTTCAAGCGGCAATAC AAGGAC ATGATGAGTGAAGGGGGTCCCCCGGGGGCCGAACCCCAGTAGCGGCTCCTTCCCGACAGA GCTGC CTCTCGGTGGCCAGCAGGACTCTGCCCAGACCACACGTAGCCCCCAGGCTGCTGGACACG TCGGA CAGCGAAGTATCCCCGACAGGACGGGCTTGGGCTTCCATTTGTGTGTGTGCAAGTGTGTA TGTGCG TGTGTGCAAGTGTCTGTGTGCAAGTGTGTGCACATGTGTGCGTGTGCGTGCATGTGCACT TGCACG CCCATGTGTGAGTGTGTGCAAGTATGTGAGTGTGTCCAAGTGTGTGTGCGTGTGTCCATG TGTGTGC AAGTGTGTGCATGTGTGCGAGTGTGTGCATGTGTGTGCTCAGGGGCGTGTGGCTCACGTG TGTGAC TCAGATGTCTCTGGCGTGTGGGTAGGTGACGGCAGCGTAGCCTCTCCGGCAGAAGGGAAC TGCCT GGGCTCCCTTGTGCGTGGGTGAAGCCGCTGCTGGGTTTTCCTCCGGGAGAGGGGACGGTC AATCC TGTGGGTGAAGACAGAGGGAAACACAGCAGCTTCTCTCCACTGAAAGAAGTGGGACTTCC CGTCGC CTGCGAGCCTGCGGCCTGCTGGAGCCTGCGCAGCTTGGATGGAGACTCCATGAGAAGCCG TGGGT GGAACCAGG AACCTCCTCCAC ACC AGCGCT GATGCCCAAT AAAG AT GCCC ACT GAGG AAT GAT G AA G CTTCCTTT CTG GATT C ATTT ATT ATTT C AAT GT G ACTTT AATTTTTT G GAT G GAT AAG CTTGTCTATGG T AC AAAAAT CAC AAGG C ATT C AAGT GT AC AGT G AAAAGT CTCC CTTTCC AG AT ATT C AAGT C AC CTCC TTAAAGGTAGTCAAGATTGTGTTTTGAGGTTTCCTTCAGACAGATTCCAGGCGATGTGCA AGTGTATG C ACGTGTG C AC AC AC AC C AC AC AT AC AC AC AC AC AAG CTTTTTT AC AC AAAT G GT AGC AT ACTTT AT A TTGGTCTGTATCTTG CTTTTTTT C ACC AAT ATTT CT C AG AC ATCG GTT CAT ATT AAG AC AT AAATT ACTT TTTCATTCTTTTATACCGCTGCATAGTATTCCATTGTGTGAGTGTACCATAATGTATTTA ACCAGTCTT CTTTT GAT ATACT ATTTT C ATT CTCTTGTT ATT GC AT C AAT G CTG AGTT AAT AAAT C AAAT ATATGTC AT TTTTGCATATATGTAAGGATAA (SEQ ID NO: 48)
As used herein, the term “ITGAX” refers to the gene encoding Integrin alpha-X. The terms “ITGAX” and "Integrin alpha-X" include wild-type forms of the ITGAX gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type ITGAX. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71 %, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type ITGAX nucleic acid sequence (e.g., SEQ ID NO: 49, ENA accession number M81695). SEQ ID NO: 49 is a wild-type gene sequence encoding ITGAX protein, and is shown below:
GAATTCCTGCCACTCTTCCTGCAACGGCCCAGGAGCTCAGAGCTCCACATCTGACCT TCT
AGTCATGACCAGGACCAGGGCAGCACTCCTCCTGTTCACAGCCTTAGCAACTTCTCT AGG
TTTCAACTTGGACACAGAGGAGCTGACAGCCTTCCGTGTGGACAGCGCTGGGTTTGG AGA
CAGCGTGGTCCAGTATGCCAACTCCTGGGTGGTGGTTGGAGCCCCCCAAAAGATAAC AGC
TGCCAACCAAACGGGTGGCCTCTACCAGTGTGGCTACAGCACTGGTGCCTGTGAGCC CAT
CGGCCTGCAGGTGCCCCCGGAGGCCGTGAACATGTCCCTGGGCCTGTCCCTGGCGTC TAC
CACCAGCCCTTCCCAGCTGCTGGCCTGCGGCCCCACCGTGCACCACGAGTGCGGGAG GAA
CATGTACCTCACCGGACTCTGCTTCCTCCTGGGCCCCACCCAGCTCACCCAGAGGCT CCC
GGTGTCCAGGCAGGAGTGCCCAAGACAGGAGCAGGACATTGTGTTCCTGATCGATGG CTC
AGGCAGCATCTCCTCCCGCAACTTTGCCACGATGATGAACTTCGTGAGAGCTGTGAT AAG
CCAGTTCCAGAGACCCAGCACCCAGTTTTCCCTGATGCAGTTCTCCAACAAATTCCA AAC
ACACTTCACTTTCGAGGAATTCAGGCGCACGTCAAACCCCCTCAGCCTGTTGGCTTC TGT
TCACCAGCTGCAAGGGTTTACATACACGGCCACCGCCATCCAAAATGTCGTGCACCG ATT
GTTCCATGCCTCATATGGGGCCCGTAGGGATGCCACCAAAATTCTCATTGTCATCAC TGA
TGGGAAGAAAGAAGGCGACAGCCTGGATTATAAGGATGTCATCCCCATGGCTGATGC AGC
AGGCATCATCCGCTATGCAATTGGGGTTGGATTAGCTTTTCAAAACAGAAATTCTTG GAA
AGAATTAAATGACATTGCATCGAAGCCCTCCCAGGAACACATATTTAAAGTGGAGGA CTT
T GATGCT CT G AAAGAT ATT CAAAACCAACT G AAGG AG AAG AT CTTTGCC ATT G AGGGTAC
GGAGACCACAAGCAGTAGCTCCTTCGAATTGGAGATGGCACAGGAGGGCTTCAGCGC TGT
GTTCACACCTGATGGCCCCGTTCTGGGGGCTGTGGGGAGCTTCACCTGGTCTGGAGG TGC
CTTCCTGTACCCCCCAAATATGAGCCCTACCTTCATCAACATGTCTCAGGAGAATGT GGA
CATGAGGGACTCTTACCTGGGTTACTCCACCGAGCTGGCCCTCTGGAAAGGGGTGCA GAG
CCTGGTCCTGGGGGCCCCCCGCTACCAGCACACCGGGAAGGCTGTCATCTTCACCCA GGT
GTCCAGGCAATGGAGGATGAAGGCCGAAGTCACGGGGACTCAGATCGGCTCCTACTT CGG
GGCCTCCCTCTGCTCCGTGGACGTAGACACCGACGGCAGCACCGACCTGGTCCTCAT CGG
GGCCCCCCATTACTACGAGCAGACCCGAGGGGGCCAGGTGTCTGTGTGTCCCTTGCC CAG GGGGTGGAGAAGGTGGTGGTGTGATGCTGTTCTCTACGGGGAGCAGGGCCACCCCTGGGG
TCGCTTTGGGGCGGCTCTGACAGTGCTGGGGGATGTGAATGGGGACAAGCTGACAGA CGT
GGTCATCGGGGCCCCAGGAGAGGAGGAGAACCGGGGTGCTGTCTACCTGTTTCACGG AGT
CTTGGGACCCAGCATCAGCCCCTCCCACAGCCAGCGGATCGCGGGCTCCCAGCTCTC CTC
CAGGCTGCAGTATTTTGGGCAGGCACTGAGCGGGGGTCAAGACCTCACCCAGGATGG ACT
GGTGGACCTGGCTGTGGGGGCCCGGGGCCAGGTGCTCCTGCTCAGGACCAGACCTGT GCT
CTGGGTGGGGGTGAGCATGCAGTTCATACCTGCCGAGATCCCCAGGTCTGCGTTTGA GTG
TCGGGAGCAGGTGGTCTCTGAGCAGACCCTGGTACAGTCCAACATCTGCCTTTACAT TGA
CAAACGTTCTAAGAACCTGCTTGGGAGCCGTGACCTCCAAAGCTCTGTGACCTTGGA CCT
GGCCCTCGACCCTGGCCGCCTGAGTCCCCGTGCCACCTTCCAGGAAACAAAGAACCG GAG
TCTGAGCCGAGTCCGAGTCCTCGGGCTGAAGGCACACTGTGAAAACTTCAACCTGCT GCT
CCCGAGCTGCGTGGAGGACTCTGTGACCCCCATTACCTTGCGTCTGAACTTCACGCT GGT
GGGCAAGCCCCTCCTTGCCTTCAGAAACCTGCGGCCTATGCTGGCCGCACTGGCTCA GAG
ATACTTCACGGCCTCCCTACCCTTTGAGAAGAACTGTGGAGCCGACCATATCTGCCA GGA
CAATCTCGGCATCTCCTTCAGCTTCCCAGGCTTGAAGTCCCTGCTGGTGGGGAGTAA CCT
GGAGCTGAACGCAGAAGTGATGGTGTGGAATGACGGGGAAGACTCCTACGGAACCAC CAT
CACCTTCTCCCACCCCGCAGGACTGTCCTACCGCTACGTGGCAGAGGGCCAGAAACA AGG
GCAGCTGCGTTCCCTGCACCTGACATGTGACAGCGCCCCAGTTGGGAGCCAGGGCAC CTG
GAGCACCAGCTGCAGAATCAACCACCTCATCTTCCGTGGCGGCGCCCAGATCACCTT CTT
GGCTACCTTTGACGTCTCCCCCAAGGCTGTCCTGGGAGACCGGCTGCTTCTGACAGC CAA
TGTGAGCAGTGAGAACAACACTCCCAGGACCAGCAAGACCACCTTCCAGCTGGAGCT CCC
GGTGAAGTATGCTGTCTACACTGTGGTTAGCAGCCACGAACAATTCACCAAATACCT CAA
CTTCTCAGAGTCTGAGGAGAAGGAAAGCCATGTGGCCATGCACAGATACCAGGTCAA TAA
CCTGGGACAGAGGGACCTGCCTGTCAGCATCAACTTCTGGGTGCCTGTGGAGCTGAA CCA
GGAGGCTGTGTGGATGGATGTGGAGGTCTCCCACCCCCAGAACCCATCCCTTCGGTG CTC
CTCAGAGAAAATCGCACCCCCAGCATCTGACTTCCTGGCGCACATTCAGAAGAATCC CGT
GCTGGACTGCTCCATTGCTGGCTGCCTGCGGTTCCGCTGTGACGTCCCCTCCTTCAG CGT
CCAGGAGGAGCTGGATTTCACCCTGAAGGGCAACCTCAGCTTTGGCTGGGTCCGCCA GAT
ATTGCAG AAG AAGGT GTCGGTCGT GAGT GTGGCT G AAATT ACGTTCGACACATCCGT GT A
CTCCCAGCTTCCAGGACAGGAGGCATTTATGAGAGCTCAGACGACAACGGTGCTGGA GAA
GTACAAGGTCCACAACCCCACCCCCCTCATCGTAGGCAGCTCCATTGGGGGTCTGTT GCT
GCTGGCACTCATCACAGCGGTACTGTACAAAGTTGGCTTCTTCAAGCGTCAGTACAA GGA
AATGATGGAGGAGGCAAATGGACAAATTGCCCCAGAAAACGGGACACAGACCCCCAG CCC
GCCCAGTGAGAAATGATCCCTCTTTGCCTTGGACTTCTTCTCCCGCGATTTTCCCCA CTT
ACTTACCCTCACCTGTCAGGCTGACGGGGAGGAACCACTGCACCACCGAGAGAGGCT GGG
ATGGGCCTGCTTCCTGTCTTTGGGAGAAAACGTCTTGCTTGGGAAGGGGCCTTTGTC TTG
TCAAGGTTCCAACT GGAAACCCTTAGGACAGGGTCCCT GCT GT GTTCCCCAAAAGGACTT
G ACTTGCAATTTCT ACCT AGAAAT ACAT GG ACAATACCCCC AGGCCT CAGTCTCCCTT CT
CCCAT G AGGCACGAAT GATCTTT CTTTCCTTTCCTTTTTTTTTTTTTT CTTTT CTTTTTT
TTTTTTTTTGAGACGGAGTCTCGCTCTGTCACCCAGGCTGGAGTGCAATGGCGTGAT CTC
GGCTCGCTGCAACCTCCGCCTCCCGGGTTCAAGTAATTCTGCTGTCTCAGCCTCCTG CGT
AGCTGGGACTACAGGCACACGCCACCTCGCCCGGCCCGATCTTTCTAAAATACAGTT CTG AATATGCTGCTCATCCCCACCTGTCTTCAACAGCTCCCCATTACCCTCAGGACAATGTCT
GAACTCTCCAGCTTCGCGTGAGAAGTCCCCTTCCATCCCAGAGGGTGGGCTTCAGGG CGC
ACAGCATGAGAGCCTCTGTGCCCCCATCACCCTCGTTTCCAGTGAATTAGTGTCATG TCA
GCATCAGCTCAGGGCTTCATCGTGGGGCTCTCAGTTCCGATTCCCCAGGCTGAATTG GGA
GTGAGATGCCTGCATGCTGGGTTCTGCACAGCTGGCCTCCCGCGGTTGGGTCAACAT TGC
TGGCCTGGAAGGGAGGAGCGCCCTCTAGGGAGGGACATGGCCCCGGTGCGGCTGCAG CTC
ACCAGCCCCAGGGGCAGAAGAGACCCAACCACTTCCTATTTTTTGAGGCTATGAATA TAG
T ACCT G AAAAAAT GCC AAGCACT AG ATTATTTTTTTAAAAAGCGT ACTTT AAAT GTTTGT
GTTAATACACATTAAAACATCGCACAAAAACGATGCATCTACCGCTCCTTGGGAAAT AAT
CTGAAAGGTCTAAAAATAAAAAAGCCTTCTGTGG
(SEQ ID NO: 49)
As used herein, the term “LILRB4” refers to the gene encoding Leukocyte immunoglobulin-like receptor subfamily B member 4. The terms “LILRB4” and "Leukocyte immunoglobulin-like receptor subfamily B member 4" include wild-type forms of the LILRB4 gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type LILRB4. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type LILRB4 nucleic acid sequence (e.g., SEQ ID NO: 50, ENA accession number U91925). SEQ ID NO: 50 is a wild-type gene sequence encoding LILRB4 protein, and is shown below:
TGAGATGAGAGCTGCCGACAGTTGGGGGTCAAGGGAGGAGACGCCATGATCCCCACC TTC
ACGGCTCTGCTCTGCCTCGGGCTGAGTCTGGGCCCCAGGACCCACATGCAGGCAGGG CCC
CTCCCCAAACCCACCCTCTGGGCTGAGCCAGGCTCTGTGATCAGCTGGGGGAACTCT GTG
ACCATCTGGTGTCAGGGGACCCTGGAGGCTCGGGAGTACCGTCTGGATAAAGAGGAA AGC
CCAGCACCCTGGGACAGACAGAACCCACTGGAGCCCAAGAACAAGGCCAGATTCTCC ATC
CCATCCATGACAGAGGACTATGCAGGGAGATACCGCTGTTACTATCGCAGCCCTGTA GGC
TGGTCACAGCCCAGTGACCCCCTGGAGCTGGTGATGACAGGAGCCTACAGTAAACCC ACC
CTTTCAGCCCTGCCGAGTCCTCTTGTGACCTCAGGAAAGAGCGTGACCCTGCTGTGT CAG
TCACGGAGCCCAATGGACACTTTCCTTCTGATCAAGGAGCGGGCAGCCCATCCCCTA CTG
CATCTGAGATCAGAGCACGGAGCTCAGCAGCACCAGGCTGAATTCCCCATGAGTCCT GTG
ACCTCAGTGCACGGGGGGACCTACAGGTGCTTCAGCTCACACGGCTTCTCCCACTAC CTG
CTGTCACACCCCAGTGACCCCCTGGAGCTCATAGTCTCAGGATCCTTGGAGGGTCCC AGG
CCCTCACCCACAAGGTCCGTCTCAACAGCTGCAGGCCCTGAGGACCAGCCCCTCATG CCT
ACAGGGTCAGTCCCCCACAGTGGTCTGAGAAGGCACTGGGAGGTACTGATCGGGGTC TTG
GTGGTCTCCATCCTGCTTCTCTCCCTCCTCCTCTTCCTCCTCCTCCAACACTGGCGT CAG
GGAAAACACAGGACATTGGCCCAGAGACAGGCTGATTTCCAACGTCCTCCAGGGGCT GCC
GAGCCAGAGCCCAAGGACGGGGGCCTACAGAGGAGGTCCAGCCCAGCTGCTGACGTC CAG
GGAGAAAACTT CTGTGCTGCCGTG AAGAACACACAGCCT GAGGACGGGGT GG AAAT GG AC
ACTCGGCAGAGCCCACACGATGAAGACCCCCAGGCAGTGACGTATGCCAAGGTGAAA CAC
TCCAGACCTAGGAGAGAAATGGCCTCTCCTCCCTCCCCACTGTCTGGGGAATTCCTG GAC ACAAAGGACAGACAGGCAGAAGAGGACAGACAGAT GGACACTGAGGCT GCT GCATCT GAA
GCCCCCCAGGATGTGACCTACGCCCAGCTGCACAGCTTTACCCTCAGACAGAAGGCA ACT
GAGCCTCCTCCATCCCAGGAAGGGGCCTCTCCAGCTGAGCCCAGTGTCTATGCCACT CTG
GCCATCCACTAATCCAGGGGGGACCCAGACCCCACAAGCCATGGAGACTCAGGACCC CAG
AAGGCATGGAAGCTGCCTCCAGTAGACATCACTGAACCCCAGCCAGCCCAGACCCCT GAC
ACAGACCACTAGAAGATTCCGGGAACGTTGGGAGTCACCTGATTCTGCAAAGATAAA TAA
TATCCCT GC ATT AT C AAAAT AAAGTAG CAGACCTCT C AATT C A
(SEQ ID NO: 50)
As used herein, the term “LPL” refers to the gene encoding Lipoprotein lipase. The terms “LPL” and "Lipoprotein lipase" include wild-type forms of the LPL gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type LPL. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type LPL nucleic acid sequence (e.g., SEQ ID NO: 51 , ENA accession number M15856). SEQ ID NO: 51 is a wild-type gene sequence encoding LPL protein, and is shown below:
CCCCTCTTCCTCCTCCTCAAGGGAAAGCTGCCCACTTCTAGCTGCCCTGCCATCCCC TTT
AAAGGGCGACTTGCTCAGCGCCAAACCGCGGCTCCAGCCCTCTCCAGCCTCCGGCTC AGC
CGGCTCATCAGTCGGTCCGCGCCTTGCAGCTCCTCCAGAGGGACGCGCCCCGAGATG GAG
AGCAAAGCCCTGCTCGTGCTGACTCTGGCCGTGTGGCTCCAGAGTCTGACCGCCTCC CGC
GGAGGGGTGGCCGCCGCCGACCAAAGAAGAGATTTTATCGACATCGAAAGTAAATTT GCC
CTAAGGACCCCTGAAGACACAGCTGAGGACACTTGCCACCTCATTCCCGGAGTAGCA GAG
TCCGTGGCTACCTGTCATTTCAATCACAGCAGCAAAACCTTCATGGTGATCCATGGC TGG
ACGGTAACAGGAATGTATGAGAGTTGGGTGCCAAAACTTGTGGCCGCCCTGTACAAG AGA
GAACCAGACTCCAATGTCATTGTGGTGGACTGGCTGTCACGGGCTCAGGAGCATTAC CCA
GTGTCCGCGGGCTACACCAAACTGGTGGGACAGGATGTGGCCCGGTTTATCAACTGG ATG
GAGGAGGAGTTTAACTACCCTCTGGACAATGTCCATCTCTTGGGATACAGCCTTGGA GCC
CATGCTGCTGGCATTGCAGGAAGTCTGACCAATAAGAAAGTCAACAGAATTACTGGC CTC
GATCCAGCTGGACCTAACTTTGAGTATGCAGAAGCCCCGAGTCGTCTTTCTCCTGAT GAT
GCAGATTTTGTAGACGTCTTACACACATTCACCAGAGGGTCCCCTGGTCGAAGCATT GGA
ATCCAGAAACCAGTTGGGCATGTTGACATTTACCCGAATGGAGGTACTTTTCAGCCA GGA
TGTAACATTGGAGAAGCTATCCGCGTGATTGCAGAGAGAGGACTTGGAGATGTGGAC CAG
CTAGTGAAGTGCTCCCACGAGCGCTCCATTCATCTCTTCATCGACTCTCTGTTGAAT GAA
GAAAATCCAAGTAAGGCCTACAGGTGCAGTTCCAAGGAAGCCTTTGAGAAAGGGCTC TGC
TTGAGTTGTAGAAAGAACCGCTGCAACAATCTGGGCTATGAGATCAATAAAGTCAGA GCC
AAAAGAAGCAGCAAAATGTACCTGAAGACTCGTTCTCAGATGCCCTACAAAGTCTTC CAT
T ACCAAGTAAAGATT CATTTTT CTGGG ACT GAGAGT GAAACCC AT ACCAAT CAGGCCTTT
GAGATTTCTCTGTATGGCACCGTGGCCGAGAGTGAGAACATCCCATTCACTCTGCCT GAA
GTTTCCACAAAT AAGACCT ACTCCTT CCT AATTT ACACAG AGGTAGAT ATTGG AG AACT A
CT C ATGTT G AAG CT C AAAT GG AAG AGTG ATT CAT ACTTT AG CTG GT C AG ACT G GTG G AGC
AGTCCCGGCTTCGCCATTCAGAAGATCAGAGTAAAAGCAGGAGAGACTCAGAAAAAG GTG ATCTTCTGTTCTAGGGAGAAAGTGTCTCATTTGCAGAAAGGAAAGGCACCTGCGGTATTT
GTGAAATGCCAT G ACAAGT CT CT G AATAAGAAGT CAGGCT GAAACT GGGCG AAT CT AC AG
AAC AAAG AACGGC AT GT GAATTCT GT G AAG AAT GAAGTGGAGGAAGT AACTTTT ACAAAA
CATACCCAGTGTTTGGGGTGTTTCAAAAGTGGATTTTCCTGAATATTAATCCCAGCC CTA
CCCTTGTTAGTTATTTTAGGAGACAGTCTCAAGCACTAAAAAGTGGCTAATTCAATT TAT
GGGGTATAGTGGCCAAATAGCACATCCTCCAACGTTAAAAGACAGTGGATCATGAAA AGT
GCTGTTTTGTCCTTTGAGAAAGAAATAATTGTTTGAGCGCAGAGTAAAATAAGGCTC CTT
CATGTGGCGTATTGGGCCATAGCCTATAATTGGTTAGAACCTCCTATTTTAATTGGA ATT
CT GGATCTTTCGG ACT GAGGCCTTCT CAAACTTT ACT CTAAGTCTCCAAGAAT ACAG AAA
ATGCTTTTCCGCGGCACGAATCAGACTCATCTACACAGCAGTATGAATGATGTTTTA GAA
T GATTCCCT CTT GCT ATT GGAAT GT GGTCCAGACGTCAACCAGG AAC AT GT AACTTGG AG
AGGGACGAAGAAAGGGTCTGATAAACACAGAGGTTTTAAACAGTCCCTACCATTGGC CTG
CAT CAT G AC AAAGTT AC AAATT C AAGG AG AT AT AAAAT CTAG AT C AATT AATTCTT AAT A
GGCTTTATCGTTTATTGCTTAATCCCTCTCTCCCCCTTCTTTTTTGTCTCAAGATTA TAT
TATAATAATGTTCTCTGGGTAGGTGTTGAAAATGAGCCTGTAATCCTCAGCTGACAC ATA
ATTT GAAT GGTGCAG AAAAAAAAAAGAT ACCGT AATTTT ATTATTAGATT CTCCAAAT GA
TTTT CAT C AATTT AAAAT C ATT C AAT ATCT G AC AGTT ACTCTT C AGTTTT AG GCTT ACCT
TGGTCATGCTTCAGTTGTACTTCCAGTGCGTCTCTTTTGTTCCTGGCTTTGACATGA AAA
GAT AGGTTT G AGTT CAAATTTT GCATT GTGTG AGCTT CTACAG ATTTTAGACAAGGACCG
TTTTTACTAAGTAAAAGGGTGGAGAGGTTCCTGGGGTGGATTCCTAAGCAGTGCTTG TAA
ACCATCGCGTGCAATGAGCCAGATGGAGTACCATGAGGGTTGTTATTTGTTGTTTTT AAC
AACT AAT C AAG AGT G AGT G AAC AACT ATTT AT AAACT AG AT CTCCT ATTTTT C AG AAT G C
TCTTCTACGTATAAATATGAAATGATAAAGATGTCAAATATCTCAGAGGCTATAGCT GGG
AAC CCG ACT GT G AAAGT ATGT GAT ATCT G AAC AC AT ACT AG AAAGCT CT GC AT GTGTGTT
GTCCTTCAGCATAATTCGGAAGGGAAAACAGTCGATCAAGGGATGTATTGGAACATG TCG
GAGTAGAAATTGTTCCTGATGTGCCAGAACTTCGACCCTTTCTCTGAGAGAGATGAT CGT
GCCTATAAATAGTAGGACCAATGTTGTGATTAACATCATCAGGCTTGGAATGAATTC TCT
CT AAAAAT AAAAT GAT GT AT GATTT GTTGTT GGCATCCCCTTT ATTAATT CATTAAATTT
CTGGATTTGGGTTGTGACCCAGGGTGCATTAACTTAAAAGATTCACTAAAGCAGCAC ATA
GCACTGGGAACTCTGGCTCCGAAAAACTTTGTTATATATATCAAGGATGTTCTGGCT TTA
CATTTTATTTATTAGCTGTAAATACATGTGTGGATGTGTAAATGGAGCTTGTACATA TTG
GAAAGGTCATTGTGGCTATCTGCATTTATAAATGTGTGGTGCTAACTGTATGTGTCT TTA
T CAGTGATGGT CTCACAG AGCCAACTC ACT CTT AT G AAATGGGCTTT AACAAAACAAGAA
AG AAACGTACTT AACT GT GTG AAG AAAT G GAAT C AGCTTTT AAT AAAATT G AC AAC ATTT
TATTACCAC
(SEQ ID NO: 51)
As used herein, the term “MEF2C” refers to the gene encoding Myocyte-specific enhancer factor 2C. The terms “MEF2C” and "Myocyte-specific enhancer factor 2C" include wild-type forms of the MEF2C gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type MEF2C.
Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type MEF2C nucleic acid sequence (e.g., SEQ ID NO: 52, ENA accession number L08895). SEQ ID NO: 52 is a wild- type gene sequence encoding MEF2C protein, and is shown below:
GAATTCCCAGCTCTCTGCTCGCTCTGCTCGCAGTCACAGACACTTGAGCACACGCGT ACA
CCCAGACATCTTCGGGCTGCTATTGGATTGACTTTGAAGGTTCTGTGTGGGTCGCCG TGG
CTGCATGTTTGAATCAGGTGGAGAAGCACTTCAACGCTGGACGAAGTAAAGATTATT GTT
GTTATTTTTTTTTTCTCTCTCTCTCTCTCTTAAGAAAGGAAAATATCCCAAGGACTA ATC
TGATCGGGTCTTCCTTCATCAGGAACGAATGCAGGAATTTGGGAACTGAGCTGTGCA AGT
G CT G AAG AAG G AG ATTT GTTT G G AGG AAAC AG G AAAG AG AAAG AAAAG G AAG G AAAAAAT
ACATAATTTCAGGGACGAGAGAGAGAAGAAAAACGGGGACTATGGGGAGAAAAAAGA TTC
AG ATT ACG AG GATT AT G GAT G AACGT AAC AG AC AG GT G AC ATTT AC AAAG AGG AAATTT G
GGTTGATGAAGAAGGCTTATGAGCTGAGCGTGCTGTGTGACTGTGAGATTGCGCTGA TCA
TCTTCAACAGCACCAACAAGCTGTTCCAGTATGCCAGCACCGACATGGACAAAGTGC TTC
T CAAGT ACACGGAGT AC AACG AGCCGC AT GAGAGCCGGACAAACT CAG ACATCGT GGAGA
CGTTGAGAAAGAAGGGCCTTAATGGCTGTGACAGCCCAGACCCCGATGCGGACGATT CCG
TAGGTCACAGCCCTGAGTCTGAGGACAAGTACAGGAAAATTAACGAAGATATTGATC TAA
TGATCAGCAGGCAAAGATTGTGTGCTGTTCCACCTCCCAACTTCGAGATGCCAGTCT CCA
TCCCAGTGTCCAGCCACAACAGTTTGGTGTACAGCAACCCTGTCAGCTCACTGGGAA ACC
CCAACCTATTGCCACTGGCTCACCCTTCTCTGCAGAGGAATAGTATGTCTCCTGGTG TAA
CACATCGACCTCCAAGTGCAGGTAACACAGGTGGTCTGATGGGTGGAGACCTCACGT CTG
GTGCAGGCACCAGTGCAGGGAACGGGTATGGCAATCCCCGAAACTCACCAGGTCTGC TGG
TCTCACCTGGTAACTTGAACAAGAATATGCAAGCAAAATCTCCTCCCCCAATGAATT TAG
GAATGAATAACCGTAAACCAGATCTCCGAGTTCTTATTCCACCAGGCAGCAAGAATA CGA
TGCCATCAGTGTCTGAGGATGTCGACCTGCTTTTGAATCAAAGGATAAATAACTCCC AGT
CGGCTCAGTCATTGGCTACCCCAGTGGTTTCCGTAGCAACTCCTACTTTACCAGGAC AAG
GAATGGGAGGATATCCATCAGCCATTTCAACAACATATGGTACCGAGTACTCTCTGA GTA
GTGCAGACCTGTCATCTCTGTCTGGGTTTAACACCGCCAGCGCTCTTCACCTTGGTT CAG
TAACTGGCTGGCAACAGCAACACCTACATAACATGCCACCATCTGCCCTCAGTCAGT TGG
GAGCTTGCACTAGCACTCATTTATCTCAGAGTTCAAATCTCTCCCTGCCTTCTACTC AAA
GCCTCAACATCAAGTCAGAACCTGTTTCTCCTCCTAGAGACCGTACCACCACCCCTT CGA
GATACCCACAACACACGCGCCACGAGGCGGGGAGATCTCCTGTTGACAGCTTGAGCA GCT
GTAGCAGTTCGTACGACGGGAGCGACCGAGAGGATCACCGGAACGAATTCCACTCCC CCA
TTGGACTCACCAGACCTTCGCCGGACGAAAGGGAAAGTCCCTCAGTCAAGCGCATGC GAC
TTT CT G AAG GAT G GG C AAC AT GAT CAG ATT ATT ACTT ACT AGTTTTTTTTTTTTT CTTG C
AGTGTGTGTGTGTGCTATACCTTAATGGGGAAGGGGGGTCGATATGCATTATATGTG CCG
T GTGTGGAAAAAAAAAAAGTCAGGT ACT CTGTTTTGTAAAAGT ACTTTT AAATTGCCT CA
GTG AT AC AGTAT AAAG AT AAAC AG AAAT G CT GAG AT AAG CTT AG C ACTT G AGTTGTAC AA
CAGAACACTTGTACAAAAT AG ATTTT AAGGCT AACTT CTTTTCACTGTT GT GCTCCTTT G
CAAAAT GTATGTT AC AAT AG AT AGTGT CATGTT GCAGGTTCAACGTTATTTACAT GT AAA
TAGACAAAAGGAAACATTTGCCAAAAGCGGCAGATCTTTACTGAAAGAGAGAGCAGC TGT
TATGCAACATATAGAAAAATGTATAGATGCTTGGACAGACCCGGTAATGGGTGGCCA TTG GTAAATGTTAGGAACACACCAGGTCACCTGACATCCCAAGAATGCTCACAAACCTGCAGG
CATATCATTGGCGTATGGCACTCATTAAAAAGGATCAGAGACCATTAAAAGAGGACC ATA
CCTATTAAAAAAAAATGTGGAGTTGGAGGGCTAACATATTTAATTAAATAAATAAAT AAA
TCTGGGTCTG CAT CTCTT ATT AAAT AAAAAT AT AAAAAT ATGTAC ATT AC ATTTT G CTTA
TTTTCATATAAAAGGTAAGACAGAGTTTGCAAAGCATTTGTGGCTTTTTGTAGTTTA CTT
AAGCCAAAATGTGTTTTTTTCCCCTTGATAGCTTCGCTAATATTTTAAACAGTCCTG TAA
AAAACCAAAAAGGACTTTTTGTATAGAAAGCACTACCCTAAGCCATGAAGAACTCCA TGC
TTTGCTAACC AAG AT AACTGTTTTCT CTTTGTAG AAGTTTT GTTTTT G AAAT GT GT ATTT
CT AATT AT AT AAAAT ATT AAG AAT CTTTT AAAAAAAT CTGTG AAATT AAC ATGCTT GTGT
AT AGCTTT CT AAT AT AT AT AAT ATT ATG GT AAT AGO AG AAGTTTTGTT AT CTT AAT AG CG
GGAGGGGGGTATATTTGTGCAGTTGCACATTTGAGTAACTATTTTCTTTCTGTTTTC TTT
TACTCTGCTTACATTTTATAAGTTTAAGGTCAGCTGTCAAAAGGATAACCTGTGGGG TTA
GAACATATCACATTGCAACACCCTAAATTGTTTTTAATACATTAGCAATCTATTGGG TCA
ACT G AC ATCC ATTGTAT AT ACT AGTTT CTTT C ATGC T ATTTTT ATTTT GTTTTTT GC ATT
TTTATCAAATGCAGGGCCCCTTTCTGATCTCACCATTTCACCATGCATCTTGGAATT CAG
TAAGTGCATATCCTAACTTGCCCATATTCTAAATCATCTGGTTGGTTTTCAGCCTAG AAT
TTGATACGCTTTTTAGAAATATGCCCAGAATAGAAAAGCTATGTTGGGGCACATGTC CTG
CAAATATGGCCCTAGAAACAAGTGATATGGAATTTACTTGGTGAATAAGTTATAAAT TCC
C AC AG AAG AAAAAT GT G AAAG ACT G GGTGCT AG AC AAG AAG G AAGC AG GT AAAG GG AT AG
TTGCTTT GT CATCCGTTTTTAATTATTTT AACT G ACCCTT G ACAATCTT GT CAGCAAT AT
AGGACTGTTGAACAATCCCGGTGTGTCAGGACCCCCAAATGTCACTTCTGCATAAAG CAT
GTATGT CAT CTATTTTTT CTT CAAT AAAGAG ATTT AATAGCC ATTT CAAG AAATCCCAT A
AAG AAC CTCTCTAT GTCCCTTTTTTT AATTT AAAAAAAT GACTCTTGTCT AAT ATTCGT C
TATAAGGGATTAATTTTCAGACCCTTTAATAAGTGAGTGCCATAAGAAAGTCAATAT ATA
TT GTTT AAAAG AT ATTT C AGTCT AGG AAAG ATTTTCCTT CTCTT GG AAT GT G AAG AT CTG
TCGATTCATCTCCAATCATATGCATTGACATACACAGCAAAGAAGATATAGGCAGTA ATA
T CAACACTGCT AT ATC AT GTGTAGG ACATTTCTT ATCCATTTTTT CT CTTTT ACTT GCAT
AGTTGCTATGTGTTTCTCATTGTAAAAGGCTGCCGCTGGGTGGCAGAAGCCAAGAGA CCT
T ATT AACT AG GCT AT ATTTTT CTT AACTT G ATCT G AAATCC AC AATT AG ACC AC AAT GCA
CCTTTGGTTGTATCCATAAAGGATGCTAGCCTGCCTTGTACTAATGTTTTATATATT
(SEQ ID NO: 52)
As used herein, the term “MMP12” refers to the gene encoding Macrophage metalloelastase. The terms “MMP12” and "Macrophage metalloelastase" include wild-type forms of the MMP12 gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type MMP12. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71 %, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81 %, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type MMP12 nucleic acid sequence (e.g., SEQ ID NO: 53, ENA accession number L23808). SEQ ID NO: 53 is a wild-type gene sequence encoding MMP12 protein, and is shown below:
TAGAAGTTTACAATGAAGTTTCTTCTAATACTGCTCCTGCAGGCCACTGCTTCTGGA GCT CTTCCCCTGAACAGCTCTACAAGCCTGGAAAAAAATAATGTGCTATTTGGTGAGAGATAC
TT AG AAAAATTTT ATG GCCTT GAG AT AAAC AAACTT CC AGT G AC AAAAAT G AAAT ATAGT
GGAAACTTAATGAAGGAAAAAATCCAAGAAATGCAGCACTTCTTGGGTCTGAAAGTG ACC
GGGCAACT GG AC AC AT CT ACCCT GGAG AT GATGCACGCACCTCGAT GT GG AGTCCCCGAT
CTCCATCATTTCAGGGAAATGCCAGGGGGGCCCGTATGGAGGAAACATTATATCACC TAC
AGAATCAATAATTACACACCTGACATGAACCGTGAGGATGTTGACTACGCAATCCGG AAA
GCTTTCCAAGTATGGAGTAATGTTACCCCCTTGAAATTCAGCAAGATTAACACAGGC ATG
GCTGACATTTTGGTGGTTTTTGCCCGTGGAGCTCATGGAGACTTCCATGCTTTTGAT GGC
AAAGGTGGAATCCTAGCCCATGCTTTTGGACCTGGATCTGGCATTGGAGGGGATGCA CAT
TTCG AT GAG G ACG AATT CTGG ACT AC AC ATT C AG G AG GC AC AAACTT GTTCCTCACTGCT
GTTCACGAGATTGGCCATTCCTTAGGTCTTGGCCATTCTAGTGATCCAAAGGCTGTA ATG
TTCCCCACCTACAAATATGTCGACATCAACACATTTCGCCTCTCTGCTGATGACATA CGT
GGCATTCAGTCCCTGTATGGAGACCCAAAAGAGAACCAACGCTTGCCAAATCCTGAC AAT
T CAG AACCAGCT CTCT GTG ACCCC AATTT GAGTTTT GATGCT GT CACT ACCGTGGGAAAT
AAGATCTTTTTCTTCAAAGACAGGTTCTTCTGGCTGAAGGTTTCTGAGAGACCAAAG ACC
AGTGTTAATTTAATTTCTTCCTTATGGCCAACCTTGCCATCTGGCATTGAAGCTGCT TAT
GAAATTGAAGCCAGAAATCAAGTTTTTCTTTTTAAAGATGACAAATACTGGTTAATT AGC
AATTTAAGACCAGAGCCAAATTATCCCAAGAGCATACATTCTTTTGGTTTTCCTAAC TTT
GTGAAAAAAATTGATGCAGCTGTTTTTAACCCACGTTTTTATAGGACCTACTTCTTT GTA
GATAACCAGTATTGGAGGTATGATGAAAGGAGACAGATGATGGACCCTGGTTATCCC AAA
CTGATTACCAAGAACTTCCAAGGAATCGGGCCTAAAATTGATGCAGTCTTCTATTCT AAA
AAC AAAT ACTACT ATTT CTTCC AAG G ATCT AACC AATTT G AAT AT G ACTTCCT ACTCC AA
CGTATCACCAAAACACTGAAAAGCAATAGCTGGTTTGGTTGTTAGAAATGGTGTAAT TAA
TGGTTTTTGTTAGTTCACTTCAGCTTAATAAGTATTTATTGCATATTTGCTATGTCC TCA
GTGTACC ACTACTT AG AG AT ATGTAT CAT AAAAAT AAAAT CTGT AAACC AT AG GT AAT G A
TT AT AT AAAAT AC AT AAT ATTTTT C AATTTT G AAAACT CT AATT GTCC ATT CTTG CTT G A
CTCTACT ATT AAGTTT G AAAAT AGTT ACCTT C AAAG C AAG AT AATT CT ATTT G AAG CAT G
CTCTGTAAGTTGCTTCCT AAC ATCCTT GG ACT G AG AAATT ATACTTACTTCT G GC AT AAC
T AAA ATT AAGTATATATATTTTGGCT C A AAT AAA ATT G
(SEQ ID NO: 53)
As used herein, the term “MS4A4A” refers to the gene encoding Membrane Spanning 4-Domains A4A. The terms “MS4A4A” and "Membrane Spanning 4-Domains A4A" include wild-type forms of the MS4A4A gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type MS4A4A. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type MS4A4A nucleic acid sequence (e.g., SEQ ID NO: 54, NCBI Reference Sequence: NM_148975.2). SEQ ID NO:
54 is a wild-type gene sequence encoding MS4A4A protein, and is shown below:
ATTCTCAGCACAGCCTTTAAGGTTCCAAACATCTGCTAGAAGAGGAATGCAGATTTA AACTGAGTGAG
GTGTGGAGTGGGGGAAGTTGATTGGGTCTAGACCAAAGAACTTTGAGGAACTTGCCC AGAGCCCTG CATGCATCAGACCTACAGCAGACATTGCAGGCCTGAAGAAAGCACCTTTTCTGCTGCCAT GACAACC
ATGCAAGGAATGGAACAGGCCATGCCAGGGGCTGGCCCTGGTGTGCCCCAGCTGGGA AACATGGC
TGTCATACATTCACATCTGTGGAAAGGATTGCAAGAGAAGTTCTTGAAGGGAGAACC CAAAGTCCTT
GGGGTTGTGCAGATTCTGACTGCCCTGATGAGCCTTAGCATGGGAATAACAATGATG TGTATGGCAT
CTAATACTTATGGAAGTAACCCTATTTCCGTGTATATCGGGTACACAATTTGGGGGT CAGTAATGTTT
ATTATTTCAGGATCCTTGTCAATTGCAGCAGGAATTAGAACTACAAAAGGCCTGGTC CGAGGTAGTCT
AGGAATGAATATCACCAGCTCTGTACTGGCTGCATCAGGGATCTTAATCAACACATT TAGCTTGGCGT
TTTATTCATTCCATCACCCTTACTGTAACTACTATGGCAACTCAAATAATTGTCATG GGACTATGTCCA
TCTTAATGGGTCTGGATGGCATGGTGCTCCTCTTAAGTGTGCTGGAATTCTGCATTG CTGTGTCCCT
CTCTGCCTTTGGATGTAAAGTGCTCTGTTGTACCCCTGGTGGGGTTGTGTTAATTCT GCCATCACATT
CTCACATGGCAGAAACAGCATCTCCCACACCACTTAATGAGGTTTGAGGCCACCAAA AGATCAACAG
AC AAAT GOT CC AG AAAT CTATGCTGACTGT G AC AC AAG AG CCT C AC AT G AG AAATT AC CAGT ATCC AA
CTTCGATACTGATAGACTTGTTGATATTATTATTATATGTAATCCAATTATGAACTG TGTGTGTATAGA
GAG AT AAT AAATT C AAAATT ATGTTCT C ATTTTTTTCCCT GG AACT C AAT AACT C ATTT C ACT GG CTCTT
T ATCG AG AGTACT AG AAGTTAAATT AAT AAAT AAT GCATTTAAT G AGGCAACAGCACTT G AAAGTTTTT
CATTCATCATAAGAACTTTATATAAAGGCATTACATTGGCAAATAAGGTTTGGAAGC AGAAGAGCAAA
AAAAAGATATTGTTAAAATGAGGCCTCCATGCAAAACACATACTTCCCTCCCATTTA TTTAACTTTTTTT
TTCTCCTACCTATGGGGACCAAAGTGCTTTTTCCTTCAGGAAGTGGAGATGCATGGC CATCTCCCCC
TCCCTTTTTCCTTCTCCTGCTTTTCTTTCCCCATAGAAAGTACCTTGAAGTAGCACA GTCCGTCCTTG
CATGTGCACGAGCTATCATTTGAGTAAAAGTATACATGGAGTAAAAATCATATTAAG CATCAGATTCA
ACTTATATTTTCTATTTCATCTTCTTCCTTTCCCTTCTCCCACCTTCTACTGGGCAT AATTATATCTTAA
TCATATATGGAAATGTGCAACATATGGTATTTGTTAAATACGTTTGTTTTTATTGCA GAGCAAAAATAA
AT C AAATT AG AAGC AAT AAAAAAAAAAAAAAAAAAAA
(SEQ ID NO: 54)
As used herein, the term “MS4A6A” refers to the gene encoding Membrane-spanning 4-domains subfamily A member 6A. The terms “MS4A6A” and "Membrane-spanning 4-domains subfamily A member 6A" include wild-type forms of the MS4A6A gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type MS4A6A. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type MS4A6A nucleic acid sequence (e.g., SEQ ID NO: 55, ENA accession number AB013104). SEQ ID NO: 55 is a wild-type gene sequence encoding MS4A6A protein, and is shown below:
GAGAACCAGAGTTAAAACCTCTTTGGAGCTTCTGAGGACTCAGCTGGAACCAACGGG CAC
AGTTGGCAACACCATCATGACATCACAACCTGTTCCCAATGAGACCATCATAGTGCT CCC
ATCAAATGTCATCAACTTCTCCCAAGCAGAGAAACCCGAACCCACCAACCAGGGGCA GGA
T AG CCT G AAG AAAC AT CT AC ACGC AG AAAT C AAAGTT ATT GG G ACT ATCC AG ATCTTGTG
TGGCATGATGGTATTGAGCTTGGGGATCATTTTGGCATCTGCTTCCTTCTCTCCAAA TTT
T ACCC AAGT G ACTT CTACACTGTT G AACT CTGCTT ACCCATT CATAGG ACCCTTTTTTTT
TATCATCTCTGGCTCTCTATCAATCGCCACAGAGAAAAGGTTAACCAAGCTTTTGGT GCA TAGCAGCCTGGTTGGAAGCATTCTGAGTGCTCTGTCTGCCCTGGTGGGTTTCATTATCCT GTCTGTCAAACAGGCCACCTTAAATCCTGCCTCACTGCAGTGGAACTCTCTCTCTGATGC TGATTTGCACTCTGCTGGAATTCTGCCTAGCTGTGCTCACTGCTGTGCTGCGGTGGAAAC AGGCTTACTCTGACTTCCCTGGGAGTGGACTTTTCCTGCCTCACAGTTACATTGGTAATT CT GGCATGTCCT CAAAAAT G ACT CAT GACTGTGG AT AT GAAG AACT ATT GACTTCTT AAG AAAAAAGGG AG AAAT ATTAAT CAGAAAGTT G ATTCTT AT GATAATAT GG AAAAGTT AACC ATT AT AG AAAAGC AAAG CTT G AGTTTCCT AAAT GT AAG CTTTT AAAGTAAT G AACATT AA AAAAAACCATTATTTCACTGTC (SEQ ID NO: 55)
As used herein, the term “NLRP3” refers to the gene encoding NACHT, LRR and PYD domains- containing protein 3. The terms “NLRP3” and "NACHT, LRR and PYD domains-containing protein 3" include wild-type forms of the NLRP3 gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type NLRP3. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type NLRP3 nucleic acid sequence (e.g., SEQ ID NO: 56, ENA accession number AF410477). SEQ ID NO: 56 is a wild-type gene sequence encoding NLRP3 protein, and is shown below:
GTAGAT GAGG AAACT GAAGTT GAGG AAT AGTG AAGAGTTTGTCCAAT GT CATAGCCCCGT
AATCAACGGGACAAAAATTTTCTTGCTGATGGGTCAAGATGGCATCGTGAAGTGGTT GTT
CACCGTAAACTGTAATACAATCCTGTTTATGGATTTGTTTGCATATTTTTCCCCCCA TAG
GGAAACCTTTTTTCCATGGCTCAGGACACACTCCTGGATCGAGCCAACAGGAGAACT TTC
TGGTAAGCATTTGGCTAACTTTTTTTTTTTTGAGATGGAGTCTTGCTGTGTCGCCTA GGC
TGGAGTGCAGTGGCGTGATCTTGGCTCACTGCAGCCTCCACCTCCCGGGTTCAATCA ATT
CTCCTACCTCAACTTCCTGAGTAGCTGGGATTACAGGCGCCCGCCACCACACCCGGC TCA
TTTTTGTACTTTTAGTAGAGACACAGTTTTGCCATGTTGGCCAGGCTGGTCTTGAAT TCC
TCAGCTCAGGTGATATGCCTGCCTTGGCCTCTCAAAGTGCTGGGATTACAGGCGTGA GCC
ACTGTGCCCGGCCTTGGCTAACTTTTCAAAATTAAAGATTTTGACTTGTTACAGTCA TGT
G ACATTTTTTT CTTTCTGTTTGGT GAGTTTTT GAT AATTT AT AT CT CTCAAAGT GGAG AC
TTTAAAAAAGACTCATCTGTGTGCCGTGTTCACTGCCTGGTATCTTAGTGTGGACCG AAG
CCTAAGGACCCTGAAAACAGCTGCAGATGAAGATGGCAAGCACCCGCTGCAAGCTGG CCA
GGTACCTGGAGGACCTGGAGGATGTGGACTTGAAGAAATTTAAGATGCACTTAGAGG ACT
ATCCTCCCCAGAAGGGCTGCATCCCCCTCCCGAGGGGTCAGACAGAGAAGGCAGACC ATG
TGGATCTAGCCACGCTAATGATCGACTTCAATGGGGAGGAGAAGGCGTGGGCCATGG CCG
TGTG GAT CTTCGCTGCGAT C AAC AG G AG AG ACCTTT AT G AG AAAG C AAAAAG AG AT G AGC
CGAAGTGGGGTTCAGATAATGCACGTGTTTCGAATCCCACTGTGATATGCCAGGAAG ACA
GCATTGAAGAGGAGTGGATGGGTTTACTGGAGTACCTTTCGAGAATCTCTATTTGTA AAA
T GAAG AAAG ATT ACCGTAAGAAGT ACAG AAAGT ACGT G AG AAG C AG ATT C C AGT GC ATT G
AAG ACAGG AATGCCCGTCTGGGTG AG AGT GT GAGCCT CAACAAACGCT ACACACGACT GC
GTCTCATCAAGGAGCACCGGAGCCAGCAGGAGAGGGAGCAGGAGCTTCTGGCCATCG GCA
AGACCAAGACGTGTG AG AGCCCCGTGAGTCCCATTAAG AT GG AGTT GCT GTTT GACCCCG ATGATGAGCATTCTGAGCCTGTGCACACCGTGGTGTTCCAGGGGGCGGCAGGGATTGGGA
AAACAATCCTGGCCAGGAAGATGATGTTGGACTGGGCGTCGGGGACACTCTACCAAG ACA
GGTTTGACTATCTGTTCTATATCCACTGTCGGGAGGTGAGCCTTGTGACACAGAGGA GCC
TGGGGGACCTGATCATGAGCTGCTGCCCCGACCCAAACCCACCCATCCACAAGATCG TGA
GAAAACCCTCCAGAATCCTCTTCCTCATGGACGGCTTCGATGAGCTGCAAGGTGCCT TTG
ACGAGCACATAGGACCGCTCTGCACTGACTGGCAGAAGGCCGAGCGGGGAGACATTC TCC
TGAGCAGCCTCATCAGAAAGAAGCTGCTTCCCGAGGCCTCTCTGCTCATCACCACGA GAC
CTGTGGCCCTGGAGAAACTGCAGCACTTGCTGGACCATCCTCGGCATGTGGAGATCC TGG
GTTTCTCCGAGGCCAAAAGGAAAGAGTACTTCTTCAAGTACTTCTCTGATGAGGCCC AAG
CCAGGGCAGCCTTCAGTCTGATTCAGGAGAACGAGGTCCTCTTCACCATGTGCTTCA TCC
CCCTGGTCTGCTGGATCGTGTGCACTGGACTGAAACAGCAGATGGAGAGTGGCAAGA GCC
TTGCCCAGACATCCAAGACCACCACCGCGGTGTACGTCTTCTTCCTTTCCAGTTTGC TGC
AGCCCCGGGGAGGGAGCCAGGAGCACGGCCTCTGCGCCCACCTCTGGGGGCTCTGCT CTT
TGGCTGCAGATGGAATCTGGAACCAGAAAATCCTGTTTGAGGAGTCCGACCTCAGGA ATC
ATGGACTGCAGAAGGCGGATGTGTCTGCTTTCCTGAGGATGAACCTGTTCCAAAAGG AAG
TGGACTGCGAGAAGTTCTACAGCTTCATCCACATGACTTTCCAGGAGTTCTTTGCCG CCA
TGTACTACCTGCTGGAAGAGGAAAAGGAAGGAAGGACGAACGTTCCAGGGAGTCGTT TGA
AGCTTCCCAGCCGAGACGTGACAGTCCTTCTGGAAAACTATGGCAAATTCGAAAAGG GGT
ATTTGATTTTTGTTGTACGTTTCCTCTTTGGCCTGGTAAACCAGGAGAGGACCTCCT ACT
TGGAGAAGAAATTAAGTTGCAAGATCTCTCAGCAAATCAGGCTGGAGCTGCTGAAAT GGA
TTGAAGTGAAAGCCAAAGCTAAAAAGCTGCAGATCCAGCCCAGCCAGCTGGAATTGT TCT
ACTGTTTGTACGAGATGCAGGAGGAGGACTTCGTGCAAAGGGCCATGGACTATTTCC CCA
AG ATT GAG AT C AAT CT CTCC ACC AG AAT G G ACC AC AT G GTTT CTTCCTTTT G C ATT GAGA
ACTGTCATCGGGTGGAGTCACTGTCCCTGGGGTTTCTCCATAACATGCCCAAGGAGG AAG
AGGAGGAGGAAAAGGAAGGCCGACACCTTGATATGGTGCAGTGTGTCCTCCCAAGCT CCT
CTCATGCTGCCTGTTCTCATGGATTGGTGAACAGCCACCTCACTTCCAGTTTTTGCC GGG
GCCT CTTTT CAGTT CT GAGCACC AGCCAG AGT CTAACT G AATT GG ACCT CAGTG ACAATT
CTCTGGGGGACCCAGGGATGAGAGTGTTGTGTGAAACGCTCCAGCATCCTGGCTGTA ACA
TTCGGAGATTGTGGTTGGGGCGCTGTGGCCTCTCGCATGAGTGCTGCTTCGACATCT CCT
TGGTCCTCAGCAGCAACCAGAAGCTGGTGGAGCTGGACCTGAGTGACAACGCCCTCG GTG
ACTTCGGAATCAGACTTCTGTGTGTGGGACTGAAGCACCTGTTGTGCAATCTGAAGA AGC
TCTGGTTGGTCAGCTGCTGCCTCACATCAGCATGTTGTCAGGATCTTGCATCAGTAT TGA
GCACCAGCCATTCCCTGACCAGACTCTATGTGGGGGAGAATGCCTTGGGAGACTCAG GAG
TCGCAATTTT AT GT G AAAAAGCC AAG AATCCACAGT GT AACCT GC AG AAACT GGGGTTGG
TGAATTCTGGCCTTACGTCAGTCTGTTGTTCAGCTTTGTCCTCGGTACTCAGCACTA ATC
AGAATCTCACGCACCTTTACCTGCGAGGCAACACTCTCGGAGACAAGGGGATCAAAC TAC
T CT GT GAGGGACTCTTGCACCCCG ACT GC AAGCTT CAGGTGTT GG AATT AG ACAACT GCA
ACCTCACGTCACACTGCTGCTGGGATCTTTCCACACTTCTGACCTCCAGCCAGAGCC TGC
GAAAGCTGAGCCTGGGCAACAATGACCTGGGCGACCTGGGGGTCATGATGTTCTGTG AAG
TGCTGAAACAGCAGAGCTGCCTCCTGCAGAACCTGGGGTTGTCTGAAATGTATTTCA ATT
ATGAGACAAAAAGTGCGTTAGAAACACTTCAAGAAGAAAAGCCTGAGCTGACCGTCG TCT
TTGAGCCTTCTTGGTAGGAGTGGAAACGGGGCTGCCAGACGCCAGTGTTCTCCGGTC CCT CCAGCTGGGGGCCCTCAGGTGGAGAGAGCTGCGATCCATCCAGGCCAAGACCACAGCTCT GTGATCCTTCCGGT GGAGT GTCGGAGAAGAGAGCTT GCCGACGAT GCCTTCCTGT GCAGA GCTTGGGCATCTCCTTTACGCCAGGGTGAGGAAGACACCAGGACAATGACAGCATCGGGT GTTGTTGTCATCACAGCGCCTCAGTTAGAGGATGTTCCTCTTGGTGACCTCATGTAATTA G CT C ATT C AAT AAAGC ACTTT CTTT ATTTT (SEQ ID NO: 56)
As used herein, the term “NME8” refers to the gene encoding Thioredoxin domain-containing protein 3. The terms “NME8” and "Thioredoxin domain-containing protein 3" include wild-type forms of the NME8 gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type NME8. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type NME8 nucleic acid sequence (e.g., SEQ ID NO: 57, ENA accession number AF202051). SEQ ID NO: 57 is a wild-type gene sequence encoding NME8 protein, and is shown below:
CGGCCACAACGAGGGAGCCGATTTAGATCCTCTGGGCCTGTTCCTTCCTTTTCTTTA AAC GTCCCAGTCTAGCTTAGAGGAGGACCTGTTTTGTTAGATAAATGGCAAGCAAAAAACGAG AAGTCCAGTTACAGACAGTCATCAATAATCAAAGCCTGTGGGATGAGATGTTGCAGAACA AAGGCTT AAC AGTG ATT GAT GTTT ACCAAGCCTGGT GTGGACCTT GCAGAGCAATGC AAC CTTTATT CAG AAAATT GAAAAAT G AACT G AACGAAG ACG AAATT CT GC ATTTTGCT GTCG CAGAAGCT G AC AACATT GTG ACTTTGCAGCCATTT AGAGAT AAATGTGAACCT GTTTTT C TCTTTAGTGTTAATGGCAAAATTATCGAAAAGATTCAGGGTGCAAATGCACCGCTTGTTA AT AAAAAAGTT ATT AATTT G ATCG AT GAG G AG AG AAAAATT G C AGC AG GT G AAAT GG CTC GACCTCAGTATCCTGAAATTCCATTAGTAGACTCAGATTCAGAAGTTAGTGAAGAATCAC CATGTGAAAGTGTTCAGGAATTATACAGTATTGCTATTATCAAACCGGATGCTGTGATTA GTAAAAAAGTT CT AG AAATT AAAAG AAAAATT ACC AAAGCT GG ATTT ATT AT AG AAG CAG AGCATAAGACAGTGCTCACTGAAGAACAAGTTGTCAACTTCTATAGTCGAATAGCAGACC AGTGTGACTTCGAAGAGTTTGTCTCTTTTATGACAAGTGGCTTAAGCTATATTCTAGTTG T ATCT CAAGGAAGT AAACACAATCCTCCCT CT G AAGAAACCG AACC AC AG ACTGACACCG AACCTAACGAACGATCTGAGGATCAACCTGAGGTCGAAGCCCAGGTTACACCTGGAATGA T G AAG AAC AAAC AAG AC AGTTT AC AAG AAT ATCTG G AAAG AC AAC ATTT AGCT CAG CTCT GTGACATTGAAGAGGATGCAGCTAATGTTGCTAAGTTCATGGATGCTTTCTTCCCCGATT TT AAAAAAAT G AAAAG CAT G AAATT AG AAAAG AC ATT G GC ATT ACTTCG ACC AAAT CTCT TT CAT G AAAG G AAAG AT G ATGTTTT G CGT ATT ATT AAAG AT G AAG ACTT C AAAAT ACTGG AGCAAAGACAAGTAGTATTATCGGAAAAAGAAGCACAAGCACTGTGCAAGGAATATGAAA AT G AAGACT ATTTTAAT AAACTT AT AG AAAACAT G ACCAGTGGTCCAT CTCT AGCCCTT G TTTT ATT GAG AG AC AAT GGCTTGC AAT ACT GG AAACAATT ACTGGG ACC AAG AACTGTT G AAGAAGCCATTGAATATTTTCCAGAGAGTTTATGTGCACAGTTTGCGATGGACAGTTTGC CGGTCAACCAGTTGTATGGCAGCGATTCATTAGAAACCGCTGAAAGGGAAATACAGCATT T CTTTCCT CTT C AAAG C ACTTT AGG CTT GATT AAACCT CAT G C AAC AAGT G AAC AAAG AG AGCAG ATCCT GAAGAT AGTTAAGGAGGCT GG ATTTG AT CT G AC ACAGGT GAAGAAAAT GT TCCTAACTCCT GAG C AAAT AG AG AAAATTT ATCC AAAAGT AAC AGG AAAAG ACTTTT AT A AAG ATTT ATT G G AAAT GTTATCTGT GG GTCCAT CTATG GT CAT GATT CT G ACC AAGTG G A ATGCTGTTGCAGAATGGAGACGATTGATGGGCCCAACAGACCCAGAAGAAGCAAAATTAC TTTCCCCTGACTCCATCCGAGCCCAGTTTGGAATAAGTAAATTGAAAAACATTGTCCATG G AGCATCT AACGCCT AT GAAGC AAAAG AGGTTGTT AAT AG ACT CTTT GAGG ATCCT G AGG AAAACT AAAGT ATATACTGT G AAAACTTT GAG AAG AT AAT AC AT AT GTT C ACGTC AAT AT ACAACC ATTTGGCACAGCTTCCTGGGAGG AAT AAT AAGAAAAAC AT GCTTTGGAGG AAAA CTCAAGATACAAAAATGAATGGCTATGCATAATAACAATAAAAATGTATTCCCCAAAC (SEQ ID NO: 57)
As used herein, the term “NOS2” refers to the gene encoding Nitric oxide synthase, inducible. The terms “NOS2” and "Nitric oxide synthase, inducible" include wild-type forms of the NOS2 gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type NOS2. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type NOS2 nucleic acid sequence (e.g., SEQ ID NO: 58, ENA accession number L24553). SEQ ID NO: 58 is a wild-type gene sequence encoding NOS2 protein, and is shown below:
AAGCCCCACAGTGAAGAACATCTGAGCTCAAATCCAGATAAGTGACATAAGTGACCT GCT
TTGTAAAGCCATAGAGATGGCCTGTCCTTGGAAATTTCTGTTCAAGACCAAATTCCA CCA
GTATGCAATGAATGGGGAAAAAGACATCAACAACAATGTGGAGAAAGCCCCCTGTGC CAC
CTCCAGTCCAGT GACACAGG AT GACCTT CAGTAT CAC AACCT CAGCAAGCAGCAGAAT G A
GTCCCCGCAGCCCCTCGTGGAGACGGGAAAGAAGTCTCCAGAATCTCTGGTCAAGCT GGA
TGCAACCCCATTGTCCTCCCCACGGCATGTGAGGATCAAAAACTGGGGCAGCGGGAT GAC
TTTCCAAGACACACTTCACCATAAGGCCAAAGGGATTTTAACTTGCAGGTCCAAATC TTG
CCTGGGGTCCATTATGACTCCCAAAAGTTTGACCAGAGGACCCAGGGACAAGCCTAC CCC
TCCAGATGAGCTTCTACCTCAAGCTATCGAATTTGTCAACCAATATTACGGCTCCTT CAA
AGAGGCAAAAATAGAGGAACATCTGGCCAGGGTGGAAGCGGTAACAAAGGAGATAGA AAC
AACAGGAACCTACCAACTGACGGGAGATGAGCTCATCTTCGCCACCAAGCAGGCCTG GCG
CAATGCCCCACGCTGCATTGGGAGGATCCAGTGGTCCAACCTGCAGGTCTTCGATGC CCG
CAGCTGTTCCACTGCCCGGGAAATGTTTGAACACATCTGCAGACACGTGCGTTACTC CAC
CAACAATGGCAACATCAGGTCGGCCATCACCGTGTTCCCCCAGCGGAGTGATGGCAA GCA
CGACTTCCGGGTGTGGAATGCTCAGCTCATCCGCTATGCTGGCTACCAGATGCCAGA TGG
CAGCATCAGAGGGGACCCTGCCAACGTGGAATTCACTCAGCTGTGCATCGACCTGGG CTG
GAAGCCCAAGTACGGCCGCTTCGATGTGGTCCCCCTGGTCCTGCAGGCCAATGGCCG TGA
CCCTGAGCTCTTCGAAATCCCACCTGACCTTGTGCTTGAGGTGGCCATGGAACATCC CAA
ATACGAGTGGTTTCGGGAACTGGAGCTAAAGTGGTACGCCCTGCCTGCAGTGGCCAA CAT
GCTGCTTGAGGTGGGCGGCCTGGAGTTCCCAGGGTGCCCCTTCAATGGCTGGTACAT GGG
CACAGAGATCGGAGTCCGGGACTTCTGTGACGTCCAGCGCTACAACATCCTGGAGGA AGT
GGGCAGGAGAATGGGCCTGGAAACGCACAAGCTGGCCTCGCTCTGGAAAGACCAGGC TGT
CGTTGAGATCAACATTGCTGTGCTCCATAGTTTCCAGAAGCAGAATGTGACCATCAT GGA CCACCACTCGGCTGCAGAATCCTTCATGAAGTACATGCAGAATGAATACCGGTCCCGTGG
GGGCTGCCCGGCAGACTGGATTTGGCTGGTCCCTCCCATGTCTGGGAGCATCACCCC CGT
GTTTCACCAGGAGATGCTGAACTACGTCCTGTCCCCTTTCTACTACTATCAGGTAGA GGC
CTGGAAAACCCATGTCTGGCAGGACGAGAAGCGGAGACCCAAGAGAAGAGAGATTCC ATT
GAAAGTCTTGGTCAAAGCTGTGCTCTTTGCCTGTATGCTGATGCGCAAGACAATGGC GTC
CCGAGTCAGAGTCACCATCCTCTTTGCGACAGAGACAGGAAAATCAGAGGCGCTGGC CTG
GGACCTGGGGGCCTTATTCAGCTGTGCCTTCAACCCCAAGGTTGTCTGCATGGATAA GTA
CAGGCT GAGCT GCCT GGAGGAGGAACGGCT GCT GTTGGTGGT GACCAGTACGTTT GGCAA
TGGAGACTGCCCTGGCAATGGAGAGAAACTGAAGAAATCGCTCTTCATGCTGAAAGA GCT
CAACAACAAATTCAGGTACGCTGTGTTTGGCCTCGGCTCCAGCATGTACCCTCGGTT CTG
CGCCTTTGCTCATGACATTGATCAGAAGCTGTCCCACCTGGGGGCCTCTCAGCTCAC CCC
GATGGGAGAAGGGGATGAGCTCAGTGGGCAGGAGGACGCCTTCCGCAGCTGGGCCGT GCA
AACCTTCAAGGCAGCCTGTGAGACGTTTGATGTCCGAGGCAAACAGCACATTCAGAT CCC
CAAGCTCTACACCTCCAATGTGACCTGGGACCCGCACCACTACAGGCTCGTGCAGGA CTC
ACAGCCTTTGGACCTCAGCAAAGCCCTCAGCAGCATGCATGCCAAGAACGTGTTCAC CAT
GAGGCTCAAATCTCGGCAGAATCTACAAAGTCCGACATCCAGCCGTGCCACCATCCT GGT
GGAACTCTCCTGTGAGGATGGCCAAGGCCTGAACTACCTGCCGGGGGAGCACCTTGG GGT
TTGCCCAGGCAACCAGCCGGCCCTGGTCCAAGGCATCCTGGAGCGAGTGGTGGATGG CCC
CACACCCCACCAGACAGT GCGCCTGGAGGCCCT GGAT GAGAGT GGCAGCTACT GGGTCAG
TGACAAGAGGCTGCCCCCCTGCTCACTCAGCCAGGCCCTCACCTACTTCCTGGACAT CAC
CACACCCCCAACCCAGCTGCTGCTCCAAAAGCTGGCCCAGGTGGCCACAGAAGAGCC TGA
GAGACAGAGGCTGGAGGCCCTGTGCCAGCCCTCAGAGTACAGCAAGTGGAAGTTCAC CAA
CAGCCCCACATTCCTGGAGGTGCTAGAGGAGTTCCCGTCCCTGCGGGTGTCTGCTGG CTT
CCTGCTTTCCCAGCTCCCCATTCTGAAGCCCAGGTTCTACTCCATCAGCTCCTCCCG GGA
T CACACGCCC ACGGAGATCCACCT G ACT GTGGCCGTGGT CACCT ACC AC ACCCG AGAT GG
CCAGGGTCCCCTGCACCACGGCGTCTGCAGCACATGGCTCAACAGCCTGAAGCCCCA AGA
CCCAGTGCCCTGCTTTGTGCGGAATGCCAGCGGCTTCCACCTCCCCGAGGATCCCTC CCA
TCCTTGCATCCTCATCGGGCCTGGCACAGGCATCGCGCCCTTCCGCAGTTTCTGGCA GCA
ACGGCTCCATGACTCCCAGCACAAGGGAGTGCGGGGAGGCCGCATGACCTTGGTGTT TGG
GTGCCGCCGCCCAGATGAGGACCACATCTACCAGGAGGAGATGCTGGAGATGGCCCA GAA
GGGGGTGCTGCATGCGGTGCACACAGCCTATTCCCGCCTGCCTGGCAAGCCCAAGGT CTA
TGTTCAGGACATCCTGCGGCAGCAGCTGGCCAGCGAGGTGCTCCGTGTGCTCCACAA GGA
GCCAGGCCACCTCTATGTTTGCGGGGATGTGCGCATGGCCCGGGACGTGGCCCACAC CCT
GAAGCAGCTGGTGGCTGCCAAGCTGAAATTGAATGAGGAGCAGGTCGAGGACTATTT CTT
T CAGCT CAAG AGCCAG AAGCGCT AT CACG AAG AT ATCTTT GGTGCTGT ATTTCCTT ACGA
GGCGAAGAAGGACAGGGTGGCGGTGCAGCCCAGCAGCCTGGAGATGTCAGCGCTCTG AGG
GCCTACAGGAGGGGTTAAAGCTGCCGGCACAGAACTTAAGGATGGAGCCAGCTCT
(SEQ ID NO: 58)
As used herein, the term “PICALM” refers to the gene encoding Phosphatidylinositol-binding clathrin assembly protein. The terms “PICALM” and "Phosphatidylinositol-binding clathrin assembly protein" include wild-type forms of the PICALM gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type PICALM. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type PICALM nucleic acid sequence (e.g., SEQ ID NO: 59, ENA accession number U45976). SEQ ID NO: 59 is a wild-type gene sequence encoding PICALM protein, and is shown below:
GCGCGGCCCCGAACCGCCGCCAGGCCGGCACGGGGGAAGGAGCCGGTGGGGGTAGGG GGT
GCGGTGGGGGGTGGGGACCCTCCGGCTCTTGGGGGTCCCAGTCCCCGCCGGCTGCTG AGC
GGGTGGGGTGGTGGAGGAGCTGCAGAGATGTCCGGCCAGAGCCTGACGGACCGAATC ACT
GCCGCCCAGCACAGTGTCACCGGCTCTGCCGTATCCAAGACAGTATGCAAGGCCACG ACC
CACGAGATCATGGGGCCCAAGAAAAAGCACCTGGACTACTTAATTCAGTGCACAAAT GAG
AT G AAT GT G AAC ATCCC AC AGTT G GC AG AC AGTTT ATTT G AAAG AACT ACT AAT AGTAGT
TGGGTGGTGGTCTT C AAAT CTCT C ATT AC AACTC AT C ATTT GAT G GTGTATG G AAAT GAG
CGTTTTATTCAGTATTTGGCTTCAAGAAACACGTTGTTTAACTTAAGCAATTTTTTG GAT
AAAAGT GG ATT G C AAGG AT AT G AC AT GTCT AC ATTT ATT AG GCG GT ATAGTAG AT ATTT A
AAT GAG AAAG C AGTTT CAT AC AG AC AAGTT G C ATTT G ATTT C AC AAAAGT G AAG AG AG GG
G CT GAT G G AGTT AT GAG AAC AAT G AAC AC AG AAAAACTCCT AAAAACT GTAC C AATT ATT
C AG AAT C AAAT GG ATGC ACTTCTT G ATTTT AAT GTT AAT AG C AAT G AACTT AC AAAT G GG
GTAATAAATGCTGCCTTCATGCTCCTGTTCAAAGATGCCATTAGACTGTTTGCAGCA TAC
CAT G AAGG AATTATT AATTT GTTGGAAAAAT ATTTT GAT AT GAAAAAG AACC AAT GCAAA
G AAGGTCTT G AC AT CTAT AAG AAGTTCCT AACT AG GAT G AC AAG AAT CT C AG AGTTCCT C
AAAGTTGCAGAGCAAGTTGGAATTGACAGAGGTGATATACCAGACCTTTCACAGGCC CCT
AGCAGTCTTCTTGATGCTTTGGAACAACATTTAGCTTCCTTGGAAGGAAAGAAAATC AAA
GATTCTACAGCTGCAAGCAGGGCAACTACACTTTCCAATGCAGTGTCTTCCCTGGCA AGC
ACTGGTCTATCTCT G AC C AAAGT G GAT G AAAGG G AAAAG C AGGC AGC ATT AG AG G AAG AA
CAGGCACGTTTGAAAGCTTTAAAGGAACAGCGCCTAAAAGAACTTGCAAAGAAACCT CAT
ACCTCTTTAACAACTGCAGCCTCTCCTGTATCCACCTCAGCAGGAGGGATAATGACT GCA
CCAGCCATTGACATATTTTCTACCCCTAGTTCTTCTAACAGCACATCAAAGCTGCCC AAT
GAT CT GCTT G ATTT GCAGCAGCCAACTTTT CACCCATCTGTACATCCT ATGTCAACT GCT
TCTCAGGTAGCAAGTACATGGGGAGATCCTTTCTCTGCTACTGTAGATGCTGTTGAT GAT
GCCATTCCAAGCTTAAATCCTTTCCTCACAAAAAGTAGTGGTGATGTTCACCTTTCC ATT
T CTTCAGATGTATCT ACTTTT ACT ACT AGG AC ACCT ACTCAT G AAATGTTTGTT GGATT C
ACTCCTTCTCCAGTTGCACAGCCACACCCTTCAGCTGGCCTTAATGTTGACTTTGAA TCT
GTGTTTGGAAATAAATCTACAAATGTTATTGTAGATTCTGGGGGCTTTGATGAACTA GGT
GGACTTCTCAAACCAACAGTGGCCTCTCAGAACCAGAACCTTCCTGTTGCCAAACTC CCA
CCTAGCAAGTTAGTATCTGATGACTTGGATTCATCTTTAGCCAACCTTGTGGGCAAT CTT
GGCATCGGAAATGGAACCACTAAGAATGATGTAAATTGGAGTCAACCAGGTGAAAAG AAG
TTAACTGGGGGATCTAACTGCGAACCAAAGGTTGCACCAACAACCGCTTGGAATGCT GCA
ACAAT GGCACCCCCT GT AAT GGCCTAT CCT GCT ACTACACCAAC AGGC AT G ATAGG AT AT
GGAATTCCTCCACAAATGGGAAGTGTTCCTGTAATGACGCAACCAACCTTAATATAC AGC
CAGCCTGTCATGAGACCTCCAAACCCCTTTGGCCCTGTATCAGGAGCACAGATACAG TTT ATGTAACTTG AT GG AAG AAAAT G G AATT ACTCC AAAAAG AC AAGT GCT C AAG C AG C AAAA
TCCTTACTTCCAGCAAAATCCAAACTGCTGTCTCTTAAATCTCTTAAACTCTCTTCT TCC
ATTAGGATGCTACAAGTANCTCAGTGAAGGCCCATGAAGGGAATTGGGGACTAGTTT ATA
GGGNGAACGTATTCATTACAGTTTATAAAGGCCAGGATTGGNTTGGATTTTAGGATT ANG
TTC
(SEQ ID NO: 59)
As used herein, the term “PILRA” refers to the gene encoding Paired Immunoglobin Like Type 2 Receptor Alpha. The terms “PILRA” and "Paired Immunoglobin Like Type 2 Receptor Alpha" include wild-type forms of the PILRA gene, as well as variants (e.g., splice variants and polymorphisms) of wild- type PILRA. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild- type PILRA nucleic acid sequence (e.g., SEQ ID NO: 60, NCBI Reference Sequence: NM_013439.2). SEQ ID NO: 60 is a wild-type gene sequence encoding PILRA protein, and is shown below:
AATAGGGGAAAATAAGCCAGATGGATAAAGGAAGTGCTGGTCACCCTGGAGGTGCAC TGGTTTGGG
GAAGGCTCCTGGCCCCCACAGCCCTCTTCGGAGCCTGAGCCCGGCTCTCCTCACTCA CCTCAACCC
CCAGGCGGCCCCTCCACAGGGCCCCTCTCCTGCCTGGACGGCTCTGCTGGTCTCCCC GTCCCCTG
GAGAAGAACAAGGCCATGGGTCGGCCCCTGCTGCTGCCCCTACTGCCCTTGCTGCTG CCGCCAGC
ATTTCTGCAGCCTAGTGGCTCCACAGGATCTGGTCCAAGCTACCTTTATGGGGTCAC TCAACCAAAA
CACCTCTCAGCCTCCATGGGTGGCTCTGTGGAAATCCCCTTCTCCTTCTATTACCCC TGGGAGTTAG
CCACAGCTCCCGACGTGAGAATATCCTGGAGACGGGGCCACTTCCACAGGCAGTCCT TCTACAGCA
CAAGGCCGCCTTCCATTCACAAGGATTATGTGAACCGGCTCTTTCTGAACTGGACAG AGGGTCAGAA
GAGCGGCTTCCTCAGGATCTCCAACCTGCAGAAGCAGGACCAGTCTGTGTATTTCTG CCGAGTTGA
GCTGGACACACGGAGCTCAGGGAGGCAGCAGTGGCAGTCCATCGAGGGGACCAAACT CTCCATCA
CCCAGGCTGTCACGACCACCACCCAGAGGCCCAGCAGCATGACTACCACCTGGAGGC TCAGTAGC
ACAACCACCACAACCGGCCTCAGGGTCACACAGGGCAAACGACGCTCAGACTCTTGG CACATAAGT
CTGGAGACTGCTGTGGGGGTGGCAGTGGCTGTCACTGTGCTCGGAATCATGATTTTG GGACTGATC
TGCCTCCTCAGGTGGAGGAGAAGGAAAGGTCAGCAGCGGACTAAAGCCACAACCCCA GCCAGGGA
ACCCTT CC AAAACACAG AGG AGCCATAT G AG AAT ATCAGG AAT G AAGGACAAAAT AC AG ATCCC AAG
CTAAATCCCAAGGATGACGGCATCGTCTATGCTTCCCTTGCCCTCTCCAGCTCCACC TCACCCAGAG
CACCTCCCAGCCACCGTCCCCTCAAGAGCCCCCAGAACGAGACCCTGTACTCTGTCT TAAAGGCCT
AACCAATGGACAGCCCTCTCAAGACTGAATGGTGAGGCCAGGTACAGTGGCGCACAC CTGTAATCC
CAGCTACTCTGAAGCCTGAGGCAGAATCAAGTGAGCCCAGGAGTTCAGGGCCAGCTT TGATAATGG
AGCG AGATGCCATCT CT AGTT AAAAAT AT AT ATT AACAATAAAGT AACAAATTT AAAAAG AT AAAAAAA
(SEQ ID NO: 60)
As used herein, the term “PLCG2” refers to the gene encoding 1 -phosphatidylinositol 4,5- bisphosphate phosphodiesterase gamma-2. The terms “PLCG2” and "1-phosphatidylinositol 4,5- bisphosphate phosphodiesterase gamma-2" include wild-type forms of the PLCG2 gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type PLCG2. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type PLCG2 nucleic acid sequence (e.g., SEQ ID NO: 61 , ENA accession number M37238). SEQ ID NO: 61 is a wild-type gene sequence encoding PLCG2 protein, and is shown below:
GAATTCGGCGCTGAGTGACCCGAGTCGGGACGCGGGCTGCGCGCGCGGGACCCCGGA GCC
CAAACCCGGGGCAGGCGGGCAGCTGTGCCCGGGCGGCACGGCCAGCTTCCTGATTTC TCC
CGATTCCTTCCTTCTCCCTGGAGCGGCCGACAATGTCCACCACGGTCAATGTAGATT CCC
TTGCGGAATATGAGAAGAGCCAGATCAAGAGAGCCCTGGAGCTGGGGACGGTGATGA CTG
TGTTCAGCTTCCGCAAGTCCACCCCCGAGCGGAGAACCGTCCAGGTGATCATGGAGA CGC
GGCAGGTGGCCTGGAGCAAGACCGCCGACAAGATCGAGGGCTTCTTGGATATCATGG AAA
TAAAAGAAATCCGCCCAGGGAAGAACTCCAAAGATTTCGAGCGAGCAAAAGCAGTTC GCC
AGAAAGAAGACTGCTGCTTCACCATCCTATATGGCACTCAGTTCGTCCTCAGCACGC TCA
GCTTGGCAGCTGACTCTAAAGAGGATGCAGTTAACTGGCTCTCTGGCTTGAAAATCT TAC
ACCAGGAAGCGATGAATGCGTCCACGCCCACCATTATCGAGAGTTGGCTGAGAAAGC AGA
T ATATT CTGTGGATC AAACCAGAAG AAACAGC AT CAGTCTCCGAG AGTT GAAGACCAT CT
TGCCCCTGATCAACTTTAAAGTGAGCAGTGCCAAGTTCCTTAAAGATAAGTTTGTGG AAA
T AG GAG C AC AC AAAG AT G AGCT C AG CTTT G AAC AGTTCC AT CTCTTCTAT AAAAAACTT A
TGTTTGAACAGCAAAAATCGATTCTCGATGAATTCAAAAAGGATTCGTCCGTGTTCA TCC
T GGGGAACACTGACAGGCCGGAT GCCTCT GCT GTTTACCT GCAT GACTTCCAGAGGTTTC
TCATACATGAACAGCAGGAGCATTGGGCTCAGGATCTGAACAAAGTCCGTGAGCGGA TGA
CAAAGTTCATT GAT GACACCATGCGT G AAACTGCT GAGCCTTTCTTGTTTGTGGAT G AGT
TCCT CACGTACCT GTTTTCACG AG AAAACAGC AT CT GGGAT GAGAAGTAT GACGCGGTGG
ACAT GCAGG AC AT GAACAACCCCCTGTCT CATTACT GG AT CTCCTCGT CACAT AACACGT
ACCTTACAGGTGACCAGCTGCGGAGCGAGTCGTCCCCAGAAGCTTACATCCGCTGCC TGC
GCATGGGCTGTCGCTGCATTGAACTGGACTGCTGGGACGGGCCCGATGGGAAGCCGG TCA
TCTACCATGGCTGGACGCGGACTACCAAGATCAAGTTTGATGACGTCGTGCAGGCCA TCA
AAG ACCACGCCTTTGTT ACCTCGAGCTTCCCAGT GATCCT GTCCATCG AGGAGCACT GCA
GCGTGGAGCAACAGCGTCACATGGCCAAGGCCTTCAAGGAAGTATTTGGCGACCTGC TGT
TGACGAAGCCCACGGAGGCCAGTGCTGACCAGCTGCCCTCGCCCAGCCAGCTGCGGG AGA
AGATCATCATCAAGCATAAGAAGCTGGGCCCCCGAGGCGATGTGGATGTCAACATGG AGG
ACAAGAAGGACGAACACAAGCAACAGGGGGAGCTGTACATGTGGGATTCCATTGACC AGA
AATGGACTCGGCACTACTGCGCCATTGCTGATGCCAAGCTGTCCTTCAGTGATGACA TTG
AACAGACTATGGAGGAGGAAGTGCCCCAGGATATACCCCCTACAGAACTACATTTTG GGG
AGAAATGGTTCCACAAGAAGGTGGAGAAGAGGACGAGTGCCGAGAAGTTGCTGCAGG AAT
ACTGCATGGAGACGGGGGGCAAGGATGGCACCTTCCTGGTTCGGGAGAGCGAGACCT TCC
CCAATGACTACACCCTGTCCTTCTGGCGGTCAGGCCGGGTCCAGCACTGCCGGATCC GCT
CCACCATGG AGGGCGGGACCCT G AAAT ACT ACTT GACT G AC AACCT GAGGTT CAGGAGG A
TGTATGCCCTCATCCAGCACTACCGCGAGACGCACCTGCCGTGCGCCGAGTTCGAGC TGC
GGCTCACGGACCCTGTGCCCAACCCCAACCCCCACGAGTCCAAGCCGTGGTACTATG ACA
GCCTGAGCCGCGGAGAGGCAGAGGACATGCTGATGAGGATTCCCCGGGACGGGGCCT TCC TGATCCGGAAGCGAGAGGGGAGCGACTCCTATGCCATCACCTTCAGGGCTAGGGGCAAGG
TAAAGCATTGTCGCATCAACCGGGACGGCCGGCACTTTGTGCTGGGGACCTCCGCCT ATT
TT GAGAGT CTGGT GGAGCTCGT CAGTT ACT ACGAGAAGCATT CACT CT ACCGAAAG AT GA
GACTGCGCTACCCCGTGACCCCCGAGCTCCTGGAGCGCTACAATACGGAAAGAGATA TAA
ACTCCCTCTACGACGTCAGCAGAATGTATGTGGATCCCAGTGAAATCAATCCGTCCA TGC
CTCAGAGAACCGTGAAAGCTCTGTATGACTACAAAGCCAAGCGAAGCGATGAGCTGA GCT
TCTGCCGTGGTGCCCTCATCCACAATGTCTCCAAGGAGCCCGGGGGCTGGTGGAAAG GAG
ACTATGGAACCAGGATCCAGCAGTACTTCCCATCCAACTACGTCGAGGACATCTCAA CTG
CAGACTTCGAGGAGCTAGAAAAGCAGATTATTGAAGACAATCCCTTAGGGTCTCTTT GCA
GAGGAATATTGGACCTCAATACCTATAACGTCGTGAAAGCCCCTCAGGGAAAAAACC AGA
AGTCCTTTGTCTTCATCCTGGAGCCCAAGGAGCAGGGCGATCCTCCGGTGGAGTTTG CCA
CAGACAGGGTGGAGG AGCT CTTT GAGT GGTTT CAG AGCATCCG AG AG ATCACGTGG AAGA
TTGACAGCAAGGAGAACAACATGAAGTACTGGGAGAAGAACCAGTCCATCGCCATCG AGC
TCTCTGACCTGGTTGTCTACTGCAAACCAACCAGCAAAACCAAGGACAACTTAGAAA ATC
CTGACTTCCGAGAAATCCGCTCCTTTGTGGAGACGAAGGCTGACAGCATCATCAGAC AGA
AGCCCGTCGACCTCCTGAAGTACAATCAAAAGGGCCTGACCCGCGTCTACCCAAAGG GAC
AAAGAGTTGACTCTTCAAACTACGACCCCTTCCGCCTCTGGCTGTGCGGTTCTCAGA TGG
TGGCACTCAATTTCCAGACGGCAGATAAGTACATGCAGATGAATCACGCATTGTTTT CTC
TCAACGGGCGCACGGGCTACGTTCTGCAGCCTGAGAGCATGAGGACAGAGAAATATG ACC
CGATGCCACCCGAGTCCCAGAGGAAGATCCTGATGACGCTGACAGTCAAGGTTCTCG GTG
CTCGCCATCTCCCCAAACTTGGACGAAGTATTGCCTGTCCCTTTGTAGAAGTGGAGA TCT
GTGGAGCCGAGTATGGCAACAACAAGTTCAAGACGACGGTTGTGAATGATAATGGCC TCA
GCCCTATCTGGGCTCCAACACAGGAGAAGGTGACATTTGAAATTTATGACCCAAACC TGG
CATTT CTGCGCTTT GTGGTTT AT GAAG AAG AT ATGTT CAGCGATCCC AACTTT CTT GCTC
ATGCCACTTACCCCATTAAAGCAGTCAAATCAGGATTCAGGTCCGTTCCTCTGAAGA ATG
GGTACAGCGAGGACATAGAGCTGGCTTCCCTCCTGGTTTTCTGTGAGATGCGGCCAG TCC
TGGAGAGCGAAGAGGAACTTTACTCCTCCTGTCGCCAGCTGAGGAGGCGGCAAGAAG AAC
TGAACAACCAGCTCTTTCTGTATGACACACACCAGAACTTGCGCAATGCCAACCGGG ATG
CCCTGGTTAAAGAGTTCAGTGTTAATGAGAACCACTCCAGCTGTACCAGGAGAAATG CAA
C AAG AGGTT AAG AG AG AAG AG AGTC AG C AAC AGC AAGTTTT ACTC AT AG AAG CTGGGGTA
TGTGTGTAAGGGTATTGTGTGTGTGCGCATGTGTGTTTGCATGTAGGAGAACGTGCC CTA
TTCACACTCTGGGAAGACGCTAATCTGTGACATCTTTTCTTCAAGCCTGCCATCAAG GAC
ATTTCTTAAGACCCAACTGGCATGAGTTGGGGTAATTTCCTATTATTTTCATCTTGG ACA
ACTTCTAACTTATATCTTTATAGAGGATTCCCCAAAATGTGCTCCTCATTTTTGGCC TCT
CATGTTCCAAACCTCATTGAATAAAAAGCAATGAAAACCTTG
(SEQ ID NO: 61)
As used herein, the term “PTK2B” refers to the gene encoding Protein-tyrosine kinase 2-beta.
The terms “PTK2B” and "Protein-tyrosine kinase 2-beta" include wild-type forms of the PTK2B gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type PTK2B. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type PTK2B nucleic acid sequence (e.g., SEQ ID NO: 62, ENA accession number U33284). SEQ ID NO: 62 is a wild-type gene sequence encoding PTK2B protein, and is shown below:
CGGTACAGGTAAGTCGGCCGGGCAGGTAGGGGTGCCCGAGGAGTAGTCGCTGGAGTC CGC
GCCTCCCTGGGACTGCAATGTGCCGGTCTTAGCTGCTGCCTGAGAGGATGTCTGGGG TGT
CCGAGCCCCTGAGCCGAGTAAAGTTGGGCACATTACGCCGGCCTGAAGGCCCTGCAG AGC
CCATGGTGGTGGTACCAGTAGATGTGGAAAAGGAGGACGTGCGTATCCTCAAGGTCT GCT
TCTATAGCAACAGCTTCAATCCTGGGAAGAACTTCAAACTGGTCAAATGCACTGTCC AGA
CGGAGATCCGGGAGATCATCACCTCCATCCTGCTGAGCGGGCGGATCGGGCCCAACA TCC
GGTTGGCTGAGTGCTATGGGCTGAGGCTGAAGCACATGAAGTCCGATGAGATCCACT GGC
T GCACCCACAGAT GACGGT GGGT GAGGT GCAGGACAAGTATGAGTGTCT GCACGT GGAAG
CCGAGTGGAGGTATGACCTTCAAATCCGCTACTTGCCAGAAGACTTCATGGAGAGCC TGA
AGGAGGACAGGACCACGCTGCTCTATTTTTACCAACAGCTCCGGAACGACTACATGC AGC
GCTACGCCAGCAAGGTCAGCGAGGGCATGGCCCTGCAGCTGGGCTGCCTGGAGCTCA GGC
GGTTCTTCAAGGATATGCCCCACAATGCACTTGACAAGAAGTCCAACTTCGAGCTCC TAG
AAAAGGAAGTGGGGCTGGACTTGTTTTTCCCAAAGCAGATGCAGGAGAACTTAAAGC CCA
AACAGTTCCGGAAGATGATCCAGCAGACCTTCCAGCAGTACGCCTCGCTCAGGGAGG AGG
AGTGCGTCATGAAGTTCTTCAACACTCTCGCCGGCTTCGCCAACATCGACCAGGAGA CCT
ACCGCT GT G AACT CATT CAAGG ATGGAACATT ACT GT GG ACCTGGT CATTGGCCCT AAAG
GGATCCGCCAGCTGACTAGTCAGGACGCAAAGCCCACCTGCCTGGCCGAGTTCAAGC AGA
TCAGGTCCATCAGGTGCCTCCCGCTGGAGGAGGGCCAGGCAGTACTTCAGCTGGGCA TTG
AAGGTGCCCCCCAGGCCTTGTCCATCAAAACCTCATCCCTAGCAGAGGCTGAGAACA TGG
CTGACCTCATAGACGGCTACTGCCGGCTGCAGGGTGAGCACCAAGGCTCTCTCATCA TCC
ATCCTAGGAAAGATGGTGAGAAGCGGAACAGCCTGCCCCAGATCCCCATGCTAAACC TGG
AGGCCCGGCGGTCCCACCTCTCAGAGAGCTGCAGCATAGAGTCAGACATCTACGCAG AGA
TTCCCGACGAAACCCTGCGAAGGCCCGGAGGTCCACAGTATGGCATTGCCCGTGAAG ATG
TGGTCCTGAATCGTATTCTTGGGGAAGGCTTTTTTGGGGAGGTCTATGAAGGTGTCT ACA
CAAATCACAAAGGGGAGAAAATCAATGTAGCTGTCAAGACCTGCAAGAAAGACTGCA CTC
TGGACAACAAGGAGAAGTTCATGAGCGAGGCAGTGATCATGAAGAACCTCGACCACC CGC
ACATCGTGAAGCTGATCGGCATCATTGAAGAGGAGCCCACCTGGATCATCATGGAAT TGT
ATCCCTATGGGGAGCTGGGCCACTACCTGGAGCGGAACAAGAACTCCCTGAAGGTGC TCA
CCCTCGTGCTGTACTCACTGCAGATATGCAAAGCCATGGCCTACCTGGAGAGCATCA ACT
GCGTGCACAGGGACATTGCTGTCCGGAACATCCTGGTGGCCTCCCCTGAGTGTGTGA AGC
TGGGGGACTTTGGTCTTTCCCGGTACATTGAGGACGAGGACTATTACAAAGCCTCTG TGA
CTCGTCTCCCCATCAAATGGATGTCCCCAGAGTCCATTAACTTCCGACGCTTCACGA CAG
CCAGTGACGTCTGGATGTTCGCCGTGTGCATGTGGGAGATCCTGAGCTTTGGGAAGC AGC
CCTTCTTCTGGCTGGAGAACAAGGATGTCATCGGGGTGCTGGAGAAAGGAGACCGGC TGC
CCAAGCCTGATCTCTGTCCACCGGTCCTTTATACCCTCATGACCCGCTGCTGGGACT ACG
ACCCCAGTGACCGGCCCCGCTTCACCGAGCTGGTGTGCAGCCTCAGTGACGTTTATC AGA
TGGAGAAGGACATTGCCATGGAGCAAGAGAGGAATGCTCGCTACCGAACCCCCAAAA TCT
TGGAGCCCACAGCCTTCCAGGAACCCCCACCCAAGCCCAGCCGACCTAAGTACAGAC CCC CTCCGCAAACCAACCTCCTGGCTCCAAAGCTGCAGTTCCAGGTTCCTGAGGGTCTGTGTG
CCAGCTCTCCTACGCTCACCAGCCCTATGGAGTATCCATCTCCCGTTAACTCACTGC ACA
CCCCACCTCTCCACCGGCACAATGTCTTCAAACGCCACAGCATGCGGGAGGAGGACT TCA
TCCAACCCAGCAGCCGAGAAGAGGCCCAGCAGCTGTGGGAGGCTGAAAAGGTCAAAA TGC
GGCAAATCCTGGACAAACAGCAGAAGCAGATGGTGGAGGACTACCAGTGGCTCAGGC AGG
AGG AGAAGTCCCTGGACCCCATGGTTTAT AT GAAT GAT AAGTCCCC ATT GACGCCAG AG A
AGGAGGTCGGCTACCTGGAGTTCACAGGGCCCCCACAGAAGCCCCCGAGGCTGGGCG CAC
AGTCCATCCAGCCCACAGCTAACCTGGACCGGACCGATGACCTGGTGTACCTCAATG TCA
TGGAGCTGGTGCGGGCCGTGCTGGAGCTCAAGAATGAGCTCTGTCAGCTGCCCCCCG AGG
GCTACGTGGTGGTGGTGAAGAATGTGGGGCTGACCCTGCGGAAGCTCATCGGGAGCG TGG
ATGATCTCCTGCCTTCCTTGCCGTCATCTTCACGGACAGAGATCGAGGGCACCCAGA AAC
TGCTCAACAAAGACCTGGCAGAGCTCATCAACAAGATGCGGCTGGCGCAGCAGAACG CCG
T GACCTCCCT G AGTG AGG AGTGCAAGAGGCAG ATGCT G ACGGCTT CAC ACACCCT GGCTG
TGGACGCCAAGAACCTGCTCGACGCTGTGGACCAGGCCAAGGTTCTGGCCAATCTGG CCC
ACCCACCTGCAGAGTGACGGAGGGTGGGGGCCACCTGCCTGCGTCTTCCGCCCCTGC CTG
CCATGTACCTCCCCTGCCTTGCTGTTGGTCATGTGGGTCTTCCAGGGAGAAGGCCAA GGG
GAGTCACCTTCCCTTGCCACTTTGCACGACGCCCTCTCCCCACCCCTACCCCTGGCT GTA
CTGCTCAGGCTGCAGCTGGACAGAGGGGACTCTGGGCTATGGACACAGGGTGACGGT GAC
AAAGATGGCTCAGAGGGGGACTGCTGCTGCCTGGCCACTGCTCCCTAAGCCAGCCT
(SEQ ID NO: 62)
As used herein, the term “SCIMP” refers to the gene encoding SLP Adaptor and CSK Interacting Membrane Protein. The terms “SCIMP” and "SLP Adaptor and CSK Interacting Membrane Protein" include wild-type forms of the SCIMP gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type SCIMP. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type SCIMP nucleic acid sequence (e.g., SEQ ID NO: 63, NCBI Reference Sequence: NM_207103.3). SEQ ID NO: 63 is a wild-type gene sequence encoding SCIMP protein, and is shown below:
ACTGTCTCTAGCAGTGGGTGAAGGCCTGTGAGTGAGGAATGCCTCTCACCAGCTGTG CCTGAGCTG
CAGCACTCCAGCCACTGCTGTCTCCTTAGCTGCTCACATATGGATACTTTCACAGTT CAGGATTCCAC
TGCAATGAGCTGGTGGAGGAATAATTTCTGGATCATCTTAGCTGTGGCCATCATCGT TGTCTCTGTG
GGTCTGGGCCTCATCCTGTACTGTGTCTGTAAGTGGCAGCTTAGACGAGGCAAGAAA TGGGAAATT
GCCAAGCCCCTGAAACACAAGCAAGTAGATGAAGAAAAGATGTATGAGAATGTTCTT AATGAGTCGC
CAGTTCAATTACCGCCTCTGCCACCGAGGAATTGGCCTTCTCTAGAAGACTCTTCCC CACAGGAAGC
CCCAAGTCAGCCGCCCGCTACATACTCACTGGTAAATAAAGTTAAAAATAAGAAGAC TGTTTCCATCC
CAAGCTACATTGAGCCTGAAGATGACTATGACGATGTTGAAATCCCTGCAAATACTG AAAAAGCATCA
TTTTGAAACAGCCATTTCTTCTTTTTGGCAAAACTGAAGAGGGTTCACACAACTTAT TTTAAAACAATC
AAGAATGGTTGAACTTCAGTAGGTCTCTGGGCCCTGAAAGCCAGTGGTGATTTTATG AAGCTCTATA
AGATAAAGCACTTCCCAAACCTTAGATGAAGACACCCCTGCGATCGGATGACTGCAG CCAGAGGAG ACACATGGGTGCTCGGCTCTGAGGACTTAGAGGGGTCAGCCTTGTGCTGTTGAGGAAACT TTCCAT
GGGAAGGACCACGGGGCTCCATGGCTCCCACCTGTGGGAAACTACTCATTTCTTGGC ATTCTTTCCC
CCTTCATTCCCTTTGGTTTGCATGGTTCTGAGTGATATTAAATCTCAGCATTTGGTT GTGCAGACCCT
CCCAGGCTCCCATCCCCAGCAAGGCCCTCACCAAGCATGCTGGTCTTTACCCTCTCA CCCCACCCA
CCTCCTGCACTGTGAGGCTGTGGGTGAGTTACAGCTGAGTGCTCTCGTGCCCAGGTT CCCACACCA
CATCTCGCGAGTTTGCAAGGGCAGGGAGTACCTTTTGTTCTCGTGAACCCTCCCCAC CTAGACACCC
T GCAAACCCCAGT GCCTTTATAT GATGTAGGCCAAATT GACC AT AG AGATTT G AGTTTT CACCT AG GT
TTTCTCCCCGT GCTTGCAAGTT GT ACT GT AAC AAT GG ACAAAGG ACAAAAGTT ACCTTCT GATTTACA
CCTAGAAGCATCATTTTGCAATAGGTGTGTTGGGGGTGCTACAGGAAAAATACATTT CCCCCAGGAC
AAATCATGGGGAACAGGAAAGAAAAGGGGCATGTAACAATGGCATATACAAGATGAG AGTTCAGGG
GGCTTAATATCCCCTGTCCATCATTTTCATCAGTACTTACTCGAGTTCTAGGAAAAC AGCCTCAAGCC
CCTTCCTTCCAGATCACTGTCCCTGGGCATCTGGGAGGAGGCAGAAGGTCCACTGTG ATGTGCTGC
AGCCAATGAGATGGGCCAGGGACATGGGCAGATGTCTTGTTAAACAAGTGTCCTAAT GGGGTCAAC
AAGGCCCGAGTCAGCTTTATAGGCTCTTAGACCTCATCAATTCCTTCTAGCTGATCG CCAGAGCCCT
AGGACTTGACTCATTCTAACTATACTCACAAGATGCTGGTTTCTAAGTGACCTCTGG GAAATCTGGCA
AAT GAACAGCCTT GCAGAGAGAGCACTGT GAACCT GGAAAGGCCT GAGAGT GACTCAGATTTCCCT
CAAGAGATGGGAAAATGTGTTCCTCCCATTTTCAAGCTTTCTCCCTCAATCAACGCT GGAGCACTGG
GGACCTGGGCTTCCTCCCTGGTTCTCTCTTTCCAGACTCTATGAAGGCTTCCACCTT GCTATTAATAC
CTCCTTGGGAGGCCAAGGTGGGCGGATCACCTGAGGTCGGGAGTTCGAGACCAGCCT GACCAACA
TGGAGAAACCCCATTTCTACTAAAAATACAAAATTAGTCAGGCATGGTCGCGCATGC CTGTAATCCCA
GCTACTTGGGAGGCTGAGGCAGAAGAATCGCTTGAAACTGGGAGGCGGAGGTTGCGG TGAGCCGA
G AACATGCCATT GC ACTCCAGCCT GG ACAACAAG AGT GAAACTCCATCT AAAAAT AAATAAAT AAAT A
AATAAATAAACCCTCCTTATGTTAGGCCAGTAGTTATCTAACTATGGCCTTATGGGA CTCTGGTATCC
CACCAGCCAAAGAGAGGACTCTTCCCAAATTATAGAACAAAAATAAGCCAAAGGATT GGAGTGTTTC
AAAC ACATGCTTTCGT CTTATAAATGTT CTGTAAACCCTCCAT G ACT AT G ACAAAAGTT AAAAACAAAT
GCCAGACAAA
(SEQ ID NO: 63)
As used herein, the term “SLC24A4” refers to the gene encoding Solute Carrier Family 24 Member 4. The terms “SLC24A4” and "Solute Carrier Family 24 Member 4" include wild-type forms of the SLC24A4 gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type SLC24A4. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type SLC24A4 nucleic acid sequence (e.g., SEQ ID NO: 64, NCBI Reference Sequence: NM_153646.3). SEQ ID NO:
64 is a wild-type gene sequence encoding SLC24A4 protein, and is shown below:
AGACGGCACCCAGGCGCTCCGGGATGGCGCTCCGCGGGACCCTCCGGCCGCTCAAAG TTCGCAG
GAGGCGAGAGATGCTGCCGCAGCAAGTCGGCTTCGTGTGCGCGGTGCTGGCCCTGGT GTGCTGTG
CGTCCGGCCTCTTCGGCAGCTTGGGGCACAAAACAGCTTCTGCTAGCAAACGTGTCC TGCCAGACA
CGTGGAGAAATAGAAAGTTGATGGCCCCAGTGAATGGGACACAGACAGCCAAGAACT GCACAGATC
CTGCGATTCACGAGTTCCCCACAGATCTGTTCTCCAATAAGGAGCGACAGCACGGAG CCGTCCTGC TGCACATCCTTGGTGCTCTGTATATGTTCTATGCCTTGGCCATAGTGTGCGATGACTTCT TTGTTCCG
TCTCTAGAGAAGATCTGTGAGAGACTCCATCTGAGCGAAGATGTGGCTGGAGCCACC TTCATGGCT
GCAGGAAGCTCAACGCCAGAGCTGTTTGCGTCTGTTATTGGGGTGTTCATCACCCAT GGGGACGTC
GGGGTGGGCACCATCGTGGGCTCTGCTGTGTTCAACATCCTGTGCATAATTGGAGTG TGCGGACTG
TTTGCTGGCCAGGTGGTCCGTCTGACGTGGTGGGCCGTGTGCCGAGACTCCGTGTAC TACACCATC
TCTGTCATCGTGCTCATCGTGTTCATATATGATGAACAAATTGTGTGGTGGGAAGGC CTGGTGCTCA
TCATCTTGTATGTGTTTTATATTCTGATCATGAAGTACAATGTGAAGATGCAAGCCT TTTTCACAGTCA
AACAAAAGAGCATTGCAAACGGTAACCCGGTCAACAGTGAGCTGGAGGCTGGTAATG ATTTCTATGA
CGGTAGCTATGATGACCCTTCCGTGCCATTGCTGGGGCAAGTGAAGGAGAAGCCACA GTATGGCAA
GAACCCCGTGGTGATGGTGGACGAGATTATGAGCTCCAGCCCTCCCAAGTTCACCTT CCCTGAAGC
AGGCTTACGAATCATGATCACCAATAAGTTTGGACCCAGGACCCGACTACGGATGGC CAGCAGGAT
CATCATTAATGAGCGGCAGAGACTGATCAACTCGGCCAATGGTGTGAGCAGTAAGCC GCTTCAAAAC
GGGAGGCACGAGAACATTGAGAACGGGAATGTTCCTGTGGAAAACCCCGAAGACCCT CAGCAGAAT
CAGGAGCAGCAGCCGCCGCCACAGCCACCACCGCCAGAGCCAGAGCCGGTGGAGGCT GACTTCCT
GTCCCCCTTCTCCGTGCCGGAGGCCAGAGGGGACAAGGTCAAGTGGGTGTTCACCTG GCCCCTCA
TCTTCCTCCTGTGCGTCACCATTCCCAACTGCAGCAAGCCCCGCTGGGAGAAGTTCT TCATGGTCAC
CTTCATCACCGCCACGCTGTGGATCGCTGTGTTCTCCTACATCATGGTGTGGCTGGT GACTATTATC
GGATACACACTTGGGATCCCGGATGTCATCATGGGCATTACTTTCCTGGCAGCAGGG ACAAGTGTTC
CAGACTGCATGGCCAGCCTAATTGTGGCGAGACAAGGCCTTGGGGACATGGCAGTCT CCAACACCA
TAGGAAGCAACGTGTTTGACATCCTGGTAGGACTTGGTGTACCGTGGGGCCTGCAGA CCATGGTTG
TTAATTATGGATCAACAGTGAAGATCAACAGCCGGGGGCTGGTCTATTCCGTGGTCC TGTTGCTGGG
CTCTGTCGCTCTCACCGTCCTCGGCATCCACCTAAACAAGTGGCGACTGGACCGGAA GCTGGGTGT
CTACGTGCTGGTTCTCTACGCCATCTTCTTGTGCTTCTCCATAATGATAGAGTTTAA CGTCTTTACCTT
CGTCAACTTGCCGATGTGCCGGGAAGACGATTAGCGCTGAGTCGCGGCCCCTGGGAG CTGATCTG
GACACCCTGTGACACTGGCGTTCTCCTCTCCCCTCCTTCCCCCACCACAGGTCTCTC CTGCATAGGC
AGCCACTGTCCGTTCTTTCACACACTGGAAGGAAGAGCCATCGTGGTCTTTGTCTGG CCACAGGCCA
GGCTGCTGGGCATCCTCCTCCTCCTTGGAGTTCCGCCCCTGCAAGGCTGGATTTGGG GGCCATTAT
CTGAGCAGCTTCAAAGACCCCTGAGCTGCCAACCACGGAGATGTGCCAAGCATCTCA TCTCTCCTG
CACACTTTAGTCAGAAGGACTTCTGCATGCAGTTTGTCTTTCTGTTCTGCAGGCAGC TTCAGAATTGA
GGTCATTTGTGAGCACAAGATCTCATAGGGCAGGTGCAAAATAGGAATGTTGTTCTC AAGTGTCACC
TCCAGCCCAGAGGTGGTTCCTTAGGCAGCATGTGCTCCTGGGAGCCTCTGACTTTTG CTGGAAGCA
GCCACAGTTTGGAAGGGGCAAGACCTCAACCTGTTGGGGTTTAGGGCCCATGATGGC AGACATTCT
ACCCCTTTTCCT GG AAAAACTGGAAG AAT GAAAAT AATTTTTTT CTGTGGAAG AG AG AAAAT G AGTG A
AT ATT CTTCT C ACTTTT ATT GAT GC ATT C AG AG AAT AAGC AAT G AAAT ATT AAAAAAT G AAAC AT CAT AT
AGGTC AT CAT ACTT G AAAATT AT C ATTCC AT AT G AAAG GAT CAT GAT AC AC AC C AAAAAAGTAAT G ATC
GTAAAGACACAAATCCTCTGTATGCCATCTTGCATTGGCACTGAGGTGTTTGGTTTG GAATAGGGAA
AAAG GT AAG AG ACT AAC GT GG AAAG GTGCTAACT C AG AG ACT G GAG ATT AT AGTTT AC AG CTGTACT
TTCCAGATCTTCTATGTGACACAATGCACTGTCCTTGTGGGTTTGTCATTTATTGGT TAATGCTCTAGT
TTCAAAACCACCCTGTTGAAAGTTCCAGTTATTTATATGCCCAACAAATTTCATAGC CTGCTGAACTGA
ACT G AGT GT GT CAG AAGT GCTGGTTAAT GACGAG AAG AGATT GCCT GAAAAACAACAAACT GCTTTC
TGGTTAGCTGAAGGCAAGTGTGAAAATCAGAATTTAGAATATTTAGAGCTAAGCTTC TGGAACCACGT
AGTTTCTACACGTGGCAGGCCAAGAATGGGAGGCTGACTCAAAACTAGATAGAAAAA TATAAAATAAT CTTCG ACC ACTT GAT AG CTCT C AAAT AT AT ATTT AAAAG ATTT AT G AAT AC AAACC ATTT AT G GTTT AT G
ATTT CT AAAAAG AAAGC AC AATT AATTTT AT AG AG AGGTTTTTT ATTTTTTT AAT ATTT CT ATT G C AAAAG
TCTATCCGATTTGATGCACTTTGAATATTGAGATATTTTGCACGGATGAATGTATGG GAACTACCCAT
GAT GAT GT AAG AGGAAAGAAC ATTTTTTTGTGATT CAC CAGACATC ACTTTAAACTT GGTG AT GAGTTT
AAATCCAGTAGCTAATCCCTTCCTGAGACTCAAAGATCGTGACGCTGGTTGGAATTT CTGACTGTGC
CCTTTAGGGCCTCCTGAGTTTCAAAAGGAGGAAGTGTTCGTGCTTGTGTCCCTGAAG TTCCCTGTTG
CATGAGCCTGCGACAGGACCTCACCCCCACCACCAGGCTTCTATTTGGGATTCACAT CAGTATTAGT
ATCGTAGCTACACCAAGTTCAGGCTTCTCTTTTTGTTTTTTTACCTAGAAATTGGGC TCAGTGGTCTTC
AACTTGAGGACGAGGGTGATTTTCCTAAGAAATCAGCAAAGAGGGAAGGCAGGGCCC CTGTAGATT
CACCAGTATAAACTTCAGCTGCAGGGATTCCAGAGCCCTCGGGACCACTCTGTCACC TTAATAGCCA
AGTTCTCCTGGTTCCTCCGATCTTACAGGCTCATCCAGGTTCCAAAGTGCTTCTGTC TCTGTTTTGAT
TCTCCAAACTGCTCTGTGATGTATGTAGGGATTATTCTCCCCACTTAACAGAAAGTA GTGTCTTGGAG
AGGTCAAGGGTCTCTAGTTCAATGGCCAGTCATAGCAGAAGGGAGGCCAAGCACCAG TCCATCACC
CCTCCCAGGCCAGCCTCTGTAAGTTGGCCACACTTGGGGAGTGAGTGTGGGTATGAC TTTACCCTC
CTGGTTGGTTCTTACTGTTTGAGTCAAAACCTCATCAATATATCATTGACTCCTGGG TTCCTCAGGTC
ATTTCCTAATATCTGTCCCTATCCAATGCCTCTATTTTATCTTGAAAAAAGGACCAA AAATTATTTTTAG
CTATGGCAAGGCACAGGCCACATGGCCCCTGATGGCGTCCCTGCTGGTTTTCAATTC TCTGAAGCCT
TGTGTAGCTTTCAGAGCACACGTATCCTAATTACCCTCCTCTTCCTCAGCAGAACCC ATTTGAGATTC
TAAATGAATACTCTTAGTCTCTAAAGTTGCAGTTAGAAACTAAAATAATGTTTTTTA ATATGTAATATGC
TCCTCTTGGCTAATTTTCTTTTGACTTTAATGTGCCAATGTAACTTCCTTTAAAGGA TCTATGCATTTAT
T AAAT CTG G AAAACT ATATGT AC ACT GT AG GT GG AAAATT CT CTTTTTT AACT AAAT ATTTTTCC AT CAC
AAATTTAAAGAATTGCATGATTAATTAGGCTTTCATTTTTAAATTACGCTTTCATCA CTACGCAGGATTA
CTTTATTTTATTCCCAAAGCTCATTAGCATGGGATAATTACTCTGCTACAGAAATAG GCAATTTAAAAA
AAT G AATTT AG CTCTTCT C ATT G GG GG C AG AAAAG AAAAAAAAAACC ATT G C ACT C AG AT G G AAAAT G
CCTATAGACACAGGAGCAGGTGGTTCCTGTGGACTTCTGGTTTGGAATTTTGCCTCA CCAGGTCAAG
CGTGGTTAGGGTGGAAGGTGTCCAGTATCTTGAAAACCTGGCCCTGGAGGAAGGTTC TGGGTCAGC
TGCAATGAGAGACTGGTGATTAAGGGCACCGTGGGCAGGACACAGTCCTCGCCTTAC CCACCCCAT
CCTTCCTGTTACCCACAGTCTGCTGGCCTCCATGCCTCTTCCCCTTGTCACTTGTGT CTCCTCCTTAT
GCACAGAGCTGCCTGCCTTTATGAATTTTCTTTTCTTTTTTTTTTGAGACAGCGTCT TGCTGTGTCACC
CAGGCTGGAGTGCAGTGGTGCCATCTTGGCTCACTGCAACCTCCGCCTCCCAGGTTC AAGCAATTC
TTGTGCCTCAGCCTCCTGAGGATTACAGGCGTGCGCCACCACACCCAGCTAATTTTT GTATTTTTAGT
AGAGACGGGTTTTCACCATGTTGGCCAGGCTGGTCTCAAACTCCTGACCTCAGGTGA TCCACCCAC
CTCGGTTTCCCAAAGTGCTGGGATTACAGGTGTGAGCCACCACGCCCAGCCTGCCCT TGTGAATTTT
CACCTGCTCCTTACCCCTCACCTGTTAGGACTGTTTCTTGCTTTTGCCCCTGTCGGT CCCCTGCCTTA
ACAGACCTAAGCAGCTGATAATGCACCAAGCTTCCCTGACCAGGTGGGGTGTGTCTA TCACCCAAG
GGCAGTCCTACAGACCCTGACCAAAGGCCGTTCCTGGGCGGCCCAAGGTCCAGGTTT CTTCCACCT
GCTCTTCCCTGTTTATGGGGATTTGCAAGCCTAATTGCATCAGCAGGAGCCCATCTC TCAGAGAACC
CGGACTCCCCAAGCAGACTGGGATTTTGGGAAGGGTGTGGGGGGTGTCATTGCTGGA TACCCGTCT
TTCTGCCTGTCCTTTCTCCTCTCTGAATCCTGGGGCCCCTCTCCCTCCTTAAAGCTG GAGTGGACAG
AGGGACAGGAGAGGATCAGAGTTCATCCCCCCTGGGAAAGAGCAAGAGCGAATGAAT CCCAGCGC
CAGCGGCTGAGGCTGCCTTCCGTGCCTTCCCTCCATGGGCGACGGGTGAGTGGGGCT TAGGAAAC
T GGAAC AGGG AAGGTT CTGTT ACCACACTTTGGAACTTTCCCCCT GGGATT CAGCAGTT GAG AAGC A GAGACCTTTCTGCCCTGGGTGAATGGGTCCTTGGGGGAGGGGTTGGTCTTTTGTCTCGCA TCCCCA
TCTTTCCTTTCCTTCTGGGCCATGCTCCTCCCTGGCTGGAAAAAGGTGGCTGTGCTG TCCCTGTGAT
CCACTCTCAGCAAATGCGTGTGGCTCAAATAAACAAAGAACTTACCTGTTAGAGTGA AAATCCTCAG
GAGATTGTACCCAAATGCCATGCTCTAAATATTCATGGTCTCTCTAATGCCCTCAAG ACGTGATTTCC
ATGGGAACCATCCTCCCCTGGGGGCAGTTAGCAGGAGTACGTGGGGCACGTGAGGTG GTCCTCCT
TTCAGCACACCGTGCCCATAGAAACTTCTAGAAATTTCTGAAAATGCTCTGTGGGCA GCTCTTGGGT
GGCAGTAAGTCCATCAACCCCCATCTACCCCGGGCCTGAAGCGCTGCGCTTGCTCTC TTTATGTGTG
TGCACCCGAAGGATTTCCTGGTCTCTGTAGCTGATCCTGTGAGCCCCTCAAGCATGA AGCCTCCCTT
GGGGCTTCTCAAAGCATGGAGAGGGGCCCTTCCTGTCCTTTGGGAAAATCTTCCCCA CTGTGTCAGT
TATATGGGAACAAGAGTGATGGGGTCTTTCTCTAGGCCTGTGCCACAGGACAGAGAA CACGGGATT
CTGCTGTTCGCTTTGAGCCACAGCCTTTACCAGCCCGGCTTGTGTGGGGGGCCCCTT CGCCTTGCT
GCAAAGAGCTGTTCCCCAAAGGGCATATCCACAGGGTACAGGTTTTAAAAAGGCTTT TTTTTTTTTTT
TTGAGACAGGGTCTCGCTCTGTCGCCTAGACTCAGTGCAGTGGCGCCATGTTGGCTG GTTGCAACC
TCCACCTCCTGGGTTCAAGTGATTCTCCCACCTCAGCCTCTCTGGTAGCTGGGACTA CAGGCACGC
GCCACCATGCCCAGCTAATTTTTGGATTTTTAGTAGAGAAGGAGTTTCACCATGCTG ACCAGGCTGG
TTTCGAACTCCTGACCTCAAGTGATCCGCCCGCCTGGGCCTCCCAGAGTGCTGAGAT TACAGGCGT
GAGCCACCGCACCTGGCCAAAAAAAGGCATTTTGATTTAGGTTGCTGTGTTTGCTTG TTGATAAAGAA
AACTCAATCGGGACACTAGTTTTGTGCTCAGCTTTAGGCCGGGTAGCTAATGGGAGG ATGTCCAGCC
TGTCACTGTGCTCCCAGCGCAAGGAAATGGGTGCCCACCTGGAATCAGGAGAAGAGG CTTTTCCCT
CCTGTTCTGCAACCAGGGTGGAGCTATCTTTCCAGGGAAGCCAGCTGAGAGGTTTTA GGGCTTTGG
TTATTTTATGGGGGTTTTAAACCTCCTAACTTTTCAATGACAAATGGCTCCCAGGTG CCATAGTCTCT
GTTAAATCCTCAAACATTCACAAGCACACACTGCCAGGGGCACGGGTTGTCTTTCAC CTGCATGTTT
CTAAGGCTCTTTATTCAATCTCACGGTGTCAGTGTCCAGTTGTCAAAGTTATGAATC TTCCTCCTGCT
TCTAAACAGGGCTGACAGTATACTCTCGTCTAGTCTAGGAACATGTCTGCTGCTGGG ATACCCTGGT
ACCAGGATTTGAGGGCCACGGGTGGCATCTCTGAGAGCTGAAAATCCACAGAGTGCC TGTGGGAAA
GCCAAGCCCTTGGCTGTGTGGCTTTTCTATCCCTTGGATTTACAGGTCTGGGAATTG GCTGCTTCTT
AGTTATAACCCCAGTGACAAATGCTGGCTTAAGCCACACCTGTTCCCACTGTTGCTA GAATTCAAACA
GTTGCTTTTTTTTTTTCTTTTTGAGAAAGGGCCTCACTCTGTTGCCCAGGCTGGAGT GCAGTGGCTTG
ATCACAGCTCACGAAAGCCTCAAACTCCTAGGCTCAAGTGATCCTCCTGAAAAGTAG GTAGGACTAC
AGG CAC AT G CC ACC AC AT AC AG CT AATTT GTTTT C ATTTTTTTTTTTTTT AG AG AC AG GATCTCGCTGT
GTTCCCCAGGTAGGTCTTGAACTCCTGGCCTCAAGTGATCCTCCTGCCTTGACCTCC CAAAGTGCTG
GATTACAAGCGTGAGCCCCTGCACCCGGCCCAAGCAGTTGCTTCTTTTTTTCTCTTT TTTTTTTTTTTT
GAGATGGAGCCTCACTCTGTTGCCCAGGCTGGAGTGCAGTGGCGCGATCTCCACTCA CTGCAAGCT
CCGCCTCCCGGGTTCATGCCATTCTCCTGCCTCAGCCTCCCGAGTAGCTGGGACTAC AGGCGCCTG
CCACCACACCCAGCTAATTTTTTGTATTTTTGGTACAGACAGGGTTTCACCGTGTTA GCCAGGATGGT
CTTGATCTCCTGATCTCGTGATCCGCCCACCCCGGCCTCCCAAAGTGCTGGATTACA AGCGTGAGC
CACCGCGCCCCGCCAAGCAGTTGCTTCTTATGCAACATGTTGGTTGGGACTTGTCCA CGGGCCAGG
CCAATAAAATTCTTAATCCTGCAGAGAGTCAGTACCCTCATCACCCCATCACTGGAA AACAAATGTTT
TAAGCTATCAAGAGAGGGAATGTGCAGCTTTTGGTTTCTAGATGCATGGTTTGGTGT GATCTACCTTT
GTGCCT AAAGGGAAT GTCCCAAACAACAG AGCCTT CTTTGCTGTCACTCCAGAATTCT CT AC ACAG A
ATTTCCCAAGTCCATTCAGGACAGACGCGCAGTCCTCTTTCAATGGAAGAAGAGAGG ACTTTTCCCC
TCCT G AAAAAT GACTGGAGTGT G AAC AAG GC AG CTCTG TTTTT CT AAAT AAGTTGTTCTTGT G AGTTT TTTCTGGCCACTGGGCATCTCTGCCCTCACTTTTCATCCCTGCCCTCTAAGCTGCAGACC CCATGAC
CACACTGTCTGCTTCCTTGAGCTTCCCGCACGAGGCTTGGACCTGGGGGACCTGGAG ACCCTGCGG
ACAGAACTGTGGCTGAGCCACTGTGGCCAACTCTTGGGGAGCTCCACAGTGGGGGTT GCTGGTCTG
TGAGGCTGAGTCTCCATTTCAGAGCACACACTCCCTGGCAGGGCGCCTCTGCCTGTG TCTCCTGCC
CAGCAGCCGCCAGCAGGGAATAGTTGCTGGTGTCTGAGCACAAAGAGAGCTTTGATT ACCTAGAGA
GGAAAAAGGCTGTCAGCCAGATGCAGCCAGGCCCAGGGGTAGATACAGGAGTTGCTA AGGAAGGG
GCCGAGCCAGGAGAGGCCAGGCAGATCCACAAAGCCCAAGGGGATGCAGGCTGGGTG TGGTTTCT
G AGGGAACCT ACCAAAT AGCAGGT AG ATGG AAT CAGAGG ACT CTT GT GTCCT G AAAG AACCTCCTT A
AAAACAACTAAAACGAAGAACTTCTGGGGCTGTTCACACATTGTTCAAGTCACCCCA AGATCGTTCTG
GCACGCTGAGCTGAACACCACCATCTTTGTTCATTCTCTCTCTAATGGGCAAAGCAG GATCATCGAG
TTGAAAAGTTGTAAATAATGAGGATATTTATCCCGCTATTTATTTTTTCAATAACTG TGACCTCCTGCA
CTGT G AAT GCTCTGT G AC AT GAG ATT CTT AGTTT AAT AAAACT GT C ATT AAATTT G AAT G AATT GAT AT
T ATT GGTTACT G AAC ACT G GC AT G AGTTT ATTTTT ATTGTG AAG AAAAAAAT CT AC AG C AAT CT AAACT
AAACCTTTCTAAGAAATCTAGCAGTCAGTATTGTAATGCAATATATCAAAATCTGTA CACTGTCAATAA
AAT AAAT GAGCACAAAAAAAAAAAAAA
(SEQ ID NO: 64)
As used herein, the term “SORL1” refers to the gene encoding Sortilin-related receptor. The terms “SORL1” and "Sortilin-related receptor" include wild-type forms of the SORL1 gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type SORL1 . Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type SORL1 nucleic acid sequence (e.g., SEQ ID NO: 65, ENA accession number Y08110). SEQ ID NO: 65 is a wild-type gene sequence encoding SORL1 protein, and is shown below:
CCGGCCCAGCGGCTCTCCTGGCCTCGCGCTGCACATTCTCTCCTGGCGGCGGCGCCA CCT
GCAGTAGCGTTCGCCCGAACATGGCGACACGGAGCAGCAGGAGGGAGTCGCGACTCC CGT
TCCTATTCACCCTGGTCGCACTGCTGCCGCCCGGAGCTCTCTGCGAAGTCTGGACGC AGA
GGCTGCACGGCGGCAGCGCGCCCTTGCCCCAGGACCGGGGCTTCCTCGTGGTGCAGG GCG
ACCCGCGCGAGCTGCGGCTGTGGGCGCGCGGGGATGCCAGGGGGGCGAGCCGCGCGG ACG
AGAAGCCGCTCCGGAGGAAACGGAGCGCTGCCCTGCAGCCCGAGCCCATCAAGGTGT ACG
GACAGGTTAGTCTGAATGATTCCCACAATCAGATGGTGGTGCACTGGGCTGGAGAGA AAA
GCAACGTGATCGTGGCCTTGGCCCGAGATAGCCTGGCATTGGCGAGGCCCAAGAGCA GTG
ATGTGTACGTGTCTTACGACTATGGAAAATCATTCAAGAAAATTTCAGACAAGTTAA ACT
TTGGCTTGGGAAATAGGAGTGAAGCTGTTATCGCCCAGTTCTACCACAGCCCTGCGG ACA
ACAAGCGGTACATCTTTGCAGACGCTTATGCCCAGTACCTCTGGATCACGTTTGACT TCT
GCAACACTCTTCAAGGCTTTTCCATCCCATTTCGGGCAGCTGATCTCCTCCTACACA GTA
AGGCCTCCAACCTTCTCTTGGGCTTTGACAGGTCCCACCCCAACAAGCAGCTGTGGA AGT
CAGATGACTTTGGCCAGACCTGGATCATGATTCAGGAACATGTCAAGTCCTTTTCTT GGG
G AATT G ATCCCT AT G ACAAACCAAAT ACC AT CTACATT GAACG ACACG AACCCT CTGGCT
ACTCCACTGTCTTCCGAAGTACAGATTTCTTCCAGTCCCGGGAAAACCAGGAAGTGA TCC TTGAGGAAGTGAGAGATTTTCAGCTTCGGGACAAGTACATGTTTGCTACAAAGGTGGTGC
ATCTCTTGGGCAGTGAACAGCAGTCTTCTGTCCAGCTCTGGGTCTCCTTTGGCCGGA AGC
CCATGAGAGCAGCCCAGTTTGTCACAAGACATCCTATTAATGAATATTACATCGCAG ATG
CCTCCG AGGACCAGGTGTTT GTGTGTGTCAGCCACAGT AACAACCG CACCAATTTATACA
TCTCAGAGGCAGAGGGGCTGAAGTTCTCCCTGTCCTTGGAGAACGTGCTCTATTACA GCC
CAGGAGGGGCCGGCAGTGACACCTTGGTGAGGTATTTTGCAAATGAACCATTTGCTG ACT
TCCACCGAGTGGAAGGATTGCAAGGAGTCTACATTGCTACTCTGATTAATGGTTCTA TGA
ATGAGGAGAACATGAGATCGGTCATCACCTTTGACAAAGGGGGAACCTGGGAGTTTC TTC
AGGCTCCAGCCTTCACGGGATATGGAGAGAAAATCAATTGTGAGCTTTCCCAGGGCT GTT
CCCTTCATCTGGCTCAGCGCCTCAGTCAGCTCCTCAACCTCCAGCTCCGGAGAATGC CCA
TCCTGTCCAAGGAGTCGGCTCCAGGCCTCATCATCGCCACTGGCTCAGTGGGAAAGA ACT
TGGCTAGCAAGACAAACGTGTACATCTCTAGCAGTGCTGGAGCCAGGTGGCGAGAGG CAC
TTCCTGGACCTCACTACTACACATGGGGAGACCACGGCGGAATCATCACGGCCATTG CCC
AGGGCATGGAAACCAACGAGCTAAAATACAGTACCAATGAAGGGGAGACCTGGAAAA CAT
TCATCTTCTCTGAGAAGCCAGTGTTTGTGTATGGCCTCCTCACAGAACCTGGGGAGA AGA
GCACTGTCTTCACCATCTTTGGCTCGAACAAAGAGAATGTCCACAGCTGGCTGATCC TCC
AGGTCAATGCCACGGATGCCTTGGGAGTTCCCTGCACAGAGAATGACTACAAGCTGT GGT
CACCATCTGATGAGCGGGGGAATGAGTGTTTGCTGGGACACAAGACTGTTTTCAAAC GGC
GGACCCCCCATGCCACATGCTTCAATGGAGAGGACTTTGACAGGCCGGTGGTCGTGT CCA
ACTGCTCCTGCACCCGGGAGGACTATGAGTGTGACTTCGGTTTCAAGATGAGTGAAG ATT
TGTCATTAGAGGTTTGTGTTCCAGATCCGGAATTTTCTGGAAAGTCATACTCCCCTC CTG
TGCCTTGCCCTGTGGGTTCTACTTACAGGAGAACGAGAGGCTACCGGAAGATTTCTG GGG
ACACTTGTAGCGGAGGAGATGTTGAAGCGCGACTGGAAGGAGAGCTGGTCCCCTGTC CCC
TGGCAGAAGAGAACGAGTTCATTCTGTATGCTGTGAGGAAATCCATCTACCGCTATG ACC
TGGCCTCGGGAGCCACCGAGCAGTTGCCTCTCACCGGGCTACGGGCAGCAGTGGCCC TGG
ACTTTGACTATGAGCACAACTGTTTGTATTGGTCCGACCTGGCCTTGGACGTCATCC AGC
GCCTCTGTTTGAATGGAAGCACAGGGCAAGAGGTGATCATCAATTCTGGCCTGGAGA CAG
T AG AAGCTTT GGCTTTT G AACCCCT CAGCCAGCTGCTTT ACT GGGT AG AT GCAGGCTT CA
AAAAGATTGAGGTAGCTAATCCAGATGGCGACTTCCGACTCACAATCGTCAATTCCT CTG
TGCTTGATCGTCCCAGGGCTCTGGTCCTCGTGCCCCAAGAGGGGGTGATGTTCTGGA CAG
ACTGGGGAGACCTGAAGCCTGGGATTTATCGGAGCAATATGGATGGTTCTGCTGCCT ATC
ACCTGGTGTCTGAGGATGTGAAGTGGCCCAATGGCATCTCTGTGGACGACCAGTGGA TTT
ACTGGACGGATGCCTACCTGGAGTGCATAGAGCGGATCACGTTCAGTGGCCAGCAGC GCT
CTGTCATTCTGGACAACCTCCCGCACCCCTATGCCATTGCTGTCTTTAAGAATGAAA TCT
ACTGGGATGACTGGTCACAGCTCAGCATATTCCGAGCTTCCAAATACAGTGGGTCCC AGA
TGGAGATTCTGGCAAACCAGCTCACGGGGCTCATGGACATGAAGATTTTCTACAAGG GGA
AGAACACT GGAAGCAAT GCCT GT GTGCCCAGGCCAT GCAGCCT GCTGTGCCT GCCCAAGG
CCAACAACAGTAGAAGCT GCAGGTGTCCAGAGGAT GT GTCCAGCAGT GT GCTTCCATCAG
GGGACCTGATGTGTGACTGCCCTCAGGGCTATCAGCTCAAGAACAATACCTGTGTCA AAG
AAGAGAACACCTGTCTTCGCAACCAGTATCGCTGCAGCAACGGGAACTGTATCAACA GCA
TTTGGT GGTGTGACTTT GACAACG ACTGTGG AG ACAT GAGCG AT G AG AG AAACTGCCCT A
CCACCATCTGTGACCTGGACACCCAGTTTCGTTGCCAGGAGTCTGGGACTTGTATCC CAC TGTCCT AT AAATGTG ACCTT GAGG AT GACTGTGG AGACAAC AGTG AT GAAAGTCATT GT G
AAATGCACCAGTGCCGGAGTGACGAGTACAACTGCAGTTCCGGCATGTGCATCCGCT CCT
CCTGGGTATGTGACGGGGACAACGACTGCAGGGACTGGTCTGATGAAGCCAACTGTA CCG
CCATCTATCACACCTGTGAGGCCTCCAACTTCCAGTGCCGAAACGGGCACTGCATCC CCC
AGCGGTGGGCGTGTGACGGGGATACGGACTGCCAGGATGGTTCCGATGAGGATCCAG TCA
ACTGTGAGAAGAAGTGCAATGGATTCCGCTGCCCAAACGGCACTTGCATCCCATCCA GCA
AACATTGTGATGGTCTGCGTGATTGCTCTGATGGCTCCGATGAACAGCACTGCGAGC CCC
TCTGTACGCACTTCATGGACTTTGTGTGTAAGAACCGCCAGCAGTGCCTGTTCCACT CCA
TGGTCTGTGACGGAATCATCCAGTGCCGCGACGGGTCCGATGAGGATGCGGCGTTTG CAG
GAT GCTCCCAAGATCCT GAGTTCCACAAGGTAT GT GATGAGTTCGGTTTCCAGT GTCAGA
ATGGAGTGTGCATCAGTTTGATTTGGAAGTGCGACGGGATGGATGATTGCGGCGATT ATT
CTGATGAAGCCAACTGCGAAAACCCCACAGAAGCCCCAAACTGCTCCCGCTACTTCC AGT
TTCGGTGTG AG AATGGCCACTGCATCCCCAAC AG ATGGAAAT GT GACAGGG AG AACG ACT
GTGGGGACTGGTCTGATGAGAAGGATTGTGGAGATTCACATATTCTTCCCTTCTCGA CTC
CT GGGCCCTCCACGT GTCT GCCCAATTACTACCGCT GCAGCAGT GGGACCTGCGT GATGG
ACACCTGGGTGTGCGACGGGTACCGAGATTGTGCAGATGGCTCTGACGAGGAAGCCT GCC
CCTTGCTTGCAAACGTCACTGCTGCCTCCACTCCCACCCAACTTGGGCGATGTGACC GAT
TTGAGTTCGAATGCCACCAACCGAAGACGTGTATTCCCAACTGGAAGCGCTGTGACG GCC
ACCAAGATTGCCAGGATGGCCGGGACGAGGCCAATTGCCCCACACACAGCACCTTGA CTT
GCATGAGCAGGGAGTTCCAGTGCGAGGACGGGGAGGCCTGCATTGTGCTCTCGGAGC GCT
GCGACGGCTTCCTGGACTGCTCGGACGAGAGCGATGAAAAGGCCTGCAGTGATGAGT TGA
CTGTGTACAAAGTACAGAATCTTCAGTGGACAGCTGACTTCTCTGGGGATGTGACTT TGA
CCTGGATGAGGCCCAAAAAAATGCCCTCTGCATCTTGTGTATATAATGTCTACTACA GGG
TGGTTGGAGAGAGCATATGGAAGACTCTGGAGACCCACAGCAATAAGACAAACACTG TAT
TAAAAGTCTTGAAACCAGATACCACGTATCAGGTTAAAGTACAGGTTCAGTGTCTCA GCA
AGGCACACAACACCAATGACTTTGTGACCCTGAGGACCCCAGAGGGATTGCCAGATG CCC
CTCGAAATCTCCAGCTGTCACTCCCCAGGGAAGCAGAAGGTGTGATTGTAGGCCACT GGG
CTCCTCCCATCCACACCCATGGCCTCATCCGTGAGTACATTGTAGAATACAGCAGGA GTG
GTTCCAAGATGTGGGCCTCCCAGAGGGCTGCTAGTAACTTTACAGAAATCAAGAACT TAT
TGGTCAACACTCTATACACCGTCAGAGTGGCTGCGGTGACTAGTCGTGGAATAGGAA ACT
GGAGCGATTCTAAATCCATTACCACCATAAAAGGAAAAGTGATCCCACCACCAGATA TCC
ACATTGACAGCTATGGTGAAAATTATCTAAGCTTCACCCTGACCATGGAGAGTGATA TCA
AGGTGAATGGCTATGTGGTGAACCTTTTCTGGGCATTTGACACCCACAAGCAAGAGA GGA
G AACTTT GAACTTCCG AGG AAGCAT ATT GT CAC ACAAAGTTGGCAAT CT G ACAGCTCAT A
CATCCTATGAGATTTCTGCCTGGGCCAAGACTGACTTGGGGGATAGCCCTCTGGCAT TTG
AGCATGTTATGACCAGAGGGGTTCGCCCACCTGCACCTAGCCTCAAGGCCAAAGCCA TCA
ACCAGACTGCAGTGGAATGTACCTGGACCGGCCCCCGGAATGTGGTTTATGGTATTT TCT
ATGCCACGTCCTTT CTT GACCT CT ATCGCAACCCG AAG AGCTT GACTACTTCACTCCAC A
ACAAGACGGTCATTGTCAGTAAGGATGAGCAGTATTTGTTTCTGGTCCGTGTAGTGG TAC
CCTACCAGGGGCCATCCTCTGACTACGTTGTAGTGAAGATGATCCCGGACAGCAGGC TTC
CACCCCGTCACCTGCATGTGGTTCATACGGGCAAAACCTCCGTGGTCATCAAGTGGG AAT
CACCGTATGACTCTCCTGACCAGGACTTGTTGTATGCAATTGCAGTCAAAGATCTCA TAA G AAAGACT G ACAGG AGCTACAAAGT AAAATCCCGTAACAGC ACT GT GG AAT ACACCCTT A
ACAAGTTGGAGCCTGGCGGGAAATACCACATCATTGTCCAACTGGGGAACATGAGCA AAG
ATTCCAGCATAAAAATTACCACAGTTTCATTATCAGCACCTGATGCCTTAAAAATCA TAA
CAGAAAATGATCATGTTCTTCTGTTTTGGAAAAGCCTGGCTTTAAAGGAAAAGCATT TTA
ATGAAAGCAGGGGCTATGAGATACACATGTTTGATAGTGCCATGAATATCACAGCTT ACC
TTGG G AAT ACTACT G AC AATTT CTTT AAAATTTCC AAC CT G AAG AT G GGTC AT AATT AC A
CGTTCACCGTCCAAGCAAGATGCCTTTTTGGCAACCAGATCTGTGGGGAGCCTGCCA TCC
TGCTGTACGATGAGCTGGGGTCTGGTGCAGATGCATCTGCAACGCAGGCTGCCAGAT CTA
CGGATGTTGCTGCTGTGGTGGTGCCCATCTTATTCCTGATACTGCTGAGCCTGGGGG TGG
GGTTTGCCATCCTGTACACGAAGCACCGGAGGCTGCAGAGCAGCTTCACCGCCTTCG CCA
ACAGCCACTACAGCTCCAGGCTGGGGTCCGCAATCTTCTCCTCTGGGGATGACCTGG GGG
AAG AT GAT G AAG ATGCCCCTAT GAT AACT GG ATTTTCAGAT G ACGTCCCCAT GGTGATAG
CCT GAAAGAGCTTTCCT CACT AG AAACC AAATGGT GT AAAT ATTTT ATTT GAT AAAG AT A
GTTGATGGTTTATTTTAAAAGATGCACTTTGAGTTGCAATATGTTATTTTTATATGG GCC
(SEQ ID NO: 65)
As used herein, the term “SPM” refers to the gene encoding Transcription factor PU.1 . The terms “SPIT’ and "Transcription factor PU.1" include wild-type forms of the SPI1 gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type SPIT Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type SPI1 nucleic acid sequence (e.g., SEQ ID NO: 66, ENA accession number X52056). SEQ ID NO: 66 is a wild-type gene sequence encoding SPI1 protein, and is shown below:
AAAATCAGGAACTTGTGCTGGCCCTGCAATGTCAAGGGAGGGGGCTCACCCAGGGCT CCT
GTAGCTCAGGGGGCAGGCCTGAGCCCTGCACCCGCCCCACGACCGTCCAGCCCCTGA CGG
GCACCCCATCCTGAGGGGCTCTGCATTGGCCCCCACCGAGGCAGGGGATCTGACCGA CTC
GGAGCCCGGCTGGATGTTACAGGCGTGCAAAATGGAAGGGTTTCCCCTCGTCCCCCC TCC
ATCAGAAGACCTGGTGCCCTATGACACGGATCTATACCAACGCCAAACGCACGAGTA TTA
CCCCTATCTCAGCAGTGATGGGGAGAGCCATAGCGACCATTACTGGGACTTCCACCC CCA
CCACGTGCACAGCGAGTTCGAGAGCTTCGCCGAGAACAACTTCACGGAGCTCCAGAG CGT
GCAGCCCCCGCAGCTGCAGCAGCTCTACCGCCACATGGAGCTGGAGCAGATGCACGT CCT
CGATACCCCCATGGTGCCACCCCATCCCAGTCTTGGCCACCAGGTCTCCTACCTGCC CCG
GATGTGCCTCCAGTACCCATCCCTGTCCCCAGCCCAGCCCAGCTCAGATGAGGAGGA GGG
CGAGCGGCAGAGCCCCCCACTGGAGGTGTCTGACGGCGAGGCGGATGGCCTGGAGCC CGG
GCCTGGGCTCCTGCCTGGGGAGACAGGCAGCAAGAAGAAGATCCGCCTGTACCAGTT CCT
GTTGGACCTGCTCCGCAGCGGCGACATGAAGGACAGCATCTGGTGGGTGGACAAGGA CAA
GGGCACCTTCCAGTTCTCGTCCAAGCACAAGGAGGCGCTGGCGCACCGCTGGGGCAT CCA
GAAGGGCAACCGCAAGAAGATGACCTACCAGAAGATGGCGCGCGCGCTGCGCAACTA CGG
CAAGACGGGCGAGGTCAAGAAGGTGAAGAAGAAGCTCACCTACCAGTTCAGCGGCGA AGT
GCTGGGCCGCGGGGGCCTGGCCGAGCGGCGCCACCCGCCCCACTGAGCCCGCAGCCC CCG CCGGCCCCGCCAGGCCTCCCCGCTGGCCATAGCATTAAGCCCTCGCCCGGCCCGGACACA
GGGAGGACGCTCCCGGGGCCCAGAGGCAGGACTGTGGCGGGCCGGGCTCCGTCACCC GCC
CCTCCCCCCACTCCAGGCCCCCTCCACATCCCGCTTCGCCTCCCTCCAGGACTCCAC CCC
GGCTCCCGACGCCAGCTGGGCGTCAGACCCACCGGCAACCTTGCAGAGGACGACCCG GGG
TACTGCCTTGGGAGTCTCAAGTCCGTATGTAAATCAGATCTCCCCTCTCACCCCTCC CAC
CCATTAACCTCCTCCCAAAAAACAAGTAAAGTTATTCTCAATCC
(SEQ ID NO: 66)
As used herein, the term “SPP1” refers to the gene encoding Secreted Phosphoprotein 1 . The terms “SPP1” and "Secreted Phosphoprotein 1" include wild-type forms of the SPP1 gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type SPP1. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type SPP1 nucleic acid sequence (e.g., SEQ ID NO: 67, NCBI Reference Sequence: NM_001040058.1). SEQ ID NO: 67 is a wild-type gene sequence encoding SPP1 protein, and is shown below:
CTCCCTGTGTTGGTGGAGGATGTCTGCAGCAGCATTTAAATTCTGGGAGGGCTTGGT TGTCAGCAG
CAGCAGGAGGAGGCAGAGCACAGCATCGTCGGGACCAGACTCGTCTCAGGCCAGTTG CAGCCTTC
TCAGCCAAACGCCGACCAAGGAAAACTCACTACCATGAGAATTGCAGTGATTTGCTT TTGCCTCCTA
GGCATCACCTGTGCCATACCAGTTAAACAGGCTGATTCTGGAAGTTCTGAGGAAAAG CAGCTTTACA
ACAAATACCCAGATGCTGTGGCCACATGGCTAAACCCTGACCCATCTCAGAAGCAGA ATCTCCTAGC
CCCACAGAAT GCT GT GTCCTCT G AAG AAACCAAT G ACTTT AAACAAG AGACCCTTCCAAGTAAGTCC
AACG AAAGCC AT GACCACATGGAT GAT ATGG AT GAT GAAG AT GAT GAT G ACCAT GT GG ACAGCCAG
GACTCCATTGACTCGAACGACTCTGATGATGTAGATGACACTGATGATTCTCACCAG TCTGATGAGT
CT CACC ATT CT GAT GAAT CT GAT GAACT GGTCACT GATTTTCCCACGG ACCT GCCAGCAACCG AAGT
TTTCACTCCAGTTGTCCCCACAGTAGACACATATGATGGCCGAGGTGATAGTGTGGT TTATGGACTG
AGGTCAAAATCTAAGAAGTTTCGCAGACCTGACATCCAGTACCCTGATGCTACAGAC GAGGACATCA
CCTCACACATGGAAAGCGAGGAGTTGAATGGTGCATACAAGGCCATCCCCGTTGCCC AGGACCTGA
ACGCGCCTTCT GATT GGGACAGCCGT GGGAAGGACAGTTAT GAAACGAGTCAGCT GGAT GACCAGA
GTGCT GAAACCCACAGCCAC AAGC AGTCCAGATT ATAT AAGCGGAAAGCCAAT GAT G AG AGCAAT G A
GCATTCCGATGTGATTGATAGTCAGGAACTTTCCAAAGTCAGCCGTGAATTCCACAG CCATGAATTTC
ACAGCCAT GAAGAT ATGCTGGTT GT AG ACCCC AAAAGTAAGG AAG AAGAT AAACACCT G AAATTTCG
T ATTT CT CAT G AATT AG AT AGT GC AT CTTCT G AGGTC AATT AAAAG G AG AAAAAAT AC AATTT CTC ACT
TTGCATTTAGT CAAAAGAAAAAATGCTTT AT AGCAAAAT GAAAGAG AAC AT GAAAT GCTTCTTTCT CAG
TTTATTGGTTGAATGTGTATCTATTTGAGTCTGGAAATAACTAATGTGTTTGATAAT TAGTTTAGTTTGT
GGCTTCATGGAAACTCCCTGTAAACTAAAAGCTTCAGGGTTATGTCTATGTTCATTC TATAGAAGAAA
T GC AAACT ATC ACTGT ATTTT AAT ATTT GTT ATT CTCT CAT GAAT AG AAATTT AT GTAG AAG C AAAC AAA
ATACTTTTACCCACTTAAAAAGAGAATATAACATTTTATGTCACTATAATCTTTTGT TTTTTAAGTTAGT
GTATATTTTGTTGTGATTATCTTTTTGTGGTGTGAATAAATCTTTTATCTTGAATGT AATAAGAATTTGG
TGGTGTCAATTGCTTATTTGTTTTCCCACGGTTGTCCAGCAATTAATAAAACATAAC CTTTTTTACTGC
CT AAAAAAAAAAAAAAAAA (SEQ ID NO: 67)
As used herein, the term “SPPL2A” refers to the gene encoding Signal Peptide Peptidase Like 2A. The terms “SPPL2A” and "Signal Peptide Peptidase Like 2A" include wild-type forms of the SPPL2A gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type SPPL2A. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71 %, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81 %, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type SPPL2A nucleic acid sequence (e.g., SEQ ID NO: 68, NCBI Reference Sequence: NM_001040058.1). SEQ ID NO: 68 is a wild-type gene sequence encoding SPPL2A protein, and is shown below:
AAGAGGAAGTCGCGCTGCTGTGGCGGCCGCTGTAGCAGCGGCGGTCCAGTCGTAGCC CGGCCGC
CCGCGCCTGTCCGGTCCGGTCCGGCCACGGAGGCAGCGCAGCGGCGGGACTCCGAGC CTACCCC
GCCGAGTGAGCTGCGCCGCACCGTGCCGTCCCACCCGGCACCCACCAGTCCGATGGG GCCGCAG
CGGCGGCTGTCCCCTGCCGGGGCCGCCCTACTCTGGGGCTTCCTGCTCCAGCTGACA GCCGCTCA
GGAAGCAATCTTGCATGCGTCTGGAAATGGCACAACCAAGGACTACTGCATGCTTTA TAACCCTTATT
GGACAGCT CTTCCAAGTACCCT AGAAAAT GCAACTTCCATTAGTTT GAT G AATCT GACTTCCACACCA
CTATGCAACCTTTCTGATATTCCTCCTGTTGGCATAAAGAGCAAAGCAGTTGTGGTT CCATGGGGAA
GCTGCCATTTT CTT GAAAAAGCCAGAATT GCACAG AAAGG AGGTGCT G AAGC AATGTT AGTTGTCAA
T AACAGTGTCCT ATTTCCTCCCT CAGGT AAC AG AT CT GAATTTCCT GAT GT GAAAAT ACT GATT GCATT
TATAAGCT AC AAAG ACTTT AG AG AT AT G AAC C AG ACT CT AGG AG AT AAC ATT ACTGT G AAAATGTATT
CTCCATCGTGGCCTAACTTTGATTATACTATGGTGGTTATTTTTGTAATTGCGGTGT TCACTGTGGCA
TT AG GTGG ATACT GG AGTG G ACT AGTT G AATT GG AAAACTT G AAAG C AGT G AC AACT G AAG AT AG AG
AAAT G AGG AAAAAG AAG G AAG AAT ATTT AACTTTT AGTCCTCTT AC AGTT GT AAT ATTTGTG GT CAT CT
GCTGTGTTATGATGGTCTTACTTTATTTCTTCTACAAATGGTTGGTTTATGTTATGA TAGCAATTTTCTG
CATAGCATCAGCAATGAGTCTGTACAACTGTCTTGCTGCACTAATTCATAAGATACC ATATGGACAAT
GCACGATTGCATGTCGTGGCAAAAACATGGAAGTGAGACTTATTTTTCTCTCTGGAC TGTGCATAGC
AGTAGCTGTTGTTTGGGCTGTGTTTCGAAATGAAGACAGGTGGGCTTGGATTTTACA GGATATCTTG
GGGATTGCTTTCTGTCTGAATTTAATTAAAACACTGAAGTTGCCCAACTTCAAGTCA TGTGTGATACTT
CT AGGCCTT CTCCTCCT CTAT G ATGTATTTTTT GTTTT CAT AACACCATT CAT CACAAAG AAT GGTG AG
AGTATCATGGTTGAACTCGCAGCTGGACCTTTTGGAAATAATGAAAAGTTGCCAGTA GTCATCAGAGT
ACCAAAACTGATCTATTTCTCAGTAATGAGTGTGTGCCTCATGCCTGTTTCAATATT GGGTTTTGGAG
ACATT ATT GT ACC AGGCCTGTT G ATTGCAT ACT GT AGAAGATTT GATGTT CAG ACT GGTT CTT CTT AC A
TATACTATGTTTCGTCTACAGTTGCCTATGCTATTGGCATGATACTTACATTTGTTG TTCTGGTGCTGA
TGAAAAAGGGGCAACCTGCTCTCCTCTATTTAGTACCTTGCACACTTATTACTGCCT CAGTTGTTGCC
TGGAGACGTAAGGAAATGAAAAAGTTCTGGAAAGGTAACAGCTATCAGATGATGGAC CATTTGGATT
GTGCAACAAATGAAGAAAACCCTGTGATATCTGGTGAACAGATTGTCCAGCAATAAT ATTATGTGGAA
CTGCTAT AATGTGTC ATT G ATTTT CT AC AAAT AG ACTTCG ACTTTTT AAATT G ACTTTT G AATT G AC AAT
CT GAAAGAGT CTT CAAT GAT AT GCTT GCAAAAAT ATATTTTT AT GAGCT GGTACT G ACAGTT ACATC AT
AAAT AACT AAAAC GCTTT GCTTTT AAT GTT AAAGTT GTGCCTT C AC ATT AAAT AAAAC AT ATGGTCTGT
GTAGTTTCCG AG AT GTACTATAT AC AGTAT ATTTTT CT AAAAAAAAA
(SEQ ID NO: 68) As used herein, the term “TBK refers to the gene encoding Serine/threonine-protein kinase TBK1. The terms “TBK and "Serine/threonine-protein kinase TBK1" include wild-type forms of the TBK1 gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type TBK1 . Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type TBK1 nucleic acid sequence (e.g., SEQ ID NO: 69, ENA accession number AF191838). SEQ ID NO: 69 is a wild-type gene sequence encoding TBK1 protein, and is shown below:
GCCGGCGGTGGCGCGGCGGAGACCCGGCTGGTATAACAAGAGGATTGCCTGATCCAG CCA
AGATGCAGAGCACTTCTAATCATCTGTGGCTTTTATCTGATATTTTAGGCCAAGGAG CTA
CT GCAAATGTCTTTCGT GGAAG ACAT AAG AAAACTGGT GATTT ATTT GCT AT CAAAGT AT
TTAATAACATAAGCTTCCTTCGTCCAGTGGATGTTCAAATGAGAGAATTTGAAGTGT TGA
AAAAACT C AAT C AC AAAAAT ATTGTC AAATT ATTT GCT ATT G AAG AG G AG AC AAC AAC AA
G AC AT AAAGT ACTT ATT AT G G AATTTT GTCC AT GT G GG AGTTT AT AC ACT GTTTT AG AAG
AACCTTCTAATGCCTATGGACTACCAGAATCTGAATTCTTAATTGTTTTGCGAGATG TGG
TGGGTGGAATGAATCATCTACGAGAGAATGGTATAGTGCACCGTGATATCAAGCCAG GAA
ATATCATGCGTGTTATAGGGGAAGATGGACAGTCTGTGTACAAACTCACAGATTTTG GTG
CAGCT AG AG AATTAGAAG AT GAT GAGC AGTTTGTTT CTCT GT AT GGCACAG AAG AAT ATT
TGC ACCCT GAT ATGTAT GAG AG AG C AGTG CT AAG AAAAG AT CAT C AG AAG AAAT AT GG AG
CAACAGTTGATCTTTGGAGCATTGGGGTAACATTTTACCATGCAGCTACTGGATCAC TGC
CATTTAGACCCTTTGAAGGGCCTCGTAGGAATAAAGAAGTGATGTATAAAATAATTA CAG
GAAAGCCTTCTGGTGCAATATCTGGAGTACAGAAAGCAGAAAATGGACCAATTGACT GGA
GTGGAGACATGCCTGTTTCTTGCAGTCTTTCTCGGGGTCTTCAGGTTCTACTTACCC CTG
TTCTTGCAAACATCCTTGAAGCAGATCAGGAAAAGTGTTGGGGTTTTGACCAGTTTT TTG
CAGAAACTAGTGATATACTTCACCGAATGGTAATTCATGTTTTTTCGCTACAACAAA TGA
CAGCT CAT AAG ATTT AT ATT CAT AG CTAT AAT ACTGCTACTAT ATTT CAT GAACTGGTAT
ATAAACAAACCAAAATTATTTCTTCAAATCAAGAACTTATCTACGAAGGGCGACGCT TAG
TCTTAGAACCTGGAAGGCTGGCACAACATTTCCCTAAAACTACTGAGGAAAACCCTA TAT
TTGTAGTAAGCCGGGAACCTCTGAATACCATAGGATTAATATATGAAAAAATTTCCC TCC
CTAAAGTACATCCACGTTATGATTTAGACGGGGATGCTAGCATGGCTAAGGCAATAA CAG
GGGTTGTGTGTTATGCCTGCAGAATTGCCAGTACCTTACTGCTTTATCAGGAATTAA TGC
G AAAGGGGATACG AT GGCT GATT GAATT AATT AAAGAT G ATTAC AAT G AAACT GTT CACA
AAAAG AC AGAAGTTGTG AT CAC ATTGG ATTT CTGTAT CAG AAACATT GAAAAAACTGTG A
AAGTATATGAAAAGTTGATGAAGATCAACCTGGAAGCGGCAGAGTTAGGTGAAATTT CAG
ACATACACACCAAATTGTTGAGACTTTCCAGTTCTCAGGGAACAATAGAAACCAGTC TTC
AGGATATCGACAGCAGATTATCTCCAGGTGGATCACTGGCAGACGCATGGGCACATC AAG
AAGGCACTCATCCGAAAGACAGAAATGTAGAAAAACTACAAGTCCTGTTAAATTGCA TGA
CAG AG ATTT ACTAT C AGTT C AAAA AAG AC AAAGC AG AACGT AG ATT AG CTT AT AAT G AAG
AACAAATCCACAAATTTGATAAGCAAAAACTGTATTACCATGCCACAAAAGCTATGA CGC
ACTTTACAGATGAATGTGTTAAAAAGTATGAGGCATTTTTGAATAAGTCAGAAGAAT GGA TAAGAAAGATGCTTCATCTTAGGAAACAGTTATTATCGCTGACTAATCAGTGTTTTGATA
TT G AAG AAG AAGTAT C AAAAT AT C AAG AAT ATACT A AT G AGTT AC AAG AAACTCTG CCT C
AGAAAATGTTTACAGCTTCCAGTGGAATCAAACATACCATGACCCCAATTTATCCAA GTT
CT AAC AC ATT AGT AG AAAT G ACTCTTG GTAT G AAG AAATT AAAG G AAG AG AT GG AAGG GG
TGGTTAAAGAACTTGCTGAAAATAACCACATTTTAGAAAGGTTTGGCTCTTTAACCA TGG
ATGGTGGCCTTCGCAACGTTGACTGTCTTTAGCTTTCTAATAGAAGTTTAAGAAAAG TTT
CCGTTTGCACAAGAAAATAACGCTTGGGCATTAAATGAATGCCTTTATAGATAGTCA CTT
GTTTCTACAATTCAGTATTTGATGTGGTCGTGTAAATATGTACAATATTGTAAATAC ATA
AAAAAT AT AC AAATTTTT GGCTGCTGT G AAG AT GT AATTTT AT CTTTT AAC ATTT AT AAT
T ATAT G AGG AAATTT G ACCTCAGT GAT CACGAG AAG AAAGCCAT G ACCG ACCAAT ATGTT
GACATACTGATCCTCTACTCTGAGTGGGGCTAAATAAGTTATTTTCTCTGACCGCCT ACT
GGAAATATTTTTAAGTGGAACCAAAATAGGCATCCTTACAAATCAGGAAGACTGACT TGA
CACGTTTGTAAATGGTAGAACGGTGGCTACTGTGAGTGGGGAGCAGAACCGCACCAC TGT
TATACTGGGATAACAATTTTTTTGAGAAGGATAAAGTGGCATTATTTTATTTTACAA GGT
GCCCAGATCCCAGTT ATCCTT GT ATCCAT GT AATTTCAG AT GAATT ATT AAGC AAACATT
TTAAAGT
(SEQ ID NO: 69)
As used herein, the term “TNF” refers to the gene encoding Tumor necrosis factor. The terms “TNF” and "Tumor necrosis factor" include wild-type forms of the TNF gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type TNF. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%,
81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type TNF nucleic acid sequence (e.g., SEQ ID NO: 70, ENA accession number X01394). SEQ ID NO: 70 is a wild-type gene sequence encoding TNF protein, and is shown below:
GCAGAGGACCAGCTAAGAGGGAGAGAAGCAACTACAGACCCCCCCTGAAAACAACCC TCA
GACGCCACATCCCCTGACAAGCTGCCAGGCAGGTTCTCTTCCTCTCACATACTGACC CAC
GGCTCCACCCTCTCTCCCCTGGAAAGGACACCATGAGCACTGAAAGCATGATCCGGG ACG
TGGAGCTGGCCGAGGAGGCGCTCCCCAAGAAGACAGGGGGGCCCCAGGGCTCCAGGC GGT
GCTTGTTCCTCAGCCTCTTCTCCTTCCTGATCGTGGCAGGCGCCACCACGCTCTTCT GCC
TGCTGCACTTTGGAGTGATCGGCCCCCAGAGGGAAGAGTTCCCCAGGGACCTCTCTC TAA
TCAGCCCTCTGGCCCAGGCAGTCAGATCATCTTCTCGAACCCCGAGTGACAAGCCTG TAG
CCCATGTTGTAGCAAACCCTCAAGCTGAGGGGCAGCTCCAGTGGCTGAACCGCCGGG CCA
ATGCCCTCCTGGCCAATGGCGTGGAGCTGAGAGATAACCAGCTGGTGGTGCCATCAG AGG
GCCTGTACCTCATCTACTCCCAGGTCCTCTTCAAGGGCCAAGGCTGCCCCTCCACCC ATG
TGCTCCTCACCCACACCATCAGCCGCATCGCCGTCTCCTACCAGACCAAGGTCAACC TCC
TCTCTGCCATCAAGAGCCCCTGCCAGAGGGAGACCCCAGAGGGGGCTGAGGCCAAGC CCT
GGTATGAGCCCATCTATCTGGGAGGGGTCTTCCAGCTGGAGAAGGGTGACCGACTCA GCG
CTGAGATCAATCGGCCCGACTATCTCGACTTTGCCGAGTCTGGGCAGGTCTACTTTG GGA
TCATTGCCCTGTGAGGAGGACGAACATCCAACCTTCCCAAACGCCTCCCCTGCCCCA ATC CCTTTATTACCCCCTCCTTCAGACACCCTCAACCTCTTCTGGCTCAAAAAGAGAATTGGG
GGCTTAGGGTCGGAACCCAAGCTTAGAACTTTAAGCAACAAGACCACCACTTCGAAA CCT
GGGATTCAGGAATGTGTGGCCTGCACAGTGAATTGCTGGCAACCACTAAGAATTCAA ACT
GGGGCCTCCAGAACTCACTGGGGCCTACAGCTTTGATCCCTGACATCTGGAATCTGG AGA
CCAGGGAGCCTTTGGTTCTGGCCAGAATGCTGCAGGACTTGAGAAGACCTCACCTAG AAA
TTGACACAAGTGGACCTTAGGCCTTCCTCTCTCCAGATGTTTCCAGACTTCCTTGAG ACA
CGGAGCCCAGCCCTCCCCATGGAGCCAGCTCCCTCTATTTATGTTTGCACTTGTGAT TAT
TT ATT ATTT ATTT ATT ATTT ATTT ATTT AC AG AT GAAT GT ATTT ATTT G GG AG AC CGG GG
TATCCTGGGGGACCCAATGTAGGAGCTGCCTTGGCTCAGACATGTTTTCCGTGAAAA CGG
AGCTGAACAATAGGCTGTTCCCATGTAGCCCCCTGGCCTCTGTGCCTTCTTTTGATT ATG
TTTTTT AAAAT ATTT ATCT GATT AAGTTGTCTAAACAAT GCT GATTTGGTG ACC AACTGT
CACTCATTGCTGAGCCTCTGCTCCCCAGGGGAGTTGTGTCTGTAATCGCCCTACTAT TCA
GTGGCGAGAAAT AAAGTTTGCTT
(SEQ ID NO: 70)
As used herein, the term “TREM2” refers to the gene encoding Triggering receptor expressed on myeloid cells 2. The terms “TREM2” and "Triggering receptor expressed on myeloid cells 2" include wild- type forms of the TREM2 gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type TREM2. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type TREM2 nucleic acid sequence (e.g., SEQ ID NO: 71 , ENA accession number AF213457). SEQ ID NO: 71 is a wild-type gene sequence encoding TREM2 protein, and is shown below:
TGACATGCCTGATCCTCTCTTTTCTGCAGTTCAAGGGAAAGACGAGATCTTGCACAA GGC
ACTCTGCTTCTGCCCTTGGCTGGGGAAGGGTGGCATGGAGCCTCTCCGGCTGCTCAT CTT
ACTCTTTGTCACAGAGCTGTCCGGAGCCCACAACACCACAGTGTTCCAGGGCGTGGC GGG
CCAGTCCCTGCAGGTGTCTTGCCCCTATGACTCCATGAAGCACTGGGGGAGGCGCAA GGC
CTGGTGCCGCCAGCTGGGAGAGAAGGGCCCATGCCAGCGTGTGGTCAGCACGCACAA CTT
GTGGCTGCTGTCCTTCCTGAGGAGGTGGAATGGGAGCACAGCCATCACAGACGATAC CCT
GGGTGGCACTCTCACCATTACGCTGCGGAATCTACAACCCCATGATGCGGGTCTCTA CCA
GTGCCAGAGCCTCCATGGCAGTGAGGCTGACACCCTCAGGAAGGTCCTGGTGGAGGT GCT
GGCAGACCCCCTGGATCACCGGGATGCTGGAGATCTCTGGTTCCCCGGGGAGTCTGA GAG
CTTCGAGGATGCCCATGTGGAGCACAGCATCTCCAGGAGCCTCTTGGAAGGAGAAAT CCC
CTTCCCACCCACTTCCATCCTTCTCCTCCTGGCCTGCATCTTTCTCATCAAGATTCT AGC
AGCCAGCGCCCTCTGGGCTGCAGCCTGGCATGGACAGAAGCCAGGGACACATCCACC CAG
TGAACTGGACTGTGGCCATGACCCAGGGTATCAGCTCCAAACTCTGCCAGGGCTGAG AGA
CACGTGAAGGAAGATGATGGGAGGAAAAGCCCAGGAGAAGTCCCACCAGGGACCAGC CCA
GCCTGCATACTTGCCACTTGGCCACCAGGACTCCTTGTTCTGCTCTGGCAAGAGACT ACT
CTGCCTGAACACTGCTTCTCCTGGACCCTGGAAGCAGGGACTGGTTGAGGGAGTGGG GAG
GTGGTAAG AACACCT GACAACTTCT GAAT ATT GG ACATTTT AAACACTT ACAAATAAAT C
C AAG ACT GT CAT ATTT AAAAA (SEQ ID NO: 71)
As used herein, the term “TREML2” refers to the gene encoding Triggering Receptor Expressed on Myeloid Cells Like 2. The terms “TREML2” and "Triggering Receptor Expressed on Myeloid Cells Like 2" include wild-type forms of the TREML2 gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type TREML2. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type TREML2 nucleic acid sequence (e.g., SEQ ID NO: 72, NCBI Reference Sequence: NM_024807.3). SEQ ID NO: 72 is a wild-type gene sequence encoding TREML2 protein, and is shown below:
CAATGAATCCCTGCGGTTGGCTGGGGGCAGTGGGTCCCACACTGCCTCACTTCCCTA AATGGGCAG
CTTCACTTTTAGAACCCCGGGTCCTTCCCTGGCAGGCCCAGGTGGCACATCCTGTGT CGGGTGGGC
CCTCACCTTGGATCTCCAGGCCTGACACTGCCCAGCTGGATGGAACCATGGCCCCAG CCTTCCTGC
TGCTGCTGCTGCTGTGGCCACAGGGTTGCGTCTCAGGCCCCTCTGCTGACAGTGTAT ACACAAAAG
TGAGGCTCCTTGAAGGGGAGACTCTGTCTGTGCAGTGCTCCTATAAGGGCTACAAAA ACCGCGTGG
AGGGCAAGGTTTGGTGCAAAATCAGGAAGAAGAAGTGTGAGCCTGGCTTTGCCCGAG TCTGGGTGA
AAGGGCCCCGCTACTTGCTGCAGGACGATGCCCAGGCCAAGGTGGTCAACATCACCA TGGTGGCC
CTCAAGCTCCAGGACTCAGGCCGATACTGGTGCATGCGCAACACCTCTGGGATCCTG TACCCCTTG
ATGGGCTTCCAGCT GG AT GT GT CTCCAGCTCCCC AAACT G AG AGG AACATTCCTTT CAC ACAT CTGG
ACAACATCCTCAAGAGTGGAACTGTCACAACTGGCCAAGCCCCTACCTCAGGCCCTG ATGCCCCTTT
TACCACTGGTGTGATGGTGTTCACCCCAGGACTCATCACCTTGCCTAGGCTCTTAGC CTCCACCAGA
CCTGCCTCCAAGACAGGCTACAGCTTCACTGCTACCAGCACCACCAGCCAGGGACCC AGGAGGACC
ATGGGGTCCCAGACAGTGACCGCGTCTCCCAGCAATGCCAGGGACTCCTCTGCTGGC CCAGAATCC
ATCTCCACTAAGTCTGGGGACCTCAGCACCAGATCGCCCACCACAGGGCTCTGCCTC ACCAGCAGA
TCTCTCCTCAACAGACTACCCTCCATGCCCTCCATCAGGCACCAGGATGTTTACTCC ACTGTGCTTG
GGGTGGTGCTGACCCTCCTGGTGCTGATGCTGATCATGGTCTATGGGTTTTGGAAGA AGAGACACA
TGGCAAGCTACAGCATGTGCAGCGATCCTTCTACACGTGACCCACCTGGAAGACCAG AGCCCTATG
TGGAAGTCTACTTGATCTGAGGCCACTTAAGCATGGGGTGGGGAGCTTCTCCCAGAG TGGCCCCAG
GGGGTTAGAGGAGGGGTGAAGATTGGGGCCAGTATCGATCTTATGAAGCTGGAGGAC TTGTGCAGT
GCTGGACTCACCCAGGACTTCCCAAACCCAGAGGCTGCCATCCTAAGCAGCCCCACA GCCCAGTGT
TCTCCTTGGGGGCAGGAACCTGGGGAGGGGCCCAGAGCAAAGGGCATCAGGGAGAAA GTCCCGAG
GAAATGTGACCAGTGGTTTCTGCTCGGAGCTGCAGACCCCAGGGCTCTTGGTGGAGG CAGGGGAA
CCCTGAGAGTGCTGTTTACAGAGAACCTCAGCTCCCGTCTGCCTCAGAAACCCTATT GGGCTGAGCT
GCCCTCCCCACCAGGGCCACTGTGTCCTCTGCTTCCCTCCGTTCTGCTTCAGCTTCC CCTAAGGTTA
GGGAAGAAAGAATCGGGCTCACGAATGCCAGAGGCAGTGATGTCCCATCCTGGAGGA GAGGAAAC
AGTGACTAAAAGCTGGGGACCCACAGAGGGGTTGGCAGCTTCTCTTGTCGGGACAGG TGTCCTTTG
CTGGGCCTCTGGATGGCCCTGCCCTGACTGGGGCTGCTCCTCCCTCCTGTCCTGGGA CCGCGCAG
AGCCCACGCTCTCACTGCTGCCTCCTGCTGGCCGCTGCCTCCTTAGAAAGCTGTGAC CAGGCAGCT
AAGAGCCTCT GGGCT GCAGGGTCAGCCTCTCCCAAGACT GAAGT GCAGAGGCTGGACTT GGGGCT
CTCTCCCCCAGCTTCTACACCTGGGCTCCAAGTCTGAGTTCCCACAGGGGACCCAGC AGCCTCCAG GAAGTCCATACCCTGGGGTGGCTGAGACCTTGGCTCTGTATGGAGGCTGCTCACCCCACA GACACT
GGTGGGGAGACCATGGCTCAGAGGAAGGGTGGAGCAACCCTCCTCCTACCCCTCAGG ATAGAGAG
AGAAGACACACTTGGGACACAGTGAAGACAGTAACTTGGAACTGACCACGGCCTGGA GGACTGGCC
CAGGCAGGGGGACAGGGAAAATGGAGCCCAAGTAGCCTCTGGCCAGGGACCCAATGT CCCGAGGA
ATCTGCCTCCCACCCACTGACTCAGGGCTCAGACTCAGCCTCTATTGTCCAGAGCAC TGGCTTGGC
GTCCAGCAATGAAGGCTGGAGAATGCAGCCTGGATTCCCCTACACACACACACACAC ACACACACA
C AC AC AC AC AC AC AC AC AC AC AC AC AC AG GT GTCTACT GACCT GGAGT GACT GGAATAGCACCTGG
GGATAAATGTGACAACTGTGCATTGAACCCTGGGTCAGGGACGTTCCAATGGCCAAG AGAGTGACA
CAGCCAGGACCCTGGTGGACAGCCAGAGGGGCCACTTCAGGATGGATGTGGGGAGAG TGGAAGAG
GCAGGGAGTAATCCTGGGGGACAGCAGGGAGGAGGCACTTCTTCCCTATGTCCAGGA GAGGGCAA
TAGAGGGAAGACTGAGGCTGAAGAATTGACGGCTCTGGACCCAGGACAGACAGACAG ACAGACAGA
CAGACAGACAGACAGACACGCACACACACCCATCTCTGTCTAGCAAGCAGCCTCCTA AGATAGCTGT
TCTCCCTATCATGACGGTGTAGCCACCATCCTGTTGTATACTAGGAGAGAACTTAAC CCACCTGGGG
GAAAATAGCTCCCCAAGAGCTGGCACCAGTACCACTGATGGCCCTGCTTCCTCTGAG TGAGATGCC
CAGGAGGAGGAGCCCTAGGGAAGAAGTCAGGGACAGGGACCAGGATACCACTCTGTC ACTGTGTG
ACCCTCAGCAAGTCACTAACCCTTGGCCTCATTTTTCCTGTCTTGTGAAAGAGGACA ATAATTCCTAC
TTCTCAAGATTGTTTTCAAGATAAAATAACATTAGCATTGTACAATGATGCAAATGC CTCATTACCATT
ATTCCTTAAGTTGTTTTCCAGCTCTAATGTTGTTTCCAACATTACATTTAAGACCTT AGGATTCTGTTTC
TTGCTTTTGTCATATCTCTTCCCAAGTGTCATCACTATATGGATGTTGAGGGCCCCC GATGACAGTCC
CTTTGGTAAGGTCCTCTTTTGAGGAGGGGAGGGTACAGGGTGGACTCATCTCAGTGT GAACTTGGC
AAGTCACTGTCCCTCTCTGATCTTGTTTCCTCATCTGGAGAAGGAGTGAGAGAGGAG AAAGGAAGAA
ACCAGTCAGGCAGGCAGTTAGGGTGGGTTCTCGGTAGAATTCTTTTAAACAAAAGAA CAGCCTGAAA
AATCAAGCTGCAGGCACAGATATGGGAACTTGCACAGGGGGGCTTGCCTAAGACATG CCCACAGCC
T CATAG AT AAG AC AG ACT ACACAGGT GACTTGCCCAAAC AT GCCT GCAATGG AAAATTT CATCCCCT
GACATGTGCAGTAAGGGGAACAAAGCAATATGGAGTAAGTAACTCAAGCCAAGGGCC CACATGTAC
ATTAGAAGGACAGCAGGGAGCTACCAGAAATTCATGCCTTATGCAGATGAGCTGCCC AGTCCTCATC
GGTTTCTTATAAAAGCCTTTACATTCAACTGTAAAAATGGCAACCCTCTTTCAGGCC TCCTCTCCACA
G C AG AG AGCTTT CTTCTCTCACT C ATT AAACTTT C ACTCC AAC CT C AAAAAAAAAAAAAAAAAA
(SEQ ID NO: 72)
As used herein, the term “TYROBP” refers to the gene encoding TYRO protein tyrosine kinasebinding protein. The terms “TYROBP” and "TYRO protein tyrosine kinase-binding protein" include wild- type forms of the TYROBP gene, as well as variants (e.g., splice variants and polymorphisms) of wild- type TYROBP. Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild- type TYROBP nucleic acid sequence (e.g., SEQ ID NO: 73, ENA accession number AF019562). SEQ ID NO: 73 is a wild-type gene sequence encoding TYROBP protein, and is shown below:
CCACGCGTCCGCGCTGCGCCACATCCCACCGGCCCTTACACTGTGGTGTCCAGCAGC ATC
CGGCTTCATGGGGGGACTTGAACCCTGCAGCAGGCTCCTGCTCCTGCCTCTCCTGCT GGC
TGTAAGTGGTCTCCGTCCTGTCCAGGCCCAGGCCCAGAGCGATTGCAGTTGCTCTAC GGT GAGCCCGGGCGT GCT GGCAGGGATCGTGAT GGGAGACCT GGTGCTGACAGT GCTCATT GC CCTGGCCGTGTACTTCCTGGGCCGGCTGGTCCCTCGGGGGCGAGGGGCTGCGGAGGCAGC GACCCGGAAACAGCGTATCACTGAGACCGAGTCGCCTTATCAGGAGCTCCAGGGTCAGAG GTCGGATGTCTACAGCGACCTCAACACACAGAGGCCGTATTACAAATGAGCCCGAATCAT GACAGTCAGCAACATGATACCTGGATCCAGCCATTCCTGAAGCCCACCCTGCACCTCATT CCAACTCCTACCGCGATACAGACCCACAGAGTGCCATCCCTGAGAGACCAGACCGCTCCC C AAT ACTCTCCT AAAAT AAAC AT G AAG C AC AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA (SEQ ID NO: 73)
As used herein, the term “ZCWPW1” refers to the gene encoding Zinc finger CW-type PVWVP domain protein 1 . The terms “ZCWPW1” and " Zinc finger CW-type PWWP domain protein 1" include wild-type forms of the ZCWPW1 gene, as well as variants (e.g., splice variants and polymorphisms) of wild-type ZCWPW1 . Examples of such variants are nucleic acids having at least 70% sequence identity (e.g., 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.9% identity, or more) to a wild-type ZCWPW1 nucleic acid sequence (e.g., SEQ ID NO: 74, ENA accession number AL136735). SEQ ID NO: 74 is a wild-type gene sequence encoding ZCWPW1 protein, and is shown below:
CGCCGTTTTCCCGGGGAGATGCGCCGCCCGGTCTCCCTGCCAGCGGAGTGCTGGGCC GAG
GACAGGGCGGCAGGGGTGACAGTGGGGTCCAGGAGAGTCTCAAAATCCTAAGCTTTC AGT
ATTTGTTATTGTGAAAGAAGTTAATTCACCTGAAACAGAGGAGGGGCAACCTGAGTT ATC
AGAAAGTGACTTCCTGGCCTTCCCTTCTTTACTGATCAGAGGCACACAAAGCGTAGT TTC
T AAGCT G AAT GAT G ACAACGTT GCAG AAT AAAG AAG AAT GT GGAAAGGG ACCAAAGAG AA
TCTTTGCCCCACCTGCACAAAAATCTTACAGCCTGTTACCTTGTAGCCCTAACTCCC CTA
AGGAGGAGACCCCGGGGATCAGTTCCCCAGAGACAGAGGCCAGGATAAGCCTGCCAA AGG
CCAGTTTAAAGAAGAAAGAGGAAAAAGCAACCATGAAGAATGTTCCAAGCAGGGAAC AGG
AGAAAAAAAGAAAGGCACAAATCAACAAGCAAGCAGAGAAGAAAGAAAAGGAAAAAT CAA
GTCTTACCAATGCAGAATTTGAGGAGATTGTCCAGATTGTTCTGCAGAAGTCCCTTC AGG
AGT GCTT GGGG AT GGG AT CTGGCCTT G ATTTT GCAGAGACTT CTT GTGCCCAGCCCGT AG
TATCT ACC C AAT C AG AC AAGG AG CC AGG AATT ACTGCTTCTGCTACT GAT ACTG AT AAT G
CT AATGG AG AGG AGGT ACCAC AT ACT CAAG AG ATTTCAGT GT CTTGGGAAGGTG AAGCT G
CCCCT GAG AT AAGG AC ATCT AAGTT AGGCCAGCC AG ATCCTGCACCCTCT AAGAAGAAAT
CCAATAGACTCACCTTAAGCAAAAGAAAGAAGGAAGCTCATGAGAAGGTGGAGAAAA CTC
AAG GTG G AC AT GAG C AC AG AC AGG AAG ACCG ACT AAAG AAAAC AGTT C AGG AT C ATT CTC
AGATCAGGGACCAGCAAAAAGGAGAGATAAGTGGTTTTGGTCAATGTCTGGTCTGGG TCC
AGTGTTCCTTCCCAAACTGTGGGAAATGGAGGCGGCTGTGTGGGAACATTGACCCCT CAG
TTCTCCCAGATAATTGGTCCTGTGATCAGAACACAGATGTGCAGTATAATCGCTGTG ATA
TTCCTGAGGAGACCTGGACAGGGCTTGAGAGTGATGTGGCCTATGCCTCCTACATCC CAG
GATCCATCATCTGGGCCAAGCAATACGGTTACCCCTGGTGGCCAGGCATGATAGAAT CTG
ATCCTGACTTAGGGGAATATTTTCTTTTTACTTCCCATCTTGATTCCCTGCCGTCTA AGT
ACCATGTGACGTTTTTTGGAGAAACAGTTTCTCGTGCATGGATCCCAGTCAACATGC TAA
AGAACTTCCAGGAGCTGTCCCTGGAGCTATCAGTCATGAAAAAGCGCAGAAATGACT GCA GCCAGAAACTGGGGGTGGCCCTGATGATGGCTCAAGAGGCAGAACAGATCAGCATTCAGG
AACGGGTTAACTTGTTTGGTTTCTGGAGCCGATTCAACGGATCTAACAGTAATGGGG AAA
GAAAAGACTTACAGCTCTCTGGTTTGAACAGCCCAGGATCCTGCTTAGAGAAAAAGG AGA
AAGAGGAAGAGTTGGAAAAGGAGGAAGGAGAGAAAACAGACCCAATTTTGCCCATTC GTA
AGCGAGTCAAAATACAGACCCAAAAAAACCAAGCCAAGAGGGCTTGGGGGTGATGCA GGC
ACAGCAGATGGCCGAGGCAGGACACTGCAGAGGAAGATAATGAAGAGATCTCTAGGC AGG
AAATCCACAGCTCCTCCTGCACCCAGAATGGGAAGGAAAGAAGGCCAAGGGAATTCA GAT
TCTGACCAGCCAGGCCCTAAGAAAAAATTTAAAGCTCCCCAGAGCAAGGCCTTGGCA GCC
AGCTTTTCAGAGGGAAAAGAAGTTAGAACAGTGCCAAAGAACCTGGGCCTATCAGCG TGT
AAGGGGGCCTGCCCCTCATCTGCGAAAGAAGAGCCCAGACACCGGGAACCCCTGACC CAG
GAGGCTGGAAGTGTCCCCCTTGAGGACGAAGCCTCCAGTGACCTGGACCTGGAGCAA CTC
ATGGAAGATGTTGGGAGAGAGCTGGGGCAGAGCGGGGAGCTGCAGCACAGCAACAGT GAT
GGCGAGGACTTCCCCGTGGCGCTGTTTGGGAAGTAGCTGGTGCTCCTCTGCTCCCTC TTT
TTCTCCCTTCTCTGGGGCGCAGGAGGGAGAAGTTGCTAAGTGCTGGGTCTGTTCATT GGC
TATGAGGTTCAAATGTGTGTGGTGCAGTTTCTGTGTTAATAAAGCAGGTTACAGTCG AAA
AAAAAAAAAAAAAAAAA
(SEQ ID NO: 74)
DETAILED DESCRIPTION OF THE INVENTION
The present invention provides new forms of siRNA, including single- and double-stranded short interfering RNA (ds-siRNA), and methods for their use in treating a patient in need of microglial gene silencing (e.g., a patient having dysregulated microglial gene expression, such as a patient with, e.g., Alzheimer’s disease, amyotrophic lateral sclerosis, Parkinson’s disease, frontotemporal dementia, Huntington’s disease, multiple sclerosis, or progressive supranuclear palsy). The branched siRNA in the present invention has shown a surprising ability to permeate the cell. The branched compositions described herein may employ a variety of modifications known and previously unknown in the art. The siRNA of the invention may contain an antisense strand including a region that is represented by Formula IX:
Z-((A-P-)n(B-P-)m)q;
(IX) wherein Z is a 5’ phosphorus stabilizing moiety; each A is, independently, a 2'-modified-ribonucleoside of a first type; each B is, independently, a 2'-modified-ribonucleoside of a second type; each P is, independently, an internucleoside linkage selected from a phosphodiester linkage and a phosphorothioate linkage; n is an integer from 1 to 5; m is an integer from 1 to 5; and q is an integer between 1 and 15. The embodiments of each part of Formula IX and the methods of use for the molecules Formula IX represents are described herein.
In some embodiments, the siRNA of the invention may have a sense strand represented by Formula X:
Y-((A-P-)n(B-P-)m)qL-((B-P-)m(A-P-)n)q; (X) wherein Y is a hydrophobic moiety (e.g., cholesterol, vitamin D, or tocopherol); L is a linker; each A is, independently, a 2'-modified-ribonucleoside of a first type; each B is, independently, a 2'-modified- ribonucleoside of a second type; each P is, independently, an internucleoside linkage selected from a phosphodiester linkage and a phosphorothioate linkage; n is an integer from 1 to 5; m is an integer from 1 to 5; and q is an integer between 1 and 15. The embodiments of each part of Formula X and the methods of use for the molecules Formula X represents are described herein. siRNA Structure
The simplest siRNAs consist of a ribonucleic acid comprising a single- or double-stranded structure, formed by a first strand, and in the case of a double-stranded siRNA, a second strand. The first strand comprises a stretch of contiguous nucleotides that is at least partially complementary to a target nucleic acid. The second strand also comprises a stretch of contiguous nucleotides where the second stretch is at least partially identical to a target nucleic acid. The first strand and said second strand may be hybridized to each other to form a double-stranded structure. The hybridization typically occurs by Watson Crick base pairing.
Depending on the sequence of the first and second strand, the hybridization or base pairing is not necessarily complete or perfect, which means that the first and second strand are not 100% base-paired due to mismatches. One or more mismatches may also be present within the duplex without necessarily impacting the siRNA activity.
The first strand contains a stretch of contiguous nucleotides which is essentially complementary to a target nucleic acid. Typically, the target nucleic acid sequence is, in accordance with the mode of action of interfering ribonucleic acids, a single-stranded RNA, preferably an mRNA. Such hybridization occurs most likely through Watson Crick base pairing but is not necessarily limited thereto. The extent to which the first strand has a complementary stretch of contiguous nucleotides to a target nucleic acid sequence can be between 80% and 100%, e.g., 80%, 85%, 90%, 95%, or 100% complementary. siRNAs described herein may employ modifications to the nucleobase, phosphate backbone, ribose core, 5'- and 3'-ends, and branching, wherein multiple strands of siRNA may be covalently linked.
Length of siRNA molecules
It is within the scope of the invention that any length, known and previously unknown in the art, may be employed for the current invention. As described herein, potential lengths for an antisense strand of the branched siRNA of the present invention is between 10 and 30 nucleotides (e.g., 10 nucleotides,
11 nucleotides, 12 nucleotides, 13 nucleotides, 14 nucleotides, 15 nucleotides, 16 nucleotides, 17 nucleotides, 18 nucleotides, 19 nucleotides, 20 nucleotides, 21 nucleotides, 22 nucleotides, 23 nucleotides, 24 nucleotides, 25 nucleotides, 26 nucleotides, 27 nucleotides, 28 nucleotides, 29 nucleotides, or 30 nucleotides), 15 and 25 nucleotides (e.g., 15 nucleotides, 16 nucleotides, 17 nucleotides, 18 nucleotides, 19 nucleotides, 20 nucleotides, 21 nucleotides, 22 nucleotides, 23 nucleotides, 24 nucleotides, or 25 nucleotides), or 18 and 23 nucleotides (e.g., 18 nucleotides, 19 nucleotides, 20 nucleotides, 21 nucleotides, 22 nucleotides, 23 nucleotides, 24 nucleotides, 25 nucleotides, 26 nucleotides, 27 nucleotides, 28 nucleotides, 29 nucleotides, or 30 nucleotides). In some embodiments, the antisense strand is 21 nucleotides. In some embodiments, the antisense strand is 22 nucleotides. In some embodiments, the antisense strand is 23 nucleotides. In some embodiments, the antisense strand is 24 nucleotides. In some embodiments, the antisense strand is 25 nucleotides. In some embodiments, the antisense strand is 26 nucleotides. In some embodiments, the antisense strand is 27 nucleotides. In some embodiments, the antisense strand is 28 nucleotides. In some embodiments, the antisense strand is 29 nucleotides. In some embodiments, the antisense strand is 30 nucleotides.
In some embodiments, the sense strand of the branched siRNA of the present invention is between 12 and 30 nucleotides (e.g., 12 nucleotides, 13 nucleotides, 14 nucleotides, 15 nucleotides, 16 nucleotides, 17 nucleotides, 18 nucleotides, 19 nucleotides, 20 nucleotides, 21 nucleotides, 22 nucleotides, 23 nucleotides, 24 nucleotides, 25 nucleotides, 26 nucleotides, 27 nucleotides, 28 nucleotides, 29 nucleotides, or 30 nucleotides), or 14 and 18 nucleotides (e.g., 14 nucleotides, 15 nucleotides, 16 nucleotides, 17 nucleotides, or 18 nucleotides). In some embodiments, the sense strand is 16 nucleotides. In some embodiments, the sense strand is 17 nucleotides. In some embodiments, the sense strand is 18 nucleotides. In some embodiments, the sense strand is 19 nucleotides. In some embodiments, the sense strand is 20 nucleotides. In some embodiments, the sense strand is 21 nucleotides. In some embodiments, the sense strand is 22 nucleotides. In some embodiments, the sense strand is 23 nucleotides. In some embodiments, the sense strand is 24 nucleotides. In some embodiments, the sense strand is 25 nucleotides. In some embodiments, the sense strand is 26 nucleotides. In some embodiments, the sense strand is 27 nucleotides. In some embodiments, the sense strand is 28 nucleotides. In some embodiments, the sense strand is 29 nucleotides. In some embodiments, the sense strand is 30 nucleotides.
2' modifications
The present invention includes single- and double-stranded compositions comprising at least one alternating motif. Alternating motifs of the present invention may have the formula ((A-P-) n (B-P-) m )q where A is a nucleoside of a first type, B is a nucleoside of a second type, n is from 1 to 5, m is from 1 to 5, and q is from 1 to 15, and P is an internucleoside linkage. The result may include a regular or irregular pattern of alternating nucleosides of the first and second types. Each of the types of nucleosides may be identical with the exception that at least the 2’-substituent groups are different.
Possible 2'-modifications comprise all possible orientations of OH; F; 0-, S-, or N-alkyl; 0-, S-, or N-alkenyl; 0-, S- or N-alkynyl; or O-alkyl-O-alkyl, wherein the alkyl, alkenyl and alkynyl may be substituted or unsubstituted C1 to C10 alkyl or C2 to C10 alkenyl and alkynyl. In some embodiments, the modification includes a 2’-0-methyl (2’-0-Me) modification. Some embodiments use 0[(CH 2 )n0] m CH3, 0(CH 2 )n0CH 3 , 0(CH 2 ) n NH 2 , 0(CH 2 ) n CH 3 , 0(CH 2 ) n 0NH 2 , and 0(CH 2 )n0N[(CH 2 ) n CH 3 ] 2 , where n and m are from 1 to about 10. Other potential sugar substituent groups include: C1 to C10 lower alkyl, substituted lower alkyl, alkenyl, alkynyl, alkaryl, aralkyl, O-alkaryl or O-aralkyl, SH, SCH 3 , OCN, Cl, Br,
CN, CF 3 , OCF 3 , SOCH 3 , S0 2 CH 3 , 0N0 2 , N0 2 , N 3 , NH 2 , heterocycloalkyl, heterocycloalkaryl, aminoalkylamino, polyalkylamino, substituted silyl, a group for improving the pharmacokinetic properties of an oligonucleotide, or a group for improving the pharmacodynamic properties of an oligonucleotide, and other substituents having similar properties. In some embodiments, the modification includes 2' methoxyethoxy (2'-0-CH 2 CH 2 0CH 3 , also known as 2'-0-(2-methoxyethyl) or2'-MOE). In some embodiments, the modification includes 2'-dimethylaminooxyethoxy, i.e. , a 0(CH 2 ) 2 0N(CH 3 ) 2 group, also known as 2'-DMAOE, and 2'-dimethylaminoethoxyethoxy (also known in the art as 2'-0-dimethylamino- ethoxy-ethyl or2'-DMAEOE), i.e., 2'-0-CH 2 0CH 2 N(CH3) 2 . Other potential sugar substituent groups include aminopropoxy (-OCH2CH2CH2NH2), allyl (-CH 2 -CH=CH 2 ), -O-allyl (-0-CH 2 -CH=CH 2 ) and fluoro (F). 2'-sugar substituent groups may be in the arabino (up) position or ribo (down) position. In some embodiments, the 2'-arabino modification is 2'-F. Similar modifications may also be made at other positions on the oligomeric compound, particularly the 3' position of the sugar on the 3' terminal nucleoside or in 2'-5' linked oligonucleotides and the 5' position of 5' terminal nucleotide.
Oligonucleotides may also have sugar mimetics such as cyclobutyl moieties in place of the pentofuranosyl sugar.
Nucleobase modifications
Oligomeric compounds may also include nucleosides or other surrogate or mimetic monomeric subunits that include a nucleobase (often referred to in the art simply as "base" or "heterocyclic base moiety"). The nucleobase is another moiety that has been extensively modified or substituted and such modified and or substituted nucleobases are amenable to the present invention. As used herein, "unmodified" or "natural" nucleobases include the purine bases adenine (A) and guanine (G), and the pyrimidine bases thymine (T), cytosine (C) and uracil (U). Modified nucleobases also referred herein as heterocyclic base moieties include other synthetic and natural nucleobases such as 5-methylcytosine (5- me-C), 5-hydroxymethyl cytosine, xanthine, hypoxanthine, 2-aminoadenine, 6-methyl and other alkyl derivatives of adenine and guanine, 2-propyl and other alkyl derivatives of adenine and guanine, 2- thiouracil, 2-thiothymine and 2-thiocytosine, 5-halouracil and cytosine, 5-propynyl (-C=C-CH3) uracil and cytosine and other alkynyl derivatives of pyrimidine bases, 6-azo uracil, cytosine and thymine, 5-uracil (pseudouracil), 4-thiouracil, 8-halo, 8-amino, 8-thiol, 8-thioalkyl, 8-hydroxyl and other 8-substituted adenines and guanines, 5-halo particularly 5-bromo, 5-trifluoromethyl and other 5-substituted uracils and cytosines, 7-methylguanine and 7-methyladenine, 2-F-adenine, 2-amino-adenine, 8-azaguanine and 8- azaadenine, 7-deazaguanine and 7-deazaadenine and 3-deazaguanine and 3-deazaadenine. Nucleobases may also include those in which the purine or pyrimidine base is replaced with other heterocycles, for example 7-deaza-adenine, 7-deazaguanosine, 2-aminopyridine and 2-pyridone. Further nucleobases include those disclosed in United States Patent No. 3,687,808, those disclosed in The Concise Encyclopedia of Polymer Science and Engineering, pages 858-859, Kroschwitz, J.I., ed. John Wiley & Sons, 1990, those disclosed by Englisch et al., Angewandte Chemie, International Edition, 1991 , 30, 613, and those disclosed by Sanghvi, Y.S., Chapter 15, Antisense Research and , Applications, pages 289-302, Crooke, S.T. and Lebleu, B., ed., CRC Press, 1993. Oligomeric compounds of the present invention can also include polycyclic heterocyclic compounds in place of one or more heterocyclic base moieties. A number of tricyclic heterocyclic compounds have been previously reported. These compounds are routinely used in antisense applications to increase the binding properties of the modified strand to a target strand.
Representative cytosine analogs that make 3 hydrogen bonds with a guanosine in a second strand include 1 ,3-diazaphenoxazine-2-one (Kurchavov, et al., Nucleosides and Nucleotides, 1997, 16, 1837-1846), 1 ,3-diazaphenothiazine-2-one (Lin, K.-Y.; Jones, R. J.; Matteucci, M.J. Am. Chem. Soc.
1995, 117, 3873-3874), and 6,7,8,9-tetrafluoro-l,3-diazaphenoxazine-2-one (Wang, J.; Lin, K.-Y., Matteucci, M. Tetrahedron Lett. 1998, 39, 8385-8388). Incorporated into oligonucleotides these base modifications were shown to hybridize with complementary guanine and the latter was also shown to hybridize with adenine and to enhance helical thermal stability by extended stacking interactions (also see U.S. Patent Application entitled "Modified Peptide Nucleic Acids" filed May 24, 2002, Serial number 10/155,920; and U.S. Patent Application entitled "Nuclease Resistant Chimeric Oligonucleotides" filed May 24, 2002, Serial number 10/013,295, both of which are herein incorporated by reference in their entirety). Further helix-stabilizing properties have been observed when a cytosine analog/substitute has an aminoethoxy moiety attached to the rigid 1 ,3-diazaphenoxazine-2-one scaffold (Lin, K.-Y.; Matteucci, M. J.25 Am. Chem. Soc. 1998, 120, 8531-8532).
Internucleoside linkage modifications
Another variable in the design of the present invention are the internucleoside linkages making up the phosphate backbone. Although the natural RNA phosphate backbone may be employed here, derivatives thereof, known and yet unknown in the art, may be used which enhance desirable characteristics of a siRNA. Although not limiting, of particular importance in the present invention is protecting parts, or the whole, of the siRNA from hydrolysis. One example of a modification that decreases the rate of hydrolysis is phosphorothioates. Any portion or the whole of the backbone may contain phosphate substitutions (e.g., phosphorothioates, phosphodiesters, etc.). For instance, the internucleoside linkages may be between 0 and 100% phosphorothioate, e.g., between 0 and 100%, 10 and 100%, 20 and 100%, 30 and 100%, 40 and 100%, 50 and 100%, 60 and 100% 70 and 100%, 80 and 100%, 90 and 100%, 0 and 90%, 0 and 80%, 0 and 70%, 0 and 60%, 0 and 50%, 0 and 40%, 0 and 30%, 0 and 20%, 0 and 10%, 10 and 90%, 20 and 80%, 30 and 70% 40 and 60%, 10 and 40%, 20 and 50%,
30 and 60%, 40 and 70%, 50 and 80%, or 60 and 90% phosphorothioate linkages. Similarly, the internucleoside linkages may be between 0 and 100% phosphodiester linkages, e.g., between 0 and 100%, 10 and 100%, 20 and 100%, 30 and 100%, 40 and 100%, 50 and 100%, 60 and 100% 70 and 100%, 80 and 100%, 90 and 100%, 0 and 90%, 0 and 80%, 0 and 70%, 0 and 60%, 0 and 50%, 0 and 40%, 0 and 30%, 0 and 20%, 0 and 10%, 10 and 90%, 20 and 80%, 30 and 70%, 40 and 60%, 10 and 40%, 20 and 50%, 30 and 60%, 40 and 70%, 50 and 80%, or 60 and 90% phosphodiester linkages.
Specific examples of some potential oligomeric compounds useful in this invention include oligonucleotides containing modified e.g. non-naturally occurring internucleoside linkages. As defined in this specification, oligonucleotides having modified internucleoside linkages include internucleoside linkages that retain a phosphorus atom and internucleoside linkages that do not have a phosphorus atom. For the purposes of this specification, and as sometimes referenced in the art, modified oligonucleotides that do not have a phosphorus atom in their internucleoside backbone can also be considered to be oligonucleosides. In the C.elegans system, modification of the internucleotide linkage (phosphorothioate) did not significantly interfere with RNAi activity. Based on this observation, it is suggested that some compositions of the invention can also have one or more modified internucleoside linkages. A preferred phosphorus containing modified internucleoside linkage is the phosphorothioate internucleoside linkage.
In some embodiments, the modified oligonucleotide backbones containing a phosphorus atom therein include, for example, phosphorothioates, phosphorodithioates, phosphotriesters, aminoalkylphosphotriesters, methyl and other alkyl phosphonates including 3'-alkylene phosphonates, 5'- alkylene phosphonates, phosphinates, phosphoramidates including 3'-amino phosphoramidate and aminoalkylphosphoramidates, thionophosphoramidates, thionoalkylphosphonates, thionoalkylphosphotriesters, selenophosphates, and boranophosphates having normal 3'-5' linkages, 2'-5' linked analogs of these, and those having inverted polarity wherein one or more internucleotide linkages is a 3' to 3', 5' to 5' or 2' to 2' linkage. In some embodiments, the modified oligonucleotide backbones that do not include a phosphorus atom therein have backbones that are formed by short chain alkyl or cycloalkyl internucleoside linkages, mixed heteroatom and alkyl or cycloalkyl internucleoside linkages, or one or more short chain heteroatomic or heterocyclic internucleoside linkages. These include those having morpholino linkages (formed in part from the sugar portion of a nucleoside); siloxane backbones; sulfide, sulfoxide and sulfone backbones; formacetyl and thioformacetyl backbones; methylene formacetyl and thioformacetyl backbones; riboacetyl backbones; alkene containing backbones; sulfamate backbones; methyleneimino and methylenehydrazino backbones; sulfonate and sulfonamide backbones; amide backbones; and others having mixed N, O, S and CH2 component parts. siRNA Patterning
Nucleosides used in the invention tolerate a range of modifications in the nucleobase and sugar.
A complete siRNA, single-stranded or double-stranded, may have 1 , 2, 3, 4, 5, or more different nucleosides that each appear in the siRNA strand or strands once or more. The nucleosides may appear in a repeating pattern (e.g., alternating between two modified nucleosides) or may be a strand of one type of nucleoside with substitutions of a second type of nucleoside. Similarly, internucleoside linkages may be of one or more type appearing in a single- or double-stranded siRNA in a repeating pattern (e.g., alternating between two internucleoside linkages) or may be a strand of one type of internucleoside linkage with substitutions of a second type of internucleoside linkage. Though the siRNAs of the invention tolerate a range of substitution patterns, the following exemplify some preferred patterns in which A and B represent nucleosides of two types, and T and P represent internucleoside linkages of two types:
Pattern 1 :
A-T-B-T-A-P-A-P-A-P-A-P-A-P-A-P-A-P-A-P-A-P-A-P-A-P-B-T-A -T-A-T-A-T-A-T-A-T-A-T
A-T-A-T-A-P-A-P-A-P-A-P-A-P-A-P-A-P-A-P-A-P-A-P-A-P-A-T-A -T
Pattern 2:
A-T-A-T-A-P-B-P-B-P-B-P-A-P-A-P-A-P-A-P-A-P-A-P-A-P-B-T-A -T-A-T-A-T-A-T-A-T-A-T
A-T-A-T-A-P-A-P-A-P-A-P-A-P-A-P-A-P-A-P-A-P-A-P-A-P-A-T-A -T
Pattern 3:
A-T-B-T-A-P-B-P-B-P-B-P-A-P-A-P-A-P-A-P-A-P-A-P-A-P-B-T-A -T-A-T-A-T-A-T-A-T-A-T
A-T-A-T-A-P-A-P-A-P-A-P-A-P-A-P-A-P-A-P-A-P-A-P-A-P-A-T-A -T
Pattern 4:
A-T-B-T-A-P-A-P-A-P-A-P-A-P-A-P-A-P-A-P-A-P-A-P-A-P-B-T-A -T-A-T-A-T-A-T-A-T-A-T
A-T-A-T-A-P-A-P-A-P-A-P-B-P-A-P-A-P-B-P-B-P-A-P-A-P-A-T-A -T Pattern 5:
A-T-B-T-A-P-A-P-A-P-B-P-A-P-A-P-A-P-A-P-A-P-A-P-A-P-B-T-A -T-B-T-A-T-A-T-A-T-A-T
A-T-A-T-A-P-A-P-A-P-A-P-B-P-A-P-B-P-B-P-B-P-A-P-A-P-A-T-A -T.
In some embodiments, T represents phosphorothioate, and P represents phosphodiester.
In some embodiments, the siRNA molecule of the disclosure features any one of the siRNA nucleotide modification patterns and/or internucleoside linkage modification patterns described in International Patent Application Publication Nos. WO 2016/161388 and WO 2020/041769, the disclosures of which are incorporated in their entirety herein.
In some embodiments of the disclosure, the siRNA may contain an antisense strand including a region represented by Formula A-l, wherein Formula A-l is, in the 5’-to-3’ direction
A-B-(A’)j-C-P 2 -D-P 1 -(C’-P 1 ) k -C’
Formula A-l; wherein A is represented by the formula C-P 1 -D-P 1 ; each A’ is represented by the formula C-P 2 -D-P 2 ; B is represented by the formula C-P 2 -D-P 2 -D-P 2 -D-P 2 ; each C is a 2’-0-methyl (2’-0-Me) ribonucleoside; each C’, independently, is a 2’-0-Me ribonucleoside or a 2’-fluoro (2’-F) ribonucleoside; each D is a 2’-F ribonucleoside; each P 1 is a phosphorothioate internucleoside linkage; each P 2 is a phosphodiester internucleoside linkage; j is an integer from 1 to 7 (e.g., 1 , 2, 3, 4, 5, 6, or 7); and k is an integer from 1 to 7 (e.g., 1 , 2, 3, 4, 5, 6, or 7). In some embodiments, j is 4. In some embodiments, k is 4. In some embodiments, j is 4 and k is 4. The antisense is complementary (e.g., fully or partially complementary) to a target nucleic acid sequence.
In some embodiments, the antisense strand includes a structure represented by Formula A1 , wherein Formula A1 is, in the 5’-to-3’ direction:
A-S-B-S-A-O-B-O-B-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A -O-B-S-A-S-A-S-A-S-B-S-A
Formula A1; wherein A represents a 2’-0-Me ribonucleoside, B represents a 2’-F ribonucleoside, O represents a phosphodiester internucleoside linkage, and S represents a phosphorothioate internucleoside linkage.
In some embodiments of the disclosure, the siRNA may contain an antisense strand including a region represented by Formula A-ll, wherein Formula A-ll is, in the 5’-to-3’ direction:
A-B-(A’)j-C-P 2 -D-P 1 -(C-P 1 ) k -C’
Formula A-ll; wherein A is represented by the formula C-P 1 -D-P 1 ; each A’ is represented by the formula C-P 2 -D-P 2 ; B is represented by the formula C-P 2 -D-P 2 -D-P 2 -D-P 2 ; each C is a 2’-0-methyl (2’-0-Me) ribonucleoside; each C’, independently, is a 2’-0-Me ribonucleoside or a 2’-fluoro (2’-F) ribonucleoside; each D is a 2’-F ribonucleoside; each P 1 is a phosphorothioate internucleoside linkage; each P 2 is a phosphodiester internucleoside linkage; j is an integer from 1 to 7 (e.g., 1 , 2, 3, 4, 5, 6, or 7); and k is an integer from 1 to 7 (e.g., 1 , 2, 3, 4, 5, 6, or 7). In some embodiments, j is 4. In some embodiments, k is 4. In some embodiments, j is 4 and k is 4. The antisense is complementary (e.g., fully or partially complementary) to a target nucleic acid sequence.
In some embodiments of the disclosure, the antisense strand includes a structure represented by Formula A2, wherein Formula A2 is, in the 5’-to-3’ direction:
A-S-B-S-A-O-B-O-B-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A -O-B-S-A-S-A-S-A-S-A-S-A
Formula A2; wherein A represents a 2’-0-Me ribonucleoside, B represents a 2’-F ribonucleoside, O represents a phosphodiester internucleoside linkage, and S represents a phosphorothioate internucleoside linkage.
In some embodiments of the disclosure, the sense strand includes a structure represented by Formula S-lll, wherein Formula S-lll is, in the 5’-to-3’ direction:
E-(A’)m-F
Formula S-lll; wherein E is represented by the formula (C-P 1 )2; F is represented by the formula (C-P 2 )3-D-P 1 -C-P 1 -C, (C- P 2 ) 3 -D-P 2 -C-P 2 -C, (C-P 2 ) 3 -D-P 1 -C-P 1 -D, or (C-P 2 ) 3 -D-P 2 -C-P 2 -D; A’, C, D, P 1 , and P 2 are as defined in Formula I; and m is an integer from 1 to 7 (e.g., 1 , 2, 3, 4, 5, 6, or 7). In some embodiments, m is 4. The sense strand is complementary (e.g., fully or partially complementary) to the antisense strand.
In some embodiments of the disclosure, the sense strand includes a structure represented by Formula S1 , wherein Formula S1 is, in the 5’-to-3’ direction:
A-S-A-S-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-A-O-A-O-B-S-A -S-A
Formula S1; wherein A represents a 2’-0-Me ribonucleoside, B represents a 2’-F ribonucleoside, O represents a phosphodiester internucleoside linkage, and S represents a phosphorothioate internucleoside linkage. In some embodiments of the disclosure, the sense strand includes a structure represented by Formula S2, wherein Formula S2 is, in the 5’-to-3’ direction:
A-S-A-S-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-A-O-A-O-B-O-A -O-A
Formula S2; wherein A represents a 2’-0-Me ribonucleoside, B represents a 2’-F ribonucleoside, O represents a phosphodiester internucleoside linkage, and S represents a phosphorothioate internucleoside linkage.
In some embodiments of the disclosure, the sense strand includes a structure represented by Formula S3, wherein Formula S3 is, in the 5’-to-3’ direction:
A-S-A-S-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-A-O-A-O-B-S-A -S-B
Formula S3; wherein A represents a 2’-0-Me ribonucleoside, B represents a 2’-F ribonucleoside, O represents a phosphodiester internucleoside linkage, and S represents a phosphorothioate internucleoside linkage.
In some embodiments of the disclosure, the sense strand includes a structure represented by Formula S4, wherein Formula S4 is, in the 5’-to-3’ direction:
A-S-A-S-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-A-O-A-O-B-O-A -O-B
Formula S4; wherein A represents a 2’-0-Me ribonucleoside, B represents a 2’-F ribonucleoside, O represents a phosphodiester internucleoside linkage, and S represents a phosphorothioate internucleoside linkage.
In some embodiments of the disclosure, the siRNA may contain an antisense strand including a region represented by Formula A-IV, wherein Formula A-IV is, in the 5’-to-3’ direction:
A-(A’)j-C-P 2 -B-(C-P 1 ) k -C’
Formula A-IV; wherein A is represented by the formula C-P 1 -D-P 1 ; each A’ is represented by the formula C-P 2 -D-P 2 ; B is represented by the formula D-P 1 -C-P 1 -D-P 1 ; each C is a 2’-0-Me ribonucleoside; each C’, independently, is a 2’-0-Me ribonucleoside or a 2’-F ribonucleoside; each D is a 2’-F ribonucleoside; each P 1 is a phosphorothioate internucleoside linkage; each P 2 is a phosphodiester internucleoside linkage; j is an integer from 1 to 7 (e.g., 1 , 2, 3, 4, 5, 6, or 7); and k is an integer from 1 to7 (e.g., 1 , 2, 3, 4, 5, 6, or 7). In some embodiments, j is 6. In some embodiments, k is 4. In some embodiments, j is 6 and k is 4. The antisense strand is complementary (e.g., fully or partially complementary) to a target nucleic acid. In some embodiments of the disclosure, the antisense strand includes a structure represented by Formula A3, wherein Formula A3 is, in the 5’-to-3’ direction:
A-S-B-S-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A -O-B-S-A-S-B-S-A-S-A-S-A
Formula A3; wherein A represents a 2’-0-Me ribonucleoside, B represents a 2’-F ribonucleoside, O represents a phosphodiester internucleoside linkage, and S represents a phosphorothioate internucleoside linkage.
In some embodiments of the disclosure, the siRNA of the disclosure may have a sense strand represented by Formula S-V, wherein Formula S-V is, in the 5’-to-3’ direction:
E-(A’) m -C-P 2 -F
Formula S-V; wherein E is represented by the formula (C-P 1 )2; F is represented by the formula D-P 1 -C-P 1 -C, D-P 2 -C-P 2 - C, D-P 1 -C-P 1 -D, or D-P 2 -C-P 2 -D; A’, C, D, P 1 , and P 2 are as defined in Formula IV; and m is an integer from 1 to 7 (e.g., 1 , 2, 3, 4, 5, 6, or 7). In some embodiments, m is 5. The sense strand is complementary (e.g., fully or partially complementary) to the antisense strand.
In some embodiments of the disclosure, the sense strand includes a structure represented by Formula S5, wherein Formula S5 is, in the 5’-to-3’ direction:
A-S-A-S-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-S-A -S-A
Formula S5; wherein A represents a 2’-0-Me ribonucleoside, B represents a 2’-F ribonucleoside, O represents a phosphodiester internucleoside linkage, and S represents a phosphorothioate internucleoside linkage.
In some embodiments of the disclosure, the sense strand includes a structure represented by Formula S6, wherein Formula S6 is, in the 5’-to-3’ direction:
A-S-A-S-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A -O-A
Formula S6; wherein A represents a 2’-0-Me ribonucleoside, B represents a 2’-F ribonucleoside, O represents a phosphodiester internucleoside linkage, and S represents a phosphorothioate internucleoside linkage. In some embodiments of the disclosure, the sense strand includes a structure represented by Formula S7, wherein Formula S7 is, in the 5’-to-3’ direction:
A-S-A-S-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-S-A -S-B
Formula S7; wherein A represents a 2’-0-Me ribonucleoside, B represents a 2’-F ribonucleoside, O represents a phosphodiester internucleoside linkage, and S represents a phosphorothioate internucleoside linkage.
In some embodiments of the disclosure, the sense strand includes a structure represented by Formula S8, wherein Formula S8 is, in the 5’-to-3’ direction:
A-S-A-S-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A-O-B-O-A -O-B
Formula S8; wherein A represents a 2’-0-Me ribonucleoside, B represents a 2’-F ribonucleoside, O represents a phosphodiester internucleoside linkage, and S represents a phosphorothioate internucleoside linkage.
In some embodiments of the disclosure, the siRNA may contain an antisense strand including a region represented by Formula A-VI, wherein Formula A-VI is, in the 5’-to-3’ direction:
A-Bj-E-B k -E-F-Gi-D-P 1 -C’
Formula A-VI; wherein A is represented by the formula C-P 1 -D-P 1 ; each B is represented by the formula C-P 2 ; each C is a 2’-0-Me ribonucleoside; each C’, independently, is a 2’-0-Me ribonucleoside or a 2’-F ribonucleoside; each D is a 2’-F ribonucleoside; each E is represented by the formula D-P 2 -C-P 2 ; F is represented by the formula D-P 1 -C-P 1 ; each G is represented by the formula C-P 1 ; each P 1 is a phosphorothioate internucleoside linkage; each P 2 is a phosphodiester internucleoside linkage; j is an integer from 1 to 7 (e.g., 1 , 2, 3, 4, 5, 6, or 7); k is an integer from 1 to 7 (e.g., 1 , 2, 3, 4, 5, 6, or 7); and I is an integer from 1 to 7 (e.g., 1 , 2, 3, 4, 5, 6, or 7). In some embodiments, j is 3. In some embodiments, k is 6. In some embodiments, I is 2. In some embodiments, j is 3, k is 6, and I is 2. The antisense strand is complementary (e.g., fully or partially complementary) to a target nucleic acid.
In some embodiments of the disclosure, the antisense strand includes a structure represented by Formula A4, wherein Formula A4 is, in the 5’-to-3’ direction:
A-S-B-S-A-O-A-O-A-O-B-O-A-O-A-O-A-O-A-O-A-O-A-O-A-O-B-O-A -O-B-S-A-S-A-S-A-S-B-S-A
Formula A4; wherein A represents a 2’-0-Me ribonucleoside, B represents a 2’-F ribonucleoside, O represents a phosphodiester internucleoside linkage, and S represents a phosphorothioate internucleoside linkage. In some embodiments of the disclosure, the siRNA may contain a sense strand including a region represented by Formula S-VII, wherein Formula S-VII is, in the 5’-to-3’ direction:
H-Bm-ln-A’-Bo-H-C
Formula S-VII; wherein A’ is represented by the formula C-P 2 -D-P 2 ; each H is represented by the formula (C-P 1 )2; each I is represented by the formula (D-P 2 ); B, C, D, P 1 , and P 2 are as defined in Formula VI; m is an integer from 1 to 7 (e.g., 1 , 2, 3, 4, 5, 6, or 7); n is an integer from 1 to 7 (e.g., 1 , 2, 3, 4, 5, 6, or 7); and o is an integer from 1 to 7 (e.g., 1 , 2, 3, 4, 5, 6, or 7). In some embodiments, m is 3. In some embodiments, n is 3. In some embodiments, o is 3. In some embodiments, m is 3, n is 3, and o is 3. The sense strand is complementary (e.g., fully or partially complementary) to the antisense strand.
In some embodiments of the disclosure, the sense strand includes a structure represented by Formula S9, wherein Formula S9 is, in the 5’-to-3’ direction:
A-S-A-S-A-O-A-O-A-O-B-O-B-O-B-O-A-O-B-O-A-O-A-O-A-O-A-S-A -S-A
Formula S9; wherein A represents a 2’-0-Me ribonucleoside, B represents a 2’-F ribonucleoside, O represents a phosphodiester internucleoside linkage, and S represents a phosphorothioate internucleoside linkage.
In some embodiments of the disclosure, the siRNA may contain an antisense strand including a region that is represented by Formula VIII:
5' phosphorus stabilizing moiety
To further protect the siRNA from degradation a 5'-phosphorus stabilizing moiety may be employed. A 5'-phosphorus stabilizing moiety replaces the 5'-phosphate to prevent hydrolysis of the phosphate. Hydrolysis of the 5'-phosphate prevents binding to RISC, a necessary step in gene silencing. Any replacement for phosphate that does not impede binding to RISC is contemplated in this disclosure.
In some embodiments, the replacement for the 5'-phosphate is also stable to in vivo hydrolysis. Each siRNA strand may independently and optionally employ any suitable 5'-phosphorus stabilizing moiety.
Some exemplary endcaps are demonstrated in Formula l-VIII. Nuc in Formula l-VIII represents a nucleobase or nucleobase derivative or replacement as described herein. X in Formula l-VIII represents a 2’-modification as described herein. Some embodiments employ hydroxy as in Formula I, phosphate as in Formula II, vinylphosphonates as in Formula III, and VI, 5’-methylsubstitued phosphates as in Formula IV, VI, and VIII, or methylenephosphonates as in Formula VII, vinyl 5'-vinylphosphonate as a 5'- phosphorus stabilizing moiety as demonstrated in Formula III. siRNA Branching
Branching of the siRNA molecules is a key feature in the present invention. The siRNA molecule may not be branched, or may be dibranched, tribranched, or tetrabranched, connected through a linker. Each main branch may be further branched to allow for 2, 3, 4, 5, 6, 7, or 8 separate RNA single- or double-strands. The branch points on the linker may stem from the same atom, or separate atoms along the linker. Some exemplary embodiments are listed in Table 1 , where L represent a linker, and X represents any atom suitable to the siRNA molecule branch points:
Table 1 : Branched siRNA structures
Linkers
Multiple strands of siRNA described herein may be covalently attached by way of a linker. The effect of this branching improves, inter alia, cell permeability allowing better access into microglia in the CNS. Any linking moiety may be employed which is not incompatible with the siRNAs of the present invention. Exemplary linkers include ethylene glycol chains of 2 to 10 subunits (e.g., 2, 3, 4, 5, 6, 7, 8, 9, or 10 subunits), alkyl chains, carbohydrate chains, block copolymers, peptides, RNA, DNA, and others.
In some embodiments, any carbon or oxygen atom of the linker is optionally replaced with a nitrogen atom, bears a hydroxyl substituent, or bears an oxo substituent. In some embodiments, the linker is a poly-ethylene glycol (PEG) linker. The PEG linkers suitable for use with the disclosed compositions and methods include linear or non-linear PEG linkers. Examples of non-linear PEG linkers include branched PEGs, linear forked PEGs, or branched forked PEGs.
PEG linkers of various weights may be used with the disclosed compositions and methods. For example, the PEG linker may have a weight that is between 5 and 500 Daltons. In some embodiments, a PEG linker having a weight that is between 500 and 1 ,000 Dalton may be used. In some embodiments, a PEG linker having a weight that is between 1 ,000 and 10,000 Dalton may be used. In some embodiments, a PEG linker having a weight that is between 200 and 20,000 Dalton may be used. In some embodiments, the linker is covalently attached to a sense strand of the siRNA. In some embodiments, the linker is covalently attached to an antisense strand of the siRNA. In some embodiments, the PEG linker is a triethylene glycol (TrEG) linker. In some embodiments, the PEG linker is a tetraethylene linker (TEG).
In some embodiments, the linker is an alkyl chain linker. In some embodiments, the linker is a peptide linker. In some embodiments, the linker is a RNA linker. In some embodiments, the linker is a DNA linker.
Linkers may covalently link 2, 3, 4, or 5 unique siRNA strands. The linker may covalently bind to any part of the siRNA oligomer. In some embodiments, the linker attaches to the 3' end of nucleosides of each siRNA strand. In some embodiments, the linker attaches to the 5' end of nucleosides of each siRNA strand. In some embodiments, the linker attaches to a nucleoside of an siRNA strand (e.g., sense or antisense strand) by way of a covalent bond-forming moiety. In some embodiments, the covalent-bond- forming moiety is selected from the group consisting of an alkyl, ester, amide, carbonate, carbamate, triazole, urea, formacetal, phosphonate, phosphate, and phosphate derivative (e.g., phosphorothioate, phosphoramidate, etc.).
In some embodiments, the linker has a structure of Formula L1 , as is shown below:
(Formula L1)
In some embodiments, the linker has a structure of Formula L2, as is shown below:
(Formula L2)
In some embodiments, the linker has a structure of Formula L3, as is shown below:
(Formula L3)
In some embodiments, the linker has a structure of Formula L4, as is shown below:
(Formula L4)
In some embodiments, the linker has a structure of Formula L5, as is shown below:
(Formula L5) In some embodiments, the linker has a structure of Formula L6, as is shown below:
(Formula L6)
In some embodiments, the linker has a structure of Formula L7, as is shown below:
(Formula L7)
In some embodiments, the linker has a structure of Formula L8, as is shown below:
(Formula L8)
In some embodiments, the linker has a structure of Formula L9, as is shown below:
(Formula L9)
In some embodiments, the selection of a linker for use with one or more of the branched siRNA molecules disclosed herein may be based on the hydrophobicity of the linker, such that, e.g., desirable hydrophobicity is achieved for the one or more branched siRNA molecules of the disclosure. For example, a linker containing an alkyl chain may be used to increase the hydrophobicity of the branched siRNA molecule as compared to a branched siRNA molecule having a less hydrophobic linker or a hydrophilic linker.
Methods of Treatment
The invention provides methods of treating a subject in need of gene silencing. The gene silencing may be performed in order to silence defective or overactive microglial genes, silence negative regulators of microglial genes with reduced expression and/or activity, silence wild type microglial genes with an activating role in a pathway(s) that increases expression and/or activity of a disease driver gene, silence splice isoforms of a microglial gene(s) that, when selectively knocked down, may elevate total expression and/or activity of the gene(s), among other reasons, so long as the goal is to restore genetic and biochemical pathway activity from a disease state towards a healthy state. The active compound can be administered in any suitable dose. The actual dosage amount of a composition of the present invention administered to a patient can be determined by physical and physiological factors such as body weight, severity of condition, previous or concurrent therapeutic interventions, idiopathy of the patient and on the route of administration. Depending upon the dosage and the route of administration, the number of administrations of a preferred dosage and/or an effective amount may vary according to the response of the subject. The practitioner responsible for administration will, in any event, determine the concentration of active ingredient(s) in a composition and appropriate dose(s) for the individual subject. Administration may occur any suitable number of times per day, and for as long as necessary. Subjects may be adult or pediatric humans, with or without comorbid diseases.
Diseases
The methods of the invention feature delivering a branched siRNA molecule to a microglial cell in a subject in need of microglial gene silencing. Subjects in need of microglial gene silencing may be suffering from neurodegenerative diseases in which neuroinflammation is a primary component of the disease pathology (e.g., Alzheimer’s disease, amyotrophic lateral sclerosis, Parkinson’s disease, frontotemporal dementia, Huntington’s disease, multiple sclerosis, or progressive supranuclear palsy).
Alzheimer’s disease
Alzheimer’s disease (AD) is a late-onset neurodegenerative disorder responsible for the majority of dementia cases in the elderly. AD patients suffer from a progressive cognitive decline characterized by symptoms including an insidious loss of short- and long-term memory, attention deficits, language- specific problems, disorientation, impulse control, social withdrawal, anhedonia, and other symptoms. Distinguishing neuropathological features of AD are extracellular aggregates of amyloid-b plaques and neurofibrillary tangles composed of hyperphosphorylated microtubule-associated tau proteins. Accumulation of these aggregates is associated with neuronal loss and atrophy in a number of brain regions including the frontal, temporal, and parietal lobes of the cerebral cortex as well as subcortical structures like the basal forebrain cholinergic system and the locus coeruleus within the brainstem. AD is also associated with increased neuroinflammation characterized by reactive gliosis and elevated levels of pro-inflammatory cytokines.
Amyotrophic Lateral Sclerosis
Amyotrophic Lateral Sclerosis (ALS) is a fast-progressing fatal neurodegenerative disease that affects motor neurons both in the brain and spinal cord, consequently resulting in paralysis of voluntary muscles at later stages of disease. ALS affects about 6 persons per 100,000 people and typically leads to death within 3 to 5 years after the onset of symptoms, with no cure yet available. ALS leads to muscle weakness, atrophy, and muscle spasms as a result of degeneration of upper and lower motor neurons. Cognitive and behavioral dysfunction (e.g., language dysfunction, executive dysfunction, social cognition, and verbal memory dysfunction), and frontotemporal dementia are all possible symptoms of ALS. Parkinson’s disease
PD is a progressive disorder that affects movement, and it is recognized as the second most common neurodegenerative disease after Alzheimer’s disease. Common symptoms of PD include resting tremor, rigidity, and bradykinesia, and non-motor symptoms, such as depression, constipation, pain, sleep disorders, genitourinary problems, cognitive decline, and olfactory dysfunction, are also increasingly being associated with PD. A key feature of PD is the death of dopaminergic neurons in the substantia nigra pars compacta, and, for that reason, most current treatments for PD focus on increasing dopamine. Another well-known neuropathological hallmark of PD is the presence of Lewy bodies containing a-synuclein in brain regions affected by PD, which are thought to contribute to the disease.
PD is thought to result from a combination of genetic and environmental risk factors. There is no single gene responsible for all Parkinson’s disease cases, and the vast majority of PD cases seem to be sporadic and not directly inherited. Mutations in the genes encoding parkin, PTEN-induced putative kinase 1 (PINK1), leucine-rich repeat kinase 2 (LRRK2), and Parkinsonism-associated deglycase (DJ-1) have been found to be associated with PD, but they represent only a small subset of the total number of PD cases. Occupational exposure to some pesticides and herbicides has also been proposed as a risk factor for PD. The synthetic neurotoxin MPTP can cause Parkinsonism, but its use is extremely rare.
Frontotemporal dementia
Frontotemporal dementia (FTD; also known as frontotemporal lobar degeneration (FTLD)) is a clinical syndrome characterized by progressive neurodegeneration in the frontal and temporal lobes of the cerebral cortex. The manifestation of FTD is complex and heterogeneous, and may present as one of three clinically distinct variants including: 1) behavioral-variant frontotemporal dementia (BVFTD), characterized by changes in behavior and personality, apathy, social withdrawal, perseverative behaviors, attentional deficits, disinhibition, and a pronounced degeneration of the frontal lobe; 2) semantic dementia (SD), characterized by fluent, anomic aphasia, progressive loss of semantic knowledge of words, objects, and concepts and a pronounced degeneration of the anterior temporal lobes. Furthermore, SD variant of FTD exhibit a flat affect, social deficits, perseverative behaviors, and disinhibition; or 3) progressive nonfluent aphasia; characterized by motor deficits in speech production, reduced language expression, and pronounced degeneration of the perisylvian cortex. Neuronal loss in brains of FTD patients is associated with one of three distinct neuropathologies: 1) the presence of tau-positive neuronal and glial inclusions; 2) ubiquitin (ub)-positive and TAR DNA-binding protein 43 (TDP43)-positive, but tau-negative inclusions; or 3) ub and fused in sarcoma (FUS)-positive, but tau and TDP-43-negative inclusions. These neuropathologies are considered to be important in the etiology of FTD.
Nearly half of FTD patients have a first-degree family member with dementia, ALS, or Parkinson’s disease, suggesting a strong genetic link to the cause of the disease. A number of mutations in chromosome 17q21 have been linked to FTD presentation.
Huntington’s disease
Huntington's Disease (HD) is an example of a trinucleotide repeat expansion disorder. This class of disorders involve the localized expansion of unstable repeats of sets of three nucleotides and can result in loss of function of a gene in which the expanded repeat is found, a gain of toxic function, or both. Trinucleotide repeats can be located in any part of the gene, including coding and non-coding regions.
Ί 3 Repeats located within coding regions typically involve a repeated glutamine encoding triplet (CAG) or an alanine encoding triplet (CGA). Expanded repeat regions within non-coding sequences can lead to aberrant expression of the gene, while expanded repeats within coding regions (also known as codon reiteration disorders) may cause protein mis-folding and aggregation. Typically, regions of wild-type genes contain a variable number of repeat sequences in the normal population, but in the afflicted populations, the number of repeats can increase from a doubling to a log order increase in the number of repeats. In HD, repeats are inserted within the N-terminal coding region of the large cytosolic protein Huntingtin (Htt). Normal Htt alleles contain 15-20 CAG repeats, while alleles containing 35 or more repeats can be considered to confer a risk for developing the disease. Alleles containing 36-39 repeats are considered incompletely penetrant, and those individuals harboring those alleles may or may not develop the disease (or exhibit delayed presentation later in life), while alleles containing 40 repeats or more are considered completely penetrant. Those individuals with juvenile onset HD (<21 years of age) are often found to have 60 or more CAG repeats.
Multiple sclerosis
Multiple sclerosis (MS) is the most common demyelinating disease of the CNS affecting young adults (disease onset between 20 to 40 years of age) and is the third leading cause for disability after trauma and rheumatic diseases in the US.
MS patients present with destruction of myelin, death of oligodendrocytes, and axonal loss. The main pathologic finding in MS is the presence of infiltrating mononuclear cells, predominantly T lymphocytes and macrophages, which breach the blood brain barrier and induce active inflammation within the CNS. The neurological symptoms that characterize MS include complete or partial vision loss, diplopia, sensory symptoms, motor weakness that can progress to complete paralysis, bladder dysfunction, and cognitive deficits. The associated inflammatory foci lead to myelin destruction, plaques of demyelination, gliosis, and axonal loss within the brain and spinal cord and are the primary drivers of the clinical manifestations of neurological disability.
The etiology of MS is not fully understood. The disease develops in genetically predisposed subjects exposed to yet undefined environmental factors and the pathogenesis involves autoimmune mechanisms associated with autoreactive T cells against myelin antigens. It is well established that not one dominant gene determines genetic susceptibility to develop MS, but rather many genes, each with different influence, are involved. The detailed molecular mechanisms underlying MS etiology are still to be elucidated.
Progressive supranuclear palsy
Progressive supranuclear palsy (PSP), a progressive and fatal tauopathy, represents ~10% of all Parkinsonian cases in the US. PSP patients have a variety of motor disorders, including postural instability, falls, abnormalities in gait, bradykinesia, vertical gaze paralysis, pseudobulbar paralysis, and axial stiffness without limb stiffness, in addition to cognitive impairments such as apathy, loss of executive function, and reduced fluency. Neuropathology of PSP is characterized by an accumulation oftau protein, which is associated with abnormal intracellular microtubules, resulting in insoluble filament deposits. The neuropathological presentation of PSP neurodegeneration is located in the subcortical regions, including substantia nigra, globus pallidus ,and subthalamic nucleus. PSP neurodegeneration is characterized by the destruction of tissues and cytokine profiles of activated microglia and astrocytes.
There are currently no disease-modifying treatments for PSP. The current standard of care is palliative. Patients in the advanced stages of the disease often have feeding tubes inserted to avoid choking hazards and to provide nutrition. Although therapies are available to decrease some symptoms of PSP, none protect the brain from neurodegeneration. Current medications to treat symptoms of PSP include dopamine agonists, tricyclic antidepressants, methysergide, onabotulinumtoxin A (to treat muscle stiffness in the face). However, as the disease progresses and symptoms worsen, medications may fail to adequately decrease symptoms.
Gene targets
The methods of the invention feature delivering a branched siRNA molecule to a microglial cell in a subject in need of microglial gene silencing. Patients in need of microglial gene silencing may have dysregulated expression and/or activity of a gene selected from the group consisting of ABCA7, ABI3, ADAM10, APOC1 , APOE, AXL, BIN1 , C1QA, C3, C90RF72, CASS4, CCL5, CD2AP, CD33, CD68, CLPTM1 , CLU, CR1 , CSF1 , CST7, CTSB, CTSD, CTSL, CXCL10, CXCL13, DSG2, ECHDC3, EPHA1 , FABP5, FERMT2, FTH1 , GNAS, GRN, HBEGF, HLA-DRB1 , HLA-DRB5, IFIT1 , IFIT3, IFITM3, IFNAR1 , IFNAR2, IGF1 , IL10RA, IL1A, IL1B, IL1RAP, INPP5D, ITGAM, ITGAX, LILRB4, LPL, MEF2C, MMP12, MS4A4A, MS4A6A, NLRP3, NME8, NOS2, PICALM, PILRA, PLCG2, PTK2B, SCIMP, SLC24A4,
SORL1 , SPI1 , SPP1 , SPPL2A, TBK1 , TNF, TREM2, TREML2, TYROBP, and ZCWPW1 gene.
In some embodiments, the patient in need of microglial gene silencing may require silencing of any one of the genes selected from the group consisting of APOE, BIN1 , C1QA, C3, C90RF72, CCL5, CD33, CLU/APOJ, CR1 , CXCL10, CXCL13, IFIT1 , IFIT3, IFITM3, IFNAR1 , IFNAR2, IL10RA, IL1A, IL1B, IL1RAP, INPP5D, ITGAM, MEF2C, MMP12, NLRP3, NOS2, PILRA, PLCG2, PTK2B, SLC24A4, TBK1 , and TNF.
Pharmaceutical compositions
The branched siRNA molecules in the present invention can be formulated into a pharmaceutical composition for administration to a subject in a biologically compatible form suitable for administration in vivo. Accordingly, in one aspect, the present invention provides a pharmaceutical composition containing a branched siRNA in admixture with a suitable diluent, carrier, or excipient. The siRNA can be administered, for example, orally or by intravenous injection.
Conventional procedures and ingredients for the selection and preparation of suitable formulations are described, for example, in Remington: The Science and Practice of Pharmacy (2012, 22 nd ed.) and in The United States Pharmacopeia: The National Formulary (2015, USP 38 NF 33).
Under ordinary conditions of storage and use, a pharmaceutical composition may contain a preservative, e.g., to prevent the growth of microorganisms. Pharmaceutical compositions may include sterile aqueous solutions, dispersions, or powders, e.g., for the extemporaneous preparation of sterile solutions or dispersions. In all cases the form may be sterilized using techniques known in the art and may be fluidized to the extent that may be easily administered to a subject in need of treatment.
A pharmaceutical composition may be administered to a subject, e.g., a human subject, alone or in combination with pharmaceutically acceptable carriers, as noted herein, the proportion of which may be determined by the solubility and/or chemical nature of the compound, chosen route of administration, and standard pharmaceutical practice.
Regimens
A physician having ordinary skill in the art can readily determine an effective amount of siRNA for administration to a mammalian subject (e.g., a human) in need thereof. For example, a physician could start prescribing doses of a siRNA of the invention at levels lower than that required in order to achieve the desired therapeutic effect and gradually increase the dosage until the desired effect is achieved. Alternatively, a physician may begin a treatment regimen by administering a siRNA at a high dose and subsequently administer progressively lower doses until a therapeutic effect is achieved (e.g., a reduction in expression of a target gene sequence). In general, a suitable daily dose of a siRNA of the invention will be an amount of the siRNA which is the lowest dose effective to produce a therapeutic effect. A single-strand or double-strand siRNA of the invention may be administered by injection, e.g., intrathecally, intracerebroventricularly, or intrastriatally . A daily dose of a therapeutic composition of a siRNA of the invention may be administered as a single dose or as two, three, four, five, six or more doses administered separately at appropriate intervals throughout the day, week, month, or year, optionally, in unit dosage forms. While it is possible for a siRNA of the invention to be administered alone, it may also be administered as a pharmaceutical formulation in combination with excipients, carriers, and optionally, additional therapeutic agents.
Routes of administration
The method of the invention contemplates any route of administration tolerated by the therapeutic composition. Some embodiments of the method include injection intrathecally, intracerebroventricularly, or intrastriatally.
Intrathecal injection is the direct injection into the spinal column or subarachnoid space. By injecting directly into the CSF of the spinal column the siRNA molecule of the invention has direct access to microglia in the spinal column and a route to access the microglia in the brain by bypassing the blood brain barrier.
Intracerebroventricular (ICV) injection is a method to directly inject into the CSF of the cerebral ventricles. Similar to intrathecal injection, ICV is a method of injection which bypasses the blood brain barrier. Using ICV allows the advantage of access to the microglia of the brain and spinal column without the danger of the therapeutic being degraded in the blood.
Intrastriatal injection is the direct injection into the striatum, or corpus striatum. The striatum is an area in the subcortical basal ganglia in the brain. Injecting into the striatum bypasses the blood brain barrier and the pharmacokinetic challenges of injection into the blood stream and allows for direct access to the microglia of the brain and spinal column.
EXAMPLES
Example 1. Protocol for Uptake of di-siRNA in Microglia of Non-Human Primates
The experiments described in this example were conducted to assess the ability of branched siRNA molecules to permeate the central nervous system and internalize within microglial cells. To this end, a branched siRNA compound targeting the huntingtin (HTT) gene and conjugated to a fluorescent dye (Cy3) was first injected into the cerebrospinal fluid via intrathecal injection into non-human primates (NHP; cynomolgus macaque). Central nervous system tissue samples were later obtained from the animals. To assess the extent to which the branched siRNA molecules were internalized by microglial cells, the tissue samples were stained using fluorescent-labeled antibodies that are specific for markers expressed in certain cell types (e.g., microglia). Fluorescence microscopy was then utilized to determine the degree of colocalization of the Cy3-labeled branched siRNA molecules and antibody-labeled microglial cells, which served as an indicator of microglial uptake. These experiments, and their results, are described in further detail below:
Paraffin embedded CNS tissue slides were tested. A dose of fluorescent labeled branched siRNA was administered to a NHP (cynomolgus macaque) via intrathecal injection. 48 hours after injection a distribution study was done. The control was an uninjected NHP. NHP tissues for imaging were post-fixed for 48-72 hours in 4% PFA at 5±3°C, and then transferred to PBS. All tissues were paraffin-embedded and sliced into 4 pm sections and mounted on slides for immunofluorescence staining. Subsequently, sections were deparaffinized and subjected to antigen retrieval. Samples were deparaffanized by two changes of xylene, 5 minutes each, then 50% xylene+50% ethanol (100%) for 5 minutes. Samples were hydrated by two changes of 100% ethanol for 3 minutes each, 90%, 80%, 70% and then 50% ethanol for 3 minutes each, followed by distilled water rinse. Antigen retrieval was carried out using 150 ml_ of Tris-EDTA buffer (pH9), placing the staining dish in a pressure cooker (containing 1200 ml_ DDH2O) for 10 minutes, allowing the slides to cool to room temperature, followed by section- wise rinsing with H2O and TBST. Sections were blocked with Background Terminator Blocking Reagent and the slides were then incubated with the primary antibody against the microglial-specific gene, lba-1 , for 1.5 hours at room temperature, followed by treatment with a secondary antibody labeled with Alexa Flour 488 (Alexa-488). Alexa-488 was used to visualize lba-1 antibody. DAPI was used to visualize cell nuclei. Tissues were washed three times for 5 min with TBS-Tween 20. Fluoromount-G was used to place glass coverslips, and slides were left to dry at 4°C overnight protected from light. Olympus VS200 slide scanner was used to acquire immunofluorescent images of brain and spinal cord (20* objective). Images within each imaging channel were acquired under the same settings for light intensity and exposure times.
Colocalization of DAPI stained nuclei, Alexa-488-labeled lba-1 antibody, and Cy3-labeled siRNA was observed across all tested brain and spinal cord tissues of cynomolgus macaques, indicating microglial cell penetration and/or uptake of the branched di-siRNA. Control experiments included uninjected NHP control (no Cy3-siRNA), non-specific primary antibody (isotype antibody control), and no secondary antibody (no Alexa Fluor 488 reagent). Robust colocalization was observed in the cortex (FIG. 1A), hippocampus (FIG. 1 B), caudate nucleus (FIG. 1C), and spinal cord (FIG. 1 D). Controls showed no co-localization of Cy3 and Alexa Fluor 488 signals, indicating specificity of detection of microglial uptake (not shown).
These results demonstrate that the ds-siRNA agents of the present disclosure are capable of being internalized by microglial cells of CNS tissues, including brain and spinal cord, and support the use of such agents for treatment of neurological conditions, such as Alzheimer’s disease or amyotrophic lateral sclerosis. Example 2. Method of Treating a Patient with Alzheimer’s Disease
A subject diagnosed with Alzheimer’s disease is treated with a dose and frequency determined by a practitioner (e.g., three times daily, twice daily, once daily, once weekly, once monthly, bi-monthly, once every 4 months, once every 5 months, once every 6 months, once every 7 months, once every 8 months, once every 9 months, once every 10 months, once every 11 months, or annually). Dosage and frequency are determined based on the subject’s height, weight, age, sex, and other disorders.
The branched siRNA is selected by the practitioner for compatibility with the disease and subject. Single- or double-stranded branched siRNA are available for selection. The siRNA chosen has an antisense strand, and in the case of double-stranded siRNA, a sense strand with a sequence and RNA modifications (e.g., natural and non-natural internucleoside linkages, modified sugars, and 5'-phosphorus stabilizing moieties) best suited to the patient and the disease being targeted (e.g.,
PS M- A-T - B-T -A- P- B- P- A- P- B- P- A- P- B- P- A- P- B- P- A- P- B- P- A- P- B-T - A-T - B-T - A-T - B-T - A-T - B-T B-T-A-T-B-P-A-P-B-P-A-P-B-P-A-P-B-P-A-P-B-P-A-P-B-P-A-T-B-T where A and B are different nucleosides, T is phosphorothioate, P is a phosphodiester, and PSM is a 5'- phosphorus stabilizing moiety).
The branched siRNA is delivered by the route best suited the patient and condition (e.g., intrathecally, intracerebroventricularly, or intrastriatally), at a rate tolerable to the patient until the subject has reached a maximum tolerated dose, or until the symptoms of the disease are ameliorated satisfactorily.
Example 3. Method of Treating a Patient with Amyotrophic Lateral Sclerosis
A subject diagnosed with Amyotrophic Lateral Sclerosis is treated with a dose and frequency determined by a practitioner (e.g., three times daily, twice daily, once daily, once weekly, once monthly bimonthly, once every 4 months, once every 5 months, once every 6 months, once every 7 months, once every 8 months, once every 9 months, once every 10 months, once every 11 months, or annually). Dosage and frequency are determined based on the subject’s height, weight, age, sex, and other disorders.
The branched siRNA is selected by the practitioner for compatibility with the disease and subject. Single- or double-stranded branched siRNA are available for selection. The siRNA chosen has an antisense strand, and in the case of double-stranded siRNA, a sense strand with a sequence and RNA modifications (e.g., natural and non-natural internucleoside linkages, modified sugars, and 5'-phosphorus stabilizing moieties) best suited to the patient and the disease being targeted (e.g.,
PS M- A-T - B-T -A- P- B- P- A- P- B- P- A- P- B- P- A- P- B- P- A- P- B- P- A- P- B-T -A-T - B-T -A-T - B-T -A-T - B-T B-T-A-T-B-P-A-P-B-P-A-P-B-P-A-P-B-P-A-P-B-P-A-P-B-P-A-T-B-T where A and B are different nucleosides, T is phosphorothioate, P is a phosphodiester, and PSM is a 5'- phosphorus stabilizing moiety).
The branched siRNA is delivered by the route best suited the patient and condition (e.g., intrathecally, intracerebroventricularly, or intrastriatally), at a rate tolerable to the patient until the subject has reached a maximum tolerated dose, or until the symptoms of the disease are ameliorated satisfactorily. SPECIFIC EMBODIMENTS
Some specific embodiments are listed below. The below enumerated embodiments should not be construed to limit the scope of the invention, rather, the below are presented as some examples of the utility of the invention.
E1. A method of delivering a branched small interfering RNA (siRNA) molecule to a microglial cell in a subject in need of microglial gene silencing, the method comprising administering the branched siRNA molecule to the central nervous system of the subject.
E2. The method of E1 , wherein the subject has been diagnosed as having a disease associated with expression of a dysregulated microglial gene ordysregulated microglial gene network.
E3. The method of E2, wherein the dysregulated microglial gene exhibits increased expression and/or activity in microglial cells of the subject as compared to the expression and/or activity of the microglial gene in microglial cells of a reference subject.
E4. The method of E2, wherein the dysregulated microglial gene exhibits reduced expression and/or activity in microglial cells of the subject as compared to the expression and/or activity of the microglial gene in microglial cells of a reference subject.
E5. The method of any one of E1 -E4, wherein the delivering of the branched siRNA molecule to the subject results in silencing of a gene in the subject.
E6. The method of any one of E1 -E5, wherein the siRNA includes (i) an antisense strand having complementarity to a portion of a gene encoding a positive regulator of a gene for which increased expression and/or activity (relative, e.g., to the level of expression and/or activity observed in a reference subject) is associated with a disease state.
E7. The method of any one of E1 -E5, wherein the siRNA includes (i) an antisense strand having complementarity to a portion of a gene encoding a negative regulator of a gene for which decreased expression and/or activity (relative, e.g., to the level of expression and/or activity observed in a reference subject) is associated with a disease state.
E8. The method of any one of E1 -E5, wherein the siRNA includes (i) an antisense strand having complementarity to a splice isoform of a gene for which overexpression of the splice isoform relative to the expression of the splice isoform in a reference subject is associated with a disease state.
E9. The method of any one of E6-E8, wherein the siRNA includes (ii) a sense strand having complementarity to the antisense strand.
E10. The method of any one of any one of E1 -E9, wherein the silencing of the microglial gene in the subject treats a disease state in the subject.
E11. The method of any one of E1 -E10, wherein the disease is a neuroinflammatory or neurodegenerative disease.
E12. The method of any one of E2-E10, wherein the dysregulated gene is selected from the group consisting of ABCA7, ABI3, ADAM10, APOC1 , APOE, AXL, BIN1 , C1QA, C3, C90RF72, CASS4, CCL5, CD2AP, CD33, CD68, CLPTM1 , CLU, CR1 , CSF1 , CST7, CTSB, CTSD, CTSL, CXCL10, CXCL13, DSG2, ECHDC3, EPHA1 , FABP5, FERMT2, FTH1 , GNAS, GRN, HBEGF, HLA-DRB1 , HLA-DRB5, IFIT1 , IFIT3, IFITM3, IFNAR1 , IFNAR2, IGF1 , IL10RA, IL1A, IL1 B, IL1 RAP, INPP5D, ITGAM, ITGAX, LILRB4, LPL, MEF2C, MMP12, MS4A4A, MS4A6A, NLRP3, NME8, NOS2, PICALM, PILRA, PLCG2, PTK2B, SCIMP, SLC24A4, SORL1 , SPI1 , SPP1 , SPPL2A, TBK1 , TNF, TREM2, TREML2, TYROBP, and ZCWPW1 or negative regulator, positive regulator, or splice isoform thereof.
E13. The method of any one of E1-E12, wherein the subject is a mammal, e.g., a human.
E14. The method of any one of E1 -E13, wherein the branched siRNA is administered to the subject intrathecally, intracerebroventricularly, or intrastriatally.
E15. The method of any one of E1 -E14, wherein the branched siRNA is administered to the subject intrathecally.
E16. The method of any one of E1 -E14, wherein the branched siRNA is administered to the subject intracerebroventricularly.
E17. The method of any one of E1 -E14, wherein the branched siRNA is administered to the subject intrastriatally.
E18. The method of any one of E1-17, wherein the siRNA molecule is di-branched.
E19. The method of any one of E1-17, wherein the siRNA molecule is tri-branched.
E20. The method of any one of E1-17, wherein the siRNA molecule is tetra-branched.
E21 . The method of any one of E1 -20, wherein the siRNA comprises (i) an antisense strand having complementarity to one or more of genes selected from the group consisting of APOE, BIN1 , C1QA, C3, C90RF72, CCL5, CD33, CLU/APOJ, CR1 , CXCL10, CXCL13, IFIT1 , IFIT3, IFITM3, IFNAR1 , IFNAR2, IL10RA, IL1A, IL1 B, IL1 RAP, INPP5D, ITGAM, MEF2C, MMP12, NLRP3, NOS2, PILRA, PLCG2, PTK2B, SLC24A4, TBK1 , and TNF and (ii) a sense strand having complementarity to the antisense strand.
E22. The method of E21 , wherein the antisense strand has the following formula, in the 5'-to-3' direction:
Z-((A-P-)n(B-P-)m)q; wherein Z is a 5’ phosphorus stabilizing moiety; each A is, independently, a 2’-0-methyl (2'-0-Me) ribonucleoside; each B is, independently, a 2'-fluoro-ribonucleoside; each P is, independently, an internucleoside linkage selected from a phosphodiester linkage and a phosphorothioate linkage; n is an integer from 1 to 5; m is an integer from 1 to 5; and q is an integer between 1 and 15. E23. The method of E22, wherein Z is represented in any one of Formula I- VIII: wherein Nuc represents a nucleobase, such as adenine, uracil, guanine, thymine, or cytosine, and R represents optionally substituted alkyl, optionally substituted alkenyl, or optionally substituted alkynyl (e.g., optionally substituted C1-C6 alkyl, optionally substituted C2-C6 alkenyl, or optionally substituted C2-C6 alkynyl), phenyl, benzyl, hydroxy, or hydrogen.
E24. The method of E22 or E23, wherein Z is (E)-vinylphosphonate represented in Formula III. E25. The method of any one of E22-E24, wherein n is from 1 to 4. E26. The method of any one of E22-E25, wherein n is from 1 to 3. E27. The method of any one of E22-E26, wherein n is from 1 to 2. E28. The method of any one of E22-E27, wherein n is 1 . E29. The method of any one of E22-E28, wherein m is from 1 to 4. E30. The method of any one of E22-E29, wherein m is from 1 to 3. E31 . The method of any one of E22-E30, wherein m is from 1 to 2. E32. The method of any one of E22-E31 , wherein m is 1 . E33. The method of any one of E22-E32, wherein n and m are each 1 . E34. The method of any one of E22-E33, wherein 10% or less of the ribonucleosides are 2'-0-Me ribonucleoside.
E35. The method of any one of E22-E34, wherein at least 10% of the ribonucleosides are 2'-0-Me ribonucleoside.
E36. The method of any one of E22-E35, wherein at least 20% of the ribonucleosides are 2'-0-Me ribonucleoside.
E37. The method of any one of E22-E36, wherein at least 30% of the ribonucleosides are 2'-0-Me ribonucleoside.
E38. The method of any one of E22-E37, wherein at least 40% of the ribonucleosides are 2'-0-Me ribonucleoside.
E39. The method of any one of E22-E38, wherein at least 50% of the ribonucleosides are 2'-0-Me ribonucleoside. E40. The method of any one of E22-E39, wherein at least 60% of the ribonucleosides are 2'-0-Me ribonucleoside.
E41 . The method of any one of E22-E40, wherein at least 70% of the ribonucleosides are 2'-0-Me ribonucleoside.
E42. The method of any one of E22-E41 , wherein at least 80% of the ribonucleosides are 2'-0-Me ribonucleoside.
E43. The method of any one of E22-E42, wherein at least 90% of the ribonucleosides are 2'-0-Me ribonucleoside.
E44. The method of any one of E22-E43, wherein 10% or less of the internucleoside linkages are phosphodiester linkages or phosphorothioate linkages.
E45. The method of any one of E22-E44, wherein at least 10% of the internucleoside linkages are phosphodiester linkages or phosphorothioate linkages.
E46. The method of any one of E22-E45, wherein at least 20% of the internucleoside linkages are phosphodiester linkages or phosphorothioate linkages.
E47. The method of any one of E22-E46, wherein at least 30% of the internucleoside linkages are phosphodiester linkages or phosphorothioate linkages.
E48. The method of any one of E22-E47, wherein at least 40% of the internucleoside linkages are phosphodiester linkages or phosphorothioate linkages.
E49. The method of any one of E22-E48, wherein at least 50% of the internucleoside linkages are phosphodiester linkages or phosphorothioate linkages.
E50. The method of any one of E22-E49, wherein at least 60% of the internucleoside linkages are phosphodiester linkages or phosphorothioate linkages.
E51 . The method of any one of E22-E50, wherein at least 70% of the internucleoside linkages are phosphodiester linkages or phosphorothioate linkages.
E52. The method of any one of E22-E51 , wherein at least 80% of the internucleoside linkages are phosphodiester linkages or phosphorothioate linkages.
E53. The method of any one of E22-E52, wherein at least 90% of the internucleoside linkages are phosphodiester linkages or phosphorothioate linkages.
E54. The method of any one of E22-E53, wherein 100% of the internucleoside linkages are phosphodiester linkages or phosphorothioate linkages.
E55. The method of any one of E22-E54, wherein 9 internucleoside linkages are phosphodiester linkages or phosphorothioate linkages.
E56. The method of any one of E22-E55, wherein the length of the antisense strand is between 10 and 30 nucleotides.
E57. The method of any one of E22-E56, wherein the length of the antisense strand is between 15 and 25 nucleotides.
E58. The method of any one of E22-E57, wherein the length of the antisense strand is between 18 and 23 nucleotides.
E59. The method of any one of E22-E58, wherein the length of the antisense strand is 18 nucleotides.
E60. The method of any one of E22-E56, wherein the length of the antisense strand is 19 nucleotides.
E61 . The method of any one of E22-E56, wherein the length of the antisense strand is 20 nucleotides.
E62. The method of any one of E22-E56, wherein the length of the antisense strand is 21 nucleotides. E63. The method of any one of E22-E56, wherein the length of the antisense strand is 22 nucleotides.
E64. The method of any one of E22-E56, wherein the length of the antisense strand is 23 nucleotides.
E65. The method of any one of E22-E56, wherein the length of the antisense strand is 24 nucleotides.
E66. The method of any one of E22-E56, wherein the length of the antisense strand is 25 nucleotides.
E67. The method of any one of E22-E56, wherein the length of the antisense strand is 26 nucleotides.
E68. The method of any one of E22-E56, wherein the length of the antisense strand is 27 nucleotides.
E69. The method of any one of E22-E56, wherein the length of the antisense strand is 28 nucleotides.
E70. The method of any one of E22-E56, wherein the length of the antisense strand is 29 nucleotides.
E71. The method of any one of E22-E56, wherein the length of the antisense strand is 30 nucleotides.
E72. The method of E22, wherein the antisense strand includes a structure of Formula A1 , wherein Formula A1 is:
A-T-B-T-A-P-A-P-A-P-A-P-A-P-A-P-A-P-A-P-A-P-A-P-A-P-B-T-A -T-A-T-A-T-A-T-A-T-A-T
Formula A1; wherein A represents a 2’-0-methyl ribonucleoside, B represents a 2’-F ribonucleoside, T represents a phosphorothioate internucleoside linkage, and P represents a phosphodiester internucleoside linkage.
E73. The method of E22, wherein the antisense strand includes a structure of Formula A2, wherein Formula A2 is:
A-T-A-T-A-P-B-P-B-P-B-P-A-P-A-P-A-P-A-P-A-P-A-P-A-P-B-T-A -T-A-T-A-T-A-T-A-T-A-T
(Formula A2); wherein A represents a 2’-0-methyl ribonucleoside, B represents a 2’-F ribonucleoside, T represents a phosphorothioate internucleoside linkage, and P represents a phosphodiester internucleoside linkage.
E74. The method of E22, wherein the antisense strand includes a structure of Formula A3, wherein Formula A3 is:
A-T-B-T-A-P-B-P-B-P-B-P-A-P-A-P-A-P-A-P-A-P-A-P-A-P-B-T-A -T-A-T-A-T-A-T-A-T-A-T
(Formula A3) wherein A represents a 2’-0-methyl ribonucleoside, B represents a 2’-F ribonucleoside, T represents a phosphorothioate internucleoside linkage, and P represents a phosphodiester internucleoside linkage.
E75. The method of E22, wherein the antisense strand includes a structure of Formula A4, wherein Formula A4 is:
A-T-B-T-A-P-A-P-A-P-A-P-A-P-A-P-A-P-A-P-A-P-A-P-A-P-B-T-A -T-A-T-A-T-A-T-A-T-A-T
(Formula A4) wherein A represents a 2’-0-methyl ribonucleoside, B represents a 2’-F ribonucleoside, T represents a phosphorothioate internucleoside linkage, and P represents a phosphodiester internucleoside linkage.
E76. The method of E22, wherein the antisense strand includes a structure of Formula A5, wherein Formula A5 is:
A-T-B-T-A-P-A-P-A-P-B-P-A-P-A-P-A-P-A-P-A-P-A-P-A-P-B-T-A -T-B-T-A-T-A-T-A-T-A-T
(Formula A5) wherein A represents a 2’-0-methyl ribonucleoside, B represents a 2’-F ribonucleoside, T represents a phosphorothioate internucleoside linkage, and P represents a phosphodiester internucleoside linkage.
E77. The method of E22, wherein the sense strand has the following formula in the 5'-to-3' direction:
Y-((A-P-)n(B-P-)m)qL-((B-P-)m(A-P-)n)q; wherein Y is a hydrophobic moiety (e.g., cholesterol, vitamin D, or tocopherol) moiety;
L is a linker; each A is, independently, a 2'-0-Me ribonucleoside; each B is, independently, a 2'-fluoro-ribonucleoside; each P is, independently, an internucleoside linkage selected from a phosphodiester linkage and a phosphorothioate linkage; n is an integer from 1 to 5; m is an integer from 1 to 5; and each q is an integer between 1 and 15.
E78. The method of E77, wherein Y is cholesterol.
E79. The method of E77, wherein Y is tocopherol.
E80. The method of any one of E77-E79, wherein L is an ethylene glycol oligomer.
E81. The method of any one of E77-E80, wherein L is tetraethylene glycol.
E82. The method of any one of E77-E81 , wherein the linker attaches to the sense strand by way of a covalent bond-forming moiety.
E83. The method of E82, wherein the covalent bond-forming moiety is selected from the group consisting of an alkyl, ester, amide, carbamate, phosphonate, phosphate, phosphorothioate, phosphoroamidate, triazole, urea, and formacetal.
E84. The method of E77, wherein L includes a structure of Formula L1 , wherein Formula L1 is:
(Formula L1) E85. The method of E77, wherein L includes a structure of Formula L2, wherein Formula L2 is:
(Formula L2)
E86. The method of E77, wherein L includes a structure of Formula L3, wherein Formula L3 is:
(Formula L3)
E87. The method of E77, wherein L includes a structure of Formula L4, wherein Formula L4 is:
(Formula L4)
E88. The method of E77, wherein L includes a structure of Formula L5, wherein Formula L5 is:
(Formula L5)
E89. The method of E77, wherein L includes a structure of Formula L6, wherein Formula L6 is:
(Formula L6) E90. The method of E77, wherein L includes a structure of Formula L7, wherein Formula L7 is:
(Formula L7)
E91 . The method of E77, wherein L includes a structure of Formula L8, wherein Formula L8 is:
(Formula L8)
E92. The method of E77, wherein L includes a structure of Formula L9, wherein Formula L9 is:
(Formula L9)
E93. The method of any one of E77-E92, wherein each P is independently selected from a phosphodiester linkage and a phosphorothioate linkage.
E94. The method of any one of E77-E93, wherein n is from 1 to 4.
E95. The method of any one of E77-E94, wherein n is from 1 to 3.
E96. The method of any one of E77-E95, wherein n is from 1 to 2.
E97. The method of any one of E77-E96, wherein n is 1 .
E98. The method of any one of E77-E97, wherein m is from 1 to 4.
E99. The method of any one of E77-E98, wherein m is from 1 to 3.
E100. The method of any one of E77-E99, wherein m is from 1 to 2.
E101 . The method of any one of E77-E100, wherein m is 1 .
E102. The method of any one of E77-E101 , wherein n and m are each 1 .
E103. The method of any one of E77-E102, wherein 10% or less of the ribonucleosides are 2'-0-Me ribonucleoside.
E104. The method of any one of E77-E103, wherein at least 10% of the ribonucleosides are 2'-0-Me ribonucleoside.
E105. The method of any one of E77-E104, wherein at least 20% of the ribonucleosides are 2'-0-Me ribonucleoside.
E106. The method of any one of E77-E105, wherein at least 30% of the ribonucleosides are 2'-0-Me ribonucleoside. E107. The method of any one of E77-E106, wherein at least 40% of the ribonucleosides are 2'-0-Me ribonucleoside.
E108. The method of any one of E77-E107, wherein at least 50% of the ribonucleosides are 2'-0-Me ribonucleoside.
E109. The method of any one of E77-E108, wherein at least 60% of the ribonucleosides are 2'-0-Me ribonucleoside.
E110. The method of any one of E77-E109, wherein at least 70% of the ribonucleosides are 2'-0-Me ribonucleoside.
E111 . The method of any one of E77-E110, wherein at least 80% of the ribonucleosides are 2'-0-Me ribonucleoside.
E112. The method of any one of E77-E111 , wherein at least 90% of the ribonucleosides are 2'-0-Me ribonucleoside.
E113. The method of any one of E77-E112, wherein 10% or less of the internucleoside linkages are phosphodiester linkages or phosphorothioate.
E114. The method of any one of E77-E113, wherein at least 10% of the internucleoside linkages are phosphodiester linkages or phosphorothioate.
E115. The method of any one of E77-E114, wherein at least 20% of the internucleoside linkages are phosphodiester linkages or phosphorothioate.
E116. The method of any one of E77-E115, wherein at least 30% of the internucleoside linkages are phosphodiester linkages or phosphorothioate.
E117. The method of any one of E77-E116, wherein at least 40% of the internucleoside linkages are phosphodiester linkages or phosphorothioate.
E118. The method of any one of E77-E117, wherein at least 50% of the internucleoside linkages are phosphodiester linkages or phosphorothioate.
E119. The method of any one of E77-E118, wherein at least 60% of the internucleoside linkages are phosphodiester linkages or phosphorothioate.
E120. The method of any one of E77-E119, wherein at least 70% of the internucleoside linkages are phosphodiester linkages or phosphorothioate.
E121. The method of any one of E77-E120, wherein at least 80% of the internucleoside linkages are phosphodiester linkages or phosphorothioate.
E122. The method of any one of E77-E121 , wherein at least 90% of the internucleoside linkages are phosphodiester linkages or phosphorothioate.
E123. The method of any one of E77-E122, wherein 100% of the internucleoside linkages are phosphodiester linkages or phosphorothioate.
E124. The method of any one of E77-E123, wherein the length of the sense strand is between 12 and 30 nucleotides.
E125. The method of any one of E77-E124, wherein the length of the sense strand is between 14 and 28 nucleotides.
E126. The method of any one of E77-E125, wherein the length of the sense strand is between 16 and 26 nucleotides.
E127. The method of any one of E77-E126, wherein the length of the sense strand is between 18 and 24 nucleotides. E128. The method of any one of E77-E125, wherein the length of the sense strand is 14 nucleotides.
E129. The method of any one of E77-E125, wherein the length of the sense strand is 15 nucleotides.
E130. The method of any one of E77-E125, wherein the length of the sense strand is 16 nucleotides.
E131. The method of any one of E77-E125, wherein the length of the sense strand is 17 nucleotides.
E132. The method of any one of E77-E125, wherein the length of the sense strand is 18 nucleotides.
E133. The method of any one of E77-E125, wherein the length of the sense strand is 19 nucleotides.
E134. The method of any one of E77-E125, wherein the length of the sense strand is 20 nucleotides.
E135. The method of any one of E77-E125, wherein the length of the sense strand is 21 nucleotides.
E136. The method of any one of E77-E125, wherein the length of the sense strand is 22 nucleotides.
E137. The method of any one of E77-E125, wherein the length of the sense strand is 23 nucleotides.
E138. The method of any one of E77-E125, wherein the length of the sense strand is 24 nucleotides.
E139. The method of any one of E77-E125, wherein the length of the sense strand is 25 nucleotides.
E140. The method of any one of E77-E125, wherein the length of the sense strand is 26 nucleotides.
E141. The method of any one of E77-E125, wherein the length of the sense strand is 27 nucleotides.
E142. The method of any one of E77-E125, wherein the length of the sense strand is 28 nucleotides.
E143. The method of any one of E77-E125, wherein the length of the sense strand is 29 nucleotides.
E144. The method of any one of E77-E125, wherein the length of the sense strand is 30 nucleotides.
E145. The method of any one of E77-E144, wherein 4 internucleoside linkages are phosphorothioate linkages.
E146. The method of E77, wherein the sense strand includes a structure of Formula S1 , wherein Formula S1 is:
A-T-A-T-A-P-A-P-A-P-A-P-A-P-A-P-A-P-A-P-A-P-A-P-A-P-A-T-A -T
Formula S1; wherein A represents a 2’-0-methyl ribonucleoside, B represents a 2’-F ribonucleoside, T represents a phosphorothioate internucleoside linkage, and P represents a phosphodiester internucleoside linkage.
E147. The method of E77, wherein the sense strand includes a structure of Formula S2, wherein Formula S2 is:
A-T-A-T-A-P-A-P-A-P-A-P-B-P-A-P-A-P-B-P-B-P-A-P-A-P-A-T-A -T
Formula S2; wherein A represents a 2’-0-methyl ribonucleoside, B represents a 2’-F ribonucleoside, T represents a phosphorothioate internucleoside linkage, and P represents a phosphodiester internucleoside linkage.
E148. The method of E77, wherein the sense strand includes a structure of Formula S3, wherein Formula S3 is:
A-T-A-T-A-P-A-P-A-P-A-P-B-P-A-P-B-P-B-P-B-P-A-P-A-P-A-T-A -T
Formula S3; wherein A represents a 2’-0-methyl ribonucleoside, B represents a 2’-F ribonucleoside, T represents a phosphorothioate internucleoside linkage, and P represents a phosphodiester internucleoside linkage.
E149. A branched siRNA molecule comprising a sense strand and an antisense strand, wherein the antisense strand comprises a region having complementarity to a segment of contiguous nucleotides within a gene selected from the group consisting of APOE, BIN1 , C1 QA, C3, C90RF72, CCL5, CD33, CLU/APOJ, CR1 , CXCL10, CXCL13, IFIT1 , IFIT3, IFITM3, IFNAR1 , IFNAR2, IL10RA, IL1A, IL1 B, IL1RAP, INPP5D, ITGAM, MEF2C, MMP12, NLRP3, NOS2, PILRA, PLCG2, PTK2B, SLC24A4, TBK1 , and TNF or a negative regulator, positive regulator, or splice isoform thereof.
E150. The molecule of E149, wherein the siRNA molecule is di-branched.
E151. The molecule of E149, wherein the siRNA molecule is tri-branched.
E152. The molecule of any one of E149, wherein the siRNA molecule is tetra-branched.
E153. The molecule of any one of E149-E152, wherein the antisense strand of the branched siRNA has the following formula in the 5'-to-3' direction:
Z-((A-P-)n(B-P-)m)q; wherein Z is a 5' phosphorus stabilizing moiety; each A is, independently, a 2'-0-Me ribonucleoside; each B is, independently, a 2'-fluoro-ribonucleoside; each P is, independently, an internucleoside linkage selected from a phosphodiester linkage and a phosphorothioate linkage; n is an integer from 1 to 5; m is an integer from 1 to 5; and q is an integer between 1 and 15.
E154. The molecule of E153, wherein Z is represented in any one of Formula l-VIII: v VI VII VIII wherein Nuc represents a nucleobase, such as adenine, uracil, guanine, thymine, or cytosine, and R represents optionally substituted alkyl, optionally substituted alkenyl, or optionally substituted alkynyl (e.g., optionally substituted C1-C6 alkyl, optionally substituted C2-C6 alkenyl, or optionally substituted C2-C6 alkynyl), phenyl, benzyl, hydroxy, or hydrogen.
E155. The molecule of E153 or E154, wherein Z is (E)-vinylphosphonate as represented in Formula III.
E156. The molecule of any one of E153-E99, wherein each P is independently selected from phosphodiester and phosphorothioate.
E157. The molecule of any one of E153-E156, wherein n is from 1 to 4.
E158. The molecule of any one of E153-E157, wherein n is from 1 to 3.
E159. The molecule of any one of E153-E158, wherein n is from 1 to 2.
E160. The molecule of any one of E153-E159, wherein n is 1.
E161. The molecule of any one of E153-E160, wherein m is from 1 to 4.
E162. The molecule of any one of E153-E161 , wherein m is from 1 to 3.
E163. The molecule of any one of E153-E162, wherein m is from 1 to 2.
E164. The molecule of any one of E153-E163, wherein m is 1.
E165. The molecule of any one of E153-E164, wherein n and m are each 1.
E166. The molecule of any one of E153-E165, wherein 10% or less of the ribonucleosides are 2'-0-Me ribonucleoside.
E167. The molecule of any one of E153-E166, wherein at least 10% of the ribonucleosides are 2'-0-Me ribonucleoside.
E168. The molecule of any one of E153-E167, wherein at least 20% of the ribonucleosides are 2'-0-Me ribonucleoside.
E169. The molecule of any one of E153-E168, wherein at least 30% of the ribonucleosides are 2'-0-Me ribonucleoside.
E170. The molecule of any one of E153-E169, wherein at least 40% of the ribonucleosides are 2'-0-Me ribonucleoside.
E171. The molecule of any one of E153-E170, wherein at least 50% of the ribonucleosides are 2'-0-Me ribonucleoside.
E172. The molecule of any one of E153-E171 , wherein at least 60% of the ribonucleosides are 2'-0-Me ribonucleoside.
E173. The molecule of any one of E153-E172, wherein at least 70% of the ribonucleosides are 2'-0-Me ribonucleoside.
E174. The molecule of any one of E153-E173, wherein at least 80% of the ribonucleosides are 2'-0-Me ribonucleoside.
E175. The molecule of any one of E153-E174, wherein at least 90% of the ribonucleosides are 2'-0-Me ribonucleoside.
E176. The molecule of any one of E153-E175, wherein 10% or less of the internucleoside linkages are phosphodiester linkages or phosphorothioate.
E177. The molecule of any one of E153-E176, wherein at least 10% of the internucleoside linkages are phosphodiester linkages or phosphorothioate.
E178. The molecule of any one of E153-E177, wherein at least 20% of the internucleoside linkages are phosphodiester linkages or phosphorothioate. E179. The molecule of any one of E153-E178, wherein at least 30% of the internucleoside linkages are phosphodiester linkages or phosphorothioate.
E180. The molecule of any one of E153-E179, wherein at least 40% of the internucleoside linkages are phosphodiester linkages or phosphorothioate.
E181. The molecule of any one of E153-E180, wherein at least 50% of the internucleoside linkages are phosphodiester linkages or phosphorothioate.
E182. The molecule of any one of E153-E181 , wherein at least 60% of the internucleoside linkages are phosphodiester linkages or phosphorothioate.
E183. The molecule of any one of E153-E182, wherein at least 70% of the internucleoside linkages are phosphodiester linkages or phosphorothioate.
E184. The molecule of any one of E153-E183, wherein at least 80% of the internucleoside linkages are phosphodiester linkages or phosphorothioate.
E185. The molecule of any one of E153-E184, wherein at least 90% of the internucleoside linkages are phosphodiester linkages or phosphorothioate.
E186. The molecule of any one of E153-E185, wherein 100% of the internucleoside linkages are phosphodiester linkages or phosphorothioate.
E187. The molecule of any one of E153-E186, wherein the length of the antisense strand is between 10 and 30 nucleotides.
E188. The molecule of any one of E153-E187, wherein the length of the antisense strand is between 15 and 25 nucleotides.
E189. The molecule of any one of E153-E188, wherein the length of the antisense strand is between 18 and 23 nucleotides.
E190. The molecule of any one of E153-E187, wherein the length of the antisense strand is 18 nucleotides.
E191 . The molecule of any one of E153-E187, wherein the length of the antisense strand is 19 nucleotides.
E192. The molecule of any one of E153-E187, wherein the length of the antisense strand is 20 nucleotides.
E193. The molecule of any one of E153-E187, wherein the length of the antisense strand is 21 nucleotides.
E194. The molecule of any one of E153-E187, wherein the length of the antisense strand is 22 nucleotides.
E195. The molecule of any one of E153-E187, wherein the length of the antisense strand is 23 nucleotides.
E196. The molecule of any one of E153-E187, wherein the length of the antisense strand is 24 nucleotides.
E197. The molecule of any one of E153-E187, wherein the length of the antisense strand is 25 nucleotides.
E198. The molecule of any one of E153-E187, wherein the length of the antisense strand is 26 nucleotides.
E199. The molecule of any one of E153-E187, wherein the length of the antisense strand is 27 nucleotides. E200. The molecule of any one of E153-E187, wherein the length of the antisense strand is 28 nucleotides.
E201. The molecule of any one of E153-E187, wherein the length of the antisense strand is 29 nucleotides.
E202. The molecule of any one of E153-E187, wherein the length of the antisense strand is 30 nucleotides.
E203. The molecule of any one of E149-E202, wherein 9 internucleoside linkages are phosphorothioate.
E204. The molecule of any one of E149-E203, wherein the sense strand of the branched siRNA has the following formula in the 5'-to-3' direction:
Y-((A-P-)n(B-P-)m)qL-((B-P-)m(A-P-)n)q; wherein Y is a hydrophobic moiety (e.g., cholesterol, vitamin D, or tocopherol);
L is a linker; each A is, independently, a 2'-0-Me ribonucleoside; each B is, independently, a 2'-fluoro-ribonucleoside; each P is, independently, an internucleoside linkage selected from a phosphodiester linkage and a phosphorothioate linkage; n is an integer from 1 to 5; m is an integer from 1 to 5; and q is an integer between 1 and 15.
E205. The molecule of E204, wherein Y is cholesterol.
E206. The molecule of E204, wherein Y is tocopherol.
E207. The molecule of any one of E204-E206, wherein L is an ethylene glycol oligomer.
E208. The molecule of E207, wherein L is tetraethylene glycol.
E209. The molecule of any one of E204-E208, wherein L attaches to the sense strand by way of a covalent bond-forming moiety.
E210. The molecule of E209, wherein the covalent bond-forming moiety is selected from the group consisting of an alkyl, ester, amide, carbamate, phosphonate, phosphate, phosphorothioate, phosphoroamidate, triazole, urea, and formacetal.
E211. The molecule of E204, wherein L includes a structure of Formula L1 , wherein Formula L1 is:
(Formula L1)
E212. The molecule of E204, wherein L includes a structure of Formula L2, wherein Formula L2 is:
(Formula L2)
E213. The molecule of E204, wherein L includes a structure of Formula L3, wherein Formula L3 is:
(Formula L3)
E214. The molecule of E204, wherein L includes a structure of Formula L4, wherein Formula L4 is:
(Formula L4)
E215. The molecule of E204, wherein L includes a structure of Formula L5, wherein Formula L5 is:
(Formula L5)
E216. The molecule of E204, wherein L includes a structure of Formula L6, wherein Formula L6 is:
(Formula L6) E217. The molecule of E204, wherein L includes a structure of Formula L7, wherein Formula L7 is:
(Formula L7)
E218. The molecule of E204, wherein L includes a structure of Formula L8, wherein Formula L8 is:
(Formula L8)
E219. The molecule of E204, wherein L includes a structure of Formula L9, wherein Formula L9 is:
(Formula L9)
E220. The molecule of any one of E204-E210, wherein each P is independently selected from phosphodiester and phosphorothioate.
E221. The molecule of any one of E204-E141 , wherein n is from 1 to 4.
E222. The molecule of any one of E204-E142, wherein n is from 1 to 3.
E223. The molecule of any one of E204-E143, wherein n is from 1 to 2.
E224. The molecule of any one of E204-E144, wherein n is 1.
E225. The molecule of any one of E204-E145, wherein m is from 1 to 4.
E226. The molecule of any one of E204-E146, wherein m is from 1 to 3.
E227. The molecule of any one of E204-E147, wherein m is from 1 to 2.
E228. The molecule of any one of E204-E148, wherein m is 1.
E229. The molecule of any one of E204-E149, wherein n and m are each 1.
E230. The molecule of any one of E204-E150, wherein 10% or less of the ribonucleosides are 2'-0-Me ribonucleoside.
E231. The molecule of any one of E204-E151 , wherein at least 10% of the ribonucleosides are 2'-0-Me ribonucleoside.
E232. The molecule of any one of E204-E152, wherein at least 20% of the ribonucleosides are 2'-0-Me ribonucleoside.
E233. The molecule of any one of E204-E153, wherein at least 30% of the ribonucleosides are 2'-0-Me ribonucleoside. E234. The molecule of any one of E204-E154, wherein at least 40% of the ribonucleosides are 2'-0-Me ribonucleoside.
E235. The molecule of any one of E204-E155, wherein at least 50% of the ribonucleosides are 2'-0-Me ribonucleoside.
E236. The molecule of any one of E204-E156, wherein at least 60% of the ribonucleosides are 2'-0-Me ribonucleoside.
E237. The molecule of any one of E204-E157, wherein at least 70% of the ribonucleosides are 2'-0-Me ribonucleoside.
E238. The molecule of any one of E204-E158, wherein at least 80% of the ribonucleosides are 2'-0-Me ribonucleoside.
E239. The molecule of any one of E204-E159, wherein at least 90% of the ribonucleosides are 2'-0-Me ribonucleoside.
E240. The molecule of any one of E204-E160, wherein 10% or less of the internucleoside linkages are phosphodiester linkages or phosphorothioate linkages.
E241. The molecule of any one of E204-E161 , wherein at least 10% of the internucleoside linkages are phosphodiester linkages or phosphorothioate linkages.
E242. The molecule of any one of E204-E162, wherein at least 20% of the internucleoside linkages are phosphodiester linkages or phosphorothioate linkages.
E243. The molecule of any one of E204-E163, wherein at least 30% of the internucleoside linkages are phosphodiester linkages or phosphorothioate linkages.
E244. The molecule of any one of E204-E164, wherein at least 40% of the internucleoside linkages are phosphodiester linkages or phosphorothioate linkages.
E245. The molecule of any one of E204-E165, wherein at least 50% of the internucleoside linkages are phosphodiester linkages or phosphorothioate linkages.
E246. The molecule of any one of E204-E166, wherein at least 60% of the internucleoside linkages are phosphodiester linkages or phosphorothioate linkages.
E247. The molecule of any one of E204-E167, wherein at least 70% of the internucleoside linkages are phosphodiester linkages or phosphorothioate linkages.
E248. The molecule of any one of E204-E168, wherein at least 80% of the internucleoside linkages are phosphodiester linkages or phosphorothioate linkages.
E249. The molecule of any one of E204-E169, wherein at least 90% of the internucleoside linkages are phosphodiester linkages or phosphorothioate linkages.
E250. The molecule of any one of E204-E170, wherein 100% of the internucleoside linkages are phosphodiester linkages or phosphorothioate linkages.
E251 . The molecule of any one of E204-E250, wherein the length of the sense strand is between 12 and 30 nucleotides.
E252. The molecule of any one of E204-E251 , wherein the length of the sense strand is between 14 and 28 nucleotides.
E253. The molecule of any one of E204-E252, wherein the length of the sense strand is between 16 and 26 nucleotides.
E254. The molecule of any one of E204-E253, wherein the length of the sense strand is between 18 and 24 nucleotides. E255. The molecule of any one of E204-E251 , wherein the length of the sense strand is 14 nucleotides.
E256. The molecule of any one of E204-E251 , wherein the length of the sense strand is 15 nucleotides.
E257. The molecule of any one of E204-E251 , wherein the length of the sense strand is 16 nucleotides.
E258. The molecule of any one of E204-E251 , wherein the length of the sense strand is 17 nucleotides.
E259. The molecule of any one of E204-E251 , wherein the length of the sense strand is 18 nucleotides.
E260. The molecule of any one of E204-E251 , wherein the length of the sense strand is 19 nucleotides.
E261. The molecule of any one of E204-E251 , wherein the length of the sense strand is 20 nucleotides.
E262. The molecule of any one of E204-E251 , wherein the length of the sense strand is 21 nucleotides.
E263. The molecule of any one of E204-E251 , wherein the length of the sense strand is 22 nucleotides.
E264. The molecule of any one of E204-E251 , wherein the length of the sense strand is 23 nucleotides.
E265. The molecule of any one of E204-E251 , wherein the length of the sense strand is 24 nucleotides.
E266. The molecule of any one of E204-E251 , wherein the length of the sense strand is 25 nucleotides.
E267. The molecule of any one of E204-E251 , wherein the length of the sense strand is 26 nucleotides.
E268. The molecule of any one of E204-E251 , wherein the length of the sense strand is 27 nucleotides.
E269. The molecule of any one of E204-E251 , wherein the length of the sense strand is 28 nucleotides.
E270. The molecule of any one of E204-E251 , wherein the length of the sense strand is 29 nucleotides.
E271. The molecule of any one of E204-E251 , wherein the length of the sense strand is 30 nucleotides.
E272. The molecule of any one of E204-E271 , wherein 4 internucleoside linkages are phosphorothioate.
E273. A method of treating a subject diagnosed as having a disease associated with expression of a dysregulated microglial gene, the method comprising administering to the subject the branched siRNA molecule of any one of E149-E271.
E274. The method of any one of E11 -E148 or E273, wherein the disease is a neuroinflammatory disease.
E275. The method of any one of E11 -E148, E273, or E274, wherein the disease is a neurodegenerative disease.
E276. The method of any one of E11 -E148 or E273-E275, wherein the disease is Alzheimer’s disease.
E277. The method of any one of E11 -E148 or E273-E275, wherein the disease is Amyotrophic Lateral
Sclerosis.
E278. The method of any one of E11 -E148 or E273-E275, wherein the disease is Parkinson’s disease.
E279. The method of any one of E11-E148 or E273-E275, wherein the disease is frontotemporal dementia.
E280. The method of any one of E11 -E148 or E273-E275, wherein the disease is Huntington’s disease.
E281. The method of any one of E11-E148 or E273-E275, wherein the disease is multiple sclerosis.
E282. The method of any one of E11 -E148 or E273-E275, wherein the disease is progressive supranuclear palsy.
E283. The method of any one of E273, wherein the dysregulated microglial gene is selected from the group consisting of ABCA7, ABI3, ADAM10, APOC1 , APOE, AXL, BIN1 , C1 QA, C3, C90RF72, CASS4, CCL5, CD2AP, CD33, CD68, CLPTM1 , CLU, CR1 , CSF1 , CST7, CTSB, CTSD, CTSL, CXCL10, CXCL13, DSG2, ECHDC3, EPHA1 , FABP5, FERMT2, FTH1 , GNAS, GRN, HBEGF, HLA-DRB1 , HLA-DRB5, IFIT1 , IFIT3, IFITM3, IFNAR1 , IFNAR2, IGF1 , IL10RA, IL1A, IL1B, IL1RAP, INPP5D, ITGAM, ITGAX, LILRB4, LPL, MEF2C, MMP12, MS4A4A, MS4A6A, NLRP3, NME8, NOS2, PICALM, PILRA, PLCG2, PTK2B, SCIMP, SLC24A4, SORL1 , SPI1 , SPP1 , SPPL2A, TBK1 , TNF, TREM2, TREML2, TYROBP, and ZCWPW1.
E284. The method of E273, wherein the administering of the branched siRNA molecule to the subject results is silencing of a microglial gene in the subject.
E285. The method of E284, wherein silencing of a microglial gene comprises silencing of any one of the genes selected from group consisting of APOE, BIN1 , C1QA, C3, C90RF72, CCL5, CD33, CLU/APOJ, CR1 , CXCL10, CXCL13, IFIT1 , IFIT3, IFITM3, IFNAR1 , IFNAR2, IL10RA, IL1A, IL1B, IL1RAP, INPP5D, ITGAM, MEF2C, MMP12, NLRP3, NOS2, PILRA, PLCG2, PTK2B, SLC24A4, TBK1 , and TNF.
E286. The method of E273, wherein the microglial gene is an overactive disease driver gene (e.g., a dysregulated microglial gene).
E287. The method of E273, wherein the gene is a positive regulator of a gene for which increased expression and/or activity relative to the level of expression and/or activity observed in a reference subject is associated with a disease state.
E288. The method of E273, wherein the gene is a negative regulator of a gene for which decreased expression and/or activity relative to the level of expression and/or activity observed in a reference subject is associated with a disease state.
E289. The method of E273, wherein the gene is a splice isoform of a gene for which overexpression of the splice isoform relative to the expression of the splice isoform in a reference subject is associated with a disease state.
E290. The method of any one of E273-E289, wherein the subject is a human.
OTHER EMBODIMENTS
Various modifications and variations of the described disclosure will be apparent to those skilled in the art without departing from the scope and spirit of the disclosure. Although the disclosure has been described in connection with specific embodiments, it should be understood that the disclosure as claimed should not be unduly limited to such specific embodiments. Indeed, various modifications of the described modes for carrying out the disclosure that are obvious to those skilled in the art are intended to be within the scope of the disclosure.
Other embodiments are in the claims.