REPLACEMENT OF RAG1 FOR USE IN THERAPY - OSPEDALE SAN RAFFAELE SRL

Title:

REPLACEMENT OF RAG1 FOR USE IN THERAPY

Document Type and Number:

WIPO Patent Application WO/2022/079054

Kind Code:

A1

Abstract:

The present invention relates to an isolated polynucleotide comprising from 5' to 3': a first homology region, a splice acceptor sequence, a nucleotide sequence encoding a RAG1 polypeptide, and a second homology region for use in treating a RAG-deficient immunodeficiency.

More Like This:

JPS6251979	MUTANT ESCHERICHIA COLI
WO/2022/071061	MUTANT TRANSGLUTAMINASE
WO/2000/068243	METHODS USING MECHANISMS OF ACTION OF AROA

Inventors:

VILLA ANNA (IT)
GENOVESE PIETRO (IT)
NALDINI LUIGI (IT)
SACCHETTI NICOLO (IT)
CASTIELLO MARIA CARMINA (IT)
FERRARI SAMUELE (IT)

Application Number:

PCT/EP2021/078222

Publication Date:

April 21, 2022

Filing Date:

October 12, 2021

Export Citation:

Click for automatic bibliography generation Help

Assignee:

OSPEDALE SAN RAFFAELE SRL (IT)
FOND TELETHON (IT)

International Classes:

C12N9/10; A61K38/17; A61P37/02; C07K14/47; C12N15/85

Domestic Patent References:

WO1998017815A1	1998-04-30
WO2020002380A1	2020-01-02

Foreign References:

US20190038771A1

2019-02-07

Other References:

SACCHETI: "ESGCT XXV Anniversary Congress in Collaboration with the German Society for Gene Therapy October 17-20, 2017 Berlin, Germany", HUMAN GENE THERAPY, vol. 28, no. 12, 1 December 2017 (2017-12-01), GB, pages A1 - A125, XP055882996, ISSN: 1043-0342, Retrieved from the Internet DOI: 10.1089/hum.2017.29055.abstracts
"ESID 2014 ORAL PRESENTATIONS", JOURNAL OF CLINICAL IMMUNOLOGY, KLUWER ACADEMIC PUBLISHERS, NEW YORK, vol. 34, no. Suppl 2, 1 October 2014 (2014-10-01), pages 139 - 515, XP037077267, ISSN: 0271-9142, [retrieved on 20141025], DOI: 10.1007/S10875-014-0101-9
ITAI M PESSACH ET AL: "Gene therapy for primary immunodeficiencies: Looking ahead, toward gene correction", JOURNAL OF ALLERGY AND CLINICAL IMMUNOLOGY, ELSEVIER, AMSTERDAM, NL, vol. 127, no. 6, 18 February 2011 (2011-02-18), pages 1344 - 1350, XP028088959, ISSN: 0091-6749, [retrieved on 20110225], DOI: 10.1016/J.JACI.2011.02.027
LAGRESLE-PEYROU CHANTAL ET AL: "Restoration of Human B-cell Differentiation Into NOD-SCID Mice Engrafted With Gene-corrected CD34+ Cells Isolated From Artemis or RAG1-deficient Patients", MOLECULAR THERAPY, vol. 16, no. 2, 1 February 2008 (2008-02-01), US, pages 396 - 403, XP055882975, ISSN: 1525-0016, DOI: 10.1038/sj.mt.6300353
TENG GSCHATZ DG, ADVANCES IN IMMUNOLOGY, vol. 128, 2015, pages 1 - 39
NOTARANGELO LD ET AL., NAT REV IMMUNOL, vol. 16, no. 4, 2016, pages 234 - 246
HADDAD E ET AL., BLOOD, vol. 132, no. 17, 2018, pages 1737 - 49
KIM, M.S. ET AL., NATURE, vol. 518, no. 7540, 2015, pages 507 - 511
LAGRESLE-PEYROU C ET AL., BLOOD, vol. 107, no. 1, 2006, pages 63 - 72
PIKE-OVERZET K ET AL., LEUKEMIA, vol. 25, no. 9, 2011, pages 1471 - 83
PIKE-OVERZET K ET AL., JOURNAL OF ALLERGY AND CLINICAL IMMUNOLOGY, vol. 134, 2014, pages 242 - 243
VAN TIL NP ET AL., J ALLERGY CLIN IMMUNOL., vol. 133, no. 4, 2014, pages 1116 - 23
ARBUCKLE, J.L. ET AL., BMC BIOCHEMISTRY, vol. 12, no. 1, 2011, pages 23
RU, H. ET AL., CELL, vol. 163, no. 5, 2015, pages 1138 - 1152
RAN, F.A. ET AL., NATURE PROTOCOLS, vol. 8, no. 11, 2013, pages 2281 - 2308
MA, S.L. ET AL., PLOS ONE, vol. 10, no. 6, 2015, pages e0130729
DEVEREUX ET AL.: "Oligonucleotide Synthesis: A Practical Approach", vol. 12, 1984, UNIVERSITY OF WISCONSIN, pages: 387
ATSCHUL ET AL., J. MOL. BIOL, 1990, pages 403 - 410
MADEIRA, F. ET AL., NUCLEIC ACIDS RESEARCH, vol. 47, no. W1, 2019, pages W636 - W641
FEMS MICROBIOL. LETT., vol. 177, no. 1, 1999, pages 187 - 50
AYUSO, E. ET AL., CURRENT GENE THERAPY, vol. 10, no. 6, 2010, pages 423 - 436
MERTEN, O.W. ET AL., MOLECULAR THERAPY-METHODS & CLINICAL DEVELOPMENT, vol. 3, 2016, pages 16017
NADEAU, IKAMEN, A., BIOTECHNOLOGY ADVANCES, vol. 20, no. 7-8, 2003, pages 475 - 489
CUI, Y. ET AL., INTERDISCIPLINARY SCIENCES: COMPUTATIONAL LIFE SCIENCES, vol. 10, no. 2, 2018, pages 455 - 465
CRADICK TJ ET AL., MOL THER - NUCLEIC ACIDS, vol. 3, no. 12, 2014, pages e214
HENDEL A ET AL., NAT BIOTECHNOL., vol. 33, no. 2, 2015, pages 187 - 97
MURUGAN, K. ET AL., MOLECULAR CELL, vol. 68, no. 1, 2017, pages 15 - 25
JIANG, FDOUDNA, J.A, ANNUAL REVIEW OF BIOPHYSICS, vol. 46, 2017, pages 505 - 529
KIM S ET AL., GENOME RES., vol. 24, no. 6, 2014, pages 1012 - 9
SCHOTT, J.W. ET AL., MOLECULAR THERAPY-METHODS & CLINICAL DEVELOPMENT, vol. 14, 2019, pages 134 - 147
YANG, H. ET AL., MOLECULAR THERAPY-NUCLEIC ACIDS, vol. 20, 2020, pages 451 - 458
HUANG, X. ET AL., F1000RESEARCH, vol. 8, 2019, pages 1833
WILBIE, D. ET AL., ACCOUNTS OF CHEMICAL RESEARCH, vol. 52, no. 6, 2019, pages 1555 - 1564
SCHIROLI, G. ET AL., CELL STEM CELL, vol. 24, no. 4, 2019, pages 551 - 565
FERRARI, S. ET AL., NATURE BIOTECHNOLOGY, 2020, pages 1 - 11
NOTARANGELO, L.D. ET AL., NATURE REVIEWS IMMUNOLOGY, vol. 16, no. 4, 2016, pages 234 - 246
DELMONTE, O.M. ET AL., JOURNAL OF CLINICAL IMMUNOLOGY, vol. 38, no. 6, 2018, pages 646 - 655
SAMBROOK, J., FRITSCH, E.F. AND MANIATIS, T.: "Molecular Cloning: A Laboratory Manual", 1989, COLD SPRING HARBOR LABORATORY PRESS
AUSUBEL, F.M ET AL.: "Current Protocols in Molecular Biology", 1995, JOHN WILEY & SONS
ROE, B.CRABTREE, J.KAHN, A.: "DNA Isolation and Sequencing: Essential Techniques", 1996, JOHN WILEY & SONS
OETTINGER MA ET AL., SCIENCE, vol. 248, no. 4962, 1990, pages 1517 - 23
LILLEY, D.M.DAHLBERG, J.E.: "Methods in Enzymology: DNA Structures Part A: Synthesis and Physical", 1992, ACADEMIC PRESS
TSAI SQ ET AL., NAT BIOTECHNOL, vol. 33, no. 2, 2015, pages 187 - 97
ZHU LJ ET AL., BMC GENOMICS, vol. 18, no. 1, 2017
DELMONTE OM ET AL., BLOOD, vol. 135, no. 9, 2020, pages 610 - 9
DOI KTAKEUCHI Y, UIRUSU, vol. 65, 2015, pages 27 - 36
TEN BOEKEL E ET AL., IMMUNITY, vol. 8, no. 2, 1998, pages 199 - 207
VAN TIL NP ET AL., J ALLERGY CLIN IMMUNOL, vol. 133, no. 4, 2014, pages 1099 - 10
ZHANG Y ET AL., ADVANCES IN IMMUNOLOGY, 2010, pages 93 - 133
PAPAEMMANUIL E ET AL., NAT GENET., vol. 46, no. 2, 2014, pages 116 - 25
KIM S ET AL., GENOME RES, vol. 24, no. 6, 2014, pages 1012 - 9
VAKULSKAS CA ET AL., NAT MED., vol. 24, no. 8, 2018, pages 1216 - 24
GENOVESE P ET AL., NATURE, vol. 510, no. 7504, 2014, pages 235 - 40
KASS EMJASIN M, FEBS LETTERS, vol. 584, 2010, pages 3703 - 8
CLACKSON T, GENE THERAPY, vol. 7, 2000, pages 120 - 5
CRADICK TJ ET AL., MOL THER - NUCLEIC ACIDS., vol. 3, no. 12, 2014, pages e214
LANGMEAD BSALZBERG SL, NAT METHODS., vol. 9, no. 4, 2012, pages 357 - 9
BASSO-RICCI L ET AL., CYTOM PART A., vol. 91, no. 10, 2017, pages 952 - 65
BASSO-RICCI ET AL., CYTOM PART A, vol. 91, 2017, pages 952 - 65
SEET ET AL., NAT METHODS, 2017
LIANG HE ET AL., IMMUNITY, vol. 17, 2002, pages 639 - 651
BREDEMEYER AL ET AL., NATURE, vol. 442, no. 7101, 2006, pages 466 - 470
DE RAVIN SS ET AL., BLOOD, vol. 116, 2010, pages 1263 - 1271

Attorney, Agent or Firm:

D YOUNG & CO LLP (GB)

Download PDF:

View/Download PDF PDF Help

Claims:

CLAIMS

1. An isolated polynucleotide comprising from 5’ to 3’: a first homology region, a splice acceptor sequence, a nucleotide sequence encoding a RAG1 polypeptide, and a second homology region.

2. The isolated polynucleotide according to claim 1 , wherein:

(i) the first homology region is homologous to a first region of the RAG1 intron 1 and the second homology region is homologous to a second region of the RAG1 intron 1 ; or

(ii) the first homology region is homologous to a first region of the RAG1 intron 1 or the RAG1 exon 2 and the second homology region is homologous to a second region of the RAG1 exon 2.

3. The isolated polynucleotide according to claim 1 or claim 2, wherein the first homology region is homologous to a first region of the RAG1 intron 1 and the second homology region is homologous to a second region of the RAG1 intron 1.

4. The isolated polynucleotide according to any preceding claim, wherein:

(i) the first homology region is homologous to a region upstream of chr 11 : 36569295 and the second homology region is homologous to a region downstream of chr 11 : 36569298;

(ii) the first homology region is homologous to a region upstream of chr 11 : 36573790 and the second homology region is homologous to a region downstream of chr 11 : 36573793;

(iii) the first homology region is homologous to a region upstream of chr 11 : 36573641 and the second homology region is homologous to a region downstream of chr 11 : 36573644;

(iv) the first homology region is homologous to a region upstream of chr 11 : 36573351 and the second homology region is homologous to a region downstream of chr 11 : 36573354;

(v) the first homology region is homologous to a region upstream of chr 11 : 36569080 and the second homology region is homologous to a region downstream of chr 11 : 36569083; (vi) the first homology region is homologous to a region upstream of chr 11 : 36572472 and the second homology region is homologous to a region downstream of chr 11 : 36572475;

(vii) the first homology region is homologous to a region upstream of chr 11 : 36571458 and the second homology region is homologous to a region downstream of chr 11 : 36571461 ;

(viii) the first homology region is homologous to a region upstream of chr 11 : 36571366 and the second homology region is homologous to a region downstream of chr 11 : 36571369;

(ix) the first homology region is homologous to a region upstream of chr 11 : 36572859 and the second homology region is homologous to a region downstream of chr 11 : 36572862;

(x) the first homology region is homologous to a region upstream of chr 11 : 36571457 and the second homology region is homologous to a region downstream of chr 11 : 36571460;

(xi) the first homology region is homologous to a region upstream of chr 11 : 36569351 and the second homology region is homologous to a region downstream of chr 11 : 36569354; or

(xii) the first homology region is homologous to a region upstream of chr 11 : 36572375 and the second homology region is homologous to a region downstream of chr 11 : 36572378. The isolated polynucleotide according to any preceding claim, wherein:

(i) the first homology region is homologous to a region upstream of chr 11 : 36569295 and the second homology region is homologous to a region downstream of chr 11 : 36569298;

(ii) the first homology region is homologous to a region upstream of chr 11 : 36573351 and the second homology region is homologous to a region downstream of chr 11 : 36573354; or

(iii) the first homology region is homologous to a region upstream of chr 11 : 36571366 and the second homology region is homologous to a region downstream of chr 11 : 36571369; preferably wherein the first homology region is homologous to a region upstream of chr 11 :

36569295 and the second homology region is homologous to a region downstream of chr 11 : 36569298.

6. The isolated polynucleotide according to any preceding claim, wherein the first homology region is homologous to a region comprising chr 11 : 36569245-chr 11 : 36569294 and/or the second homology region is homologous to a region comprising chr 11 : 36569299- chr 11 : 36569348.

7. The isolated polynucleotide according to any preceding claim, wherein the 3’ terminal sequence of the first homology region comprises or consists of a nucleotide sequence that has at least 70% identity to SEQ ID NO: 7 and/or the 5’ terminal sequence of the second homology region comprises or consists of a nucleotide sequence that has at least 70% identity to SEQ ID NO: 19.

8. The isolated polynucleotide according to any preceding claim, wherein the first homology region comprises or consists of a nucleotide sequence that has at least 70% identity to SEQ ID NO: 31 , or a fragment thereof and/or the second homology region comprises or consists of a nucleotide sequence that has at least 70% identity to SEQ ID NO: 32, or a fragment thereof.

9. The isolated polynucleotide according to any preceding claim, wherein the first and second homology regions are each 50-1000bp in length, 100-500 bp in length, or 200-400 bp in length.

10. The isolated polynucleotide according to any preceding claim, wherein the nucleotide sequence encoding a RAG1 polypeptide comprises or consists of a nucleotide sequence encoding an amino acid sequence that has at least 70% identity to SEQ ID NO: 4 or SEQ ID NO: 5.

11 . The isolated polynucleotide according to any preceding claim, wherein the nucleotide sequence encoding a RAG1 polypeptide comprises or consists of a nucleotide sequence that has at least 70% identity to SEQ ID NO: 6.

12. The isolated polynucleotide according to any preceding claim, wherein the splice acceptor site comprises or consists of a nucleotide sequence that has at least 70% identity to SEQ ID NO: 33.

13. The isolated polynucleotide according to any preceding claim, wherein the nucleotide sequence encoding a RAG1 polypeptide is operably linked to a polyadenylation sequence, optionally wherein the polyadenylation sequence is a bGH polyadenylation sequence.

14. The isolated polynucleotide according to any preceding claim, wherein the nucleotide sequence encoding a RAG1 polypeptide is operably linked to a polyadenylation sequence comprising or consisting of a nucleotide sequence that has at least 70% identity to SEQ ID NO: 35.

15. The isolated polynucleotide according to any preceding claim, wherein the nucleotide sequence encoding a RAG1 polypeptide is operably linked a Kozak sequence, optionally wherein the Kozak sequence comprises or consists of a nucleotide sequence that has at least 70% identity to SEQ ID NO: 36.

16. The isolated polynucleotide according to any preceding claim, wherein the polynucleotide comprises or consists of a nucleotide sequence that has at least 70% identity to SEQ ID NO: 39.

17. A vector comprising the polynucleotide according to any preceding claim.

18. The vector according to claim 17, wherein the vector is a viral vector, optionally an adeno-associated viral (AAV) vector such as an AAV6 vector.

19. A guide RNA comprising or consisting of a nucleotide sequence that has at least 90% identity to any of SEQ ID NOs: 41-52 or 53-55, optionally wherein the guide RNA comprises or consists of a nucleotide sequence that has at least 90% identity to SEQ ID NO: 41 or 53 (preferably SEQ ID NO: 41).

20. The guide RNA according to claim 19, wherein from one to five of the terminal nucleotides at 5’ end and/or 3’ end of the guide RNA are chemically modified to enhance stability, optionally wherein three terminal nucleotides at 5’ end and/or 3’ end if the guide RNA are chemically modified to enhance stability, optionally wherein the chemical modification is modification with 2'-O-methyl 3'phosphorothioate.

21. A kit, a composition, or a gene-editing system, comprising the polynucleotide according to any one of claims 1 to 16 or the vector according to any one of claims 17 or 18.

22. The kit, composition, gene-editing system according to claim 21 , wherein the kit, composition, or gene-editing system further comprises a guide RNA according to claim 19 or claim 20.

23. The kit, composition, or gene-editing system, according to claim 21 or claim 22, wherein the kit, composition, or gene-editing system, further comprises a RNA-guided nuclease, optionally wherein the RNA-guided nuclease is a Cas9 endonuclease.

24. Use of the isolated polynucleotide according to any one of claims 1 to 16, the vector according to any one of claims 17 or 18, the guide RNA according to any one of claims 19 or 20, or the kit, composition, or gene-editing system according to any one of claims 21 to 23, for gene editing a cell or a population of cells.

25. An isolated genome comprising the polynucleotide according to any one of claims 1 to 16.

26. An isolated cell comprising the polynucleotide according to any one of claims 1 to 16 or the genome according to claim 25.

27. The isolated cell according to claim 26, wherein the cell is a hematopoietic stem cell (HSC), a hematopoietic progenitor cell (HPC), or a lymphoid progenitor cell (LPC).

28. The isolated cell according to claim 26 or claim 27, wherein the cell is a CD34+ cell.

29. A population of cells comprising one or more isolated cells according to any one of claims 26 to 28.

30. The population of cells according to claim 29, wherein at least 50% of the population of cells are CD34+ cells.

31 . The population of cells according to claim 29 or claim 30, wherein at least 20% of the population of cells are CD34+ cells comprising the genome according to claim 25.

32. A method of gene editing a population of cells comprising:

(a) providing a population of cells; and

(b) delivering an RNA-guided nuclease, a guide RNA according to claim 19 or claim 20, and a vector according to claim 17 or claim 18, to the population of cells to obtain a population of gene-edited cells.

33. A method of treating a RAG-deficient immunodeficiency in a subject comprising:

(a) providing a population of cells; (b) delivering an RNA-guided nuclease, a guide RNA according to claim 19 or claim 20, and a vector according to claim 17 or claim 18, to the population of cells to obtain a population of gene-edited cells.

(c) administering the population of gene-edited cells to the subject.

34. The method according to claim 32 or claim 33, wherein the population of cells comprises or consists of HSCs, HPCs, and/or LPCs and/or wherein the population of cells comprises or consists of CD34+ cells.

35. The method according to any one of claims 32 to 34, wherein the population of cells is pre-activated, optionally wherein the population of cells is cultured with one or more cytokines selected from: one or more early acting cytokines such as TPO, IL-6, IL-3, SCF, FLT3-L; one or more transduction enhancers such as PGE2; and one or more expansion enhancers such as UM171 , UM729, SR1.

36. The method according to any one of claims 32 to 35, wherein the RNA-guided nuclease and/or guide RNA is delivered prior to the vector and/or simultaneously with the vector.

37. The method according to any one of claims 32 to 36, wherein the RNA-guided nuclease is Cas9, optionally wherein the Cas9 and the guide RNA are delivered preassembled as Cas9 RNPs.

38. The method according to any one of claims 32 to 37, wherein the method further comprises delivering a p53 inhibitor and/or a HDR enhancer, optionally wherein the p53 inhibitor and/or a HDR enhancer is delivered simultaneously with the RNA-guided nuclease and/or guide RNA.

39. The method according to any one of claims 32 to 38, wherein the population of gene- edited cells is defined according to any one of claims 29 to 31.

40. A population of gene-edited cells obtainable by the method according to any one of claims 32 to 39.

41 . A method of treating a RAG-deficient immunodeficiency comprising administering the isolated cell according to any one of claims 26 to 28, the population of cells according to any one of claims 29 to 31 , or the population of gene-edited cells according to claim 40, to a subject in need thereof.

42. The isolated cell according to any one of claims 26 to 28, the population of cells according to any one of claims 29 to 31 , or the population of gene-edited cells according to claim 40, for use in treating a RAG-deficient immunodeficiency in a subject.

43. The method according to claim 41 , or the isolated cell, population of cells, or population of gene-edited cells for use according to claim 42, wherein the RAG-deficient immunodeficiency is T- B- severe combined immunodeficiency (SCID), Omenn syndrome, atypical SCID or combined immunodeficiency with granuloma/autoimmunity (CID-G/AI).

44. The method according to claim 41 or claim 43, or the isolated cell, population of cells, or population of gene-edited cells for use according to claim 42 or claim 43, wherein the subject has a RAG1 deficiency.

45. The method according to any one of claims 41 , 43, or 44, or the isolated cell, population of cells, or population of gene-edited cells for use according to any one of claims 42 to 44, wherein the subject has a mutation in the RAG1 gene, optionally in RAG1 exon 2.

Description:

REPLACEMENT OF RAG1 FOR USE IN THERAPY

FIELD OF THE INVENTION

The present invention relates to methods for gene-editing cells to introduce a RAG1 polypeptide, for example as a treatment for severe combined immunodeficiency. The present invention also relates to polynucleotides, vectors, guide RNAs, kits, compositions, and gene editing systems for use in said methods. The present invention also relates to genomes and cells obtained or obtainable by said methods.

BACKGROUND TO THE INVENTION

The RAG1 and RAG2 proteins initiate V(D)J recombination, allowing generation of a diverse repertoire of T and B cells (Teng G, Schatz DG. Advances in Immunology. 2015;128:1-39). RAG mutations in humans cause a broad spectrum of phenotypes, including T B' SCID, Omenn syndrome (OS), atypical SCID (AS) and combined immunodeficiency with granuloma/autoimmunity (CID-G/AI) (Notarangelo LD, et al. Nat Rev Immunol. 2016;16(4):234-246).

Hematopoietic stem cell transplantation (HSCT) is the mainstay for severe forms of RAG1 deficiency, including T B' SCID, OS and AS with an overall survival of -80% after transplantation from donors other than matched siblings (Haddad E, et al. Blood. 2018;132(17):1737-49). However, overall survival rate is lower in non-matched-sibling donors and a high rate of graft failure and poor T and B cell immune reconstitution are observed in the absence of myeloablative or reduced intensity conditioning. Besides donor type and conditioning, other factors associated with worse outcomes after HSCT include age (>3.5 months of life) and infections at the time of transplantation.

An alternative approach to overcome the obstacles with HSCT is represented by gene therapy. Selective advantage of gene-corrected hematopoietic stem cells (HSCs) to overcome the block of T and B cells that occur in the absence of RAG activity represents the rationale for developing such a strategy. In recent years, lentiviral vectors have become the strategy of choice to deliver the transgene of interest, and allow its expression under the control of suitable promoters (Naldini L, Nature. 2015;526:351-360). In the case of RAG1 deficiency, the observation that endogenous RAG1 gene expression is tightly regulated during cell cycle and during lymphoid development, may expose to the risk that ectopic or dysregulated gene expression could lead to immune dysregulation or leukemia (Lagresle-Peyrou C, et al. Blood. 2006;107(1):63-72; Pike-Overzet K, et al. Leukemia. 2011 ;25(9):1471-83; and Pike-Overzet K, et al. Journal of Allergy and Clinical Immunology . 2014;134:242-243). Several groups have examined the safety and efficacy of lentivirus-mediated gene therapy for RAG deficiency in preclinical models showing poor immune reconstitution or severe signs of inflammation, with cellular infiltrates in the skin, lung, liver, kidney, and presence of circulating anti-double strand DNA (van Til NP, et al. J Allergy Clin Immunol. 2014;133(4):1116-23).

Overall, these data raise significant concerns on the clinical use of conventional RAG1 gene therapy vectors that allow suboptimal levels and deregulated pattern of gene expression.

Thus, there is a demand for improved treatments for RAG1 deficiency.

SUMMARY OF THE INVENTION

The present inventors have developed a gene editing strategy to correct mutations in the RAG1 gene by targeting the genomic region located at the 5’ of the second exon, which contains the entire coding sequence of the gene.

The present inventors have designed and selected a panel of CRISPR-Cas9 nucleases and identified specific sites in non-repeated regions of the first intron of the human RAG1 gene. The present inventors have identified guide RNAs and optimal conditions for the delivery of the CRISPR-Cas9 nuclease ribonucleoprotein complexes. In parallel, the present inventors have developed a donor DNA carrying the human RAG1 cDNA.

The gene editing strategy allows a high level of activity (measured as frequency of NHEJ- mutagenesis) and targeting efficiency (measured as GFP expression), both in a surrogate cell line deficient in RAG1 expression and expressing a recombination cassette, and in humans CD34+ HSCs obtained from mobilized peripheral blood (mPB). High editing efficiencies were reached in mobilized peripheral blood (mPB) CD34 ⁺ cells using the gene editing strategy.

In one aspect, the present invention provides a polynucleotide comprising from 5’ to 3’: a first homology region, a splice acceptor sequence, a nucleotide sequence encoding a RAG1 polypeptide, and a second homology region.

In another aspect, the present invention provides a polynucleotide comprising from 5’ to 3’: a first homology region, a nucleotide sequence encoding a RAG1 polypeptide, and a second homology region.

In some embodiments:

(i) the first homology region is homologous to a first region of the RAG1 intron 1 and the second homology region is homologous to a second region of the RAG1 intron 1 ; or (ii) the first homology region is homologous to a first region of the RAG1 intron 1 or the RAG1 exon 2 and the second homology region is homologous to a second region of the RAG1 exon 2.

In some embodiments, the first homology region is homologous to a first region of the RAG1 intron 1 and the second homology region is homologous to a second region of the RAG1 intron

1.

In some embodiments, the first homology region is homologous to a first region of the RAG1 intron 1 and the second homology region is homologous to a second region of the RAG1 exon

2.

In some embodiments, the first homology region is homologous to a first region of the RAG1 exon 2 and the second homology region is homologous to a second region of the RAG1 exon 2.

In some embodiments:

(i) the first homology region is homologous to a region upstream of chr 11 : 36569295 and the second homology region is homologous to a region downstream of chr 11 : 36569298;

(ii) the first homology region is homologous to a region upstream of chr 11 : 36573790 and the second homology region is homologous to a region downstream of chr 11 : 36573793;

(iii) the first homology region is homologous to a region upstream of chr 11 : 36573641 and the second homology region is homologous to a region downstream of chr 11 : 36573644;

(iv) the first homology region is homologous to a region upstream of chr 11 : 36573351 and the second homology region is homologous to a region downstream of chr 11 : 36573354;

(v) the first homology region is homologous to a region upstream of chr 11 : 36569080 and the second homology region is homologous to a region downstream of chr 11 : 36569083;

(vi) the first homology region is homologous to a region upstream of chr 11 : 36572472 and the second homology region is homologous to a region downstream of chr 11 : 36572475; (vii) the first homology region is homologous to a region upstream of chr 11 : 36571458 and the second homology region is homologous to a region downstream of chr 11 : 36571461 ;

(viii) the first homology region is homologous to a region upstream of chr 11 : 36571366 and the second homology region is homologous to a region downstream of chr 11 : 36571369;

(ix) the first homology region is homologous to a region upstream of chr 11 : 36572859 and the second homology region is homologous to a region downstream of chr 11 : 36572862;

(x) the first homology region is homologous to a region upstream of chr 11 : 36571457 and the second homology region is homologous to a region downstream of chr 11 : 36571460;

(xi) the first homology region is homologous to a region upstream of chr 11 : 36569351 and the second homology region is homologous to a region downstream of chr 11 : 36569354; or

(xii) the first homology region is homologous to a region upstream of chr 11 : 36572375 and the second homology region is homologous to a region downstream of chr 11 : 36572378.

In some embodiments:

(i) the first homology region is homologous to a region upstream of chr 11 : 36569295 and the second homology region is homologous to a region downstream of chr 11 : 36569298;

(ii) the first homology region is homologous to a region upstream of chr 11 : 36573351 and the second homology region is homologous to a region downstream of chr 11 : 36573354; or

(iii) the first homology region is homologous to a region upstream of chr 11 : 36571366 and the second homology region is homologous to a region downstream of chr 11 : 36571369.

In preferred embodiments, the first homology region is homologous to a region upstream of chr 11 : 36569295 and the second homology region is homologous to a region downstream of chr 11 : 36569298. In some embodiments, the first homology region is homologous to a region upstream of chr 11 : 36573790 and the second homology region is homologous to a region downstream of chr 11 : 36573793.

In some embodiments, the first homology region is homologous to a region upstream of chr 11 : 36573641 and the second homology region is homologous to a region downstream of chr 11 : 36573644.

In some embodiments, the first homology region is homologous to a region upstream of chr 11 : 36573351 and the second homology region is homologous to a region downstream of chr 11 : 36573354.

In some embodiments, the first homology region is homologous to a region upstream of chr 11 : 36569080 and the second homology region is homologous to a region downstream of chr 11 : 36569083.

In some embodiments, the first homology region is homologous to a region upstream of chr 11 : 36572472 and the second homology region is homologous to a region downstream of chr 11 : 36572475.

In some embodiments, the first homology region is homologous to a region upstream of chr 11 : 36571458 and the second homology region is homologous to a region downstream of chr 11 : 36571461.

In some embodiments, the first homology region is homologous to a region upstream of chr 11 : 36571366 and the second homology region is homologous to a region downstream of chr 11 : 36571369.

In some embodiments, the first homology region is homologous to a region upstream of chr 11 : 36572859 and the second homology region is homologous to a region downstream of chr 11 : 36572862.

In some embodiments, the first homology region is homologous to a region upstream of chr 11 : 36571457 and the second homology region is homologous to a region downstream of chr 11 : 36571460.

In some embodiments, the first homology region is homologous to a region upstream of chr 11 : 36569351 and the second homology region is homologous to a region downstream of chr 11 : 36569354. In some embodiments, the first homology region is homologous to a region upstream of chr 11 : 36572375 and the second homology region is homologous to a region downstream of chr 11 : 36572378.

In preferred embodiments, the first homology region is homologous to a region comprising chr 11 : 36569245-chr 11 : 36569294 and/or the second homology region is homologous to a region comprising chr 11 : 36569299-chr 11 : 36569348.

In some embodiments, the 3’ terminal sequence of the first homology region comprises or consists of a nucleotide sequence that has at least 70% identity to SEQ ID NO: 7 and/or the 5’ terminal sequence of the second homology region comprises or consists of a nucleotide sequence that has at least 70% identity to SEQ ID NO: 19.

In some embodiments, the first homology region comprises or consists of a nucleotide sequence that has at least 70% identity to SEQ ID NO: 31 , or a fragment thereof and/or the second homology region comprises or consists of a nucleotide sequence that has at least 70% identity to SEQ ID NO: 32, or a fragment thereof.

In some embodiments, the first and second homology regions are each 50-1000bp in length, 100-500 bp in length, or 200-400 bp in length.

In some embodiments, the nucleotide sequence encoding a RAG1 polypeptide comprises or consists of a nucleotide sequence encoding an amino acid sequence that has at least 70% identity to SEQ ID NO: 4 or SEQ ID NO: 5.

In some embodiments, the nucleotide sequence encoding a RAG1 polypeptide comprises or consists of a nucleotide sequence that has at least 70% identity to SEQ ID NO: 6.

In some embodiments, the splice acceptor site comprises or consists of a nucleotide sequence that has at least 70% identity to SEQ ID NO: 33.

In preferred embodiments, the nucleotide sequence encoding a RAG1 polypeptide is operably linked to a polyadenylation sequence, optionally wherein the polyadenylation sequence is a bGH polyadenylation sequence.

In some embodiments, the nucleotide sequence encoding a RAG1 polypeptide is operably linked to a polyadenylation sequence comprising or consisting of a nucleotide sequence that has at least 70% identity to SEQ ID NO: 35. In some embodiments, the nucleotide sequence encoding a RAG1 polypeptide is operably linked a Kozak sequence, optionally wherein the Kozak sequence comprises or consists of a nucleotide sequence that has at least 70% identity to SEQ ID NO: 36.

In some embodiments, the polynucleotide comprises or consists of a nucleotide sequence that has at least 70% identity to SEQ ID NO: 39.

In another aspect, the present invention provides a vector comprising the polynucleotide of the invention.

In some embodiments, the vector is a viral vector, optionally an adeno-associated viral (AAV) vector such as an AAV6 vector. In some embodiments, the vector is a lentiviral vector, such as an integration-defective lentiviral vector (IDLV).

In another aspect, the present invention provides a guide RNA comprising or consisting of a nucleotide sequence that has at least 90% identity to any of SEQ ID NOs: 41-52.

In another aspect, the present invention provides a guide RNA comprising or consisting of a nucleotide sequence that has at least 90% identity to any of SEQ ID NOs: 53-55.

In preferred embodiments, the guide RNA comprises or consists of a nucleotide sequence that has at least 90% identity to SEQ ID NO: 41. In preferred embodiments, the guide RNA comprises or consists of a nucleotide sequence that has at least 90% identity to SEQ ID NO: 53. In some embodiments, the guide RNA comprises or consists of a nucleotide sequence that has at least 90% identity to SEQ ID NO: 42. In some embodiments, the guide RNA comprises or consists of a nucleotide sequence that has at least 90% identity to SEQ ID NO: 43. In some embodiments, the guide RNA comprises or consists of a nucleotide sequence that has at least 90% identity to SEQ ID NO: 44. In some embodiments, the guide RNA comprises or consists of a nucleotide sequence that has at least 90% identity to SEQ ID NO: 45. In some embodiments, the guide RNA comprises or consists of a nucleotide sequence that has at least 90% identity to SEQ ID NO: 46. In some embodiments, the guide RNA comprises or consists of a nucleotide sequence that has at least 90% identity to SEQ ID NO: 47. In some embodiments, the guide RNA comprises or consists of a nucleotide sequence that has at least 90% identity to SEQ ID NO: 48. In some embodiments, the guide RNA comprises or consists of a nucleotide sequence that has at least 90% identity to SEQ ID NO: 49. In some embodiments, the guide RNA comprises or consists of a nucleotide sequence that has at least 90% identity to SEQ ID NO: 50. In some embodiments, the guide RNA comprises or consists of a nucleotide sequence that has at least 90% identity to SEQ ID NO: 51. In some embodiments, the guide RNA comprises or consists of a nucleotide sequence that has at least 90% identity to SEQ ID NO: 52. In some embodiments, the guide RNA comprises or consists of a nucleotide sequence that has at least 90% identity to SEQ ID NO: 54. In some embodiments, the guide RNA comprises or consists of a nucleotide sequence that has at least 90% identity to SEQ ID NO: 55.

In some embodiments, from one to five of the terminal nucleotides at 5’ end and/or 3’ end of the guide RNA are chemically modified to enhance stability, optionally wherein three terminal nucleotides at 5’ end and/or 3’ end if the guide RNA are chemically modified to enhance stability, optionally wherein the chemical modification is modification with 2'-O-methyl 3'phosphorothioate.

In another aspect, the present invention provides a kit comprising the polynucleotide or the vector of the invention.

In another aspect, the present invention provides a composition comprising the polynucleotide or the vector of the invention.

In another aspect, the present invention provides a gene-editing system comprising the polynucleotide or the vector of the invention.

In some embodiments, the kit, composition, or gene-editing system further comprises a guide RNA of the invention. In some embodiments, the kit, composition, or gene-editing system further comprises a RNA-guided nuclease, optionally wherein the RNA-guided nuclease is a Cas9 endonuclease

In another aspect, the present invention provides for use of the polynucleotide, the vector, the kit, the composition, or the gene-editing system, for gene editing a cell or a population of cells. In some embodiments, the use is ex vivo or in vitro use.

In another aspect, the present invention provides a genome comprising the polynucleotide of the invention.

In another aspect, the present invention provides a genome comprising a splice acceptor sequence and a nucleotide sequence encoding a RAG1 polypeptide located in the RAG1 intron 1 or RAG1 exon 2. In some embodiments, the splice acceptor sequence and the nucleotide sequence encoding RAG1 are located in the RAG1 intron 1.

In some embodiments:

(i) the splice acceptor sequence and the nucleotide sequence encoding RAG1 replace chr 11 : 36569295 to chr 11 : 36569298; (ii) the splice acceptor sequence and the nucleotide sequence encoding RAG1 replace chr 11 : 36573790 to chr 11 : 36573793;

(iii) the splice acceptor sequence and the nucleotide sequence encoding RAG1 replace chr 11 : 36573641 to chr 11 : 36573644;

(iv) the splice acceptor sequence and the nucleotide sequence encoding RAG1 replace chr 11 : 36573351 to chr 11 : 36573354;

(v) the splice acceptor sequence and the nucleotide sequence encoding RAG1 replace chr 11 : 36569080 to chr 11 : 36569083;

(vi) the splice acceptor sequence and the nucleotide sequence encoding RAG1 replace chr 11 : 36572472 to chr 11 : 36572475;

(vii) the splice acceptor sequence and the nucleotide sequence encoding RAG1 replace chr 11 : 36571458 to chr 11 : 36571461 ;

(viii) the splice acceptor sequence and the nucleotide sequence encoding RAG1 replace chr 11 : 36571366 to chr 11 : 36571369;

(ix) the splice acceptor sequence and the nucleotide sequence encoding RAG1 replace chr 11 : 36572859 to chr 11 : 36572862;

(x) the splice acceptor sequence and the nucleotide sequence encoding RAG1 replace chr 11 : 36571457 to chr 11 : 36571460;

(xi) the splice acceptor sequence and the nucleotide sequence encoding RAG1 replace chr 11 : 36569351 to chr 11 : 36569354; or

(xii) the splice acceptor sequence and the nucleotide sequence encoding RAG1 replace chr 11 : 36572375 to chr 11 : 36572378.

In some embodiments:

(i) the splice acceptor sequence and the nucleotide sequence encoding RAG1 replace chr 11 : 36569295 to chr 11 : 36569298;

(ii) the splice acceptor sequence and the nucleotide sequence encoding RAG1 replace chr 11 : 36573351 to chr 11 : 36573354; or (iii) the splice acceptor sequence and the nucleotide sequence encoding RAG1 replace chr 11 : 36571366 to chr 11 : 36571369.

In some embodiments, the splice acceptor sequence and the nucleotide sequence encoding RAG1 replace chr 11 : 36569295 to chr 11 : 36569298.

In another aspect, the present invention provides a cell comprising the polynucleotide, the vector, or the genome of the invention.

In another aspect, the present invention provides a population of cells comprising one or more cells of the present invention.

In another aspect, the present invention provides a method of gene editing a population of cells comprising delivering the polynucleotide or the vector of the invention to a population of cells to obtain a population of gene-edited cells. In some embodiments, the method is an ex vivo or in vitro method.

In another aspect, the present invention provides a method of treating immunodeficiency in a subject in need thereof, comprising delivering the polynucleotide or the vector of the invention to a population of cells to obtain a population of gene-edited cells and administering the population of gene-edited cells to the subject.

In another aspect, the present invention provides a population of gene-edited cells obtainable by the method of the invention.

In another aspect, the present invention provides the polynucleotide, the vector, the guide RNA, the kit, the composition, or the gene-editing system, for use in treating immunodeficiency in a subject.

In another aspect, the present invention provides a method of treating a subject comprising administering a cell, a population of cells, or a population of gene edited cells of the present invention to the subject.

In another aspect, the present invention provides a method of treating immunodeficiency in a subject in need thereof comprising administering a cell, a population of cells, or a population of gene edited cells of the present invention to the subject.

In another aspect, the present invention provides a cell, a population of cells, or a population of gene edited cells of the present invention for use as a medicament. In another aspect, the present invention provides a cell, a population of cells, or a population of gene edited cells of the present invention for use in treating immunodeficiency in a subject.

DESCRIPTION OF DRAWINGS

Figure 1. Generation of NALM6 Cas9 and K562 Cas9 cell lines

A) Schematic representation of the gene correction approach; B) Schematic representation of the protocol for generation of K562 Cas9 and NALM6 Cas9 cell lines; C) Vector Copy Number (VCN) of the integrated Cas9 containing cassette measured by ddPCR, telomerase was used as normalizer; D) Cas9 expression for scaling doses of doxycycline measured by qPCR in NALM6 Cas9 (left panel) and K562 Cas9 (right panel) cell lines, represented as fold change Vs actin.

Figure 2. Selection of the best performing gRNA

A) Schematic representation of the intronic and exonic loci targeted by the different gRNA tested; B) Schematic representation of the experimental protocol; C) Percentages of NHEJ induced indels in K562 Cas9 treated with different doses of plasmids encoding for different guides, 7 days after transfection, n=1 ; D) Percentages of NHEJ induced indels in NALM6 Cas9 treated with different doses of plasmids encoding for guides 3, 7 and 9, 7 days after transfection, n=1 ; E) Percentages of NHEJ induced indels in NALM6 Cas9 treated with different doses of guides 3 and 9 in vitro preassembled RNPs 7 days after transfection, n=1.

Figure 3. Donor DNA optimization

A) RAG1 gene expression measured by RT-qPCR, represented as fold change vs RAG1 expression in 293T cell line, actin was used as normalizer; B) Schematic representation of different SA_GFP DNA donor tested; C) Schematic representation of the splicing mechanism with SA_GFP_SD donor; D) Percentage of targeted cells measured by flow cytometry as GFP+ cells, 7 days after transfection; E) GFP expression levels measured as Mean Fluorescence Intensity (MFI) gating on GFP+ events; F) Representative FlowJo plots; Oneway ANOVA, Geisser-Greenhouse correction for multiple comparison, n=3. P values: *<0.05; **<0.005; ***<0.0005; ****<0.0001. Mean±SD are shown.

Figure 4. Off-target analysis

A) Table shows the top 10 off-target sites predicted by in silico COSMID tool for guide 9. The off-target sequence, type of PAM, score, number of mismatches and chromosomal position are shown. B-C) Cutting efficiency measured as percentage of NHEJ (D) and dsDNA tag integration (ODN) on target site are evaluated by RFLP in K562 cells. D-E) Plots show the coverage of on-target reads (chromosome 11) of guide 9 (D) and guide 7 (E) and off-target reads identified for guide 7 by relaxed constraints (chromosome 20 and 9).

Figure 5. Optimization of the gene editing protocol, guide 3 efficiency

A) Schematic representation of the gene editing protocol; B) Schematic representation of the gating strategy; C) Percentages of NHEJ induced indels in hCB-CD34 ⁺ cells treated with different doses of guides 3 and 9 as in vitro preassembled RNPs, n=2; D) Percentage of targeted cells using guide 3, measured by flow cytometry as GFP ⁺ cells in the hCD34 ⁺ gate, n=1 ; E) Percentage of targeted cells using guide 3, measured by flow cytometry as GFP ⁺ cells in the three main hCD34+ cell subpopulations for hCD133 hCD90 expression, n=1.

Figure 6. Optimization of the gene editing protocol, guide 9 efficiency

A) Percentages of viable cells measured by flow cytometry as 7AAD7AnnexinV' at day 4; B) Total number of cells at day 7 expressed as fold increase compared day 3; C) Frequency of hCD34 ⁺ cells at day 7 measured by flow cytometry; D) Distribution of the 3 hCD34 ⁺ cell subpopulations measured by flow cytometry based on the expression of hCD133 and hCD90 at day 7; E) Frequency of targeted cells measured by flow cytometry as GFP ⁺ cells in the 3 hCD34 ⁺ cell subpopulations based on the expression of hCD133 and hCD90 at day 7; F) Percentages of targeted cells measured by ddPCR at day 7, telomerase genomic site was used as normalizer; G) Total number of edited cells at day 7 calculated on frequency of targeted cells by ddPCR. One-way ANOVA, Geisser-Greenhouse correction for multiple comparison, n=3. P values: *<0.05; **<0.005; ***<0.0005; ****<0.0001. Mean±SD are shown.

Figure 7. In vivo transplantation of gene edited hCB-CD34 ⁺ cells

A) Percentages of targeted cells measured by ddPCR at day 4, telomerase genomic site was used as normalizer; B) Treated cell engraftment measured by flow cytometry as frequency of hCD45 ⁺ cells in peripheral blood (PB); C) Targeted cell engraftment measured in PB by flow cytometry as frequency of GFP ⁺ cells in hCD45 ⁺ gate; D, F, H) B cell, T cell and Myeloid cell frequency in PB measured as percentage of hCD19 ⁺ cells (D), hCD3 ⁺ cells (F), hCD13 ⁺ cells (F) in hCD45 ⁺ gate, respectively. E, G, I) Targeted cells among the B-cell, T-cell and Myeloidcell compartment in PB measured as GFP ⁺ cells in the hCD19 ⁺ gate (E), hCD3 ⁺ gate (G) and hCD13 ⁺ gate (I), respectively; L) Frequency of hCD34 ⁺ cells measured by flow cytometry among hCD45 ⁺ cells in the bone marrow; M) Frequency of targeted cells measured by flow cytometry as GFP ⁺ cells among hCD34 ⁺ cells in the bone marrow; N) Frequency of GFP ⁺ expressing cells measured by flow cytometry, among different T-cell development stages in the thymus (according to the expression of hCD4 and hCD8), in the peripheral blood and in the spleen (according to the expression of hCD3, hCD4 and hCD8), 17 weeks after transplant. Mann-Whitney test at 17 weeks after transplant. Group size: SA_GFP n=5; PGK_GFP n=4. P values: *<0.05; **<0.005; ***<0.0005; ****<0.0001. Mean±SD are shown.

Figure 8. Test corrective donor on hMPB-CD34 ⁺ cells

A) Schematic representation of the corrective donor; B) Schematic representation of the experimental protocol; C) Percentages of targeted cells measured by ddPCR on sorted hCD34 ⁺ cell subpopulation according to the expression of hCD133 and hCD90 at day 4, telomerase genomic region was used as normalizer; D) Total number of cells at day 4 represented as fold increase compared day 0. N=3.

Figure 9. In vivo transplantation of edited hMPB-CD34 ⁺ cells from HD and RAG1- patient

A) Schematic representation of the experimental groups; B) Percentage of targeted cells measured by ddPCR at day 4, telomerase genomic region was used as normalizer; C) Cell engraftment measured by flow cytometry in PB as frequency of hCD45 ⁺ cells; D) Frequency of targeted cells among human cells measured by ddPCR in PB 8 weeks after transplant, telomerase genomic region was used as normalizer; E) Immune cell distribution in PB of mice transplanted with MPB-CD34 ⁺ of HD treated and untreated cells measured by flow cytometry according to the expression of hCD19, hCD3 and hCD13 in the hCD45 ⁺ gate; F) Immune cell distribution in PB of mice transplanted with MPB-CD34* cells derived from a RAG 1 -patient treated and untreated cells measured by flow cytometry according to the expression of hCD19, hCD3 and hCD13 in the hCD45+ gate; G, H) Analyses in bone marrow (G) and spleen (H) of the proportion of human engraftment measured as frequency of hCD45 ⁺ cells by flow cytometry (left panels) and of targeting efficiency measured as HDR by ddPCR (right panels). Mean±SD are shown.

Figure 10. Multiparametric analysis of hMPB-CD34 ⁺ cells from HD and RAG1 -patient before and after gene editing manipulation.

A, B) Analysis of HSPC composition was performed in MPB-CD34* cells derived from healthy donor (HD, A) and a RAG1-Patient (Pt, B) by flow-cytometry. The analysis was performed before the expansion phase (day-3) and 1 day after the gene editing procedure (GE). Untreated cells (UT) were also analyzed the same day of edited cells. Graphs show 20 subtypes analyzed in the Lineage negative (Lin ) CD34 ⁺ gate including: Hematopoietic Stem cells (HSC), Multipotent Progenitors (MPP), Multi-Lymphoid Progenitors (MLP), Early T Progenitors (ETP), B and NK cell precursors (Pre-B/NK), common myeloid progenitors (CMP), granulocyte-monocyte progenitors (GMP), megakaryoerythroid progenitors (MEP), megakaryocyte progenitors (MKp) and erythroid progenitors (EP).

Figure 11. Donor Screening for RAG1 editing.

A) Schematic representations of donor constructs. HA_L, left homology arm; HA_R, right homology arm; SA, splice acceptor; SD, splice donor; BGHpA, bovine growth hormone poly A; WPRE, Woodchuck hepatitis virus post-transcriptional regulatory element; IRES, the internal ribosome entry site sequence; PEST, proline (P), glutamic acid (E), serine (S), and threonine (T). B) schematic representation of the experimental protocol. C) GFP expression levels shown as Mean Fluorescence Intensity (MFI) gating on GFP+ events measured by flow cytometry over time (d, days after editing). D) Modulation of GFP expression in serum starved cells is shown as ratio of GFP MFI of starved cells (- FBS) and GFP MFI of not starved cells (+ FBS) (1 experiment representative of 3).

Figure 12. Editing enhancer effects on HDR efficiency of RAG1 locus.

A) Schematic representation of the gene editing protocol (upper panel) and artificial thymic organoid protocol (ATO) (lower panel). B) HDR efficiency is shown as percentages of edited alleles measured by ddPCR 7 days after editing; C) Frequency of targeted cells measured by flow cytometry as GFP ⁺ cells among hCD34 ⁺ subsets 7 days after editing; D) Analysis of HSPC composition was performed in MPB or BM CD34 ⁺ cells derived from healthy donor by flow-cytometry. The analysis was performed before the expansion phase (day 0) and 1 day after the gene editing procedure (GE, day 4). Untreated cells (UT) were also analyzed the same day of edited cells. Graphs show 20 subtypes analyzed in the Lineage negative (Lin ) CD34 ⁺ gate including: Hematopoietic Stem cells (HSC), Multipotent Progenitors (MPP), MultiLymphoid Progenitors (MLP), Early T Progenitors (ETP), B and NK cell precursors (Pre-B/NK), common myeloid progenitors (CMP), granulocyte-monocyte progenitors (GMP), megakaryoerythroid progenitors (MEP), megakaryocyte progenitors (MKp) and erythroid progenitors (EP).

Figure 13. Editing enhancer effects on T cell differentiation potential.

Representative images of artificial thymic organoid (ATO) 4 weeks after ATO seeding with Untreated cells (UT) or edited cells with or without HDR enhancers. B) total number of cells harvested from ATOs 4 weeks after ATO seeding. C) HDR efficiency is shown as percentages of edited alleles measured by ddPCR in bulk differentiated T cells 4 weeks after ATO seeding. D) HDR efficiency is measured as percentage of GFP+ cells within distinct T cell subpopulation by flow cytometry 4 weeks after ATO seeding.

Figure 14. Donor constructs for the intronic correction strategy.

Schematic representation of the SA_coRAG1 CDS_BGHpA (A) and SA_coRAG1 CDS_SD

(B) donor templates used for the intronic correction strategy. HA, homology arm; SA, splice acceptor; SD, splice donor; coRAGI CDS, codon optimized RAG1 coding sequence; BGHpA, bovine growth hormone poly A; Ex., exon; gRNA, guide RNA; 3’IITR, 3’ untranslated region; HDR, homology directed repair.

Figure 15. Corrective donor comparison in NALM6.Rag1KO cells.

(A) Schematic representation of the experiment performed to compare the correction efficacy of the two donors: the SA_coRAG1 CDS_BGHpA vs the SA_coRAG1 CDS_SD donor. (B) RAG1 CDS expression was evaluated in various NALM6.Rag1 KO edited clones by RT-qPCR and measured as relative expression to the housekeeping beta-actin. (C) Recombination activity was evaluated 7 days after serum-starvation as proportion of GFP+ cells gated on transduced cells by flow cytometry.

Figure 16. Corrective donor comparison in HD-HSPC.

(A) Hematopoietic stem and progenitor cells were edited by guide 9 and Cas9 as RNP in combination with SA_coRAG1 CDS_BGHpA or SA_coRAG1 CDS_SD donor. The proportion of edited alleles was evaluated by ddPCR in bulk HSPC 4 days after the editing. (B) The proportion of edited alleles was evaluated by ddPCR in HSPC subsets isolated by cell sorting.

(C) Kinetics of cell growth in untreated (UT) cr edited HSPC according to the indicated donors, doses and days after gene editing (GE). (D) Colony forming unit (CFU) assay was performed on untreated or edited HSPC by counting the number of red (erythroid), white (myeloid) and mixed colonies at microscope 14 days after the plating. (E) Distribution of the CD34+ cell subpopulations and CD34- cells measured by flow cytometry based on the expression of hCD133 and hCD90 analysed 4 days after the editing. (F) Representative plots of the T cell differentiation stages analysed by flow cytometry 7 weeks after ATO seeding. (G) HDR efficiency is measured as proportion of edited alleles in bulk, CD4+ CD8+ double positive (DP) cells and CD4- CD8- double negative (DN) cells by flow cytometry 6 weeks after ATO seeding.

DETAILED DESCRIPTION

It must be noted that as used herein and in the appended claims, the singular forms "a", "an", and "the" include plural referents unless the context clearly dictates otherwise. The terms "comprising", "comprises" and "comprised of' as used herein are synonymous with "including", "includes" or "containing", "contains", and are inclusive or open-ended and do not exclude additional, non-recited members, elements or method steps. The terms "comprising", "comprises" and "comprised of” also include the term "consisting of”.

Numeric ranges are inclusive of the numbers defining the range. Unless otherwise indicated, any nucleic acid sequences are written left to right in 5' to 3' orientation; amino acid sequences are written left to right in amino to carboxy orientation, respectively.

All recited genomic locations are based on human genome assembly GRCh38.p13 (GCF_000001405.39). One of skill in the art will be able to identify the corresponding genome locations in alternative genome assemblies and convert the recited genomic location accordingly. For example, RAG1 is located at chr 11 : 36510353 to 36579762 in assembly GRCh38.p13 and at chr 11 : 36532053 to 36601312 in assembly GRCh37.p13.

The publications discussed herein are provided solely for their disclosure prior to the filing date of the present application. Nothing herein is to be construed as an admission that such publications constitute prior art to the claims appended hereto.

Recombination activating gene 1 (RAG1)

The present invention relates to methods for gene-editing cells to introduce a RAG1 polypeptide, for example as a treatment for severe combined immunodeficiency. The present invention also relates to polynucleotides, vectors, guide RNAs, kits, compositions, and gene editing systems for use in said methods, and genomes and cells obtained or obtainable by said methods.

“RAG1” is the abbreviated name of the polypeptide encoded by recombination activating gene 1 and is also known as RAG-1 , RNF74, and recombination activating 1.

RAG1 is the catalytic component of the RAG complex, a multiprotein complex that mediates the DNA cleavage phase during V(D)J recombination. V(D)J recombination assembles a diverse repertoire of immunoglobulin and T-cell receptor genes in developing B and T- lymphocytes through rearrangement of different V (variable), in some cases D (diversity), and J (joining) gene segments. In the RAG complex, RAG1 mediates the DNA-binding to the conserved recombination signal sequences (RSS) and catalyses the DNA cleavage activities by introducing a double-strand break between the RSS and the adjacent coding segment. RAG2 is not a catalytic component but is required for all known catalytic activities. A “RAG1 polypeptide” is a polypeptide having RAG1 activity, for example a polypeptide which is able to form a RAG complex, mediate DNA-binding to the RSS, and introduce a doublestrand break between the RSS and the adjacent coding segment. Suitably, a RAG1 polypeptide may have the same or similar activity to a wild-type RAG1 , e.g. may have at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, at least 100%, at least 110%, at least 120%, at least 130%, at least 140%, or at least 150% of the activity of a wild-type RAG1 polypeptide.

The RAG1 polypeptide may be a fragment of RAG1 and/or a RAG1 variant.

A “fragment of RAG1” may refer to a portion or region of a full-length RAG1 polypeptide that has the same of similar activity as a full-length RAG1 polypeptide, i.e. the fragment may be a functional fragment. The fragment may have at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, or 100% of the activity of a full-length RAG1 polypeptide. A person skilled in the art would be able to generate fragments based on the known structural and functional features of RAG1. These are described, for instance, in Arbuckle, J.L., et al., 2011. BMC biochemistry, 12(1), p.23; Ru, H., et al., 2015. Cell, 163(5), pp.1138-1152; and Kim, M.S., et al., 2015. Nature, 518(7540), pp.507-511.

The minimal regions of RAG1 required for catalysis have been identified. These regions are referred to as the core proteins. Core RAG1 consists of multiple structural domains, termed the nonamer binding domain (NBD; residues 389-464), the central domain (residues 528-760), and the C-terminal domain (residues 761-980) domains. Besides the ability to recognize the RSS nonamer and heptamer through the NBD and the central domain, respectively, core RAG1 contains the essential acidic active site residues (Arbuckle, J.L., et al., 2011. BMC biochemistry, 12(1), p.23). Suitably, a fragment of RAG1 comprises the nonamer binding domain, the central domain, and/or the C-terminal domain.

A “RAG1 variant” may include an amino acid sequence or a nucleotide sequence which may be at least 50%, at least 55%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85% or at least 90% identical, optionally at least 95% or at least 97% or at least 99% identical to a wild-type RAG1 polypeptide. RAG1 variants may have the same or similar activity to a wild-type RAG1 polypeptide, e.g. may have at least at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, at least 100%, at least 110%, at least 120%, at least 130%, at least 140%, or at least 150% of the activity of a wild-type RAG1 polypeptide. A person skilled in the art would be able to generate RAG1 variants based on the known structural and functional features of RAG1 and/or using conservative substitutions. The gene encoding RAG1 (NCBI gene ID: 5896) is located in the human genome at chr 11 : 36510353 to 36579762.

Several alternative mRNAs are transcribed from the RAG1 gene. Transcript variant 1 (NM_000448) has two exons and one intron. As used herein, the region of the RAG1 gene corresponding to the first exon of transcript variant 1 is called the “RAG1 exon 1”, the region of the RAG1 gene corresponding to the intron of transcript variant 1 is called the “RAG1 intron 1”, and the region of the RAG1 gene corresponding to the second exon (which encodes a RAG1 polypeptide) is called the “RAG1 exon 2”.

Suitably, the RAG1 exon 1 is from chr 11 : 36568006 to chr 11 : 36568122; the RAG1 intron 1 is from chr 11 : 36568123 to chr 11 : 36573290; and/or the RAG1 exon 2 is from chr 11 : 36573291 to chr 11 : 36579762.

Suitably, the RAG1 exon 1 consists of the nucleotide sequence of SEQ ID NO: 1 , or variants thereof; the RAG1 intron 1 consists of the nucleotide sequence of SEQ ID NO: 2, or variants thereof; and/or the RAG1 exon 2 consists of the nucleotide sequence of SEQ ID NO: 3, or variants thereof.

Illustrative RAG1 exon 1 (SEQ ID NO: 1) agaaacaagagggcaaggagagagcagagaacacactttgccttctctttggtattgagt aatatcaaccaaattgc agacatctcaacactttggccaggcagcctgctgagcaag

Illustrative RAG1 intron 1 (SEQ ID NO: 2) gtaacactcatacttttcatgccttgagccaaaatatttattacatttttatgtttctaa ctagaagtgcttgagctttttttccttcc aggtgatgaggggatggaatgagcaaagctacatcaatttttttttaatgtatgaaaata aaaaaggtacaagaggcc aagtttagggccactgaaggttcatagaaagatgcaaaatatctgaattactataaatga atgctattgtcagaggaaa ggtttaaggagtgcttcttgaatgaatgtgtacaaatcagcagaaggtaaggtgtgagac tcttggaaatgaatactggt agttcaggtgagaaaaataatcaggaacataatagggtgggaggaaatgtatggtttccc aggtattaacaagtattg ccaggcatttcctgaactagattggcctaagtaggagaccaatgtttctcaaaatattca ctcattttagaatcactgaatg tttaaaaatgcaatttctggattccttcccaaacagccagactctttgggacctgatgat ctgcatttctttttaaaaacaaa ctcgctcatgattctgatttgtattaattttgagaattgccatggtagagaccctgcttt gaggttatgttcttgagtcaggattc ctggccagggattgtgatgatatatttctctttctgaagtggttcatgcaagaggttgtc tgaaggaagagcaagaattgt agtgttattttgtggatacttgagacttataaaaaggctttttattttgtcacatttttg atacatgatgtttggcaaaaaacaga cgatagtatttgcagagtgaatgaataagtggaacaggtgtgataatgagaggtcacact tgagcacacagttattact tggaaattgtgtacagactaagttgaagatgttaggagggaagattgtgggccaagtaac ggggtgtatgtgtgtgggt atagggtgggcagctgggatggaaatggggggctgctgctgctgctgcaccctggcctcc tgaactaatgatatcact caccagaaactactgttcctgcactgtccaagccaccccaaactagtttgtcaaaatgaa tctgtgctgtgtggaggga ggcacgcctgtagctctgatgtcagatggcaatgtcgagatggcagtggccggtggggac agggctgagccagcac caaccactcagcctttgagatcccgaggctggtctactgctgagaccttttgttagaaga gaggagatcaagcatttgc aaggtttctgagtgtcaaaatatgaatccaagataactctttcacaatcctaacttcatg ctgtctacaggtccatattttag cctgctttctccatgttcatccgaaaagaaagaaaagctaagggtggtggtcatatttga aattagccagatcttaagtttt tctgggggaaatttagaagaaaatatggaaaagtgactatgagcacatatacagctagtc tttaaaacagttttatcca aaataaatgtatcacaaaattaataaaaatagttacttgcttgttttgaataattcaaat gatacaaaaattaataaaataa aaagtgcaaaaggccctcttatcaatgccaattctatttttttcagaaattaaacactgt taagattttagtgtgtatcctttca gaattcctgtgatttcatatatgtacaaatacaaacgtatctacataaagggaatcctac tatacttgctattgtcattctattc tctgctttttcatgtgagcatctttccatgtcactgatgcatacagaaattgcacatatg catcagtgcatacagaaaattaa attttctgcatggttttccactgtatgtctggaccatagtttatttaataataatgccct ttgggtaattatttatattgtttcctgcttt ttcaaagtaacagcttttgaaacaaatctctctctgtctttatataaatattgttgcatt cctgtggaaatgtttctattggataa cttcccaaaaggagatttattgcatcaaagataatatattcaaaaattttaaagatattg ctaaattgtctagtaggtatttta taccaatttatactcctcccaagaatgtatggagatatcttaatttctccatgccttcat taatgctgaaccatataagtagttt taatctttgctaattgaatagataaaaaatatctaatctaagtctagttcttaaaagttc tatcttctaccaaaagtaatacac gtctattttagggagtaaaaatcacaagtaaggataaaaaatagtgcagcaataaacaca ggagtgtagatgtctctg aacatactgatttaacttcctttggataaatacccagtagtaggactgctggatcatata ataattctatctttagtttttttgag gacctccatactattcttcatagtggctgtactaatttacattcctaccaactgtgtatg aaggttcccttttctctacatccttg ccagcattcattattgcttgtcatttggatacaatctattttaactggggtgagatgaca tctcattgtagttttgatatgcatttc tctgatgatcagtggtgttgagcaccttttcatatacctgtttgccatttgtatgtcttc ctttgagaaatgtctattcagatatttt acctattttaaaatcggattattagattgtttcctgtagagttgtttgagctccttgtat attctggttattaatctcttgtcagatgc atagcttacaaatattttctcccatcatgtggattgtgtcttcactttgtggattgttta ctttgctgtgcagaagcttttaacttga tgcaatcccatttgtccacttttgctttggttgccttccacaggagtatttaaataaatg tagtttggtagattttggtatagtaat gcaggccagtgggagtcaggggagaaatgtgtagggaagtgagatagttctaaggatcct acaaacatgccttatga ttgacttactcaatgtgaaagtcaatattaaacttgatgagctctagagatggtcatgca ttttaaaaagaattactcaaaa tattgtcttggaataccagagagcaagtgctttaagtataggctgggaagtaaaatgcta aaggaatgagaaggcattt ggggttgagttcaacctaagaggcaggggagccacagggaaagacctagcacctgccaca gaagagaattagg aagcagaattgaactataagcaattttgaggtgttcgttgggctgcagttgaaatatttt ttgaggttaatgagacatttgaa atggccgtgtattgtttaactcttgcatagtcctgcatagggaacaatctaataggattt ctctgtgaatcaagtcttagaaa tttgcttttaatttttatgaaaaacgcccatttctttgtttttgagacagagtcctgctc tgtcatccaggctgggttgcagtggc gtgatcttggcccactgcaatctctgcctcctgggttcaggcaattttcctgtctcagcc tcccgagtagctgggatttcaa gtgcctgccaccatgcccggctaaatttttttgtatttttggtacagatggagtatcacc atgttggccaggctggtctcgaa ctcctgacctcaagtgattcaccagccttgacctcccaaagtgttgggatcacaggcatg agccactgtgcctgtgccc caaaacaccaatttctgatgtgtgatgcatgtaagatagaacaaacttcagtaaagcggg gacttgaaaagaggcttt ggtaacagctgtcagcattaacccttgcccctccgtacctcctaatcccacccctgctca aagtatgttcatctgagaattt gtctccataactatgtgactataaaaattctcatcgattttgttagttgatcaattgagg gaaaaacatatgttacttgatata actggtgggtcaaaagaattaacccaggcaaatttgagataggtggatgggatgatggat tgaaaatacagctgctct ctttccaatcatgtactaagtaatttgggaaagattgatctaattgggtctagagagtac acttcacatggcattgtttgactt tttttctgcatcgctagcgatctgtgcattacaactcaaatcagtcgggtttcctggcat atgtaattgccaatgttttttacca gaagagaaacattactcccacctcttcttattatgttacaaactatagtgctaatgacca tcgaccaacagtgactttcag gatgacctgtgtgagttttatctgaaaccatgtgaatttttcatcttaaaagtcccttag aatctcagtctatgtacactcaggt ttgttgcaggtttagagttccgtgttttttgtttctaatgtagacacagccttataattt acaacagcattcactaattaaaattgt aagcataattactatccacgatacttattattagtttgcattcataaagctcaaaattca cttcatcctttcaagtagtgaata attagtttctttgggtttgcagctttatcatccttttatgacccatttggaagaaataaa caaccaaccccctggaagactgc tttaaaaagctggaaatacattgtccagctagtacaatgaggctaatacaatgtggaaaa tattacttttctttgattttagt agcctgtttatctttacatttactgaacaaataactattgagcacctaatgtatactggg acccttggggaggcaaagatg aatcaaagattctgtccttaaagaccttaaggtttttgtggaaggaaataaaactttaca tgtatatatttaagcacttatat gtgtgtaacaggtataagtaaccataaacactgtcagaagaggaaataactctatgatca gcacctaacatgatatatt aaggtagaagatttaatacatatcttttggaatacatgaataaataattgaatgtattta tttttattatttataagatacatca gtgggatattgatattggtcttaatatgacttgttttcattgttctcag

Illustrative RAG1 exon 2 (SEQ ID NO: 3) gtacctcagccagcATGGCAGCCTCTTTCCCACCCACCTTGGGACTCAGTTCTGCCCC AGATGAAATTCAGCACCCACATATTAAATTTTCAGAATGGAAATTTAAGCTGTTC CGGGTGAGATCCTTTGAAAAGACACCTGAAGAAGCTCAAAAGGAAAAGAAGGAT TCCTTTGAGGGGAAACCCTCTCTGGAGCAATCTCCAGCAGTCCTGGACAAGGC TGATGGTCAGAAGCCAGTCCCAACTCAGCCATTGTTAAAAGCCCACCCTAAGTT TTCAAAGAAATTTCACGACAACGAGAAAGCAAGAGGCAAAGCGATCCATCAAGC CAACCTTCGACATCTCTGCCGCATCTGTGGGAATTCTTTTAGAGCTGATGAGCA CAACAGGAGATATCCAGTCCATGGTCCTGTGGATGGTAAAACCCTAGGCCTTTT ACGAAAGAAGGAAAAGAGAGCTACTTCCTGGCCGGACCTCATTGCCAAGGTTTT CCGGATCGATGTGAAGGCAGATGTTGACTCGATCCACCCCACTGAGTTCTGCC ATAACTGCTGGAGCATCATGCACAGGAAGTTTAGCAGTGCCCCATGTGAGGTTT ACTTCCCGAGGAACGTGACCATGGAGTGGCACCCCCACACACCATCCTGTGAC

ATCTGCAACACTGCCCGTCGGGGACTCAAGAGGAAGAGTCTTCAGCCAAACTT GCAGCTCAGCAAAAAACTCAAAACTGTGCTTGACCAAGCAAGACAAGCCCGTCA GCACAAGAGAAGAGCTCAGGCAAGGATCAGCAGCAAGGATGTCATGAAGAAGA TCGCCAACTGCAGTAAGATACATCTTAGTACCAAGCTCCTTGCAGTGGACTTCC CAGAGCACTTTGTGAAATCCATCTCCTGCCAGATCTGTGAACACATTCTGGCTG ACCCTGTGGAGACCAACTGTAAGCATGTCTTTTGCCGGGTCTGCATTCTCAGAT GCCTCAAAGTCATGGGCAGCTATTGTCCCTCTTGCCGATATCCATGCTTCCCTA CTGACCTGGAGAGTCCAGTGAAGTCCTTTCTGAGCGTCTTGAATTCCCTGATGG TGAAATGTCCAGCAAAAGAGTGCAATGAGGAGGTCAGTTTGGAAAAATATAATC ACCACATCTCAAGTCACAAGGAATCAAAAGAGATTTTTGTGCACATTAATAAAGG GGGCCGGCCCCGCCAACATCTTCTGTCGCTGACTCGGAGAGCTCAGAAGCACC

GGCTGAGGGAGCTCAAGCTGCAAGTCAAAGCCTTTGCTGACAAAGAAGAAGGT

GGAGATGTGAAGTCCGTGTGCATGACCTTGTTCCTGCTGGCTCTGAGGGCGAG

GAATGAGCACAGGCAAGCTGATGAGCTGGAGGCCATCATGCAGGGAAAGGGCT

CTGGCCTGCAGCCAGCTGTTTGCTTGGCCATCCGTGTCAACACCTTCCTCAGCT

GCAGTCAGTACCACAAGATGTACAGGACTGTGAAAGCCATCACAGGGAGACAG

ATTTTTCAGCCTTTGCATGCCCTTCGGAATGCTGAGAAGGTACTTCTGCCAGGC

TACCACCACTTTGAGTGGCAGCCACCTCTGAAGAATGTGTCTTCCAGCACTGAT

GTTGGCATTATTGATGGGCTGTCTGGACTATCATCCTCTGTGGATGATTACCCA

GTGGACACCATTGCAAAGAGGTTCCGCTATGATTCAGCTTTGGTGTCTGCTTTG

ATGGACATGGAAGAAGACATCTTGGAAGGCATGAGATCCCAAGACCTTGATGAT

TACCTGAATGGCCCCTTCACTGTGGTGGTGAAGGAGTCTTGTGATGGAATGGG

AGACGTGAGTGAGAAGCATGGGAGTGGGCCTGTAGTTCCAGAAAAGGCAGTCC

GTTTTTCATTCACAATCATGAAAATTACTATTGCCCACAGCTCTCAGAATGTGAA

AGTATTTGAAGAAGCCAAACCTAACTCTGAACTGTGTTGCAAGCCATTGTGCCTT

ATGCTGGCAGATGAGTCTGACCACGAGACGCTGACTGCCATCCTGAGTCCTCT

CATTGCTGAGAGGGAGGCCATGAAGAGCAGTGAATTAATGCTTGAGCTGGGAG

GCATTCTCCGGACTTTCAAGTTCATCTTCAGGGGCACCGGCTATGATGAAAAAC

TTGTGCGGGAAGTGGAAGGCCTCGAGGCTTCTGGCTCAGTCTACATTTGTACTC

TTTGTGATGCCACCCGTCTGGAAGCCTCTCAAAATCTTGTCTTCCACTCTATAAC

CAGAAGCCATGCTGAGAACCTGGAACGTTATGAGGTCTGGCGTTCCAACCCTTA

CCATGAGTCTGTGGAAGAACTGCGGGATCGGGTGAAAGGGGTCTCAGCTAAAC

CTTTCATTGAGACAGTCCCTTCCATAGATGCACTCCACTGTGACATTGGCAATG

CAGCTGAGTTCTACAAGATCTTCCAGCTAGAGATAGGGGAAGTGTATAAGAATC

CCAATGCTTCCAAAGAGGAAAGGAAAAGGTGGCAGGCCACACTGGACAAGCAT

CTCCGGAAGAAGATGAACCTCAAACCAATCATGAGGATGAATGGCAACTTTGCC

AGGAAGCTCATGACCAAAGAGACTGTGGATGCAGTTTGTGAGTTAATTCCTTCC

GAGGAGAGGCACGAGGCTCTGAGGGAGCTGATGGATCTTTACCTGAAGATGAA

ACCAGTATGGCGATCATCATGCCCTGCTAAAGAGTGCCCAGAATCCCTCTGCCA

GTACAGTTTCAATTCACAGCGTTTTGCTGAGCTCCTTTCTACGAAGTTCAAGTAT

AGGTATGAGGGAAAAATCACCAATTATTTTCACAAAACCCTGGCCCATGTTCCTG

AAATTATTGAGAGGGATGGCTCCATTGGGGCATGGGCAAGTGAGGGAAATGAG

TCTGGTAACAAACTGTTTAGGCGCTTCCGGAAAATGAATGCCAGGCAGTCCAAA

TGCTATGAGATGGAAGATGTCCTGAAACACCACTGGTTGTACACCTCCAAATAC

CTCCAGAAGTTTATGAATGCTCATAATGCATTAAAAACCTCTGGGTTTACCATGA

ACCCTCAGGCAAGCTTAGGGGACCCATTAGGCATAGAGGACTCTCTGGAAAGC

CAAGATTCAATGGAATTTTAAgtagggcaaccacttatgagttggtttttgcaattg agtttccctctgggttg cattgagggcttctcctagcaccctttactgctgtgtatggggcttcaccatccaagagg tggtaggttggagtaagatgc tacagatgctctcaagtcaggaatagaaactgatgagctgattgcttgaggcttttagtg agttccgaaaagcaacagg aaaaatcagttatctgaaagctcagtaactcagaacaggagtaactgcaggggaccagag atgagcaaagatctgt gtgtgttggggagctgtcatgtaaatcaaagccaaggttgtcaaagaacagccagtgagg ccaggaaagaaattggt cttgtggttttcatttttttcccccttgattgattatattttgtattgagatatgataag tgccttctatttcatttttgaataattcttcatt tttataattttacatatcttggcttgctatataagattcaaaagagctttttaaattttt ctaataatatcttacatttgtacagcatg atgacctttacaaagtgctctcaatgcatttacccattcgttatataaatatgttacatc aggacaactttgagaaaatcagt ccttttttatgtttaaattatgtatctattgtaaccttcagagtttaggaggtcatctgc tgtcatggatttttcaataatgaatttag aatacacctgttagctacagttagttattaaatcttctgataatatatgtttacttagct atcagaagccaagtatgattctttat ttttactttttcatttcaagaaatttagagtttccaaatttagagcttctgcatacagtc ttaaagccacagaggcttgtaaaa atataggttagcttgatgtctaaaaatatatttcatgtcttactgaaacattttgccaga ctttctccaaatgaaacctgaatc aatttttctaaatctaggtttcatagagtcctctcctctgcaatgtgttattctttctat aatgatcagtttactttcagtggattcag aattgtgtagcaggataaccttgtatttttccatccgctaagtttagatggagtccaaac gcagtacagcagaagagtta acatttacacagtgctttttaccactgtggaatgttttcacactcatttttccttacaac aattctgaggagtaggtgttgttatta tctccatttgatgggggtttaaatgatttgctcaaagtcatttaggggtaataaatactt ggcttggaaatttaacacagtcct tttgtctccaaagcccttcttctttccaccacaaattaatcactatgtttataaggtagt atcagaatttttttaggattcacaac taatcactatagcacatgaccttgggattacatttttatggggcaggggtaagcaagttt ttaaatcatttgtgtgctctggct cttttgatagaagaaagcaacacaaaagctccaaagggccccctaaccctcttgtggctc cagttatttggaaactatg atctgcatccttaggaatctgggatttgccagttgctggcaatgtagagcaggcatggaa ttttatatgctagtgagtcata atgatatgttagtgttaattagttttttcttcctttgattttattggccataattgctac tcttcatacacagtatatcaaagagcttg ataatttagttgtcaaaagtgcatcggcgacattatctttaattgtatgtatttggtgct tcttcagggattgaactcagtatcttt cattaaaaaacacagcagttttccttgctttttatatgcagaatatcaaagtcatttcta atttagttgtcaaaaacatataca tattttaacattagtttttttgaaaactcttggttttgtttttttggaaatgagtgggcc actaagccacactttcccttcatcctgct taatccttccagcatgtctctgcactaataaacagctaaattcacataatcatcctattt actgaagcatggtcatgctggtt tatagattttttacccatttctactctttttctctattggtggcactgtaaatactttcc agtattaaattatccttttctaacactgta ggaactattttgaatgcatgtgactaagagcatgatttatagcacaacctttccaataat cccttaatcagatcacattttga taaaccctgggaacatctggctgcaggaatttcaatatgtagaaacgctgcctatggttt tttgcccttactgttgagactg caatatcctagaccctagttttatactagagttttatttttagcaatgcctattgcaagt gcaattatatactccagggaaattc accacactgaatcgagcatttgtgtgtgtatgtgtgaagtatatactgggacttcagaag tgcaatgtatttttctcctgtga aacctgaatctacaagttttcctgccaagccactcaggtgcattgcagggaccagtgata atggctgatgaaaattgat gattggtcagtgaggtcaaaaggagccttgggattaataaacatgcactgagaagcaaga ggaggagaaaaagat gtctttttcttccaggtgaactggaatttagttttgcctcagatttttttcccacaagat acagaagaagataaagatttttttgg ttgagagtgtgggtcttgcattacatcaaacagagttcaaattccacacagataagaggc aggatatataagcgccag tggtagttgggaggaataaaccattatttggatgcaggtggtttttgattgcaaatatgt gtgtgtcttcagtgattgtatgac agatgatgtattcttttgatgttaaaagattttaagtaagagtagatacattgtacccat tttacattttcttattttaactacagt aatctacataaatatacctcagaaatcatttttggtgattattttttgttttgtagaatt gcacttcagtttattttcttacaaataac cttacattttgtttaatggcttccaagagccttttttttttttgtatttcagagaaaatt caggtaccaggatgcaatggatttattt gattcaggggacctgtgtttccatgtcaaatgttttcaaataaaatgaaatatgagtttc aatactttttatattttaatatttcca ttcattaatattatggttattgtcagcaattttatgtttgaatatttgaaataaaagttt aagatttgaaaa

In the illustrative RAG1 exon 2 (SEQ ID NO: 3), upper case letters indicate a nucleotide sequence which encodes a RAG1 polypeptide.

RAG1 polypeptides

The RAG1 polypeptide may be a human RAG1 polypeptide. Suitably, the RAG1 polypeptide may comprise or consist of a polypeptide sequence of UniProtKB accession P15918, or a fragment or variant thereof.

In some embodiments of the invention, the RAG1 polypeptide comprises or consists of an amino acid sequence which is at least 70% identical to SEQ ID NO: 4 or a fragment thereof. Suitably, the RAG1 polypeptide comprises or consists of an amino acid sequence which is at least 80%, at least 85%, at least 90%, at least 95%, at least 98% or at least 99% identical to SEQ ID NO: 4 or a fragment thereof.

In some embodiments, the RAG1 polypeptide comprises or consists of SEQ ID NO: 4 or a fragment thereof.

RAG1 polypeptide isoform 1, UniProtKB accession P15918 (SEQ ID NO: 4)

MAASFPPTLGLSSAPDEIQHPHIKFSEWKFKLFRVRSFEKTPEEAQKEKKDSFEGKP SLEQSPAVLDKADGQKPVPTQPLLKAHPKFSKKFHDNEKARGKAIHQANLRHLCRI CGNSFRADEHNRRYPVHGPVDGKTLGLLRKKEKRATSWPDLIAKVFRIDVKADVDS IHPTEFCHNCWSIMHRKFSSAPCEVYFPRNVTMEWHPHTPSCDICNTARRGLKRKS LQPNLQLSKKLKTVLDQARQARQHKRRAQARISSKDVMKKIANCSKIHLSTKLLAVD FPEHFVKSISCQICEHILADPVETNCKHVFCRVCILRCLKVMGSYCPSCRYPCFPTDL ESPVKSFLSVLNSLMVKCPAKECNEEVSLEKYNHHISSHKESKEIFVHINKGGRPRQ HLLSLTRRAQKHRLRELKLQVKAFADKEEGGDVKSVCMTLFLLALRARNEHRQADE LEAIMQGKGSGLQPAVCLAIRVNTFLSCSQYHKMYRTVKAITGRQIFQPLHALRNAE KVLLPGYHHFEWQPPLKNVSSSTDVGIIDGLSGLSSSVDDYPVDTIAKRFRYDSALV SALMDMEEDILEGMRSQDLDDYLNGPFTVVVKESCDGMGDVSEKHGSGPVVPEK AVRFSFTIMKITIAHSSQNVKVFEEAKPNSELCCKPLCLMLADESDHETLTAILSPLIA EREAMKSSELMLELGGILRTFKFIFRGTGYDEKLVREVEGLEASGSVYICTLCDATRL EASQNLVFHSITRSHAENLERYEVWRSNPYHESVEELRDRVKGVSAKPFIETVPSID ALHCDIGNAAEFYKIFQLEIGEVYKNPNASKEERKRWQATLDKHLRKKMNLKPIMRM NGNFARKLMTKETVDAVCELIPSEERHEALRELMDLYLKMKPVWRSSCPAKECPES LCQYSFNSQRFAELLSTKFKYRYEGKITNYFHKTLAHVPEIIERDGSIGAWASEGNE SGNKLFRRFRKMNARQSKCYEMEDVLKHHWLYTSKYLQKFMNAHNALKTSGFTM NPQASLGDPLGIEDSLESQDSMEF

In some embodiments of the invention, the RAG1 polypeptide comprises or consists of an amino acid sequence which is at least 70% identical to SEQ ID NO: 5 or a fragment thereof. Suitably, the RAG1 polypeptide comprises or consists of an amino acid sequence which is at least 80%, at least 85%, at least 90%, at least 95%, at least 98% or at least 99% identical to SEQ ID NO: 5 or a fragment thereof.

In some embodiments, the RAG1 polypeptide comprises or consists of SEQ ID NO: 5 or a fragment thereof.

RAG1 polypeptide isoform 2, UniProtKB accession P15918 (SEQ ID NO: 5)

MAASFPPTLGLSSAPDEIQHPHIKFSEWKFKLFRVRSFEKTPEEAQKEKKDSFEGKP SLEQSPAVLDKADGQKPVPTQPLLKAHPKFSKKFHDNEKARGKAIHQANLRHLCRI CGNSFRADEHNRRYPVHGPVDGKTLGLLRKKEKRATSWPDLIAKVFRIDVKADVDS IHPTEFCHNCWSIMHRKFSSAPCEVYFPRNVTMEWHPHTPSCDICNTARRGLKRKS LQPNLQLSKKLKTVLDQARQARQHKRRAQARISSKDVMKKIANCSKIHLSTKLLAVD FPEHFVKSISCQICEHILADPVETNCKHVFCRVCILRCLKVMGSYCPSCRYPCFPTDL ESPVKSFLSVLNSLMVKCPAKECNEEVSLEKYNHHISSHKESKEIFVHINKGGRPRQ HLLSLTRRAQKHRLRELKLQVKAFADKEEGGDVKSVCMTLFLLALRARNEHRQADE LEAIMQGKGSGLQPAVCLAIRVNTFLSCSQYHKMYRTVKAITGRQIFQPLHALRNAE KVLLPGYHHFEWQPPLKNVSSSTDVGIIDGLSGLSSSVDDYPVDTIAKRFRYDSALV SALMDMEEDILEGMRSQDLDDYLNGPFTVVVKESCDGMGDVSEKHGSGPVVPEK AVRFSFTIMKITIAHSSQNVKVFEEAKPNSELCCKPLCLMLADESDHETLTAILSPLIA EREAMKSSELMLELGGILRTFKFIFRGTGYDEKLVREVEGLEASGSVYICTLCDATRL EASQNLVFHSITRSHAENLERYEVWRSNPYHESVEELRDRVKGVSAKPFIETVPSID ALHCDIGNAAEFYKIFQLEIGEVYKNPNASKEERKRWQATLDKHLRKKMNLKPIMRM NGNFARKLMTKETVDAVCELIPSEERHEALRELMDLYLKMKPVWRSSCPAKECPES LCQYSFNSQRFAELLSTKFKYRN

RAG1 polynucleotides

The nucleotide sequence encoding a RAG1 polypeptide may be codon-optimised. Suitably, the nucleotide sequence encoding a RAG1 polypeptide may be codon optimised for expression in a human cell.

Different cells differ in their usage of particular codons. This codon bias corresponds to a bias in the relative abundance of particular tRNAs in the cell type. By altering the codons in the sequence so that they are tailored to match with the relative abundance of corresponding tRNAs, it is possible to increase expression. By the same token, it is possible to decrease expression by deliberately choosing codons for which the corresponding tRNAs are known to be rare in the particular cell type. Thus, an additional degree of translational control is available. Codon usage tables are known in the art for mammalian cells (e.g. humans), as well as for a variety of other organisms.

In some embodiments of the invention, the nucleotide sequence encoding a RAG1 polypeptide comprises or consists of a nucleotide sequence which is at least 70% identical to SEQ ID NO: 6 or a fragment thereof. Suitably, the nucleotide sequence encoding a RAG1 polypeptide comprises or consists of a nucleotide sequence which is at least 80%, at least 85%, at least 90%, at least 95%, at least 98% or at least 99% identical to SEQ ID NO: 6 or a fragment thereof.

In some embodiments of the invention, the nucleotide sequence encoding a RAG1 polypeptide comprises or consists of the nucleotide sequence SEQ ID NO: 6 or a fragment thereof.

Exemplary nucleotide sequence encoding a RAG1 polypeptide (SEQ ID NO: 6) atggccgcctccttcccacctacccttggattgtcctccgcccctgacgaaattcaacat ccccacatcaaattctcgga gtggaagttcaagctctttcgcgtgcgctcgttcgaaaagacccccgaggaagcccaaaa ggagaagaaagactc attcgaaggaaaacccagcctcgaacagtccccggccgtcctggacaaggccgacgggca gaagcctgtgccga cccagccgctgctgaaagcgcacccgaaattctccaagaagtttcacgataacgagaagg cccggggaaaggcc atccaccaagcaaaccttagacacctgtgccgcatctgtgggaactcattcagagccgac gaacataaccggagat accctgtgcatggccctgtcgacggaaagaccctggggctcctgagaaagaaggagaaga gggcgacatcctgg ccggacctgatcgcaaaggtgttcagaatcgacgtgaaggcagatgtggacagcatccac ccaaccgagttctgcc acaactgctggagcattatgcaccggaagttcagctcagcgccctgtgaagtgtacttcc cgcgcaacgtgactatgg agtggcatccacacactccgtcctgcgacatctgtaacactgctcggcgcggactcaaga ggaagtccctgcagccg aatctgcagctgagcaagaagcttaagaccgtgctggaccaggctcggcaggcccgccag cacaagcgacgcgc ccaggcccggatctcatctaaggatgtgatgaagaagatcgccaattgcagcaaaatcca cctgtctaccaagctgct ggcggtggacttcccggagcacttcgtgaagtccatcagctgtcagatctgcgagcatat tctcgccgaccccgtgga gactaattgcaagcacgtgttctgccgcgtgtgcatcctgcgctgcctgaaggtcatggg ctcctattgcccttcctgccg gtacccctgtttccctactgatctggagtccccggtcaagtccttcttgtccgtgctgaa ctccctgatggtcaaatgtcccg caaaggagtgcaatgaggaagtgtccctggaaaagtacaaccaccacatcagcagccaca aggagtccaaaga aatctttgtgcacattaacaagggcggtcggccccggcagcatctgctctcgctgactcg ccgggcccagaagcaca ggctccgggagctgaagctgcaagtcaaggccttcgccgacaaggaagagggaggagatg tgaagtccgtgtgca tgaccctgtttttgctggcgctgcgggctcggaacgaacacagacaagctgatgaactgg aggccatcatgcagggc aaaggatcgggactccagccggctgtgtgtctcgccatccgcgtcaacacattcctctca tgctcccaataccacaag atgtacaggactgtgaaggccatcaccggacggcagatctttcagccactccacgccctt cggaacgcagaaaagg tcttgctgccgggataccatcatttcgaatggcagccgcccttgaaaaacgtgtcctcgt ccaccgacgtgggcattatt gatgggctgagcggcctgtcctcctctgtggatgactaccctgtggataccatcgccaaa cggttcagatacgattccg cgctggtgtcggccctgatggacatggaggaggacatcctggagggaatgagatcacaag atctggacgactacct caacgggcccttcacggtggtggtcaaggaatcgtgcgatggaatgggcgacgtgtcgga gaagcacggttccgga cctgtggtgccggaaaaggccgtgcgcttctccttcaccatcatgaagatcaccattgcg catagctcccagaacgtca aagtgttcgaagaggccaagccgaactcagagctctgctgcaagccgctgtgcctgatgt tggcggacgagagcga tcacgaaaccctgaccgccattctgtcgcctctgatcgcggagagggaggccatgaagtc ctccgaactgatgctgg agctgggcggtattttgcggacttttaagttcatcttccggggaaccggttatgacgaaa agctcgtgcgcgaagtgga gggcctggaagcctcaggctccgtctacatctgcactctctgcgacgccacccggctgga ggcgtcacagaatcttgt gttccactcgatcactaggtcccacgcggagaacctggaacgctatgaggtctggcgctc taacccataccacgaatc cgtggaagaacttcgggacagagtgaagggagtgtcagcaaagcctttcattgaaaccgt gcctagcatcgacgcc ctccattgcgacatcggcaacgccgccgagttctacaagatcttccagcttgagatcggg gaagtgtacaagaaccc gaacgcctccaaggaagaaagaaagcggtggcaggctacccttgacaaacacctccgcaa gaagatgaacctg aagcccattatgcggatgaacggaaacttcgctaggaagctgatgactaaggaaacggtc gacgcggtctgtgaact gatccccagcgaagaacgacatgaagcgctgcgcgaactcatggacctgtacctgaagat gaagcctgtctggcgg agctcgtgccctgccaaggagtgcccggagtcgctgtgtcagtacagctttaacagccaa aggttcgcagagctgctg tcgaccaagttcaagtacagatacgaaggaaagattaccaactacttccacaagactctc gctcacgtgcccgagatt atcgaacgcgatggttccatcggggcctgggcctccgagggcaacgagtcgggcaacaag ttgttccgccggtttag aaagatgaacgcccgccagtccaagtgctacgaaatggaagatgtgctgaagcatcactg gctgtatacctccaagt acctccagaagttcatgaacgcacataacgccctcaagacctccgggttcaccatgaacc cccaggcctccctcggt gaccctctgggaattgaagatagcttggagagccaggactcgatggaattcta

Polynucleotides and genomes

In one aspect, the present invention provides a polynucleotide comprising from 5’ to 3’: a first homology region, a splice acceptor sequence, a nucleotide sequence encoding a RAG1 polypeptide, and a second homology region. The polynucleotide may be an isolated polynucleotide. The polynucleotide may be a DNA molecule, e.g. a double-stranded DNA molecule.

Suitably, the polynucleotide of the invention may be limited to a size suitable to be inserted into a vector (e.g. an adeno-associated viral (AAV) vector, such as AAV6). Suitably, the polynucleotide of the invention may be 5.0 kb or less, 4.9 kb or less, 4.8 kb or less, 4.7 kb or less, 4.6 kb or less, 4.5 kb or less, 4.4 kb or less, 4.3 kb or less, 4.2 kb or less, 4.1 kb or less, 4.0 kb or less in total size. In some embodiments, the polynucleotide of the invention is 4.1 kb or less or 4.0 kb or less in size. In another aspect, the present invention provides a genome comprising a splice acceptor sequence and a nucleotide sequence encoding a RAG1 polypeptide. Suitably, the genome may comprise the polynucleotide of the present invention. The genome may be an isolated genome. The genome may be a mammalian genome, e.g. a human genome.

Homology regions

A “homology region” (also known as “homology arm”) is a nucleotide sequence which is located upstream or downstream of a nucleotide sequence to be inserted (a “nucleotide sequence insert” e.g. a splice acceptor sequence and a nucleotide sequence encoding a RAG1 polypeptide). The polynucleotide of the present invention comprises two homology regions, one upstream of the nucleotide sequence insert (the “first homology region”) and one downstream of the nucleotide insert (the “second homology region”).

Each “homology region” is designed such that the nucleotide sequence insert can be introduced into a genome at a site of a double strand break (DSB) by homology-directed repair (HDR). One of skill in the art will be able to design homology arms depending on the desired insertion site (i.e. the site of the DSB) (see e.g. Ran, F.A., et al., 2013. Nature protocols, 8(11), pp.2281-2308). Each “homology region” is homologous to a region either side of the DSB. For example, the first homology region may be homologous to a region upstream of the DSB and the second homology region may be homologous to a region downstream of the DSB.

As used herein, the term “homologous” means that the nucleotide sequences are similar or identical. For example, the nucleotide sequences may be at least 70% identical, at least 75% identical, at least 80% identical, at least 85% identical, at least 90% identical, at least 95% identical, at least 98% identical, at least 99% identical, or 100% identical.

As used herein, “upstream” and “downstream” both refer to relative positions in DNA or RNA. Each strand of DNA or RNA has a 5’ end and a 3’ end and, by convention, “upstream” and “downstream” relate to the 5' to 3' direction respectively in which RNA transcription takes place. For example, when considering double-stranded DNA, “upstream” is toward the 5' end of the coding strand for the gene in question (e.g. RAG1) and downstream is toward the 3' end of the coding strand for the gene in question (e.g. RAG1).

The homology regions may be any length suitable for HDR. The homology regions may be the same or different lengths. Suitably, the homology regions are each independently 50-1000 bp in length, 100-500 bp in length, or 200-400 bp in length. For example, the first homology may be 50-1000 bp in length and homologous to a region upstream of a DSB and the second homology region may be 50-1000 bp in length and homologous to a region downstream of the DSB.

In some embodiments:

(i) the first homology region is homologous to a first region of the RAG1 intron 1 and the second homology region is homologous to a second region of the RAG1 intron 1 ; or

(ii) the first homology region is homologous to a first region of the RAG1 intron 1 or the RAG1 exon 2 and the second homology region is homologous to a second region of the RAG1 exon 2.

In some embodiments, the first homology region is homologous to a first region of the RAG1 intron 1 and the second homology region is homologous to a second region of the RAG1 intron 1.