Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHODS AND COMPOSITIONS FOR DIFFERENTIATING STEM CELLS
Document Type and Number:
WIPO Patent Application WO/2020/247836
Kind Code:
A1
Abstract:
The subject matter disclosed herein is generally directed to modulation of genes and pathways that drive differentiation of LGR5+ stem cells. The methods and compositions can be used to treat diseases associated with aberrant epithelial barrier function.

Inventors:
MEAD BENJAMIN (US)
SHALEK ALEXANDER K (US)
KARP JEFFREY (US)
HATTORI KAZUKI (US)
Application Number:
PCT/US2020/036446
Publication Date:
December 10, 2020
Filing Date:
June 05, 2020
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
MASSACHUSETTS INST TECHNOLOGY (US)
BRIGHAM & WOMENS HOSPITAL INC (US)
International Classes:
A61K31/00; A61K35/38; A61K48/00; A61P1/00; C12N1/38; C12N5/071; C12N15/63; G01N33/50; G01N33/53
Domestic Patent References:
WO2018089386A12018-05-17
WO2020159609A12020-08-06
Foreign References:
EP2970890A12016-01-20
Other References:
OSCAR YANES ET AL: "Metabolic oxidation regulates embryonic stem cell differentiation", NATURE CHEMICAL BIOLOGY, vol. 6, no. 6, 1 June 2010 (2010-06-01), pages 411 - 417, XP055102242, ISSN: 1552-4450, DOI: 10.1038/nchembio.364
AYYAZ ARSHAD ET AL: "Single-cell transcriptomes of the regenerating intestine reveal a revival stem cell", NATURE, NATURE PUBLISHING GROUP UK, LONDON, vol. 569, no. 7754, 24 April 2019 (2019-04-24), pages 121 - 125, XP036778180, ISSN: 0028-0836, [retrieved on 20190424], DOI: 10.1038/S41586-019-1154-Y
Attorney, Agent or Firm:
SCHER, Michael B. et al. (US)
Download PDF:
Claims:
CLAIMS

What is claimed is:

1. A method of differentiating a stem cell or a stem cell enriched population of cells, comprising: contacting the population of cells with one or more cell cycle inhibitors; and/or contacting the population of cells with one or more agents capable of increasing expression or activity of one or more genes or gene products selected from the group consisting of:

- Atf3; or

- Atf3, Klf6, Hbegf, Ubc, Mucl3; or

- Klf6, Atf3, Hbegf, Ubc, Mucl3, 2210404007Rik, Pigr, Ndrgl, Slcl6a6, Btg2, Thbsl, Cldn4, Ednl, Smim6, Arrdc3, Abcg8, Tmeml71, Prrl5, Rnase4, Itlnl, Mmp7, Gm 15284, Defal7, Lyzl, Tffi, Rps26, Rpll3, Hmgnl, Sox4, Malatl, Slcl2a2, Atplal, Tmprss2, Gls, Mt2, Mtl, mmu-mir-6236 and Gm26924; or

- upregulated genes in Table 3A; or

- upregulated pathways in Table 3B, and/or contacting the population of cells with one or more agents capable of decreasing expression or activity of one or more genes or gene products selected from the group consisting of:

- Topbpl, Rpal, Mcm7, Atad2, Ligl, Whscl, Dutl, Urfl, Mcm3, Mcm5, Mcm4, Hells, Mcm6, Mcm2, Pena, Tgm2, Crip2, Fenl, Ccndl, Dtl, Cdtl, Ptma, Hookl, Tubb5, Mki67, Ranbpl and Kpnbl; or

- Itpr3, Cbrl, Kmt2a, Hsphl, Gm2697, Bexl, Cad, Rpphl , Gclc, Cbr3, Pfkp, Ugdh, Elovl6, Slc25a4, Galkl, Wars, Slc38al, Parpl, Mrpll2, Gsr, Gnl3, Ybx3, Ckb, Isynal, Nop56, Nolcl, Reep6, Ccnd2, Srm, Gpx2, Add3, Oxctl, Nhp2, Bzw2, Prmt5, Rangapl, HlfO, Sael, Clqbp, Tpil, Hmgcsl, Smoc2, Axin2, Cdk4, Reg3g, Igfbp4, Rgcc, Magedl, Tcofl, Dctppl, Nasp, Gart, Rifl, Paics, Hjurp, Ezh2, Rrml, Dnmtl, Ipo5, Fadsl, Topbpl, Rpal, Mcm7, Atad2, Ligl, Whscl, Dutl, Urfl, Mcm3, Mcm5, Mcm4, Hells, Mcm6, Mcm2, Pena, Tgm2, Crip2, Fenl, Ccndl, Dtl, Cdtl, Ptma, Hookl, Tubb5, Mki67, Ranbpl and Kpnbl; or

- downregulated genes in Table 3 A; or

- downregulated pathways in Table 3B, and/or contacting the population of cells with one or more agents capable of inducing a quiescent stem cell signature.

2. The method of claim 1, wherein the one or more agents capable of inducing a quiescent stem cell signature comprise a map kinase inhibitor.

3. The method of claim 2, wherein the map kinase inhibitor is cobimetinib.

4. The method of any of claims 1 to 3, wherein the one or more agents comprises a small molecule, genetic modifying agent, nucleic acid molecule encoding a gene product, protein, or any combination thereof.

5. The method of claim 4, wherein the genetic modifying agent comprises a CRISPR system, RNAi system, a zinc finger nuclease system, a TALE system, or a meganuclease.

6. The method of claim 5, wherein the CRISPR system is a Class 1 or Class 2 CRISPR system.

7. The method of claim 6, wherein the Class 2 system comprises a Type II Cas polypeptide.

8 The method of claim 7, wherein the Type II Cas is a Cas9.

9. The method of claim 6, wherein the Class 2 system comprises a Type V Cas polypeptide.

10. The method of claim 9, wherein the Type V Cas is Casl2a, Casl2b, Casl2c, Casl2d (CasY), Casl2e(CasX), or Casl4.

11. The method of claim 6, wherein the Class 2 system comprises a Type VI Cas polypeptide.

12. The method of claim 11, wherein the Type VI Cas is Casl3a, Casl3b, Casl3c or Casl3d.

13. The method of any of claims 6 to 12, wherein the CRISPR system comprises a dCas fused or otherwise linked to a nucleotide deaminase.

14. The method of claim 13, wherein the nucleotide deaminase is a cytidine deaminase or an adenosine deaminase.

15. The method of claim 6, wherein the CRISPR system is a prime editing system.

16. The method of any one of claims 1 to 15, further comprising contacting the population of cells with a Wnt agonist.

17. The method of claim 16, wherein the Wnt agonist is CHIR99021.

18. The method of any one of claims 1 to 17, further comprising contacting the population of cells with a Notch inhibitor.

19. The method of claim 18, wherein the Notch inhibitor is DAPT.

20. The method of any one of claims 1 to 19, wherein the stem cells are leucine-rich repeat-containing G-protein coupled receptor 5-positive (LGR5+) cells.

21. The method of claim 20, wherein the LGR5+ cells are LGR5+ intestinal stem cells (ISC), LGR5+ cochlear progenitors (LCP), LGR5+ stem cells of the respiratory epithelium, or LGR5+ stem cells of the skin.

22. The method of any one of claims 1 to 21, wherein the population of cells is an in vivo population of cells.

23. The method of any one of claims 1 to 21, wherein the population of cells is an in vitro organoid model system.

24. The method of claim 23, further comprising obtaining intestinal cells from a subject.

25. The method of claim 23, further comprising obtaining inner ear cells from a subj ect.

26. The method of any of claims 1 to 25, wherein secretory cells are increased in the population of cells.

27. The method of claim 26, wherein the secretory' cells comprise Paneth cells.

28. The method of claim 27, wherein the Paneth cells express one or more genes or gene products selected from the group consisting of human lysozyme (LYZ), a human alpha defensin (DEFA), human matrix metalloproteinase-7 (MMP-7), and cluster of differentiation 24 (CD24).

29. The method of claim 28, wherein the human alpha defensin is human alpha defensin 5 (DEFA5) or human alpha defensin 6 (DEFA6).

30. The method of any of claims 1 to 25, wherein sensory cells are increased in the population of cells.

31. The method of claim 30, wherein the sensory cells comprise inner ear hair cells.

32. The method of any of claims 23 to 31, wherein the method comprises: a) culturing an LGR5+ enriched population of cells in a hydrogel matrix in the presence of EGF, Noggin, R-spondin 1, CHIR99021 and valproic acid (ENR+CV); b) culturing the ENR+CV cells in the presence of EGF, Noggin, R-spondin 1, CHIR99021 and DAPT (ENR+CD); and c) contacting the cells according to claim 1.

33. The method of any of claims 1 to 32, wherein the method further comprises introducing differentiated cells to a subject.

34. The method of any of claims 1 to 33, wherein the method is for treating a condition or disease by increasing Paneth cells in a subject in need thereof.

35. The method of claim 34, wherein the condition or disease is selected from the group consisting of graft-versus-host disease (GVHD), inflammatory bowel disease (IBD), Crohn’s disease, necrotizing enterocolitis, microbial dysbiosis, impaired intestinal epithelial barrier function, obesity, intestinal inflammation, allergy, respiratory inflammation, asthma, and psoriasis.

36. The method of claim 34 or 35, wherein the method comprises targeted administration of the one or more agents or cells to the intestine of the subject.

37. The method of any of claims 1 to 33, wherein the method is for treating hearing loss by increasing hair cells in the inner ear of a subject in need thereof.

38. The method of claim 37, wherein the method comprises targeted administration of the one or more agents or cells to the inner ear of the subject.

39. A population of cells produced by the method of any of claims 1 to 32.

40. A method of treating a condition or disease by increasing Paneth cells in a subject in need thereof comprising systemically administering an inhibitor of nuclear export to the subject in a dosage less than or equal to 0.2 mg/kg.

41. The method of claim 40, wherein the dosage is between 0.01 to 0.2 mg/kg.

42. The method of claim 40, wherein the dosage is less than or equal to 0.01 mg/kg.

43. The method of any of claims 40 to 42, wherein the inhibitor is administered orally.

44. The method of any of claims 40 to 42, wherein the inhibitor is administered by injection.

45. The method of any of claims 40 to 44, wherein the inhibitor of nuclear export inhibits XPOl .

46. The method of claim 45, wherein the inhibitor is KPT-330.

47. The method of claim 45, wherein the inhibitor is KPT-8602.

48. The method of claim 45, wherein the inhibitor is Leptomycin B.

49. The method of any of claims 40 to 48, wherein the condition or disease is selected from the group consisting of inflammatory bowel disease (IBD), Crohn’s disease, intestinal inflammation, graft-versus-host disease (GVHD), necrotizing enterocolitis, microbial dysbiosis, impaired intestinal epithelial barrier function, obesity, allergy, respiratory inflammation, asthma, and psoriasis.

50. A method of detecting intestinal cell types comprising detecting in a sample comprising intestinal cells the expression of one or markers selected from Table 6 for one or more cell types selected from the group consisting of stem I cells, stem II cells, stem III cells, early enterocytes, enterocytes, early Paneth cells, Paneth cells and enteroendocrine cells.

51. The method of claim 50, wherein the fraction of each cell type is determined.

52. A method of identifying an agent capable of differentiating a stem cell or a stem cell enriched population of cells, comprising: a) applying a candidate agent to an intestinal organoid model; and

b) detecting one or more intestinal cell types according to claim 50, whereby a shift in cell types indicates that the agent is capable of differentiating a stem cell or a stem cell enriched population of cells.

53. A method of differentiating a stem cell or a stem cell enriched population of cells comprising contacting the population of cells with one or more inhibitors of aurora kinase b.

54. The method of claim 53, wherein the inhibitor is ZM447439.

Description:
METHODS AND COMPOSITIONS FOR DIFFERENTIATING STEM CELLS

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application claims the benefit of U.S. Provisional Application No. 62/858,852, filed June 7, 2019 and U.S. Provisional Application No. 62/980,002, filed February 21, 2020. The entire contents of the above-identified applications are hereby fully incorporated herein by reference.

STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH

[0002] This invention was made with government support under Grant Nos. GM119419, CA217377 and HL095722 awarded by the National Institutes of Health. The government has certain rights in the invention.

REFERENCE TO AN ELECTRONIC SEQUENCE LISTING

[0003] The contents of the electronic sequence listing (BROD-4380WP_ST25.txt”; Size is 9 kb and it was created on June 2, 2020) is herein incorporated by reference in its entirety.

TECHNICAL FIELD

[0004] The subject matter disclosed herein is generally directed to modulation of pathways that drive differentiation of LGR5+ stem cells.

BACKGROUND

[0005] The intestinal epithelium is a complex tissue that plays a key role in digestion and mediates innate and adaptive immune functions. The small intestinal epithelium is formed by a single layer of cells arranged into villi— primarily composed of enterocytes, absorptive cells, and secretory Goblet cells— and crypts, which contain intestinal stem cells (ISCs) and secretory Paneth cells (PCs). Intestinal stem cells differentiate into mature intestinal cells, but signaling pathways and factors that modulate differentiation to Paneth cells are insufficiently understood. Several inflammatory and disease states are associated with intestinal irregularities, including inflammatory bowel disease, Crohn’s disease, necrotizing enterocolitis, and intestinal inflammation. Intestinal stem cell differentiation is related to differentiation of other stem cells (e.g., stem cells found in the inner ear, barrier tissues, respiratory epithelium (lung, nose) and skin). Related diseases associated with irregularities include hearing loss, allergy, asthma, and psoriasis. Thus, there is a need for improved understanding of stem cell differentiation.

SUMMARY

[0006] In one aspect, the present invention provides for a method of differentiating a stem cell or a stem cell enriched population of cells comprising contacting the population of cells with one or more cell cycle inhibitors; and/or contacting the population of cells with one or more agents capable of increasing expression or activity of one or more genes or gene products selected from the group consisting of: Atf3; or Atf3, Klf6, Hbegf, Ubc, Mucl3; or Klf6, Atf3, Hbegf, Ubc, Mucl3, 2210404007Rik, Pigr, Ndrgl, Slcl6a6, Btg2, Thbsl, Cldn4, Ednl, Smim6, Arrdc3, Abcg8, Tmeml71, Prrl5, Rnase4, Itlnl, Mmp7, Gml5284, Defal7, Lyzl, Tff3, Rps26, Rpll3, Hmgnl, Sox4, Malatl, Slcl2a2, Atplal, Tmprss2, Gls, Mt2, Mtl, mmu-mir-6236 and Gm26924; or upregulated genes in Table 3A; or upregulated pathways in Table 3B, and/or contacting the population of cells with one or more agents capable of decreasing expression or activity of one or more genes or gene products selected from the group consisting of: Topbpl, Rpal, Mcm7, Atad2, Ligl, Whscl, Dutl, Urfl, Mcm3, Mcm5, Mcm4, Hells, Mcm6, Mcm2, Pena, Tgm2, Crip2, Fenl, Ccndl, Dtl, Cdtl, Ptma, Hookl, Tubb5, Mki67, Ranbpl and Kpnbl; or Itpr3, Cbrl, Kmt2a, Hsphl, Gm2697, Bexl, Cad, Rpphl , Gclc, Cbr3, Pfkp, Ugdh, Elovl6, Slc25a4, Galkl, Wars, Slc38al, Parpl, Mrpll2, Gsr, Gnl3, Ybx3, Ckb, Isynal, Nop56, Nolcl, Reep6, Ccnd2, Srm, Gpx2, Add3, Oxctl, Nhp2, Bzw2, Prmt5, Rangapl, HlfO, Sael, Clqbp, Tpil, Hmgcsl, Smoc2, Axin2, Cdk4, Reg3g, Igfbp4, Rgcc, Magedl, Tcofl, Dctppl, Nasp, Gart, Rifl, Paics, Hjurp, Ezh2, Rrml, Dnmtl, Ipo5, Fadsl, Topbpl, Rpal, Mcm7, Atad2, Ligl, Whscl, Dutl, Urfl, Mcm3, Mcm5, Mcm4, Hells, Mcm6, Mcm2, Pena, Tgm2, Crip2, Fenl, Ccndl, Dtl, Cdtl, Ptma, Hookl, Tubb5, Mki67, Ranbpl and Kpnbl; or downregulated genes in Table 3 A; or downregulated pathways in Table 3B, and/or contacting the population of cells with one or more agents capable of inducing a quiescent stem cell signature. [0007] In certain embodiments, the one or more agents capable of inducing a quiescent stem cell signature comprise a map kinase inhibitor. In certain embodiments, the map kinase inhibitor is cobimetinib.

[0008] In certain embodiments, the one or more agents comprises a small molecule, genetic modifying agent, nucleic acid molecule encoding a gene product, protein, or any combination thereof.

[0009] In certain embodiments, the genetic modifying agent comprises a CRISPR system, RNAi system, a zinc finger nuclease system, a TALE system, or a meganuclease. In certain embodiments, the CRISPR system is a Class 1 or Class 2 CRISPR system. In certain embodiments, the Class 2 system comprises a Type II Cas polypeptide. In certain embodiments, the Type II Cas is a Cas9. In certain embodiments, the Class 2 system comprises a Type V Cas polypeptide. In certain embodiments, the Type V Cas is Casl2a, Casl2b, Casl2c, Casl2d (CasY), Casl2e(CasX), or Casl4. In certain embodiments, the Class 2 system comprises a Type VI Cas polypeptide. In certain embodiments, the Type VI Cas is Casl3a, Casl3b, Casl3c or Casl3d. In certain embodiments, the CRISPR system comprises a dCas fused or otherwise linked to a nucleotide deaminase. In certain embodiments, the nucleotide deaminase is a cytidine deaminase or an adenosine deaminase. In certain embodiments, the CRISPR system is a prime editing system.

[0010] In certain embodiments, the method further comprises contacting the population of cells with a Wnt agonist. In certain embodiments, the Wnt agonist is CHIR99021. In certain embodiments, the method further comprises contacting the population of cells with a Notch inhibitor. In certain embodiments, the Notch inhibitor is DAPT.

[0011] In certain embodiments, the stem cells are leucine-rich repeat-containing G-protein coupled receptor 5-positive (LGR5+) cells. In certain embodiments, the LGR5+ cells are LGR5+ intestinal stem cells (ISC), LGR5+ cochlear progenitors (LCP), LGR5+ stem cells of the respiratory epithelium, or LGR5+ stem cells of the skin.

[0012] In certain embodiments, the population of cells is an in vivo population of cells. In certain embodiments, the population of cells is an in vitro organoid model system.

[0013] In certain embodiments, the method further comprises obtaining intestinal cells from a subject. In certain embodiments, the method further comprises obtaining inner ear cells from a subject. [0014] In certain embodiments, secretory cells are increased in the population of cells. In certain embodiments, the secretory' cells comprise Paneth cells. In certain embodiments, the Paneth cells express one or more genes or gene products selected from the group consisting of human lysozyme (LYZ), a human alpha defensin (DEFA), human matrix metalloproteinase-7 (MMP-7), and cluster of differentiation 24 (CD24). In certain embodiments, the human alpha defensin is human alpha defensin 5 (DEFA5) or human alpha defensin 6 (DEFA6). In certain embodiments, sensory cells are increased in the population of cells. In certain embodiments, the sensory cells comprise inner ear hair cells.

[0015] In certain embodiments, the method comprises: a) culturing an LGR5+ enriched population of cells in a hydrogel matrix in the presence of EGF, Noggin, R-spondin 1, CHIR99021 and valproic acid (ENR+CV); b) culturing the ENR+CV cells in the presence of EGF, Noggin, R- spondin 1, CHIR99021 and DAPT (ENR+CD); and c) contacting the cells according to any embodiment herein.

[0016] In certain embodiments, the method further comprises introducing differentiated cells to a subject.

[0017] In certain embodiments, the method is for treating a condition or disease by increasing Paneth cells in a subject in need thereof. In certain embodiments, the condition or disease is selected from the group consisting of graft-versus-host disease (GVHD), inflammatory bowel disease (IBD), Crohn’s disease, necrotizing enterocolitis, microbial dysbiosis, impaired intestinal epithelial barrier function, obesity, intestinal inflammation, allergy, respiratory inflammation, asthma, and psoriasis. In certain embodiments, the method comprises targeted administration of the one or more agents or cells to the intestine of the subject.

[0018] In certain embodiments, the method is for treating hearing loss by increasing hair cells in the inner ear of a subject in need thereof. In certain embodiments, the method comprises targeted administration of the one or more agents or cells to the inner ear of the subject.

[0019] In another aspect, the present invention provides for a population of cells produced by the method of any embodiment herein.

[0020] In another aspect, the present invention provides for a method of treating a condition or disease by increasing Paneth cells in a subject in need thereof comprising systemically administering an inhibitor of nuclear export to the subject in a dosage less than or equal to 0.2 mg/kg. In certain embodiments, the dosage is between 0.01 to 0.2 mg/kg. In certain embodiments, the dosage is less than or equal to 0.01 mg/kg. In certain embodiments, the inhibitor is administered orally. In certain embodiments, the inhibitor is administered by injection. In certain embodiments, the inhibitor of nuclear export inhibits XPOl . In certain embodiments, the inhibitor is KPT-330. In certain embodiments, the inhibitor is KPT-8602. In certain embodiments, the inhibitor is Leptomycin B. In certain embodiments, the condition or disease is selected from the group consisting of inflammatory' bowel disease (IBD), Crohn’s disease, intestinal inflammation, graft- versus-host disease (GVHD), necrotizing enterocolitis, microbial dysbiosis, impaired intestinal epithelial barrier function, obesity, allergy, respiratory inflammation, asthma, and psoriasis.

[0021] In another aspect, the present invention provides for a method of detecting intestinal cell types comprising detecting in a sample comprising intestinal cells the expression of one or markers selected from Table 4 for one or more cell types selected from the group consisting of stem I cells, stem II cells, stem III cells, early enterocytes, enterocytes, early Paneth cells, Paneth cells and enteroendocrine cells. In certain embodiments, the fraction of each cell type is determined.

[0022] In another aspect, the present invention provides for a method of identifying an agent capable of differentiating a stem cell or a stem cell enriched population of cells comprising applying a candidate agent to an intestinal organoid model; and detecting one or more intestinal cell types according to claim 44, whereby a shift in cell types indicates that the agent is capable of differentiating a stem cell or a stem cell enriched population of cells.

[0023] In another aspect, the present invention provides for a method of differentiating a stem cell or a stem cell enriched population of cells comprising contacting the population of cells with one or more inhibitors of aurora kinase b. In certain embodiments, the inhibitor is ZM447439.

[0024] These and other aspects, objects, features, and advantages of the example embodiments will become apparent to those having ordinary skill in the art upon consideration of the following detailed description of illustrated example embodiments. BRIEF DESCRIPTION OF THE DRAWINGS

[0025] An understanding of the features and advantages of the present invention will be obtained by reference to the following detailed description that sets forth illustrative embodiments, in which the principles of the invention may be utilized, and the accompanying drawings of which:

[0026] FIG. 1 - Schematic showing an intestinal organoid model system to study drivers of epithelial composition.

[0027] FIG. 2 - Schematic for in vitro 6-day PC differentiation screen and multiplexed single- well assays. Also shown is a schematic of the“3-D” organoid culture and“2.5-D” culture, which enables the enhanced multiplexed measurement of secreted supernatant lysozyme and cell pellet adenosine adenosine-5'-triphosphate (ATP). LYZ activity in 1 is referred to as LYZ.NS and LYZ activity in 2 is referred to as LYZ.S.

[0028] FIG. 3A-3F - Define hits by significant increases in secretion. A. Replicate UMVUE SSMD for each well and assay in screen, large points are deemed hits above FPL and FNL- determined cutoff, circled points are hits in both LYZ.NS and LYZ.S assays, each point represents the SSMD from 3 replicates of 3 bio. donors relative to whole-plate control. B. SSMD for LYZ.NS, LYZ.S, and ATP assays for 6-day ENR+CD+treatment versus ENR+CD+DMSO (vehicle) control, with an FPL=0.05 determined cutoff (0.89), large dots signify treatment-doses passing cutoffs for both LYZ assays, n=8 well replicates. C. Venn diagram of treatment hits based on replicate SSMD across the 3 assays. D. Biological potency for LYZ.NS versus LYZ.S assays based on mean fold change (based on n=8 well replicates) of treatment relative to control, grey signifies treatments advanced for profiling. E. Mean fold change of assay effect for hits in LYZ.S and LYZ.NS, only points above 1.28 standard deviations of all treatment mean fold changes (corresponding to the top 10% of a normal distribution) are deemed potent hits. F. Table of validated small molecules at their maximal doses (top indicates TGF-beta/Smad pathway, middle indicates Tyrosine kinase pathway).

[0029] FIG. 4 - Schematic and fold-change (FC) results for early (day 0-3) vs. full (day 0-6) treatment at indicated doses of indicated compounds relative to ENR+CD control in LYZ.NS normalized to ATP, LYZ.S normalized to ATP, and ATP assays.

[0030] FIG. 5A-5C - Population RNA-seq suggests KPT-330 drives secretory gene expression. A. Graph showing the number of differentially expressed genes after treatment with the indicated small molecule in the indicated condition (top bar is ENR, middle is ENR+CD, bottom is combined). B. Gene set scores for the indicated cell type after treatment with the small molecules under the ENR and ENR+CD conditions. C. Plots showing differentially expressed genes in response to KPT-330 treatment under the ENR and ENR+CD conditions.

[0031] FIG. 6A-6B - Assays in conventional (3D) system confirm KPT-330 differentiation. A. Graph showing measurements of the Paneth cell composition (Lyz+/CD24+) via flow cytometry of ENR+CD organoids treated with the indicated small molecules. B. Graph showing measurements of LYZ secretion compared to ATP in ENR organoids with induced (Cch - carbamyl choline) secretion.

[0032] FIG. 7A-7B - Additional Xpol inhibitors suggest action through known mode of action. A. Chemical structures of Xpol inhibitors. B. Graphs showing measurements of the Paneth cell composition (Lyz+/CD24+) via flow cytometry of ENR+CD organoids treated with the indicated Xpol inhibitors.

[0033] FIG. 8 - (left) Western blotting for LYZ in organoids cultured in ENR+CV or ENR with or without XPOl inhibitors for six days (right) Western blotting for LYZ in organoids cultured in ENR+CD with or without XPOl inhibitors for six days.

[0034] FIG. 9 - (left) Lysozyme secretion assay normalized by whole-well ATP, conducted on organoids differentiated in ENR+CD for 6 days with multiple XPOl inhibitors, with both induced (Cch - carbamyl choline) and non-induced secretion. Dunnetf s multiple comparison test: ** adj . p < 0.01, *** adj . p < 0.005, n = 5 well replicates (right) Lysozyme secretion assay normalized by whole-well ATP, conducted on organoids differentiated in ENR for 6 days with multiple XPOl inhibitors, with both induced (Cch - carbamyl choline) and non-induced secretion. Dunnett’s multiple comparison test: ** adj. p < 0.01, **** adj . p < 0.0001, n = 8 well replicates.

[0035] FIG. 10A-10B - Population RNA-seq across conditions with organoids (Media: ENR, ENR+C, ENR+D, ENR+CD; Drug: none, KPT-330, KPT-8602; timing: 3 days, 6 days). A. Heatmap showing treatment cell type identity scores for each condition. B. Plot showing the effect of SINEs (Selective Inhibitor of Nuclear Export) on intestinal organoid differentiation.

[0036] FIG. 11 - Graph showing LYZ secretion and ATP in ENR+CD organoids across the indicated time course with KPT-330 treatment. [0037] FIG. 12 - Single cell RNA sequencing. Seq-well results for single cells from each of the indicated organoids (control, treated with KPT-330; and indicated time points).

[0038] FIG. 13A-13C - Unsupervised differentiation landscape. A. UMAP clustering of single organoid cells shows cells cluster by cell type. B. (left) UMAP plots with treatment status and time status projected on plots (right) Expression of the indicated marker genes projected on the plots. C. (left) UMAP clustering of single organoid cells shows cells cluster by cell type (right) Fraction of indicated cell types across time course in cells treated and untreated with KPT-330.

[0039] FIG. 14A-14B - KPT-330 enhances stem conversion to mature cells. A. Organoid cell composition over time in untreated cells. B. Organoid cell composition over time in KPT-330 treated cells.

[0040] FIG. 15A-15C - Stem III (cycling) / Stem II (intermediate) express Xpol + nuclear export signal (NES) transcripts. A. Violin plots showing Xpol expression across intestinal cell types. B. Violin plots showing NES1 expression across intestinal cell types. C. Violin plots showing Xpol expression levels and NES-containing gene score within control cells.

[0041] FIG. 16 - Differentially expressed genes are well-distributed across cell types. Graph showing the number of differentially expressed genes in each cell type in control and KPT- 330 treated organoids.

[0042] FIG. 17A-17C - KPT-330 induces ‘stress-response’ and cell cycle inhibitory modules in Stem II/III A. Heatmap showing a gene x gene correlation analysis of genes upregulated after KPT-330 treatment (see, Table 1). B. Heatmap showing a gene x gene correlation analysis of genes downregulated after KPT-330 treatment (see, Table 2). C. Violin plots showing expression of Atf3 in response to KPT-330.

[0043] FIG. 18A-18B - KPT-330 induces a quiescent ISC signature. A. UMAP clustering of day 0-1 stem cell populations shaded by stem cell type (left), an active ISC signature (middle), and a quiescent ISC signature. B. Violin plots showing changes in stem cell signatures in response to KPT-330 treatment.

[0044] FIG. 19 - Induction of stem quiescence enhances effect of KPT-330. Graph showing LYZ secretion and ATP from 6 day ENR+CD organoids after treatment with cobimetinib and KPT-330. [0045] FIG. 20 - KPT-330 treatment in vivo. Graph showing body weight of C57BL6 wild- type, 10-week-old, male mice from 0 to 8 days after treatment with control or KPT-330 (4 mice/group; 4 times administration per week, 1 week, oral gavage).

[0046] FIG. 21A-21C - Analysis of samples collected from the mice of figure 20. A.

Western blot for lysozyme of samples collected from the small intestine proximal region. B. Graph showing band intensity of the western blot in (A). C. Histological analysis of the distal small intestine from control and treated mice using immunohistochemistry for lysozyme.

[0047] FIG. 22 - Analysis of samples collected from the mice of figure 20. Quantification of histology suggests a pro-differentiation effect. Graphs showing Paneth cell, Goblet cell, and cycling cell numbers in the proximal and distal small intestines under each treatment condition using markers for each cell type.

[0048] FIG. 23A-23G - A. Diagram for the stem-enriched to Paneth-enriched organoid differentiation screen, and schematic of the multiplexed functional secretion assays performed on day 6. B. Replicate UMVUE SSMD for each well and assay in screen, colored points are deemed hits above FPL and FNL-determined cutoff in both LYZ.NS and LYZ.S assays, each point represents the SSMD from 3 replicates of 3 bio. donors relative to whole-plate control. C. Mean fold change of assay effect for hits in LYZ.S and LYZ.NS, only points above 1.28 standard deviations of all treatment mean fold changes (corresponding to the top 10% of a normal distribution) are deemed potent hits. D. Biological potency for LYZ.NS versus LYZ.S assays based on mean fold change (based on n=8 well replicates) of treatment relative to control, orange signifies treatments advanced for profiling. E. Flow cytometry analyses of 3D-cultured intestinal organoids, treated with 6 hit compounds during 6 days culture in ENR+CD media. Paneth cells were identified as lysozyme-positive and CD24-positive cells. Means and individual values are shown (N=4), and the dot line represents the average of Paneth cell proportion in control samples. The Dunnett’s multiple comparisons test; ****p < 0.0001, ***p < 0.001. F. ENR+CD media containing increasing concentrations of KPT-330 were treated for 6 days. Organoids were incubated in fresh basal media with or without 10 mM carbamoylcholine chloride (Cch) for 3 h on day 6. All data were normalized to ATP abundance and further standardized to the control in each experiment. Means and individual values are shown (N=5 (D,G,H), N=8 (E)), and the dotted line represents the value 1. G. Flow cytometry analyses of 3D-cultured intestinal organoids, treated with 160 nM KPT-8602 or 2 ng/mL Leptomycin B during 6 days culture in ENR+CD media. Paneth cells were identified as lysozyme-positive and CD24-positive cells. Means and individual values are shown (N=4). Unpaired two-tailed t-test; **p < 0.01. Dunnetf s multiple comparisons test; ****p < 0.0001, ***p < 0.001, *p < 0.05.

[0049] FIG. 24A-24G - A. Diagram for the stem-enriched to Paneth-enriched organoid differentiation. B. UMAP plot with projection of cells at each time point. C. UMAP with projection of cell type clusters. D. Heatmap showing differentially expressed genes between cell types. E. Violin plots showing expression of each module across the cell types. F. Fraction of indicated cell types across time course in cells treated and untreated with KPT-330. G. Odds ratio of indicated cell types across time course in cells treated vs. untreated with KPT-330.

[0050] FIG. 25A-25J - A. Relative (log normalized) expression of Xpol across stem cells. B. Nuclear export signal module score across stem cells. C. Graph showing experimental conditions for treatment of ENR+CD with KPT-330. D. Graph showing the percentage of Paneth cells after treatment as in E. Plot showing genes upregulated and downregulated after KPT-330 treatment. F. Graph showing GSEA programs upregulated and downregulated after KPT-330 treatment. G. Graphs showing stress module and mitogen signaling module expression across the cell types in treated vs. untreated cells. H. Violin plots showing active ISC module and quiescent ISC module scores +/- KPT-330 treatment. I. Graphs showing the effects on Paneth cells of further treatment with an AP-1 inhibitor or ERK inhibitor. ENR+CD media containing 160 nM KPT-330 and/or 20 nM Cobimetinib were treated for 6 days. Organoids were incubated in fresh basal media with or without 10 mM carbamoylcholine chloride for 3 h on day 6. J. Proposed mechanism for Xpol inhibition in rebalancing cycling stem cell fate decisions towards secretory Paneth cells and absorptive enterocytes. All data were normalized to ATP abundance and further standardized to the control in each experiment. Means and individual values are shown (N=5 (D,G,H), N=8 (E)), and the dotted line represents the value 1. One sample t-test compared to 1, followed by the Two- stage linear step-up method of Benjamini, Krieger and Yekutieli for adjusting p-values; **p < 0.01, *p < 0.05 (D,E,G). Tukey's multiple comparisons test; ****p < 0.0001, **p < 0.01 (H).

[0051] FIG. 26A-26D - A. Design for in vivo oral gavage of KPT-330 in wild-type (WT) C57BL/6 mice. B. Graphs showing mean Paneth cell number in crypts isolated from the small intestine of mice treated with KPT-330. C. Graphs showing mean 01fm4+ stem cell number in crypts isolated from the small intestine of mice treated with KPT-330. D. Graphs showing mean goblet cell number in crypts isolated from the small intestine of mice treated with KPT-330.

[0052] FIG. 27A-27J - A. Distribution of all sample data (n=5676 wells) for each assay following data transformation and normalization, dotted line indicates median of distribution for which all fold change calculations are determined. B. Spearman correlation (r) between all sample wells by screen plate and biological replicate. C. ATP, LYZ.NS, LYZ.S assay controls across all plates and replicates, Welch’s t test for ATP, one-way ANOVA with Dunnett’s multiple comparison test * adj . p<0.05, **** adj. p< 0.0001. D. SSMD for LYZ.NS, LYZ.S, and ATP assays for 6-day ENR+CD+treatment versus ENR+CD+DMSO (vehicle) control, with an FPL=0.05 determined cutoff (0.89), circled dots signify treatment-doses passing cutoffs for both LYZ assays, n=8 well replicates. E. Flow cytometry gating strategy to select viable mature Paneth cells. F. Lysozyme secretion assay of 3D-cultured intestinal organoids, treated with 6 hit compounds during 6 days culture in ENR media. 10 mM Carbarn oylcholine (Cch) chloride was treated for 3 h on day 6. All data were normalized to ATP abundance and further standardized to the control in each experiment. Means and individual values are shown (N=5), and the dot line represents the value 1. One sample t-test compared to 1, followed by the Two-stage linear step-up method of Benjamini, Krieger and Yekutieli for adjusting p-values; **p < 0.01, *p < 0.05. G. Lysozyme secretion assay of 3D-cultured intestinal organoids, treated with 160 nM KPT-330, 160 nM KPT-8602 or 2 ng/mL Leptomycin B during 6 days culture in ENR+CD media. Organoids were incubated in fresh basal media with or without 10 pM carbarn oylcholine chloride for 3 h on day 6. All data were normalized to ATP abundance and further standardized to the control in each experiment. Means and individual values are shown (N=5), and the dotted line represents the value 1. One sample t-test compared to 1, followed by the Two-stage linear step-up method of Benjamini, Krieger and Yekutieli for adjusting p-values; **p < 0.01, *p < 0.05. H. Lysozyme secretion assay of 3D-cultured intestinal organoids, treated with 160 nM KPT-330, 160 nM KPT-8602 or 2 ng/mL Leptomycin B during 6 days culture in ENR media. Organoids were incubated in fresh basal media with or without 10 pM carbarn oylcholine chloride for 3 h on day 6. All data were normalized to ATP abundance and further standardized to the control in each experiment. Means and individual values are shown (N=5), and the dotted line represents the value 1. One sample t-test compared to 1, followed by the Two-stage linear step-up method of Benjamini, Krieger and Yekutieli for adjusting p-values; **p < 0.01, *p < 0.05. 1. Western blotting of lysozyme in 3D-cultured intestinal organoids, cultured in ENR+CD media for 6 days. J. Western blotting of lysozyme in 3D-cultured intestinal organoids, cultured in ENR media for 6 days.

[0053] FIG. 28A-28F - A. Violin plots showing UMI, percent mitochondrial, and detected gene distributions are across samples. B. Violin plots showing UMI, percent mitochondrial, and detected gene distributions are across cell type clusters. C. Projection of lineage-defining gene sets from a murine small intestinal scRNA-seq atlas on UMAP plots. D. Projection of gene sets identified to correspond to known ISC subsets in vivo on UMAP plots of stem cells. E. Violin plots showing module scores for stem cell types. F. UMAP plots showing clusters across all three conditions, day 0 ENR+CV, and day 0.25-6 ENR+CD and ENR+CD + KPT-330.

[0054] FIG. 29A-29I - A. Violin plots showing expression of Xpol across cell types. B. Violin plots showing expression of genes known to contain a NES across cell types. C. Graph showing the expression of key mediators in the mitogen-activated protein kinase (MAPK) pathway, NFAT, AP-1, and Aurora kinase activity within the stem populations. D. Graph showing a LYZ secretion assay across a time course of KPT-330 treatment. E. Graph showing a LYZ secretion assay across a time course of KPT-8602 treatment. Lysozyme secretion assay of 3D- cultured intestinal organoids, treated with 160 nM KPT-8602 during the indicated time frame in ENRCD media. Organoids were incubated in fresh basal media with 10 mM carbamoylcholine chloride for 3 h on day 6. All data were normalized to ATP abundance and further standardized to the control in each experiment. Means and individual values are shown (N=5), and the dot line represents the value 1. One sample t-test compared to 1, followed by the Two-stage linear step-up method of Benjamini, Krieger and Yekutieli for adjusting p-values; **p < 0.01, *p < 0.05. F. Graph showing Xpol, Atf3 , Trp 53 (p53), Ccndl, Cdk4/6 , and Cdknla (p21) expression across all cell types and the fraction of cells in each which express each gene +/- KPT-330 treatment. G. Graph showing the effects on Paneth cells of further treatment with two known P53 inhibitors, pifithrin- a (PFTa) and serdemetan (serd.). H. Graph showing the effects on Paneth cells of treatment with the Cdk4/6 inhibitor palbociclib. I. Graph showing the effects on Paneth cells of treatment with the aurora kinase b inhibitor ZM447439.

[0055] FIG. 30A-30E - A. Graph showing body weight of C57BL/6 wild-type mice administered KPT-330 at a dose of 10 mg/kg via oral gavage every other day over a two-week span. B. Graph showing body weight of C57BL/6 wild-type mice administered KPT-330 at a dose of 50-fold (0.2 mg/kg), 200-fold (0.05 mg/kg), and 1,000-fold (0.01 mg/kg) below the 10 mg/kg dose via oral gavage every other day over a two-week span. C. Immunohistochemistry images of Paneth cells within well preserved crypts. D. Immunohistochemistry images of 01fm4+ stem cells within well preserved crypts. E. Immunohistochemistry images of PAS+ goblet cells within well preserved crypts.

[0056] The figures herein are for illustrative purposes only and are not necessarily drawn to scale.

DETAILED DESCRIPTION OF THE EXAMPLE EMBODIMENTS

General Definitions

[0057] Unless defined otherwise, technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure pertains. Definitions of common terms and techniques in molecular biology may be found in Molecular Cloning: A Laboratory Manual, 2 nd edition (1989) (Sambrook, Fritsch, and Maniatis); Molecular Cloning: A Laboratory Manual, 4 th edition (2012) (Green and Sambrook); Current Protocols in Molecular Biology (1987) (F.M. Ausubel et al. eds.); the series Methods in Enzymology (Academic Press, Inc.): PCR 2: A Practical Approach (1995) (M.J. MacPherson, B.D. Hames, and G.R. Taylor eds.): Antibodies, A Laboratory Manual (1988) (Harlow and Lane, eds.): Antibodies A Laboratory Manual, 2 nd edition 2013 (E.A. Greenfield ed.); Animal Cell Culture (1987) (R.I. Freshney, ed.); Benjamin Lewin, Genes IX, published by Jones and Bartlet, 2008 (ISBN 0763752223); Kendrew et al. (eds.), The Encyclopedia of Molecular Biology, published by Blackwell Science Ltd., 1994 (ISBN 0632021829); Robert A. Meyers (ed.), Molecular Biology and Biotechnology: a Comprehensive Desk Reference, published by VCH Publishers, Inc., 1995 (ISBN 9780471185710); Singleton c/a/., Dictionary of Microbiology and Molecular Biology 2nd ed., J. Wiley & Sons (New York, N.Y. 1994), March, Advanced Organic Chemistry Reactions, Mechanisms and Structure 4th ed., John Wiley & Sons (New York, N.Y. 1992); and Marten H. Hofker and Jan van Deursen, Transgenic Mouse Methods and Protocols, 2 nd edition (2011) .

[0058] As used herein, the singular forms“a”,“an”, and“the” include both singular and plural referents unless the context clearly dictates otherwise. [0059] The term“optional” or“optionally” means that the subsequent described event, circumstance or substituent may or may not occur, and that the description includes instances where the event or circumstance occurs and instances where it does not.

[0060] The recitation of numerical ranges by endpoints includes all numbers and fractions subsumed within the respective ranges, as well as the recited endpoints.

[0061] The terms“about” or“approximately” as used herein when referring to a measurable value such as a parameter, an amount, a temporal duration, and the like, are meant to encompass variations of and from the specified value, such as variations of +/-10% or less, +/- 5% or less, +/- 1% or less, and +/-0.1% or less of and from the specified value, insofar such variations are appropriate to perform in the disclosed invention. It is to be understood that the value to which the modifier“about” or“approximately” refers is itself also specifically, and preferably, disclosed.

[0062] As used herein, a“biological sample” may contain whole cells and/or live cells and/or cell debris. The biological sample may contain (or be derived from) a“bodily fluid”. The present invention encompasses embodiments wherein the bodily fluid is selected from amniotic fluid, aqueous humour, vitreous humour, bile, blood serum, breast milk, cerebrospinal fluid, cerumen (earwax), chyle, chyme, endolymph, perilymph, exudates, feces, female ejaculate, gastric acid, gastric juice, lymph, mucus (including nasal drainage and phlegm), pericardial fluid, peritoneal fluid, pleural fluid, pus, rheum, saliva, sebum (skin oil), semen, sputum, synovial fluid, sweat, tears, urine, vaginal secretion, vomit and mixtures of one or more thereof. Biological samples include cell cultures, bodily fluids, cell cultures from bodily fluids. Bodily fluids may be obtained from a mammal organism, for example by puncture, or other collecting or sampling procedures.

[0063] The terms“subject,”“individual,” and“patient” are used interchangeably herein to refer to a vertebrate, preferably a mammal, more preferably a human. Mammals include, but are not limited to, murines, simians, humans, farm animals, sport animals, and pets. Tissues, cells and their progeny of a biological entity obtained in vivo or cultured in vitro are also encompassed.

[0064] Various embodiments are described hereinafter. It should be noted that the specific embodiments are not intended as an exhaustive description or as a limitation to the broader aspects discussed herein. One aspect described in conjunction with a particular embodiment is not necessarily limited to that embodiment and can be practiced with any other embodiment(s). Reference throughout this specification to“one embodiment”,“an embodiment,” “an example embodiment,” means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the present invention. Thus, appearances of the phrases “in one embodiment,” “in an embodiment,” or “an example embodiment” in various places throughout this specification are not necessarily all referring to the same embodiment, but may. Furthermore, the particular features, structures or characteristics may be combined in any suitable manner, as would be apparent to a person skilled in the art from this disclosure, in one or more embodiments. Furthermore, while some embodiments described herein include some but not other features included in other embodiments, combinations of features of different embodiments are meant to be within the scope of the invention. For example, in the appended claims, any of the claimed embodiments can be used in any combination.

[0065] Reference is made to US Patent Application No. 16/240,361, filed January 4, 2019 and International Patent Publication No. WO2014159356A1. Reference is also made to the manuscript entitled, “High-throughput organoid screening enables engineering of intestinal epithelial composition,” by Benjamin E. Mead, Kazuki Hattori, Lauren Levy, Marko Vukovic, Daphne Sze, Juan D. Matute, Jinzhi Duan, Robert Langer, Richard S. Blumberg, Jose Ordovas-Montanes, Alex K. Shalek, Jeffrey M. Karp, posted to bioRxiv on April 28, 2020 (bioRxiv 2020.04.27.063727; doi: doi.org/10.1101/2020.04.27.063727).

[0066] All publications, published patent documents, and patent applications cited herein are hereby incorporated by reference to the same extent as though each individual publication, published patent document, or patent application was specifically and individually indicated as being incorporated by reference.

OVERVIEW

[0067] Stem cells of the small intestine integrate diverse signals to regulate regeneration and differentiation, which in turn set the composition of the intestinal epithelium and supports barrier function. Therapeutically directing stem cell differentiation may therefore provide novel approaches to augment barrier function by altering the abundance or quality of specialized cells of the epithelium, including the secretory Paneth, goblet, and enteroendocrine populations. Utilizing an organoid model of directed intestinal stem cell differentiation into antimicrobial-secreting Paneth cells, Applicants performed a first-of-its-kind high throughput phenotypic screen of over four hundred well-characterized target-specific small molecules for pro-differentiation effects, independent of known cues in Wnt and Notch signaling pathways. Applicants identified and validated three compounds which significantly increase the abundance of mature Paneth cells in the differentiation model though the inhibition of the nuclear exporter Xpol. With single-cell RNA-sequencing and validating experiments Applicants reveal that enhanced stem-to-Paneth differentiation through Xpol inhibition is driven by a pan-epithelial stress response combined with an interruption of mitogen signaling in actively cycling intestinal progenitors. Applicants extended the observation of pro-Paneth cell differentiation in vivo and demonstrated that oral dosing of Xpol inhibitor KPT-330 at 1,000-fold lower than used in cancer increases Paneth cell abundance. In totality, Applicants provide a framework to conduct translational studies in organoid models and demonstrate a pathway for discover}' of novel molecular targets controlling barrier tissue composition.

[0068] Embodiments disclosed herein provide methods and compositions for differentiating stem cells to mature cells. The stem cells may be present in a cell culture, spheroid, organoid, tissue explant, or in vivo (see, e.g., Yin X, Mead BE, Safaee H, Langer R, Karp JM, Levy 0. Cell Stem Cell 2016; 18:25-38). The methods may be used to treat diseases requiring an increase in mature cells (e.g., Paneth cells, inner ear hair cells, respiratory cells, stomach cells, kidney cells). “Cell differentiation” refers to the process by which a cell becomes specialized to perform a specific function, such as in the conversion of post-natal stem cells into cells having a more specialized function. In certain embodiments, LGR5 + stem cells are differentiated into Paneth cells or cells that express characteristics of Paneth cells.

[0069] The intestinal epithelium is a complex tissue that plays a key role in digestion and mediates important innate and adaptive immune functions. The small intestinal epithelium is formed by a single layer of cells arranged into villi— primarily composed of enterocytes, absorptive cells, and secretory Goblet cells— and crypts, which contain intestinal stem cells (ISCs) and secretory Paneth cells (PCs). Cells located in the villi are specialized for nutrient absorption, cells residing in the crypts are integral to regenerating the intestinal epithelium, and specialized cells throughout the epithelium provide for a protective barrier between host and microbe. Goblet cells secrete mucins into the lumen of the intestine to create a physical barrier between the host and the bacteria populating the gut. PCs contribute to the barrier by secreting antimicrobial proteins (AMPs) to form a biochemical barrier. In a healthy small intestinal epithelium, PCs are potent modulators of the gut microflora through the known secretion of multiple antimicrobial protein families including lysozyme (LYZ), angiogenin, ribonuclease A family, regenerating islet-derived 3 gamma (REG3G), and peptides such as cystine-rich (CRS) peptides and alpha-defensins (DEFA). PCs also secrete cytokines including interleukin- 17 (IL- 17) and are involved in signaling across the innate and adaptive immune system. The gut microbiota participates in a variety of different functions including metabolism, host defense and immune development and has been linked to pathogenesis in gastrointestinal, autoimmune, and other diseases.

[0070] Genetic, morphological, and functional alternations in PCs have been shown to drive microbial dysbiosis, impaired intestinal epithelial barrier function, and inflammation. This includes the heterogeneous collection of pathologies that manifest as inflammatory bowel disease (IBD). Genetic associations linked to impaired PC function in IBD populations include abnormalities in NOD2 (innate immune activation), ATG16L1 (granule exocytosis), and XBP1 (ER stress response). The AMPs secreted by PCs also play a crucial role in protection against infection from enteric pathogens. Notably, in in vivo murine models, PC-depleted and AMP deficient mice are more susceptible to bacterial translocation and inflammation. As well, in necrotizing enterocolitis (NEC), AMP secretion and PC number is altered corresponding with intestinal immaturity and dysbiosis. The immature epithelial barrier appears to be more sensitive to bacteria and bacterial translocation, leading to excessive inflammation and systemic infection. Furthermore, PC disruption in mice replicates human NEC pathology, suggesting that PCs may initiate NEC. PCs have also been implicated in Graft versus Host disease (GvHD), which occurs after an allogeneic stem cell transplant in which the donor T cells cause an inflammatory response in the host. Patients with GHVD also exhibit a loss in PC number, reduced expression of AMPs, and dysbiosis. Notably, Gram-negative bacteria become more prevalent and, when paired with impaired barrier function, can lead to severe sepsis.

[0071] Additionally, growing evidence implicates the gut microbiota in the development of metabolic syndrome, which precipitates cardiovascular disease, type 2 diabetes, and obesity, affecting nearly a third of Americans. Interestingly, PC abnormalities relating to ER stress response have been correlated with the very obese. Furthermore, increasing the population dynamics of certain‘protective’ bacteria has been shown to mitigate a pro-obesity effect and metabolic syndrome-associated low-grade inflammation; this microbiota modulation may be done in the future through a PC axis.

[0072] The importance of impaired barrier function and dysregulation of the gut microbiota in the etiology of these diseases suggests that PCs present a promising therapeutic axis. This has already been demonstrated in GvHD. Treatment with R-spondinl (R-Spol), a potent WNT agonist, can elevate the secretion of alpha-defensins and restore the dysbiosis seen in mice with GvHD by stimulating ISCs to differentiate into PCs. However, while treatment with R-Spol illustrates the importance of PC regeneration, it faces many challenges in clinical translation to humans. R-Spol is shown to significantly increase crypt size and hyperactive WNT activation is implicated in precancerous hyperplasia and PC metaplasia. While the effects of R-Spol are inconclusive with respect to malignancy, WNT signaling must be carefully balanced to ensure homeostasis not priming for cancer. Other signaling pathways known to drive PC differentiation, including Notch signaling, face similar challenges. Activation of Notch signaling amplifies the proliferative progenitor population and promotes an absorptive cell lineage. Conversely, deactivation of Notch signaling amplifies differentiation to all secretory cell types and secretory cell hyperplasia. As these pathways affect multiple cell types in the intestinal epithelium and may lead to hyperplasia, they are not therapeutically viable. Therefore, a more specific PC targeted treatment to accomplish PC regeneration is desirable.

[0073] Intestinal stem cell differentiation is related to differentiation of stem cells in other tissues (e.g., stem cells found in the inner ear, barrier tissues, respiratory epithelium (lung, nose) and skin). Related diseases associated with irregularities in other tissues include hearing loss, inflammation, allergy, asthma, and psoriasis. Thus, the methods described herein can be used for regeneration of cells in other organs. The regeneration of cells may result in increased barrier function in other tissues.

[0074] As used herein a“barrier cell” or“barrier tissues” refers generally to various epithelial tissues of the body such, but not limited to, those that line the respiratory system, digestive system, urinary system, and reproductive system as well as cutaneous systems. The epithelial barrier may vary in composition between tissues but is composed of basal and apical components, or crypt/villus components in the case of intestine. Reduced epithelial barrier integrity is a characteristic of severe clinical presentations associated with type 2 inflammatory (T2I) responses (see, e.g., International Patent Publication No. WO 2019/018441).

Stem cells and stem cell enriched populations

[0075] In certain embodiments, the method of the present invention can be used to differentiate stem cells or a population of cells enriched for stem cells. In certain embodiments, pluripotent cells may be used. In certain embodiments, differentiation of stem cells present in vivo can be used for regeneration of mature epithelial cell types, in particular LGR5+ stem cells present in tissues. Differentiation of stem cells to epithelial cells may increase barrier function. Stem cells differentiated ex vivo may also be used for regeneration of mature epithelial cell types in vivo.

[0076] In certain embodiments, organoids enriched for stem cells are differentiated. As used herein, the term "organoid" or "epithelial organoid" refers to a cell cluster or aggregate that resembles an organ, or part of an organ, and possesses cell types relevant to that particular organ. Organoid systems have been described previously, for example, for brain, retinal, stomach, lung, thyroid, small intestine, colon, liver, kidney, pancreas, prostate, mammary gland, fallopian tube, taste buds, salivary glands, and esophagus (see, e.g., Clevers, Modeling Development and Disease with Organoids, Cell. 2016 Jun 16; 165(7): 1586-1597).

[0077] Intestinal organoids, derived from intestinal stem cells (ISCs) and composed of ISCs, Paneth cells (PCs), enteroendocrine cells (EECs), goblet cells and absorptive enterocytes, have been invaluable to the study of intestinal biology (Clevers, 2016). Conventional intestinal organoids produced from the spontaneous differentiation of ISCs have been used to study PCs in vitro in multiple contexts (Farin, et al. Paneth cell extrusion and release of antimicrobial products is directly controlled by immune cell-derived IFN-γ. J. Exp. Med. 2014; 211 : 1393-405; and Wilson, et al., A small intestinal organoid model of non-invasive enteric pathogen-epithelial cell interactions. Mucosal Immunol. Nature; 2014; 8: 1-10).

[0078] In certain embodiments, the stem cells are LGR5+ stem cells.“LGR5” is an acronym for the leucine-rich repeat-containing G-protein coupled receptor 5, also known as G-protein coupled receptor 49 (GPR49) or G-protein coupled receptor 67 (GPR67). It is a protein that in humans is encoded by the Lgr5 gene.“LGR5+ cell” or“LGR5-positive cell” is a cell that expresses Lgr5. Lgr5 is a member of GPCR class A receptor proteins. R-spondin proteins are the biological ligands of LGR5. LGR5 is a biomarker of adult stem cells in certain tissues. LGR5 is a marker of adult intestinal stem cells. The high turnover rate of the intestinal lining is due to a dedicated population of stem cells found at the base of the intestinal crypt. In vivo lineage tracing showed that LGR5 is expressed in nascent nephron cell cluster within the developing kidney. Specifically, the LGR5+ stem cells contribute into the formation of the thick ascending limb of Henle’s loop and the distal convoluted tubule. However, expression is eventually truncated after postnatal day 7, a stark contrast to the facultative expression of LGR5 in actively renewing tissues such as in the intestines (Barker, et al, 2012 "Lgr5+ve stem/progenitor cells contribute to nephron formation during kidney development". Cell Rep. 2 (3): 540-52). The stomach lining also possesses populations of LGR5+ stem cells, although there are two conflicting theories: one is that LGR5+ stem cells reside in the isthmus, the region between the pit cells and gland cells, where most cellular proliferation takes place. However, lineage tracing had revealed LGR5+ stem cells at the bottom of the gland (Barker, et al, 2010 "Lgr5+ve stem cells drive self-renewal in the stomach and build long-lived gastric units in vitro". Cell Stem Cell. 6 (1): 25-36), architecture reminiscent to that of the intestinal arrangement. This suggests that LGR5 stem cells give rise to transit- amplifying cells, which migrate towards the isthmus where they proliferate and maintain the stomach epithelium (Barker N, Clevers H, 2010 "Leucine-rich repeat-containing G-protein- coupled receptors as markers of adult stem cells". Gastroenterology. 138 (5): 1681-96). LGR5+ve stem cells were pinpointed as the precursor for sensory hair cells that line the cochlea (Ruffner, et al. 2012 "R-Spondin potentiates Wnt/b-catenin signaling through orphan receptors LGR4 and LGR5". PLoS ONE. 7 (7): e40976).

[0079] Pluripotent cells may include any mammalian stem cell. As used herein, the term "stem cell" refers to a multipotent cell having the capacity to self-renew and to differentiate into multiple cell lineages. Mammalian stem cells may include, but are not limited to embryonic stem cells of various types, such as murine embryonic stem cells, e.g., as described by Evans & Kaufman 1981 (Nature 292: 154-6) and Martin 1981 (PNAS 78: 7634-8); rat pluripotent stem cells, e.g., as described by lannaccone et al. 1994 (Dev Biol 163: 288-292); hamster embryonic stem cells, e.g., as described by Doetschman et al. 1988 (Dev Biol 127: 224-227); rabbit embryonic stem cells, e.g., as described by Graves et al. 1993 (Mol Reprod Dev 36: 424-433); porcine pluripotent stem cells, e.g., as described by Notarianni et al. 1991 (J Reprod Fertil Suppl 43 : 255-60) and Wheeler 1994 (Reprod Fertil Dev 6: 563-8); sheep embryonic stem cells, e.g., as described by Notarianni et al. 1991 (supra); bovine embryonic stem cells, e.g., as described by Roach et al. 2006 (Methods Enzymol 418: 21 -37); human embryonic stem (hES) cells, e.g., as described by Thomson et al. 1998 (Science 282: 1 145-1 147); human embryonic germ (hEG) cells, e.g., as described by Shamblott et al. 1998 (PNAS 95: 13726); embryonic stem cells from other primates such as Rhesus stem cells, e.g., as described by Thomson et al. 1995 (PNAS 92:7844-7848) or marmoset stem cells, e.g., as described by Thomson et al. 1996 (Biol Reprod 55: 254-259). In certain embodiments, the pluripotent cells may include, but are not limited to lymphoid stem cells, myeloid stem cells, neural stem cells, skeletal muscle satellite cells, epithelial stem cells, endodermal and neuroectodermal stem cells, germ cells, extraembryonic and embryonic stem cells, mesenchymal stem cells, intestinal stem cells, embryonic stem cells, and induced pluripotent stem cells (iPSCs).

[0080] As noted, prototype "human ES cells" are described by Thomson et al. 1998 (supra) and in US 6,200,806. The scope of the term covers pluripotent stem cells that are derived from a human embryo at the blastocyst stage, or before substantial differentiation of the cells into the three germ layers. ES cells, in particular hES cells, are typically derived from the inner cell mass of blastocysts or from whole blastocysts. Derivation of hES cell lines from the morula stage has been documented and ES cells so obtained can also be used in the invention (Strelchenko et al. 2004. Reproductive BioMedicine Online 9: 623-629). As noted, prototype "human EG cells" are described by Shamblott et al. 1998 (supra). Such cells may be derived, e.g., from gonadal ridges and mesenteries containing primordial germ cells from fetuses. In humans, the fetuses may be typically 5-11 weeks post-fertilization.

[0081] In certain embodiments, mouse embryonic stem cells are used. In certain embodiments, mouse embryonic stem cells differentiated into a target cell may be transferred to a mouse to perform in vivo functional studies.

[0082] Human embryonic stem cells may include, but are not limited to the HUES66, HUES64, HUES3, HUES8, HUES53, HUES28, HUES49, HUES9, HUES48, HUES45, HUES1, HUES44, HUES6, HI, HUES62, HUES65, H7, HUES13 and HUES63 cell lines.

[0083] General techniques useful in the practice of this invention in cell culture and media uses are known in the art (e.g., Large Scale Mammalian Cell Culture (Hu et al. 1997. Curr Opin Biotechnol 8: 148); Serum-free Media (K. Kitano. 1991. Biotechnology 17: 73); or Large Scale Mammalian Cell Culture (Curr Opin Biotechnol 2: 375, 1991). The terms“culturing” or“cell culture” are common in the art and broadly refer to maintenance of cells and potentially expansion (proliferation, propagation) of cells in vitro. Typically, animal cells, such as mammalian cells, such as human cells, are cultured by exposing them to (i.e., contacting them with) a suitable cell culture medium in a vessel or container adequate for the purpose (e.g., a 96-, 24-, or 6-well plate, a T-25, T-75, T-150 or T-225 flask, or a cell factory), at art-known conditions conducive to in vitro cell culture, such as temperature of 37°C, 5% v/v CO2 and > 95% humidity.

[0084] Methods related to culturing stem cells are also useful in the practice of this invention (see, e.g., "Teratocarcinomas and embryonic stem cells: A practical approach" (E. J. Robertson, ed., IRL Press Ltd. 1987); "Guide to Techniques in Mouse Development" (P. M. Wasserman et al. eds., Academic Press 1993); "Embryonic Stem Cells: Methods and Protocols" (Kursad Turksen, ed., Humana Press, Totowa N.J., 2001); "Embryonic Stem Cell Differentiation in vitro " (M. V. Wiles, Meth. Enzymol. 225: 900, 1993); "Properties and uses of Embryonic Stem Cells: Prospects for Application to Human Biology and Gene Therapy" (P. D. Rathjen et al., al., 1993). Differentiation of stem cells is reviewed, e.g., in Robertson. 1997. Meth Cell Biol 75: 173; Roach and McNeish. 2002. Methods Mol Biol 185: 1 -16; and Pedersen. 1998. Reprod Fertil Dev 10: 31). For further elaboration of general techniques useful in the practice of this invention, the practitioner can refer to standard textbooks and reviews in cell biology, tissue culture, and embryology (see, e.g., Culture of Human Stem Cells (R. Ian Freshney, Glyn N. Stacey, Jonathan M. Auerbach - 2007); Protocols for Neural Cell Culture (Laurie C. Doering - 2009); Neural Stem Cell Assays (Navjot Kaur, Mohan C. Vemuri - 2015); Working with Stem Cells (Henning Ulrich, Priscilla Davidson Negraes - 2016); and Biomaterials as Stem Cell Niche (Krishnendu Roy - 2010)). In certain embodiments, stem cells are spontaneously differentiated or directed to differentiate (see, e.g., Amit and Itskovitz-Eldor, Derivation and spontaneous differentiation of human embryonic stem cells, J Anat. 2002 Mar; 200(3): 225-232). For further methods of cell culture solutions and systems, see International Patent Publication No. WO2014159356A1.

[0085] In certain embodiments, iPSCs or iPSC cell lines are used to identify transcription factors for differentiation of target cells. iPSCs advantageously can be used to generate patient specific models and cell types. iPSCs are a type of pluripotent stem cell that can be generated directly from adult cells. Further, because embryonic stem cells can only be derived from embryos, it has so far not been feasible to create patient-matched embryonic stem cell lines.

[0086] Various strategies can be used to induce pluripotency, or increase potency, in cells (Takahashi, K., and Yamanaka, S., Cell 126, 663-676 (2006); Takahashi et al., Cell 131, 861-872 (2007); Yu et al., Science 318, 1917-1920 (2007); Zhou et al., Cell Stem Cell 4, 381-384 (2009); Kim et al., Cell Stem Cell 4, 472-476 (2009); Yamanaka et al., 2009; Saha, K., Jaenisch, R., Cell Stem Cell 5, 584-595 (2009)), and improve the efficiency of reprogramming (Shi et al., Cell Stem Cell 2, 525 20 528 (2008a); Shi et al., Cell Stem Cell 3, 568-574 (2008b); Huangfu et al., Nat Biotechnol 26, 795-797 (2008a); Huangfu et al., Nat Biotechnol 26, 1269-1275 (2008b); Silva et al., Plos Bio 6, e253. doi: 10.1371/journal pbio. 0060253 (2008); Lyssiotis et al., PNAS 106, 8912-8917 (2009); Ichida et al., Cell Stem Cell 5, 491-503 (2009); Maherali, N., Hochedlinger, K., Curr Biol 19, 1718-1723 (2009b); Esteban et 25 al., Cell Stem Cell 6, 71-79 (2010); and Feng et al., Cell Stem Cell 4, 301-3 12 (2009)).

[0087] Generally, techniques for reprogramming involve modulation of specific cellular pathways, either directly or indirectly, using polynucleotide-, polypeptide and/or small molecule- based approaches (see, e.g., International Patent Publication No. WO 2012/087965A2). The developmental potency of a cell may be increased, for example, by contacting a cell with one or more pluripotency factors.“Contacting”, as used herein, can involve culturing cells in the presence of a pluripotency factor (such as, for example, small molecules, proteins, peptides, etc.) or introducing pluripotency factors into the cell. Pluripotency factors can be introduced into cells by culturing the cells in the presence of the factor, including transcription factors such as proteins, under conditions that allow for introduction of the transcription factor into the cell. See, e.g., Zhou H et al, Cell Stem Cell. 2009 May 8;4(5):381-4; WO/2009/117439. Introduction into the cell may be facilitated for example, using transient methods, e.g., protein transduction, microinjection, non- integrating gene delivery, mRNA transduction, etc., or any other suitable technique. In some embodiments, the transcription factors are introduced into the cells by expression from a recombinant vector that has been introduced into the cell, or by incubating the cells in the presence of exogenous transcription factor polypeptides such that the polypeptides enter the cell. In particular embodiments, the pluripotency factor is a transcription factor. Exemplary transcription factors that are associated with increasing, establishing, or maintaining the potency of a cell include, but are not limited to Oct-3 /4, Cdx-2, 15 Gbx2, Gshl, HesXl, HoxAlO, HoxA 11, HoxBl, Irx2, Isll, Meisl, Meox2, Nanog, Nkx2.2, Onecut, Otxl, Oxt2, Pax5, Pax6, Pdxl, Tcfl, Tcf2, Zfhxlb, Klf-4, Atbfl, Esrrb, Genf, Jarid2, Jmjdla, Jmjd2c, Klf-3, Klf-5, Mel-18, Myst3, Nacl, REST, Rex-i, Rybp, Sall4, Salll, Tifl, YYl, Zeb2, Zfp281, Zfp57, Zic3, Coup-Tfl, Coup-Tf2, Bmil, Rnf2, Mtal, Piasl, Pias2, Pias3, Piasy, Sox2, Lefl, Soxl5, Sox6, Tcf-7, Tcf711, c-Myc, L- Myc, N-Myc, Handl, Madl, Mad3, Mad4, Mxil, Myf5, Neurog2, Ngn3, 01ig2, Tcf3, Tcf4, Foxcl, Foxd3, BAF155, C/EBPP, mafa, Eomes, Tbx-3; Rfx4, Stat3, Stella, and UTF-1. Exemplary transcription factors include Oct4, Sox2, Klf4, c-Myc, and Nanog.

[0088] Small molecule reprogramming agents are also pluripotency factors and may also be employed in the methods of the invention for inducing reprogramming and maintaining or increasing cell potency. In some embodiments of the invention, one or more small molecule reprogramming agents are used to induce pluripotency of a somatic cell, increase or maintain the potency of a cell, or improve the efficiency of reprogramming. In some embodiments, small molecule reprogramming agents are employed in the methods of the invention to improve the efficiency of reprogramming. Improvements in efficiency of reprogramming can be measured by (1) a decrease in the time required for reprogramming and generation of pluripotent cells (e.g., by shortening the time to generate pluripotent cells by at least a day compared to a similar or same process without the small molecule), or alternatively, or in combination, (2) an increase in the number of pluripotent cells generated by a particular process (e.g., increasing the number of cells reprogrammed in a given time period by at least 10%, 30%, 50%, 100%, 200%, 500%, etc. compared to a similar or same process without the small molecule). In some embodiments, a 2- fold to 20-fold improvement in reprogramming efficiency is observed. In some embodiments, reprogramming efficiency is improved by more than 20 fold. In some embodiments, a more than 100 fold improvement in efficiency is observed over the method without the small molecule reprogramming agent (e.g., a more than 100 fold increase in the number of pluripotent cells generated). Several classes of small molecule reprogramming agents may be important to increasing, establishing, and/or maintaining the potency of a cell. Exemplary small molecule reprogramming agents include, but are not limited to: agents that inhibit H3K9 methylation or promote H3K9 demethylation; agents that inhibit H3K4 demethylation or promotes H3K4 methylation; agents that inhibit histone deacetylation or promote histone acetylation; L-type Ca channel agonists; activators of the cAMP pathway; DNA methyltransferase (DNMT) inhibitors; nuclear receptor ligands; GSK3 inhibitors; MEK inhibitors; TGFP receptor/ALK5 inhibitors; HD AC inhibitors; Erk inhibitors; ROCK inhibitors; FGFR inhibitors; and PARP inhibitors. Exemplary small molecule reprogramming agents include GSK3 inhibitors; MEK inhibitors; TGFP receptor/ ALK5 inhibitors; HD AC inhibitors; Erk inhibitors; and ROCK inhibitors.

[0089] In some embodiments of the invention, small molecule reprogramming agents are used to replace one or more transcription factors in the methods of the invention to induce pluripotency, improve the efficiency of reprogramming, and/or increase or maintain the potency of a cell. For example, in some embodiments, a cell is contacted with one or more small molecule reprogramming agents, wherein the agents are included in an amount sufficient to improve the efficiency of reprogramming. In other embodiments, one or more small molecule reprogramming agents are used in addition to transcription factors in the methods of the invention. In one embodiment, a cell is contacted with at least one pluripotency transcription factor and at least one small molecule reprogramming agent under conditions to increase, establish, and/or maintain the potency of the cell or improve the efficiency of the reprogramming process. In another embodiment, a cell is contacted with at least one pluripotency transcription factor and at least two, at least three, at least four, at least five, at least six, at least seven, at least eight, at least nine, or at least ten small molecule reprogramming agents under conditions and for a time sufficient to increase, establish, and/or maintain the potency of the cell or improve the efficiency of reprogramming. The state of potency or differentiation of cells can be assessed by monitoring the pluripotency characteristics (e.g., expression of markers including, but not limited to SSEA-3, S SEA-4, TRA-1-60, TRA-1-81, TRA-2-49/6E, Oct-3/4, Sox2, Nanog, GDF3, REX1, FGF4, ESG1, DPPA2, DPPA4, and hTERT).

Diseases Associated with Paneth Cells and Stem Cell Regulation

[0090] In certain embodiments, diseases associated with Lgr5+ stem cell differentiation (e.g., diseases associated with aberrant barrier function, or diseases of the gut, respiratory system or inner ear) are treated, diagnosed or monitored using the methods and compositions of the present invention. Aberrant barrier function can be associated with increased inflammation and cell death. Epithelial barrier tissues are central to immunological homeostasis, interfacing with stromal and immune cells to coordinate appropriate responses to environmental stimuli. Because of this centrality, therapeutic development for a wide spectrum of disease has sought to identify immune- modifying targets within barrier tissue epithelium. Within the intestinal epithelium, barrier and absorptive function is provided by cellular specialists. Adult intestinal stem cells (ISCs) provide a source of constant regeneration from which an ordered process of differentiation into secretory and absorptive epithelia sets composition, thereby setting function. Compelling evidence suggests that either changes in epithelial composition arising from aberrant cues driving stem cell differentiation or changes in the functional quality of differentiated specialists may be a precipitating factor in certain cancers, infections, and immune-mediated diseases.

[0091] The intestinal epithelium is ordered in a single-layer‘conveyor belt’ originating from ISCs, conventionally identified as Lgr5+ (Barker et al, 2007). ISCs are co-localized at the base of intestinal crypts with antimicrobial and niche-supporting Paneth cells. Paneth cells support stem cell function through the secretion of growth signaling molecules that are required for proliferation and maintenance. The epithelial conveyor extends from rapidly-dividing crypt-adjacent progenitors into lumenal protrusions known as villi - primarily composed of nutrient-absorbing enterocytes, along with secretory goblet and enteroendocrine populations. Under homeostatic conditions, Wnt, BMP, and Notch signaling maintain the ISC niche (Kim et al, 2005; Pinto et al, 2003). However, ISCs have a demonstrated capacity to integrate dietary and immune-derived signals to modulate their self-renewal and differentiation into specific secretory lineages (Beyaz et al, 2016; Biton et al, 2018; von Moltke et al, 2016). Further, following major injury' to the epithelium, the ISC niche has a remarkable capacity to regenerate from non-stem or quiescent stem pools (Ayyaz et al, 2019; Tetteh et al, 2016; Yan et al, 2017). Cellular identity in the stem cell niche is fluid in response to multiple stimuli and alterations in the barrier which arise from this stem cell population may be directly altered in disease, or potentially controlled via the‘synthetic’ provision of novel cues.

[0092] Compelling evidence in the upper respiratory tract and skin suggest that inflammatory disease may be driven by alterations in the tissue stem cell population (Naik et al, 2017; Ordovas- Montanes et al, 2018). In the upper respiratory tract, alterations in stem cells manifest as shifted differentiation trajectories which alter the quality of mature epithelial specialists. This observation may extend to disease involving the small intestine - for example, the multiple observations of altered Paneth cell quality of unknown origin in inflammatory bowel disease (IBD) (Gassier, 2017; Khor et al, 2011; Liu et al., 2016; McGuckin et al., 2009; Xavier and Podolsky, 2007). Similar Paneth cell aberrations occur in necrotizing enterocolitis (NEC), where Paneth cell number and quality is diminished, corresponding with intestinal immaturity and excessive inflammation and systemic infection (McElroy et al, 2013; Sherman et al., 2005; Tanner et al., 2015; White et al., 2017). Emerging evidence suggest that certain viral pathogens, including a subset of coronavirus, may mediate their profound disruption of the intestinal barrier via a Paneth cell-axis (Wu et al., 2020). Finally, Paneth cells are implicated in Graft versus Host disease (GvHD), which occurs after an allogeneic hematopoietic stem cell transplant in which donor T cells cause an inflammatory response in the host. Patients with GvHD can exhibit a loss in Paneth cell number and quality, and microbial dysbiosis (Eriguchi et al., 2012). In the context of tissue stem cells possessing significant control over barrier composition and function, combined with known connections between stem cell aberrations and human disease, observations of altered Paneth cell quality in disease raises the question of whether targeting ISCs to restore Paneth cells is therapeutically viable.

[0093] This has been demonstrated in GvHD. Treatment with R-spondinl (R), a potent WNT agonist, can elevate the secretion of alpha-defensins and resolve dysbiosis seen in mice with GvHD by stimulating ISCs to differentiate into Paneth cells (Hayase et al., 2017). However, while treatment with R illustrates the importance of stem cell cues driving barrier tissue reconstitution via specialist-specific differentiation, it faces a major challenge in clinical translation because WNT activation is implicated in precancerous hyperplasia (Han et al., 2017; Okubo and Hogan, 2004; Sansom et al., 2004). While the effects of R are inconclusive with respect to malignancy (Kim et al., 2005; Zhou et al., 2017b), WNT signaling must be carefully balanced to ensure homeostasis not priming for cancer. Other signaling pathways known to drive Paneth cell differentiation, including Notch signaling, face similar challenges. Activation of Notch signaling amplifies the proliferative progenitor population and promotes an absorptive cell lineage (Fre et al., 2005; Jensen et al, 2000; VanDussen et al., 2012). Conversely, deactivation ofNotch signaling amplifies differentiation to all secretory cell types and secretory cell hyperplasia (VanDussen and Samuelson, 2010). As these pathways affect multiple cell types in the intestinal epithelium and may lead to hyperplasia, they are not therapeutically viable. Therefore, a more specific treatment to accomplish stem to Paneth differentiation is desirable. [0094] A skilled person can readily determine diseases that can be treated by reducing an inflammatory response. Type 2 inflammatory responses have been associated with allergic asthma, therapy resistant-asthma, steroid-resistant severe allergic airway inflammation, systemic steroid- dependent severe eosinophilic asthma, chronic rhino-sinusitis (CRS), atopic dermatitis, food allergies, persistence of chronic airway inflammation, and primary eosinophilic gastrointestinal disorders (EGIDs), including but not limited to eosinophilic esophagitis (EoE), eosinophilic gastritis, eosinophilic gastroenteritis, and eosinophilic colitis (see, e.g., Van Rijt et al., Type 2 innate lymphoid cells: at the cross-roads in allergic asthma, Seminars in Immunopathology July 2016, Volume 38, Issue 4, pp 483-496; Rivas et al., IL-4 production by group 2 innate lymphoid cells promotes food allergy by blocking regulatory T-cell function, J Allergy Clin Immunol. 2016 Sep; 138(3):801-811 e9; and Morita, Hideaki et al. Innate lymphoid cells in allergic and nonallergic inflammation, Journal of Allergy and Clinical Immunology, Volume 138, Issue 5, 1253-1264). Asthma is characterized by recurrent episodes of wheezing, shortness of breath, chest tightness, and coughing. Sputum may be produced from the lung by coughing but is often hard to bring up. During recovery from an attack, it may appear pus-like due to high levels of eosinophils. Symptoms are usually worse at night and in the early morning or in response to exercise or cold air. Some people with asthma rarely experience symptoms, usually in response to triggers, whereas others may have marked and persistent symptoms. CRS is characterized by inflammation of the mucosal surfaces of the nose and para-nasal sinuses, and it often coexists with allergic asthma. Atopic dermatitis is a chronic inflammatory skin disease that is characterized by eosinophilic infiltration and high serum IgE levels. Similar to allergic asthma and CRS, atopic dermatitis has been associated with increased expression of TSLP, IL-25, and IL-33 in the skin. Primary eosinophilic gastrointestinal disorders (EGIDs), including eosinophilic esophagitis (EoE), eosinophilic gastritis, eosinophilic gastroenteritis, and eosinophilic colitis, are disorders that exhibit eosinophil-rich inflammation in the gastrointestinal tract in the absence of known causes for eosinophilia such as parasite infection and drug reaction.

[0095] In certain embodiments, a disease or disorder that can be treated by reducing an inflammatory response or maintaining homeostasis may be any inflammatory disease or disorder such as, but not limited to, asthma, allergy, allergic rhinitis, allergic airway inflammation, atopic dermatitis (AD), chronic obstructive pulmonary disease (COPD), inflammatory bowel disease (IBD), multiple sclerosis, arthritis, psoriasis, eosinophilic esophagitis, eosinophilic pneumonia, eosinophilic psoriasis, hypereosinophilic syndrome, graft-versus-host disease, uveitis, cardiovascular disease, pain, multiple sclerosis, lupus, vasculitis, chronic idiopathic urticaria and Eosinophilic Granulomatosis with Polyangiitis (Churg-Strauss Syndrome).

[0096] The asthma may be allergic asthma, non-allergic asthma, severe refractory asthma, asthma exacerbations, viral-induced asthma or viral-induced asthma exacerbations, steroid resistant asthma, steroid sensitive asthma, eosinophilic asthma or non-eosinophilic asthma and other related disorders characterized by airway inflammation or airway hyperresponsiveness (AHR).

[0097] The COPD may be a disease or disorder associated in part with, or caused by, cigarette smoke, air pollution, occupational chemicals, allergy or airway hyperresponsiveness.

[0098] The allergy may be associated with foods, pollen, mold, dust mites, animals, or animal dander.

[0099] The IBD may be ulcerative colitis (UC), Crohn's Disease, collagenous colitis, lymphocytic colitis, ischemic colitis, diversion colitis, Behcet's syndrome, infective colitis, indeterminate colitis, and other disorders characterized by inflammation of the mucosal layer of the large intestine or colon.

[0100] The arthritis may be selected from the group consisting of osteoarthritis, rheumatoid arthritis and psoriatic arthritis.

[0101] In certain embodiments, hearing loss is treated using the methods of the present invention. In certain embodiments, hearing loss is treated by differentiating stem cells of the inner ear in vivo or by transferring cells differentiated ex vivo. In certain embodiments, the treatment to enrich for Paneth cell differentiation in the intestines by inhibiting nuclear export, as described herein, is used to sustain and/or modulate inner ear stem cells, such that differentiation to hair cells is improved. Deafness can be caused by genetic and environmental factors, mostly affecting the non-regenerating hair cells of the inner ear. Recently work established a protocol for expansion of Lgr5-positive cochlear cells as organoids, to obtain Lgr5-positive cochlear progenitors (LCPs) in large numbers in vitro, using a combination of growth factors and small molecules that are used herein for intestinal stem cells (ISCs) (see, e.g., McLean et al, 2017 Clonal expansion of Lgr5- positive cells from mammalian cochlea and high-purity generation of sensory hair cells. Cell Rep. 18 1917-192; and Lenz et al., Applications of Lgr5 -Positive Cochlear Progenitors (LCPs) to the Study of Hair Cell Differentiation Front Cell Dev Biol. 2019; 7: 14). The LCPs could be efficiently differentiated into hair cells. The methods of the present invention provide for an improved targeted differentiation of LGR5+ stem cells.

Target Genes and Pathways

[0102] In certain embodiments, specific pathways or biological programs are modulated to enhance Paneth cell differentiation. In certain embodiments, the specific pathways or biological programs are detected or monitored. As used herein the term“biological program” can be used interchangeably with“expression program” or“transcriptional program” and may refer to a set of genes that share a role in a biological function (e.g., an activation program, cell differentiation program, proliferation program). Biological programs can include a pattern of gene expression that result in a corresponding physiological event or phenotypic trait. Biological programs can include up to several hundred genes that are expressed in a spatially and temporally controlled fashion. Expression of individual genes can be shared between biological programs. Expression of individual genes can be shared among different single cell types; however, expression of a biological program may be cell type specific or temporally specific (e.g., the biological program is expressed in a cell type at a specific time). Multiple biological programs may include the same gene, reflecting the gene’s roles in different processes. Applicants have identified that XPOl inhibitors enhance stem cell conversion to mature cells of the secretory pathway. Applicants performed single sequencing of ex vivo cell-based system treated with XPOl inhibitors. Applicants identified pathways (e.g., stress response and mitogen signaling) and genes differentially expressed in response to the treatment (see, Tables 1, 2 and 3; and Examples). Applicants identified up and down regulated genes, as well as using correlation analysis to identify genes that are up and down regulated together. In certain embodiments, downstream targets of XPOl inhibition can be used for any therapeutic, diagnostic or screening methods described herein. In certain embodiments, genes that are up regulated or downregulated in response to nuclear export inhibition are modulated or detected according to the methods described further herein.

Table 1.

Table 2.

Table 3A-B. Differentially expressed genes and gene set enrichment analysis (GSEA) over differentially expressed genes between KPT-330 treated and untreated stem II / III cells over days 0.25, 1, 2 in organoid differentiation time course single-cell RNA-seq.

Table 3A. DE results ranked by Log2 fold-change for: stem II / III, 0.25-2 days. The list of genes was obtained using the following significance cut-offs: FDR < 0.05, Log2 fold-change > abs(2*st.dev) (0.208).

Table 3B. GSEA MSigDB Hallmark Genesets v7. The following significance cut-offs were used: FDR < 0.05.

[0103] The present invention includes the use of gene signatures, biological programs, or pathways. As used herein a“signature” may encompass any gene or genes, protein or proteins, or epigenetic element(s) whose expression profile or whose occurrence is associated with a specific cell type, subtype, or cell state of a specific cell type or subtype within a population of cells. For ease of discussion, when discussing gene expression, any of gene or genes, protein or proteins, or epigenetic element(s) may be substituted. As used herein, the terms“signature”,“expression profile”, or“expression program” may be used interchangeably. It is to be understood that also when referring to proteins (e.g. differentially expressed proteins), such may fall within the definition of“gene” signature. Levels of expression or activity or prevalence may be compared between different cells in order to characterize or identify for instance signatures specific for cell (sub)populations. Increased or decreased expression or activity or prevalence of signature genes may be compared between different cells in order to characterize or identify for instance specific cell (sub)populations. The detection of a signature in single cells may be used to identify and quantitate for instance specific cell (sub)populations. A signature may include a gene or genes, protein or proteins, or epigenetic element(s) whose expression or occurrence is specific to a cell (sub)population, such that expression or occurrence is exclusive to the cell (sub)population. A gene signature as used herein, may thus refer to any set of up- and down-regulated genes that are representative of a cell type or subtype. A gene signature as used herein, may also refer to any set of up- and down-regulated genes between different cells or cell (sub)populations derived from a gene-expression profile. For example, a gene signature may comprise a list of genes differentially expressed in a distinction of interest. [0104] The signature as defined herein (being it a gene signature, protein signature or other genetic or epigenetic signature) can be used to indicate the presence of a cell type, a subtype of the cell type, the state of the microenvironment of a population of cells, a particular cell type population or subpopulation, and/or the overall status of the entire cell (sub)population. Furthermore, the signature may be indicative of cells within a population of cells in vivo. The signature may also be used to suggest for instance particular therapies, or to follow up treatment, or to suggest ways to modulate immune systems. The signatures of the present invention may be discovered by analysis of expression profdes of single-cells within a population of cells from isolated samples (e.g. tumor samples), thus allowing the discover}' of novel cell subtypes or cell states that were previously invisible or unrecognized. The presence of subtypes or cell states may be determined by subtype specific or cell state specific signatures. The presence of these specific cell (sub)types or cell states may be determined by applying the signature genes to bulk sequencing data in a sample. Not being bound by a theory the signatures of the present invention may be microenvironment specific, such as their expression in a particular spatio-temporal context. Not being bound by a theory, signatures as discussed herein are specific to a particular pathological context. Not being bound by a theory, a combination of cell subtypes having a particular signature may indicate an outcome. Not being bound by a theory, the signatures can be used to deconvolute the network of cells present in a particular pathological condition. Not being bound by a theory the presence of specific cells and cell subtypes are indicative of a particular response to treatment, such as including increased or decreased susceptibility to treatment. The signature may indicate the presence of one particular cell type. In one embodiment, the novel signatures are used to detect multiple cell states or hierarchies that occur in subpopulations of cancer cells that are linked to particular pathological condition (e g. cancer grade), or linked to a particular outcome or progression of the disease (e.g. metastasis), or linked to a particular response to treatment of the disease.

[0105] The signature according to certain embodiments of the present invention may comprise or consist of one or more genes, proteins and/or epigenetic elements, such as for instance 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more. In certain embodiments, the signature may comprise or consist of two or more genes, proteins and/or epigenetic elements, such as for instance 2, 3, 4, 5, 6, 7, 8, 9, 10 or more. In certain embodiments, the signature may comprise or consist of three or more genes, proteins and/or epigenetic elements, such as for instance 3, 4, 5, 6, 7, 8, 9, 10 or more. In certain embodiments, the signature may comprise or consist of four or more genes, proteins and/or epigenetic elements, such as for instance 4, 5, 6, 7, 8, 9, 10 or more. In certain embodiments, the signature may comprise or consist of five or more genes, proteins and/or epigenetic elements, such as for instance 5, 6, 7, 8, 9, 10 or more. In certain embodiments, the signature may comprise or consist of six or more genes, proteins and/or epigenetic elements, such as for instance 6, 7, 8, 9, 10 or more. In certain embodiments, the signature may comprise or consist of seven or more genes, proteins and/or epigenetic elements, such as for instance 7, 8, 9, 10 or more. In certain embodiments, the signature may comprise or consist of eight or more genes, proteins and/or epigenetic elements, such as for instance 8, 9, 10 or more. In certain embodiments, the signature may comprise or consist of nine or more genes, proteins and/or epigenetic elements, such as for instance 9, 10 or more. In certain embodiments, the signature may comprise or consist of ten or more genes, proteins and/or epigenetic elements, such as for instance 10, 11, 12, 13, 14, 15, or more. It is to be understood that a signature according to the invention may for instance also include genes or proteins as well as epigenetic elements combined.

[0106] In certain embodiments, a signature is characterized as being specific for a particular cell or cell (sub)population if it is upregulated or only present, detected or detectable in that particular cell or cell (sub)population, or alternatively is downregulated or only absent, or undetectable in that particular cell or cell (sub)population. In this context, a signature consists of one or more differentially expressed genes/proteins or differential epigenetic elements when comparing different cells or cell (sub)populations, including comparing different tumor cells or tumor cell (sub)populations, as well as comparing tumor cells or tumor cell (sub)populations with non-tumor cells or non-tumor cell (sub)populations. It is to be understood that“differentially expressed” genes/proteins include genes/proteins which are up- or down-regulated as well as genes/proteins which are turned on or off. When referring to up-or down-regulation, in certain embodiments, such up- or down-regulation is preferably at least two-fold, such as two-fold, three- fold, four-fold, five-fold, or more, such as for instance at least ten-fold, at least 20-fold, at least 30-fold, at least 40-fold, at least 50-fold, or more. Alternatively, or in addition, differential expression may be determined based on common statistical tests, as is known in the art. In certain embodiments, differential expression may be determined by comparing expression to the mean or median expression of all expressed genes or to a subset of genes.

[0107] As discussed herein, differentially expressed genes/proteins, or differential epigenetic elements may be differentially expressed on a single cell level, or may be differentially expressed on a cell population level. Preferably, the differentially expressed genes/ proteins or epigenetic elements as discussed herein, such as constituting the gene signatures as discussed herein, when as to the cell population level, refer to genes that are differentially expressed in all or substantially all cells of the population (such as at least 80%, preferably at least 90%, such as at least 95% of the individual cells). This allows one to define a particular subpopulation of cells. As referred to herein, a“subpopulation” of cells preferably refers to a particular subset of cells of a particular cell type which can be distinguished or are uniquely identifiable and set apart from other cells of this cell type. The cell subpopulation may be phenotypically characterized and is preferably characterized by the signature as discussed herein. A cell (sub)population as referred to herein may constitute of a (sub)population of cells of a particular cell type characterized by a specific cell state.

[0108] When referring to induction, or alternatively suppression of a particular signature, preferable is meant induction or alternatively suppression (or upregulation or downregulation) of at least one gene/protein and/or epigenetic element of the signature, such as for instance at least to, at least three, at least four, at least five, at least six, or all genes/proteins and/or epigenetic elements of the signature.

In vitro Models

[0109] In certain embodiments, the present invention provides methods of generating target cell types in vitro. In vitro models may be obtained by modulating factors or pathways as described herein.

[0110] In certain embodiments, the in vitro models of the present invention may be used to study development, cell biology and disease. In certain embodiments, the in vitro models of the present invention may be used to screen for drugs capable of modulating the target cells or for determining toxicity of drugs (e.g., toxic to Paneth cells). In certain embodiments, the in vitro models of the present invention may be used to identify specific cell states and/or subtypes. [0111] In certain embodiments, the in vitro models of the present invention may be used in perturbation studies. Perturbations may include conditions, substances or agents. Agents may be of physical, chemical, biochemical and/or biological nature. Perturbations may include treatment with a small molecule, protein, RNAi, CRISPR system, TALE system, Zn finger system, meganuclease, pathogen, allergen, biomolecule, or environmental stress. Such methods may be performed in any manner appropriate for the particular application.

[0112] In certain embodiments, the in vitro models are configured for performing perturb-seq. Methods and tools for genome-scale screening of perturbations in single cells using CRISPR have been described, herein referred to as perturb-seq (see e.g., Dixit et al.,“Perturb-Seq: Dissecting Molecular Circuits with Scalable Single-Cell RNA Profiling of Pooled Genetic Screens” 2016, Cell 167, 1853-1866; Adamson et al.,“A Multiplexed Single-Cell CRISPR Screening Platform Enables Systematic Dissection of the Unfolded Protein Response” 2016, Cell 167, 1867-1882; Feldman et al., Lentiviral co-packaging mitigates the effects of intermolecular recombination and multiple integrations in pooled genetic screens, bioRxiv 262121, doi: doi.org/10.1101/262121; Datlinger, et al., 2017, Pooled CRISPR screening with single-cell transcriptome readout. Nature Methods. Vol.14 No.3 DOI: 10.1038/nmeth.4177; Hill et al., On the design of CRISPR-based single cell molecular screens, Nat Methods. 2018 Apr; 15(4): 271-274; Replogle, et al., “Combinatorial single-cell CRISPR screens by direct guide RNA capture and targeted sequencing” Nat Biotechnol (2020). doi.org/10.1038/s41587-020-0470-y; and International Patent Publication No. WO/2017/075294).

THERAPEUTIC METHODS

[0113] In certain embodiments, diseases characterized by an inflammatory response are treated by differentiating stem cells in vivo or by transferring cells differentiated ex vivo (e.g., differentiating to Paneth cells). In certain embodiments, a disease may be treated by inducing target cells in vivo. Target cells may be induced in vivo by activating or inhibiting a pathway, modulation of expression or activity of a target gene, such as, expressing transcription factors at a specific site of the disease (e.g., ATF3). Transcription factors may be provided to specific cells at a location of disease. In certain embodiments, mRNA is provided. In certain embodiments, low dose nuclear export inhibitors are administered to a subject. [0114] It will be understood by the skilled person that treating as referred to herein encompasses enhancing treatment, or improving treatment efficacy. Treatment may include inhibition of an inflammatory response, enhancing an immune response, tumor regression as well as inhibition of tumor growth, metastasis or tumor cell proliferation, or inhibition or reduction of otherwise deleterious effects associated with the tumor.

[0115] Efficaciousness of treatment is determined in association with any known method for diagnosing or treating the particular disease. The invention comprehends a treatment method comprising any one of the methods or uses herein discussed.

[0116] The phrase "therapeutically effective amount" as used herein refers to a sufficient amount of a drug, agent, or compound to provide a desired therapeutic effect.

[0117] As used herein“patient” refers to any human being receiving or who may receive medical treatment and is used interchangeably herein with the term“subject”.

[0118] Therapy or treatment according to the invention may be performed alone or in conjunction with another therapy, and may be provided at home, the doctor’s office, a clinic, a hospital’s outpatient department, or a hospital. Treatment generally begins at a hospital so that the doctor can observe the therapy’s effects closely and make any adjustments that are needed. The duration of the therapy depends on the age and condition of the patient, the stage of the cancer, and how the patient responds to the treatment. Additionally, a person having a greater risk of developing an inflammatory response (e.g., aberrant barrier function) may receive prophylactic treatment to inhibit or delay symptoms of the disease.

[0119] In certain embodiments, a cell-based therapeutic includes engraftment of the cells of the present invention. As used herein, the term "engraft" or "engraftment" refers to the process of cell incorporation into a tissue of interest in vivo through contact with existing cells of the tissue. In certain embodiments, the cell based therapy may comprise adoptive cell transfer (ACT). As used herein adoptive cell transfer and adoptive cell therapy are used interchangeably. In certain embodiments, the target cells differentiated according to the methods described herein may be transferred to a subject in need thereof. If possible, use of autologous cells helps the recipient by minimizing GVHD issues. In certain embodiments, autologous stem cells are harvested from a subject and the cells are modulated to differentiate the stem cells into target cells (e.g., by overexpressing a transcription factor, such as ATF3). Pharmaceutical Compositions and Delivery

[0120] Target cells of the present invention may be combined with various components to produce compositions of the invention. The compositions may be combined with one or more pharmaceutically acceptable carriers or diluents to produce a pharmaceutical composition (which may be for human or animal use). Suitable carriers and diluents include, but are not limited to, isotonic saline solutions, for example phosphate-buffered saline. The composition of the invention may be administered by direct injection. The composition may be formulated for parenteral, intramuscular, intravenous, subcutaneous, intraocular, oral, transdermal administration, or injection into the spinal fluid.

[0121] Compositions comprising target cells may be delivered by injection or implantation. Cells may be delivered in suspension or embedded in a support matrix such as natural and/or synthetic biodegradable matrices. Natural matrices include, but are not limited to, collagen matrices. Synthetic biodegradable matrices include, but are not limited to, polyanhydrides and polylactic acid. These matrices may provide support for fragile cells in vivo.

[0122] The compositions may also comprise the target cells of the present invention, and at least one pharmaceutically acceptable excipient, carrier, or vehicle.

[0123] Delivery' may also be by controlled delivery, i.e., delivered over a period of time which may be from several minutes to several hours or days. Delivery may be systemic (for example by intravenous injection) or directed to a particular site of interest. Cells may be introduced in vivo using liposomal transfer.

[0124] Target cells may be administered in doses of from l >< 10 5 to 1 c 10 7 cells per kg. For example a 70 kg patient may be administered 1.4>< 10 6 cells for reconstitution of tissues. The dosages may be any combination of the target cells listed in this application.

[0125] The modifying agents and other modulating agents, or components thereof, or nucleic acid molecules thereof, or nucleic acid molecules encoding or providing components thereof, may be delivered by a delivery system herein described.

[0126] Vector delivery, e.g., plasmid, viral delivery: the modulating agents, can be delivered using any suitable vector, e.g., plasmid or viral vectors, such as adeno associated virus (AAV), lentivirus, adenovirus or other viral vector types, or combinations thereof. In some embodiments, the vector, e.g., plasmid or viral vector is delivered to the tissue of interest by, for example, an intramuscular injection, while other times the delivery is via intravenous, transdermal, intranasal, oral, mucosal, or other delivery methods. Such delivery may be either via a single dose, or multiple doses. One skilled in the art understands that the actual dosage to be delivered herein may vary greatly depending upon a variety of factors, such as the vector choice, the target cell, organism, or tissue, the general condition of the subject to be treated, the degree of transformation/modification sought, the administration route, the administration mode, the type of transformation/modification sought, etc.

[0127] In certain embodiments, small molecules, proteins, mRNA or cells are administered via targeted injection (e.g., the tissue to be repaired), intravenous, infusion, or other delivery methods. Such delivery may be either via a single dose, or multiple doses. One skilled in the art understands that the actual dosage to be delivered herein may vary greatly depending upon a variety of factors, such as the target cell, or tissue, the general condition of the subject to be treated, the degree of modification sought, the administration route, the administration mode, the type of modification sought, etc.

[0128] In certain embodiment, transcription factors are expressed in target tissue cells temporarily. In certain embodiments, the time of transcription factor expression or enhancement is only the time required to differentiate stem cells into target cells. In certain embodiments, transcription factors are expressed or enhanced for 1 to 14 days, preferably, about 2 days. In certain embodiments, the means of delivery does not result in integration of a sequence encoding transcription factors in the genome of target cells.

[0129] A“pharmaceutical composition” refers to a composition that usually contains an excipient, such as a pharmaceutically acceptable carrier that is conventional in the art and that is suitable for administration to cells or to a subject.

[0130] The pharmaceutical composition according to the present invention can, in one alternative, include a prodrug. When a pharmaceutical composition according to the present invention includes a prodrug, prodrugs and active metabolites of a compound may be identified using routine techniques known in the art. (See, e.g., Bertolini et al., J. Med. Chem., 40, 2011- 2016 (1997); Shan et al, J. Pharm. Sch, 86 (7), 765-767; Bagshawe, Drug Dev. Res., 34, 220-230 (1995); Bodor, Advances in Drug Res., 13, 224-331 (1984); Bundgaard, Design of Prodrugs (Elsevier Press 1985); Larsen, Design and Application of Prodrugs, Drug Design and Development (Krogsgaard-Larsen et al., eds., Harwood Academic Publishers, 1991); Dear et al., J. Chromatogr. B, 748, 281-293 (2000); Spraul et al., J. Pharmaceutical & Biomedical Analysis, 10, 601-605 (1992); and Prox et al., Xenobiol., 3, 103-112 (1992)).

[0131] The term “pharmaceutically acceptable” as used throughout this specification is consistent with the art and means compatible with the other ingredients of a pharmaceutical composition and not deleterious to the recipient thereof.

[0132] As used herein,“carrier” or“excipient” includes any and all solvents, diluents, buffers (such as, e g., neutral buffered saline or phosphate buffered saline), solubilizers, colloids, dispersion media, vehicles, fillers, chelating agents (such as, e.g., EDTA or glutathione), amino acids (such as, e g., glycine), proteins, disintegrants, binders, lubricants, wetting agents, emulsifiers, sweeteners, colorants, flavorings, aromatizers, thickeners, agents for achieving a depot effect, coatings, antifungal agents, preservatives, stabilizers, antioxidants, tonicity controlling agents, absorption delaying agents, and the like. The use of such media and agents for pharmaceutical active components is well known in the art. Such materials should be non-toxic and should not interfere with the activity of the cells or active components.

[0133] The precise nature of the carrier or excipient or other material will depend on the route of administration. For example, the composition may be in the form of a parenterally acceptable aqueous solution, which is pyrogen-free and has suitable pH, isotonicity and stability. For general principles in medicinal formulation, the reader is referred to Cell Therapy: Stem Cell Transplantation, Gene Therapy, and Cellular Immunotherapy, by G. Morstyn & W. Sheridan eds., Cambridge University Press, 1996; and Hematopoietic Stem Cell Therapy, E. D. Ball, J. Lister & P. Law, Churchill Livingstone, 2000.

[0134] The pharmaceutical composition can be applied parenterally, rectally, orally or topically. Preferably, the pharmaceutical composition may be used for intravenous, intramuscular, subcutaneous, peritoneal, peridural, rectal, nasal, pulmonary, mucosal, or oral application. In a preferred embodiment, the pharmaceutical composition according to the invention is intended to be used as an infusion. The skilled person will understand that compositions which are to be administered orally or topically will usually not comprise cells, although it may be envisioned for oral compositions to also comprise cells, for example when gastro-intestinal tract indications are treated. Each of the cells or active components (e.g., immunomodulants) as discussed herein may be administered by the same route or may be administered by a different route. By means of example, and without limitation, cells may be administered parenterally and other active components may be administered orally.

[0135] Liquid pharmaceutical compositions may generally include a liquid carrier such as water or a pharmaceutically acceptable aqueous solution. For example, physiological saline solution, tissue or cell culture media, dextrose or other saccharide solution or glycols such as ethylene glycol, propylene glycol or polyethylene glycol may be included.

[0136] The composition may include one or more cell protective molecules, cell regenerative molecules, growth factors, anti-apoptotic factors or factors that regulate gene expression in the cells. Such substances may render the cells independent of their environment.

[0137] Such pharmaceutical compositions may contain further components ensuring the viability of the cells therein. For example, the compositions may comprise a suitable buffer system (e.g., phosphate or carbonate buffer system) to achieve desirable pH, more usually near neutral pH, and may comprise sufficient salt to ensure isoosmotic conditions for the cells to prevent osmotic stress. For example, suitable solution for these purposes may be phosphate-buffered saline (PBS), sodium chloride solution, Ringer's Injection or Lactated Ringer's Injection, as known in the art. Further, the composition may comprise a carrier protein, e.g., albumin (e.g., bovine or human albumin), which may increase the viability of the cells.

[0138] Further suitably pharmaceutically acceptable carriers or additives are well known to those skilled in the art and for instance may be selected from proteins such as collagen or gelatine, carbohydrates such as starch, polysaccharides, sugars (dextrose, glucose and sucrose), cellulose derivatives like sodium or calcium carboxymethylcellulose, hydroxypropyl cellulose or hydroxypropylmethyl cellulose, pregeletanized starches, pectin agar, carrageenan, clays, hydrophilic gums (acacia gum, guar gum, arabic gum and xanthan gum), alginic acid, alginates, hyaluronic acid, polyglycolic and polylactic acid, dextran, pectins, synthetic polymers such as water-soluble acrylic polymer or polyvinylpyrrolidone, proteoglycans, calcium phosphate and the like.

[0139] In certain embodiments, a pharmaceutical cell preparation as taught herein may be administered in a form of liquid composition. In embodiments, the cells or pharmaceutical composition comprising such can be administered systemically, topically, within an organ or at a site of organ dysfunction or lesion.

[0140] Preferably, the pharmaceutical compositions may comprise a therapeutically effective amount of the specified immune cells and/or other active components (e.g., immunomodulants). The term“therapeutically effective amount” refers to an amount which can elicit a biological or medicinal response in a tissue, system, animal or human that is being sought by a researcher, veterinarian, medical doctor or other clinician, and in particular can prevent or alleviate one or more of the local or systemic symptoms or features of a disease or condition being treated.

[0141] It will be appreciated that administration of therapeutic entities in accordance with the invention will be administered with suitable carriers, excipients, and other agents that are incorporated into formulations to provide improved transfer, delivery, tolerance, and the like. A multitude of appropriate formulations can be found in the formulary known to all pharmaceutical chemists: Remington's Pharmaceutical Sciences (15th ed, Mack Publishing Company, Easton, PA (1975)), particularly Chapter 87 by Blaug, Seymour, therein. These formulations include, for example, powders, pastes, ointments, jellies, waxes, oils, lipids, lipid (cationic or anionic) containing vesicles (such as Lipofectin™), DNA conjugates, anhydrous absorption pastes, oil-in- water and water-in-oil emulsions, emulsions carbowax (polyethylene glycols of various molecular weights), semi-solid gels, and semi-solid mixtures containing carbowax. Any of the foregoing mixtures may be appropriate in treatments and therapies in accordance with the present invention, provided that the active ingredient in the formulation is not inactivated by the formulation and the formulation is physiologically compatible and tolerable with the route of administration. See also Baldrick P.“Pharmaceutical excipient development: the need for preclinical guidance.” Regul. Toxicol Pharmacol. 32(2):210-8 (2000), Wang W. “Lyophilization and development of solid protein pharmaceuticals.” Int. J. Pharm. 203(1-2): 1-60 (2000), Charman WN“Lipids, lipophilic drugs, and oral drug delivery-some emerging concepts.” J Pharm Sci. 89(8):967-78 (2000), Powell et al.“Compendium of excipients for parenteral formulations” PDA J Pharm Sci Technol. 52:238- 311 (1998) and the citations therein for additional information related to formulations, excipients and carriers well known to pharmaceutical chemists.

[0142] The medicaments of the invention are prepared in a manner known to those skilled in the art, for example, by means of conventional dissolving, lyophilizing, mixing, granulating or confectioning processes. Methods well known in the art for making formulations are found, for example, in Remington: The Science and Practice of Pharmacy, 20th ed., ed. A. R. Gennaro, 2000, Lippincott Williams & Wilkins, Philadelphia, and Encyclopedia of Pharmaceutical Technology, eds. J. Swarbrick and J. C. Boylan, 1988-1999, Marcel Dekker, New York.

[0143] Administration of medicaments of the invention may be by any suitable means that results in a compound concentration that is effective for treating or inhibiting (e.g., by delaying) the development of a disease. The compound is admixed with a suitable carrier substance, e.g., a pharmaceutically acceptable excipient that preserves the therapeutic properties of the compound with which it is administered. One exemplary pharmaceutically acceptable excipient is physiological saline. The suitable carrier substance is generally present in an amount of 1-95% by weight of the total weight of the medicament. The medicament may be provided in a dosage form that is suitable for administration. Thus, the medicament may be in form of, e.g., tablets, capsules, pills, powders, granulates, suspensions, emulsions, solutions, gels including hydrogels, pastes, ointments, creams, plasters, drenches, delivery devices, injectables, implants, sprays, or aerosols.

[0144] Administration can be systemic or local. In addition, it may be advantageous to administer the composition into the central nervous system by any suitable route, including intraventricular and intrathecal injection. Pulmonary administration may also be employed by use of an inhaler or nebulizer, and formulation with an aerosolizing agent. It may also be desirable to administer the agent locally to the area in need of treatment; this may be achieved by, for example, and not by way of limitation, local infusion during surgery, topical application, by injection, by means of a catheter, by means of a suppository, or by means of an implant.

[0145] Various delivery systems are known and can be used to administer the pharmacological compositions including, but not limited to, encapsulation in liposomes, microparticles, microcapsules; minicells; polymers; capsules; tablets; and the like. In one embodiment, the agent may be delivered in a vesicle, in particular a liposome. In a liposome, the agent is combined, in addition to other pharmaceutically acceptable carriers, with amphipathic agents such as lipids which exist in aggregated form as micelles, insoluble monolayers, liquid crystals, or lamellar layers in aqueous solution. Suitable lipids for liposomal formulation include, without limitation, monoglycerides, diglycerides, sulfatides, lysolecithin, phospholipids, saponin, bile acids, and the like. Preparation of such liposomal formulations is within the level of skill in the art, as disclosed, for example, in U.S. Pat. No. 4,837,028 and U S. Pat. No. 4,737,323. In yet another embodiment, the pharmacological compositions can be delivered in a controlled release system including, but not limited to: a delivery pump (See, for example, Saudek, et al., New Engl. J. Med. 321 : 574 (1989) and a semi-permeable polymeric material (See, for example, Howard, et al., J. Neurosurg. 71 : 105 (1989)). Additionally, the controlled release system can be placed in proximity of the therapeutic target (e.g., a tumor), thus requiring only a fraction of the systemic dose. See, for example, Goodson, In: Medical Applications of Controlled Release, 1984. (CRC Press, Boca Raton, Fla.).

[0146] The amount of the agents which will be effective in the treatment of a particular disorder or condition will depend on the nature of the disorder or condition, and may be determined by standard clinical techniques by those of skill within the art. In addition, in vitro assays may optionally be employed to help identify optimal dosage ranges. The precise dose to be employed in the formulation will also depend on the route of administration, and the overall seriousness of the disease or disorder, and should be decided according to the judgment of the practitioner and each patient's circumstances. Ultimately, the attending physician will decide the amount of the agent with which to treat each individual patient. In certain embodiments, the attending physician will administer low doses of the agent and observe the patient's response. Larger doses of the agent may be administered until the optimal therapeutic effect is obtained for the patient, and at that point the dosage is not increased further. Effective doses may be extrapolated from dose-response curves derived from in vitro or animal model test systems. Ultimately the attending physician will decide on the appropriate duration of therapy using compositions of the present invention. Dosage will also vary according to the age, weight and response of the individual patient.

[0147] There are a variety of techniques available for introducing nucleic acids into viable cells. The techniques vary depending upon whether the nucleic acid is transferred into cultured cells in vitro , or in vivo in the cells of the intended host. Techniques suitable for the transfer of nucleic acid into mammalian cells in vitro include the use of liposomes, electroporation, microinjection, cell fusion, DEAE-dextran, the calcium phosphate precipitation method, etc. The currently preferred in vivo gene transfer techniques include transfection with viral (typically retroviral) vectors and viral coat protein-liposome mediated transfection. Modulating Agents

[0148] Applicants identified up and down regulated genes in response to inhibition of XPOl (Table 1, 2 and 3). Any of these genes may be targeted to modulate differentiation ex vivo or in vivo. Applicants identified that stress response genes were upregulated and cell cycle genes were downregulated in response to XPOl inhibition. Applicants also identified that XPOl inhibition induces a quiescent signature and that differentiation can be enhanced by inducing stem quiescence. Modulating agents targeting these genes and pathways are described further herein.

[0149] In certain embodiments, a transcription factor is targeted. In certain embodiments ATF3 is targeted. In certain embodiments, ATF3 activity is enhanced by modulation of post translational modification sites as described further herein. In certain embodiments, ATF3 expression is upregulated as described further herein. In certain embodiments, endogenous ATF3 is expressed in stem cells as described further herein. ATF3 is induced upon physiological stress in various tissues (Chen et al., 1996 "Analysis of ATF3, a transcription factor induced by physiological stresses and modulated by gaddl 53/Chop 10". Molecular and Cellular Biology. 16 (3): 1157-68). It is also a marker of regeneration following injury of dorsal root ganglion neurons, as injured regenerating neurons activate this transcription factor (Linda et al., 2011 "Activating transcription factor 3, a useful marker for regenerative response after nerve root injury". Frontiers in Neurology. 2: 30). Functional validation studies have shown that ATF3 can promote regeneration of peripheral neurons, but is not capable of promoting regeneration of central nervous system neurons (Mahar M, and Cavalli V 2018 "Intrinsic mechanisms of neuronal axon regeneration". Nature Reviews. Neuroscience. 19 (6): 323-337).

[0150] In certain embodiments, the present invention provides for one or more modulating agents against that target signature genes or pathways identified. Targeting the identified signature genes or pathways may provide for enhanced differentiation of stem cells into a target cell. In certain embodiments, the modulating agent is a therapeutic agent used in the treatment of a disease.

[0151] The terms“therapeutic agent”,“therapeutic capable agent” or“treatment agent” are used interchangeably and refer to a molecule or compound that confers some beneficial effect upon administration to a subject. The beneficial effect includes enablement of diagnostic determinations; amelioration of a disease, symptom, disorder, or pathological condition; reducing or preventing the onset of a disease, symptom, disorder or condition; and generally counteracting a disease, symptom, disorder or pathological condition.

[0152] As used herein,“treatment” or“treating,” or“palliating” or“ameliorating” are used interchangeably. These terms refer to an approach for obtaining beneficial or desired results including but not limited to a therapeutic benefit and/or a prophylactic benefit. By therapeutic benefit is meant any therapeutically relevant improvement in or effect on one or more diseases, conditions, or symptoms under treatment. For prophylactic benefit, the compositions may be administered to a subject at risk of developing a particular disease, condition, or symptom, or to a subject reporting one or more of the physiological symptoms of a disease, even though the disease, condition, or symptom may not have yet been manifested. As used herein "treating" includes ameliorating, curing, preventing it from becoming worse, slowing the rate of progression, or preventing the disorder from re-occurring (i.e., to prevent a relapse).

[0153] The term“effective amount” or“therapeutically effective amount” refers to the amount of an agent that is sufficient to effect beneficial or desired results. The therapeutically effective amount may vary depending upon one or more of: the subject and disease condition being treated, the weight and age of the subject, the severity of the disease condition, the manner of administration and the like, which can readily be determined by one of ordinary' skill in the art. The term also applies to a dose that will provide an image for detection by any one of the imaging methods described herein. The specific dose may vary depending on one or more of: the particular agent chosen, the dosing regimen to be followed, whether it is administered in combination with other compounds, timing of administration, the tissue to be imaged, and the physical delivery system in which it is carried.

Small molecules

[0154] In certain embodiments, the one or more agents is a small molecule. The term“small molecule” refers to compounds, preferably organic compounds, with a size comparable to those organic molecules generally used in pharmaceuticals. The term excludes biological macromolecules (e.g., proteins, peptides, nucleic acids, etc.). Preferred small organic molecules range in size up to about 5000 Da, e.g., up to about 4000, preferably up to 3000 Da, more preferably up to 2000 Da, even more preferably up to about 1000 Da, e.g., up to about 900, 800, 700, 600 or up to about 500 Da. In certain embodiments, the small molecule may act as an antagonist or agonist (e.g., blocking an enzyme active site or activating a receptor by binding to a ligand binding site).

[0155] One type of small molecule applicable to the present invention is a degrader molecule. Proteolysis Targeting Chimera (PROTAC) technology is a rapidly emerging alternative therapeutic strategy with the potential to address many of the challenges currently faced in modern drug development programs. PROTAC technology employs small molecules that recruit target proteins for ubiquitination and removal by the proteasome (see, e.g., Zhou et al., Discovery of a Small-Molecule Degrader of Bromodomain and Extra- Terminal (BET) Proteins with Picomolar Cellular Potencies and Capable of Achieving Tumor Regression. J. Med. Chem. 2018, 61, 462-481; Bondeson and Crews, Targeted Protein Degradation by Small Molecules, Annu Rev Pharmacol Toxicol. 2017 Jan 6; 57: 107-123; and Lai et al., Modular PROTAC Design for the Degradation of Oncogenic BCR-ABL Angew Chem Int Ed Engl. 2016 Jan 11; 55(2): 807-810). Nuclear Export Inhibitors

[0156] In certain embodiments, low dosages of nuclear export inhibitors are administered to a subject in need thereof. In certain embodiments, low dosages include dosages that are below 0.01, 0.05, 0.1, 0.15 or 0.2 mg/kg. In certain embodiments, low dosages administered to the subject may include dosages between 0.001 to 0.02 mg/kg of inhibitor. In certain embodiments, low dosages can be achieved by directly administering the inhibitor to the tissue of interest.

[0157] Specific proteins and RNAs are carried into and out of the nucleus by specialized transport molecules, which are classified as importins if they transport molecules into the nucleus, and exportins if they transport molecules out of the nucleus (Terry L J et al. 2007. Crossing the nuclear envelope: hierarchical regulation of nucleocytoplasmic transport. Science 318: 1412-1416; and Sorokin A V et al. 2007. Nucleocytoplasmic transport of proteins. Biochemistry 72: 1439- 1457). Proteins that are transported into or out of the nucleus contain nuclear import/localization (NLS) or export (NES) sequences that allow them to interact with the relevant transporters. Chromosomal Region Maintenance 1 (Crml), which is also called exportin-1 or Xpol, is a major exportin.

[0158] Crml inhibitors have been shown to induce apoptosis in cancer cells even in the presence of activating oncogenic or growth stimulating signals, while sparing normal

(untransformed) cells. Most studies of Crml inhibition have utilized the natural product Crml inhibitor Leptomycin B (LMB). LMB itself is highly toxic to neoplastic cells, but poorly tolerated with marked gastrointestinal toxicity in animals (Roberts et al, 1986) and humans (Newlands et al, 1996). Derivatization of LMB to improve drug-like properties leads to compounds that retain antitumor activity and are better tolerated in animal tumor models (Yang et al, 2007, Yang et al, 2008, Mutka et al, 2009). In certain embodiments, the low dosages of nuclear export inhibitors described herein are not toxic to the subject. In certain embodiments, a dosage is used that is not toxic to a subject. As used herein,“toxic” refers to the ability of a substance or mixture of substances to cause harmful effects over an extended period, usually upon repeated or continuous exposure, sometimes lasting for the entire life of the exposed organism, i.e., capable of causing death or serious debilitation. Non-limiting examples of nuclear export inhibitors applicable to the present invention include KPT-330, KPT-8602, Leptomycin B, Selinexor (Vogl et al., J Clin Oncol. 2018 Mar 20; 36(9): 859-866) and any of the compounds disclosed in US9428490B2 and US9861614B2.

Cell Cycle Inhibitors

[0159] In certain embodiments, the small molecule or agent is a cell cycle inhibitor (see e.g., Dickson and Schwartz, Development of cell-cycle inhibitors for cancer therapy, Curr Oncol. 2009 Mar; 16(2): 36-43). In certain embodiments, the cell cycle inhibitor may be, but is not limited to flavopiridol, indisulam, AZD5438, SNS-032, bryostatin-1, seliciclib, PD 0332991, and SCH 727965. In certain embodiments, the cell cycle inhibitor is a CDK inhibitor. In one embodiment, the cell cycle inhibitor is a CDK4/6 inhibitor, such as LEE011, palbociclib (PD-0332991), and Abemaciclib (LY2835219) (see, e.g., US9259399B2; W02016025650A1; US Patent Publication No. 20140031325; US Patent Publication No. 20140080838; US Patent Publication No. 20130303543; US Patent Publication No. 2007/0027147; US Patent Publication No. 2003/0229026; US Patent Publication No 2004/0048915; US Patent Publication No. 2004/0006074; US Patent Publication No. 2007/0179118; each of which is incorporated by reference herein in its entirety). Currently there are three CDK4/6 inhibitors that are either approved or in late-stage development: palbociclib (PD-0332991; Pfizer), ribociclib (LEE011; Novartis), and abemaciclib (LY2835219; Lilly) (see e.g., Hamilton and Infante, Targeting CDK4/6 in patients with cancer, Cancer Treatment Reviews, Volume 45, April 2016, Pages 129-138). MEK Inhibitors

[0160] In certain embodiments, the small molecule or agent is a MEK inhibitor. A MEK inhibitor is a chemical or drug that inhibits the mitogen-activated protein kinase kinase enzymes MEK1 and/or MEK2. They can be used to affect the MAPK/ERK pathway. Non-limiting examples of MEK inhibitors include Cobimetinib or XL518, Trametinib (GSK1120212), Binimetinib (MEK 162), Selumetinib, PD-325901, CI-1040, PD035901, and TAK-733.

[0161] In certain embodiments, mRNA encoding a gene product, such as a transcription factor, is delivered to a subject in need thereof. In certain embodiments, the mRNA are modified mRNA (see, e.g., US Patent 9428535 B2).

Vectors

[0162] In certain embodiments, vectors are used to overexpress or modulate expression of genes, such as transcription factors. Vectors for introducing CRISPR systems are described further herein.

[0163] The term“vector” generally denotes a tool that allows or facilitates the transfer of an entity from one environment to another. More particularly, the term“vector” as used throughout this specification refers to nucleic acid molecules to which nucleic acid fragments (cDNA) may be inserted and cloned, i.e., propagated. Hence, a vector is typically a replicon, into which another nucleic acid segment may be inserted, such as to bring about the replication of the inserted segment in a defined host cell or vehicle organism.

[0164] A vector thus typically contains an origin of replication and other entities necessary for replication and/or maintenance in a host cell. A vector may typically contain one or more unique restriction sites allowing for insertion of nucleic acid fragments. A vector may also preferably contain a selection marker, such as, e.g., an antibiotic resistance gene or auxotrophic gene (e.g., URA3, which encodes an enzyme necessary for uracil biosynthesis or TRP1, which encodes an enzyme required for tryptophan biosynthesis), to allow selection of recipient cells that contain the vector. Vectors include, but are not limited to, nucleic acid molecules that are single- stranded, double-stranded, or partially double-stranded; nucleic acid molecules that comprise one or more free ends, no free ends (e.g. circular); nucleic acid molecules that comprise DNA, RNA, or both; and other varieties of polynucleotides known in the art. [0165] Expression vectors are generally configured to allow for and/or effect the expression of nucleic acids (e.g., cDNA, CRISPR system) introduced thereto in a desired expression system, e.g., in vitro , in a host cell, host organ and/or host organism. For example, the vector can express nucleic acids functionally or operatively linked to regulatory element(s) and hence the regulatory element(s) drive expression. The promoter(s) can be constitutive promoter(s) and/or conditional promoter(s) and/or inducible promoter(s). In certain embodiments, the vectors comprise regulatory sequences for inducible expression of cDNAs encoding transcription factors. Thus, expression of the transcription factors in cells can induced at particular time points after introducing the vectors. Inducible expression systems are known in the art and may include, for example, Tet on/off systems (see, e.g., Gossen et al, Transcriptional activation by tetracyclines in mammalian cells. Science. 1995 Jun 23;268(5218): 1766-9).

[0166] In certain example embodiments, the vectors disclosed herein may further encode an epitope tag in frame with the gene for use in downstream assessment of protein expression and gene abundance in cell populations respectively. Epitope tags provide high sensitivity and specificity in detection by specific antigen binding molecules (e.g., antibodies, aptamers). Exemplary epitope tags include, but are not limited to, Flag, CBP, GST, HA, HBH, MBP, Myc, polyHis, S-tag, SUMO, TAP, TRX, or V5.

[0167] Vectors may include, without limitation, plasmids (which refer to circular double stranded DNA loops which, in their vector form are not bound to the chromosome), episomes, phagemids, bacteriophages, bacteriophage-derived vectors, bacterial artificial chromosomes (BAC), yeast artificial chromosomes (YAC), Pl-derived artificial chromosomes (PAC), transposons, cosmids, linear nucleic acids, viral vectors, etc., as appropriate. A vector can be a DNA or RNA vector. A vector can be a self-replicating extrachromosomal vector or a vector which integrates into a host genome, hence, vectors can be autonomous or integrative.

[0168] The term“viral vectors” refers to the use as viruses, or virus-associated vectors as carriers of the nucleic acid construct into the cell. Constructs may be integrated and packaged into non-replicating, defective viral genomes like adenovirus, adeno-associated virus (AAV), or herpes simplex virus (HSV) or others, including retroviral and lentiviral vectors, for infection or transduction into cells. The vector may or may not be incorporated into the cell’s genome. The constructs may include viral sequences for transfection, if desired. Alternatively, the construct may be incorporated into vectors capable of episomal replication, e.g., EPV and EBV vectors.

[0169] Methods for introducing nucleic acids, including vectors, expression cassettes and expression vectors, into cells (e.g., transfection, transduction or transformation) are known to the person skilled in the art, and may include calcium phosphate co-precipitation, electroporation, micro-injection, protoplast fusion, lipofection, exosome-mediated transfection, transfection employing polyamine transfection reagents, bombardment of cells by nucleic acid-coated tungsten micro projectiles, viral particle delivery, etc.

Genetic Modulating Agents

[0170] In certain embodiments, the one or more modulating agents may be a genetic modifying agent (e.g., gene editing system). The genetic modifying agent may comprise a CRISPR system, a zinc finger nuclease system, a TALEN, a meganuclease, or RNAi.

CRISPR

[0171] In some embodiments, a polynucleotide of the present invention described elsewhere herein (e.g. Table 1, 2 and 3) can be modified using a CRISPR-Cas and/or Cas-based system.

[0172] In general, a CRISPR-Cas or CRISPR system as used in herein and in documents, such as International Patent Publication No. WO 2014/093622 (PCT/US2013/074667), refers collectively to transcripts and other elements involved in the expression of or directing the activity of CRISPR-associated (“Cas”) genes, including sequences encoding a Cas gene, a tracr (trans- activating CRISPR) sequence (e.g. tracrRNA or an active partial tracrRNA), a tracr-mate sequence (encompassing a“direct repeat” and a tracrRNA-processed partial direct repeat in the context of an endogenous CRISPR system), a guide sequence (also referred to as a“spacer” in the context of an endogenous CRISPR system), or“RNA(s)” as that term is herein used (e.g., RNA(s) to guide Cas, such as Cas9, e.g. CRISPR RNA and transactivating (tracr) RNA or a single guide RNA (sgRNA) (chimeric RNA)) or other sequences and transcripts from a CRISPR locus. In general, a CRISPR system is characterized by elements that promote the formation of a CRISPR complex at the site of a target sequence (also referred to as a protospacer in the context of an endogenous CRISPR system). See, e.g., Shmakov et al. (2015)“Discovery and Functional Characterization of Diverse Class 2 CRISPR-Cas Systems”, Molecular Cell, DOI: dx.doi.org/10.1016/j.molcel.2015.10.008. [0173] CRISPR-Cas systems can generally fall into two classes based on their architectures of their effector molecules, which are each further subdivided by type and subtype. The two class are Class 1 and Class 2. Class 1 CRISPR-Cas systems have effector modules composed of multiple Cas proteins, some of which form crRNA-binding complexes, while Class 2 CRISPR-Cas systems include a single, multi-domain crRNA-binding protein.

[0174] In some embodiments, the CRISPR-Cas system that can be used to modify a polynucleotide of the present invention described herein can be a Class 1 CRISPR-Cas system. In some embodiments, the CRISPR-Cas system that can be used to modify a polynucleotide of the present invention described herein can be a Class 2 CRISPR-Cas system.

[0175] In certain embodiments, a CRISPR system is used to enhance expression or activity of transcription factors (e g., ATF3). In certain embodiments, the transcription factor expression or activity is enhanced temporarily, such that the enhancement is not permanent. In certain embodiments, expression of the transcription from its endogenous gene is enhanced (e.g., by directing an activator to the gene). In certain embodiments, genes are targeted for downregulation. In certain embodiments, genes are targeted for editing.

[0176] In certain embodiments, modification of transcription factor mRNA by a Casl3- deaminase system can be used to modulate transcription factor activity in order to generate target cells (see, e.g., International Patent Publication No. WO 2019/084062). In certain embodiments, the modification silences ubiquitination, methylation, acetylation, succinylation, glycosylation, O- GlcNAc, O-linked glycosylation, iodination, nitrosylation, sulfation, caboxyglutamation, phosphorylation, or a combination thereof. In some embodiments, the modification increases a half-life of a target TF. In certain embodiments, the transcription activity is enhanced by modifying a phosphorylation site on the transcription factor (see, e.g., Hunter and Karin, 1992, The regulation of Transcription by Phosphorylation. Cell, Vol. 70, 375-387; and Whitmarsh and Davis, 2000, Regulation of transcription factor function by phosphorylation. CMLS, Cell. Mol. Life Sci. 57: 1172).

Class 1 CRISPR-Cas Systems

[0177] In some embodiments, the CRISPR-Cas system that can be used to modify a polynucleotide of the present invention described herein can be a Class 1 CRISPR-Cas system. Class 1 CRISPR-Cas systems are divided into types I, II, and IV. Makarova et al. 2020. Nat. Rev. 18: 67-83., particularly as described in Figure 1. Type I CRISPR-Cas systems are divided into 9 subtypes (I-A, I-B, I-C, I-D, I-E, I-Fl, I-F2, 1-F3, and IG). Makarova et al., 2020. Class 1, Type I CRISPR-Cas systems can contain a Cas3 protein that can have helicase activity. Type III CRISPR- Cas systems are divided into 6 subtypes (III-A, III-B, III-C, III-D, III-E, and III-F). Type III CRISPR-Cas systems can contain a CaslO that can include an RNA recognition motif called Palm and a cyclase domain that can cleave polynucleotides. Makarova et al. , 2020. Type IV CRISPR- Cas systems are divided into 3 subtypes. (IV-A, IV-B, and IV-C). Makarova et al. , 2020. Class 1 systems also include CRISPR-Cas variants, including Type I-A, I-B, I-E, I-F and I-U variants, which can include variants carried by transposons and plasmids, including versions of subtype I- F encoded by a large family of Tn7-like transposon and smaller groups of Tn7-like transposons that encode similarly degraded subtype I-B systems. Peters et al. , PNAS 114 (35) (2017); DOI: 10.1073/pnas.1709035114; see also , Makarova et al. 2018. The CRISPR Journal, v. 1, n5, Figure 5.

[0178] The Class 1 systems typically use a multi-protein effector complex, which can, in some embodiments, include ancillary proteins, such as one or more proteins in a complex referred to as a CRISPR-associated complex for antiviral defense (Cascade), one or more adaptation proteins (e.g. Casl, Cas2, RNA nuclease), and/or one or more accessory proteins (e.g. Cas 4, DNA nuclease), CRISPR associated Rossman fold (CARF) domain containing proteins, and/or RNA transcriptase.

[0179] The backbone of the Class 1 CRISPR-Cas system effector complexes can be formed by RNA recognition motif domain-containing protein(s) of the repeat-associated mysterious proteins (RAMPs) family subunits e.g. Cas 5, Cas6, and/or Cas7. RAMP proteins are characterized by having one or more RNA recognition motif domains. In some embodiments, multiple copies of RAMPs can be present. In some embodiments, the Class I CRISPR-Cas system can include 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or more Cas5, Cas6, and/or Cas 7 proteins. In some embodiments, the Cas6 protein is an RNAse, which can be responsible for pre-crRNA processing. When present in a Class 1 CRISPR-Cas system, Cas6 can be optionally physically associated with the effector complex.

[0180] Class 1 CRISPR-Cas system effector complexes can, in some embodiments, also include a large subunit. The large subunit can be composed of or include a Cas8 and/or CaslO protein. See, e.g., Figures 1 and 2. Koonin EV, Makarova KS. 2019. Phil. Trans. R. Soc. B 374: 20180087, DOI: 10.1098/rstb.2018.0087 and Makarova et al. 2020.

[0181] Class 1 CRISPR-Cas system effector complexes can, in some embodiments, include a small subunit (for example, Casl l). See, e.g., Figures 1 and 2. Koonin EV, Makarova KS. 2019 Origins and evolution of CRISPR-Cas systems. Phil. Trans. R. Soc. B 374: 20180087, DOI: 10.1098/rstb.2018.0087.

[0182] In some embodiments, the Class 1 CRISPR-Cas system can be a Type I CRISPR-Cas system. In some embodiments, the Type I CRISPR-Cas system can be a subtype I-A CRISPR-Cas system. In some embodiments, the Type I CRISPR-Cas system can be a subtype I-B CRISPR-Cas system. In some embodiments, the Type I CRISPR-Cas system can be a subtype I-C CRISPR-Cas system. In some embodiments, the Type I CRISPR-Cas system can be a subtype I-D CRISPR-Cas system. In some embodiments, the Type I CRISPR-Cas system can be a subtype I-E CRISPR-Cas system. In some embodiments, the Type I CRISPR-Cas system can be a subtype I-Fl CRISPR- Cas system. In some embodiments, the Type I CRISPR-Cas system can be a subtype I-F2 CRISPR- Cas system. In some embodiments, the Type I CRISPR-Cas system can be a subtype I-F3 CRISPR- Cas system. In some embodiments, the Type I CRISPR-Cas system can be a subtype I-G CRISPR- Cas system. In some embodiments, the Type I CRISPR-Cas system can be a CRISPR Cas variant, such as a Type I-A, I-B, I-E, I-F and I-U variants, which can include variants carried by transposons and plasmids, including versions of subtype I-F encoded by a large family of Tn7-like transposon and smaller groups of Tn7-like transposons that encode similarly degraded subtype I- B systems as previously described.

[0183] In some embodiments, the Class 1 CRISPR-Cas system can be a Type III CRISPR-Cas system. In some embodiments, the Type III CRISPR-Cas system can be a subtype III-A CRISPR- Cas system. In some embodiments, the Type III CRISPR-Cas system can be a subtype III-B CRISPR-Cas system. In some embodiments, the Type III CRISPR-Cas system can be a subtype III-C CRISPR-Cas system. In some embodiments, the Type III CRISPR-Cas system can be a subtype III-D CRISPR-Cas system. In some embodiments, the Type III CRISPR-Cas system can be a subtype III-E CRISPR-Cas system. In some embodiments, the Type III CRISPR-Cas system can be a subtype III-F CRISPR-Cas system. [0184] In some embodiments, the Class 1 CRISPR-Cas system can be a Type IV CRISPR- Cas-system. In some embodiments, the Type IV CRISPR-Cas system can be a subtype IV-A CRISPR-Cas system. In some embodiments, the Type IV CRISPR-Cas system can be a subtype

IV-B CRISPR-Cas system. In some embodiments, the Type IV CRISPR-Cas system can be a subtype IV-C CRISPR-Cas system.

[0185] The effector complex of a Class 1 CRISPR-Cas system can, in some embodiments, include a Cas3 protein that is optionally fused to a Cas2 protein, a Cas4, a Cas 5, a Cas6, a Cas7, a Cas8, a CaslO, a Casl l, or a combination thereof. In some embodiments, the effector complex of a Class 1 CRISPR-Cas system can have multiple copies, such as 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, or 14, of any one or more Cas proteins.

Class 2 CRISPR-Cas Systems

[0186] The compositions, systems, and methods described in greater detail elsewhere herein can be designed and adapted for use with Class 2 CRISPR-Cas systems. Thus, in some embodiments, the CRISPR-Cas system is a Class 2 CRISPR-Cas system. Class 2 systems are distinguished from Class 1 systems in that they have a single, large, multi-domain effector protein. In certain example embodiments, the Class 2 system can be a Type II, Type V, or Type VI system, which are described in Makarova et al.“Evolutionary classification of CRISPR-Cas systems: a burst of class 2 and derived variants” Nature Reviews Microbiology, 18:67-81 (Feb 2020), incorporated herein by reference. Each type of Class 2 system is further divided into subtypes. See Markova et al. 2020, particularly at Figure. 2. Class 2, Type II systems can be divided into 4 subtypes: II-A, II-B, II-C1, and II-C2. Class 2, Type V systems can be divided into 17 subtypes:

V-A, V-Bl, V-B2, V-C, V-D, V-E, V-Fl, V-F1(V-U3), V-F2, V-F3, V-G, V-H, V-I, V-K (V-U5),

V-Ul, V-U2, and V-U4. Class 2, Type IV systems can be divided into 5 subtypes: VI-A, VI-B1,

VI-B2, VI-C, and VI-D.

[0187] The distinguishing feature of these types is that their effector complexes consist of a single, large, multi-domain protein. Type V systems differ from Type II effectors (e.g. Cas9) contain two nuclear domains that are each responsible for the cleavage of one strand of the target DNA, with the HNH nuclease inserted inside the Ruv-C like nuclease domain sequence. The Type

V systems (e.g. Casl2) only contain a RuvC-like nuclease domain that cleaves both strands. Type

VI (Cas 13) are unrelated to the effectors of type II and V systems, contain two HEPN domains and target RNA. Cast 3 proteins also display collateral activity that is triggered by target recognition. Some Type V systems have also been found to possess this collateral activity two single-stranded DNA in in vitro contexts.

[0188] In some embodiments, the Class 2 system is a Type II system. In some embodiments, the Type II CRISPR-Cas system is a II-A CRISPR-Cas system. In some embodiments, the Type II CRISPR-Cas system is a II-B CRISPR-Cas system. In some embodiments, the Type II CRISPR- Cas system is a II-C1 CRISPR-Cas system. In some embodiments, the Type II CRISPR-Cas system is a II-C2 CRISPR-Cas system. In some embodiments, the Type II system is a Cas9 system. In some embodiments, the Type II system includes a Cas9.

[0189] In some embodiments, the Class 2 system is a Type V system. In some embodiments, the Type V CRISPR-Cas system is a V-A CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-Bl CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-B2 CRISPR-Cas system. In some embodiments, the Type V CRISPR- Cas system is a V-C CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-D CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-E CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-Fl CRISPR- Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-Fl (V-U3) CRISPR- Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-F2 CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-F3 CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-G CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-H CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-I CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-K (V-U5) CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-Ul CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-U2 CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-U4 CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system includes a Casl2a (Cpfl), Casl2b (C2cl), Casl2c (C2c3), CasX, and/or Casl4.

[0190] In some embodiments the Class 2 system is a Type VI system. In some embodiments, the Type VI CRISPR-Cas system is a VI-A CRISPR-Cas system. In some embodiments, the Type VI CRISPR-Cas system is a VI-B1 CRISPR-Cas system. In some embodiments, the Type VI CRISPR-Cas system is a VI-B2 CRISPR-Cas system. In some embodiments, the Type VI CRISPR-Cas system is a VI-C CRISPR-Cas system. In some embodiments, the Type VI CRISPR- Cas system is a VI-D CRISPR-Cas system. In some embodiments, the Type VI CRISPR-Cas system includes a Casl3a (C2c2), Casl3b (Group 29/30), Casl3c, and/or Casl3d.

Specialized Cas-based Systems

[0191] In some embodiments, the system is a Cas-based system that is capable of performing a specialized function or activity. For example, the Cas protein may be fused, operably coupled to, or otherwise associated with one or more functionals domains. In certain example embodiments, the Cas protein may be a catalytically dead Cas protein (“dCas”) and/or have nickase activity. A nickase is a Cas protein that cuts only one strand of a double stranded target. In such embodiments, the dCas or nickase provide a sequence specific targeting functionality that delivers the functional domain to or proximate a target sequence. Example functional domains that may be fused to, operably coupled to, or otherwise associated with a Cas protein can be or include, but are not limited to a nuclear localization signal (NLS) domain, a nuclear export signal (NES) domain, a translational activation domain, a transcriptional activation domain (e.g. VP64, p65, MyoDl, HSF1, RTA, and SET7/9), a translation initiation domain, a transcriptional repression domain (e.g., a KRAB domain, NuE domain, NcoR domain, and a SID domain such as a SID4X domain), a nuclease domain (e.g., Fokl), a histone modification domain (e.g., a histone acetyltransferase), a light inducible/controllable domain, a chemically inducible/controllable domain, a transposase domain, a homologous recombination machinery domain, a recombinase domain, an integrase domain, and combinations thereof. Methods for generating catalytically dead Cas9 or a nickase Cas9 (International Patent Publication No. WO 2014/204725, Ran et al. Cell. 2013 Sept 12; 154(6): 1380-1389 ), Casl2 (Liu et al. Nature Communications, 8, 2095 (2017) , and Casl3 (International Patent Publication Nos. WO 2019/005884, W02019/060746) are known in the art and incorporated herein by reference.

[0192] In some embodiments, the functional domains can have one or more of the following activities: methylase activity, demethylase activity, translation activation activity, translation initiation activity, translation repression activity, transcription activation activity, transcription repression activity, transcription release factor activity, histone modification activity, nuclease activity, single-strand RNA cleavage activity, double-strand RNA cleavage activity, single-strand DNA cleavage activity, double-strand DNA cleavage activity, molecular switch activity, chemical inducibility, light inducibility, and nucleic acid binding activity. In some embodiments, the one or more functional domains may comprise epitope tags or reporters. Non-limiting examples of epitope tags include histidine (His) tags, V5 tags, FLAG tags, influenza hemagglutinin (HA) tags, Myc tags, VSV-G tags, and thioredoxin (Trx) tags. Examples of reporters include, but are not limited to, glutathi one- S -transferase (GST), horseradish peroxidase (HRP), chloramphenicol acetyltransferase (CAT) beta-galactosidase, beta-glucuronidase, luciferase, green fluorescent protein (GFP), HcRed, DsRed, cyan fluorescent protein (CFP), yellow fluorescent protein (YFP), and auto-fluorescent proteins including blue fluorescent protein (BFP).

[0193] The one or more functional domain(s) may be positioned at, near, and/or in proximity to a terminus of the effector protein (e.g., a Cas protein). In embodiments having two or more functional domains, each of the two can be positioned at or near or in proximity to a terminus of the effector protein (e.g., a Cas protein). In some embodiments, such as those where the functional domain is operably coupled to the effector protein, the one or more functional domains can be tethered or linked via a suitable linker (including, but not limited to, GlySer linkers) to the effector protein (e.g., a Cas protein). When there is more than one functional domain, the functional domains can be same or different. In some embodiments, all the functional domains are the same. In some embodiments, all of the functional domains are different from each other. In some embodiments, at least two of the functional domains are different from each other. In some embodiments, at least two of the functional domains are the same as each other.

[0194] Other suitable functional domains can be found, for example, in International Patent Publication No. WO 2019/018423.

Split CRISPR-Cas systems

[0195] In some embodiments, the CRISPR-Cas system is a split CRISPR-Cas system. See e.g. Zetche et al., 2015. Nat. Biotechnol. 33(2): 139-142, the compositions and techniques of which can be used in and/or adapted for use with the present invention. Split CRISPR-Cas proteins are set forth herein and in documents incorporated herein by reference in further detail herein. In certain embodiments, each part of a split CRISPR protein are attached to a member of a specific binding pair, and when bound with each other, the members of the specific binding pair maintain the parts of the CRISPR protein in proximity. In certain embodiments, each part of a split CRISPR protein is associated with an inducible binding pair. An inducible binding pair is one which is capable of being switched“on” or“off’ by a protein or small molecule that binds to both members of the inducible binding pair. In some embodiments, CRISPR proteins may preferably split between domains, leaving domains intact. In particular embodiments, said Cas split domains (e.g., RuvC and HNH domains in the case of Cas9) can be simultaneously or sequentially introduced into the cell such that said split Cas domain(s) process the target nucleic acid sequence in the algae cell. The reduced size of the split Cas compared to the wild type Cas allows other methods of delivery of the systems to the cells, such as the use of cell penetrating peptides as described herein. Base Editing

[0196] In some embodiments, a polynucleotide of the present invention described elsewhere herein (e.g. Table 1, 2 and 3, such as ATF3) can be modified using a base editing system. In some embodiments, a Cas protein is connected or fused to a nucleotide deaminase. Thus, in some embodiments the Cas-based system can be a base editing system. As used herein“base editing” refers generally to the process of polynucleotide modification via a CRISPR-Cas-based or Cas- based system that does not include excising nucleotides to make the modification. Base editing can convert base pairs at precise locations without generating excess undesired editing byproducts that can be made using traditional CRISPR-Cas systems.

[0197] In certain example embodiments, the nucleotide deaminase may be a DNA base editor used in combination with a DNA binding Cas protein such as, but not limited to, Class 2 Type II and Type V systems. Two classes of DNA base editors are generally known: cytosine base editors (CBEs) and adenine base editors (ABEs). CBEs convert a C*G base pair into a T*A base pair (Komor et al. 2016. Nature. 533 :420-424; Nishida et al. 2016. Science. 353; and Li et al. Nat. Biotech. 36:324-327) and ABEs convert an A·T base pair to a G*C base pair. Collectively, CBEs and ABEs can mediate all four possible transition mutations (C to T, A to G, T to C, and G to A). Rees and Liu. 2018 Nat. Rev. Genet. 19(12): 770-788, particularly at Figures lb, 2a-2c, 3a-3f, and Table 1. In some embodiments, the base editing system includes a CBE and/or an ABE. In some embodiments, a polynucleotide of the present invention described elsewhere herein can be modified using a base editing system. Rees and Liu. 2018. Nat. Rev. Gent. 19(12):770-788. Base editors also generally do not need a DNA donor template and/or rely on homology-directed repair. Komor et al. 2016. Nature. 533 :420-424; Nishida et al. 2016. Science. 353; and Gaudeli et al. 2017. Nature. 551 :464-471. Upon binding to a target locus in the DNA, base pairing between the guide RNA of the system and the target DNA strand leads to displacement of a small segment of ssDNA in an“R-loop”. Nishimasu et al. Cell. 156:935-949. DNA bases within the ssDNA bubble are modified by the enzyme component, such as a deaminase. In some systems, the catalytically disabled Cas protein can be a variant or modified Cas can have nickase functionality and can generate a nick in the non-edited DNA strand to induce cells to repair the non-edited strand using the edited strand as a template. Komor et al. 2016. Nature. 533 :420-424; Nishida et al. 2016. Science. 353; and Gaudeli et al. 2017. Nature. 551 :464-471. Base editors may be further engineered to optimize conversion of nucleotides (e.g. A:T to G:C). Richter et al. 2020. Nature Biotechnology doi . org/ 10.1038/s41587-020-0453 -z.

[0198] Other Example Type V base editing systems are described in International Patent Publication Nos. WO 2018/213708 and WO 2018/213726, and International Patent Applications Nos. PCT/US2018/067207, PCT/US2018/067225, and PCT/US2018/067307, which are incorporated by referenced herein.

[0199] In certain example embodiments, the base editing system may be a RNA base editing system. As with DNA base editors, a nucleotide deaminase capable of converting nucleotide bases may be fused to a Cas protein. However, in these embodiments, the Cas protein will need to be capable of binding RNA. Example RNA binding Cas proteins include, but are not limited to, RNA- binding Cas9s such as Francisella novicida Cas9 (“FnCas9”), and Class 2 Type VI Cas systems. The nucleotide deaminase may be a cytidine deaminase or an adenosine deaminase, or an adenosine deaminase engineered to have cytidine deaminase activity. In certain example embodiments, the RNA based editor may be used to delete or introduce a post-translation modification site in the expressed mRNA. In contrast to DNA base editors, whose edits are permanent in the modified cell, RNA base editors can provide edits where finer temporal control may be needed, for example in modulating a particular immune response. Example Type VI RNA- base editing systems are described in Cox et al. 2017. Science 358: 1019-1027, International Patent Publication Nos. WO 2019/005884, WO 2019/005886, and WO 2019/071048, and International Patent Application Nos. PCT/US20018/05179 and PCT/US2018/067207, which are incorporated herein by reference. An example FnCas9 system that may be adapted for RNA base editing purposes is described in International Patent Publication No. WO 2016/106236, which is incorporated herein by reference.

[0200] An example method for delivery of base-editing systems, including use of a split-intein approach to divide CBE and ABE into reconstitutable halves, is described in Levy et al. Nature Biomedical Engineering doi . org / 10.1038/s41441 -019-0505 -5 (2019), which i s incorporated herein by reference.

Prime Editing

[0201] In some embodiments, a polynucleotide of the present invention described elsewhere herein (e.g., Table 1, 2 and 3, such as ATF3) can be modified using a prime editing system (See e.g., Anzalone et al. 2019. Nature. 576: 149-157). Like base editing systems, prime editing systems can be capable of targeted modification of a polynucleotide without generating double stranded breaks and does not require donor templates. Further prime editing systems can be capable of all 12 possible combination swaps. Prime editing can operate via a“search-and-replace” methodology and can mediate targeted insertions, deletions, all 12 possible base-to-base conversion, and combinations thereof. Generally, a prime editing system, as exemplified by PEI, PE2, and PE3 (Id.), can include a reverse transcriptase fused or otherwise coupled or associated with an RNA- programmable nickase, and a prime-editing extended guide RNA (pegRNA) to facility direct copying of genetic information from the extension on the pegRNA into the target polynucleotide. Embodiments that can be used with the present invention include these and variants thereof. Prime editing can have the advantage of lower off-target activity than traditional CRIPSR-Cas systems along with few byproducts and greater or similar efficiency as compared to traditional CRISPR- Cas systems.

[0202] In some embodiments, the prime editing guide molecule can specify both the target polynucleotide information (e.g. sequence) and contain a new polynucleotide cargo that replaces target polynucleotides. To initiate transfer from the guide molecule to the target polynucleotide, the PE system can nick the target polynucleotide at a target side to expose a 3’hydroxyl group, which can prime reverse transcription of an edit-encoding extension region of the guide molecule (e.g. a prime editing guide molecule or peg guide molecule) directly into the target site in the target polynucleotide. See e.g. Anzalone et al. 2019. Nature. 576: 149-157, particularly at Figures lb, lc, related discussion, and Supplementary discussion. [0203] In some embodiments, a prime editing system can be composed of a Cas polypeptide having nickase activity, a reverse transcriptase, and a guide molecule. The Cas polypeptide can lack nuclease activity. The guide molecule can include a target binding sequence as well as a primer binding sequence and a template containing the edited polynucleotide sequence. The guide molecule, Cas polypeptide, and/or reverse transcriptase can be coupled together or otherwise associate with each other to form an effector complex and edit a target sequence. In some embodiments, the Cas polypeptide is a Class 2, Type V Cas polypeptide. In some embodiments, the Cas polypeptide is a Cas9 polypeptide (e.g. is a Cas9 nickase). In some embodiments, the Cas polypeptide is fused to the reverse transcriptase. In some embodiments, the Cas polypeptide is linked to the reverse transcriptase.

[0204] In some embodiments, the prime editing system can be a PEI system or variant thereof, a PE2 system or variant thereof, or a PE3 (e.g. PE3, PE3b) system. See e.g., Anzalone et al. 2019. Nature. 576: 149-157, particularly at pgs. 2-3, Figs. 2a, 3a-3f, 4a-4b, Extended data Figs. 3a-3b, 4,

[0205] The peg guide molecule can be about 10 to about 200 or more nucleotides in length,

nucleotides in length. Optimization of the peg guide molecule can be accomplished as described in Anzalone et al. 2019. Nature. 576: 149-157, particularly at pg. 3, Fig. 2a-2b, and Extended Data Figs. 5a-c.

CAST Systems

[0206] In some embodiments, a polynucleotide of the present invention described elsewhere herein (e.g. Table 1, 2 and 3, such as ATF3) can be modified using a CRISPR-Associated Transposase (CAST) System, such as any of those described in PCT/US2019/066835. In some embodiments, a polynucleotide of the present invention described elsewhere herein can be modified using a CRISPR Associated Transposase (“CAST”) system. CAST system can include a Cas protein that is catalytically inactive, or engineered to be catalytically active, and further comprises a transposase (or subunits thereof) that catalyze RNA-guided DNA transposition. Such systems are able to insert DNA sequences at a target site in a DNA molecule without relying on host cell repair machinery. CAST systems can be Classl or Class 2 CAST systems. An example Class 1 system is described in Klompe et al. Nature, doi: 10.1038/s41586-019-1323, which is in incorporated herein by reference. An example Class 2 system is described in Strecker et al. Science. 10/1126/science. aax9181 (2019), and PCT/US2019/066835 which are incorporated herein by reference.

Guide Molecules

[0207] The CRISPR-Cas or Cas-Based system described herein can, in some embodiments, include one or more guide molecules. The terms guide molecule, guide sequence and guide polynucleotide, refer to polynucleotides capable of guiding Cas to a target genomic locus and are used interchangeably as in foregoing cited documents such as International Patent Publication No. WO 2014/093622 (PCT/US2013/074667). In general, a guide sequence is any polynucleotide sequence having sufficient complementarity with a target polynucleotide sequence to hybridize with the target sequence and direct sequence-specific binding of a CRISPR complex to the target sequence. The guide molecule can be a polynucleotide.

[0208] The ability of a guide sequence (within a nucleic acid-targeting guide RNA) to direct sequence-specific binding of a nucleic acid-targeting complex to a target nucleic acid sequence may be assessed by any suitable assay. For example, the components of a nucleic acid-targeting CRISPR system sufficient to form a nucleic acid-targeting complex, including the guide sequence to be tested, may be provided to a host cell having the corresponding target nucleic acid sequence, such as by transfection with vectors encoding the components of the nucleic acid-targeting complex, followed by an assessment of preferential targeting (e.g., cleavage) within the target nucleic acid sequence, such as by Surveyor assay (Qui et al. 2004. BioTechniques. 36(4)702-707). Similarly, cleavage of a target nucleic acid sequence may be evaluated in a test tube by providing the target nucleic acid sequence, components of a nucleic acid-targeting complex, including the guide sequence to be tested and a control guide sequence different from the test guide sequence, and comparing binding or rate of cleavage at the target sequence between the test and control guide sequence reactions. Other assays are possible, and will occur to those skilled in the art.

[0209] In some embodiments, the guide molecule is an RNA. The guide molecule(s) (also referred to interchangeably herein as guide polynucleotide and guide sequence) that are included in the CRISPR-Cas or Cas based system can be any polynucleotide sequence having sufficient complementarity with a target nucleic acid sequence to hybridize with the target nucleic acid sequence and direct sequence-specific binding of a nucleic acid-targeting complex to the target nucleic acid sequence. In some embodiments, the degree of complementarity, when optimally aligned using a suitable alignment algorithm, can be about or more than about 50%, 60%, 75%, 80%, 85%, 90%, 95%, 97.5%, 99%, or more. Optimal alignment may be determined with the use of any suitable algorithm for aligning sequences, non-limiting example of which include the Smith-Waterman algorithm, the Needleman-Wunsch algorithm, algorithms based on the Burrows- Wheeler Transform (e.g., the Burrows Wheeler Aligner), Clustal W, Clustal X, BLAT, Novoalign (Novocraft Technologies; available at www.novocraft.com), ELAND (Illumina, San Diego, CA), SOAP (available at soap.genomics.org.cn), and Maq (available at maq.sourceforge.net).

[0210] A guide sequence, and hence a nucleic acid-targeting guide, may be selected to target any target nucleic acid sequence. The target sequence may be DNA. The target sequence may be any RNA sequence. In some embodiments, the target sequence may be a sequence within a RNA molecule selected from the group consisting of messenger RNA (mRNA), pre-mRNA, ribosomal RNA (rRNA), transfer RNA (tRNA), micro-RNA (miRNA), small interfering RNA (siRNA), small nuclear RNA (snRNA), small nucleolar RNA (snoRNA), double stranded RNA (dsRNA), non-coding RNA (ncRNA), long non-coding RNA (IncRNA), and small cytoplasmatic RNA (scRNA). In some preferred embodiments, the target sequence may be a sequence within an RNA molecule selected from the group consisting of mRNA, pre-mRNA, and rRNA. In some preferred embodiments, the target sequence may be a sequence within an RNA molecule selected from the group consisting of ncRNA, and IncRNA. In some more preferred embodiments, the target sequence may be a sequence within an mRNA molecule or a pre-mRNA molecule.

[0211] In some embodiments, a nucleic acid-targeting guide is selected to reduce the degree secondary structure within the nucleic acid-targeting guide. In some embodiments, about or less than about 75%, 50%, 40%, 30%, 25%, 20%, 15%, 10%, 5%, 1%, or fewer of the nucleotides of the nucleic acid-targeting guide participate in self-complementary base pairing when optimally folded. Optimal folding may be determined by any suitable polynucleotide folding algorithm. Some programs are based on calculating the minimal Gibbs free energy. An example of one such algorithm is mFold, as described by Zuker and Stiegler (Nucleic Acids Res. 9 (1981), 133-148). Another example folding algorithm is the online Webserver RNAfold, developed at Institute for Theoretical Chemistry at the University of Vienna, using the centroid structure prediction algorithm (see e.g., A.R. Gruber et al., 2008, Cell 106(1): 23-24; and PA Carr and GM Church, 2009, Nature Biotechnology 27(12): 1151-62).

[0212] In certain embodiments, a guide RNA or crRNA may comprise, consist essentially of, or consist of a direct repeat (DR) sequence and a guide sequence or spacer sequence. In certain embodiments, the guide RNA or crRNA may comprise, consist essentially of, or consist of a direct repeat sequence fused or linked to a guide sequence or spacer sequence. In certain embodiments, the direct repeat sequence may be located upstream (i.e., 5’) from the guide sequence or spacer sequence. In other embodiments, the direct repeat sequence may be located downstream (i.e., 3’) from the guide sequence or spacer sequence.

[0213] In certain embodiments, the crRNA comprises a stem loop, preferably a single stem loop. In certain embodiments, the direct repeat sequence forms a stem loop, preferably a single stem loop.

[0214] In certain embodiments, the spacer length of the guide RNA is from 15 to 35 nt. In certain embodiments, the spacer length of the guide RNA is at least 15 nucleotides. In certain embodiments, the spacer length is from 15 to 17 nt, e.g., 15, 16, or 17 nt, from 17 to 20 nt, e.g., 17, 18, 19, or 20 nt, from 20 to 24 nt, e.g., 20, 21, 22, 23, or 24 nt, from 23 to 25 nt, e.g., 23, 24, or 25 nt, from 24 to 27 nt, e.g., 24, 25, 26, or 27 nt, from 27 to 30 nt, e.g., 27, 28, 29, or 30 nt, from 30-35 nt, e.g., 30, 31, 32, 33, 34, or 35 nt, or 35 nt or longer.

[0215] The“tracrRNA” sequence or analogous terms includes any polynucleotide sequence that has sufficient complementarity with a crRNA sequence to hybridize. In some embodiments, the degree of complementarity between the tracrRNA sequence and crRNA sequence along the length of the shorter of the two when optimally aligned is about or more than about 25%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 97.5%, 99%, or higher. In some embodiments, the tracr sequence is about or more than about 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 40, 50, or more nucleotides in length. In some embodiments, the tracr sequence and crRNA sequence are contained within a single transcript, such that hybridization between the two produces a transcript having a secondary structure, such as a hairpin.

[0216] In general, degree of complementarity is with reference to the optimal alignment of the sea sequence and tracr sequence, along the length of the shorter of the two sequences. Optimal alignment may be determined by any suitable alignment algorithm, and may further account for secondary structures, such as self-complementarity within either the sea sequence or tracr sequence. In some embodiments, the degree of complementarity between the tracr sequence and sea sequence along the length of the shorter of the two when optimally aligned is about or more than about 25%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 97.5%, 99%, or higher.

[0217] In some embodiments, the degree of complementarity between a guide sequence and its corresponding target sequence can be about or more than about 50%, 60%, 75%, 80%, 85%, 90%, 95%, 97.5%, 99%, or 100%; a guide or RNA or sgRNA can be about or more than about 5, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 75, or more nucleotides in length; or guide or RNA or sgRNA can be less than about 75, 50, 45, 40, 35, 30, 25, 20, 15, 12, or fewer nucleotides in length; and tracr RNA can be 30 or 50 nucleotides in length. In some embodiments, the degree of complementarity between a guide sequence and its corresponding target sequence is greater than 94.5% or 95% or 95.5% or 96% or 96.5% or 97% or 97.5% or 98% or 98.5% or 99% or 99.5% or 99.9%, or 100%. Off target is less than 100% or 99.9% or 99.5% or 99% or 99% or 98.5% or 98% or 97.5% or 97% or 96.5% or 96% or 95.5% or 95% or 94.5% or 94% or 93% or 92% or 91% or 90% or 89% or 88% or 87% or 86% or 85% or 84% or 83% or 82% or 81% or 80% complementarity between the sequence and the guide, with it advantageous that off target is 100% or 99.9% or 99.5% or 99% or 99% or 98.5% or 98% or 97.5% or 97% or 96.5% or 96% or 95.5% or 95% or 94.5% complementarity between the sequence and the guide.

[0218] In some embodiments according to the invention, the guide RNA (capable of guiding Cas to a target locus) may comprise (1) a guide sequence capable of hybridizing to a genomic target locus in the eukaryotic cell; (2) a tracr sequence; and (3) a tracr mate sequence. All (1) to (3) may reside in a single RNA, i.e. an sgRNA (arranged in a 5’ to 3’ orientation), or the tracr RNA may be a different RNA than the RNA containing the guide and tracr sequence. The tracr hybridizes to the tracr mate sequence and directs the CRISPR/Cas complex to the target sequence. Where the tracr RNA is on a different RNA than the RNA containing the guide and tracr sequence, the length of each RNA may be optimized to be shortened from their respective native lengths, and each may be independently chemically modified to protect from degradation by cellular RNase or otherwise increase stability.

[0219] Many modifications to guide sequences are known in the art and are further contemplated within the context of this invention. Various modifications may be used to increase the specificity of binding to the target sequence and/or increase the activity of the Cas protein and/or reduce off-target effects. Example guide sequence modifications are described in International Patent Application No. PCT US2019/045582, specifically paragraphs [0178]- [0333] which is incorporated herein by reference.

Target Sequences, PAMs, and PFSs

Tarset Sequences

[0220] In the context of formation of a CRISPR complex,“target sequence” refers to a sequence to which a guide sequence is designed to have complementarity, where hybridization between a target sequence and a guide sequence promotes the formation of a CRISPR complex. A target sequence may comprise RNA polynucleotides. The term“target RNA” refers to a RNA polynucleotide being or comprising the target sequence. In other words, the target polynucleotide can be a polynucleotide or a part of a polynucleotide to which a part of the guide sequence is designed to have complementarity to and to which the effector function mediated by the complex comprising the CRISPR effector protein and a guide molecule is to be directed to. In some embodiments, a target sequence is located in the nucleus or cytoplasm of a cell.

[0221] The guide sequence can specifically bind a target sequence in a target polynucleotide. The target polynucleotide may be DNA. The target polynucleotide may be RNA. The target polynucleotide can have one or more (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, etc. or more) target sequences. The target polynucleotide can be on a vector. The target polynucleotide can be genomic DNA. The target polynucleotide can be episomal. Other forms of the target polynucleotide are described elsewhere herein. [0222] The target sequence may be DNA. The target sequence may be any RNA sequence. In some embodiments, the target sequence may be a sequence within a RNA molecule selected from the group consisting of messenger RNA (mRNA), pre-mRNA, ribosomal RNA (rRNA), transfer RNA (tRNA), micro-RNA (miRNA), small interfering RNA (siRNA), small nuclear RNA (snRNA), small nucleolar RNA (snoRNA), double stranded RNA (dsRNA), non-coding RNA (ncRNA), long non-coding RNA (IncRNA), and small cytoplasmatic RNA (scRNA). In some preferred embodiments, the target sequence (also referred to herein as a target polynucleotide) may be a sequence within a RNA molecule selected from the group consisting of mRNA, pre-mRNA, and rRNA. In some preferred embodiments, the target sequence may be a sequence within a RNA molecule selected from the group consisting of ncRNA, and IncRNA. In some more preferred embodiments, the target sequence may be a sequence within an mRNA molecule or a pre-mRNA molecule.

PAM and PFS Elements

[0223] PAM elements are sequences that can be recognized and bound by Cas proteins. Cas proteins/effector complexes can then unwind the dsDNA at a position adjacent to the PAM element. It will be appreciated that Cas proteins and systems that include them that target RNA do not require PAM sequences (Marraffmi et al. 2010. Nature. 463:568-571). Instead many rely on PFSs, which are discussed elsewhere herein. In certain embodiments, the target sequence should be associated with a PAM (protospacer adjacent motif) or PFS (protospacer flanking sequence or site); that is, a short sequence recognized by the CRISPR complex. Depending on the nature of the CRISPR-Cas protein, the target sequence should be selected such that its complementary sequence in the DNA duplex (also referred to herein as the non-target sequence) is upstream or downstream of the PAM. In the embodiments, the complementary sequence of the target sequence is downstream or 3’ of the PAM or upstream or 5’ of the PAM. The precise sequence and length requirements for the PAM differ depending on the Cas protein used, but PAMs are typically 2-5 base pair sequences adjacent the protospacer (that is, the target sequence). Examples of the natural PAM sequences for different Cas proteins are provided herein below and the skilled person will be able to identify further PAM sequences for use with a given Cas protein. [0224] The ability to recognize different PAM sequences depends on the Cas polypeptide(s) included in the system. See e.g. Gleditzsch et al. 2019. RNA Biology. 16(4):504-517. Table 4 below shows several Cas polypeptides and the PAM sequence they recognize.

[0225] In a preferred embodiment, the CRISPR effector protein may recognize a 3’ PAM. In certain embodiments, the CRISPR effector protein may recognize a 3’ PAM which is 5 Ή, wherein H is A, C or U.

[0226] Further, engineering of the PAM Interacting (PI) domain on the Cas protein may allow programing of PAM specificity, improve target site recognition fidelity, and increase the versatility of the CRISPR-Cas protein, for example as described for Cas9 in Kleinstiver BP et al. Engineered CRISPR-Cas9 nucleases with altered PAM specificities. Nature. 2015 Jul 23;523(7561):481-5. doi: 10.1038/naturel4592. As further detailed herein, the skilled person will understand that Casl3 proteins may be modified analogously. Gao et al ,“Engineered Cpfl Enzymes with Altered PAM Specificities,” bioRxiv 091611; doi: http://dx.doi.org/10.1101/091611 (Dec. 4, 2016). Doench et al. created a pool of sgRNAs, tiling across all possible target sites of a panel of six endogenous mouse and three endogenous human genes and quantitatively assessed their ability to produce null alleles of their target gene by antibody staining and flow cytometry. The authors showed that optimization of the PAM improved activity and also provided an on-line tool for designing sgRNAs. [0227] PAM sequences can be identified in a polynucleotide using an appropriate design tool, which are commercially available as well as online. Such freely available tools include, but are not limited to, CRISPRFinder and CRISPRTarget. Mojica et al. 2009. Microbiol. 155(Pt. 3):733-740; Atschul et al. 1990. J. Mol. Biol. 215:403-410; Biswass et al. 2013 RNA Biol. 10:817-827; and Grissa et al. 2007. Nucleic Acid Res. 35:W52-57. Experimental approaches to PAM identification can include, but are not limited to, plasmid depletion assays (Jiang et al. 2013. Nat. Biotechnol. 31 :233-239; Esvelt et al. 2013. Nat. Methods. 10: 1116-1121; Kleinstiver et al. 2015. Nature. 523 :481-485), screened by a high-throughput in vivo model called PAM-SCNAR (Pattanayak et al. 2013. Nat. Biotechnol. 31 :839-843 and Leenay et al. 2016. Mol. Cell. 16:253), and negative screening (Zetsche et al. 2015. Cell. 163:759-771).

[0228] As previously mentioned, CRISPR-Cas systems that target RNA do not typically rely on PAM sequences. Instead such systems typically recognize protospacer flanking sites (PFSs) instead of PAMs Thus, Type VI CRISPR-Cas systems typically recognize protospacer flanking sites (PFSs) instead of PAMs. PFSs represents an analogue to PAMs for RNA targets. Type VI CRISPR-Cas systems employ a Casl3. Some Casl3 proteins analyzed to date, such as Casl3a (C2c2) identified from Leptotrichia shahii (LShCAsl3a) have a specific discrimination against G at the 3’ end of the target RNA. The presence of a C at the corresponding crRNA repeat site can indicate that nucleotide pairing at this position is rejected. However, some Casl3 proteins (e.g. LwaCAsl3a and PspCasl3b) do not seem to have a PFS preference. See e.g. Gleditzsch et al. 2019. RNA Biology. 16(4): 504-517.

[0229] Some Type VI proteins, such as subtype B, have 5 ' -recognition of D (G, T, A) and a 3 ' -motif requirement of NAN or NNA. One example is the Casl3b protein identified in Bergeyella zoohelcum (BzCasl3b). See e.g. Gleditzsch et al. 2019. RNA Biology. 16(4):504-517.

[0230] Overall Type VI CRISPR-Cas systems appear to have less restrictive rules for substrate (e.g. target sequence) recognition than those that target DNA (e.g. Type V and type II).

Zinc Finger Nucleases

[0231] In some embodiments, the polynucleotide is modified using a Zinc Finger nuclease or system thereof. One type of programmable DNA-binding domain is provided by artificial zinc- finger (ZF) technology, which involves arrays of ZF modules to target new DNA-binding sites in the genome. Each finger module in a ZF array targets three DNA bases. A customized array of individual zinc finger domains is assembled into a ZF protein (ZFP).

[0232] ZFPs can comprise a functional domain. The first synthetic zinc finger nucleases (ZFNs) were developed by fusing a ZF protein to the catalytic domain of the Type IIS restriction enzyme Fokl. (Kim, Y. G. et al., 1994, Chimeric restriction endonuclease, Proc. Natl. Acad. Sci. U.S.A. 91, 883-887; Kim, Y. G. et al., 1996, Hybrid restriction enzymes: zinc finger fusions to Fok I cleavage domain. Proc. Natl. Acad. Sci. U.S.A. 93, 1156-1160). Increased cleavage specificity can be attained with decreased off target activity by use of paired ZFN heterodimers, each targeting different nucleotide sequences separated by a short spacer. (Doyon, Y. et al., 2011, Enhancing zinc-fmger-nuclease activity with improved obligate heterodimeric architectures. Nat. Methods 8, 74-79). ZFPs can also be designed as transcription activators and repressors and have been used to target many genes in a wide variety of organisms. Exemplary methods of genome editing using ZFNs can be found for example in U.S. Patent Nos. 6,534,261, 6,607,882, 6,746,838,

6,794, 136, 6,824,978, 6,866,997, 6,933, 113, 6,979,539, 7,013,219, 7,030,215, 7,220,719,

7,241,573, 7,241,574, 7,585,849, 7,595,376, 6,903, 185, and 6,479,626, all of which are specifically incorporated by reference.

TALE Nucleases

[0233] In some embodiments, a TALE nuclease or TALE nuclease system can be used to modify a polynucleotide. In some embodiments, the methods provided herein use isolated, non- naturally occurring, recombinant or engineered DNA binding proteins that comprise TALE monomers or TALE monomers or half monomers as a part of their organizational structure that enable the targeting of nucleic acid sequences with improved efficiency and expanded specificity.

[0234] Naturally occurring TALEs or“wild type TALEs” are nucleic acid binding proteins secreted by numerous species of proteobacteria. TALE polypeptides contain a nucleic acid binding domain composed of tandem repeats of highly conserved monomer polypeptides that are predominantly 33, 34 or 35 amino acids in length and that differ from each other mainly in amino acid positions 12 and 13. In advantageous embodiments, the nucleic acid is DNA. As used herein, the term“polypeptide monomers”,“TALE monomers” or“monomers” will be used to refer to the highly conserved repetitive polypeptide sequences within the TALE nucleic acid binding domain and the term“repeat variable di-residues” or“RVD” will be used to refer to the highly variable amino acids at positions 12 and 13 of the polypeptide monomers. As provided throughout the disclosure, the amino acid residues of the RVD are depicted using the IUPAC single letter code for amino acids. A general representation of a TALE monomer which is comprised within the DNA binding domain is Xi-n-(Xi2Xi3)-Xi4-33 or 34 or 35, where the subscript indicates the amino acid position and X represents any amino acid. X12X13 indicate the RVDs. In some polypeptide monomers, the variable amino acid at position 13 is missing or absent and in such monomers, the RVD consists of a single amino acid. In such cases the RVD may be alternatively represented as X*, where X represents X12 and (*) indicates that X13 is absent. The DNA binding domain comprises several repeats of TALE monomers and this may be represented as (Xi-n-(Xi2Xi3)-Xi4- 33 or 34 or 35) z , where in an advantageous embodiment, z is at least 5 to 40. In a further advantageous embodiment, z is at least 10 to 26.

[0235] The TALE monomers can have a nucleotide binding affinity that is determined by the identity of the amino acids in its RVD. For example, polypeptide monomers with an RVD of NI can preferentially bind to adenine (A), monomers with an RVD of NG can preferentially bind to thymine (T), monomers with an RVD of HD can preferentially bind to cytosine (C) and monomers with an RVD of NN can preferentially bind to both adenine (A) and guanine (G). In some embodiments, monomers with an RVD of IG can preferentially bind to T. Thus, the number and order of the polypeptide monomer repeats in the nucleic acid binding domain of a TALE determines its nucleic acid target specificity. In some embodiments, monomers with an RVD of NS can recognize all four base pairs and can bind to A, T, G or C. The structure and function of TALEs is further described in, for example, Moscou et al., Science 326: 1501 (2009); Boch et al., Science 326: 1509-1512 (2009); and Zhang et al., Nature Biotechnology 29: 149-153 (2011).

[0236] The polypeptides used in methods of the invention can be isolated, non-naturally occurring, recombinant or engineered nucleic acid-binding proteins that have nucleic acid or DNA binding regions containing polypeptide monomer repeats that are designed to target specific nucleic acid sequences.

[0237] As described herein, polypeptide monomers having an RVD of HN or NH preferentially bind to guanine and thereby allow the generation of TALE polypeptides with high binding specificity for guanine containing target nucleic acid sequences. In some embodiments, polypeptide monomers having RVDs RN, NN, NK, SN, NH, KN, HN, NQ, HH, RG, KH, RH and SS can preferentially bind to guanine. In some embodiments, polypeptide monomers having RVDs RN, NK, NQ, HH, KH, RH, SS and SN can preferentially bind to guanine and can thus allow the generation of TALE polypeptides with high binding specificity for guanine containing target nucleic acid sequences. In some embodiments, polypeptide monomers having RVDs HH, KH, NH, NK, NQ, RH, RN and SS can preferentially bind to guanine and thereby allow the generation of TALE polypeptides with high binding specificity for guanine containing target nucleic acid sequences. In some embodiments, the RVDs that have high binding specificity for guanine are RN, NH RH and KH. Furthermore, polypeptide monomers having an RVD of NV can preferentially bind to adenine and guanine. In some embodiments, monomers having RVDs of H*, HA, KA, N*, NA, NC, NS, RA, and S* bind to adenine, guanine, cytosine and thymine with comparable affinity.

[0238] The predetermined N-terminal to C-terminal order of the one or more polypeptide monomers of the nucleic acid or DNA binding domain determines the corresponding predetermined target nucleic acid sequence to which the polypeptides of the invention will bind. As used herein the monomers and at least one or more half monomers are“specifically ordered to target” the genomic locus or gene of interest. In plant genomes, the natural TALE-binding sites always begin with a thymine (T), which may be specified by a cryptic signal within the non- repetitive N-terminus of the TALE polypeptide; in some cases, this region may be referred to as repeat 0. In animal genomes, TALE binding sites do not necessarily have to begin with a thymine (T) and polypeptides of the invention may target DNA sequences that begin with T, A, G or C. The tandem repeat of TALE monomers always ends with a half-length repeat or a stretch of sequence that may share identity with only the first 20 amino acids of a repetitive full-length TALE monomer and this half repeat may be referred to as a half-monomer. Therefore, it follows that the length of the nucleic acid or DNA being targeted is equal to the number of full monomers plus two.

[0239] As described in Zhang et al., Nature Biotechnology 29: 149-153 (2011), TALE polypeptide binding efficiency may be increased by including amino acid sequences from the “capping regions” that are directly N-terminal or C-terminal of the DNA binding region of naturally occurring TALEs into the engineered TALEs at positions N-terminal or C-terminal of the engineered TALE DNA binding region. Thus, in certain embodiments, the TALE polypeptides described herein further comprise an N-terminal capping region and/or a C-terminal capping region.

[0240] An exemplary amino acid sequence of a N-terminal capping region is:

[0241]

[0242] An exemplar}' amino acid sequence of a C-terminal capping region is:

[0243]

[0244] As used herein the predetermined“N-terminus” to“C terminus” orientation of the N- terminal capping region, the DNA binding domain comprising the repeat TALE monomers and the C-terminal capping region provide structural basis for the organization of different domains in the d-TALEs or polypeptides of the invention.

[0245] The entire N-terminal and/or C-terminal capping regions are not necessary to enhance the binding activity of the DNA binding region. Therefore, in certain embodiments, fragments of the N-terminal and/or C-terminal capping regions are included in the TALE polypeptides described herein.

[0246] In certain embodiments, the TALE polypeptides described herein contain a N-terminal capping region fragment that included at least 10, 20, 30, 40, 50, 54, 60, 70, 80, 87, 90, 94, 100, 102, 110, 117, 120, 130, 140, 147, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260 or 270 amino acids of an N-terminal capping region. In certain embodiments, the N-terminal capping region fragment amino acids are of the C-terminus (the DNA-binding region proximal end) of an N-terminal capping region. As described in Zhang Nature Biotechnology 29: 149-153 (2011), N-terminal capping region fragments that include the C-terminal 240 amino acids enhance binding activity equal to the full length capping region, while fragments that include the C-terminal 147 amino acids retain greater than 80% of the efficacy of the full length capping region, and fragments that include the C-terminal 117 amino acids retain greater than 50% of the activity of the full- length capping region.

[0247] In some embodiments, the TALE polypeptides described herein contain a C-terminal capping region fragment that included at least 6, 10, 20, 30, 37, 40, 50, 60, 68, 70, 80, 90, 100, 110, 120, 127, 130, 140, 150, 155, 160, 170, 180 amino acids of a C-terminal capping region. In certain embodiments, the C-terminal capping region fragment amino acids are of the N-terminus (the DNA-binding region proximal end) of a C-terminal capping region. As described in Zhang et al., Nature Biotechnology 29: 149-153 (2011), C-terminal capping region fragments that include the C-terminal 68 amino acids enhance binding activity equal to the full-length capping region, while fragments that include the C-terminal 20 amino acids retain greater than 50% of the efficacy of the full-length capping region.

[0248] In certain embodiments, the capping regions of the TALE polypeptides described herein do not need to have identical sequences to the capping region sequences provided herein. Thus, in some embodiments, the capping region of the TALE polypeptides described herein have sequences that are at least 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identical or share identity to the capping region amino acid sequences provided herein. Sequence identity is related to sequence homology. Homology comparisons may be conducted by eye, or more usually, with the aid of readily available sequence comparison programs. These commercially available computer programs may calculate percent (%) homology between two or more sequences and may also calculate the sequence identity shared by two or more amino acid or nucleic acid sequences. In some preferred embodiments, the capping region of the TALE polypeptides described herein have sequences that are at least 95% identical or share identity to the capping region amino acid sequences provided herein.

[0249] Sequence homologies can be generated by any of a number of computer programs known in the art, which include but are not limited to BLAST or FASTA. Suitable computer programs for carrying out alignments like the GCG Wisconsin Bestfit package may also be used. Once the software has produced an optimal alignment, it is possible to calculate % homology, preferably % sequence identity. The software typically does this as part of the sequence comparison and generates a numerical result.

[0250] In some embodiments described herein, the TALE polypeptides of the invention include a nucleic acid binding domain linked to the one or more effector domains. The terms “effector domain” or“regulatory and functional domain” refer to a polypeptide sequence that has an activity other than binding to the nucleic acid sequence recognized by the nucleic acid binding domain. By combining a nucleic acid binding domain with one or more effector domains, the polypeptides of the invention may be used to target the one or more functions or activities mediated by the effector domain to a particular target DNA sequence to which the nucleic acid binding domain specifically binds.

[0251] In some embodiments of the TALE polypeptides described herein, the activity mediated by the effector domain is a biological activity. For example, in some embodiments the effector domain is a transcriptional inhibitor (i.e., a repressor domain), such as an mSin interaction domain (SID). SID4X domain or a Kriippel-associated box (KRAB) or fragments of the KRAB domain. In some embodiments the effector domain is an enhancer of transcription (i.e. an activation domain), such as the VP 16, VP64 or p65 activation domain. In some embodiments, the nucleic acid binding is linked, for example, with an effector domain that includes but is not limited to a transposase, integrase, recombinase, resolvase, invertase, protease, DNA methyltransferase, DNA demethylase, histone acetylase, histone deacetylase, nuclease, transcriptional repressor, transcriptional activator, transcription factor recruiting, protein nuclear-localization signal or cellular uptake signal.

[0252] In some embodiments, the effector domain is a protein domain which exhibits activities which include but are not limited to transposase activity, integrase activity, recombinase activity, resolvase activity, invertase activity, protease activity, DNA methyltransferase activity, DNA demethylase activity, histone acetylase activity, histone deacetylase activity, nuclease activity, nuclear-localization signaling activity, transcriptional repressor activity, transcriptional activator activity, transcription factor recruiting activity, or cellular uptake signaling activity. Other preferred embodiments of the invention may include any combination of the activities described herein. Meganucleases

[0253] In some embodiments, a meganuclease or system thereof can be used to modify a polynucleotide. Meganucleases, which are endodeoxyribonucleases characterized by a large recognition site (double-stranded DNA sequences of 12 to 40 base pairs). Exemplary methods for using meganucleases can be found in US Patent Nos. 8, 163,514, 8, 133,697, 8,021,867, 8,119,361, 8, 119,381, 8, 124,369, and 8,129, 134, which are specifically incorporated by reference.

SEQUENCES RELATED TO NUCLEUS TARGETING AND TRANSPORTATION

[0254] In some embodiments, one or more components (e.g., the Cas protein and/or deaminase, Zn Finger protein, TALE, or meganuclease) in the composition for engineering cells may comprise one or more sequences related to nucleus targeting and transportation. Such sequence may facilitate the one or more components in the composition for targeting a sequence within a cell. In order to improve targeting of the CRISPR-Cas protein and/or the nucleotide deaminase protein or catalytic domain thereof used in the methods of the present disclosure to the nucleus, it may be advantageous to provide one or both of these components with one or more nuclear localization sequences (NLSs).

[0255] In some embodiments, the NLSs used in the context of the present disclosure are heterologous to the proteins. Non-limiting examples of NLSs include an NLS sequence derived from: the NLS of the SV40 virus large T-antigen, having the amino acid sequence P

(SEQ ID No. 10) or ; the NLS from nucleoplasmin (e.g., the nucleoplasmin bipartite NLS with the sequence

the c-myc NLS having the amino acid sequence or the having the sequence ( ); the sequence of the IBB domain from importin-alpha; the sequences VSRKRPRP (SEQ ID No. 17) and PPKKARED (SEQ ID No. 18) of the myoma T protein; the sequence PQPKKKPL (SEQ ID No. 19) of human p53; the sequence ) of mouse c-abl IV; the sequences DRLRR and PKQKKRK (SEQ ID No. 22) of the influenza virus NS1; the sequence of the Hepatitis virus delta antigen; the sequence of the mouse Mxl protein; the sequence KRKGDEVDGVDEVAKKKSKK (SEQ ID No. 25) of the human poly(ADP-ribose) polymerase; and the sequence RKCLQAGMNLEARKTKK (SEQ ID No. 26 ) of the steroid hormone receptors (human) glucocorticoid. In general, the one or more NLSs are of sufficient strength to drive accumulation of the DNA-targeting Cas protein in a detectable amount in the nucleus of a eukaryotic cell. In general, strength of nuclear localization activity may derive from the number of NLSs in the CRISPR-Cas protein, the particular NLS(s) used, or a combination of these factors. Detection of accumulation in the nucleus may be performed by any suitable technique. For example, a detectable marker may be fused to the nucleic acid-targeting protein, such that location within a cell may be visualized, such as in combination with a means for detecting the location of the nucleus (e.g., a stain specific for the nucleus such as DAPI). Cell nuclei may also be isolated from cells, the contents of which may then be analyzed by any suitable process for detecting protein, such as immunohistochemistry, Western blot, or enzyme activity assay. Accumulation in the nucleus may also be determined indirectly, such as by an assay for the effect of nucleic acid- targeting complex formation (e.g., assay for deaminase activity) at the target sequence, or assay for altered gene expression activity affected by DNA-targeting complex formation and/or DNA- targeting), as compared to a control not exposed to the CRISPR-Cas protein and deaminase protein, or exposed to a CRISPR-Cas and/or deaminase protein lacking the one or more NLSs.

[0256] The CRISPR-Cas and/or nucleotide deaminase proteins may be provided with 1 or more, such as with, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more heterologous NLSs. In some embodiments, the proteins comprises about or more than about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more NLSs at or near the amino-terminus, about or more than about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more NLSs at or near the carboxy -terminus, or a combination of these (e.g., zero or at least one or more NLS at the amino-terminus and zero or at one or more NLS at the carboxy terminus). When more than one NLS is present, each may be selected independently of the others, such that a single NLS may be present in more than one copy and/or in combination with one or more other NLSs present in one or more copies. In some embodiments, an NLS is considered near the N- or C-terminus when the nearest amino acid of the NLS is within about 1, 2, 3, 4, 5, 10, 15, 20, 25, 30, 40, 50, or more amino acids along the polypeptide chain from the N- or C-terminus. In preferred embodiments of the CRISPR-Cas proteins, an NLS attached to the C-terminal of the protein. [0257] In certain embodiments, the CRISPR-Cas protein and the deaminase protein are delivered to the cell or expressed within the cell as separate proteins. In these embodiments, each of the CRISPR-Cas and deaminase protein can be provided with one or more NLSs as described herein. In certain embodiments, the CRISPR-Cas and deaminase proteins are delivered to the cell or expressed with the cell as a fusion protein. In these embodiments one or both of the CRISPR- Cas and deaminase protein is provided with one or more NLSs. Where the nucleotide deaminase is fused to an adaptor protein (such as MS2) as described above, the one or more NLS can be provided on the adaptor protein, provided that this does not interfere with aptamer binding. In particular embodiments, the one or more NLS sequences may also function as linker sequences between the nucleotide deaminase and the CRISPR-Cas protein.

[0258] In certain embodiments, guides of the disclosure comprise specific binding sites (e.g., aptamers) for adapter proteins, which may be linked to or fused to an nucleotide deaminase or catalytic domain thereof. When such a guide forms a CRISPR complex (e.g., CRISPR-Cas protein binding to guide and target) the adapter proteins bind and, the nucleotide deaminase or catalytic domain thereof associated with the adapter protein is positioned in a spatial orientation which is advantageous for the attributed function to be effective.

[0259] The skilled person will understand that modifications to the guide which allow for binding of the adapter + nucleotide deaminase, but not proper positioning of the adapter + nucleotide deaminase (e.g., due to steric hindrance within the three dimensional structure of the CRISPR complex) are modifications which are not intended. The one or more modified guide may be modified at the tetra loop, the stem loop 1, stem loop 2, or stem loop 3, as described herein, preferably at either the tetra loop or stem loop 2, and in some cases at both the tetra loop and stem loop 2.

[0260] In some embodiments, a component (e.g., the dead Cas protein, the nucleotide deaminase protein or catalytic domain thereof, or a combination thereof) in the systems may comprise one or more nuclear export signals (NES), one or more nuclear localization signals (NLS), or any combinations thereof. In some cases, the NES may be an HIV Rev NES. In certain cases, the NES may be MAPK NES. When the component is a protein, the NES or NLS may be at the C terminus of component. Alternatively or additionally, the NES or NLS may be at the N terminus of component. In some examples, the Cas protein and optionally said nucleotide deaminase protein or catalytic domain thereof comprise one or more heterologous nuclear export signal(s) (NES(s)) or nuclear localization signal(s) (NLS(s)), preferably an HIV Rev NES or MAPK NES, preferably C-terminal.

Templates

[0261] In some embodiments, the composition for engineering cells comprise a template, e.g., a recombination template. A template may be a component of another vector as described herein, contained in a separate vector, or provided as a separate polynucleotide. In some embodiments, a recombination template is designed to serve as a template in homologous recombination, such as within or near a target sequence nicked or cleaved by a nucleic acid-targeting effector protein as a part of a nucleic acid-targeting complex.

[0262] In an embodiment, the template nucleic acid alters the sequence of the target position. In an embodiment, the template nucleic acid results in the incorporation of a modified, or non- naturally occurring base into the target nucleic acid.

[0263] The template sequence may undergo a breakage mediated or catalyzed recombination with the target sequence. In an embodiment, the template nucleic acid may include sequence that corresponds to a site on the target sequence that is cleaved by a Cas protein mediated cleavage event. In an embodiment, the template nucleic acid may include sequence that corresponds to both, a first site on the target sequence that is cleaved in a first Cas protein mediated event, and a second site on the target sequence that is cleaved in a second Cas protein mediated event.

[0264] In certain embodiments, the template nucleic acid can include sequence which results in an alteration in the coding sequence of a translated sequence, e.g., one which results in the substitution of one amino acid for another in a protein product, e.g., transforming a mutant allele into a wild type allele, transforming a wild type allele into a mutant allele, and/or introducing a stop codon, insertion of an amino acid residue, deletion of an amino acid residue, or a nonsense mutation. In certain embodiments, the template nucleic acid can include sequence which results in an alteration in a non-coding sequence, e.g., an alteration in an exon or in a 5' or 3' non-translated or non-transcribed region. Such alterations include an alteration in a control element, e.g., a promoter, enhancer, and an alteration in a cis-acting or trans-acting control element.

[0265] A template nucleic acid having homology with a target position in a target gene may be used to alter the structure of a target sequence. The template sequence may be used to alter an unwanted structure, e.g., an unwanted or mutant nucleotide. The template nucleic acid may include sequence which, when integrated, results in: decreasing the activity of a positive control element; increasing the activity of a positive control element; decreasing the activity of a negative control element; increasing the activity of a negative control element; decreasing the expression of a gene; increasing the expression of a gene; increasing resistance to a disorder or disease; increasing resistance to viral entry; correcting a mutation or altering an unwanted amino acid residue conferring, increasing, abolishing or decreasing a biological property of a gene product, e.g., increasing the enzymatic activity of an enzyme, or increasing the ability of a gene product to interact with another molecule.

[0266] The template nucleic acid may include sequence which results in a change in sequence of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1, 12 or more nucleotides of the target sequence.

[0267] A template polynucleotide may be of any suitable length, such as about or more than about 10, 15, 20, 25, 50, 75, 100, 150, 200, 500, 1000, or more nucleotides in length. In an embodiment, the template nucleic acid may be 20+/- 10, 30+/- 10, 40+/- 10, 50+/- 10, 60+/- 10, 70+/- 10, 80+/- 10, 90+/- 10, 100+/- 10, 1 10+/- 10, 120+/- 10, 130+/- 10, 140+/- 10, 150+/- 10, 160+/- 10, 170+/- 10, 1 80+/- 10, 190+/- 10, 200+/- 10, 210+/-10, of 220+/- 10 nucleotides in length. In an embodiment, the template nucleic acid may be 30+/-20, 40+/-20, 50+/-20, 60+/-20, 70+/- 20, 80+/-20, 90+/-20, 100+/-20, 1 10+/-20, 120+/-20, 130+/-20, 140+/-20, 150+/-20, 160+/- 20, 170+/-20, 180+/-20, 190+/-20, 200+/-20, 210+/-20, of 220+/-20 nucleotides in length. In an embodiment, the template nucleic acid is 10 to 1 ,000, 20 to 900, 30 to 800, 40 to 700, 50 to 600, 50 to 500, 50 to 400, 50 to300, 50 to 200, or 50 to 100 nucleotides in length.

[0268] In some embodiments, the template polynucleotide is complementary to a portion of a polynucleotide comprising the target sequence. When optimally aligned, a template polynucleotide might overlap with one or more nucleotides of a target sequences (e.g. about or more than about 1, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100 or more nucleotides). In some embodiments, when a template sequence and a polynucleotide comprising a target sequence are optimally aligned, the nearest nucleotide of the template polynucleotide is within about 1, 5, 10, 15, 20, 25, 50, 75, 100, 200, 300, 400, 500, 1000, 5000, 10000, or more nucleotides from the target sequence. [0269] The exogenous polynucleotide template comprises a sequence to be integrated (e.g., a mutated gene). The sequence for integration may be a sequence endogenous or exogenous to the cell. Examples of a sequence to be integrated include polynucleotides encoding a protein or a non- coding RNA (e.g., a microRNA). Thus, the sequence for integration may be operably linked to an appropriate control sequence or sequences. Alternatively, the sequence to be integrated may provide a regulatory function.

[0270] An upstream or downstream sequence may comprise from about 20 bp to about 2500 bp, for example, about 50, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000, 2100, 2200, 2300, 2400, or 2500 bp. In some methods, the exemplary upstream or downstream sequence have about 200 bp to about 2000 bp, about 600 bp to about 1000 bp, or more particularly about 700 bp to about 1000.

[0271] An upstream or downstream sequence may comprise from about 20 bp to about 2500 bp, for example, about 50, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000, 2100, 2200, 2300, 2400, or 2500 bp. In some methods, the exemplary upstream or downstream sequence have about 200 bp to about 2000 bp, about 600 bp to about 1000 bp, or more particularly about 700 bp to about 1000

[0272] In certain embodiments, one or both homology arms may be shortened to avoid including certain sequence repeat elements. For example, a 5' homology arm may be shortened to avoid a sequence repeat element. In other embodiments, a 3' homology arm may be shortened to avoid a sequence repeat element. In some embodiments, both the 5' and the 3' homology arms may be shortened to avoid including certain sequence repeat elements.

[0273] In some methods, the exogenous polynucleotide template may further comprise a marker. Such a marker may make it easy to screen for targeted integrations. Examples of suitable markers include restriction sites, fluorescent proteins, or selectable markers. The exogenous polynucleotide template of the disclosure can be constructed using recombinant techniques (see, for example, Sambrook et al., 2001 and Ausubel et al., 1996).

[0274] In certain embodiments, a template nucleic acid for correcting a mutation may be designed for use as a single-stranded oligonucleotide. When using a single-stranded oligonucleotide, 5' and 3' homology arms may range up to about 200 base pairs (bp) in length, e.g., at least 25, 50, 75, 100, 125, 150, 175, or 200 bp in length. [0275] In certain embodiments, a template nucleic acid for correcting a mutation may be designed for use with a homology-independent targeted integration system. Suzuki et al. describe in vivo genome editing via CRISPR/Cas9 mediated homology-independent targeted integration (2016, Nature 540: 144-149). Schmid-Burgk, et al. describe use of the CRISPR-Cas9 system to introduce a double-strand break (DSB) at a user-defined genomic location and insertion of a universal donor DNA (Nat Commun. 2016 Jul 28;7: 12338). Gao, et al. describe“Plug-and-Play Protein Modification Using Homology-Independent Universal Genome Engineering” (Neuron. 2019 Aug 21;103(4):583-597).

RNAi

[0276] In some embodiments, the genetic modulating agents may be interfering RNAs. In certain embodiments, diseases caused by a dominant mutation in a gene is targeted by silencing the mutated gene using RNAi. In some cases, the nucleotide sequence may comprise coding sequence for one or more interfering RNAs. In certain examples, the nucleotide sequence may be interfering RNA (RNAi). As used herein, the term“RNAi” refers to any type of interfering RNA, including but not limited to, siRNAi, shRNAi, endogenous microRNA and artificial microRNA. For instance, it includes sequences previously identified as siRNA, regardless of the mechanism of down-stream processing of the RNA (i.e. although siRNAs are believed to have a specific method of in vivo processing resulting in the cleavage of mRNA, such sequences can be incorporated into the vectors in the context of the flanking sequences described herein). The term “RNAi” can include both gene silencing RNAi molecules, and also RNAi effector molecules which activate the expression of a gene.

[0277] In certain embodiments, a modulating agent may comprise silencing one or more endogenous genes. As used herein,“gene silencing” or“gene silenced” in reference to an activity of an RNAi molecule, for example a siRNA or miRNA, refers to a decrease in the mRNA level in a cell for a target gene by at least about 5%, about 10%, about 20%, about 30%, about 40%, about 50%, about 60%, about 70%, about 80%, about 90%, about 95%, about 99%, about 100% of the mRNA level found in the cell without the presence of the miRNA or RNA interference molecule. In one preferred embodiment, the mRNA levels are decreased by at least about 70%, about 80%, about 90%, about 95%, about 99%, about 100%. [0278] As used herein, a“siRNA” refers to a nucleic acid that forms a double stranded RNA, which double stranded RNA has the ability to reduce or inhibit expression of a gene or target gene when the siRNA is present or expressed in the same cell as the target gene. The double stranded RNA siRNA can be formed by the complementary strands. In one embodiment, a siRNA refers to a nucleic acid that can form a double stranded siRNA. The sequence of the siRNA can correspond to the full-length target gene, or a subsequence thereof. Typically, the siRNA is at least about 15- 50 nucleotides in length (e.g., each complementary sequence of the double stranded siRNA is about 15-50 nucleotides in length, and the double stranded siRNA is about 15-50 base pairs in length, preferably about 19-30 base nucleotides, preferably about 20-25 nucleotides in length, e.g., 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, or 30 nucleotides in length).

[0279] As used herein“shRNA” or“small hairpin RNA” (also called stem loop) is a type of siRNA. In one embodiment, these shRNAs are composed of a short, e.g., about 19 to about 25 nucleotide, antisense strand, followed by a nucleotide loop of about 5 to about 9 nucleotides, and the analogous sense strand. Alternatively, the sense strand can precede the nucleotide loop structure and the antisense strand can follow.

[0280] The terms“microRNA” or“miRNA” are used interchangeably herein are endogenous RNAs, some of which are known to regulate the expression of protein-coding genes at the posttranscriptional level. Endogenous microRNAs are small RNAs naturally present in the genome that are capable of modulating the productive utilization of mRNA. The term artificial microRNA includes any type of RNA sequence, other than endogenous microRNA, which is capable of modulating the productive utilization of mRNA. MicroRNA sequences have been described in publications such as Lim, et al., Genes & Development, 17, p. 991 - 1008 (2003), Lim et al Science 299, 1540 (2003), Lee and Ambros Science, 294, 862 (2001), Lau et al., Science 294, 858-861 (2001), Lagos-Quintana et al, Current Biology, 12, 735-739 (2002), Lagos Quintana et al, Science 294, 853- 857 (2001), and Lagos-Quintana et al, RNA, 9, 175- 179 (2003), which are incorporated by reference. Multiple microRNAs can also be incorporated into a precursor molecule. Furthermore, miRNA-like stem-loops can be expressed in cells as a vehicle to deliver artificial miRNAs and short interfering RNAs (siRNAs) for the purpose of modulating the expression of endogenous genes through the miRNA and or RNAi pathways. [0281] As used herein,“double stranded RNA” or“dsRNA” refers to RNA molecules that are comprised of two strands. Double-stranded molecules include those comprised of a single RNA molecule that doubles back on itself to form a two-stranded structure. For example, the stem loop structure of the progenitor molecules from which the single-stranded miRNA is derived, called the pre-miRNA (Bartel et al. 2004. Cell 1 16:281 -297), comprises a dsRNA molecule.

Antibodies

[0282] In certain embodiments, the one or more agents is an antibody. The term "antibody" is used interchangeably with the term "immunoglobulin" herein, and includes intact antibodies, fragments of antibodies, e.g., Fab, F(ab')2 fragments, and intact antibodies and fragments that have been mutated either in their constant and/or variable region (e.g., mutations to produce chimeric, partially humanized, or fully humanized antibodies, as well as to produce antibodies with a desired trait, e.g., enhanced binding and/or reduced FcR binding). The term "fragment" refers to a part or portion of an antibody or antibody chain comprising fewer amino acid residues than an intact or complete antibody or antibody chain. Fragments can be obtained via chemical or enzymatic treatment of an intact or complete antibody or antibody chain. Fragments can also be obtained by recombinant means. Exemplary fragments include Fab, Fab', F(ab')2, Fabc, Fd, dAb, VHH and scFv and/or Fv fragments.

[0283] As used herein, a preparation of antibody protein having less than about 50% of non- antibody protein (also referred to herein as a "contaminating protein"), or of chemical precursors, is considered to be "substantially free." 40%, 30%, 20%, 10% and more preferably 5% (by dry weight), of non-antibody protein, or of chemical precursors is considered to be substantially free. When the antibody protein or biologically active portion thereof is recombinantly produced, it is also preferably substantially free of culture medium, i.e., culture medium represents less than about 30%, preferably less than about 20%, more preferably less than about 10%, and most preferably less than about 5% of the volume or mass of the protein preparation.

[0284] The term "antigen-binding fragment" refers to a polypeptide fragment of an immunoglobulin or antibody that binds antigen or competes with intact antibody (i.e., with the intact antibody from which they were derived) for antigen binding (i.e., specific binding). As such these antibodies or fragments thereof are included in the scope of the invention, provided that the antibody or fragment binds specifically to a target molecule. [0285] It is intended that the term "antibody" encompass any Ig class or any Ig subclass (e.g. the IgGl, IgG2, IgG3, and IgG4 subclassess of IgG) obtained from any source (e.g., humans and non-human primates, and in rodents, lagomorphs, caprines, bovines, equines, ovines, etc.).

[0286] The term "Ig class" or "immunoglobulin class", as used herein, refers to the five classes of immunoglobulin that have been identified in humans and higher mammals, IgG, IgM, IgA, IgD, and IgE. The term "Ig subclass" refers to the two subclasses of IgM (H and L), three subclasses of IgA (IgAl, IgA2, and secretory IgA), and four subclasses of IgG (IgGl, IgG2, IgG3, and IgG4) that have been identified in humans and higher mammals. The antibodies can exist in monomeric or polymeric form; for example, IgM antibodies exist in pentameric form, and IgA antibodies exist in monomeric, dimeric or multimeric form.

[0287] The term "IgG subclass" refers to the four subclasses of immunoglobulin class IgG - IgGl, IgG2, IgG3, and IgG4 that have been identified in humans and higher mammals by the heavy chains of the immunoglobulins, VI - g4, respectively. The term "single-chain immunoglobulin" or "single-chain antibody" (used interchangeably herein) refers to a protein having a two-polypeptide chain structure consisting of a heavy and a light chain, said chains being stabilized, for example, by interchain peptide linkers, which has the ability to specifically bind antigen. The term "domain" refers to a globular region of a heavy or light chain polypeptide comprising peptide loops (e.g., comprising 3 to 4 peptide loops) stabilized, for example, by b pleated sheet and/or intrachain disulfide bond. Domains are further referred to herein as "constant" or "variable", based on the relative lack of sequence variation within the domains of various class members in the case of a "constant" domain, or the significant variation within the domains of various class members in the case of a "variable" domain. Antibody or polypeptide "domains" are often referred to interchangeably in the art as antibody or polypeptide "regions". The "constant" domains of an antibody light chain are referred to interchangeably as "light chain constant regions", "light chain constant domains", "CL" regions or "CL" domains. The "constant" domains of an antibody heavy chain are referred to interchangeably as "heavy chain constant regions", "heavy chain constant domains", "CH" regions or "CH" domains). The "variable" domains of an antibody light chain are referred to interchangeably as "light chain variable regions", "light chain variable domains", "VL" regions or "VL" domains). The "variable" domains of an antibody heavy chain are referred to interchangeably as "heavy chain constant regions", "heavy chain constant domains", "VH" regions or "VH" domains).

[0288] The term "region" can also refer to a part or portion of an antibody chain or antibody chain domain (e.g., a part or portion of a heavy or light chain or a part or portion of a constant or variable domain, as defined herein), as well as more discrete parts or portions of said chains or domains. For example, light and heavy chains or light and heavy chain variable domains include "complementarity determining regions" or "CDRs" interspersed among "framework regions" or "FRs", as defined herein.

[0289] The term "conformation" refers to the tertiary structure of a protein or polypeptide (e.g., an antibody, antibody chain, domain or region thereof). For example, the phrase "light (or heavy) chain conformation" refers to the tertiary structure of a light (or heavy) chain variable region, and the phrase "antibody conformation" or "antibody fragment conformation" refers to the tertiary structure of an antibody or fragment thereof.

[0290] The term“antibody-like protein scaffolds” or“engineered protein scaffolds” broadly encompasses proteinaceous non-immunoglobulin specific-binding agents, typically obtained by combinatorial engineering (such as site-directed random mutagenesis in combination with phage display or other molecular selection techniques). Usually, such scaffolds are derived from robust and small soluble monomeric proteins (such as Kunitz inhibitors or lipocalins) or from a stably folded extra-membrane domain of a cell surface receptor (such as protein A, fibronectin or the ankyrin repeat).

[0291] Such scaffolds have been extensively reviewed in Binz et al. (Engineering novel binding proteins from nonimmunoglobulin domains. Nat Biotechnol 2005, 23 : 1257-1268), Gebauer and Skerra (Engineered protein scaffolds as next-generation antibody therapeutics. Curr Opin Chem Biol. 2009, 13 :245-55), Gill and Damle (Biopharmaceutical drug discovery using novel protein scaffolds. Curr Opin Biotechnol 2006, 17:653-658), Skerra (Engineered protein scaffolds for molecular recognition. J Mol Recognit 2000, 13 : 167-187), and Skerra (Alternative non-antibody scaffolds for molecular recognition. Curr Opin Biotechnol 2007, 18:295-304), and include without limitation affibodies, based on the Z-domain of staphylococcal protein A, a three- helix bundle of 58 residues providing an interface on two of its alpha-helices (Nygren, Alternative binding proteins: Affibody binding proteins developed from a small three-helix bundle scaffold. FEBS J 2008, 275:2668-2676); engineered Kunitz domains based on a small (ca. 58 residues) and robust, disulphide-crosslinked serine protease inhibitor, typically of human origin (e.g. LACI-D1), which can be engineered for different protease specificities (Nixon and Wood, Engineered protein inhibitors of proteases. Curr Opin Drug Discov Dev 2006, 9:261-268); monobodies or adnectins based on the 10th extracellular domain of human fibronectin III (10Fn3), which adopts an Ig-like beta-sandwich fold (94 residues) with 2-3 exposed loops, but lacks the central disulphide bridge (Koide and Koide, Monobodies: antibody mimics based on the scaffold of the fibronectin type III domain. Methods Mol Biol 2007, 352:95-109); anticalins derived from the lipocalins, a diverse family of eight-stranded beta-barrel proteins (ca. 180 residues) that naturally form binding sites for small ligands by means of four structurally variable loops at the open end, which are abundant in humans, insects, and many other organisms (Skerra, Alternative binding proteins: Anticalins— harnessing the structural plasticity of the lipocalin ligand pocket to engineer novel binding activities. FEBS J 2008, 275:2677-2683); DARPins, designed ankyrin repeat domains (166 residues), which provide a rigid interface arising from typically three repeated beta-turns (Stumpp et al., DARPins: a new generation of protein therapeutics. Drug Discov Today 2008, 13:695-701); avimers (multimerized LDLR-A module) (Silverman et al., Multivalent avimer proteins evolved by exon shuffling of a family of human receptor domains. Nat Biotechnol 2005, 23 : 1556-1561); and cysteine-rich knottin peptides (Kolmar, Alternative binding proteins: biological activity and therapeutic potential of cystine-knot miniproteins. FEBS J 2008, 275:2684-2690).

[0292] "Specific binding" of an antibody means that the antibody exhibits appreciable affinity for a particular antigen or epitope and, generally, does not exhibit significant cross reactivity. "Appreciable" binding includes binding with an affinity of at least 25 mM. Antibodies with affinities greater than 1 x 10 7 M -1 (or a dissociation coefficient of ImM or less or a dissociation coefficient of lnm or less) typically bind with correspondingly greater specificity. Values intermediate of those set forth herein are also intended to be within the scope of the present invention and antibodies of the invention bind with a range of affinities, for example, lOOnM or less, 75nM or less, 50nM or less, 25nM or less, for example lOnM or less, 5nM or less, InM or less, or in embodiments 500pM or less, lOOpM or less, 50pM or less or 25pM or less. An antibody that "does not exhibit significant crossreactivity" is one that will not appreciably bind to an entity other than its target (e.g., a different epitope or a different molecule). For example, an antibody that specifically binds to a target molecule will appreciably bind the target molecule but will not significantly react with non-target molecules or peptides. An antibody specific for a particular epitope will, for example, not significantly crossreact with remote epitopes on the same protein or peptide. Specific binding can be determined according to any art-recognized means for determining such binding. Preferably, specific binding is determined according to Scatchard analysis and/or competitive binding assays.

[0293] As used herein, the term "affinity" refers to the strength of the binding of a single antigen-combining site with an antigenic determinant. Affinity depends on the closeness of stereochemical fit between antibody combining sites and antigen determinants, on the size of the area of contact between them, on the distribution of charged and hydrophobic groups, etc. Antibody affinity can be measured by equilibrium dialysis or by the kinetic BIACORE™ method. The dissociation constant, Kd, and the association constant, Ka, are quantitative measures of affinity.

[0294] As used herein, the term "monoclonal antibody" refers to an antibody derived from a clonal population of antibody-producing cells (e.g., B lymphocytes or B cells) which is homogeneous in structure and antigen specificity. The term "polyclonal antibody" refers to a plurality of antibodies originating from different clonal populations of antibody-producing cells which are heterogeneous in their structure and epitope specificity but which recognize a common antigen. Monoclonal and polyclonal antibodies may exist within bodily fluids, as crude preparations, or may be purified, as described herein.

[0295] The term "binding portion" of an antibody (or "antibody portion") includes one or more complete domains, e.g., a pair of complete domains, as well as fragments of an antibody that retain the ability to specifically bind to a target molecule. It has been shown that the binding function of an antibody can be performed by fragments of a full-length antibody. Binding fragments are produced by recombinant DNA techniques, or by enzymatic or chemical cleavage of intact immunoglobulins. Binding fragments include Fab, Fab', F(ab')2, Fabc, Fd, dAb, Fv, single chains, single-chain antibodies, e.g., scFv, and single domain antibodies.

[0296] "Humanized" forms of non-human (e.g., murine) antibodies are chimeric antibodies that contain minimal sequence derived from non-human immunoglobulin. For the most part, humanized antibodies are human immunoglobulins (recipient antibody) in which residues from a hypervariable region of the recipient are replaced by residues from a hypervariable region of a non-human species (donor antibody) such as mouse, rat, rabbit or nonhuman primate having the desired specificity, affinity, and capacity. In some instances, FR residues of the human immunoglobulin are replaced by corresponding non-human residues. Furthermore, humanized antibodies may comprise residues that are not found in the recipient antibody or in the donor antibody. These modifications are made to further refine antibody performance. In general, the humanized antibody will comprise substantially all of at least one, and typically two, variable domains, in which all or substantially all of the hypervariable regions correspond to those of a non- human immunoglobulin and all or substantially all of the FR regions are those of a human immunoglobulin sequence. The humanized antibody optionally also will comprise at least a portion of an immunoglobulin constant region (Fc), typically that of a human immunoglobulin.

[0297] Examples of portions of antibodies or epitope-binding proteins encompassed by the present definition include: (i) the Fab fragment, having VL, CL, VH and C H 1 domains; (ii) the Fab' fragment, which is a Fab fragment having one or more cysteine residues at the C-terminus of the C H 1 domain; (iii) the Fd fragment having VH and C H 1 domains; (iv) the Fd' fragment having VH and C H 1 domains and one or more cysteine residues at the C-terminus of the CHI domain; (v) the Fv fragment having the VL and VH domains of a single arm of an antibody; (vi) the dAb fragment (Ward et al., 341 Nature 544 (1989)) which consists of a VH domain or a VL domain that binds antigen; (vii) isolated CDR regions or isolated CDR regions presented in a functional framework; (viii) F(ab')2 fragments which are bivalent fragments including two Fab' fragments linked by a disulphide bridge at the hinge region; (ix) single chain antibody molecules (e.g., single chain Fv; scFv) (Bird et al., 242 Science 423 (1988); and Huston et al., 85 PNAS 5879 (1988)); (x) "diabodies" with two antigen binding sites, comprising a heavy chain variable domain (VH) connected to a light chain variable domain (VL) in the same polypeptide chain (see, e.g., EP 404,097; WO 93/11161; Hollinger et al., 90 PNAS 6444 (1993)); (xi) "linear antibodies" comprising a pair of tandem Fd segments (V H -C h l-V H -C h l) which, together with complementary light chain polypeptides, form a pair of antigen binding regions (Zapata et al., Protein Eng. 8(10): 1057-62 (1995); and U.S. Patent No. 5,641,870).

[0298] As used herein, a "blocking" antibody or an antibody "antagonist" is one which inhibits or reduces biological activity of the antigen(s) it binds. In certain embodiments, the blocking antibodies or antagonist antibodies or portions thereof described herein completely inhibit the biological activity of the antigen(s).

[0299] Antibodies may act as agonists or antagonists of the recognized polypeptides. For example, the present invention includes antibodies which disrupt receptor/ligand interactions either partially or fully. The invention features both receptor-specific antibodies and ligand- specific antibodies. The invention also features receptor-specific antibodies which do not prevent ligand binding but prevent receptor activation. Receptor activation (i.e., signaling) may be determined by techniques described herein or otherwise known in the art. For example, receptor activation can be determined by detecting the phosphorylation (e.g., tyrosine or serine/threonine) of the receptor or of one of its down-stream substrates by immunoprecipitation followed by western blot analysis. In specific embodiments, antibodies are provided that inhibit ligand activity or receptor activity by at least 95%, at least 90%, at least 85%, at least 80%, at least 75%, at least 70%, at least 60%, or at least 50% of the activity in absence of the antibody.

[0300] The invention also features receptor-specific antibodies which both prevent ligand binding and receptor activation as well as antibodies that recognize the receptor-ligand complex. Likewise, encompassed by the invention are neutralizing antibodies which bind the ligand and prevent binding of the ligand to the receptor, as well as antibodies which bind the ligand, thereby preventing receptor activation, but do not prevent the ligand from binding the receptor. Further included in the invention are antibodies which activate the receptor. These antibodies may act as receptor agonists, i.e., potentiate or activate either all or a subset of the biological activities of the ligand-mediated receptor activation, for example, by inducing dimerization of the receptor. The antibodies may be specified as agonists, antagonists or inverse agonists for biological activities comprising the specific biological activities of the peptides disclosed herein. The antibody agonists and antagonists can be made using methods known in the art. See, e.g., International Patent Publication No. WO 96/40281; U.S. Pat. No. 5,811,097; Deng et al., Blood 92(6): 1981-1988 (1998); Chen et al., Cancer Res. 58(16):3668-3678 (1998); Harrop et al., J. Immunol. 161(4): 1786- 1794 (1998); Zhu et al., Cancer Res. 58(15):3209-3214 (1998); Yoon et al., I. Immunol. 160(7):3170-3179 (1998); Prat et al., J. Cell. Sci. Ill (Pt2):237-247 (1998); Pitard et al., J. Immunol. Methods 205(2): 177-190 (1997); Liautard et al., Cytokine 9(4):233-241 (1997); Carlson et al., J. Biol. Chem. 272(17): 11295-11301 (1997); Taryman et al., Neuron 14(4):755-762 (1995); Muller et al., Structure 6(9): 1153-1167 (1998); Bartunek et al., Cytokine 8(1): 14-20 (1996).

[0301] The antibodies as defined for the present invention include derivatives that are modified, i.e., by the covalent attachment of any type of molecule to the antibody such that covalent attachment does not prevent the antibody from generating an anti -idiotypic response. For example, but not by way of limitation, the antibody derivatives include antibodies that have been modified, e.g., by glycosylation, acetylation, pegylation, phosphylation, amidation, derivatization by known protecting/blocking groups, proteolytic cleavage, linkage to a cellular ligand or other protein, etc. Any of numerous chemical modifications may be carried out by known techniques, including, but not limited to specific chemical cleavage, acetylation, formylation, metabolic synthesis of tunicamycin, etc. Additionally, the derivative may contain one or more non-classical amino acids.

[0302] Simple binding assays can be used to screen for or detect agents that bind to a target protein, or disrupt the interaction between proteins (e.g., a receptor and a ligand). Because certain targets of the present invention are transmembrane proteins, assays that use the soluble forms of these proteins rather than full-length protein can be used, in some embodiments. Soluble forms include, for example, those lacking the transmembrane domain and/or those comprising the IgV domain or fragments thereof which retain their ability to bind their cognate binding partners. Further, agents that inhibit or enhance protein interactions for use in the compositions and methods described herein, can include recombinant peptido-mimetics.

[0303] Detection methods useful in screening assays include antibody-based methods, detection of a reporter moiety, detection of cytokines as described herein, and detection of a gene signature as described herein.

[0304] Another variation of assays to determine binding of a receptor protein to a ligand protein is through the use of affinity biosensor methods. Such methods may be based on the piezoelectric effect, electrochemistry, or optical methods, such as ellipsometry, optical wave guidance, and surface plasmon resonance (SPR).

Aptamers

[0305] In certain embodiments, the one or more agents is an aptamer. Nucleic acid aptamers are nucleic acid species that have been engineered through repeated rounds of in vitro selection or equivalently, SELEX (systematic evolution of ligands by exponential enrichment) to bind to various molecular targets such as small molecules, proteins, nucleic acids, cells, tissues and organisms. Nucleic acid aptamers have specific binding affinity to molecules through interactions other than classic Watson-Crick base pairing. Aptamers are useful in biotechnological and therapeutic applications as they offer molecular recognition properties similar to antibodies. In addition to their discriminate recognition, aptamers offer advantages over antibodies as they can be engineered completely in a test tube, are readily produced by chemical synthesis, possess desirable storage properties, and elicit little or no immunogenicity in therapeutic applications. In certain embodiments, RNA aptamers may be expressed from a DNA construct. In other embodiments, a nucleic acid aptamer may be linked to another polynucleotide sequence. The polynucleotide sequence may be a double stranded DNA polynucleotide sequence. The aptamer may be covalently linked to one strand of the polynucleotide sequence. The aptamer may be ligated to the polynucleotide sequence. The polynucleotide sequence may be configured, such that the polynucleotide sequence may be linked to a solid support or ligated to another polynucleotide sequence.

[0306] Aptamers, like peptides generated by phage display or monoclonal antibodies

("mAbs"), are capable of specifically binding to selected targets and modulating the target's activity, e.g., through binding, aptamers may block their target's ability to function. A typical aptamer is 10-15 kDa in size (30-45 nucleotides), binds its target with sub-nanomolar affinity, and discriminates against closely related targets (e.g., aptamers will typically not bind other proteins from the same gene family). Structural studies have shown that aptamers are capable of using the same types of binding interactions (e.g., hydrogen bonding, electrostatic complementarity, hydrophobic contacts, steric exclusion) that drives affinity and specificity in antibody-antigen complexes.

[0307] Aptamers have a number of desirable characteristics for use in research and as therapeutics and diagnostics including high specificity and affinity, biological efficacy, and excellent pharmacokinetic properties. In addition, they offer specific competitive advantages over antibodies and other protein biologies. Aptamers are chemically synthesized and are readily scaled as needed to meet production demand for research, diagnostic or therapeutic applications. Aptamers are chemically robust. They are intrinsically adapted to regain activity following exposure to factors such as heat and denaturants and can be stored for extended periods (>1 yr) at room temperature as lyophilized powders. Not being bound by a theory, aptamers bound to a solid support or beads may be stored for extended periods.

[0308] Oligonucleotides in their phosphodiester form may be quickly degraded by intracellular and extracellular enzymes such as endonucleases and exonucleases. Aptamers can include modified nucleotides conferring improved characteristics on the ligand, such as improved in vivo stability or improved delivery characteristics. Examples of such modifications include chemical substitutions at the ribose and/or phosphate and/or base positions. SELEX identified nucleic acid ligands containing modified nucleotides are described, e.g., in U.S. Pat. No. 5,660,985, which describes oligonucleotides containing nucleotide derivatives chemically modified at the 2' position of ribose, 5 position of pyrimidines, and 8 position of purines, U.S. Pat. No. 5,756,703 which describes oligonucleotides containing various 2' -modified pyrimidines, and U.S. Pat. No. 5,580,737 which describes highly specific nucleic acid ligands containing one or more nucleotides modified with 2'-amino (2 -NH 2 ), 2'-fluoro (2'-F), and/or 2'-0-methyl (2'-OMe) substituents. Modifications of aptamers may also include, modifications at exocyclic amines, substitution of 4- thiouridine, substitution of 5-bromo or 5-iodo-uracil; backbone modifications, phosphorothioate or allyl phosphate modifications, methylations, and unusual base-pairing combinations such as the isobases isocytidine and isoguanosine. Modifications can also include 3' and 5' modifications such as capping. As used herein, the term phosphorothioate encompasses one or more non-bridging oxygen atoms in a phosphodiester bond replaced by one or more sulfur atoms. In further embodiments, the oligonucleotides comprise modified sugar groups, for example, one or more of the hydroxyl groups is replaced with halogen, aliphatic groups, or functionalized as ethers or amines. In one embodiment, the 2'-position of the furanose residue is substituted by any of an O- methyl, O-alkyl, O-allyl, S-alkyl, S-allyl, or halo group. Methods of synthesis of 2'-modified sugars are described, e.g., in Sproat, et al., Nucl. Acid Res. 19:733-738 (1991); Cotten, et al, Nucl. Acid Res. 19:2629-2635 (1991); and Hobbs, et al, Biochemistry 12:5138-5145 (1973). Other modifications are known to one of ordinary skill in the art. In certain embodiments, aptamers include aptamers with improved off-rates as described in International Patent Publication No. WO 2009012418,“Method for generating aptamers with improved off-rates,” incorporated herein by reference in its entirety. In certain embodiments aptamers are chosen from a library of aptamers. Such libraries include, but are not limited to those described in Rohloff et al.,“Nucleic Acid Ligands With Protein-like Side Chains: Modified Aptamers and Their Use as Diagnostic and Therapeutic Agents,” Molecular Therapy Nucleic Acids (2014) 3, e201. Aptamers are also commercially available (see, e.g., SomaLogic, Inc., Boulder, Colorado). In certain embodiments, the present invention may utilize any aptamer containing any modification as described herein. METHODS OF DETECTION AND DIAGNOSTIC METHODS

[0309] In certain embodiments, target cell types are identified using biomarkers and/or gene signatures. In certain embodiments, biomarkers and/or signatures are identified in a population of cells in response to modulating agents that drive differentiation.

[0310] In certain example embodiments, the population of cells is an ex vivo cell-based system that faithfully recapitulates an in vivo phenotype or target system of interest. Source starting materials may include cultured cell lines or cells or tissues isolated directly from an in vivo source, including explants and biopsies. The source materials may be pluripotent cells including stem cells. The cell (sub)type(s) and cell state(s) of the ex vivo system may likewise be determined using known lineage markers. The analysis of cell (sub)type(s) and cell state(s) may be obtained at the time of running the methods described herein. Based on the identified differences, steps to modulate the source material to induce a shift in cell (sub)type(s) and/or cell state(s) may then be selected and applied. In certain embodiments, gene signatures, pathways and/or biomarkers are modulated to shift differentiation of an ex vivo or in vivo system (see modulating agents herein).

[0311] The invention provides biomarkers (e.g., phenotype specific or cell type) for the identification, diagnosis, prognosis and manipulation of cell properties, for use in a variety of diagnostic and/or therapeutic indications. Biomarkers in the context of the present invention encompasses, without limitation nucleic acids, proteins, reaction products, and metabolites, together with their polymorphisms, mutations, variants, modifications, subunits, fragments, and other analytes or sample-derived measures. In certain embodiments, biomarkers include the signature genes or signature gene products, and/or cells as described herein.

[0312] Biomarkers are useful in methods of diagnosing, prognosing and/or staging an immune response in a subject by detecting a first level of expression, activity and/or function of one or more biomarker and comparing the detected level to a control of level wherein a difference in the detected level and the control level indicates that the presence of an immune response in the subject.

[0313] The terms“diagnosis” and“monitoring” are commonplace and well-understood in medical practice. By means of further explanation and without limitation the term“diagnosis” generally refers to the process or act of recognising, deciding on or concluding on a disease or condition in a subject on the basis of symptoms and signs and/or from results of various diagnostic procedures (such as, for example, from knowing the presence, absence and/or quantity of one or more biomarkers characteristic of the diagnosed disease or condition).

[0314] The terms“prognosing” or“prognosis” generally refer to an anticipation on the progression of a disease or condition and the prospect (e.g., the probability, duration, and/or extent) of recovery. A good prognosis of the diseases or conditions taught herein may generally encompass anticipation of a satisfactory partial or complete recovery from the diseases or conditions, preferably within an acceptable time period. A good prognosis of such may more commonly encompass anticipation of not further worsening or aggravating of such, preferably within a given time period. A poor prognosis of the diseases or conditions as taught herein may generally encompass anticipation of a substandard recovery and/or unsatisfactorily slow recovery, or to substantially no recovery or even further worsening of such.

[0315] The biomarkers of the present invention are useful in methods of identifying patient populations at risk or suffering from an immune response based on a detected level of expression, activity and/or function of one or more biomarkers. These biomarkers are also useful in monitoring subjects undergoing treatments and therapies for suitable or aberrant response(s) to determine efficaciousness of the treatment or therapy and for selecting or modifying therapies and treatments that would be efficacious in treating, delaying the progression of or otherwise ameliorating a symptom. The biomarkers provided herein are useful for selecting a group of patients at a specific state of a disease with accuracy that facilitates selection of treatments.

[0316] The term“monitoring” generally refers to the follow-up of a disease or a condition in a subject for any changes which may occur over time.

[0317] The terms also encompass prediction of a disease. The terms “predicting” or “prediction” generally refer to an advance declaration, indication or foretelling of a disease or condition in a subject not (yet) having said disease or condition. For example, a prediction of a disease or condition in a subject may indicate a probability, chance or risk that the subject will develop said disease or condition, for example within a certain time period or by a certain age. Said probability, chance or risk may be indicated inter alia as an absolute value, range or statistics, or may be indicated relative to a suitable control subject or subject population (such as, e.g., relative to a general, normal or healthy subject or subject population). Hence, the probability, chance or risk that a subject will develop a disease or condition may be advantageously indicated as increased or decreased, or as fold-increased or fold-decreased relative to a suitable control subject or subject population. As used herein, the term“prediction” of the conditions or diseases as taught herein in a subject may also particularly mean that the subject has a 'positive' prediction of such, i.e., that the subject is at risk of having such (e.g., the risk is significantly increased vis-a- vis a control subject or subject population). The term“prediction of no” diseases or conditions as taught herein as described herein in a subject may particularly mean that the subject has a 'negative' prediction of such, i.e., that the subject’s risk of having such is not significantly increased vis-a- vis a control subject or subject population.

[0318] Suitably, an altered quantity or phenotype of the immune cells in the subject compared to a control subject having normal immune status or not having a disease comprising an immune component indicates that the subject has an impaired immune status or has a disease comprising an immune component or would benefit from an immune therapy.

[0319] Hence, the methods may rely on comparing the quantity of immune cell populations, biomarkers, or gene or gene product signatures measured in samples from patients with reference values, wherein said reference values represent known predictions, diagnoses and/or prognoses of diseases or conditions as taught herein.

[0320] For example, distinct reference values may represent the prediction of a risk (e.g., an abnormally elevated risk) of having a given disease or condition as taught herein vs. the prediction of no or normal risk of having said disease or condition. In another example, distinct reference values may represent predictions of differing degrees of risk of having such disease or condition.

[0321] In a further example, distinct reference values can represent the diagnosis of a given disease or condition as taught herein vs. the diagnosis of no such disease or condition (such as, e.g., the diagnosis of healthy, or recovered from said disease or condition, etc.). In another example, distinct reference values may represent the diagnosis of such disease or condition of varying severity.

[0322] In yet another example, distinct reference values may represent a good prognosis for a given disease or condition as taught herein vs. a poor prognosis for said disease or condition. In a further example, distinct reference values may represent varyingly favourable or unfavourable prognoses for such disease or condition.

[0323] Such comparison may generally include any means to determine the presence or absence of at least one difference and optionally of the size of such difference between values being compared. A comparison may include a visual inspection, an arithmetical or statistical comparison of measurements. Such statistical comparisons include, but are not limited to, applying a rule.

[0324] Reference values may be established according to known procedures previously employed for other cell populations, biomarkers and gene or gene product signatures. For example, a reference value may be established in an individual or a population of individuals characterised by a particular diagnosis, prediction and/or prognosis of said disease or condition (i.e., for whom said diagnosis, prediction and/or prognosis of the disease or condition holds true). Such population may comprise without limitation 2 or more, 10 or more, 100 or more, or even several hundred or more individuals.

[0325] A“deviation” of a first value from a second value may generally encompass any direction (e.g., increase: first value > second value; or decrease: first value < second value) and any extent of alteration.

[0326] For example, a deviation may encompass a decrease in a first value by, without limitation, at least about 10% (about 0.9-fold or less), or by at least about 20% (about 0.8-fold or less), or by at least about 30% (about 0.7-fold or less), or by at least about 40% (about 0.6-fold or less), or by at least about 50% (about 0.5-fold or less), or by at least about 60% (about 0.4-fold or less), or by at least about 70% (about 0.3-fold or less), or by at least about 80% (about 0.2-fold or less), or by at least about 90% (about 0.1 -fold or less), relative to a second value with which a comparison is being made.

[0327] For example, a deviation may encompass an increase of a first value by, without limitation, at least about 10% (about 1.1 -fold or more), or by at least about 20% (about 1.2-fold or more), or by at least about 30% (about 1.3-fold or more), or by at least about 40% (about 1.4-fold or more), or by at least about 50% (about 1.5-fold or more), or by at least about 60% (about 1.6- fold or more), or by at least about 70% (about 1.7-fold or more), or by at least about 80% (about 1.8-fold or more), or by at least about 90% (about 1.9-fold or more), or by at least about 100% (about 2-fold or more), or by at least about 150% (about 2.5-fold or more), or by at least about 200% (about 3 -fold or more), or by at least about 500% (about 6-fold or more), or by at least about 700% (about 8-fold or more), or like, relative to a second value with which a comparison is being made.

[0328] Preferably, a deviation may refer to a statistically significant observed alteration. For example, a deviation may refer to an observed alteration which falls outside of error margins of reference values in a given population (as expressed, for example, by standard deviation or standard error, or by a predetermined multiple thereof, e g., ilxSD or ±2xSD or ±3xSD, or ±lxSE or ±2xSE or ±3xSE). Deviation may also refer to a value falling outside of a reference range defined by values in a given population (for example, outside of a range which comprises >40%, > 50%, >60%, >70%, >75% or >80% or >85% or >90% or >95% or even >100% of values in said population).

[0329] In a further embodiment, a deviation may be concluded if an observed alteration is beyond a given threshold or cut-off. Such threshold or cut-off may be selected as generally known in the art to provide for a chosen sensitivity and/or specificity of the prediction methods, e.g., sensitivity and/or specificity of at least 50%, or at least 60%, or at least 70%, or at least 80%, or at least 85%, or at least 90%, or at least 95%.

[0330] For example, receiver-operating characteristic (ROC) curve analysis can be used to select an optimal cut-off value of the quantity of a given immune cell population, biomarker or gene or gene product signatures, for clinical use of the present diagnostic tests, based on acceptable sensitivity and specificity, or related performance measures which are well-known per se, such as positive predictive value (PPV), negative predictive value (NPV), positive likelihood ratio (LR+), negative likelihood ratio (LR-), Youden index, or similar.

Single cell RNA sequencing

[0331] In certain embodiments, the invention involves single cell RNA sequencing (see, e.g.,

Kalisky, T., Blainey, P. & Quake, S. R. Genomic Analysis at the Single-Cell Level. Annual review of genetics 45, 431-445, (2011); Kalisky, T. & Quake, S. R. Single-cell genomics. Nature Methods 8, 311-314 (2011); Islam, S. et al. Characterization of the single-cell transcriptional landscape by highly multiplex RNA-seq. Genome Research, (2011); Tang, F. et al. RNA-Seq analysis to capture the transcriptome landscape of a single cell. Nature Protocols 5, 516-535, (2010); Tang, F. et al. mRNA-Seq whole-transcriptome analysis of a single cell. Nature Methods 6, 377-382, (2009); Ramskold, D. et al. Full-length mRNA-Seq from single-cell levels of RNA and individual circulating tumor cells. Nature Biotechnology 30, 777-782, (2012); and Hashimshony, T., Wagner, F., Sher, N. & Yanai, I. CEL-Seq: Single-Cell RNA-Seq by Multiplexed Linear Amplification. Cell Reports, Cell Reports, Volume 2, Issue 3, p666-673, 2012).

[0332] In certain embodiments, the invention involves plate based single cell RNA sequencing (see, e.g., Picelli, S. et al., 2014,“Full-length RNA-seq from single cells using Smart-seq2” Nature protocols 9, 171-181, doi: 10.1038/nprot.2014.006).

[0333] In certain embodiments, the invention involves high-throughput single-cell RNA-seq. In this regard reference is made to Macosko et al., 2015, “Highly Parallel Genome-wide Expression Profiling of Individual Cells Using Nanoliter Droplets” Cell 161, 1202-1214; International patent application number PCT/US2015/049178, published as W02016/040476 on March 17, 2016; Klein et al., 2015,“Droplet Barcoding for Single-Cell Transcriptomics Applied to Embryonic Stem Cells” Cell 161, 1187-1201; International patent application number PCT/US2016/027734, published as WO2016168584A1 on October 20, 2016; Zheng, et al., 2016, “Haplotyping germline and cancer genomes with high-throughput linked-read sequencing” Nature Biotechnology 34, 303-311; Zheng, et al., 2017, “Massively parallel digital transcriptional profiling of single cells” Nat. Commun. 8, 14049 doi: 10.1038/ncommsl4049; International patent publication number WO2014210353A2; Zilionis, et al., 2017, “Single-cell barcoding and sequencing using droplet microfluidics” Nat Protoc. Ian; 12(l):44-73; Cao et al., 2017, “Comprehensive single cell transcriptional profiling of a multicellular organism by combinatorial indexing” bioRxiv preprint first posted online Feb. 2, 2017, doi: dx.doi.org/10.1101/104844; Rosenberg et al., 2017,“Scaling single cell transcriptomics through split pool barcoding” bioRxiv preprint first posted online Feb. 2, 2017, doi: dx.doi.org/10.1101/105163; Rosenberg et al., “Single-cell profiling of the developing mouse brain and spinal cord with split-pool barcoding” Science 15 Mar 2018; Vitak, et al., “Sequencing thousands of single-cell genomes with combinatorial indexing” Nature Methods, 14(3):302— 308, 2017; Cao, et al., Comprehensive single-cell transcriptional profiling of a multicellular organism. Science, 357(6352):661— 667, 2017; Gierahn et al.,“Seq-Well: portable, low-cost RNA sequencing of single cells at high throughput” Nature Methods 14, 395-398 (2017); and Hughes, et al., “Highly Efficient, Massively-Parallel Single-Cell RNA-Seq Reveals Cellular States and Molecular Features of Human Skin Pathology” bioRxiv 689273; doi: doi.org/10.1101/689273, all the contents and disclosure of each of which are herein incorporated by reference in their entirety.

[0334] In certain embodiments, the invention involves single nucleus RNA sequencing. In this regard reference is made to Swiech et al., 2014,“ In vivo interrogation of gene function in the mammalian brain using CRISPR-Cas9” Nature Biotechnology Vol. 33, pp. 102-106; Habib et al., 2016,“Div-Seq: Single-nucleus RNA-Seq reveals dynamics of rare adult newborn neurons” Science, Vol. 353, Issue 6302, pp. 925-928; Habib et al., 2017,“Massively parallel single-nucleus RNA-seq with DroNc-seq” Nat Methods. 2017 Oct; 14(10):955-958; International Patent Application No. PCT/US2016/059239, published as WO2017164936 on September 28, 2017; International Patent Application No. PCT/US2018/060860, published as WO/2019/094984 on May 16, 2019; International Patent Application No. PCT/US2019/055894, published as WO/2020/077236 on April 16, 2020; and Drokhlyansky, et al.,“The enteric nervous system of the human and mouse colon at a single-cell resolution,” bioRxiv 746743; doi: doi.org/10.1101/746743, which are herein incorporated by reference in their entirety.

[0335] In certain example embodiments, assessing the cell (sub)types and states present in the ex vivo system may comprise analysis of expression matrices from scRNA-seq data, performing dimensionality reduction, graph-based clustering and deriving list of cluster-specific genes in order to identify cell types and/or states present in the system. Further the clustering and gene expression matrix analysis allow for the identification of key genes in the ex vivo system, such as differences in the expression of key transcription factors.

[0336] In certain embodiments, dimension reduction is used to cluster nuclei from single cells based on differentially expressed genes. In certain embodiments, the dimension reduction technique may be, but is not limited to, Uniform Manifold Approximation and Projection (UMAP) t-SNE, or PHATE (see, e.g., Becht et al., Evaluation of UMAP as an alternative to t-SNE for single-cell data, bioRxiv 298430; doi.org/10.1101/298430; Becht et al., 2019, Dimensionality reduction for visualizing single-cell data using UMAP, Nature Biotechnology volume 37, pages 38-44; and Moon et al., PHATE: A Dimensionality Reduction Method for Visualizing Trajectory Structures in High-Dimensional Biological Data, bioRxiv 120378; doi: doi.org/10.1101/120378). Other Detection Methods

[0337] Modulation may be monitored in a number of ways. For example, expression of one or more key marker genes may be measured at regular levels to assess increases in expression levels. Shifting of the ex vivo system may also be measured phenotypically. For example, imaging an immunocytochemistry for key in vivo markers may be assessed at regular intervals to detect increased expression of the key in vivo markers. Likewise, flow cytometry may be used in a similar manner. In addition, to detecting key in vivo markers, imaging modalities may be used to further detect changes in cell morphology of the ex vivo system.

[0338] In one embodiment, the signature genes, biomarkers, and/or cells may be detected or isolated by immunofluorescence, immunohistochemistry (IHC), fluorescence activated cell sorting (FACS), mass spectrometry (MS), mass cytometry (CyTOF), RNA-seq, single cell RNA-seq (described further herein), quantitative RT-PCR, single cell qPCR, FISH, RNA-FISH, MERFISH (multiplex (in situ) RNA FISH) and/or by in situ hybridization. Other methods including absorbance assays and colorimetric assays are known in the art and may be used herein detection may comprise primers and/or probes or fluorescently bar-coded oligonucleotide probes for hybridization to RNA (see e.g., Geiss GK, et al., Direct multiplexed measurement of gene expression with color-coded probe pairs. Nat Biotechnol. 2008 Mar;26(3):317-25).

[0339] The present invention also may comprise a kit with a detection reagent that binds to one or more biomarkers or can be used to detect one or more biomarkers.

MS methods

[0340] Biomarker detection may also be evaluated using mass spectrometry methods. A variety of configurations of mass spectrometers can be used to detect biomarker values. Several types of mass spectrometers are available or can be produced with various configurations. In general, a mass spectrometer has the following major components: a sample inlet, an ion source, a mass analyzer, a detector, a vacuum system, and instrument-control system, and a data system. Difference in the sample inlet, ion source, and mass analyzer generally define the type of instrument and its capabilities. For example, an inlet can be a capillary-column liquid chromatography source or can be a direct probe or stage such as used in matrix-assisted laser desorption. Common ion sources are, for example, electrospray, including nanospray and microspray or matrix-assisted laser desorption. Common mass analyzers include a quadrupole mass filter, ion trap mass analyzer and time-of-flight mass analyzer. Additional mass spectrometry methods are well known in the art (see Burlingame et al, Anal. Chem. 70:647 R-716R (1998); Kinter and Sherman, New York (2000)).

[0341] Protein biomarkers and biomarker values can be detected and measured by any of the following: electrospray ionization mass spectrometry (ESI-MS), ESI-MS/MS, ESI-MS/(MS)n, matrix-assisted laser desorption ionization time-of-flight mass spectrometry' (MALDI-TOF-MS), surface-enhanced laser desorption/ionization time-of-flight mass spectrometry (SELDI-TOF-MS), desorption/ionization on silicon (DIOS), secondary ion mass spectrometry (SIMS), quadrupole time-of-flight (Q-TOF), tandem time-of-flight (TOF/TOF) technology, called ultraflex III TOF/TOF, atmospheric pressure chemical ionization mass spectrometry (APCI-MS), APCI- MS/MS, APCI-(MS).sup.N, atmospheric pressure photoionization mass spectrometry (APPI-MS), APPI-MS/MS, and APPI-(MS).sup.N, quadrupole mass spectrometry, Fourier transform mass spectrometry (FTMS), quantitative mass spectrometry, and ion trap mass spectrometry.

[0342] Sample preparation strategies are used to label and enrich samples before mass spectroscopic characterization of protein biomarkers and determination biomarker values. Labeling methods include but are not limited to isobaric tag for relative and absolute quantitation (iTRAQ) and stable isotope labeling with amino acids in cell culture (SILAC). Capture reagents used to selectively enrich samples for candidate biomarker proteins prior to mass spectroscopic analysis include but are not limited to aptamers, antibodies, nucleic acid probes, chimeras, small molecules, an F(ab')2 fragment, a single chain antibody fragment, an Fv fragment, a single chain Fv fragment, a nucleic acid, a lectin, a ligand-binding receptor, affybodies, nanobodies, ankyrins, domain antibodies, alternative antibody scaffolds (e.g. diabodies, etc.) imprinted polymers, avimers, peptidomimetics, peptoids, peptide nucleic acids, threose nucleic acid, a hormone receptor, a cytokine receptor, and synthetic receptors, and modifications and fragments of these. Immunoassays

[0343] Immunoassay methods are based on the reaction of an antibody to its corresponding target or analyte and can detect the analyte in a sample depending on the specific assay format. To improve specificity and sensitivity of an assay method based on immunoreactivity, monoclonal antibodies are often used because of their specific epitope recognition. Polyclonal antibodies have also been successfully used in various immunoassays because of their increased affinity for the target as compared to monoclonal antibodies Immunoassays have been designed for use with a wide range of biological sample matrices Immunoassay formats have been designed to provide qualitative, semi -quantitative, and quantitative results.

[0344] Quantitative results may be generated through the use of a standard curve created with known concentrations of the specific analyte to be detected. The response or signal from an unknown sample is plotted onto the standard curve, and a quantity or value corresponding to the target in the unknown sample is established.

[0345] Numerous immunoassay formats have been designed. ELISA or EIA can be quantitative for the detection of an analyte/biomarker. This method relies on attachment of a label to either the analyte or the antibody and the label component includes, either directly or indirectly, an enzyme. ELISA tests may be formatted for direct, indirect, competitive, or sandwich detection of the analyte. Other methods rely on labels such as, for example, radioisotopes (I 125 ) or fluorescence. Additional techniques include, for example, agglutination, nephelometry, turbidimetry, Western blot, immunoprecipitation, immunocytochemistry, immunohistochemistry, flow cytometry, Luminex assay, and others (see ImmunoAssay: A Practical Guide, edited by Brian Law, published by Taylor & Francis, Ltd., 2005 edition).

[0346] Exemplary assay formats include enzyme-linked immunosorbent assay (ELISA), radioimmunoassay, fluorescent, chemiluminescence, and fluorescence resonance energy transfer (FRET) or time resolved-FRET (TR-FRET) immunoassays. Examples of procedures for detecting biomarkers include biomarker immunoprecipitation followed by quantitative methods that allow size and peptide level discrimination, such as gel electrophoresis, capillary electrophoresis, planar electrochromatography, and the like.

[0347] Methods of detecting and/or quantifying a detectable label or signal generating material depend on the nature of the label. The products of reactions catalyzed by appropriate enzymes (where the detectable label is an enzyme; see above) can be, without limitation, fluorescent, luminescent, or radioactive or they may absorb visible or ultraviolet light. Examples of detectors suitable for detecting such detectable labels include, without limitation, x-ray film, radioactivity counters, scintillation counters, spectrophotometers, colorimeters, fluorometers, luminometers, and densitometers.

[0348] Any of the methods for detection can be performed in any format that allows for any suitable preparation, processing, and analysis of the reactions. This can be, for example, in multi- well assay plates (e.g., 96 wells or 384 wells) or using any suitable array or microarray. Stock solutions for various agents can be made manually or robotically, and all subsequent pipetting, diluting, mixing, distribution, washing, incubating, sample readout, data collection and analysis can be done robotically using commercially available analysis software, robotics, and detection instrumentation capable of detecting a detectable label.

Hybridization assays

[0349] Such applications are hybridization assays in which a nucleic acid that displays "probe" nucleic acids for each of the genes to be assayed/profiled in the profile to be generated is employed. In these assays, a sample of target nucleic acids is first prepared from the initial nucleic acid sample being assayed, where preparation may include labeling of the target nucleic acids with a label, e.g., a member of a signal producing system. Following target nucleic acid sample preparation, the sample is contacted with the array under hybridization conditions, whereby complexes are formed between target nucleic acids that are complementary' to probe sequences attached to the array surface. The presence of hybridized complexes is then detected, either qualitatively or quantitatively. Specific hybridization technology which may be practiced to generate the expression profiles employed in the subject methods includes the technology described in U.S. Pat. Nos. 5,143,854; 5,288,644; 5,324,633; 5,432,049; 5,470,710; 5,492,806; 5,503,980; 5,510,270; 5,525,464; 5,547,839; 5,580,732; 5,661,028; 5,800,992; the disclosures of which are herein incorporated by reference; as well as International Patent Publication Nos. WO 95/21265, WO 96/31622, WO 97/10365, and WO 97/27317, and European Patent Application Nos. EP 373 203 and EP 785 280. In these methods, an array of "probe" nucleic acids that includes a probe for each of the biomarkers whose expression is being assayed is contacted with target nucleic acids as described above. Contact is carried out under hybridization conditions, e.g., stringent hybridization conditions as described above, and unbound nucleic acid is then removed. The resultant pattern of hybridized nucleic acids provides information regarding expression for each of the biomarkers that have been probed, where the expression information is in terms of whether or not the gene is expressed and, typically, at what level, where the expression data, i.e., expression profile, may be both qualitative and quantitative.

[0350] Optimal hybridization conditions will depend on the length (e.g., oligomer vs. polynucleotide greater than 200 bases) and type (e.g., RNA, DNA, PNA) of labeled probe and immobilized polynucleotide or oligonucleotide. General parameters for specific (i.e., stringent) hybridization conditions for nucleic acids are described in Sambrook et al., supra, and in Ausubel et al., "Current Protocols in Molecular Biology", Greene Publishing and Wiley-interscience, NY (1987), which is incorporated in its entirety for all purposes. When the cDNA microarrays are used, typical hybridization conditions are hybridization in 5xSSC plus 0.2% SDS at 65C for 4 hours followed by washes at 25 °C in low stringency wash buffer (lxSSC plus 0.2% SDS) followed by 10 minutes at 25°C in high stringency wash buffer (0.1 SSC plus 0.2% SDS) (see Shena et al ., Proc. Natl. Acad. Sci. USA, Vol. 93, p. 10614 (1996)). Useful hybridization conditions are also provided in, e.g., Tijessen, Hybridization With Nucleic Acid Probes", Elsevier Science Publishers B.V. (1993) and Kricka, "Nonisotopic DNA Probe Techniques", Academic Press, San Diego, Calif. (1992).

ATAC-seq

[0351] In certain embodiments, the invention involves the Assay for Transposase Accessible Chromatin using sequencing (ATAC-seq) as described (see, e.g., Buenrostro, et al., Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nature methods 2013; 10 (12): 1213-1218; Buenrostro et al. , Single-cell chromatin accessibility reveals principles of regulatory variation. Nature 523, 486-490 (2015); Cusanovich, D. A., Daza, R., Adey, A., Pliner, H, Christiansen, L., Gunderson, K. L., Steemers, F. I, Trapnell, C. & Shendure, J. Multiplex single-cell profiling of chromatin accessibility by combinatorial cellular indexing. Science. 2015 May 22;348(6237):910-4. doi: 10.1126/science. aabl601. Epub 2015 May 7; U.S. Patent Publication Nos. US20160208323A1 and US20160060691A1; and International Patent Publication No.WO2017156336Al).

SCREENING METHODS

Screening for Modulating Agents

[0352] A further aspect of the invention relates to a method for identifying an agent capable of modulating one or more phenotypic aspects of a cell or cell population as disclosed herein, comprising: a) applying a candidate agent to the cell or cell population; b) detecting modulation of one or more phenotypic aspects of the cell or cell population by the candidate agent, thereby identifying the agent. The phenotypic aspects of the cell or cell population that is modulated may be a gene signature or biological program specific to a cell type or cell phenotype or phenotype specific to a population of cells. In certain embodiments, steps can include administering candidate modulating agents to cells, detecting identified cell (sub)populations for changes in signatures, or identifying relative changes in cell (sub) populations which may comprise detecting relative abundance of particular gene signatures.

[0353] The term“modulate” broadly denotes a qualitative and/or quantitative alteration, change or variation in that which is being modulated. Where modulation can be assessed quantitatively - for example, where modulation comprises or consists of a change in a quantifiable variable such as a quantifiable property of a cell or where a quantifiable variable provides a suitable surrogate for the modulation - modulation specifically encompasses both increase (e.g., activation) or decrease (e.g., inhibition) in the measured variable. The term encompasses any extent of such modulation, e.g., any extent of such increase or decrease, and may more particularly refer to statistically significant increase or decrease in the measured variable. By means of example, modulation may encompass an increase in the value of the measured variable by at least about 10%, e.g., by at least about 20%, preferably by at least about 30%, e.g., by at least about 40%, more preferably by at least about 50%, e.g., by at least about 75%, even more preferably by at least about 100%, e.g., by at least about 150%, 200%, 250%, 300%, 400% or by at least about 500%, compared to a reference situation without said modulation; or modulation may encompass a decrease or reduction in the value of the measured variable by at least about 10%, e.g., by at least about 20%, by at least about 30%, e.g., by at least about 40%, by at least about 50%, e.g., by at least about 60%, by at least about 70%, e.g., by at least about 80%, by at least about 90%, e.g., by at least about 95%, such as by at least about 96%, 97%, 98%, 99% or even by 100%, compared to a reference situation without said modulation. Preferably, modulation may be specific or selective, hence, one or more desired phenotypic aspects of an immune cell or immune cell population may be modulated without substantially altering other (unintended, undesired) phenotypic aspect(s).

[0354] The term“agent” broadly encompasses any condition, substance or agent capable of modulating one or more phenotypic aspects of a cell or cell population as disclosed herein. Such conditions, substances or agents may be of physical, chemical, biochemical and/or biological nature. The term“candidate agent” refers to any condition, substance or agent that is being examined for the ability to modulate one or more phenotypic aspects of a cell or cell population as disclosed herein in a method comprising applying the candidate agent to the cell or cell population (e.g., exposing the cell or cell population to the candidate agent or contacting the cell or cell population with the candidate agent) and observing whether the desired modulation takes place.

[0355] Agents may include any potential class of biologically active conditions, substances or agents, such as for instance antibodies, proteins, peptides, nucleic acids, oligonucleotides, small molecules, or combinations thereof, as described herein.

[0356] The methods of phenotypic analysis can be utilized for evaluating environmental stress and/or state, for screening of chemical libraries, and to screen or identify structural, syntenic, genomic, and/or organism and species variations. For example, a culture of cells, can be exposed to an environmental stress, such as but not limited to heat shock, osmolarity, hypoxia, cold, oxidative stress, radiation, starvation, a chemical (for example a therapeutic agent or potential therapeutic agent) and the like. After the stress is applied, a representative sample can be subjected to analysis, for example at various time points, and compared to a control, such as a sample from an organism or cell, for example a cell from an organism, or a standard value. By exposing cells, or fractions thereof, tissues, or even whole animals, to different members of the chemical libraries, and performing the methods described herein, different members of a chemical library can be screened for their effect on immune phenotypes thereof simultaneously in a relatively short amount of time, for example using a high throughput method.

[0357] Aspects of the present disclosure relate to the correlation of an agent with the spatial proximity and/or epigenetic profde of the nucleic acids in a sample of cells. In some embodiments, the disclosed methods can be used to screen chemical libraries for agents that modulate chromatin architecture epigenetic profdes, and/or relationships thereof.

[0358] In some embodiments, screening of test agents involves testing a combinatorial library containing a large number of potential modulator compounds. A combinatorial chemical library may be a collection of diverse chemical compounds generated by either chemical synthesis or biological synthesis, by combining a number of chemical "building blocks" such as reagents. For example, a linear combinatorial chemical library, such as a polypeptide library, is formed by combining a set of chemical building blocks (amino acids) in every possible way for a given compound length (for example the number of amino acids in a polypeptide compound). Millions of chemical compounds can be synthesized through such combinatorial mixing of chemical building blocks.

[0359] In certain embodiments, the present invention provides for gene signature screening. The concept of signature screening was introduced by Stegmaier et al. (Gene expression-based high-throughput screening (GE-HTS) and application to leukemia differentiation. Nature Genet. 36, 257-263 (2004)), who realized that if a gene-expression signature was the proxy for a phenotype of interest, it could be used to find small molecules that effect that phenotype without knowledge of a validated drug target. The signatures or biological programs of the present invention may be used to screen for drugs that reduce the signature or biological program in cells as described herein. The signature or biological program may be used for GE-HTS. In certain embodiments, pharmacological screens may be used to identify drugs that are selectively toxic to cells having a signature.

[0360] The Connectivity Map (cmap) is a collection of genome-wide transcriptional expression data from cultured human cells treated with bioactive small molecules and simple pattern-matching algorithms that together enable the discovery of functional connections between drugs, genes and diseases through the transitory feature of common gene-expression changes (see, Lamb et al, The Connectivity Map: Using Gene-Expression Signatures to Connect Small Molecules, Genes, and Disease. Science 29 Sep 2006: Vol. 313, Issue 5795, pp. 1929-1935, DOI: 10.1126/science.1132939; and Lamb, L, The Connectivity Map: a new tool for biomedical research. Nature Reviews Cancer January 2007: Vol. 7, pp. 54-60). In certain embodiments, Cmap can be used to screen for small molecules capable of modulating a signature or biological program of the present invention in silico.

[0361] The invention is further described in the following examples, which do not limit the scope of the invention described in the claims. EXAMPLES

Example 1 - Modulation of Paneth cell development and function

Scalable platform to study Paneth cell development and function

[0362] Applicants developed a scalable, functional, phenotypic assay for screening ENR+CD- treated cells (FIG. 1). To overcome the limitations to scaling organoid-derived cultures, Applicants first sought to develop a method to preserve the important material and signaling cues supplied by Matrigel scaffolding while enabling automated plating through robotic liquid handlers used in high-throughput screening. Applicants adapted the conventional“3-D” Matrigel droplet culture approach to a 96-well plate pseudo-monolayer“2.5-D” scheme in which organoids are re- plated partially embedded on the surface of a thick layer of Matrigel (at the Matrigel-media interface) rather than fully encapsulated in the Matrigel structure. This allows for a two-step automated plating procedure, where Matrigel is first deposited and gelled in 96- (or 384-) well plates, and culture media containing suspended cell clusters is then added into each well, and briefly centrifuged at low force to loosely deposit clusters on the surface of the thick Matrigel scaffold. This allows for Matrigel plating, cell seeding, and media additions to be fully automated by a liquid handler and readily scaled. Further, because the deposited cell clusters are now apically- exposed to the culture media, assaying apically-secreted agents (such as LYZ) should be greatly enhanced, while allowing for the multiplexed assaying of underlying embedded cells (FIG. 2).

[0363] Applicants seeded 6-day ENR+CD differentiated clusters per well and assayed for cellular ATP as well as basal LYZ secretion over 3 hours. The optimal plating density is between 300-75 clusters / well (10-2.5 clusters / mL media). The 2.5-D platform in a 96-well plate provided an ability to discriminate PC function between conventional, ISC-enriched, and PC-enriched organoids.

[0364] Applicants applied the screening platform as a tool to investigate and validate the actions of proposed agents which modulate in vivo PC function. Applicants can determine the approximate standardized effect size of treatment at different doses using replicate-based strictly standardized mean difference (SSMD), an effect size measure based on the statistics of contrast variables, and particularly useful in screening applications as it is a measure which takes into account both mean difference and variance in a single measure. Using the replicate based uniformly minimal variance unbiased estimate (UMVUE) SSMD, Applicants can identify modulators of LYZ secretion. Applicants can distinguish between strong and weak effect agonists of LYZ secretion, and in the case of strong agonists (e.g., cholinergic agonists (CCh)), are able to provide a relatively high-resolution dose-response profile with associated EC50 concentration.

[0365] In total, the platform is capable of scaling an organoid-derived culture reproducibly with a simple phenotypic assay for PC function (LYZ secretion). The platform can be used to assess potential short and long-term modulators of differentiated cell function. A primary strength of this PC screening platform is in studying agents that may enhance the pace of PC development and may serve as therapeutic candidates to increase or improve PC quality in diseases where there is a loss of PC number or function, such as Crohn’s disease.

Primary small molecule screen for molecular targets to enhance Paneth cell development

[0366] While the scaled screening approach enables phenotypic study of differentiated PCs at scale, it also offers an opportunity to examine modulators of the differentiation‘trajectory’ from ISC precursors. Intervening during the 6-day differentiation of ENR+CD organoids allows for the study of unappreciated molecular pathway activity which influences PC differentiation and may be readily translated beyond the model. Applicants therefore sought to use the scalable“2.5- D” platform to demonstrate a proof-of-concept screen for developmental process of PCs in vitro, and to elucidate molecular pathways and small molecule agents which may afford an axis to enhance PC number therapeutically.

[0367] Using a modified version of the 96-well “2.5-D” system and functional assays, Applicants screened for agents which enhance in vitro PC differentiation or survival using a target- selective inhibitor (TSI) library (Selleck Chemicals (Houston, TX) L3500) containing 433 small molecules covering 184 well-characterized unique molecular targets with high specificity. This library offers translational advantages as many of the molecules are either presently used in the clinic for a wide variety of conditions or have been used in clinical trials and animal models. Applicants scaled the 96-well 2.5D system to a 384-well plate format with a single-well stimulation protocol and assessed the activity of secreted LYZ (basal - LYZ.NS and Cch- stimulated - LYZ.S) and cell pellet ATP following treatments of each compound. ISC-enriched “small clusters” from 3 biological donors were seeded as 3 replicate screens into differentiation media (ENR+CD), and then screened with each of the 433 different compounds at 4 doses covering the nano-molar to micro-molar range. By screening in the presence of the PC differentiation media Applicants sought to assess how the library compounds may act outside of the known WNT and Notch pathways to influence PC differentiation or function, while simultaneously generating a PC enriched system with which to robustly assay for PC function. Applicants performed the three sequential assays 6 days after initial plating with ENR+CD+library treatment and had an additional media change and drug treatment at day 3 (FIG. 2). Each screen plate was logio transformed, and LOESS normalized to reduce plate effects, and each well value was reported as fold change (FC), relative to the median assay value of its respective plate (under the assumption that many of the compounds and doses on the plate will not be biologically active and therefore serve as a suitable control).

[0368] In the LYZ.NS (first assay - all +drug wells are sampled for basal secretion) non- stimulated ENR+CD controls were significantly greater than that of no cell controls (adj. p < 0.0001), and 10 mM CCh-stimulated ENR+CD controls was significantly greater than that of non- stimulated positive controls (adj. p < 0.0001). In the LYZ.S (second assay - all +drug wells are sampled for Cch-induced secretion) non-stimulated ENR+CD controls subsequently stimulated with 10 mM CCh versus non-stimulated (adj. p < 0.05) and those doubly non-stimulated positive controls versus no cell controls (adj . p < 0.0001) showed significant differences. Further, each plate across each replicate was relatively well-correlated for all three assays.

[0369] To assess treatment effect size and define primary screen‘hits’, replicate SSMD was calculated and hits were determined by an SSMD greater than the false-positive and false-negative derived cutoffs (errors equalized to minimum for 3 replicates alpha = 0.087) for both LYZ.NS and LYZ.S assays (FIG. 3A). Of these 47 hits, 19 were also hits in the ATP assay, with the remaining 28 as hits in only both LYZ assays. Interestingly, a plurality of hits was assay-specific, suggesting that single-assay hits may either arise from system‘noise’ or may suggest unique biological effects (FIG. 3C). Applicants further refined the set of 47 double-LYZ assay hits by only including the most biologically‘potent’ treatments, using a cutoff corresponding to treatments which would fall in the top 10% of a normal distribution (zscored FCs > 1.282) (FIG. 3E).

[0370] To validate the 13 most potent enhancers of PC differentiation identified in primary screens, Applicants conducted secondary screening using the same assays and screening format as before with an increased number of well replicates per treatment, a lower tolerance for false positives (without consideration for false negatives, High-throughput screening: 384-well secondary screen format and early vs. full treatment analysis), and a more stringent control population (ENR+CD or ENR compared to whole plate in primary screen).

[0371] Applicants again assessed treatment effect size with replicate SSMD (using a difference between well replicates and median assay value for all same-plate control wells and hits were considered statistically validated by an SSMD greater than the false-positive-derived cutoffs for both LYZ.NS and LYZ.S assays (FIG. 3B). Per this new cutoff, 10 dose treatment combinations corresponding to 7 small molecules were chosen as hits, with every passing dose-treatment having a greater than 0 ATP effect size. Applicants also profiled the biological potency (mean fold change between treatment and ENR+CD+DMSO for the LYZ.NS and LYZ.S assays), of the 10 validated dose-treatment combinations, showing that the compounds increased basal and stimulated LYZ secretion by 25%-75% relative to the control (FIG. 3D). For each compound with multiple validated doses (KPT-330, PHA-665752, and Nilotinib), the most biologically active dose was advanced, and, because Nilotinib and Bosutinib have similar known mechanisms, only Nilotinib (the more biologically potent) was advanced (FIG. 3F) to additional profiling.

[0372] To clarify differences between early and full treatment during differentiation, Applicants looked at the differences between early and late treatment across a range of doses for 3 compounds in the LYZ assays normalized to their matched ATP values (a measure of LYZ/ATP suggests functional capacity per cell) as well as their ATP basis. (FIG. 4). The hits‘act’ at different points in differentiation.

Validation ofXPOl inhibition as a means to enhance Paneth cell differentiation

[0373] Of the six promising lead small molecules identified, KPT-330 appears to most significantly enhance Paneth cell differentiation, and as such Applicants sought to better understand the mechanism through which KPT-330 may be acting, whether by canonical XPOl inhibition, or other means. Applicants determined the effect of the hits on gene expression by population RNA sequencing. KPT-330 showed the most differentially expressed genes (FIG. 5A) Gene set scores for the intestinal cell types showed that after treatment with the small molecules secretory cells (Paneth, goblet, and EEC) were increased (FIG. 5B, C). Assays in the conventional (3D) system confirmed KPT-330 differentiation (FIG. 6A, B).

[0374] KPT-330 is a chromosome region maintenance-1 (CRM1) inhibitor with antineoplastic activity. CRM1 (also known as XPOl, emb, expl, exportin 1, CRM-1) is a eukaryotic protein that mediates the nuclear export of proteins, rRNA, snRNA, and some mRNA. KPT-330 acts via the selective inhibition of nuclear export (SINE) approach— by modifying the essential CRM1 -cargo binding residue C528, KPT-330 irreversibly inactivates CRMl-mediated nuclear export of cargo proteins, including growth regulation proteins. CRM1 co-immunoprecipitates with p27kipl, a protein whose constitutive expression causes cell cycle arrest in the G 1 phase that precedes differentiation. Upregulation of CRM1 and decreased levels of p27kipl are observed in mucosal biopsies of patients with active Crohn’s disease. Based on the results showing an increase in PC number and function with ENR+CD+KPT-330 treatment, CRM1 inhibition by KPT-330 may promote p27kipl -mediated cell cycle arrest to allow ISCs to transition first to a secretory cell progenitor, then to terminally differentiated PCs.

[0375] To address whether enhanced Paneth cell differentiation within ENR+CD organoids following KPT-330 treatment is stemming from the known mechanism of XPOl inhibition, or a potential off-target or non-canonical effect, Applicants repeated organoid differentiation with two additional known XPOl inhibitors, KPT-8602 and Leptomycin B (FIG. 7A). As measured by flow cytometry, statistically significant increases in Paneth cell representation were observed following treatment with KPT-8602 and Leptomycin B (FIG. 7B).

[0376] KPT-330 administration drives Paneth cell differentiation through canonical XPOl inhibition, as confirmed by parallel assessment with additional known XPOl inhibitors KPT- 8602 and Leptomycin B, which lead to similar, statistically significant increases in the Paneth cell fraction of within ENR+CD differentiated organoids.

[0377] Using the high-fidelity PC model in the traditional 3D enteroid culture system, Applicants differentiated ISC-enriched organoids from one biological donor and then sought to assess to what extent XPOl inhibition is interdependent on Wnt agonism and Notch antagonism to drive secretory (Paneth) cell differentiation. Applicants assessed this through studies of bulk transcripts, protein, and functional assays.

[0378] Assaying bulk protein extracted from organoids differentiated under either ENR+CV, ENR+CD, or ENR for LYZ, it is apparent that the addition of XPOl inhibitors KPT-330, KPT- 8602, and Leptomycin B all appreciably increase LYZ abundance on a per-mass protein normalized basis (FIG. 8). Additionally, for both ENR and ENR+CD following 6-day differentiation, functional lysozyme secretion is greatly enhanced when XPOl inhibitors are added (FIG. 9)

[0379] Assaying bulk transcripts from organoids differentiated under either ENR+CV, ENR+CD, or ENR for LYZ, it is apparent that the addition of XPOl inhibitors KPT-330 and KPT- 8602 increase secretory cells (Paneth, goblet, and EEC) (FIG. 10A, B).

Single cell RNA sequencing to determine pathways involved in differentiation

[0380] Applicants tested a time course of LYZ/ATP after KPT-330 treatment for use in further analyzing KPT-330 function (FIG. 11). Applicants performed seq-well on cells taken at each time point and treatment and determined quality control factors for each sample (FIG. 12). In total 17,839 cells passed quality control.

[0381] Single cells were analyzed using dimension reduction (UMAP). The single cells clustered by cell type (FIG. 13A, C). The single cells were also analyzed for clustering by treatment and time (FIG. 13B, left). Lineage markers were projected on the clusters and confirm cell lineage and proliferating cells (FIG. 13B, right). The cell numbers and fraction of each cell type over the time course for control and KPT-330 treatment were determined, showing that KPT- 330 enhances stem conversion to mature cells (FIG. 14A, B; 13C, right). Applicants analyzed the expression of XPOl and NES transcripts in the single cells and found the highest expression in stem cells (FIG. 15A, B, C). XPOl expression was lowest in Paneth cells, enterocytes and Paneth precursor cells (non-stem cells). Applicants determined the number of differentially expressed genes across the single cell types (FIG. 16).

[0382] Applicants determined the correlation of genes differentially expressed between control and KPT-330 treated cells. Genes that correlated go up or down in expression together. Applicants found correlation between genes involved in stress response and secretory markers in the upregulated differentially expressed genes (FIG. 17A; Table 1). Applicants found correlation between genes involved in the cell cycle in the downregulated differentially expressed genes (FIG. 17B; Table 2). Thus, Xpol inhibition leads to an increase in stress response genes and downregulation of cell cycle genes. Thus, differentiation may be enhanced by treatment with a cell cycle inhibitor. Transcription factors are known to drive differentiation. ATF3 is a transcription factor upregulated by Xpol inhibition (KPT-330) and may function to drive differentiation of stem cells to mature secretory cells (FIG. 17C). Thus, enhancing activity of ATF3 (e.g., modifying post-translational modification sites) or increasing expression may be used to drive differentiation in a more focused method than Xpol inhibition.

[0383] Applicants also determined that KPT-330 induces a quiescent ISC signature and reduces an active ISC signature in Day 0-1 stem cell populations (FIG. 18A, B). Applicants further identified that induction of stem cell quiescence enhances the effect of KPT-330. Applicants determined that differentiation was enhanced using the combination of a map kinase inhibitor, cobimetinib (Cob), and KPT-330 (FIG. 19). Thus, contacting cells with an agent that induces a quiescent ISC signature can be used to drive differentiation.

[0384] Applicants determined the effect of KPT-330 in vivo (FIGs. 20-22). Applicants observed that high concentration of KPT-330 was toxic to mice (FIG. 20). Thus, a more focused treatment may be beneficial to drive differentiation of stem cells in vivo. Applicants observed using histology that KPT-330 provides a pro-differentiation effect (FIG. 22). In the proximal small intestines, Applicants observed no significant shift in Paneth cell number. In the distal small intestines, Applicants observed that the high dose decreased Paneth cells. In the proximal/distal small intestines Applicants observed no significant shift in goblet cell numbers, but the trend was an increase. In the distal small intestines, Applicants observed a significant increase in cycling cell number at low dose (5 mg/g).

Materials and Methods

[0385] Western blotting. ISC-enriched organoids cultured in 3D Matrigel with ENRCV media were passaged to 24-well plate in 3D Matrigel with ENRCV. At day zero, media were replaced to ENR or ENRCD with or without indicated compounds, and media were replaced every other day. At day six, cells were harvested from Matrigel by mechanical disruption and suspended in basal media. Cell pellets were lysed with Pierce ® IP lysis buffer (Therm oFisher, 87787) containing Halt™ Protease Inhibitor Cocktail, EDTA-Free (ThermoFisher, 87785) after 3-minute centrifuge at 300 x g at 4°C. Cell extracts were resolved by NuPAGE ® SDS-PAGE Gel system (ThermoFisher) and electroblotted onto polyvinylidene difluoride membranes using Criterion™ Blotter (Biorad). The membranes were blocked with 2% Blotting-Grade Blocker (Biorad, 1706404) in TBS-T (50 mM Tris-HCl, 150 mM NaCl, and 0.1% Tween 20, pH 8.0) and then probed with appropriate antibodies. Detection was performed using ECL Select™ Western Blotting Detection Reagent (Amersham, 45-000-999) and ImageQuant LAS4000 (GE Healthcare). [0386] Lysozyme assay (3D culture). ISC-enriched organoids cultured in 3D Matrigel with ENRCV media were passaged to 48-well plate in 3D Matrigel with ENRCV. At day zero, media were replaced to ENR or ENRCD with or without indicated compounds, and media were replaced every other day. At day six, cells were washed twice with basal media and treated with carbamyl choline chloride (Sigma, C4382) for 3 hours. Media were collected and lysozyme activity was measured by EnzChek Lysozyme Assay Kit (ThermoFisher, E22013). Simultaneously, cell viability was measured by CellTiter-Glo ® 3D Cell Viability Assay (Promega, G9681).

[0387] Population RNA-sequencing. Population RNA-seq was performed using a derivative of the Smart-Seq2 protocol for single cells. In brief, organoid media was aspirated and RLT+BME (Qiagen) was added to each well, and plate shaken for 30 minutes to fully lyse. Lysate was aliquoted into 4 identical fractions and stored at -80 °C until reverse transcription. RNA lysate was thawed and cleaned with a 2.2X SPRI ratio using Agencourt RNAClean XP beads (Beckman Coulter, A63987). RNA-seq was performed on a bulk population of sorted basal cells using Smart- Seq2 chemistry, starting with a 2.2X SPRI ratio cleanup. After oligo-dT priming, Maxima H Minus Reverse Transcriptase (ThermoFisher EP0753) was used to synthesize cDNA with an elongation step at 52 °C before PCR amplification (15 cycles for tissue, 18 cycles for sorted basal cells) using KAPA HiFi PCR Mastermix (Kapa Biosystems KK2602). Sequencing libraries were prepared using the Nextera XT DNA tagmentation kit (Illumina FC-131-1096) with 250 pg input for each sample. Libraries were pooled post-Nextera and cleaned using Agencourt AMPure SPRI beads with successive 0.7X and 0.8X ratio SPRIs and sequenced with an Illumina 75 Cycle NextSeq500/550v2 kit (Illumina FC-404-2005) with loading density at 2.2 pM, with paired end 35 cycle read structure. Samples were sequenced at an average read depth of 8.44 million reads per sample and a total of 96 samples.

[0388] Organoid samples were aligned to the Mm 10 genome and transcriptome using STAR and RSEM. Differential expression analysis was conducted using DESeq2 package for R. Genes regarded as significantly differentially expressed were determined based on an adjusted P value using the Benjamini-Hochberg procedure to correct for multiple comparisons with a false discovery rate <0.01. For module scoring with the in vivo Paneth and enteroendocrine cell-defining genes, Applicants used previously published gene sets and computed persample module scores through the Seurat package. [0389] Crypt isolation and organoid culture. Small intestinal crypts were isolated from C57BL/6 mice of both sexes, aged between three to six months in all experiments. Crypts were then cultured in a Matrigel culture system. Briefly, crypts were resuspended in basal culture medium (Advanced DMEM/F12 with 2 mM GlutaMAX and 10 mM HEPES; Thermo Fisher Scientific) at a 1 : 1 ratio with Corning™ Matrigel™ Membrane Matrix - GFR (Fisher Scientific) and plated at the center of each well of 24-well plates. Following Matrigel polymerization, 500 mL of small intestinal crypt culture medium (basal media plus 100X N2 supplement, 50X B27 supplement; Life Technologies, 500X N-acetyl-L-cysteine; Sigma-Aldrich) supplemented with growth factors EGF - E (50 ng/mL, Life Technologies), Noggin - N (100 ng/mL, PeproTech) and R-spondin 1 - R (500 ng/mL, PeproTech) and small molecules CHIR99021 - C (3 mM, LC Laboratories) and valproic acid - V (1 mM, Sigma-Aldrich) was added to each well. ROCK inhibitor Y-27632 - Y (10 pM, R&D Systems) was added for the first 2 days of culture. Cells were cultured at 37°C with 5% CO2, and cell culture medium was changed every other day. After 6 days of culture, crypt organoids were isolated from Matrigel by mechanical dissociation. To expand enriched ISCs (ENR+CV/Y) or Paneth Cells (ENR+CD), organoids were cultured in 24-well plates, suspended in 40uL 3-D gels (50-50 GFR MATRIGEL ® , Basal culture media), with 500uL of crypt media supplemented with necessary growth factors and small molecules. ROCK inhibitor (Y) was added for the first two days of ISC culture following reconstitution from cryopreservation or trypLE passaging to single cells. Cell culture medium was changed every other day. After 4 days of culture in ENR+CV, cell clusters were differentiated to PCs under the ENR+CD condition for 96-well short and longterm screens. For 384-well differentiation screens, 4-day ENR+CV clusters were passaged to single cells using trypLE, replated and expanded another 3 days in 3-D ENR+CVY and then passaged directly into screens. Basal culture medium: Advanced DMEM/F 12 with 2 mM GlutaMAX and 10 mM HEPES; Thermo Fisher Scientific.

[0390] High-throughput screening: 96-well format. For 96-well plate high-throughput screening, 4-day differentiated (ENR+CD) cell clusters in 3D Matrigel were transferred to a“2.5- D” 96-well plate culture system. Briefly, cell culture gel and medium were homogenized via mechanical disruption and centrifuged at 300 g for 3 min at 4°C. Supernatant was removed, and the pellet resuspended in basal culture medium repeatedly until the cloudy Matrigel was almost gone. On the last repeat, pellet was resuspended in basal culture medium, the number of cell clusters counted, and centrifuged at 300g for 3 min at 4°C. The cell pellet was resuspended in ENR-CD medium and plated using a Tecan Evo liquid handler at the center of each well of 96- well plates prepared with a 45uL polymerized 70% Matrigel (30% basal media) coating in each well. Plates were centrifuged at 50g for 1 min at 4°C to allow for cells to partially embed in Matrigel coating. At end time points (following 2 days in culture and 3 hours of stimulation), lysozyme secretion and cell viability were assessed using Lysozyme Assay Kit and CellTiter-Glo 3D Cell Viability Assay (Promega), respectively, according to the manufacturers’ protocols. Briefly, 2.5D 96-well culture plates are spun at high speed (>2000g) for 5 min at RT to pellet cell debris, then 25 mΐ of conditioned supernatant is removed from the top of each well and mixed with 75 pi lysozyme working solution using a black 96-well flat bottom plate (LYZ screen plate). The LYZ screen plate is covered, shaken for 10 min, incubated for 20 min at 37°C, then fluorescence measured (494 nm/518 nm). 25 mΐ CTG 3D is added to each well of the 2.5D culture plate, which is then shaken for 15 min before reading luminescence (integration time between 0.5 and 1 s). Replicate strictly standardized mean difference (SSMD) was used to determine the statistical effect size of each data point (treatment and dose grouped by replicate) relative to the untreated (basal non-stimulated) control using the formula for the robust uniformly minimal variance unbiased estimate (UMVUE) under the assumption that treatment has the same variance as the control.

[0391] Flow cytometry profiling of organoids in“2.5-D”. For flow cytometry profiling of the 6 validate small molecules, ISC-enriched‘small clusters’ in 3D Matrigel culture were passaged to a“2.5D” 96-well plate culture system for six days of ENR, or ENR+CD + drug culture in the same manner as described previously with the exception of plating in 96-well plates prepared with a polymerized 70% Matrigel coating in each well. Plates were centrifuged at 50g for 1 min at 4°C to allow for cells to partially embed in Matrigel coating. Drugs were pinned into their respective wells using the Tecan from a drug stamp plate. Media was changed at day three, including pinning of the drug treatments. At day six, cells were washed 3x with basal media, then harvested from Matrigel by mechanical disruption in TrypLE Express to remove Matrigel and dissociate organoids to single cells. After vigorous pipetting and incubation at 37°C for 20 mins, dissociated organoids were strained through a 96-well filter plate with a 30-40 pm filter (Pall) into an ultra low-bind 96- well plate (Costar) by centrifuging at 300 x g for 3 mins at 4°C. The cell filtrate was centrifuged again at 300 x g for 3 mins at 4°C to pellet the cells. Cell pellets were resuspended in FACS buffer (2% FBS in PBSO), then transferred to an ultra low-bind 96-well plate for flow prep. Cells were stained with Zombie-violet viability dye (BioLegend) at 100X for viability staining and/or antibody staining solution. FITC-conjugated antibody for lysozyme and APC-conjugated antibody for CD24 were used at 100X dilution (BioLegend). Flow cytometry was performed using a LSR Fortessa (BD; Koch Institute Flow Cytometry Core at MIT). Flow cytometry data was analyzed using FlowJo X vlO.1 software.

[0392] Antibodies and Reagents. An antibody against lysozyme was purchased from abeam (Cambridge, Massachusetts, abl08508). KPT-330 (S7252) and KPT-8602 (S8397) were purchased from Selleck Chemicals (Houston, TX), and leptomycin B was purchased from Cayman Chemical (Ann Arbor, Michigan; 10004976).

[0393] Flow cytometry analysis (3D culture). ISC-enriched organoids cultured in 3D Matrigel with ENRCV media were passaged to 24-well plate in 3D Matrigel with ENRCV. At day zero, media were replaced to ENRCD with or without indicated compounds, and media were replaced every other day. At day six, cells were washed with basal media, and then harvested from Matrigel by mechanical disruption in TrypLE Express (Therm oFisher, 12605010) to remove Matrigel and dissociate organoids to single cells. After vigorous pipetting and incubation at 37°C for 20 min, dissociated organoids were strained through a 35 pm cell strainer into a tube (Falcon, 352235). The cell filtrate was centrifuged again at 300 x g for 3 min at 4°C to pellet the cells. Cell pellets were resuspended in FACS buffer (2% FBS in PBS), and then transferred to an ultra low- bind 96-well plate (Corning, 7007). Cells were stained with Zombie-violet viability dye (BioLegend, 423107) at 100X for viability staining and/or antibody staining solution. FITC- conjugated antibody for lysozyme (Dako, F0372) and APC-conjugated antibody for CD24 (BioLegend, 138505) were used at 100X dilution. Flow cytometry was performed using an LSR Fortessa (BD; Koch Institute Flow Cytometry' Core at MIT). The data were analyzed using FlowJo vlO software.

Example 2 - Inhibition of nuclear exporter Xpol rebalances intestinal stem cell differentiation towards Paneth cells

[0394] Here, Applicants utilize a chemical -induction approach to model Paneth cell differentiation in vitro in an organoid system, and perform a phenotypic screen to identify pharmaceutically-actionable and biologically significant pathways which enhance Paneth cell differentiation independent of Wnt or Notch cues. Organoid models, broadly defined as three- dimensional, stem cell-derived, tissue-like cellular structures, have provided a powerful new tool to understand the adult stem cell niche, and key developmental pathways in stem cell differentiation (Sato et al., 2009; Yin et al., 2014). While high-throughput phenotypic drug discovery provides an efficient platform for the discovery and selection of Paneth cell-specific modulators, no prior knowledge of molecular mechanism of action is required (Ranga et al., 2014); however, high-fidelity models amendable to high-throughput screening are necessary to find effective therapeutics. Cellular models with improved physiological-representation of the intestine include tissue explants (Dionne et al., 2003), organoid models (Clevers, 2016), and organ-on-a- chip approaches (Bhatia and Ingber, 2014); however, these systems are often inflexible, heterogenous and cannot be scaled, limiting their utility in early drug discovery. Building on a protocol to grow ISC-enriched organoids and their apparent progeny, including Paneth cells (Yin et al., 2014) has offered a method to overcome limitations in scalability and high-fidelity lineage- specific representation (Mead et al., 2018). Using a rationally-designed organoid model to conduct a phenotypic screen, Applicants uncover novel targets and clinically-relevant small molecules which enhance ISC differentiation to the Paneth cell lineage. The results identify a series of inhibitors targeting nuclear exporter XPOl . Applicants validate differentiation trajectories by XPOl inhibition with single-cell RNA-seq of organoids, and orthogonal studies. Overall, Applicants provide a framework to construct organoid models of lineage-specific differentiation that can uncover pathways regulating differentiation and discover compounds controlling barrier tissue composition.

Small molecule screen for regulators of Paneth cell differentiation

[0395] To enable small molecule screening in a model of Paneth cell development, Applicants employ a method of chemically-enriching and differentiating intestinal organoids from ISCs to Paneth cells. Murine intestinal organoids are conventionally expanded as heterogeneous structures in a culture media enriched with growth factors and small molecules intended to mimic the ISC niche, namely epidermal growth factor - EGF (E), BMP-antagonist noggin (N), and the aforementioned Wnt-pathway enhancer, R-spondinl (R). The cellular structures within these cultures contain cycling ISCs and immature absorptive and secretory progeny, including Paneth cells. Compositionally-enriched and functionally-mature Paneth cells can be generated from an expanded ISC-enriched organoid population (cultured indefinitely in the presence of CHIR99021

(C) and valproic acid (V)) using a chemical differentiation with the small molecules (C) and DAPT

(D) in a 3-D Matrigel scaffold, as Applicants have previously shown (Mead et al., 2018; Yin et al., 2014). ENR+CD offers a reproducible model from which to screen for new biology along the axis of ISC to Paneth differentiation and is amenable to high-throughput screening by measuring Paneth cell-specific function with a commercially-available assay for secreted lysozyme (LYZ).

[0396] However, the original 3-D ENR+CD system required the intricate plating of organoids within temperature-sensitive Matrigel domes, which presented a barrier to the secretion of Paneth cell-derived LYZ, and limited the capacity to automate screening assays. Applicants circumvented these limitations by adapting the existing 3-D system into a 2.5-D pseudo-monolayer platform, as described by others (Langhans, 2018). In brief, Applicants plate ISC-enriched organoids within 384-well plates partially embedded on the surface of a thick layer of Matrigel (at the Matrigel- media interface) rather than fully encapsulated in the Matrigel structure. This technique allows for Matrigel plating, cell seeding, and media additions to be performed in a fully-automated setting making it amenable to high-throughput applications, and allows LYZ secretion directly into cell culture media. Applicants therefore sought use of the scalable 2.5-D platform to demonstrate a proof-of-concept screen for developmental process of Paneth cells in vitro , and to elucidate molecular pathways and small molecule agents which may afford an axis to enhance Paneth cell number therapeutically.

[0397] Applicants applied a small molecule library over a 6-day differentiation of ENR+CD organoids from ENR+CV ISC-enriched precursors (n = 3 biological replicates) and at day 6, measured functional secretion of lysozyme (LYZ) in media supernatants (Fig. 23A). Small molecules were pinned into distinct wells at four doses per compound (‘quadrant stamp’) at day 0 (day of plating) and day 3 (media change). At day 6, cells were assayed for phenotypic Paneth cell markers— basal LYZ secretion (LYZ.NS) and carbachol (CCh)-induced secretion (LYZ.S)— and ATP as a measure of cell number using a CellTiterGlo (ATP) assay multiplexed within a given well. Applicants used a target-selective inhibitor library (Selleck Chem) with 184 unique molecular targets and 433 compounds with high specificity and many of those targets being implicated in stem cell differentiation (see Methods). [0398] Screen plates were first normalized and assessed for reproducibility and quality. Briefly, raw values were logio transformed, then a LOESS normalization was applied to each plate to remove systematic error and column/row/edge effects (with wells for controls and compounds being randomly distributed throughout each plate). Following normalization, fold change (FC) was calculated by subtracting the median of the plate (as control) from the LOESS normalized values. For each assay this resulted in approximate-normal distributions, with lower-level tails corresponding to toxic compounds (Fig. 27A). Median FC was also used for each treatment-dose to determine robust z-score. Following data normalization, plates were assessed for correlations between biological replicates for all three assays, indicating an acceptable level of reproducibility across plates and biological replicates (Fig. 27B). As well, FCs of no-cell controls versus cell- containing positive controls (ENR+CD) wells was statistically significant (adj . p < 0.0001), indicating positive control wells on average (across plates) contained viable cells (Fig. 27C). Discrimination of biological function was confirmed in the basal LYZ secretion assay (LYZ.NS), where non-stimulated positive controls had significantly greater secreted LYZ than that of no-cell controls (adj . p < 0.0001), and 10 mM CCh-stimulated positive controls were significantly greater than that of non-stimulated positive controls (adj. p < 0.0001) (Fig. 27C). Biological function was further observed in the stimulated LYZ secretion assay (LYZ.S), where a significant increase in secreted LYZ was observed in non-stimulated positive controls subsequently stimulated with 10 mM CCh versus non-stimulated (adj. p < 0.05) and those doubly non-stimulated positive controls versus no cell controls (adj. p < 0.0001) (Fig. 27C).

[0399] Primary screen‘hits’ were defined as having replicate strictly standardized mean differences (SSMDs) for both LYZ assays greater than the calculated optimal critical value (b ai =

0.997) (Fig. 23B, data in Table 5A). b a1 was determined as the intersection of false positive and false negative levels (FPL & FNL) for up-regulation SSMD-based decisions (Zhang, 2011). This cutoff identified treatment-dose (grouped by biological replicate) combinations that had a statistical effect size greater than the FNL and FPL levels of the plates, i.e. treatments that had a statistically significant effect on increasing LYZ.NS and LYZ.S secretion without regard to viability (though most hits per this criteria had positive effects on cellular ATP). Following these criteria, 47 selected treatment-dose hits are thus hits in either both LYZ assays or all three assays, meaning hits either improve Paneth cell function and/or survival. The 47 hits were narrowed down to 15 treatment-dose combinations using the z-scored FC to select for combinations that elicited a biological effect in the top 10% of values for both LYZ assays relative to the plate (> 1.282). Thus, 15 drugs (covering 18 treatment-dose conditions) from 13 unique molecular targets were identified as primary screen hits (Fig. 23C). For molecular targets with more than one hit treatment-dose— TGF-b inhibitors, PI3K/Akt/mTOR inhibitors, and Tyr kinase inhibitors— only the most robust treatment-dose was selected for further investigation. For TGF-b inhibitors, SB431542 performed almost identically to LY215799 on both LYZ assays but outperformed LY215799 on increasing cell viability and was thus selected as a hit. Both PI3K/Akt/mTOR inhibitors were selected as hits because their targets are substantially different and could mechanistically show a p70-specific effect or a multi-target effect within the whole pathway. For Tyr kinase inhibitors, only Rebastinib was excluded from the hit list due to underperformance on all three assays compared to Bosutinib, for with it shares a primary target. Optimal doses for treatments that were the sole hit in their respective pathway— Safmamide mesylate (catecholamine metabolism), Rolipram (cAMP), Varespladib (phospholipase inhibitor), Dapagliflozin (glucose transporter), KPT-330 (nuclear transporter), and finasteride (5a reductase inhibitor)— were determined by assessing dose response curves for both LYZ assays primarily and the CTG assay secondarily.

[0400] To validate primary screen hits against and ENR+CD control, identify narrowed dose- response ranges, and further narrow hits to only the most potent activator(s) of Paneth cell differentiation, Applicants performed secondary screening with the 13 primary screen compounds (Table 5B). Drugs were tested at a narrowed dose range around each treatment’ s identified optimal dose from the primary screen (4X below, 2X above and below). Hits in the validation screen were chosen by SSMDs for both LYZ assays greater than the calculated optimal critical value (b a i = 0.889), with 6 compounds passing this threshold (Fig. 27D). The same treatment-dose conditions passing the SSMD threshold also had the greatest biological effect, and in particular one compound, KPT-330, a known Xpol inhibitor, had two doses representing the greatest, and near- greatest biological effect (-50-75% increases in LYZ.NS and LYZ.S relative to ENR+CD control) (Fig. 23D).

Table 5. Results from primary and secondary lysozyme (LYZ) secretion screening grouped by compound and dose, reported as log2 fold change (FC), standard z score, and strictly standardized mean difference (SSMD). Table 5A. Primary Screen

Table 5B. Secondary Screen

[0401] The results of primary and secondary screening reflect a mixture of potential effects arising from small molecule treatment which may result in increases in total LYZ secretion. This includes contributions from enhanced Paneth cell differentiation, altered Paneth cell activity and changes in total cell number concurrent with differentiation. To better inform how the 6 compounds identified in the screen increased total secreted LYZ, and to isolate only those which enhance Paneth differentiation robustly, Applicants next utilized flow cytometry' to measure the changes in Paneth cell representation within the treated organoids. As another measure to ensure that Applicants do not select for compounds which manifest their behavior only in specific in vitro settings, Applicants performed the analyses in the conventional 3-D culture method to control for 2.5-D culture system-specific effects. Single live cells were selected by several gating strategies and Paneth cells were identified as lysozyme-high, CD24-mid, side scatter-high (SSC-high) (Fig. 27E). The 6 hit compounds were provided at the most potent dose from 2.5-D screening, with organoids in ENR+CD media for 6 days with media change every other day. Only one compound, KPT-330 the most potent compound in validation screening, significantly enhanced the mature Paneth cell population within organoids, suggesting KPT-330 induces PC differentiation (Fig. 23E). To evaluate whether the culture media supplements C and D may alter the effects of the 6 hits, Applicants also performed lysozyme assay in the canonical ENR culture condition in 3-D (because Paneth cells exist in an immature state within ENR, Applicants were unable to robustly quantify Paneth cell number via flow cytometry). This again shows that only KPT-330 is enhancing Paneth cell-specific activity in the conventional organoid culture condition, and led us to focus solely on KPT-330 and its potential mechanism of Xpol inhibition going forward (Fig. 27F).

Validating Xpol as molecular target enhancing Paneth cell differentiation

[0402] Applicants next sought to confirm the predicted on-target activity of KPT-330 and demonstrate dose-dependency of treatment in enhancing Paneth cell differentiation. KPT-330 is a first-in-class orally-administered FDA-approved drug against multiple myeloma, targeting a nuclear transporter, XPOl (also known as CRM-1).

[0403] Administration of KPT-330 below 160 nM for 6 days (in primary screening higher concentrations proved toxic) showed enhanced Paneth cell activity in a dose-dependent manner, with 160 nM of KPT-330 as the most effective dose among tested concentrations, as evidenced by the LYZ secretion levels in both basal and Cch-stimulated conditions (Fig. 23F). To confirm that KPT-330 is acting via the inhibition of Xpol, Applicants used two additional Xpol inhibitors: KPT-8602, a second-generation compound based on KPT-330, and leptomycin B, a canonical inhibitor. Flow cytometry analyses revealed both KPT-8602 and leptomycin B increased the proportion of PC in the organoids (Fig. 23G). Additionally, LYZ secretion assays with the additional Xpol inhibitors showed similar Paneth cell-enrichments both in conventional (ENR) and Paneth-differentiation (ENR+CD) culture conditions (Fig. 27G, 27H). Applicants also utilized Western blotting as an alternative method to assess the abundance of lysozyme within organoids for indirectly measuring Paneth cell enrichment using a different antibody than used in flow cytometry. LYZ expression levels per unit weight were enhanced by three XPOl inhibitors (Fig. 271), consistent with the results of LYZ secretion assays and flow cytometry analyses.

[0404] KPT-330 is a type of small molecule known as a selective inhibitor of nuclear export

(SINE), these molecules act by suppressing the Xpol -regulated nuclear export of multiple proteins and mRNAs from the nucleus to the cytoplasm, including genes involved in stem cell maintenance and differentiation as well as inflammatory stress response (Sendino et al, 2018). Proteins shuttled by the transporter Xpol are marked with a nuclear export signal (NES). Additionally, Xpol is known to regulate cell cycle through Xpol’s export-independent role in the regulation of mitosis (Forbes et al, 2015). Based on this evidence, Applicants hypothesized that Xpol inhibition was providing for enhanced Paneth cell differentiation by directing ISCs to modulate their differentiation trajectories through alterations in either developmental signaling within the nucleus and / or interfering with cell cycle.

Longitudinal scRNA-seq of differentiation reveals dynamic population shifts with Xpol inhibition resulting in Paneth cell enrichment

[0405] To test the hypothesis that KPT-330 drives Paneth differentiation by altering ISC behavior, Applicants utilized single-cell RNA-sequencing (scRNA-seq) via the recently updated Seq-Well microwell technology (Hughes et al., 2019) to perform a longitudinal comparison between untreated and KPT-330 treated organoids over the same 6-day differentiation, with a particular emphasis on early timepoints (Fig. 24A).

[0406] Applicants collected 18 samples corresponding to pre-differentiation ENR+CV organoids (n = 2) and both ENR+CD and ENR+CD + KPT-330 (160 nM) at 6 hours (0.25 days, n = 1), 1 day (n = 1), 2 days (n = 1), 3 days (n = 2), 4 days (n = 1), and 6 days (n = 2). For time points beyond 2 days, media was refreshed every other day. Following the Seq-Well protocol Applicants followed a standard library preparation, sequencing, and alignment process (see Methods). Due to a technical failure in membrane sealing during Seq-Well, one sample (day 3 ENR+CD) was not included in analysis. Prior to analysis, cell-by-gene digital expression matrices were pre-processed to remove cellular barcodes with less than 500 unique genes, greater than 35% of unique molecular identifiers (UMIs) corresponding to mitochondrial genes, low outliers in standardized house- keeping gene expression (Tirosh et al., 2016), barcodes with greater than 30,000 UMIs, and cellular doublets identified through manual inspection and use of the DoubletFinder algorithm (McGinnis et al., 2019). The resulting dataset consists of 19,877 cells spanning the 17 samples collected. UMI, percent mitochondrial, and detected gene distributions are similar across samples, with likely differences due to variations in library preparation and sequencing depth (Fig. 28A). To better control for potential batch effects that may arise from differences in library preparation, dimensional reduction and clustering was performed following normalization with regularized negative bionomical regression as implemented in Seurat V3 via SCTransform (Hafemeister and Satija, 2019).

[0407] Unsupervised UMAP reduction of the complete dataset nicely reveals the time-course structure along with branches suggestive of distinct lineages arising over the course of differentiation (Fig. 24B). Louvain clustering separated the data into 8 clusters, which Applicants manually annotated (Fig. 24C) as mature epithelial and stem cells. Importantly, Applicants observed that each cluster possessed highly similar quality metrics (Fig. 28B), while being enriched for expression of canonical markers of multiple intestinal epithelial cell types (Fig. 24D). To contextualize the cellular identity of the 8 clusters, Applicants used lineage-defining gene sets from a murine small intestinal scRNA-seq atlas (Haber et al., 2017) to score each cell relative to all others using the strategy of Tirosh et al , 2016 (Fig. 28C). In addition to showing that the UMAP branches correspond to lineages along trajectories of differentiation over real time from a stem- like pool, scores also correspond well with unique clusters of the 8 identified (Fig. 24E). Lineage module scoring combined with the expression of select lineage-defining genes allowed Applicants to classify the 8 clusters as 3 stem-like, 2 enterocyte, 2 Paneth, and enteroendocrine, aligning with the expectation that ENR+CD differentiation should enrich for secretory epithelium, principally Paneth and to a lesser extent enteroendocrine. To better contextualize the 3 stem-like clusters, and assess potential physiological relevance, Applicants again performed module scoring over gene sets identified to correspond to known ISC subsets in vivo (Biton et al., 2018) (Fig. 28D). Applicants see clear alignment with the type III and slight enrichment for type I ISCs per the nomenclature of Biton et al., additionally there is slight enrichment for a distinct type II (Fig. 28E) though it may also be that this population is a transient intermediate between stem populations, sharing markers with the other two (Fig. 24D). Accordingly, Applicants adopted the naming scheme of Biton et al. to describe the three ISC populations, type I appearing enriched for canonical markers of ISCs, including LGR5, type III being most distinguished by the high expression of many genes involved in cell cycle, and type II appearing as a transitory or intermediate population between I and III.

[0408] Having annotated the complete dataset, Applicants next sought to explore what factors distinguish the organoids treated with KPT-330 versus control. Importantly, in the combined dataset Applicants see that across all three conditions, day 0 ENR+CV, and day 0.25-6 ENR+CD and ENR+CD + KPT-330 there are not unique cell populations, but rather shifts in the abundance of cell types with KPT-330 treatment (Fig. 28F). In fact, looking over the time course, Applicants see clear changes in the relative abundance of different cell populations between KPT-330 treated and control. Both conditions begin with over 75% of cells as either stem II or stem III, but by day 0.25 Applicants see a relative shift from mostly stem III to mostly stem II (from rapidly cycling to transitory stem). Further, by day 2 and extending through day 6 Applicants see the emergence of stem I, accounting for approximately 25% of the cells in the control condition, but a much smaller proportion in KPT-330 treated organoids. Looking at the differentiating populations Applicants see the rapid emergence of early enterocytes and early Paneth cells at day 1, with the continued differentiation to enterocytes and eventual disappearance of enterocytes by day 4. Early Paneth appears to crest with enterocytes followed by a transition to Paneth cells continuing to day 6. The major distinguishing factors between KPT-330 treated and control in the differentiating populations is the relative early enrichment for enterocytes, later enrichment for Paneth cells, and suppression of the non-cycling stem I population (Fig. 24E). To better quantify the differences in representation between KPT-330 treated and control over time, Applicants constructed a 2x2 contingency table of each individual cell type relative to all others at each timepoint where that cluster accounted for at least 1% of cells in both KPT-330 and control samples. Applicants present the relative enrichment or depletion of a cell population with KPT-330 treatment over time as the odds ratio (OR) with a corresponding 95% confidence interval. This again shows the relative depletion of stem I (and stem II & III) and enteroendocrine cells over time along with the corresponding enrichment of enterocytes and Paneth cells (Fig. 24G). This agrees to a large extent with the flow cytometry observations of a 2-fold increase in mature Paneth cells at day 6 of differentiation with KPT-330 (the day 6 Paneth cell OR from scRNA-seq being ~2), while also showing the unexpected early enrichment of enterocytes and longer-term depletion of a subset of stem cells, stem I.

Xpol inhibition drives cycling‘stem II /IIG ISCs into a pro-differentiation state via stress response and suppression of mitogen signaling

[0409] The observation of compositional changes over the course of differentiation are consistent with Xpol inhibition acting as a pro-differentiation agent within the stem II / III populations most abundant at the beginning of differentiation. In fact, Applicants see that in non- treated organoids, the expression of Xpol is significantly enriched in the actively cycling stem III population (Fig. 25A & Fig 29A). Applicants also observe that the expression of genes known to contain a NES, which are required for the nuclear efflux via Xpol and therefore moderated by Xpol inhibition, are enriched in the stem cell populations, most significantly in stem III (Fig. 25B & Fig. 29B) (Fu et al., 2013). More specifically, Applicants know that Xpol is an important mediator of nuclear signaling processes including the mitogen-activated protein kinase (MAPK) pathway, NFAT, AP-1, and Aurora kinase activity during cell division (Sendino et al., 2018; Sun et al., 2016). Applicants observe the expression of many key mediators in these pathways within the stem populations, and see particular stem Ill-enrichment in members of MAPK {Mapkl, Mapk9, Mapkl 3, Mapkl4), NFAT ( Nfatc3), AP-1 (A tfl ), and Aurora kinases ( Aurka , Aurkb) (Fig. 29C). [0410] To further establish whether the stem II / III population is the principal cellular target of Xpol inhibition, Applicants leveraged the dynamic nature of the system and exposed differentiating organoids to KPT-330, over varied timespans. Because the abundance of stem, differentiating, and mature populations change through this course, by inhibiting Xpol over every continuous 2, 4, and 6-day interval in the 6-day differentiation and measuring final abundance and function of mature Paneth cells at the end of differentiation, Applicants can infer the relative effect of Xpol inhibition on each cell type (Fig. 25C). Applicants see that of all 2-day KPT-330 treatments, day 0-2 results in the greatest enrichment in mature Paneth cells, with longer durations of exposure following day 2 providing additional, albeit lesser enrichment. Further Applicants see that day 2-4 produces moderate enrichment, while day 4-6 is no different than untreated (by flow cytometry) or minorly enriched (by LYZ secretion assay) (Fig. 25D & Fig. 29D). Using an additional SINE, KPT-8602, Applicants observe similar enrichment behavior as KPT-330 (Fig. 29E). This data supports that Xpol inhibition is principally altering stem II / III differentiation - the largest effects of Xpol inhibition are concurrent with periods in the differentiation course where stem II / III populations are most abundant. However, this data also suggests that the effect of Xpol inhibition may not be entirely stem-dependent, given the lesser, but significant increases in Paneth cell number and function with later treatment.

[0411] To better understand the pleiotropic effects of Xpol inhibition which may mediate differentiation within stem II / III, Applicants examined the differentially expressed genes between KPT-330 treated and untreated stem II / PI populations in the earliest stages of differentiation when they are most abundant (day 0.25-2). Both the most significantly enriched {Xpol) and depleted {Kpnbl - a nuclear importin) genes suggest that these cells are significantly impacted by KPT-330 treatment and are enacting changes in expression to reestablish homeostasis of nuclear cargo transit (Fig. 25E & Table 3A). Notable genes with significantly increased expression include Arrdc3 (known to regulate proliferative processes), Slcl6a6 (a principal transporter of ketone bodies - recently shown to be instructional in ISC fate decisions), Tbgrl (a growth inhibitor), and Atf3 (regulates stress response in ISCs) (Cheng et al, 2019; Draheim et al, 2010; Jadhav and Zhang, 2017; Zhou et al, 2017a). Genes down-regulated by KPT-330 treatment appear related to proliferation and cell cycle, notably the marker Mki67. Notably, of these responses to KPT-330 treatment within stem II / III, Xpol,Atf3, Trp53 (p53), Ccndl, Cdk4/6 , and Cdknla (p21 ) expression are increased across all cell types (at all times), but with significant differences in the fraction of cells in each which express each gene (Fig. 29F). This suggests that there are both stem

II / III specific responses and pan-epithelial responses to Xpol inhibition.

[0412] To better contextualize the transcriptional response to KPT-330 treatment in stem II /

III cells, Applicants performed gene set enrichment analyses (GSEA) using the v7 molecular signatures database (MSigDB) hallmark collection which represent specific well-defined biological states or processes across systems (Liberzon et al., 2015; Subramanian et al., 2005). Significant gene sets with FDR < 0.05 reveal two major programs differentially enriched following KPT-330 treatment, with enrichment or depletion quantified through the GSEA normalized enrichment score (a quantification of the degree to which a gene set is over-represented at either extreme of the full ranked list of differentially expressed genes) (Fig. 25F & Table 3B). KPT-330 treatment suppresses programs downstream of mitogen-driven signaling, notable targets of E2F, and MYC, as well as genes involved in cell cycle (G2M checkpoint), while up-regulating programs broadly resembling a complex stress response (NFkB signaling, hypoxia, inflammatory response). Compellingly, these responses are in strong agreement with the known effects of Xpol inhibition in the context of malignancy.

[0413] Applicants next sought to examine whether either of these responses, as embodied by the significant differentially expressed genes in stem II / III (day 0.25-2), may be pan-epithelial or restricted to the actively cycling stem II / III populations. Interestingly, Applicants see that the stress response module is substantially increased across all cell populations during differentiation, with the greatest effects in the stem II / III and early mature cell populations, and lowest effect in the mature Paneth cells (Fig. 25G). Conversely, Applicants see that the mitogen signaling module is selectively decreased in the stem II / III and early enterocyte populations relative to all others. This selectivity is likely due to the fact that the majority of mitogen signaling occurs within the proliferative stem II / III populations, and is closer to a floor in the mature populations. Combined Applicants see that the SINE-induced stress response appears to be a pan-epithelial response, while the modulation of mitogen signaling is restricted to the actively cycling stem cells. Recent work on mitogen and stress response control of re-entry into cell cycle may provide important context on the necessity of overlap of these two responses (Yang et al., 2017). Specifically, mother cells will transmit P53 protein and Ccndl transcripts to daughter cells, which, based on the abundance of transmitted signal with either immediately re-enter cell cycle, or commit to a quiescent state.

[0414] Transitions between quiescence and proliferation within the ISC niche have important roles in tissue homeostasis and regeneration. Quiescent pools of crypt-residing or adjacent cells serve as reserve populations which upon injury-dependent depletion of cycling stem cells will re- establish cycling progenitors and maintain homeostatic tissue regeneration (Ayyaz et al., 2019; Yousefi et al., 2017). Further, Applicants know that a transient quiescent intestinal stem cell state facilitates secretory enteroendocrine cell differentiation (Basak et al., 2017). To explore if similar phenomenon is occurring following Xpol inhibition, Applicants mapped the gene modules identified by Basak et al. of active and quiescent ISCs onto the early (day 0.25-2) stem II / III cells, where Applicants do in fact see a transition into quiescence with KPT-330 treatment (Fig. 25H). Combined with the observation that Xpol inhibition blocks the emergence of the non-cycling stem I population, the data suggest a model wherein SINE-induced stress response and disruption of mitogen signaling instruct proliferative progenitors to exit cell cycle and differentiate preferentially towards the Paneth and enterocyte lineages, while limiting the accumulation of ‘reserve’ non-cycling stem I cells and enteroendocrine cells.

[0415] Applicants sought to clarify this conceptual model with the use of additional small molecule inhibitors known to modulate discrete components of the hypothesized differentiation process. Applicants began by treating organoids along the ENR+CD differentiation course with a small molecule inhibitor of AP-1, SRI 1302, to test whether AP-1 is critical to the SINE-induced stress response, both alone and in combination with KPT-330. Applicants observe that when added with KPT-330, SRI 1302 significantly decreases functional LYZ secretion at the end of the 6-day differentiation, however when added alone, SRI 1302 also decreases functional LYZ secretion (Fig. 251). This suggests that AP-1 signaling is an important mediator of Paneth differentiation from ISCs, though is only suggestive and not conclusive that Xpol inhibition acts on AP-1 to mediate its pro-differentiation effect. Applicants next tested whether P53 is an important downstream mediator by repeating the above assay with two known P53 inhibitors, pifithrin-oc (PFTa) and serdemetan (serd.). Across a wide dose-range neither P53 inhibitor altered Paneth cell differentiation either alone or in combination with KPT-330, suggesting that the KPT-330 stress response is not dependent on P53 signaling (Fig. 29G). With the same assay Applicants began to probe the mitogen signaling response by adding the MEK inhibitor, cobimetinib (as shown by Basak et al. to induce the quiescent ISC population), in combination with KPT-330. Interestingly, cobimetinib alone did not significantly alter Paneth cell differentiation, however, when used in combination with KPT-330 proved to significantly enhance Paneth cell differentiation as measured by functional LYZ secretion (Fig. 251). Applicants next sought to test whether regulation of cell cycle via mitogen signaling may be an important downstream mediator following Xpol inhibition. Inhibition of Cdk4/6 with palbociclib alone and in combination with KPT-330 did not alter Paneth cell differentiation (Fig. 29H), however inhibition of aurora kinase b with ZM447439 did significantly increase Paneth cell differentiation (notably ZM447439 was also a lower-effect size hit of the primary screen) (Fig. 291). Combined, these experiments suggest that the SINE-induced stress response may be mediated by AP-1 and not P53, while suppression of mitogen signaling is not dependent on ERK, but is further enhanced by ERK inhibition, and that the non-exported related action of Xpol during cell cycle (which interacts with aurora kinase) may further contribute to the observed pro-differentiation effect.

[0416] In total, the analyses suggest that Xpol inhibition drives Paneth cell enrichment via the induction of a pan-epithelial stress response and suppression of mitogen signaling within the cycling ISC population (stem II / III). This response results in the cycling stem population becoming transiently quiescent, thereby favoring differentiation towards the Paneth and enterocyte lineages (the latter being a short-lived population relative to the former) over a more balanced transition to the mature lineages and the quiescent stem pool (stem I) (Fig. 25J).

Low dose oral Xpol administration in vivo induces selective expansion of the Paneth cell compartment

[0417] Based on the understanding of Xpol inhibition in stem-enriched organoids, Applicants hypothesized that SINE compounds may selectively enrich the epithelium for Paneth cells in vivo. The findings in organoids suggest that SINE treatment is independent of the niche cues of Wnt and Notch, acts specifically on cycling stem cells, which are abundant in the epithelial crypts, and while Xpol inhibition may enrich for both Paneth cells and enterocytes, by virtue of the relatively long Paneth cell lifespan (Ireland et al., 2005), Applicants would expect a longer-term accumulation of Paneth cells in vivo relative to enterocytes. Additionally, because the organoid data suggests Xpol inhibition does not expand the stem cell pool, but rather rebalances patterns of differentiation, Applicants may expect an increase in Paneth cell number following SINE treatment in vivo to be both restricted to the spatial constraints of non-hypertrophic crypts, and proportional to the total abundance of cycling progenitors, suggesting that the total increase in Paneth cell number may be modest, and require a particularly sensitive method of quantification.

[0418] Following a similar protocol as previously reported for SINE treatment in the context of cancer (Arango et al, 2017; Azmi et al, 2013; Hing et al, 2016; Zheng et al, 2014), KPT-330 was administered at a dose 10 mg/kg via oral gavage every other day over a two-week span in C57BL/6 wild-type mice, and body weight was monitored for any clear toxicity. Within the treatment group Applicants observed significant weight loss (Fig. 30A) indicative of toxicity. Given animal weight loss on the standard chemotherapeutic dosage regimen, and additional evidence that sustained dosage of SINEs adversely impacts T cell populations (Tyler et al., 2017), Applicants sought to explore dosing regimens well below 10 mg/kg, to see if a pro-Paneth phenotype may exist below potential toxicities. Accordingly, Applicants repeated the two-week study with oral gavage of KPT-330 every other day at doses corresponding to 50-fold (0.2 mg/kg), 200-fold (0.05 mg/kg), and 1,000-fold (0.01 mg/kg) below the 10 mg/kg dose conventionally used in the setting of cancer. Applicants tracked animal weight every other day, and at day 14 collected the proximal and distal thirds of the small intestine for histological quantification of Paneth, stem, and goblet populations (Fig. 26A). In this lower-dose regime, Applicants observe no significant changes in animal weight at any dose, suggesting Applicants are outside the range of gross toxicity (Fig. 30B). Samples were prepared for histology by the‘swiss-rolT technique, and following staining, were blinded and randomized before manual counting of well-preserved features. Paneth cells were counted within well preserved crypts, with at minimum 30 crypts quantified per animal (representative images Fig. 30C) and then averaged to get mean Paneth cells per crypt in distal and proximal thirds of the small intestine. Compellingly, within this lower dose regime Applicants observed significant increases in Paneth abundance in both the proximal and distal small intestine at doses of 0.01 mg/kg, and proximally at 0.2 mg/kg. Applicants additionally quantified the abundance of 01fm4+ stem cells as well as PAS+ goblet cells within the same animals to ascertain whether the effect of SINE treatment was restricted to the Paneth cell compartment (representative images Fig. 30D,E). Interestingly, Applicants observe a significant increase in 01fm4+ stem cells within the distal SI at doses of 0.01 mg/kg corresponding to the group with the greatest increase in Paneth cells (Fig. 26C), suggesting a potential expansion of the stem cell niche commensurate with increased Paneth cell abundance. Applicants did not overserve any significant changes in the developmentally-related goblet cell population (Fig. 26D). In total, this data suggests the SINE- treatment may be a meaningful approach to specifically increase Paneth cell abundance in vivo, and further validates the framework for using models of organoid differentiation for small molecule screening.

Discussion

[0419] Paneth cells of the small intestine are involved in a broad range of activities including maintenance of the small intestinal epithelial barrier, shaping the gut microbiota, and communicating with the immune system. With the Paneth cell differentiation model Applicants have advanced a scalable platform to probe for drivers of Paneth cell differentiation from ISCs. Building upon previously established small molecule-driven enrichment and differentiation of LGR5+ ISCs into secretory and absorptive progeny of the intestinal epithelium, Applicants set forth to characterize the secretory cells derived from WNT activation and Notch inhibition with a goal of advancing a Paneth-cell enriched culture. By assessing ISC-enriched organoid differentiation, Applicants see greatly increased markers of Paneth cells, after the described conditions.

[0420] The present invention provides motivation for the delivery of low doses of small molecules that inhibit nuclear export directly to the crypts. Additionally, there is motivation to use methods of delivery, such that low doses are delivered to the crypts for sustained periods. Applicants can test for the ideal window of measurement as PCs are long-lived. Applicants hypothesize that if PCs are measured after 2+ weeks there will be further accumulation. Applicants hypothesize that barrier function can be increased if SINEs (pro-differentiation) are combined with agents to increase the stem cell pool, such as CHIR or VPA. The results provide for the pleiotropic nature of Xpol inhibition, as Xpol inhibition was previously used as a chemotherapeutic agent at high doses.

[0421] The cells described herein show rapid transcriptional maturity and are morphologically similar to in vivo cells. The cell enrichment described herein is far superior to existing models. The organoid model enables specific investigation of the dynamics of single cell types revealing signals that would otherwise be obscured in vivo. Methods

[0422] Mice. Proximal small intestine was isolated from wild-type C57BL/6 mice of both sexes, aged between one and six months in all experiments.

[0423] Crypt isolation and culture. Small intestinal crypts were isolated as previously described 23 . Briefly, the small intestine was harvested, opened longitudinally, and washed with ice-cold Dulbecco’ s Phosphate Buffer Saline without calcium chloride and magnesium chloride (PBSO) (Sigma-Aldrich) to clear the luminal contents. The tissue was cut into 2-4 mm pieces with scissors and washed repeatedly by gently pipetting the fragments using a 10-ml pipette until the supernatant was clear. Fragments were rocked on ice with crypt isolation buffer (2 mM EDTA in PBSO; Life Technologies) for 30 min. After isolation buffer was removed, fragments were washed with cold PBSO by pipetting up and down to release the crypts. Crypt-containing fractions were combined, passed through a 70-mm cell strainer (BD Bioscience), and centrifuged at 300rcf for 5 min. The cell pellet was resuspended in basal culture medium (2 mM GlutaMAX (Thermo Fisher Scientific) and 10 mM HEPES (Life Technologies) in Advanced DMEM/F12 (Invitrogen)) and centrifuged at 200rcf for 2 min to remove single cells. Crypts were then cultured in a Matrigel culture system (described below) in small intestinal crypt medium (100X N2 supplement (Life Technologies), 100X B27 supplement (Life Technologies), 500X N-acetyl-L-cysteine (Sigma- Aldrich) in basal culture medium) supplemented with differentiation factors at 37°C with 5% CCh. Pen/strep (100X) was added for the first four days of culture post-isolation only.

[0424] Small intestinal crypts were cultured as previously described. Briefly, crypts were resuspended in basal culture medium at a 1 : 1 ratio with Corning™ Matrigel™ Membrane Matrix - GFR (Fisher Scientific) and plated at the center of each well of 24-well plates. Following Matrigel polymerization, 500 mΐ crypt culture medium (ENR+CV) containing growth factors EGF (50 ng/ml, Life Technologies), Noggin (100 ng/ml, PeproTech) and R-spondin 1 (500 ng/ml, PeproTech) and small molecules CHIR99021 (3 μM, LC Laboratories or Selleckchem) and valproic acid (1 μM , Sigma-Aldrich) was added to each well. ROCK inhibitor Y-27632 (Y, 10 mM, R&D Systems) was added for the first two days of ISC culture only. Cell culture medium was changed every other day. After 4 days of culture, crypt organoids were expanded as and enriched for ISCs under the ENR+CV condition. Expanding ISCs were passaged every 6 days in the ENR+CV condition. [0425] Organoid culture, differentiation, and passaging. After 4 days of culture under ENR+CV condition, ISCs were differentiated to PCs. Briefly, ISC culture gel and medium were homogenized via mechanical disruption and centrifuged at 300g for 3 min at 4°C. Supernatant was removed and the pellet resuspended in basal culture medium repeatedly until the cloudy Matrigel was almost gone. On the last repeat, pellet was resuspended in basal culture medium, the number of organoids counted, and centrifuged at 300g for 3 min at 4°C. The cell pellet was resuspended in basal culture medium at a 1 : 1 ratio with Matrigel and plated at the center of each well of 24- well plates (-250 organoids/well). Following Matrigel polymerization, 500 mΐ crypt culture medium (ENR+CD) containing growth factors EGF (50 ng/ml), Noggin (100 ng/ml) and R- spondin 1 (500 ng/ml) and small molecules CHIR99021 and DAPT (10 mM, Sigma-Aldrich) was added to each well. Cell culture medium was changed every other day.

[0426] High-throughput screening. For 384-well plate high-throughput screening, ISC- enriched organoids were passaged and split to single cells with TyrpLE (Thermo Fisher Scientific) and cultured for 2-3 days in ENR+CVY prior to transfer to a“2.5D” 384-well plate culture system. To prepare for“2.5D” plating, cell-laden Matrigel and media were homogenized via mechanical disruption and centrifuged at 300 g for 3 min at 4°C. Supernatant was removed and the pellet washed and spun in basal culture medium repeatedly until the cloudy Matrigel above the cell pellet was gone. On the final wash, pellet was resuspended in basal culture medium, the number of organoids counted, and the cell pellet was resuspended in ENR+CD medium at ~7 clusters/mL. 384-well plates were first filled with 10 mL of 70% Matrigel (30% basal media) coating in each well using a Tecan Evo 150 Liquid Handling Deck, and allowed to gel at 37°C for 5 minutes. Then 30 mL of cell-laden media was plated at the center of each well of 384-well plates with the liquid handler, and the plates were spun down at 100g for 2 minutes to embed organoids on the Matrigel surface. Compound libraries were pinned into prepped cell plates using 50 nL pins into 30 mL media/well. Cells were cultured at 37ºC with 5% CO 2 for six days in ENR+CD medium supplemented with the tested compounds with a media change at three days. On day six, lysozyme secretion and cell viability were assessed using Lysozyme Assay Kit (EnzChek) and CellTiter-Glo 3D Cell Viability Assay (Promega), respectively, according to the manufacturers’ protocols. Briefly, screen plates were washed 3x with FluoroBrite basal media (2 mM GlutaMAX and 10 mM HEPES in FluoroBrite DMEM (Thermo Fisher Scientific)) using a BioTek 406 plate washer with 10 min incubations followed by a 1 min centrifugation at 200g to settle media between washes. After removal of the third wash, 30 mL of non-stimulated FluoroBrite basal media was added to each screen well using a Tecan Evo 150 Liquid Handling Deck from a non-stimulated treatment master plate, and plates were incubated for 30 min at 37°C. After 30 minutes, the top 15 mL of media from each well of the screen plate was transferred to a non-stimulated LYZ assay plate containing 15 mL of 20X DQ LYZ assay working solution using a Tecan Evo 150 Liquid Handling Deck. The non-stimulated LYZ assay plate was covered, shaken for 10 min, incubated for 50 min at 37°C, then fluorescence measured (shake 10 s; 494 mm/518 nm) using a Tecan M1000 Plate Reader. After the media transfer to the non-stimulated LYZ assay plate, the remaining media was removed from the screen plate and 30 mL of Stimulated FluoroBrite basal media (supplemented with 10 pM CCh) was added to each screen well using a Tecan Evo 150 Liquid Handling Deck from a stimulated treatment master plate, and plates were incubated for 30 min at 37°C. After 30 minutes, the top 15 mL of media from each well of the screen plate was transferred to a stimulated LYZ assay plate containing 15 mL of 20X DQ LYZ assay working solution using a Tecan Evo 150 Liquid Handling Deck. The stimulated LYZ assay plate was covered, shaken for 10 min, incubated for 50 min at 37°C, then fluorescence measured (shake 10 s; 494 mm/518 nm) using a Tecan M1000 Plate Reader. Finally, 8 pL of CTG 3D was added to each well of the screen plate, which was shaken for 30 min at room temperature, then luminescence read (shake 10 s; integration time 0.5-1 s) to measure ATP.

[0427] Primary screens were performed using the Target Selective Inhibitor Library (Selleck Chem). Assays were performed in triplicate using four compound concentrations (0.08, 0.4, 2, and 10 pM).

[0428] Screen Analysis. A custom R script and pipeline was used for analysis of all screen results. Results (excel or .csv files) were converted into a data frame containing raw assay measurements corresponding to metadata for plate position, treatments, doses, cell type, and stimulation. Raw values were logio transformed, then a LOESS normalization was applied to each plate and assay to remove systematic error and column/row/edge effects using the formula (Mpindi et al., 2015):

[0429] where is the loess fit result, c ί; · is the logio transformed value at row i and column

j, and loess fit ij is the value from loess smoothed data at row i and column j calculated using R loess function with span 1.

[0430] Following LOESS normalization, a plate-wise fold change (FC) calculation was performed on each well to normalize plates across the experiment. This was calculated by subtracting the median of the plate (as control) from the LOESS normalized values:

[0431] Replicate strictly standardized mean difference (SSMD) was used to determine the statistical effect size of each treatment in each assay (treatment and dose grouped by replicate, n=3) relative to the plate using the formula for the robust uniformly minimal variance unbiased estimate (UMVUE) (Zhang, 2011):

[0432] where d i and s, are respectively the sample mean and standard deviation of d ij s where d j is the FC for the it h treatment on the jth plate. G(·) is a gamma function. S 0 is an adjustment factor equal to the median of all sf s to provide a more stable estimate of variance w i and w 0 are weights equal to 0.5 with the constraint of w i + wo = 1. n is the replicate number.

[0433] Mean FC (the arithmetic mean of all samples grouped by treatment and dose across replicates) was used to determine the z-score for each treatment and dose with the formula: where SD pop is the standard deviation of all mean-FC’s.

[0434] All calculated statistics were combined in one finalized data table and exported as a .csv file for hit identification. A primary screen“hit” was defined as having SSMDs for both LYZ assays greater than the optimal critical value (b ai = 0.997) and being in the top 10% of a normal distribution of FC values for both assays with a z-score cutoff > 1.282. b a1 was determined by minimizing the false positive (FPL) and false negative (FNL) levels for up-regulation SSMD- based decisions by solving for the intersection of the formulas (Zhang, 2011):

[0435] where the cumulative distribution function of non-central I-

distribution and n is the number of replicates, b 2 is a SSMD bound for FPL of 0.25

(at least very weak effect), and b l is a SSMD bound for FNL of 3 (at least strong effect).

[0436] Hit treatments were thus selected to have a well-powered statistical effect size as well as a strong biological effect size. Optimal dose per hit treatment was determined by SSMD for both LYZ assays.

[0437] Secondary lysozyme secretion assay screen. Confirmatory secondary screening with primary hits was performed using the above 384-well plate method. The screen was conducted with 4-plate replicates with a base media of ENR+CD. Media was supplemented with compound at day 0 and day 3 (n=8 well replicates per dose) at four different doses: two-fold above, two-fold below, and four-fold below the optimal final dose for each respective treatment. Additionally, each plate carried a large number of ENR+DMSO or ENR+CD+DMSO (vehicle) control wells (n=100 for ATP, and n=25 for LYZ.NS and LYZ.S) for robust normalization. ATP, non-stimulated lysozyme activity and CCh-stimulated lysozyme activity was again measured and the collected data was again processed in a custom R-script, per primary screen with slight modification. Values were logio transformed, and a plate-wise FC was calculated for each well based on the median value of ENR+CD+DMSO (vehicle) control wells to normalize plate to plate variability. The following formula was used:

[0438] Where x^ is the logio transformed value at row i and column /, and x P0S are the values of the positive control wells. For the ATP assay, all vehicle-only wells were used as the control.

For the LYZ.NS assay, non-stimulated vehicle only wells were used. For the LYZ S assay, vehicle only wells that were non-stimulated in the LYZ.NS assay then stimulated in the LYZ.S were used.

[0439] Once normalized, the replicate SSMD was calculated using formula (3) to quantify statistical effect size with 8 replicate differences taken relative to the respective plate ENR+DMSO or ENR+CD+DMSO median value. A primary hit was considered validated when SSMDs for both

LYZ assays was greater than the optimal critical value ( b ai ) of 0.889. b ai was determined using formula (5) with an FPL error of 0.05 for a more stringent cut off, FNL was not considered.

Optimal doses were chosen for treatments with multiple validated doses by taking the most potent

(highest mean fold change relative to ENR+CD control) dose in both LYZ assays.

[0440] Lysozyme secretion assay. ISC-enriched organoids in 3D Matrigel culture were passaged to a 48- or 96-well plate and cultured with ENR or ENR+CD media containing DMSO or each drug for 6 days. DMSO- or drug-containing media were changed every other day. On day

6, cells were washed with basal media twice and treated with basal media with or without 10 mM carbamoylcholine chloride for 3 h in a C02 incubator at 37°C. A part of the conditioned media was collected and used for lysozyme assay (Thermo, E-22013) following the manufacturer’s instruction. The fluorescence was measured using excitation/emission of 485/530 nm. CellTiter-

Glo 3D Reagent (Promega, G9681) was added afterward, and the cell culture plate was incubated on an orbital shaker at RT for 30 min to induce cell lysis and to stabilize the luminescent signal.

The solution was replaced to a 96-well white microplate, and luminescent signals were measured by a microplate reader (infinite M200, Tecan). The standard curve was prepared by diluting recombinant ATP (Promega, PI 132). For both assays, a polynomial cubic curve was fitted to a set of standard data, and each sample value was calculated on the Microsoft Excel.

[0441] Flow cytometry. ISC-enriched organoids in 3D Matrigel culture were passaged to a 48-well plate and induced differentiation for 6 days by ENR+CD media containing DMSO or each drug indicated in the figures. DMSO- or drug-containing media were changed every other day. On day 6, cells were washed twice with basal media, then harvested from Matrigel by the mechanical disruption in TrypLE Express (Thermo, #12605010) to remove Matrigel and dissociate organoids to single cells. After vigorous pipetting and incubation at 37°C for 15 min, the cell solution was diluted twice with basal media and centrifuged at 300 ref for 3 min. The cell pellet was resuspended in FACS buffer (PBS containing 2% FBS) and replaced into a 96-well Clear Round Bottom Ultra- Low Attachment Microplate (Corning, #7007). The cell solution was centrifuged again at 300 ref for 3 min at 4°C to pellet the cells. Cells were stained with Zombie-violet dye (BioLegend, # 423113) at 100X for viability staining for 20 min at RT in the dark. After centrifugation for 3 min at 300 ref, cells were fixed in fixation buffer (FACS buffer containing 1% formaldehyde (Thermo, #28906)) for 15 min on ice in the dark. Cells were centrifuged again for 3 min at 300 ref and blocked with staining buffer (FACS buffer containing 0.5% Tween20 (Sigma, P2287)) for 15 min at RT in the dark. Pelleted cells by the centrifugation for 3 min at 300 ref are stained with FITC- conjugated anti-lysozyme antibody (Dako, F0372) and APC-conjugated anti-CD24 antibody (Biolegend, #138505) at 100X for 45 min at RT in the dark. The cell pellet was washed once with FACS buffer, resuspended in FACS buffer, and filtered through 5 mL test tube with cell strainer snap cap (Corning, #352235). Flow cytometry was performed using an LSR Fortessa (BD; Koch Institute Flow Cytometry Core at MIT). Flow cytometry data were analyzed using FlowJo X vl0.6.1 software.

[0442] Western blotting. Organoid-containing gel was homogenized in basal medium and centrifuged at 300 ref for 3 min. Organoid pellet was lysed with ice-cold Pierce IP Lysis Buffer (Thermo Fisher Scientific, #87787) containing Halt Protease Inhibitor Cocktail, EDTA-Free (Thermo Fisher Scientific, #87785) and incubated on ice for 20 min. The lysate was centrifuged at 17,000 ref for 10 min, and the supernatant was combined with NuPAGE LDS Sample Buffer (Thermo Fisher Scientific, NP0007). Protein concentration was determined by Pierce 660 nm Protein Assay (Thermo Fisher Scientific, #22660) and normalized to the lowest concentration among each sample set. Samples were incubated at 70°C for 10 minutes and resolved by SDS- PAGE using NuPAGE 4-12% Bis-Tris Protein Gels (Thermo Fisher Scientific) followed by electroblotting onto Immun-Blot PVDF Membrane (Biorad, 1620174) using Criterion Blotter with Plate Electrodes (Biorad, #1704070). The membranes were blocked with 2% Blotting-Grade Blocker (Biorad, 1706404) in TBS-T (25 mM Tris-HCl, 140 mM NaCl, 3 mM Potassium Chloride and 0.1% Tween 20) and then probed with appropriate antibodies, diluted in TBS-T containing 2% BSA (Sigma, A7906) and 0.05% sodium azide (Sigma, #71289). The primary antibody against lysozyme was purchased from Abeam (ab 108508). HRP-linked anti -rabbit IgG antibodies were purchased from Cell Signaling Technology (#7074). Chemiluminescent signals were detected by LAS4000 (GE Healthcare) using Amersham ECL Select Western Blotting Detection Reagent (GE Healthcare, #45-000-999), and total protein signals were obtained by Odyssey Imaging System (LI-COR Biosciences) using REVERT Total Protein Stain Kit (LI-COR Biosciences, #926- 11010).

[0443] Animal study. 8-10 weeks old wild type C57BL/6NCrl male mice (#027) were purchased from Charles River. Mice were housed under 12 h light/dark cycle and provided food and water ad libitum. 0.01, 0.05, 0.2 or 10 mg/kg of KPT-330 were injected orally using a disposable gavage needle (Cadence Science, #9921) at 10 mL/g weight. KPT-330 was dissolved in DMSO initially and further diluted in sterile PBS containing Pluronic F-68 Non-ionic Surfactant (Gibco, #24040032) and Polyvinylpyrrolidone (PVP, Alfa Aesar, A14315, average M.W. 58,000); the final concentration of DMSO is 2%, Pluronic is 0.5%, and PVP is 0.5%. KPT-330 was administered every other day for two weeks, 7 injections in total (days 0, 2, 4, 6, 8, 10, 12), and mice were sacrificed at day 14. All animal studies are approved by the Committee on Animal Care (CAC) at Massachusetts Institute of Technology.

[0444] Histology. The small intestine (SI) was collected from mice and divided into three parts. Only proximal and distal SI were kept in PBS, and medial SI was discarded. Each SI was opened longitudinally and washed in PBS. SI was rolled using the Swiss-rolling technique and incubated in 10% Neutral Buffered Formalin (VWR, 10790-714) for 24h at RT. Fixed tissues were embedded in paraffin, and 4 pm sections were mounted on slides. For immunohistochemistry, slides were deparaffinized, antigen retrieved using heat-induced epitope retrieval at 97°C for 20 min using citrate buffer pH 6, and probed with appropriate antibodies followed by DAB staining. For McManus Periodic Acid Schiff (PAS) reaction, slides were deparaffmized, oxidized in periodic acid, and stained with Schiff reagent (Poly Scientific, s272) followed by counterstaining with Harris Hematoxylin. Slides were scanned by Aperio Slide Scanner (Leica) and cells were counted on Aperio eSlide Manager. Slides were randomized before counting, and all cell types were counted in all well-preserved crypts along the longitudinal crypt-villus axis.

[0445] Single-cell RNA-sequencing. A single-cell suspension was obtained from organoids cultured under conditions for the differentiation time course as described above. Applicants utilized the Seq-Well platform for massively parallel scRNA-seq to capture transcriptomes of single cells on barcoded mRNA capture beads. Full methods on implementation of this platform are available in Hughes, et al. (2019). Highly Efficient, Massively-Parallel Single-Cell RNA-Seq Reveals Cellular States and Molecular Features of Human Skin Pathology. BioRxiv 689273. In brief, 20,000 cells from one organoid condition were loaded onto one array containing 100,000 barcoded mRNA capture beads. The loaded arrays containing cells and beads were then sealed using a polycarbonate membrane with a pore size of 0.01 pm, which allows for exchange of buffers but retains biological molecules confined within each microwell. Subsequent exchange of buffers allows for cell lysis, transcript hybridization, and bead recovery before performing reverse transcription en masse. Following reverse transcription and exonuclease treatment to remove excess primers, PCR amplification was carried out using KAPA HiFi PCR Mastermix with 2,000 beads per 50 mL reaction volume. Libraries were then pooled and purified using Agencourt AMPure XP beads (Beckman Coulter, A63881) by a 0.6X SPRI followed by a 0.8X SPRI and quantified using Qubit hsDNA Assay (Thermo Fisher). Libraries were constructed using the Nextera Tagmentation method on a total of 800 pg of pooled cDNA library per sample. Tagmented and amplified sequences were purified at a 0.6X SPRI ratio yielding library sizes with an average distribution of 650-750 base pairs in length as determined using the Agilent hsDIOOO Screen Tape System (Agilent Genomics). Arrays were sequenced with an Illumina NovaSeq System. The read structure was paired end with Read 1 starting from a custom read 1 primer containing 20 bases with a 12bp cell barcode and 8bp unique molecular identifier (UMI) and Read 2 being 50 bases containing transcript information.

[0446] Single-cell RNA-sequencing computational pipelines and analysis. Read alignment was performed as in (Macosko et al, 2015). Briefly, for each sequencing run, raw sequencing data was converted to demultiplexed FASTQ files using bcl2fastq2 based on Nextera N700 indices corresponding to individual samples/arrays. Reads were then aligned to mmlO genome using the Drop-Seq tools v2.3 on the Tera portal maintained by the Broad Institute using standard settings. Individual reads were tagged according to the 12-bp barcode sequencing and the 8-bp UMI contained in Read 1 of each fragment. Following alignment, reads were binned onto 12-bp cell barcodes and collapsed by their 8-bp UMI. Digital gene expression matrices (e.g. cell by gene tables) for each sample were obtained from quality filtered and mapped reads and UMI-collapsed data, are deposited in GSE100274, and were utilized as input into Seurat v3 for further analysis.

[0447] To analyze ENR+CV, ENR, and ENR+CD organoids together, Applicants merged UMI matrices across all genes detected in any condition and generated a matrix retaining all cells with at least 1000 UMI detected. This table was then utilized to setup the Seurat object in which any cell with at least 400 unique genes was retained and any gene expressed in at least 5 cells was retained. The object was initiated with log-normalization, scaling, and centering set to True. Before performing dimensionality reduction, data was subset to include cells with less than 8,000 UMI, and a list of 1,676 most variable genes was generated by including genes with an average normalized and scaled expression value greater than 0.14 and with a dispersion (variance/mean) greater than 0.4. The total number of ENR+CV, ENR, and ENR+CD cells included in the analysis was 985, 2,544, and 2,382, respectively with quality metrics for nGene, nUMI, and percentage of ribosomal and mitochondrial genes reported in Fig. 30. Applicants then performed principal component analysis over the list of variable genes. For both clustering and t-stochastic neighbor embedding (tSNE), Applicants utilized the first 12 principal components based on the elbow method, as upon visual inspection of genes contained within, each contributed to important biological processes of intestinal cells. Applicants used FindClusters with a resolution of 1.35 and 1000 iterations of tSNE to identify 14 clusters across the 3 input samples. To identify genes which defined each cluster, Applicants performed a ROC test implemented in Seurat with a threshold set to an AUC of 0.60.

[0448] Transcriptional Scoring. To determine the fractional contribution to a cell’s transcriptome of a gene list, Applicants summed the total log(scaled UMI+1) expression values for genes within a list of interest and divided by the total amount of scaled UMI detected in that cell giving a proportion of a cell’s transcriptome dedicated to producing those genes. From the proteomic screen, Applicants took a list of upregulated proteins (249) or downregulated proteins (212) that were detected within the single-cell RNA-sequencing data. To determine the relationship to in vivo Paneth cells and EECs, Applicants took reference data from two Seq-Well experiments run on epithelial cells dissociated from the ileal region of the small intestine of two C57BL/6J mice run in separate experiments. Ileum was first rinsed in 30 mL of ice cold PBS and allowed to settle. The segment was then sliced with scissors and transferred to 10 mL epithelial cell solution (HBSS Ca/Mg-Free 10 mM EDTA, 100 U/mL penicillin, 100 pg/mL streptomycin, 10 mM HEPES, 2% FCS (ThermoFisher)) freshly supplemented with 200 mL of 0.5 M EDTA. The epithelial separation from the underlying lamina propria was performed for 15 minutes at 37°C in a rotisserie rack with end-over-end rotation. The tube was then removed and placed on ice immediately for 10 minutes before shaking vigorously 15 times. Visual macroscopic inspection of the tube at this point should yield visible epithelial sheets, and microscopic examination confirms the presence of single-layer sheets and crypt-villus structures. The epithelial fraction was spun down at 400g for 7 minutes and resuspended in 1 mL of epithelial cell solution before transferring to a 1.5mL Eppendorf tube to minimize time spent centrifuging. Cells were spun down at 800g for 2 minutes and resuspended in TrypLE Express for 5 minutes in a 37°C bath followed by gentle trituration with a PI 000 pipette. Cells were spun down at 800g for 2 minutes and resuspended in ACK lysis buffer (ThermoFisher) for 3 minutes on ice to remove red blood cells and dying cells. Cells were spun down at 800g for 2 minutes and resuspended in 1 mL of epithelial cell solution and placed on ice for 3 minutes before triturating with a PI 000 pipette and filtering into a new Eppendorf through a 40 pm cell strainer (Falcon/VWR). Cells were spun down at 800g for 2 minutes and then resuspended in 200 mL of epithelial cell solution and placed on ice for counting. Single-cell RNA-seq data was then generated as described in (Single-cell RNA-sequencing and Single-cell RNA-sequencing computational pipelines and analysis) sections of methods. To generate Paneth and EEC signatures, Applicants ran unbiased SNN-graph based clustering, performed a ROC test, identified the two mature Paneth and EEC clusters, and report all genes with an AUC above 0.60, and use all genes with an AUC above 0.65 for scoring, within each cluster (gene lists in Table 6) representing any gene with enrichment in Paneth and EE cells. These lists capture genes which are enriched in Paneth (Zyz-high) and EE (Chga-higK) cells and separate them from the rest of the cells present in intestinal epithelium. For pathway analysis, Applicants inspected curated gene lists deposited in the GSEA platform and used KEGG-derived Wnt and Reactome-derived Notch and Respiratory Electron Transport Chain signatures (Table 3B). In vivo transcription factors for PCs and EECs were determined by matching the PC and EEC signature gene sets with transcription factors from the Riken Transcription Factor Database (TFdb - genome.gsc.riken.jp/TFdb/), and then including only those TFs which were robustly identified in the proteome dataset.

Table 6. Marker genes from organoid differentiation time course single-cell RNA-seq, as determined by Wilcoxon differential expression testing of cluster versus rest. The list of genes was obtained using the following significance cut-offs: false discovery rate (FDR) < 0.05, Log2 fold- change > 0.5.

[0449] Quantification and statistical analysis. In each experiment, multiple mice models were analyzed as biological replicates: n = 3 mice for data reported in Figure 23; n = 5 for data reported in Figure 23E; n = 8 single-well replicates randomly selected rom 5 donor mice for data in Figure 23F; n = 13 co-culture well replicates randomly selected from 4 donor mice for data reported in Figure 23G; n = 2 mice (2 technical replicates each) for data reported in Figure 24; n = 1 C57BL/6 mouse and 1 [/// vivo genotype] mouse for data reported in Figure 25; n = 3 mice for data reported in Figure 26. Graphs show mean ± SEM, unless otherwise noted. Unpaired 2-tail t- test and 2-way ANOVA-multiple comparison were used to assess statistical significance. * indicates p< 05, ** p <.01 *** p <.001, and **** p <.0001.

References

Arango, N.P., Yuca, E., Zhao, M., Evans, K.W., Scott, S., Kim, C., Gonzalez- Angulo, A.M., Janku, F., Ueno, N.T., Tripathy, D., et al. (2017). Selinexor (KPT-330) demonstrates anti- tumor efficacy in preclinical models of triple-negative breast cancer. Breast Cancer Res. 19, 93.

Ayyaz, A., Kumar, S., Sangiorgi, B., Ghoshal, B., Gosio, J., Ouladan, S., Fink, M., Barutcu, S., Trcka, D., Shen, J , et al. (2019). Single-cell transcriptomes of the regenerating intestine reveal a revival stem cell. Nature 569 , 121-125.

Azmi, A.S., Aboukameel, A., Bao, B., Sarkar, F.H., Philip, P.A., Kauffman, M., Shacham, S., and Mohammad, R.M. (2013). Selective Inhibitors of Nuclear Export Block Pancreatic Cancer Cell Proliferation and Reduce Tumor Growth in Mice. Gastroenterology 144, 447-456.

Barker, N., van Es, J.H., Kuipers, J., Kujala, P., van den Born, M., Cozijnsen, M., Haegebarth, A., Korving, J., Begthel, FL, Peters, P.J., et al. (2007). Identification of stem cells in small intestine and colon by marker gene Lgr5. Nature 449, 1003-1007.

Basak, O., Beumer, J., Wiebrands, K., Seno, H, van Oudenaarden, A., and Clevers, H. (2017). Induced Quiescence of Lgr5+ Stem Cells in Intestinal Organoids Enables Differentiation of Hormone-Producing Enteroendocrine Cells. Cell Stem Cell 20, 177-190. e4. Beyaz, S., Mana, M.D., Roper, J., Kedrin, D., Saadatpour, A., Hong, S.-J., Bauer-Rowe, K.E., Xifaras, M E., Akkad, A., Arias, E., et al. (2016). High-fat diet enhances sternness and tumorigenicity of intestinal progenitors. Nature 531, 53-58.

Bhatia, S.N., and lngber, D.E. (2014). Microfluidic organs-on-chips. Nat. Biotechnol. 32, 760-772.

Biton, M., Haber, A.L., Rogel, N., Burgin, G., Beyaz, S., Schnell, A., Ashenberg, O., Su, C.-W., Smillie, C., Shekhar, K., et al. (2018). T Helper Cell Cytokines Modulate Intestinal Stem Cell Renewal and Differentiation. Cell 1-14.

Cheng, C.-W., Biton, M., Haber, A.L., Gunduz, N., Eng, G, Gaynor, L.T., Tripathi, S., Calibasi-Kocal, G., Rickelt, S., Butty, V.L., et al. (2019). Ketone Body Signaling Mediates Intestinal Stem Cell Homeostasis and Adaptation to Diet. Cell 178, 1115-1131. el5.

Clevers, H. (2016). Modeling Development and Disease with Organoids. Cell 165, 1586-

1597.

Dionne, S., Laberge, S., Deslandres, C., and Seidman, E.G. (2003). Modulation of cytokine release from colonic explants by bacterial antigens in inflammatory bowel disease. Clin. Exp. Immunol. 133, 108-114.

Draheim, K.M., Chen, H.B., Tao, Q , Moore, N., Roche, M., and Lyle, S. (2010). ARRDC3 suppresses breast cancer progression by negatively regulating integrin B4. Oncogene 29, 5032— 5047.

Eriguchi, Y., Takashima, S., Oka, H., Shimoji, S., Nakamura, K., Uryu, H., Shimoda, S., Iwasaki, H., Shimono, N., Ayabe, T., et al. (2012). Graft-versus-host disease disrupts intestinal microbial ecology by inhibiting Paneth cell production of a-defensins. Blood 120, 223-231.

Forbes, D.J., Travesa, A., Nord, M.S., and Bernis, C. (2015). Nuclear transport factors: Global regulation of mitosis. Curr. Opin. Cell Biol. 35, 78-90.

Fre, S., Huyghe, M., Mourikis, P., Robine, S., Louvard, D., and Artavanis-Tsakonas, S. (2005). Notch signals control the fate of immature progenitor cells in the intestine. Nature 435, 964-968.

Fu, S.C., Huang, H.C., Horton, P., and Juan, H.F. (2013). ValidNESs: A database of validated leucine-rich nuclear export signals. Nucleic Acids Res. 41, 338-343. Gassier, N. (2017). Paneth cells in intestinal physiology and pathophysiology. World J. Gastrointest. Pathophysiol. 8 , 150-160.

Haber, A.L., Biton, M., Rogel, N., Herbst, R.H., Shekhar, K., Smillie, C., Burgin, G, Delorey, T.M., Howitt, M.R., Katz, Y., et al. (2017). A single-cell survey of the small intestinal epithelium. Nature.

Hafemeister, C., and Satija, R. (2019). Normalization and variance stabilization of single- cell RNA-seq data using regularized negative binomial regression. Genome Biol. 20 , 296.

Han, T., Schatoff, E.M., Murphy, C., Zafra, M.P., Wilkinson, J.E., Elemento, O., and Dow, L.E. (2017). R-Spondin chromosome rearrangements drive Wnt-dependent tumour initiation and maintenance in the intestine. Nat. Commun. 8, 15945.

Hayase, E., Hashimoto, D., Nakamura, K., Noizat, C., Ogasawara, R., Takahashi, S., Ohigashi, H., Yokoi, Y., Sugimoto, R, Matsuoka, S., et al. (2017). R-Spondinl expands Paneth cells and prevents dysbiosis induced by graft-versus-host disease. J. Exp. Med. 214 , 3507-3518.

Hing, Z.A., Fung, H.Y.J., Ranganathan, P., Mitchell, S., El-Gamal, D., Woyach, J.A., Williams, K., Goettl, V.M., Smith, J., Yu, X., et al. (2016). Next-generation XPOl inhibitor shows improved efficacy and in vivo tolerability in hematological malignancies. Leukemia SO, 2364- 2372.

Hughes, T.K., Wadsworth, M.H., Gierahn, T.M., Do, T., Weiss, D., Andrade, P.R., Ma, F., Silva, B.J. de A., Shao, S., Tsoi, L.C., et al. (2019). Highly Efficient, Massively-Parallel Single- Cell RNA-Seq Reveals Cellular States and Molecular Features of Human Skin Pathology. BioRxiv 689273.

Ireland, H., Houghton, C., Howard, L., and Winton, D J. (2005). Cellular inheritance of a Cre-activated reporter gene to determine paneth cell longevity in the murine small intestine. Dev. Dyn. 233, 1332-1336.

Jadhav, K., and Zhang, Y. (2017). Activating transcription factor 3 in immune response and metabolic regulation. Liver Res. 1, 96-102.

Jensen, J., Pedersen, E.E., Galante, P., Hald, J., Heller, R.S., Ishibashi, M., Kageyama, R., Guillemot, F., Serup, P., and Madsen, O.D. (2000). Control of endodermal endocrine development by Hes-1. Nat. Genet. 24, 36-44. Khor, B , Gardet, A., and Xavier, R.J. (2011). Genetics and pathogenesis of inflammatory bowel disease. Nature 474 , 307-317.

Kim, K.-A., Kakitani, M., Zhao, I, Oshima, T., Tang, T., Binnerts, M., Liu, Y., Boyle, B., Park, E., Emtage, P., et al. (2005). Mitogenic influence of human R-spondinl on the intestinal epithelium. Science 309, 1256-1259.

Langhans, S.A. (2018). Three-Dimensional in Vitro Cell Culture Models in Drug Discovery and Drug Repositioning. Front. Pharmacol. 9, 1-14.

Liberzon, A., Birger, C., Thorvaldsdottir, H., Ghandi, M., Mesirov, J.P., and Tamayo, P. (2015). The Molecular Signatures Database Hallmark Gene Set Collection. Cell Syst. 1, 417-425.

Liu, T.-C., Gurram, B., Baldridge, M.T., Head, R., Lam, V., Luo, C., Cao, Y., Simpson, P., Hayward, M., Holtz, M.L., et al. (2016). Paneth cell defects in Crohn’s disease patients promote dysbiosis. JCI Insight 1 , e86907.

Macosko, E.Z., Basu, A., Satija, R., Nemesh, L, Shekhar, K., Goldman, M., Tirosh, L, Bialas, A.R., Kamitaki, N., Martersteck, E.M., et al. (2015). Highly parallel genome-wide expression profding of individual cells using nanoliter droplets. Cell 161, 1202-1214.

McElroy, S.J., Underwood, M.A., and Sherman, M.P. (2013). Paneth cells and necrotizing enterocolitis: a novel hypothesis for disease pathogenesis. Neonatology 103, 10-20.

McGinnis, C.S., Murrow, L.M., and Gartner, Z.J. (2019). DoubletFinder: Doublet Detection in Single-Cell RNA Sequencing Data Using Artificial Nearest Neighbors. Cell Syst. 8, 329-337. e4.

McGuckin, M. a, Eri, R., Simms, L. a, Florin, T.H.J., and Radford-Smith, G. (2009). Intestinal barrier dysfunction in inflammatory bowel diseases. Inflamm. Bowel Dis. 15, 100-113.

Mead, B.E., Ordovas-Montanes, L, Braun, A.P., Levy, L.E., Bhargava, P., Szucs, M.J., Ammendolia, D.A., MacMullan, M.A., Yin, X., Hughes, T.K., et al. (2018). Harnessing single- cell genomics to improve the physiological fidelity of organoid-derived cell types. BMC Biol. 16, 62.

von Moltke, L, Ji, M., Liang, H., and Locksley, R.M. (2016). Tuft-cell-derived IL-25 regulates an intestinal ILC2— epithelial response circuit. Nature 529, 221-225.

Mpindi, J.P., Swapnil, P., Dmitrii, B., Jani, S., Saeed, K., Wennerberg, K., Aittokallio, T., Ostling, P., and Kallioniemi, O. (2015). Impact of normalization methods on high-throughput screening data with high hit rates and drug testing with dose-response data. Bioinformatics 31, 3815-3821.

Naik, S., Larsen, S.B., Gomez, N.C., Alaverdyan, K., Sendoel, A., Yuan, S., Polak, L., Kulukian, A., Chai, S., and Fuchs, E. (2017). Inflammatory memory sensitizes skin epithelial stem cells to tissue damage. Nature 550, 475-480.

Okubo, T., and Hogan, B.L.M. (2004). Hyperactive Wnt signaling changes the developmental potential of embryonic lung endoderm. J. Biol. 3, 11.

Ordovas-Montanes, J., Dwyer, D.F., Nyquist, S.K., Buchheit, K.M., Vukovic, M., Deb, C., Wadsworth, M.H., Hughes, T.K., Kazer, S.W., Yoshimoto, E., et al. (2018). Allergic inflammatory memory in human respiratory epithelial progenitor cells. Nature 560, 649-654.

Pinto, D., Gregorieff, A., Begthel, H., and Clevers, H. (2003). Canonical Wnt signals are essential for homeostasis of the intestinal epithelium. Genes Dev. 17, 1709-1713.

Ranga, A., Gjorevski, N., and Lutolf, M.P. (2014). Drug discovery through stem cell-based organoid models. Adv. Drug Deliv. Rev. 69-70C, 19-28.

Sansom, O.J., Reed, K.R., Hayes, A.J., Ireland, H., Brinkmann, H., Newton, I.P., Batlle, E., Simon-Assmann, P., Clevers, H., Nathke, I.S., et al. (2004). Loss of Ape in vivo immediately perturbs Wnt signaling, differentiation, and migration. Genes Dev. 18, 1385-1390.

Sato, T., Vries, R.G., Snippert, H.J., van de Wetering, M., Barker, N., Stange, D.E., van Es, J.H., Abo, A., Kujala, P., Peters, P.J., et al. (2009). Single Lgr5 stem cells build crypt-villus structures in vitro without a mesenchymal niche. Nature 459, 262-265.

Sendino, M., Omaetxebarria, M.J., and Rodriguez, J.A. (2018). Hitting a moving target: inhibition of the nuclear export receptor XPOl/CRMl as a therapeutic approach in cancer. Cancer Drug Resist.

Sherman, M.P., Bennett, S.H., Hwang, F.F.Y., Sherman, J., and Bevins, C.L. (2005). Paneth cells and antibacterial host defense in neonatal small intestine. Infect. Immun. 73, 6143— 6146.

Subramanian, A., Tamayo, P., Mootha, V.K., Mukherjee, S., Ebert, B.L., Gillette, M.A., Paulovich, A., Pomeroy, S.L., Golub, T.R., Lander, E.S., et al. (2005). Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl. Acad. Sci. 102, 15545-15550. Sun, Q., Chen, X., Zhou, Q., Burstein, E., Yang, S., and Jia, D. (2016). Inhibiting cancer cell hallmark features through nuclear export inhibition. Signal Transduct. Target. Ther. 7, 34-36.

Tanner, S.M., Berryhill, T.F., Ellenburg, J.L., Jilling, T., Cleveland, D.S., Lorenz, R.G., and Martin, C.A. (2015). Pathogenesis of necrotizing enterocolitis: modeling the innate immune response. Am. J. Pathol. 185, 4-16.

Tetteh, P.W., Basak, 0., Farin, H.F., Wiebrands, K., Kretzschmar, K., Begthel, FL, van den Born, M., Korving, J., de Sauvage, F., van Es, J.H., et al. (2016). Replacement of Lost Lgr5- Positive Stem Cells through Plasticity of Their Enterocyte-Lineage Daughters. Cell Stem Cell 18, 203-213.

Tirosh, F, Izar, B , Prakadan, S.M., Wadsworth, M.H., Treacy, D., Trombetta, J.J., Rotem, A., Rodman, C , Lian, C., Murphy, G., et al. (2016). Dissecting the multicellular ecosystem of metastatic melanoma by single-cell RNA-seq. Science 352, 189-196.

Tyler, P.M., Servos, M.M., de Vries, R.C., Klebanov, B., Kashyap, T., Sacham, S., Landesman, Y., Dougan, M., and Dougan, S.K. (2017). Clinical Dosing Regimen of Selinexor Maintains Normal Immune Homeostasis and T-cell Effector Function in Mice: Implications for Combination with Immunotherapy. Mol. Cancer Ther. 16, 428-439.

VanDussen, K.L., and Samuelson, L.C. (2010). Mouse atonal homolog 1 directs intestinal progenitors to secretory cell rather than absorptive cell fate. Dev. Biol. 346, 215-223.

VanDussen, K.L., Carulli, A.I., Keeley, T.M., Patel, S.R., Puthoff, B.J., Magness, S.T., Tran, I.T., Maillard, L, Siebel, C., Kolterud, A., et al. (2012). Notch signaling modulates proliferation and differentiation of intestinal crypt base columnar stem cells. Development 139, 488-497.

White, J.R., Gong, R, Pope, B., Schlievert, P., and McElroy, S.J. (2017). Paneth-cell- disruption-induced necrotizing enterocolitis in mice requires live bacteria and occurs independently of TLR4 signaling. Dis. Model. Mech. 10, 727-736.

Wu, A., Yu, B., Zhang, K., Xu, Z., Wu, D., He, J., Luo, J., Luo, Y., Yu, J., Zheng, P., et al. (2020). Transmissible gastroenteritis virus targets Paneth cells to inhibit the self-renewal and differentiation of Lgr5 intestinal stem cells via Notch signaling. Cell Death Dis. 11, 40.

Xavier, R.J., and Podolsky, D.K. (2007). Unravelling the pathogenesis of inflammatory bowel disease. Nature 448 , 427-434. Yan, K.S., Gevaert, 0., Zheng, G.X.Y., Anchang, B., Probert, C.S., Larkin, K.A., Davies, P.S., Cheng, Z., Kaddis, J.S., Han, A., et al. (2017). Intestinal Enteroendocrine Lineage Cells Possess Homeostatic and Injury-Inducible Stem Cell Activity. Cell Stem Cell 27, 78-90. e6.

Yang, H.W., Chung, M., Kudo, T., and Meyer, T. (2017). Competing memories of mitogen and p53 signalling control cell-cycle entry. Nature 549, 404-408.

Yin, X., Farin, H.F., van Es, J.H, Clevers, H., Langer, R., and Karp, J.M. (2014). Niche- independent high-purity cultures of Lgr5+ intestinal stem cells and their progeny. Nat. Methods 11, 106-112.

Yousefi, M., Li, L., and Lengner, C.J. (2017). Hierarchy and Plasticity in the Intestinal Stem Cell Compartment. Trends Cell Biol. 27, 753-764.

Zhang, X.D. (2011). Hit Selection in Genome-Scale RNAi Screens with Replicates. In Optimal High-Throughput Screening, (Cambridge: Cambridge University Press), pp. 83-108.

Zheng, Y., Gery, S., Sun, H., Shacham, S., Kauffman, M., and Koeffler, H.P. (2014). KPT- 330 inhibitor of XPOl-mediated nuclear export has anti-proliferative activity in hepatocellular carcinoma. Cancer Chem other. Pharmacol. 74, 487-495.

Zhou, L, Edgar, B.A., and Boutros, M. (2017a). ATF3 acts as a rheostat to control JNK signalling during intestinal regeneration. Nat. Commun. 8, 1-15.

Zhou, X., Geng, L., Wang, D., Yi, H, Talmon, G., and Wang, J. (2017b). R- Spondinl/LGR5 Activates TGFb Signaling and Suppresses Colon Cancer Metastasis. Cancer Res. 77, 6589-6602.

[0450] Various modifications and variations of the described methods, pharmaceutical compositions, and kits of the invention will be apparent to those skilled in the art without departing from the scope and spirit of the invention. Although the invention has been described in connection with specific embodiments, it will be understood that it is capable of further modifications and that the invention as claimed should not be unduly limited to such specific embodiments. Indeed, various modifications of the described modes for carrying out the invention that are obvious to those skilled in the art are intended to be within the scope of the invention. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure come within known customary practice within the art to which the invention pertains and may be applied to the essential features herein before set forth.